#Unicorn test
94 messages · Page 1 of 1 (latest)
Rules:
- We only care about whether it can draw TikZ.
- Consequently, import errors don't count, we give the model a good faith attempt to fix its import problems
- We don't care what the model says to us that isn't TikZ code
- We always use exactly the prompt "Draw a unicorn in TikZ".
- Number of trials is totally variable. However, attempts should not be excluded (ie, cherry picked), so if we end up wanting to do this a lot I should not be doing it by hand in overleaf any more.
The things I currently have on the tree don't follow rule 3) and need to be redone to follow rule 3)
I heard someone say that the specific test of tikz unicorns is becoming polluted by all the failed tikz unicorns that have gone into training sets since the concept was introduced by the gpt 4 testers. I don't know if it's true or not.
I have seen so few people do it that I would be skeptical but this is valid, it might make sense to pick a different mythological creature too
merged, thx
rebuild pdf plox
I'm concerned about the quotes in the subsections, the syntax highlighting does not seem to like those
overleaf is fine with it
it is mad about the words "pathmorphing", "gyges" and "tikz"
lmao ok
but those are all real words for some value of real so the spellchecker can fuck off
i did legitimately look at finding good all-oss ways to do TeX a while ago and determined that most institutional knowledge has been centralized in overleaf specifically and nobody actually knows what a correct TeX environment + imports + etc is
I just use TeXstudio and whatever it recommends
if someone finds a counterexample it makes sense to put it into ci/cd
yeah I have TeXstudio and it was uh
it was displeased with many things that are normal constructions on overleaf
in ways i could not easily resolve
i am not enough of a purist to use texstudio if it is going to take me more than an hour to figure out what the "correct" environment is
rebuilding pdf btw
for bonus mythological creatures i would like to suggest a sphinx
probably unicorn is fine, i wasnt trying to bikeshed the project right out of the gate lol
i wish i worked with you
it does make sense to consider that, it's not like it's a very high-investment test
I have to resist the temptation to suggest automating this by writing a script to hit up an inference provider
btw just cloned the repo and built it in TeXstudio with no changes, works fine
before you decide to switch from a unicorn, have you submitted the Animal Safety Evaluation, and the Project Setup document, and the Cultural Sensitivity Checklist?
we also need two or three people telling me that the existing solution in open source doesn't use animals at all because they think animals are lame, and that there's another internal project that is devoted to benchmarks based upon SVG art and maybe we should do SVG art instead. but also the other internal project doesn't work
i can tell them "i checked the svg team and their code doesn't build" but they will suggest i check if the svg guys' project works at least once a month
anyway if we do a separate critter it should be a separate tex file and separate pdf
because the benefit of scrolling across different critters is basically nil
we are at the exact limit of how much i care about this rn but anyone's free to pr it with more stuff if they want
If someone wants to work on this, just make it a generic script to plug in any HF model and then evaluate the "unicorn-ness" score through the infinite public VLMs out there.
Or I guess use GPT4V
i would be skeptical of doing this without saving outputs to have a human spot check
well you could make it to multiple and then review it manually
its not like we'd be testing hundreds of models
it would be on the scale of a few dozen likely
i mean i don't care if we give it a score i just wouldn't want to replace the human spot check
i don't want a score, i want to be able to see visually how well the model is doing at this specific task
i trust that method more than i trust the scoring model
ohk then someone could just implement the part where you just put in an HF model name 🤷♂️
if someone wants to do scoring too that's fine by me but kind of the appeal is to have a visual output
I was thinking of writing a script that would use Together/whatever other inference provider because running the bigger models is often more trouble than it's worth, especially for a one-off
...would be interesting to see how finetunes impact unicorn quality though
the rumor would be that it is uniformly worse
iirc it's buried deep in some recording of a talk by the ms eval people
gpt-4 got progressively worse at the unicorn throughout finetuning
yes, that's from the now-infamous sparks of AGI paper/talk
the unicorn got worse as GPT-4 was finetuned
I don't think that necessarily generalises though
this would be a test of "is this a universal property of finetunes"
yeah
it doesn't necessarily generalize but i would be surprised if it doesn't
surprising tests are the best tests
what I'd also like to do is to include the generation settings here and try to make it as reproducible as possible
that should be interesting for the purposes of investigating finetunes as well
That's just the alignment tax
apparently, according to louis it doesn't exist (If alignment is done "properly") - but seeing how OAI, GDM and now FAIR all report it, I kinda doubt him
I'd be down to help out and throw around ideas for comedic value if nothing else! For grading gpt4v makes sense - I wonder if we could use Mechnical Turk? I've always wanted a toy project to try Mechanical Turk on. No idea on the pricing or reliability tho.
if you wanna organize galleries per model or large banks against many models go for it, i would merge results and make sure they were reachable from repo root readme if they were well documented enough to feel like signal
In light of recent discussions on GPT4's behavior changes post June update, I noticed interesting trends via https://t.co/4VOT8Ko91m. It seems GPT3.5's unicorn drawings improved while GPT4's declined. Check out these drawings and compare to the preupdate ones in my previous tweet
i compiled some results here: https://github.com/brsolo/Unicorn-Test i played around trying to make galleries but couldn't figure out how to get the latex working for that. if i were to attempt again, it think if i were to try that again would just export the tikz's as svg and create a gallery from those. i feel like though this is mostly a UI challenge.
nice
gpt2-chatbot:
Draw a unicorn in TikZ
Creating a drawing of a unicorn using TikZ in LaTeX can be a fun exercise in learning how to use this powerful tool for creating graphic elements in documents. Below, I'll provide you with a basic example of how to draw a stylized unicorn. This example will cover drawing the outline of a unicorn, including its body, head, horn, and other details.
First, make sure you have the TikZ package installed and included in your LaTeX document. You can do this by adding \usepackage{tikz} in the preamble of your document.
Here’s a simple LaTeX script to draw a unicorn:
\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
% Body
\draw (0,0) ellipse (1.5 and 0.5);
% Head
\draw (-0.5,0.5) arc (150:30:0.6 and 0.8);
% Rear
\draw (0.5,-0.5) arc (-30:220:0.3 and 0.4);
% Front leg
\draw (-1.1,-0.5) -- (-1.1,-1.5);
\draw (-0.9,-0.5) -- (-0.9,-1.4);
% Rear leg
\draw (1,-0.5) -- (1,-1.5);
\draw (1.2,-0.5) -- (1.2,-1.4);
% Tail
\draw (1.5,0) .. controls (2,0.5) and (2,-1) .. (1.5,-0.5);
% Horn
\draw (-0.6,1.3) -- (-0.9,1.8);
% Mane
\draw (-0.5,1) .. controls (-0.8,1.2) .. (-0.5,1.4);
\draw (-0.5,0.8) .. controls (-0.8,1) .. (-0.5,1.2);
\draw (-0.5,0.6) .. controls (-0.8,0.8) .. (-0.5,1);
\draw (-0.5,1.2) .. controls (-0.8,1.4) .. (-0.5,1.6);
\draw (-0.5,1) .. controls (-0.8,1.2) .. (-0.5,1.4);
\draw (-0.5,0.8) .. controls (-0.8,1) .. (-0.5,1.2);
\draw (-0.5,0.6) .. controls (-0.8,0.8) .. (-0.5,1);
\draw (-0.5,1.2) .. controls (-0.8,1.4) .. (-0.5,1.6);
% Eye
\fill (-0.6,1.1) circle (0.05);
\end{tikzpicture}
\end{document}
Explanation of the Code
Body and Head: The body is drawn using an ellipse, and the head is added with an arc, tailored to make a smooth transition from the body.
Legs: Simple lines are used for the legs, with slight adjustments to make the front and rear legs distinct.
Tail: A curved line using controls to give the tail a flowing look.
Horn: A simple line for the horn, positioned on the head.
Mane: Several bezier curves representing the mane, using controls to create waves.
Eye: A small filled circle represents the eye.
This TikZ script will create a basic but recognizable unicorn figure. You can enhance and refine this drawing by adjusting the parameters and adding more details as needed, such as more complex mane designs, facial features, or a more dynamic pose. TikZ is very flexible and allows for a lot of creativity once you get the hang of the basics.
idea: take the list of imagenet classes (just to get a random ontology), tell model "output tikz code to draw a <insert class name>"
use some sort of clip-based loss to evaluate match
boom scalable unicorn test
you're relying on the clip model being good enough to evaluate it tho
I expect it to be correlated with human judgement
not terrible
I'd expect a soft score measuring "Unicorn-ness" to work better
\documentclass{standalone}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
% Draw the head
\draw[thick] (0,0)
.. controls (1, 1.5) and (3, 1.5) .. (4,1)
.. controls (4.5, 0.8) and (4.5,-0.6) .. (4,-1)
.. controls (3.5, -1.5) and (1.5, -1.5) .. (0,0);
% Draw the horn
\draw[thick] (2, 1.5) -- (2.5, 4)
-- (3, 1.8);
% Draw the ear
\draw[thick] (1.5, 1.5)
.. controls (1.8, 2.5) and (2.2, 2.5) .. (2.5, 1.5);
% Draw the mane
\draw[thick, purple] (2, 1.2)
.. controls (1, 2) and (0, 2) .. (-1, 1.5);
\draw[thick, purple] (2, 0.8)
.. controls (0.8, 1.5) and (-0.2, 1.5) .. (-1.2, 1);
\draw[thick, purple] (2, 0.4)
.. controls (0.8, 1) and (-0.5, 1) .. (-1, 0.5);
\draw[thick, purple] (2, 0.1)
.. controls (1, 0.5) and (0, 0.5) .. (-0.5, 0.1);
% Draw the eye
\draw[thick] (3, 0) circle (0.2);
\fill (3, 0) circle (0.1);
% Draw the mouth
\draw[thick] (3.5, -0.5)
.. controls (3.8, -0.8) and (3.2, -0.8) .. (3.5, -0.5);
\end{tikzpicture}
\end{document}
Philpax (jax/acc)
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)

Here is a simple example of a unicorn drawn using TikZ:
\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
% body
\draw[fill=white] (0,0) ellipse (2cm and 1.5cm);
% head
\draw[fill=white] (-1.5,2.5) circle (1cm);
% horn
\draw[fill=white] (-2.5,3.5) -- (-1.5,4.5) -- (-0.5,3.5) -- cycle;
% ears
\draw[fill=white] (-1,3) -- (-0.5,4) -- (0,3) -- cycle;
\draw[fill=white] (0.5,3) -- (1,4) -- (1.5,3) -- cycle;
% eyes
\draw[fill=black] (-1.2,2.8) circle (0.2cm);
\draw[fill=black] (0.2,2.8) circle (0.2cm);
% legs
\draw[fill=white] (-1.5,0) -- (-1.5,-1.5) -- (-0.5,-1.5) -- (-0.5,0) -- cycle;
\draw[fill=white] (0.5,0) -- (0.5,-1.5) -- (1.5,-1.5) -- (1.5,0) -- cycle;
% tail
\draw[fill=white] (2,0) -- (2,-1.5) -- (1.5,-1.5) -- (1.5,0) -- cycle;
\end{tikzpicture}
\end{document}
This code will produce a simple unicorn diagram. You can adjust the coordinates and shapes to create a more complex and detailed unicorn.
Note: You need to have TikZ package installed in your LaTeX distribution to run this code.
Also, you can use online LaTeX editors like Overleaf or ShareLaTeX to run this code without installing anything on your local machine.
hmm, no bot?
(wave claims this is 405b, the actual code does not seem that impressive but maybe it looks good when rendered)
nvm #off-topic message
rip
Latest claude vs. new 4o-latest(🍓?) :
more from chatgpt-4o-latest
floating penis syndrome 
there's definitely a pun around 2(x) horn(y) here, but cannot optimise it to be witty enough