#Unicorn test

94 messages · Page 1 of 1 (latest)

burnt echo
#

Because nobody else is doing it or takes it seriously, the unicorn test is sort of a good smoke test of new LLMs.

Ideally all models should be tested in the exact same way, and results should be aggregated centrally.

#

Rules:

  1. We only care about whether it can draw TikZ.
  2. Consequently, import errors don't count, we give the model a good faith attempt to fix its import problems
  3. We don't care what the model says to us that isn't TikZ code
  4. We always use exactly the prompt "Draw a unicorn in TikZ".
  5. Number of trials is totally variable. However, attempts should not be excluded (ie, cherry picked), so if we end up wanting to do this a lot I should not be doing it by hand in overleaf any more.
#

The things I currently have on the tree don't follow rule 3) and need to be redone to follow rule 3)

keen reef
#

I heard someone say that the specific test of tikz unicorns is becoming polluted by all the failed tikz unicorns that have gone into training sets since the concept was introduced by the gpt 4 testers. I don't know if it's true or not.

burnt echo
dawn solstice
burnt echo
dawn solstice
#

rebuild pdf plox

#

I'm concerned about the quotes in the subsections, the syntax highlighting does not seem to like those

burnt echo
#

overleaf is fine with it

#

it is mad about the words "pathmorphing", "gyges" and "tikz"

dawn solstice
#

lmao ok

burnt echo
#

but those are all real words for some value of real so the spellchecker can fuck off

#

i did legitimately look at finding good all-oss ways to do TeX a while ago and determined that most institutional knowledge has been centralized in overleaf specifically and nobody actually knows what a correct TeX environment + imports + etc is

dawn solstice
#

I just use TeXstudio and whatever it recommends

burnt echo
#

if someone finds a counterexample it makes sense to put it into ci/cd

#

yeah I have TeXstudio and it was uh

#

it was displeased with many things that are normal constructions on overleaf

#

in ways i could not easily resolve

#

i am not enough of a purist to use texstudio if it is going to take me more than an hour to figure out what the "correct" environment is

#

rebuilding pdf btw

#

for bonus mythological creatures i would like to suggest a sphinx

keen reef
#

probably unicorn is fine, i wasnt trying to bikeshed the project right out of the gate lol

burnt echo
#

it does make sense to consider that, it's not like it's a very high-investment test

dawn solstice
#

I have to resist the temptation to suggest automating this by writing a script to hit up an inference provider

#

btw just cloned the repo and built it in TeXstudio with no changes, works fine

willow remnant
# burnt echo i wish i worked with you

before you decide to switch from a unicorn, have you submitted the Animal Safety Evaluation, and the Project Setup document, and the Cultural Sensitivity Checklist?

burnt echo
#

i can tell them "i checked the svg team and their code doesn't build" but they will suggest i check if the svg guys' project works at least once a month

#

anyway if we do a separate critter it should be a separate tex file and separate pdf

#

because the benefit of scrolling across different critters is basically nil

#

we are at the exact limit of how much i care about this rn but anyone's free to pr it with more stuff if they want

fiery ibex
#

If someone wants to work on this, just make it a generic script to plug in any HF model and then evaluate the "unicorn-ness" score through the infinite public VLMs out there.

#

Or I guess use GPT4V

burnt echo
fiery ibex
#

its not like we'd be testing hundreds of models

#

it would be on the scale of a few dozen likely

burnt echo
#

i mean i don't care if we give it a score i just wouldn't want to replace the human spot check

#

i don't want a score, i want to be able to see visually how well the model is doing at this specific task

#

i trust that method more than i trust the scoring model

fiery ibex
#

ohk then someone could just implement the part where you just put in an HF model name 🤷‍♂️

burnt echo
#

if someone wants to do scoring too that's fine by me but kind of the appeal is to have a visual output

dawn solstice
#

...would be interesting to see how finetunes impact unicorn quality though

burnt echo
#

iirc it's buried deep in some recording of a talk by the ms eval people

#

gpt-4 got progressively worse at the unicorn throughout finetuning

dawn solstice
#

yes, that's from the now-infamous sparks of AGI paper/talk

#

the unicorn got worse as GPT-4 was finetuned

#

I don't think that necessarily generalises though

burnt echo
#

this would be a test of "is this a universal property of finetunes"

#

yeah

#

it doesn't necessarily generalize but i would be surprised if it doesn't

#

surprising tests are the best tests

dawn solstice
#

that should be interesting for the purposes of investigating finetunes as well

fiery ibex
#

apparently, according to louis it doesn't exist (If alignment is done "properly") - but seeing how OAI, GDM and now FAIR all report it, I kinda doubt him

tacit fossil
#

I'd be down to help out and throw around ideas for comedic value if nothing else! For grading gpt4v makes sense - I wonder if we could use Mechnical Turk? I've always wanted a toy project to try Mechanical Turk on. No idea on the pricing or reliability tho.

burnt echo
tacit fossil
#

i compiled some results here: https://github.com/brsolo/Unicorn-Test i played around trying to make galleries but couldn't figure out how to get the latex working for that. if i were to attempt again, it think if i were to try that again would just export the tikz's as svg and create a gallery from those. i feel like though this is mostly a UI challenge.

GitHub

Contribute to brsolo/Unicorn-Test development by creating an account on GitHub.

visual ferry
#

gpt2-chatbot:

Draw a unicorn in TikZ
Creating a drawing of a unicorn using TikZ in LaTeX can be a fun exercise in learning how to use this powerful tool for creating graphic elements in documents. Below, I'll provide you with a basic example of how to draw a stylized unicorn. This example will cover drawing the outline of a unicorn, including its body, head, horn, and other details.

First, make sure you have the TikZ package installed and included in your LaTeX document. You can do this by adding \usepackage{tikz} in the preamble of your document.

Here’s a simple LaTeX script to draw a unicorn:

\documentclass{article}
\usepackage{tikz}

\begin{document}
\begin{tikzpicture}
    % Body
    \draw (0,0) ellipse (1.5 and 0.5);
    % Head
    \draw (-0.5,0.5) arc (150:30:0.6 and 0.8);
    % Rear
    \draw (0.5,-0.5) arc (-30:220:0.3 and 0.4);
    % Front leg
    \draw (-1.1,-0.5) -- (-1.1,-1.5);
    \draw (-0.9,-0.5) -- (-0.9,-1.4);
    % Rear leg
    \draw (1,-0.5) -- (1,-1.5);
    \draw (1.2,-0.5) -- (1.2,-1.4);
    % Tail
    \draw (1.5,0) .. controls (2,0.5) and (2,-1) .. (1.5,-0.5);
    % Horn
    \draw (-0.6,1.3) -- (-0.9,1.8);
    % Mane
    \draw (-0.5,1) .. controls (-0.8,1.2) .. (-0.5,1.4);
    \draw (-0.5,0.8) .. controls (-0.8,1) .. (-0.5,1.2);
    \draw (-0.5,0.6) .. controls (-0.8,0.8) .. (-0.5,1);
    \draw (-0.5,1.2) .. controls (-0.8,1.4) .. (-0.5,1.6);
    \draw (-0.5,1) .. controls (-0.8,1.2) .. (-0.5,1.4);
    \draw (-0.5,0.8) .. controls (-0.8,1) .. (-0.5,1.2);
    \draw (-0.5,0.6) .. controls (-0.8,0.8) .. (-0.5,1);
    \draw (-0.5,1.2) .. controls (-0.8,1.4) .. (-0.5,1.6);
    % Eye
    \fill (-0.6,1.1) circle (0.05);
\end{tikzpicture}
\end{document}

Explanation of the Code

    Body and Head: The body is drawn using an ellipse, and the head is added with an arc, tailored to make a smooth transition from the body.
    Legs: Simple lines are used for the legs, with slight adjustments to make the front and rear legs distinct.
    Tail: A curved line using controls to give the tail a flowing look.
    Horn: A simple line for the horn, positioned on the head.
    Mane: Several bezier curves representing the mane, using controls to create waves.
    Eye: A small filled circle represents the eye.

This TikZ script will create a basic but recognizable unicorn figure. You can enhance and refine this drawing by adjusting the parameters and adding more details as needed, such as more complex mane designs, facial features, or a more dynamic pose. TikZ is very flexible and allows for a lot of creativity once you get the hang of the basics.
misty marsh
#

idea: take the list of imagenet classes (just to get a random ontology), tell model "output tikz code to draw a <insert class name>"

#

use some sort of clip-based loss to evaluate match

#

boom scalable unicorn test

burnt echo
misty marsh
#

I expect it to be correlated with human judgement

#

not terrible

#

I'd expect a soft score measuring "Unicorn-ness" to work better

dawn solstice
#
\documentclass{standalone}
\usepackage{tikz}

\begin{document}

\begin{tikzpicture}
    % Draw the head
    \draw[thick] (0,0) 
        .. controls (1, 1.5) and (3, 1.5) .. (4,1) 
        .. controls (4.5, 0.8) and (4.5,-0.6) .. (4,-1)
        .. controls (3.5, -1.5) and (1.5, -1.5) .. (0,0);

    % Draw the horn
    \draw[thick] (2, 1.5) -- (2.5, 4)
            -- (3, 1.8);

    % Draw the ear
    \draw[thick] (1.5, 1.5) 
        .. controls (1.8, 2.5) and (2.2, 2.5) .. (2.5, 1.5);

    % Draw the mane
    \draw[thick, purple] (2, 1.2) 
        .. controls (1, 2) and (0, 2) .. (-1, 1.5);
    \draw[thick, purple] (2, 0.8) 
        .. controls (0.8, 1.5) and (-0.2, 1.5) .. (-1.2, 1);
    \draw[thick, purple] (2, 0.4) 
        .. controls (0.8, 1) and (-0.5, 1) .. (-1, 0.5);
    \draw[thick, purple] (2, 0.1) 
        .. controls (1, 0.5) and (0, 0.5) .. (-0.5, 0.1);
        
    % Draw the eye
    \draw[thick] (3, 0) circle (0.2);
    \fill (3, 0) circle (0.1);
    
    % Draw the mouth
    \draw[thick] (3.5, -0.5) 
        .. controls (3.8, -0.8) and (3.2, -0.8) .. (3.5, -0.5);
\end{tikzpicture}

\end{document}
fast capeBOT
#

Philpax (jax/acc)
Compile Error! Click the errors reaction for more information.
(You may edit your message to recompile.)

dawn solstice
dawn solstice
#

Here is a simple example of a unicorn drawn using TikZ:

\documentclass{article}
\usepackage{tikz}

\begin{document}

\begin{tikzpicture}
  % body
  \draw[fill=white] (0,0) ellipse (2cm and 1.5cm);
  
  % head
  \draw[fill=white] (-1.5,2.5) circle (1cm);
  
  % horn
  \draw[fill=white] (-2.5,3.5) -- (-1.5,4.5) -- (-0.5,3.5) -- cycle;
  
  % ears
  \draw[fill=white] (-1,3) -- (-0.5,4) -- (0,3) -- cycle;
  \draw[fill=white] (0.5,3) -- (1,4) -- (1.5,3) -- cycle;
  
  % eyes
  \draw[fill=black] (-1.2,2.8) circle (0.2cm);
  \draw[fill=black] (0.2,2.8) circle (0.2cm);
  
  % legs
  \draw[fill=white] (-1.5,0) -- (-1.5,-1.5) -- (-0.5,-1.5) -- (-0.5,0) -- cycle;
  \draw[fill=white] (0.5,0) -- (0.5,-1.5) -- (1.5,-1.5) -- (1.5,0) -- cycle;
  
  % tail
  \draw[fill=white] (2,0) -- (2,-1.5) -- (1.5,-1.5) -- (1.5,0) -- cycle;
\end{tikzpicture}

\end{document}

This code will produce a simple unicorn diagram. You can adjust the coordinates and shapes to create a more complex and detailed unicorn.

Note: You need to have TikZ package installed in your LaTeX distribution to run this code.

Also, you can use online LaTeX editors like Overleaf or ShareLaTeX to run this code without installing anything on your local machine.

#

hmm, no bot?

#

(wave claims this is 405b, the actual code does not seem that impressive but maybe it looks good when rendered)

#

nvm #off-topic message

keen reef
#

rip

fiery ibex
#

Latest claude vs. new 4o-latest(🍓?) :

visual ferry
#

more from chatgpt-4o-latest

solid ravine
solid ravine
solid ravine
solid ravine
#

floating penis syndrome sadge

solid ravine