#I found out
1 messages · Page 1 of 1 (latest)
what about vision
Neuro recognized that a cup was orange on one of the zoo streams.
She also saw, recognized, and later referenced yellow umbrellas in a zoo stream. (And confused Toma, because Toma had already passed the yellow umbrellas before Neuro started talking about them.)
Then there's the fireplace and music keyboard she saw on the wallpaper, during the conversation about the Ironmouse canvas. (Not to mention the Donate Plasma text.)
Might be interesting to see how well she and/or Evil would do with one of those baby books with pictures of animals and such.
She saw an anteater plushie from so far away on the zoo stream, but only when Toma got closer could we see it clearly in the livestream.
She also recognizes non-verbal sound cues.
Can she see what three-dimensional shape this shaped paper can be folded into?
chatgpt can, doesn't make it particularly meaningful
This is a question for elementary school students to practice their imagination. ChatGPT must have known the answer in advance.
exactly
LLMs don't have the greatest spatial reasoning, but they have something
maybe an image/video-LLM might have much better spatial/temporal reasoning
I suppose one core of this is whether the LLM develops a world model that its training data describes (beyond language), or if it just learns text really well and still doesn't know about squares etc.