#general
1 messages · Page 150 of 1
10 years in the future from now:
Average human: âAhhh⊠What a sunny day outside. Smartbed, place me into my smart wheelchair.â
Smartbed: âYes, Greg.â
I find it interesting people think that with AI is gonna be leisure
itâs gonna be there to make sure you work faster and harder
I use about 1mi of tokens daily and never, and probably never will pay for AI
đ€Ł
well somebody has to pay for it
nothingâs ever free
maybe the people paying 200 a month lol
so youâre telling me you go through 30 million tokens a month?
yoo it looks like a web broswer what that so sick, do you think it can have a mode or feature that can like look at your screen
ah okay, do you think its possible for me to do maybe 40 images in one day? maybe even in a couple hours if it resets?
of course it is not like people have a choice lol
Is ChatGPT Atlas free?
yes
Seems like it.
how do you get it? because ive tried searching it up and cant find it
I think itâs only available on Mac and iOS
I'm giving Opus agent accounts and agentic Browser for free who want??
No way it only works on mac...
yeah
I am on my SCHOOL macbook right now, I dont have anything blocked but
they need to get the OK from android and Microsoft
MacBook. đ€ź
The task I need it for, is done on my windows laptop at home
they even use them MacBook in the demonstration
yeah i saw
you know, they gotta give that percentage to Apple lol
Willing to bet in the following months.
sooner than later
cause I know gemini live works on windows and thats sort of looking at your screen data thingy
oh gosh đ
But, I could be wrong.
so you are gambling?
how do you download it on mac?
Quite possiblyâŠ
oh no...
Howdy folks, Im new to the space. Looking to learn some new ways of using generative ai
how do guys generate video in you aspected ratio ?
How do you download chat gpt atlas on mac?
@robust yoke ur favorite video
like I have to generate video of 9:!6 ratio but its generating 16
:9 rTIO
you know, Iâll do anything for that token
anything for a little bit of credit
I love to get a little bit of credit and then I like to go into the high-end video generation models
and see if I can win the jackpot
maybe a viral video that would be awesome
then I know my credit was spent well
They don't are helping me
this is the AI community
this is the AI community
Create a short commercial video for Fairy Fingers, a graphic design agency in Ouidah, Benin.
Show a young elegant African woman (around 25 years old) wearing modern African attire, smiling and working at her computer in a stylish creative office filled with colors, posters, and design tools.
She speaks in French with subtitles.
Add smooth camera motion, vibrant lighting, and modern rhythmic background music (creative agency vibe).
Duration: 45â60 seconds.
Voice-over (French):
« à Ouidah, la créativité a un nom : Fairy Fingers.
Ici, chaque pixel compte.
Chaque couleur raconte une histoire.
Et chaque design devient une émotion.
Fairy Fingers, câest plus quâune agence de graphisme â
câest une Ă©quipe de passionnĂ©s qui transforme vos idĂ©es en visuels percutants.
Logos, identitĂ©s de marque, affiches, flyers, menus, visuels publicitairesâŠ
tout est pensé pour attirer, inspirer et faire briller votre image.
Nous croyons en la puissance du design,
en la force dâune bonne idĂ©e,
et en la magie du détail bien fait.
Avec Fairy Fingers, votre marque prend vie.
Votre vision devient réalité.
đ Fairy Fingers â Donnons vie Ă vos idĂ©es. »
In the end, they don't vote đĄ
đ
I think they are bots
Literal bots
Beep beep
I'am not a bot
Bot
I donât know he passed the deadline
well thank you for confirming your humanity
I guess my point still stands lol
Moist Vanilla Cake
Ingredients (serves 6â8):
200 g (1 2/3 cups) all-purpose flour
150 g (3/4 cup) sugar
100 g (7 tbsp) melted butter
3 eggs
1 packet baking powder (about 1 tbsp)
1 packet vanilla sugar (or 1 tsp vanilla extract)
100 ml (about 1/2 cup) milk
A pinch of salt
Instructions:
First, preheat your oven to 180°C (350°F).
In a bowl, whisk together the eggs, sugar, and vanilla until the mixture is nice and pale.
Add the melted butter and milk, mixing everything until smooth.
Gently fold in the flour, baking powder, and a pinch of salt until you get a smooth batter.
Pour the batter into a greased and floured cake pan.
Bake for 30â35 minutes, or until the top is golden and a knife comes out clean.
Tips & Tricks:
Want to make it extra special? Throw in some chocolate chips or a bit of lemon zest.
You can also finish it with a simple vanilla glaze for that perfect sweet touch.
Enjoy your cake and donât be afraid to get creative â baking is all about making it yours!
Cmon thats based
Bro im so cooked, i need chat gpt atlas like today and the only mac I have is my school mac book, I downloaded it but open ai is blocked đ
bro, you were fine like 10 minutes ago before it came out
loo
before you even knew it existed
damn my bad
Dead internet theory is real I guess
is there some way to be able to install mac programs on windows
Lithiumflow won every single battle I have encountered.. they definitly cooked this time. Can't wait for its launch
GEMINI 3 IS TOO COOKED TO BE COOKED
ITS SO GOOD
How cook is it..?
atlas
Atlas is telling you about gemini 3?
i tried it before
how..??
it does do that no other model can do
GUYS
isnt it not out?
like what
LMArena
HOW DO I GET LITHIUMFLOW, ITS IMPOSSIBLE
whats it called on there
he has access to atlas ans tot access to 3
too much to write
gemini 3 pro is so good i wanna try it
copy and paste
lithiumflow, orionmist
aaaa
im using search bar i dont see those
how cook is it?
ohh okay
Lets go cook
what about my earlier video
bro how u get lithiumflow i only get orionmist
ez
its kinda impossible
not on atlas
whats that
and how does that give u gemini 3?
ez
facts
bruh i just looked on the atlas openai videos, ITS ASS
oh
its literally a normal browser with chatgpt on the side...
wow openai keeps losing
yeah but he could book hotels for you
what model does copilot deepthink use
prob gpt5 high
and itâs super safe
and you donât have to worry about privacy or your data because open ai gonna secure it
you donât have to worry about hackers anymore
firefox better
oh brother
and it blocks adult websites
which is gonna make it very useful for people that are professional
this guy doesnt know openai steals his data, he's literally using the malware
super safe AI browser wow
but i gues sthey should have changed it by now
gpt 5 highi s better
well i guess gemini 3.0 pro is gonna release this month
and its gonna crush everything
gpt5 couldve been better tbh, they had a horrible release
yes
sadly they only care about money
it wasnt even working properly
200 bucks a month for a "smart" gpt5 pro
what a steal lol
and it fails the hand test
you just wait 20 minutes for a results thats probably similar to gpt 5 high
yeah the improvement is not that great between the pro and gpt5 high
not worth the money
unless they release gpt6 for 200 bucks a month
very cheap
Real llama
sora 2 is good ig
also do yall have access to sora 2
yes
u got the sora 2 pro? or the free one
free
i aint paying ai subscriptions
pro
wtf is that
people say free sora 2 has seen decreased performance
duh, it costs alot of money to run free stuff
and openai doesnt like that
my job performance is about 25-30% better because of AI. I wont live without AI subscriptions đ
bro this is cursed
only ai subscription i have is the one that was given to all college accounts
are u a software engineer
its a gemini subscription
not really. my work is mostly email, chats, strategy docs, reviews. etc
I am not coding ..even though most of my team codes
damn what kind of company hires for that job
Lol
all conmpanies đ
honestly for most desk job workers, if you are not using AI in your job, you are losing out a lot. That's why i am super excited about gemini 3... it is going to further reduce my workload
and newer models will keep on reducing the workload until you just get laid off and replaced by ai
if that's is going to happen then it is going to happen regardless you use AI or not
better use it and improve your efficiency and workload.
gemini 3 pro gonna be the goat of coding
not sure about other stuff, maybe it has some agent work
it is its crushing web design
I haven't tried any coding question yet.... but I have tried some very complex problems, it did extremely good
well ig we wait until release
i was very excited about gpt-5 launch as well.. but it didn't do better than 2.5 pro in my job. I was extremely disappointed.
hopefully gemini 3 wont betray me
gng the lithiumflow is jus soooo peak
lmarena is not the best place to test the model, since you can only do one shot prompts
thats what i hate about it the most
soo far as i tested it smoked every single model start from GPT 5 to Claude 4.5 haiku
When is predicted to be accessible normally
well yeah that's true
wydm
same in my testing
December
google released 2.5 pro completely free tho
on ai studio?
plss they do the same with this gemini 3.0
yea
đ
\
i hope its free on ai studio
no 200 bucks openai bs
it will be
fk openai atp they are just desperate for money
yeps tho it'll have limits
And is gemini ultra just dead and replaced by deepthink
$20 /month is also reasonable. .. but 200$ .. F that
200 bucks? you could buy a new pc for that be fr!
i think they gonna release it this month or in november
@pulsar saffron
release what?
for 200 bucks i would rather learn coding
sundar pichai said in december tho
@crude lagoon
where
gemini 3.0
yep
they didnt state the release date of it yet
yes learn it i recommend youtube
ohh but its out on LMArena
that is probably stable version on API etc.
I think Nov Ist week is the launch. All signals point to that
at some campaign ig idk saw multiple posts and blogs of it
ye but i feel like its nerfed ther
the preview is gonna release earlier tho
prolly
same as 2.5 pro preview
its some sort of smaller model or weird checkpoint i think
orionmist is 3.0 and lithiumflow is 3.0 pro
hope soo.. ong it should release out before my gemini pro expires đ
wait it only comes on battle mode, how dyk if you got gemini 3 or not?
yeah its a checkpoint
its really good at coding things up but randomly gives me terrible text analysis and creative writing
so idk
there was an a/b testing method on ai studio, but it got removed 2 days ago
well when i asked about its parent company it said "im a large language model trained by Google"
u can just read the codename
lithiumflow
thats what i got rn
and at some point it even said "im a large language model trained by open ai"
even gemini 2.5 pro
said the same
đ
that makes me think its a distillation
bro glm 4.5 told me it was trained by google
are they tweaking
đ they be tweaking fr
atleast claude doesnt lie
GLM 4.5 was distilled using gem 2.5
oh damn
gemini 3 is amazing, its what is powering the build on studio(its disguised as 2.5 pro)
so the chinese AI companies stole the 2.5 pro and made it opensource
ouhhh makes sense
what is better overal gpt-5 chat or gpt-5 high
i dont think so
gpt 5 high for coding and math bs
in a way, Z.ai is still a good company but it appears they did use gem 2.5 to make GLM 4.5 better
That's the only way open-source can win for now... I think it's fair enough?
alr thanks
test and see, im telling you
i mean every AI model steals data from eachother to become better
idk before i gets to know it's true capabilities my daily quota ends
just did an hour ago đ
nah bro they didnt release it yet on AI studio
im just poor to the core that i got addicted to gemini pro which i got for free
đ
yups
gemini 3 distilled when?
3 months from now
ill try lithiunmflow
yes
also why does everyone test it on SVG art
mumbai city voxel art
voxel art got popular because of 3.0 pro
lol
everyone doing voxel art and SVG
ONGG
what website for voxel art
yeah just saw it on X
they better not be on no nightwhisper stuff
make realistic house and we'll see
i still dont think we ever got that model
i wanna try to make a 3D Doom game with it, not just one-shotting it
SVG art is totally made up of code unlike normal images, it tests the capabilities of an ai model by testing it's blend with coding and crativity with artistic touch
whats that
one-shot
a legendary model that only exist in folktale now
there is daily quotas on atlas?
I was just about to ask is there any limit on gpt-5 on LMArena like can I just keep sending messages and uploading images without it timing me out or something
oh damn, i thought that model released
bro i remember reading x posts on it
shame they dumped it
gemini 3.0 pro absolutely smashed the benchmarks with this design
lemme see
yeah that is perfect UI
4.5 sonnet thinking can do similar stuff but its not that advanced
plus its more expensive to run
lmao
soras 2 downgrade is insane lol
@stray aspen my only complainment is that i never get lithiumflow on lmarena LOL, its almost impossible for me to get it
sora 2 pro was the original one, they just made it paid only
and called it (pro)
i got it first try on webdev arena lol
ig webdev is my only hope to test it
yeah
since battle mode has all models in it
its still so crushingly good there
I got it earlier in battle mode text
What is it menâs ? âŠ..
took like 30 attempts
Bruh
also got orion x2
i aint trying it with those odds.
I mean đ yeah I get it
wow.. this is sooo good. Why are some folks saying that lithiumflow is not that good for coding?
orion is like 3.0 flash i think
is it?
i cant say for sure
It didn't seem that good at Tkinter... that's literally the simplest UI library there is
like i could test it so little
idk people say that orionmist is not that good compared to lithiumflow
I asked it to build me minecraft in html
prompt
cuz it's just an early testing model rn, and you cannot improve on it just with one-shot prompts
I actually got it twice on the same chat, and coincidentally orionmist too (although I voted the other model because I didn't realize it changed the execution method)
i saw an x post, they made a good minecraft clone
it looks like early version of minecraft
where u can only build and destroy
better than most other models showed for 1 single prompt
i dont have the code anymore i cycled a lot trough them didnt save
Good
yes this ^
which one was better in your opinion
can build/destroy move around jump. Some textures were off but not big deal
Hmm... hard to say... both were decent but didn't really implement what I asked for
tbh you can definitely make a better minecraft with a few more prompts and debugging
Lithiumflow gave me a buggy input bar (defocuses after you type one character)
for sure; but my point was to compare them single promp vs claude for examle and vs gpt
claude did a decent job at it
I noticed Google models have this problem
like they always seem to create input boxes that defocus after you type a single character
single prompt is not the best way to completely see the model's advantages, thats why i hate one-shots
idk if it's just coincidence though, since this is a completely separate UI framework
Hm
well đ i wasnt going in depth with how little exposure it has right now
they implemented their bugged google prompt bars into the AI model LOL
kinda hoping for tomorrow to have some nice news tho, i think it may launch
I noticed that even AI Studio has this issue lol
this AI studio upgrade sets it up
AI studio is so buggy on mobile
they cannot improve an AI model with their bugged data
Idk if it was coincidence that it happened in Tkinter
it's a completely different framework, shouldn't really be possible for the same bug to manifest
got lithiumflow for this prompt... works pretty well.
But not sure if Claude 4.5 can do the same.. may be they can too?
try it with 4.5 sonnet thinking 32k, its the best that claude has rn
Mann has anyone tried the new orionmist model? It's crazy
This was Lithiumflow on Tkinter
thats what we're talking about
Well Lithiumflow -> some other models -> Orionmist -> some other models -> Lithiumflow, all in one chat.
i had them vs each other
but y'all never mentioned orionmist đ« I did spectate this chat for a while
did u test it on webdev arena
claude was ok, but less complex
No, LMArena. It's Python (Tkinter GUI).
try the webdev arena model
maybe it has more improvement somehow
I'm not sure if that's a good test, because React + Tailwind CSS is definitely what they're aiming for. imo it's better to test for generalization.
what's lithiumflow? New model?
orionmist and lithiumflow are the new gemini 3 models
i think orionmist is 3.0 preview and lithiumflow is 3.0 pro
yh fr
hmm
why i keep getting lithiumflow
maybe it's tricking you
I heard rumors it's the coding model?
lucky mf
People are saying the one that pops up in AI Studio is better
and Im getting orionmist from like the past 4 hours
the problem is we can no longer use it on there
Someone told me they got it recently
Ouhh Google casually dropped another banger like veo3 and nano banana
no way
its gonna drop this month
Well it's A/B testing ig, might be random
didnt they remove it 2 days ago
AI Studio?
Damn hope my gemini pro will make use of it
Not sure tbh
yup i dont get A/B testing anymore
new lithiumflow result its pretty good https://3000-iguwxbqrl7dwqfa1ixa6y-6532622b.e2b-foxtrot.dev/
A tribute to The Strokes
its gonna be a huge month
Very snappy too
That's actually cool
Hope they give us veo4 soon
Nice transitions

not bad, remember that your prompt complexity can also impact how well it performs
It picked good fonts too
lmao
lol, grok is not playing around
I wonder what happens if you keep feeding back images of the output (or video, since Gemini accepts those)
tbh the model is gonna show its true powers in AI studio, with more tools and multiple chats
Yeah, hard to test without tool calling/ReAct loop
If it has strong vision capabilities, then maybe screenshots will help it improve a lot
it definitely does
its one big multimodal release
K
And one day it'll probably be a world model too
uh maybe in 2026
lol
It's part of their vision
They are aiming to have one general model that would perform better than specialized world models
i think this is the last year for soft AI uses, 2026 is gonna change everything
with more computing, comes better models. just feed the machine
and dont worry about climate issues, the AGI will solve it LOL
Well we need ASI for that haha
Whoa that's really cool
Where are the enemies?
lets hope we can reach it before we lose the computing resources to run it
its white dots
since thats how the ac-130 cameras work
Well Google said AlphaEvolve helped improve their TPU design
its thermal
does it actually play? for a zero shot thats great
!info
I wonder if it can improve the outlines, like zombies or smth
maybe its the afghan war LOL
or modern warfare 2
Kinda funny that it's basically making a combat simulator đ€
i think it's on pair with early 2000s games
kimi k2 absolutely cooked
lol thats too much for it's brain
I'm just thinking what happens if they allow it to control robots like how they extended Gemini 2.0 to control a robotic arm
Anyways, if AI can one day play games, what if someone tells it to play a combat game, except it isn't a combat game...
its going to create the Gemini-AGI model and rule over the world
it would be fun to see how AI performs in different games with zero idea on how it plays
Well they can already do that, but not LLMs
imagine if they improve the gameplay and become the #1 speedrunner in like 30 mins of playing
An interesting thing I saw a while back was how they basically trained a model to control a character that responds to user input. Like it can walk, punch(?), etc.
I'm not sure if it was two separate papers, but basically it can trip over and get up, which is pretty funny.
also I hope that google will release Genie 3 to the public, that would change everything
And leveling up is essentially a real process, since it's just the AI-controlled characters at different checkpoints. A high level character would beat a low level character in a fistfight with real physics.
no other AI company has made anything near Genie 3, thats why Google wont release it
tbh i would rather create AI games with Genie 3 rather than traditional coding
because it would take you months and possibly years to make a great AAA game with AI coding, Genie 3 would just generate the world for you
I wonder if multiplayer would be possible đ€
if you could save the game into the database and have an online app that runs the same saved worlds
the Genie AI experience
I suppose there could be a paper for a model that can generate the same world from multiple perspectives, at the same time.
you mean like the same seed? AI already does that with image generation and video generation
More for multiplayer, since two players would need to see the same world, and the other player + their effects on the world.
lmarena works for me rn
i cant generate when i paste prompt it says something went wrong while generating the repsonse
its looping it
nop
Can you add this information to the #1417174113092374689 post?
clear your cache
What was the error?
Hi
well look that charm it worked
you gotta thank the wizard
@stray aspen can you test lithiumflow with a floppy bird game
cuz i cant seem to get it
nvm
i got lithiumflow finally
heres a floppy bird game
Disable ad blocker and try again
it's fixed
try the game i made
Oh I'm late lol
opus 4.1 looks terrible
Cool design, but it's literally running at like 1000 FPS on my machine lol
compared to the lithiumflow generation
It's dropping instantly
its a mobile game since im on my phone rn
this is like the hardest flappy bird ever... reached only 3 points
you dont wanna see the opus 4.1 one
Tell it to update the position and velocity based on the time step ig
can u do more than one chat in webdev arena?
imma try it rn
also the AI studio is better because it can generate music and perform way better than this lmarena model
sadly i cant get access to the A/B testing
Hi
Yeah
Maybe ask it to generate music
bruh it broke the game with the second prompt u wanted me to ask
Oh, maybe it's only good for one-shot đ€
Like it's not that hard to just measure the time between frames and just multiply by dt for the velocity/position update
heres the "fixed" version
maybe its gonna work for u
Broken
yeah i gotta debug it
Well that's not promising lol
How can it write something so complex but mess up such a simple update
thats AI debugging for ya
they can one-shot a game but they aint fixing nothing afterwards
That might indicate it's just overfitted on popular prompts but doesn't actually know what it's doing
probably, atleast it actually made a playable game first try, unlike 4.1 opus
wtf
dont worry it is actually a playable game
i broke it with my prompt
Just encountered Orionmist, seems to fail at TikZ
@verbal nimbus this generation is working fine for me now https://3000-i54k60h5t8psjuoifjx3o-6532622b.e2b-foxtrot.dev/
Even R1 can do this one
i think orionmist is 3.0 flash or something
Gemini 2.5 Pro can do it
yup its not the coding model
That's very good
it works now?
The style is much better too
wow it only took 2 more prompts
yeah i made it fix the bugs finally
im gonna risk it and prompt it to create music for it
its 90% gonna brick the game lol
Haha I was going to say that
Why's that?
shouldn't impact game logic
cuz sometimes it just rewrites the whole game code for a simple change
thats why the style looks different
Uh oh
each generation
would be funny if it could clone discord...
Ok I'm hoping it's not Gemini 2.5 Pro then
i wouldn't trust this AI with my database lol
Best case scenario one is Flash and the other is Flash Lite
lets hope so
it could, but just the frontend
If that's actually Flash I'd be so impressed
yeah, the pro model better be revolutionary
Assistant A seems to have gone into an infinite loop
the biggest point of this september gemini flash update was, when using gemini flash with thinking for deep questions it used to spend more reasoning tokens than gemini pro, so it was none faster and barely cheaper than pro in samethings
Thats why i think they will not make an Pro 2.5 Sept version
here's the music version lol https://3000-iytlhjfhel6k6i4sc6xbz-6532622b.e2b-foxtrot.dev/
it changed the bird again...
wtf
and i dont hear any music, it just made sound effects
nbm
nvm
@verbal nimbus ITS A FRICKING DEEP AMBIENT
lol
That's a pass to me
weird
should i try to prompt it with a better style
or would that finally break the game
oh well, no harm in testing ig
risk it for the biscuit
You can just save outputs you like
yeah at this point it seems to be quite working well
If you ever want to run it you can ask Claude/ChatGPT/coding agent to put it into a self-contained HTML file.
wydm
oh yeah, i can just copy the code
Well the file is just a single React component, if you want to run it you need either a React env like Vite, or put it all in one HTML file.
does webdev arena only do react coding?
Yeah, although I got Claude to help me write a web extension that basically lets LMArena run any code.
React and TailwindCSS
i wonder if it can generate a C# game engine
in release
Oh, no display output without going through the DOM though
Try this one now https://3000-iclx8vkvnkaly9th80alh-6532622b.e2b-foxtrot.dev/
That's much better, it fixed the ground speed synchronization issue.
Did you tell it to turn the buildings into an equalizer visualization
or is it a bug lol
nah i think it just made a bug
Can check its intention in the comments
This is a heavily polished Batman-themed Flappy Bird clone called GothamFlapper. It features a multi-layered gothic background with parallax, animated rain and fog effects, and stylized pipes resembling gothic architecture. The Batman character has an animated cape that reacts to velocity, glowing eyes, and a subtle utility belt. The game includes screen shake on impact, flash effects on scoring, and smooth UI transitions. The procedural audio engine provides dark ambient drone music and fitting sound effects for flap, score, and hit events. The physics and game loop are time-step based for consistent gameplay. The UI includes a mute toggle and a stylish start/game over menu with animations. The entire game is styled with Tailwind CSS and uses React hooks for state and effects. The code is self-contained and designed for pages/index.tsx in a Next.js environment.
the output
@verbal nimbus also the game's performance seemed quite laggy on my phone now, so i told it to optimize the game properly and fix the background buildings bug
Oh I meant the comments in the code
Are you opening it in a new tab? Otherwise could just be the conversation getting long.
Ok that's kinda cool, I didn't know it invented a whole lore đ€Ł
yeah, this is the newer "optimized and fixed" version https://3000-i2cy0vx7wikcqy0zxk9xd-6532622b.e2b-foxtrot.dev/
it runs well now on my phone
now it works much better wow
Ok that's good
Why did it add rain when you told it to optimize though, lol
Unless it was just bugged before
BRO THE 4.1 OPUS IS CURSED
the server is done
wtf its so laggyu
laggy
Help
i will never call lithiumflow out anymore, COMPARED TO THIS ABOMINATION
4.1 opus sucks so bad
FR
what did u test it with atp
it does do 3D games well
3d games
i wanna make a doom clone but im on my phone rn
You can ignore the errors
This seems quite good too tbh
music and bird is better
yeah the art is quite good
how well does it perform on your pc
cuz its a laggy mess for me
Same
maybe its replicating something that makes it lag so bad
try this improved version now https://3000-im54wjvgib4hhn9uifjji-6532622b.e2b-foxtrot.dev/
did it make the bird better
Yeah, but looks like the ground sync bug is back
ig it was actually the last edit that broke the ground synchronization
This feels more complete though
Opus feels more fancy, but less complete
the ground always seemed like that for me even in the previous ones
i think it depends on your fps
This one seems to have momentarily fixed it
It should just be adding the same velocity to both the ground and the poles, odd
or maybe it forgot to add the dt term to the ground
this is the latest opus iteration https://3000-ishvlouzqlfl10fdvkafu-6532622b.e2b-foxtrot.dev/
Lol that's worse than the previous
bugged rewards + no sound
I think Lithiumflow wins here
yeah it messed up cuz i told it to make it "optimized"
They said LMArena uses GPT-5 but when I asked it latest info it wont tell me and says its information is limited to 2024 LMAO
Encountered it again, this time it passed, although suspiciously well đ€
cuz the API doesn't update
so its useless then?
This is TikZ (Latex diagramming library), so very low chance of data contamination unlike SVGs
if you want to use it for the latest info then use the chatgpt app
for daily questions
so it tests the true AI powers
Whats your prompt so I can test it on lithiumflow
I'm not sure anymore... this is Gemini Flash Lite 2.5
I don't think it should be possible for Flash Lite to get it unless there's data contamination đ€
I think I gotta change my prompt/language
Since it fails on more basic spatial tasks, and even SOTA models don't do well on this...
It's this:
Can you give your best attempt at generating a gorgeous realistic beautiful dragon breathing blue flames in TikZ? Show the full dragon. Only use standard packages. Please, really try to put effort into it!
alr imma test it
Old tests (I haven't added new prompt)
I mean, this was literally Gemini 2.5 Pro a few months ago:
I feel like there's some data contamination... Flash Lite would have no way of getting it lol
also can you ask it to use specific libraries in Tikz?
Yeah, but you gotta install the packages, that's why I said standard packages
also they like to hallucinate non-existence packages
so that doesn't work in LLMs?
wdym?
can it install those packages if it doesn't have access to any tools?
No, it's only outputting code
Like SVGs ig, but less data contamination because who is drawing pictures in TikZ haha
hmm yea, maybe in release we could use more features
Hmm... there's literally no way 2.5 flash lite could have got it
I know, I'll just switch the task a little
if it fails then it's data contamination
Have you tried with the rumored gemini 3??
thats what im trying to accomplish
Well that's slightly perplexing... this is 2.5 flash lite
It's supposed to be a cat on a potato
looks like SVG to me
wtf is this
@verbal nimbus lithiumflow came up with this...
its on webdev arena btw
(react)
Want me to test it on flagship models?
sure
The code is not valid I think, missing imports and also error on line:
l.99 ...y=0.15, blur shadow={shadow blur steps=5}]
I'm not familiar with blur
Oh it's part of a missing import
Fixed it:
Hmm đ€
Better than Orionmist
i would rather test it with SVGs than TikZ
Yeah... although Gemini Flash Lite 2.5 did suspiciously well đ€
it would crush it with the same prompt
Consider that it's 2.5 + a small model
Yeah, but that's because there's a lot of SVGs in the training data
If you want, try this prompt: Can you give your best attempt at generating a gorgeous, delicious potato with a cute cat on top in TikZ? Only use standard packages. Please, really put effort into it!
This is Sonnet 4.5 Thinking:
not bad
Better than I expected
But Anthropic does know about the test
oh so they got some data of it
Although not this specific prompt that I just made up
Well not really data
coz no one is drawing this kind of stuff in TikZ xD
It's basically for maths diagrams
why does the battle mode keep looping on the code, they never finish the TikZ...
i think TikZ in battlemode is cursed, only webdev arena gave me working results
Oh what... how did ChatGPT-4o get it
Maybe potato and cat is actually common
imma try it on gpt5 high
Or have providers started training TikZ đ€
or maybe there's a few TikZ potatoes on the internet
can you do python games on webdev arena?
I'm just surprised because ChatGPT would not have been able to get it in January. Especially with shading lol
or do I have to waste my time in battle mode trying to get lithiumflow
thats gpt5 right?
No, 4o
hmm thats good for 4o
This was back in feb I think
you can see how bad 4o was there
maybe "potato" is way too simple
@verbal nimbus can you try a few elemental geometry like torus, L-shaped cube, Klein-bottle?
why is gpt5 high stuck on "generating" when I test it on that prompt
It might be good at that
Since it's basically a LaTeX diagramming language
try Klein-bottle, or Moebius ring
nvm gpt5 high is now writing the code, after like 3 mins of thinking
I am increasing the difficulty:
Can you give your best attempt at generating a cute cat on top of an Boeing 747-800 in TikZ? Only use standard packages. Please, really put effort into it!
That might be a bit unfair lol
I mean one is literally a 4D object đ€Ł
Yes, I know
Do you still have it?
Probably not
i wanted to see, please đ
just for fun
What the....
GPT5 high result
instead of Boeing, you could also have asked " a cat on a flying saucer"
That's actually crazy good by LLM standards
It's not loading argh
fly*
yeah looks like a mutated fly
This was Opus 4.1 Thinking
i never get those 2 in lmarena battlemode lol
only the bad models
did lithiumflow fail at the test?
but it didnt fail on webdev arena for me
The clouds in the background by Opus is just it showing off atp
yeah opus is very creative
It's actually notoriously hard to draw anything in TikZ
i hate the cloudflare verification when you try to vote each battle session
OH BOY
I GOT LITHIUMFLOW VS 2.5 FLASH
lemme try the code
Like it's usually just stuff like this
here's lithiumflow
btw i cropped the image since its on pdf
Ah you can probably get an AI to upscale it too haha
Yes, I was thinking on it
absolute cinema
Truly.
what is better atlas or comet
firefox
okay let me rephrase, whats better for doing tasks without you doing anything, atlas or comet
complex tasks*
Fixed Lithiumflow code. Errors were incorrect color names and wrong imports.
This is even better I think
far better than opus
Too bad the basic coding errors made it lose
the boeing is good
heart-shaped clouds, thats cute
Yeah
Not SVG but TikZ too
time for Klein-bottle and Moebius ring xD
I should find an even harder library then TikZ
Coz TikZ was kinda popularized by the Microsoft paper, and I'm kinda suspicious with how well 2.5 Flash Lite did
Oh I'll try that
is perplexcity on LMArena?
The mountain got split in half somehow but it's alright
GLSL/WebGL Shaders (Probably #1 hardest)
4.5 sonnet output
idk if its true
Their sonar model is
If you enable web search on bottom left of input box
Easy I think
alr i just wanted to know, is perplexcity good with complex tasks?
tons of training data
like can perplexcity compare with gpt 5
wdym?
you'd want things where there's 0 chance of there being an example on the internet
what?
OpenGL has a ton of examples
whats openGL
I was replying to Forton haha
Opus 4.1 Thinking (maybe this is a good test since the model is messing up)
You can use GPT-5 in perplexity
Sonar (or whatever the Pro model is) is inferior imo (Deep Research result contradicts GPT-5 Thinking)
lithiumflow better win this
Hello everyone
i got orionmist for the webgl test
Well, I can remake that paint in IRL
Nothing that can't be done, I just need to buy the paints
đïžđšđŒïž
@verbal nimbus uh this isnt working for me
Black screen for me too
hm.. it's a good start... đ
I'm trying the Mobius loop but it hit a computation limit (no idea what it's doing lol)
omg why does it need 5 packages just to run the short mobius loop code
ah the same code looping i had earlier
This is a good test, it's getting stuck (Opus)
is 4.5 sonnet better than 4.1 opus in coding?
Well, it is a Mobius loop...
another hard test would be to turn a donut into a coffee mug, ask the model to draw the transition phase in 3 stages
in case you're wondering, it's called homotopy invariance
It wrote a whole document on it đ
yall why r u doing pdfs?
just do svgs
It's LaTeX
The main goal is to prevent data contamination by testing it on unseen unexamples, and in a library no one would draw stuff in
Hmm I wonder if it can animate that
and why do mysterious models only appear in lmarena
Sounds cool
What are humans equivalent to
@verbal nimbus https://textintext.on.websim.com
depends on how many orifices you count as holes
there are animation packages within latex i believe, i forgot the names, but you can ask the model to draw the transitioning phase in 3 pictures
this is getting sad i cant even use claude right now
Ok that's crazy cool
why would u test AI with that
lithiumflow and orionmist used to be the only models to get simple sentences out of this (and consistently) but I've been testing the same prompt and they are hallucinating so looks like they were heavily nerfed
its incredibly difficult for em
ah sheeit
why do models get nerfed
đó ó ó §ó ó €
@elder burrow Can you read this?
this is the prompt
"wow t"
did ya copy it correctly
đó ó ó §ó ó €
You're testing whether it can read it?
hmm, still says "wow t"
yes
it was able to read 2 words from an encoded sentence
consistently
I can't believe I've never heard about this before
But perhaps it depends on the software/platform?
both models got the same 2 words right which is interestin
Or the tokenizer I suppose
aâââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââbc 123
decode that
Hmm...
I gotta read up how this works
Yeah I did that
Hi, I'm in
đó ó ó ó ó
Did you manage to fix this?
I will buy a screen frame and put that video in it
Joke, I'm not crazy
Actually I think that can be a product
AI generated photo frame
that changes so slightly you can never see it changing but it changes
Yeah
Solar flare
What global warming does
That's what happens when I sweep the sun too much
so true
Lol I think I figured out why it isn't working...
It basically commented out the entire block inside the script... đ
which do you prefer?
why would it do that
This one is much cooler
damn, is it gonna run
what do you think?
"Boilerplate and helpers" idk if it included the full code
Is it Gemini
its orionmist bruh
Gemini 2.5 Pro has this weird problem in Gemini app where it forgets how to write new lines sometimes
