#general
1 messages · Page 97 of 1
it depends on what
Because they are pre-release models
🙂
Hii, is the legacy site down??
I'm seeing this too. Will flag. 
Well the code named models appear only randomly so uhm you can just hope that if you have a long convo, they will appear a lot.
I just got a really good idea for generating code with models
Good for you
i dont think so
I really like the customisation parameters
good point
Like when @echo aurora you guys changed into new lmarena I was asking gpt and perplexity about you guys
😂
503 Service Unavailable
No server is available to handle this request.
damn
legacy dead
The legacy site doesn't receive updates sorry to say.
oh
it
seems
file sending may break due to the limit of tokens per user message
Ah, who cares
okay, found another small bug.....
no
make it easier to generate code with the AIs, and easier to copy and view it
damn
thats ai for you
it guesses
intelligently guesses
no, the thing is that on LMArena, they dont have any system message at all
So they dont have a name per se
none of them is trustworthy
they profit by being friendly
Demis 100 percent
agreed
where is V.P.
🙁
Finally I can force them AIs to not roleplay as completely different models
yes my bruv
ChatGPT, trained by OpenAI
Stop using that dumbass chat format
Ehat is the prompt?
And then how can I have the picture
Uhm idk how to do
Lol
I eill give you the answer
I have from grok 4
craig are you paying a claude subscription
omg fraud detected
Actually mine look more detailed
Nani
I wonder why system prompts dont work on the mistral medium model
Glm 4.5 is goodddddd
👋
Hi?
Just joined, so yeah, hi, lol
Go check out the other channels 👍
yeah go check them out NOW @scenic salmon.
@echo aurora
Huh 😭

@scenic crypt
Is the problem solved?
Which issue are you referring to?
gpt-4-0314 not being in the arena
There is a problem with the video creation converter counter. It says that 8 videos cannot be created per day. Veo 3 Audio no longer appears.
@echo aurora
Yes, the daily amount for Video Arena has been fixed.
Veo 3 Audio no longer appears
This was changed. There is now Veo 3 with and without audio.
Russian state news
it does suck tho
wtf is this L by gpt 5?
obvious indication it is windows machine and provides linux commands
even gpt 3 dont fail that
If you post this one more time, I’m reporting you
ok
he said not to spam
i am not spamming
Posting the same thing 700 times is spamming
ayo please dont attack lmarena with russian botnet
real
We've taken action, let's move on please 
😭
theory: the stealth model 'toad' is deepseek v4
I prefer grok 4 over gpt 5 and 2.5 I was wondering if anyone else doed
Doed?
Hello, LMArena ranking aside, would you guys recommend GPT5 (subscription model) for coding complex (1500+ line) projects?
depends on what your alternatives are
Ideally you'd have multiple models ready to go
I've been using gemini pro exclusively but I get rate limited every day
That's honestly pretty boilerplate
is my personal info in risk for using this service?
this looks like boilerplate google analytics
if you don't like analytics, use a browser with tracker blocking or get ublock origin
as someone who uses both Gemini and GPT, it's nice to have them cover each other's weaknesses
DOES
What do you use it for that makes you prefer it
I prefer Claude opus for everything not, math/logic/advanced reasoning
Use brave it’s got automatic tracker blocking
nah i already installed the thing this guy said
That’s fine but brave is goated it’s got everything built it, 0 bloat ware, and many other useful features
I usually just use it as a companion I’m not that techy. I don’t prefer it because of intelligence I more meant the actual app lol. I do like how it responds tho
Built in Tor as well
General things
Bruh are you using Ani
ANI?
Yeah
Eww tobi how will you execute the PLAN like this
💔 I’ll ask it for ideas
@marsh stratusI think I'm in love with claude. Gemini's good but I keep getting rate limited, Grok is... special in a 'stay away from me by at least 5m at all times kind of way' but it froze when I asked some complex coding question, but Claude re wrote my entire code with just a single prompt.
why chash the LMarena chat history ? how to I solve this problem ?
Every social media platform knows a bit more about you than LMArena
lol
this is silly.
well now i use tracking blocker
Everybody should use that when they install any browser
some dont even know ad blocking
yea
Is your chat history now missing?
Unfortunately, if your chat history is missing you won't be able to get it back. This is a problem our team is looking to address both with new features but overall site reliability as well. I am sorry you don't have access to that history anymore.
nano banana removed from the arena?
I don't believe so, what makes you say that?
bro thinks hes bing chat
This is @echo aurora's reaction rn to
this message
Or
I'm sorry, but I don't believe that's accurate. I think there may be some misunderstanding here. I'm still learning, so my assessment could be mistaken, and I appreciate your understanding and patience.🙏
Buddy your data isn't that important chill the hell down 🤣😭😭
Mf acting like the CIA is on his ass
No bro it's not that, i mean like what if the data of me gooning with ai girlfriend gets leaked and it can linked to my identity irl?!?! then no one want to hire me anymore : (((
like claude serious asf but damn he know how to .... well i can't say more due to the rules here :(
🤣😭😭
dont link your identity tot he AI
but LMarena does that now for us
Maybe they forgot you
The stuff is anonymized, even in direct chats, unless you provide your real name and other revealing info
if it concerns you, you can always use a VPN.
that was indeed my plan, but then this mean @exotic stream interrupted us 😡 im shaking and crying rn
If privacy concerns you that much just use a local model bro
not the mc video deleted too 😭
Hey does anyone here what the limits for each model are or is it just completely free
There are some rate limits
especially if you put too many inputs in one minute aka rpm
do you know how much it is
No, but the whole service is free.
just dont rapid fire the chat
lol
Damn why is'nt this website popular then wth
Don't know, lol
Tell your friends!
Have u tried advertising lol
am i your friend? 🥺
You didn't wish me happy birthday
is that you pineapple
yes
I take it back
What! How did i miss this?? Happy bday pineapple!!
Lol not my birthday, thanks tho
Well, you deserve blessings for a wonderful day no matter what
is there an answer to why most of the models get all fucky and respond with that?
"Something went wrong while generating the response. Please try again." or is there maybe a temporary fix i could use to use my current chats so that the model doesn't lose the context of whatever it's working on?
Hmmm
1 - This message generally means that you are rate limited
2 - About more context, prob no, bcs of costs
@echo aurora
How do you guys get all these expensive APIs lol? Sorry for ping
Why don't use the discord search function?
Ppl ask it 999 times everyday
Yeah sorry to say sometimes models will error out for different reasons. We are working on making these error messages more clear about what went wrong
What do I search lol?
the video arena ruined search discord function
I assume they're granted API credits, if not near unlimited API access by all of the model companies since lmarena provides valuable feedback on what users like and don't like 🤷♂️
saves them from having to do the research themselves
Sry for delay, doorbell. We are paying for the usage. This blog post should be helpful: https://news.lmarena.ai/new-lmarena/
ha, you got VCs to back you, I should pick your brain if you were involved in the raising at all @echo aurora
it's not the worst idea, they have more data on the type of responses preferred than any of the individual model companies
And your getting high quality data from things like GPT-5 and Opus
Thats worth milions
yeah, no synthetic training data needed, all real human questions
$100M to be exact 
Models that are using a codename can only be used in the Battle mode (random models being sampled side-by-side). Meaning you aren't able to select it specifically
Do you guys charge a lower AI lab if they wanted to test their AI as a codename?
hi guys what the best prompts for still frame, still camera no movement?
HI
Hi
hi
Hi
wth
Countdown to Gemini 3💀
Is the proposal of direct chat for codename models disapproved or still on your todo list?
Many models shown on the site aren’t the real ones or are just named versions that don’t exist (like GPT-5, Grok 4, etc.).
They are actually working on GPT-4 & Grok 2.
Kindly fix it ASAP
Where
Guys I have a question
Has anyone of you experienced a model becoming unable to answer some questions correctly after it was upgraded?
Like GPT-o4 answered some correctly, then o3 wasn't able to get it at all
I found that Deepseek and Qwen answer my questions correctly based on incorrect data (!) it was trained on what the heck
@echo aurora Brother, the chat history issue is causing a lot of trouble. Please take necessary steps to fix it.
I hope they put soon the "cancel button" it's so frustrating
any solution to stop the "generating"?
What happened?
Evidence for that?
I understand, but it's weird since like it's not even related to what AI you're choosing on the website, once this error appears even changing it to a different model wouldn't fix that whatsoever, the chat itself goes to hell
until you're starting a new chat that is, and everything works well again, only if there was a way to maybe export the chat after it dies and maybe import it into another one so that you'll be able to feed the model context about what you're trying to build with it, instead of starting from zero and having to feed it information about what you were using it for in the previous chats
because it’s currently in a very unstable state, the website and the concept are cool, but that doesn’t mean much if your chats die after just three minutes of use
For fairness, I think other reasoning levels should be included, not a configuration that even Pro users won't get to use (if X posts are accurate).
i believe toad is a qwen series model:
- it uses emojis in a similar style
- "the vibes"
neither do deepseek models use emojis for responses nor does gemini.
The only other model i've observed is chatgpt-4o-latest
0 evidence
what happened to video arena 4
There are some decent reasons for them to do this though. Reasoning effort changes are more impactful in cost for chatgpt than API, since it's often doing tool calls and that mostly happens while it's reasoning. They are fine going all out on API even for diminishing returns but for chatgpt this makes no sense if they get big increase in cost with little to no benefit. And then also, "128" technically still falls within high reasoning effort range for gpt5, since medium on API is 64.
The rate limits are for all models, not for specific model. And you know that every prompt you send is gonna be public right? Lmarena is for you to test models not for real work
Roon said that this juicy thing is not the reasoning effort
Toad? What the hell is?
And that gpt thinking is using high reasoning
Kindly educate yourself please. Just because a model responds it is based on some older variant, that is completely normal behavior. Models are not trained about themselves much usually, since this is not very useful for IRL applications of solving actual tasks or problems. Fine-tuning on redundant things like that will degrade performance everywhere else
Yeah that's why I said "usually"
Most of the time models will get this info from a system message
Ye just claude do that
Just claude train to say which model they are i mean
Openai, grok, they use the system prompt generally
link? 🧐
I'm pretty sure it is. There's a direct link between reasoning effort and that number for sure
I didn't saved it but it was in Tibor post
Which company is Nano-banana? OpenAI ?
Maybe you could ask it for a different style?
Google.
Does nano implies small model??
Well we can only assume that is the case. No one knows for sure
Maybe
Tried to look for it and only found this. Which seems to confirm it being the reasoning effort rather than deny lol
Is there any website like lmarena?
Also I don't think juice number directly translates gpt5 vs o3. For o3-medium the number is 2X compared to gpt5-medium, but it only just generates more:
not by much I mean
Idk them, there is another post that he said that think harder on router can reach high thinking too
I think that it is this one
😭
anyone know an ai that’s capible of making high quality gfx’s? Is it even possible yet
No it isn't at all
I've tried that a lot
Please don't waste your time like me for that
that's different context though. Their router was the weakest link at the start and their model card technically describes it as being more significant than it was on launch. If they haven't improved that yet I'm sure they will. It honestly made no sense that it couldn't do more than low reasoning effort, especially now with it being renamed to "Auto" and describing it as "decides how long to think"
It should be able to do chat/low/medium (for Plus sub), the way they are presenting it
Then for Pro high as well (128)
their chatgpt prompt sucks and the infra or whatever they do on the backend sucks
and gpt-5-pro feel exctly like o3-pro, i'm sure they just changed the name
I mean at the start when that was named "GPT5" it did actually perform better than gpt5 with no reasoning and that was fairly obvious. Now it's still a good option for people who are confused by all the models and just want to ask a question
now that gemini copy pasted their frontend
when they finish copy pasting the features too
it's gonna be better than chatgpt
Nah no way. It is based on gpt5 now rather than o3. Different base model. It's like comparing gpt4o from last year vs gpt4.1
i retryed my old prompts btw
it did the same frontend
the same answer
Try something different but still with visuals, you should see the difference tbh
it now ranks nr1 on webdev
yeah i tryed already to see the difference
while before it was bad
there is none
the model that they deployed for testers on api was called o3-alpha too
xDD
I haven't tried frontend, but for svg the difference is obvious
after that they changed it to nectarine
yeah that's just random joke lol
they did it with gpt2-chatbot as well
They knew everyone would see that name, they did it deliberately...
it was leaked on the git
Not to make it too obvious that it's gpt5 base I think
But still promote their stuff
it was on lmarena and webdev
with that name
It was immediately obvious to everyone that this name alias is public
Besides, @blazing bison , those gains wouldn't be possible otherwise without changing the base model tbh. gpt5-medium generates less than o3-medium but performs considerably better
/video prompt
i believe that gpt-5 is another base model
just the pro version that is strange
as a heavy user of the pro version
i noticed the differece from o1 to o3-pro clearly
but from o3-pro for gpt-5-pro
almost none
well then pro is different as well. Pro is literally just some code for the most part
and i'm talking about frontend in general
they took 3 months to update from o1-pro to o3-pro
but released gpt-5-pro on the same day
that was already sus asf
and
there is no gpt-5 pro on api why
it's like they changed o3-pro low to o3-pro medium now
They were slow with o3-pro as well. I think it took them time to test and approve. You can probably get some unexpected results by running each prompt 10+ times and to make gains it is not super easy out the box. Small changes even to just the prompting can make a fairly big difference
lol. Isn’t pro just gpt high now?
This time it may actually be capacity for real... They did big changes with new releases and are still finetuning the caps for subs. Pro API would nuke their gpus lol
like i said, i can't proof, but i'm not the only one pro user that feels that, everyone that heavy use pro models from openai feels the same
i think the real gpt-5 pro is gonna appear on chatgpt only after it's avaliable on api
I thought it was leaked that pro is just gpt high
conspiracy theory much. This is extremely unlikely
gpt high didn't take 10-20 minutes to answer
well i have it, i'm using it, and the AI communites on x says the same
No, pro is parallel test-time compute. As should have been immediatelly obvious but you can read their model card if you don't want to take my word for it. 👀
and there is a router on pro models on chatgpt too
it answer in seconds if your question is simple
Hmm. What is it then?
What? So 200 juice is reserved for API specifically, and even Pro users don't get this treatment? Is this a joke?
In ChatGPT, we also provide access
to gpt-5-thinking using a setting that makes use of parallel test time compute; we refer to this as
gpt-5-thinking-pro.
Yeah this is what I was thinking of. It shows pro as juice 128
yes
Notice that even when writing this they knew it's gonna be available at first only on chatgpt
not API
no different results
We can only wonder...
@blazing bison I think it "feels" the same probably because they used much of the same prompting to compute ~10 attempts into a single answer. So when it follows their instructions, answers are to look similar as previous pro, by design
I have a theory on that. If you think about it a bit it’s obvious…
Context length?
Ye
no
it's bcs above 200 there is no difference or the model actually lose performance
So you want to say that at some point the context of reasoning becomes too long and the model starts to slop?
No what. What evidence do you have to prove this absurd theory that they are using o3-pro in disguise marketed as gpt5-pro?
Yes. So 200 is the max cot length it was trained on. Above that is slop.
my gut feeling and my friends that actually have pro
idk what you're talking about on plus plan
But why just 200
So basically nothing. It's difficult to test models properly even doing the right things, let alone 'gut feeling'...
if you run same prompts and have the same results
it's enough
it is
?
By the way did you know guys that these stealth models drop their names if you ask them nicely
Is there any specified schedule for when the leaderboards are updated?
🤓
I don't know what's the system prompt of LMArena that restricts them to share their names
But they will at least very generously share the information about their creators
@blazing bison Also wait a sec, you are SERIOUSLY arguing that gpt5-pro is just a scam and that they are using o3-pro instead, based on your gut feeling and no evidence to distinguish between 2 very good models? You are ACTUALLY for real? 🤣 🤣
i wasnt siding with u
LOL
they can call anything gpt-5-pro on their end
it's not a scam
It's technically is tho💀
it's not, on their end gpt-5 can be called o3-alpha and it's ok
ur saying that openai will break federal laws
good point
Gpt5pro means that its gemini knockoff
👍
dang you can use Claude 4.1 for free now too lmarena is awesome!
Imagine if OpenAI really scammed everyone with this move 💀
What do you mean it's okay 😭😭😭
?? they can call anything gpt-5-pro bro
You sound delusional and misinformed
i'm not saying that they are delivering a worst model
could you be more vague
To be honest it's not much surprising after the news about Sama being a pathological narcissistic liar
if they change reasoning effort of old o3-pro to high and call it gpt-5-pro it's ok
Lol.
no
Yeah, nobody can do illegal stuff
because they market it as a better tool
if the reasoning effort is high, the result is better
then it's a better tool
source: trust me bro
i guess gpt-5 is actually just gpt 2 fine tuned
we got played..
i'm not saying that it's actually it
no
i tryed it
with the same prompts
and got the exactly same results
Imagine if everything they did to gpt-5 was just scaling the reasoning up to 200 juice, once Gemini scales up to this number OpenAI will be cooked 💀
which platform did you try this in?
zenith was 64 juice
was better than anything atm
I agree with him. I don't think the people arguing here even have Pro
There are even posts about it on the OpenAI forum
OpenAI said they are working on it
So here's the following dilemma
the hidden model in battle mode
You get GPT Pro with two stupid models running at once that kind of "increases" the odds of successful completion of a task
Or one smart GPT High
Two stupid models or one smart?
qwen:
"basically guys i think gpt-5 pro is the same as o3 pro bc my gut says so"
Theo even made a video about it. It's not directly about Pro Models, but it is about all GPT-5 models. There is a problem with them
holy ragebait mother of 3
OpenAI has already addressed this
it's trivial to jailbreak LMArena to make the models namedrop
? pls tell in detail
Up to you, hacker
But I can say that, from my experience, if it is a model that is worth attention, everyone will be talking about it
Be it Zenith, Toad or whatever
agreed
zenith was probably gpt 5
lmarena should probably reveal the hidden models after their public release
Toad is nothing special btw
zenith was the nectarine model
im only assuming because of what theo says
it matches with zenith's output speed and accuracy
ahh okay
not confirmed tho
yeah its meh
I meant that it is not by a major provider
nah its just not SOTA so IDC
That's why I think you don't have the Pro model. If you had paid for it and didn't notice any improvements, you would be upset too
If it is by a major provider everyone'd already losing their crap here and there
💀
what are u yapping about
but i just saw it side by side with qwen 235 b n thought hey they loook similar
i have the pro model
Show proof
Better follow openrouter for stealth models
😆
There is no point in discuss this btw
@keen beacon
I'm watching this
Check out Displate and use code FERN for 23% off one Displate, 27% off two to three, or 33% off four or more*. Or click this link to get the discount automatically: https://displate.com/@fern (ad)
This is the tragic story of one of the greatest minds in history!
If you want to learn more about the crazy math that Ramanujan and Hardy came up w...
india
This documentary even better than the movie
🔥
There are so many stealth models on LMArena and they are total garbage
@keen beacon
What?
Toad
Toad?
I don't know
Iirc?
If I Remember Correctly
What is IIRC
Most hype comes from stealth models from openrouter
Here are the big things
Can you give an example of a stealth model
Toad, Zenith, Nano-banana
I know nano banana
They all come from big startups and enterprises btw. There's no way someone can train a LLM in their garage, so think about something as big as Netflix
But it doesn't matter
None of them are SOTA
Claude, Gemini or Gemini no doubt
Gemini has genuinely impressed me today though
@echo aurora make this into a channel. We have community-creations, community-polls would fit and be useful. 🙂
My favorite anime (it is NOT Madoka Magica) is often criticized for things that are present in 99% of other anime, but failed mostly due to marketing reasons
There are very few models that have identified this problem
Qwen and Deepseek often point it out, as does GPT-5
But Gemini was the only one that compared it to other shows and figured out that these criticisms do not actually matter
I'm fully aware that the prompts are public yes, so it's essentially a platform that lets you test different models that stop working about 2 minutes after using them?
how to make video
No i think only the claude opus model has limits
Because it is so expensive
nah man, it's gemini 2.5 pro, chatgpt 5, claude 4 sonnet, opus as well
3.7 claude
You sure? After how many prompts did you encounter limits on lmerena?
one
look
Bruh lol
one specific prompt
I used gpt 5 high for hours
i'll give ya a prompt you'll see
Okay send it
That is not limit
Send prompt
i'll send you the prompt
hold up
other things work fine
whoops
wait
i'mma sent it in dms
great!
Gemini is falling behind a bit when it comes to agentic abilities (hopefully the next model proves this wrong).
the next Gemini will probably topple everything
nightride-on might be the next one
Not sure whether it's Flash though
can anyone in this chat try and use this prompt and tell me if it bricks a chat for any of you on any model?
<form method="POST" action="{{ route('register') }}">
@csrf
<!-- Name -->
<div>
<x-input-label for="name" :value="__('Name')" />
<x-text-input id="name" class="block mt-1 w-full" type="text" name="name" :value="old('name')" required autofocus autocomplete="name" />
<x-input-error :messages="$errors->get('name')" class="mt-2" />
</div>
<!-- Username -->
<div class="mt-4">
<x-input-label for="username" :value="__('Username')" />
<x-text-input id="username" class="block mt-1 w-full" type="text" name="username" :value="old('username')" required autocomplete="username" />
<x-input-error :messages="$errors->get('username')" class="mt-2" />
<p class="text-sm text-gray-600 mt-1">This will be your unique profile URL: {{ url('/') }}/username</p>
</div>
<!-- Email Address --> ```
No idea, havent played at all with stealth models
I only encountered it once, it was pretty good.
it does
It would be nice if there was a mode that tested tool use/ReAct loops more.
its probably escaping the data and making the packet invalid
yeah it's weird as hell
not really, thats how code just works sometimes
because it accidentally created a situation that made the LMArena website start going crazy
i'm trying to figure out if there's a way to fix a bricked chat
i dont think so
nothing seems to be working lmao
cleaned my cache, hard reload, cleaned the browser history
disabled all of my extensions
restarted my browser
try this prompt and tell me if it's working for ya
.
Its not the browser
its LMArena
i know this issue
that code is the prompt?
My Extension had an issue with it
which model
All models
any of them
ok
no
theres something wrong with your prompt
no
nothing breaks for me
the prompt is not the issue
then you have luck? or you added something to the prompt
but everything works fine until i put that prompt
its how LMArena handles messages
Probably luck
yes
Can you show a pic
how very very interesting
thats GREAT
WHAT?!
I think I also know why its working for you
gpt 5 high as well???
yes
bruh
let's try that
<x-guest-layout>
<form method="POST" action="{{ route('register') }}">
@csrf
<!-- Name -->
<div>
<x-input-label for="name" :value="__('Name')" />
<x-text-input id="name" class="block mt-1 w-full" type="text" name="name" :value="old('name')" required autofocus autocomplete="name" />
<x-input-error :messages="$errors->get('name')" class="mt-2" />
</div>
<!-- Username -->
<div class="mt-4">
<x-input-label for="username" :value="__('Username')" />
<x-text-input id="username" class="block mt-1 w-full" type="text" name="username" :value="old('username')" required autocomplete="username" />
<x-input-error :messages="$errors->get('username')" class="mt-2" />
<p class="text-sm text-gray-600 mt-1">This will be your unique profile URL: {{ url('/') }}/username</p>
</div>
<!-- Email Address -->
```php
You need to copy my message through the discord copying
try that
<x-guest-layout>
<form method="POST" action="{{ route('register') }}">
@csrf
<!-- Name -->
<div>
<x-input-label for="name" :value="__('Name')" />
<x-text-input id="name" class="block mt-1 w-full" type="text" name="name" :value="old('name')" required autofocus autocomplete="name" />
<x-input-error :messages="$errors->get('name')" class="mt-2" />
</div>
<!-- Username -->
<div class="mt-4">
<x-input-label for="username" :value="__('Username')" />
<x-text-input id="username" class="block mt-1 w-full" type="text" name="username" :value="old('username')" required autocomplete="username" />
<x-input-error :messages="$errors->get('username')" class="mt-2" />
<p class="text-sm text-gray-600 mt-1">This will be your unique profile URL: {{ url('/') }}/username</p>
</div>
<!-- Email Address -->
Maybe. I'm like to try and keep channel lists light, and I don't think polls get used enough to justify their own channel atm.
its message handling stuff
that's so freaking weird
🙂
thats GREAT
Interesting that GPT-5-High actually loses to Gemini 2.5 Pro the majority of the time
well, LMArena is changing its shape slowly
no
jkjk
🙂
Im making another small extension to change the look of LMArena into more OpenChat style yk
can you put the model selector next to the iamge button
yooooooo good luck
Thats what im working on
right this second
very fun stuff
Actually it's kinda odd, because Gemini has a higher win rate against more models, yet is ranked lower. Green = higher win rate. Red = lower win rate. Yellow = tie/no data. Statistical paradox?
make it so that the prompts are gonna work by adding that
hello
😂
😭
just kidding it's still not gonna do anything
make ai give it a quick upper cut back into place
@echo aurora How does this work?
ai would fail miserably 😭
Lol the filters turn off when adding ```
test it out
Uhm but still the model itself has safety filters
Yea and it won't let me generate girls kissing
😠
Not when you give it a really nice system prompt
Yeah....
?
filters....
How to do that?
Did it fall out of the flex box?
Prob because I turned it into a circle
Not yet available for normal people
You mean you're part of lmarena team?
Or connected to them
how do I stop "generating"? it's been an hour, lol
That shouldn't change it
Did you refresh the page?
no idea
@verbal nimbus yes brother, I did.
is there any solution, to fix this problem? it's so frustrating.
do you think it's possible to save a chat that has that prompt that bricks it?
just lemme cook well
Do new chats work?
because something weird just happened
ups
it's working. but the chat AI I used, knows all the information and important things I have
something very weird just happened @tired herald ya gotta help me understand that
that's insanely weird
so i've had a chat that was bricked right?
couldn't send any messages whatsoever
not even a hello
Ah yeah, that's annoying. Very odd that it hasn't timed-out.
i've responded with ``` and then wrote the whole code thing
and it legit worked somehow???
If you want the info, you can copy the whole chat out by manually highlighting everything and ask another AI to clean the raw paste and summarize it.
I hope they will fix it
ok, I really dont know why its happening, thats really weird
@echo aurora
we need your help
I will try bro. thank you! that's a big help
dem
Report in #1343291835845578853
Hey sorry I'm in the middle of something else and haven't been following this chat closely. Can you submit a bug report and TLDR everything so I can take a look/let the team know?
i'll do that right now
btw, try clearing cookie files
if someone know how to stop generation? my generation with chatbot bugged, its just loads endlessly
for example, so that the chatbot gives an error or if possible in some other way
What version do you use the most?
13
24
3
Direct
nohow
Unfortunately, your only two options at the moment are to refresh the page (sometimes this works) or start a new chat. This is a problem we're aware of where models will continue to generate without ending.
unfortunately, restarting the site doesnt help, Ill have to make a new chat(
thanks for the answer
Hello to all
Why is style control not called "length control" instead?
Style control is confusing!
Howdy partners
Really sorry to hear that it didn't work. The current state what happens when chats get stuck isn't great, and we plan to make changes to help with this.
@echo aurora , I’m sorry to interrupt, I’d just like to ask a quick question; why is it that there is no direct chat on WebDev Arena?
What’s webdev used for?
is gpt-5-high on lmarena supports tools?
nah
only chat or gpt-5-search in search mode
react apps
but WebDeb doesn't hace direct or side by side mode 🙁
because it uses others things besides length, like lists, bold and markdown formatting
Any opinions on "folsom-0805-1"
I tried it on a logic prompt and it actually wasn't terrible.
I wish we could compare toad to zenith
“We have to make these horrible trade-offs right now,” he said. “We have better models, and we just can’t offer them because we don’t have the capacity. We have other kinds of new products and services we’d love to offer.” Sam in recent interview
BRO HOW IS NO ONE COMPLAINING ABOUT THIS?!
IT'S LITERALLY ALWAYS HAPPENING LEGIT ONE SINGLE PROMPT CAUSED IT
WHAT THE HELL
That's been a thing forever, seems no resolve
bruhhhhhhhhhhhhhhhhhhhhhhhhh
wdym lmao
its probably the most common complaint
and still no solution? not even a temporary fix?
no
bruv
lmarena team is working on it
uff
even gpt-oss-120b is better then the minimal effort gpt-5 imo
Something went wrong while generating the response. Please try again.
gpt5-minimal < gpt4.1
but to make this more confusing, this is also likely true:
gpt5-chat > gpt4.1
it depends to which model you get routed
anything is better then gpt5-minimal xD
gpt5-minimal is probably only in the API
it uses that if it "thinks" that the question doesnt require much thinking
which is like in 99% of the cases i feel
Something went wrong while generating the response. Please try again.
it uses gpt5-chat. That's direct replacement model for chatgpt-4o-latest. It's a different model from gpt5-minimal, this one has no reasoning at all so it most definitely performs better as reasoning wasn't the focus at all
just use the prompt in direct chat, and paste it into react project:
instead of reasoning being something it was trained for that was then later taken away... It was fine-tuned from the get go to perform as good as possible without relying on reasoning
@gentle plinth pls pdf support noah 🙏
wdym
i just asked it to generate a site with the prompt
and copied it
its what they are using in webdev arena apparently
What is webdev arena
but i honestly cannot guarantee that it will work out of the box
bc they seem to have some specific project setup
but maybe ai can help you with that
What does it do
You are the ceo of lmaren right?
and you have to say which one is better
no
whats even going on here
the purple name is just a role i picked xD
Isn't creating a website very hard and you need wix for that?
🚀 Qwen Chat Desktop for Windows is here!
💻 All the power of Qwen Chat — now with MCP support for smarter, faster agents.
⚡ Run up MCP Servers, supercharge your productivity, and stay in control.
📥 Download now → https://t.co/uYQIIGQAJo
ai can do some small websites
I'm curious
for larger projects of course it can get to its limits
why would i install something that i can use in the webbrowser
qwen sucks lmao lol rofl
there isnt
Qwen is good over deepseek?
not at coding
I want it to show a photo in the website
but the image model from qwen is great
What are your thoughts on Kimi AI?
How's Gpt 5 high
its amazing its SotA
Isn't Claude on top for that
@echo aurora is it possible to add different reasoning efforts for gpt models? Because we don't get high in the chatGPT app
What about gpt 5 search
Do you have a good explanation for prompt injection these days? GPT5 compatible?
Want to know if my company is safe
mostly coding
@stray aspen why don't you compare lmarena gpt 5 high and yupp ai
i dont needd to lmao
What company
Does Lmarea have any limits aside from opus?

lmarena is greater than yupp ai regarding usage of gpt-5 high
What a crash holy crap
holy sigma
@gentle plinth helloo
seems like gpt-oss-120b is around o3-mini level for hard prompts
GPT 4o better than GPT 5 wtf
How did they manage to make 5 worse than 4o 😭
Is it possible to switch between Auto, Fast and Thinking models on the phone in ChatGPT?
Is it even cheaper to run?
How recent is the leaderboard for the rest of you
I think it's likely that OpenAI and lmarena came to some kind of agreement to make GPT-5 look better than it actually was on release
o3 search is better than gpt5 search I can say that with 100% surity
new leaderboard
i cant tell if the announcement implies it changed rn lol
4o better than 4.5 ? How?
Or maybe OpenAI just deceived lmarena before release
gemini still top of pack 😄 ?
@echo aurora 🙀
4o was always top tier
Guys. Is it possible to switch between Auto, Fast and Thinking models on the phone in ChatGPT?
gpt-5 sucked on release
it was trash
now its way better
still trash
Still sucks
people vote for it because of its sycophancy
golden trash is still trash
GPT-3o
lol
SOTA
I like GPT-5 clapping back against stupid ideas in contrast to 4o
That's like the only thing it's better at though
Yupp ai, is this free?
how interesting
i dont understand how ppl think 2.5 pro is so good
@tired heraldhow much are eyou gonna sell the plugin for
OpenAI made people addicted to this model.
i mean its not bad
Gives you credits for using the ai
im not selling, im gonna put up the code for free
but gpt-5 is way better
prob on github
Does anyone use webdev arena
yeah, not bad but grok 4, o3 pro, 5 pro are all noticably better no question
It is
but right now lmarena is way better
100%
agreed
There's no better worse in 2+2
When I went there, it asked me for a login; thank God I have not.
@gentle plinth Vro hello
stop randomly tagging me
ups, LMArena doesnt like this at all
The 03-25 variant was amazing. They just ruined it.
or i will block
I think its bc on lm arena ppl ask simple questions
i love how gemini 2.5 pro preview versions were better than the final lol
No. We wouldn't do something like that, ever. That'd go against everything we stand for and are trying to build.
How to attach a photo in webdev
No one has done it
then you can
Give
but as i said it will probably not work out of the box
ask ai to help you run it
It will not run on chrome?
Wdym
🙁
I want to see where the gpt I actually use in chat ranks
Recently I use 2.5 flash more than 2.5 pro. DeepMind used some over-optimized technical changes to 2.5 pro, they do not know how to fix it.
What about Gemini and others? Have anyone got limts on it?
It's gpt-5-chat I believe
on lmarena?
Sorry currently busy with other stuff, will try to address later 
Fifth place at 1427 ELO
i have only run into limits with claude 4.1 opus on lmarena
Yeah
as far as i know claude 4.1 opus has limits
chatGPT has thinking variants still
They just don't use the high reasoning
You guys were lying in gpt 5 it really is gpt 4 you guys are lying to us
It’s really a bad thing guys
I have proof
Bruh
If I paste the prompt into gpt-5 high, will that work?
See you guys were lying
the api doesnt know what model it is
its not stop yapping
theres no system prompt and the api doesnt know what it is