#general

1 messages · Page 97 of 1

sour spindle
#

might be a google guy now

#

look what you've done openAI

eternal niche
#

it depends on what

keen beacon
#

Because they are pre-release models

tired herald
stray aspen
#

are you a LMArena dev or seomthing

tired herald
#

no

#

just having fun with extensions

#

it doesnt fully work right now tho

native flame
#

Hii, is the legacy site down??

echo aurora
keen beacon
#

Well the code named models appear only randomly so uhm you can just hope that if you have a long convo, they will appear a lot.

tired herald
#

I just got a really good idea for generating code with models

hollow imp
#

Does the legacy site have any updates

#

Like any new models smth

tired herald
#

i dont think so

hollow imp
#

I really like the customisation parameters

tired herald
#

good point

hollow imp
#

Like when @echo aurora you guys changed into new lmarena I was asking gpt and perplexity about you guys

#

😂

tired herald
#
503 Service Unavailable
No server is available to handle this request.
#

damn

#

legacy dead

echo aurora
tired herald
#

oh

#

it

#

seems

#

file sending may break due to the limit of tokens per user message

#

Ah, who cares

#

okay, found another small bug.....

#

no

#

make it easier to generate code with the AIs, and easier to copy and view it

#

damn

#

thats ai for you

#

it guesses

#

intelligently guesses

#

no, the thing is that on LMArena, they dont have any system message at all

#

So they dont have a name per se

#

none of them is trustworthy

#

they profit by being friendly

stray aspen
#

google deepmind guy

#

hes nice

keen beacon
#

Demis 100 percent

stray aspen
#

agreed

tired herald
#

HAHAHAHAHHA

#

FINALLY

eternal niche
#

where is V.P.

tired herald
#

working system prompt

eternal niche
#

🙁

tired herald
#

Finally I can force them AIs to not roleplay as completely different models

#

yes my bruv

#

ChatGPT, trained by OpenAI

high ginkgo
#

Stop using that dumbass chat format

trail creek
#

why are you glasing gpt this much tf?!!

#

damn

neon idol
#

Ehat is the prompt?

#

And then how can I have the picture

#

Uhm idk how to do

#

Lol

#

I eill give you the answer

tired herald
#

well well well

#

gpt 5 mini high is so slow 😭

#

reasoning or no?

#

okay

neon idol
#

I have from grok 4

stray aspen
#

craig are you paying a claude subscription

tired herald
#

GPT-5 Mini High

#

Claude 4.1 Opus Reasoning

#

why do mine feel worse

neon idol
#

@tired herald bro if I dm you can you see with grok 4?

#

I have the svg

tired herald
#

Huh

#

okay

#

dm me

golden ocean
tired herald
tired herald
#

I wonder why system prompts dont work on the mistral medium model

neon idol
#

Glm 4.5 is goodddddd

scenic salmon
#

👋

tired herald
#

Hi?

scenic salmon
#

Just joined, so yeah, hi, lol

tired herald
#

Go check out the other channels 👍

golden ocean
scenic crypt
#

@echo aurora

tired herald
echo aurora
echo aurora
scenic crypt
echo aurora
misty vault
scenic crypt
#

@echo aurora

echo aurora
#

Veo 3 Audio no longer appears
This was changed. There is now Veo 3 with and without audio.

golden ocean
#

Russian state news

#

it does suck tho

#

wtf is this L by gpt 5?

#

obvious indication it is windows machine and provides linux commands

#

even gpt 3 dont fail that

pure falcon
eternal niche
#

he said not to spam

#

i am not spamming

pure falcon
#

Posting the same thing 700 times is spamming

golden ocean
trail creek
#

real

echo aurora
#

We've taken action, let's move on please blobthanks

white hatch
obsidian cargo
#

theory: the stealth model 'toad' is deepseek v4

stray aspen
#

oh damn

#

they banned the gemini ragebaiter

misty vault
#

Huh wdym? paws is still here

#

paws day 1 without gemini 2.5 pro😔

inner gate
#

I prefer grok 4 over gpt 5 and 2.5 I was wondering if anyone else doed

hollow imp
#

Doed?

wicked root
#

Hello, LMArena ranking aside, would you guys recommend GPT5 (subscription model) for coding complex (1500+ line) projects?

marsh stratus
#

depends on what your alternatives are

#

Ideally you'd have multiple models ready to go

wicked root
tidal ginkgo
#

guys

#

i think i might never use LMArena again

marsh stratus
#

That's honestly pretty boilerplate

tidal ginkgo
#

is my personal info in risk for using this service?

leaden palm
#

if you don't like analytics, use a browser with tracker blocking or get ublock origin

marsh stratus
tidal ginkgo
#

yea i will use tracking blocking and vpn

inner gate
wintry tinsel
#

I prefer Claude opus for everything not, math/logic/advanced reasoning

wintry tinsel
tidal ginkgo
wintry tinsel
#

That’s fine but brave is goated it’s got everything built it, 0 bloat ware, and many other useful features

inner gate
#

I usually just use it as a companion I’m not that techy. I don’t prefer it because of intelligence I more meant the actual app lol. I do like how it responds tho

wintry tinsel
#

Built in Tor as well

inner gate
#

General things

inner gate
#

ANI?

wintry tinsel
#

Yeah

inner gate
#

Idk what that is

#

As a companion I meant when I have a question it’s my go to

hollow imp
inner gate
#

💔 I’ll ask it for ideas

wicked root
#

@marsh stratusI think I'm in love with claude. Gemini's good but I keep getting rate limited, Grok is... special in a 'stay away from me by at least 5m at all times kind of way' but it froze when I asked some complex coding question, but Claude re wrote my entire code with just a single prompt.

viscid sun
#

why chash the LMarena chat history ? how to I solve this problem ?

stark mango
#

hello

#

every boddyy

keen beacon
#

lol

#

this is silly.

tidal ginkgo
#

well now i use tracking blocker

keen beacon
#

some dont even know ad blocking

tidal ginkgo
#

yea

echo aurora
echo aurora
trail creek
#

nano banana removed from the arena?

echo aurora
trail creek
#

Its been very long since i got it in the battle.

#

don't lie to me yes it is

golden ocean
keen beacon
#

I just got it in my testing

golden ocean
trail creek
#

Idk why you are lying but it is hundred percent removed

#

im not schizo.

golden ocean
golden ocean
misty vault
#

Or

I'm sorry, but I don't believe that's accurate. I think there may be some misunderstanding here. I'm still learning, so my assessment could be mistaken, and I appreciate your understanding and patience.🙏

exotic stream
#

Mf acting like the CIA is on his ass

golden ocean
#

No bro it's not that, i mean like what if the data of me gooning with ai girlfriend gets leaked and it can linked to my identity irl?!?! then no one want to hire me anymore : (((

#

like claude serious asf but damn he know how to .... well i can't say more due to the rules here :(

#

🤣😭😭

stray aspen
#

dont link your identity tot he AI

golden ocean
#

but LMarena does that now for us

keen beacon
keen beacon
golden ocean
#

other websites might have my name

keen beacon
#

if it concerns you, you can always use a VPN.

golden ocean
exotic stream
golden ocean
#

not the mc video deleted too 😭

misty vault
severe stream
#

Hey does anyone here what the limits for each model are or is it just completely free

keen beacon
#

especially if you put too many inputs in one minute aka rpm

misty vault
severe stream
keen beacon
#

just dont rapid fire the chat

#

lol

severe stream
keen beacon
echo aurora
inner gate
golden ocean
echo aurora
golden ocean
stray aspen
#

is that you pineapple

misty vault
#

yes

echo aurora
#

I take it back

pure falcon
echo aurora
pure falcon
#

Well, you deserve blessings for a wonderful day no matter what

queen thorn
#

is there an answer to why most of the models get all fucky and respond with that?
"Something went wrong while generating the response. Please try again." or is there maybe a temporary fix i could use to use my current chats so that the model doesn't lose the context of whatever it's working on?

blazing bison
vital lake
#

@echo aurora
How do you guys get all these expensive APIs lol? Sorry for ping

blazing bison
#

Ppl ask it 999 times everyday

echo aurora
vital lake
golden ocean
#

the video arena ruined search discord function

golden ocean
scenic salmon
#

saves them from having to do the research themselves

echo aurora
scenic salmon
#

ha, you got VCs to back you, I should pick your brain if you were involved in the raising at all @echo aurora

vital lake
#

Sounds to me like they want to build their own models

#

Using the data

scenic salmon
vital lake
#

Thats worth milions

scenic salmon
#

yeah, no synthetic training data needed, all real human questions

scenic salmon
vernal oxide
#

how to do i use bananaLLM

#

the uhh

#

nano banana image generation

vital lake
#

Btw how did Solar get so good?

#

I remember they were hot garbage

#

They fr locked in

echo aurora
# vernal oxide how to do i use bananaLLM

Models that are using a codename can only be used in the Battle mode (random models being sampled side-by-side). Meaning you aren't able to select it specifically

vital lake
rough brook
#

hi guys what the best prompts for still frame, still camera no movement?

true tusk
#

HI

unborn lily
#

Hi

opaque mirage
#

hi

hollow field
#

Hi

keen beacon
#

wth

wintry tinsel
#

Countdown to Gemini 3💀

rare python
tribal nymph
#

Many models shown on the site aren’t the real ones or are just named versions that don’t exist (like GPT-5, Grok 4, etc.).
They are actually working on GPT-4 & Grok 2.

Kindly fix it ASAP

keen beacon
#

Guys I have a question

#

Has anyone of you experienced a model becoming unable to answer some questions correctly after it was upgraded?

#

Like GPT-o4 answered some correctly, then o3 wasn't able to get it at all

#

I found that Deepseek and Qwen answer my questions correctly based on incorrect data (!) it was trained on what the heck

unborn lantern
#

@echo aurora Brother, the chat history issue is causing a lot of trouble. Please take necessary steps to fix it.

random wolf
#

I hope they put soon the "cancel button" it's so frustrating

#

any solution to stop the "generating"?

solid brook
queen thorn
#

until you're starting a new chat that is, and everything works well again, only if there was a way to maybe export the chat after it dies and maybe import it into another one so that you'll be able to feed the model context about what you're trying to build with it, instead of starting from zero and having to feed it information about what you were using it for in the previous chats

#

because it’s currently in a very unstable state, the website and the concept are cool, but that doesn’t mean much if your chats die after just three minutes of use

verbal nimbus
verbal nimbus
#

For fairness, I think other reasoning levels should be included, not a configuration that even Pro users won't get to use (if X posts are accurate).

keen beacon
#

i believe toad is a qwen series model:

  • it uses emojis in a similar style
  • "the vibes"
#

neither do deepseek models use emojis for responses nor does gemini.

The only other model i've observed is chatgpt-4o-latest

viscid timber
#

what happened to video arena 4

ocean vortex
# verbal nimbus For fairness, I think other reasoning levels should be included, not a configura...

There are some decent reasons for them to do this though. Reasoning effort changes are more impactful in cost for chatgpt than API, since it's often doing tool calls and that mostly happens while it's reasoning. They are fine going all out on API even for diminishing returns but for chatgpt this makes no sense if they get big increase in cost with little to no benefit. And then also, "128" technically still falls within high reasoning effort range for gpt5, since medium on API is 64.

blazing bison
blazing bison
blazing bison
#

And that gpt thinking is using high reasoning

ocean vortex
blazing bison
#

Claude is trained to say which model they are btw

#

But others, no

ocean vortex
#

Yeah that's why I said "usually"

#

Most of the time models will get this info from a system message

blazing bison
#

Ye just claude do that

#

Just claude train to say which model they are i mean

#

Openai, grok, they use the system prompt generally

ocean vortex
#

I'm pretty sure it is. There's a direct link between reasoning effort and that number for sure

blazing bison
#

I didn't saved it but it was in Tibor post

viscid lake
#

Chat gpt us still better at Pokémon

fleet lintel
#

Which company is Nano-banana? OpenAI ?

keen beacon
keen beacon
fleet lintel
#

Does nano implies small model??

keen beacon
viscid lake
unborn lantern
#

Is there any website like lmarena?

#

Anyone know?

viscid lake
#

First,banana
Second chat gpt
Third is imegen 4 ultra

ocean vortex
unborn lantern
#

Is there any website like lmarena?

ocean vortex
#

Also I don't think juice number directly translates gpt5 vs o3. For o3-medium the number is 2X compared to gpt5-medium, but it only just generates more:

#

not by much I mean

blazing bison
#

I think that it is this one

hollow imp
#

😭

rocky mauve
#

anyone know an ai that’s capible of making high quality gfx’s? Is it even possible yet

hollow imp
#

I've tried that a lot

#

Please don't waste your time like me for that

ocean vortex
# blazing bison Idk them, there is another post that he said that think harder on router can rea...

that's different context though. Their router was the weakest link at the start and their model card technically describes it as being more significant than it was on launch. If they haven't improved that yet I'm sure they will. It honestly made no sense that it couldn't do more than low reasoning effort, especially now with it being renamed to "Auto" and describing it as "decides how long to think"

#

It should be able to do chat/low/medium (for Plus sub), the way they are presenting it

blazing bison
#

it sucks anyway

#

like o3

#

always better on api

ocean vortex
#

Then for Pro high as well (128)

blazing bison
#

their chatgpt prompt sucks and the infra or whatever they do on the backend sucks

#

and gpt-5-pro feel exctly like o3-pro, i'm sure they just changed the name

ocean vortex
# blazing bison it sucks anyway

I mean at the start when that was named "GPT5" it did actually perform better than gpt5 with no reasoning and that was fairly obvious. Now it's still a good option for people who are confused by all the models and just want to ask a question

blazing bison
#

now that gemini copy pasted their frontend

#

when they finish copy pasting the features too

#

it's gonna be better than chatgpt

ocean vortex
blazing bison
#

it did the same frontend

#

the same answer

ocean vortex
#

Try something different but still with visuals, you should see the difference tbh

#

it now ranks nr1 on webdev

blazing bison
#

yeah i tryed already to see the difference

ocean vortex
#

while before it was bad

blazing bison
#

there is none

#

the model that they deployed for testers on api was called o3-alpha too

#

xDD

ocean vortex
#

I haven't tried frontend, but for svg the difference is obvious

blazing bison
#

after that they changed it to nectarine

ocean vortex
#

they did it with gpt2-chatbot as well

blazing bison
#

idk about that

#

the model that won the math olympiada has o3-alpha on the name too

ocean vortex
#

They knew everyone would see that name, they did it deliberately...

blazing bison
#

it was leaked on the git

ocean vortex
#

Not to make it too obvious that it's gpt5 base I think

#

But still promote their stuff

ocean vortex
#

with that name

#

It was immediately obvious to everyone that this name alias is public

#

Besides, @blazing bison , those gains wouldn't be possible otherwise without changing the base model tbh. gpt5-medium generates less than o3-medium but performs considerably better

winged burrow
#

/video prompt

blazing bison
#

just the pro version that is strange

#

as a heavy user of the pro version

#

i noticed the differece from o1 to o3-pro clearly

#

but from o3-pro for gpt-5-pro

#

almost none

ocean vortex
blazing bison
#

and i'm talking about frontend in general

ocean vortex
#

And it is using gpt5

#

not o3

blazing bison
#

they took 3 months to update from o1-pro to o3-pro

#

but released gpt-5-pro on the same day

#

that was already sus asf

#

and

#

there is no gpt-5 pro on api why

#

it's like they changed o3-pro low to o3-pro medium now

ocean vortex
#

They were slow with o3-pro as well. I think it took them time to test and approve. You can probably get some unexpected results by running each prompt 10+ times and to make gains it is not super easy out the box. Small changes even to just the prompting can make a fairly big difference

ornate agate
#

lol. Isn’t pro just gpt high now?

ocean vortex
blazing bison
#

i think the real gpt-5 pro is gonna appear on chatgpt only after it's avaliable on api

ornate agate
#

I thought it was leaked that pro is just gpt high

ocean vortex
blazing bison
blazing bison
ocean vortex
blazing bison
#

and there is a router on pro models on chatgpt too

#

it answer in seconds if your question is simple

ornate agate
#

Hmm. What is it then?

keen beacon
# verbal nimbus

What? So 200 juice is reserved for API specifically, and even Pro users don't get this treatment? Is this a joke?

ocean vortex
#

In ChatGPT, we also provide access
to gpt-5-thinking using a setting that makes use of parallel test time compute; we refer to this as
gpt-5-thinking-pro.

ornate agate
# verbal nimbus

Yeah this is what I was thinking of. It shows pro as juice 128

ocean vortex
#

Notice that even when writing this they knew it's gonna be available at first only on chatgpt

#

not API

keen beacon
#

Why did they cap the juice at 200?

#

Why not 512 or 1024?

#

Why this exact number?

blazing bison
#

no different results

keen beacon
#

We can only wonder...

ocean vortex
#

@blazing bison I think it "feels" the same probably because they used much of the same prompting to compute ~10 attempts into a single answer. So when it follows their instructions, answers are to look similar as previous pro, by design

ornate agate
ornate agate
#

Ye

blazing bison
#

no

#

it's bcs above 200 there is no difference or the model actually lose performance

keen beacon
#

So you want to say that at some point the context of reasoning becomes too long and the model starts to slop?

ocean vortex
# blazing bison no

No what. What evidence do you have to prove this absurd theory that they are using o3-pro in disguise marketed as gpt5-pro?

ornate agate
#

Yes. So 200 is the max cot length it was trained on. Above that is slop.

blazing bison
#

idk what you're talking about on plus plan

ocean vortex
blazing bison
#

it's enough

ocean vortex
#

No it's not

#

o3-pro was not gpt3.5

blazing bison
#

it is

ocean vortex
#

?

blazing bison
#

now i have better things to do

#

bye

warm fulcrum
#

arguing about pointless things goes hard

#
  • ur being ragebaited and u cant see it
#

🔥

keen beacon
#

By the way did you know guys that these stealth models drop their names if you ask them nicely

vapid zinc
#

Is there any specified schedule for when the leaderboards are updated?

blazing bison
#

🤓

keen beacon
#

But they will at least very generously share the information about their creators

ocean vortex
#

@blazing bison Also wait a sec, you are SERIOUSLY arguing that gpt5-pro is just a scam and that they are using o3-pro instead, based on your gut feeling and no evidence to distinguish between 2 very good models? You are ACTUALLY for real? 🤣 🤣

warm fulcrum
#

i wasnt siding with u

ocean vortex
#

LOL

blazing bison
#

it's not a scam

ocean vortex
#

no they can't

#

it would be an obvious scam

nimble trail
blazing bison
#

it's not, on their end gpt-5 can be called o3-alpha and it's ok

ocean vortex
#

What gpt5-pro means is CLEARLY defined in their model card

#

no 2 ways about it

warm fulcrum
#

good point

flint sandal
#

Gpt5pro means that its gemini knockoff

warm fulcrum
#

👍

novel crater
#

dang you can use Claude 4.1 for free now too lmarena is awesome!

keen beacon
#

Imagine if OpenAI really scammed everyone with this move 💀

nimble trail
blazing bison
keen beacon
blazing bison
#

i'm not saying that they are delivering a worst model

warm fulcrum
keen beacon
warm fulcrum
#

ur saying openai is doing illegal stuff

#

which they cnat

blazing bison
keen beacon
#

Lol.

blazing bison
#

they aren't breaking any laws

#

doing that

warm fulcrum
#

no

keen beacon
warm fulcrum
#

because they market it as a better tool

keen beacon
#

They will goto jail

#

Its illegal

blazing bison
#

then it's a better tool

warm fulcrum
#

source: trust me bro

#

i guess gpt-5 is actually just gpt 2 fine tuned

#

we got played..

blazing bison
#

i'm not saying that it's actually it

#

no

#

i tryed it

#

with the same prompts

#

and got the exactly same results

keen beacon
#

Imagine if everything they did to gpt-5 was just scaling the reasoning up to 200 juice, once Gemini scales up to this number OpenAI will be cooked 💀

warm fulcrum
warm fulcrum
#

was better than anything atm

wet sparrow
#

I agree with him. I don't think the people arguing here even have Pro

#

There are even posts about it on the OpenAI forum

#

OpenAI said they are working on it

keen beacon
#

So here's the following dilemma

keen beacon
#

You get GPT Pro with two stupid models running at once that kind of "increases" the odds of successful completion of a task

#

Or one smart GPT High

#

Two stupid models or one smart?

blazing bison
#

we don't really know what it is

#

or what it do

keen beacon
warm fulcrum
#

"basically guys i think gpt-5 pro is the same as o3 pro bc my gut says so"

wet sparrow
#

Theo even made a video about it. It's not directly about Pro Models, but it is about all GPT-5 models. There is a problem with them

warm fulcrum
#

holy ragebait mother of 3

wet sparrow
#

OpenAI has already addressed this

keen beacon
keen beacon
keen beacon
#

But I can say that, from my experience, if it is a model that is worth attention, everyone will be talking about it

#

Be it Zenith, Toad or whatever

#

agreed

keen beacon
#

lmarena should probably reveal the hidden models after their public release

#

Toad is nothing special btw

warm fulcrum
#

im only assuming because of what theo says

#

it matches with zenith's output speed and accuracy

keen beacon
#

ahh okay

warm fulcrum
#

not confirmed tho

keen beacon
keen beacon
keen beacon
wet sparrow
keen beacon
#

If it is by a major provider everyone'd already losing their crap here and there

warm fulcrum
#

what are u yapping about

keen beacon
warm fulcrum
#

i have the pro model

wet sparrow
#

Show proof

keen beacon
#

Better follow openrouter for stealth models

wet sparrow
#

😆

warm fulcrum
#

wowie

#

i have gpt-5 pro it seems

#

i must be a millionaire

blazing bison
#

There is no point in discuss this btw

hollow imp
#

@keen beacon

#

I'm watching this

warm fulcrum
#

india

hollow imp
#

This documentary even better than the movie

warm fulcrum
#

🔥

keen beacon
#

There are so many stealth models on LMArena and they are total garbage

hollow imp
#

@keen beacon

keen beacon
#

What?

hollow imp
#

What is a stealth model

#

What are they hiding from

keen beacon
#

Toad

hollow imp
#

Toad?

keen beacon
#

I don't know

keen beacon
#

Yes, and also these Phantoms IIRC

#

They are nothing special anyway

hollow imp
#

Iirc?

keen beacon
#

If I Remember Correctly

hollow imp
#

What is IIRC

keen beacon
hollow imp
#

Ohh

#

Bruh I thought

keen beacon
#

Here are the big things

hollow imp
#

IIRc is a

#

Ai term

#

😭

hollow imp
keen beacon
hollow imp
keen beacon
#

Me too

#

We went to the same school together

keen beacon
#

But it doesn't matter

#

None of them are SOTA

#

Claude, Gemini or Gemini no doubt

#

Gemini has genuinely impressed me today though

ocean vortex
#

@echo aurora make this into a channel. We have community-creations, community-polls would fit and be useful. 🙂

keen beacon
#

There are very few models that have identified this problem

#

Qwen and Deepseek often point it out, as does GPT-5

#

But Gemini was the only one that compared it to other shows and figured out that these criticisms do not actually matter

queen thorn
somber rover
#

how to make video

solid brook
#

Because it is so expensive

queen thorn
#

3.7 claude

solid brook
solid brook
#

Bruh lol

queen thorn
#

one specific prompt

solid brook
#

I used gpt 5 high for hours

queen thorn
#

i'll give ya a prompt you'll see

solid brook
#

Okay send it

queen thorn
solid brook
terse shuttle
#

or clear cookie

queen thorn
queen thorn
#

of these things

solid brook
#

Send prompt

terse shuttle
#

idk

queen thorn
#

i'll send you the prompt

#

hold up

#

other things work fine

#

whoops

#

wait

#

i'mma sent it in dms

vernal oxide
#

guys

#

how do i know if i got the banana llm

keen beacon
verbal nimbus
#

Gemini is falling behind a bit when it comes to agentic abilities (hopefully the next model proves this wrong).

tired herald
#

the next Gemini will probably topple everything

verbal nimbus
#

Not sure whether it's Flash though

queen thorn
#

can anyone in this chat try and use this prompt and tell me if it bricks a chat for any of you on any model?

    <form method="POST" action="{{ route('register') }}">
        @csrf

        <!-- Name -->
        <div>
            <x-input-label for="name" :value="__('Name')" />
            <x-text-input id="name" class="block mt-1 w-full" type="text" name="name" :value="old('name')" required autofocus autocomplete="name" />
            <x-input-error :messages="$errors->get('name')" class="mt-2" />
        </div>

        <!-- Username -->
        <div class="mt-4">
            <x-input-label for="username" :value="__('Username')" />
            <x-text-input id="username" class="block mt-1 w-full" type="text" name="username" :value="old('username')" required autocomplete="username" />
            <x-input-error :messages="$errors->get('username')" class="mt-2" />
            <p class="text-sm text-gray-600 mt-1">This will be your unique profile URL: {{ url('/') }}/username</p>
        </div>

        <!-- Email Address --> ```
tired herald
verbal nimbus
verbal nimbus
#

It would be nice if there was a mode that tested tool use/ReAct loops more.

tired herald
queen thorn
#

yeah it's weird as hell

tired herald
#

not really, thats how code just works sometimes

queen thorn
#

it replied with that, and after it did, it got bricked completely

tired herald
#

because it accidentally created a situation that made the LMArena website start going crazy

queen thorn
tired herald
#

i dont think so

queen thorn
#

nothing seems to be working lmao

tired herald
#

prob is impossible

#

but idk

#

ask pineapple

stray aspen
#

thats what pineapple told me

queen thorn
#

cleaned my cache, hard reload, cleaned the browser history

#

disabled all of my extensions

#

restarted my browser

#

try this prompt and tell me if it's working for ya

tired herald
#

its LMArena

queen thorn
#

has to be an issue with lmarena

#

yeah

#

yeah

#

true

tired herald
#

i know this issue

stray aspen
#

that code is the prompt?

queen thorn
#

yeah copy paste it

#

into lmarena

tired herald
#

My Extension had an issue with it

stray aspen
#

which model

tired herald
#

All models

queen thorn
#

any of them

stray aspen
#

ok

queen thorn
#

could it be that it's happening because it's inside of that text code block?

stray aspen
#

that prompt is broken

tired herald
stray aspen
#

theres something wrong with your prompt

tired herald
#

no

white hatch
#

nothing breaks for me

tired herald
#

the prompt is not the issue

queen thorn
#

on a new chat tho?

tired herald
stray aspen
#

but everything works fine until i put that prompt

tired herald
#

its how LMArena handles messages

white hatch
white hatch
tired herald
queen thorn
#

yeah idk man that's insanely weird

#

idk how??

white hatch
tired herald
#

how very very interesting

stray aspen
#

thats GREAT

queen thorn
tired herald
#

I think I also know why its working for you

queen thorn
#

gpt 5 high as well???

white hatch
#

yes

queen thorn
tired herald
#

who wouldve guessed

#

its the 1:1 the same issue I had with files for my extension

queen thorn
tired herald
#

and then give

queen thorn
#

let's try that

tired herald
#
<x-guest-layout>
    <form method="POST" action="{{ route('register') }}">
        @csrf

        <!-- Name -->
        <div>
            <x-input-label for="name" :value="__('Name')" />
            <x-text-input id="name" class="block mt-1 w-full" type="text" name="name" :value="old('name')" required autofocus autocomplete="name" />
            <x-input-error :messages="$errors->get('name')" class="mt-2" />
        </div>

        <!-- Username -->
        <div class="mt-4">
            <x-input-label for="username" :value="__('Username')" />
            <x-text-input id="username" class="block mt-1 w-full" type="text" name="username" :value="old('username')" required autocomplete="username" />
            <x-input-error :messages="$errors->get('username')" class="mt-2" />
            <p class="text-sm text-gray-600 mt-1">This will be your unique profile URL: {{ url('/') }}/username</p>
        </div>

        <!-- Email Address --> 
white hatch
#
```php

tired herald
#

yes

#

thats the code block format

queen thorn
tired herald
#

You need to copy my message through the discord copying

#

try that

#
<x-guest-layout>
    <form method="POST" action="{{ route('register') }}">
        @csrf

        <!-- Name -->
        <div>
            <x-input-label for="name" :value="__('Name')" />
            <x-text-input id="name" class="block mt-1 w-full" type="text" name="name" :value="old('name')" required autofocus autocomplete="name" />
            <x-input-error :messages="$errors->get('name')" class="mt-2" />
        </div>

        <!-- Username -->
        <div class="mt-4">
            <x-input-label for="username" :value="__('Username')" />
            <x-text-input id="username" class="block mt-1 w-full" type="text" name="username" :value="old('username')" required autocomplete="username" />
            <x-input-error :messages="$errors->get('username')" class="mt-2" />
            <p class="text-sm text-gray-600 mt-1">This will be your unique profile URL: {{ url('/') }}/username</p>
        </div>

        <!-- Email Address --> 
queen thorn
tired herald
#

still not copying with discord bruv

#

``

#

`

echo aurora
tired herald
#

you need three of `

#

at the beginning

#

and end

#

works everytime for me

white hatch
tired herald
#

yes

#

that

#

exactly that

#

copy it and paste

queen thorn
#

interesting

tired herald
#

its message handling stuff

queen thorn
#

that's so freaking weird

tired herald
#

🙂

stray aspen
#

thats GREAT

verbal nimbus
#

Interesting that GPT-5-High actually loses to Gemini 2.5 Pro the majority of the time

tired herald
#

well

#

it happens

queen thorn
#

welp, so lmarena

#

gotta fix some stuff

tired herald
#

well, LMArena is changing its shape slowly

queen thorn
#

new update?!

#

haha

tired herald
#

no

queen thorn
#

jkjk

tired herald
#

🙂

#

Im making another small extension to change the look of LMArena into more OpenChat style yk

stray aspen
#

can you put the model selector next to the iamge button

tired herald
#

right this second

#

very fun stuff

verbal nimbus
queen thorn
tired herald
#

😭

queen thorn
#

just kidding it's still not gonna do anything

tired herald
#

the + has an issue with me

queen thorn
verbal nimbus
tired herald
solid brook
tired herald
#

oh what

#

no way

queen thorn
#

huh?

#

wait lemme try and use that on that website

tired herald
#

test it out

solid brook
#

Uhm but still the model itself has safety filters

keen beacon
#

😠

tired herald
solid brook
#

Yeah....

tired herald
queen thorn
tired herald
#

filters....

queen thorn
#

yeah hold on

#

it's still responding it's funny

#

to me

solid brook
verbal nimbus
tired herald
#

Prob because I turned it into a circle

tired herald
solid brook
#

Or connected to them

random wolf
#

how do I stop "generating"? it's been an hour, lol

verbal nimbus
verbal nimbus
tired herald
random wolf
#

@verbal nimbus yes brother, I did.

#

is there any solution, to fix this problem? it's so frustrating.

queen thorn
#

do you think it's possible to save a chat that has that prompt that bricks it?

tired herald
verbal nimbus
queen thorn
#

because something weird just happened

tired herald
random wolf
queen thorn
#

something very weird just happened @tired herald ya gotta help me understand that

#

that's insanely weird

#

so i've had a chat that was bricked right?

#

couldn't send any messages whatsoever

#

not even a hello

tired herald
#

yes

#

?

verbal nimbus
queen thorn
#

i've responded with ``` and then wrote the whole code thing

#

and it legit worked somehow???

verbal nimbus
queen thorn
random wolf
queen thorn
#

but now

tired herald
# queen thorn

ok, I really dont know why its happening, thats really weird

#

@echo aurora

#

we need your help

random wolf
queen thorn
#

haha

tired herald
#

dem

solid brook
echo aurora
# tired herald <@283397944160550928>

Hey sorry I'm in the middle of something else and haven't been following this chat closely. Can you submit a bug report and TLDR everything so I can take a look/let the team know?

white hatch
#

btw, try clearing cookie files

tired herald
#

do the buttons look good enough?

grizzled turtle
#

if someone know how to stop generation? my generation with chatbot bugged, its just loads endlessly

#

for example, so that the chatbot gives an error or if possible in some other way

echo aurora
#
poll_question_text

What version do you use the most?

victor_answer_votes

13

total_votes

24

victor_answer_id

3

victor_answer_text

Direct

echo aurora
grizzled turtle
#

thanks for the answer

glossy epoch
#

Hello to all

red sluice
#

Why is style control not called "length control" instead?
Style control is confusing!

inner gate
#

Howdy partners

echo aurora
raven helm
#

@echo aurora , I’m sorry to interrupt, I’d just like to ask a quick question; why is it that there is no direct chat on WebDev Arena?

patent bane
#

is gpt-5-high on lmarena supports tools?

terse shuttle
#

only chat or gpt-5-search in search mode

terse shuttle
#

but WebDeb doesn't hace direct or side by side mode 🙁

hardy lion
ancient reef
#

Any opinions on "folsom-0805-1"
I tried it on a logic prompt and it actually wasn't terrible.

obsidian cargo
#

I wish we could compare toad to zenith

whole wagon
#

“We have to make these horrible trade-offs right now,” he said. “We have better models, and we just can’t offer them because we don’t have the capacity. We have other kinds of new products and services we’d love to offer.” Sam in recent interview

queen thorn
#

BRO HOW IS NO ONE COMPLAINING ABOUT THIS?!

#

IT'S LITERALLY ALWAYS HAPPENING LEGIT ONE SINGLE PROMPT CAUSED IT

#

WHAT THE HELL

whole wagon
#

That's been a thing forever, seems no resolve

queen thorn
#

bruhhhhhhhhhhhhhhhhhhhhhhhhh

stray aspen
#

its probably the most common complaint

queen thorn
#

and still no solution? not even a temporary fix?

stray aspen
#

no

queen thorn
#

bruv

stray aspen
#

lmarena team is working on it

tired herald
#

uff

gentle plinth
queen thorn
ocean vortex
ocean vortex
#

but to make this more confusing, this is also likely true:
gpt5-chat > gpt4.1

gentle plinth
gentle plinth
ocean vortex
gentle plinth
#

it uses that if it "thinks" that the question doesnt require much thinking

#

which is like in 99% of the cases i feel

brittle furnace
ocean vortex
gentle plinth
tired herald
#

very cool

#

ill take this and use it as my system prompt

ocean vortex
#

instead of reasoning being something it was trained for that was then later taken away... It was fine-tuned from the get go to perform as good as possible without relying on reasoning

hollow imp
#

@gentle plinth pls pdf support noah 🙏

gentle plinth
#

wdym

#

i just asked it to generate a site with the prompt

#

and copied it

#

its what they are using in webdev arena apparently

hollow imp
gentle plinth
#

but i honestly cannot guarantee that it will work out of the box

#

bc they seem to have some specific project setup

#

but maybe ai can help you with that

gentle plinth
hollow imp
#

What does it do

gentle plinth
#

it generates a website

#

with two random models

hollow imp
gentle plinth
#

and you have to say which one is better

gentle plinth
tired herald
#

whats even going on here

gentle plinth
#

the purple name is just a role i picked xD

hollow imp
indigo hazel
gentle plinth
hollow imp
gentle plinth
#

for larger projects of course it can get to its limits

hollow imp
#

I'mma go play with it

#

What should I say but

gentle plinth
stray aspen
#

qwen sucks lmao lol rofl

hollow imp
#

Where is direct chat

#

In webdevarenn

stray aspen
#

there isnt

atomic stream
#

Qwen is good over deepseek?

stray aspen
#

not at coding

hollow imp
stray aspen
#

but the image model from qwen is great

atomic stream
#

What are your thoughts on Kimi AI?

stray aspen
#

it sucks

#

its only good for writing

#

they need to make a reasoning version

hollow imp
stray aspen
hollow imp
whole wagon
#

@echo aurora is it possible to add different reasoning efforts for gpt models? Because we don't get high in the chatGPT app

hollow imp
bronze urchin
#

Do you have a good explanation for prompt injection these days? GPT5 compatible?

stray aspen
#

idk

#

i dont use search

hollow imp
#

Why

#

Mf pays for openai subscription

#

😡

stray aspen
#

because no

#

im not doing any reasearch

bronze urchin
stray aspen
#

mostly coding

hollow imp
#

@stray aspen why don't you compare lmarena gpt 5 high and yupp ai

hollow imp
atomic stream
#

Does Lmarea have any limits aside from opus?

glass gulch
stray aspen
#

lmarena is greater than yupp ai regarding usage of gpt-5 high

errant cave
#

What a crash holy crap

stray aspen
#

holy sigma

hollow imp
#

@gentle plinth helloo

gentle plinth
#

seems like gpt-oss-120b is around o3-mini level for hard prompts

regal river
#

GPT 4o better than GPT 5 wtf

proud yoke
#

How did they manage to make 5 worse than 4o 😭

quiet dust
#

Is it possible to switch between Auto, Fast and Thinking models on the phone in ChatGPT?

proud yoke
#

Is it even cheaper to run?

glass gulch
#

How recent is the leaderboard for the rest of you

errant cave
#

I think it's likely that OpenAI and lmarena came to some kind of agreement to make GPT-5 look better than it actually was on release

hollow imp
stray aspen
#

new leaderboard

glass gulch
#

i cant tell if the announcement implies it changed rn lol

modest prism
errant cave
#

Or maybe OpenAI just deceived lmarena before release

gusty helm
#

gemini still top of pack 😄 ?

errant cave
quiet dust
#

Guys. Is it possible to switch between Auto, Fast and Thinking models on the phone in ChatGPT?

stray aspen
#

it was trash

#

now its way better

tired herald
#

still trash

errant cave
#

Still sucks

gentle plinth
stray aspen
#

no

#

its great

#

its SotA

tired herald
#

golden trash is still trash

hollow imp
#

GPT-3o

stray aspen
#

lol

hollow imp
#

SOTA

errant cave
#

I like GPT-5 clapping back against stupid ideas in contrast to 4o

#

That's like the only thing it's better at though

atomic stream
tired herald
#

how interesting

vapid zinc
#

i dont understand how ppl think 2.5 pro is so good

stray aspen
#

@tired heraldhow much are eyou gonna sell the plugin for

modest prism
hollow imp
#

Such a scam

stray aspen
hollow imp
#

Gives you credits for using the ai

tired herald
stray aspen
#

but gpt-5 is way better

tired herald
#

prob on github

hollow imp
#

Does anyone use webdev arena

vapid zinc
stray aspen
#

stop yapping

hollow imp
#

It is

stray aspen
#

but right now lmarena is way better

hollow imp
#

100%

hollow imp
#

There's no better worse in 2+2

atomic stream
hollow imp
#

@gentle plinth Vro hello

gentle plinth
#

stop randomly tagging me

tired herald
#

ups, LMArena doesnt like this at all

modest prism
gentle plinth
#

or i will block

vapid zinc
stray aspen
#

i love how gemini 2.5 pro preview versions were better than the final lol

echo aurora
hollow imp
gentle plinth
#

its not possible

#

currently

#

use the prompt i said in direct chat

hollow imp
#

No one has done it

gentle plinth
#

then you can

hollow imp
gentle plinth
#

but as i said it will probably not work out of the box

#

ask ai to help you run it

hollow imp
#

Wdym

whole wagon
#

I want to see where the gpt I actually use in chat ranks

modest prism
atomic stream
errant cave
echo aurora
errant cave
#

Fifth place at 1427 ELO

stray aspen
#

i have only run into limits with claude 4.1 opus on lmarena

atomic stream
stray aspen
whole wagon
stray aspen
#

not sure about the rest

#

havent run into limits

whole wagon
#

They just don't use the high reasoning

coarse glade
#

You guys were lying in gpt 5 it really is gpt 4 you guys are lying to us

#

It’s really a bad thing guys

#

I have proof

whole wagon
#

Bruh

coarse glade
#

Look at this

#

And each gpt 5 model says this

hollow imp
coarse glade
#

See you guys were lying

stray aspen
stray aspen
#

theres no system prompt and the api doesnt know what it is