#general

1 messages · Page 277 of 1

wicked talon
#

Also how would you rate apple playground for image making

ocean vortex
#

Well yeah like the main AI sub. Since their limits are very reasonable and their platform is very feature rich

wicked talon
#

It uses different models but how would you rate it stilc

wicked talon
#

I wonder when apple will integrate Gemini

#

Also s26 agentic ai

#

Seems like a security flaw

ocean vortex
#

not extremely useful

willow seal
wicked talon
#

What iOS are we on

#

iOS 26 probably will come out on iOS 27 ish

ocean vortex
#

Apple's thing isn't anywhere near as good as the best alternatives. It can forward requests to chatgpt but at that point might as well just use chatgpt app

wicked talon
#

Why couldn't they try integrate it to just do it anyway

willow seal
ocean vortex
wicked talon
ocean vortex
#

You almost never have issues with apps not working on an iPhone, or glitching out, or having to deal with bloatware. Everything just works from the first try properly

willow seal
wicked talon
ocean vortex
wicked talon
#

Alot of the main galaxy ai features use Gemini nano/ ai core

wicked talon
#

As it's open source

#

All Google can do is probably not allow Google pre installed apps

#

I think they did that with Huawei

ocean vortex
#

Which would annul their main advantage. Not too great. Wouldn't want for them to do this even though I'm not using Android. It is supposed to be open OS...

wicked talon
#

A account security program was restricting side loading

#

I would switch to grapheneos if they block side loading

#

Or install surveillance

#

Which UK government is trying to push

wicked talon
#

And side loading

paper vortex
#

Hello

loud verge
#

I keep it off.

wicked talon
loud verge
#

Because it always flags mod apps.

#

Always.

wicked talon
#

And I've installed many modded apps

loud verge
#

Hmm...

wicked talon
#

Can't talk about piracy :/

loud verge
#

Noted.

wicked talon
#

Against discord tos

#

I've only ever got flagged on my old s9

#

Which gave fakes almost every app

rich spruce
#

Hi, does anyone have a Sora 2 code? I already have an account. I'm looking for a code

stray aspen
#

no

#

go to openai server

rich spruce
# stray aspen no

I can't get in, does anyone have a solution? I've correctly set up my surfshark VPN

stray aspen
#

i would give you one

#

but last time i did i got banned for 'scamming'

wicked talon
#

From you entering it by discords anti ddos

spring kelp
#

What's the best ai for coding?

rich spruce
stray aspen
#

nah

#

i sent the code here

#

and they banned me

#

lmao

#

@rich spruce

wicked talon
#

You would have to turn it off

rich spruce
rich spruce
#

I'm looking for a code; I came across a YouTube video on this Discord link

rich spruce
spring kelp
stray aspen
quiet skiff
#

They added the direct mode battles back, right? Haha

stray aspen
#

really?

ashen mauve
#

anyone got any ideas for a good rp?

hollow ivy
# ashen mauve anyone got any ideas for a good rp?

roleplaying game with an AI as GM?
I can recommend a sandbox game:
You discover a hidden, ancient alien spaceship, while traveling/exploring antarctica. The spaceship has an AGI/ASI in it, which offers you to become its new (biological) pilot.
Was fun to play with Gemini.

#

Or you play an adventure, where you inherit a time-machine from your reclusive, genius uncle.

#

Claude also is really good in these games.

#

A third game: your character discovers an anomalously large spider in your house (attic, bathroom, or whatever)

#

-# third scenario not recommended for arachnophobic people ^^

sinful thorn
#

Everything but Video in direct chat 💔

wheat onyx
gleaming roost
#

This is the third time Codex has reset my weekly quota to 100%. 🤔
Should I be worried?

#

Oh, now it makes sense.

wicked talon
#

Why is 900 people watching the itv logo bro 😭

lucid geyser
#

gpt 5.4 on arena but it sucks

fierce kelp
#

You shocked?

fickle venture
frosty lava
#

i don't see it on arena

lucid geyser
echo aurora
frosty lava
#

5.4

lucid geyser
#

and on lm arena it says its openai

#

its like the instant or low thinking version though

#

so it sucks

frosty lava
#

okay i think you can be right but it might not be gpt at all too

#

wait for the real one

#

better to make a real opinion

lucid geyser
#

to be gpt

frosty lava
#

if it did worse than gpt 5.3 codex then it can't be the thinking one

lucid geyser
#

its pretty bad, just try it yourself

frosty lava
#

cause it wouldn't make sense at all

#

i told you what it can be, but anyway we don't know since its an hidden one

fervent plank
#

It is extremely direct and kinda funny. No fluff

#

But knowledge cutoff in july seems wrong

#

Should be something like december

frosty lava
#

or directly 5.4 but without thinking

short sequoia
sly raven
#

Quick question: why are certain models (like Gemini 3.1) not receiving new votes anymore? It’s been a long time since I've seen the numbers move. Is this intentional?

tired crypt
#

hey i wanna know something, how do people get this notification?

thorny schooner
#

Is anyone else having this issue now where it just gives me a error message repeatedly even after I locked out and everything and no I don't have a VPN on I already turn it off

languid thunder
vernal storm
ashen mauve
#

well gang the A/B testing in direct chat is broke whatever you do DONT click skip or your chat is going to get bricked

lofty frigate
#

Okay look I don't know if I'm the only one having this issue when I'm using Image generation and using the nano banana pro model, And let's say I passed image of myself and make a prompt to include another figure next to me when it finishes generating it indeed creates another figure but then For some reason it doesn't generates me it generates someone who is wearing the same clothes as me but not me, Is anyone else experiencing this issue, Now if I hit try after a couple of tries it does generate me but many times it generates a random person who is wearing the same clothes as me

vital cove
lofty river
#

anyone know why its not letting me reset password ?

hardy lion
lucid geyser
undone saffron
sinful thorn
#

Everything but video in direct chat😭😭🙏🙏

rigid holly
#

Ayo is it just me or does it demand for you to log in your account to use it?

crisp mauve
#

/voice

rigid holly
#

You'd think they'd make an announcement or something if they made it so you had to be logged in to use arena

tawny brook
#

i forgot the prompt, but i ran this again and look what it gave me LMAO

bright shard
keen beacon
#

Why I can't send prompt in lmarena incognito

pastel plaza
keen beacon
echo aurora
#

Why not

keen beacon
#

i'll show you

#

why i cant do this

#

thats incognito btw

echo aurora
keen beacon
#

why?

#

im doing a text

echo aurora
# keen beacon why?

I’m not sure. Was able to make it work without having to sign in. Can you post a bug in #1343291835845578853 ? Add the relevant info. I can’t take a look now, but will later.

glacial swan
#

Guys, does anyone know why in Claude 4.6 I can’t upload files? Screenshots and things like that would be useful, but for some reason it doesn’t work in LMArena.

undone saffron
undone saffron
# echo aurora Have Video Arena selected?

Why is it that at a certain hour, when using AI to create a video, the captcha system is bug and cannot be overcome?
Then it won't let me write on code-arena or direct chat because the captcha ruined my previous captcha token

native folio
light sleet
#

finally u can stop responses, seen in canary arena

rare veldt
#

hello

runic rain
#

Hi

inland quest
#

Absolute Cinema

#

Stop button

inland fossil
#

Hello

whole sundial
#

<@&1349916362595635286>

#

crypto scam

hollow mulch
#

My main account got stop genration button, but my onther account still don't have that button:(

wicked sage
#

:LFG:

hollow mulch
#

Well no more infnite genrate yay 😄

#

Gpt 5.3 code vs opus 6 who is better guy?

manic moss
#

How long time do we have to wait to have new token for claudAI free ? Or its « one shot » mb I’m new

shrewd citrus
#

Is opus 4-6-search broken?

hollow mulch
#

Now can you guy fix issue this like model sometime can't use is report 'something went wrong with this message, please try again'

distant idol
#

@echo aurora when yall making the video arena a direct use or even side by side, its been months since i asked this 😭

hollow mulch
#

@echo aurora Is he developer or discord mod?

distant idol
#

king

bright shard
#

It's a bug; if you delete the cookies and try again, the login message doesn't appear. It only appears sometimes, so I imagine it's a bug.

fervent plank
#

from openai

rich spruce
ocean vortex
frosty lava
#

that's probably an instant or normal model

#

so it will be for free user too

#

for sure

fervent plank
fervent plank
frosty lava
#

honestly all i care about is thinking model capabilities i don't need another instant model or low cost

fervent plank
light scroll
#

Arena has updated something? Previous I was able to generate 3 images per day with no account

ocean vortex
tiny jolt
#

.txt support for arena when?

humble ether
#

now generate image need login first?

tame haven
#

I can't chat anymore without login. Is it time to leave lmarena?

humble ether
#

maybe

fervent plank
wicked talon
#

I like how qwen can image create anything except 18+ content

wicked talon
#

Nvm I hate it it doesn't make Donald trump in a I hate America shirt

burnt sinew
#

I dont know but id rather just pick opus 4.6 every time

#

And to have router show me what model its using

tired mantle
#

so annoying

ocean vortex
tired mantle
#

it's broken

ocean vortex
#

As you can see not a single emoji anywhere

#

No added sys/dev instructions here, default settings and a simple question with OpenAI's search enabled.

ocean vortex
# shrewd citrus

You have no more code_execution tool calls remaining this turn

errant inlet
#
fickle venture
tired mantle
fickle venture
tired mantle
fickle venture
fickle venture
#

Here it is

sharp mirage
#

Yo

#

Hi chat

languid kernel
#

hi new here

sharp mirage
#

Hi

#

Wsg

#

The chat died

#

Yoooo. Chat

languid kernel
#

yo folks, where is video arena channel? pls anyone can guide me to it

unreal tide
#

holy plot twist

mortal vale
#

@steep blaze Note that Video Arena has been removed from the server. More information can be found in this #announcements . You can still generate videos on the website.

vital lake
#

GPT 5.4 LAUNCHED
EVERYONE

#

EVERYONE

#

5.4 LAUNCHED

frosty lava
#

yesssss its real

vital lake
#

Someone addd ittttt

cloud zinc
vital lake
cloud zinc
frosty lava
#

okay so they are saying its better than 5.3 codex in coding too so now i can't wait for 5.4 codex to see even more improvement in coding task

vital lake
#

It doesnt seem like they focused on intelligence this time

#

Like they did with 5.2

frosty lava
#

but for sure they'll release the codex version if what you mean by intelligence is coding capabilities

vital lake
#

No, general intelligence

#

Not coding

#

I value a smarter assistant more

#

hopefully 5.4 isnt a disapointment

frosty lava
shell pewter
#

its not on arena yet right?

frosty lava
#

still should be smarter

vital lake
vital lake
shell pewter
frosty lava
crisp anvil
#

Mine isn't working

vital lake
#

Or open new tab

shell pewter
# vital lake It is

did you (or anyone else) who has used a lot ai and tested gpt 5.4/high? how is it compared to other tier 1 AI?

frosty lava
#

knowledge cutoff seems to be august 2025

frosty lava
shell pewter
# crisp anvil Mine isn't working

1 saw from X, open ai is experiencing error originally before the launch (lol)
2 openai just released it so their own server must be experiencing a high volume -> slow
3 same thing for arena itself

frosty lava
# vital lake

actually every model fail at this answer due to knowledge cutoff and what they've been trained most on

#

ask gemini to guess it will do same and same for anthropic

stray aspen
#

gpt 5.4 is out

vital lake
#

So a guess like GPT 5 era is good

vital lake
frosty lava
#

yes its good that they are saying gpt 5

#

its not that old

vital lake
#

It talks so natural and smart

frosty lava
#

i guess its good since everyone complained about it

loud verge
#

Guys

compact flame
#

Well gpt 5.4

#

Is it any good?

loud verge
#

Was there no announcement for gpt 5.4 on arena?

frosty lava
stray aspen
#

why does gpt 5.4 not answer

loud verge
#

Chad model

wind ember
frosty lava
#

should be better or as good as 5.3 codex on coding task

wind ember
#

still same coding slops

vital lake
#

Is more ways then one

wind ember
#

well its def better than gpt 5.2

#

but not that much diff with gpt 5/3

compact flame
#

I wonder if we should expect gpt 5.5 sometime later

frosty lava
#

the pro version is beating gemini deep think right ?

stray aspen
#

gpt 5.4 sucks

#

we need gpt 6

fathom apex
#

yea

#

gpt is sucks

frosty lava
#

So now openai are doing monthly release instead of two month ?

inner relic
#

Is gpt 5.4 good in roleplay

stray aspen
#

it sucks

vital lake
frosty lava
vital lake
stray aspen
#

gpt 5.4 is literally so ass

honest verge
#

Gpt 5.5 is going to come out April 5

stray aspen
#

ill just go back to claude 4.6 for the 50th time this month

frosty lava
vital lake
royal sail
#

lol

stray aspen
honest verge
#

And gpt 5.6 may 4

inner relic
#

bruh

frosty lava
stray aspen
light sleet
#

Gpt 5.4 is good in lua coding 💀

royal sail
#

The model is quite better conversationally

honest verge
#

and maybe gpt 5.7 early-mid june

royal sail
#

It sounds a lot less AI-ish

inner relic
frosty lava
#

it talk in a better way

honest verge
#

When gpt 6 will come out

stray aspen
#

lets get gpt 6 already

#

no more gpt 5.x crap

honest verge
#

Because at this point it's only 5.x

royal sail
#

they're scared to call anything gpt 6 right now lol

#

I doubt they have anything worth calling gpt 6

frosty lava
#

knowing every other companies are doing the same

tawny canyon
#

what do u think ,will gpt 5.4 take the first place in arena?

stray aspen
#

i literally just said that i tested it

#

bro what the hell is this

#

even glm is better

#

glory to anthropic

royal sail
#

especially with frontend

honest verge
#

High version is out

vital lake
royal sail
#

I pretty much just imagine this model to be the general-use version of 5.3 codex

stray aspen
#

claude is way better

frosty lava
#

this one is made for general purpose task.

inner relic
#

Let's test creativity writting

honest verge
#

But no xhigh...

frosty lava
#

mainly but still good at coding

inner relic
#

since chatgpt messed up eariler

honest verge
#

I wonder if arena will ever have xhigh

compact flame
honest verge
#

Or it's too expensive

compact flame
vital lake
royal sail
stray aspen
royal sail
#

the model has barely been out for an hour lol

stray aspen
royal sail
#

on your 3 prompts?

stray aspen
#

yes

vital lake
royal sail
#

congrats

#

we should hire you for benchmarking!

stray aspen
#

i test each model for the stuff i do

#

if its not good then its trash for me cause i wont give any other use

honest verge
#

Waiting for extreme thinking

inner relic
#

All models are not perfect yet

#

that's reality

honest verge
#

But I think it's not true

#

Cuz it's still not there

frosty lava
vital lake
stray aspen
vital lake
#

But I think extreme thinking is a fake leak

royal sail
vital lake
stray aspen
#

gemini 3 deepthink on arena when

frosty lava
#

atleast before judging coding capabilities wait for the coding model lol

royal sail
#

Most benchmarks run hundreds of prompts

#

You ran 3.

frosty lava
stray aspen
#

well that sucks

frosty lava
#

just like we ain't ever getting gpt 5.4 pro on arena

stray aspen
compact flame
stray aspen
#

they had a 5.2 pro

#

i guess it was fake

honest verge
#

And also all leaks about open source gpt are fake ...

frosty lava
#

i mean its from the highest subscription

honest verge
#

Really hoped openai will release gpt oss 2 or something

compact flame
#

Yeh

frosty lava
#

so it can't be on arena

royal sail
royal sail
#

especially with how they're doing in the public eye right now

stray aspen
#

people want deepseek 4

tawny canyon
#

i did some tests and in general texting gpt 5.4 is not better compared to claude opus 4.6 or even gemini 3.1

tawny canyon
tawny canyon
vital lake
honest verge
#

Is o3 pro better than gpt 5.4?💀

royal sail
#

oh hell naw

stray aspen
#

o3 pro sucks

vital lake
stray aspen
#

gpt 5.4 is miles better

royal sail
#

bro talkin bout ancient technologies in the big 26

#

o3 is the past

vital lake
#

Since post RL was new

honest verge
vital lake
royal sail
#

Some labs and companies still use it

honest verge
#

o3 pro and o3 are still live

royal sail
#

but there are much better options now

vital lake
royal sail
#

and cheaper too

honest verge
#

And openai won't remove them until 2027

vital lake
royal sail
#

there's no real reason to use o3 right now unless you REALLY like the model for your use case

honest verge
royal sail
#

there are more effective options

honest verge
#

They are deprecating gpt 5.1 but not o3

vital lake
honest verge
#

Like why removing 5.1

#

But not o3

royal sail
#

One of my personal favorite models

#

just far too expensive

#

also hallucinated everything lol

vital lake
stray aspen
honest verge
#

Btw what's the point of using o3 pro

#

Like is it better?

vital lake
royal sail
vital lake
#

Who remembers o1 pro pricing? 😭

royal sail
#

yeah lol

vital lake
#

I never saw any big improvements, but maybe I wasnt using it right

frosty lava
#

that mean every month we are getting a new gpt its insane

tawny canyon
#

will gpt5.4 search mode be added to arena?

vital lake
#

Obviously

royal sail
#

I feel like OpenAI is gonna have to release something extraordinary to get back into the race

#

feels like they're falling off frontier

stray aspen
#

glory to anthropic

frosty lava
royal sail
#

well

tawny canyon
royal sail
#

4o put thousands of people into psychosis so

frosty lava
#

and people want it back

stray aspen
honest verge
#

But in early 2025 it was peak

thorny schooner
#

I still hate they for some reason added battle mode directly to direct mode

honest verge
#

There was no better ai writing

royal sail
#

I will agree that there was something unique about 4o in the sense that it just had less refusals and felt less corporate

stray aspen
frosty lava
royal sail
#

but it was just too sycophantic

frosty lava
#

they want the model to agree with everything they say that's it

royal sail
#

yeah pretty much

honest verge
royal sail
#

because if a model agrees with everything i say then it must be the best model

#

/s

honest verge
#

Not in chatgpt but it's live

stray aspen
frosty lava
honest verge
#

So what's the point of #keep4o

royal sail
#

that's why they still want it in the app

frosty lava
#

i don't wanna be in a world where 80% of people are in psychosis due to an ai

#

please

#

use ai for coding task or general purpose

distant idol
#

any news on seedance 2 api guys

stray aspen
light sleet
#

Gpt 5.4 is the best at lua code?

distant idol
mossy girder
#

Why UI is scrolling down forcefully after response is done? Before it wasn't there, it's irritating.

stray aspen
#

its horrible

#

claude is better

light sleet
stray aspen
#

claudius

light sleet
#

k

royal sail
#

bro

stray aspen
#

@daring rock

distant idol
#

what are u doing nephew

honest verge
#

What's the point of paying someone 5$

#

When you can just use ai

royal sail
#

I swear these guys just ask a model "how can i make money fast with ai" and do whatever it says

stray aspen
honest verge
#

But still funny

distant idol
#

how to make money fast

honest verge
#

Sleeping for 30 minutes is enough

stray aspen
honest verge
#

So you will make money

#

Then sell it again

distant idol
honest verge
#

And you will make money again

#

Do this forever

distant idol
honest verge
#

And you will get new money

honest verge
#

What's this...

#

Concord grape?

#

Mammoth newt 0226

frosty lava
honest verge
#

Yupp ai

stray aspen
honest verge
vital lake
#

5.4 seems smart

frosty lava
#

personal opinion but i love how gpt 5.4 talk

distant idol
vital lake
honest verge
#

Wait mercury 2 is out?

distant idol
honest verge
#

And it's after gpt 5.4

frosty lava
vital lake
#

Somethin new idk

distant idol
honest verge
meager harbor
#

gpt 5.4 worst than 5.2 latest

#

on arena

frosty lava
honest verge
#

Also qwen image 2.0 and pro is out!

distant idol
honest verge
#

Crazy

inner relic
#

They are focused on coding skill not creativity

vital lake
frosty lava
#

this one is for general purpose

inner relic
#

Ok

#

Anyways.. My one question is

honest verge
#

Gpt 5.4 pro pricing is crazy

inner relic
#

Chatgpt 5.4 is good at writting and creative?

vital lake
honest verge
#

Like it's too expensive

#

It's unusable for normal peoples

frosty lava
vital lake
vital lake
royal sail
honest verge
distant idol
#

wharr

honest verge
#

No one will use it

royal sail
#

yeah okay

vital lake
frosty lava
distant idol
vital lake
#

I wish

inner relic
#

not bad not bad.

honest verge
#

Arc agi 1 even existed?

#

Or every model just scores 100% now?

#

In it

royal sail
#

yes

#

it did exist but it was saturated

distant idol
royal sail
honest verge
#

Arc agi 3 coming soon

royal sail
#

🤦‍♂️

distant idol
#

folk

compact flame
royal sail
#

Like, counting letters in a word is something us humans do fine because we have a method of doing it

#

LLMs don't work letter-by-letter

inner relic
#

lol

distant idol
royal sail
#

Not a hallucination problem

#

It's just how tokenization works

#

the model doesn't see every letter in the word

#

it just sees the probability of that specific token happening

meager harbor
royal sail
#

I mean, the model just came out

#

I'd give the score time to settle

frosty lava
#

you didn't saw the warning ?

distant idol
#

calm down saul goddman

honest verge
#

WHERE IS DEEPSEEK V4

#

Deepseek is already miles away

meager harbor
meager harbor
distant idol
royal sail
#

trust

#

my source is me

meager harbor
honest verge
#

Are you ....

#

LOL

#

lol

royal sail
inner relic
#

Yep

meager harbor
sleek phoenix
#

5.8 in the next month

frosty lava
# meager harbor

let's ignore the fact that it just released and the warning "preliminary"

honest verge
royal sail
#

The current leaderboard position is irrelevant solely because of the fact that the model has only 2,000 votes as of now.

honest verge
#

IM TIRED OF 5.X

sleek phoenix
royal sail
#

That's not nearly enough to make a conclusion

honest verge
#

I want next generation

meager harbor
royal sail
#

Opus 4.5 has 30,000+

sleek phoenix
meager harbor
royal sail
royal sail
honest verge
#

gpt 5.4 is o5

distant idol
#

damn 💔

royal sail
#

Models can do this pretty easily if you just give it terminal access

#

they'll just use a command

frosty lava
royal sail
#

Same reason why LLMs can't do long form arithmetic without a calculator

meager harbor
royal sail
#

My fault

#

Didn't know you had experience

meager harbor
distant idol
#

experience 💥

honest verge
#

Wait why gpt 5.4 is only available for text?

#

No coding?

royal sail
#

you are not special

distant idol
meager harbor
#

no it's just that you don't understand simple thing is ragebaiting me

vital lake
honest verge
royal sail
#

I'd rather have that than "arena champion"

distant idol
#

down*

ocean vortex
royal sail
#

I mainly bring it up because I think we're jumping to the conclusion that this model is "worse than 5.2" way too quickly

#

Once the score settles, it's probably fair to determine if the model is good or not

tawny canyon
#

guys when will deepseek v4 be released?

honest verge
#

WHERE IS DEEPSEEK V4

inner relic
#

I dont know what whale is doing

royal sail
honest verge
#

IM TIRED

inner relic
#

They're not afraid of gemini,claude and openai

honest verge
#

I NEED DEEPSEEK V4

frosty lava
#

deepseek waited for gpt release to beat them lol

#

joke

distant idol
tawny canyon
meager harbor
ocean vortex
royal sail
frosty lava
royal sail
#

if anything people were saying it was incremental lol

royal sail
meager harbor
distant idol
royal sail
#

nobody here thought it was a huge leap

honest verge
#

Or they are getting destroyed

frosty lava
ocean vortex
distant idol
royal sail
#

hence why i take the scores with a grain of salt and try to supplement my opinion with other benchmarks as well

honest verge
#

WE NEED A PIECE OF OPEN SOURCE!

ocean vortex
royal sail
#

Eh, I'd say misleading in some cases. Llama 4 topped the leaderboard at some point.

royal sail
#

dead in the back of an alley

honest verge
#

I haven't heard anything about it

ocean vortex
honest verge
#

They just discounted it?

distant idol
#

when is arena-video gonna be a direct chat 💔

honest verge
#

For some hours

#

But it got deleted

distant idol
honest verge
#

And now it's only for battle arena

distant idol
#

ive been waiting for months

#

its the site can really be 10/10 if they js did that

frosty lava
#

my dream would be deepseek v4 #1 in capabilities from all existing model till now but it's impossible right ?

ocean vortex
#

Many of it is just style that is technically meaningless and easily changeable, some of it are also just the mere patterns... Patterns from the most active users on arena, what prompts they use and how they are voting

honest verge
#

Gpt 5.4 is very bad for coding in arena

#

Like it's so bad

honest verge
#

I can't

#

Even Gemini 3 beats it

#

It's worse than gpt 5.1

#

It can't do anything

royal sail
#

It's supposed to be 5.3 Codex level

frosty lava
#

why in the coding i only see the medium version ?

#

on arena

honest verge
ocean vortex
frosty lava
#

don't do front-end using gpt

royal sail
honest verge
#

Opus 4.6 is still my king

#

I'm waiting for opus 5 so hard

frosty lava
#

how can we explain opus 4.6 have good "taste" in front-end but not other companies ?

honest verge
frosty lava
#

so how do you explain it

ocean vortex
frosty lava
#

the same gpt 5.3 codex on front end is bad

honest verge
#

It's capable of ambitious work

frosty lava
#

yeah i guess that make sense

honest verge
#

As anthropic says

frosty lava
#

so openai need to work on this

#

its not about coding capabilities honestly but about the taste it have

royal sail
#

never really got good UI outputs from GPT models

frosty lava
honest verge
#

While opus 4.6 somehow makes it alive and beautiful

ocean vortex
# frosty lava so openai need to work on this

to have good visuals you ideally need a bigger model. So Opus and Google gonna have natural advantage there. And if OpenAI made it bigger they wouldn't be able to sustain current caps on chatgpt. It's a reasonable trade-off. They used to be struggling considerably more with this before gpt4.1 and subsequent models

honest verge
#

What's the secret

frosty lava
#

its capable of implementing features and doing very great things like that but don't have any good visuals

frosty lava
royal sail
ocean vortex
vital lake
#

Guys 5.4 is so good at creative writing

royal sail
#

Size is only really important for world knowledge and superposition (relating novel concepts to each other)

vital lake
#

Like way better then Gemini and Opus

hushed gyro
#

WTF is 5.4???

devout vault
#

openai will never make good models. again

ocean vortex
honest verge
crude lagoon
frosty lava
#

but the reason why i use gpt 5.3 codex instead of opus 4.6 is that its doing better job on every other thing than front end

royal sail
#

Theo.t3 made a video on the whole thing

round forge
honest verge
#

While opus 4.6 requires max subscription

#

Pro isn't enough

frosty lava
inner relic
inner relic
vital lake
ocean vortex
vital lake
#

Its creepily natural.

#

OpenAI 100% focused in this aspect more

royal sail
#

5.3 codex loves doing it

#

5.3 codex (left), 5.4 (right)

honest verge
#

Was gpt o models ever really good?

frosty lava
#

but i love the 1m context

distant idol
#

guys which model has the most context

royal sail
zealous sparrow
royal sail
#

Not amazing at coding

#

But they were frontier

honest verge
#

And where's o2?

zealous sparrow
royal sail
vital lake
zealous sparrow
#

I did my tests

vital lake
royal sail
#

oh really?

ocean vortex
vital lake
#

Yeah some company name

frosty lava
#

Chat gpt is bad at making good visuals

weak dagger
#

@alpine pasture i like cherrys

royal sail
#

just basing off my usage and general consensus back then

distant idol
honest verge
#

Missing

honest verge
weak dagger
distant idol
ocean vortex
#

no 5.4 on chatgpt?

#

😠

weak dagger
#

RIP chicken

ocean vortex
#

this rollout thing is kinda slow recently

#

used to be near instant

royal sail
ocean vortex
#

Didn't even have 5.3 until now

royal sail
#

i do remember gpt 5 taking some time to roll out

honest verge
#

Gemini namings are so done 🥀

royal sail
#

even though it was supposed to be huge release

frosty lava
#

there is no normal 5.3

royal sail
#

every company is terrible at naming models except kimi and minimax

#

and deepseek ngl

#

DeepSeek's naming feels the most straightforward

distant idol
proud bobcat
#

lmao gpt 5.4

#

gpt 5.3 hadnt even fully released yet

#

😭

frosty lava
#

only the instant version

ocean vortex
weak dagger
#

Qwens video generator is just so ahh 😭💔 its like veo 3

proud bobcat
#

IM CRYING HOW DID IT FALL BEHIND 5.2 CHAT

#

IM CRINE

frosty lava
distant idol
inner relic
#

just test

proud bobcat
#

imagine its like gonna be a point ahead

#

😭

royal sail
proud bobcat
hazy marlin
frosty lava
inner relic
#

yep

distant idol
hazy marlin
#

the only reason I use it and not other ones because it actually edits the code not rewrite it entirely everytime

ocean vortex
frosty lava
proud bobcat
#

nevermind

#

solid

#

all gpt models get choked up on this

royal sail
#

LLMs don't work in letters

inner relic
distant idol
proud bobcat
#

lmao

honest verge
#

Gemini 3.1 0326 flash lite Omni pro preview high flash TTS image veo

proud bobcat
#

hold on i gotta see ts

hazy marlin
proud bobcat
#

i wonder about its coding

#

hold on

#

i wonder if they fixed the schizo code of gpt 5.2 and 5.3

distant idol
honest verge
#

TECHNOLOGIA

#

TECHNOLOGIA

frosty lava
#

when do you think they'll release the codex version

#

5.4 codex

distant idol
#

yes techologia

royal sail
#

source: idk

distant idol
honest verge
#

Source :Slam Altman

royal sail
#

do not slam the altman

proud bobcat
#

scam altman

distant idol
honest verge
#

FINALLY

#

This is what I needed

#

MISTRAL VIBE CLI

#

FINALLY

#

I WAITED FOR THIS MY ENTIRE LIFE

proud bobcat
#

ah yes

#

mistral cli

#

i needed this

#

truly

#

(mistral sucks)

distant idol
honest verge
#

Wonder if this is even better than Gemini 2.5

#

I'll test it

proud bobcat
frosty lava
#

why only gpt 5.4 medium on coding arena

#

i want the high version atleast

signal pelican
#

lol.... 5.4 is behind 5.2?

proud bobcat
#

yeah im confused too

proud bobcat
frosty lava
distant idol
#

give it a day or more

frosty lava
#

no honestly everyone saying its worse than gpt 5.2 is lying

#

that's just not true at all

signal pelican
#

yeah... I've been watching the leaderboard for a while... I don't think 5.4 will catch up that much.

frosty lava
signal pelican
#

oh I know

distant idol
fierce kelp
burnt sinew
frosty lava
#

but your right definitly not a huge leap in capabilities

signal pelican
#

I'm sure it will beat 5.2 someday... but I'm talking about others... lol

#

I think they are actually falling behind Gemini and Claude

wraith wren
#

Quick question. Even though gpt 5.4 was out for maybe an hour? Do you think gpt 5.4 is better than opus 4.6 at coding?

fierce kelp
cinder nexus
#

gpt 5.2 still stands strong

frosty lava
#

they clearly have focused on creative writing too this time

frosty lava
#

so 5.4 should do same as 5.3 codex

wraith wren
#

Interesting. So for back end gpt 5.4/5.3 codex is better?

frosty lava
#

for good looking go to opus

#

gpt is bad at making thing look good

honest verge
#

LOL THIS IS GPT 5.3 INSTANT

#

LOL

#

Vs mistral

wraith wren
honest verge
#

MISTRAL is better

royal sail
burnt sinew
honest verge
frosty lava
#

yes i was never able to make a gpt model to do something good looking

cinder nexus
royal sail
distant idol
royal sail
#

The font for mistral is terrible

honest verge
distant idol
#

the font of doom

cinder nexus
distant idol
#

guys stop bullying him

slim gorge
#

why is gpt-5.4 already out??? gpt-5.3 just came out like wtf 😭

hollow ivy
#
poll_question_text

Which is better in coding?

victor_answer_votes

9

total_votes

15

victor_answer_id

1

victor_answer_text

Claude Opus 4.6

distant idol
#

oh

#

👀

honest verge
#

Left is gpt and the right is mistral I can't decide which design for repair station is better

#

(idk what repair station even means)

royal sail
#

Well these two designs are completely different lol

#

not really fair to compare

#

depends what you need

proud bobcat
#

also gpt 5.4 is awful

#

like wow

#

its not

#

impressive at all

honest verge
#

Everybody were like

distant idol
honest verge
#

WOW IT'S GOING TO DESTROY EVERYTHING BEST MODEL EVER!

#

But we got nothing

#

It's not better at all

royal sail
proud bobcat
#

i think openai is just "RANDOM SHI GO"

graceful vortex
#

guys I have to ask one thing can anyone that have a knowledge with arena.ai answer me?

proud bobcat
#

what is it brinks truck sniper

thorny schooner
#

Is it just me where thos battle mode in direct mode increase the frequency of mistakes even if using skip like I keep I getting stuck my getting into a freeze area where it just tells me a error was made despite no matter what I do

honest verge
#

Openai rushed 5.4

#

I think it's clear

vital lake
#

Bru how people only care about coding 😭

graceful vortex
royal sail
#

5.4 does feel great for writing unironically

vital lake
#

Wait for 5.4 codex before judging on coding

royal sail
#

not sure why they didn't talk about how good it is at writing

#

feels a lot better

graceful vortex
#

why is arena.ai free. I mean how isnt the API's are so expensive?

honest verge
vital lake
#

Its creepy how good it is

inner relic
honest verge
#

While 5.3 was supposed for writing

inner relic
#

Roleplayers doesnt want a expensive model

proud bobcat
royal sail
proud bobcat
#

its a worthwhile investment

distant idol
royal sail
#

Top writing model is Opus 4.6 lol

graceful vortex
proud bobcat
#

basically

graceful vortex
thorny schooner
#

Basically money is replaced with data as the cost

proud bobcat
#

arena gives the rankings and companies get the data

graceful vortex
#

and people see the models : "oh wow its a good model"

vital lake
graceful vortex
#

and company won

proud bobcat
#

basically yeah

proud bobcat
#

its defo better than gpt 5.2

honest verge
#

Yes

thorny schooner
#

Basically all three of that since we all kind of answer at the same time lmoa

proud bobcat
#

but its not

vital lake
royal sail
proud bobcat
#

really great

royal sail
#

like

#

in comparison to what?

honest verge
#

But not for what it was made

vital lake
#

Like its pretty clear

honest verge
#

Gpt was supposed for coding and agentic tasks