#general

1 messages · Page 86 of 1

hollow imp
#

@half trail soja vro

wicked root
#

Look at the confidence interval though

eternal niche
half trail
#

I used it now I can say it has the ability to use some amazing things but when it comes to functionality it lacks a lot

rapid merlin
#

has been for like 2h in the phone app

keen beacon
#

the CI

half trail
wicked root
hollow imp
ocean vortex
#

hhhhh this table could have looked slightly better. I see why they hid it gptdrawncat

wicked root
molten parcel
#

btw awhat u can do with gpt 5

keen beacon
molten parcel
#

like?

#

is it more advance?

eternal niche
#

no

keen beacon
molten parcel
#

dans gemini is way ahead

whole wagon
hollow imp
#

What does this mean

keen beacon
keen beacon
molten parcel
#

is the gpt 5 free or it comes with subscribtion

keen beacon
eternal niche
#

free

molten parcel
#

ohh

#

nice

eternal niche
#

lmarena limitless

keen beacon
hollow imp
molten parcel
#

mini also?

hollow imp
#

GIMME REASONING 😭

keen beacon
# molten parcel huuh?

It's a limitation for free users. After certain amount of prompting it will switch to the mini variant

#

cost stuff

hollow imp
#

What does this meannnnn

eternal niche
keen beacon
#

so don't worry

eternal niche
keen beacon
molten parcel
keen beacon
#

did some already

rich comet
#

Does the "remove style control" leaderboard get updated (I ask because I discovered it just now)?

molten parcel
#

huuh

blazing bison
#

and where are the people that said "gpt-5 will accept video inputs"

eternal niche
blazing bison
#

gpt -5 will have a new gen image model

hollow imp
#

Gpt5 not worth then

#

If it's worse than gemini 2.5 pro

blazing bison
#

it is

stray aspen
#

gpt 5 high is live on yupp ai

keen beacon
eternal niche
blazing bison
keen beacon
#

Analyzes them well

hollow imp
molten parcel
eternal niche
blazing bison
#

is gpt-5 after drugs

keen beacon
molten parcel
#

i it like gpt 5 after taking drugs

warm fulcrum
#

didnt legit say gpt5 high was basically summit

blazing bison
molten parcel
#

its the highest in the room'

warm fulcrum
blazing bison
#

where?

warm fulcrum
#

he deleted it but he announced it

hollow imp
#

He who

blazing bison
hollow imp
#

For that is a mannnnnnn what has he got !! !! But him selfffffffff...

warm fulcrum
stray aspen
#

yupp ai has all this

#

and a few more

blazing bison
stray aspen
#

wdym its fake

hollow imp
#

One of his ingrained biases

blazing bison
#

they put gpt-5 low and just change names

hollow imp
#

😔

stray aspen
#

ill try it

warm fulcrum
#

yep its fake

stray aspen
#

and see if its any better

gusty sinew
#

Btw will LMarena ever be taken down

stray aspen
eternal niche
keen beacon
#

The irony!

#

lol

stray aspen
#

yupp ai

hollow imp
blazing bison
gusty sinew
#

Idk if it was legal or not since u using paid versions

hollow imp
#

💀

molten parcel
stray aspen
wicked root
#

What’s remove style control?

#

NO ONE KNOWS WHAT IT DOES

keen beacon
#

This is funnier

eternal niche
wicked root
eternal niche
gusty sinew
#

Well ik it legal probably but didnt know really why it let's u use paid versions for free

blazing bison
#

lol

hollow imp
blazing bison
#

i don't trust them with a browser

#

lol

hollow imp
hollow imp
keen beacon
#

Perplexity's GPT-5 didn't use thinking But holy hell is that a question even KImi k2 could have solved.

stray aspen
#

the "gpt-5 high" on yupp ai gave me this

hollow imp
#

Better better BETTER

stray aspen
#

better than what lm arena gave me

#

and im not biased

gusty sinew
keen beacon
blazing bison
#

sorry

hollow imp
stray aspen
#

maybe

gusty sinew
keen beacon
hollow imp
#

From getting crushed

#

He won't accept he was fooled

molten parcel
stray aspen
#

i just tested both

keen beacon
eternal niche
keen beacon
#

or have inside info

west maple
#

hey how can I make that when I generate video it always gives me veo3, I'm crating bunch of videos and they dont have sound, should I put in my prompt veo3 or?

keen beacon
warm fulcrum
stray aspen
#

this is the one from the lm arena version

west maple
keen beacon
hollow imp
keen beacon
#

to provide info

raven helm
stray aspen
hollow imp
#

Even gpt 5 super duper high won't be able to solve it

#

😂

keen beacon
stray aspen
ornate stump
#

Gpt 5 high is gpt 5 pro?

eternal niche
whole wagon
hollow imp
whole wagon
keen beacon
#

"GPT-5 is the smartest model we've ever done, but the main thing we pushed for is real-world utility and mass accessibility/affordability.

we can release much, much smarter models, and we will, but this is something a billion+ people will benefit from.

(most of the world has only used models like GPT-4o!)" "much, much smarter model" WEN!!!

#

sam altman btw

stray aspen
ocean vortex
#

pro is gpt5-pro

ornate stump
whole wagon
ocean vortex
keen beacon
molten parcel
hollow imp
ocean vortex
stray aspen
#

yes elon musk said he will release grok 5 before 2025 ends

novel flame
#

What the hell kind of half-baked model did they cook up over there with GPT-5? I just ran it through my standard set of tests (which I have used for years and by now must have made it into everyone's training data) and it got somewhere around 3.5-4 out of 5?

For reference, all of these score 5 out of 5: GPT-4.1, GPT-4.5, o1, o3, Claude since Sonnet 3.5, DeepSeek-R1, Grok since v3, Gemini 2.5 (both Pro and Flash), and even GLM-4.5! How did they manage to make GPT-5 worse than o3? Did they overcook it during RL post-training or something? It's worse at instruction following and won't properly format its responses (GPT-5 Mini gives me Markdown at least). Something really weird is happening, because I feel like it has some good intelligence (below expectations but still good), but it's shooting itself in the foot on the simple stuff.

hollow imp
stray aspen
agile bloom
#

OMG GPT-5!!!!

stray aspen
#

they just release

ocean vortex
eternal niche
ocean vortex
#

and totally not worth it

gusty loom
keen beacon
wicked root
warm fulcrum
ocean vortex
#

you get like 8% better performance than non-pro model if that

keen beacon
#

it gets obnoxious

agile bloom
gusty loom
#

Leaderboard is wrong

stray aspen
#

the hype is just a marketing strategy and then they hit us with absolute garbage

keen beacon
ocean vortex
eternal niche
#

reddit sucks btw

keen beacon
ocean vortex
#

Spoiler alert: Models don't degrade after release

stray aspen
hollow imp
hollow imp
whole wagon
# whole wagon

Oh wow the xAI odds started increasing quickly after this kek

#

They spiking

stray aspen
#

ok

rapid merlin
#

my gpt 5 answered that it can't be determined

keen beacon
rapid merlin
#

if it's still that old question

keen beacon
#

lmao

warm fulcrum
#

see if it makes any difference

ocean vortex
hollow imp
warm fulcrum
#

guys

ocean vortex
#

😠

warm fulcrum
#

u know the tweet

hollow imp
rapid merlin
warm fulcrum
#

where sam altman asked gpt 5 about ai shows

stray aspen
#

so whats the right answer

storm needle
# stray aspen

can you try this?


<p>For example, $17$ and $1305$ are heptaphobic, but $14$ and $132$ are not because $14$ and $231$ are divisible by seven.</p>

<p>Let $C(N)$ count heptaphobic numbers smaller than $N$. You are given $C(100) = 74$ and $C(10^4) = 3737$.</p>

<p>Find $C(10^{13})$.</p>

I have access to a python compiler.
warm fulcrum
#

and it answered pantheon

#

if u ask gpt5 rn it doesnt give out that answer

#

wtflip????

hollow imp
#

@rapid merlin

keen beacon
hollow imp
whole wagon
#

Deepseek???

keen beacon
#

knew it. Ain't no one about to read 3 similar books on the same topic

ocean vortex
rapid merlin
#

?

hollow imp
# keen beacon you read all those books about memory, what did you learn? kinda interested now

To remember book or data:
Active recall
Photographic remembering
Underlining
Summarisation
Linking
In-head simulation of practice
Explanation to oneself aloud
Virtual memory palace
Spaced repetition

To train photographic:
Military method
Word contemplation
Numbers memorisation

To train intellect:
Study hard topics constantly (Japanese, Chinese, advanced mathematics)

To speak:
Learn syntax
Learn logic as a branch of mathematics
Writing (just as this text)
Dissect and demarcate sentences
Contemplate and demarcate sentences while speaking
Read aloud

Sequence memory:
PAO
Binar code learning
Remembering sequences

Pattern recognition:
Reading what induces pattern-seeking propensity
Thinking about systems and parterns
Writing—subcategorisation and loopholing (just as writing this text)

Self-analysis:
Physical self-awareness
Psychological self-analysis
Philosophical self-analysis
Cognitive self-evaluation
Retrospection and strategy

Skill-building:
Full concentration
Kinesthetic exploration via error-driven movements
Exuberant caffeinatedness
Lucid dreaming

eternal niche
#

what

hollow imp
whole wagon
#

Why you do this wall

stray aspen
whole wagon
#

Of bs

sacred quail
#

did we liked gpt 5 or not ?

eternal niche
#

no

hollow imp
keen beacon
#

lol

sacred quail
keen beacon
eternal niche
ocean vortex
# hollow imp Share the system prompt please

Nothing too crazy actually. But it worked for what I needed it for so ended up using this for other stuff now lol

All responses must be extremely long. it is crucial that leave no stone unturned and complete everything in exhaustive detail meticulously. You must reflect endlessly for each user's query. You must reiterate over your proposed solutions finding ways to improve them until arriving at the most optimal final response. Meaning you must review each response provided and then improve it.

hollow imp
eternal niche
keen beacon
#

We need some guy to just read about a particular topic and and distribute learned knowledge to the rest of the community. Lmao!

hollow imp
quartz light
#

deleted user is ai

#

fr

rapid merlin
#

fr

quartz light
#

he doesnt exist

#

👻

hollow imp
# eternal niche help me

I'll ask you a question. If you understand basic English the realisations that click in your mind will be enough to change your life at max.

#

Ready?

eternal niche
#

yes

shadow jewel
#

Something went wrong while generating the response. Please try again.

I got that error what does it mean?

ocean vortex
gusty loom
hollow imp
ocean vortex
stray aspen
keen beacon
#

Bro what was the question you were about to ask?

eternal niche
#

водка

deft cradle
#

Hey! Who knows why Lmarena when using gpt 5 it says it's 4o

deft cradle
stray aspen
hollow imp
# eternal niche yes

"If a machine could print out every thought you've had for the last seven days, and you were forced to read it, would the person revealed on those pages be the person you believe yourself to be?"

keen beacon
keen beacon
#

it can be slow

whole wagon
eternal niche
whole wagon
#

Maybe grok 5 can beat GPT5 also

#

It is possible

keen beacon
novel flame
# keen beacon Can you tell us more about the kind of quesitons it failed on?

I have a very simple test where I ask it to write a short paragraph about web application reactivity ending in: "..properties of the component".

This tests the model's ability to pre-plan a sentence and guide its token generation so it ends up with something that makes sense and ends with the correct words. All of the large models can do this beautifully, reasoning or not. This was the paragraph I got from GPT-5:

"Web application reactivity lets the UI respond automatically when underlying state changes. Frameworks track dependencies with signals, proxies, or fine-grained subscriptions, updating only what’s necessary for performance. Developers write declarative bindings so templates reflect current data, producing predictable, maintainable behavior rooted in a single source of truth and explicit inputs, events, and props..properties of the component."

Not only did it include the dots like a psychopath (a handful of reasoning models do this), but "props..properties"? That's fully nonsensical.

deft cradle
#

Только про сеть

hollow imp
keen beacon
hollow imp
eternal niche
keen beacon
eternal niche
hollow imp
#

@stray aspen how is this yupp ai giving access to o3 pro? Does it have like really strict limits? Are you sure is it not fooling people?

hollow imp
eternal niche
#

what about my

stray aspen
#

you have points

#

but you dont buy the points

#

you get them from using the ais

eternal niche
#

spend points to get points

#

what

fiery lagoon
#

gpt5 out guys

stray aspen
hollow imp
eternal niche
fiery lagoon
stray aspen
#

yes bro its real

eternal niche
fiery lagoon
#

Yeah

#

insane

keen beacon
barren prairie
stray aspen
#

and its also an arena

eternal niche
#

anyway gpt5 sucks

hollow imp
#

Maybe it's authentic!

stray aspen
eternal niche
hollow imp
eternal niche
#

do you have this?

#

we can play

brittle tiger
hollow imp
eternal niche
stray aspen
#

new gpt 5 models on lmarena

#

this is great

keen beacon
stray aspen
#

now we need gpt 5 high

agile bloom
#

omggg all the gpt 5 model got on lm

hollow imp
#

Not all

eternal niche
hollow imp
hollow imp
hollow imp
#

😭😭😭

stray aspen
#

what were you expecting

#

if you want the full on service just buy a subsciption in chatgpt

hollow imp
#

Guess I'll have to stick to this plan

restive sky
#

So what was lobster, tangerine, starfish, and zenith??

eternal niche
stray aspen
#

gpt 8 pro high premium battle pass

hollow imp
#

Gpt AGI pro max

restive sky
stray aspen
#

bro gpt 5 is so bad

eternal niche
#

yes

stray aspen
#

gemini 2.5 pro tackled a coding task in roblocks

#

first shot

hollow imp
restive sky
#

Gemini 3 is going to crush GPT-5

stray aspen
#

yes it will

#

if gemini 2.5 can

#

already

eternal niche
hollow imp
#

And I'm gonna crush my exam

steady vale
#

hour 3 of no gpt5 access on web

restive sky
eternal niche
restive sky
#

And only a 32K token context for paying plus users?

hollow imp
#

Goodnight and goodbye

#

🙏🙏🙏

eternal niche
stray aspen
#

gpt 5 is so slow

eternal niche
#

i am already get controller for game

stray aspen
#

feels like deepseek

keen beacon
eternal niche
#

i hope you will play with me

restive sky
#

Maybe yann lecun was right

hollow imp
eternal niche
keen beacon
#

lol

full idol
#

im pretty sure zenith is gpt5 Pro

hollow imp
wicked root
eternal niche
#

i am training

#

to delivery food

#

help me you are so smart 😭

quartz light
#

hey goobers, just wanted to share this preview html button I made for lmarena :)

// ==UserScript==
// @name         Arena HTML Codeblock Preview
// @match        https://lmarena.ai/*
// @grant        none
// @run-at       document-idle
// ==/UserScript==
(() => {
  const sel='div[data-code-block="true"]';
  const go=r=>{
    r.querySelectorAll(sel).forEach(b=>{
      if(b.dataset.apb)return;
      const g=b.querySelector('[data-sentry-element="CodeBlockGroup"]');
      if(!((g?.textContent)||'').toLowerCase().includes('html'))return;
      const btn=document.createElement('button');
      btn.type='button';
      btn.textContent='Preview';
      btn.className='inline-flex items-center justify-center gap-2 whitespace-nowrap transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring ring-offset-2 focus-visible:ring-offset-surface-primary disabled:pointer-events-none disabled:opacity-50 [&_svg]:pointer-events-none [&_svg]:shrink-0 text-sm font-medium text-interactive-active hover:text-interactive-normal active:text-text-tertiary relative rounded-lg p-[6px]';
      btn.onclick=()=>{
        const code=b.querySelector('pre code');
        if(!code)return;
        const ls=code.querySelectorAll('.line');
        let s=ls.length?[...ls].map(n=>n.innerText).join('\n'):(code.innerText||code.textContent||'');
        const d=document.createElement('textarea'); d.innerHTML=s; s=d.value;
        const w=open('about:blank','_blank'); if(!w)return;
        w.document.open(); w.document.write(s); w.document.close();
      };
      (g||b).appendChild(btn);
      b.dataset.apb=1;
    });
  };
  go(document);
  new MutationObserver(m=>m.forEach(x=>x.addedNodes.forEach(n=>n.nodeType===1&&go(n)))).observe(document.documentElement,{childList:true,subtree:true});
})();
eternal niche
#

what

#

looks like stealer

quartz light
#

no

#

read the code

eternal niche
#

no

#

i cant read long text

quartz light
#

you can clearly see theres no url or anything

stray aspen
#

what is this

eternal niche
#

you are lying

quartz light
#

its because its a userscript

#

it runs if you are on lmarena

#

thats the only url

white hatch
#

gpt-5 is fire. He fixed the error that I wasn't able to solve

eternal niche
#

send me your email login and pass as guarantee

#

and i'll trust you

stray aspen
#

i mean sometimes it does miracles and sometimes its straight up garbage

white hatch
#

idk, basic, he just thinks longer

quartz light
#

preview button for lmarena

stray aspen
quartz light
stray aspen
#

what ai made you the code

eternal niche
quartz light
#

for the preview button? gpt 5

eternal niche
quartz light
#

for the page on the right, gemini 2.5 flash lite

keen beacon
eternal niche
quartz light
devout vault
#

Gpt 5 made me a professional website with many features in 1 prompt

stray aspen
#

yeah

#

gpt 5 gives really complete scripts

keen beacon
stray aspen
#

i just noticed

blazing bison
#

Every model do that lmao

devout vault
stray aspen
#

no but gpt 5 does it better

quartz light
#

gpt 5 gave me 80k tokens response

quartz light
#

why

keen beacon
stray aspen
#

im not OPenAI bootlicker guys

eternal niche
blazing bison
#

Google will be this year winner

keen beacon
blazing bison
#

That's funny

keen beacon
#

do scripts and such

blazing bison
#

Google started so much behind

devout vault
stray aspen
#

will we get gpt 5 high on the arena

somber vigil
#

i dont see mini on the leaderboard 🙁

eternal niche
quartz light
white hatch
somber vigil
blazing bison
somber vigil
#

yea

quartz light
stray aspen
#

i made an entire roblocks game with deepseek and gemini lol

quartz light
#

oh btw did yall know genie 3 is on aistudio

keen beacon
stray aspen
#

where

quartz light
stray aspen
quartz light
#

dont ask me how i know

devout vault
#

is there gpt 5-low

eternal niche
quartz light
quartz light
quartz light
#

😨 dont snitch

stray aspen
#

i love google

keen beacon
#

lol

somber vigil
devout vault
#

Gemini 3 will be my favorite model

#

Gemini is so uncensored bro

#

There's no censorship at all if u disable it

stray aspen
#

and free

quartz light
somber vigil
#

ChatGPTs usage limits are a joke

#

literally

white hatch
blazing bison
whole wagon
#

Congrats to openAI, their nano and mini models are getting close to Qwen now

somber vigil
#

2.5 Flash in Gemini is crazy free

whole wagon
somber vigil
somber vigil
#

especially for file uploads

quartz light
#

did yall even notice gpt 5 has the biggest output tokens window

eternal niche
#

openai sucks

somber vigil
#

I still don't understand how AI Studio is free

quartz light
keen beacon
quartz light
somber vigil
#

thats nice

blazing bison
#

Opus has 100k too no?

minor bloom
stray aspen
somber vigil
#

fr

#

i love it

keen beacon
#

like all free users of it

somber vigil
#

take eveerything idc

errant cave
#

Makes inference much cheaper

somber vigil
#

still

quartz light
keen beacon
#

their gpus melt too much

blazing bison
#

But only if you enable thinking

eternal niche
#

(thanks that i am from Russia)

keen beacon
stray aspen
quartz light
blazing bison
blazing bison
#

Openai letting it be available for everyone is cool

devout vault
#

i don't have it rn

#

I'm in Europe

blazing bison
#

Its not pushed to web

split kayak
#

how to make prompt for arena battle
ok

vague wharf
#

Why does the model identify itself as GPT-4 on lmarena, while on the official website it identifies as GPT-5?

blazing bison
stray aspen
frigid coral
#

gpt 5 flopped?

split kayak
blazing bison
rapid merlin
split kayak
#

JohnPork

stray aspen
#

no its great

rapid merlin
#

for me it still didn't drop on my computer

stray aspen
#

i have changed my opinion

devout vault
#

LOL I remember u were saying how good the model is gonna be and now ur saying this

eternal niche
eternal niche
#

my brother

stray aspen
#

still gemini is better

eternal niche
stray aspen
#

i love gemini

#

yes it is lmao

eternal niche
quartz light
eternal niche
#

who needs style control

#

because it is style

#

🤣🤣🤣

#

nice joke bro

#

can you teach me

quartz light
stray aspen
#

google will hit the industry hard with gemini 3

#

im confident

keen beacon
#

I just learned a new word

#

a useful one

quartz light
eternal niche
#

dont hit the wall bro

#

it hurts

steady vale
#

i heard google has an advantage from their custom tpu processors or something like that

devout vault
#

@deep adder is good at ragebaiting ngl he's funny LMAOO

stray aspen
#

google deepmind is cracked

hollow imp
#

@stray aspen vro

stray aspen
#

whatsup

hollow imp
#

This yupp ai and it's models

#

It doesn't remember the previous message I sent

#

😶

keen beacon
hollow imp
#

One trick pony

stray aspen
#

but it does remember the messages for me

eternal niche
keen beacon
stray aspen
#

what companies

eternal niche
red tangle
#

hey @deep adder

devout vault
hollow imp
#

Wtf

keen beacon
#

"Perse"

eternal niche
keen beacon
#

Similar then

keen beacon
#

if the same meaning

hollow imp
stray aspen
#

@deep adderwhats the release date of gpt - 6

hollow imp
#

Openai o3 pro answer that specific question from the pdf

#

We don't mind your best guesses

#

Will you happily give them to us?

#

I thought you replied to @deep adderwhats the release date of gpt - 6

balmy mist
#

GPT-5 is so good

stray aspen
#

yes

#

its great

#

but behind gemini 2.5 pro

hollow imp
#

@stray aspen what do I do so that it remembers my messages

stray aspen
#

i dont use that website

ocean vortex
#

nothing flopped lmao

jade egret
ocean vortex
#

Or you are talking about o3-low or smth 🤔

stray aspen
#

im high on truth

errant cave
#

I'm more curious how GPT-OSS-20B fares considering I should be able to run it on my own computer

#

Anyone give it a shot?

jade egret
#

i think it better than gemini 2.5 pro but tbh google prob gonna release something better soon

white hatch
#

Do never buy weed from gas stations

ocean vortex
#

or where

jade egret
white hatch
ocean vortex
#

Help me find it where it is behind

#

lmao

stray aspen
ocean vortex
jade egret
errant cave
#

I've been able to make Gemma-3 generate whatever I want by simply editing its output

#

Perhaps the same trick would work with GPT-OSS

jade egret
#

i like google (:

errant cave
#

I see, thanks

ocean vortex
#

And 400k is more than enough for 99% cases. But for that 1% I'm gonna use aistudio yeah

#

now let's talk about actual output performance

keen beacon
white hatch
#

I think chatgpt uses their context more effectively

keen beacon
#

Updated usage limits

errant cave
#

This is hilarious lol

olive mesa
white hatch
ocean vortex
keen beacon
#

Because China. People fear China for some reason

stray aspen
keen beacon
#

I have used Kimi K2 happily

errant cave
stray aspen
#

is lmarena somehow related to openAI

keen beacon
finite mirage
#

algum br por aqui?

stray aspen
finite mirage
#

dbas?

stray aspen
glad perch
finite mirage
#

que

indigo hazel
#

i mean, if it has far less hallucinations yes

quartz light
stray aspen
#

should i bet on polymarket

wheat onyx
#

still dont have gpt5 yet (plus account)

#

☠️

stray aspen
#

yeah it gives me imgur links sometimes

#

that lead to nothing

red sluice
#

Back to 3.5 in terms of hallucinations

errant cave
wheat onyx
subtle tree
wheat onyx
#

Other than the fact that I don't have access

red sluice
stray aspen
#

how is qwen near the top

#

its french bro

red sluice
stray aspen
#

this model is supposed to be good at languages

#

they yapped about that in the livestream

red sluice
#

Yeah ofc I'm based in France

stray aspen
#

is it available for non EU countries

#

in europe

eternal niche
red sluice
#

Delete cookies/cache on ChatGPT, wait like 20 minutes, log, and you'll have GPT5

#

oh

stray aspen
#

where are you

eternal niche
#

he said "AI made by americans for americans"

sacred quail
# ocean vortex Context is not performance metric.

it could be depends to situation. For example. If i give some long text, some long video, some long book and ask something (espicially useful for learning something or if you student) gemini is crazy good while all other llms losing their mind.
but i must say, right now gpt 5 doing better outputs than gemini 2.5 pro

wicked root
stray aspen
#

bro how is this possible

#

how is qwen 3 par with gpt 5 at coding

quartz light
subtle tree
errant cave
stray aspen
#

im gonna develop a roblocks game with gpt-5

#

whats funny

wicked root
#

No one knows what style control does

subtle tree
#

Qwen's performance is demonstrably on par with top-tier models like GPT-5 in coding, likely due to its specialized training on massive code datasets, sophisticated architecture (including MoE)

wheat onyx
burnt sinew
#

What lwaderboard is that

subtle tree
burnt sinew
#

Why does thst not show on the normal lmarena leaderboard

subtle tree
burnt sinew
#

I know

#

Exactly

#

Coding should show as another ranking

#

On main page

subtle tree
burnt sinew
subtle tree
#

Why not just find out by accident! ¬_¬

burnt sinew
golden ocean
#

gpt 5 is asi

wicked root
#

what?

#

WHAT DOES STYLE CONTROL DO?

subtle tree
#

"This leaderboard shows what are the best LLMs for writing and editing code (released after April 2024). "

ocean vortex
#

🤯 🤯

#

it f'ing destroyed o3

#

lmfao

#

wtf

stray aspen
#

gpt 5 benchmark on artificial analysis

#

whats the gpt 5 on lmarena

ocean vortex
#

Wouldn't expect it to be THIS good. It's doing worse than o4-mini-medium on most things people test it on

ocean vortex
#

new SOTA

balmy mist
#

i can def say gpt-5 is the SOTA

stray aspen
#

it is officialy SoTA i guess

balmy mist
#

based on my tests

stray aspen
#

i was wrong

#

ive always been wrong

ocean vortex
#

They just made up that 1M figure huh

subtle tree
# wicked root **WHAT DOES STYLE CONTROL DO?**

It is a statistical method that adjusts a model's score by removing the influence of "style" features like response length, emoji count and markdown usage to reveal its true performance based on the quality of its content... not just how it looks.

stray aspen
#

whats gpt 5 high

subtle tree
# wicked root **WHAT DOES STYLE CONTROL DO?**

For example a model that consistently produces long, friendly responses with lots of emojis might get a lower score after style control is applied if the underlying content isn't actually better than a more concise, factual response from another model.

ocean vortex
#

Because gpt5 scored the top spot?

#

lmao

#

Well they did fix them. You gotta give them some slack too I don't think anyone is doing as much testing and this fast like they do

rough nimbus
#

Guys is there a way to call AI bot on discord server? Chatgpt or sth ?

misty vault
ocean vortex
#

at least qwen they did

#

for glm4.5 I stopped following it

#

which model

stray aspen
#

how does this high medium low minimal thing work

ocean vortex
#

yeah so that still not fixed yet I suppose

#

it's good. They finally fixed that flaw. They made a good move with sticking out with gpt4o model size... Training progress caught up to that size now lol

#

Meanwhile Anthropic went for short term gain with Opus but are now stuck with it

#

4.1 and gpt5 released only days apart

#

but performance difference is huge

subtle tree
#

Removing style control, it seems like GTP5 is still in first place, and second is GEMINI 2.5 PRO, which is weird.

jade egret
ocean vortex
subtle tree
shadow jewel
#

chatgpt or claude?

jade egret
ocean vortex
#

chatgpt-latest tanks though - that is kinda weird.

jade egret
neat apex
#

Gpt 5 is great in my tests xd

ocean vortex
#

Style control should bring 4o-latest down, not help it... yeah weird lol

hollow imp
patent aspen
#

I think it's reasonable to claim GPT-5 is SotA. It's just not SotA enough

jade egret
#

not rlly big tbh

wicked root
subtle tree
#

Ultimately, China will prevail in the end.

wicked root
#

How do we know if GPT5 is better with such small sample?

jade egret
#

do you guys have access to gpt-5?

keen beacon
#

updated 2.5 pro 🔥

neat apex
#

Qwen 3 Plus will beat this Gpt 5?

jade egret
#

i do

wicked root
subtle tree
#

But I like the analytical side of the 2.5 pro

jade egret
#

new model always win, same with gpt-5

wheat onyx
#

how can I still not have access....

#

give it to me

neat apex
wicked root
#

I said for their own domestic use, yeah

wicked root
#

Not rn ofc. Give it 10 years.

neat apex
#

Even LG launched the best model, it is always who launchs at last

reef pawn
#

Are they using new image model for GPT-5 or still the old one?

jade egret
subtle tree
#

not today but some day

neat apex
#

Damn, they lied to me

stray aspen
#

exaone is garbage

wheat onyx
#

for coding, it's ~ the same right? With GPT5 significantly cheaper

stray aspen
#

wdym

wheat onyx
#

And then Anthropic has big updates coming soon

jade egret
#

gpt-5 doesn't suck, but they just hype it too much

reef pawn
ocean vortex
neat apex
#

Opus 4.1 is better than it? Or too drawded to say an winner

jade egret
#

2.5 pro got updated?

reef pawn
ocean vortex
keen beacon
ocean vortex
#

meant to post this actually

keen beacon
#

btw remove style control 🤣

ocean vortex
#

webdev is more interesting to me tbh

keen beacon
#

2.5 pro is still on top somehow

stray aspen
#

gpt-5 vision is amazing

jade egret
#

bro did gemini 2.5 pro even got an update, if it didn't than it pretty reasonable for gpt-5 to beat it, as i said before, newer models win (:

stray aspen
#

its better than gemini

subtle tree
#

"Nothing is permanent except change." — Arthur Schopenhauer

ocean vortex
reef pawn
wicked root
# jade egret wdym

We're all forgetting that the language, Chinese, is an extremely high context and figurative one. You need native Chinese to develop an AI that handles domestic uses. Country's authoritarian af so it already collects tons of data and we know people who are close to the 'party' get special privileges that are unimagineable in the West. So is it possible that alibaba or whatever develop a product that exceeds Western models when it comes to unique contextual questions that are asked by every day Chinese people on Chinese issues? Yeah. I don't see why not?

keen beacon
jade egret
#

i though u talking abt something else

keen beacon
#

wow i didn't see this 💀

jade egret
reef pawn
#

Genie

wicked root
#

overall yeah

neat apex
#

Gpt 5 can be updated at users feedback yet

wicked root
#

We need more samples on GPT5, but ngl I'm a bit nervous for Gemini.

ember sentinel
#

Yo, guys was the horizon models related to GPT-5 or should we expect better models 😉

wicked root
#

That's great news for me bc I have bets on Google.

ocean vortex
neon idol
ornate stump
#

still don't have gpt5 do i need to do something

jade egret
#

waiting for gpt 5.5 (:

jade egret
ocean vortex
#

only 9 points behind with style control disabled

neat apex
#

But they keep a million worth a day to gpt 4.5 compete yet

keen beacon
#

im not sure if gpt 5 can beat the updated 2.5 models tho

reef pawn
#

Google AI team is literally run by noble prize winner ofc they gonna win, it's no brainer

jade egret
tranquil blaze
#

guys does video arena have veo 3?

jade egret
#

fr i have no idea lol

echo aurora
tranquil blaze
reef pawn
#

No, it's not

neat apex
#

Gemini besises be the best, is yet only cents already of like, Mistral

jade egret
#

wait so is 2.5 getting an update?

neon idol
#

I HAVE GPT 5

#

Lets go

stray aspen
#

when is 2.5 getting an update

jade egret
jade egret
keen beacon
jade egret
keen beacon
#

on pro

#

so there's clearly an update left

ocean vortex
solar hollow
#

i just woke up, so far i am reading that gpt 5 is underwhelming? is it true?

jade egret
jade egret
neat apex
#

Gpt 5 is Gpt 4o 2, nothing that oustunishing

ocean vortex
#

I suppose wollfstride, but we don't know how it REALLY performs other than some limited impressions

neat apex
#

Nahhh, Open ai did not overhyped gpt 5

ocean vortex
#

that may have been misleading

keen beacon
neat apex
#

They did Gpt 4.5, but not Gpt 5

reef pawn
#

Google Deepmind CEO is Noble Prize Winner in Neuroscience, Not only that he was literally child prodigy with Google brand on table and Working under Noble Prize Winner they wouldn't face any problem with AI talent hunt like Zuckerberg or Musk. Even the Godfather of AI used to work at Google. People who disagree with these facts are not rational to begin with.

neon idol
#

Why I cant take photos?

jade egret
keen beacon
#

with that logic, you can just pay the chatgpt sub tho

neon idol
ocean vortex
#

It was avail to Plus subs. And even on openrouter briefly I think before they started requiring your key (or was it o1..?)

jade egret
neon idol
brittle tiger
#

4.5 was supposed to be 5. It didnt work out as leap that would be worthy of 5 branding so with tooling and a lot of other clever implementations put this out. Kinda wish they just waited for a leap worthy of 5 but they need to be constantly fundraising so it does make some sense

jade egret
stray aspen
# stray aspen
poll_question_text

which is better for code

victor_answer_votes

4

total_votes

6

victor_answer_id

1

victor_answer_text

gpt-5

reef pawn
#

GPT-5 is good model

neat apex
#

👻 👻 Qwen 3 Plus 👻 👻

ocean vortex
reef pawn
#

I personally think it's better than Grok 4 because it's literally free

jade egret
keen beacon
keen beacon
jade egret
neon idol
# jade egret dont think so

In my opinion wirh the realese pf gpt 5 they are working on the UI of the app becuase that button was also for thinking mode, deep research...

keen beacon
#

4o (oct 2023) => 4.1 (june 2024) => gpt-5 (oct 2024)

neat apex
ocean vortex
brittle tiger
keen beacon
jade egret
#

do all of you have access to gpt-5?

keen beacon
#

u can access gpt 5 via lmarena direct chat

neat apex
#

I dont even have Gpt installed at my phone

jade egret
#

changed color (:

keen beacon
#

if it hasnt rolled out to you

reef pawn
#

GPT-5 mini is good as well

neon idol
keen beacon
#

it seems different from one on LMArena

ocean vortex
#

you could pay $20 and use it a ton. That's accessible I would say

keen beacon
#

in formatting

ocean vortex
#

not $200 and 10rpd

neat apex
#

Gpt 5 mini is 90% of quality like it was with Gpt 4.1 mini?

neon idol
ocean vortex
#

and for API it's tier3. Not tier1 but not tier5 either

jade egret
#

would you pay 200$ for gpt-5 pro? personally nah

keen beacon
#

i would rather use gemini 2.5 flash than gpt 5 mini tbh

ocean vortex
#

with new Gemini no tier will give you API lol

reef pawn
keen beacon
#

yes

jade egret
#

imagine if google drop gemini 3 tomorow 🤯

reef pawn
#

Intresting

ocean vortex
neat apex
#

Literally named after it

keen beacon
neon idol
reef pawn
#

I personally hated the 4.1 series

keen beacon
jade egret
neat apex
#

They were a worsen version of other models

jade egret
neat apex
#

But gpt

jade egret
neon idol
ocean vortex
jade egret
#

but tbh i kinda expect gpt-5 to be much better at coding than 4.1 opus 🙁

jade egret
#

but the price is 10x 😭

ocean vortex
ionic mortar
#

Quick question, does LMArena have message limits for any of the models? I can't find any info about this online.

jade egret
#

oooo

blazing bison
neon idol
#

Guys I have read that chagpt realease for everyone the advanced vocal mode

blazing bison
#

Gpt 5 medium is better than opus for coding

reef pawn
jade egret
#

hopefully google discover it soon (:

blazing bison
#

On my tests

blazing bison
#

But opus is good as gpt 5 without thinking

ionic mortar
fickle venture
#

Guys i cant send message to gpt 5 is lmarena down

ionic mortar
#

For text at least

blazing bison
#

Gpt 5 without thinking is a gpt 4o

reef pawn
#

Anthropic makes most of his money from developers where as Open AI is used by everyone

fickle venture
golden ocean
#

can u get a profile picture

wicked root
#

which method is gemini using? Do you see GPT5 overtaking Gemini on the leaderboard this month?

reef pawn
#

Literally Top 5 most visited site in the world

golden ocean
#

put on a profile picture now*

jade egret
blazing bison
#

I dont think it's #1

#

Not by far

wicked root
#

What about for overall?

blazing bison
#

I mean, I need to calculate costs, bcs it thinks a lot

jade egret
#

i mean other compaies prob gonna react to gpt-5 and lock in

whole wagon
#

GPT5 is multimodal right?

blazing bison
#

And opus without thinking gave me the same results

#

Im testing it rgn

quartz light
#

bro 😭

#

gpt 5

reef pawn
blazing bison
#

Webdev

whole wagon
#

Why can't I get it to accept video or audio then. Strange

reef pawn
#

Only 3k votes

whole wagon
#

It keeps telling me it's unsupported

reef pawn
#

It's a good model

#

But Gemini 3 will beat it in no time soon

wicked root
blazing bison
#

Needs more votes to be fair

#

Im talking about opus

#

Vs gpt 5

jade egret
#

gpt-5 is def better but not gonna be for too long

keen beacon
whole wagon
reef pawn
wicked root
#

@deep adderwhat's your opinion on overall ranking?

whole wagon
#

💀

#

Welp guess I just stick to Gemini lol

whole wagon
#

I use video summary

fickle venture
#

Btw there is no Claude 4.1 somehow

keen beacon
#

no it uses the frames as well

#

not sure about audio

jade egret
#

grok 5 v.s gpt 6 v.s gemini 3.5, whos winning?

fickle venture
reef pawn
jade egret
fickle venture
#

It gonna drop the market again when it releases

#

They said late April but it's been a while

reef pawn
fickle venture
jade egret
#

:00

fickle venture
#

So it can search easily and be trainned

glass arch
#

my problem is the damn knowledge cutoff for gpt5 is still june 2024

#

like PLEASE

reef pawn
#

Yeah

fickle venture
#

It's because so no one search dangerous things on 2025

#

So they have to lock it

#

In 2026 they will change it to may 25, 2025

#

Idk

reef pawn
#

I tried audio summary of a Wikipedia article on Notebook LM but the audio file was corrupted even after two tries.

errant cave
#

My first prompt to GPT-5 and I already hate it

#

What a cringeworthy little sht