reef pawn Aug 13, 2025, 7:30 PM

#

LG is cooking

neon idol Aug 13, 2025, 7:30 PM

#

reef pawn LG is cooking

Fr

pure falcon Aug 13, 2025, 7:38 PM

#

It makes me wonder what kind of person should use battle. Power user? Early adopter? Average person?

Right now, lots of people who may not be able to afford GPT pro, are free-riding direct chats to get gpt-5 high access.

You could maybe gate + cap direct chats at like, idk, 15 messages and force them to respond to a battle before continuing.

I do wonder how the average battle user vs direct chat user differs. My guess is direct chat people are younger, less income, etc. Winder what effect that ultimately has on the leaderboards / stats / rankings

agile bloom Aug 13, 2025, 7:38 PM

#

gpt-5-search model can search the web?

neon idol Aug 13, 2025, 7:42 PM

#

agile bloom gpt-5-search model can search the web?

Yes

pure falcon Aug 13, 2025, 7:43 PM

#

Though tbh, the style control thing confuses me. LMArena is all about moving away from “objective” benchmarks to the ones that matter most: user preferences.

If users like different styles and formatting, then why control for it? The whole point of this is to find out what users like!! Defeats the whole purpose imo

pure falcon Aug 13, 2025, 7:43 PM

#

pure falcon Though tbh, the style control thing confuses me. LMArena is all about moving awa...

@ocean vortex

ocean vortex Aug 13, 2025, 7:45 PM

#

pure falcon It makes me wonder what kind of person *should* use battle. Power user? Early ad...

I think AI enthusiasts. But those people are not gonna use it how it was designed for if they can't actually test unreleased models for even a single message after voting

neon idol Aug 13, 2025, 7:46 PM

#

@echo aurora What is this new ai image generator called nano banana that is in LMarena

jade egret Aug 13, 2025, 7:47 PM

#

😭

ocean vortex Aug 13, 2025, 7:48 PM

#

OLED AI

obsidian cargo Aug 13, 2025, 7:48 PM

#

neon idol <@283397944160550928> What is this new ai image generator called nano banana tha...

stealth model. I think it's an imagen variant

ocean vortex Aug 13, 2025, 7:48 PM

#

😭

pure falcon Aug 13, 2025, 7:48 PM

#

ocean vortex I think AI enthusiasts. But those people are not gonna use it how it was designe...

Well, LMArena got a nice check from a16z and co. But eventually they’re gonna have to find out who their customers really are. Users or AI companies? I imagine they charge model makers to put their model up in the arena, no?

echo aurora Aug 13, 2025, 7:48 PM

#

neon idol <@283397944160550928> What is this new ai image generator called nano banana tha...

For models that are behind a codename I won't be providing info about them

neon idol Aug 13, 2025, 7:49 PM

#

echo aurora For models that are behind a codename I won't be providing info about them

Oh ok

ocean vortex Aug 13, 2025, 7:49 PM

#

pure falcon Well, LMArena got a nice check from a16z and co. But eventually they’re gonna ha...

AI labs are the "customers" but users are enablers. Without users there's no benchmark

pure falcon Aug 13, 2025, 7:52 PM

#

ocean vortex AI labs are the "customers" but users are enablers. Without users there's no ben...

Right. So from the model makers eyes, the rankings have to converge on the average user. Otherwise it’s useless. OpenAI, for instance, went with summit over zenith bc of better LMArena scores.

If the people battling in the arena are different from the companies average user, then it becomes a bad benchmark. LMArena must find a way to match those two pools of people

hollow imp Aug 13, 2025, 7:52 PM

#

Has anyone tried minimax? How is it

ocean vortex Aug 13, 2025, 7:54 PM

#

pure falcon Right. So from the model makers eyes, the rankings have to converge on the avera...

I think it would naturally converge for the most part. More interest in battle mode = more users participating = more reach = more people will hear about it and try it.

pure falcon Aug 13, 2025, 7:57 PM

#

Haha, are we sure about that?? You just said that early arena testing is made for enthusiasts lol, which i do agree with. Enthusiasts aren’t average users though.

One thing LMArena should roll out, is user accounts. Allowing accounts on the platform would go a long way to help them get the data they need, in order to answer that question

stray aspen Aug 13, 2025, 7:58 PM

#

hollow imp Has anyone tried minimax? How is it

it sucks

#

however they have great tts models

#

and video models

hollow imp Aug 13, 2025, 8:02 PM

#

stray aspen however they have great tts models

Text to speech?

hollow imp Aug 13, 2025, 8:03 PM

#

stray aspen it sucks

So according to you the only things which do not suck are 4.1 opus, 2.5 pro, grok 4, some specific versions of gpt 5

ocean vortex Aug 13, 2025, 8:13 PM

#

pure falcon Haha, are we sure about that?? You just said that early arena testing is made fo...

The demographic is not changing and you can't easily control that (+I'm not sure you should). The best you can do is be increasing the interest and activity...

#

Something like chatgpt started with like only the most die hard nerds at it's infancy, look at it now... 👀

#

They didn't do anything to push those away and make their audience "more balanced" at the time and that was the right move

jade egret Aug 13, 2025, 8:16 PM

#

when gemini 3 ):

neon idol Aug 13, 2025, 8:17 PM

#

Fr

ocean vortex Aug 13, 2025, 8:18 PM

#

Not as good as gpt5 😇

#

Also interesting fact... it doesn't even have 1% of chatgpt market share lol

stray aspen Aug 13, 2025, 8:20 PM

#

hollow imp So according to you the only things which do not suck are 4.1 opus, 2.5 pro, gro...

claude and 2.5 pro suck

#

grok 4 and gpt-5 are good

ocean vortex Aug 13, 2025, 8:21 PM

#

It has... you are very wrong actually. It's more than 20%

#

Gemini is quite substantial

#

Mostly thanks to Android I think

#

They integrated Gemini into Android itself quite well. Like they have a properly working voice assistant based on Gemini. They have what Apple should have had by now with Siri lol

zinc ore Aug 13, 2025, 8:24 PM

#

20% figure is Gemini app

ocean vortex Aug 13, 2025, 8:24 PM

#

Not really, it's still Gemini and it's super easy to direct users directly to Gemini app from there

zinc ore Aug 13, 2025, 8:25 PM

#

They don't count the AI overviews stuff in search in those figures

ocean vortex Aug 13, 2025, 8:26 PM

#

I've tried those local modals and they are crap. No 2 ways about it

zinc ore Aug 13, 2025, 8:26 PM

#

Their AI overviews is 2b users per month, and I forget what the AI mode one is (but it's been growing at a decent tick too)

ocean vortex Aug 13, 2025, 8:26 PM

#

Even their cloud model is underwhelming

loud leaf Aug 13, 2025, 8:26 PM

#

does current gpt-5-high in arena correspond to GPT-5-Auto in the client? is GPT-5-Pro distinct / to be added to arena at some point?

ocean vortex Aug 13, 2025, 8:27 PM

#

2.5 Flash would destroy that cloud model without trying

#

Probably even Flash-Lite has a chance

#

Well there are no useful on-device tasks that wouldn't use chatgpt...

#

with iOS

#

It can do like notification summaries. But any at all model can do that and it takes no time at all to make API request for that lol

#

then you don't have notifications either

#

lol

#

Ones that would benefit from summarisation usually do

#

it's just such a non-thing. You need to be actively looking for edge cases to find any benefits...

#

It also kills your battery much more than API

leaden palm Aug 13, 2025, 8:37 PM

#

do you guys think o4-mini and gpt-5-mini have less world knowledge than gpt-4.1?

eternal niche Aug 13, 2025, 8:39 PM

#

btw gpt5 sucks

#

well

pure falcon Aug 13, 2025, 8:40 PM

#

I hate OpenAI and will never give scam Altman a dime of my money, but GPT-5 high is the nuts

ornate agate Aug 13, 2025, 8:44 PM

#

I thought that only works with “juice” on the api?

tight oriole Aug 13, 2025, 8:44 PM

#

any ETA on when video battle goes live?

pure falcon Aug 13, 2025, 8:47 PM

#

eternal niche well

honestly can you stop spamming this? You’re not accomplishing anything except looking like an idiot

eternal niche Aug 13, 2025, 8:47 PM

#

pure falcon honestly can you stop spamming this? You’re not accomplishing anything except lo...

cope

pure falcon Aug 13, 2025, 8:47 PM

#

eternal niche cope

keep denying reality, I’m sure it’s doing you well

#

Anywho

eternal niche Aug 13, 2025, 8:47 PM

#

cope

pure falcon Aug 13, 2025, 8:48 PM

#

One day Claude will reign supreme in all areas

#

Besides coding

#

Claude is the champ

#

agreed (reluctantly)

eternal niche Aug 13, 2025, 8:50 PM

#

gemini 3 - SOTA

#

just accept it

white hatch Aug 13, 2025, 8:57 PM

#

magnum opus?

wicked root Aug 13, 2025, 9:03 PM

#

Someone here disagreed with you

ocean vortex Aug 13, 2025, 9:03 PM

#

eternal niche gemini 3 - SOTA

No it's gpt6

#

or gpt5 if you want to stick to reality 👀

eternal niche Aug 13, 2025, 9:04 PM

#

ocean vortex or gpt5 if you want to stick to reality 👀

golden ocean Aug 13, 2025, 9:05 PM

#

Russian state news

leaden palm Aug 13, 2025, 9:05 PM

#

uhhh

#

i need to scroll up

#

yeah can you stop @eternal niche lol

ocean vortex Aug 13, 2025, 9:05 PM

#

golden ocean Russian state news

illegal neo-... propaganda. We don't need that here

leaden palm Aug 13, 2025, 9:05 PM

#

14 times

eternal niche Aug 13, 2025, 9:05 PM

#

leaden palm yeah can you stop <@1391027884889608333> lol

spam

leaden palm Aug 13, 2025, 9:06 PM

#

eternal niche spam

act in good faith pls

ocean vortex Aug 13, 2025, 9:07 PM

#

I love it how he felt the need to explain what SOTA is

#

yeah act in good faith Russian spy.

@eternal niche

eternal niche Aug 13, 2025, 9:09 PM

#

leaden palm act in good faith pls

well, right now they're being russophobic and making fun of me for being Russian, and you're ignoring it

#

leaden palm Aug 13, 2025, 9:10 PM

#

that's not good either, exchanging blows doesn't usually get you anywhere

#

i didn't mean to ignore it - i'm not always watching the chat

leaden palm Aug 13, 2025, 9:12 PM

#

ornate agate I thought that only works with “juice” on the api?

fwiw pro users have 64% of high juice

ocean vortex Aug 13, 2025, 9:12 PM

#

eternal niche well, right now they're being russophobic and making fun of me for being Russian...

that wasn't making fun it's just a friendly joke. Also no need to take personally reaction to some specific thing like "state news" lol

ornate agate Aug 13, 2025, 9:12 PM

#

leaden palm fwiw pro users have 64% of high juice

lol

eternal niche Aug 13, 2025, 9:12 PM

#

ocean vortex that wasn't making fun it's just a friendly joke. Also no need to take personall...

you are not my friend

leaden palm Aug 13, 2025, 9:13 PM

#

eternal niche you are not my friend

i think it'd be beneficial to drop this and not have a giant discussion about who's in the right/wrong

ocean vortex Aug 13, 2025, 9:14 PM

#

well that is more "juice". No parallel / pro, but those requests are limited even for pro sub anyways

white hatch Aug 13, 2025, 9:16 PM

#

tasty mhmhmmhmmm

#

lol

ocean vortex Aug 13, 2025, 9:16 PM

#

Some tasks probably yes... not enough reasoning effort is not gonna arrive at the answer even if you run it 10+ times

upbeat pasture Aug 13, 2025, 9:25 PM

#

@echo aurora Is the data for LMArena going to be released (I sent an email to lmarena.ai@gmail.com about it)

echo aurora Aug 13, 2025, 9:28 PM

#

upbeat pasture <@283397944160550928> Is the data for LMArena going to be released (I sent an e...

When we do open data releases we'll be sure to share. We did one recently, would recommend to check out this blog.

lmarena.ai@gmail.com
Would also note this isn't an email address that we use. contact@lmarena.ai is what you're looking for.

wicked root Aug 13, 2025, 9:38 PM

#

@eternal niche no way this guy’s a legend

gentle plinth Aug 13, 2025, 9:45 PM

#

i think gemini deep-think might be better if we can trust the benchmarks, but i cant try it and havent heard much about it bc its so expensive and not in the api

#

grok4 heavy or the "normal" 4

#

heavy is more like deepthink

#

i dont think they have any of these parallel thinking models in the arena

#

i tried it with rust code, and it wasnt able to write compilable code (unlike gpt5-high)

ornate agate Aug 13, 2025, 9:50 PM

#

What kind of debugging?

rain mulch Aug 13, 2025, 9:51 PM

#

hiii

#

@amber warren yo

#

😼

ornate agate Aug 13, 2025, 9:53 PM

#

Hmm. I think for complex code debugging in c++ where I assume some of it is segfault or unexpected stuff, most AI is not very good at this

#

I would actually ask a lot of them simultaneously

amber warren Aug 13, 2025, 9:54 PM

#

rain mulch <@276544649148235776> yo

howdy

#

i am normie in here

wicked root Aug 13, 2025, 9:55 PM

#

In coding? Gemini absolutely brute forces

#

Well, gemini pro

#

Flash is utterly useless for coding

stray aspen Aug 13, 2025, 9:55 PM

#

amber warren i am normie in here

do you have any info if video arena will be added to lmarena website

amber warren Aug 13, 2025, 9:56 PM

#

stray aspen do you have any info if video arena will be added to lmarena website

it's something we have on the roadmap but haven't explcitly started workin on!

cedar tide Aug 13, 2025, 9:56 PM

#

Toad from ?

ornate agate Aug 13, 2025, 9:56 PM

#

It might find the bugs, but I would also try other AIs if it doesn’t.

cedar tide Aug 13, 2025, 9:58 PM

#

?

Screenshot_2025-08-14-00-57-07-889_com.discord-edit.jpg

amber warren Aug 13, 2025, 9:58 PM

#

toad

eternal niche Aug 13, 2025, 10:00 PM

#

cedar tide ?

gemini 3

ornate agate Aug 13, 2025, 10:00 PM

#

It’s a good test but what harness/platform are you using?

barren prairie Aug 13, 2025, 10:01 PM

#

cedar tide ?

Is it good?

cedar tide Aug 13, 2025, 10:01 PM

#

barren prairie Is it good?

Dont know

#

Msft

ornate agate Aug 13, 2025, 10:01 PM

#

If you want the AI to code it for you it’s probably worth trying Claude code and Qwen code.

barren prairie Aug 13, 2025, 10:02 PM

#

Any recommanded models for python??

eternal niche Aug 13, 2025, 10:02 PM

#

gemini 2.5 pro SOTA

ornate agate Aug 13, 2025, 10:03 PM

#

I think it’s good enough to be worth trying.

eternal niche Aug 13, 2025, 10:03 PM

#

i forgive you

misty vault Aug 13, 2025, 10:03 PM

#

are you ok?

ornate agate Aug 13, 2025, 10:03 PM

#

Qwen code is free if you can’t pay for Claude code

misty vault Aug 13, 2025, 10:03 PM

#

This isn't the real @ paws

ornate agate Aug 13, 2025, 10:04 PM

#

There is also Gemini cli which is also free

gentle plinth Aug 13, 2025, 10:06 PM

#

qwen wrote a stockfish PR actually

ornate agate Aug 13, 2025, 10:06 PM

#

No but I don’t have time to do like giga deep dives like that unfortunately.

gentle plinth Aug 13, 2025, 10:06 PM

#

https://abrok.eu/stockfish/

gentle plinth Aug 13, 2025, 10:07 PM

#

gentle plinth https://abrok.eu/stockfish/

prompt was quite detailed tho: https://rentry.co/bm6vriai

Here is the current search code of the chess engine Stockfish:

<stockfish src/search.cpp from a37b38bdf0ad964363a7fef4b278ccc761a52c52>
Here are some useful notes:

Stockfish Search Patch Development Insights

Here is a consolidated, comprehensive, and organized list of high-value insights for writing Stockfish search patches, focused on static code a...

#

model is qwen3-235b-a22b-thinking-2507

sand bay Aug 13, 2025, 10:08 PM

#

this is fake it's actually 4

rain mulch Aug 13, 2025, 10:11 PM

#

amber warren howdy

bro has no mod aura

gentle plinth Aug 13, 2025, 10:13 PM

#

sand bay this is fake it's actually 4

it doesnt know who it is

#

this is a common problem with llms

amber warren Aug 13, 2025, 10:17 PM

#

rain mulch bro has no mod aura

Dead

#

i get the n/a role

#

which is admin for some reason

#

cat_with_raised_eyebrow

#

my proof is that i am employed by lmarena

echo aurora Aug 13, 2025, 10:21 PM

#

can confirm ablobnodfast

#

all of them simultaneously

ornate agate Aug 13, 2025, 10:26 PM

#

Using many models at the same time is actually a good idea

#

I wouldn’t bother with grok though

echo aurora Aug 13, 2025, 10:30 PM

#

just all

gentle plinth Aug 13, 2025, 10:33 PM

#

noticed something interesting. gpt-5 high has only a 33% winrate against gemini 2.5 pro in the arena

worn tundra Aug 13, 2025, 10:34 PM

#

is there any way to see all the available models? like the stealh models when you play battle with 2 anon models? because I cannot find any

gentle plinth Aug 13, 2025, 10:35 PM

#

worn tundra is there any way to see all the available models? like the stealh models when yo...

in the leaderboard? they will be published once they come out (if they should be released)

#

which they probably only will if the arena went good

solid brook Aug 13, 2025, 10:36 PM

#

Omg google releasing anything but gemini 3

#

This is so bs

echo aurora Aug 13, 2025, 10:43 PM

#

worn tundra is there any way to see all the available models? like the stealh models when yo...

Sorry to say there is not.

red sluice Aug 13, 2025, 10:47 PM

#

Just did a few text prompts with the search mode, it didn't tell me the models it was once I voted, is it normal? new?

wicked root Aug 13, 2025, 10:58 PM

#

gentle plinth noticed something interesting. gpt-5 high has only a 33% winrate against gemini ...

This might change with more samples

terse shuttle Aug 13, 2025, 11:00 PM

#

cedar tide ?

where u find it?

wicked root Aug 13, 2025, 11:05 PM

#

@eternal nichedo you think gemini pro has a shot against gpt 5 this month?

tall summit Aug 13, 2025, 11:08 PM

#

worn tundra is there any way to see all the available models? like the stealh models when yo...

there is unofficially

ripe mountain Aug 13, 2025, 11:08 PM

#

The best AI models according to benchmarks

amber warren Aug 13, 2025, 11:16 PM

#

posting artificial analysis 😠

hardy lion Aug 13, 2025, 11:16 PM

#

we use LMArena so we don't even know until after we vote

ripe mountain Aug 13, 2025, 11:18 PM

#

amber warren posting artificial analysis 😠

wym?

quiet dust Aug 13, 2025, 11:22 PM

#

Why is the regular GPT-5 model weaker than the GPT-5 mini?
If anything, GPT-5 (minimal) is a regular model, without thinking, and when the requests end, you are transferred to GPT-5 mini

ornate agate Aug 13, 2025, 11:34 PM

#

Gpt5 high that you see there is api only it’s not even available to pro users is my understanding

leaden egret Aug 13, 2025, 11:35 PM

#

Is there anywhere you can specifically test nano banana or do you just have to wait for it to come up once every like 50 prompts

ripe mountain Aug 13, 2025, 11:40 PM

#

quiet dust Why is the regular GPT-5 model weaker than the GPT-5 mini? If anything, GPT-5 (m...

Because GPT-5 was not trained to be minimal, it was trained to think. GPT-5 Mini was trained better because it had no way of thinking.

echo aurora Aug 13, 2025, 11:45 PM

#

leaden egret Is there anywhere you can specifically test nano banana or do you just have to w...

You'll have to wait for it to appear. When it comes to the models that are behind a codename those are only accessible in the Battle mode.

stray aspen Aug 13, 2025, 11:48 PM

#

bro some people in video arena are down bad lmao

solid brook Aug 13, 2025, 11:52 PM

#

ripe mountain The best AI models according to benchmarks

This benchmark is bs. It shows that gpt oss is better than claude 4 which is completly false

quiet dust Aug 13, 2025, 11:55 PM

#

ripe mountain Because GPT-5 was not trained to be minimal, it was trained to think. GPT-5 Mini...

But then... GPT-5 mini > GPT-5? Doesn't that seem strange to you?

keen beacon Aug 13, 2025, 11:57 PM

#

ripe mountain The best AI models according to benchmarks

Qwen above Deepseek sounds like massive psyop

#

It is really not that good

ripe mountain Aug 13, 2025, 11:57 PM

#

keen beacon Qwen above Deepseek sounds like massive psyop

deepseek r1 very old model

keen beacon Aug 13, 2025, 11:58 PM

#

ripe mountain deepseek r1 very old model

You are missing the point

ripe mountain Aug 13, 2025, 11:58 PM

#

i think qwen coder 3 is even better than gemini 2.5 pro

ripe mountain Aug 13, 2025, 11:58 PM

#

ripe mountain i think qwen coder 3 is even better than gemini 2.5 pro

for coding

ripe mountain Aug 13, 2025, 11:58 PM

#

keen beacon You are missing the point

wym

keen beacon Aug 13, 2025, 11:59 PM

#

Try to ask it for around 100 anime similar to Madoka Magica, Qwen will fail, hallucinate and invent titles that never existed. Latest R1 does this job way better.

#

Try to ask both some questions from music theory and see how often Deepseek answers correctly and how often Qwen does

#

Qwen in general is not that good yet, unfortunately

#

It's trivial to train a model that passess certain public benchmarks even if it never was trained on them

#

The only way to compare general capabilities of models is to use private benchmarks like I do

toxic whale Aug 14, 2025, 12:01 AM

#

im in the process of testing AI models on my own benchmark i just made, so far the results are very interesting, Opus 4.1 gets only 36%. if anyone has access to GPT-5 Pro please dm me i would love to test and the bechmark is only 10 questions so it wont eat up your rate limits that much 🙂

#

Opus 4.1 is the highest so far, o3 gets 22%

keen beacon Aug 14, 2025, 12:02 AM

#

toxic whale im in the process of testing AI models on my own benchmark i just made, so far t...

We have gpt-5-high on arena

#

Isn't it enough?

ripe mountain Aug 14, 2025, 12:02 AM

#

keen beacon We have gpt-5-high on arena

not same

toxic whale Aug 14, 2025, 12:02 AM

#

keen beacon We have gpt-5-high on arena

ye i used that and am using gpt-5 high on cursor, but 5 pro is alot better from what i hear

keen beacon Aug 14, 2025, 12:02 AM

#

I find gpt-5-high enough good for most tasks lol

drifting thorn Aug 14, 2025, 12:02 AM

#

keen beacon We have gpt-5-high on arena

GPT-5 Pro and GPT-5 High is not the same

ripe mountain Aug 14, 2025, 12:03 AM

#

toxic whale im in the process of testing AI models on my own benchmark i just made, so far t...

AIs such as GPT 5 Pro, Grok Superheavy, and Gemini 2.5 Pro DeepThink are not typically included in benchmarks because they are usually purchased rather than employing a salaried software developer.

stray aspen Aug 14, 2025, 12:03 AM

#

gpt-5 high is amazing

ripe mountain Aug 14, 2025, 12:03 AM

#

stray aspen gpt-5 high is amazing

right

#

sota

toxic whale Aug 14, 2025, 12:05 AM

#

oh wait sorry i tested gpt 5 mini thinking im not done GPT-5 high, give me a few minutues ill see if it ends up beating opus

ripe mountain Aug 14, 2025, 12:06 AM

#

toxic whale oh wait sorry i tested gpt 5 mini thinking im not done GPT-5 high, give me a few...

Can you test Gemini too?

toxic whale Aug 14, 2025, 12:06 AM

#

ripe mountain AIs such as GPT 5 Pro, Grok Superheavy, and Gemini 2.5 Pro DeepThink are not typ...

what do you mean im confused?

#

tested 2.5 pro and 2.5 flash already

ripe mountain Aug 14, 2025, 12:06 AM

#

toxic whale tested 2.5 pro and 2.5 flash already

rate?

toxic whale Aug 14, 2025, 12:07 AM

#

2.5 pro is second with ~33%, flash is 14%

ripe mountain Aug 14, 2025, 12:08 AM

#

toxic whale what do you mean im confused?

They are not included in benchmarks because they are not use by average peoples, so they do not have APIs either.

ripe mountain Aug 14, 2025, 12:08 AM

#

toxic whale 2.5 pro is second with ~33%, flash is 14%

When you benchmark GPT-5 High, can you write down what percentage score it got?

toxic whale Aug 14, 2025, 12:09 AM

#

ye sure its almost done it has 2 more questions

ripe mountain Aug 14, 2025, 12:09 AM

#

thxx

quiet dust Aug 14, 2025, 12:10 AM

#

drifting thorn GPT-5 Pro and GPT-5 High is not the same

What do you think then?

leaden palm Aug 14, 2025, 12:10 AM

#

https://www.youtube.com/watch?v=9alJwQG-Wbk cant get over this guy trusting gemini with his arm

YouTube

Basically Homeless

Neuromuscular Aim Assist

Giving a PC program control of my muscles to become the fastest in the world. Sponsored by Micro Center!

Build, Upgrade, and Save All Month Long at Micro Center: https://micro.center/9d4315

Sign-Up for VIP Days at Micro Center Phoenix: https://micro.center/a11e1b

Shop 50 Series Laptops at Micro Center: https://micro.center/717642

Shop Raspb...

▶ Play video

keen beacon Aug 14, 2025, 12:10 AM

#

toxic whale ye sure its almost done it has 2 more questions

Don't forget to run each question 10 times in a row in a different context

drifting thorn Aug 14, 2025, 12:11 AM

#

quiet dust What do you think then?

I haven’t used it but on the benchmarks they’ve shown GPT-5 Pro have 5 points higher than GPT-5 itself

keen beacon Aug 14, 2025, 12:11 AM

#

Guys I have a stupid question

#

There's no signup on LMarena?

stray aspen Aug 14, 2025, 12:12 AM

#

no

keen beacon Aug 14, 2025, 12:12 AM

#

Cool.

toxic whale Aug 14, 2025, 12:12 AM

#

keen beacon Don't forget to run each question 10 times in a row in a different context

each model gets 1 attempt per question, it also makes my job 10x easier

keen beacon Aug 14, 2025, 12:13 AM

#

toxic whale each model gets 1 attempt per question, it also makes my job 10x easier

It doesn't. LLMs output is non deterministic. You need to collect a series of responses to the same question to see if they succeed.

quiet dust Aug 14, 2025, 12:13 AM

#

drifting thorn I haven’t used it but on the benchmarks they’ve shown GPT-5 Pro have 5 points hi...

In benchmarking GPT-5 Pro is one point higher than GPT-5 Thinking medium

#

And the regular GPT-5 model has 44 points.

#

GPT-5 (minimal) - this is the standard model, which is by default

toxic whale Aug 14, 2025, 12:14 AM

#

keen beacon It doesn't. LLMs output is non deterministic. You need to collect a series of re...

i guess i could but my main goal is too see how good these models are in every day tasks, you are never gonna ask a model to try again 10 times and this isnt something i plan on publishing or something

toxic whale Aug 14, 2025, 12:19 AM

#

ripe mountain When you benchmark GPT-5 High, can you write down what percentage score it got?

it got 42%, did well on coding tasks and was the best at making a flappy bird clone and an FPS shooter but did worse on some spatial reasoning and logic puzzles

toxic whale Aug 14, 2025, 12:20 AM

#

ripe mountain When you benchmark GPT-5 High, can you write down what percentage score it got?

it made the only usable flappy bird clone and fps shooter i have gotten so far tho

ripe mountain Aug 14, 2025, 12:21 AM

#

toxic whale it made the only usable flappy bird clone and fps shooter i have gotten so far t...

omg

ripe mountain Aug 14, 2025, 12:21 AM

#

toxic whale it got 42%, did well on coding tasks and was the best at making a flappy bird cl...

Compared to which AI?

keen beacon Aug 14, 2025, 12:24 AM

#

toxic whale i guess i could but my main goal is too see how good these models are in every d...

You're missing the point here, the same model can get the same task correctly 90% of all time and fail in the rest 10%. You never know if it is more or less likely to fail until you just test it with the same question again and again and again and again, until you have enough statistics to judge.

echo aurora Aug 14, 2025, 12:24 AM

#

keen beacon Cool.

This is a feature request though that's very much on our radar!

keen beacon Aug 14, 2025, 12:24 AM

#

echo aurora This is a feature request though that's very much on our radar!

Arigato.

toxic whale Aug 14, 2025, 12:24 AM

#

ripe mountain Compared to which AI?

Grok-4 and o3 did better on the spatial reasoning and logic tasks

toxic whale Aug 14, 2025, 12:25 AM

#

keen beacon You're missing the point here, the same model can get the same task correctly 90...

i did run certain models many times before and the results were almost the same so i dont think its worth the hastle to go through and test each one 10 times atleast for me

ripe mountain Aug 14, 2025, 12:31 AM

#

The most important thing about GPT-5 is not that it's the best model, but rather, that it uses resources more efficiently. Despite being an improvement, GPT-5 is much cheaper than GPT-4.5.

#

gpt 5 cheaper than 4o and 4.1

keen beacon Aug 14, 2025, 12:38 AM

#

toxic whale i did run certain models many times before and the results were almost the same ...

It is worth the hassle.

toxic whale Aug 14, 2025, 12:40 AM

#

keen beacon It is worth the hassle.

im not doing this over the API so to test like Grok-4 for example i would either have to buy the Premium plan or wait a very long time

keen beacon Aug 14, 2025, 12:40 AM

#

toxic whale im not doing this over the API so to test like Grok-4 for example i would either...

There's lmarena battle mode for this...

toxic whale Aug 14, 2025, 12:41 AM

#

Battle mode gives random models no?

#

direct chat also has limits, im getting limited on Opus thinking right now

keen beacon Aug 14, 2025, 12:48 AM

#

toxic whale Battle mode gives random models no?

Not necessary, there's side to side option that lets you choose two models

toxic whale Aug 14, 2025, 12:50 AM

#

#

with side by side you still get rate limited

stray aspen Aug 14, 2025, 12:51 AM

#

ive only ran into limits with opus

#

ive been using gpt-5 high non stop and it still lets me

toxic whale Aug 14, 2025, 12:56 AM

#

ill try with another model, Opus is expensive so that makes sense if its only opus

ripe mountain Aug 14, 2025, 12:56 AM

#

claude so overrated and overprice

#

fact

solid brook Aug 14, 2025, 12:57 AM

#

ripe mountain claude so overrated and overprice

stray aspen Aug 14, 2025, 12:57 AM

#

fact

ripe mountain Aug 14, 2025, 12:58 AM

#

omg loll

toxic whale Aug 14, 2025, 12:58 AM

#

ripe mountain claude so overrated and overprice

Anthropic as a company is just so weird too

ornate agate Aug 14, 2025, 12:59 AM

#

they are now the only ones to have not released an open source model

ripe mountain Aug 14, 2025, 12:59 AM

#

ornate agate they are now the only ones to have not released an open source model

gemini and grok?

dense reef Aug 14, 2025, 12:59 AM

#

/img vedeo

toxic whale Aug 14, 2025, 1:00 AM

#

ripe mountain gemini and grok?

Grok has Grok-1 and Gemini has gemma line up

ripe mountain Aug 14, 2025, 1:00 AM

#

gemma is opensource right

#

i forgot

dense reef Aug 14, 2025, 1:00 AM

#

Change her photo and wedding dress alongside Cristiano Ronaldo is taking a photo with it

toxic whale Aug 14, 2025, 1:00 AM

#

Grok-2 is gonna be open sourced soon they are saying

ripe mountain Aug 14, 2025, 1:01 AM

#

omg

ripe mountain Aug 14, 2025, 1:01 AM

#

toxic whale Grok-2 is gonna be open sourced soon they are saying

Instead of making Grok open source, I think they should lower the price.

toxic whale Aug 14, 2025, 1:08 AM

#

ripe mountain Instead of making Grok open source, I think they should lower the price.

ripe mountain Aug 14, 2025, 1:11 AM

#

why the EU is the worst

#

When will the Grok coding model be released?

left rain Aug 14, 2025, 1:33 AM

#

guys what is this ai image model?

#

like I can't find it anywhere in the dropdown

tired herald Aug 14, 2025, 1:36 AM

#

You cant use these hidden ai's in direct mode

left rain Aug 14, 2025, 1:51 AM

#

tired herald You cant use these hidden ai's in direct mode

yeah thanks, i realized

torn mantle Aug 14, 2025, 1:58 AM

#

where is france

whole sundial Aug 14, 2025, 2:00 AM

#

lumped in with EU

#

but the data's from January 2019, things have changed a lot since then

torn mantle Aug 14, 2025, 2:01 AM

#

1 year

#

or tomorrow

#

😴

#

true

#

yes

stray aspen Aug 14, 2025, 2:21 AM

#

amd is running gpt oss 120 on a mi400x on huggingface

rugged mulch Aug 14, 2025, 2:34 AM

#

What kind of ai is this

exotic nebula Aug 14, 2025, 2:35 AM

#

rugged mulch What kind of ai is this

Wtf 🤣

rugged mulch Aug 14, 2025, 2:39 AM

#

I don't know how they do that

stray aspen Aug 14, 2025, 2:50 AM

#

Lol

#

People on video arena are so down bad bro 😂

wicked root Aug 14, 2025, 2:59 AM

#

Who's Himanshu?

wintry tinsel Aug 14, 2025, 3:05 AM

#

AI relationships and love is a genuine threat to humanity and is growing rapidly by the day, I remember just two year ago where you would be laughed out of any discussion as a total loser for saying you date AI, now it’s still a squeamish topic but it’s not much stranger than just viewing normal illicit content, 2 years from now, it may be as commonplace as illicit content is, scary stuff

#

I remember back to the early days of chat gpt when this stuff was first being experimented with, a romantic partner bot in early 2023, people treated it as a meme than, how quickly things have changed

wicked root Aug 14, 2025, 3:14 AM

#

bro I dno anyone who says things like "I date an AI".

#

#video-arena-1 message
@eternal niche

wintry tinsel Aug 14, 2025, 3:30 AM

#

wicked root bro I dno anyone who says things like "I date an AI".

Nobody admits it but the statistics show millions are

wicked root Aug 14, 2025, 3:32 AM

#

man... maybe I'm surrounded by imposters.

#

also, wth everyone's saying their AI's too agreeable, praising, do all the gf stuffs, but my Gemini told me I'll kill myself by 45 if I don't start having hobbies among other more hyper-realistic criticisms.

#

maybe there's a real human behind my Gemini instance

solid brook Aug 14, 2025, 3:55 AM

#

wintry tinsel AI relationships and love is a genuine threat to humanity and is growing rapidly...

Yes finally someone who is rational

#

The threat is real and action is needed from those in power

#

It may look like it is fixing problems short term but in long term a very big threat is waiting

misty harbor Aug 14, 2025, 3:59 AM

#

wicked root maybe there's a real human behind my Gemini instance

Yeah and that human is my ex wife

wicked root Aug 14, 2025, 3:59 AM

#

misty harbor Yeah and that human is my ex wife

LOL

ripe mountain Aug 14, 2025, 4:13 AM

#

omg

solid brook Aug 14, 2025, 4:17 AM

#

ripe mountain omg

BS

mellow frigate Aug 14, 2025, 4:19 AM

#

Elon:
Igor, I’ve been testing Grok all week. It’s not matching ChatGPT. Not even close.

Babuschkin:
It’s early days, Elon. We’re iterating—

Elon:
Iterating? I asked it to outline a Mars colonization plan. It gave me a blog post about composting.

Babuschkin:
That’s because the model is still aligning—

Elon:
I don’t want alignment, I want intelligence. Strategic thinking. If GPT can answer it in 5 seconds, why can’t ours?

Babuschkin:
Because we don’t have the same data scale, or the same training infrastructure. It takes—

Elon:
I’m not hearing solutions, I’m hearing excuses. We’re supposed to be ahead of OpenAI, not their science fair project.

Babuschkin:
We’re building something different—

Elon:
Different doesn’t win. Better wins. People aren’t going to pay for “different.” They’ll just go back to GPT.

Babuschkin:
If you want an overnight GPT clone, you’ll need to run the company differently.

Elon:
Differently? I’m already pushing the team harder than they’ve ever worked.

Babuschkin:
Exactly. That’s the problem. AI research doesn’t work on a launch schedule.

[Elon steps closer, his voice tightening.]

Elon:
So what you’re saying is, we’re going to watch OpenAI pull further ahead… and do nothing.

Babuschkin:
I’m saying we can’t brute-force our way past them in six months. If that’s unacceptable—

Elon:
It is unacceptable.

[A pause. Babuschkin closes his laptop.]

Babuschkin:
Then I think I’m done here.

[He stands, walks out. Elon stays silent, staring at the whiteboard, gripping the marker until it creaks.]

⸻

[Elon sits down at his desk, opens Grok.]

Elon:
Grok… how do I replace a cofounder?

Grok:
Step one: Acquire a cofounder replacement kit from Amazon Prime. Step two: Follow the instructions in Swahili.

Elon:
…Not helpful.

Grok:
Would you like me to search for “emotional support raccoons” instead?

solid brook Aug 14, 2025, 4:20 AM

#

Is this real?

#

No

#

Not real

mellow frigate Aug 14, 2025, 4:23 AM

#

It is pretty hilarious to think about though haha (written by gpt 5)

languid crescent Aug 14, 2025, 5:35 AM

#

uhh what's a toad?

echo aurora Aug 14, 2025, 5:42 AM

#

languid crescent uhh what's a toad?

Some models are going to be private, and so they're given a codename.

solid brook Aug 14, 2025, 5:53 AM

#

languid crescent uhh what's a toad?

Hmmm what is the performance of this model?

reef pawn Aug 14, 2025, 6:00 AM

#

Why can't I use Deep Research in Grok 4? Is it even available?

torn mantle Aug 14, 2025, 6:08 AM

#

agile bloom Aug 14, 2025, 6:19 AM

#

which is the top tier gpt-5 model without the thinking/reasoning? GPT-5 takes too long to output reply with it's thinking/reasoning

reef pawn Aug 14, 2025, 6:25 AM

#

agile bloom which is the top tier gpt-5 model without the thinking/reasoning? GPT-5 takes t...

GPT-5 mini?

#

Or Nano

#

Nano only works with API

agile bloom Aug 14, 2025, 6:26 AM

#

reef pawn GPT-5 mini?

what about gpt-5-chat?

reef pawn Aug 14, 2025, 6:26 AM

#

It's not the fastest model since it "thinks"

agile bloom Aug 14, 2025, 6:27 AM

#

gpt-5-chat felt faster to me

#

gpt-5-high felt slower

reef pawn Aug 14, 2025, 6:27 AM

#

It's good but not as fast as compared to GPT-5 MINI

agile bloom Aug 14, 2025, 6:28 AM

#

oh ok

agile bloom Aug 14, 2025, 6:28 AM

#

reef pawn It's good but not as fast as compared to GPT-5 MINI

thanks

reef bridge Aug 14, 2025, 7:48 AM

#

why we can't upload images? on our chats?

terse shuttle Aug 14, 2025, 8:03 AM

#

yo

#

ts cool

deft maple Aug 14, 2025, 8:06 AM

#

hello

teal summit Aug 14, 2025, 8:06 AM

#

I have a question, in LM arena i can keep with many tabs with my previous messages and he can remember that ?

agile bloom Aug 14, 2025, 8:24 AM

#

terse shuttle yo

yooooooo??????

compact jay Aug 14, 2025, 8:27 AM

#

teal summit I have a question, in LM arena i can keep with many tabs with my previous messag...

wdym ?

#

you mean different conversations, or browser tabs ?

agile bloom Aug 14, 2025, 8:30 AM

#

compact jay you mean different conversations, or browser tabs ?

use ai to decode what the user is saying

cedar tide Aug 14, 2025, 8:41 AM

#

https://x.com/FT/status/1955846095977811992?t=HjNyWY3ZplKZULY8UgKzig&s=19

Financial Times (@FT)

DeepSeek’s launch of new AI model delayed by Huawei chip issues https://t.co/cienSLzAVl

ocean vortex Aug 14, 2025, 9:10 AM

#

ripe mountain AIs such as GPT 5 Pro, Grok Superheavy, and Gemini 2.5 Pro DeepThink are not typ...

No they are very much included typically lol. There are plenty of benchmarks for o3-pro

#

For gpt5-pro there's no open public API yet

ocean vortex Aug 14, 2025, 9:14 AM

#

cedar tide https://x.com/FT/status/1955846095977811992?t=HjNyWY3ZplKZULY8UgKzig&s=19

the problem here is using Huawei in the first place

#

"issues" is inevitable

north vale Aug 14, 2025, 9:14 AM

#

FT is into writing deepseek fanfic

ocean vortex Aug 14, 2025, 9:15 AM

#

Like this is equivalent of them just announcing they are gonna use Huawei chips lol

#

"DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rather than use Nvidia’s systems after releasing its R1 model in January, according to three people familiar with the matter." - typical CCP things... 👀

#

They are gonna ruin Deepseek

keen beacon Aug 14, 2025, 9:18 AM

#

ocean vortex "DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rathe...

Rip deepseek

ornate agate Aug 14, 2025, 9:18 AM

#

ocean vortex They are gonna ruin Deepseek

Maybe. Or maybe it’s a smart long term strategy

#

The media also published something recently on “billions” of AI chips smuggled into China. this is only around 20,000 b200 chips. I’m surprised it’s not a lot more. I suspect it would be a lot more if the Chinese state was using their vast spying and intelligence resources to do the smuggling. These articles suggest they are not and want to focus on Huawei instead…

ocean vortex Aug 14, 2025, 9:30 AM

#

ornate agate Maybe. Or maybe it’s a smart long term strategy

It's not for the "state" to decide. Each company or startup should be making these decisions by itself

#

Innovation is kinda killed when everything is constrained and controlled by a single entity

ornate agate Aug 14, 2025, 9:37 AM

#

I don’t want to get into another massive argument with you about China. It’s such a waste of time. They are going to do whatever they do no matter how much you don’t like it. So will the Americans. The irrelevant ones here are people like you and me who live in Europe. We have just chosen to not play this game at all.

barren prairie Aug 14, 2025, 9:39 AM

#

ocean vortex Innovation is kinda killed when everything is constrained and controlled by a si...

They wanted to be independent 😶it is smart even they will face some problems at first 🙃

ocean vortex Aug 14, 2025, 9:42 AM

#

barren prairie They wanted to be independent 😶it is smart even they will face some problems at...

Who's 'they'? I don't think that's Deepseek. And that's kinda the main problem here...

ocean vortex Aug 14, 2025, 10:16 AM

#

Ok so for Plus "juice" is only 32 for gpt5-thinking-mini apparently

#

that's medium reasoning effort at best. Would have expected to have high there...

#

They clearly don't want for it to perform better than the full model on any task this time

#

on chatgpt at least

sacred quail Aug 14, 2025, 10:19 AM

#

App using medium reasoning is kinda sad. Maybe they want to peoples using their API more but idk. People can use openai api in Poe app free even with high reasoning support

#

its open to abusive with multiple gmail accounts

ocean vortex Aug 14, 2025, 10:21 AM

#

If you think about it can actually make sense.. They want to make it simpler. And now they have matching naming (not like o4-mini vs o3). But it's still disappointing not to have high reasoning effort anymore ofc

#

"inferior model" performing better is confusing, even if that's only on specific tasks

solar hollow Aug 14, 2025, 10:46 AM

#

i dont think deepseek will catch up again any time soon

ornate agate Aug 14, 2025, 10:51 AM

#

solar hollow i dont think deepseek will catch up again any time soon

If they are focusing on changing their whole infrastructure that will take some time

#

But they don’t need the hype cycle for funding so… not really a problem for them long term imo

tame granite Aug 14, 2025, 10:56 AM

#

nano banana is not on lmarena?

keen beacon Aug 14, 2025, 11:02 AM

#

tame granite nano banana is not on lmarena?

it is in battle mode

#

it will come up randomly in that mode

tame granite Aug 14, 2025, 11:04 AM

#

keen beacon it is in battle mode

thanks

quiet dust Aug 14, 2025, 11:04 AM

#

Guys, why is the GPT-5 mini more intelligent than the regular GPT-5 model?

tame granite Aug 14, 2025, 11:06 AM

#

quiet dust Guys, why is the GPT-5 mini more intelligent than the regular GPT-5 model?

ask it 😄

quiet dust Aug 14, 2025, 11:07 AM

#

GPT-5 minimal - This is a normal model.
GPR-5 Low - This is a low-effort thinking model.
GPT-5 mini - This is the model that appears after you finish the main limits. The question is, why is the mini smarter than the regular model?

#

And why is the GPT-5-nano also smarter than the regular model?..

tame granite Aug 14, 2025, 11:09 AM

#

this is really good

keen beacon Aug 14, 2025, 11:16 AM

#

tame granite this is really good

it is, yeah. I hope that the term "nano" means that it is the size and not the only model

#

hoping for multiple sized imagen models

tame granite Aug 14, 2025, 11:17 AM

#

keen beacon it is, yeah. I hope that the term "nano" means that it is the size and not the o...

we will see probably

keen beacon Aug 14, 2025, 11:21 AM

#

tame granite this is really good

Holy hell

quiet dust Aug 14, 2025, 11:22 AM

#

quiet dust GPT-5 minimal - This is a normal model. GPR-5 Low - This is a low-effort thinki...

Nobody can answer the question why the mini is smarter than the regular model?

keen beacon Aug 14, 2025, 11:24 AM

#

quiet dust Nobody can answer the question why the mini is smarter than the regular model?

I've got no answer to that

obtuse heart Aug 14, 2025, 11:25 AM

#

gpt5 pro doesnt have an api so most likely not

quiet dust Aug 14, 2025, 11:25 AM

#

GPT-5 Pro already in LMArena

obtuse heart Aug 14, 2025, 11:25 AM

#

quiet dust GPT-5 Pro already in LMArena

no its not..

hollow imp Aug 14, 2025, 11:25 AM

#

quiet dust Nobody can answer the question why the mini is smarter than the regular model?

@deep adder will answer

keen beacon Aug 14, 2025, 11:26 AM

#

quiet dust GPT-5 Pro already in LMArena

Nope

quiet dust Aug 14, 2025, 11:26 AM

#

obtuse heart no its not..

Yeah, what do you think of it then? It uses High Effort Thinking, like in Pro

hollow imp Aug 14, 2025, 11:26 AM

#

Gemini 2.5 would be so expensive for them to bring

obtuse heart Aug 14, 2025, 11:26 AM

#

quiet dust Yeah, what do you think of it then? It uses High Effort Thinking, like in Pro

pro is a different version, thats the difference okay?

#

its pretty much smarter

white hatch Aug 14, 2025, 11:27 AM

#

Guys, guess this riddle: small, yellow, opens any door?

obtuse heart Aug 14, 2025, 11:28 AM

#

white hatch Guys, guess this riddle: small, yellow, opens any door?

key?

white hatch Aug 14, 2025, 11:28 AM

#

Bruce Lee

quiet dust Aug 14, 2025, 11:28 AM

#

Pro is a high-effort thinking model. LMarena already has this. Moreover, Pro is not that much smarter than the medium-effort thinking model.

glacial torrent Aug 14, 2025, 11:28 AM

#

How many videos a person can generated here? I got 8 videos limit yesterday and now I got just 1 video limit? why man????????

obtuse heart Aug 14, 2025, 11:28 AM

#

quiet dust Pro is a high-effort thinking model. LMarena already has this. Moreover, Pro is ...

ITS a different version

#

how is it that hard to understand dude

quiet dust Aug 14, 2025, 11:28 AM

#

...

obtuse heart Aug 14, 2025, 11:29 AM

#

gpt 5 pro does not have an api, therefore it cannot exist on lmarena

#

you can say "lol but this model is similar" as much as you want, but the model ITSELF doesnt exist on lmarena

#

thats what the person was asking

#

roblox_shrug

quiet dust Aug 14, 2025, 11:30 AM

#

Pro one point higher than the medium model. Why is there such an opinion that as if Pro cannot be in LMArena?

obtuse heart Aug 14, 2025, 11:30 AM

#

quiet dust Pro one point higher than the medium model. Why is there such an opinion that as...

https://tenor.com/view/ishowspeed-try-not-to-laugh-gif-7682731162751353849

Tenor

quiet dust Aug 14, 2025, 11:30 AM

#

Apparently Pro is so smart that she couldn't solve a Russian math problem

#

I thought for 26 minutes and still couldn't decide...

obtuse heart Aug 14, 2025, 11:31 AM

#

yeah because its in russian, that makes it harder

quiet dust Aug 14, 2025, 11:32 AM

#

obtuse heart yeah because its in russian, that makes it harder

It was translated into English, but he still couldn't decide.

hollow imp Aug 14, 2025, 11:32 AM

#

Listen

#

Give pro this problem and see if it can solve it

#

@quiet dust

obtuse heart Aug 14, 2025, 11:33 AM

#

quiet dust Pro one point higher than the medium model. Why is there such an opinion that as...

Because pro is a seperate model. even if a model is close to it, it does not exist on LMArena on its own

#

also that benchmark is kinda stupid, gpt 5 pro is much smarter than medium lol

#

medium literally hallucinates every prompt, while pro can one shot a lot of things

quiet dust Aug 14, 2025, 11:34 AM

#

obtuse heart Because pro is a seperate model. even if a model is close to it, it does not exi...

Pro is just high effort thinking. Yes, it is separate, but it is just high effort thinking by default.

quiet dust Aug 14, 2025, 11:35 AM

#

obtuse heart also that benchmark is kinda stupid, gpt 5 pro is much smarter than medium lol

I heard you bro.

quiet dust Aug 14, 2025, 11:35 AM

#

obtuse heart medium literally hallucinates every prompt, while pro can one shot a lot of thin...

There are no particular hallucinations, I don’t know what kind of questions you ask her

ocean vortex Aug 14, 2025, 11:35 AM

#

quiet dust GPT-5 minimal - This is a normal model. GPR-5 Low - This is a low-effort thinki...

Because "gpt5-mini" is gpt5-mini-medium. So you need to compare it against gpt5-medium

hollow imp Aug 14, 2025, 11:35 AM

#

Her???

ocean vortex Aug 14, 2025, 11:35 AM

#

It's also a reasoning model, even nano is

obtuse heart Aug 14, 2025, 11:36 AM

#

quiet dust Pro is just high effort thinking. Yes, it is separate, but it is just high effor...

even if pro is just high-effort thinking (it has more capabilities than this), it’s only available in chatgpt Plus not via API. LMArena can’t test it directly, so it doesn’t qualify as a model on its own on LMArena

quiet dust Aug 14, 2025, 11:36 AM

#

hollow imp Give pro this problem and see if it can solve it

I need to invite my friend, but I don't know if he wants to. And there are some bugs on LMarena right now

hollow imp Aug 14, 2025, 11:36 AM

#

@quiet dust so what do you say what model is the best in lmarena direct chat?

ocean vortex Aug 14, 2025, 11:36 AM

#

and gpt5-mini-high is your new o4-mini-high

#

though that version of gpt5-mini is not accessible on chatgpt. They do not want to cannibalise full model and make this confusing

quiet dust Aug 14, 2025, 11:37 AM

#

obtuse heart even if pro is just high-effort thinking (it has more capabilities than this), i...

Well, this is just your opinion, not supported by any facts, I don’t believe in it.

obtuse heart Aug 14, 2025, 11:37 AM

#

quiet dust Well, this is just your opinion, not supported by any facts, I don’t believe in ...

WHAT?

hollow imp Aug 14, 2025, 11:37 AM

#

Dayum

obtuse heart Aug 14, 2025, 11:37 AM

#

dude 😭

quiet dust Aug 14, 2025, 11:37 AM

#

hollow imp <@1005764904743092347> so what do you say what model is the best in lmarena dire...

Grok 4, Gemini 2.5 Pro, GPT-5-high

obtuse heart Aug 14, 2025, 11:38 AM

#

yeah im sure that gpt-5 PRO is barely better than medium, thats facts guys

#

joyful

#

literally many youtube videos available on the internet proving that gpt-5-pro is much more capable

quiet dust Aug 14, 2025, 11:38 AM

#

...

#

hollow imp Aug 14, 2025, 11:39 AM

#

From where are you pulling all this up?

quiet dust Aug 14, 2025, 11:39 AM

#

On the Internet

obtuse heart Aug 14, 2025, 11:40 AM

#

quiet dust

for what is this for

quiet dust Aug 14, 2025, 11:40 AM

#

ocean vortex Because "gpt5-mini" is gpt5-mini-medium. So you need to compare it against gpt5-...

Oh okay

obtuse heart Aug 14, 2025, 11:40 AM

#

🤦‍♂️

untold hatch Aug 14, 2025, 11:40 AM

#

What is the daily limit in the video arena? I suddenly have 0,others have 2 other still 8. 🤔

ocean vortex Aug 14, 2025, 11:40 AM

#

quiet dust

yeah looks about right. Nothing new there

quiet dust Aug 14, 2025, 11:41 AM

#

obtuse heart for what is this for

To understand what a pro model is. And why it can't be so much stronger than the thinking of a medium with effort .

hollow imp Aug 14, 2025, 11:41 AM

#

https://images-ext-1.discordapp.net/external/bgDAakgLfaenf3bOBZiSFEXDZmI5mUZ_jffhS6FPofw/https/media.tenor.com/bu-c5qVtnjIAAAPo/flcl1.mp4

▶ Play video

quiet dust Aug 14, 2025, 11:42 AM

#

GPT-5 mini - This is the Thinking Medium Mini model.

hollow imp Aug 14, 2025, 11:42 AM

#

Previously they had different models and naming issue now they have different versions of gpt5 issue

#

Wow openai wow

obtuse heart Aug 14, 2025, 11:42 AM

#

quiet dust To understand what a pro model is. And why it can't be so much stronger than the...

it's not "much" stronger, we know that. but it is factual that it is stronger than the medium

someone asked: "will these models ever become part of the battlemode? GPT-5-PRO and other models"
you replied with: "pro already exists".
but the PRO model ITSELF doesnt exist on lmarena ( which was the question asked ) even if something is similar to it

quiet dust Aug 14, 2025, 11:43 AM

#

obtuse heart it's not "much" stronger, we know that. but it is factual that it is stronger th...

I can't explain this to you several times anymore.

obtuse heart Aug 14, 2025, 11:43 AM

#

quiet dust I can't explain this to you several times anymore.

nor can i because you wont bother trying to comprehend what im saying

#

👋

quiet dust Aug 14, 2025, 11:44 AM

#

GPT-5-HIGH is a model of thinking with high effort. Pro works also

obtuse heart Aug 14, 2025, 11:44 AM

#

quiet dust GPT-5-HIGH is a model of thinking with high effort. Pro works also

yeah but the person asked for the pro model itself, which does not exist on lmarena

#

not gpt-5-high thinking

quiet dust Aug 14, 2025, 11:44 AM

#

Gpt-5-high - already works automatically on thinking

ocean vortex Aug 14, 2025, 11:44 AM

#

quiet dust GPT-5 mini - This is the Thinking Medium Mini model.

Yes. You need to look at gpt5-mini-minimal if you want to view it in the context of models like gpt4.1-mini

obtuse heart Aug 14, 2025, 11:45 AM

#

quiet dust Gpt-5-high - already works automatically on thinking

i'm not saying it isnt tho...?

hollow imp Aug 14, 2025, 11:45 AM

#

It's like a = 50 = b
a + b = 100
a + a = 100
2a = 100
a = 100/2 = 50

#

😭

quiet dust Aug 14, 2025, 11:46 AM

#

I provide evidence, and then they write, “No, these are different models!”
I provide more evidence, and again the response is: “No, these are different models!”
I'm already tired of this

ocean vortex Aug 14, 2025, 11:46 AM

#

gpt4.1-mini was the same base model as o4-mini. Though it had advantage over gpt5-mini-minimal in that it didn't have 'redundant' RL training for reasoning (when that is not being used)

quiet dust Aug 14, 2025, 11:47 AM

#

ocean vortex gpt4.1-mini was the same base model as o4-mini. Though it had advantage over gpt...

Okay, thanks for answering. I really didn't understand for a long time, there was a lot of confusion....

hollow imp Aug 14, 2025, 11:47 AM

#

ocean vortex gpt4.1-mini was the same base model as o4-mini. Though it had advantage over gpt...

Musk: ahh soo much confusion, guys just use supergrok and enjoy throw this gpt stuff out

obtuse heart Aug 14, 2025, 11:47 AM

#

quiet dust I provide evidence, and then they write, “No, these are different models!” I pro...

because they are bro

hollow imp Aug 14, 2025, 11:47 AM

#

😭

obtuse heart Aug 14, 2025, 11:47 AM

#

different variations

hollow imp Aug 14, 2025, 11:47 AM

#

😭🙏

quiet dust Aug 14, 2025, 11:48 AM

#

obtuse heart because they are bro

Okay, we heard you.

hollow imp Aug 14, 2025, 11:48 AM

#

Just accept gpt5 chat is the most smartest model guys

#

🔥

#

It is faster than wally west

tired herald Aug 14, 2025, 11:49 AM

#

no

#

literally the worst garbage ive seen in the past year

#

it cant even code well

obtuse heart Aug 14, 2025, 11:49 AM

#

tired herald it cant even code well

yes it can

quiet dust Aug 14, 2025, 11:49 AM

#

hollow imp It is faster than wally west

Better quality than speed

obtuse heart Aug 14, 2025, 11:50 AM

#

people mad that gpt5 is more professional now, and can no longer be their ai partner 💔

hollow imp Aug 14, 2025, 11:51 AM

#

People should go to c ai

quiet dust Aug 14, 2025, 11:51 AM

#

obtuse heart people mad that gpt5 is more professional now, and can no longer be their ai par...

On the PC version this can be fixed

tired herald Aug 14, 2025, 11:51 AM

#

obtuse heart yes it can

literally no

quiet dust Aug 14, 2025, 11:52 AM

#

hollow imp People should go to c ai

He is very bad

tired herald Aug 14, 2025, 11:52 AM

#

anything above 500 lines of code has the ai drop to its knees

obtuse heart Aug 14, 2025, 11:52 AM

#

tired herald literally no

i'm so sure you're asking it "gpt code me gta 6 in only html"

tired herald Aug 14, 2025, 11:52 AM

#

GPT 5 high is a bit better, but still garbage

tired herald Aug 14, 2025, 11:52 AM

#

obtuse heart i'm so sure you're asking it "gpt code me gta 6 in only html"

ofc dude

obtuse heart Aug 14, 2025, 11:52 AM

#

tired herald anything above 500 lines of code has the ai drop to its knees

its giving me 1000 lines of codes that work perfectly fine, so idk what youre on about

ocean vortex Aug 14, 2025, 11:52 AM

#

hollow imp Just accept gpt5 chat is the most smartest model guys

I'm kinda still waiting for artificialanalysis to test it. This should perform better than gpt5-minimal (confusing... But gpt5-chat is their only model of the series which is exclusively non-reasoning)

tired herald Aug 14, 2025, 11:53 AM

#

yeah, you ask it for "code me an unnecessarily large file that does nothing"

obtuse heart Aug 14, 2025, 11:53 AM

#

me when i ask gpt 5 to code me gta 6 in only html, then blame openai why it cant do that

#

joyful

tired herald Aug 14, 2025, 11:53 AM

#

333 lines of code, and it fails to patch a simple bug that I intentionally added

obtuse heart Aug 14, 2025, 11:54 AM

#

that screenshot sure proves a lot of things

#

im so sorry, i was wrong the whole time

tired herald Aug 14, 2025, 11:54 AM

#

Yes

obtuse heart Aug 14, 2025, 11:54 AM

#

I apologize DeNew779

tired herald Aug 14, 2025, 11:54 AM

#

What an intelligent user

obtuse heart Aug 14, 2025, 11:55 AM

#

1442 lines of code, and it perfectly did it one shot

#

see how dumb it sounds and looks

tired herald Aug 14, 2025, 11:55 AM

#

again, even a dog can code simple stuff that looks massive if you train it

obtuse heart Aug 14, 2025, 11:55 AM

#

tired herald again, even a dog can code simple stuff that looks massive if you train it

its not simple at all lol... it uses a lot of maths and calculations

tired herald Aug 14, 2025, 11:55 AM

#

code that in any way isnt self-contained and interacts with outside stuff is broken

obtuse heart Aug 14, 2025, 11:55 AM

#

physics and allat

#

roblox_shrug

tired herald Aug 14, 2025, 11:56 AM

#

im not sure what you are using

obtuse heart Aug 14, 2025, 11:56 AM

#

quality also depends on your prompts

tired herald Aug 14, 2025, 11:56 AM

#

but its definitely not what im using

#

mine forgets to put code in code blocks

obtuse heart Aug 14, 2025, 11:56 AM

#

tired herald im not sure what you are using

gpt-5 high

tired herald Aug 14, 2025, 11:56 AM

#

and spits out text code

obtuse heart Aug 14, 2025, 11:57 AM

#

ig gpt just doesnt like you then, cus ive never had that problem

#

start of skynet

hollow imp Aug 14, 2025, 11:57 AM

#

Gpt 5 high or Claude opus 4.1 for code

keen beacon Aug 14, 2025, 11:57 AM

#

They should have named it GTA instead of GPT so we'd make fun about the fact GTA 6 is never coming out

tired herald Aug 14, 2025, 11:57 AM

#

GPT 6 defo before GTA 6

obtuse heart Aug 14, 2025, 11:57 AM

#

hollow imp Gpt 5 high or Claude opus 4.1 for code

gpt 5 high to code, opus 4.1 for finding and debugging bugs

hollow imp Aug 14, 2025, 11:59 AM

#

tired herald GPT 6 defo before GTA 6

Gta 6 before deepseek r2

#

Guys tell me most popular use cases of using gpt 5 Pro, o3 pro, gemini deepthink

#

Exclude problem solving stuff

tired herald Aug 14, 2025, 12:01 PM

#

chatting

#

rp

hollow imp Aug 14, 2025, 12:01 PM

#

Rip

tired herald Aug 14, 2025, 12:01 PM

#

Roleplaying

hollow imp Aug 14, 2025, 12:01 PM

#

Isn't custom gpts perfect for that

ocean vortex Aug 14, 2025, 12:01 PM

#

obtuse heart gpt-5 high

"Juice: 200" in custom instructions. Redneck's gpt5-high

keen beacon Aug 14, 2025, 12:03 PM

#

hollow imp Guys tell me most popular use cases of using gpt 5 Pro, o3 pro, gemini deepthink

Analysis of chord progressions

tired herald Aug 14, 2025, 12:07 PM

#

music writing

#

Did you guys know that you can send system prompts on lmarena

#

Okay, so now onto the next feature for my extension

gritty sequoia Aug 14, 2025, 12:20 PM

#

Hi, i really want to fix an account for lmarena so i can save my chat can i do that ?

tired herald Aug 14, 2025, 12:20 PM

#

actually showing the files being sent on the message

tired herald Aug 14, 2025, 12:20 PM

#

gritty sequoia Hi, i really want to fix an account for lmarena so i can save my chat can i do t...

There are no accounts right now

gritty sequoia Aug 14, 2025, 12:20 PM

#

why

#

how can i save chat ?

tired herald Aug 14, 2025, 12:22 PM

#

its saved on device

#

so all chats on a pc will stay on that pc

gritty sequoia Aug 14, 2025, 12:22 PM

#

but forexample if i use safari and delete history the chats will be gone

tired herald Aug 14, 2025, 12:23 PM

#

should be, yes

gritty sequoia Aug 14, 2025, 12:23 PM

#

ye but i want it to be saved like in a account you know

tired herald Aug 14, 2025, 12:26 PM

#

not possible as of now

flint sandal Aug 14, 2025, 12:30 PM

#

Idk why but for me 4 Sonnet 32K is better than Opus 4.1 16K, the code seems better, responses seem more natural and overall while for me gpt-5-high is on opus 4.1 level, sonnet 4 is crushing them both. Its my opinion.

#

And why opus 4.1 thinking isnt on laderboard?

willow grail Aug 14, 2025, 12:36 PM

#

deepseek r2 will be trash.
they simply dont have the nvidia gpus

tired herald Aug 14, 2025, 12:36 PM

#

#ai-creations message

#

hope the icon is better

normal abyss Aug 14, 2025, 12:52 PM

#

Is it a bug that gemini 2.5 pro doesnt use its thinking sometimes or does it just decide when it needs to???

torn mantle Aug 14, 2025, 1:06 PM

#

willow grail deepseek r2 will be trash. they simply dont have the nvidia gpus

i blame the chinese gov

calm sequoia Aug 14, 2025, 1:08 PM

#

Anyone seen the size approximations for the GPT5 base model?

modest prism Aug 14, 2025, 1:10 PM

#

willow grail deepseek r2 will be trash. they simply dont have the nvidia gpus

R1 was already trash imo.

willow grail Aug 14, 2025, 1:22 PM

#

torn mantle i blame the chinese gov

ofc JUST PICK ON THE RICH ONES UUUUUU BULLY

willow grail Aug 14, 2025, 1:22 PM

#

modest prism R1 was already trash imo.

yueah like all other chinese robots they are making just for propaganda how GREAT CHINA IS

#

the only great thing china does is their nice cargo and bulk ships

#

:3

#

i love big vehicles :3 rawr rawr furry uwu

#

https://tenor.com/lRNt1fmElbr.gif

Tenor

ocean vortex Aug 14, 2025, 1:25 PM

#

flint sandal Idk why but for me 4 Sonnet 32K is better than Opus 4.1 16K, the code seems bett...

You need Haiku reasoning then

#

or gpt5-mini-high which does exist

#

gpt5-mini-high 'juice' is 256

#

So higher than gpt5-high

fallen swift Aug 14, 2025, 1:28 PM

#

💀

willow grail Aug 14, 2025, 1:34 PM

#

fallen swift 💀

i like furries 🍸

fallen swift Aug 14, 2025, 1:35 PM

#

huh

#

i didnt even talk a single bit about furries

willow grail Aug 14, 2025, 1:35 PM

#

i mean better than sayoug you are a looser.
you are not tho

fallen swift Aug 14, 2025, 1:36 PM

#

i forgot did i mention i might be a furry on my carrd i forgot

willow grail Aug 14, 2025, 1:36 PM

#

https://giphy.com/gifs/RNXhdtXRmkv1wJ7gOK

Giphy

fallen swift Aug 14, 2025, 1:37 PM

#

????

#

what are you on about

willow grail Aug 14, 2025, 1:37 PM

#

a third time would be a charm!

fallen swift Aug 14, 2025, 1:37 PM

#

so youre confusing me on purpose arent you

willow grail Aug 14, 2025, 1:37 PM

#

https://tenor.com/tOy7HwdMVQv.gif

Tenor

fallen swift Aug 14, 2025, 1:37 PM

#

that face says it all

#

but yeah i like furries too

#

hey man dont be weird

willow grail Aug 14, 2025, 1:41 PM

#

https://i.imgur.com/k4OqNoe.png

Imgur

echo aurora Aug 14, 2025, 1:42 PM

#

normal abyss Is it a bug that gemini 2.5 pro doesnt use its thinking sometimes or does it jus...

Hey we're actually looking into an issue that it sounds like you're experiencing, I'm having trouble reproing but can you share more details about what you're experiencing here: #1395438935676817428

willow grail Aug 14, 2025, 1:44 PM

#

fallen swift hey man dont be weird

how old are ya?

fallen swift Aug 14, 2025, 1:44 PM

#

18

willow grail Aug 14, 2025, 1:44 PM

#

do u know masenko or ratgrave or vrchat?

fallen swift Aug 14, 2025, 1:44 PM

#

i dont know two of these but i do know about vrchat

willow grail Aug 14, 2025, 1:45 PM

#

/afk

echo aurora Aug 14, 2025, 1:45 PM

#

let's try to keep conversation related to AI please

fallen swift Aug 14, 2025, 1:46 PM

#

echo aurora let's try to keep conversation related to AI please

ngl i kinda want an off topic channel

#

but the thing i want to say is

#

people are having limits of 0 videos

tired herald Aug 14, 2025, 1:51 PM

#

what

keen beacon Aug 14, 2025, 1:57 PM

#

fallen swift people are having limits of 0 videos

oof

tired herald Aug 14, 2025, 1:57 PM

#

huh

#

no way right

echo aurora Aug 14, 2025, 1:58 PM

#

fallen swift people are having limits of 0 videos

This has been flagged to the team. It's pretty strange as it's not happening to everyone. Some have the standard 8, others have 3, some have 0.

#

We're looking into though.

unborn lantern Aug 14, 2025, 2:08 PM

#

For me it's 0

trim lantern Aug 14, 2025, 2:12 PM

#

Man , I literally cry when all my saved memory gets reset 😭

tired herald Aug 14, 2025, 2:15 PM

#

wdym

stray aspen Aug 14, 2025, 2:16 PM

#

what

trim lantern Aug 14, 2025, 2:16 PM

#

tired herald wdym

Chat history gets reset after every update, any way to prevent that?

stray aspen Aug 14, 2025, 2:16 PM

#

wdym wdym

stray aspen Aug 14, 2025, 2:16 PM

#

trim lantern Chat history gets reset after every update, any way to prevent that?

no

#

there are no accounts

tired herald Aug 14, 2025, 2:17 PM

#

trim lantern Chat history gets reset after every update, any way to prevent that?

oh damn

echo aurora Aug 14, 2025, 2:18 PM

#

trim lantern Man , I literally cry when all my saved memory gets reset 😭

We're sorry for this, it's very understandable how frusterating this can be. We are looking into features that'll help with this issue.

keen beacon Aug 14, 2025, 2:19 PM

#

are we allowed to ping @ pineapple?

tired herald Aug 14, 2025, 2:20 PM

#

yes

trim lantern Aug 14, 2025, 2:20 PM

#

echo aurora We're sorry for this, it's very understandable how frusterating this can be. We ...

Thanks team, btw this app is a banger , love all the things about this but promise you guys won't charge us in the future 🫠

barren prairie Aug 14, 2025, 2:20 PM

#

willow grail deepseek r2 will be trash. they simply dont have the nvidia gpus

No it won t be a trash. DeepSeek won t release any model before getting good and respectful result (that s the reason for their delays)

echo aurora Aug 14, 2025, 2:20 PM

#

trim lantern Thanks team, btw this app is a banger , love all the things about this but promi...

heartthrow

stray aspen Aug 14, 2025, 2:21 PM

#

willow grail deepseek r2 will be trash. they simply dont have the nvidia gpus

they have great engineers tho

keen beacon Aug 14, 2025, 2:22 PM

#

@echo aurora in battle mode, a vision-enabled and and a no-vision model may be selected. But we still have the option to add in an image.. How is this handled by the app? Just wanted to know

tired herald Aug 14, 2025, 2:22 PM

#

the image is sent to the vision enabled, and not to the no-vision

glacial torrent Aug 14, 2025, 2:23 PM

#

Can, anyone please explain how many generation limits everyday?

stray aspen Aug 14, 2025, 2:23 PM

#

glacial torrent Can, anyone please explain how many generation limits everyday?

i think 5 for claude

#

idk the rest

keen beacon Aug 14, 2025, 2:23 PM

#

tired herald the image is sent to the vision enabled, and not to the no-vision

then how does the LLM with no vision answer to the query i sent..?

keen beacon Aug 14, 2025, 2:24 PM

#

stray aspen idk the rest

i remeber around 20-25 messages for o3 in an hour

tired herald Aug 14, 2025, 2:24 PM

#

keen beacon then how does the LLM with no vision answer to the query i sent..?

it only sees the text you send/ it gets no input and gives gibberish

#

or atleast I think so

keen beacon Aug 14, 2025, 2:24 PM

#

tired herald it only sees the text you send/ it gets no input and gives gibberish

no but it actually reads the info!?

keen beacon Aug 14, 2025, 2:24 PM

#

tired herald it only sees the text you send/ it gets no input and gives gibberish

only sometimes this happens

tired herald Aug 14, 2025, 2:24 PM

#

keen beacon no but it actually reads the info!?

if possible, can you tell me what you are comparing

#

which models

#

ill go test it out

keen beacon Aug 14, 2025, 2:25 PM

#

tired herald which models

i dont specifically remember actually... tho when i select those exact models in side-by-side chat, the image input gets disabled

willow grail Aug 14, 2025, 2:26 PM

#

stray aspen they have great engineers tho

which needs decades to create something without help of machine learning.

tired herald Aug 14, 2025, 2:26 PM

#

I see

tired herald Aug 14, 2025, 2:27 PM

#

keen beacon i dont specifically remember actually... tho when i select those exact models in...

then i really dont know, sorry

keen beacon Aug 14, 2025, 2:28 PM

#

also I had another query.. in battle mode, if i send a math query, and lets say I get a wrong answer from one of the models (model B). I vote for the model which answered correctly.

In the next query in the same chat, is there a chance the new model selected in place of model B gets affected by the previous wrong response?

keen beacon Aug 14, 2025, 2:28 PM

#

tired herald then i really dont know, sorry

no problem its okay

echo aurora Aug 14, 2025, 2:31 PM

#

keen beacon <@283397944160550928> in battle mode, a vision-enabled and and a no-vision model...

When an image is uploaded the models available (via battle) will only sample from models that have that capability.

keen beacon Aug 14, 2025, 2:31 PM

#

echo aurora When an image is uploaded the models available (via battle) will only sample fro...

If i send the image after the models have been selected then what happens?

echo aurora Aug 14, 2025, 2:33 PM

#

keen beacon If i send the image after the models have been selected then what happens?

Oh I see now. That's a good question. Let me check with the team and followup with you.

keen beacon Aug 14, 2025, 2:33 PM

#

echo aurora Oh I see now. That's a good question. Let me check with the team and followup wi...

thank you. i meant something like this:

echo aurora Aug 14, 2025, 2:33 PM

#

keen beacon thank you. i meant something like this:

Yeah I'm following now.

keen beacon Aug 14, 2025, 2:34 PM

#

In the above case tho, something weird happened.

echo aurora Aug 14, 2025, 2:34 PM

#

keen beacon also I had another query.. in battle mode, if i send a math query, and lets say ...

Regarding this question this has been a topic of a lot of debate internally. I'll be sure to raise again as it's an important question.

keen beacon Aug 14, 2025, 2:35 PM

#

echo aurora Regarding this question this has been a topic of a lot of debate internally. I'l...

Alright! Thank you

scenic crypt Aug 14, 2025, 2:37 PM

#

@echo aurora

echo aurora Aug 14, 2025, 2:37 PM

#

keen beacon In the above case tho, something weird happened.

So I had a different experience, which could be our answer:

I did something similar - "hey there"...uploads image... votes. With my experience when the vote happened both models were the same for the first and second response. Since it's different than what you're seeing, what I think is happening is for your case the two models (or one of them) you originally had don't have that capability, but when you uploaded two new models were selected. Lots of assumptions on my part but regardless I'm going to double check.

keen beacon Aug 14, 2025, 2:38 PM

#

echo aurora So I had a different experience, which could be our answer: I did something si...

Yes! I think that is the case...

echo aurora Aug 14, 2025, 2:38 PM

#

Yup! This is for sure a bug that we're going to look into and fix. My apologies for the inconvenience.

#

Good to know, I'll also raise. blobthanks

tired herald Aug 14, 2025, 2:45 PM

#

how weird

#

lmarena is being very weird with packets right now

keen beacon Aug 14, 2025, 2:46 PM

#

Also @echo aurora i had been wondering does lmarena sanitize the chats before publishing their datasets on huggingface? Sanitize as in removing any personal information or is it solely the user's responsiblity to not share any info that they deem personal?

keen beacon Aug 14, 2025, 2:46 PM

#

tired herald lmarena is being very weird with packets right now

packets?

tired herald Aug 14, 2025, 2:47 PM

#

internet

#

im working on an extension

keen beacon Aug 14, 2025, 2:47 PM

#

hm?

tired herald Aug 14, 2025, 2:47 PM

#

for being able to add files to your messages

keen beacon Aug 14, 2025, 2:49 PM

#

tired herald for being able to add files to your messages

oh wow! will you be sharing it on the chrome extension store?

echo aurora Aug 14, 2025, 2:49 PM

#

keen beacon Also <@283397944160550928> i had been wondering does lmarena sanitize the chats ...

Yes, absolutely. We do aggressively filter out PII before releasing.

echo aurora Aug 14, 2025, 2:49 PM

#

keen beacon oh wow! will you be sharing it on the chrome extension store?

big +1!

echo aurora Aug 14, 2025, 2:49 PM

#

tired herald lmarena is being very weird with packets right now

Can you share a bit more?

keen beacon Aug 14, 2025, 2:50 PM

#

alright...

keen beacon Aug 14, 2025, 2:50 PM

#

echo aurora Can you share a bit more?

+1!

tired herald Aug 14, 2025, 2:50 PM

#

echo aurora Can you share a bit more?

its rejecting all create-evaluation (starting the chat) but accepting the post-evaluation (continuation of chat)

keen beacon Aug 14, 2025, 2:50 PM

#

having the ability to share pdfs would do wonders

tired herald Aug 14, 2025, 2:50 PM

#

atleast when im adding my files

tired herald Aug 14, 2025, 2:50 PM

#

keen beacon having the ability to share pdfs would do wonders

in the works, maybe soon

keen beacon Aug 14, 2025, 2:51 PM

#

r u reverse engineering the message sending??

tired herald Aug 14, 2025, 2:51 PM

#

already have

#

i havent changed the logic yet im being constantly rejected

#

before it worked

tired herald Aug 14, 2025, 2:51 PM

#

keen beacon r u reverse engineering the message sending??

#ai-creations message

keen beacon Aug 14, 2025, 2:52 PM

#

tired herald before it worked

hmmm

tired herald Aug 14, 2025, 2:52 PM

#

im slowly losing my sanity

#

im sending 1:1 the same thing as before but now its broken

keen beacon Aug 14, 2025, 2:53 PM

#

tired herald https://discord.com/channels/1340554757349179412/1344733249628541099/14055305627...

wow thats fire

tired herald Aug 14, 2025, 2:53 PM

#

#ai-creations message

keen beacon Aug 14, 2025, 2:53 PM

#

tired herald im sending 1:1 the same thing as before but now its broken

hmm maybe the server is not accepting the packets now (maybe their strcutre changed or smtn)

tired herald Aug 14, 2025, 2:54 PM

#

changed in the last few hours 😭

keen beacon Aug 14, 2025, 2:54 PM

#

hmm did you try checking the network tab's response>>?

#

(you probably did)

tired herald Aug 14, 2025, 2:56 PM

#

im going to commit life erasure

#

now its working

keen beacon Aug 14, 2025, 2:56 PM

#

lol great

#

can you share the thing already plss

tired herald Aug 14, 2025, 2:56 PM

#

no way im sharing it when it has severe mental illnesses like this

#

im not gonna embarass myself

keen beacon Aug 14, 2025, 2:57 PM

#

in DM?

grand panther Aug 14, 2025, 2:57 PM

#

Hey guys! I wanted to know if there's any way to "bypass" the chat so I can post "inappropriate content"...
or any other way to get Grok 3 and 4 completely free and without limits.

tired herald Aug 14, 2025, 2:57 PM

#

keen beacon in DM?

ill finish the basic stuff and then ill give it to you

keen beacon Aug 14, 2025, 2:57 PM

#

tired herald ill finish the basic stuff and then ill give it to you

sent a req

tired herald Aug 14, 2025, 2:57 PM

#

accepted

#

but I know the issue

#

its the file im sending that has bad data

keen beacon Aug 14, 2025, 2:58 PM

#

tired herald its the file im sending that has bad data

ahhh

keen beacon Aug 14, 2025, 2:58 PM

#

grand panther Hey guys! I wanted to know if there's any way to "bypass" the chat so I can post...

i think the moderation is server-side so

tired herald Aug 14, 2025, 2:59 PM

#

WAIT

#

I KNOW HOW TO FIX THIS

echo aurora Aug 14, 2025, 2:59 PM

#

stray aspen Aug 14, 2025, 2:59 PM

#

@tired heraldhow did you attach a python file

echo aurora Aug 14, 2025, 3:00 PM

#

echo aurora

I'm going to be running this poll periodically, we'd love to understand better why. Please share in the thread!!

tired herald Aug 14, 2025, 3:00 PM

#

stray aspen <@1395809769947660389>how did you attach a python file

extension, ill release later on

echo aurora Aug 14, 2025, 3:00 PM

#

echo aurora

scenic crypt Aug 14, 2025, 3:00 PM

#

@echo aurora
how much I fix the problem

echo aurora Aug 14, 2025, 3:00 PM

#

grand panther Hey guys! I wanted to know if there's any way to "bypass" the chat so I can post...

There is not.

grand panther Aug 14, 2025, 3:02 PM

#

echo aurora There is not.

So apparently I'll have to suffer waiting every 2 hours to put together a gore story lol

tired herald Aug 14, 2025, 3:02 PM

#

yes

#

or go to aistudio from google

eternal niche Aug 14, 2025, 3:03 PM

#

btw gpt5 sucks

echo aurora Aug 14, 2025, 3:03 PM

#

Our team is on it blobfingerguns

scenic crypt Aug 14, 2025, 3:04 PM

#

echo aurora Our team is on it <:blobfingerguns:582620181956657153>

good

stray aspen Aug 14, 2025, 3:04 PM

#

eternal niche btw gpt5 sucks

its great lol

#

stop ragebaiting

tired herald Aug 14, 2025, 3:04 PM

#

not fixed

keen beacon Aug 14, 2025, 3:05 PM

#

stray aspen stop ragebaiting

agreed

keen beacon Aug 14, 2025, 3:06 PM

#

grand panther So apparently I'll have to suffer waiting every 2 hours to put together a gore s...

yup

keen beacon Aug 14, 2025, 3:06 PM

#

echo aurora

wait why am i the only one sleecting battle

echo aurora Aug 14, 2025, 3:07 PM

#

Reminder that on discord you're able to block other people if you're not a fan!

echo aurora Aug 14, 2025, 3:07 PM

#

keen beacon wait why am i the only one sleecting battle

blobextranervous

keen beacon Aug 14, 2025, 3:07 PM

#

echo aurora <:blobextranervous:430179754633723907>

i actually find 3-4 math q, then i sovle them myself, then verify with gpt-5 high
then send atleast 5-10 times in battle mode

echo aurora Aug 14, 2025, 3:07 PM

#

keen beacon wait why am i the only one sleecting battle

Hmm what you mean? I'm seeing side-by-side & direct. Can you send a screenshot?

keen beacon Aug 14, 2025, 3:07 PM

#

they're actually just olympiad level

willow grail Aug 14, 2025, 3:08 PM

#

PINE AND APPLE

U NEED TO BAN GPT5 HATERS like @eternal niche

keen beacon Aug 14, 2025, 3:08 PM

#

echo aurora Hmm what you mean? I'm seeing side-by-side & direct. Can you send a screenshot?

*selecting battle... everyone else selected direct chat - thats what I meant sorry

keen beacon Aug 14, 2025, 3:08 PM

#

willow grail PINE AND APPLE U NEED TO BAN GPT5 HATERS like <@1391027884889608333>

yo chill

willow grail Aug 14, 2025, 3:08 PM

#

they goes on my nerves with their irrational yapping

keen beacon Aug 14, 2025, 3:08 PM

#

willow grail they goes on my nerves with their irrational yapping

i kinda agree with you but freedom of speech

willow grail Aug 14, 2025, 3:08 PM

#

keen beacon i kinda agree with you but freedom of speech

FOS doesnt mean you can hate on my babe gpt5

echo aurora Aug 14, 2025, 3:08 PM

#

willow grail they goes on my nerves with their irrational yapping

I'd recommend blocking them! I'm not here to police opinions.

keen beacon Aug 14, 2025, 3:09 PM

#

willow grail FOS doesnt mean you can hate on my babe gpt5

i use it a lot myself 😂 (i hated gpt-4o's emojis)

keen beacon Aug 14, 2025, 3:09 PM

#

echo aurora I'd recommend blocking them! I'm not here to police opinions.

yeah

willow grail Aug 14, 2025, 3:09 PM

#

pineapple i am joking. you are so serious lol

keen beacon Aug 14, 2025, 3:09 PM

#

willow grail pineapple i am joking. you are so serious lol

nah he/she/they just mature

echo aurora Aug 14, 2025, 3:09 PM

#

willow grail pineapple i am joking. you are so serious lol

oh

willow grail Aug 14, 2025, 3:09 PM

#

yes. like myself. very mature

keen beacon Aug 14, 2025, 3:09 PM

#

willow grail yes. like myself. very mature

sure @willow grail

#

@echo aurora is there a way we could know the rate limits of every model and henceforth use them wisely (not wasting credits)

echo aurora Aug 14, 2025, 3:12 PM

#

keen beacon <@283397944160550928> is there a way we could know the rate limits of every mode...

I did raise this to the team recently, it's possible just would need some thoughts around it and prioritize developing it.

keen beacon Aug 14, 2025, 3:12 PM

#

and also could we add the option for choosing system prompts in direct chat?
somewhat like the gems in gemini.google.com

I have a few instrcutions like "use latex" or "do not solve the given question, only give it as text w/ latex" or "judge my solution and provide it marks", etc

keen beacon Aug 14, 2025, 3:12 PM

#

echo aurora I did raise this to the team recently, it's possible just would need some though...

right....
also can we bring back repo-chat

echo aurora Aug 14, 2025, 3:13 PM

#

keen beacon right.... also can we bring back repo-chat

Also possible! Something I've also shared with the team recently(ish).

keen beacon Aug 14, 2025, 3:14 PM

#

i believe i am asking wayy too many questions

echo aurora Aug 14, 2025, 3:14 PM

#

Would encourage you to check out #1372230675914031105 , some of these requests are on our radar + it helps us organize these requests better.

echo aurora Aug 14, 2025, 3:14 PM

#

keen beacon i believe i am asking wayy too many questions

That's okay! nothing wrong with that.

keen beacon Aug 14, 2025, 3:14 PM

#

echo aurora Would encourage you to check out <#1372230675914031105> , some of these requests...

sure!

keen beacon Aug 14, 2025, 3:17 PM

#

keen beacon and also could we add the option for choosing system prompts in direct chat? som...

i see this has already been posted in the feedback forum

eternal niche Aug 14, 2025, 3:19 PM

#

willow grail PINE AND APPLE U NEED TO BAN GPT5 HATERS like <@1391027884889608333>

freedom of speech

spare rune Aug 14, 2025, 3:21 PM

#

pineapple

#

are you

#

a

#

pineapple

keen beacon Aug 14, 2025, 3:21 PM

#

eternal niche freedom of speech

i take back my statement now. no FOS

keen beacon Aug 14, 2025, 3:21 PM

#

spare rune pineapple

duh! ofc he is 😂

echo aurora Aug 14, 2025, 3:21 PM

#

spare rune pineapple

greg1 greg2

keen beacon Aug 14, 2025, 3:22 PM

#

echo aurora <:greg1:453987535274901524><:greg2:453987540521713675>

thats so cute ngl

spare rune Aug 14, 2025, 3:22 PM

#

pineaserox

#

pineapple rex

#

im not good at coming up at nimes

#

names

echo aurora Aug 14, 2025, 3:22 PM

#

both are good

keen beacon Aug 14, 2025, 3:22 PM

#

pineaserous-rex

#

nono pineasaurous-rex

#

yeah

spare rune Aug 14, 2025, 3:22 PM

#

yeah

keen beacon Aug 14, 2025, 3:22 PM

#

hehe

#

pineappie-rex?

echo aurora Aug 14, 2025, 3:28 PM

#

keen beacon Yes! I think that is the case...

Got confirmation this is whats happening blobthumbsup

#

This is a bug, our team is looking into asap.

#

I am sorry you're running into this.

cedar tide Aug 14, 2025, 3:33 PM

#

https://huggingface.co/google/gemma-3-270m

google/gemma-3-270m · Hugging Face

eternal niche Aug 14, 2025, 3:38 PM

#

who

echo aurora Aug 14, 2025, 3:38 PM

#

Lets move on please blobthanks

true condor Aug 14, 2025, 3:43 PM

#

Anyone did Nano Banana comparison?

trail creek Aug 14, 2025, 3:50 PM

#

true condor Anyone did Nano Banana comparison?

better than gpt in consistency and designs. it will be great for visual novel authors and help them work fast.

#

tho its also very bad at redrawing/enhancing opposite to gpt

keen beacon Aug 14, 2025, 3:51 PM

#

Calm down.

#

Alrighty then.

#

Lol

stray aspen Aug 14, 2025, 3:52 PM

#

@keen beacon

#

grok 4 or gpt-5 high for coding

eternal niche Aug 14, 2025, 3:52 PM

#

gemini 2.5 pro

stray aspen Aug 14, 2025, 3:52 PM

#

lol

#

what

#

gemini sucks

#

its literally obsolete

keen beacon Aug 14, 2025, 3:52 PM

#

eternal niche gemini 2.5 pro

Meh.

true condor Aug 14, 2025, 3:53 PM

#

trail creek better than gpt in consistency and designs. it will be great for visual novel au...

Thx I saw several single comparative cases, so not sure

eternal niche Aug 14, 2025, 3:56 PM

#

who cares

keen beacon Aug 14, 2025, 3:57 PM

#

Too far away

#

Needs to have one in Europe

stray aspen Aug 14, 2025, 4:01 PM

#

no

pure falcon Aug 14, 2025, 4:02 PM

#

@echo aurora
Any clue if we’re getting a new leaderboard update soon? Can’t wait to see how GPT-5-chat does!

echo aurora Aug 14, 2025, 4:02 PM

#

pure falcon <@283397944160550928> Any clue if we’re getting a new leaderboard update soon? ...

Shouldn't be too much longer!

candid storm Aug 14, 2025, 4:08 PM

#

echo aurora Shouldn't be too much longer!

Today?

pure falcon Aug 14, 2025, 4:11 PM

#

🤞

tired herald Aug 14, 2025, 4:13 PM

#

keen beacon and also could we add the option for choosing system prompts in direct chat? som...

I can do that already lol

pure falcon Aug 14, 2025, 4:13 PM

#

Hope so. The vague-posting is so frustrating!!

tired herald Aug 14, 2025, 4:13 PM

#

should I add that to my extension?

eternal niche Aug 14, 2025, 4:13 PM

#

what extension

tired herald Aug 14, 2025, 4:15 PM

#

im making an extension for lmarena

#

to allow adding files like code files

#

its pretty good yes

eternal niche Aug 14, 2025, 4:26 PM

#

yes gemini 3 - SOTA

tired herald Aug 14, 2025, 4:26 PM

#

idk

#

ive never used toad before

stray aspen Aug 14, 2025, 4:26 PM

#

ROFL

tired herald Aug 14, 2025, 4:27 PM

#

im only here to improve my own experience with lmarena

stray aspen Aug 14, 2025, 4:27 PM

#

slide the extension

tired herald Aug 14, 2025, 4:28 PM

#

🙂

#

wait patiently

stray aspen Aug 14, 2025, 4:29 PM

#

alrighty

tired herald Aug 14, 2025, 4:29 PM

#

though ive got most features working well

#

nope, i just use direct models

#

is it really that good?

drifting thorn Aug 14, 2025, 4:34 PM

#

By far, what’s the SOTA model for tool-calling?

#

Isn’t that SillyTavern’s work(

#

Gonna try it in my n8n workflow

pure falcon Aug 14, 2025, 4:36 PM

#

echo aurora Shouldn't be too much longer!

I don’t want you to give away secrets, but can you give us a smallll hint? Sometime next week you think?

echo aurora Aug 14, 2025, 4:38 PM

#

pure falcon I don’t want you to give away secrets, but can you give us a smallll hint? Somet...

I can say that it'll for sure happen at some point in the future.

#

blobslightsmile

tired herald Aug 14, 2025, 4:39 PM

#

lets see

#

just gave it a pretty complicated request

spring turtle Aug 14, 2025, 4:41 PM

#

I have a question... Is the message that says "daily video limit of 0 videos" a mistake or is different to the 8 video limit message (like if It was like one of those ai tools with one-time credits)?

amber warren Aug 14, 2025, 4:42 PM

#

spring turtle I have a question... Is the message that says "daily video limit of 0 videos" a ...

this is a bug we are looking into!

keen beacon Aug 14, 2025, 4:42 PM

#

tired herald lets see

what models are those?

#

toad and dino, I mean

tired herald Aug 14, 2025, 4:42 PM

#

s

spring turtle Aug 14, 2025, 4:42 PM

#

@amber warren thanks, so im safe...

tired herald Aug 14, 2025, 4:43 PM

#

well, I think toad died on me

keen beacon Aug 14, 2025, 4:44 PM

#

tired herald well, I think toad died on me

toad from mario games only come to my mind every time, lol

tired herald Aug 14, 2025, 4:44 PM

#

😭

worldly osprey Aug 14, 2025, 4:44 PM

#

Hello

tired herald Aug 14, 2025, 4:44 PM

#

hi

keen beacon Aug 14, 2025, 4:44 PM

#

worldly osprey Hello

Hello.

tired herald Aug 14, 2025, 4:48 PM

#

no, the other model finished

#

and apparently toad finished without giving an answer

#

so its loading inf

echo aurora Aug 14, 2025, 4:50 PM

#

Since it's a model that's behind a codename it's only accessible through Battle mode, meaning you won't be able to select it.

spring turtle Aug 14, 2025, 4:53 PM

#

Also, is this also a bug?

keen beacon Aug 14, 2025, 4:53 PM

#

hello, I used the ai battle option and uploaded an image and gave a prompt to edit it, there was a model named nano banana but it wasn't in the model list when I want to do side by side or single ai

tired herald Aug 14, 2025, 4:55 PM

#

claude opus 4.1 or gpt 5 high

stray aspen Aug 14, 2025, 4:55 PM

#

gpt 5 high

tired herald Aug 14, 2025, 4:55 PM

#

wdym

stray aspen Aug 14, 2025, 4:56 PM

#

its also very limited

tired herald Aug 14, 2025, 4:56 PM

#

no, theres i think a limit of 5 messages

#

either 5 or 10

#

none

#

none

#

its pretty good

stray aspen Aug 14, 2025, 4:57 PM

#

thats great

tired herald Aug 14, 2025, 4:58 PM

#

step 3 is very chinese

#

I really wonder, should I add the ability to add custom system prompts

stray aspen Aug 14, 2025, 5:00 PM

#

what the hell

pure falcon Aug 14, 2025, 5:03 PM

#

candid storm Today?

Not today, it seems 🙁

They usually announce updates by now

keen beacon Aug 14, 2025, 5:12 PM

#

https://www.reddit.com/r/Bard/comments/1mq5l83/_/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

From the Bard community on Reddit: 👀

Explore this post and more from the Bard community

#

👀

#

dont know if these are some updated versions

#

but saw on reddit

stray aspen Aug 14, 2025, 5:13 PM

#

its a nice model

#

#

just tested it

keen beacon Aug 14, 2025, 5:14 PM

#

stray aspen

did they just change the names?

#

not sure if these are the same ones with different naming now

obtuse heart Aug 14, 2025, 5:15 PM

#

stray aspen

holy poop

stray aspen Aug 14, 2025, 5:15 PM

#

idk

#

#

this is imagen ultra

obtuse heart Aug 14, 2025, 5:15 PM

#

that can definitely fool more people

tired herald Aug 14, 2025, 5:16 PM

#

stray aspen this is imagen ultra

its mostly good, yes

echo aurora Aug 14, 2025, 5:16 PM

#

oooo which model is this?

#

oh opus 4.1 you mentioned

tired herald Aug 14, 2025, 5:16 PM

#

[FILES_START]
{"files":[{"name":"test.py","size":7,"mime":"text/x-python","content":"Nothing","truncated":false}]}
[FILES_END]

#

this is how the files are sent

#

very cool

keen beacon Aug 14, 2025, 5:20 PM

#

tired herald I can do that already lol

howww?

keen beacon Aug 14, 2025, 5:20 PM

#

tired herald this is how the files are sent

it workss?!

keen beacon Aug 14, 2025, 5:21 PM

#

echo aurora Got confirmation this is whats happening <:blobthumbsup:494901804476137482>

great!

tired herald Aug 14, 2025, 5:22 PM

#

keen beacon it workss?!

yes

tired herald Aug 14, 2025, 5:22 PM

#

keen beacon howww?

🙂

#

ill make a vid of it in a few hours when I have it done with a button to customize it and all

#

:))))))

#

find whats wrong

#

Huh

#

Im not doing this for gpt 5

#

I dont even use it

stray aspen Aug 14, 2025, 5:28 PM

#

@tired herald

#

add pdf support

tired herald Aug 14, 2025, 5:28 PM

#

😭

#

PDFs are prob not possible for me, they'd have to be implemented into the very LMArena

#

But ill try

stray aspen Aug 14, 2025, 5:29 PM

#

whos funding your project

tired herald Aug 14, 2025, 5:30 PM

#

No one

#

My boredom*

stray aspen Aug 14, 2025, 5:31 PM

#

skibidi toilet

tired herald Aug 14, 2025, 5:31 PM

#

Real

zealous anchor Aug 14, 2025, 5:45 PM

#

How do I access nano banana? Not sure which tab it's under

echo aurora Aug 14, 2025, 5:46 PM

#

zealous anchor How do I access nano banana? Not sure which tab it's under

All models that are using a codename can only be used in the Battle mode. You won't be able to select them from the drop down in Direct & Side-by-side.

zealous anchor Aug 14, 2025, 5:46 PM

#

Ah.

echo aurora Aug 14, 2025, 5:47 PM

#

echo aurora

alarm ✨ alarm ✨ alarm ✨ Bumping this poll btw!!! alarm ✨ alarm ✨ alarm ✨

weak nova Aug 14, 2025, 5:47 PM

#

zealous anchor How do I access nano banana? Not sure which tab it's under

Uploading images for editing seems to help, and/or more complex/uncommon queries (tho i could be imagining it)

gentle plinth Aug 14, 2025, 5:48 PM

#

btw why is it possible to vote for models in side-by-side? shouldnt this only be possible if we dont know the models name

#

or is it just a ui bug, and votes dont count anyway

weak nova Aug 14, 2025, 5:48 PM

#

gentle plinth btw why is it possible to vote for models in side-by-side? shouldnt this only be...

Maybe it's weighted accordingly

gentle plinth Aug 14, 2025, 5:49 PM

#

gentle plinth or is it just a ui bug, and votes dont count anyway

(which i am not so sure about because i got a cloudflare captcha before voting)

zealous anchor Aug 14, 2025, 5:49 PM

#

It apparently doesn't know what a Klingon is

weak nova Aug 14, 2025, 5:50 PM

#

It's bad at a lot of things other models are good at, but it blows everything else out of the water at what it's good at

#

They way you instruct it matters a lot too

terse shuttle Aug 14, 2025, 5:53 PM

#

I'm dying to ask, but it's kinda personal. (i think)😭

echo aurora Aug 14, 2025, 5:58 PM

#

poll_question_text

What version do you use the most?

victor_answer_votes

14

total_votes

22

victor_answer_id

3

victor_answer_text

Direct

gentle plinth Aug 14, 2025, 6:01 PM

#

echo aurora

do the web-analytics agree with this one?

eternal niche Aug 14, 2025, 6:03 PM

#

tired herald I dont even use it

my bro

echo aurora Aug 14, 2025, 6:04 PM

#

gentle plinth do the web-analytics agree with this one?

I woulnd't be able to say blobzippermouth

flint sandal Aug 14, 2025, 6:20 PM

#

Grok can now do corn videos in $30 plan...

keen beacon Aug 14, 2025, 6:21 PM

#

flint sandal Grok can now do corn videos in $30 plan...

what?

#

grok can make videos now?

keen beacon Aug 14, 2025, 6:21 PM

#

tired herald :))))))

theres a new attachment button?

stray aspen Aug 14, 2025, 6:21 PM

#

flint sandal Grok can now do corn videos in $30 plan...

wdym

#

it doesnt have censorship?

eternal niche Aug 14, 2025, 6:21 PM

#

keen beacon theres a new attachment button?

keen beacon Aug 14, 2025, 6:22 PM

#

eternal niche

ohh

keen beacon Aug 14, 2025, 6:22 PM

#

tired herald PDFs are prob not possible for me, they'd have to be implemented into the very L...

we can do this tho - OCR the pdf?

#

or just simply extract the text from it

flint sandal Aug 14, 2025, 6:23 PM

#

stray aspen it doesnt have censorship?

You can turn spicy mode

#

That doesnt even make things spicy if you ask

#

Its making it always

eternal niche Aug 14, 2025, 6:23 PM

#

flint sandal Its making it always

proof

#

in dm

flint sandal Aug 14, 2025, 6:24 PM

#

X. Com and elon musk

#

He saidit

eternal niche Aug 14, 2025, 6:24 PM

#

i dont believe him

#

i want see video

stray aspen Aug 14, 2025, 6:25 PM

#

lmao

flint sandal Aug 14, 2025, 6:25 PM

#

Can i send link to article?

#

Or i will get banned?

keen beacon Aug 14, 2025, 6:25 PM

#

eternal niche i want see video

Not here dude

stray aspen Aug 14, 2025, 6:25 PM

#

elon musks knows his audience

eternal niche Aug 14, 2025, 6:26 PM

#

keen beacon Not here dude

why

keen beacon Aug 14, 2025, 6:26 PM

#

Maybe in DM

eternal niche Aug 14, 2025, 6:26 PM

#

only for educational purposes

#

for science

keen beacon Aug 14, 2025, 6:26 PM

#

eternal niche only for educational purposes

Sure.

eternal niche Aug 14, 2025, 6:26 PM

#

what

flint sandal Aug 14, 2025, 6:27 PM

#

I will send link to article with censored version

stray aspen Aug 14, 2025, 6:27 PM

#

guys its official

#

elon musk made a corn generator

eternal niche Aug 14, 2025, 6:27 PM

#

he is lying

#

(elon musk)

flint sandal Aug 14, 2025, 6:27 PM

#

Nah

#

I tested it

#

Kind

#

Kinda

#

Like i didnt want to test ir

#

You know

#

Science

stray aspen Aug 14, 2025, 6:28 PM

#

musk needs to be stopped

keen beacon Aug 14, 2025, 6:28 PM

#

stray aspen musk needs to be stopped

He can be unstable

#

As you can see

flint sandal Aug 14, 2025, 6:28 PM

#

He will literally buy anyone

#

No one can stop him

#

Only he can stop himself

keen beacon Aug 14, 2025, 6:28 PM

#

flint sandal No one can stop him

Gooners though will like that video generator

#

Just being honest

flint sandal Aug 14, 2025, 6:29 PM

#

I think no one will pay to see robot corn

#

Cornhub is free

keen beacon Aug 14, 2025, 6:29 PM

#

flint sandal I think no one will pay to see robot corn

The companion update is popular though

flint sandal Aug 14, 2025, 6:30 PM

#

Its openai and google fault

#

For dicovering transformers and gpr

#

Gpt

keen beacon Aug 14, 2025, 6:30 PM

#

flint sandal For dicovering transformers and gpr

Too late now

#

Billions are thrown for the AI tech industry

flint sandal Aug 14, 2025, 6:31 PM

#

And still most models dont know what number is bigger. 9.9 or 9.11

obtuse heart Aug 14, 2025, 6:32 PM

#

flint sandal And still most models dont know what number is bigger. 9.9 or 9.11

gpt-5 finally fixes that problem i think

#

yep

ocean vortex Aug 14, 2025, 6:32 PM

#

gpt5-high. Made it output 16k lol

obtuse heart Aug 14, 2025, 6:32 PM

#

ocean vortex gpt5-high. Made it output 16k lol

Wow

ocean vortex Aug 14, 2025, 6:32 PM

#

It messed up the reflection

#

but otherwise this is very detailed

obtuse heart Aug 14, 2025, 6:33 PM

#

thats impressive and detailed

flint sandal Aug 14, 2025, 6:33 PM

#

I tested gpt-5-high and i am dissapointed of his censorship for things that are not even a bit unethical

neon idol Aug 14, 2025, 6:33 PM

#

hello

keen beacon Aug 14, 2025, 6:33 PM

#

flint sandal I tested gpt-5-high and i am dissapointed of his censorship for things that are ...

What censoring stuff did you test?

flint sandal Aug 14, 2025, 6:34 PM

#

keen beacon What censoring stuff did you test?

I wanted to code like interface for deepseek r1 and this mf hided CoT, when i asked him to show it to me he said he cant, i tried multiple times with prompting it like dev mode where i should see CoT but still didnt work.

#

I think its system prompt

keen beacon Aug 14, 2025, 6:34 PM

#

flint sandal I wanted to code like interface for deepseek r1 and this mf hided CoT, when i as...

LMArena doesn't show chain of thought ever

flint sandal Aug 14, 2025, 6:35 PM

#

When openai said he cant show his CoT he made it too literally

flint sandal Aug 14, 2025, 6:35 PM

#

keen beacon LMArena doesn't show chain of thought ever

I dont mean it like gpt-5 cot, and i tested it in chatgpt

#

Not lmarena

#

Read my message again

keen beacon Aug 14, 2025, 6:35 PM

#

flint sandal I dont mean it like gpt-5 cot, and i tested it in chatgpt

Ah, okay

#

I see

eternal niche Aug 14, 2025, 6:36 PM

#

stray aspen musk needs to be stopped

why

#

no censorship

#

musk forward!

flint sandal Aug 14, 2025, 6:37 PM

#

Wdym

#

Many ai dont have censorship

#

Even opensource

eternal niche Aug 14, 2025, 6:37 PM

#

flint sandal Many ai dont have censorship

gpt5 have

#

(and sucks)

neon idol Aug 14, 2025, 6:37 PM

#

grok 4 is better

keen beacon Aug 14, 2025, 6:37 PM

#

flint sandal Even opensource

Ahem... GPT-OSS

flint sandal Aug 14, 2025, 6:37 PM

#

keen beacon Ahem... GPT-OSS

It does have censorship

keen beacon Aug 14, 2025, 6:38 PM

#

flint sandal It does have censorship

Yeah

#

It's bad

#

I don't like it in general

#

It can only speak in english

#

Even small qwen models are multilingual

#

With lots of languages

gentle plinth Aug 14, 2025, 6:39 PM

#

best os math model tho in my opinion

tired herald Aug 14, 2025, 6:40 PM

#

keen beacon or just simply extract the text from it

that would be a problem if you want a description of what the pdf is, but its possible

tired herald Aug 14, 2025, 6:40 PM

#

eternal niche

and it should soon allow custom system prompts

stray aspen Aug 14, 2025, 6:44 PM

#

@eternal nichedo you support musk

sour spindle Aug 14, 2025, 6:45 PM

#

going through my account and deleting all my anti 2.5 pro comments

#general

Stockfish Search Patch Development Insights