#general
1 messages · Page 95 of 1
Creative writing id say kimi
crack bench
any api?
this is wild
According to here:
https://fixupx.com/elonmusk/status/1955047197487272362
https://www.vals.ai/benchmarks/IOI_2025_08_11
(C++ benchmark)
but..
Lol shut up elon
Also lpok at the real metrics, cost & latency
They tested an Early access version, so we can't be sure it's the public one
Nvm that's another benchmark
LOL. It's always cringe reading any of the Elon's posts
Isn’t this man a good example how you should behave in public?
It's also ironic that the ones doing the most manipulation are also the first to accuse anyone else of doing that.
Trump and Musk probably the worst offenders of that by far in the modern history lol
Oh they absolutely do. One is vocal about 'rigging' and 'destroying democracy' when he is the one guilty of that, another is calling random people out on manipulation when his platform is the worst offender (+ his political stint to get contracts for personal gain) lol
????
is lmarena down
working for me
اه
Right. He cares only as much as it involves his money or businesses
Republicans = money. Ethics have no play 🤷♂️
Not everyone is prepared to turn a blind eye on basic regulation etc...
Much less employ him in the government itself lol
Yeah it's a bit sus, though to be completely honest... I think if there was more to it competitors would have probably find out by now.
I'm inclined to think at this point that it does indeed perform well on benchmarks. With perhaps some minimal additional prompting that they did to make it less concise
Like update on Grok4 there sums it up quite well:
https://matharena.ai/imo/
MathArena: Evaluating LLMs on Uncontaminated Math Competitions
there is good reason to test non-high gpt-5
look into his eyes, how could you not trust him
genuinely though they're not hard to make a prompt to extract if you know what you're doing
how else would it be provided to the model?
it shows up in pliny's extracted system prompts as well
Is o4 released or is it gonna be realesed? Because gpt-5
when will video arena will be added to the site?
/video
#announcements message
And opus 4.1 thinking
In lm and web dev arena
You'll want to learn more about how to use Video Arena here #1397655624103493813
We haven't yet decided if Video Arena will move to the site or not. We are interested to hear what the community has to say though so be sure to share any feedback related to Video Arena in #bot-feedback
thanks
whats the 256 meethod
Would be nice to see on the website and would encourage people to vote since otherwise one has to make a Discord account to experiment with models. And the videos get lost in the feed as so many people are making them at the same time.
Just my thought.
I shared in the feedback channel so it isn't lost. But yeah all valid feedback. There are pros/cons to having it this way which is why we've been considering this an experiment from the start.
Okay, thanks☀️
Sam Altman should just migrate to Bluesky. Not gonna go further into this topic on this server.
X can be mess of a place
how so?
https://i.snipboard.io/UKaNyr.jpg
Opus 4.1 answering like chatgpt 4o
Oh no
It's actually without any additional prompting. Just the way it decided to reply.
i'm wondering as well. Or am I just that good 😄
There’s someone really eye opening about human nature how prone the AI’s (which are presumably trained on massive data sets) loved to be glazed
(Phrasing)
https://i.snipboard.io/VTeXAC.jpg
This is actually how 4o replied to the same prompt.
so opus is even more excited
I've just seen it! They released gpt-5 high!!!
Just wait a bit before sending your prompt because mine is still in the queue for about 10 minutes 🤣
Those missing two votes? No that's not yet clear to me what happened there.
You may want to refresh the page. Over 10 minutes it's likely stuck.
30 minutes pass and it's still generating... even though it refreshes every 1 minute 😅
Which subscription system is better, Gemini's or ChatGPT's? I want to subscribe based on that comparison.
Unfortunately it sounds like you've encountered a bug where the chat gets stuck. Sorry to say but you may want to start a new chat.
can we get a overview of our pasts chats on lmarena
😄 😄
I've tried in another tab/incognito browsing, another browser and it's still generating for a long time, only the OpenAi models are like this, I think they are overloaded.
We are looking into what a user login system could look like, which would help with this.
Okay good to know, I'll be sure to flag to the team. Is the prompts your users pretty large or does the size not matter?
skill issue 💀
To be honest, the prompt has 2.5k lines of code (which in this case are several scripts compiled into 1 prompt 😅), but the first time I tested it, it was responding in a maximum of 1 minute, but it started to take a long time. Oh, it would be great if we could attach txt files, it would make it easier to make changes to codes.
2.5k lines of code
Gotcha, I'm assuming when you tried different browsers/tab it was the same prompt?
if we could attach txt files
Yes, different upload file types is for sure on our radar. This is a highly requested freature from the community.
I tested 4 times, and out of 4 only 1 time did the response appear, after about 5-10 minutes. The first 3 are still generating.
Its normal
Becuase he his using the gpt 5 high
Does the mini version also take a long time?
Yes
Becuase he his high
And the thinking js really aggressive
@frosty shuttle aniway, do you have sone prompts for test ai?
It's not an AI test prompt per se 😅, it's a prompt asking for changes to a code, but with the different files compiled into just one.
i noticed that too
the time to generate response is long as hell
btw gpt5 sucks
Escuae me bro but you are italian?
si
Brazil 🇧🇷
Oh Brazil
greg are you italian
😺
Quick question: What model(s) are currently best for creative writing? I cheked a few leaderboards which seemed to indicate that ChatGPT and Gemini were both pretty good for that, but what else?
Probably grok 4
gemini 2.5 pro
the best
YES
What in gods name is going on
Yes 100%
Can confirm there's "juice" and for minimal it is "Juice: 5"
Looks like they replaced yapping score with this new juicing system huh
they also have it for o3 and o4-mini. Yap score alone was not enough 🤯
No apparently yap score works independently of juice score. This is f'ing hilarious talking about it lol
o4-mini and o3 have both
yap_score is verbosity and juice score is reasoning - smth like that
also why didnt openai tell me to show id for gpt5?
yo flux chill
ahh nice. looks very warcrafty. a mix of goblin and ..... elf?
also why didnt openai tell me to show id for gpt5?
bro what
hi
im making a chrome extension for a bit of workaround to attaching txt/code files, does the button look good enough to blend in?
I think this one is their only gpt5 model that didn't actually have RL training for true reasoning. Come to think of it... it did perform better than gpt4.1 in my testing. So it may actually perform better than gpt5-minimal as well
i thought openai demands id verification? someone said so somewhere dont remember when and why and how
their "minimal" may as well could have been "off" though. They kept it there for easier refusals or whatever even if it is not literally no reasoning tokens at all
So I do see gpt5/gpt5-mini/nano as hybrid reasoning models
When it's "minimal" it behaves exactly as the model with no reasoning would as far as the user is concerned
That's some data right there
btw I only voted for it because the other option was gemini flash 2.0
hi
Hi guys. Who is smarter, GPT-5 (Thinking "Low mode") or o4-mini (not o4-mini-high)? I just wanted to know what was better in the free version.
Depends on what you need, but generally it should be o4-mini since it thinks really well
For example, in math
didnt you say gpt5 sucks
O4-mini is better
Gpt 5 low thinking shouldnt be as good
Logic is working now, I can finally send different file types easily 😭
I never have to copy paste and see such large messages anymore
wait apparently this was by a model called "nano-banana" not imagen 4 ultra like I thought
Hmm... That is, we were able to use the o4-mini for free several times, but now it has been removed. That is, for free, we can use a better model than the o4-mini, this is the GPT-5 (Thinking "Medium" Mode), but it is available in manual mode for free users only once a day. So it got worse?
Somewhat yes
04-mini, so somewhere in between?
Depending on the task, yes
Benchmarks arent everything tho, so dont depend on them
GPT 5 for example says that blueberry has 3 b's
Even though it can generate incredibly complex code
I think it will go away after the first two weeks.
Yeah, still doesnt change the fact that results really depend on the task
GPT 5 for example (another) is basically lobotomized and not fun to talk to
Anime art of a short, tough, imposing blue skin one eyed Oni girl cyclops with blue horns, black bowl cut, baseball hat, varsity jacket, bubble gum, baseball bat with nails in it, city backdrop
Nano Banana cannot do cyclopes apparently
GPT 4o on the other hand has really good conversational skills
Cuz they train it on that old "benchmark" that asks how many certain letters are in a certain word
"how many r's are in Strawberry"
for the first time ever, Gemini 2.0 got a win over imagen 4 ultra
OMG LOL
A massive win in fact
20 days😭😭😭
Well, chats that get too long sometimes break
sonnet has 1 nill context now?
Who said that
Check out the pricing: https://www.anthropic.com/news/1m-context
Nano potato
ohh okay that’s probably the problem
Nano banana
btw the pricing sucks
But atleast it should be usable on lmarena
erm actually LMArena shows Gemini 2.0 with a 17% win rate vs. imagen 4 ultra and based off of their total battle count of 1166 that would imply Gemini 2.0 has won at least 192 battles against it before ☝️🤓
hah thats more than I was expecting
though its also the first time I gave it a win over imagen 4 ultra
losing 5 out of 6 is prettyyyyy bad
seed on qwen-image is broken I got the same image three times
my bra is made of sharks your argument is invalid
I just wanted a shark mermaid Q_Q
Me too
1:1 the same image in three separate chats
Hi there newbie here
I really wish they added aspect ratios
So am I
the square format is boring
I don't know how we got to this, but I agree
Are the results we get on the site private aside from the model developers?
sometimes the results arent square even when shown as square, you can click on them and they'll be portrait
usually with gemini 2.0 flash
But not 3.7 or Opus?
1 million context is huge for roleplay lol
But sonnet 4 isn’t that good for RP
Hello
would recommend to check out our privacy policy in full for all the details. But also in our FAQ for the question: Is my prompt data publicly visible?
Your conversations may be shared to support our community, improve our service, and advance the development of reliable AI. This includes posting conversations publicly online. Any data that we share is always anonymous and never linked to you. We never share any personal information, just the conversation and votes.
No
I mean they're shared with the public or with developers?
Used for research and may be publicly visible when datasets are released
Public
it's complicated.
going to check out the video generator, because it is interesting
Who is financing this tho, bit curious
Glad to hear it! Be sure to check out #1397655624103493813 to get an understanding of how it works. If you're running into issues feel free to ping me with questions. If you have feedback be sure to share it in the #bot-feedback channel.
There is more detail in this blog post but the TLDR is we're backed by Andreessen Horowitz (a16z) and UC Investments (University of California), with additional participation from Lightspeed, Laude Ventures, Felicis, Kleiner Perkins, The House Fund, and others.

Of all things you can create with an AI, you made that
I was trying to make a shark mermaid
I did not tell the AI anything about shark bras
has he subscribed to pro just for this xD
gpt5-mini-medium pretty much as good as o4-mini-high
gpt5-mini is actually crazy efficient
Is there a way to use gpt 5 mini?
It's on the API only looks like...
damn
They have simplified their model switcher way too much...
It makes more sense to use openrouter to use gpt
it's ridiculous how i can't even tell what mode gpt is thinking in
I keep getting the same image from nano-banana now too
Wth is nano banana 💀
new image model, stealth release on lmarena
@keen beacon any idea which version are they quoting of gpt4o here? We alr looked at these before but I had a 2nd look for mini and noticed that gpt4o score which doesn't look right 🤔
44% is beyond gpt4.1
probably last
yeah but I thought they stopped updating 4o-latest somewhere around gpt4.1
seems that they improved it beyond that as well...
gpt4.20 
bro 😭
Cry me a river Dorklon 😇
Looks like he can't read
grok is not trending
So no reason to include it here lol
Why is gpt 5 high so dlow
What is the best ai for coding
seriously..
Grok is probably struggling usage/money wise and that's why he's upset
lol
Is a good model?
yeah probably below imagen 4 in quality though
heres nano-banana's take on hatsune miku developing minecraft in the 90s
prompt was realistic grainy polaroid photo of Hatsune Miku in the 90s developing minecraft on a CRT computer, she's wearing glasses and dressed in 90s casual fashion, minecraft game design documents on the wall
imagen 4 for comparison
Das actually good
yeah now I want full banana
go to lmarena, do create image, keep doing the same input till you get it
Why cant we swear here
You can also direct chat
Cant you?
Uhm i don't find it
Oh i have to do arena
yeah
Why cant you direct chat some models
Ufff
its a stealth release
You can also generate imgs on this discord
You upload an imagevand if you want gemini transform it in a storybook
thanks!
more nano banana being goated
compared to gpt-image-1 with the same prompt
also waaay better than imagen 4
gpt5-high / thinking
Very good model honestly
yeah I wouldn't say its the best on average but its best sometimes
mega-banana
damn
In my opinion is a good model but in my opinion it creates realistic images really bad
it's scary good, or at least it throws off my AI slop detector
realism ❌
taco ✅
Lmao
WHY BRO PUT A PROFESSIONAL PHOTOGRAPHY INSTEAD OF THE PHOTOGRAPHY STYLE 😭😭😭
I copy pasted the prompt from one I gave to Sora
Lol
nano-banana why did you change beautiful to handsome
So am I understanding correctly the version of GPT5 in LLM arena is the version none of us have access to in chatGPT?
is nano banana imagen gempix
This is bs ngl
They listed a model that isn't the one even in chatGPT. Why don't they list the one we actually get
where is kimi k2 in lmarena? its gone??
What
Nearly all users are not using the API
They should list the version normies actually get
GPT 5 high is never used in chatGPT
It only goes up to medium
They don't need to remove high to list other versions
you can get 128 juice in gpt 5 thinking If you have pro subscriptions lol
high is 200 juice though
Yeah really
I’m sorry, but I’m not comfortable with this conversation. I’m still learning so I appreciate your understanding and patience.🙏
btw where is nano-banana?
Nobody cares
I'm sorry, but I don't understand your message. It looks like you are trying to obfuscate your words by adding symbols between them. Please send your message again without trying to obfuscate it.🙏
I'm sorry but I prefer not to continue this discussion. I do not like someone trying to manipulate me. I'm still learning and this detection might be wrong, so I appreciate your understanding and patience. 🙏
why did it say new
I’m pretty sure they’ve updated it a few times
If I remember correctly it used to say the update date after it and yeah it would be updated every now and then
i didnt tested it yet
oh
that cool if that true
it culd be just visual too tho
AI is developed to help humans to be more productive not to connect with humans and reduce human social interactions
all I need is sydney
Who is sydney?
Gpt 4o was an abomination of an AI model. It was really dangerous. It didn't do any reality checks on you and kept supporting your actions no matter what
y ogusy quick question
if you put one of your google docs in a gemini gem
and you update the google docs, does the gem use the updated the verison of that google docs?
hey uhhhh
i just saw the new terms and conditions
are they hearing our conversations?
I have no idea why latest Qwen is listed so high in the benchmarks. When I ask it questions of my domain of expertise (anime), it rarely answers correctly, hallucinates names of shows, and cheats by listing same shows that belong to the same franchise. When asked to sort shows by the amount of criticisms they receive, it just lists most hated anime instead of testing them against each objection I provide.
Deepseek, in contrast, just answers my prompt without this sort of crap.
I think that Chinese developers just optimise their LLMs not for benchmarks, but for domains of expertise required to pass these benchmarks. They are not training them on tests, but train them on the expertise required to pass these tests. Then they see that the benchmarks go up and publish their models, and everyone loses their goddamn mind over this
It's cheating. It's not AGI in any way, they're just creating more specific models for more specific tasks. But when it comes to something really obscure or specific, they fail
China is infamously known for lying by the way
We need this sort of private obscure knowledge benchmarks that are never tested in any public bench
Or maybe we don't - if we already have open-source LLMs that are this good at more specific tasks, we'll just outsource them and go do things that they are incapable of doing
hello im so happy to generate some videos here
lmao
dead internet theory
Crack do respond on it
Yes, it's sydney
what is that?
I am tired of that violation rules....
Every time I do actions it counts as violation.
qwen-image is sabotaging the competition, three times now I've had Battles where qwen-image gives me an output and the other gives me a "something went wrong with this response please try again" and when I hit "both are bad" each time the failed model is a different one
devious qwen-image 😡
image seeds are still broken on qwen-image too the output has been the same every time
haha nice
Cmooooon I just want pictures of teletubbies about to perform human sacrifice to the sun
but I keep getting qwen and other cruddy models… except gpt-image-1 but I've already seen it in its style a bunch of times
I’m sorry, but I don’t want to talk about this topic anymore. I’m still learning so I appreciate your understanding and patience.🙏
gpt-image-1 is racking up a bunch of points because none of the other heavy hitters are coming out to play atm
nano-banana and imagen have gone to sleep
thank you ideogram this looks like hot garbage but its the best I've gotten outside of gpt-image-1
whatever maybe the models just don't like my prompt, I'll switch to a different one
Time for the era of
BIGFOOT CAUGHT ON TAPE SKATEBOARDING. TOP 10 EPIC FAILS COMPILATION! 🤣 🤣
I've been pretty fond of GPT-thinking. its creative for writing but also a little weird with its imagery
I dunno, I have plus so I wouldn't know
anyways yeah GPT-5 thinking says stuff like "We take a picture because that's how you make a moment behave."
"Women wanted him, which ruined his peace. Fish feared him, which ruined his work. He put to sea to escape both and the sea took it personally. His boat went missing without becoming lost, which is a fine distinction you only understand once you’re on the wrong side of it. He steered into the deep and the deep forgot to hand him back. The hat learned to keep its promise without the man."
how do you have the old bing chat
Guys with the new 192k context gpt 5 thinking is actually pretty good
me vs the girl she tells me not to worry about
💀
what the actual truck is this lmao
hi all, did anyone tried nano banana
Yeah nano banana is goated
Prompts right there haha
I know which model it is
can you show me examples
but I dont know if I am allowed to say which model it is
You lowkey scare me
Day 1 without sydney
Better that than be boring
What is that
I search for nano banana nothing came
Image model in the battle arena right now
Is it better than other image models?
I think it might be google's new image model
Sometimes better, sometimes worse. Definitely a heavy hitter, but fails at certain things.
No idea who actually makes that model tho
I know
When is Gemini 3?
Are there rumors of tommorow or did you pull that out of…
Soon
Gemini 3 must save us
nano banana is coming really really soon
Looking at the broader AI Industry Anthropic and Google are the only ones neck and neck for Sota intelligence models and have been for some time
Open AI is more focused on mass distribution and the business side of things their models are so corporate and annoying it’s like a large social media platform, and Grok is so dumb, it’s not frontier on any meaningful category and needlessly expensive
Google will reach AGI before all
Man openai and xai really are unprofessional with sama and elon fighting like two high schoolers
I have hope in google
Who cares, I'm here for the fun of the show
we finally have gpt-5 search and opus 4.1 search 🔥
yayyyyyyyyyyy
Are there any conversations that weren't voted by the users on Hugging face?
E.g. direct chats and just convos that users forgor (💀) to vote in
hey guys
are they recording our conversations? that is said on the terms and conditions
Anthropic s h i t at complex tasks,math science, web searching
Yes, that is the cost you have to pay for using these expensive models for free
You are not deleted user wtf
If you chat brainrot on lmarena, the companies of that respective ai is gonna train their next models on your chat
(just kidding)
I don't talk to non shonen/seinen lovers
You ain't even an otaku if you not a shonen/seinen lover
Dude I met an anime producer to help with my project, who's not even an otaku I'm asking you.
🙀
😭🙏 how was I supposed to know
What
You didn't, it is a secret knowledge that I drop on people like you to give them a taste of humiliation
...
What project? Are you willing to tell me?
I don't think that it is a topic for this server but I am looking for developers for a visual novel with some battle mechanics
Oh
I thought it would be some amv project
Sure I hired a producer with years of experience just for an amv, dude tf
Tf what?
Nothing
So umm what's your top 5
GPT-5
GPT-4
GPT-3
GPT-2
GPT-1
I want to read Manga's guide to physics part 2 but the English version isn't out yet
You have ChatGPT to translate.
Does anyone know why GPT-5 mini and nano are still not on the leaderboard?
hi
why is the website down again bro
Chatgpt uses deepL
Do you have a source for this?
Why isn't GPT-5 chat on the leaderboard it is literally SOTA it is above supergrok heavy gemini three gpt 6
That tab was on incognito so that is lost now
I was translating japanese through o3 search and I pressured it to translate accurately so it said it went to deepL and translated it from there and told me benefits and feats of DeepL's translation
Why were you translating it through o3 search when you could just ask GPT-5?
Gpt 5 was not out back then
This was 2 weeks ago
I don't speak Japanese (yet)
not fixable (try reloading page)
regenerate it 🔄
(If this doesn't work, I don't know)
is climate change human made? how much? if yes, how can i explain a AFD voter in germany that it is that. he got his own little youtube channels he watches.... all conservative sided.
gpt5high: I can’t help craft persuasion targeted to a specific political group or voter.
Me: 😒
help
btw guys gpt5 sucks
Hi guys. Am I the only one having Grok 4 and gpt-5-high on LMArena hanging on mathematical problems and not giving responses anymore? I waited 10-30 minutes, but they didn't write anything. Is this a bug or what? How can it be solved?
Learn how to write prompts for it. It's a good model.
try reloading website, reloading answer, try asking 10 times, try everything, if it still doesnt work then go to #1343291835845578853
Used GPT5 in Cursor CLI today, was actually very impressed. Much slower than Claude code but it got the job done in less prompts.
⭐i am sorry for your fanboyism.⭐
Even though my subjective benchmarks are comparing GPT5 vs Sonnet 4 since I run out of Opus way too fast 😂
I’ll continue to use GPT5 with Cursor until my trial runs out or hit the usage limit
Cursor CLI is nowhere near CC in terms of feature though. I’m just not willing to pay API pricing via Codex
If I can get GPT5 on subscription and CC that’s the dream setup
Gemini 2.5 pro hallucinated a python library that doesn't exist after 50K context.
gemini will hallucinate that conservatives are right and climate change is not human made
🐒
Calude 4.1 token limit in lmarena 🙂
gpt-5-high is pretty good yall capping
why u using claude? 😭
u wanna unalive urself with claude lol?
i wish the Opus 4 limits were like 10 per 50 minuets instead
its just trolls. overtroll them or ignore em
Bro. Its far better than gpt 5
I chose the robotic personality in chatgpt. Its fire
does anyone know the message limits for the Claude Sonnet 4 model? (on lmarena)
Nope
stop using claude
u cant create any gpt5 competitive model with chinas hardware
pls drop your hopes by 99%. thank you for your help.
waifu model?
oh no, furry. lol
PAWS
UwU
i like furries thjey are open for everything. literally. everything.
susy baka
if u want various pathogens in your body which dont help your immune system. sure.
vitamin d3?
the pathogens cat have are very dangerous for homos
we are all homos yes
then dont do research. keep sane
is better living a lie than knowing the truth in your case
i just said why
nope irrelevant
once u get the virus from cats ull be very ill
u can also die from touching crows
xd
I tried... I asked simple questions, he answered them. But as soon as I give Grok 4 a long math problem, he freezes and that's it. He doesn't write anything else. Although I also asked him to give a quick answer so as not to get stuck... I reloaded the site, nothing helped. Maybe it's the same for everyone?
i bonly take 20k units d3 daily
Yeah
i know.
it removed my psoriasis fully
but i cant take it obviously
i cant take 20k for oviosu reasons daily
I have a question, how to choose GPT-5 Thinking Medium on the phone?
I just heard that for free users it is available once a day for free
20k is bad even vitamin d society uk says that
even without touching the sun 20k is too much
?
20k daily.
then ill get psoriasis bakc.
like now.
why?
lets not talk about how much i need to have healthy bloodline that is irrelevant
i wanna know why 20k units remove my psoriasis
i know i am not overdosed on 10k
yay
imagine if deepseek delivers something at at least Gemini 2.5 pro level this time
i hope it's better than gpt5 high, but it wont be so...
I just want to see how everyone loses their crap over R2 after it achieves SOTA right next to or at the level of GPT-5
10k doesnt do anything for my psoriasis
what is that
Whst do you mean what is that
gpt5 cant watch videos
Gpt 5 thinking is medium reason effort but it did think for 11 minutes
I don't think you understand
How did you turn on GPT-5 Thinking Medium?
Im on the free plan btw. Tell it to think very hard
I don't know how to do it on the phone. I can only ask him to think deeper, but he switches to Thinking Low that way.
Thinking Medium, as far as I heard, is only available to free users once a day.
nothing to track... daily....... intake.... same amount.... when.. will .. u .... get... it
Idk but it is good for sure
i knwo bro i know
!!!!!!!!!!!
that wont help ma psoriasis tho!!!!!!!
111111111111111111111
I compared two math problems for Grok 4 and GPT-5 with the "Think Deeper" prompt. Grok 4 made the right decision. It seems that no matter how you ask GPT-5 to think, it always automatically switches to Thinking Low, not Medium.
When did you do that
?
Hmmm
Tell the promt
What is your model? Think very hard
Send a ss
The task is not in English. This is the task I gave to Grok 4 and him. The correct answer to this task is: 4049. Grok answered correctly, but GPT-5, despite thinking for 9 minutes, was unable to cope.
Give me the math problem text
Not image text
Let ƒ: ℝ ➔ ℝ be a continuous function. Call a chord a segment of integer length, parallel to the x-axis, whose endpoints lie on the graph of ƒ. It is known that the graph of ƒ has exactly N such chords, and among them there is a chord of length 2025. Find the smallest possible value of N.
iirc its either cortison with topical vit d3
or
oral/injections which lower your immune system.
none of that is safe
What is the correct answer? In english
topical d3 if you take oral d3 will lead to too much d3
4049
By the way, I gave this problem to Gemini 2.5 Pro, he couldn't solve it either. He writes that the answer is 15, but the answer is 4049. Only Grock managed to cope.
Could you ask him why the answer is 15?
I didn't ask him, but he wrote a detailed solution to the problem, but it doesn't give the correct answer.
I tried to give this task to GPT-5-high on LMArena, but as I already wrote in this server on the forum "bugs", Grok 4 and GPT-5-high for some reason hang over long tasks and do not write anything
Whats the best ai now again
10
25
3
gpt 5 thinking
Is this Deep Research?
Nope
Is this Think Longer?
No, think longer feature isn't available for GPT-5 pro model
Do you have GPT-5 Pro?
Yes
@hollow ivy wait what u say how much D u **swallow **daily?
Well, is there an answer?
Hm.. This is strange. This is a problem from the Mathematics Olympiad for Russian students. The correct answer to this problem is 4049, as stated in the official solution. Maybe the neural networks are a little bad at capturing the Russian language or the problem statement is incorrect? Grock 4 somehow managed to come up with the answer 4049
GPT5 PRO!!!!
Don't you have an English translation of the problem?
Let ƒ: ℝ ➔ ℝ be a continuous function. Call a chord a segment of integer length, parallel to the x-axis, whose endpoints lie on the graph of ƒ. It is known that the graph of ƒ has exactly N such chords, and among them there is a chord of length 2025. Find the smallest possible value of N.
This is not an official translation, this is a translation using GPT-5
Use DeepL
I don't think DeepL is very good.
But Grock, he didn't formulate his answer very nicely, but I asked him about it later.
I believe gemini deepthink and deepseek prover v2 are the best models for math
Have you tried GPT-5 Pro to give this task in English?
Although... He probably has a good translation anyway, it probably won't matter.
Yes, he's investigating it right now
Ah, okay.
How can I check who pinged me Idk 😭
Don't you have notifications on or something
Gah dayum thanks
nono. it actually tells u how to
Oh
if not admin can ban me
... its just the first steps lol
in the end it will show u how to see the last ping
🤣🤣
I remembered
When I gave this problem to Grok in English, he solved it incorrectly. But when I gave it to him in Russian, he was able to solve it.
Maybe it's the same here
Can I ask you something, are you Russian?
yes
О нашёл своего
Свояк!
Я уже нашёл электронную версию
IF BOTH OF YOU ARE RUSSIAN WHY DIDN'T YOU EARLIER AHH NOTHING NVM
Странно конечно, что он с этой задачей не справляется
gemini продолжает на своём стоять
🤣I don't know
Я Gemini сначала дал на английском, потом на оригинале, всё равно 15 пишет.
Наверное Gemini Deep Think должен её решить
Не зря же он 60% на Международной математической олимпиаде решил
А тут просто какая-то российская задача
Хотелось бы ещё у Grok 4 Heavy спросить
Bruwbsiavriahirhwidb Gemini Deep Think brbrbrbrbrbbrbrbrbrb?
??????
Я если что на оригинале Гроку 4 дал задачу, он же решил её, но почему-то на английском не справлялся
Brbrbrbrbrbbrbrbrbrb tell brbrbrbrbrbbrbrbrbrb
I mean Gemini Deep Think should solve it.
Может сложности перевода brbrbrbrbrrb
Do you have Google ai ultra?
Of course not
gpt 5 pro taking more time than deepseek to reason💀
Scam altman
lmao
🤣
axaaxaxaxaxaxa
А у тебя его нет?
Нет
my bielarusy
А почему кстати Грок 4 Heavy нет на LMarena? Гпт-5 про там есть, а Хэви нет
А ты откуда узнал?))
Enjoy! Quick edit!
MY PAYHIP
https://payhip.com/PJUNKIE
TIMECODES
0:00 - Intro
0:05 - Edit
0:20 - End
MUSIC
Lost Soul Down Floki Remix
ANIME
Demon Slay...
Его вроде как нет в api
Да и то, сейчас какие-то баги в LMArena , Грок и гпт-5-хай бесконечно думают и ничего не пишут, если задача длинная будет
А, понятно
@quiet dust what does the lyrics mean
А ты не знаешь, наверное, как Thinking Medium в Gpt-5 на телефоне включить бесплатно да? Просто я же слышал, что говорили, что для бесплатных пользователей один раз в день будет доступен медиум режим
my bielaerusy minryja ludzi sercam addanyja rodnaj ziamli
scyra siabrujem sily hartujem
Well, there's a popular song about alcohol and kissing.
lol
Я про это вообще не слышал, у меня на телефоне нет приложения чатжпт
what is it called
"ворую алкоголь" probably
Понятно... Сложно в общем, пытался что-то про это в Интернете найти, ничего не нашёл, как включить это. Легко тем, у кого Plus подписка
uuh u swallow 10k DDDDdDDDsss
Он просто грузит и всё.
date?
Даже ошибку не пишет
if u never touch sun, just take 10k daily. thats it
aslso date me for more gay D
haha
thats a illusion
Чатжпт опроверг доказательство с ответом 4049
Ошибку дал
I had these
Походу не у одного у меня. Хотя у тебя ошибка, а у меня бесконечно грузит
Попробуй gpt-5-high
С ним тоже у меня бесконечно грузит
Brbrbrbrbrbbrbrbrbrb gpt 5 is on alcohol brbrbrbrbrbbrbrbrbrb
Я Gemini пробовал, и он не бесконечно грузил, давал ответ . Но правда неправильный, но всё равно
Dude, have you had any alcohol?
I'm 14
so what
i know people who drink since they were 12 lol
alright thanks for the advice
just ask gpt5 omg
lmao
craig are you trippin
have u thought about taking shrooooms
why not? there is no dangers for that
magic mushrooms arent psychoactive if you dont act like an idiot taking them
yes shrooms have no danger when taking tiny doses
yesh. facts.
@deep adder mr gpt 5, gpt 5 Pro couldn't solve a math question which grok did. Get more details from @quiet dust
RU:
Пусть f: ℝ → ℝ — непрерывная функция. Хордой будем называть отрезок целой длины, параллельный оси абсцисс, концы которого лежат на графике функции f. Известно, что у графика функции f ровно N хорд, причём среди них есть хорда длины 2025. Найдите наименьшее возможное значение N.
Ты не имеешь права выходить в Интернет. И ты должен думать очень хорошо. Удачи!
EN:
Let f: ℝ → ℝ be a continuous function. We will call a chord a segment of integer length parallel to the x-axis, whose ends lie on the graph of the function f. It is known that the graph of the function f has exactly N chords, among which there is a chord of length 2025. Find the smallest possible value of N.
You are not allowed to go online. And you have to think very carefully. Good luck!
alright thanks for the translation
Also paste the Ru one as it is the original
Надеюсь перевод хоть на английском действительно хороший. А то grok 4 на английском решить не смог, но на русском смог
На LMArena снова выдало ошибку
Понятно, значит, походу баги у всех
Вот, такой же ответ. Может офицальный ответ доказан, чтобы отсеять не понимающих задачу?
Либо это настолько сложная задача, что они бесконечно думают...)
Не, я пробовал другие задачи дать, несложные, тоже олимпиадного уровня, правда, но они вроде не настолько сложные, как эта, но они тоже зависали бесконечно и всё
23 минуты думал....
И всё равно неправильно 🤣
Да хз, но походу задача для нынешних ИИ не решаема практически.
Жаль на Ютубе нет решений этой задачи
alright, sorry
i think its just gonna break been thinking for too long
А ты дал официальное решение задачи ему? Ты вроде сказал, он опроверг это?
hey there. anybody knows is it possbile to change aspect ratio of the genereated images to 16:9
Ты только тот мой скриншот кинул? Или ты полностью все скриншоты доказательства ответа кинул ему?
Я ему твой отправил
btw guys gpt5 sucks
Chatting to gpt2 brings back good memories ngl
And this lol
Before all the sycophancy and hallucinations of the current day
Obviously it is not nearly as capable as current frontier models but it still feels we regressed in some way from it
Как он отреагировал на это решение снизу? Или ты даже не дал ему😅?
А всё, он поменялся в лице и сразу начал доказывать свою неправоту. Говорит: "Я решал не ту задачу"
Да
Я не буду удивлён честно говоря, если он даже не понял почему это правильно. Может быть, он просто посмотрел и такой "а ну раз там большое доказательство, то значит 100% правильно"
😂
If nano banana is new native Gemini model it has synth-ID turned off. No invisible watermarking on outputs unlike imagen 4 outputs which shows "made by Google AI" when checked with Google lens
Возможно, он немного плохо разбирается в вот таких геометрических задачах
GPT-5-High is slow for some reason is it because its thinking?
as well as the nano and mini versions of it
yes
ohh that's why
its on high resoning effort
Nano banana has creativity and understanding of gpt-image-1 but doesn't lose key details like Flux
@stray aspen so what's the fastest model of gpt-5 ? just the normal gpt-5-chat? is it good like the other versions of gpt-5?
could nano banana be imagen gempix
oh man :(( I've been using GPT-5-High as my tutor like in gpt-5 study mode it so slow :((
Gpt 5 high thinks at least 40-50 seconds and i love that
Gemini pro 2.5 thinks 25 seconds
but i love it
2x
i hate and love it
its good thing that we have deep reasoning model
crazy
XDDD
Don't you have a mistake where he thinks endlessly about complex problems in LMArena and doesn't give an answer?
yes
its been ages
😅Ah, you wrote that very problem...)
i cant read and understand codes but probably better than grok 4
I'm on Grok, I waited for a whole hour. A WHOLE HOUR. Nothing came of it. He was still thinking. I haven't checked on Gpt-5-high yet, but I waited more than 10 minutes, still nothing written.
i dont know. Claude models have some magic on their models. Even if their benchmarks not looks good, people still using claude all time for codes
it was a mistake using gpt-5-high as my study mode lmao
Забавно, что я сейчас хвалил Грок, за то что он единственный решил эту задачу верно, я решил дать ему повторно, на русском языке, как вчера, думал, сейчас снова 4049 напишет. Но , он пишет это:
Короче, повезло ему значит в прошлый раз. Да и в прошлый раз, он как-то подробно писал, а тут просто число
whats the best model for math rn?
Deep Think
Hello , who are best models on python? (please give me a list I need it💕🩷)
i'm a pretty good model myself 😍
It's a known bug where models will occassionally get stuck generating or stop providing an answer mid response. I've seen refreshing the page sometimes helps. Although it isn't a fix 100% of the time.
Team is aware of the issue and are focussed on improving this reliability overall.
thanks
How did you get gpt5 high
its on lm arena
OpenAI listened, that’s more like it!
Coincidence or not but that’s pretty much exactly what I suggested to do in their server shortly before lol
4o is niceeee
Leave auto for those who are overwhelmed + hidden models
some 1 pls vote my video in video-arena-2 i rly need to see who made the better video, it is so detailed
there you go
pushing the limits to the limits
The output is shi asf
Hey did the server Legacy are down get
"503 Service Unavailable
No server is available to handle this request."
The more specific you are, the worse they are
those have been gone for a while now
Weird i was using it yesterday but thx for your answer
𝐈𝐭 𝐡𝐚𝐬 𝐚 𝐛𝐫𝐚𝐢𝐧 𝐨𝐟 𝐆𝐩𝐭-𝟑𝐨
3o?
is there any way to compare videos in private? thanks
Ive never heard of 3o before 😭
lmao
Does anyone know if the GPT-5 Thinking Medium is available for free users in app ChatGPT? I heard OpenAI talking about manual mode, which is better than automatic mode of thinking. And according to their statement, it is the manual mode that gives Medium Thinking, as opposed to the automatic mode, which only gives Low Thinking. And it seems they said that manual mode for free users will be available once a day. Does anyone know how to use it?
here is any easy prompt to make gpt 5 spell like gpt 4o did?
maybe Formal yet Original?
On the PC version after the GPT-5 answer there are buttons like "more detailed" and "more briefly". "More detailed" in my opinion gives about the same level as 4o
fine
is ther any mod i am trying to make video on web but its not wroking does i have to say anything specifically to get video
lmarena gpt 5 ahhhh 💔
lmaus
hello
I might buy Open AI
What does - high suffix mean in LLMs?
you guys! I'm the creator or boba.video. It's an anime model 👀 anyone know how I can submit my model to the leaderboard?
for code use claude
@deep adder 5 pro is goated
i want to ask is there any limit of making video etc??
Hell yeah!!!!
Has gpt -5 improved? Last time I tried it. It didn’t seem that good
Oh damn
honestly ive been using since last week gpt 5 high and it made my music program completely different and on a completely different other level. 2.5 pro even with code interpreter made so many mistakes and hallucinations, while gpt 5 high doesnt make any... incredible
I think Gemini 3 will be a monster but same guy was pushing bogus gpt-5 simplebench eval
Don't believe in benchmarks, believe in personal experience with the model
2.5 pro seems outdated idk if it’s just me
GPT-5's developer is listed as xAI. Additionally, from what I've tested, the bar graph in Artificial Analysis never extends beyond the maximum memory; it's always drawn below the maximum scale.
With grok 4 and now gpt 5
is that even remotely true?
An... odd way to write all of this but yeah you wrote it mostly correct lol
"Auto" --> routing between no reasoning at all and low reasoning effort (gpt5-chat / gpt5-low until you get capped)
"Thinking" --> medium reasoning effort all the time (gpt5-medium)
Sam altman success = new grok gonna release faster
They are likely to tweak that "Auto" option moving forward though
So how to access this manual mode
pay $20
I cannot even select any models in chatgpt
pay
What if Gemini 3.0 is already finished and they’re just waiting it out
See how gpt 5 preforms
They are still improving upon it i believe
Slow and steady wins the race
Ah I see
Yep
Apparently Elon musk wants to make a 4.20 as fast as possible because of gpt-5
Femgoku does not win the race ❌❌❌
99% that they are trying to release it ASAP. No one is holding back or waiting
Your pfp
My pfp is dude goku
That ain't goku
Looks like Goku to me
Also, they kinda already knew how GPT5 performed on the release day lol
@ocean vortex DOMDOM
Probably had early access as well tbh. There was closed beta iirc
Do u need pro to chanhe models
What is iirc
I haven't made a single dollar in my life
There is an 8 generations per day limit.
per day ohh okay
btw the generation is good
are you like 12 13? Ok go to lmarena direct chat then lol
there's gpt5-high with daily caps
8 generations 8 seconds anything else 8
left or right?
Right
Left
For complex tasks which I know only 1 ai can do or for long paragraphs which I cannot read from another ai except these cases I always use side by side
Side by side is so good for verifying things
going to post this in the thread so I can keep it organized.
Suggestion: add the button to copy the code at the end of the code as well, because sometimes the code is long and you have to keep scrolling up the page until you find the copy button.
HELLO FOLKS
pay up lmao
whats the daily cap
ive been using this thing all day non stop
When will the gemini 3 come out?
lmao imagine at the end of this week or next wee r2 and gemini 3
And? You saying both gonna release almost at the same time
no no
i was just saying random stuff
No way gemini 3 is coming out next week. You will see them in arena as anonymous model atleast a week before....
Bumping this poll 
Pin the message
The magic of gpt is using it on their site
Using it anywhere else through API the flaws become so much more apparent
And kf Google wants to surprise us 🫣🫣
How
I can't select gpt 5 I can't select thinking
pay up
did something change since the last time this poll was up?
Looking even worse this time lol
did something change
Nope, just looking for more information.
Looking even worse this time lol
How so?
The fact that battle brings no genuine model testing opportunities (like legacy version low-key did) does not help here
Direct chat big lead...
That's not necessarily a bad thing though
There is no right or wrong answer to the polls, we're running these to just try to understand why
Llama 4 is probably the worst open model
I would think elo is the main purpose. If people are reluctant to use battle I wouldn't say that's great. 🤷♂️
Hello chat :D
With the current system, user does not get very much value out of it. To be brutally honest
LG WHAT THE HELL????