#general
1 messages ยท Page 90 of 1
some say that when you spend enough time with a model you begin to imitate it, even when you aren't using it
(see _opencv_ on x for instances of this)
I can confirm
1 year after talking with sydney
have you been a good bing?
yes
Why would they put a " symbol at the end?
wow i'm gonna follow this account
it's interesting
ah fair point, guess it's just 5 then
Whatโs gpt oss?? How does it compare to gpt5?
i don't know now if my experience with sydney was so good like in memory
Lobotomized
Open source models
more info here: https://openai.com/index/introducing-gpt-oss/
Thanks
And I heard it was only good in English too primarily. Absolutely awful in my native language.
it's truly fried
(full thread https://x.com/jxmnop/status/1953899426075816164 / https://xcancel.com/jxmnop/status/1953899426075816164)
Talking about OpenAI as some poor little startup that we shouldn't be so hard on for needing a small context window and long thinking times to benchmaxx is the weirdest complex
What i mean is, way too much safety, and the model itself isn't good for an os model
Thought you said it to ME lol
Thanks
Simple bench guy had a funny story. When he posted gpt oss result openAI guys contacted him to rerun it and then they got an even worse result

He talked about it in latest vid
Titled gpt5
Where GPT-5?
Better question is wen gpt6
They've nuked it ๐ญ
???
No. They delete GPT-5 main
Just strange if one time they get GPT-5 and after that delete
Well they couldn't keep gpt5 there because it's literally better than chatGPT plus for free then
Because you are always on the full model when selected in LM arena
Feels like plus tier regressed so much....
At least you could select o3
Although on medium reasoning by default as far as i know
Hmm, okay flagging, thank you!
Could also just be capacity issues
wait where did gpt-5 go?
WHY'D THEY DELETE IT!?
It's coming back
oh okay ๐
Maybe GPT-5 were the LLMs we chatted with along the way...
Okay
Maybe the GPT-5 was the friends we made along the way
๐
But why buy plus tier when GPT5 on lmarena is better than the router
Is this Claude 4.1?
Yes
Did you read it? ๐
kinda upset to why the models can't use search actively
Bc all your data is going to be published eventually if posting it on lmarena, also no tools etc.
lmarena actively spies on you confirmed
Ngl. People are still going to be putting personal info into LM arena they should at least set up a filter
Before publishing the data
they do
so even deleted chats?
Nah, you can have it, guys ๐ฝ
yes
I ain't got no money to pay for router
will someone give me their card info pretty please
Just find one in the lmarena dataset, i guess ๐ฌ
can confirm
And how exactly would I do that ๐
Manually, i guess
hey can you find me a credit card
hey no
lmao
welp, an attempt was made
no fair
this was compelling
yeah there was an issue, working on bringing it back asap
It even disappeared from every single leaderboard for a second. Now it's back tho
What does it mean?
you dare to delete my message inferior being
it's a dog
it's a dog
stap
Yes
kk back to AI topics please!
No back to you secretly watching me
The consequences of not having access to GPT-5 on lmarena for a few minutes
Itโs bad
so far it does what I expect
4o was a better waifu
I like Chinese-provided waifus more
wait does gpt-5 on lmarena have the ability to use specific models for different tasks?
@echo aurora
It's not a router, so probably not
Itโs the ultimate benchmark
Unless you mean different experts, but we don't know if GPT-5 is MoE or not
It's taking a vacation
currently in the bahamas
Bruh
Angy Copilot ๐
CURRENTLY ON VACATION
I ate it
WAIT 5 WEEKS
It's having an issue atm, team is working on a fix
ok thank you very much.
u gotta lie to them ๐
I think thereโs problems with it
hey no
Must be why
In the meantime u guys will have to use @deep adder as replacement for gpt-5
temporary
I dont like trolls
well luckily I'm not a small rainbow rat living in a mushroom
Is there a limit to LMArena models? Iโm a new to these kind of stuff
@deep adder
5rk1/4bppp/3R4/4p3/1p4P1/1N3P2/PPP4P/1K3B1R w - - 1 22
Your move
@echo aurora is gpt-5 on lmarena using (high)?
Probably medium
I donโt think so
You should ask pineapple for more details
Welp, i was instantly proven wrong
wheres gpt-5 normal
@echo aurora Please
@echo aurora I cant delete my history chat. It always reappears
There are rate limits
Just delete your cookies completely lol
This was a bug; however, it was fixed yesterday, are you still seeing this?
Where can I read about them
yep
I thought I hit the rate limit for gpt5 when it stopped working lol
Then it got removed so I thought it was an error
Same
I don't believe exact info is shared somewhere
But I will raise to the team if that's something we want to start doing
crack bench
Guess Iโll have to find out myself then
There was one question I had. When an ai takes a while to reply is it Because of reasoning?
Nope! there should be error message
Yup
craigbench
I see
I think depends, overall lag is an issue we're aware of
Would it tell us directly if we hit the limit?
Or is it the general error message
@echo aurora I am still seeing the bug
I believe we updated it so it says rate limit error
Try deleting your history
Thatโs what I did
And whatever chats get removed from the history
can you try a different browser?
Also gets removed on the website
let me know if it's still there
Ah I see Thabks
I really wonder how the rate limit resets if thereโs no account needed
@echo aurora I cant see it on Edge
Probably IP-based or directed by your browser cookies
It would be stupid funny to be based on cookies
as in can't see the chats after deleted (meaning it's not a bug)?
wheres gpt-5 ๐
Removed for some fixes
it will be back??
sorry about that!
Pineapple and his team wants you to have a great experience
yes! it will be back
I will not abuse it of course, Iโm just curious
no, it is still there in chrome
it wont go
Delete ur history
thanks @inner gate and @echo aurora
Todayโs history
Try after that
After deleting ur history also close ur tab
So it refreshes
Then check again
Whatโs the point in deleting the chats the data is shared anyway
???
It might be because of OCD
shared?
???
I usually delete the chats I donโt use (I donโt have OCD) it just looks cleaner
I like it that way
I mean, on the legacy website your can just overwrite the model you're using by intercepting a join web request, since they aren't encrypted, so i wouldn't be surprised
Thatโs how it improves I think
This stuff is free
Can I buy you
Isnโt that why ur data is shared?
Cus itโs free but not
Yep
I thought so
If the product is free, then maybe we're the product... ๐
Who cares.
I donโt to be honest
Itโs not like Iโm gonna use more than 5-10 messages per chat.
I mostly use Opus for help when codex is too dumb.
4o coming back confirmed 
My Claude 4 opus usually stops working after a few messages
I havenโt used it recently
Maybe itโs been fixed
Elon said Grok 5 is coming before the end of the year
Opus is nice for agentic coding imo, but for fine-grade adjustments o3 is better
I heard gemeni 3 is also coming this year
Gemini
Yeah, maybe.
At the end of the day what the people want is just a model to endlessly glaze them
All this benchmarks don't matter kek
I read somewhere that ai models are made to sugar coat things to you and always think youโre in the right
I hope they increase the context window even more, because that's one of the best features Gemini has over competitors
U can tell it u murdered someone and itโll give you excuses on why you had no choice
Yeah.. I want to switch to Claude Code so bad but OpenAI is giving me free tokens to have fun and Opus is too expensive
Man, i wish i had free money from OpenAI and didn't have to use lmarena and openrouter's free models ๐
What grok4 model is used on LM arena?
I share my data for free tokens.
Who cares, itโs not like they donโt already know all my weaknesses
You get free tokens from openly sharing ur data?
Yeah.
I did not know that
Look, Iโm gonna show you.
Go ahead
I swear gpt 5 just disappeared
Please don't swear
When I press that i it shouldโve show me the limits but itโs stupid on mobile, so here are the limits.
im betting gpt 5 just disappeared lol
bro how do you do that lol
Do what
sacrifice your personal data for free tokens lol
Who cares
lowkey it kinda fun talking with gemini 2.5 pro
Why
i love gpt-5
you clearly because you're the one that does it
I share my data, that means I donโt care about it
Where's Four-Door Dodge Charger Daytona as a selection option?
well true at least you get something out of it lol
Can you show the cars?
Or at least one, Iโm curious about the models
idk
no
where is gpt 5? Lol
Not in lmarena
gemini tweaking
Ah
What are you trying to do
why is gemini hallucinating apache 2.0 licenses and copyright crap in its python scripts
his prob
You gave him PTSD
What happened with GPT-5 ?
wdym
Is this through LMA?
lol its so high
Its not there in the site anymore
no gemini official website
lol
if you have a pro plan(not max) it happens sadly. Claude wants from you to buy max or pay api
You know about the ai studio right?
ye
that's what happens when your reasoning effort is too high lol
๐ฎ๐ ๐๐๐ ๐๐๐บ๐
Anyone played with verbosity parameter?
From my quick research itโs just a yap meter
4o lovers would love the current gemini btw. Praising you for doing breathe
@keen beacon
GPT-5 isn't there anymore?
Yeah, it drowned in the amount of requests per second
Bro I thought GPT-5 made you a browser game with those cars
its roblocks
It will be fixed?
Roadblocks
๐ง๐๐ ๐ฝ๐๐ฝ ๐ ๐๐พ๐ ๐๐๐พ๐๐พ ๐๐ฝ ๐ ๐๐๐พ ๐๐ ๐๐๐บ๐๐พ ๐ ๐๐
Most likely, yeah
nah fake, already did this. someone else did with nano and it worked too
the system routed it to gpt 1.5
where is gpt5
Search for OpenAI Platform, Settings, Data Control and Sharing, you need to verify ID if I remember right and itโs not for everyone free, you have to be lucky.
Broke all the bones
๐จ ๐๐พ๐พ
but that was pervasive. it works for me, and it worked for others using nano
Most people get it, so try your luck, you have nothing to lose.
๐ณ๐๐บ๐๐๐๐๐๐บ๐
What
New strawberry problem
Same I get a satisfaction
๐จ ๐๐พ๐ ๐ฉ ๐ฟ๐๐พ๐พ ๐๐พ๐พ๐๐ ๐ ๐พ๐๐บ๐ ๐
๐ฅฒ
Unlucky
if it got it right it means this is all a government conspiracy
I donโt know what that 7 free evals does.
i've heard that not adding the extra 0 decimal can cause issues (but shouldnt)
5.90-5.11 works, even though 5.9-5.11 should always work too
No idea either
The X is shy
lol no way.
could be model selector
Wait
theres nano, mini, normal
no. gpt 5 is multiple models
and you don't know which it is
Is this Claude?
no
Or Copilot
web gpt
its copilot
I see
yeah it works
the best model
6
9
3
gemini-2.5-pro
they really have to fix the autorouter
AGI when
You know what
I want ASI
I want to focus on myself more
Let the AI do everything
I also want asi
wdym autorouter
they choose the "correct" model
also something they have to fix - if you have a REALLY long conversation, it basically crashes
WHY IS MY GPT 5 NOT BACK YET
why is gpt-5 gone
Same here. No GPT-5. Just mini and nano
maybe bcz of costs
too much use?
i guess pineapple can say
@echo aurora
GUYZ WHY IS MY BEST FRE MODAL NOT HERE YATTTTT!!1!1111111!!!!! I WANT FRE STUF!!!11!!1
too much hype around gpt 5 and too many people use it without actually using the actual vote feature id say
Yeah GPT-5 isn't available atm
dont just ping them man
team is looking into
ok
is it bcz of the hype?
๐ญ
It takes too long to delete a chat. The deletion doesn't seem to complete until the page redirects to the homepage. Also, if I try to delete multiple chats at once, the process often fails and none of them get deleted. Is anyone else experiencing this?
They are trying to fix it
Hey guys. Where is GPT-5?
people burning tokens
gpt-5-chat is live
im confused that what are they used for
everyone I need help. why is it always saying "Something went wrong with this response, please try again."???
rate limited prob
try again some hours later
oh so the gpt-5 is more "pro"
?
one is more nicer with the answers the other no
ok? still donยดt understand lol
yes
or does it only have more temp?
oh ok
no it's completly different model
ok
finetuned for chat
which is smarter
same ig
oh
for coding im using gpt-5
for everyday chat i use gpt-5-chat
cool
claude sonnet still better coder btw
is there a gpt-5-thinking?
is opus better than sonnet?
no
is this gpt-5 still (high)
idk
Gpt 5 chat does not reason i think
oh, my simple technical question show that:
- gpt-5(lmarena) = gpt-5 thinking (chatgpt+) >= new gpt-5 (lmarena)
- gpt-5-chat = gpt-5(plus)
- gpt-5 in copilot has limit context
@blazing bison told you
there is a thinking
oh ok
i dont know, but gpt-5 on lmarena have great outputs, better than gpt-5 (plus user) and copilot, oc in my case
Yeah the api is better
oh cool
lmarena team did well
is there a limit?
thanks staff
idk, i just tesh 4-5 question
btw someone do a bench gpt 5 and gpt 5 chat
the limit is its output, maybe
oh ok
No i don't think so
ok
@echo aurora Context window limit is 400k for gpt 5 right?
god yall pinging admins lol
Is GPT-5 always โGeneratingโฆโ?
it is thinking, maybe, lol
maybe my question is quite hard, thx
both answer at the same speed
Yeah, my gpt-5 on plus account didnt think in my simple question test
official description
atleast on lmarena
2 seconds for first token for both
no matter what the prompt
so are both thinking o both non thinking
it looks like the gpt-5 on chatgpt.com automatically selects thinking or no-thinking
Don't be confused, start super grok for sometime, then turned to gpt 5 you can clearly see what others can't
Grok overpriced af
Is it takes millions for subscription
It is 300 doolers a month
google stock doing pretty good today lol
tested on api playground, both reason
Gemini gemini gemini
takes itยดs time
Mmm i refreshed the page and it appeared.
how to use veo3?
hey, just now this is happening to me
maybe lots of people using it at a time?
copilot is also great
just use "think extremely hard"
at the start of you rprompt
yupp.ai is also a great website for trying gpt-5
but its limited
but overall microsoft copilot has hbecome a great option for gpt-5
in my opinion
I don't think gpt-5 and gpt-5-chat are equally smart.
why does gemini keep talking aboitu the current time
wait what the real answer?
pi + e?
TREE(2) is actually 3
isnt gpt 5 chat a router
I tried switching it to Thinking, but failed.
Whays the difference between gpt5 and gpt5 chat?
Ahhhhh that makes more sense thanks lol
They arent
The open ai website itself says it
It can think a lot
oh ok
Wow, so many models fail at this
If you want to think less use "think low" in promt
What is the difference between GPT five and GPT five chat?
Gpt 5 is better
Use that
oh, didnยดt know that u could say in prompt how much to reason
chat dumbed down
All right
gpt-5 better at any reasoning
Ok
oh, thank you, old but good tip, it works
Whenever I chat with him only in one conversation, he hallucinated and Iโve had over 20 conversations with it
How about you guys?
Oh, youโre talking about the chat version
I never really talk to it

nobody uses it in here
Does anyone have questions like this that models confidently fail at?
It is bad
Then how would you know if itโs worse?
for something i want a quick answer 2
The chat version
because it doesnยดt reason as much
dumber answers
OK, so Iโm gonna use GPT-5 for coding and essays etc. GPT-5-chat for simple stuff
exactly
Because you said it was way faster
yes
Right now Iโm getting it
Actually
So GPT five chat is like the direct answer version of GPT five
I think gpt 5 mini is better than gpt 5 chat
i hear you
Gpt 5 mini reasons
And I never tested the chat version so
Yeah, but itโs worse
yeah, i mean
Someone said it was even worse than GPT five mini
Yeah but gpt 5 mini is a good balance of power and speed
good point
like 2 sec of difference
the problem with chat is
itยดs dumbed down for people that only use chatgpt on their phones for a recipe on a tuesday
People keep saying that GPT five is a one step closer to AGI but is it really I just wanna hear yโallโs opinions and Iโm not talking about the chat version
ehhh
getting there
I mean ig
It is. But the people that are trashing on it they were way too hyped and expected AGI
guys
Before GPT 5 I thought we would have AGI 2026, now Iโm thinking 2027-2029
point
All right, canโt wait till 26-27-28 or 29 to see how AI gets
lol
true
If Gemini three pro comes out, will it be better than GPT five
Will it be one step closer to AGI?
but not announcments
slow steps
before that we will have Gemini 3.0, maybe 3.5 and grok 5
more like baby crawling
grok 5
will be even closer than gemini 3
Yes
by a lot
When baby grok releases, what will it teach kids?
I mean the gpt 5 we have now i a very good tech. If you showed it to someone 10 years ago they would have a heart attack lol
no chatbots in that era lol
What happens if you show GPT five to a Victorian child
even if u show it to someone 2 years ago
sigh we still wait for AGI so we can finally have an infinite self-improving feedback loop and have a dystopian future with unaligned ASI
wonยดt comprehend
All right
lol
People are saying GPT five sucks
I am confident we will have a version of AM in the 2040s
Is that true?
its good
no
Itโs like when grok 4 released
All right
Whenever I used Grok for for coding, it was damn near Gemini 2.0 pro level
Majority of people are dumb and have way too high expectations
So I donโt know all about all the hype in for Glock four when it came out
yep
yeah
they all want an AGI by 2026
which for now is NOT happening
I feel 2027 or 2028
me too
ASI though? Wonโt take long once we have AGI
not gonna take long for obvious reasons
lol
mmmm
Grok is overpriced af it needs to change.
maybe 2030
Iโve seen people say ASI will take decades after AGI
yh
I think most people buy supergrok for ani not the actuall ai
Grok 4 for its price is like selling a hamster for the price of a car
true
Hes actually smart AF tho
elon chill
Yes in logic
It's great for math
But it is not worth the price at all
Yeah it's a robbery
not in a million years
Just use lm arena lmao
If Grok 4 was actually the best AI model at its time then I would understand all the hype
I love lmarena.
hey
are there any benches on every model of the gpt-5 family?
like 5, 5 chat, mini and nano
which gpt5 is
Forget 5 chat it is pointless
Yeah, and itโs way less expensive
5 mini i think is close to o3
yeah
But you can use it for free on lmarena just like Grok four and you can also use it for free in ChatGPT
yeah but limits
on gpt
hey
I didn't see any limits on the gpt 5 models
Yeah, but thereโs no limit in lmarena though
GPT five is doing quite great on writing
for now
I think i heard a few say that it is worse than 4o in creative writing
maybe because chat is more for public
To anyone who has tested GPT five pro how did it feel?
its better in that
How was it?
I have not used it but i think it is just a minimal upgrade
yeah
Proof please
I heard gemini 3 is going to release december
All right
but much MUCH better
I understand it might be much better because itโll come after
but gonna cost more than a house in 2025
But what were yโall experience with Grok four because there was too much hype for it when it came out
When it coded, bad games was I just using bad prompts
Is it really the best model?
Let's hope all of them create good models with good prices to keep the competetion alive
is that proof?
No the only thing it is good at is reason and logic
Uhm i don't see any gpt pro
Their input prices and output prices are very cheap
Actually on chatgpt website context window is limited 32k. The contex window in pic are for api
?
yeah
And gpt 5 pro doesn't have any api
sadly
No one talks about copilot of around here so I just wanna clear this out. GPT five is now on Copilot.
yeah but nobody cares about gemini today
Gemini 2.5 pro becomes dumb after 200k context
already sorted it out like 2 hours ago lol
The 1 mil is useless
Ok
nobody uses copilot anyways lol
idk
I feel like GPT five on Copilot might be a faster version
havenยดt tried copilot
Let me test it out with GPT chat
yeah maybe they using gpt 5 chat
I feel it is gpt 5 mini not gpt 5
Theyโre using GPT five chat
EWWWWW I JUST GOT 5 CHAT IN BATTLE
I told them the exact same thing and they gave the exact same answers
Lol
no temp lol
wait which gpt5 they have in copilot?
chat says @hoary elbow
Nerfed
chat?
It does not reason much
i thought it was pro normal nano and mini
here
pro normal chat mini nano
On lmarena you can get the model to think up to 3 or 4 minutes. But not on copilot
am talking about the pro plan of copilot
yeah
nobody uses copilot lol
i do
I wanna clear something up
well lmarena better
yeah
everyone does that
even chatgpt website
oh
They have to give it system promt to say it is gpt t
nvm
Gpt 5
Maybe they didnโt change the system prompt yet
maybe
But when they change the system prompt, I might know, or it might straight up lie to my face and say itโs the original version of ChatGPT five
so umm... why is there a gpt-5-chat? what's that about?
Itโs a small version of ChatGPT five
Hmmm guys when i go use gpt 3.5 on legacy lmarena i get nostalgia
And itโs bad
gpt-5-chat is used in the chatgpt website
Same deal as chatgpt-4o-latest I believe
Well, not that bad but like itโs worse than GPT five
gpt-5 is the better, more reasoning model
lol
okay
The said they want to simplify the models. But i feel it got more complicated
yeah
wwasnยดt on the livestream at all had to go out
just saw the first 5 min
Is there anything better than GPT image right now?
imagen 4 ultra
much better in text
? what is it?
yeah ig
yeah you guess?
yes
I feel like whenever AI gets like a very big update and they suddenly become like the biggest model I feel like when it updates it might not be like as better so I feel like in like 2029 the first AGI will come
u just said 2027-28 5 min ago lol
In pro version?
Guys gpt 4 was 2 years ago. The diffrence between gpt 5 and gpt 4 is a whole lot. 2027 will be AGI
Or 2028
yeah
oh wow
2 years ago
time flies
closer to that
What video generators do they have in video Arena?
3 fast
and kling master
and wan
i think so
Google has crazy advancments in I
AI
Damn no image
Compare this to all other AI companies
yes google is king
They didn't even mention publishing 80% of the seminal AI papers of the last decade
ChatGPT couldn't exist without Google
ai is surprisingly growing and evolving so fast, last year ai was garbage and couldnโt do anything good
I feel google has AGI internal
I mean classified
The might have it
Dude i am saying classified
How would you know
Hmmm
Overall every tech before releasing gets used by the goverment first
Rip btw I feel for u
Anyone else here using local system prompts with gpt5? How do you feel itโs performing?
Kind of sucks we canโt tweak temperature anymore
@echo aurora all image generators expect flux kontext and gpt are dead
Gemini 2.5 pro and gpt 5
thanks
how do i delete all data in lmarena?

What happened to Claude family?
no
What is the difference between the regular gpt-5 and gpt-5-chat?
sota is nowhere near agi yet
and if you want to believe unlikely, unverifiable things, go ahead believe in "agi internally" or some god
See i think the goverment has a model far more better than the public sota
what good does this for you without any evidence in that direction?
did you ever find any indication of it at all?
Maybe the gpt-5 have high reasoning.
I mean it is pretty clear
If you look in history
claude opus 4.1 is acc pretty good
Every big tech before being public was used secretly
pricing tho ๐ซฆ
assume that is not considered ๐ข
Where is this
assume we billionaires ๐ซฆ
gatekeeping
lmarena has claude opus 4,1
Limited
Wdym
isnt that openrouter
noooooo
Thanks
You have to pay btw
yeah
The api price
nothing is free in this world
Oxygen?
?
we have to pay with co2
You have to pay for api price in openrouter not free
there acc are free apis tho there
I never used OpenRouter, but since LMA has Opus 4.1 now why would I want to use it
we have to pay with our data
I know but top models you have to pay
yup
ye thats true fs
amazing at writing, great for coding
esp if agentic
for that warp gives limited free opus 4.1
wait are we allowed to share referral links here??
Gpt 5 is way more efficent than opus 4.1
tru
How many messages
Did you try it
Even if opus 4.1 is better it is very minimal not worth the price. Claude must change or it is cooked
valid lowk
with gpt 5 out
Artificial analysis shows GPT-5 is bad at coding.
Wait for gemini 3 that will blow everything
it got dumbed down today cuz of some bug ๐
i have gemini 2.5 pro lol
It i s good
ye its hella good
Just say think hard in promt
lol
And btw 2.5 pro got nerfed. Original 2.5 pro was at 4 sonnet or opus level
are we allowed to share referral links here?:
oh yeah ik that ๐
<@&1349916362595635286> am i allowed to share referral links here?
check rules
To ensure we have an inclusive and welcoming community we have some rules everyone should review and adhere to. The moderation team has final say over if violations of these rules have or have not occurred, along with the actions we take in response.
โ Act in accordance with Discordโs Terms of Service and Community Guidelines.Violations of these terms and guidelines should be reported directly to Discord. Itโs also recommended to be familiar with Discordโs Safety Center for more information on how to remain safe while using Discord.
โ No NSFW, Harmful Content, or Spam. This includes, but isnโt limited to, hate speech, harassment, racisms, sexism, homophobia, illegal content, inappropriate profile pictures, sharing of inappropriate content, and so on.
โ Treat others with Respect. Be kind, assume good intent from others, and keep disagreements respectful. Itโs encouraged to share your disagreements, but only if itโs done in a respectful and productive way.
โ Do not promote or advertise. This includes sharing of: social media, other Discord servers, or involved projects in an promoting manner.
โ Avoid political and religious content. As a space thatโs inclusive to many different worldviews we ask to avoid topics related to politics and religion in order to maintain an inclusive space. It is okay to have discussion related to new policy or laws as long as itโs related to AI.
โ Do not impersonate staff, moderators, or others. Efforts to impersonate LMArena staff, server moderators, or other community members is not allowed, even in a joking manner.
โ Message in English only. Please keep discussions in English.
Most importantly, remember why we are here, to advance the understanding and application of AI!
GPT-5
Wait this the photo from yesterday.
This is from now.
98% at AIME but somehow canโt solve a equation
Imagine the thousands of simple equations GPT canโt solve.
Every single ai is garbage at math without reason
gpt 5 has been dumbed out
So benchmark is fake?
today gpt 5 has been dumbed out, sam altman posted in twitter abt it
ye
Let me see
I fine-tuned OpenAIโs OSS 20B reasoning model using the most popular medical reasoning dataset and published the results on Hugging Face. The model can break down complex medical cases step-by-step, identify possible diagnoses in clinical scenarios, and answer board-exam-style questions with logical reasoning.
During training, I used 4-bit optimization and enhanced the modelโs performance in medical contexts while preserving its Chain-of-Thought reasoning capabilities. The training format includes โquestion,โ โComplex_CoT,โ and โResponseโ fieldsโallowing the model to first reason in detail, then provide the final answer.
You can check it out here:
๐ https://huggingface.co/dousery/medical-reasoning-gpt-oss-20b
Iโd love to hear feedback from anyone working on or interested in medical AI.
They said they will double the rate limits for Plus
mhm
Don't know about that. Deepseek V3 calculated it just fine without reasoning.
And the model is from march
Okay so what is the point of this argument?
Without reasoning, models can do math just fine too
