#general

1 messages · Page 84 of 1

ocean vortex
#

disagree

#

😇

solid brook
#

guys competetion is good.

eternal niche
#

US were created by Europe lol

solid brook
#

only came to the land few hundred years ago

stray aspen
#

why is this thing not working

ocean vortex
#

What is currently unfolding in US, Europe (or EU) has already lived through and hopefully put behind...

echo aurora
#

lets not go down this road pls

solid brook
stray aspen
#

google needs to lock in

devout vault
#

google will prob win again against gpt-5

#

with their new gemini 3 model

red tangle
#

lol gemini 2.5 deepthink is probably better than gpt5

stray aspen
#

how do you know that

red tangle
#

i have

#

it's miles better than o3 pro

#

i have completely switched from chatgpt to gemini usage because o3 has been such a poor showing

devout vault
#

gemini is completely free with almost no usage at all

red tangle
#

o3 pro thinks for 20 minutes and is like at max 5% better than o1 pro

stray aspen
devout vault
#

gemini 3 is waiting for gpt-5 to release

#

so they can compare

#

their models

devout vault
#

gemini 3 is already done training

#

a month ago

ocean vortex
# stray aspen google needs to lock in

they added nice feature to google gpt5 launch time conveniently. If you google for "time 10am pt" this will convert to your local time. How nice of them 😇

stray aspen
#

i love google

obsidian shell
#

gpt5 in 20 minutes?

solid brook
#

only 2 hours and 20 minutes left

stray aspen
warm fulcrum
#

guys why it feel like openai slowed time

red tangle
#

gpt5 isn't even out lol

solid brook
stray aspen
devout vault
#

gemini 2.5 pro was so powerful before when it first released which shows google true strength

red tangle
devout vault
#

gpt 5 - pricy
gemini 3 - free

solid brook
#

good competetion is always good for us

stray aspen
#

gpt 5 50 euro per token lol

red tangle
#

openai has already lost on API and enterprise

white hatch
red tangle
#

anthropic

devout vault
#

claude is like a human when it comes to coding

red tangle
#

why would you use a 10x more expensive model

devout vault
#

gpt-5 will be more smarter than claude opus 4 for sure

#

but gemini 3? idk

red tangle
#

um

#

im pretty sure they will drop something this month

#

they've been hyping it

solid brook
#

claude is so expensive. someone needs to beat them at coding to challenge them

blazing bison
#

I got access to gpt 5, the one that plus users will receive

#

Too much hype

red tangle
#

the markets are idiots

red tangle
#

lmarena no style control google is dominating

#

it's not even close

#

lol no they arent

solid brook
devout vault
#

i think animated is lying

#

lol

red tangle
#

theres somehow a 75% chance openai has best model by end of august on polymarket 😂 😂 😂

based on lmarena no style control

#

i did

#

$1k

blazing bison
#

Well, no one need to believe me 🤷‍♂️

red tangle
#

it's my first time betting on polymarket

devout vault
#

no proof at all

stray aspen
devout vault
blazing bison
#

What do you want, prints?

#

Share screen?

devout vault
blazing bison
#

??

red tangle
#

wall street is better than polymarket traders

#

HF analysts actually know how to do math

#

a lot of smart money has googl rn

#

like a loooot

white hatch
#

When gpt 5 was announced?

patent aspen
#

It's a small enough market that I could move the market

ocean vortex
solid brook
ocean vortex
devout vault
#

what is st

stray aspen
#

wall street

solid brook
#

yes impossible. that is the reason i used the term "imagine"

patent aspen
red tangle
#

all of the good HFs are subscribed to semianalysis

ocean vortex
#

I think it's gonna be made avail immediately. Too big of a release not to. At the very least to their Pro subs same day. But likely more

red tangle
#

and have really smart analysts constantly looking at compute & energy that google vs openai are building

#

whereas polymarket is people who invest based on twitter vibes

solid brook
patent aspen
#

Semi analysis is good if you want invest in the supply chain of the relevant companies

devout vault
blazing bison
warm fulcrum
#

he would never do such thing

devout vault
#

rn

blazing bison
#

Of copilot?

#

🤓

solid brook
#

this guy is such a troll

blazing bison
#

Im not the only one with access btw

stray aspen
#

theres no gpt 5 on copilot yet

devout vault
warm fulcrum
stray aspen
#

i dont

blazing bison
blazing bison
warm fulcrum
#

i saw some other people get it

#

was it just random or did u have to do something?

blazing bison
#

Random

eternal niche
#

guys i have gpt6

devout vault
#

i have gpt-6 on copilot rn

warm fulcrum
#

does it perform like summit or zenith

blazing bison
#

Idk too much limited

solid brook
devout vault
#

do that

blazing bison
#

I can't send images

#

For example

stray aspen
#

guys ive just been granted access to gpt 8.5 pro high max reasoning 1 billion context

devout vault
blazing bison
#

Idk

devout vault
#

u can send photos here

warm fulcrum
#

wait wdym

#

too much limited?

#

rate limit?

blazing bison
#

They added a limit of output or something

#

Maybe it's not really gpt 5

ocean vortex
warm fulcrum
#

but finetuned

devout vault
wicked ingot
warm fulcrum
#

they did the same for gpt 4

devout vault
#

i like these ngl

warm fulcrum
solid brook
#

Guys what would you think will be the free tier model with no limits?

torn bison
#

given that summit and zenith have already been tested in the arena, they have enough confidence to do so

warm fulcrum
#

lmarena so greedy

#

they didnt keep zenith

eternal niche
whole wagon
#

why didnt lm arena give us the sota model for free smh

solid brook
#

2 hours left

whole wagon
warm fulcrum
#

so greedy

red tangle
#

guarantee 99% of polymarket traders can't even buy secondaries

#

it's very few individuals

#

it's mostly funds that are buying up secondaries

stray aspen
#

i think horizon beta has given me the best plug and play main menu for roblocks out of all the LLMs ive tried

rapid merlin
#

they having a livestream in about 2 or so hours right

stray aspen
#

yes

quartz light
stray aspen
quartz light
warm fulcrum
#

they said it was really good

#

but i never got the chance to try it

hollow imp
#

Actual chatgpt plus and pro sora?

molten cipher
#

Hey @echo aurora , I was wondering if I could talk with you through DMS or a ticket in this server?

patent aspen
#

I think they know but also don't have models that deal well with it. No AI company is profitable and wall street models generally are focused on 6-18 months in the future at best

echo aurora
wintry tinsel
#

China is in actual flames now, just because they open source good models doesn’t mean the country is holding itself together lol

molten cipher
quartz light
warm fulcrum
#

greedy mfs

molten cipher
warm fulcrum
#

why would pineapple take away zenith from us??!?

echo aurora
raven helm
#

GPT-5 benchmarks just leaked (It might be fake; take it with a grain of salt.)

rapid merlin
#

where did you grab this from?

raven helm
#

But might be fake

#

take it with a grain of salt

rapid merlin
#

most likely yeah

zinc ore
#

It's strawberry

#

He probably made those this morning

rapid merlin
#

excited for gpt5 july 31

#

strawberry?

zinc ore
#

Okay I mean on the 13th*

rapid merlin
#

wasnt strawberry o1

zinc ore
#

Look at the strawberries in his username

raven helm
#

He also posted this

whole wagon
#

bruh

stray aspen
#

fake

raven helm
#

But again; take it with a grain of salt

whole wagon
#

he could have at least attempted to make it look realistic

rapid merlin
#

yeah no

raven helm
whole wagon
#

you obviously do not get a base model performing like that on arc agi 2

raven helm
#

yea

#

From brockman

stray aspen
#

gpt 4.5

#

no way

raven helm
rapid merlin
#

gpt 4-p

#

4 pixels

astral jetty
#

Very curious to see if the creative writing is less repetitive than even something like Gemini

raven helm
#

Yea, i think that will be getting an upgrade

#

(Hopefully there'll be less/no em-dashes)

stray aspen
#

horizon beta is so good at lua

solid brook
#

i feel fake

raven helm
#

Yea, no way a base model scores that high

#

or maybe it does

#

idk

solid brook
#

that is just way too big

#

way too big

astral jetty
#

I can kinda believe it, but I don’t think the gap will be that big between Gemini 2.5 and base gpt 5

solid brook
#

if this is true

#

gpt 5 is AGI?

stray aspen
#

no

pure anvil
#

agi is when spinny hexagon+snake

solid brook
#

i guess gpt 5.5 then?

#

I feel we are so close

stray aspen
#

we are close to AGI

eternal niche
#

agi is fake

torn mantle
#

stop reposting that idiot

raven helm
#

Ok, sure. Sorry.

torn mantle
#

is it you?

#

strawberry guy = you?

#

😡

raven helm
#

No No,

torn mantle
#

ok good

raven helm
#

I'm just saying i'll stop posting it here.

torn mantle
#

thanks

raven helm
#

No problem!

eternal niche
#

it is

jade egret
#

btw gpt-5 today right

#

YO

stray aspen
#

its out

#

47 minutes

#

for launch

obsidian shell
#

i dont think there will be a huge difference

probably 5-9%

#

they are just capitalizing on the hype they have been building

storm needle
keen beacon
#

Ahhhhhhhhhh

jade egret
#

better be much better at coding than 4.1 opus (:

quartz light
#

38 minutes

coarse glade
#

Guys quick question how do we do text to video on LMArena.ai

echo aurora
coarse glade
#

So I can’t do it on the website

echo aurora
#

Video Arena is currently only available through our Discord

torn mantle
#

some cool demos

blazing bison
#

Gpt 5 on copilot feels like 4o v2

stray aspen
#

30 minutes

stray aspen
blazing bison
#

I hope it's not the same

astral jetty
stray aspen
#

it is

quartz light
echo aurora
#

Reminder we have our Staff AMA tomorrow with the dev behind our Video Arena bot, if you have any specific questions be sure to add them here

quartz light
quartz light
#

12 minutes!

astral jetty
# quartz light

256k sounds right, I don’t want to get my hopes too high

blazing bison
#

From my first impressions, it's not good, but I had good results with zenith

spring rune
#

I have this

blazing bison
#

Lesgoo

#

Hype or sota

#

Answer in 10 minutes

wintry hamlet
#

how do I use the bot in direct messages

echo aurora
wintry hamlet
#

oh ok. It would be cool if you could

spring rune
#

Hey pineapple when gpt-5 added in lmarena?

echo aurora
proven raft
wintry hamlet
#

Ok

astral jetty
storm needle
stray aspen
#

gpt 5 is out

#

400k context

proven raft
#

Ooo

stray aspen
#

@echo auroraadd gpt 5 to the arena

astral jetty
astral jetty
rapid merlin
#

1 minute yall

void elm
proven raft
#

Yea it's cheaper than gpt 4.5

clever estuary
stray aspen
clever estuary
#

400k context

void elm
#

4o*

proven raft
clever estuary
#

4o
2.5 dollar input
10 output

proven raft
#

I think it was 200k for 4o

#

Most likely

#

o3 deep research is still the smartest one, tho with huge latency

rapid merlin
#

even a free tier? goddamn

tame horizon
whole wagon
#

what the hell is this graph

#

makes no sense lmao

brittle tiger
#

still don't have audio or video input which is annoying

raven oracle
#

yes it does, bottom is no think top is with think

whole wagon
#

o3 is at 69.1

#

like the scale of the graph is impossible

#

lmao

void elm
keen beacon
#

Very happy to hear about it

tame horizon
clever estuary
rapid merlin
#

graph made by chatgpt

willow grail
#

byebye anthropic

tame horizon
#

Thankyou

willow grail
#

byebye opus 4.x

#

BYEBYE

#

say hello to gpt5

barren prairie
#

Let s test and see if it is great or as always just a hype

willow grail
#

were.

#

case closed.

#

you are fanboying billy

brittle tiger
#

The disparity between GPT-5 Thinking score (incredible) and no-think (awful) is pretty crazy

willow grail
#

wait thats a crow?

whole wagon
#

how many thinking tokens is it gonna use kek

willow grail
#

omg we both likes crows

rapid merlin
#

barely better than 4o

willow grail
#

is it not a crow?

devout vault
void elm
brittle tiger
#

Thinking score is amazing and I don't know why you would use it without but definitely interesting

void elm
#

thats pathetic

keen beacon
#

Daamn

void elm
stray aspen
#

its out on lmarena

#

gpt 5

hardy pecan
stray aspen
#

that was quick

devout vault
astral jetty
stray aspen
#

is the gpt 5 in arena think

blazing bison
rapid merlin
patent aspen
#

GPT-5 seems to be weaker or comparable to Deep Think on all benchmarks without tools

blazing bison
#

But it answers faster

#

Its important

primal orbit
#

GPT 5 live direct arena!

stray aspen
#

gpt 5 crushed the webdev by far

warm fulcrum
#

yeehaw

blazing bison
#

And you can send more than 5 prompts per day

#

Also important

obsidian cargo
#

summit was GPT-5? Then what was zenith?

warm fulcrum
hoary plaza
#

Ig it's time google will release something 😂

spring rune
#

FINALLY YAY!!!

misty star
#

@echo aurora I love you

patent aspen
#

Only a 21-point ELO lead over 2.5 Pro. We're good

warm fulcrum
spring rune
#

Love you guys

stray aspen
#

crazy

hoary plaza
#

Since they want #1

exotic tartan
#

WOW

whole wagon
#

LMAO all the bettors lose with openai

#

it didnt win without style control

prime mulch
#

Hell yeah

primal orbit
#

lmarena first with gpt5 on the web, gratz

whole wagon
clever estuary
#

they are deprecating everything???

meager harbor
#

what a joke, AI pioneer not even to be able to make a proper graph, embarassing

prime mulch
#

@echo aurora thanks a lot

stray aspen
#

cooked

patent aspen
#

I'm feeling pretty good ngl

stray aspen
#

the hype was a marketing stunt

spring rune
#

Hey brian! Looks like gpt 5 is indeed becoming underrated right?

elder rapids
#

dawg they're deprecating all the previous models

blazing bison
#

Much marketing little results

#

Hahahahahahha

clever estuary
keen ferry
#

gpt 5 worse than Gemini 2.5 pro?

blazing bison
#

Yes

soft river
#

I have a question, as soon as they started the live stream, they already published GPT 5 on the official website?

blazing bison
#

No

ember sentinel
#

guys, when? 🔥

spring rune
#

Well animated you know gpt 5 is way better than gemini 2.5 pro right?

barren prairie
#

Now Google can have a good sleep 😆

hardy pecan
#

no ones interested about non style control

ornate agate
blazing bison
primal orbit
sleek stump
#

what video model this AI is using and how many videos i can generate in a day?

indigo hazel
meager harbor
sleek stump
astral jetty
stray aspen
#

cant wait for gemini 3 to destroy OAI

fleet ocean
sleek stump
primal orbit
fleet ocean
stray aspen
#

horizon beta chat was disabled

split kayak
#

ok

sleek stump
fleet ocean
#

No problem

astral jetty
split kayak
#

ok

barren prairie
spring rune
civic flame
#

@echo aurora can you shed some light on what zenith was? there wasn't jsut summit

leaden meteor
#

They can't. It's against their policy.

stray aspen
#

so summit was gpt 5

cyan zodiac
#

does anyone know what makes gpt 5 better than opus 4 at coding?

keen talon
#

what is the limit for video arena?

primal orbit
stray aspen
#

eight

indigo hazel
#

i cant wait to see gpt 5 failing at your tests lmao, send images pls

hoary plaza
#

I remember seeing opus 4.1 in direct chat

#

Was it removed?

primal orbit
torn mantle
#

As I've guessed summit = gpt5

#

It wasnt that obvious

stray aspen
#

what was zenith

proven raft
stray aspen
#

is the gpt 5 in lmarena reasoning?

tidal schooner
stray aspen
#

im guessing it is because it took forever to answer my prompt

proven raft
#

Gpt 5 has got 50 percent in hfe

tidal schooner
proven raft
#

That's crazy

fleet lintel
#

Am I reading the evals incorrectly? GPT-5 looks underwhelming to me.. What am I missing?

dreamy sparrow
#

is every model in lmarena even real

dreamy sparrow
stray aspen
meager harbor
rapid merlin
fleet lintel
#

Things are slowing down but GPT-5 was supposed to be like multiple levels better ..

proven raft
#

Many ppl are using it

dreamy sparrow
#

most apps use it

echo aurora
dreamy sparrow
stray aspen
#

is it gpt 5 think or no think

leaden meteor
#

Why didn't lmarena add got 5 thinking on leaderboard yet? If it is zenith, it should have had enough votes like summit/gpt5main ?

dreamy sparrow
#

think

rapid merlin
keen beacon
#

People are already complaining about gpt 5, lol

#

Just enjoy it

eternal niche
#

so gpt5 is crap

dreamy sparrow
#

tho idk

#

maybe Gemini IS better

stray aspen
#

gemini 3 will destroy it

red sluice
keen beacon
#

Don't like it

brittle tiger
#

I don't think all of GPT-5 is thinking. It might do some to determine which to route to but they wouldn't differentiate between main and thinking if both were thinking.

fleet lintel
fleet lintel
#

I thought OAI was 3-4 months ahead of Google. I think it's behind now

rapid merlin
primal orbit
#

Gemini 2.5 was released in March. Google has to have much better version internally now.

barren prairie
leaden meteor
fleet lintel
#

polymarket sudden huge moved in favor of gemini models for Aug

primal orbit
meager harbor
patent bane
#

oh no it's more censored

fleet lintel
barren prairie
stray aspen
dusky aurora
leaden meteor
meager harbor
jade egret
#

is gpt-5 better than gemini

stray aspen
#

slightly

barren prairie
whole wagon
small haven
fleet lintel
#

I think wolfstride model is better than gpt-5 ... this is very dissappointing 🙁

small haven
meager harbor
void elm
#

what is this creature talking

feral lichen
#

Best ai for

#

Rbx studio

#

?

void elm
#

shut up

stray aspen
void elm
#

nah

keen beacon
#

Yes. They are surely updating the gpt image model

civic flame
rapid merlin
#

fr

feral lichen
civic flame
#

it thought for less time than summit on 90% of tasks

#

lmao

#

it also had less juice

#

that was NOT pro

eternal niche
#

@deep adder hi

civic flame
#

they just picked the checkpoint that won in elo even though it was less performant facepalm

#

openai never help themselves do they

whole wagon
#

Oh dear....

dreamy sparrow
#

is the gpt in lmarena the pro thinking and not normal one

jade egret
#

is gpt-5 good? (better than claude 4.1 opus?)

civic flame
meager harbor
#

so gpt 5 is 200 elo better than gpt 4 0314 (the og one)

fleet lintel
# whole wagon

who is the only voter in favor of GPT-5 ? :). who is the crazy person? Reveal yourself

red sluice
#

Damn so GPT-5 was on lmarena since the 27th of July 🤯

rapid merlin
eternal niche
#

well...

jade egret
stray aspen
whole wagon
red sluice
#

...

stray aspen
#

craig stop licking openai's boots

native flame
#

Hii, this gpt-5 has the thinking mode activated by default? Like the o3??

dreamy sparrow
#

lmao

jade egret
dreamy sparrow
#

it's google mate

fleet lintel
wheat onyx
#

Absolutely huge

dreamy sparrow
mental briar
#

but no by default

dreamy sparrow
#

yeah i guess?

brittle tiger
keen beacon
meager harbor
lavish orchid
keen beacon
#

Can we people stop complaining already? Take a deep breath

fleet lintel
civic flame
steady vale
#

grok 4 honestly kinda sucks

for any type of question that isn't just benchmark maxxing types

fleet lintel
#

If I remove style control... 2.5 > Gpt-5.. WTF

civic flame
#

1 point better

#

💀

verbal nimbus
#

It's crazy and kinda scary how much Gemini 2.5 Pro hallucinates.

rapid merlin
#

32K? 🥀 🥀

#

yeah, go google go go go

verbal nimbus
void elm
#

gpt 5 is a disappointment

#

gemini 3 last hope

fleet lintel
#

This release is breaking my heart :(. I had soo much hopes

fleet lintel
astral jetty
stray aspen
#

if its gonna have 8k windom for free users ill just use google ai studio

#

that sucks

verbal nimbus
void elm
#

so much hype but so little progress

rapid merlin
astral jetty
whole wagon
#

Google has incredible stuff upcoming there is no worries trust me. I am a bit shocked by this gpt5 release

#

i thought they would have cooked for sure

#

Tbh i tried gpt5 and the vibes are great

stray aspen
#

i hope google doesnt mess up

verbal nimbus
empty stump
#

Hi how good is gpt 5

eternal niche
#

craig still coping

civic flame
#

michelle please stop i can't see you through the tears in my eyes

stray aspen
whole wagon
#

like i tried actual help for real world coding and it was pretty good

#

maybe dont judge pure off benchmarks just yet

verbal nimbus
empty stump
keen beacon
eternal niche
keen beacon
native flame
#

Well then how can I make the GPT-5 on the lmarena can think?? Like the reasoning models?

feral lichen
#

You know, the human brain can reach further than an ai

barren prairie
keen beacon
keen beacon
#

You guys are so negative

indigo hazel
#

guys i honestly think that the thing of hallucinations is still really good even if it's not the best model in every task

void elm
#

AI is NOT replacing jobs 💀

quartz light
# quartz light
poll_question_text

GPT-5 Context Window Predictions

victor_answer_votes

7

total_votes

14

victor_answer_id

2

victor_answer_text

256k

victor_answer_emoji_name

😕

void elm
#

1 year wait for this trash btw

iron meadow
#

So

blazing bison
#

1 year?

#

2 years

rapid merlin
void elm
#

more yea

fleet lintel
blazing bison
#

2 years for a mid model

iron meadow
#

How does gpt 5 compare to opus 4

rapid merlin
#

saved to my gifs

iron meadow
#

Does it have more sophistication?

whole wagon
#

its not trash cmon lol. it is still SOTA its not just the jump expected

verbal nimbus
tall summit
#

gpt 5 seems... better at translation than any other from openai

rapid merlin
#

censorship update

blazing bison
#

people will stick with claude, believe me

void elm
#

this jump is the same amount of improvement from claude 4.0 to 4.1

leaden palm
#

what is my timeline 😭

zealous panther
#

What was zenith

blazing bison
#

Dario did nothing and won

#

🤑

native flame
stray aspen
rapid merlin
void elm
#

alr bro

fleet lintel
empty stump
#

What is airline

zealous panther
fleet lintel
#

I fell for OAI hype again... this is like 5th time I fell for it

barren prairie
iron meadow
blazing bison
#

gpt 5 is really a 4o v2

zealous panther
keen beacon
#

Boys, "15.9% for Grok 4 vs 9.9% for GPT-5." it sucks. sad face emoji

#

arc agi

verbal nimbus
#

Is GPT-5 a router model?

rapid merlin
#

gpt-4oo

whole wagon
#

that aint great ngl

wintry tinsel
#

Is there a gpt 5 heavy or a gpt 5 thinking yet?

keen beacon
#

This is the worst "upgrade" ever!

fleet lintel
#

50 ELO improvement over 2.5 gemini.. that was my mid-level expectation.. i was hoping for 65+

verbal nimbus
jade egret
#

so gpt-5 suck?

whole wagon
wheat onyx
#

Why are people thinking this is bad? This looks fantastic to me

glad perch
red sluice
whole wagon
keen beacon
# whole wagon

Simple-bench needed right now! can't wait for those results

jade egret
#

so they hype it up way too much lol

whole wagon
brittle tiger
wheat onyx
keen beacon
jade egret
#

guys please tell me is it even betetr than opus 4.1?

weary flint
#

Hello, how do I make videos in 9:16 size?

fleet lintel
deft vigil
#

Wow finally Sam hit the wall

steady vale
#

this long context performance is really good

actually game changing tbh

whole wagon
#

i really like gpt5 ngl. i actually used it

zealous panther
#

Jse it bruh

meager harbor
keen beacon
#

Google DID "out accelerate" Sam

weary flint
#

Hello, how do I make videos in 9:16 size?

keen beacon
#

lol

whole wagon
#

the benchmarks arent capturing smth it feels

deft vigil
#

Any coding benchmark guys

void elm
#

gemini 3.0 deep thinking is gonna leave gpt 5 in the dust

wheat onyx
#

looks like GPT5 is much better at writing, coding, and overall knowledge, while having 1/5 the hallucinations. Pretty huge imo

zealous panther
fleet lintel
echo aurora
wintry tinsel
#

GPT 5 is confirmation this crsp has hit a wall and we’ve been being grifted into thinking it hasn’t for over a year now

zealous panther
indigo hazel
keen beacon
jade egret
wheat onyx
verbal nimbus
# whole wagon

I don't think ARC-AGI is reliable for models released after the benchmark. Just look at how o3 (High) scores 68.8% on ARC-AGI-1 vs. Opus 4's 35.7%, but for ARC-AGI-2 they score about the same (Opus scores higher now). The current models will probably do badly on ARC-AGI-3.

keen beacon
#

Slop maxed model

#

No way

wintry tinsel
#

AI has hit a wall guys

void elm
wintry tinsel
#

They don’t know how to scale or improve it anymore it’s all investor hype

wheat onyx
keen beacon
#

This is what Sam commented on the Chart crime "wow a mega chart screwup from us earlier--wen GPT-6?! correct on the blog though.
"

fleet lintel
# jade egret

@deep adder is agian the only voter in favor of gpt-5... are you OAI employee?? reveal to us

feral lichen
#

Ai has limits, the brain does not

keen beacon
#

"GPT-5 is here - and it’s #1 across the board.

🥇#1 in Text, WebDev, and Vision Arena
🥇#1 in Hard Prompts, Coding, Math, Creativity, Long Queries, and more

Tested under the codename “summit”, GPT-5 now holds the highest Arena score to date." THEN WHAT WAS ZENITH???

#

context: zenith was better

stray aspen
#

its gpt 5.5

wheat onyx
blazing bison
#

reveal zenith

feral lichen
#

Does anyone know anything about monitors?

whole wagon
#

summit > zenith

blazing bison
#

no

red sluice
#

Yup if zenith is google damn son we're gonna have fun

blazing bison
#

no one agrees that summit >zenith bro

#

NO ONE

keen beacon
#

GUYS! GPT-6 COMFIRMED! (gone wrong!)

whole wagon
#

no way

white hatch
#

we'll see how gpt 5 will fix the bug in my project

wheat onyx
fleet lintel
red sluice
#

oh

primal orbit
#

openai is not the same without Ilya.

keen beacon
#

We need ilya back tbh

verbal nimbus
# wheat onyx

Would be nice if there was a graph with Claude, and the price on the x-axis.

blazing bison
#

imagine if zenith is gpt 5 -mini

#

fun

keen beacon
#

His balding head made the company bold

wheat onyx
void elm
#

openai is over

#

not even gpt 6 will save them

#

actual bs

stray aspen
#

at this point deepseek r2 will cook them

verbal nimbus
keen beacon
#

Bro this is pure trash

#

coding specialized model

fleet lintel
#

nah.. OAI will remain top company for atleast couple of years... but direction is not great

keen beacon
#

Opus 4.2 will cook this

wheat onyx
narrow dawn
#

lmarena doesn't works

echo aurora
wheat onyx
narrow dawn
#

i need to select the model and there is no model

#

._.

deft vigil
#

Lmarena is dxomark 2.0

unborn lantern
#

guys, how can i use gpt 4.5 in lmarena?

whole wagon
#

Admittedly I am extremely confused how they are expecting to reach AGI

stray aspen
#

is lmarena down

whole wagon
void elm
verbal nimbus
meager harbor
#

SCAM HYPEMAN

pulsar rain
#

gpt-5 tend to go straight to the point compare to gemini 2.5 pro

unborn lantern
void elm
#

google has infinite money

#

everyone shitted on bard then they cooked

keen beacon
wheat onyx
narrow dawn
echo aurora
opal juniper
unborn lantern
tired herald
#

Yeah model selector be having some problems

narrow dawn
#

bruh :/

opal juniper
#

However the battle mode works

wheat onyx
#

Anthropic should have no problem for staying ahead in coding then, they said big improvements in coming weeks

feral lichen
#

Max tokens for gpt 5?

narrow dawn
#

i was just making my homework with it

unborn lantern
#

guys, how can i use gpt 4.5 in lmarena?

keen beacon
#

Guys! have we hit a wall??? first claude opus 4.1 scores 3 percent more and now this?

stray aspen
#

lmarena model selection is not working

unborn lantern
sleek crow
#

finally a model that can generate a Minecraft clone

verbal nimbus
feral lichen
#

Max tokens for gpt 5?

stray aspen
narrow dawn
opal juniper
fleet lintel
keen beacon
#

"Grok 5 will be out before the end of this year and it will be crushingly good
" ELON ON TWITTER!!!

sleek crow
#

gpt-5

unborn lantern
pulsar rain
narrow dawn
sleek crow
#

is avaible on lmarena

torn mantle
#

is lmarena down?

#

cant select models

stray aspen
keen beacon
feral lichen
#

Max tokens for gpt 5?

verbal nimbus
exotic tartan
keen beacon
#

WE WAITED 2 YEARS FOR THIS???

lone vector
whole wagon
#

Even the december odds are shifting...

feral lichen
echo aurora
meager harbor
stray aspen
#

waited so long for this garbage

torn mantle
keen beacon
tired herald
#

Model selector back on again

exotic tartan
wheat onyx
#

The AIDS chart

unborn lantern
#

guys, how can i use gpt 4.5 in lmarena?

echo aurora
barren prairie
#

I hope thta deepSeek won t do the same thing , waiting ages for garbage

pulsar rain
#

1 million token point is barely enough if you want it to read all the text in a book 🤣

thorn ore
#

is chatgpt-5 a joke model

#

i think its fake

steady vale
#

JUST IN: GPT-4.5 got removed from chatgpt's website

thorn ore
#

Oh

#

NOOOOOO

stray aspen
void elm
fleet lintel
whole sundial
verbal nimbus
# lone vector

Gemini hallucinates so terribly though. It sometimes doesn't even tell me I forgot to attach a document; it just makes one up.

void elm
#

its not even removed

prime mulch
void elm
#

4.5 is still there

#

lies

wicked root
#

How much better is gpt5?

pulsar rain
whole wagon
keen beacon
#

This is deeply saddening.

warm fulcrum
#

@echo aurora which gpt-5 version is it that's displayed on lmarena?

whole wagon
#

thats crazy work ngl

unborn lantern
pulsar rain
stray aspen
verbal nimbus
wicked root
#

Is gemini 2.5 screwed?

warm fulcrum
tired herald
#

Lmfao

hoary elbow
#

I overslept and I woke up realizing that GPT five is out

stray aspen
stray aspen
tired herald
#

Theres two models called "gpt oss 120b" on lmarena rn

keen beacon
#

Why did they release this? If they did it for the normies, why did they hype it so much!?

stray aspen
hoary elbow
#

GPTOSS 120 B is a open source model made by GPT. ChatGPT says it’s just as powerful as 4o

keen beacon
#

Everything!

#

it does not match the hype!

pulsar rain
stray aspen
#

its the worst open source model ever

unborn lantern
#

Same or bugs?

stray aspen
#

plus it has north korean level censorship

#

like what were they thinking

hoary elbow
blazing rune
#

They didn't release any benchmarks for the Mini and Nano versions of GPT-5.

echo aurora
keen beacon
#

GUYS! GPT-5 supposed to be the best model for cost to performance though.

blazing rune
#

probably means they are about at the level of 4.1 Mini and Nano

hoary elbow
#

Is it better than Grok though?

pulsar rain
torn mantle
#

its working now @echo aurora

hoary elbow
#

Ok

stray aspen
eternal niche
keen beacon
#

ITS NOT A REVOLUTIONARY MODEL! its an effcient model

rapid merlin
stray aspen
#

it didnt beat grok in arc agi 2

wicked root
#

Polymarket’s saying gemini’s over

torn mantle
#

so far gpt5 is good

verbal nimbus
# wicked root Is gemini 2.5 screwed?

Google has the efficiency advantage because of their custom TPUs. It's kinda crazy that you get free unlimited use of Gemini 2.5 Pro on AIStudio with 1M context.

rapid merlin
#

are yall testing it from lmarena?

#

dont think they put it on their site yet

echo aurora
wicked root
#

Wait so is it over for gemini?

keen beacon
#

"GPT-5 results on ARC-AGI 1 & 2!

Top line:

65.7% on ARC-AGI-1
9.9% on ARC-AGI-2
" IT DOESN'T EVEN MATCH o3 FROM DECEMBER AT ARC AGI 1!!!

stray aspen
pulsar rain
#

Nothing will replace 1M context. just paste the whole book and it know everything

hoary elbow
hoary elbow
#

I saw the benchmarks

#

But at least GPT five is good

warm fulcrum
keen beacon
#

LOGAN! say it! say the damn words! "Gemini Gemini Gemini"!!!

astral jetty
hoary elbow
unborn lantern
#

They Didn't increase their knowledge cut off parameters

verbal nimbus
stray aspen
#

this one?

wintry tinsel
#

Gemini may be the best now but google will neuter it once they have market monopoly we need heavy competition to keep Gemini good

tired herald
hoary elbow
#

Is there another way to get GPT five to search

hoary elbow
#

Because Joe Biden is not the president

torn mantle
#

nah its actually so good

hoary elbow
#

Not anymore

torn mantle
#

i have many things to say

whole wagon
#

this is literally stale model wth

stray aspen
tired herald
#

Dem

devout vault
#

Why is the cut off day 2024 oct

verbal nimbus
patent bane
#

gpt-5 on chatgpt is dumber than the one in API????

whole wagon
#

crazy

echo aurora
tired herald
#

Openai logins are broken lol

keen beacon
#

"gpt-5 fast facts:

  1. hits sota on pretty much every eval
  2. way better than claude 4.1 opus at swe
  3. 5× cheaper than opus

  4. 40% cheaper than sonnet

  5. best writing quality of any model
  6. way less sycophantic" - OpenAI employee
#

Roon failed us.

#

Trash writing still

verbal nimbus
#

The mini version has even an older knowledge cutoff date

void elm
#

barely any better

indigo flax
#

“Black and white vector-style silhouette of a confident bearded man wearing sunglasses, modern hairstyle

tired herald
#

They ought to make a good 1m parameter model

void elm
#

openai has to retire and give its compute to google

stray aspen
solid brook
keen beacon
#

Secret Gemini models have better writing!!!

tired herald
whole wagon
#

how the hell

tired herald
whole wagon
#

its a mini model bruh it should not take a year to train

devout vault
#

Gemini 3 and grok 5 will win 100%

stray aspen
#

the chatgpt 5 is dumber than the API

tired herald
#

Not even nano has 1m

quiet moss
#

is chatGPT 5 on the website yet?

whole wagon
keen beacon
rapid merlin
#

so the one in chatgpt is lobotomized?

#

💀

stray aspen
#

didnt someone from google say this was gonna be an exciting week

barren prairie
tired herald
# keen beacon

Of course they will, they have to release a "competitor" to this new trashai model

sour spindle
#

What model has had the biggest positive reception here?

stray aspen
#

gemini

verbal nimbus
hoary elbow
#

Can I send a video here real quick?

devout vault
#

i was the first to say that LOL

tired herald
#

Gemini 2.5 Pro already has many things better than gpt 5

#

Gemini 3 Pro will be groundbreaking

stray aspen
#

gpt 5 is garbage

keen beacon
hoary elbow
#

Gemini three pro might be better than Grok four I mean it has a chance to be better

#

Since 2.5 pro is better than GPT five

stray aspen
keen beacon
hoary elbow
#

I wonder how Gemini will be

analog bone
#

Where's GLM ? 🙁

pulsar rain
#

default gemini 2.5 pro praises all your question no mater how stupid it is. It get very annoyed

hoary elbow
#

Wait, it’s winning

#

I can’t believe it

stray aspen
void elm
#

its such a minor upgrade

keen beacon
#

Stop with the A = A arguments. its a trash model sir.

tired herald
verbal nimbus
#

Gemini is definitely sycophantic, lol

keen beacon
#

What do you mean?

tired herald
#

Well, at least gpt 5 nano accepts pictures

sour spindle
#

Also does anyone have access to to 5 right now lol

steady vale
#

gemini is the most sycophantic model

eternal niche
whole wagon
#

this guy on the livestream is just bsing

whole wagon
#

literally saying nothing

keen beacon
#

It literally is not! arc agi is only one of the benchmarks where it falters!

stray aspen
#

on lmarena

void elm
#

filler words

stray aspen
#

now the livestream is just pure yapping

whole wagon
#

selling

slow sail
#

He tried his best

whole wagon
#

we will get back to selling kek

verbal nimbus
whole wagon
keen beacon
#

Will they do the pokemon Bench with GPT-5?

whole wagon
#

it is the actual value only

#

I find it funny they copied gemini 2.5 pro pricing exactly lol

#

like $1.25/$10

hollow imp
#

Wait is gpt5 a reasoning model?

short adder
#

Could we get gemini 3 this month?

stray aspen
keen beacon
#

Roon did not fail us

verbal nimbus
void elm
#

did you even watch the stream?

balmy mist
#

why some people hating on GPT-5?

void elm
#

because its like upgrading claude 4.0 to 4.1

#

negligible upgrades

hollow imp
hollow imp
stray aspen
#

first with this garbage open source model

#

and now with gpt 5

hollow imp