drifting thorn Jun 4, 2025, 5:17 PM

#

Is that just opened to someone?

keen beacon Jun 4, 2025, 5:18 PM

#

it was removed

#

later 2.5 pro revision, or something else, it was good

elder rapids Jun 4, 2025, 5:18 PM

#

just for it to solve a basic physics problem, it wasn't hard

#

I meant for it, to see how its changed

small haven Jun 4, 2025, 5:20 PM

#

it will fix o3 pro issues

late path Jun 4, 2025, 5:21 PM

#

looks like today's openai livestream doesn't have o3pro?

keen beacon Jun 4, 2025, 5:21 PM

#

maybe tmrw?

small haven Jun 4, 2025, 5:21 PM

#

nah like seriously, deepthink >>

#

thursday

wicked root Jun 4, 2025, 5:21 PM

#

Do we have rumors on o3 release?

torn mantle Jun 4, 2025, 5:22 PM

#

lies

wicked root Jun 4, 2025, 5:22 PM

#

So gemini’s coming out tmrw

small haven Jun 4, 2025, 5:22 PM

#

o3 pro on thursday

wicked root Jun 4, 2025, 5:22 PM

#

Deepthink thursday

sturdy mica Jun 4, 2025, 5:22 PM

#

guys my pc just broke

wicked root Jun 4, 2025, 5:22 PM

#

O3 on thurs too?

torn mantle Jun 4, 2025, 5:22 PM

#

this is crazy

wicked root Jun 4, 2025, 5:22 PM

#

What time?

small haven Jun 4, 2025, 5:23 PM

#

i have it alrdy, its ass

wicked root Jun 4, 2025, 5:23 PM

#

Hmmmmm

wicked root Jun 4, 2025, 5:23 PM

#

small haven i have it alrdy, its ass

Very nice

#

How ass is it?

small haven Jun 4, 2025, 5:24 PM

#

if ur tight, id say hold onto it for google ultra

patent aspen Jun 4, 2025, 5:24 PM

#

When people use the term "structured thinking" what are they usually talking about? I only ask because a lot of external AI jargon doesn't map cleanly to more technical AI jargon

keen beacon Jun 4, 2025, 5:24 PM

#

idk it seems idiosyncratic to me too

small haven Jun 4, 2025, 5:24 PM

#

wicked root How ass is it?

its better than o3, but not that great of a jump like o1 to o1 pro

wicked root Jun 4, 2025, 5:24 PM

#

small haven its better than o3, but not that great of a jump like o1 to o1 pro

What about compared to gemini 2.5?

small haven Jun 4, 2025, 5:25 PM

#

o1 pro could spit out 2k lines no placeholder, no omissions, all in full, o3 is maxed out at 500 locs

torn mantle Jun 4, 2025, 5:25 PM

#

OMG

#

OMGGGGGGGGGGGGGGGGGG

keen beacon Jun 4, 2025, 5:25 PM

#

OMG

torn mantle Jun 4, 2025, 5:25 PM

#

nvm

#

bark

#

bard*

wicked root Jun 4, 2025, 5:26 PM

#

small haven o1 pro could spit out 2k lines no placeholder, no omissions, all in full, o3 is ...

Thisi s fantastic

small haven Jun 4, 2025, 5:26 PM

#

and o3 pro times out a lot

torn mantle Jun 4, 2025, 5:26 PM

#

saw this on reddit

#

kingfall

#

is it good or nah

small haven Jun 4, 2025, 5:26 PM

#

king falls tmmrw and king will fall further a week later

keen beacon Jun 4, 2025, 5:26 PM

#

its good

civic flame Jun 4, 2025, 5:27 PM

#

torn mantle saw this on reddit

woah

torn mantle Jun 4, 2025, 5:27 PM

#

google releasing products every week whereas xai :

civic flame Jun 4, 2025, 5:27 PM

#

okay gdm just drop kingfall already 😔

small haven Jun 4, 2025, 5:28 PM

#

torn mantle google releasing products every week whereas xai :

this is why

#

link

#

lol

#

no screenshot either

torn mantle Jun 4, 2025, 5:29 PM

#

small haven this is why

lol.. i mean he probably took a vacation after a long time, i dont doubt that they sleep in the office

small haven Jun 4, 2025, 5:29 PM

#

o3 pro cots seem to get it

torn mantle Jun 4, 2025, 5:29 PM

#

but we should seriously start an audit to see what are they doing exactly

small haven Jun 4, 2025, 5:30 PM

#

so far, its still thinking

torn mantle Jun 4, 2025, 5:30 PM

#

cheeeeeen

small haven Jun 4, 2025, 5:30 PM

#

ya

drifting thorn Jun 4, 2025, 5:31 PM

#

Gotta sleep

small haven Jun 4, 2025, 5:31 PM

#

o3 pro is ass i been telling

drifting thorn Jun 4, 2025, 5:31 PM

#

Gn everyone

civic flame Jun 4, 2025, 5:31 PM

#

small haven so far, its still thinking

lol wow

#

so close yet so far

#

☠️

small haven Jun 4, 2025, 5:33 PM

#

civic flame lol wow

is the answer public ? or did o3 pro rlly get it

keen beacon Jun 4, 2025, 5:33 PM

#

but o3 pro has tools

elder rapids Jun 4, 2025, 5:33 PM

#

I don't think I would take the extent of deepthink, so inevitably, yeah

small haven Jun 4, 2025, 5:33 PM

#

yea im in evan chen site, no 3031 answer

#

o3 pro timed out great

#

full cot

#

📎 message.txt

#

full cot ^

#

this is from another run

#

and still running

civic flame Jun 4, 2025, 5:36 PM

#

small haven is the answer public ? or did o3 pro rlly get it

it is public but it's under a reddit post with like a handful of upvotes and a few comments

small haven Jun 4, 2025, 5:36 PM

#

yea guessed so

civic flame Jun 4, 2025, 5:36 PM

#

small haven o3 pro timed out great

lmao

small haven Jun 4, 2025, 5:36 PM

#

they both timed out

keen beacon Jun 4, 2025, 5:36 PM

#

thanks i needed this

civic flame Jun 4, 2025, 5:36 PM

#

small haven they both timed out

if you're paying for o3 pro there should be no time out 😔

small haven Jun 4, 2025, 5:37 PM

#

civic flame if you're paying for o3 pro there should be no time out 😔

well it does :/

keen beacon Jun 4, 2025, 5:37 PM

#

nah

small haven Jun 4, 2025, 5:37 PM

#

it was nearing 15 mins, i guess thats the hard limit

patent bane Jun 4, 2025, 5:39 PM

#

what's the discord?

small haven Jun 4, 2025, 5:40 PM

#

deepthink is going to crack it without tools

keen beacon Jun 4, 2025, 5:40 PM

#

i wonder if theyre gonna switch out the deepthink model

#

if they make further revisions that close the gap as a standalone model w/o parallel requests

keen beacon Jun 4, 2025, 5:41 PM

#

keen beacon i wonder if theyre gonna switch out the deepthink model

assuming its a set 2.5 pro revision already

elder rapids Jun 4, 2025, 5:41 PM

#

man

#

I want that model

#

give me that model

#

😭😭

keen beacon Jun 4, 2025, 5:42 PM

#

deepthink? kingfall? goldmane? everything?

elder rapids Jun 4, 2025, 5:42 PM

#

everything

#

I want kingfall

civic flame Jun 4, 2025, 5:43 PM

#

real

misty vault Jun 4, 2025, 5:43 PM

#

elder rapids I want kingfall

it has to consent first

civic flame Jun 4, 2025, 5:43 PM

#

there's a decent chance kingfall drops on arena this week to be fair

#

i see goldmane releasing officially tomorrow, and kingfall appearing on the arena on the weekend

#

i arrived at the conclusion that the red teaming platform was just testing existing models this round

#

so nothing good

misty vault Jun 4, 2025, 5:45 PM

#

leo has access to asi

civic flame Jun 4, 2025, 5:46 PM

#

yup

civic flame Jun 4, 2025, 5:46 PM

#

misty vault leo has access to asi

lol i wish

elder rapids Jun 4, 2025, 5:46 PM

#

btw

#

kingfall and goldmane are INSANE writers

#

the best by a large margin out of all the models

civic flame Jun 4, 2025, 5:46 PM

#

what did kingfall write

#

goldmane was decent but not the best in my experience

elder rapids Jun 4, 2025, 5:47 PM

#

civic flame what did kingfall write

I'm not asking them for a prose here

#

or to "write"

#

I'm asking them to explain things

#

and that gives insight into how they actually write

#

without forcing them

#

no

#

lol

#

not what I said

civic flame Jun 4, 2025, 5:48 PM

#

elder rapids and that gives insight into how they actually write

it's a lot easier to gauge how good they are at writing in the usual sense if you're asking for prose

elder rapids Jun 4, 2025, 5:49 PM

#

that's not true at all

#

you don't ask for a prose or a prompt when you want to see how a good model is in writing

civic flame Jun 4, 2025, 5:49 PM

#

debatable

elder rapids Jun 4, 2025, 5:49 PM

#

if it has to create its own context

#

then you're not really allowing it to write, this is the same for every LLM

misty vault Jun 4, 2025, 5:50 PM

#

sydney fine tune of gpt-4 is best at writing

elder rapids Jun 4, 2025, 5:50 PM

#

deadass hate when you talk to me like I'm stupid

#

no it has nothing to do with style lmao

#

I'm not asking it a question and paying attention to its wording

#

that's redundant and nonsensical

keen beacon Jun 4, 2025, 5:51 PM

#

🤣 🤣

elder rapids Jun 4, 2025, 5:52 PM

#

why y'all deleting so much

keen beacon Jun 4, 2025, 5:52 PM

#

why not

misty vault Jun 4, 2025, 5:52 PM

#

he is shy and insecure

#

nah bro sydney is opposite of shy

#

.

elder rapids Jun 4, 2025, 5:53 PM

#

I just said you shouldn't allow it to create its own context, certain models do do this well like r1 and o3, but models like grok are hard capped, same with claude

#

you know this, too, with 3.6 sonnet

#

it was an excellent writer

#

but it was a different type

civic flame Jun 4, 2025, 5:54 PM

#

@keen beacon i agree the OG claude models were really really good at writing

#

they felt very unhinged but in a good way

#

i remember when claude was only available for public use using their slack bot

#

it was great

#

let me see if the messages are still there

elder rapids Jun 4, 2025, 5:55 PM

#

civic flame <@456226577798135808> i agree the OG claude models were really really good at wr...

same esp for the time

civic flame Jun 4, 2025, 5:56 PM

#

pfft

small haven Jun 4, 2025, 6:00 PM

#

anybody have gemini ultra here

narrow elbow Jun 4, 2025, 6:01 PM

#

Google pulls a Kingfall before OpenAI's stream 🤪

small haven Jun 4, 2025, 6:01 PM

#

hahah they thought it was o3 pro, plugged it out when it wasn't

quiet folio Jun 4, 2025, 6:02 PM

#

small haven hahah they thought it was o3 pro, plugged it out when it wasn't

LMAO

#

Real

small haven Jun 4, 2025, 6:13 PM

#

augment code is matching google ultra in price 😭

loud tinsel Jun 4, 2025, 6:14 PM

#

<@&1349916362595635286> Please, add "Claude 4 Sonnet" or** "Claude 4 Opus"** in the last WebSite !

Why did you almost completely abandon the old website, when it is better done and uses gradio (which is visually attractive) ?

civic flame Jun 4, 2025, 6:16 PM

#

i'm not too sure about those last 4 words..

misty vault Jun 4, 2025, 6:19 PM

#

fr

#

like
we're not talking about sydney

civic flame Jun 4, 2025, 6:31 PM

#

this is all gone now 😔

small haven Jun 4, 2025, 6:38 PM

#

@deep adder run this bunx ccusage

#

faster npx alternative

#

?

#

it checks ur claude code metrics lol

#

i just seen that

#

i went through 2 billions tokens insanity

#

how come

keen beacon Jun 4, 2025, 6:45 PM

#

oh reddit is suing anthropic

#

they have a deal with openai and google

echo aurora Jun 4, 2025, 6:50 PM

#

loud tinsel <@&1349916362595635286> Please, add **"Claude 4 Sonnet"** or** "Claude 4 Opus"**...

I am sorry to hear you're not a fan of the of the new site. It is a big change, it's totally fair if you prefer the old website, but we’d love to hear your feedback on the new one if you’re open to sharing. When we changed from the legacy site to the current site we did mention that moving forward all feature updates and improvements will happen on the new LMArena site. We have been seeing a lot of positive signal that the new site is more appealing and have made the decision that going forward this is where our team is going to be focusing on.

small haven Jun 4, 2025, 7:00 PM

#

echo aurora I am sorry to hear you're not a fan of the of the new site. It is a big change, ...

any bug bounty?

echo aurora Jun 4, 2025, 7:04 PM

#

small haven any bug bounty?

Not atm but for sure a good idea I'll pass along. We do have report bug form on the site but in terms of a bounty program that's not something we have currently.

#

Did you find something 👀

misty vault Jun 4, 2025, 7:08 PM

#

yes, remember when models went unavailable while ago? he did that 😔

echo aurora Jun 4, 2025, 7:09 PM

#

blobglarepolice

cedar tide Jun 4, 2025, 7:11 PM

#

@echo aurora will the webdev arena be integrated into the main site?

loud tinsel Jun 4, 2025, 7:11 PM

#

echo aurora I am sorry to hear you're not a fan of the of the new site. It is a big change, ...

<@&1349916362595635286> I understand your team's decision. However, I believe both websites are complementary, as each has its strengths and weaknesses.

New website:
It features a more modern, visually pleasant, and intuitive interface. It's prettier and easier to use, but also noticeably less complete.

Old website:
While its interface is a bit more "raw" or even cluttered, it's much more complete and packed with features. It might feel more like a "dev tool" than a "power user interface", but its depth is genuinely valuable.

Personally, I especially appreciate being able to tweak settings like temperature or the maximum number of output tokens (up to 4096), and how easily accessible every button or option becomes once you're used to the layout.

Honestly, what I prefer is that "familiar" and "feature-rich" experience. For example, when using tools like Ollama, I tend to choose the command line over a graphical interface.

So I think both websites are great—but just not for the same type of user. Then again, maybe I’m a bit of an edge case 😅

In the finally, I prefer the last website 😁

echo aurora Jun 4, 2025, 7:12 PM

#

cedar tide <@283397944160550928> will the webdev arena be integrated into the main site?

generally I won't be able to share ETAs for new features/updates/etc, but moving webdev over the arena is on our radar

small haven Jun 4, 2025, 7:14 PM

#

echo aurora Did you find something 👀

😏

#

im jk nothing

#

too lazy, but if money is involved, i would

cedar tide Jun 4, 2025, 7:15 PM

#

@echo aurora Can we have a page with a continuously updated list of models currently on the leaderboard that are not yet in the arena? (mystery models included)

echo aurora Jun 4, 2025, 7:16 PM

#

small haven too lazy, but if money is involved, i would

a bounty program is a rly good idea blobthanks

sonic tendon Jun 4, 2025, 7:18 PM

#

on that note: maybe redarena being integrated into the main site could be cool

cedar tide Jun 4, 2025, 7:33 PM

#

cedar tide <@283397944160550928> Can we have a page with a continuously updated list of mod...

??

echo aurora Jun 4, 2025, 7:36 PM

#

cedar tide ??

sry for delay, I'm going to spin up a thread in #1372230675914031105 when out of this meeting

elder rapids Jun 4, 2025, 7:41 PM

#

small haven hahah they thought it was o3 pro, plugged it out when it wasn't

pretty sure this wouldn't be the case, since the stream was already known to be something else

keen fulcrum Jun 4, 2025, 7:54 PM

#

https://www.rxddit.com/r/singularity/comments/1l32s24/sam_altman_says_the_perfect_ai_is_a_very_tiny/

rxddit.com

Sam Altman says the perfect AI is “a very tiny model with superhuman reasoning, 1 trillion tokens of context, and access to every tool you can imagine.”

u/Nunki08 on r/singularity

Source: Maginative on Youtube: Sam Altman Talks AGI Timeline & Next-Gen AI Capabilities | Snowflake Summit 2025 Fireside Chat: https://www.youtube.com/watch?v=qhnJDDX2hhU
Video by vitrupo on 𝕏: https://x.com/vitrupo/status/1930009915650912586

▶ Play video

small haven Jun 4, 2025, 7:56 PM

#

the perfect ai is a 1 bit-sized model with superhuman reasoning, 100 trillions tokens of context

unborn ocean Jun 4, 2025, 8:12 PM

#

small haven the perfect ai is a 1 bit-sized model with superhuman reasoning, 100 trillions t...

and most importantly its knowledge coming from tool workflows

#

it is kind of a weird ideal in many ways and i have some problems with him assuming that this would be the ideal (economically speaking), but it is fiction anyways

#

so who cares

#

on a completely different note: @earnest parcel godaddy says your domain is worth 434 USD 🤑 (dubesor.de)

tall summit Jun 4, 2025, 9:12 PM

#

keen fulcrum https://www.rxddit.com/r/singularity/comments/1l32s24/sam_altman_says_the_perfec...

nobody needed him to say this

wicked root Jun 4, 2025, 9:34 PM

#

guys I broke Gemini

small haven Jun 4, 2025, 9:39 PM

#

tall summit nobody needed him to say this

exactly lol

#

u know the convo been boring when he pulls this shxt

tall summit Jun 4, 2025, 9:42 PM

#

small haven the perfect ai is a 1 bit-sized model with superhuman reasoning, 100 trillions t...

so trueeeee
i want ultrahuman reasoning though 🙏

small haven Jun 4, 2025, 9:45 PM

#

its gonna happen but whatever hes talking is at least a decade away lol

misty vault Jun 4, 2025, 10:01 PM

#

agi

boreal saddle Jun 4, 2025, 10:06 PM

#

If AGI/ASI is actually developed in this decade like AI 2027 predicts, I will eat my hat.

#

Fair enough.

#

Though the AI 2027 scenario will either be the biggest comedy show of our time, or the most shocking prophetic call ever.

small haven Jun 4, 2025, 10:15 PM

#

is goldmane still good for thursday?

misty vault Jun 4, 2025, 10:19 PM

#

boreal saddle If AGI/ASI is actually developed in this decade like AI 2027 predicts, I will ea...

i will make you throw up the hat you ate and will eat it as well with all your stomache and inside fluids still on it

small haven Jun 4, 2025, 10:22 PM

#

boreal saddle If AGI/ASI is actually developed in this decade like AI 2027 predicts, I will ea...

deepthink is technically agi, u can eat it now

boreal saddle Jun 4, 2025, 10:39 PM

#

misty vault i will make you throw up the hat you ate and will eat it as well with all your s...

Bro....

cedar tide Jun 4, 2025, 10:53 PM

#

Google didn't lie, the latest 2.5 flash is much more efficient than the old one (being more performant)

Screenshot_2025-06-05-00-50-09-229_com.android.chrome-edit.jpg

#

I did new model request
https://discord.com/channels/1340554757349179412/1379958143072862208

echo aurora Jun 4, 2025, 11:12 PM

#

cedar tide I did new model request https://discord.com/channels/1340554757349179412/137995...

blob_salute seeing that now, adding to the request list

leaden palm Jun 4, 2025, 11:24 PM

#

[moved to ai-news]

small haven Jun 4, 2025, 11:32 PM

#

huh where is this guy tweet??

#

@patent aspen i thought it was tmmrw

#

oh nvm he tweeted 4 hrs later from this time, last time

sweet frost Jun 4, 2025, 11:54 PM

#

is stylectrl dead in the new arena?

keen beacon Jun 5, 2025, 12:03 AM

#

sweet frost is stylectrl dead in the new arena?

its the default

sweet frost Jun 5, 2025, 12:11 AM

#

as it should be thanks!

small haven Jun 5, 2025, 12:15 AM

#

late june

#

as per brian

#

its going thru a safety testing phase rn

#

only trusted users

#

yea its heavily been nerfed compared to the december version

#

wym? u think its gonna be a flop?

#

nah even a month before, its been flaking, was just using o3 since

#

idk i think theyre just following protocol

#

u love finance do u

small haven Jun 5, 2025, 12:52 AM

#

cool hopefully

#

ive tried months ago, its meh unless u do math heavy things

#

deeperthink is rlly bad

#

o3 patches that tho, let alone o3 pro

#

grok 3 is archaic

elder rapids Jun 5, 2025, 1:05 AM

#

when is Logan tweeting

#

istg

small haven Jun 5, 2025, 1:07 AM

#

hahah im also waiting

#

usually tweets at 8pm pst

elder rapids Jun 5, 2025, 1:10 AM

#

alr so in 2 hours

#

or an hour 50 minutes

small haven Jun 5, 2025, 1:16 AM

#

a little over that, well for the last tweet

#

pre gemini io

#

goated

#

but where is logan's tweet?

small haven Jun 5, 2025, 2:42 AM

#

?

#

i dont see it

#

ok bud

leaden palm Jun 5, 2025, 2:43 AM

#

well google doesn't have it

#

fair enough

#

still

#

burden of proof is on you

small haven Jun 5, 2025, 2:44 AM

#

logan is knocked out at a gay bar in san francisco

#

all san fran are gay

leaden palm Jun 5, 2025, 2:46 AM

#

if you dont want to be in the same place where major ai developments are taking place

haughty tangle Jun 5, 2025, 3:10 AM

#

There’s too many stealth Google models

#

There’s like 20, I can’t keep count of them. Every 4 days I see a new one.

#

Kingfall, Dragontail, Nightwhisper, Dreamtides, Moonhowler, Stargazer, Shadowbrook, Riverhollow, Lunarcall, Moonfall

#

I’m pretty sure that’s all of them

jade egret Jun 5, 2025, 3:33 AM

#

?

#

whats releasing tomorow

elder rapids Jun 5, 2025, 3:33 AM

#

nothing

elder rapids Jun 5, 2025, 3:34 AM

#

small haven a little over that, well for the last tweet

you lied

#

I'll never believe you again

#

it's over Google is never releasing

jade egret Jun 5, 2025, 3:38 AM

#

huh

#

fr?

#

YO

#

new gemini model?

#

?

#

plz answer : (

echo aurora Jun 5, 2025, 3:40 AM

#

🤞

jade egret Jun 5, 2025, 3:41 AM

#

wait a min

elder rapids Jun 5, 2025, 3:41 AM

#

ts not happening

jade egret Jun 5, 2025, 3:41 AM

#

i though gemini 2.5 pro already released

elder rapids Jun 5, 2025, 3:42 AM

#

jade egret i though gemini 2.5 pro already released

iterations

jade egret Jun 5, 2025, 3:42 AM

#

or thats the preview

elder rapids Jun 5, 2025, 3:42 AM

#

there are different previews, they change like every month

jade egret Jun 5, 2025, 3:42 AM

#

o

#

so

#

newer preview

#

it gonna beb etter right

elder rapids Jun 5, 2025, 3:42 AM

#

ye, but GA soon

jade egret Jun 5, 2025, 3:42 AM

#

what is ga

elder rapids Jun 5, 2025, 3:42 AM

#

GA = general availability

jade egret Jun 5, 2025, 3:43 AM

#

oh

#

w

#

is

#

gemini 2.5 ultra ever gonan release ; (

elder rapids Jun 5, 2025, 3:49 AM

#

no

hollow ocean Jun 5, 2025, 3:56 AM

#

O3 killer tmr?

elder rapids Jun 5, 2025, 3:58 AM

#

it's not going to be an o3 "killer", o3 has higher compute variants

#

it's just going to be even better than before

#

with that out of the way

#

yeah o3 killer, if tomorrow

wicked root Jun 5, 2025, 4:01 AM

#

What about gpt5?

elder rapids Jun 5, 2025, 4:07 AM

#

hope it's even real

wicked root Jun 5, 2025, 4:09 AM

#

Polymarket has the probability at 90% for a 2025 release

wintry tinsel Jun 5, 2025, 4:14 AM

#

What are the rumors for king fall

#

Better than opus or nah

wintry tinsel Jun 5, 2025, 4:16 AM

#

jade egret gemini 2.5 ultra ever gonan release ; (

The only company to release an ultra variant is Anthropic with opus everyone else is greedy

haughty tangle Jun 5, 2025, 4:34 AM

#

wintry tinsel Better than opus or nah

This is what it gave when asked for SVG of a robot https://x.com/testingcatalog/status/1930298521078399226?s=61

TestingCatalog News 🗞 (@testingcatalog)

Kingfall is killing it at the "SVG robot benchmark"

WOW 🤯

#

I don’t know how that compares to Claude 4 Opus

#

Since I haven’t tested it with the same prompt

#

It’s probably just an iteration of 2.5 Pro though

#

I like everything but the robot

#

The background and text is nice, but I personally think Kingfall is better

drifting thorn Jun 5, 2025, 4:58 AM

#

Where is Kingfall?

#

arghhhh

civic flame Jun 5, 2025, 5:44 AM

#

https://x.com/OfficialLoganK/status/1930500218602369344

Logan Kilpatrick (@OfficialLoganK)

Gemini

#

THEEERE IT IS

#

WE'RE SO BACK BROS

dusky aurora Jun 5, 2025, 6:23 AM

#

I witnessed firsthand as Gemini changed versions (in direct chat) back in May

small haven Jun 5, 2025, 6:40 AM

#

civic flame https://x.com/OfficialLoganK/status/1930500218602369344

boom

#

SAM FALLS TODAY

late path Jun 5, 2025, 6:52 AM

#

goldmane🏆🏆🏆

small haven Jun 5, 2025, 6:55 AM

#

kingfall definitely won

small haven Jun 5, 2025, 6:56 AM

#

elder rapids you lied

good moning

#

trust in brian

#

brian is our google insider

#

o1 pro? its not that good, o3 beats it by a mile

#

kingfall is way above o3

#

i mean even o3 in price/performance

#

i doubt kingfall uses 10x

#

its more like $1.25/1m in $10/1m out

#

deepthink will be 10x

#

have u tried kingfall? do u not remember, its cot wasn't that big and actually lesser than 0506

#

prolly i can believe that

#

u right, thats an interesting pov, parallel cot

#

oh yea deepthink i believe will demolish o3 pro

#

i have o3 pro rn on stealth, it aint that great as it thought it be

#

its better than o3, but not similar to the jump from o1 to o1 pro

#

all of this competition is so good, bc now we have o4 and o5-mini-high to proc an early release, very very excited for summer

#

hmm, currently if ur basing codex as their winning agentic product, i believe ur wrong, ive tried it and currently claude code >> is the meta

#

operator isn't even that good, even with the o3 upgrade as the base model

#

hmm, have u tried jules?

#

its more like claude > gemini > openai, in my view

#

lately or when it got released

#

it improved a lot

#

codex is garbage too 😭

#

they both garbage

#

but i belive oai is more garbage than gemini

#

😂

#

im using codex-1 on their ui, its rlly bad

#

idk why ppl are hyping itup

#

by perpertuity, how is codex mini > codex-1?

#

no its not the ui, its beautiful, but the output is very bad

#

trust me

#

hmm later today i will, but i think gemini is going to take my time lol

#

its unfortunately google that will win

#

i am surprised

#

but any competition is good

#

dario said benchmarks dont matter anymore, so its gg 😭

late path Jun 5, 2025, 7:22 AM

#

and $15/$75 pricing💀

small haven Jun 5, 2025, 7:23 AM

#

gemini at 86% and claude is at 72%, im not using claude 4 for a while

#

yes, but this kind of a jump is too big, im going to be using the good ole' copy and paste until they integrate their pro/ultra sub for virtually unlimited queries like claude max

#

i believe its going to be unlimited on at least gemini ultra via their ui

#

o3 pro is roughly at $100-$150 in/out and they can afford the "unlimited" inference via their ui

#

so why cant google

#

veo 3 is just a whole different arch

#

and its just a gimmick rn, i dont rlly care

#

we will see,

#

gonna get ultra

#

if deepthink is announced

#

but later june im guessing

#

nah gemini is

#

oai never cut under competition

#

ever

#

r1 > o1 mini, what are u saying 😭

#

talking out of ur ass srry 😭

#

oh yea i get what u mean, but they still never cut under competition, but cut after competition cuts

#

never before it

#

they always held a premium

#

they greedy

elder rapids Jun 5, 2025, 8:03 AM

#

civic flame https://x.com/OfficialLoganK/status/1930500218602369344

🙏🙏🙏

#

I used to pray for times like these

small haven Jun 5, 2025, 8:04 AM

#

so i guess we can trust brian

#

my question is why is he doing it 😭

wispy leaf Jun 5, 2025, 8:05 AM

#

New sota model today?

late path Jun 5, 2025, 8:07 AM

#

goldmane yes

keen beacon Jun 5, 2025, 8:07 AM

#

https://www.youtube.com/watch?v=x4wm5Y9E_9g

YouTube

Wes Roth

Google's Secret "KINGFALL" Model Leaked... plus other AI News

The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI.

My Links 🔗
➡️ Subscribe: https://www.youtube.com/@WesRoth?sub_confirmation=1
➡️ Twitter: https://x.com/WesRothMoney
➡️ AI Newsletter: https:...

▶ Play video

#

small haven Jun 5, 2025, 8:08 AM

#

i think the alpha here is reading all of brian discord messages

#

😭

keen beacon Jun 5, 2025, 8:09 AM

#

chatgpt is so ass

#

google is sooooooooo much better

#

but their ai helps kill kids

#

but other then that its great

small haven Jun 5, 2025, 8:09 AM

#

keen beacon chatgpt is so ass

yes

keen beacon Jun 5, 2025, 8:10 AM

#

small haven yes

chatgpt operator ass too

small haven Jun 5, 2025, 8:10 AM

#

keen beacon chatgpt operator ass too

agreed

keen beacon Jun 5, 2025, 8:11 AM

#

https://www.youtube.com/watch?v=sxTNACldK3Y&t=185s

YouTube

NetworkChuck

ChatGPT Operator is expensive....use this instead (FREE + Open Source)

🌐 Build your next project on Hostinger with an INSANELY fast VPS: Get 10% off with code NETWORKCHUCK: http://hostinger.com/networkchuckvps

☕ Because everything in I.T. requires coffee: https://ntck.co/coffee

We’re in the future. AI can now run your web browser, complete tasks, and automate entire workflows– basically, acting like y...

▶ Play video

#

i was using this to do my homework

#

its pretty decent

#

https://browser-use.com/
https://github.com/browser-use/web-ui

Browser Use - The AI browser agent

Repetitive work is dead. Browser Use empowers anyone to automate repetitive online tasks, no code required.

GitHub

GitHub - browser-use/web-ui: 🖥️ Run AI Agent in your browser.

🖥️ Run AI Agent in your browser. Contribute to browser-use/web-ui development by creating an account on GitHub.

#

go to the end of the video you'll see the chatgpt vs browser use comparison

there really is no difference especially if you have a decent cpu/gpu

elder rapids Jun 5, 2025, 8:21 AM

#

small haven so i guess we can trust brian

sure but I still don't believe hed have so much more access after that tbh, he says he has "connections" but then it's like he's "directly" there, combined with the fact it's already a reasoned inference via Logan saying it'd come early June

ornate agate Jun 5, 2025, 8:22 AM

#

Who’s Brian?

elder rapids Jun 5, 2025, 8:22 AM

#

hence the conflicting information that are equally valid (Logan also saying, "in weeks they'll come")

#

and he's not pushing for any of them

#

but ey

#

glad it's Thursday

torn mantle Jun 5, 2025, 8:55 AM

#

ornate agate Who’s Brian?

billy

small haven Jun 5, 2025, 9:15 AM

#

elder rapids sure but I still don't believe hed have so much more access after that tbh, he s...

yea maybe, but he got goldmane spot on

#

i was skeptical at first, but damn google insiders are built diff

drifting thorn Jun 5, 2025, 9:21 AM

#

2.5 Flash feels so dumb to me

#

kingfall

hardy pecan Jun 5, 2025, 9:26 AM

#

o3-pro and new gemini model tomorrow? that was would a lovely surprise

drifting thorn Jun 5, 2025, 9:28 AM

#

I need King to fall rn with free API usage

late path Jun 5, 2025, 9:28 AM

#

I'm curious what else OpenAI has up its sleeve this time to steal Gemini's thunder

#

Last time for 0325 they released GPT-4o imagen and Ghibli on the same day

#

If they only release o3pro today it probably won't be enough

drifting thorn Jun 5, 2025, 9:30 AM

#

Maybe o3 image gen?

#

Flux Kontext is threatening GPT image 1

#

And Flux Kontext is much faster than GPT's and it doesn't have that yellow filter that GPT always have.

calm sequoia Jun 5, 2025, 9:34 AM

#

This is going to be big in the future, whatever Wolfram does is always excellent: https://www.wolfram.com/llm-benchmarking-project/

Wolfram LLM Benchmarking Project

Results from Wolfram's ongoing tracking of LLM performance. The benchmark is based on a Wolfram Language code generation task.

#

No o3 though

#

👀

#

Qwen3 30B is just off the charts for it's size

tall summit Jun 5, 2025, 9:42 AM

#

calm sequoia This is going to be big in the future, whatever Wolfram does is always excellent...

if only wolfram did more in the first place

drifting thorn Jun 5, 2025, 9:45 AM

#

fxxk

#

the knowledge base in Poe is shxtty

#

I use 2.5 Pro as the base model, and then I asked him about a novel character(which is in the novel) and then it just pops out nonsense

patent bane Jun 5, 2025, 9:47 AM

#

drifting thorn I use 2.5 Pro as the base model, and then I asked him about a novel character(wh...

ok?

ornate stump Jun 5, 2025, 10:00 AM

#

why flash 2.5 in gemini app feels stupid today

calm sequoia Jun 5, 2025, 10:20 AM

#

That's a good observation. But I'm still surprised how good the 4.5 is and how bad is the o4-mini.

#

Hmm I wonder if it means that the train time compute > inference time compute in the end.

#

This says otherwise

#

That's true. The distribution is niche

#

Ahh I'd like to see the 4.5 with PRO level thinking

#

Yeah but still would be interesting experiment to see bench results

#

Indeed

dusky aurora Jun 5, 2025, 11:36 AM

#

it seems as if LMArena's sampling has become more restrictive

#

less creativity

#

Gemini is much less interesting then

ocean vortex Jun 5, 2025, 11:51 AM

#

calm sequoia That's a good observation. But I'm still surprised how good the 4.5 is and how b...

there are certain things any mini model will either never beat 4.5 on, or will take a very long time to make it possible. SimpleQA, spatial awareness or spatial reasoning.

#

also context awareness ("vibe test") - the smaller the model is, typically the more literally it will take your last message or sentence having less capacity to consider everything else or "read between the lines"

#

SimpleQA 62.5% (gpt4.5) vs 19.3% (o4-mini-high)

boreal saddle Jun 5, 2025, 11:59 AM

#

#

You gotta be kidding me.

#

Why does the new website break down all the time?

#

Even discussing perfectly legitimate subjects gets the "Something went wrong" error.

ocean vortex Jun 5, 2025, 12:00 PM

#

boreal saddle Even discussing perfectly legitimate subjects gets the "Something went wrong" er...

could be Anthropic tbh. They added some really shady flagging with Opus 4

#

safetycell technology 🤷‍♂️

boreal saddle Jun 5, 2025, 12:01 PM

#

ocean vortex could be Anthropic tbh. They added some really shady flagging with Opus 4

Weird.

#

I never got such errors on the old website.

#

Refreshed the website, and finally, Claude is speaking up.

dusky aurora Jun 5, 2025, 12:26 PM

#

boreal saddle

opus has periods of unavailability, you simply have to wait some time and reroll regularly. it's not relat d to the prompt

calm sequoia Jun 5, 2025, 12:32 PM

#

Today I've used the google collab and android studio. They both have "gemini" chat to interact to, however, without model selection or identification. This could explain why they need so many different anonymous models on arena: each has a different use case and may be finetuned for this.

#

Or they simply but cheapest gemini flash 😦

boreal saddle Jun 5, 2025, 12:41 PM

#

dusky aurora opus has periods of unavailability, you simply have to wait some time and reroll...

So I didnt actually trigger a censor?

cursive pagoda Jun 5, 2025, 12:53 PM

#

hey , you guys have some server where people actively post their AI projects and discuss?

raven void Jun 5, 2025, 12:54 PM

#

I want to join such a server

cursive pagoda Jun 5, 2025, 12:55 PM

#

calm sequoia This says otherwise

claude 4 ops not there?
also , llama nemotron is ahead of 3.7 thinking?

#

crazy

echo aurora Jun 5, 2025, 1:19 PM

#

Reminder we have our Staff AMA tomorrow! https://discord.gg/XkfsbYWX?event=1375223423009165435

We also have a contest running right now and would love to see your submissions! #announcements message

calm sequoia Jun 5, 2025, 1:33 PM

#

Hey mister "there have never been any nerfs" 🤓 @ocean vortex

late path Jun 5, 2025, 1:40 PM

#

goldmane will crush it yay

ocean vortex Jun 5, 2025, 1:47 PM

#

calm sequoia Hey mister "there have never been any nerfs" 🤓 <@514836230802898954>

is this that excel spreadsheet where they counted missing scores (not tested or they failed to find them) as 0% scored?

#

lmao

calm sequoia Jun 5, 2025, 1:48 PM

#

Vibes says otherwise

ocean vortex Jun 5, 2025, 1:49 PM

#

?

#

this graph is wrong on some many levels. Not any better than some random thing someone drew

#

Like... don't mind gpt4.1 being above 2.5pro

#

LOL

dusky aurora Jun 5, 2025, 1:50 PM

#

boreal saddle So I didnt actually trigger a censor?

the censor would be the the box with "retry" and "clear". your screenshot was the censor admitting the prompt and it failing at the generation stage

calm sequoia Jun 5, 2025, 1:51 PM

#

ocean vortex Jun 5, 2025, 1:51 PM

#

calm sequoia

yeah this is accurate. Exactly the same +/- miniscule amount

calm sequoia Jun 5, 2025, 1:51 PM

#

#

#

drifting thorn Jun 5, 2025, 1:53 PM

#

calm sequoia

This shows the real problem

calm sequoia Jun 5, 2025, 1:54 PM

#

drifting thorn Jun 5, 2025, 1:54 PM

#

oh btw claude 4 opus is not in the chart of artificial analysis

ocean vortex Jun 5, 2025, 1:58 PM

#

calm sequoia

yeah but this is completely different to the official leaderboard... Wonder if artificialanalysis messed something up with their testing. Wouldn't be the first time

#

https://scale.com/leaderboard/humanitys_last_exam

Scale | SEAL Leaderboard: Humanity's Last Exam

Challenging LLMs at the frontier of human knowledge

alpine coral Jun 5, 2025, 1:58 PM

#

calm sequoia This is going to be big in the future, whatever Wolfram does is always excellent...

i dunno.. what it's actually benchmarking is very narrow..

The task consists of going from English-language specifications to Wolfram Language code. The test cases are exercises from Stephen Wolfram's An Elementary Introduction to the Wolfram Language. These exercises have been done online by millions of humans, and we've developed effective tools for determining functional correctness of code, which we're now applying to LLMs.

#

Perhaps translating natural language into wolfram has more generalised value.. and full respect to Wolfram too.. im just not sure it is or will be that useful / meaningful

ocean vortex Jun 5, 2025, 1:59 PM

#

if you look there https://artificialanalysis.ai/models/gemini-2-5-pro there's not a single thing where the new one scored higher for them, that can't be right

Gemini 2.5 Pro - Intelligence, Performance & Price Analysis | Artif...

Analysis of Google's Gemini 2.5 Pro Preview (Mar' 25) and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.

#

we already saw it higher on AIDER, webdev arena etc

calm sequoia Jun 5, 2025, 1:59 PM

#

Why not, if I remmember they just showed the ELO based benchmarks during presentation, which are innacurate

#

Anyway, not important now as the GA is underway

alpine coral Jun 5, 2025, 2:01 PM

#

i feel like if there's nerfing.. it happens from pre-release (e.g. nebula, goldmane etc ect) to preview/exp - then GA is just another layer of safety and corporate alignment nerfing

ocean vortex Jun 5, 2025, 2:01 PM

#

calm sequoia Why not, if I remmember they just showed the ELO based benchmarks during present...

no, they showed substantial gains in coding, including livecodebench, artificialanalysis complete opposite:

calm sequoia Jun 5, 2025, 2:01 PM

#

But how they showed it? I don't remember numerical data

ocean vortex Jun 5, 2025, 2:01 PM

#

I think they messed smth up with their test suite lol

calm sequoia Jun 5, 2025, 2:02 PM

#

alpine coral i feel like if there's nerfing.. it happens from pre-release (e.g. nebula, goldm...

That's true indeed

#

Hmm

#

ocean vortex Jun 5, 2025, 2:02 PM

#

calm sequoia But how they showed it? I don't remember numerical data

they literally showed it for both versions

#

calm sequoia Jun 5, 2025, 2:03 PM

#

So you justify "no-nerf" statement based solely on livebench?

ocean vortex Jun 5, 2025, 2:05 PM

#

calm sequoia So you justify "no-nerf" statement based solely on livebench?

https://aider.chat/docs/leaderboards/

aider

Aider LLM Leaderboards

Quantitative benchmarks of LLM code editing skill.

#

there are more, basically most coding benchmarks show improvement

#

https://web.lmarena.ai/leaderboard

boreal saddle Jun 5, 2025, 2:30 PM

#

This is quickly becoming the new "As a AI language model....", for real.

#

What if you wanted to talk to a LLM, but God said: "There was an error."

alpine coral Jun 5, 2025, 2:31 PM

#

yeah and if the version they release (whether preview or GA) is slightly less performant than goldmane, that would be consistent with what i feel like ive seen in the past

#

they rarely seem stronger compared to pre-release imo anyway

late path Jun 5, 2025, 2:34 PM

#

like, +30 elo

alpine coral Jun 5, 2025, 2:34 PM

#

yeah both goldmane and redsword are solid af

#

well.. redsword is no more.. but it was (and i feel they are more the less the same checkpoints)

#

nah i think it's way more fundamental than that

#

they're making good models

#

yeah latest iterations have been incremental

#

but nebula / 2.5 pro was a massive performance step up

#

i'm not sure goldmane is of the same extent - maybe - but it feels quite signficicant imo / fwiw

late path Jun 5, 2025, 2:38 PM

#

goldmane will be the next Nebula moment. It's much better than 0506

alpine coral Jun 5, 2025, 2:39 PM

#

late path goldmane will be the next Nebula moment. It's much better than 0506

yeah i feel 0506 didn't get nerfed so much as was f-ted for tool usage - and other areas suffered as a result

late path Jun 5, 2025, 2:40 PM

#

alpine coral yeah i feel 0506 didn't get nerfed so much as was f-ted for tool usage - and oth...

My feeling is that they strengthened instruction following, but this somehow reduced the model's judgment and autonomy (compared to 0325)

alpine coral Jun 5, 2025, 2:41 PM

#

in terms of actual usage - o3 on chatgpt is my go-to for anything involving complexity and / or web search.. it's really something

#

but if i needed to use an API, gem pro 2.5 would be so much smoother

#

and it's such high quialuity

#

but yeah not o3 chatgpt

#

i mean i just use 4o for most things tbh lol

#

fair enough - i can appreciate that 👍

#

it is a strong model

#

but yeah 4o is just useful for quick stuff - like translating / transcribing, some quick question

#

like i don't always need the 'best' model - actually, aside from research, i rarely use thinking models

#

i do but that's just me ha

#

it's what oai says it is

#

yeah i dont see that as controversial tbh

#

but i guess others find it so

#

and like 4.5 is still 'preview' (i bet whatever is the OG version that wasn't tuned for safety and public release is a beast)

#

but it's being depcrated soon.. ig we'll never see a non-preview version..

tall summit Jun 5, 2025, 2:50 PM

#

alpine coral i mean i just use 4o for most things tbh lol

gemini 2.5 flash 😍 or 4.1

#

?

#

ok

#

im talkin about most things.

late path Jun 5, 2025, 2:55 PM

#

It's clear that OpenAI has put a lot of effort into optimizing 4o's chat experience and multi-turn conversations. There's a reason it's now ranked 1st in multi-turn on the arena

alpine coral Jun 5, 2025, 3:05 PM

#

yeah i agree, though the latest 2.5 flash with thinking is like notably strong imo

#

but still.. different use cases - like thinking gets in the way.. if i'm gonna use it i may as well go all out with o3 or whatever ha

balmy mist Jun 5, 2025, 3:21 PM

#

https://x.com/legit_api/status/1930645775731605604

ʟᴇɢɪᴛ (@legit_api)

Gemini 2.5 Pro Preview 06-05

late path Jun 5, 2025, 3:22 PM

#

waiting for leaderboard updates

cedar tide Jun 5, 2025, 3:24 PM

#

Nope

balmy mist Jun 5, 2025, 3:27 PM

#

what is GA?

late path Jun 5, 2025, 3:28 PM

#

is openai still having another livestream?

fleet lintel Jun 5, 2025, 3:31 PM

#

First they will put the model in preview mode (different from experiment mode) and if they find no bugs then only it will go to GA

#

Looks like latest model in GA target and will release as a preview model for first few weeks

brittle tiger Jun 5, 2025, 3:32 PM

#

https://x.com/ai_for_success/status/1930647901828792830?t=5c9tW35EdxazK54nms6oBw&s=19

#

I got a arc-agi problem through and it was right. O3 and last 2.5 pro get it right about half the time. Not enough time to test thoroughly

fleet lintel Jun 5, 2025, 3:42 PM

#

I saw it and it's gone again

late path Jun 5, 2025, 3:42 PM

#

portal color scheme

fleet lintel Jun 5, 2025, 3:43 PM

#

yeah, it's rolling out. I am probably hitting different server now and flag is not ON yet on new server

brittle tiger Jun 5, 2025, 3:43 PM

#

I got one through and now in the same chat instance it says model not available

main gulch Jun 5, 2025, 3:45 PM

#

golden ocean Jun 5, 2025, 3:51 PM

#

pls say it fixed coding issues

lime coral Jun 5, 2025, 3:51 PM

#

https://x.com/sundarpichai/status/1930651395985960975?s=46

Sundar Pichai (@sundarpichai)

@J_S0L1 Goldmane

fleet lintel Jun 5, 2025, 3:55 PM

#

Formatting is better. Looks a bit like chatgpt to me

brittle tiger Jun 5, 2025, 3:56 PM

#

https://x.com/sundarpichai/status/1930650575097696490

Sundar Pichai (@sundarpichai)

Gemini

#

craig is sundar larping as openai fanboy confirmed

#

1470 ELO

echo aurora Jun 5, 2025, 3:58 PM

#

late path waiting for leaderboard updates

Refresh 🙂

brittle tiger Jun 5, 2025, 3:58 PM

#

fleet lintel Jun 5, 2025, 3:59 PM

#

24 elo gain is good. I was hoping for 30 elo gain but still respectable

brittle tiger Jun 5, 2025, 4:01 PM

#

30+ point elo difference over Opus is hefty amount of ppl picking it over Opus

late path Jun 5, 2025, 4:02 PM

#

fleet lintel 24 elo gain is good. I was hoping for 30 elo gain but still respectable

if you unchecked style control its 30

sour spindle Jun 5, 2025, 4:02 PM

#

Damn may be back on the googtrain again

patent bane Jun 5, 2025, 4:02 PM

#

we are so back

patent bane Jun 5, 2025, 4:05 PM

#

brittle tiger

how is it on there so fast?

brittle tiger Jun 5, 2025, 4:05 PM

#

fleet lintel Jun 5, 2025, 4:05 PM

#

patent bane how is it on there so fast?

it's goldmane model and has been on Arena for 10 days or so

patent bane Jun 5, 2025, 4:06 PM

#

fleet lintel it's goldmane model and has been on Arena for 10 days or so

oh, haven't heard about it, only heard about kingfall

fleet lintel Jun 5, 2025, 4:07 PM

#

https://x.com/sundarpichai/status/1930656033237823862

"We also heard your feedback and made improvements to style and the structure of responses. "

No wonder it feels better

Sundar Pichai (@sundarpichai)

Our latest Gemini 2.5 Pro update is now in preview.

It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads @lmarena_ai with a 24pt Elo score jump since the previous version.

We also

brittle tiger Jun 5, 2025, 4:08 PM

#

https://x.com/OfficialLoganK/status/1930657743251349854

Logan Kilpatrick (@OfficialLoganK)

Introducing our latest update to Gemini 2.5 Pro (06-05), which we expect to become our long term stable release. At a glance:

- SOTA on HLE, Aider, and GPQA
- Now supports thinking budgets
- Same cost, on pareto frontier
- Closes gap on 03-25 regressions

calm sequoia Jun 5, 2025, 4:09 PM

#

hahahahaha

patent bane Jun 5, 2025, 4:09 PM

#

close?

golden ocean Jun 5, 2025, 4:10 PM

#

yea ever since cot became a thing i noticed verbosity went up so high and I never benefited from cot anyway. non cot sota models could already do every one of my tasks but i gotta be honest claude 4 opus thinking kinda beats non thinking opus 4 so im starting to accept cot and claude thinking doesnt have verbosity issues

fleet lintel Jun 5, 2025, 4:11 PM

#

late path if you unchecked style control its 30

that's cool

late path Jun 5, 2025, 4:12 PM

#

gemini 2.5 ultra👀

calm sequoia Jun 5, 2025, 4:13 PM

#

Lol the grok 3.5 lost opportunity for the good release. And now it is Irrelevant 😄

civic flame Jun 5, 2025, 4:13 PM

#

do you think it's likely kingfall will turn up on arena soon?

wicked root Jun 5, 2025, 4:15 PM

#

WOOOOOOOT

wicked root Jun 5, 2025, 4:15 PM

#

brittle tiger

LETS GO

#

GOOGLE SUPREMACY

civic flame Jun 5, 2025, 4:15 PM

#

fleet lintel Jun 5, 2025, 4:19 PM

#

https://blog.google/products/gemini/gemini-2-5-pro-latest-preview/

It's clearly the best model now

#

and quite cheap

patent bane Jun 5, 2025, 4:20 PM

#

civic flame

lmao o4 mini high got that high

elder rapids Jun 5, 2025, 4:21 PM

#

civic flame

HOLY

#

SHT

calm sequoia Jun 5, 2025, 4:21 PM

#

#

They did reverse as last time

fleet lintel Jun 5, 2025, 4:22 PM

#

calm sequoia

this is what I am looking for.. could you align them better ? 😄

wicked root Jun 5, 2025, 4:22 PM

#

Polymarket’s going ham

#

Odds of google winning this month jumped 10% in last 2 days

calm sequoia Jun 5, 2025, 4:23 PM

#

fleet lintel this is what I am looking for.. could you align them better ? 😄

In proramming it's middle point between march and may updates. In everything else - better than both.

wicked root Jun 5, 2025, 4:24 PM

#

I dno man. Im new to the ai bet scene

#

I use gemini for work tho

#

No we’re at 80%

fleet lintel Jun 5, 2025, 4:24 PM

#

what is the Polymarket link?

civic flame Jun 5, 2025, 4:25 PM

#

calm sequoia

wow that is one hell of a base model

#

it does, though

#

there'd a toggle in ai atudio

#

studio

#

speaking of AI studio it keeps throwing errors ugh

balmy mist Jun 5, 2025, 4:26 PM

#

yeah it does

#

i have mine off but it still thinks

#

Screenshot_2025-06-05_at_12.27.06_PM.png

#

nvm mine is on lol

civic flame Jun 5, 2025, 4:27 PM

#

wicked root Jun 5, 2025, 4:27 PM

#

Ill take my chances. Ill win as long as non-google or openai models win

#

By the end of June i mean

balmy mist Jun 5, 2025, 4:27 PM

#

yoooo this is so much better, like it feels better to use

#

which model was this? NW?

#

lol NW literally is a myth at this point, is goldmane better than NW?

fleet lintel Jun 5, 2025, 4:29 PM

#

balmy mist lol NW literally is a myth at this point, is goldmane better than NW?

most likely

balmy mist Jun 5, 2025, 4:30 PM

#

ahh yeah that makes sense

#

what was this General availability people keep talking about?

#

i have been mia for a bit

#

this is impressive af

Screenshot_2025-06-05_at_12.31.36_PM.png

#

wasn't every model release GA?

drifting thorn Jun 5, 2025, 4:32 PM

#

Wow I’m exhilarated

#

Gemini new model

#

Yay

balmy mist Jun 5, 2025, 4:33 PM

#

isnt o3 pro coming out today as well?

#

or did OA give up?

keen fulcrum Jun 5, 2025, 4:33 PM

#

balmy mist isnt o3 pro coming out today as well?

Is it?

balmy mist Jun 5, 2025, 4:34 PM

#

keen fulcrum Is it?

thats what someone said here

keen fulcrum Jun 5, 2025, 4:34 PM

#

They are releasing snapshot after snapchot, still no gemini 3
or where is ultra

patent bane Jun 5, 2025, 4:35 PM

#

people keep talking about how bad the new releases are but I see the potentials, imagine what our children would be using 10 years later

#

i will definitely be fooled by AI

keen fulcrum Jun 5, 2025, 4:36 PM

#

What about Ultra?
I don't believe Gemini 3 is that late tbh

sour spindle Jun 5, 2025, 4:37 PM

#

Does anyone know if there is a way to set your defaults in ai studio like i prefer temperature at 0 or do you have to manually change it everytime

cedar tide Jun 5, 2025, 4:38 PM

#

Does anyone have the benchmarks for version 05-06 (the previous one) please?

patent bane Jun 5, 2025, 4:39 PM

#

sour spindle Does anyone know if there is a way to set your defaults in ai studio like i pref...

manually set it everytime you start a new chat

cedar tide Jun 5, 2025, 4:40 PM

#

Oh i found this

golden ocean Jun 5, 2025, 4:42 PM

#

bro definitely wants to be like that

#

but u wanted to be like that either way

#

LOL

keen fulcrum Jun 5, 2025, 4:44 PM

#

It could be Sunstrike

wicked root Jun 5, 2025, 4:44 PM

#

Fat cat

#

https://tenor.com/view/cat-gif-27161863

Tenor

misty vault Jun 5, 2025, 4:45 PM

#

golden ocean LOL

Bruh I thought craig was not rage baitable

fleet lintel Jun 5, 2025, 4:45 PM

#

After seeing releases from Google, OAI and Claude, is there any hope for Microsoft, Amazon or Apple to release a good AI model ever?

wicked root Jun 5, 2025, 4:45 PM

#

@deep adder do u think OAI can beat gemini?

wicked root Jun 5, 2025, 4:46 PM

#

fleet lintel After seeing releases from Google, OAI and Claude, is there any hope for Micros...

Nah

fleet lintel Jun 5, 2025, 4:46 PM

#

wicked root <@348477266704990208> do u think OAI can beat gemini?

I am not OAI fan or gemini hater, and I think OAI can beat gemini

wicked root Jun 5, 2025, 4:46 PM

#

How so?

#

No im asking out of curiosity

#

Why is OAI’s model better?

patent bane Jun 5, 2025, 4:47 PM

#

cedar tide Oh i found this

my toxic traits could never....

fleet lintel Jun 5, 2025, 4:48 PM

#

wicked root Nah

Amazon might be OK because of CLaude investment and Microsoft because of OAI but Apple is kinda done

wicked root Jun 5, 2025, 4:48 PM

#

Idc what the crowd wants lol. LMArena ranking is the only thing I care about

wicked root Jun 5, 2025, 4:48 PM

#

fleet lintel Amazon might be OK because of CLaude investment and Microsoft because of OAI but...

Does amazon own claude?

fleet lintel Jun 5, 2025, 4:48 PM

#

wicked root Does amazon own claude?

not sure how much % but may be aroudn 50% ?

barren prairie Jun 5, 2025, 4:49 PM

#

Chatgpt is more popular but Gemini is better for me .

wicked root Jun 5, 2025, 4:49 PM

#

Why? Polymarket only cares about lmarena ranking and I use gemini pro on daily basis without any problem

fleet lintel Jun 5, 2025, 4:51 PM

#

fleet lintel not sure how much % but may be aroudn 50% ?

chatgpt is guessestimating 15% and gemini is guessestimating 20%

#

and Google own around 10% of Claude. I thought difference would be much bigger.

cedar tide Jun 5, 2025, 4:52 PM

#

06-05 is it goldmane or redsword?

fleet lintel Jun 5, 2025, 4:53 PM

#

goldmane, confirmed by CEO on twitter

small haven Jun 5, 2025, 4:53 PM

#

when is kingfall

cedar tide Jun 5, 2025, 4:53 PM

#

You have the proof from the web dev Arena ?

fleet lintel Jun 5, 2025, 4:53 PM

#

small haven when is kingfall

most likely never 😄

barren prairie Jun 5, 2025, 4:53 PM

#

The kingfall didn t fall today 🙂

small haven Jun 5, 2025, 4:53 PM

#

is kingfall deepthink parallel cot?\

cedar tide Jun 5, 2025, 4:53 PM

#

Ah found

cedar tide Jun 5, 2025, 4:54 PM

#

small haven is kingfall deepthink parallel cot?\

i used it without think

fleet lintel Jun 5, 2025, 4:54 PM

#

I have my doubts. It doesn't look that easy to cook something in this area

small haven Jun 5, 2025, 4:56 PM

#

u missed on some wealth

fleet lintel Jun 5, 2025, 4:57 PM

#

how is function calling for the latest Gemini model? It used to be bad

late path Jun 5, 2025, 4:57 PM

#

bought some at 78c😋

small haven Jun 5, 2025, 4:58 PM

#

brian is talking shh

tall summit Jun 5, 2025, 4:58 PM

#

wicked root Idc what the crowd wants lol. LMArena ranking is the only thing I care about

BASED

fleet lintel Jun 5, 2025, 4:58 PM

#

he is talking from like 5 min...

#

has to be the biggest reveal ever

small haven Jun 5, 2025, 4:59 PM

#

when will sam fall?

fleet lintel Jun 5, 2025, 4:59 PM

#

small haven when will sam fall?

with king fall 😛

small haven Jun 5, 2025, 4:59 PM

#

fleet lintel with king fall 😛

and when will sam rise

fleet lintel Jun 5, 2025, 5:01 PM

#

you are going to be fired if you reveal too much

small haven Jun 5, 2025, 5:01 PM

#

vvery cool thanks

late path Jun 5, 2025, 5:02 PM

#

please dont

small haven Jun 5, 2025, 5:02 PM

#

chillax

quiet folio Jun 5, 2025, 5:02 PM

#

small haven chillax

word chillax detected, i will now delete messages

small haven Jun 5, 2025, 5:03 PM

#

quiet folio word chillax detected, i will now delete messages

respect brian

quiet folio Jun 5, 2025, 5:03 PM

#

word respect detected, reposting messages

misty vault Jun 5, 2025, 5:03 PM

#

real

#

Imma try

small haven Jun 5, 2025, 5:04 PM

#

@civic flame gemini hasn't solved it, im feeling grok 3.5

#

its not

#

oh

#

u can set thinking budget

#

im at 8192 default

#

im trying another

fleet lintel Jun 5, 2025, 5:05 PM

#

drop hints ... not full paragraphs 🙂

tall summit Jun 5, 2025, 5:05 PM

#

whats fixed

small haven Jun 5, 2025, 5:05 PM

#

u need to make ur own discord

late path Jun 5, 2025, 5:06 PM

#

My favorite paragraphs😭

tall summit Jun 5, 2025, 5:06 PM

#

hoooly

#

fixed 🙀

small haven Jun 5, 2025, 5:06 PM

#

small haven u need to make ur own discord

and charge huge

quiet folio Jun 5, 2025, 5:06 PM

#

Bruh those deleted messages are only I think I think I think and other uncertainties

#

Hows that delete worthy

small haven Jun 5, 2025, 5:07 PM

#

craigbench, oai at 100%, gemini at 0%

misty vault Jun 5, 2025, 5:07 PM

#

small haven craigbench, oai at 100%, gemini at 0%

LMAO

cedar tide Jun 5, 2025, 5:08 PM

#

Does anyone have any code examples from 06-05?

fleet lintel Jun 5, 2025, 5:08 PM

#

small haven <@1338136168344064040> gemini hasn't solved it, im feeling grok 3.5

goldmane model is stuck in infinite loop for this question ... thinking for 4 min now

misty vault Jun 5, 2025, 5:09 PM

#

I'm going to finish my project with claude 4 opus thinking and gemini side by side now

#

Claude 4 been doing good but I want to see if they fixed the gemini coding annoyances

civic flame Jun 5, 2025, 5:09 PM

#

cedar tide Oh i found this

lol why does it keep getting worse at swe bench

leaden sun Jun 5, 2025, 5:09 PM

#

fleet lintel you are going to be fired if you reveal too much

Missed the opportunity b4 it got deleted 🥲

small haven Jun 5, 2025, 5:09 PM

#

fleet lintel goldmane model is stuck in infinite loop for this question ... thinking for 4 mi...

yup, i just set my thinking budget to 32k

#

still running

fleet lintel Jun 5, 2025, 5:09 PM

#

small haven yup, i just set my thinking budget to 32k

I got 4041 after 4 min thinking... answer is something like 3000?>

small haven Jun 5, 2025, 5:09 PM

#

ohhh ffs

civic flame Jun 5, 2025, 5:09 PM

#

small haven <@1338136168344064040> gemini hasn't solved it, im feeling grok 3.5

yeah nor could kingsfall

small haven Jun 5, 2025, 5:10 PM

#

civic flame yeah nor could kingsfall

damn rlly

civic flame Jun 5, 2025, 5:10 PM

#

when a model can consistently solve this im calling it agi

#

trust

civic flame Jun 5, 2025, 5:10 PM

#

small haven damn rlly

yeah it was one of the few prompts i tried it with

small haven Jun 5, 2025, 5:10 PM

#

civic flame when a model can consistently solve this im calling it agi

but tbf, thinking budget was default 8192 for kingfall?

civic flame Jun 5, 2025, 5:10 PM

#

i set it to max

#

for that prompt

small haven Jun 5, 2025, 5:10 PM

#

wow

fleet lintel Jun 5, 2025, 5:10 PM

#

small haven <@1338136168344064040> gemini hasn't solved it, im feeling grok 3.5

not grok 3.5 but grok 35

quiet folio Jun 5, 2025, 5:10 PM

#

leaden sun Missed the opportunity b4 it got deleted 🥲

Bro he didnt even answer any question
Unless.. all the "i think"s and guessing stuff is just ACTING clueless but its in fact real and confirmed?????? 😱

patent bane Jun 5, 2025, 5:10 PM

#

civic flame yeah it was one of the few prompts i tried it with

whats the prompt?

civic flame Jun 5, 2025, 5:10 PM

#

There are 2022 users on a social network called Mathbook, and some of them are Mathbook-friends. (On Mathbook, friendship is always mutual and permanent.)

Starting now, Mathbook will only allow a new friendship to be formed between two users if they have at least two friends in common. What is the minimum number of friendships that must already exist so that every user could eventually become friends with every other user?

small haven Jun 5, 2025, 5:11 PM

#

fleet lintel not grok 3.5 but grok 35

with tools*

civic flame Jun 5, 2025, 5:11 PM

#

ans = 3031

patent bane Jun 5, 2025, 5:11 PM

#

what models have solved it?

#

is it tricky?

civic flame Jun 5, 2025, 5:11 PM

#

patent bane what models have solved it?

0 consistently

#

nebula (NOT released 2.5 pro for some reason) sometimes got it

#

that's the only model

#

that's because it was cheating lol

tall summit Jun 5, 2025, 5:12 PM

#

patent bane is it tricky?

its 2022 usamo problem 6

civic flame Jun 5, 2025, 5:12 PM

#

tool usage is fair except web search

tall summit Jun 5, 2025, 5:12 PM

#

so yes.

small haven Jun 5, 2025, 5:12 PM

#

civic flame tool usage is fair except web search

until it pulls up evan chen's work

civic flame Jun 5, 2025, 5:12 PM

#

lol

#

vibe-wise i quite like this new 2.5 pro

#

it makes some actually visually pleasing frontends that don't feel like generic slop

#

fleet lintel Jun 5, 2025, 5:13 PM

#

civic flame vibe-wise i quite like this new 2.5 pro

agreed.. I used like to Gemini answers but now structure and presetation and tone is better too

civic flame Jun 5, 2025, 5:13 PM

#

html + css only

misty vault Jun 5, 2025, 5:14 PM

#

civic flame html + css only

check the code

tall summit Jun 5, 2025, 5:14 PM

#

https://artofproblemsolving.com/wiki/index.php/2022_USAMO_Problems/Problem_6 hmm someone wrote 4044 as the answer

Problem 6

misty vault Jun 5, 2025, 5:14 PM

#

is it ass cheeks

small haven Jun 5, 2025, 5:14 PM

#

u realized deeperthink mode has to search the web 😭

patent bane Jun 5, 2025, 5:15 PM

#

@civic flame
https://artofproblemsolving.com/wiki/index.php/2022_USAMO_Problems/Problem_6?srsltid=AfmBOoqJa2CnONUZ-EQwyLGb8GvDwsK2fYV4bJE5dZW76laHI7UNrsyC

uhmm, how about this?

Problem 6

small haven Jun 5, 2025, 5:15 PM

#

link it then

#

its believable, grok is finetuned to math

civic flame Jun 5, 2025, 5:15 PM

#

misty vault is it ass cheeks

opus 4 is the best model in terms of actually nice code, but i still prefer the end result of 2.5 pro

misty vault Jun 5, 2025, 5:16 PM

#

fair

elder rapids Jun 5, 2025, 5:16 PM

#

patent bane <@1338136168344064040> https://artofproblemsolving.com/wiki/index.php/2022_USA...

that would be funny, the models keep getting 4044

civic flame Jun 5, 2025, 5:17 PM

#

patent bane <@1338136168344064040> https://artofproblemsolving.com/wiki/index.php/2022_USA...

oh that's weird

#

one second

small haven Jun 5, 2025, 5:17 PM

#

ya the fact opus is still at 79% swe, and 0605 at 67.2%, but aider 82% and 72% opus, fishy

civic flame Jun 5, 2025, 5:17 PM

#

https://www.reddit.com/r/mathriddles/comments/1gz19na/what_is_the_minimum_number_of_friendships_that/

From the mathriddles community on Reddit

Explore this post and more from the mathriddles community

#

look at the comments on this

#

i honestly did not realise this was a USAMO problem

elder rapids Jun 5, 2025, 5:17 PM

#

small haven ya the fact opus is still at 79% swe, and 0605 at 67.2%, but aider 82% and 72% o...

they're entirely different things tho

small haven Jun 5, 2025, 5:18 PM

#

elder rapids they're entirely different things tho

its really both just code tasks

tall summit Jun 5, 2025, 5:18 PM

#

civic flame https://www.reddit.com/r/mathriddles/comments/1gz19na/what_is_the_minimum_number...

i love that sub

civic flame Jun 5, 2025, 5:18 PM

#

https://imgur.com/a/RARdRMa

Imgur

Untitled Album

civic flame Jun 5, 2025, 5:18 PM

#

tall summit i love that sub

samesies

tall summit Jun 5, 2025, 5:18 PM

#

civic flame samesies

its funny there are a few people who just post oly problems there

tall summit Jun 5, 2025, 5:19 PM

#

civic flame https://imgur.com/a/RARdRMa

thats copied from the aops discussion

fleet lintel Jun 5, 2025, 5:19 PM

#

wow, goldmane is confidently wrong.
I said "answer is 3031. prove it"

And reply was (last line)
"Therefore, despite your suggestion, the mathematical proof confirms that the minimum number of friendships is 4041."

🙂

wicked root Jun 5, 2025, 5:19 PM

#

tall summit BASED

😤

tall summit Jun 5, 2025, 5:19 PM

#

https://artofproblemsolving.com/community/c5h2808836p24774626

elder rapids Jun 5, 2025, 5:19 PM

#

small haven its really both just code tasks

the initial presupposition of different coding strengths affirms this regardless tho? the fact that they simply ARE good at different coding things in practice, should surely be considered

small haven Jun 5, 2025, 5:19 PM

#

wow brian left the server

elder rapids Jun 5, 2025, 5:20 PM

#

he didn't

small haven Jun 5, 2025, 5:20 PM

#

elder rapids he didn't

ping him

elder rapids Jun 5, 2025, 5:20 PM

#

he's still in my heart

small haven Jun 5, 2025, 5:20 PM

#

add him

civic flame Jun 5, 2025, 5:20 PM

#

💔

small haven Jun 5, 2025, 5:20 PM

#

great

civic flame Jun 5, 2025, 5:20 PM

#

he probably got clocked

#

rip

civic flame Jun 5, 2025, 5:20 PM

#

tall summit thats copied from the aops discussion

each solution i find comes up with a different end result lol

small haven Jun 5, 2025, 5:21 PM

#

elder rapids the initial presupposition of different coding strengths affirms this regardless...

true, hopefully kingfall unanimously dominates both

quiet folio Jun 5, 2025, 5:21 PM

#

Is this my fault

small haven Jun 5, 2025, 5:21 PM

#

wanted to ask when will there be a claude max equivalent for gemini 😭

quiet folio Jun 5, 2025, 5:21 PM

#

sad~7

late path Jun 5, 2025, 5:21 PM

#

yes

misty vault Jun 5, 2025, 5:21 PM

#

yes

golden ocean Jun 5, 2025, 5:21 PM

#

yes

quiet folio Jun 5, 2025, 5:22 PM

#

yes

tall summit Jun 5, 2025, 5:22 PM

#

civic flame each solution i find comes up with a different end result lol

i trust evan chen

late path Jun 5, 2025, 5:22 PM

#

you and that purple guy

#

You drove away our insider

elder rapids Jun 5, 2025, 5:22 PM

#

small haven true, hopefully kingfall unanimously dominates both

i think that's inevitable, and btw I don't think kingfall is unanimously better than goldmane, although in spatial tasks it seems to be like basically agi

late path Jun 5, 2025, 5:22 PM

#

🤦

small haven Jun 5, 2025, 5:22 PM

#

brian is going to come back under a different alt

quiet folio Jun 5, 2025, 5:22 PM

#

Guys if we all show enough love we can get him to join back

patent bane Jun 5, 2025, 5:22 PM

#

fleet lintel wow, goldmane is confidently wrong. I said "answer is 3031. prove it" And repl...

i prefer that than the "sorry, you're right..."

elder rapids Jun 5, 2025, 5:22 PM

#

kingfall svgs >>>>>>

small haven Jun 5, 2025, 5:22 PM

#

quiet folio Guys if we all show enough love we can get him to join back

its ur fault hahaha

elder rapids Jun 5, 2025, 5:23 PM

#

but that's pretty much it imo, the insane nuance in how they speak, the details

#

are there

quiet folio Jun 5, 2025, 5:23 PM

#

small haven its ur fault hahaha

Yes but multiple people showing love beats only one person (me) doing it!

small haven Jun 5, 2025, 5:23 PM

#

quiet folio Yes but multiple people showing love beats only one person (me) doing it!

well u can unleash ur treasure trove stash now

quiet folio Jun 5, 2025, 5:23 PM

#

what treaure trove stash

#

Is he referring to the deleted brian messages

small haven Jun 5, 2025, 5:24 PM

#

no

quiet folio Jun 5, 2025, 5:24 PM

#

But my nu des ae private. I'd prefer not to share them here

#

Also because that would be against the rules

small haven Jun 5, 2025, 5:24 PM

#

oh true

elder rapids Jun 5, 2025, 5:25 PM

#

btw without thinking budget turned on, is it just how much it wants to think?

civic flame Jun 5, 2025, 5:25 PM

#

again, there is a non thinking mode

#

iirc it's just only on ai studio

#

whoops

#

#

yeah nevermind it's just bugged

#

well you can on the frontend it just throws an error

misty vault Jun 5, 2025, 5:27 PM

#

thats what he meant bro

civic flame Jun 5, 2025, 5:27 PM

#

yes i know dawg

#

i just didn't realise that because i thought the reason it was throwing an error was it was overloaded initially

patent bane Jun 5, 2025, 5:29 PM

#

why mine is 24576?

surreal creek Jun 5, 2025, 5:30 PM

#

Aspiring to be this delusional

small haven Jun 5, 2025, 5:31 PM

#

yea tbf oai still has the edge rn

#

even if kingfall drops, they'll just drop o4 and o5 mini high

patent bane Jun 5, 2025, 5:32 PM

#

damn

elder rapids Jun 5, 2025, 5:33 PM

#

small haven even if kingfall drops, they'll just drop o4 and o5 mini high

ion think this would be the case, especially before gpt 5s imminent release

small haven Jun 5, 2025, 5:33 PM

#

patent bane # damn

đem

elder rapids Jun 5, 2025, 5:33 PM

#

you'd have to treat Gemini deepthink with the same attitude

#

which to me, is obviously just selective tbh

small haven Jun 5, 2025, 5:34 PM

#

elder rapids ion think this would be the case, especially before gpt 5s imminent release

they always change their plans

late path Jun 5, 2025, 5:34 PM

#

Ugh, really wish this server didn't have those disrespectful people. I'm going to miss Brian||and his insider infomations||

small haven Jun 5, 2025, 5:34 PM

#

o3 pro wasn't supposed to make it

elder rapids Jun 5, 2025, 5:35 PM

#

it's "technically" bannable but thats nonsensical

#

no reason to report them or do anything about it

quiet folio Jun 5, 2025, 5:35 PM

#

(No one ever got banned for this unless they were a big name so good luck)

patent bane Jun 5, 2025, 5:35 PM

#

small haven đem

? where are you from

quiet folio Jun 5, 2025, 5:36 PM

#

Anyway but I agree I also deleted the messages and wasn't going to do it again but oh well rip

small haven Jun 5, 2025, 5:36 PM

#

patent bane ? where are you from

not from vietnam

patent bane Jun 5, 2025, 5:38 PM

#

small haven not from vietnam

vai cac

small haven Jun 5, 2025, 5:38 PM

#

patent bane vai cac

i dont understand

patent bane Jun 5, 2025, 5:38 PM

#

small haven i dont understand

same

small haven Jun 5, 2025, 5:38 PM

#

hes gone

#

wow

misty vault Jun 5, 2025, 5:39 PM

#

NOw lets get brian back

elder rapids Jun 5, 2025, 5:39 PM

#

what

#

ngl Craig

#

you're kind of

#

small haven Jun 5, 2025, 5:39 PM

#

w craiggers

elder rapids Jun 5, 2025, 5:39 PM

#

mentally lacking

keen fulcrum Jun 5, 2025, 5:40 PM

#

fleet lintel I have my doubts. It doesn't look that easy to cook something in this area

Apple can do it, but they have no AI lab as big as xAI or OpenAI

#

Its sad Apple didn't join the AI race
They have the resources to do so

torn mantle Jun 5, 2025, 5:41 PM

#

https://x.com/TheGregYang/status/1930668478425645081

Greg Yang (@TheGregYang)

fixing bot replies

#

grok 3.5 update

#

🥰

small haven Jun 5, 2025, 5:41 PM

#

torn mantle grok 3.5 update

every week its like that 😭

torn mantle Jun 5, 2025, 5:41 PM

#

he said

#

'fixing bot replies'

#

....

#

google just released another model

#

veo 3

#

imagen 4

#

imagen 4 ultra

#

whisk...

echo aurora Jun 5, 2025, 5:42 PM

#

Sry for the slow response, currently on mobile.

torn mantle Jun 5, 2025, 5:42 PM

#

meanwhile xai are struggling to release grok 3.5?

#

doesnt make any sense to me

small haven Jun 5, 2025, 5:42 PM

#

torn mantle meanwhile xai are struggling to release grok 3.5?

something def went wrong internally

misty vault Jun 5, 2025, 5:43 PM

#

elder rapids mentally lacking

true craig just gets butthurt or baited like realllllly easy it's fun bro lmaoo

small haven Jun 5, 2025, 5:43 PM

#

almost a month delayed

misty vault Jun 5, 2025, 5:43 PM

#

but whatever in this case it was not invalid reaction. they got revenge on the guy for making brian leave so that is fair

torn mantle Jun 5, 2025, 5:43 PM

#

small haven something def went wrong internally

yea

elder rapids Jun 5, 2025, 5:43 PM

#

holy

torn mantle Jun 5, 2025, 5:43 PM

#

thats what im thinking

elder rapids Jun 5, 2025, 5:43 PM

#

sht

torn mantle Jun 5, 2025, 5:43 PM

#

HOLY

#

OIMG

#

STOP

#

wait

misty vault Jun 5, 2025, 5:43 PM

#

nahh you cried lmaoo

elder rapids Jun 5, 2025, 5:43 PM

#

asura

#

0605 just wrote me

torn mantle Jun 5, 2025, 5:43 PM

#

OMG

elder rapids Jun 5, 2025, 5:43 PM

#

an INSANE

#

ESSAY

torn mantle Jun 5, 2025, 5:43 PM

#

elder rapids an INSANE

you should try kingfall

elder rapids Jun 5, 2025, 5:44 PM

#

y'all gotta see this

torn mantle Jun 5, 2025, 5:44 PM

#

you have access?

elder rapids Jun 5, 2025, 5:44 PM

#

torn mantle you should try kingfall

nah. better.

misty vault Jun 5, 2025, 5:44 PM

#

gork 3.5 release???

torn mantle Jun 5, 2025, 5:44 PM

#

oh?

zinc ore Jun 5, 2025, 5:44 PM

#

So if goldmane is 82% on aider, then which model was 86%?? Was it kingsfall or something else?

torn mantle Jun 5, 2025, 5:44 PM

#

misty vault gork 3.5 release???

wen?

#

qwen

#

q w e n

misty vault Jun 5, 2025, 5:44 PM

#

Idk why everyone hype

torn mantle Jun 5, 2025, 5:44 PM

#

:p

#

its good to create hype

barren prairie Jun 5, 2025, 5:45 PM

#

torn mantle you should try kingfall

Where on web dev? Or on arena

torn mantle Jun 5, 2025, 5:46 PM

#

barren prairie Where on web dev? Or on arena

i have private access

#

i can send link in dm

#

thanks

#

oh the choice of 'they'

#

i see what you did

#

lol

#

yea

#

you sleep well

#

8h sleep = sharp memory

#

tf

#

no way

#

thats unhealthy

deep adder Jun 5, 2025, 5:48 PM

#

ppl are weird

torn mantle Jun 5, 2025, 5:48 PM

#

shrug

fleet lintel Jun 5, 2025, 5:50 PM

#

https://techcrunch.com/2025/06/03/deepseek-may-have-used-googles-gemini-to-train-its-latest-model/

#

That's why we can't have good things. I am quite sure Google disabled thinking because of deepseek

elder rapids Jun 5, 2025, 5:50 PM

#

how tf do I change chat names on mobile

#

for AI studio

torn mantle Jun 5, 2025, 5:50 PM

#

fleet lintel https://techcrunch.com/2025/06/03/deepseek-may-have-used-googles-gemini-to-train...

it was kinda obv they will distill from diff models

#

its the fastest way and easiest road

fleet lintel Jun 5, 2025, 5:51 PM

#

too much liability for a big company

elder rapids Jun 5, 2025, 5:51 PM

#

ye

#

and also, Google = data geeks

#

they want the cleanest data

#

so even if they did, it's not really going to be that meaningful

elder rapids Jun 5, 2025, 5:53 PM

#

elder rapids how tf do I change chat names on mobile

@torn mantle

#

wya

torn mantle Jun 5, 2025, 5:54 PM

#

im here

torn mantle Jun 5, 2025, 5:54 PM

#

elder rapids how tf do I change chat names on mobile

@deep adder help

#

what secret

#

what

#

its kingfall

#

not fallking

#

i dont have fallking

#

im sorry

#

just go to twitter and type kingfall

small haven Jun 5, 2025, 5:56 PM

#

wen kingfallfurther?

torn mantle Jun 5, 2025, 5:56 PM

#

there is a chinese guy who shared the link just recently

#

it works

#

havent checked the code tbh

#

im just using it

small haven Jun 5, 2025, 5:58 PM

#

wait whats going on

#

i just went to grab coffee

torn mantle Jun 5, 2025, 5:59 PM

#

what if????????

#

omg

#

stop it

#

imagine...

#

nah its not

#

its just kingfall

#

just random things

late path Jun 5, 2025, 5:59 PM

#

it works, just like you're accessing api endpoints at google internally

torn mantle Jun 5, 2025, 5:59 PM

#

stop the cap

#

don't share it pls

#

lemme use it

#

chat is acting weird

#

😮

fleet lintel Jun 5, 2025, 6:01 PM

#

share please... how do I test otherwise

torn mantle Jun 5, 2025, 6:01 PM

#

fleet lintel share please... how do I test otherwise

hehehe

#

beg

small haven Jun 5, 2025, 6:06 PM

#

torn mantle don't share it pls

wait where do i get it

torn mantle Jun 5, 2025, 6:08 PM

#

torn mantle just go to twitter and type kingfall

.

#

i cant share the link here

small haven Jun 5, 2025, 6:08 PM

#

ok not gonna login with my google account 😭

torn mantle Jun 5, 2025, 6:08 PM

#

what

#

its an artifact on aistudio

#

it will be patched?

#

because of @small haven ?

small haven Jun 5, 2025, 6:09 PM

#

i mean does it match kingfall of yesterday?

torn mantle Jun 5, 2025, 6:09 PM

#

small haven i mean does it match kingfall of yesterday?

yea

small haven Jun 5, 2025, 6:09 PM

#

wow

torn mantle Jun 5, 2025, 6:09 PM

#

its kingfall

#

fallking

late path Jun 5, 2025, 6:09 PM

#

small haven i mean does it match kingfall of yesterday?

its the same model

torn mantle Jun 5, 2025, 6:09 PM

#

its gemini 3

#

probably first checkpoint of gemini 3

#

because logan said goldmane is the stable gemini 2.5 pro ver

#

its quite decent

small haven Jun 5, 2025, 6:10 PM

#

whats the context size

torn mantle Jun 5, 2025, 6:10 PM

#

im happy

#

64k

#

if im not wrong

small haven Jun 5, 2025, 6:13 PM

#

lets talk about o3 pro

#

where is it?

#

hmm i just checked the model selector

#

im blown away 😭

#

no my oai model selector

#

no

#

i thought it would came out thursday

#

they had everything set up, like why

torn mantle Jun 5, 2025, 6:16 PM

#

small haven hmm i just checked the model selector

you didnt thank me

small haven Jun 5, 2025, 6:16 PM

#

torn mantle you didnt thank me

thank u so much

torn mantle Jun 5, 2025, 6:16 PM

#

small haven thank u so much

thank uuuuuuuuuu

small haven Jun 5, 2025, 6:16 PM

#

for the oai model selector

torn mantle Jun 5, 2025, 6:16 PM

#

what

#

yes

small haven Jun 5, 2025, 6:16 PM

#

yes

torn mantle Jun 5, 2025, 6:17 PM

#

thank me

small haven Jun 5, 2025, 6:17 PM

#

thank me

torn mantle Jun 5, 2025, 6:17 PM

#

small haven thank me

thank u

small haven Jun 5, 2025, 6:17 PM

#

thank u

torn mantle Jun 5, 2025, 6:18 PM

#

yes

small haven Jun 5, 2025, 6:19 PM

#

what

#

oh right

late path Jun 5, 2025, 6:20 PM

#

they retained 0506 after this release

#

why not also keep 0325?

golden ocean Jun 5, 2025, 6:21 PM

#

is 0605 better than 0325

keen fulcrum Jun 5, 2025, 6:22 PM

#

How does the new version compare to sonnet and opus?

elder rapids Jun 5, 2025, 6:25 PM

#

golden ocean is 0605 better than 0325

ye

#

btw what's the highest we've seen for the simplebench sample questions?

#

@torn mantle @civic flame

small haven Jun 5, 2025, 6:34 PM

#

gonna try this

ocean vortex Jun 5, 2025, 6:38 PM

#

@calm sequoia you happy now?

#

they did the opposite now lmao

#

coding worse in parts, everything else better

small haven Jun 5, 2025, 6:39 PM

#

is that good? where does o3 and 0506 stand

#

0506 as in 0605

#general

we are so back

close?

damn