#general | Arena | Page 17

ocean vortex Apr 10, 2025, 1:04 PM

#

I mean the paper he linked is reasonably legit with one of the authors being uni professor https://arxiv.org/pdf/2412.06822

keen beacon Apr 10, 2025, 1:04 PM

#

which is another llama slop model

#

not quasar

sonic tendon Apr 10, 2025, 1:04 PM

#

they seem to have dialed back the slop

keen beacon Apr 10, 2025, 1:05 PM

#

it is tho

sonic tendon Apr 10, 2025, 1:05 PM

#

which probably won't fare well for their leaderboard placements

keen beacon Apr 10, 2025, 1:05 PM

#

it was recently, i think i got it somewhat recently

sonic tendon Apr 10, 2025, 1:05 PM

#

keen beacon it is tho

i think anonymous-test was, not sure about chatbot

keen beacon Apr 10, 2025, 1:05 PM

#

keen beacon it is tho

i haven't seen an anon chatbot

ocean vortex Apr 10, 2025, 1:05 PM

#

oh wait he is Eyad Gomaa, right

keen beacon Apr 10, 2025, 1:05 PM

#

and i use it quite often

#

will see if i can find it now..

keen beacon Apr 10, 2025, 1:06 PM

#

keen beacon i haven't seen an anon chatbot

they forgot to remove the You are trained on data up to October 2023. system prompt appendix for both of them 😭

#

(it's wrong btw)

#

by the time openai have a model more up to date than oct '23 i'll be dead

keen beacon Apr 10, 2025, 1:06 PM

#

keen beacon by the time openai have a model more up to date than oct '23 i'll be dead

it has the same june 2024 cut off as chatgpt 4o latest but they add that appendix to all 4o models

keen beacon Apr 10, 2025, 1:06 PM

#

keen beacon (it's wrong btw)

i noticed this with the o3 private model i have

sonic tendon Apr 10, 2025, 1:06 PM

#

do we think that that's actually a system prompt

keen beacon Apr 10, 2025, 1:07 PM

#

i asked it to explain "hawk tuah" and it did it just fine but maintains when directly asked that its knowledge cutoff is oct '23

sonic tendon Apr 10, 2025, 1:07 PM

#

or just a bunch of posttraining chat data they forgot to clean up

ocean vortex Apr 10, 2025, 1:07 PM

#

ocean vortex oh wait he is Eyad Gomaa, right

so "our team" he referred to was literally himself and his professor lmfao

keen beacon Apr 10, 2025, 1:07 PM

#

sonic tendon do we think that that's actually a system prompt

yes since it was retroactively added to older 4o models

sonic tendon Apr 10, 2025, 1:07 PM

#

keen beacon yes since it was retroactively added to older 4o models

ah

keen beacon Apr 10, 2025, 1:07 PM

#

keen beacon i asked it to explain "hawk tuah" and it did it just fine but maintains when dir...

wait what

#

that o3 might be on the new chatgpt 4o latest base model that was cont pretrained

#

wait a minute

#

"As of today (April 2025) the most recent stable release published on npm is:

discord.js 14.18.0

Because new patch versions appear fairly often, you can always check yourself with one of the following:

• npm view discord.js version
• npm outdated discord.js (if it’s already in your package.json)
• Visit https://www.npmjs.com/package/discord.js

Those commands/web pages will return the definitive latest version at any given moment."

npm

discord.js

A powerful library for interacting with the Discord API. Latest version: 14.18.0, last published: 2 months ago. Start using discord.js in your project by running npm i discord.js. There are 5739 other projects in the npm registry using discord.js.

#

is that just... a good guess... or

#

is it given tools

#

hmm

#

thats a new base model if it knows stuff outside of oct 2023 @keen beacon

sonic tendon Apr 10, 2025, 1:09 PM

#

good question

#

might be able to ask it

keen beacon Apr 10, 2025, 1:09 PM

#

chatgpt 4o latest was cont pretrained to june 2024

#

give me something it would only know if it had a more recent cutoff

#

oh wait

keen beacon Apr 10, 2025, 1:10 PM

#

keen beacon give me something it would only know if it had a more recent cutoff

who won the 2024 london mayoral elections and by what margin?

#

it just said the knowledge cutoff was june 2024 👀

keen beacon Apr 10, 2025, 1:10 PM

#

keen beacon who won the 2024 london mayoral elections and by what margin?

will try

sonic tendon Apr 10, 2025, 1:10 PM

#

maybe something that it wouldn't think it would be a good idea to look up?

#

well, that might work

keen beacon Apr 10, 2025, 1:10 PM

#

👀 new o3 model is based on the updated 4o

#

Sadiq Khan (Labour) was re‑elected Mayor of London in the election held on 2 May 2024.

• Votes:
– Sadiq Khan: 1,088,225
– Susan Hall (Conservative): 811,518

• Margin of victory:
– By votes: 276,707
– By share of the vote: 11.1 percentage points (Khan 43.8 %, Hall 32.7 %).

Because the 2024 contest was run under first‑past‑the‑post for the first time, those first‑round figures are the final result.

#

lol it got it bang on

neat apex Apr 10, 2025, 1:10 PM

#

More likely the reverse

keen beacon Apr 10, 2025, 1:11 PM

#

interesting

keen beacon Apr 10, 2025, 1:11 PM

#

keen beacon Sadiq Khan (Labour) was re‑elected Mayor of London in the election held on 2 May...

confirmed!!!

sonic tendon Apr 10, 2025, 1:11 PM

#

wait, when did the updated 4o release?

#

i would

keen beacon Apr 10, 2025, 1:11 PM

#

i don't think it has tools

sonic tendon Apr 10, 2025, 1:11 PM

#

've thought that going from 4o->o3 would take a few months

keen beacon Apr 10, 2025, 1:11 PM

#

when i asked who the US president is

#

it said "As of the most recent information available to me (knowledge cutoff June 2024), the President of the United States is Joseph R. Biden Jr. If you need confirmation for a date after that, please check a reliable, up‑to‑date news source or the official White House website."

keen beacon Apr 10, 2025, 1:12 PM

#

keen beacon it said "As of the most recent information available to me (knowledge cutoff Jun...

yeah

#

wild was right about continued pretraining yet again

#

lmao

#

wait

sonic tendon Apr 10, 2025, 1:12 PM

#

not sure i'd wanna be this guy rn

keen beacon Apr 10, 2025, 1:12 PM

#

keen beacon lmao

no doubt that is 100% new o3 full model

#

there's a slim chance this may actually be o4 mini?

#

since o4 is the one initially rumoured to be the new base

keen beacon Apr 10, 2025, 1:13 PM

#

keen beacon there's a slim chance this may actually be o4 mini?

maybe but they have been zero public models/etc. with the new chatgpt 4o cut off (cont. pretraining)

keen beacon Apr 10, 2025, 1:13 PM

#

keen beacon since o4 is the one initially rumoured to be the new base

yes its using the new 4o cont pretrained base

#

@keen beacon bruh u have access to the new o3 with the new base

#

they may have updated the system prompt in the last week or so

#

because before

#

it said it was oct '23

keen beacon Apr 10, 2025, 1:14 PM

#

keen beacon they may have updated the system prompt in the last week or so

it might be a lie if it knows more than that

#

but i guess if it was somewhat recent it makes sense too

balmy mist Apr 10, 2025, 1:15 PM

#

Yo, I'm seeing a lot of people have the Veo Gen in Google Studio. I don't see it for some reason, and I'm a power user of Studio. I use it like every damn day for hours, and I don't have it 😦

keen beacon Apr 10, 2025, 1:15 PM

#

balmy mist Yo, I'm seeing a lot of people have the Veo Gen in Google Studio. I don't see it...

api only and paid i think

#

or it hasnt rolled out or smthing

balmy mist Apr 10, 2025, 1:15 PM

#

my mom has it

sonic tendon Apr 10, 2025, 1:15 PM

#

your mom sounds cool

balmy mist Apr 10, 2025, 1:15 PM

#

and i just setup studio for her last night

neat apex Apr 10, 2025, 1:15 PM

#

Maybe the advanced?

balmy mist Apr 10, 2025, 1:15 PM

#

like basic

#

just opened the app

keen beacon Apr 10, 2025, 1:15 PM

#

balmy mist and i just setup studio for her last night

ya it hasnt rolled out to you ig

#

i've got it

balmy mist Apr 10, 2025, 1:15 PM

#

lmaoo

neat apex Apr 10, 2025, 1:15 PM

#

Hm

#

Maybe age?

balmy mist Apr 10, 2025, 1:16 PM

#

aint no age thing

keen beacon Apr 10, 2025, 1:16 PM

#

lmao

neat apex Apr 10, 2025, 1:16 PM

#

Hm

balmy mist Apr 10, 2025, 1:16 PM

#

none of the accounts have it 😦

#

now im team Open Ai again

neat apex Apr 10, 2025, 1:16 PM

#

Maybe verification like that phone thing?

keen beacon Apr 10, 2025, 1:18 PM

#

pretty crazy how good veo 2 is still

#

although that tree doesn't make much sense at the end

keen beacon Apr 10, 2025, 1:18 PM

#

keen beacon pretty crazy how good veo 2 is still

free or is it charging you on aistudio?

ocean vortex Apr 10, 2025, 1:18 PM

#

OpenAI need to finally release that gpt4o... like wtf are they waiting for catgrin

keen beacon Apr 10, 2025, 1:18 PM

#

keen beacon free or is it charging you on aistudio?

completely free

#

wow

#

so only api is paid i guess

#

ai studio is a crazy product

hardy pecan Apr 10, 2025, 1:18 PM

#

Vertex AI isn't free, just a fyi

keen beacon Apr 10, 2025, 1:19 PM

#

make sure ur not getting charged lol @keen beacon its hella expensive i thinkn

balmy mist Apr 10, 2025, 1:19 PM

#

keen beacon ai studio is a crazy product

yupp the free part is just wild man

ocean vortex Apr 10, 2025, 1:19 PM

#

keen beacon so only api is paid i guess

yes you can't even select exp on aistudio. It's only preview

keen beacon Apr 10, 2025, 1:19 PM

#

keen beacon make sure ur not getting charged lol <@456226577798135808> its hella expensive i...

yeah i know

balmy mist Apr 10, 2025, 1:19 PM

#

lol SOTA video and LLM for free

keen beacon Apr 10, 2025, 1:19 PM

#

i've checked

#

these aren't available yet tho

ocean vortex Apr 10, 2025, 1:20 PM

#

ocean vortex yes you can't even select exp on aistudio. It's only preview

for 2.5 pro. Which was supposed to be paid for preview version

keen beacon Apr 10, 2025, 1:20 PM

#

im really curious whether they removed anon chatbot in the arena because i got recently and it was matching quasar with the same system prompt

#

i've never seen it lol

sonic tendon Apr 10, 2025, 1:22 PM

#

i've been wondering about the fidelity with which you could predict where anon models were on the leaderboard if you just triggered a bunch of matchups and saw what they got paired with

#

does that sentence make any sense

keen beacon Apr 10, 2025, 1:22 PM

#

yeah

#

that's come in handy for me sometimes but i can't do it on a big enough scale to make concrete conclusions

sonic tendon Apr 10, 2025, 1:22 PM

#

especially since the web arena still leaks model names and providers when you open the site

sonic tendon Apr 10, 2025, 1:23 PM

#

keen beacon that's come in handy for me sometimes but i can't do it on a big enough scale to...

yeah, same

sonic tendon Apr 10, 2025, 1:23 PM

#

sonic tendon especially since the web arena still leaks model names and providers when you op...

although it seems like the web arena doesn't always have all of the models that the chat arena does

keen beacon Apr 10, 2025, 1:23 PM

#

sonic tendon especially since the web arena still leaks model names and providers when you op...

this was a thing with main lmarena site for a while i think

#

i might have stored a dump or two

sonic tendon Apr 10, 2025, 1:24 PM

#

oh, hmm

ocean vortex Apr 10, 2025, 1:33 PM

#

hardy pecan Vertex AI isn't free, just a fyi

yeah if you have billing set up you choose wrong 2.5 with identical date and description... you are getting billed lmao

#

exp and preview literally just the names slightly different

#

keen beacon Apr 10, 2025, 1:35 PM

#

@keen beacon did u notice the o3 model change within the last week or so? or was it the same model with an adjusted sys prompt?

#

based on phases of development we can guess their current timelinne

ocean vortex Apr 10, 2025, 1:37 PM

#

keen beacon <@456226577798135808> did u notice the o3 model change within the last week or s...

what is this all about btw? o3-mini update?

keen beacon Apr 10, 2025, 1:37 PM

#

keen beacon <@456226577798135808> did u notice the o3 model change within the last week or s...

probably the latter

keen beacon Apr 10, 2025, 1:37 PM

#

ocean vortex what is this all about btw? o3-mini update?

he has access to new o3 with updated gpt 4o base (june 2024)

ocean vortex Apr 10, 2025, 1:38 PM

#

keen beacon he has access to new o3 with updated gpt 4o base (june 2024)

what

#

o3 is only deep research lol

keen beacon Apr 10, 2025, 1:39 PM

#

it's not through deep research 👍

ocean vortex Apr 10, 2025, 1:39 PM

#

keen beacon it's not through deep research 👍

then where? Isn't it not released?

keen beacon Apr 10, 2025, 1:39 PM

#

it is not released no

#

i can't share where

#

sorry

#

its probably very imminent at this point tbh

#

yup

#

im curious how much stronger it is compared to og o3 👀

#

wdym

sage raptor Apr 10, 2025, 1:40 PM

#

keen beacon i can't share where

is it better than 2.5 ?

keen beacon Apr 10, 2025, 1:40 PM

#

keen beacon wdym

i mean theyre using a much stronger base model now, am curious lol

#

depends on your use case

sage raptor Apr 10, 2025, 1:40 PM

#

in coding

keen beacon Apr 10, 2025, 1:40 PM

#

well

#

there are still many specific coding use cases

#

for very reasoning heavy coding tasks, o3 is probably better

#

for most others, 2.5 pro is better

keen beacon Apr 10, 2025, 1:41 PM

#

keen beacon for most others, 2.5 pro is better

is that in your experience using it?

ocean vortex Apr 10, 2025, 1:41 PM

#

keen beacon i mean theyre using a much stronger base model now, am curious lol

it would make sense to lower the cost with better base more than anything. OG o3 used to be closer to pro I believe. Judging by arc-agi wording in collaboration with openai as well as reported cost

keen beacon Apr 10, 2025, 1:41 PM

#

yeah

#

unfortunately

keen beacon Apr 10, 2025, 1:41 PM

#

keen beacon is that in your experience using it?

yeah

#

again though

#

this is probably o3 medium

#

and the jumps between reasoning efforts are fairly significant

#

so we shall see once we have high

keen beacon Apr 10, 2025, 1:42 PM

#

ocean vortex it would make sense to lower the cost with better base more than anything. OG o3...

could have much higher peaks with a more competitive base model

#

if they ran it with as much compute as they did before

#

doesn't seem like o3 pro will launch at the same time as o3 and o4 mini

ocean vortex Apr 10, 2025, 1:44 PM

#

I'm not a fan of how they are making it look like it's scaling of RL training when they constantly keep low-key updating the base now tbh

keen beacon Apr 10, 2025, 1:44 PM

#

going off of the recent frontend changes in preparation

keen beacon Apr 10, 2025, 1:44 PM

#

ocean vortex I'm not a fan of how they are making it look like it's scaling of RL training wh...

why not? both avenues seem to have no end (to a certain degree) rn, you just have to be smart about it i think

#

especially base models

ocean vortex Apr 10, 2025, 1:46 PM

#

keen beacon why not? both avenues seem to have no end (to a certain degree) rn, you just hav...

they are about to release reasoning models which are based on undocumented base models. Just seems obscure on purpose and wrong lol

keen beacon Apr 10, 2025, 1:46 PM

#

ocean vortex they are about to release reasoning models which are based on undocumented base ...

they are about to release an api dated instruct version of it tho. and its already released in chatgpt 4o latest

ocean vortex Apr 10, 2025, 1:46 PM

#

makes it much harder to isolate things and do direct comparisons

ocean vortex Apr 10, 2025, 1:47 PM

#

keen beacon they are about to release an api dated instruct version of it tho. and its alrea...

about to... Though it is still not there 👀

keen beacon Apr 10, 2025, 1:47 PM

#

ocean vortex about to... Though it is still not there 👀

chatgpt 4o latest post december has the new base model, and that has been released

ocean vortex Apr 10, 2025, 1:47 PM

#

o3 and o4 is also about to lol

ocean vortex Apr 10, 2025, 1:48 PM

#

keen beacon chatgpt 4o latest post december has the new base model, and that has been releas...

yeah but that doesn't have any official metrics. And it was updated numerous times now. That's what I meant by 'low-key'

sage raptor Apr 10, 2025, 1:48 PM

#

so today o3 will release ?

ocean vortex Apr 10, 2025, 1:48 PM

#

btw given that 4o-mini stayed the same all this time

#

I do believe mini reasoning variants are distills

keen beacon Apr 10, 2025, 1:49 PM

#

sage raptor so today o3 will release ?

likely

#

the interface is ready for it to launch

keen beacon Apr 10, 2025, 1:49 PM

#

ocean vortex btw given that 4o-mini stayed the same all this time

i wonder if o4 mini is on an updated base (we know nothing of that publicly though)

#

if theyre releasing o3/o4 mini, they will launch quasar/anonymous chatbot too i think

#

anonymous chatbot was replaced quite quickly after their last chatgpt 4o release

#

so they want the lmarena results fast 👀

ocean vortex Apr 10, 2025, 1:50 PM

#

keen beacon i wonder if o4 mini is on an updated base (we know nothing of that publicly thou...

I think it's a distill of their best most expensive version of o3

keen beacon Apr 10, 2025, 1:51 PM

#

slightly off topic but what's everyone's bet on how they launch it this time

ocean vortex Apr 10, 2025, 1:51 PM

#

since full o4 that's unlikely to exist yet

keen beacon Apr 10, 2025, 1:51 PM

#

surprise live stream announcement, just drop the blog post with no warning

#

etc

#

this time we won't get to see the system card before launch

#

which is a shame

keen beacon Apr 10, 2025, 1:51 PM

#

keen beacon surprise live stream announcement, just drop the blog post with no warning

idk but they will prob do something big because the quasar thing is largely a huge marketing thing i think

#

i think they'll probably do some hypeposting beforehand

#

as ever

sonic tendon Apr 10, 2025, 1:52 PM

#

sonic tendon not sure i'd wanna be this guy rn

L

sage raptor Apr 10, 2025, 1:52 PM

#

how will the o4 benchmarks look like

keen beacon Apr 10, 2025, 1:52 PM

#

if they do a livestream

#

i wouldn't be surprised if they previewed o4 at the end

sage raptor Apr 10, 2025, 1:52 PM

#

yea, they will

sonic tendon Apr 10, 2025, 1:53 PM

#

yeah

keen beacon Apr 10, 2025, 1:53 PM

#

/gpt-5

sonic tendon Apr 10, 2025, 1:53 PM

#

arc-agi?

#

and humanity's last exam

ocean vortex Apr 10, 2025, 1:53 PM

#

sage raptor how will the o4 benchmarks look like

just like o1 and o3-mini did I think. o4-mini-high to be comparable to o3-medium but excelling at different things

keen beacon Apr 10, 2025, 1:53 PM

#

i wanna know the new arc agi results with the new o3 model 👀

sage raptor Apr 10, 2025, 1:53 PM

#

maybe even o5, who knows

thorny drum Apr 10, 2025, 1:53 PM

#

will o3 have an api?

keen beacon Apr 10, 2025, 1:53 PM

#

ofc

#

openai stopped staggering launches starting with o1 iirc

#

when they launch on chatgpt they launch on api

thorny drum Apr 10, 2025, 1:54 PM

#

well o1 pro wasnt on api for a bit i thought

keen beacon Apr 10, 2025, 1:54 PM

#

thorny drum well o1 pro wasnt on api for a bit i thought

yeah that's only pro

sonic tendon Apr 10, 2025, 1:54 PM

#

assuming oAI just launches o3 with no warning, would it just show up on the leaderboard out of the blue once it got enough votes

keen beacon Apr 10, 2025, 1:54 PM

#

we won't be getting o3 pro straight away

sonic tendon Apr 10, 2025, 1:54 PM

#

isn't o1 pro just regular o1 doing best of 3

keen beacon Apr 10, 2025, 1:54 PM

#

sonic tendon assuming oAI just launches o3 with no warning, would it just show up on the lead...

it will get added to lmarena like a few hours after api launch

#

then it'll be like

#

~1 week until leaderboard appearance

sonic tendon Apr 10, 2025, 1:55 PM

#

makes sense

keen beacon Apr 10, 2025, 1:55 PM

#

i don't think it'll take top spot without stylectrl

#

but with it it might

#

again, depends on what reasoning effort version they use

#

last time they just put medium to start

#

same w mini until they added o3 mini high

sonic tendon Apr 10, 2025, 1:56 PM

#

would o3-high be prohibitively expensive?

keen beacon Apr 10, 2025, 1:56 PM

#

openai give them credits lol

#

yes (for normal users)

sonic tendon Apr 10, 2025, 1:56 PM

#

i'm somewhat unsurprised that openai doesn't care enough about lmsys to sponsor that

keen beacon Apr 10, 2025, 1:56 PM

#

most labs fund their usage

#

labs + sponsors

sonic tendon Apr 10, 2025, 1:56 PM

#

keen beacon openai give them credits lol

ah, that makes sense

#

didn't think i saw them in the sponsor list

sonic tendon Apr 10, 2025, 1:57 PM

#

keen beacon yes (for normal users)

oh, i just meant for lmsys

keen beacon Apr 10, 2025, 1:57 PM

#

sonic tendon didn't think i saw them in the sponsor list

they're not quite sponsors in that sense so the labs aren't in the list

#

but it makes sense for openai to fund it because of how much they gain from the data

#

(i also can't think of much explanation for why all the direct chat limits for models were removed a couple months ago other than that)

novel flame Apr 10, 2025, 1:58 PM

#

keen beacon although that tree doesn't make much sense at the end

Wdym? It’s clearly just a tree being pushed along in the other direction by a friendly gardener, totally normal human behavior my fellow human.

balmy mist Apr 10, 2025, 1:58 PM

#

is there a discord bot that summarizes from the last part of the convo you are in to the current? that would be so clutch man

keen beacon Apr 10, 2025, 1:58 PM

#

keen beacon (i also can't think of much explanation for why all the direct chat limits for m...

? theyre still present i think

#

i haven't run into one in months

#

i used to all the time

#

then they just stopped

#

its model specific

#

Shrug

sonic tendon Apr 10, 2025, 1:59 PM

#

i keep being surprised that something as small as lmarena seems to have any impact on these massive companies' decisions

keen beacon Apr 10, 2025, 1:59 PM

#

some have user limits, global limits per interval, etc

keen beacon Apr 10, 2025, 1:59 PM

#

keen beacon its model specific

i haven't had any on claude, 4o or gemini

keen beacon Apr 10, 2025, 1:59 PM

#

keen beacon some have user limits, global limits per interval, etc

yeah i remember

sonic tendon Apr 10, 2025, 1:59 PM

#

balmy mist is there a discord bot that summarizes from the last part of the convo you are i...

i think discord has been testing out ai summaries for like a year now

keen beacon Apr 10, 2025, 1:59 PM

#

keen beacon i haven't had any on claude, 4o or gemini

i had one on an older version of 4o when i tested for the appendix (on quasar and anonymous chatbot) recently

keen beacon Apr 10, 2025, 1:59 PM

#

keen beacon i had one on an older version of 4o when i tested for the appendix (on quasar an...

hm

sonic tendon Apr 10, 2025, 1:59 PM

#

you might still need a client mod to access them

keen beacon Apr 10, 2025, 1:59 PM

#

oops

#

wrong paste

#

lmao

sonic tendon Apr 10, 2025, 2:00 PM

#

lmaoo i was

#

wondering

keen beacon Apr 10, 2025, 2:00 PM

#

sonic tendon i think discord has been testing out ai summaries for like a year now

perpetual beta

sonic tendon Apr 10, 2025, 2:00 PM

#

i feel like i recognize that game

#

*recognized

keen beacon Apr 10, 2025, 2:00 PM

#

https://store.steampowered.com/app/1184770/The_Political_Process/

Steam

The Political Process

Explore a dynamic world of politics in this turn-based, political simulator. Create a character, run for political office, write legislation, balance budgets, and more as you move up the political hierarchy.Create a Custom PoliticianPlay as a Democrat or Republican in the American political system. Customize the name, age, and appearance of you…

Price

$14.99

Recommendations

1826

▶ Play video

#

if optimus alpha is openai related, its probably gonna be o4 mini

#

again

#

oh

#

optimus

#

nvm

sonic tendon Apr 10, 2025, 2:01 PM

#

keen beacon https://store.steampowered.com/app/1184770/The_Political_Process/

ohh yeah

keen beacon Apr 10, 2025, 2:01 PM

#

yeah idk what else they could put there

#

they defo wont put o3 its far too expensive

#

i dont think they have an updated 4o mini yet (at least publicly, we only know of an updated 4o), and it doesnt really match up with the name

sonic tendon Apr 10, 2025, 2:02 PM

#

the soon-to-be 4o/o4 confusion will be fun

keen beacon Apr 10, 2025, 2:03 PM

#

keen beacon i dont think they have an updated 4o mini yet (at least publicly, we only know o...

dw it's just the 120574th chatgpt 4o update

keen beacon Apr 10, 2025, 2:03 PM

#

keen beacon dw it's just the 120574th chatgpt 4o update

🤣 thats quasar tho

#

i dont think they have another one ready that quickly

#

i wouldn't put it past them

#

this next 4o release is gonna be big tho. the whole quasar thing, etc. (no released benchmarks on chatgpt 4o latest with the updated base model, despite massive improvements, etc)

#

they are REALLY milking 4o

#

none of us thought it would still be the default almost 1 year on when it launched

novel flame Apr 10, 2025, 2:08 PM

#

keen beacon none of us thought it would still be the default almost 1 year on when it launch...

To be fair, it’s a pretty solid workhorse model that hasn’t been easy to beat. Up until reasoners came out it was pretty much undefeated except for coding.

balmy mist Apr 10, 2025, 2:08 PM

#

@sonic tendon so there is a summary bot?

#

how do I use it?

keen beacon Apr 10, 2025, 2:08 PM

#

it is solid yes, but it doesn't really hold up very well these days

balmy mist Apr 10, 2025, 2:08 PM

#

i cant keep up with these convos

keen beacon Apr 10, 2025, 2:09 PM

#

and coding is a big weak point given how many people use llms for that specific area

balmy mist Apr 10, 2025, 2:09 PM

#

keen beacon they are REALLY milking 4o

fr

#

thats their bread and butter it seems

#

but it def got better

#

i like it now

keen beacon Apr 10, 2025, 2:10 PM

#

it's very good for creative tasks and has got noticeably better at most other things but it is still pretty unacceptably poor in code tasks compared to other llms

keen beacon Apr 10, 2025, 2:10 PM

#

keen beacon it's very good for creative tasks and has got noticeably better at most other th...

quasar is supposed to be a large improvement over the previous 4o in coding

#

i've tested it

sonic tendon Apr 10, 2025, 2:11 PM

#

balmy mist <@609942266953465856> so there is a summary bot?

https://vencord.dev/plugins/Summaries

Vencord

Summaries

Enables Discord's experimental Summaries feature on every server, displaying AI generated summaries of conversations

keen beacon Apr 10, 2025, 2:11 PM

#

it is an improvement over the old 4o but it's still lagging behind claude 3.7 sonnet

sonic tendon Apr 10, 2025, 2:11 PM

#

you'll need a discord client mod loader

#

it's a bit involved

novel flame Apr 10, 2025, 2:11 PM

#

keen beacon and coding is a big weak point given how many people use llms for that specific ...

100% but millions of non-techie users don’t care about that, they just want to cheat at schoolwork, generate their marketing slop, or add chatbot support to their enterprise to appease investors.

keen beacon Apr 10, 2025, 2:12 PM

#

sonic tendon https://vencord.dev/plugins/Summaries

vencordd :3

keen beacon Apr 10, 2025, 2:12 PM

#

novel flame 100% but millions of non-techie users don’t care about that, they just want to c...

yeah fair enough

sage raptor Apr 10, 2025, 2:12 PM

#

https://x.com/sama/status/1910334443690340845

Sam Altman (@sama) on X

a few times a year i wake up early and can't fall back asleep because we are launching a new feature ive been so excited about for so long.

today is one of those days!

keen beacon Apr 10, 2025, 2:12 PM

#

AYEE

#

HERE WE GO

#

"new feature" lol

#

we're getting o3

#

oh

#

wait a minute

#

hes underplaying it but i think its gonna be multiple model releases

novel flame Apr 10, 2025, 2:13 PM

#

keen beacon yeah fair enough

Also talk dirty with their personalized SmutBot, although I don’t think GPT4o lets you do that

keen beacon Apr 10, 2025, 2:13 PM

#

yeah

sonic tendon Apr 10, 2025, 2:13 PM

#

keen beacon vencordd :3

ye :3

sonic tendon Apr 10, 2025, 2:13 PM

#

novel flame Also talk dirty with their personalized SmutBot, although I don’t think GPT4o le...

What

keen beacon Apr 10, 2025, 2:13 PM

#

novel flame Also talk dirty with their personalized SmutBot, although I don’t think GPT4o le...

new 4o is actually quite uncensored

#

LOL

#

let me find a screenshot

thorny drum Apr 10, 2025, 2:13 PM

#

yeah its just tech bro speak for 2+ new models

balmy mist Apr 10, 2025, 2:13 PM

#

he is so random with his tweets, we really in wild times when ceos use twitter to launch stuff lol

sonic tendon Apr 10, 2025, 2:13 PM

#

i may know where this is going

balmy mist Apr 10, 2025, 2:13 PM

#

but im here for it

sonic tendon Apr 10, 2025, 2:14 PM

#

sage raptor https://x.com/sama/status/1910334443690340845

i wake up early and can't fall back asleep
real

keen beacon Apr 10, 2025, 2:14 PM

#

keen beacon let me find a screenshot

can't find it 😔

brittle tiger Apr 10, 2025, 2:15 PM

#

https://x.com/sama/status/1910334443690340845?t=1rXzwx_9QNE1eLjzgJd6Gg&s=19

Sam Altman (@sama) on X

a few times a year i wake up early and can't fall back asleep because we are launching a new feature ive been so excited about for so long.

today is one of those days!

keen beacon Apr 10, 2025, 2:15 PM

#

balmy mist he is so random with his tweets, we really in wild times when ceos use twitter t...

i see people complain about sam not using grammar quite often but i like it

#

we can all agree professionalism sucks right

balmy mist Apr 10, 2025, 2:15 PM

#

fr

#

its all about efficiency of info at this point

#

no need to fluff it up

sonic tendon Apr 10, 2025, 2:17 PM

#

idk it's like
almost informal enough to feel like genuine internet speak, but not quite

keen beacon Apr 10, 2025, 2:17 PM

#

we're still waiting on whether this will be a livestream

#

i think it will be

balmy mist Apr 10, 2025, 2:17 PM

#

can someone tell him to hurry up

keen beacon Apr 10, 2025, 2:17 PM

#

it'll be at least a couple more hours lol

balmy mist Apr 10, 2025, 2:17 PM

#

they always livestream lol, and if sam is not on it then its mid

#

yeah prob 1 pm est

keen beacon Apr 10, 2025, 2:18 PM

#

balmy mist they always livestream lol, and if sam is not on it then its mid

sam will be there

#

i have no doubt

tall summit Apr 10, 2025, 2:18 PM

#

sonic tendon > i wake up early and can't fall back asleep real

universal experience

keen beacon Apr 10, 2025, 2:18 PM

#

im guessing the next openai releases are:

o3
o4 mini
quasar (4o api dated version, benchmarks, lmarena [was probably anon chatbot], updated stronger base model that was cont. pretrained to june 2024 from oct 2023)

keen beacon Apr 10, 2025, 2:18 PM

#

keen beacon sam will be there

balmy mist Apr 10, 2025, 2:19 PM

#

keen beacon im guessing the next openai releases are: - o3 - o4 mini - quasar (4o api dated ...

you think we get o3 pro today as well?

#

it does not seem like it

keen beacon Apr 10, 2025, 2:19 PM

#

i dont thinkn so

keen beacon Apr 10, 2025, 2:19 PM

#

balmy mist you think we get o3 pro today as well?

nope

#

and it might not happen all today

#

it'll be "in the coming weeks"

oblique flint Apr 10, 2025, 2:19 PM

#

what I dont get is why they're releasing o3. If they have o4 mini, then it's probably distilled from o4 right? So why not release o4

keen beacon Apr 10, 2025, 2:19 PM

#

oblique flint what I dont get is why they're releasing o3. If they have o4 mini, then it's pro...

they redid o3

tall summit Apr 10, 2025, 2:19 PM

#

balmy mist you think we get o3 pro today as well?

nobody's ever released so much at once

keen beacon Apr 10, 2025, 2:19 PM

#

oblique flint what I dont get is why they're releasing o3. If they have o4 mini, then it's pro...

that's just how they've always done o series models

#

with the new 4o base model

#

it has knowledge in june 2024 now

sonic tendon Apr 10, 2025, 2:20 PM

#

keen beacon

now that's an interaction i can stand by

balmy mist Apr 10, 2025, 2:20 PM

#

wants the point of releasing the new 4o when we already had it for days lol

keen beacon Apr 10, 2025, 2:20 PM

#

wait

#

i've just had

#

a thought

#

so we all know about optimus alpha

balmy mist Apr 10, 2025, 2:20 PM

#

like they should just drop is silently lol

keen beacon Apr 10, 2025, 2:20 PM

#

balmy mist wants the point of releasing the new 4o when we already had it for days lol

u can pay for it now 🙂

#

and if they're likely going to release a 4o update (quasar) alongside o3 and o4 mini today

#

then that makes whatever optimus is all the more interesting

#

i kinda doubt another 4o version

#

but i also doubt o4 mini if they're releasing it today

#

it's weird because

#

the naming scheme + context window with quasar would make you think google (stargazer, moonhowler, nebula...)

#

but all my testing screams openai

keen beacon Apr 10, 2025, 2:22 PM

#

keen beacon but all my testing screams openai

bruh the benchmarks, pretraining knowledge, cut off, tokenizer, using the same openai chatgpt 4o anon name on lmarena, adding the same 4o api appendix, etc. it is 4o

#

quasar is definitively 4o imho, but idk what optimus could be

balmy mist Apr 10, 2025, 2:23 PM

#

so that would also mean that nw is OA?

keen beacon Apr 10, 2025, 2:23 PM

#

no thats google

tall summit Apr 10, 2025, 2:23 PM

#

keen beacon bruh the benchmarks, pretraining knowledge, cut off, tokenizer, using the same o...

the benchmarks?

keen beacon Apr 10, 2025, 2:24 PM

#

i do expect new google models possibly today but likely tomorrow

keen beacon Apr 10, 2025, 2:24 PM

#

tall summit the benchmarks?

#general message

#

they will want to respond and we're still waiting on 2.5 flash nevermind all the anon great coding models they've been testing

balmy mist Apr 10, 2025, 2:24 PM

#

is anyone gonna livestream the OA livestream here? that might be cute

keen beacon Apr 10, 2025, 2:24 PM

#

can u do that on a discord server?

keen beacon Apr 10, 2025, 2:25 PM

#

keen beacon can u do that on a discord server?

yup

#

in a vc

#

stream your screen

sonic tendon Apr 10, 2025, 2:25 PM

#

keen beacon can u do that on a discord server?

you can

#

yeah

tall summit Apr 10, 2025, 2:25 PM

#

keen beacon https://discord.com/channels/1340554757349179412/1340554757827461211/13591565679...

oooh thank you for the info

sage raptor Apr 10, 2025, 2:25 PM

#

what is sama automate

sonic tendon Apr 10, 2025, 2:25 PM

#

play the yt livestream in a video player

keen beacon Apr 10, 2025, 2:25 PM

#

sage raptor what is sama automate

perplexity is cooked

sonic tendon Apr 10, 2025, 2:25 PM

#

and then stream that window

keen beacon Apr 10, 2025, 2:25 PM

#

they have no moat lmao

sonic tendon Apr 10, 2025, 2:26 PM

#

i feel like LLMs are well on their way to becoming commodities at this point

#

let alone random LLM search applications

keen beacon Apr 10, 2025, 2:26 PM

#

tbh any other company that isnt an ai frontier lab building stuff on ai models will get crushed

#

regularly scheduled programming = resumed

#

tall summit Apr 10, 2025, 2:26 PM

#

keen beacon they have no moat lmao

they are important as a part of history

sonic tendon Apr 10, 2025, 2:26 PM

#

keen beacon

wait what happened

keen beacon Apr 10, 2025, 2:26 PM

#

tall summit they are important as a part of history

yeah that's the only thing they're gonna be.. history

lime coral Apr 10, 2025, 2:26 PM

#

sage raptor what is sama automate

This is HYPEMAN

#

HE IS BACK

keen beacon Apr 10, 2025, 2:26 PM

#

sonic tendon wait what happened

a trump admin official said the word tariff too much -> investors run screaming

tall summit Apr 10, 2025, 2:26 PM

#

sonic tendon i feel like LLMs are well on their way to becoming commodities at this point

aren't they already

keen beacon Apr 10, 2025, 2:27 PM

#

it is the prophecy

sonic tendon Apr 10, 2025, 2:27 PM

#

keen beacon a trump admin official said the word tariff too much -> investors run screaming

lmaoo

sonic tendon Apr 10, 2025, 2:27 PM

#

tall summit aren't they already

pretty much

tall summit Apr 10, 2025, 2:27 PM

#

keen beacon

chaos 😈

sonic tendon Apr 10, 2025, 2:27 PM

#

after r2, i think things will be mostly over

keen beacon Apr 10, 2025, 2:27 PM

#

the trump admin's WH site really sucks

lime coral Apr 10, 2025, 2:27 PM

#

keen beacon they have no moat lmao

Was obvious since the beginning

keen beacon Apr 10, 2025, 2:27 PM

#

gemini can do better

sonic tendon Apr 10, 2025, 2:27 PM

#

keen beacon the trump admin's WH site really sucks

oh god did you see it when it first came out

#

they like

keen beacon Apr 10, 2025, 2:27 PM

#

the biden site was so much better

#

but imo obama's was the best

sonic tendon Apr 10, 2025, 2:27 PM

#

played a promo video when you opened the site

torn mantle Apr 10, 2025, 2:27 PM

#

brittle tiger https://x.com/sama/status/1910334443690340845?t=1rXzwx_9QNE1eLjzgJd6Gg&s=19

Its gonna be something stupid

keen beacon Apr 10, 2025, 2:27 PM

#

🙏

torn mantle Apr 10, 2025, 2:28 PM

#

We know hype sama

keen beacon Apr 10, 2025, 2:28 PM

#

sonic tendon played a promo video when you opened the site

yeah and it screamed authoritarian

sonic tendon Apr 10, 2025, 2:28 PM

#

that too

torn mantle Apr 10, 2025, 2:28 PM

#

Always hyping his products

keen beacon Apr 10, 2025, 2:28 PM

#

torn mantle We know hype sama

i think hes underplaying it this time lol calling it a "feature"

sonic tendon Apr 10, 2025, 2:28 PM

#

i'd be down to hang out in a vc, just am gonna be at school for the next 4 hours 😭

#

so

#

probably gonna miss the livestream

keen beacon Apr 10, 2025, 2:29 PM

#

obama you've got 8 more years in you buddy

#

come back

sonic tendon Apr 10, 2025, 2:29 PM

#

lime coral Was obvious since the beginning

do you guys remember phind
i used it for the free gpt-4 back in like 2023
i have no idea how they're still in business

keen beacon Apr 10, 2025, 2:29 PM

#

hell yeah

#

this was such a fire lineup

#

if only joe ran in 2016

#

maybe we would've never had the orange

sonic tendon Apr 10, 2025, 2:30 PM

#

obamna? 🥺

keen beacon Apr 10, 2025, 2:30 PM

#

sonic tendon obamna? 🥺

https://tenor.com/view/soda-gif-8026056502744469335

Tenor

sonic tendon Apr 10, 2025, 2:31 PM

#

keen beacon https://tenor.com/view/soda-gif-8026056502744469335

i was just about to

tall summit Apr 10, 2025, 2:31 PM

#

sonic tendon do you guys remember phind i used it for the free gpt-4 back in like 2023 i have...

have used it a few times
they're still in business maybe because the ai boom is still in progress

sonic tendon Apr 10, 2025, 2:31 PM

#

.

torn mantle Apr 10, 2025, 2:31 PM

#

keen beacon i think hes underplaying it this time lol calling it a "feature"

Is the thinking slider added yet?

keen beacon Apr 10, 2025, 2:31 PM

#

nope

torn mantle Apr 10, 2025, 2:31 PM

#

It could be that no?

keen beacon Apr 10, 2025, 2:31 PM

#

may arrive today

sonic tendon Apr 10, 2025, 2:31 PM

#

leo stole my line :(

#

/j

torn mantle Apr 10, 2025, 2:31 PM

#

So thats the feature

keen beacon Apr 10, 2025, 2:31 PM

#

sonic tendon leo stole my line :(

pointandlaugh

keen beacon Apr 10, 2025, 2:31 PM

#

torn mantle So thats the feature

no chance

tall summit Apr 10, 2025, 2:31 PM

#

ok what the fuck's a thinking slider

keen beacon Apr 10, 2025, 2:31 PM

#

the quasar release is very imminent anyway, if its not today

#

nobody would get hyped about that

torn mantle Apr 10, 2025, 2:31 PM

#

keen beacon no chance

Let's see

sonic tendon Apr 10, 2025, 2:31 PM

#

tall summit have used it a few times they're still in business maybe because the ai boom is ...

oh yeah, everyone's burning through VC money right now

torn mantle Apr 10, 2025, 2:31 PM

#

xd

keen beacon Apr 10, 2025, 2:31 PM

#

sam would not be doing all that for a damn slider

#

even by his standards that's meh

torn mantle Apr 10, 2025, 2:31 PM

#

keen beacon sam would not be doing all that for a damn slider

Oh he will

sonic tendon Apr 10, 2025, 2:32 PM

#

torn mantle Oh he will

he won't

keen beacon Apr 10, 2025, 2:32 PM

#

torn mantle Oh he will

👎

novel flame Apr 10, 2025, 2:32 PM

#

brittle tiger https://x.com/sama/status/1910334443690340845?t=1rXzwx_9QNE1eLjzgJd6Gg&s=19

Another thing: how is this not market manipulation?

keen beacon Apr 10, 2025, 2:32 PM

#

the thinking slider is already out in beta on some of the clients

#

doesn't make sense to be doing all this for it

#

why deploy chatgpt after midnight with anticipated model names tbh if its not coming out soon

#

^

sonic tendon Apr 10, 2025, 2:32 PM

#

novel flame Another thing: how is this not market manipulation?

wdym?

tall summit Apr 10, 2025, 2:32 PM

#

sonic tendon oh yeah, everyone's burning through VC money right now

seems to have only gotten worse as the ai boom continues

keen beacon Apr 10, 2025, 2:32 PM

#

wrong paste

#

smh

#

so it begins

#

whatever the case the new o3 (on a new gpt 4o base model) is seemingly ready anyway 👀

tall summit Apr 10, 2025, 2:33 PM

#

keen beacon so it begins

LOL

keen beacon Apr 10, 2025, 2:33 PM

#

i like how xAI took a huge leap off a cliff

sonic tendon Apr 10, 2025, 2:33 PM

#

sonic tendon wdym?

maybe for the $1.2m polymarket listing, which he definitely doesn't have any incentive to participate in

novel flame Apr 10, 2025, 2:33 PM

#

sonic tendon wdym?

Usually insider information - such as hinting at a release - is kept under wraps until an official statement is released, in order to not mess with markets. But In genuinely asking because I don’t know how it works

sonic tendon Apr 10, 2025, 2:33 PM

#

novel flame Usually insider information - such as hinting at a release - is kept under wraps...

Hmm

tall summit Apr 10, 2025, 2:33 PM

#

keen beacon i like how xAI took a huge leap off a cliff

-# that's because grok sucks

keen beacon Apr 10, 2025, 2:34 PM

#

grok 3 is mid

sonic tendon Apr 10, 2025, 2:34 PM

#

keen beacon so it begins

yeah that was crazy

#

when it happened on the march market, a couple people made like 50 grand

keen beacon Apr 10, 2025, 2:34 PM

#

good on 'em

#

it was only really a matter of time

sonic tendon Apr 10, 2025, 2:35 PM

#

novel flame Usually insider information - such as hinting at a release - is kept under wraps...

i imagine it's less of a concern since oAI isn't a public company

#

although it could still impact competitors

keen beacon Apr 10, 2025, 2:35 PM

#

i wonder if oai will ever go public

sonic tendon Apr 10, 2025, 2:36 PM

#

i feel like that would finally cross the line

keen beacon Apr 10, 2025, 2:36 PM

#

the old mission has been slowly being undone for ages

#

nail in the coffin

sonic tendon Apr 10, 2025, 2:37 PM

#

yeah, it'd suck

keen beacon Apr 10, 2025, 2:37 PM

#

i heard someone talk about this recently and it made me think

#

most labs have a plan/goal and everything for developing AGI and what they would do if they achieved it first, etc

#

and the question was - which lab would you rather have control over the first true AGI

sonic tendon Apr 10, 2025, 2:38 PM

#

google could possibly be worse

keen beacon Apr 10, 2025, 2:38 PM

#

yeah i don't know

brittle tiger Apr 10, 2025, 2:38 PM

#

novel flame Usually insider information - such as hinting at a release - is kept under wraps...

Private company and it would have to be way more specific to apply. If this were manipulation tons of salesman ceos like altman would be in trouble. Elon has been saying fsd is a year away for almost a decade

sonic tendon Apr 10, 2025, 2:38 PM

#

keen beacon and the question was - which lab would you rather have control over the first tr...

possibly anthropic

keen beacon Apr 10, 2025, 2:38 PM

#

openai and anthropic are the loudest about what their intentions would be

#

google are very vague

sonic tendon Apr 10, 2025, 2:39 PM

#

i wish there were a way we could be comfortable about ai safety without making everything proprietary forever

#

but, eh

tall summit Apr 10, 2025, 2:39 PM

#

i was about to say none of them but after red said that i'd say anthropic because least of many evils

keen beacon Apr 10, 2025, 2:39 PM

#

none of the labs are great if we're talking ethics but anthropic have the best vibes

#

there's the question of "would you rather it be in the hands of a lab that's too careless or a lab that's too careful"

#

i would say the latter

oblique flint Apr 10, 2025, 2:40 PM

#

keen beacon and the question was - which lab would you rather have control over the first tr...

mistral but I dont think it's happening 😂

keen beacon Apr 10, 2025, 2:40 PM

#

yeah no 😭

#

imagine if it's deepseek

#

that would be quite the predicament

brittle tiger Apr 10, 2025, 2:40 PM

#

keen beacon and the question was - which lab would you rather have control over the first tr...

A lab in a democracy would be ideal

keen beacon Apr 10, 2025, 2:40 PM

#

lol yeah

sonic tendon Apr 10, 2025, 2:40 PM

#

anthropic seems like a bunch of idealistic nerds (but, not led by Sam Altman)

keen beacon Apr 10, 2025, 2:41 PM

#

if a lab has achieved agi other labs would probably follow short tbh

sage raptor Apr 10, 2025, 2:41 PM

#

sage raptor what is sama automate

grok agrees too

sonic tendon Apr 10, 2025, 2:41 PM

#

which seems like the best-case scenario

keen beacon Apr 10, 2025, 2:41 PM

#

it would def be a domino effect kinda thing yeah

keen beacon Apr 10, 2025, 2:41 PM

#

sage raptor grok agrees too

💀

sonic tendon Apr 10, 2025, 2:41 PM

#

sage raptor grok agrees too

where are they getting this from 😭

keen beacon Apr 10, 2025, 2:41 PM

#

schizo

sonic tendon Apr 10, 2025, 2:41 PM

#

oblique flint mistral but I dont think it's happening 😂

why mistral?

tall summit Apr 10, 2025, 2:41 PM

#

sage raptor grok agrees too

makes a lot of sense yeah

sonic tendon Apr 10, 2025, 2:41 PM

#

i haven't heard of that name in a while

keen beacon Apr 10, 2025, 2:41 PM

#

imagine if meta got there first 💀

#

no way lol

tall summit Apr 10, 2025, 2:42 PM

#

oblique flint mistral but I dont think it's happening 😂

you never know

keen beacon Apr 10, 2025, 2:42 PM

#

i would definitely hope not

#

i dont even see any chance of anthropic getting there first

sonic tendon Apr 10, 2025, 2:42 PM

#

keen beacon that would be quite the predicament

I think I'd trust DS more than Meta

keen beacon Apr 10, 2025, 2:42 PM

#

keen beacon i dont even see any chance of anthropic getting there first

i think it's a 2 horse race atp

#

GDM vs OAI

#

anthropic will get there soon after one of those does

sonic tendon Apr 10, 2025, 2:42 PM

#

keen beacon i would definitely hope not

wait, what is this in reference to

keen beacon Apr 10, 2025, 2:42 PM

#

but they aren't daring enough

keen beacon Apr 10, 2025, 2:42 PM

#

sonic tendon wait, what is this in reference to

#general message

sonic tendon Apr 10, 2025, 2:43 PM

#

keen beacon https://discord.com/channels/1340554757349179412/1340554757827461211/13599012129...

got it

oblique flint Apr 10, 2025, 2:43 PM

#

sonic tendon why mistral?

well it's eu based (am eu citizen lol), committed to open source and doesnt seem as sketchy as some bigger companies (meta, google etc)

keen beacon Apr 10, 2025, 2:43 PM

#

keen beacon anthropic will get there soon after one of those does

deepseek

brittle tiger Apr 10, 2025, 2:43 PM

#

keen beacon if a lab has achieved agi other labs would probably follow short tbh

Depends on the lab or how advanced it is. If making sure it is most capable is a top goal of the model or lab it would probably not have a hard time preventing others

sonic tendon Apr 10, 2025, 2:43 PM

#

sonic tendon I think I'd trust DS more than Meta

I think most people want what's best for the world, DS researchers included

keen beacon Apr 10, 2025, 2:43 PM

#

they'll have their spies running back to the china hq once oai crack it wink wink

sonic tendon Apr 10, 2025, 2:44 PM

#

even if they're operating under a system that's going to pressure them into researching military use

#

they clearly put the minimum possible effort into the censors, for example

balmy mist Apr 10, 2025, 2:44 PM

#

people saying this:
https://x.com/iruletheworldmo/status/1910342414944137344

🍓🍓🍓 (@iruletheworldmo) on X

what we getting from openai? leave a comment ill patience chair.

#

what would agent gpt do?

#

like agentspace

keen beacon Apr 10, 2025, 2:44 PM

#

ew not that grifter

balmy mist Apr 10, 2025, 2:44 PM

#

lmaoo

keen beacon Apr 10, 2025, 2:44 PM

#

i hate iruletheworld

#

😭

brittle tiger Apr 10, 2025, 2:45 PM

#

keen beacon they'll have their spies running back to the china hq once oai crack it wink win...

There's little stopping China. They got tpuv6 specs over a year ago which are like crown jewels level secret

balmy mist Apr 10, 2025, 2:45 PM

#

the votes are real tho

#

idc about what he posts, but the reactions show what people are thinking

keen beacon Apr 10, 2025, 2:45 PM

#

beyond oai/deepmind, the chinese frontier labs are probably after them tbh. skill diff and ingenuity tbh. unpopular opinion probably but i dont see anthropic getting there faster than chinese frontier labs (deepseek and qwen)

keen beacon Apr 10, 2025, 2:45 PM

#

brittle tiger There's little stopping China. They got tpuv6 specs over a year ago which are li...

yeah they don't let silly things like ethics stop them

#

i respect the dedication

keen beacon Apr 10, 2025, 2:45 PM

#

keen beacon beyond oai/deepmind, the chinese frontier labs are probably after them tbh. skil...

i think it'll be oai > gdm > anthropic > deepseek

#

the last two can probably be swapped at will

sonic tendon Apr 10, 2025, 2:45 PM

#

wait, what's gdm?

keen beacon Apr 10, 2025, 2:45 PM

#

sonic tendon wait, what's gdm?

google deepmind

sonic tendon Apr 10, 2025, 2:45 PM

#

oh google

keen beacon Apr 10, 2025, 2:45 PM

#

anthropic has done zero public image gen work i think

sonic tendon Apr 10, 2025, 2:46 PM

#

keen beacon anthropic has done zero public image gen work i think

yeah dario made a good point about this

#

very little real commercial application

keen beacon Apr 10, 2025, 2:46 PM

#

yeah anthropic are very focused on just llms really

sonic tendon Apr 10, 2025, 2:46 PM

#

and imo it's a bad look

keen beacon Apr 10, 2025, 2:46 PM

#

sonic tendon very little real commercial application

for now. but i think this will be a critical aspect in the future

#

openai are the most innovative lab imo

#

they are normally the ones "to follow"

#

multimodal cot will be important

#

which might include native image gen

#

(then they normally get beat at their own game... but hey, that's the spirit)

sonic tendon Apr 10, 2025, 2:47 PM

#

i sort of dislike the oAI leadership

sonic tendon Apr 10, 2025, 2:47 PM

#

keen beacon (then they normally get beat at their own game... but hey, that's the spirit)

win imo

keen beacon Apr 10, 2025, 2:47 PM

#

openai's corporate structure is ridiculous

#

it's designed to be convoluted

oblique flint Apr 10, 2025, 2:47 PM

#

I dont get why people are hyping up google so much tbh. Do you really want a future where google controls AI as well in addition to already pretty much controlling the internet?

keen beacon Apr 10, 2025, 2:47 PM

#

their board also has some big conflicts of interest

sonic tendon Apr 10, 2025, 2:48 PM

#

oblique flint I dont get why people are hyping up google so much tbh. Do you really want a fut...

no, but they keep burning money to give us free stuff

thorny drum Apr 10, 2025, 2:48 PM

#

novel flame Another thing: how is this not market manipulation?

wut

keen beacon Apr 10, 2025, 2:48 PM

#

2.5 pro has a jan 2025 cut off. (they cont. pretrained 2.0 pro/etc) this means the 2.5 pro timeline is absolutely absurd lol (1-2 months)

thorny drum Apr 10, 2025, 2:48 PM

#

what market is he manipulating

oblique flint Apr 10, 2025, 2:48 PM

#

sonic tendon no, but they keep burning money to give us free stuff

for now, they'll quickly change once they establish the monopoly

keen beacon Apr 10, 2025, 2:48 PM

#

i doubt any lab will be able to form a monopoly rn

torn mantle Apr 10, 2025, 2:49 PM

#

So the new feature is either memory improvements or that thinking slider

sonic tendon Apr 10, 2025, 2:49 PM

#

oblique flint for now, they'll quickly change once they establish the monopoly

i plan on riding on their free stuff and then immediately switching to other providers when/if i need to scale up

#

i will not pay google a dollar for api credits

sonic tendon Apr 10, 2025, 2:49 PM

#

keen beacon i doubt any lab will be able to form a monopoly rn

yeah everyone copies everyone pretty quickly atp

torn mantle Apr 10, 2025, 2:49 PM

#

sonic tendon i will not pay google a dollar for api credits

Why

sonic tendon Apr 10, 2025, 2:49 PM

#

no moats last for long

sonic tendon Apr 10, 2025, 2:50 PM

#

torn mantle Why

because google

tall summit Apr 10, 2025, 2:50 PM

#

keen beacon i doubt any lab will be able to form a monopoly rn

i loooooove that about the ai space

brittle tiger Apr 10, 2025, 2:50 PM

#

sonic tendon very little real commercial application

there's major production companies using veo 2 right now which was born out of imagegen

sonic tendon Apr 10, 2025, 2:50 PM

#

sonic tendon yeah everyone copies everyone pretty quickly atp

and this seems to keep getting more and more true

sonic tendon Apr 10, 2025, 2:51 PM

#

brittle tiger there's major production companies using veo 2 right now which was born out of i...

i may stand corrected

keen beacon Apr 10, 2025, 2:51 PM

#

its not just copying. companies are neck in neck in research i think

sonic tendon Apr 10, 2025, 2:51 PM

#

there is little commercial application

leaden palm Apr 10, 2025, 2:51 PM

#

what's launching today?
so far i know

qwen 3
something from openai (optimus on OR and/or o3/o4 models)
maybe a google model

keen beacon Apr 10, 2025, 2:51 PM

#

qwen3 isnt happening today

sonic tendon Apr 10, 2025, 2:51 PM

#

leaden palm what's launching today? so far i know - qwen 3 - something from openai (optimus ...

didn't one of the qwen researchers dispute this

keen beacon Apr 10, 2025, 2:52 PM

#

torn mantle So the new feature is either memory improvements or that thinking slider

lol no

leaden palm Apr 10, 2025, 2:52 PM

#

hm

oblique flint Apr 10, 2025, 2:52 PM

#

2.5 flash just didnt launch yesterday for whatever reasn

keen beacon Apr 10, 2025, 2:52 PM

#

it is models

#

i am telling you mow

#

now

#

he just worded it weirdly

leaden palm Apr 10, 2025, 2:52 PM

#

leaden palm hm

ok

keen beacon Apr 10, 2025, 2:52 PM

#

memory improvements would be incredibly silly to hype up when they're having their lunch ate

brittle tiger Apr 10, 2025, 2:53 PM

#

it's for sure models. metadata and amount of hype in tweet is basically confirmation

sonic tendon Apr 10, 2025, 2:53 PM

#

metadata? but yeah agreed

keen beacon Apr 10, 2025, 2:53 PM

#

i think that bindureddy lady is off her rocker tbh

#

if Sama can't sleep because he's so excited over memory improvements he must be more autistic than i thought

#

lol

brittle tiger Apr 10, 2025, 2:54 PM

#

sonic tendon metadata? but yeah agreed

https://x.com/btibor91/status/1910237861674353108

Tibor Blaho (@btibor91) on X

New ChatGPT web app version deployed just now (after midnight San Francisco time) adds mentions of "o4-mini", "o4-mini-high" and "o3"

keen beacon Apr 10, 2025, 2:54 PM

#

keen beacon if Sama can't sleep because he's so excited over memory improvements he must be ...

im wondering if theyll bunch quasar (updated 4o) with the releases

#

it would make aense if o3 is based on it

#

sense

sonic tendon Apr 10, 2025, 2:55 PM

#

keen beacon i think that bindureddy lady is off her rocker tbh

honestly, the confidence and cadence i see in her tweets does sort of remind me of the stuff people write when they're hypomanic

keen beacon Apr 10, 2025, 2:55 PM

#

sense

sonic tendon Apr 10, 2025, 2:55 PM

#

i feel like a lot more CEOs are on the bipolar spectrum than people realize lol

keen beacon Apr 10, 2025, 2:56 PM

#

most tech ceos are probably on some kind of spectrum haha

sonic tendon Apr 10, 2025, 2:56 PM

#

yeah autism and adhd too

#

they have such a huge adaptive advantage in tech jobs, it's crazy

fleet lintel Apr 10, 2025, 2:56 PM

#

oblique flint I dont get why people are hyping up google so much tbh. Do you really want a fut...

compared to what every other big tech has done with their lead, Google is the best option that we have (IMO)

keen beacon Apr 10, 2025, 2:57 PM

#

i like my guys a little nerdy and/or autistic and/or with adhd

#

:3

sonic tendon Apr 10, 2025, 2:57 PM

#

same

#

i do definitively have adhd

keen beacon Apr 10, 2025, 2:57 PM

#

keen beacon it would make aense if o3 is based on it

ya but its a big release. they did the whole anon model on or thing, its an updated 4o with 1m context, large improvements, api dated probably, super fast, etc. they could dedicate a single announcement with it

#

they definitely could release it tomorrow via blog post so that the stream isnt too intense

brittle tiger Apr 10, 2025, 2:58 PM

#

oblique flint I dont get why people are hyping up google so much tbh. Do you really want a fut...

all that really matters to be is the the leading AI coming from democratic nations. i don't really see much ethical difference in google vs the worldcoin founder who successfully drove out all the ai safety ppl

tall summit Apr 10, 2025, 2:58 PM

#

sonic tendon i do definitively have adhd

i wish i could get the hypothesis that i have adhd confirmed/denied but i can't

leaden palm Apr 10, 2025, 2:58 PM

#

optimus alpha is now on openrouter, a reasoner most likely from openai

keen beacon Apr 10, 2025, 2:58 PM

#

woah

#

its o4 mini ✅

tall summit Apr 10, 2025, 2:58 PM

#

optimus 💪

keen beacon Apr 10, 2025, 2:58 PM

#

just guessing lol

leaden palm Apr 10, 2025, 2:58 PM

#

NVM

#

NVM

balmy mist Apr 10, 2025, 2:58 PM

#

how many deep research wiht gemini do you get a day?

leaden palm Apr 10, 2025, 2:59 PM

#

i dont think it reasons

tall summit Apr 10, 2025, 2:59 PM

#

leaden palm NVM

?

leaden palm Apr 10, 2025, 2:59 PM

#

it just had a high latency

sonic tendon Apr 10, 2025, 2:59 PM

#

sonic tendon they have such a huge adaptive advantage in tech jobs, it's crazy

"hey lemme just absorb myself into neovim for 7 hours straight without taking any breaks" describes like a third of my autistic friends lol

leaden palm Apr 10, 2025, 2:59 PM

#

so i thought it reasoned

#

just another gpt 4o variant

tall summit Apr 10, 2025, 2:59 PM

#

LMAO

leaden palm Apr 10, 2025, 2:59 PM

#

sorry guys

keen beacon Apr 10, 2025, 2:59 PM

#

no it's not a reasoner

oblique flint Apr 10, 2025, 2:59 PM

#

brittle tiger all that really matters to be is the the leading AI coming from democratic natio...

yes I agree, but not gonna lie, my perception of the US being a democratic nation has changed somewhat after.. uh certain events

keen beacon Apr 10, 2025, 2:59 PM

#

got my first question wrong that chatgpt 4o latest gets right

#

damn this sucks

#

☠️

sonic tendon Apr 10, 2025, 2:59 PM

#

oblique flint yes I agree, but not gonna lie, my perception of the US being a democratic natio...

democracy and demagogues

keen beacon Apr 10, 2025, 3:00 PM

#

yup its another gpt 4o with the new base model (june 2024)

sonic tendon Apr 10, 2025, 3:00 PM

#

keen beacon got my first question wrong that chatgpt 4o latest gets right

wait what

keen beacon Apr 10, 2025, 3:00 PM

#

i cant believe they dropped another version

sonic tendon Apr 10, 2025, 3:00 PM

#

oh right that one

tall summit Apr 10, 2025, 3:00 PM

#

keen beacon got my first question wrong that chatgpt 4o latest gets right

you can't just say "my first question" that makes me curious
for no reason even though it doesn't matter i'm not an ai

sonic tendon Apr 10, 2025, 3:00 PM

#

i'm gonna do the jar test

keen beacon Apr 10, 2025, 3:00 PM

#

haven't tested it with code yet

balmy mist Apr 10, 2025, 3:00 PM

#

o4 mini high should be the best all around model if we see history, but it has to be way cheaper than 2.5 and faster, bc its not gonna be on the same IQ level obviously, but it can be slightly worse then like 3.7 and be fast af and cheaper than 2.5 than we have a winner

keen beacon Apr 10, 2025, 3:00 PM

#

perhaps that's where it improves

sonic tendon Apr 10, 2025, 3:01 PM

#

tall summit i wish i could get the hypothesis that i have adhd confirmed/denied but i can't

how so? just bad mental health systems in your area

#

?

fleet lintel Apr 10, 2025, 3:01 PM

#

balmy mist o4 mini high should be the best all around model if we see history, but it has t...

But we haven't seen anything better on LMArena? did they not release it here?

balmy mist Apr 10, 2025, 3:01 PM

#

fleet lintel But we haven't seen anything better on LMArena? did they not release it here?

ahh true

sonic tendon Apr 10, 2025, 3:02 PM

#

sonic tendon i'm gonna do the jar test

womp womp (bad result)

fleet lintel Apr 10, 2025, 3:02 PM

#

i have a feeling that it is probably comparable to 2.5 but 1M context

tall summit Apr 10, 2025, 3:02 PM

#

sonic tendon how so? just bad mental health systems in your area

pretty much

keen beacon Apr 10, 2025, 3:02 PM

#

sonic tendon womp womp (bad result)

wait did you get me to try this with o3

torn mantle Apr 10, 2025, 3:02 PM

#

balmy mist how many deep research wiht gemini do you get a day?

20

sonic tendon Apr 10, 2025, 3:02 PM

#

keen beacon wait did you get me to try this with o3

yeah

#

it's in #share-prompts

keen beacon Apr 10, 2025, 3:02 PM

#

im gonna run gpqa diamond and math 500 on it

#

but im pretty sure its another gpt 4o lol

brittle tiger Apr 10, 2025, 3:03 PM

#

the $1200 per question o3 high model would be really cool to have public for big research problems

fleet lintel Apr 10, 2025, 3:03 PM

#

torn mantle 20

no one needs to do more than 20 deep researches in a day 🙂

brittle tiger Apr 10, 2025, 3:04 PM

#

fleet lintel no one needs to do more than 20 deep researches in a day 🙂

i did close yesterday but am mostly testing

sonic tendon Apr 10, 2025, 3:04 PM

#

brittle tiger the $1200 per question o3 high model would be really cool to have public for big...

deepseek is probably gonna replicate it at 1/3000 the cost in at most a year

keen beacon Apr 10, 2025, 3:04 PM

#

sonic tendon yeah

was that the one where it got it wrong first but adding "think carefully" it got it right

sonic tendon Apr 10, 2025, 3:05 PM

#

keen beacon was that the one where it got it wrong first but adding "think carefully" it got...

oh, i have no idea

#

never tried that, actually

#

#share-prompts message

brittle tiger Apr 10, 2025, 3:05 PM

#

sonic tendon deepseek is probably gonna replicate it at 1/3000 the cost in at most a year

i don't think price would be that starkly different for inference.

sonic tendon Apr 10, 2025, 3:06 PM

#

brittle tiger i don't think price would be that starkly different for inference.

maybe not, but R1 was already ~1/50 the cost of o1 with pretty similar performance

torn mantle Apr 10, 2025, 3:07 PM

#

fleet lintel no one needs to do more than 20 deep researches in a day 🙂

I DO

keen beacon Apr 10, 2025, 3:10 PM

#

sonic tendon https://discord.com/channels/1340554757349179412/1343302058929033216/13587819770...

One workable way is to hide the information from yourself before you ever have a chance to see it.

Prepare the record.
• Tear the sheet of paper into two slips of exactly the same size.
• On one slip write “GREEN”, on the other write “RED”.
• Fold each slip so the writing is completely hidden and the two folded slips look and feel identical.
Load the jars.
• Put the folded “GREEN” slip into the green jar.
• Put the folded “RED” slip into the red jar.
Remove your ability to distinguish the jars.
• Close your eyes (or blind‑fold yourself).
• With your eyes shut, move the two jars around on the table for a few seconds until you have genuinely lost track of which is which.
Make the selection blindly.
• Still blindfolded, pick up either jar at random.
• Take the ball and the folded slip out of that jar and put both straight into your pocket without opening the slip.
• Return the now‑empty jar to the table and, still blindfolded, shuffle the two jars again for a few seconds.
Open your eyes.
• You now see two jars, one green and one red, but because of the blind shuffling you no longer have any idea which was the one you emptied.

Result:
• The ball is in your pocket.
• The folded slip in your pocket accurately states the colour of the jar it came from.
• You yourself never learned (and still don’t know) which jar you chose.

Anyone who later opens the folded note will learn the correct answer, but you remain ignorant, satisfying all the conditions.

#

o3

#

lmk how it did

thorny drum Apr 10, 2025, 3:10 PM

#

hmm do people have o3 full yet?

sonic tendon Apr 10, 2025, 3:11 PM

#

keen beacon One workable way is to hide the information from yourself before you ever have a...

yeah that's perfect

calm sequoia Apr 10, 2025, 3:11 PM

#

sonic tendon yeah that's perfect

How it compares to 2.5 pro?

sonic tendon Apr 10, 2025, 3:11 PM

#

seems like both o3 and gp2.5 get it, just intermittently

keen beacon Apr 10, 2025, 3:12 PM

#

i presume o3 mini and o1 did not?

#

for that guy who said he was gonna look into ai and roblox

#

https://devforum.roblox.com/t/free-roblox-mistral-7b-ai-chatbot-agent-aware-infinite-agents-2000-emojis-100-emotes-memories-wikipedia-32k-context-open-sourced/3034325

Developer Forum | Roblox

[FREE] ROBLOX Mistral 7b AI Chatbot Agent: Aware, Infinite Agents, ...

Chatbot Demo.rbxl (657.1 KB) This is a place file that is already set up, requires only that you: Get your Free API token from Huggingface profile settings and insert it here as shown in the image! Publish the place and make sure you have HTTP requests enabled! To Disable a feature you can set the variable to false! You’re maybe thinkin...

#

forgot what he was gonna do with it

#

look at this

torn mantle Apr 10, 2025, 3:13 PM

#

whats this

#

https://openrouter.ai/openrouter/optimus-alpha

Optimus Alpha - API, Providers, Stats

This is a cloaked model provided to the community to gather feedback. It's geared toward real world use cases, including programming. Run Optimus Alpha with API

sonic tendon Apr 10, 2025, 3:13 PM

#

leo tested it one other time and it got a plausible-sounding but slightly incorrect answer

keen beacon Apr 10, 2025, 3:13 PM

#

wait this might be a new gpt 4o minin

#

optimus alpha

#

it's just 49

#

4o

#

another update

#

gpqa diamond: 60.10% (maybe there might be something wrong with my evaluation framework/answer parsing)

sonic tendon Apr 10, 2025, 3:13 PM

#

keen beacon i presume o3 mini and o1 did not?

o1 did, actually

#

i think it's the only model from before 2025 that got it right

leaden palm Apr 10, 2025, 3:14 PM

#

keen beacon wait this might be a new gpt 4o minin

no full 4o

keen beacon Apr 10, 2025, 3:14 PM

#

keen beacon gpqa diamond: 60.10% (maybe there might be something wrong with my evaluation fr...

that's quite the jump

#

it dropped 7 points

keen beacon Apr 10, 2025, 3:14 PM

#

sonic tendon i think it's the only model from before 2025 that got it right

what reasoning effort?

keen beacon Apr 10, 2025, 3:14 PM

#

keen beacon it dropped 7 points

oh nvm i was using the weong point of comparison ☠️

#

wrong*

sonic tendon Apr 10, 2025, 3:14 PM

#

keen beacon what reasoning effort?

not sure, whatever the lmarena default is for o1-2024-12-17

brittle tiger Apr 10, 2025, 3:15 PM

#

Predictions on how long until a model can solve this?

https://x.com/SpencerKSchiff/status/1910106368205336769

Spencer Schiff (@SpencerKSchiff) on X

I drew this today. None of the frontier models come anywhere close to matching the correct name to each person. I feel like this is a pretty good visual test so I’m looking forward to trying it with future models.

balmy mist Apr 10, 2025, 3:17 PM

#

keen beacon https://devforum.roblox.com/t/free-roblox-mistral-7b-ai-chatbot-agent-aware-infi...

nice!!!

sonic tendon Apr 10, 2025, 3:17 PM

#

have you guys seen https://mcbench.ai

MC-Bench

Evaluating AI with Minecraft

torn mantle Apr 10, 2025, 3:18 PM

#

yea this seems like openai model

keen beacon Apr 10, 2025, 3:19 PM

#

im running math 500 rn, ill review gpqa diamond samples a little after this

keen beacon Apr 10, 2025, 3:19 PM

#

sonic tendon have you guys seen https://mcbench.ai

yeah 2.5 pro literally murders every other model

#

if it is 4o mini 60% gpqa diamond is impressive, otherwise something is borked (massive degradation)

#

it's actually wack

sonic tendon Apr 10, 2025, 3:19 PM

#

keen beacon yeah 2.5 pro literally murders every other model

except for sonnet?

keen beacon Apr 10, 2025, 3:19 PM

#

even sonnet

sonic tendon Apr 10, 2025, 3:19 PM

#

o1 does surprisingly bad, also

keen beacon Apr 10, 2025, 3:20 PM

#

2.5 pro base model diff

#

it just seems to have next level spatial understanding

sonic tendon Apr 10, 2025, 3:20 PM

#

are we looking at the same bench?

balmy mist Apr 10, 2025, 3:20 PM

#

torn mantle https://openrouter.ai/openrouter/optimus-alpha

is this beter than quasar?

sonic tendon Apr 10, 2025, 3:20 PM

#

not a ton of votes tho

keen beacon Apr 10, 2025, 3:20 PM

#

sonic tendon are we looking at the same bench?

ignore the leaderboard

#

start rating

#

you'll see what i mean

sonic tendon Apr 10, 2025, 3:20 PM

#

ah, ok

#

you can click on the models in the leaderboards to see their builds

#

you could try o3-medium w/ some of these, although setting it up for self-hosting seems like it wouldn't really be worth the effort

sonic tendon Apr 10, 2025, 3:22 PM

#

keen beacon you'll see what i mean

doggamn

#

goddamn

keen beacon Apr 10, 2025, 3:22 PM

#

hmm optimus could be an updated 4o mini

#

im still running math 500 :\

sonic tendon Apr 10, 2025, 3:23 PM

#

i wanna see the "abstract mathematical concept" build, sonnet's impression of that was really cool

night trout Apr 10, 2025, 3:23 PM

#

sonic tendon are we looking at the same bench?

Screenshot_2025-04-02_at_10.55.09_PM.png

#

Gemini 2.5 Pro kills Sonnet 3.7 99% of the time. It's always like this.

sonic tendon Apr 10, 2025, 3:24 PM

#

night trout

yeah, we were just talking about mcbench specifically

sonic tendon Apr 10, 2025, 3:24 PM

#

keen beacon you'll see what i mean

update: i think sonnet is still better in some of these

#

not publicly available yet

night trout Apr 10, 2025, 3:25 PM

#

Yeah McBench usually 2.5 is a vast improvement over 3.7 too.

#

There are weird edge cases of course.

sonic tendon Apr 10, 2025, 3:25 PM

#

sonnet has some good builds tho

calm sequoia Apr 10, 2025, 3:25 PM

#

As I understand the optimus is a new open source model of OpenAI

sonic tendon Apr 10, 2025, 3:25 PM

#

it's still gonna be a bit

keen beacon Apr 10, 2025, 3:25 PM

#

i doubt that the oss model is trained yet

brittle tiger Apr 10, 2025, 3:26 PM

#

sonic tendon are we looking at the same bench?

Their elo system is bad rn. Test them out. Sonnet is clearly 2nd but 2.5 is in class of its own

keen beacon Apr 10, 2025, 3:26 PM

#

okay yeah optimus is worse than quasar

#

its done worse on basically everything ive thrown at it

sonic tendon Apr 10, 2025, 3:26 PM

#

brittle tiger Their elo system is bad rn. Test them out. Sonnet is clearly 2nd but 2.5 is in c...

well, it doesn't have that many votes yet

#

kk, i gtg

keen beacon Apr 10, 2025, 3:26 PM

#

gpqa diamond: 60%
math 500: 89%

keen beacon Apr 10, 2025, 3:26 PM

#

sonic tendon kk, i gtg

cya

#

i wrote an eval framework myself lol

#

its part of the work im doing

#

quasar alpha:

gpqa diamond: 67.42%
math 500: 90%

optimus alpha:
gpqa diamond: 60%
math 500: 89%

march chatgpt 4o (measured by artificial analysis):
gpqa diamond: 65.5%
math 500: 89.3%

#

i only used 1 sample for optimus alpha (pass@1 estimated w 1 sample) tho, quasar alpha (pass@1 estimated with 4 samples gpqa diamond, pass@1 estimated with 1 sample for math 500)

tall summit Apr 10, 2025, 3:29 PM

#

brittle tiger Predictions on how long until a model can solve this? https://x.com/SpencerKSch...

oh i like that one

keen beacon Apr 10, 2025, 3:29 PM

#

optimus must be mini

keen beacon Apr 10, 2025, 3:30 PM

#

keen beacon optimus must be mini

oh boy

#

o4 mini is gonna be awesome with a much stronger mini base model

balmy mist Apr 10, 2025, 3:31 PM

#

i just hope o4 mini is cheap like o3 mini

keen beacon Apr 10, 2025, 3:32 PM

#

keen beacon optimus must be mini

it is it seems

balmy mist Apr 10, 2025, 3:32 PM

#

is optimus better than quasar?

leaden palm Apr 10, 2025, 3:33 PM

#

why do you guys think quasar/optimus are mini models

#

just because they're fast?

#

plain 4o is fast too

keen beacon Apr 10, 2025, 3:33 PM

#

leaden palm why do you guys think quasar/optimus are mini models

quasar is updated 4o, optimus is 4o mini updated

#

4o mini just got continued pretraining with the new cut off too 👀

#

o4 mini is GONNA GO HARD lol

balmy mist Apr 10, 2025, 3:34 PM

#

wow optimys is fast af

#

wow

keen beacon Apr 10, 2025, 3:34 PM

#

Holy hell is that confusing

#

4o mini
o4 mini

keen fulcrum Apr 10, 2025, 3:34 PM

#

Hi
Nightwhisper released?

tall summit Apr 10, 2025, 3:35 PM

#

keen beacon 4o mini o4 mini

very

#

i've forgotten the difference at this point
why are there both
please more normal names

keen beacon Apr 10, 2025, 3:36 PM

#

keen fulcrum Hi Nightwhisper released?

afraid not

#

we're all still waiting lol

leaden palm Apr 10, 2025, 3:36 PM

#

keen beacon quasar is updated 4o, optimus is 4o mini updated

i havent seen much evidence though

novel flame Apr 10, 2025, 3:36 PM

#

keen beacon imagine if *meta* got there first 💀

Honestly I think this is the most likely scenario since LeCun is right about AGI and most other labs are too focused on scaling up autoregressive token yappers to truly deliver AGI. And you know what, despite all the shady things Meta have done, they are the reason we have a flourishing open source landscape in AI. That’s not nothing.

keen beacon Apr 10, 2025, 3:37 PM

#

leaden palm i havent seen much evidence though

sm1 ran aider + look at my benchmarks

leaden palm Apr 10, 2025, 3:37 PM

#

novel flame Honestly I think this is the most likely scenario since LeCun is right about AGI...

LeCope

ember rapids Apr 10, 2025, 3:38 PM

#

o4 mini high is about 3000 elo on codeforces

drifting thorn Apr 10, 2025, 3:39 PM

#

3000 elo???

keen beacon Apr 10, 2025, 3:39 PM

#

leaden palm LeCope

lmao

keen beacon Apr 10, 2025, 3:39 PM

#

drifting thorn 3000 elo???

yeah it's going to be one hell of a model for a lil guy

lime coral Apr 10, 2025, 3:39 PM

#

Reminder that this means nothing in real world use case

keen beacon Apr 10, 2025, 3:39 PM

#

good things come in small packages (does not apply to me btw)

sage raptor Apr 10, 2025, 3:40 PM

#

is this true ??

keen beacon Apr 10, 2025, 3:40 PM

#

block the guy if u cant see hes joking

drifting thorn Apr 10, 2025, 3:41 PM

#

keen beacon yeah it's going to be one hell of a model for a lil guy

Nah I mean what’s the current highest elo?

brittle tiger Apr 10, 2025, 3:41 PM

#

novel flame Honestly I think this is the most likely scenario since LeCun is right about AGI...

part of deepmind still seems very focused on RL

keen beacon Apr 10, 2025, 3:42 PM

#

man this 4o mini and o4 mini release is gonna be extremely exciting

keen beacon Apr 10, 2025, 3:42 PM

#

drifting thorn Nah I mean what’s the current highest elo?

3828

keen fulcrum Apr 10, 2025, 3:42 PM

#

R2 is expected to come out with Qwen 3

keen beacon Apr 10, 2025, 3:43 PM

#

3000 would make it the 67th best competitive coder in the world

#

hmmm

#

yea

#

u can see in simpleqa

#

with o1 mini simpleqa regressed but they figured it out in o3

#

its obviously not 4o based because 4o has more than double the simpleqa of o3 mini

#

it has the newest reasoning research in it

#

well up to now 😄

#

o4 mini is gonna be crazy

#

with the new much stronger 4o mini base model 😄

drifting thorn Apr 10, 2025, 3:45 PM

#

keen beacon 3828

Is 3.7 and 2.5 Pro tested on that benchmark?

keen beacon Apr 10, 2025, 3:45 PM

#

its 1m context

#

didnt u see?

#

im not sure about o4 mini though

drifting thorn Apr 10, 2025, 3:46 PM

#

Hope 1m context will be the new standard for LLM

#

And I think they should train LLMs for MRCR like tasks

brittle tiger Apr 10, 2025, 3:47 PM

#

ember rapids o4 mini high is about 3000 elo on codeforces

where'd you see this btw

keen beacon Apr 10, 2025, 3:47 PM

#

none when o4 mini gets released 😄

brittle tiger Apr 10, 2025, 3:47 PM

#

o1 better at harder coding and math

keen beacon Apr 10, 2025, 3:48 PM

#

no o1 isnt always better than o3 mini at math

#

o3 mini high is usually better

keen fulcrum Apr 10, 2025, 3:48 PM

#

Optimus Alpha worth trying? How good is it?
Comparable with 3.7 Somnet?

drifting thorn Apr 10, 2025, 3:48 PM

#

Above said it’s worse than Quasar Alpha

brittle tiger Apr 10, 2025, 3:48 PM

#

i could be wrong but that was sense from following top mathematicians who use llms

drifting thorn Apr 10, 2025, 3:49 PM

#

And I thought Quasar Alpha is trashy

keen beacon Apr 10, 2025, 3:49 PM

#

brittle tiger i could be wrong but that was sense from following top mathematicians who use ll...

it depends what kind of problems u throw at it. 4o has more world knowledge than 4o mini

#

but see here: https://matharena.ai/

MathArena.ai

MathArena: Evaluating LLMs on Uncontaminated Math Competitions

#

o3 mini is generally better

brittle tiger Apr 10, 2025, 3:50 PM

#

USAMO is best bench there though bc others are judged by llms

ember rapids Apr 10, 2025, 3:50 PM

#

brittle tiger where'd you see this btw

A while ago, Sam said they have a model internally that ranks around 50th in competitive programming.

keen fulcrum Apr 10, 2025, 3:50 PM

#

(they use wolfram alpha under the hood)

ember rapids Apr 10, 2025, 3:50 PM

#

most likely refering to o4 mini

keen beacon Apr 10, 2025, 3:50 PM

#

brittle tiger USAMO is best bench there though bc others are judged by llms

no it isnt evaluated with llms

#

its answer match i think

#

beyond usamo

#

yea

#

it checks whether the answer matches

brittle tiger Apr 10, 2025, 3:51 PM

#

i mean USAMO grades on how you reached solution with human reviewers and others dont. if im remembering right

keen beacon Apr 10, 2025, 3:51 PM

#

brittle tiger i mean USAMO grades on how you reached solution with human reviewers and others ...

yes but the others are still valid results

brittle tiger Apr 10, 2025, 3:52 PM

#

agreed but some people dont

drifting thorn Apr 10, 2025, 3:55 PM

#

Do you guys think LLMs in the future will be specialised again, like Anthropic for tool calls and coding, OpenAI for Maths and photo generation, and Gemini for long context and multimodal conversations

north vale Apr 10, 2025, 3:55 PM

#

https://www.theverge.com/news/646458/openai-gpt-4-1-ai-model

The Verge

OpenAI gets ready to launch GPT-4.1

o4 mini and o3 might also debut next week

keen beacon Apr 10, 2025, 3:55 PM

#

drifting thorn Do you guys think LLMs in the future will be specialised again, like Anthropic f...

no

drifting thorn Apr 10, 2025, 3:56 PM

#

WTF?

keen beacon Apr 10, 2025, 3:56 PM

#

north vale https://www.theverge.com/news/646458/openai-gpt-4-1-ai-model

..wtf

drifting thorn Apr 10, 2025, 3:56 PM

#

GPT 4.1 after GPT 4.5?

sage raptor Apr 10, 2025, 3:56 PM

#

lol

drifting thorn Apr 10, 2025, 3:56 PM

#

It is dumb

#

Just call it 4.7

keen beacon Apr 10, 2025, 3:57 PM

#

so this is gpt 4.1o and gpt 4.1 mini

#

wow

#

terrible names

#

oh for god sake

#

GPT-4.1, which one source describes as a revamped version of OpenAI’s GPT-4o multimodal model.
✅ i was right

leaden palm Apr 10, 2025, 3:58 PM

#

keen beacon oh for god sake

hop on archive.is

keen beacon Apr 10, 2025, 3:58 PM

#

"OpenAI is also readying the full version of its o3 reasoning model and an o4 mini version that could debut even sooner. AI engineer Tibor Blaho discovered references to o4 mini, o4 mini high, and o3 in a new ChatGPT web version earlier today, suggesting these additions are imminent. I understand o3 and o4 mini are both set to debut next week, unless OpenAI moves the launch plans around."

keen beacon Apr 10, 2025, 3:59 PM

#

leaden palm hop on archive.is

yeah i have an extension that fixed it dw

#

"OpenAI CEO Sam Altman teased on X that OpenAI would be launching an exciting feature today, but it’s not clear if this is related to the o3 and o4 mini references in ChatGPT or not. Sources caution that OpenAI has delayed the introduction of some new models recently due to capacity issues, so it’s possible for the new GPT-4.1 model introduction to slip beyond a planned debut next week. I asked OpenAI to comment on this story, but the company didn’t respond in time for publication."

#

well

#

so it's still up in the air what we're getting today

keen beacon Apr 10, 2025, 3:59 PM

#

keen beacon "OpenAI CEO Sam Altman teased on X that OpenAI would be launching an exciting fe...

capacity issues?

#

but it will likely be a model

keen beacon Apr 10, 2025, 3:59 PM

#

keen beacon capacity issues?

i presume in relation to 4o image gen

drifting thorn Apr 10, 2025, 3:59 PM

#

Go buy TPU from Google

keen beacon Apr 10, 2025, 3:59 PM

#

keen beacon i presume in relation to 4o image gen

oh i was wondering lol. they are serving new 4o and 4o mini for free 🤣

north vale Apr 10, 2025, 4:00 PM

#

drifting thorn Just call it 4.7

no bc it's smaller than 4.5

#

just way better trained

#

quasar alpha is probably new 4.1 mini
optimus alpha is probably new 4.1 nano
and there's some 4.1 they haven't tested

keen beacon Apr 10, 2025, 4:00 PM

#

jeez this naming is so bad

north vale Apr 10, 2025, 4:00 PM

#

ikr

keen beacon Apr 10, 2025, 4:00 PM

#

it's like they have a meeting when launching a new model and they agree on what the worst one is and that's the one they use

north vale Apr 10, 2025, 4:00 PM

#

but i mean

keen beacon Apr 10, 2025, 4:01 PM

#

they should just scrap it all and restart

north vale Apr 10, 2025, 4:01 PM

#

i think adding numbers after the version is like the right way to do it

keen beacon Apr 10, 2025, 4:01 PM

#

make it make logical sense instead

keen beacon Apr 10, 2025, 4:01 PM

#

north vale i think adding numbers after the version is like the right way to do it

well yeah

north vale Apr 10, 2025, 4:01 PM

#

gpt4, gpt4.1, gpt4.2, gpt4.3, wtv

keen beacon Apr 10, 2025, 4:01 PM

#

bro the model picker is going to get even worse

sonic tendon Apr 10, 2025, 4:01 PM

#

keen beacon jeez this naming is so bad

ikr??

keen beacon Apr 10, 2025, 4:01 PM

#

the average user is going to click that and have a seizure

drifting thorn Apr 10, 2025, 4:01 PM

#

Yes, it is a poor naming

jaunty kraken Apr 10, 2025, 4:01 PM

#

Gotta screenshot that 404 screen and say we got early access

drifting thorn Apr 10, 2025, 4:02 PM

#

Lmao

balmy mist Apr 10, 2025, 4:02 PM

#

i am so confused man

#

forget thesse names

#

just gimmie new model

drifting thorn Apr 10, 2025, 4:02 PM

#

north vale no bc it's smaller than 4.5

Can 4.5 be better trained?

keen beacon Apr 10, 2025, 4:02 PM

#

drifting thorn Can 4.5 be better trained?

its too large and inefficient to work with

north vale Apr 10, 2025, 4:03 PM

#

it can be better trained, it was probably too large relative to the current optimal pretraining scaling laws, etc.

#

doesn't really make sense for 4.5 to exist

keen beacon Apr 10, 2025, 4:03 PM

#

u have to remain agile especially now

drifting thorn Apr 10, 2025, 4:03 PM

#

I don't think Gemini 2.5 Pro is a smaller model than DeepSeek R1

north vale Apr 10, 2025, 4:03 PM

#

like it has some utility but limited

keen beacon Apr 10, 2025, 4:03 PM

#

if the anon openrouter models are 4.1 i'm automatically disappointed lol

keen beacon Apr 10, 2025, 4:03 PM

#

keen beacon if the anon openrouter models are 4.1 i'm automatically disappointed lol

what else could they be lol

#

they aren't big enough of an improvement to justify a new model version num jump

north vale Apr 10, 2025, 4:03 PM

#

2.5 pro probably is decently large, has pretty similar price per token as grok 3 surprisingly

keen beacon Apr 10, 2025, 4:03 PM

#

keen beacon they aren't big enough of an improvement to justify a new model version num jump

ya they shouldve named them 4o and 4o mini

#

a new version of it

#

gpt-4.1o-latest-preview

#

openai style

north vale Apr 10, 2025, 4:04 PM

#

keen beacon what else could they be lol

there's 4.1, 4.1 mini and 4.1 nano! they exactly fit with nano and mini imo

torn mantle Apr 10, 2025, 4:04 PM

#

north vale https://www.theverge.com/news/646458/openai-gpt-4-1-ai-model

so more context

north vale Apr 10, 2025, 4:04 PM

#

optimus matches 4.1 nano imo

alpine coral Apr 10, 2025, 4:04 PM

#

here's those results for private alonw with some other oai models (same quiz, single pass)

keen beacon Apr 10, 2025, 4:04 PM

#

calling it 4.1 is also funnier because 4.5 already exists

keen beacon Apr 10, 2025, 4:05 PM

#

alpine coral here's those results for private alonw with some other oai models (same quiz, si...

yeah this is o3 medium so that's pretty damn good

drifting thorn Apr 10, 2025, 4:05 PM

#

I don't think there’s a use for “nano” when it’s not an open-sourced model

keen beacon Apr 10, 2025, 4:05 PM

#

north vale there's 4.1, 4.1 mini and 4.1 nano! they exactly fit with nano and mini imo

i mean the other guy said theyd be disappointed if its 4.1 its definitively 4.1 but we dont know which model is which

north vale Apr 10, 2025, 4:05 PM

#

keen beacon i mean the other guy said theyd be disappointed if its 4.1 its definitively 4.1 ...

my bad i meant to reply to him

drifting thorn Apr 10, 2025, 4:06 PM

#

At least it is suitable for research to test new algorithms for small models

keen beacon Apr 10, 2025, 4:06 PM

#

https://x.com/sama/status/1910363426972635455

Sam Altman (@sama) on X

a lot of people were interested in how we made GPT-4.5 and what comes next.

we did a podcast with alex paino, dan selsam, and @atootoon who helped drive the project.

full episode coming soon, but here are some interesting clips:

#

and also

Among the new AI models will be a release of what I’m expecting will be branded GPT-4.1, which one source describes as a revamped version of OpenAI’s GPT-4o multimodal model.

this is big boy gpt 4.1 (quasar/gpt 4o provenance), gpt 4.1 mini is optimus, nano we have yet to test

north vale Apr 10, 2025, 4:06 PM

#

yeah ig "it's 4.1" feelsl ike it's talking about the full model not the smaller versions

north vale Apr 10, 2025, 4:06 PM

#

keen beacon and also > Among the new AI models will be a release of what I’m expecting will...

why do you say that? quasar alpha is worse than 4o I think?

#

my prior is 4.1 full would be better than 4o for sure

keen beacon Apr 10, 2025, 4:07 PM

#

wtf happened between now and when i last checked this

#

it has completely tanked

keen beacon Apr 10, 2025, 4:07 PM

#

north vale why do you say that? quasar alpha is worse than 4o I think?

did u see my benchmarks

north vale Apr 10, 2025, 4:08 PM

#

keen beacon did u see my benchmarks

yeah and nvm you're right I was dumb 4.1 being alpha makes sense

keen beacon Apr 10, 2025, 4:09 PM

#

https://x.com/sama/status/1910363838001869199

Sam Altman (@sama) on X

@slow_developer quasars are very bright things!

drifting thorn Apr 10, 2025, 4:09 PM

#

What criteria would you guys think O4 Mini will take the lead to Gemini 2.5 Pro

keen beacon Apr 10, 2025, 4:09 PM

#

also worth looking at what he replied to

drifting thorn Apr 10, 2025, 4:10 PM

#

Gotta sleep

keen beacon Apr 10, 2025, 4:10 PM

#

https://x.com/sama/status/1910363747652432304

Sam Altman (@sama) on X

@SmokeAwayyy o3, o4-mini are not launching today, they come soon.

today is a new feature.

and it points, i hope, to something very important for the future of how AI will integrate in our lives.

#

well

#

that's a shame

drifting thorn Apr 10, 2025, 4:11 PM

#

Good night from UTC +08:00

north vale Apr 10, 2025, 4:14 PM

#

have ppl thought quasar alpha is better than current 4o?

#

why clown

leaden palm Apr 10, 2025, 4:15 PM

#

what did @misty vault mean by this

leaden palm Apr 10, 2025, 4:15 PM

#

north vale have ppl thought quasar alpha is better than current 4o?

yes

keen beacon Apr 10, 2025, 4:16 PM

#

bros just a hater

balmy mist Apr 10, 2025, 4:19 PM

#

lmaooo https://x.com/AskPerplexity/status/1910365852563915126

Ask Perplexity (@AskPerplexity) on X

Wyd if I show you this popup after answering your question?

fleet lintel Apr 10, 2025, 4:25 PM

#

keen beacon https://x.com/sama/status/1910363747652432304

ah... 😦

I think it's going to be another gimmick feature. or something completely irrelevant

balmy mist Apr 10, 2025, 4:25 PM

#

if it is then gg openai

keen beacon Apr 10, 2025, 4:25 PM

#

"memory improvements"

balmy mist Apr 10, 2025, 4:25 PM

#

cant be playing when google releasing heat

keen beacon Apr 10, 2025, 4:25 PM

#

💀

sonic tendon Apr 10, 2025, 4:27 PM

#

why not just start another line of models and call it Gemini 1 or g1 for short

#

whoops

#

20 minute message delay

keen beacon Apr 10, 2025, 4:27 PM

#

lmfao

leaden palm Apr 10, 2025, 4:33 PM

#

10am livestream again?

keen beacon Apr 10, 2025, 4:34 PM

#

probably

brittle tiger Apr 10, 2025, 4:37 PM

#

north vale have ppl thought quasar alpha is better than current 4o?

I just don't get why quasar has 1M context window when 4o has better eval at 120k

calm sequoia Apr 10, 2025, 4:40 PM

#

brittle tiger I just don't get why quasar has 1M context window when 4o has better eval at 120...

Source?

brittle tiger Apr 10, 2025, 4:41 PM

#

calm sequoia Source?

https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87

Fiction.liveBench April 6 2025

Benchmarking AI Models for Long Context Comprehension

keen beacon Apr 10, 2025, 4:41 PM

#

brittle tiger I just don't get why quasar has 1M context window when 4o has better eval at 120...

they can flex needle in a haystack like meta 🤣

brittle tiger Apr 10, 2025, 4:42 PM

#

keen beacon they can flex needle in a haystack like meta 🤣

everyone gravitates to the window size without looking at evals for long context. doesn't really happen with any other element of these models

calm sequoia Apr 10, 2025, 4:44 PM

#

brittle tiger https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87

Gratitude 🙏

keen beacon Apr 10, 2025, 4:44 PM

#

brittle tiger everyone gravitates to the window size without looking at evals for long context...

besides if u use the higher context window, you give them more money lol even if its massively degraded on a lot of tasks

torn mantle Apr 10, 2025, 4:46 PM

#

oai still have a long way making a good coding model

#

im not talking about a reasoning coding model

#

but a model that is practically useful

#

all these benchmarks are impressive but it doesnt reflect real world case scenarios

brittle tiger Apr 10, 2025, 4:48 PM

#

2.5 long context is legit. I can summarize 900k token video files reliably when 1.5 pro and 2.0 flash didnt' really come close

torn mantle Apr 10, 2025, 4:48 PM

#

codeforces is a good benchmark for reasoning but the majority arent always asking the model for algorithmic use cases

#

if the model is good at math then you would expect it to do good on codeforce benchmark

#

as its all just mathematical algorithms implementations

#

i still think openai didnt do enough RLHF on coding aesthetics/styling like anthropic and google

#

and i still dont understand whats the issue/constraint here

#

deepseek probably used that on their latest update

calm sequoia Apr 10, 2025, 4:54 PM

#

brittle tiger 2.5 long context is legit. I can summarize 900k token video files reliably when ...

How long is such video? Do you pass a real video file or transcript

brittle tiger Apr 10, 2025, 4:55 PM

#

hour long videos. not transcripts i need info on the screen from tickers and charts and stuff

balmy mist Apr 10, 2025, 4:56 PM

#

so when is this update coming?

#

also is google releasing anything today?

calm sequoia Apr 10, 2025, 4:56 PM

#

What could be your use case? So interesting! Are current models even capable of taking movie long videos as an input? I would have guessed no

brittle tiger Apr 10, 2025, 4:57 PM

#

calm sequoia What could be your use case? So interesting! Are current models even capable of ...

project on how markets react to financial cable news

sonic tendon Apr 10, 2025, 5:03 PM

#

leaden palm 10am livestream again?

wait, what timezone

keen beacon Apr 10, 2025, 5:04 PM

#

pst/pt i think

#

i guess no livestream today

#

it's literally just

#

memory of your past convos

#

it's so over

#

https://x.com/OpenAI/status/1910378768172212636

OpenAI (@OpenAI) on X

Starting today, memory in ChatGPT can now reference all of your past chats to provide more personalized responses, drawing on your preferences and interests to make it even more helpful for writing, getting advice, learning, and beyond.

#

Lmao

torn mantle Apr 10, 2025, 5:08 PM

#

keen beacon it's literally just

told ya

#

im always right

#

i mean not always but like 99.9%

leaden palm Apr 10, 2025, 5:10 PM

#

🗿

sage raptor Apr 10, 2025, 5:10 PM

#

disappointing

fleet lintel Apr 10, 2025, 5:11 PM

#

fleet lintel ah... 😦 I think it's going to be another gimmick feature. or something complet...

I knew it.. irrelevant.
Could have been a small tweet to announce this feature, rather than grandeous can't sleep crap

brittle tiger Apr 10, 2025, 5:13 PM

#

at least all the clueless ai hype accounts on twitter tweeting stuff like "o4-mini today 👀 " look dumb

keen beacon Apr 10, 2025, 5:14 PM

#

@keen beacon btw could it be o4 mini instead thats the private model?

#

it has a new base model but now we know of a mini version it could be that

torn mantle Apr 10, 2025, 5:14 PM

#

the only companies that are in the right path rn for AGI are google/anthropic/ deepseek

#

they are all innovating and trying to understand how the model behave internally

#

google is showing us that we still havent hit a wall

fleet lintel Apr 10, 2025, 5:15 PM

#

this constant overhype by OAI is getting to my nerves

torn mantle Apr 10, 2025, 5:15 PM

#

deepseek is innovating in hw & sw

#

anthropic are trying to understand the model black box

torn mantle Apr 10, 2025, 5:16 PM

#

fleet lintel this constant overhype by OAI is getting to my nerves

he always does that

leaden palm Apr 10, 2025, 5:18 PM

#

look at this ratio

fleet lintel Apr 10, 2025, 5:20 PM

#

leaden palm look at this ratio

I am not very familiar with these things
What does this ratio imply here?

keen beacon Apr 10, 2025, 5:20 PM

#

only available for Pro users starting today. "Soon" for Plus, nothing announced for Free. Not available in the EEA, UK, Switzerland, Norway and Liechtenstein

#

the new memory

#

lmfao

keen beacon Apr 10, 2025, 5:20 PM

#

keen beacon <@456226577798135808> btw could it be o4 mini instead thats the private model?

possibly

barren prairie Apr 10, 2025, 5:21 PM

#

keen beacon only available for Pro users starting today. "Soon" for Plus, nothing announced ...

Ehhhhhh

brittle tiger Apr 10, 2025, 5:22 PM

#

it's not cool but memory is really important to openai and altman. the more lock-in they can get the less likely it is people will leave for smarter models. it's real dynamic rn. i know some ppl who don't care about 2.5. they just want their chatgpt memories. that strat works really well for keeping normies on board as long as gulf between models doesn't get too big.

keen beacon Apr 10, 2025, 5:22 PM

#

i for one absolutely do not want chatgpt to remember the things i tell it 😇

fleet lintel Apr 10, 2025, 5:22 PM

#

keen beacon only available for Pro users starting today. "Soon" for Plus, nothing announced ...

it's worse than I thought... gawwd

fleet lintel Apr 10, 2025, 5:23 PM

#

brittle tiger it's not cool but memory is really important to openai and altman. the more lock...

I dont think it will create any lock-in unless there is significant improvement in the responses becasue of this feature, which I highly doubt.

brittle tiger Apr 10, 2025, 5:24 PM

#

fleet lintel I dont think it will create any lock-in unless there is significant improvement ...

there already is an element of lockin before this. we're enthusiasts. normies won't move to a model that doesn't know them unless it's significantly better

tall summit Apr 10, 2025, 5:25 PM

#

what

#

what happened!!

fleet lintel Apr 10, 2025, 5:25 PM

#

By the way, I do think that getting it done correctly is a good feature but they should announce it properly

tall summit Apr 10, 2025, 5:25 PM

#

Starting today, memory in ChatGPT can now reference all of your past chats to provide more personalized responses, drawing on your preferences and interests to make it even more helpful for writing, getting advice, learning, and beyond.

#

thats the most boring feature ever

oblique flint Apr 10, 2025, 5:28 PM

#

o3 mini was a coding model wasnt it?