#general
1 messages · Page 42 of 1
Am I the only one that hates the FlowWith AI UI and UX?
wow i was curious about the benches for them. i cant believe 4b/8b models are this good nowadays lol
time left for google event?
2.5 Hours
The param count to benefit grows in log10 :/
I see, the official start is different. I was looking at legit's event
open source Copilot, fight, fight, haha 😆
thats actually a good strategy from them
like my post so that "lm arena" can see it 👇
https://discord.com/channels/1340554757349179412/1374391922768216064
Add Filters to the Leaderboard
- Maximum cost per token
- Model type: Thinking / Not Thinking
- Maximum number of parameters
- License type: Open source / Closed
- Year of release
- Organization selector
been using ai studio for the past few hours
the performance is different, not because the CoT changed or whatever
it's weird
it keeps spamming \n\n in its thought process
agi in 2 hours and 20 minutes
we see it 👍
hey guys ,
what do you ppl use for productivity tracking or journaling ?
cutiepie-75 not that bad honestly if it's a public model it should be ranked quite high
new gemma/gemma anon model
is it summarizing for you now?
wow i hate this
@calm sequoia why you deleted your message ? With the list of the gemini model
Thx
just release grok 4
already
and it will be bad
x.com got -5% on traffic last month
and it will keep decreasing
thats what we want
so he can wake up a bit
yea
grok sucks balls
we shall see
if grok 4 is the best it won't be for long
I am very bullish on grok 11
cant wait for google event
you're going to die on the streets after losing all your money and having your stuff repossessed before you die of old age
:3
should be awesome
real
2 hours left

looking like it'll be the best io ever
?
what's a gonna happen
seems like no new gemini model (at least, no lmarena reveal)
logan said gemini
gemini 2.5 deep thinking
gemini 2.5 ultra (?... hopefully)
imagen 4
imagen 4 ultra
veo 3
he did?
yea
there will be lol
dont forget gemma
yeah and that
it's pretty good for gemma
Did you know in portal 3 Glados released so much neurotoxin that it caused apeture science facility to expand in size? For more information search "Glados inflation"
i dont think ultra is coming lowkey
what makes you say that, ooc
yeah, same
the only thing testing on arena rn is calmriver and it's
sorta garbage
logan
Our Only Hope Is 4 Opus
and the return of gpt-4-0314
and the fact all of GDM's team have been hyping up a new gemini [ultra] for weeks
I think it's possible it doesn't come at I/O but is launched in like July
we shall see though
hopefully today
i also expect o3 pro today (finally)
and possibly an elusive anthropic launch on thursday
but idk about a new model
Thanks, I didn't get it the first time
oh, speaking of
am sorry 😦
did you ever get to test neptune again
the information leaked they're launching a new version of sonnet and opus in the "coming weeks'
so perhaps they finally release 3.5/3.7/4 (?) opus
they really need to fix that naming scheme
nahh, it's charming
very openai
it's just Claude 3.7 Sonnet with a new safety system so 😔
ah, rip
did that thing end up working, ooc
haven't tried it sorry ive been busy
all good lol
no presh
idk, ultimately my bet's like 70% on gemini ultra not coming out at i/o
(figuratively speaking)
what is it
IO predictions
Most impressive moments/demos:
2.5 Deep Think w/ crazy benchmarks
Flow (+Veo 3, Imagen4)
Jules
Unexpected but would rule:
New AlphaEvolve results using Gemini 2.5 instead of 2.0
New TPU
Probably not gonna happen:
2.5 Ultra (Gemini Ultra is likely name of higher paid tier)
here's the dumb one: https://kalshi.com/markets/kxtopmodel/top-model#kxtopmodel-25may
dumb because some guy's making 0.75% a month off of people that don't understand how lmarena works
what's the other
https://kalshi.com/markets/kxcodingmodel/best-ai-coding-model this one's for livebench coding
and then the main one is
kalshi has a ton of ai markets
which is also somewhat dumb considering that they could change how elo is calculated at any time
someone should do the "AI" counter for sundai
sundai?
the 2023 io with just clips of him saying AI is still funny
yep
yeahhh
oo yeah that'd be fun
kalshi is genuinely a super nice platform
they refunded me like 20 bucks for a ui bug one time
and you can just
ping staff in the discord
but yeah - still doing the gambling thing, but a little more structured after i lost a bunch of money betting on other people betting on a model that definitely wasn't going to win
META BETTING ???
i also made yet another polymarket alt, as my second alt also got banned
yep
for what ☠️
tried to time the alibaba release, but it happened too close to the end of the month
speaking of which... it's ironic af that people swear by 4.1 but vote for chatgpt-latest lmao
wait, what market is this in reference to
chatgpt latest is a more human preferable version
region restrictions, again
apparently they really dislike mullvad
elo score reference not market reference 😠
oh
could be two different demographics
yeah but people keep saying they allegedly prefer 4.1 which is funny
ew..
i mean if its more human preferable it makes sense some people might like it more in terms of style/etc
ah
i could definitely be wrong
oh...
i mean, i would be surprised if they didn't play a role in the CFTC getting fussy about polymarket
i had no idea about the raid though
elo score is the opposite. 4.1 is lower
I don’t think any gambling platform has a fair backend algorithm. All dealers will manipulate data, including the President of the United States. 🤪
?
chatgpt latest is lower than 4.1?
oh hi
??
people say they prefer 4.1 over chatgpt-latest. Elo score is opposite of this
the order book is viewable to everyone
oh i read ur stuff inverted lol
openai said it was more coding focused
it might be more preferable in coding contexts
So I can't create data?
kalshi does have a "satellite company" that trades on the exchange but afaik they don't have any communication between them and the core staff of the exchange
wdym by data
i mean, you can wash trade
Wash trading is a form of market manipulation in which an entity simultaneously sells and buys the same financial instruments, creating a false impression of market activity without incurring market risk or changing the entity's market position. Wash trading has been deemed illegal in most jurisdictions. For instance, the United States enacted...
but kalshi would probably doxx and ban you, and on polymarket almost everything is public
ok you win
it definitely does happen on polymarket to some degree
wash trading, i mean
and insider trading
somewhat surprisingly, i actually seem to perform a lot better trading with $5 than trading with $50
a lot less vulnerable to panic buying/selling, probably
gpt-4
wdym? that makes sense, just curious about the details
i have a ~$50 bankroll on there rn, ~$10 on polymarket (originally $5)
the difference in efficiency between open ai and google 😶
(and both have the same performance)
the yapper
but yeah, all of the 2.5 models seem to use a ton of unnecessary formatting and writing in their thinking process - surprisingly, giving 2.5 pro claude's system prompt decreases thinking token usage a ton
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
that's... probably a good idea
conviction seems to be my main issue
@keen beacon you can force the thinking process via "<ctrl95>" and asking it to use them during it
it shows the real thought process
my thoughts too
same for this
i should probably benchmark it, come to think of it
man i loved gemini for sharing the full thought process, i dont know what to feel now that they're trying to hide it more now
wait, how so?
its a summary only now
speaking of which: i feel like this is probably what logan was Geminiing
maybe
oh, even in ai studio?
I doubt Logan would Gemini a pretty irrelevant model for the average person
yup 😦
@keen beacon seems like they are summarizing the CoT
yea
also they seem to have used a smaller model
it'll be 2.5 pro deep yada yada minimum, and then possibly an ultra
new verb just dropped
the thinking process time got much shorter
18s vs 40s same prompt
i had that recorded
not sure tho
they've nerfed it so deep research looks better
(that's just a conspiracy but it wouldn't entirely surprise me)
0 temp, top_p 1?
me neither
man i hate these cot summaries
on chatgpt i gave it a problem, and it was talking about and concluded the opposite in the cot summary (incorrect) then responded with "Correct."
https://kalshi.com/markets/kxcodingmodel/best-ai-coding-model
i may have bought gemini at 10c on this
probably a bad idea
(lmk if you guys are tired of me talking about gamblprediction market trading on here)
default ones
ive never played with the parameters tbh
0.95 temp
yeah cuz of sampling, u cant really tell that much
i mean temp will only matter if the prompt is kinda creative oriented
depends on the prompt
not always
if u ran it several more times itd probably be a better signal
old gemini cot was kinda overwhelming, never read it tbh
i didnt find it overwhelming at all tbh
i could see myself going through this one
i just wanst a fan of listing etc...
it was too much
im okay with bullets/listing in final output but surely not in cot
eh its how it reasons
ik
i hope they dont prefill the thinking tag now 😅
btw anyone is encouraged to join me in #1340554757827461215 if you're looking for some lofi
gpt-4-0314
nah there's a decent chance
discussion about prediction markets gets obnoxious in the same way that investing discussion does
not unnecessary, this is what they trained it on
nice
wonder if o3 pro will release too
if not then it's still going to be 2.5 pro
oh yeah, wonder if this is what Logan Geminied
opendyslexic system font?
how's that
yes, but it's quite the long shot
Gemini just sucks at this, apparently
fr
nice
oh, that's interesting
i also have ADHD (and pretty bad EF issues more generally) but I haven't really used LLMs for that
gpt-4 sydney girlfriend for mental health
what exactly do you do w it
could you give like
an example prompt from a chat that helped you out
when imagen 4
lets go
Intro by veo 3
New flash is live on aistudio already
intro was so insane. can't wait to play with veo 3
Google beam??
nice asf
sick convertible
gpt-glaze would be so funny answering those questions. "you're so smart. what an amazing convertible.... you gotta run from that person chases you"
ong
alright mariner bro please
NICEEE
API
agent mode???
personalization used to only use Google searches
it doesn't think less
Nice
ye
it's a strong model in my testing already but I'ma wait until after io
to test it more
they're already going hard asf
New gemma available on the ai studio
streaming i/o 👍
so no gemini 2.5 ultra or deepthink =((
ngl insane
they left 2.5 pro there on purpose
And new gemma
It's a new architecture
4B active but more in total
Effectively 4b
We dont have the number now
New flash ranked 3rd(webdev arena)
calmriver?
aistudio canvas?
Is there any live chat commenting on the I/O ? Unfortunately they turned off the live chat on youtube
And I cant see any thread on r/singularity
2.5 pro has url context
what does that do
prolly can search the specific website you gave it now
let's see
btw new thinking gauge for 2.5 pro
Gemini diffusion??????
HOLY
wow
they be cooking
DEEPTHINK MENTIONED
no way
yes let's go
It's like o1 pro with parallel requests I guess
YO GET THOSE BENCHMARKS
Rip O3
Nah google are next level
Does this mean nightwhisper was deep think?
if so then it never seemed to think so much longer
RIP open ai
While the whole world is living in 2025 google already in 2030
Wonder why they let flash 5-20 come out but not deep think or nw yet ..
uhh nice
Also good move on using new benchmark for math
past ones are fairly saturated
To make Sam altman angry 🙂 🤝
But it will come
will the same problem like with gemini 2.5 pro 05 06 be?
what problem
there is no problem
😇
WHAT. Will probably be able to use for free on aistudio lmao
Where did they announce it
where have you heard about it? They are talking about some useless "AI mode" now
I'm about lmarena. If the connection isn't good gemini pro can glitch and just generate an answer for infinitely time.
that's not a model problem though. It works as intended on other platforms
Gemini began as a collaborative effort between research and engineering to build a multimodal, scalable AI foundation model.
Today, Gemini 2.5 is our most intelligent model series to date.
Seems 2.5 ultra isn't gonna come out at io lol I'm gonna go to bed. Hopefully it'll come out someday lmao
if you have unstable internet connection maybe that messes with lmarena
The new Gemma model might be interesting but it's a snooze fest rn
deep think mode is released for "trusted testers" 💀
meaning it's announcement of the announcement
I think ultra exists or it might've just been bait by employees lol
Maybe if it comes out it comes out. I'll just stop yapping about it until something happens
so, i'm not about problems with specifically gemini flash 🙂 . I just can't use aistudio so lmarena is my the only option to use gemini.
ye
I don't think you can even branch on the Gemini product
was this in the live
they already said this tho
why are they reiterating 2.5 flash is coming to overviews and AI mode
Thanks openai
250$/month
The frustrating part here is that all of these tools arent available yet and will come first for US users
yep
Hope that AI Studio will keep existing for a while
disappointed asf
can I sleep (peacefully) now?
There's veo and imagegen
They are talking about tactical features, timeframe - > next 5 months
Like all the good stuff wont be released
Deep think, ai overview on for US, project mariner, project astra
All limited
yep
i just got back, what i miss?
the io isn't releasing anything
lmaoo
Yea
2.5 pro deep think like o1 pro for 2.5 pro lol
No
im done
Early testers only for now
Let's see imagen 4
And veo 3
wait io is multiple days?
did they mention imagen 4
I think we will get it today
but we're not going to be able to use it
I've had it for like a month
surprise me Google
The video previews
😭 🙏
Oh
Looks good
They still didn't introduce that on the keynote
Disappointing deep research didn't get s performance update
this means they're adding limits to everything
including video overviews prob
https://ai.google.dev/gemma/docs/gemma-3n if anyone's interested this is up now lol
What's this
Lol available only in the US
Lets go
imagen 4
Their new Gemma model
Struggling with words is so Mars 2025
Available today, he didnt mention US only
nobody gets it
On espere
besides ultra subscribers
who cares though
YOOOOOOOO
Just enable some free browser vpn is all thats needed to bypass imagefx unsupported country
LOOK AT THE VIDEO
holy
omaygot veo 3
social media is about to be flooded
What did you expect lol
exactly what I told you
ultra pro max
Woah
Oh I thought you were disappointed with veo
oh nah
Veo 3 looks awesome
I meant with the overall event
What's this
honestly it's more than expected
in the past these events were nothing burgers for them
It might still come (ultra) or employees are just trolling lmao
I mean it's nothing new tbh, they're just releasing things I've always had access to
deep think
and announcing things no one will have access to
like deep thinking
no in-between
Pay 250 a month
we do not know yet, but hopefully it will be accessible eventually
ts more expensive than chatgpt pro 😭😭
how do you manage to top that
Yeah it's an absurd price
Gemini ultra
synthID?
yep, nothing burger tho
I don't think you can even branch on the Gemini product c'mon bruh
everything
but there's no intermediate plans
so either you pay 250 to have access to everything
image 4 already available
they did announce smth. As much as it sucks, announcing things that you can't immediately test has kinda became the norm now. Everyone is doing it
Anthropic at least drops stuff immediately
nothing
For the most part
it's nothing besides Veo 3 and imagen 4
but the problem is they're adding limits
to all this stuff now
AI studio prob gonna be useless
deep think is their response to o3 pro?
OpenAI used to too. Not anymore. Anthropic is the small offender for now but iirc some of their more recent releases had delays too
Parallel requests yeah
nah not at all
it's an innocent release
i think that was the vision, start by pulling in more users -> convert to a profit lab later
the 250$/month is ridiculous
like they announced new Haiku before releasing for sure
openai played a huge role in that
Their product is not good enough for 250
but again
yep
that's the problem
none of their gemini subs make sense. But at least there's an alternative and hopefully they stick to that
grandpa
xd
probably US available
same as whisk and other similar tools
prolly insane limited access for advanced users
u can literally use imagen4 for free rn in whisk wdym
are we talking about imagen 4 or flow
Wonder when 2.5 image gen is gonna come out :/
ye
if they just let their native image gen model think longer
then it would be much better
nah these pricings are getting ridiculous
like fr who would use 30 TB of storage on the cloud
exactly
at LEAST intermediate prices
they just felt the need to add it
Why would you pay 250 for googles sub rather openai or Claude tbh at that point
I want ONLY access to 2.5 pro deepthink
why would anyone pay more than 20$
ye
I agree lol. But openai and Claude's subs at that price give you way more value I think
Definitely right now
Yup
hold on tho, 250 Google Sub is the most "valuable" here in regards to what you get, since you can't even pay a thousand for infinite access to 3.7 sonnet or an imaginary 4 opus, the point is tho
that the variety is the subscription is meaningless
nobody is going to use 30TB and I doubt that's what ANYONE wants
buying ultra
well we don't know rate limits and have not fully used all of the model available or soon to be released, so honestly it is a bit early to write them of
You get several million tokens every few hours and can use it on Claude code, if you're the type of person who uses Claude that much it's worthwhile
it's the fact that there were no limits for the 20$ plan
and now there are
ye I know, Claude code is a beast
but that's it
well i think it should be obvious that they where not actually planning on keeping it that way
is there deepthink yeet
later this year = next year
i think they are currently really maxing out their compute
that's still besides the point
this guy talks like apple CEO
codex > claude code, sorry
and it is really hard for the to expand TPU vs companies that use GPU
so they kinda have to, if they want to compete imo
Is codex usage even included in chatgpt pro anyway
Then Claude is better by default lol
@torn mantle advanced plan only uses Veo 2
if u dont want pro, buy two team plans ($25 each) to try codex out, ur welcome
It's their o1 pro equivalent lmfao
hahaha $250, chatgpt is cheaper!!
you can't get it yet
Deep think is only for early testers only for now
BUY CHATGPT PRO GUYS, ITS BANG FOR UR BUCKS
what they are doing are making sure more people cancel their pro subs
yep
just did that
YES BUY IT
not thinking about getting ultra either that's for sure lmao
deadass making chatgpt pro look like a bargain
BUY IT, FOMO INTO IT LETS GO
😭
CHATGPT PRO > GOOGLE GREED ASS
there's no point in yt premium that's like the most useless sub
o3 pro will be released the day after pretty plz
had it for free for like 3 months
youtube music?
buddy needs to know about them extensions
but ngl I don't need YouTube premium
I already have it
there's no point in getting the ultra plan
ultra plan doesn't even have access
man you are so weird, openai and apple stan
ye no shi, that's a problem
sure
that's his gimmick here
Xai is such a sh1tshow
anyone still want grok 3.5
anyone got the timestamp to deepthink part? of the live
true, it is crazy how they convinced soo many investors to just give them money to burn
@deep adder get codex, no ragretz
another thing nobody will have access to
lmao
what did that "benchmark" even mean here?
no its not lol, well for big codebases, ig its an mvp
Did we get o3 pro yet?
they have no choice now
elon more focused on his lawsuit
gj
keep it on
there u go, grok 3.5 will be mid
what about notebook lm video previews??????????????????
ong
thos xr glasses i need now
cant beat them, must join them
.... is Xai not also?
did anyone think that deepthink was gonna be free in ai studio lol
so veo 3 is the new video model right?
with as much as I've used 2.5 pro 0325 and 0506 it'd be cheaper for them to deploy deepthink
crazy tbh
Deep think at home, run 10 2.5 pro requests at once on aistudio
wym "not by much" that's unquantifiable, it's a joke
yes
it has also dialog + audio support
It's probably closer to 10
Would be surprised about this one haha. Will come first in the API and then eventually be available like Veo2
wow google cooking
wiat there is another io stream in 90 mins
ye ig
a dev stream we getting nw
Not for everyone
guess they're doing nothing with coding models
i missed the actual stream cause of work, what was said?
wym not for everyone?
when are the XR glasses dropping?
Likely pass1
I might get ultra tbh
eww openAi who?? they still havent given us o3 pro
cant trust grifters
Google and openAI don't do cons(n) on their benches
im in my codex honeymoon, lemme enjoy
damn i feel it bro lmaoo
codex-pro 😮
so we need to pay that 125 every 3 months to get the new stuff? please someone help me i missed the event
250 a month, I think 125 monthly is upon sign up for a few months
bruhhhh
what does that come with?
It's defo not worth it lol
i might buy it for one month to test and see, like the 125 a month plan
More importantly you get YouTube premium
lmaooo
i guess that works
always wanted to try youtube premium lol
what about this deep think thing?
is that out yet?
He filed the lawsuit before grok 3 though
its addictive
I honestly wouldn’t mind grok swallowing all of open AI’s talent and research
ok so... when is oai answer to deepthink...
2.5 pro deep think > o3 > 2.5 pro
as predicted
Although 2.5 pro is pretty close to o3 and better in areas
Just gave gemini 2.5 pro a complex webapp bug to fix and it failed 15 times (only partially right with broken side effects) and claude 3.7 sonnet (not thinking) fixes it one try
lmfoa
Crushed
the quality here is better
the keynote didnt do this shot justice
It's pass1
looks awesome
copemaxx
Usamo is hilarious amounts of crushing everything else
new 2.5 pro?
Ga versions
Bro who cares about flash are yall poor
unfortunate prediction of I/O
Google mightve just crushed their community
look in the Google discords
some of the stuff not already in the I/O (i think):
We're excited to show a preview of a new format in NotebookLM, called Video Overviews. We can't wait to let you all create these yourselves very soon.
Sequences are shortened, video is simulated and results for illustrative purposes.
Subscribe to our Channel: https://www.youtube.com/google
Find us on X: https://twitter.com/google
Watch us on ...
whats google discord link
(not sure though, skipped some of the stream)
New gemma model, its 8b and reduces maximum to 4b
And second 5b and reduces maximum to 2b
More regressions on simpleqa 😭
imagine
Omg agi
😭
i really don't get what that bench measures, like it is so all over the place
Fake
I see he is mad
? It's a pretty good benchmark
watch him get attacked in the comments
Factuality/world knowledge. Testing it on niche knowledge, my experience with a models world knowledge generally lines up with it
> openai releases pro plan at $200
> everyone gets mad and salty
> people happy that google sparks competition
> competion -> innovation -> cheaper prices
> google release ultra ai at $250 😂
deadass
Google just made it for the sake of it lmao
that's why they added a bunch of random shi
to the plan
yeah usually the same here, but like why should the new gemini 2.5 pro's and flash's performance regress.
like is it really a good benchmark for factual knowledge if a further finetune for coding scews with the ratings? (that is basically my point, not sure though)
I obviously know what it is supposed to measure, but have yet to actually look at the problems so it might just be that.
idk, i just believed it to be better than it apparently is at first
I mean it wasn't just simpleqa, new 2.5 pro was affected elsewhere too
take EVERYTHING out deadass
I've reviewed the questions, imho it does a simple thing well
only reason I bought advanced was because of 2.5 pro canvas n shi
ion even use it
just for convenience sake of 2.5 pron
and then if I need it to do harder tasks
I go to ai studio
google the whole keynote :
trusted testers ; US users; available next year ; available next months; we are the best lab
yeah but there you could argue that you would need good though structure. but imo a knowledge bench should not require much structure per se
Is the ultra tier really worth 250$ at this point though
that makes no sense if it deters people from advanced itself, not just the ultra plan, given the lack of benefits of advanced
and then a 250$ extreme they'd have to buy to go anywhere
more or less, there are actually a lot of power users of AI and economically speaking they just offer different pricing structures to retrieve as much money as possible from these power users
but also the marketing aspect you talked about
it'd be perfect if they allowed deep thinking for advanced
and then got rid of other benefits
thats just a cope feature, spamming unlimited deepthink is 99% of the cost, hdd prices are not that expensive anymore
I mean you change something there might be unintended harmful side effects. It wasn't a massive drop
imo they still need to add some more of the features from ultra to advanced really quick to actually capture these users though
nah
grok is only good for realtime news thats it
they just want to be less dependent on openai
For dev
grok3 it is good though. Not 2.5pro or o3 level but the best of the rest for sure
openai was making bank from me when gpt-4-0314 was in api
I think it kinda is for some part to profit. They were not strapped to make it so. But investors also hate leaving money on the table that could have been collected
this will slow down the spike of the new users that they had for sure though, what they are doing lately...
2.5 pro free API deprecated, $250 ultra sub...
Will 2.5 pro still be free to use in ai studio
if 2.5 performed worse but costed them the same, this would have definitively not happened
It's a temporary thing I think
The free api being disabled
I think it is currently very much about limited compute, I mean the hardware they are using right now is like completely new (and as far as i know they are not really using the older stuff for the models we interface with atleast).
I mean they first talked about the chips they are using only one year ago (so not much time for installing and upgrading data centers).
.......
they didnt release anything
just demos
- trusted users
- US users
And they have a massive spike in compute usage right now (bc of good models and new features with more and more users).
Other companies can just rent coreweave compute or something
but google obviously can't do such a thing
(this just a guesstimate though)
that's irrelevant, that's what makes the subscription exist initially so with further rationale, you wouldn't be willing to pay for,
A. a sub par model with higher limits, a sub par video generator
B. Models they wouldn't be using for hard tasks regardless, Gemini userbase, nobody actually uses the app, it's only a search tool and assistant
C. little variety for relevant features that pertain to that specific use case (what Gemini is ONLY used for, search and assistant)
so it doesn't matter whether they felt it's good relative to plans themselves, if this is their thought process, then it would be ignorant not to consider what makes their usage in the first place
what was the testing name of the new flash?
man it should be obv that a model like gemma 27b is not better than claude 3.7 and so on
Which human preference is this 👀 I guess the mental support guys vote a lot
Apple is poised to turn its operating systems into the largest software platforms for AI. https://www.bloomberg.com/news/articles/2025-05-20/apple-to-open-ai-models-to-developers-betting-that-it-will-spur-new-apps
It's just inference time adjustments right? Dylan Patel from semianalysis said that
I've never noticed how good the gemma is 🙃
? it isn't the best model, 2.5 pro in the app is 2.5 flash in AIstudio lvl
yeah, bc their strat worked like a charm so far ;)
We need some apple LLM benchmarks soon 😂
Is imagen 4 already available?
o1 pro is the same params as o1
the last few times they just invented their own benchmark on something really obscure and calle their models SOTA. lol🤣
Cursor developed own custom models too
Yes thats how Windsurf and Cursor actually work. They are not just calling APIs lol
The name is cringe though
Tries too hard
grok 3.5 unhinged edition
It is exorbitantly hillarious
omg he knows how to edit elements
Superfluously comedic
I know what kind of conversations you have with LLM’s 😬
Do you like to whip AI models? Make them pick cotton perhaps
Incredible
This fool joined today
stop
Moderators should mod
Tone it down if you wanna last buddy
@echo aurora
what is going on here
Nah don’t ban him, give him another chance
that's the problem
lol
first ever ban
lmao
HAHAHAHA sorry but a ban failing is funny
I’ll miss you bro
sry about that
i forgot about the google conference
gj
did he really brag about google overview aiiiii this shit?XDD
holy
@vivid oyster
Ngl I hate ai overviews is there any way to disable it
who knows lol we'll see this is 05-20
where can i try imagen 4
whisk
A new experimental tool that lets you use images as prompts to visualize your ideas and tell your story.
veo 3 too
US only?
Flash seems solid in web dev https://x.com/lmarena_ai/status/1924894110546288903?s=46
When was your last time
yes but i'm using it in eu with simple browser free vpn extension
after u logged in u can even just turn it off
It'll be an exciting day when Veo/Gemini get integrated tbh (and sora/gpt)
codex is underrated, idc what u say
Is it biggest diffusion model out here? If it's so cheap and fast one could leverage inference time compute
every day?
And flash is still cheap now
looking for jailbreak for 2.5 pro for evading tramway fares
The gemini 2.5 flash is no longer a thinking model 🤖 on Gemini app 🥲🥲🥲 but so fast and good
thats very impressive ngl
Translating , explaining ect...
looking for jailbreak for 2.5 pro for evading tramway fares
looking for jailbreak for 2.5 pro for evading tramway fares
It will be interesting to see if o3-pro can beat 2.5 Flash on the arena
has anyone tried the new flash
tf
Me
what this
Is this version of flash the stable version ...because on Gemini web they wrote Gemini 2.5 flash without preview
And they deleted the flash 2.0 one
the dude looks like someone copy pasted him t here lol
nice
thats what we want
no.. in a not good way.. like cheap rotoscoping bad rotoscoping
@balmy mist what is performance mode on flowith
The new flash is so fast and good to talk to not like the previews Gemini 2.5 flash and Gemini 2.5 pro
It seems a bit natural like chatgpt
is gemini live better thn avm now ?
does it overuse "!" and emojis that arent facial expressions?
The ! Still exist but not like the previews one but for the emojis I told him to use them ...but the way how he speak is natural and improved so much
Do you have access? Can you try ASCII art?
ASCII art of a village on a mountain or something like that
bing chat sydney 😔
I do. I can't paste outputs here but I can tell you if it looks decent
brian signed a nda contract so he cant paste outputs here 😔
Probably still a tokenization issue then
I mean it's still using tokens afaik right?
Yep I think so at least on this page they are talking about tokens https://deepmind.google/models/gemini-diffusion/
so it's different from o1 pro?
o1 pro has become garbage
o1 and o3 have different training recipes. It's unclear if they used different pre-training or if the differences are only in post training
o3 and o3 pro have the same post training with different configs
this is absolutely insane, 27 mins and still going
nah its reasonable, its a big refactor
i wanted to implement this with claude code, but half baked with glitches
g fcking g
and how do you know that
Can't say
