cinder crest May 19, 2026, 3:37 PM

#

evidence and stuff aside do you really think that models don’t degrade in any form whatsoever overtime

#

not argumentative btw just genuinely curious

unborn spade May 19, 2026, 3:37 PM

#

cinder crest evidence and stuff aside do you really think that models don’t degrade in any fo...

labs probably quantize their models and make them more efficient over time which can change the results of certain workflows

native root May 19, 2026, 3:37 PM

#

cinder crest evidence and stuff aside do you really think that models don’t degrade in any fo...

they quantize it 2 week after release, like a tradition

unborn spade May 19, 2026, 3:38 PM

#

but those are just maintenance changes

final vessel May 19, 2026, 3:38 PM

#

Inference bugs probably happen here and there

unborn spade May 19, 2026, 3:38 PM

#

that too

cinder crest May 19, 2026, 3:38 PM

#

yeah the quant stuff is what im sure of

final vessel May 19, 2026, 3:39 PM

#

Them quantizing/subtly optimizing over time is plausible, but you'd think this would be somehow measurable

native root May 19, 2026, 3:39 PM

#

cinder crest yeah the quant stuff is what im sure of

one after the two week of benchmark, one before new model release

balmy grove May 19, 2026, 3:39 PM

#

Surely when a model comes out, someone could set temp to 0, make an array of requests, save the responses, and then do the same over time to see if anything actually changed. If the temp and other params are identical the responses should be the same

final vessel May 19, 2026, 3:39 PM

#

Temp 0 does not guarantee deterministic outputs

native root May 19, 2026, 3:39 PM

#

final vessel Them quantizing/subtly optimizing over time is plausible, but you'd think this w...

nobody measures the model benchmark overtime, it is costly.

cinder crest May 19, 2026, 3:40 PM

#

we can’t really get actual proof proof unless it’s from an actual insider

balmy grove May 19, 2026, 3:40 PM

#

Other things that could change are hidden system prompts and safety guardrail stuff that gets tweaked

unborn spade May 19, 2026, 3:41 PM

#

those change all the time tbf

#

for direct API access it's less common though

#

you usually get pretty full control of the model besides safety stuff

native root May 19, 2026, 3:41 PM

#

cinder crest we can’t really get actual proof proof unless it’s from an actual insider

NDA and sacrificed first born

final vessel May 19, 2026, 3:42 PM

#

native root nobody measures the model benchmark overtime, it is costly.

trackingai.org

native root May 19, 2026, 3:42 PM

#

balmy grove Other things that could change are hidden system prompts and safety guardrail st...

only anthropic been scummy with these

final vessel May 19, 2026, 3:42 PM

#

Doesn't even need to be an expensive bench

cinder crest May 19, 2026, 3:42 PM

#

they actually tightened their safety stuff on 3.1 pro at one point

#

lots of false flag filtering popped up suddenly out of nowhere

unborn spade May 19, 2026, 3:44 PM

#

i mean one prime example of a model changing after release was when 4o became even more of a sycophant than usual

cinder crest May 19, 2026, 3:44 PM

#

you could tell if you use 3.1 on vertex compared to AI studio

unborn spade May 19, 2026, 3:44 PM

#

so labs definitely do change the weights every now and then

balmy grove May 19, 2026, 3:44 PM

#

cinder crest they actually tightened their safety stuff on 3.1 pro at one point

I did have a period where Nano Banana 2 (over API) had a lot of random refusals for harmless stuff for a day or two before going back to normal

#

So I think that was some backend guardrail stuff at work

cinder crest May 19, 2026, 3:45 PM

#

balmy grove I did have a period where Nano Banana 2 (over API) had a lot of random refusals ...

ye that’s what im on about

#

they were definitely tweaking stuff behind the scenes

native root May 19, 2026, 3:46 PM

#

final vessel trackingai.org

first time seeing, either not popular or skill issue on my part. interesting stuff, shows things spikey but overall the same... i don't somehow believe or maybe need better testing.

#

https://tenor.com/view/old-snake-metal-gear-solid-snake-time-goes-by-fast-metal-gear-solid-gif-26264117

Tenor

native root May 19, 2026, 3:47 PM

#

cinder crest lots of false flag filtering popped up suddenly out of nowhere

that is the classifier model not the main model

native root May 19, 2026, 3:47 PM

#

unborn spade i mean one prime example of a model changing after release was when 4o became ev...

previously models used to get quiet incremental updates based on date but keep their 'brand name', now they get .1 .2 .3 .4....

unborn spade May 19, 2026, 3:49 PM

#

no way to know for sure if those .1 .2 etc. versions are getting quiet updates or not

#

i reckon it's just small things like quants and maintenance, but it's still possible for a model to be changed under the hood without users knowing

cinder crest May 19, 2026, 3:51 PM

#

wouldnt put it past them theyre not exactly the most transparent companies lol

subtle spruce May 19, 2026, 3:57 PM

#

final vessel trackingai.org

Thank you for sharing this.

native root May 19, 2026, 4:06 PM

#

unborn spade i reckon it's just small things like quants and maintenance, but it's still poss...

gpt 5.5 is new pretraining, they say

stuck spade May 19, 2026, 4:14 PM

#

gemini 3.5 flash pricing on docs
https://cloud.google.com/gemini-enterprise-agent-platform/generative-ai/pricing#standard

#

it WILL be 1.5/9 unfortunately

fervent mist May 19, 2026, 4:16 PM

#

well that's pretty fucking expensive for flash

#

I guess pro will be no less than $3/$20

stuck spade May 19, 2026, 4:24 PM

#

https://fixupx.com/googleaidevs/status/2056767412255805521

Google AI Developers (@googleaidevs)

It’s time for Google I/O!
︀︀
︀︀✧ 10:00AM - Keynote
︀︀✧ 1:30PM - Developer Keynote
︀︀✧ 3:30PM - What’s new in Google AI
︀︀✧ 4:30PM - Scale the builder ecosystem with @GoogleDeepMind and @Antigravity
︀︀
︀︀Times in PT.
︀︀x.com/i/events/2053241348807864323

**💬 2 🔁 6 ❤️ 27 👁️ 2.0K **

final vessel May 19, 2026, 4:25 PM

#

stuck spade gemini 3.5 flash pricing on docs https://cloud.google.com/gemini-enterprise-agen...

Hi, Google, I live in Mars and would like the non-global pricing, thanks

stuck spade May 19, 2026, 5:02 PM

#

@hallow sinew

sonic wraith May 19, 2026, 5:04 PM

#

I'm sure it's really good and everything but 1.5/9 for a Flash model is a yikes

#

I thought the TPUs were supposed to be magical devices that made model go fast for cheaper

stuck spade May 19, 2026, 5:05 PM

#

"tokenmaxxing" haha

radiant meadow May 19, 2026, 5:05 PM

#

bru

#

Gemini 3.5 flash still not on google ai studio

stuck spade May 19, 2026, 5:05 PM

#

probably will be out when announced on I/O

gleaming pawn May 19, 2026, 5:18 PM

#

WHY IT SO EXPENSIVE OHHH YMG OODODO!?!?!?!!?!

#

#

wtf is ths ?

stuck spade May 19, 2026, 5:21 PM

#

its like agentic video / image generation

#

based on flash

fervent mist May 19, 2026, 5:21 PM

#

omnibanana...

glossy marsh May 19, 2026, 5:24 PM

#

gleaming pawn May 19, 2026, 5:24 PM

#

OHM YMGOD

#

glossy marsh May 19, 2026, 5:25 PM

#

why are they bragging about doublign token usage

#

"our processes keep getting less token efficient"

gleaming pawn May 19, 2026, 5:27 PM

#

doesnt necesarily mean less token usage

#

just faster

glossy marsh May 19, 2026, 5:32 PM

#

okay I'm not watching this guy yap about antigravity. Someone post here if they do anything else interesting

open cobalt May 19, 2026, 5:33 PM

#

idk hes kinda persuading me to try it

simple sundial May 19, 2026, 5:33 PM

#

open cobalt idk hes kinda persuading me to try it

ong

gleaming pawn May 19, 2026, 5:33 PM

#

same

#

but i want cli

#

well antigravity cli ig

#

didnt we have gemini cli ?

open cobalt May 19, 2026, 5:34 PM

#

if I can have that as a cli where I can open the app too sometimes if I feel like it, I think I'de swap to it.

split drift May 19, 2026, 5:36 PM

#

Damn

glossy marsh May 19, 2026, 5:38 PM

#

swe-bench result is an outlier

hoary hinge May 19, 2026, 5:39 PM

#

gemini 3.5 pro next month

fervent mist May 19, 2026, 5:39 PM

#

40% in HLE for a fast model is rad

hoary hinge May 19, 2026, 5:39 PM

#

pichai says

fervent mist May 19, 2026, 5:40 PM

#

also that's a crazy jump in GDPval

gleaming pawn May 19, 2026, 5:40 PM

#

gpt 5.5 fucking strong man

hoary hinge May 19, 2026, 5:41 PM

#

glossy marsh why are they bragging about doublign token usage

its not abt that

#

its trying to show shareholders that their product is growing and popular

#

basically "look at this big number"

#

"we have big number so buy stock"

glossy marsh May 19, 2026, 5:42 PM

#

hoary hinge its not abt that

yeah ik why but it's a pretty dumb way to say it

#

and would make me suspicious about if 3.5 flash is an overthinker

hoary hinge May 19, 2026, 5:45 PM

#

ngl google i/o is so fuckin stupid to watch

#

they will make the most mundane claim ever

gleaming pawn May 19, 2026, 5:45 PM

#

yeah

hoary hinge May 19, 2026, 5:45 PM

#

"we are making this available today"

#

5 second pause

#

applause

gleaming pawn May 19, 2026, 5:45 PM

#

oh my god this script is so bad

hoary hinge May 19, 2026, 5:47 PM

#

script written by gemini

#

is she reading off a screen

gleaming pawn May 19, 2026, 5:47 PM

#

most liekly

fervent mist May 19, 2026, 6:59 PM

#

https://cdn.discordapp.com/attachments/1077534221410783252/1506366332088750080/edited_7e216195-3c67-4d11-9106-32c528b73a5d3082304733425612491.jpg

#

apparently there's a scam site

#

purporting to be Gemini Spark

gleaming pawn May 19, 2026, 6:59 PM

#

gulp

#

thats really fast

fervent mist May 19, 2026, 7:00 PM

#

(well not a scam really but you know, misleading branding)

fervent mist May 19, 2026, 7:02 PM

#

gleaming pawn thats really fast

the wonders of vibe-coding

gleaming pawn May 19, 2026, 7:04 PM

#

maybe they used 3.5 flash's speed for it

#

👍

ebon crypt May 19, 2026, 7:05 PM

#

how's with 3.5 model

#

the speed is very noticeable

#

it did regressed than other frontier models but its miles better than 3 flash

wary timber May 20, 2026, 12:51 AM

#

Flash Lite 3.5 when

final vessel May 20, 2026, 12:52 AM

#

I want my Flash 3 back

wary timber May 20, 2026, 12:55 AM

#

Lite 3.5 would be new 3 Flash!

#

With pricing too

final vessel May 20, 2026, 12:58 AM

#

Sweeps loss of generalist ability, drastically reduced world knowledge and worse reasoning in name of task-specific and agentic training under the rug

balmy grove May 20, 2026, 1:02 AM

#

final vessel *Sweeps loss of generalist ability, drastically reduced world knowledge and wors...

https://giphy.com/gifs/editingandlayout-the-office-true-dwight-5wWf7GR2nhgamhRnEuA

Giphy

EditingAndLayout

cinder crest May 23, 2026, 5:03 PM

#

#

https://tenor.com/view/mogger-mind-moggermind-мегамозг-мьюинг-мегамозг-мьюинг-gif-16212916096529261739

Tenor

stuck spade May 23, 2026, 6:15 PM

#

https://fixupx.com/NVIDIAAI/status/2058238271738855742

nemotron 500B?

NVIDIA AI (@NVIDIAAI)

↩ (@TheAhmadOsman)
@TheAhmadOsman 👀 "Ultra" ⏳️

**💬 4 🔁 1 ❤️ 17 👁️ 2.4K **

glossy marsh May 24, 2026, 5:47 PM

#

I had a dream deepseek and Kimi both release 59a3b models, does that count

wary timber May 24, 2026, 5:51 PM

#

Nightmare

#

Why 59?

#

To leave 5GB of RAM to OS?

gleaming pawn May 24, 2026, 7:27 PM

#

stuck spade https://fixupx.com/NVIDIAAI/status/2058238271738855742 nemotron 500B?

whatever tbh

#

nemotron models are so damn dogshit

#

and theyre SUPER unstable

#

as in, cant rely on them answering in a specific format, sometimes they start reasoning in the output block, sometimes just random fucking characters or whtaever, structured outputs dont even think abt it, etc.

torn juniper May 24, 2026, 7:30 PM

#

I think your leather jacket simply isn't enough of a winner to appreciate them

gleaming pawn May 24, 2026, 7:31 PM

#

🥀

#

like i find it crazy that xiaomi is able to come onto the scene with such a good model to begin with (mimo v2 flash)

#

compared to a giant like damn nvidia themselves and they cant do something even ok

final vessel May 24, 2026, 7:32 PM

#

cinder crest

Going to burn through the quota like there's no tomorrow

torn juniper May 24, 2026, 7:36 PM

#

Yeah, I knew they had something special with V2. It was smol and had some issues, but it felt so unique and fresh to me. Wow'd me with some of its insights

#

And then basically one release later and it's the top benching open model

#

Not that it's some mom-and-pop company, but still a wild intro to the scene

cobalt stream May 24, 2026, 10:57 PM

#

Didn't they hire someone from deepseek

glossy marsh May 24, 2026, 11:08 PM

#

wary timber To leave 5GB of RAM to OS?

prolly dream logic but honestly that fits weirdly well

stuck spade May 26, 2026, 4:04 PM

#

https://fixupx.com/MiniMax_AI/status/2059286515155599595

MiniMax (official) (@MiniMax_AI)

#MSA #OpenSource #M3
︀︀🫣😎

Quoting Skyler Miao (@SkylerMiao7)
︀
Something BIG is coming

**💬 44 🔁 42 ❤️ 554 👁️ 28.5K **

#

new attention mechanism aswell, seemingly quite similar to deepseek's DSA but with chunks instead of individual tokens

wary timber May 26, 2026, 4:55 PM

#

stuck spade https://fixupx.com/MiniMax_AI/status/2059286515155599595

unborn spade May 26, 2026, 5:04 PM

#

im kinda just sitting here waiting for kimi and glm to do something

#

been a while since they released something good

stuck spade May 26, 2026, 5:15 PM

#

kimi teased a long time ago a 1T param model with KDA

#

somewhere in a paper

#

so i wonder when that will come out

gleaming pawn May 26, 2026, 5:51 PM

#

stuck spade https://fixupx.com/MiniMax_AI/status/2059286515155599595

ohh bet

#

oh shit msa

#

🫩

karmic gulch May 26, 2026, 6:12 PM

#

Claude Opus 4.8 is coming next week

glossy marsh May 26, 2026, 7:01 PM

#

Kimi doesn't release super often I expect they'll drop K3 in like a couple months

wraith crypt May 26, 2026, 7:24 PM

#

deepseek v5 tomorrow

verbal schooner May 26, 2026, 7:24 PM

#

We might get Minimax-M3 with sparse attention: https://x.com/SkylerMiao7/status/2059285750458544561

Skyler Miao (@SkylerMiao7)

Something BIG is coming

wraith crypt May 26, 2026, 7:25 PM

#

wraith crypt deepseek v5 tomorrow

100T parameter model (ultra sparse with 1m parameters active per token) and 1b token context window

glossy marsh May 27, 2026, 12:19 AM

#

50T token context window

#

it just has all the training data in context

plush mirage May 27, 2026, 2:36 AM

#

wraith crypt 100T parameter model (ultra sparse with 1m parameters active per token) and 1b t...

1m active per token gonna be so ass

#

at least 100B pertoken

wraith crypt May 27, 2026, 2:44 AM

#

plush mirage 1m active per token gonna be so ass

it gets 100% on every benchmark

#

it’s literally asi

#

when being tested in a benchmark it identified a use after free memory vulnerability in the answer parser, which it then used to gain arbitrary code execution and change every other models score to 0

stuck spade May 28, 2026, 2:36 PM

#

https://fixupx.com/hysteresis_x/status/2059997797420540338

Tensor (@hysteresis_x)

Opus 4.8 has been found staged in the claude code model selector on the desktop app. It should be releasing today! lets gooooooo

**💬 7 🔁 12 ❤️ 111 👁️ 21.6K **

wary timber May 28, 2026, 2:51 PM

#

Deepseek 4.8 today

sonic wraith May 28, 2026, 2:57 PM

#

Haiku 3.5 today

cinder crest May 28, 2026, 3:07 PM

#

release 4.8 already im tired of using 4.6/4.7 on my plan AngryJoe

radiant meadow May 28, 2026, 3:07 PM

#

wary timber May 28, 2026, 3:08 PM

#

Your fonts are worrying me

#

A lot

radiant meadow May 28, 2026, 3:08 PM

#

wary timber Your fonts are worrying me

ok

cinder crest May 28, 2026, 3:09 PM

#

https://tenor.com/view/kraid-samus-metroid-metroid-dread-lion-gif-3665180798431418891

Tenor

ebon crypt May 28, 2026, 3:29 PM

#

radiant meadow

it took 4-7 months to GA 😭

#

yet it gets beaten by gpt image 2

unkempt imp May 28, 2026, 3:33 PM

#

stuck spade https://fixupx.com/hysteresis_x/status/2059997797420540338

this should be good. i'm very curious to see if they've noticed the criticisms of 4.7 and being mogged by sama. gpt 5.5 is just so good, for those who are skilled enough to weaponize its autism

stuck spade May 28, 2026, 4:44 PM

#

stark orbit May 28, 2026, 5:19 PM

#

stuck spade https://fixupx.com/hysteresis_x/status/2059997797420540338

just when I downgraded my sub that should renew tomorrow 😂

stuck spade May 28, 2026, 6:14 PM

#

https://fixupx.com/Polymarket/status/2060044998754623572

Polymarket (@Polymarket)

JUST IN: Anthropic announces it will roll out Claude Mythos “in the coming weeks” despite growing fears over the model’s cyber capabilities.

**💬 126 🔁 124 ❤️ 1.9K 👁️ 80.3K **

gleaming pawn May 28, 2026, 6:50 PM

#

"fears over the model's cyber capabilites"

#

yea ight

unborn spade May 28, 2026, 7:01 PM

#

by the end of june we should expect:

claude mythos (or at least mythos-class)
gemini 3.5 pro
minimax m3
gpt 5.6 (probably likely considering opus 4.8 just dropped and oai will need to respond)
new kimi or glm drop?

#

top two are confirmed, minimax m3 is confirmed soon but no date, last two are mainly speculation

gleaming pawn May 28, 2026, 7:02 PM

#

gpt 5.6 would be goated

glossy marsh May 28, 2026, 7:14 PM

#

unborn spade by the end of june we should expect: - claude mythos (or at least mythos-class)...

Step 3.6
Mimo v2.6
Hopefully qwen 3.7 open models

unborn spade May 28, 2026, 7:15 PM

#

mimo v2.6 would be nice to see

#

especially if pricing stays the same

#

qwen models piss me off because of how expensive they are through api for such small models

rain dagger May 28, 2026, 7:55 PM

#

minimax 3.0 but it's already confirmed right

#

also wondering when mythos is coming

stuck spade May 28, 2026, 7:57 PM

#

rain dagger minimax 3.0 but it's already confirmed right

just soon, but we dont know the date

stuck spade May 28, 2026, 9:20 PM

#

i wonder what gpt 6 will be, what will it do differently, probably in the last few months of the year

split drift May 28, 2026, 10:17 PM

#

stuck spade just soon, but we dont know the date

they said in a reply 'several days' hope that means next week

cobalt stream May 29, 2026, 12:19 AM

#

Top models are all closed rn

#

Someone needs to do something

#

Moonshot I'm looking at you

zealous spruce May 29, 2026, 12:30 AM

#

Kimi k3 fr

wraith crypt May 29, 2026, 12:32 AM

#

cobalt stream Top models are all closed rn

kimi is still mogging grok and muse spark

cobalt stream May 29, 2026, 12:42 AM

#

K3 is going to mog opus

untold stratus May 29, 2026, 1:34 AM

#

It better

#

I wanna see a open weight model #1 on aa

gleaming pawn May 29, 2026, 2:19 AM

#

lol

hoary hinge May 29, 2026, 2:25 AM

#

unborn spade by the end of june we should expect: - claude mythos (or at least mythos-class)...

i pray for deepseek 4.1

sick crystal May 29, 2026, 4:21 AM

#

04:50 PM EDT, 05/28/2026 (MT Newswires) -- (Updates with the company's response in the fourth paragraph.)
Microsoft (MSFT) is slated to release a suite of new homegrown AI models next week at its annual Build conference in San Francisco, The Information reported Thursday.
The company will unveil a coding model aimed at boosting the competitiveness of Microsoft-owned GitHub Copilot, the report said, adding that it also plans to introduce new models specialized in tasks such as transcription, reasoning, speech, and image processing.
The new suite of models will build on earlier homegrown models that Microsoft previewed earlier this year, according to the news outlet.

unborn spade May 29, 2026, 4:40 AM

#

microsoft models have always been quite trash unfortunately

#

i would be surprised if something actually changed

#

also hoping to see meta release something actually competitive sometime soon, if muse spark ever even gets api access

sick crystal May 29, 2026, 4:53 AM

#

maybe but imagine if they released something like grok fast at a low price

#

they have some skill, phi is a remarkable model series, but they haven't proven themselves with frontier coding models

#

which is fascinating considering they own Github. i think they have a lot of data, but whether they are able to pull it off is another

willow jewel May 29, 2026, 3:12 PM

#

sick crystal they have some skill, phi is a remarkable model series, but they haven't proven ...

Phi is not remarkable at all lol

#

It is so clearly benchmaxxed

ancient raven May 30, 2026, 12:26 AM

#

sick crystal they have some skill, phi is a remarkable model series, but they haven't proven ...

I tried using Phi ages ago and it just felt off - it's a shame coz I really appreciated it's low file size for local deployment.

#

Would be nice if Phi improves and gets better

ebon crypt May 30, 2026, 10:23 AM

#

sick crystal > 04:50 PM EDT, 05/28/2026 (MT Newswires) -- (Updates with the company's respons...

they really should take the opportunity to make decent cheaper coding models given they have github and all the data they have

#

mai image 2.5 is decent but meh, it feels so unfinished, text to image only, limited to default 1:1 DALL-E resolution

#

I don't think microsoft's genai division isn't yet mature compared to google and openai, they only had 2 years to develop in house models, clearly its half baked

#

they did had finetunes and modified versions of gpt4 before, but thats based on existing openai tech, not something they trained from scratch, and if they trained something from scratch its ends up being horrible, phi models were not good compared to gemma and qwen

#

but I'm still doubtful how ms is gonna pull this off, and with mustafa involved I'm also doubtful considering the fact he did say he's fine being behind at frontier for approximately 6 months, but if model performance is a compromise then I'm not going to use it, their text model is not so great not something they should celebrate

plucky merlin May 30, 2026, 11:56 AM

#

new model we want to get added

#

I know the team, any way to speed it up?

zealous edge May 30, 2026, 12:18 PM

#

what do we want?

DEEPSHMEEK V4!

when do we want it?

uh... about RIGHT NOW!

how do we want it?

UNREASONABLY CHEAP CONSIDERING INFERENCE COSTS

stuck spade May 30, 2026, 2:23 PM

#

https://fixupx.com/AiBattle_/status/2060706458564440324

AiBattle (@AiBattle_)

MiniMax is currently conducting internal CKPT testing for M3, a multimodal, long-context model
︀︀
︀︀The team is also resolving pipeline issues and upgrading its infrastructure
︀︀
︀︀In the next few days, they plan to provide CKPT/API access for developers in the open-source community to evaluate the model

Quoting Jiayuan (JY) Zhang (@jiayuan_jy)
︀
MiniMax M3 即将发布，想邀请一些中文开源社区的 contributor 来评测，阿岛 @SkylerMiao7 建了一个飞书群，可以第一时间体验到！
︀︀
︀︀另外希望申请者有一些开源项目的贡献经验（贡献过开源项目或者有自己的开源项目），在验证信息里面注明就行。

**🔁 4 ❤️ 47 👁️ 2.1K **

ebon crypt May 31, 2026, 3:26 AM

#

Ok now this is interesting

#

lmao i was a bit wrong MS falls significantly behind e.g. image gen

#

turns out they have image editing hidden

ancient raven May 31, 2026, 6:00 AM

#

ebon crypt Ok now *this is interesting*

guess we'll have to see - have never used any of their models but I'm curious about their transcribe and voice models

ebon crypt May 31, 2026, 6:00 AM

#

I only tried their image models

#

not bad but its just so limited

#

being behind gpt image 2 and nano banana pro is kinda DoA at this point

#

ive been a fan of bing chat back then but ever since 2024, things went downhill for MS, even worse in 2025

ancient raven May 31, 2026, 2:35 PM

#

ebon crypt ive been a fan of bing chat back then but ever since 2024, things went downhill ...

Same. really liked the early days when Bing made google abit scared.

ebon crypt May 31, 2026, 2:38 PM

#

google fumbled a ton lol

#

given they're a search company after all

#

they had this called Prometheus baked into their custom gpt model in bing chat, which i find bing to be most used for every queries I gave than bard or chatgpt

#

and uhh, they kinda blew it ever since mustafa took over, and google rising

#

I'm curious to see what their in house coding model would perform, of course if its not going to be a chatbot code model though :/ where its just optimized for open file context and qa

#

they have like the leverage... github copilot data, cost efficient compute if they learned something that openai models are expensive to run on azure

ancient raven May 31, 2026, 2:54 PM

#

ebon crypt they have like the leverage... github copilot data, cost efficient compute if th...

oh wow that's very insightful of you. I agree it's still unclear. They've been releasing Phi but they've been poor on being positioning it well amongst other models in the market. Seems like there's an upcoming release coming up so we'll have to wait and see. I'm keen to know how they'll position themselves in the current climate.

ebon crypt May 31, 2026, 2:56 PM

#

i dont think phi is supposed to be sota, its a small model after all, and hasnt been updated ever since

#

i think the last update was 2025

#

so its not competitive to gemma, which google kinda leads in edge / local AI right now

ancient raven May 31, 2026, 3:00 PM

#

ebon crypt so its not competitive to gemma, which google kinda leads in edge / local AI rig...

yeah agree I love Gemma 4 - running on locally and it's been my go to for very simple tasks that I need to clear for work.

ebon crypt May 31, 2026, 3:01 PM

#

im still waiting what MSbuild has to offer

#

with ms cancelling claude code subs internally, changing gh copilot pricing model... they really should offer cost effective code model at this point that should be as good, and not some code chat model that barely does anything

ancient raven May 31, 2026, 3:18 PM

#

ebon crypt im still waiting what MSbuild has to offer

it's coming soon - so I guess we'll find out. Here's hoping they shake things up abit.

wraith crypt Jun 1, 2026, 12:38 AM

#

#

minimax m3

#

holy shit

#

nice

#

nvm im late 💔

sonic wraith Jun 1, 2026, 1:17 AM

#

This is upcoming models, not released one minute ago models

#

smh smh

ebon crypt Jun 1, 2026, 1:31 AM

#

it's on arena

#

if its not available on API, technically its still upcoming

merry harbor Jun 1, 2026, 2:16 AM

#

Gpt5.6 when

split rivet Jun 1, 2026, 2:16 AM

#

GPT5.6 when

hoary hinge Jun 1, 2026, 4:19 AM

#

agi when

dry marten Jun 1, 2026, 4:42 AM

#

#

cosmos3 nano/super for text to image/vid/world model

unborn spade Jun 1, 2026, 2:36 PM

#

now THIS looks absurd

#

finally nvidia is doing something useful with their models

#

looking forward to seeing the release

#

anyways i'm still waiting on sonnet 5

#Upcoming models speculation

DEEPSHMEEK V4!

uh... about RIGHT NOW!

UNREASONABLY CHEAP CONSIDERING INFERENCE COSTS