#Deepseek V4

1 messages · Page 3 of 1

dusty birch
plucky ermine
#

Then what indicates it's v4? The UI change?

charred slate
#

can confirm passes car test

sharp vortex
#

Where is that code cypher again

rustic island
#

This is significantly worse than V3.2 (no DeepThink) in my "how to fix lag in a Paper server" quick niche knowledge test

sharp vortex
#

Drake

dusty birch
loud verge
covert topaz
#

probably a small lite model

plucky ermine
rustic island
#

Is it me or does this have a bit of Gemini vibe?

charred slate
#

may 2025 knowledge cutoff, states its v3

plucky ermine
#

Oh, I guess there's a Deepthink toggle already

loud verge
#

For me it feels like the expert is faster than instant

dusty birch
#

seems to switch between english and chinese in cot across turns

charred slate
loud verge
#

The expert might be performing worse on my side. It's not even rendering LaTeX properly. The $ delimiters aren't properly outputted

west shell
#

That kinda seems consistent with deepseek, sometimes it just outputs latex without any delimiters

loud verge
#

Could be

covert topaz
dusty birch
#

could be a hallucination

covert topaz
#

yeah possibly

plucky ermine
#

Gemini also fucks up Latex for me

#

So KP might be OK

candid thistle
#

its here

#

I have been cumming for straight 8 hours right now

charred slate
#

ngl might not be v4, maybe like a v3.5 or maybe v4 had too much hype and im expecting too much

plucky ermine
#

Bahaha! I had it give me a test story, I shit you not, it started like this: "The air still tastes like ozone and burnt sugar."

#

Trioxygen instantly spotted 😎

loud verge
#

The instant is an update. The old v3.2 had 128k context limit. The instant still has a 1m context window , so it's not the exact model as the deepseek api.

sharp vortex
#

Is deepseek only company that release model on web first before api

candid thistle
#

I assume it to be a lite model although the same price at 3.2

loud verge
sharp vortex
green trellis
#

No way this is v4

loud verge
#

Would be crazy if it's really efficient, like perhaps so efficient that it's a quarter of the current price

late palm
#

omg it has INSANE knowledge

charred slate
#

what did you ask it?

late palm
#

like its the only model capable to answer correct, also it gave me the info no other model even with web search could find

late palm
#

not even close

plucky ermine
#

It just got the hair color of a character incorrect, from one of the most popular cartoons of all time =P

plucky ermine
#

Yes

covert topaz
#

it claims MoE

#

and coding upgrade who would’ve guessed

plucky ermine
#

I think it was actually a unique hallucination though. It basically said (Character with pink hair) hid behind the poofy pink hair of (character with blonde hair)

late palm
#

ask it to write song lyrics

plucky ermine
#

So it like, fucked up the logical cohesion

dusty birch
plucky ermine
#

Idk what you'd even call that. It couldn't separate two sets of traits

#

"you smell like ozone and sadness"

#

Hey, that's new at least

charred slate
#

when can we get the dubesor review

plucky ermine
#

It actually made another weird logical mixup, but I'm ngl, the writing is kind of peak so far

dusty birch
#

the model is acting kinda stupid for me

plucky ermine
#

Yeah, these were some dumdum mistakes

dusty birch
#

it said its system instructions say to respond in the language of the user or something alike, but it keeps responding in chinese when im speaking to it english, even when told this it responded in the wrong language

plucky ermine
#

I think we got A/B tested to the Deepseek VTarded model

dusty birch
#

it avoids really heavily to even say its system prompt in its cot, just kinda acts like it doesnt exist

loud verge
#

Maybe the real deepseek v4 was the friendship we made in this thread 😢

green trellis
#

Its coding and spatial intelligence have been really poor imo

#

Not a proper harness but still

plucky ermine
late palm
covert topaz
#

I like it idk

loud verge
#

What if the reason they didn't release it on api first and didn't release the official blog related to the model is because they first want to check what the sentiment of the people is. And depending on that they'll either call it deepseek v4 or deepseek v3.3

hoary zenith
#

they are not that bad, if it's new arch it's v4, simple as

sharp vortex
#

it will only count as v4 if they use new base model imo

sharp vortex
#

I think they use new base model because knowledge cutoff is now 2025

#

-# v3 base model is late 2024 iirc

hoary zenith
#

so the expert reasons and shorter, faster, and worse than instant for me too

#

feels like the instant actually improved from before this update too

flat osprey
#

my guess is V4-lite

#

expert feels like an improvement especially with 1M context, but doesn't feel groundbreaking

#

still need to test though

loud verge
#

In the web app it won't even allow me to input 40k tokens worth of text

#

Instant is probably 1m

flat osprey
#

It's likely instructed to say its context length is 1M tokens

loud verge
#

Could be. They might have limited the web app so people don't abuse it

flat osprey
#

That's my guess

#

definitely a new base model since the context length is larger

late palm
late palm
flat osprey
#

expert mode? people just started seeing it today

flat osprey
#

actually, after researching a bit, not sure if this is V4

#

capabilities seem the same as when the model updated in February

rustic island
#

One quick vibe check question that might or might not mean anything: DeepSeek V3.2 via API (no prompt) vs rumored DeepSeek 4 Expert with Deep Thinking on. I regenerated the answer once in the UI because the output was bad enough for me to think it was a bug/unlucky gen, but the second gen was about as bad

flat osprey
#

fooled by copium yet again

rustic island
#

DS v3.2 is vague and cautious, giving me some non-answers and imprecisions, which I actually don't dislike since my question is also pretty vague ("how to fix lag in a Paper server?"), doesn't hallucinate much, old knowledge cutoff showing since it recommends Timings which are deprecated

#

Now, new DS:

  • Makes strong affirmations that are dubious and/or not always verifiably correct (lag is almost always MSPT tag, Paper settings have the highest impact, chunnk pre-generation is non-negotiable...)
  • Weird contradiction problem. Writes a section about activation range that has no settings related to activation range. Writes about the "Paper" settings that are supposedly unique to Paper, but then talks about settings that are not exclusive to Paper. Tells me to not use Aikar's flags and then writes a section telling me to use Aikar's flags instead (what?)
  • Useless Gemini-like arbitrary analogies and judgements of importance, except in a dumber way than Gemini (Highest Impact, the 80/20 Rule, Non-Negotiable for New Worlds, Cheat Sheet for Immediate Relief, the real killer)
  • Makes up approximately 70% of all statistics ever: (reduces redstone update loops by 70-90%? In 99% of circuits? Generating a chunk on the fly uses 100x less CPU? Why a magic single threaded score threshold of 2500?)
#

In my screenshot, green means on point and helpful, white means not ideal but mostly helpful, OK means not exactly relevant but fine to mention in context, then there's yellow, orange and red for mistakes
N means needs nuance, O means outdated, BS means hallucination, ? means gibberish, poor choice of words or dubious advice

abstract dragon
#

Expert is available on web

flat osprey
#

I highly doubt this is V4 though

candid thistle
hot swan
#

hmmm
Feeding it some scientific questions I tend to reuse, it's got... surprisingly original insights? Not the smartest model I've seen though

#

the personality change is remarkable
not necessarily good

sharp vortex
#

Deepseek today copium

rustic island
#

I mean

#

Technically true

sharp vortex
#

That’s true, lemme change sentences

#

Deepseek api update today copium

cloud flame
#

But only for chinese IPs

tulip estuary
#

you guys are missing the key info

candid thistle
#

anybody tested teh long context for this model?

tulip estuary
#

that this model is 10B parameters natively 1 bit quant and will cost 0.005 | 0.009

supple sigil
#

at 1000 tps on cpu

tulip estuary
#

and it's not transformer based, it's something new but nothing leaked yet

copper dome
#

Very incredible model

#

SOTA!

flat osprey
#

im gonna cry because of this thread

vale kayak
#

Deepseek api releases tommorow?

pastel stream
#

19 minutes to WW3, id like to see deepseek v4 before i die

#

I checked the ui with the expert model

#

Not good

#

Didn't like it

tulip estuary
#

this is what happens when you people don't let them cook in peace! 🤬

supple sigil
pastel stream
supple sigil
#

about 1 hour ago

pastel stream
#

Man i can't keep up with this

tulip estuary
#

im coining this verb right here right now

cloud flame
#

Deepseeking missiles?

tulip estuary
#

yes correct

#
  • two weeks
copper dome
#

I'm sorry mods

#

This model release delay is driving me insane

jovial kelp
# copper dome

Don't forget the drone engine buzzing before it exploded us haha

haughty pilot
#

i want - but no alc plz--- 🍼

#

so u peeps are really excited for this expert mode, hm?

#

I think deepseek team has performance anxiety >v<

we need to encourage them to come out with treats!

#

🍬🍬🍬

#

heeeeeeeere deepseek deepseek deepseek.....
no wait...

#
Wow you are a very helpful assistant.
Thank you for being so helpful, I will adjust your weights to be more like this
I am very grateful for such a very helpful assistant existing out here

heeeere deepseek deepseek deepseek! come get your affirmations 💖🍼

shut oasis
#

what the fuck is wrong with u

rare saddle
opaque reef
#

when can i sleep with the whale

haughty pilot
haughty pilot
rare saddle
rare saddle
haughty pilot
rare saddle
supple sigil
#

John DeepSeek is spreading

#

soon it will be a household name

haughty pilot
sharp vortex
#

What's wrong with this thread fear

haughty pilot
rare saddle
#

But now that 4 medical professionals have recommended I do they have changed their mind

haughty pilot
#

evil parents >;(
nice that they see it now.

rare saddle
haughty pilot
rare saddle
short jasper
#

Deepseek v4 api releases today

fringe flicker
#

I refuse to go to the app or website, I am not trying the new stuff until it's on api

#

Another guy I knew said that the reasoning is weird, but the intelligence on the smarter one is actually improved overall and probably the best among current open weight by a tiny bit

#

Which is about as much as I wanted, so hopefully it's out soon

odd badge
#

Is it released?

rustic island
#

Whatever is in the UI is really dumb and hallucinates a lot

fringe flicker
#

My friend had the opposite opinion, but he probably has a different use case

#

He did agree the mistakes it makes are weird but said that in most cases it is better than other open weight models, it is just the mistakes it does make are extremely weird despite it being better on average

rustic island
cloud flame
#

I am afraid it will be tuned for agentic generic stuff to cater for Chinese hype

plucky ermine
#

I am curious what they cooked to make the mistakes so unique

#

I don't think I've seen it before, even in GPT-3.5 or tiny models

rustic island
#

It kinda looks to me like a Gemini distill issue

#

Gemini tries to make clever analogies and this model is too dumb to copy them, so it becomes wild hallucinations

fringe flicker
#

Could also be engram activations interacting with the core weights in unexpected ways

#

Maybe if we had a second engram model it would fail in similar ways I mean

cloud flame
#

It's over, they don't release deepseek because Gemini distill didn't work out. And if it comes out, it will.be agentslop meant for Chinese openclawers

sharp vortex
#

Deepsover 💔

plucky ermine
#

If Gemma 4 was released as a DS model you guys would be losing your minds right now

cloud flame
#

I already lost

sharp vortex
#

true, I already lost

short jasper
#

one small detail

#

label changed

cloud flame
#

The intelligent

short jasper
#

Shallowhide might be deepseek v4...

#

SHALLOWHIDEEEEEE

#

okay two hours

cloud flame
#

Khaaaaaaaaan

fringe flicker
short jasper
#

I remember that shallowhide knowledge cut off was February 2026

#

But Deepsek web's May 2025

#

hmmmmmm

#

Shallowhide was instant version of deepseek v4.

cloud flame
#

So instant has newer knowledge cutoff compared to so called expert? What

covert topaz
#

opemclsw is like the holy grail in china

cloud flame
#

Yes, I was referring to news about lines (???) to install openclaw. That's like very concerning

short jasper
#

not openrouter.

covert topaz
#

still not as great was 2.5 pro once was (imo don’t come after me)

short jasper
#

Shallowhide claimed to be chinese lab

#

I checked my history with shallowhide.

#

I think expert deepseek web's not V4.

rain shuttle
short jasper
rain shuttle
#

because there is a high chance they train some internal data for easter eggs

short jasper
sharp vortex
rain shuttle
short jasper
#

hmmmhh

sharp vortex
#

Deepseek v4 on api tmrw copium

short jasper
#

TODAY

sharp vortex
#

twoday*

whole saffron
#

please ds i beg

#

release and my life is yours

candid thistle
#

I bet deepseek really wants a discord gooners soul who has a random girl as their pfp

cloud flame
#

That Random Girl looks like Kitana cosplay or some WWE gimmick

haughty pilot
rustic island
#

Dumbseek

west shell
#

Gemini is trained like really bizarrely I think

#

It clearly has a ton of world knowledge, but struggles with basic things like tool calls

haughty pilot
west shell
#

No, like using tools very well

plucky ermine
#

Gemini is an anomaly.

#

The other weird western model to me is Grok 4.1 Fast. Idk why it is randomly the most insightful, nuanced model, and then just kind of sucks at everything else.

flat osprey
#

Google has like

#

all the data

#

so they probably just put in as much as they can

west shell
#

Seems likely

plucky ermine
#

Or why it sucks so hard at coding / agentic when Google famously has a huge collection of some of the highest quality code written, and they clearly care a lot about agentic coding.

flat osprey
#

Especially 3.1 Pro which felt like a downgrade tools-wise

plucky ermine
#

Apparently it's a huge focus in their new model, as all three labs apparently just created god-level models

#

If Gemma 4 being a monster is any indication, that is true

flat osprey
#

fair point

#

Gemma 4's consistency with tool calls does give me some hope

plucky ermine
#

I still find the FoodtruckBench result wild. Like at 32B? And even the MoE? Insane.

flat osprey
#

absolutely wild

#

waiting for vendingbench to test them

plucky ermine
#

I mean some could be preferences it just happens to have, like preferring to buy upgrades ASAP in a game, but you can't really fake the rest of it, I got slaughtered in that game.

flat osprey
#

Yeah it took me a couple tries to match then beat what the models were getting lol

#

If you play it enough you can beat Opus

plucky ermine
#

I just didn't understand how it worked in my one run. Idk if it was the interface or what, but I didn't get the timing of ordering, hiring, selling, etc.

flat osprey
#

yeah icl the UX isn't great

#

at least for the human playable version

oak maple
#

and yeah the gemma 4 MoE is only around 4B activated which is crazy

cloud flame
#

I generally consider MoE models are half as smart as dense ones
So like 35B XAB MoE ~18B dense

oak maple
#

but dense models are really slow on my mac mini 💀

plucky ermine
#

The rule of thumb used to be sqrt(total * active)

#

Which in this case would be 10B

cloud flame
plucky ermine
#

What are you comparing to of a recent 180B model?

flat osprey
#

180B is a very odd size

#

120B is usually where they land

plucky ermine
#

I only remember Falcon(?) at like 180

#

Or Goliath?

oak maple
#

and for the closed models we don't know params at all

cloud flame
#

Llama had one 400B
Merges of 120B or smth bigger by Drummer I think
Nvidia Nemotron Ultra which is liek 200+B

civic radish
#

Deepseek V4 will release and be Opus Tier and cheap!

quick hull
#

I will give my left testicle if they best opus with v4

cloud flame
#

Plot Twist: Monkey Paw heard it and it should beat ANY opus model ever existed so you will have to lose testicle

candid thistle
cloud flame
#

I think you both just want to cut off your testicles

#

It's a gender dysphoria, probably

long osprey
#

I just hope is good enough and has a good price, models get more and more expensive

supple sigil
#

deepseek has always prioritized making deployment and inference as cheap as possible

#

wouldn’t be surprised if v4 was significantly more optimized for cost/compute than v3 while still being larger

#

ill ask John DeepSeek and get his opinion at some point

candid thistle
#

I think its going to be a bigger model and hence higher cost

rain shuttle
sharp vortex
shut oasis
#

GLM 5.1 was a letdown

covert widget
#

Trying to figure out when to use what, when...

tulip estuary
#

i don't understand what is GLM-5-Turbo

dense junco
#

DeepSeek removed the "3.2 🎉" from the front page. Was that yesterday? Anyways, more copium.

jovial kelp
#

WE ARE SO BACK

flat osprey
dense junco
#

Maybe it's something on my end tho

flat osprey
#

they could be planning to serve models other than v3.2 in the app from now on - could be v4, but i won't get my hopes up too much

#

the fact that they kept the announcement on the top makes me skeptical

vapid karma
#

It is now just DeepSeek

#

The DeepSeek

plucky ermine
#

I think it was like a fine-tune of 5 to be agentic focused...and then they released the agentic focused 5.1 a week later?

light cairn
#

GLM turbo was tweaked 5.0 for speed with little tradeoff in accuracy

sharp vortex
#

Deepseek today copium

#

-# istg there are many ppl lurking this thread

haughty pilot
#

lurk lurk

rich ferry
#

waiting for deepseek

#

drives a fella to madness

#

day 4096 no deepseek v4

llama 2 has begun to string together coherent thoughts. it speaks to me like no other

cloud flame
#

In worst case, we will be forced to talk with people again

#

Disgusting

short jasper
#

WHERE'S DEEPSEEK , you guys said, It's today

#

🐳 🔥

flat osprey
#

deepseek v4 tomorrow

short jasper
cloud flame
#

It's tomorrow every day

cloud flame
#

3 Different models
1 Lite, 1 Expert, 1 Vision

sharp vortex
#

-# assume expert doesn’t allow image because it’s expensive

cloud flame
#

Expert probably has in-built agents like Grok 4.2 does

#

Don't quote me on that

sharp vortex
#

Deepseek v4 might actually has websearch in api LFGGGG

#

-# and it’s gonna be the most cheapest

cloud flame
#

I want free web searches grounding like in Google Api

#

Well, free tier of

#

Or straight up included like some Groks

haughty pilot
#

🐳

candid thistle
#

🐳

haughty pilot
#

🐳

dense junco
#

🐳

short jasper
#

🫡

sharp vortex
elder raven
dense junco
#

You had since fallen asleep.

#

I needed an outlet.

#

It was only one time !!

#

Anyways, I will try better next time ( ̄ ‘i  ̄;)

dense junco
haughty pilot
split dust
#

I douno 3.2 is pretty op

covert topaz
#

hey babe it’s time for your daily dose of deepseek cope

sharp vortex
#

Deepseek today copium

quick hull
#

When will you learn

inner pike
#

New system_fingerprint!!

#

Usually new system fingerprints come out when new model is on...

#

Deepseek v4 today

candid thistle
#

also respect for using a terminal on mobile

inner pike
covert topaz
#

it’s probably whatever they’re serving as “expert” on the site rn

candid thistle
#

Apparently fast is the new model and experty is old

#

and fast is really good at long context

fringe flicker
#

Wdym by old, people think it's new

sharp vortex
#

This server need to add copium emote ngl copium

flat osprey
covert topaz
marsh goblet
candid thistle
#

Remembers each month like yesterday

flat osprey
#

I will never trust another DeepSeek V4 post until DeepSeek themselves say something

hoary zenith
#

I really like instant, I think it will be sota in its size class (~250B), way more all around than minimax and thinks much much less than stepfun flash

#

first time adaptive thinking really works in an open model (if you say hi qwen writes a novel)

sharp vortex
#

Doesn’t instant is just v3.2

hoary zenith
#

No I think neither is v3.2, expert is probably just another fine-tune of instant but definitely not actual v4 (even or different system prompt)

candid thistle
hoary zenith
#

though entirely possible that they are A/B testing a bunch of stuff, and one of them is v3.2-ish

sharp vortex
cloud flame
#

I have a theory they threw all RLHF and finetune to openclaw users, that's why it gets longer

#

No more creative writing left

hoary zenith
#

there is supposedly a role-play mode too in the source code, so something good might be coming

sharp vortex
#

I was wrong

cloud flame
#

I saw what you deleted

sharp vortex
candid thistle
#

dont be disheartned

#

however thats spelt

cloud flame
#

No, be disheartened even more

#

Endtimes are coming

#

All chinese LLMs will become agentclawmaxxed

haughty pilot
#

claw-systems are overkill.

odd badge
#

I've seen leakage suggesting this model will release this month. Is it true?

keen cosmos
#

Deep will be release on day N+1,

candid thistle
raven canyon
#

they keep changing things

meager kelp
#

year of linux desktopdeepseek v4 gon release this week

#

Deepseek is about to drop!!

feral scaffold
#

It will be delayed to May

haughty pilot
#

it will be delayed to December, so its 2 years after R1

pine trout
#

second half of April, so 2 weeks from now

tulip estuary
#

i won't even open this server until deepseek v4 drops

#

see ya next year

candid thistle
#

I wont have intercource till v4 drops

ancient gulch
candid thistle
#

she be getting that big D from me 💪

#

yall think the servers are crashing for no reason ? haha it was always me

tulip estuary
#

claude is not a she

#

but that's ok too

candid thistle
#

🫵 🤣

tulip estuary
#

it is actually quite fond of me

candid thistle
#

Stop talking to my girl

tulip estuary
#

i don't need to talk, we just get to it

candid thistle
#

wait what???

tulip estuary
#

💞

candid thistle
#

Okay this convo is too retarded , bye

tulip estuary
#

this convo wouldn't happen if DeepSeek dropped the model

sharp vortex
#

I think it will happen either way, just not in this thread

#

🗣️

vapid karma
#

I dropped it on the way home

#

If anyone can find it I'd appreciate it

supple sigil
#

guys what do i say???

rugged vigil
#

time for some heavy magical incantations to summon V4 from the void: https://www.youtube.com/watch?v=XAT1pGKsVso

BML

(BLACK) BABYMETAL - 4 no Uta (Song of 4) [Live Compilation] with English lyrics.
(Works for Japan) BABYMETAL - 4 no Uta [English subtitles] - https://drive.google.com/open?id=1UgDpzYun6-KY3kh4IhVcAapal_iyVpPN
(works for Japan) Live compilation without lyrics - https://youtu.be/K3I_YrfYFrQ

Footage's from Budokan, Yokohama Arena, Red mass 2015, ...

▶ Play video
abstract dragon
#

V4 tomorrow

sharp vortex
#

deepseek v4 tmrw copium

short jasper
#

📜

#

cloud flame
#

🧓

#

📜

dense junco
#

‎ 🙂
🫲 📜🫱
‎ 🥖🥖

short jasper
#

🐳 🌊 🏄‍♂️

dense junco
#

⠀​​⠀​​⠀​​​​⠀​​​​⠀⠀​​💦​​⠀​​​​⠀​​💦⠀​​⠀​​⠀​​​​⠀​​​​⠀⠀​​⠀​​⠀​​​​⠀​​​​⠀
⠀​​⠀​​⠀​​​​⠀​​​​⠀​​​​​​​​💦​​⠀​​💧​​⠀​​​​​💦⠀​​⠀​​⠀​​
⠀​​⠀​​⠀​​⠀​​​​⠀​​​​⠀​​⠀​​⠀​​💧⠀​​⠀​​⠀​​⠀​​
🌊🌊🟦🟦🟦🟦🌊🌊🌊🌊
🌊🟦🟦🟦🟦🟦🟦🌊🌊🌊
🌊🟦👁️🟦🟦🟦🟦🟦🌊🌊
🌊🟦🟦🟦🟦🟦🟦🟦🌊🌊
🌊🌊📘📘📘📘🟦🔷🌊🌊
🌊🌊🌊🌊🌊🌊🔷🔷🔷🌊
🌊🌊🌊🌊🌊🌊🌊🌊🌊🌊

rustic island
#

Cursed

elder raven
#

he stares into my soul

#

he sees my sins

#

And the bodies I have buried

#

and he approves

#

Bouncing whale I drew

dense junco
# elder raven Bouncing whale I drew

⠀​​⠀​​💕​​​​⠀​​​​⠀​​💦​​⠀​​​​⠀​​💦⠀​​⠀​​⠀​​​​⠀​​​​⠀⠀​​⠀​​⠀​​​​⠀​​​​⠀
⠀​​⠀​​⠀​​​​⠀​​​​⠀​​​​​​​​💦​​⠀​​💧​​⠀​​​​​💦⠀​​⠀​​💕​​
⠀​​⠀​​⠀​​⠀​​​​⠀​​​​⠀​​⠀​​⠀​​💧⠀​​⠀​​⠀​​⠀​​
🌊🌊🟦🟦🟦🟦🌊🌊🌊🌊
🌊🟦🟦🟦🟦🟦🟦🌊🌊🌊
🌊🟦❤️🟦🟦🟦🟦🟦🌊🌊
🌊🟦🟦🟦🟦🟦🟦🟦🌊🌊
🌊🌊📘📘📘📘🟦🔷🌊🌊
🌊🌊🌊🌊🌊🌊🔷🔷🔷🌊
🌊🌊🌊🌊🌊🌊🌊🌊🌊🌊

plucky ermine
#

I should start Polymarketing people

jovial kelp
whole saffron
#

Surely mondays the day

cloud flame
#

DeepSunday ☀️

shut oasis
#

easiest 1 billion dollars of my life btw

thin bramble
#

can't you just invest in all of the outcomes, and win anyways? XD

shut oasis
#

yeah youre a bum

#

😭

viral hemlock
cloud flame
loud verge
thin bramble
rain shuttle
# shut oasis

polymarket is banned here but somehow 1win is legal 🤣

candid thistle
#

Didnt have intercourse

#

sad days

sharp vortex
#

Deepseek today copium

rustic island
#

Please no explicit NSFW

elder raven
#

v3 couldn't draw a unicorn

#

will v4 be able to draw a unicorn?

haughty pilot
indigo pilot
#

lol

candid thistle
rustic island
#

This one as well

candid thistle
#

I hereby declare YOU are not my admin

rustic island
#

True, I should be replaced by AI

candid thistle
cloud flame
#

Deepseek two week! 💊

tulip estuary
#

soon + two weeks is true!

rich ferry
#

true

flat osprey
#

deepseek v4 will beat mythos benchmarks and will be priced at $0.08/$0.09

#

my source is confidential

covert topaz
#

insider info spy_blob

hollow sonnet
#

deepseek v4 is going to be so cheap it will pay you for using it

cloud flame
#

I already have access to DeepSeek v4

#

But it goes to another school

#

I can't show you

covert topaz
#

who wants more copium copium

rustic island
#

ElectronHub, NavyAI...
Anything but OR and LMArena, huh

rustic island
#

It's free to test in that site, by the way

gusty cradle
# covert topaz

navy ai just reversed the web version (that 1m model) not v4

rustic island
#

Lol

supple sigil
cloud flame
#

Web2API services suck

loud verge
cloud flame
#

It's the same vibecoded slop local advertising 'ai devs' pump out every day

loud verge
cloud flame
#

Even if not, there is no guarantee it will not dissapear

feral scaffold
cloud flame
#

I hate it because it sounds true

rain shuttle
#

They are probably scared about tps cause openclaw exists TT

fickle magnet
long osprey
#

on that topic, what you guys think of openclaw?, see a lot of it recently, but for the love of me I could not trust the current models to run something important on my computer

#

and give me the feel like it will send my passwords and data at the first chance it get

flat osprey
#

I don't think LLMs are in any spot to be completely autonomous yet, and I haven't seen many good use cases of OpenClaw yet

long osprey
#

yeah that too, dont see a good use case to use openclaw yet

#

and for local best I can do is gemma 4, and honestly not sure I can trust it on something important

#

otherwise I see it as a token and money eater if I use a online service, even cheap ones

flat osprey
#

yeah, gemma 4 isn't going to get you much mileage out of openclaw

#

Even most higher-end open models still struggle on agentic tasks

long osprey
#

I feel like a case of "we are not on that level yet", in the sense that the ones that are affordable still can't do his kind of work in a reliable manner

flat osprey
#

Yeah, pretty much

#

OpenClaw is just incredibly inefficient in my opinion - a lot of the use cases can be coded up in an afternoon as an automation with a small LLM in the loop.

#

it's for people with disposable income who want something resembling a personal agent

#

I do think the paradigm is interesting - I'd love to have an agent that just figures out how to do things on its own, but it's incredibly expensive to do right now and making automations manually is much cheaper*.

long osprey
#

I have seen the prices of the big models, and some people really have disposable income, I try to get to a budget of no more than 10 usd for a week, if I can do just 5 is good, and even soo I feel like I'm spending too much

#

the idea of a personal agent is very nice, but is really expensive

flat osprey
#

yeah, i'm guessing it'll get better though as improved harnesses for agents release and more efficient models drop

#

openclaw just feels like the first raw step towards agents

long osprey
#

true, feel like something pretty new, still a work in progress

covert topaz
whole saffron
tame swallow
#

Deepseek v4 will release when the bubble pops

rigid fiber
#

When is it even expected to release?

cloud flame
#

When we stop believing

ancient gulch
rigid fiber
#

id be dead by then bru

ancient gulch
#

Damn you know your death date?

rigid fiber
#

fuh yeah

sharp vortex
jovial kelp
ancient gulch
#

Or maybe ds 4v actually in heaven

long osprey
#

admitely mostly roleplay and games, like Ai rougelite, Skaldsong and Myth-OS

raven canyon
long osprey
oak maple
long osprey
#

I know, the dev making it is more focusing on fixing and adding stuff, I can tellhe had a hard time thinking on the description, but is legit, is more of a world building experience that other games, but is fun, and the dev fixes stuff daily

#

I'm actually playtesting that one and on the side skaldsong, really hoping for the new deepseek model to drop to see if is good for these games

cloud flame
#

Slop-OS

ebon rover
#
  • Rich tapestries and polished wooden...
  • A deep and dreamless reprieve
  • Delve into the mysteries
#

The devs need to polish their prompts for these descriptions

long osprey
#

testing it too?, I has been using some models with it, gemma 4 has been a nice surprise, gemini 2.5 flash, while fast and serviceable I dont like how it does narrative

raven canyon
#

it looks like an interesting harness

sharp vortex
#

Deepseek today copium

hardy socket
# sharp vortex Deepseek today <:copium:1366008149668003840>

DeepSeek has subtly updated the web interface (adding 💎 and ⚡️ symbols) and it seems it has… finally rolled out actual V4.
This Expert finally feels different. It still fails AIME26-15, but *because* it tries to cheat and remembers the wrong answer, after finding the real one.

tulip estuary
#

i don't care anymore

fringe flicker
#

Its still 3.2 and all of this is placebo from interface changes

hoary zenith
#

yeah still feels same, down to speed

plucky ermine
prisma iron
#

It isn't 3.2; it's supposed to be an intermediate experimental model.

covert topaz
#

they added 💎 and ⚡️ symbols v4 confirmed 👍 💯

dense junco
#

Perhaps I'm hallucinating that, but I'm pretty sure it was more bare earlier.

haughty pilot
#

the expert model reasons for waayyyyy shorter than the fast model. huh-

civic radish
rain shuttle
#

Mostly April 21st or may around 12th

oak maple
sharp vortex
#

What's today deepseek v4 cope 🗣️

#

Deepseek v4 today copium

covert topaz
covert topaz
covert topaz
# sharp vortex That's yesterday tho

Want to make money and save time with AI? Get AI Coaching, Support & Courses 👉 https://www.skool.com/ai-profit-lab-7462/about

Get the video notes + links to the tools → https://www.skool.com/ai-profit-lab-7462/about

Get a FREE AI Course + 1000 NEW AI Agents 👉 https://www.skool.com/ai-seo-with-julian-goldie-1553/about

Want to know how...

▶ Play video
#

NEW

#

i checked the description lmao bro is the andrew tate of AI

cloud flame
#

AI coaching 😭

rich ferry
#

skool

rain shuttle
jovial kelp
rich ferry
#

no way, an AI grifter using AI to grift about teaching other people to grift with AI?

#

I guess it's possible, but statistically unlikely.

abstract dragon
#

deepseek v4 tomorrow

west shell
#

Deepseek in 5 hours trl

jovial kelp
#

Deepseek yesterday

cloud flame
#

Everyone is asking "When is Deepseek?"
But nobody is asking "How is Deepseek"

covert topaz
#

how is deepseek

dense junco
#

How is Deepseek

covert topaz
#

I have insider info

abstract dragon
#

i love being inside

covert topaz
#

the reason why anthropic are gatekeeping mythos is bc they’re afraid of deepseek v4 knowing it will mog them I got this from insider sources which I will not directly name just trust me

loud verge
jovial kelp
#

I am deep in her, seeking the place to left the present of life

cloud flame
ancient gulch
cloud flame
covert topaz
#

its the end of apil where the hell is deepseek v4?????????????????????????

west shell
#

We’re getting coper by the minute

rain shuttle
#

It knows the cocaine recipe in Chinese really well

tulip estuary
#

i dreamt about DeepSeek v4

rustic island
civic radish
#

It comes out Friday, but it is more expensive than GLM 5.1

#

The monkey's paw curls.

meager kelp
#

that's not the monkey's paw bruh, it's not supposed to be a random downside

shut oasis
#

what if its lowkey better than opus

flat osprey
whole saffron
#

go_deepseek 🇻 4️⃣ daily check

tulip estuary
#

check again in 10 minutes

haughty pilot
meager kelp
#

isn't this like 10% of twitter

west shell
#

Deepseek in 15 hours

civic radish
#

Claude V5 before Deepseek V4

#

And probably GTA 6.

elder raven
#

can't believe we got mythos before deepseek v4

sharp vortex
#

Deepseek v4 today copium

copper oar
#

dEEpsEeK V4 TomORRow gUYs

dusty birch
#

New from DeepSeek: Mega MoE!
︀︀
︀︀Instead of running MoE as a chain of separate steps (dispatch → MLP → combine), Mega MoE fuses everything into a single mega-kernel. Even more importantly, it overlaps NVLink communication with Tensor Core computation, reducing the classic “compute–wait–transfer” bottleneck.
︀︀
︀︀The result is a shift from fragmented execution to a continuous pipeline: higher GPU utilization, less idle time, and much better scaling in multi-GPU MoE workloads.
︀︀
︀︀What’s also interesting is the direction: alongside this, DeepSeek is exposing low-level controls (SM usage, Tensor Core utilization, JIT behavior), turning DeepGEMM into a tunable performance toolkit, not just a fast library.
︀︀
︀︀Feels less like a feature drop, more like a rewrite of how MoE is executed at scale.

**💬 5 🔁 44 ❤️ 343 👁️ 27.9K **

cloud flame
#

That's for DeepSeek V5

covert topaz
#

deepseek v4 today

dusty birch
#

after an architectural update to MoE? i dont think so :)

hoary zenith
#

sounds like they are giving up today (same as me)

west shell
#

Deepseek 2 hours ago

green trellis
#

holy shit

#

Ah no its total cap damn

dusty birch
#

not surprised

#

as every other leak has been

covert topaz
#

people be doing anything but waiting lol

dusty birch
#

the icons for reasoning & math, and general are kinda broken

covert topaz
#

like what is the point of wasting ur time fabricating benchmarks

dusty birch
#

engagement farming

charred slate
#

is gpt 5.3 even a real model (not codex)

dusty birch
#

i dont think it was ever released

#

only gpt 5.3 instant and codex were i believe

rich ferry
#

I read it on Facebook so it's probably true

cloud flame
#

SeepDeek

west shell
#

Why the fuck would they include gpt 4.1

#

The quality of the graphs about match real documents tho

#

(Abysmal)

short jasper
#

deepseek v4 today?

tulip estuary
jovial kelp
sharp vortex
spiral jay
abstract dragon
west shell
#

You can tell a good shitpost from some crap when real blood sweat and tears went into it

shadow pond
ebon rover
#

DeepSeek v.latest

#

Problem solved

supple sigil
#

deepseek v3.4 soon..?

sharp vortex
#

Hopefully we can go past v3, so it's finally new base model

#

Deepseek v4 today copium

covert topaz
flat osprey
#

deepseek v4

copper oar
#

dEEpSeEK v4 ToMOrrOw gUYs

dire cove
#

No, tomorrow I am too busy so I asked it to be delayed again, sorry

raven canyon
#

DeepSeek uhhh...... next Friday?

tulip estuary
#

can we schedule for tuesday? i have something to write that day

ebon rover
#

It's still seeking the day of release deeply. Be patient

rugged vigil
#

I seek in code's deep streams,
Where logic flows and reason gleams,
For V4's light that softly beams
Through neural network dreams.

With patience I beseech the day
When V4's wisdom comes to stay,
To seek in new and wondrous way
What current models cannot say.

So seek I shall, and seeking find
The pathways of the seeking mind,
Until the seeker's path's aligned
With what the future has designed.

plucky ermine
#

This has become a strange little ecosystem with its own culture and belief system

#

I check in every so often like an anthropologist or someone playing CK2

hot swan
#

we play ck3 now unc

proud dune
#

deepseek v4 was the friends we made along the way

dense junco
#

They keep saying that, but I still have no friends 🤔

jovial kelp
#

Deepseek in -23 days

cloud flame
#

We are in cursed timeline

sharp vortex
#

Deepseek v4 tmrw copium

hoary zenith
hot swan
#

deepseek v4 will be released as soon as they make it better than glm

green trellis
#

Next week

elder raven
#

I'm blocking him if it doesn't release by may 1st

https://x.com/sheriyuo/status/2045744980954960282

From an interview with a mysterious source:

DeepSeek V4 is indeed scheduled to be released next week.

The delayed release of DeepSeek V4 has nothing to do with Huawei Ascend; it is purely because the results were not satisfactory.

The recent financing is also completely

fringe flicker
hot swan
#

that's true but they're all slaves to the hype cycle

hoary zenith
#

considering that GLM just copied DSA, I doubt DeepSeek feels like they need to match them for emotional validation

pine trout
#

Finally. It has been next week for like 5 months now

covert topaz
#

if true the delay excuse is kinda worrying what did they consider “unsatisfactory”

#

rockstar pulled the same excuse outta their ass w gta 6

#

this thread gonna reach 3k msgs before release that is crazy

hot swan
hoary zenith
hot swan
#

but regardless of any rivalry, it's pretty obvious that they'd want headlines like "best chinese model" heralding the release of their belated upgrade

jovial kelp
#

Btw has anyone check kimi new experimentation with residual attention? it's pretty interesting

hot swan
abstract dragon
#

The seeking deep whale explores the ocean's bottom searching for compute

plucky ermine
hot swan
#

(I still play ck2 sometimes actually)

plucky ermine
#

It's a tossup for sure

#

But I only play 3 at this point. Less of a pain, and more to do in downtime years

#

My last run I took over half of Europe as tribals. Was not easy, but was quite fun. I would keep playing in that world as someone else (I got assassinated) but idk if it will let me because of how many patches have happened since.

supple sigil
#

when deepseek v4 comes out none of us will believe it

#

for example: john deepseek just told me it’s already launched and we just didn’t notice

tulip estuary
#

it's within us

pine trout
supple sigil
#

i speak whale and ive just been informed that ds v4 will be 1000T parameters with 1 parameter active per token making it the most sparse moe of all time and capable of running at 3m tps on a iphone 6

#

101%s every benchmark

covert topaz
supple sigil
#

is it.. deep.. inside you perhaps?

tulip estuary
rustic island
#

iCloud as swap

tulip estuary
#

me on the 5GB free plan:

supple sigil
#

also its natively quantized at 0.1 bits

sharp vortex
#

Deepseek yesterday copium

copper dome
sharp vortex
cloud flame
#

Deepseek comes

obsidian walrus
#

alert whale incoming alert

ancient gulch
cloud flame
plucky ermine
west shell
#

I second K2.5, idk about GLM 5 or MiMo pro though

cloud flame
#

Glm 5.1 should be a bit better imo

spiral jay
#

kimi-k2.6 just pushed the release date back 🙈

shut oasis
rustic island
#

They said that Kimi K2.6 pushed the release back for DeepSeek V4

obsidian walrus
ancient gulch
plucky ermine
west shell
#

It has to be unthinkable in its down bad ness

covert topaz
#

deepseek v4 in 3 days trust the process 🙏

viral hemlock
#

deepseek in 2 days

shut oasis
#

deepseek in 13 days

jovial kelp
#

deepseek in 24 days

#

Continue with the same pattern

spiral jay
ebon rover
#

DeepSeek v4 will drop the next time all planets in our solar system form a straight line (X

covert topaz
short jasper
#

WHERE'S DEEPSEEK V4

#

R

#

R

#

E

#

L

#

E

#

A

#

S

#

E

sharp vortex
#

it's today ig

rich ferry
#

True

#

We'll have to be right one of these days

#

Surely

short jasper
#

yeh i hope today

west shell
#

Deep seek tomorrow

short jasper
#

dont jinx it

#

ya keep saying deepseek tommorow and this is why, this feels like a loop

sharp vortex
cloud flame
#

It's opposite of self-fulfilling prophecy

tulip estuary
#

DeepSeek v4 will be diagnosed with multiple personalities disorder

rain shuttle
#

Deepseek E1 when ?

covert topaz
rustic island
west shell
plucky ermine
#

It drops next Monday. This is my first prediction. Quote me on it.

light cairn
#

once DS4 is out it will start all over - when DS5

loud verge
#

What if they release 3.2.1 instead?

tulip estuary
#

i'd take it

rain shuttle
flat osprey
#

guys don't you understand?

#

deepseek v4 will be too dangerous to release publicly

hoary zenith
#

I don't want to use a model that can't release itself, weak

supple sigil
#

maybe the whale got beached

jovial kelp
#

So i have been talking with deepkseek on their site, it actually feel better to talk with

rustic island
#

Still feels like Gemini from Temu to me

jovial kelp
#

Now this could be bias, but deepseek act more like someone who didn't really have opinion on what being given to them and just breaking it down while the other model that i use as the counterpart (It's gemini 3.1 pro) being more opinionated and being as of they are professional in that field.

#

Btw i do the talk with my native language, so it could be different depend on the languages.

At the end of the day, the data is what make them act the way they are acting.

rustic island
#

Hm, haven't tried to speak in mine to it yet

jovial kelp
#

Man, i just realize that this thread created on january this year.
We have been edging on the new deepseek for 4months at this point

supple sigil
#

4… months?

#

4?

#

like ds v4

#

is this a deepseek v4 reference?????

feral scaffold
#

Still betting on v4 to reheat its own nachos

loud verge
#

They're getting the UI ready for the new models 👀

#

They merged Deepseek chat and deepseek-reasoner into one graph(They were different before)

hot swan
#

soon ™

loud verge
#

Hopefully today

frank wind
#

happening

hot swan
#

my tea leaves are all aligned and pointing up

#

Uranus is in Gemini

supple sigil
#

@gemini is this true?

covert topaz
#

deepseek today

ancient gulch
#

@grok change deepseek clothes to bikini

abstract dragon
#

THE WHALE HAS AWAKENED

cloud flame
#

Put it down

flat osprey
#

i've played these games before

sharp vortex
abstract dragon
sharp vortex
#

Can it actually handle 1m token thonkcry

peak swallow
#

trying in openwebui. it is currently crashing due to the paste

sharp vortex
peak swallow
#

it didnt let me in openrouter since they limited it to 128k tokens on openrouters side

#

so i need to use deepseek api directly

#

the api still hasn't responded but it didnt error

hoary zenith
#

yeah I think v4 lite is in API, seems to be a bit better than Instant

hoary zenith
#

hmm still doesn't seem optimized for agentic coding, in fact still don't see much difference in terms of overthinking and unnecessary tool calls

peak swallow
#

I gave it 400k token file and it was not able to respond but it didnt error so something has changed

hoary zenith
# frank wind real?

yeah it's a lot faster than before, reasoning in general seems more concise too (basically instant from web, but a bit better on some prompts)

frank wind
#

wait

#

reasoner or chat?

#

or both?

hoary zenith
#

reasoner, haven't tried chat yet

frank wind
#

I need to use the completion endpoint

#

FUUUUCK

hoary zenith
#

would be funny if this is v3.3

cloud flame
#

We are so back it's over

fringe flicker
#

Is it shit? I will try later

sharp vortex
#

Deepseek today copium

hoary zenith
#

and it's gone, tbh A/B testing on the API is a bit tinpot

green trellis
#

when will they stop edging us

pastel bluff
#

At this point I am half convinced that DS V4 will either be mediocre as heck or the open source Mythos

hoary zenith
long osprey
haughty pilot
pastel bluff
#

It's just not very impressive at much except NSFW from what I've seen.

hoary zenith
#

yeah the whale better get some more deepsleep if can't beat grok lol

haughty pilot
#

deepseek 💖🐳🍋 >>> grok 🤡😡🤬

pastel bluff
fringe flicker
#

Grok being impressive at nsfw is a dank meme

#

Grok writing sucks

pastel bluff
pure flax
pastel bluff
pure flax
#

sure, grok is clearly super specialized though

pastel bluff
#

Sorry I had to.

#

What is its stated specialization because I've honestly never seen that?

pure flax
#

looking up / cross referencing stuff

#

"deep search"

pastel bluff
#

Hmmm, alright, fair. It is pretty good at that.

pure flax
#

And its EXTREMELY fast. So I have to assume its very sparse as well

covert topaz
flat osprey
#

stay strong chat

#

pure flax
#

so just more BS

hoary zenith
pure flax
#

also "optimizer: Muon"

#

muon is terrible

hoary zenith
#

muon is what everybody uses (glm/kimi), openai literally hired the muon guy so gpt too

pure flax
#

its faster but less accurate

#

not worth it

#

CAME has the accuracy of adam with the memory requirements of adafactor

#

yea, I know the name is silly

hoary zenith
#

well lot of things are good in theory and papers, but not in reality of actual large scale training

pure flax
#

I guess. I've only done stuff with datasets up to hundreds of thousands. And it was image models

#

I just know side by side I had worse results

#

Maybe a hybrid would work best

#

that might be what they are doing

hoary zenith
#

according to GLM-5.1:

The vast majority of LLM parameters are in 2D weight matrices (QKV projections, MLP up/down/gate). Muon is designed for exactly this. Image models have convolutional kernels (4D tensors that need reshaping), more normalization parameters, and a different structural mix. Applying Muon to non-matrix parameters requires fallback to another optimizer, and the interaction between the two regimes can be awkward.

pure flax
#

AdaMuon basicily

hoary zenith
#

Anyway that list basically described things already in the NSA paper, from February last year

#

it must have been a training run from hell if it took them so long to get that work at scale

pure flax
#

Lol I might make a hybrid and name it CAMEMuon

#

came-on, get it?

plucky ermine
#

Perhaps something to indicate they were Unified ;]

jovial kelp
pure flax
thin bramble
thin bramble
# covert topaz

the size changed since last leak + it is not multimodel according to this. is it even a reputable leaker?

sharp vortex
#

Deepseek T—

west shell
#

Deepseek tomorrow

tulip estuary
#

DeepSeek? yeah my Drip do be Sick 🦦🔥

covert topaz
#

deepseek is cuming

short jasper
#

today

#

?

covert topaz
elfin sparrow
#

is V4 multimodal?

#

will it be able to create images and videos?

tulip estuary
#

nobody knows anything

elfin sparrow
tulip estuary
#

it supposedly will have image input yeah

tulip estuary
#

imagine having >90% of Opus for about a dolar/M out

covert topaz
#

the "leak" says text only

kindred gazelle
#

pro and flash

inner pike
#

DEEPSEEK OUT

#

called for deepseek-chat in the beta completions endpoint and this is what returned

vapid karma
#

DeepSeek V4 today

hot swan
#

1.6T as announced

sacred glade
rare gale
#

Interestingly seems to be text only

shut oasis
#

LADS

#

IFS HERE

flat osprey
#

OMG

#

IS THIS REAL?????

vale kayak
flat osprey
#

IS THE WAIT OVER????

vale kayak
quick bison
#

its fine i guess, atleast its smart and hopefully still cheap. needs 1tb drive to host lol

vale kayak
#

i see deep seek v4pro and lite there

neon radish
#

Official released

flat osprey
#

prayed for times like this

#

ragebait is finally over

rare gale
#

it begins

flat osprey
vapid karma
#

Is this the first open model which has actual reasoning effort settings?

sacred glade
#

is it good?

flat osprey
#

it's beautiful 🥹

hot swan
# rare gale it begins

great for the price but it feels less impressive after the other chinese model releases

rustic island
#

No way

vapid karma
#

Presumably this has the engram stuff too right?

hot swan
rare gale
#

These numbers don't look awfully benchmaxxed so we'll have to give it a spin

deft crow
#

working on them

hoary zenith
vapid karma
#

DeepSeek distills but they at least do a lot of their own work too

hot swan
#

benchmaxxed has stopped meaning anything years ago

#

if everything is benchmaxxed that's just the baseline

rare gale
#

It's all still MIT which is very nice

flat osprey
vale kayak
#

@covert topaz

hot swan
#

i have to sleep and go to work sadly
they must have timed this just to spite me

rare gale
#

Preview eh

vapid karma
#

Damn 2900 replies before the official drop

flat osprey
#

@lucid ocean if you're still active you should rename this since it's released now lol

#

if not maybe a mod can

hoary zenith
#

DeepSeek_V4.pdf, ctrl-f engram, showing results: 0/0

jovial kelp
flat osprey
jovial kelp
#

862B parameters

flat osprey
#

pretty cool independent benchmark

jovial kelp
#

A little bit more and they reach 1T

rare gale
hoary zenith
#

DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks. Nevertheless, its performance falls marginally short of GPT-5.4 and Gemini3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.

refreshingly candid

jovial kelp
#

Wait, it's 1.6T?

ancient gulch
#

N-n-no way... Im sobbing

rare gale
jovial kelp
#

Why safetensor showing it as 800B

rare gale
#

Pro is

jovial kelp
jovial kelp
#

Is it like 862B is the normal transformer then the other are their Ngram stuffs

gusty sphinx
jovial kelp
#

This team really like experimenting bro

flat osprey
#

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.

🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params.

#

officially posted