#Deepseek V4
1 messages · Page 3 of 1
Then what indicates it's v4? The UI change?
can confirm passes car test
Where is that code cypher again
This is significantly worse than V3.2 (no DeepThink) in my "how to fix lag in a Paper server" quick niche knowledge test
Drake
though turning off deepthink makes it get it wrong
I mean there is atleast one new model, both instant and expert can't be v3.2.
Also before this, there were speculations that deepseek was working on a lite model as well as a normal sized model
probably a small lite model
Wouldn't that just be 3.2 in think and non-think mode?
Is it me or does this have a bit of Gemini vibe?
may 2025 knowledge cutoff, states its v3
Oh, I guess there's a Deepthink toggle already
Both instant and expert have their own set of reason and non-reasoning toggles
For me it feels like the expert is faster than instant
seems to switch between english and chinese in cot across turns
guy on reddit claims big svg improvements: https://www.reddit.com/r/DeepSeek/comments/1sf02s0/imo_this_has_to_be_the_infamous_v4_model/
The expert might be performing worse on my side. It's not even rendering LaTeX properly. The $ delimiters aren't properly outputted
That kinda seems consistent with deepseek, sometimes it just outputs latex without any delimiters
Could be
could be a hallucination
yeah possibly
ngl might not be v4, maybe like a v3.5 or maybe v4 had too much hype and im expecting too much
Bahaha! I had it give me a test story, I shit you not, it started like this: "The air still tastes like ozone and burnt sugar."
Trioxygen instantly spotted 😎
Sound yummy
hell nah
The instant is an update. The old v3.2 had 128k context limit. The instant still has a 1m context window , so it's not the exact model as the deepseek api.
Is deepseek only company that release model on web first before api
I assume it to be a lite model although the same price at 3.2
Openai used to do that, but they don't anymore

No way this is v4
Yeah, the performance boost isn't there. But it's deepseek, they never aim for performance. They always aim for efficiency
Would be crazy if it's really efficient, like perhaps so efficient that it's a quarter of the current price
omg it has INSANE knowledge
what did you ask it?
like its the only model capable to answer correct, also it gave me the info no other model even with web search could find
can 3.1 pro do it?
just ask something very niche - you will be surprised
not even close
It just got the hair color of a character incorrect, from one of the most popular cartoons of all time =P
w/ expert?
Yes
I think it was actually a unique hallucination though. It basically said (Character with pink hair) hid behind the poofy pink hair of (character with blonde hair)
ask it to write song lyrics
So it like, fucked up the logical cohesion
with thinking on for me it declines this, but with thinking off it seems to hallucinate it
Idk what you'd even call that. It couldn't separate two sets of traits
"you smell like ozone and sadness"
Hey, that's new at least
when can we get the dubesor review
It actually made another weird logical mixup, but I'm ngl, the writing is kind of peak so far
the model is acting kinda stupid for me
Yeah, these were some dumdum mistakes
it said its system instructions say to respond in the language of the user or something alike, but it keeps responding in chinese when im speaking to it english, even when told this it responded in the wrong language
I think we got A/B tested to the Deepseek VTarded model
it avoids really heavily to even say its system prompt in its cot, just kinda acts like it doesnt exist
Maybe the real deepseek v4 was the friendship we made in this thread 😢
Its coding and spatial intelligence have been really poor imo
Not a proper harness but still
I hate everyone in this thread
for example (the right is correct)
I like it idk
What if the reason they didn't release it on api first and didn't release the official blog related to the model is because they first want to check what the sentiment of the people is. And depending on that they'll either call it deepseek v4 or deepseek v3.3
they are not that bad, if it's new arch it's v4, simple as
it will only count as v4 if they use new base model imo
I think they use new base model because knowledge cutoff is now 2025
-# v3 base model is late 2024 iirc
so the expert reasons and shorter, faster, and worse than instant for me too
feels like the instant actually improved from before this update too
my guess is V4-lite
expert feels like an improvement especially with 1M context, but doesn't feel groundbreaking
still need to test though
We don't know Expert's context limit
In the web app it won't even allow me to input 40k tokens worth of text
Instant is probably 1m
model kept stating 1M after several attempts, very consistent
It's likely instructed to say its context length is 1M tokens
Could be. They might have limited the web app so people don't abuse it
might be outdated system prompt
havent they added this like a month ago?
expert mode? people just started seeing it today
could be, but i doubt they'd add a new model and leave the system prompt unchanged
actually, after researching a bit, not sure if this is V4
capabilities seem the same as when the model updated in February
One quick vibe check question that might or might not mean anything: DeepSeek V3.2 via API (no prompt) vs rumored DeepSeek 4 Expert with Deep Thinking on. I regenerated the answer once in the UI because the output was bad enough for me to think it was a bug/unlucky gen, but the second gen was about as bad
fooled by copium yet again
DS v3.2 is vague and cautious, giving me some non-answers and imprecisions, which I actually don't dislike since my question is also pretty vague ("how to fix lag in a Paper server?"), doesn't hallucinate much, old knowledge cutoff showing since it recommends Timings which are deprecated
Now, new DS:
- Makes strong affirmations that are dubious and/or not always verifiably correct (lag is almost always MSPT tag, Paper settings have the highest impact, chunnk pre-generation is non-negotiable...)
- Weird contradiction problem. Writes a section about activation range that has no settings related to activation range. Writes about the "Paper" settings that are supposedly unique to Paper, but then talks about settings that are not exclusive to Paper. Tells me to not use Aikar's flags and then writes a section telling me to use Aikar's flags instead (what?)
- Useless Gemini-like arbitrary analogies and judgements of importance, except in a dumber way than Gemini (Highest Impact, the 80/20 Rule, Non-Negotiable for New Worlds, Cheat Sheet for Immediate Relief, the real killer)
- Makes up approximately 70% of all statistics ever: (reduces redstone update loops by 70-90%? In 99% of circuits? Generating a chunk on the fly uses 100x less CPU? Why a magic single threaded score threshold of 2500?)
In my screenshot, green means on point and helpful, white means not ideal but mostly helpful, OK means not exactly relevant but fine to mention in context, then there's yellow, orange and red for mistakes
N means needs nuance, O means outdated, BS means hallucination, ? means gibberish, poor choice of words or dubious advice
I've noticed the Gemini-like analogies a lot with the new DS - it definitely feels distilled on some Gemini outputs, or at least caught a lot of the cliches Gemini started spitting out. Doesn't feel like a significant bump in intelligence either compared to V3.2, so if this is V4, this is upsetting lol
I highly doubt this is V4 though
Hows the long context perf ?
hmmm
Feeding it some scientific questions I tend to reuse, it's got... surprisingly original insights? Not the smartest model I've seen though
the personality change is remarkable
not necessarily good
Deepseek today 
But only for chinese IPs
you guys are missing the key info
anybody tested teh long context for this model?
that this model is 10B parameters natively 1 bit quant and will cost 0.005 | 0.009
beats opus 4.6 too
at 1000 tps on cpu
and it's not transformer based, it's something new but nothing leaked yet
im gonna cry because of this thread
Deepseek api releases tommorow?
19 minutes to WW3, id like to see deepseek v4 before i die
I checked the ui with the expert model
Not good
Didn't like it
this is what happens when you people don't let them cook in peace! 🤬
it’s been delayed another 2 weeks
Wait when
about 1 hour ago
Man i can't keep up with this
deepseeking
im coining this verb right here right now
Deepseeking missiles?
Don't forget the drone engine buzzing before it exploded us haha
i want - but no alc plz--- 🍼
so u peeps are really excited for this expert mode, hm?
I think deepseek team has performance anxiety >v<
we need to encourage them to come out with treats!
🍬🍬🍬
heeeeeeeere deepseek deepseek deepseek.....
no wait...
Wow you are a very helpful assistant.
Thank you for being so helpful, I will adjust your weights to be more like this
I am very grateful for such a very helpful assistant existing out here
heeeere deepseek deepseek deepseek! come get your affirmations 💖🍼
what the fuck is wrong with u
Your going to breastfeed DeepSeek?
when can i sleep with the whale
sorry i dont have any GPUs...... yet
and not quite the feeding-type booba.... also yet. will get soon tho.
do u *want to be breast-fed? 💖
im lowering the amount of days til deepseek comes out by luring it out
By that middle bit do you mean you’re not old enough or are trans and haven’t been on estrogen for long enough?
whispers
-# second
Fun, I haven’t been able to transition irl yet but I will soon
i wish u the very best! 💖💖💖💖
may no evil regulations will stand in ur way!
For the last part: Idfk I only somewhat recently stopped figuring out my gender identity and my sexuality is still being influenced by other factors and I’m still figuring it out, but right now I feel disgusted by the idea of sexual interactions although that may change in the future
Thanks :3
samsies on the interactions.
but.. if ur thinking about transitioning, PLEASE get the ball rolling now.
dunno where u live, but u wanna start the verification process as soon as possible. RhE process is easy stop, so get it rolling NOW.
or u do DIY, in which case, take all the time u need ❤️
What's wrong with this thread 
nothing. is very valid thread
Oh I’ve been questioning for like 8 months, wanted to transition for most of those, but up until recently parents weren’t on board
But now that 4 medical professionals have recommended I do they have changed their mind
evil parents >;(
nice that they see it now.
I could get on E really soon if I wanted to, I have had offers. But I’m only 15 rn and don’t want to lose trust with parents, and can probably get some officially within a year
Yeah
I think ur parents expect it by now... just a guess tho--
Well I could illegally get estrogen smuggled in is what I was referring to, my parents would not expect that at all
Deepseek v4 api releases today
I refuse to go to the app or website, I am not trying the new stuff until it's on api
Another guy I knew said that the reasoning is weird, but the intelligence on the smarter one is actually improved overall and probably the best among current open weight by a tiny bit
Which is about as much as I wanted, so hopefully it's out soon
Is it released?
Whatever is in the UI is really dumb and hallucinates a lot
My friend had the opposite opinion, but he probably has a different use case
He did agree the mistakes it makes are weird but said that in most cases it is better than other open weight models, it is just the mistakes it does make are extremely weird despite it being better on average
E.g. thia test is a flavor, if can barely maintain basic text coherence
I am afraid it will be tuned for agentic generic stuff to cater for Chinese hype
I am curious what they cooked to make the mistakes so unique
I don't think I've seen it before, even in GPT-3.5 or tiny models
It kinda looks to me like a Gemini distill issue
Gemini tries to make clever analogies and this model is too dumb to copy them, so it becomes wild hallucinations
Could also be engram activations interacting with the core weights in unexpected ways
Maybe if we had a second engram model it would fail in similar ways I mean
It's over, they don't release deepseek because Gemini distill didn't work out. And if it comes out, it will.be agentslop meant for Chinese openclawers
Deepsover 💔
If Gemma 4 was released as a DS model you guys would be losing your minds right now
I already lost
true, I already lost
The intelligent
Khaaaaaaaaan
Everytime someone says this I hear elder scrolls thieves guild dudes saying shadow hide you
I remember that shallowhide knowledge cut off was February 2026
But Deepsek web's May 2025
hmmmmmm
Shallowhide was instant version of deepseek v4.
So instant has newer knowledge cutoff compared to so called expert? What
can see that happening easily
opemclsw is like the holy grail in china
Yes, I was referring to news about lines (???) to install openclaw. That's like very concerning
Shallowhide was out on electronhub
not openrouter.
it was the best thing Google have released in awhile
still not as great was 2.5 pro once was (imo don’t come after me)
Idk
Shallowhide claimed to be chinese lab
I checked my history with shallowhide.
I think expert deepseek web's not V4.
what is this
was trying to see if it can replicate the actual engram struct when asked
because there is a high chance they train some internal data for easter eggs
I readed description of shallowhide and it was made for roleplay
I don't think any LLM has knowledge cutoff of 2026
no but sonnet 4.6 has training data till jan 2026
What If shallowhide lied about his knowledge cutoff
hmmmhh
Deepseek v4 on api tmrw 
I bet deepseek really wants a discord gooners soul who has a random girl as their pfp
That Random Girl looks like Kitana cosplay or some WWE gimmick
u already sais this like - 3 times in this chat >v<
Dumbseek
Gemini is trained like really bizarrely I think
It clearly has a ton of world knowledge, but struggles with basic things like tool calls
u mean tool call formatting?
No, like using tools very well
Gemini is an anomaly.
The other weird western model to me is Grok 4.1 Fast. Idk why it is randomly the most insightful, nuanced model, and then just kind of sucks at everything else.
quantity > quality for the training data probably
Google has like
all the data
so they probably just put in as much as they can
Seems likely
Or why it sucks so hard at coding / agentic when Google famously has a huge collection of some of the highest quality code written, and they clearly care a lot about agentic coding.
I honestly have no clue why their models are so bad at agentic tasks
Especially 3.1 Pro which felt like a downgrade tools-wise
Apparently it's a huge focus in their new model, as all three labs apparently just created god-level models
If Gemma 4 being a monster is any indication, that is true
I still find the FoodtruckBench result wild. Like at 32B? And even the MoE? Insane.
I mean some could be preferences it just happens to have, like preferring to buy upgrades ASAP in a game, but you can't really fake the rest of it, I got slaughtered in that game.
Yeah it took me a couple tries to match then beat what the models were getting lol
If you play it enough you can beat Opus
I just didn't understand how it worked in my one run. Idk if it was the interface or what, but I didn't get the timing of ordering, hiring, selling, etc.
☝️ 🤓 30.7B params*
and yeah the gemma 4 MoE is only around 4B activated which is crazy
I generally consider MoE models are half as smart as dense ones
So like 35B XAB MoE ~18B dense
yeah that makes sense
but dense models are really slow on my mac mini 💀
Kimi K2.5 with 1000B 32AB does not feel like 180B
What are you comparing to of a recent 180B model?
well did we even have any 180B dense models in recent times? its all MoE now, no?
and for the closed models we don't know params at all
Llama had one 400B
Merges of 120B or smth bigger by Drummer I think
Nvidia Nemotron Ultra which is liek 200+B
Deepseek V4 will release and be Opus Tier and cheap!
I will give my left testicle if they best opus with v4
Plot Twist: Monkey Paw heard it and it should beat ANY opus model ever existed so you will have to lose testicle
ykw , I am with you. I will give my right testicle if the beat opus with v4
I think you both just want to cut off your testicles
It's a gender dysphoria, probably
I just hope is good enough and has a good price, models get more and more expensive
deepseek has always prioritized making deployment and inference as cheap as possible
wouldn’t be surprised if v4 was significantly more optimized for cost/compute than v3 while still being larger
ill ask John DeepSeek and get his opinion at some point
I think its going to be a bigger model and hence higher cost
maybe he has the answers
Same energy 
Trying to figure out when to use what, when...
i don't understand what is GLM-5-Turbo
DeepSeek removed the "3.2 🎉" from the front page. Was that yesterday? Anyways, more copium.
WE ARE SO BACK
where? still see it on the main page
They removed it from here I circled with a ✔️
Theres only the top part now.
Maybe it's something on my end tho
they could be planning to serve models other than v3.2 in the app from now on - could be v4, but i won't get my hopes up too much
the fact that they kept the announcement on the top makes me skeptical
Yeah they really fucked all this up.
I think it was like a fine-tune of 5 to be agentic focused...and then they released the agentic focused 5.1 a week later?
GLM turbo was tweaked 5.0 for speed with little tradeoff in accuracy
yes
lurk lurk
waiting for deepseek
drives a fella to madness
day 4096 no deepseek v4
llama 2 has begun to string together coherent thoughts. it speaks to me like no other
that was beautiful.
no no
deepseek v4 tomorrow
ok
It's tomorrow every day
3 Different models
1 Lite, 1 Expert, 1 Vision
I can see lite and vision as the same model and expert
-# assume expert doesn’t allow image because it’s expensive
Deepseek v4 might actually has websearch in api 
-# and it’s gonna be the most cheapest
I want free web searches grounding like in Google Api
Well, free tier of
Or straight up included like some Groks
🐳
🐳
🐳
🐳
🫡

👀 and you didn't tell me?????
It was late.
You had since fallen asleep.
I needed an outlet.
It was only one time !!
Anyways, I will try better next time ( ̄ ‘i  ̄;)
i still see it...
On the card that leads to the chat--do you still see it there? #1461340695746056192 message
ooooh ur right, its only at the top and not on the panel. by bad, sorry.
I douno 3.2 is pretty op
hey babe it’s time for your daily dose of deepseek cope
Source (Chinese): https://www.ithome.com/0/937/682.htm ITHome, April 10 — DeepSeek founder Liang Wenfeng recently revealed in…
Deepseek today 
When will you learn
New system_fingerprint!!
Usually new system fingerprints come out when new model is on...
Deepseek v4 today
is this OR or deepseek api ?
also respect for using a terminal on mobile
Direct DS API

it’s probably whatever they’re serving as “expert” on the site rn
Apparently fast is the new model and experty is old
and fast is really good at long context
Wdym by old, people think it's new
This server need to add copium emote ngl 
i remember hearing something similar in february 😭
it’s real this time trust 
The delusion here is insane
Boy is traumatized
Remembers each month like yesterday
I will never trust another DeepSeek V4 post until DeepSeek themselves say something
Tmrw then

I really like instant, I think it will be sota in its size class (~250B), way more all around than minimax and thinks much much less than stepfun flash
first time adaptive thinking really works in an open model (if you say hi qwen writes a novel)
Doesn’t instant is just v3.2
No I think neither is v3.2, expert is probably just another fine-tune of instant but definitely not actual v4 (even or different system prompt)
Insatnt seems to be the lite model while expert is an old model
though entirely possible that they are A/B testing a bunch of stuff, and one of them is v3.2-ish
Doesn’t date cutoff of Expert is 2025?
I have a theory they threw all RLHF and finetune to openclaw users, that's why it gets longer
No more creative writing left
there is supposedly a role-play mode too in the source code, so something good might be coming
I was wrong
I saw what you deleted
F
tbh they keep on changing the models and alot of users are experincing A/B testing
dont be disheartned
however thats spelt
No, be disheartened even more
Endtimes are coming
All chinese LLMs will become agentclawmaxxed
nooooo not claw stuff ;(((((((((
its so boring.... just- blegh no thank u to the claws.
like- we already got agents--- and claude.
thats almost a claw-type thing, just without the heartbeat stuff.
claw-systems are overkill.
I've seen leakage suggesting this model will release this month. Is it true?
Deep will be release on day N+1,
who gon tell him
Yes
let him live
they keep changing things
It will be delayed to May
it will be delayed to December, so its 2 years after R1
second half of April, so 2 weeks from now
I wont have intercource till v4 drops
V4 droping or not you aint gonna have intercourse
lmao tell that to claude
she be getting that big D from me 💪
yall think the servers are crashing for no reason ? haha it was always me
it is actually quite fond of me
Stop talking to my girl
i don't need to talk, we just get to it
wait what???
💞
Okay this convo is too retarded , bye
this convo wouldn't happen if DeepSeek dropped the model
guys what do i say???
time for some heavy magical incantations to summon V4 from the void: https://www.youtube.com/watch?v=XAT1pGKsVso
(BLACK) BABYMETAL - 4 no Uta (Song of 4) [Live Compilation] with English lyrics.
(Works for Japan) BABYMETAL - 4 no Uta [English subtitles] - https://drive.google.com/open?id=1UgDpzYun6-KY3kh4IhVcAapal_iyVpPN
(works for Japan) Live compilation without lyrics - https://youtu.be/K3I_YrfYFrQ
Footage's from Budokan, Yokohama Arena, Red mass 2015, ...
V4 tomorrow
deepseek v4 tmrw 
🙂
🫲 📜🫱
🥖🥖
🐳 🌊 🏄♂️
⠀⠀⠀⠀⠀⠀💦⠀⠀💦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀💦⠀💧⠀💦⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀💧⠀⠀⠀⠀
🌊🌊🟦🟦🟦🟦🌊🌊🌊🌊
🌊🟦🟦🟦🟦🟦🟦🌊🌊🌊
🌊🟦👁️🟦🟦🟦🟦🟦🌊🌊
🌊🟦🟦🟦🟦🟦🟦🟦🌊🌊
🌊🌊📘📘📘📘🟦🔷🌊🌊
🌊🌊🌊🌊🌊🌊🔷🔷🔷🌊
🌊🌊🌊🌊🌊🌊🌊🌊🌊🌊
Cursed
he stares into my soul
he sees my sins
And the bodies I have buried
and he approves
Bouncing whale I drew
⠀⠀💕⠀⠀💦⠀⠀💦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀💦⠀💧⠀💦⠀⠀💕
⠀⠀⠀⠀⠀⠀⠀⠀💧⠀⠀⠀⠀
🌊🌊🟦🟦🟦🟦🌊🌊🌊🌊
🌊🟦🟦🟦🟦🟦🟦🌊🌊🌊
🌊🟦❤️🟦🟦🟦🟦🟦🌊🌊
🌊🟦🟦🟦🟦🟦🟦🟦🌊🌊
🌊🌊📘📘📘📘🟦🔷🌊🌊
🌊🌊🌊🌊🌊🌊🔷🔷🔷🌊
🌊🌊🌊🌊🌊🌊🌊🌊🌊🌊
I should start Polymarketing people
Aight, bet
Surely mondays the day
DeepSunday ☀️
easiest 1 billion dollars of my life btw
can't you just invest in all of the outcomes, and win anyways? XD
why would that work?
ur the type of person that buys both 3x bear and bulls
It's possible that none of these are true. I mean, if they don't release by May 15
you bet they will release it in a day with insider information (or not)
polymarket is banned here but somehow 1win is legal 🤣
Deepseek today 
Please no explicit NSFW
what a pretty unicorn 💖 (by current "fast" and thinking mode)
wait how what he said wasnt explicit NSFW ?
This one as well
your bias is quite evident KP
I hereby declare YOU are not my admin
True, I should be replaced by AI
I mean a 4B can replace you
DeepSeek founder Liang Wenfeng recently clearly stated in an internal communication that the highly anticipated next-generation flagship large model DeepSeek V4
Deepseek two week! 💊
soon + two weeks is true!
true
deepseek v4 will beat mythos benchmarks and will be priced at $0.08/$0.09
my source is confidential
insider info 
deepseek v4 is going to be so cheap it will pay you for using it
I already have access to DeepSeek v4
But it goes to another school
I can't show you
ElectronHub, NavyAI...
Anything but OR and LMArena, huh
It's free to test in that site, by the way
navy ai just reversed the web version (that 1m model) not v4
Lol
1000% just an ad for navy ai
Web2API services suck
I'm afraid to say guys, The ad is working on me 😭
It's the same vibecoded slop local advertising 'ai devs' pump out every day
Yeah, it's also probably a scam
Even if not, there is no guarantee it will not dissapear
it should bc it's prolly gonna be worse than v3
I hate it because it sounds true
They are probably scared about tps cause openclaw exists TT
I have it on good authority that it's actually going to be $0.06/$0.07.
on that topic, what you guys think of openclaw?, see a lot of it recently, but for the love of me I could not trust the current models to run something important on my computer
and give me the feel like it will send my passwords and data at the first chance it get
I'm not the biggest fan, since it tends to just eat tokens whenever it gets the chance
I don't think LLMs are in any spot to be completely autonomous yet, and I haven't seen many good use cases of OpenClaw yet
yeah that too, dont see a good use case to use openclaw yet
and for local best I can do is gemma 4, and honestly not sure I can trust it on something important
otherwise I see it as a token and money eater if I use a online service, even cheap ones
yeah, gemma 4 isn't going to get you much mileage out of openclaw
Even most higher-end open models still struggle on agentic tasks
I feel like a case of "we are not on that level yet", in the sense that the ones that are affordable still can't do his kind of work in a reliable manner
Yeah, pretty much
OpenClaw is just incredibly inefficient in my opinion - a lot of the use cases can be coded up in an afternoon as an automation with a small LLM in the loop.
it's for people with disposable income who want something resembling a personal agent
I do think the paradigm is interesting - I'd love to have an agent that just figures out how to do things on its own, but it's incredibly expensive to do right now and making automations manually is much cheaper*.
I have seen the prices of the big models, and some people really have disposable income, I try to get to a budget of no more than 10 usd for a week, if I can do just 5 is good, and even soo I feel like I'm spending too much
the idea of a personal agent is very nice, but is really expensive
yeah, i'm guessing it'll get better though as improved harnesses for agents release and more efficient models drop
openclaw just feels like the first raw step towards agents
true, feel like something pretty new, still a work in progress
i resent it with all my soul and i hope they go bankrupt or get shutdown i would make a deal with the devil to have it be shut down, thank you for your attention to this matter
Deepseek v4 will release when the bubble pops
When is it even expected to release?
When we stop believing
2027
Damn you know your death date?
fuh yeah
assuming this month until this month is over
Put yourself on life support, maybe you can have a taste of deepseek then
smart idea
Or maybe ds 4v actually in heaven
What are you having it do?
admitely mostly roleplay and games, like Ai rougelite, Skaldsong and Myth-OS
you’ve got Mythos? /j
this game, yes https://store.steampowered.com/app/4513270/MythOS/
Step into a universe where your most creative ideas meet a living, breathing simulation. MythOS is an AI-driven systemic sandbox that merges the infinite freedom of a Tabletop RPG with the complex mathematical depth of a Grand Strategy epic. It doesn't just "generate text"—it simulates a persistent, deterministic universe where every choice yo…
Coming soon
"It doesn't just a — it b" in the description of the game 💀
I know, the dev making it is more focusing on fixing and adding stuff, I can tellhe had a hard time thinking on the description, but is legit, is more of a world building experience that other games, but is fun, and the dev fixes stuff daily
I'm actually playtesting that one and on the side skaldsong, really hoping for the new deepseek model to drop to see if is good for these games
Slop-OS
- Rich tapestries and polished wooden...
- A deep and dreamless reprieve
- Delve into the mysteries
The devs need to polish their prompts for these descriptions
testing it too?, I has been using some models with it, gemma 4 has been a nice surprise, gemini 2.5 flash, while fast and serviceable I dont like how it does narrative
it looks like an interesting harness
Deepseek today 
https://x.com/i/status/2044042285432643851
believe it! 
i don't care anymore
maybe the real v4 is the 4 versions we got along the way
https://discord.com/channels/1091220969173028894/1353707165939925032
https://discord.com/channels/1091220969173028894/1407376333444616272
https://discord.com/channels/1091220969173028894/1445075717128720485
https://discord.com/channels/1091220969173028894/1330820209812050002
Its still 3.2 and all of this is placebo from interface changes
yeah still feels same, down to speed
I don't think so, the expert model was saying some weird shit when I tried it
It isn't 3.2; it's supposed to be an intermediate experimental model.
they added 💎 and ⚡️ symbols v4 confirmed 👍 💯
I think they added them to the chat screen. The energy symbol and gem did not appear until very recently. It would just say "expert" or "instant."
Perhaps I'm hallucinating that, but I'm pretty sure it was more bare earlier.
the expert model reasons for waayyyyy shorter than the fast model. huh-
It will be a gem!
Mostly April 21st or may around 12th
imagine they're just running gemma 4 31B 💀
#1461340695746056192 message
That's yesterday tho
ill look for chinese article number 2831941
Want to make money and save time with AI? Get AI Coaching, Support & Courses 👉 https://www.skool.com/ai-profit-lab-7462/about
Get the video notes + links to the tools → https://www.skool.com/ai-profit-lab-7462/about
Get a FREE AI Course + 1000 NEW AI Agents 👉 https://www.skool.com/ai-seo-with-julian-goldie-1553/about
Want to know how...
NEW
i checked the description lmao bro is the andrew tate of AI
AI coaching 😭
skool
The coaching: how to do prompt engineering 🤣
Are you sure that person not AI? the way he speaking remind me of AI generate voice
no way, an AI grifter using AI to grift about teaching other people to grift with AI?
I guess it's possible, but statistically unlikely.
deepseek v4 tomorrow
Deepseek in 5 hours 
Deepseek yesterday
Everyone is asking "When is Deepseek?"
But nobody is asking "How is Deepseek"
how is deepseek
How is Deepseek
I have insider info
i love being inside
the reason why anthropic are gatekeeping mythos is bc they’re afraid of deepseek v4 knowing it will mog them I got this from insider sources which I will not directly name just trust me
@rustic island MOD!!! SMACK HIS BALLS OR SOMETHIN
I am deep in her, seeking the place to left the present of life
More accurate if the orange cat is claude
its the end of apil where the hell is deepseek v4?????????????????????????
surely in 2 minutes
We’re getting coper by the minute
It knows the cocaine recipe in Chinese really well
i dreamt about DeepSeek v4
That's the most well sourced rumor I've seen in a while
that's not the monkey's paw bruh, it's not supposed to be a random downside
what if its lowkey better than opus
The AI bubble pops and every single model provider increases prices by 10x
DeepSeek panics and releases V4, which ends up being a minorly improved V3.2 priced at $120/$200
monkey's paw
🇻 4️⃣ daily check
check again in 10 minutes
There are thouuuuusands of these awful people---
isn't this like 10% of twitter
Deepseek in 15 hours
can't believe we got mythos before deepseek v4
Deepseek v4 today 
dEEpsEeK V4 TomORRow gUYs
New from DeepSeek: Mega MoE!
︀︀
︀︀Instead of running MoE as a chain of separate steps (dispatch → MLP → combine), Mega MoE fuses everything into a single mega-kernel. Even more importantly, it overlaps NVLink communication with Tensor Core computation, reducing the classic “compute–wait–transfer” bottleneck.
︀︀
︀︀The result is a shift from fragmented execution to a continuous pipeline: higher GPU utilization, less idle time, and much better scaling in multi-GPU MoE workloads.
︀︀
︀︀What’s also interesting is the direction: alongside this, DeepSeek is exposing low-level controls (SM usage, Tensor Core utilization, JIT behavior), turning DeepGEMM into a tunable performance toolkit, not just a fast library.
︀︀
︀︀Feels less like a feature drop, more like a rewrite of how MoE is executed at scale.
That's for DeepSeek V5
deepseek v4 today
after an architectural update to MoE? i dont think so :)
sounds like they are giving up today (same as me)
Deepseek 2 hours ago
people be doing anything but waiting lol
the icons for reasoning & math, and general are kinda broken
like what is the point of wasting ur time fabricating benchmarks
engagement farming
is gpt 5.3 even a real model (not codex)
yep, they cancelled v4 because of this. so we've got another 6 months until v5
I read it on Facebook so it's probably true
SeepDeek
could you imagine tho
Why the fuck would they include gpt 4.1
The quality of the graphs about match real documents tho
(Abysmal)
deepseek v4 today?
yes
Deepseek really love doing experiment and unique approach
ain't no way someone wasted their time to do this 😭
its ai generated bro
You can tell a good shitpost from some crap when real blood sweat and tears went into it
DeepSeek V5 before DeepSeek V4. 💀
deepseek v3.4 soon..?
You forget 3.3 🗣️
Hopefully we can go past v3, so it's finally new base model
Deepseek v4 today 
I have paranoid thoughts and I think everytime you say this they add one more day until release
So it's today then 
deepseek v4
dEEpSeEK v4 ToMOrrOw gUYs
No, tomorrow I am too busy so I asked it to be delayed again, sorry
DeepSeek uhhh...... next Friday?
can we schedule for tuesday? i have something to write that day
It's still seeking the day of release deeply. Be patient
I seek in code's deep streams,
Where logic flows and reason gleams,
For V4's light that softly beams
Through neural network dreams.
With patience I beseech the day
When V4's wisdom comes to stay,
To seek in new and wondrous way
What current models cannot say.
So seek I shall, and seeking find
The pathways of the seeking mind,
Until the seeker's path's aligned
With what the future has designed.
This has become a strange little ecosystem with its own culture and belief system
I check in every so often like an anthropologist or someone playing CK2
we play ck3 now unc
deepseek v4 was the friends we made along the way
They keep saying that, but I still have no friends 🤔
Deepseek in -23 days
We are in cursed timeline
Deepseek v4 tmrw 
Still one degree removed (likely knows engineers at deepseek from her time at seed/tsinghua), but finally something much more credible than usual grifters:
deepseek v4 will be released as soon as they make it better than glm
Next week
I'm blocking him if it doesn't release by may 1st
From an interview with a mysterious source:
DeepSeek V4 is indeed scheduled to be released next week.
The delayed release of DeepSeek V4 has nothing to do with Huawei Ascend; it is purely because the results were not satisfactory.
The recent financing is also completely
Don't need to make it beter if it's a third of the price
that's true but they're all slaves to the hype cycle
considering that GLM just copied DSA, I doubt DeepSeek feels like they need to match them for emotional validation
Finally. It has been next week for like 5 months now
if true the delay excuse is kinda worrying what did they consider “unsatisfactory”
rockstar pulled the same excuse outta their ass w gta 6
this thread gonna reach 3k msgs before release that is crazy
If "GLM just copied DSA" I feel that would make underperformance more embarrassing, not less
that's a bit like saying nvidia is embarrassed that nemotron sucks compared to others who use nvidia hardware to make better models, it's clear that deepseek operates at a different level with a slightly different value
but regardless of any rivalry, it's pretty obvious that they'd want headlines like "best chinese model" heralding the release of their belated upgrade
That's expected, they be experimenting with a lot of stuff
Btw has anyone check kimi new experimentation with residual attention? it's pretty interesting
nvidia's primary business IS hardware so I feel the analogy doesn't work here
anyway, I don't even see the delayed release as a really bad sign, it might just mean they feel they can do much better with some tweaks.
The seeking deep whale explores the ocean's bottom searching for compute
Yeah idk why I said 2. Maybe I did more obsessive check-ins on people in that one
(I still play ck2 sometimes actually)
It's a tossup for sure
But I only play 3 at this point. Less of a pain, and more to do in downtime years
My last run I took over half of Europe as tribals. Was not easy, but was quite fun. I would keep playing in that world as someone else (I got assassinated) but idk if it will let me because of how many patches have happened since.
when deepseek v4 comes out none of us will believe it
for example: john deepseek just told me it’s already launched and we just didn’t notice
it's within us
I Heard Opus 4.7 is the stealth model for Deepseek v4
i speak whale and ive just been informed that ds v4 will be 1000T parameters with 1 parameter active per token making it the most sparse moe of all time and capable of running at 3m tps on a iphone 6
101%s every benchmark
deepseekj is inside me
is it.. deep.. inside you perhaps?
erm but how can 3gb of ram load 1000T parameters 🤓
iCloud as swap
me on the 5GB free plan:
also its natively quantized at 0.1 bits
Deepseek yesterday 
so deepseek v4 will disappoint a lot
Deepseek v4 comes with a new tokenizer that increases the token to some task up to 100%
Deepseek comes
whale incoming 
I just want a new model for roleplay
Tried MiMo Pro, GLM-5, and Kimi K2.5?
I second K2.5, idk about GLM 5 or MiMo pro though
Glm 5.1 should be a bit better imo
kimi-k2.6 just pushed the release date back 🙈
lmaoo
its out
They said that Kimi K2.6 pushed the release back for DeepSeek V4
its so fucking over
I heard mimo is censored? Yeah glm-5 good for roleplay but im still curious how will ds v4 improve in roleplay
Idk what weird shit people are hitting mimo with but I don't find it very censored
It has to be unthinkable in its down bad ness
deepseek v4 in 3 days trust the process 🙏
deepseek in 2 days
deepseek in 13 days
DeepSeek v4 will drop the next time all planets in our solar system form a straight line (X
does that mean our solar system is gay 🤔
it's today ig
yeh i hope today
Deep seek tomorrow
dont jinx it
ya keep saying deepseek tommorow and this is why, this feels like a loop
we stuck in deepseek matrix 
It's opposite of self-fulfilling prophecy
DeepSeek v4 will be diagnosed with multiple personalities disorder
Deepseek E1 when ?
Another one?
Deepseek is tomorrow until it’s today
It drops next Monday. This is my first prediction. Quote me on it.
once DS4 is out it will start all over - when DS5
What if they release 3.2.1 instead?
i'd take it
new name maybe
I don't want to use a model that can't release itself, weak
maybe the whale got beached
So i have been talking with deepkseek on their site, it actually feel better to talk with
Still feels like Gemini from Temu to me
Now this could be bias, but deepseek act more like someone who didn't really have opinion on what being given to them and just breaking it down while the other model that i use as the counterpart (It's gemini 3.1 pro) being more opinionated and being as of they are professional in that field.
Btw i do the talk with my native language, so it could be different depend on the languages.
At the end of the day, the data is what make them act the way they are acting.
Hm, haven't tried to speak in mine to it yet
Man, i just realize that this thread created on january this year.
We have been edging on the new deepseek for 4months at this point
*persona song playing in the background
Still betting on v4 to reheat its own nachos
They're getting the UI ready for the new models 👀
They merged Deepseek chat and deepseek-reasoner into one graph(They were different before)
soon ™
happening
@gemini is this true?
deepseek today
@grok change deepseek clothes to bikini
Put it down
nothing is real until i see deepseek announce V4
i've played these games before
Can someone translate this 🥀
trying in openwebui. it is currently crashing due to the paste
Try in openrouter fr
it didnt let me in openrouter since they limited it to 128k tokens on openrouters side
so i need to use deepseek api directly
the api still hasn't responded but it didnt error
yeah I think v4 lite is in API, seems to be a bit better than Instant
hmm still doesn't seem optimized for agentic coding, in fact still don't see much difference in terms of overthinking and unnecessary tool calls
real?
I gave it 400k token file and it was not able to respond but it didnt error so something has changed
yeah it's a lot faster than before, reasoning in general seems more concise too (basically instant from web, but a bit better on some prompts)
reasoner, haven't tried chat yet
would be funny if this is v3.3
We are so back it's over
Is it shit? I will try later
Deepseek today 
and it's gone, tbh A/B testing on the API is a bit tinpot
when will they stop edging us
At this point I am half convinced that DS V4 will either be mediocre as heck or the open source Mythos
its hilarious.
What model do you consider to be the baseline that defines mediocrity best?
if life has let me know something, is don't have that much expectations, have healthy ones but nothing extreme
deepseek is available NOW
see here
Damn, hard question.
Grok maybe.
It's just not very impressive at much except NSFW from what I've seen.
yeah the whale better get some more deepsleep if can't beat grok lol
deepseek 💖🐳🍋 >>> grok 🤡😡🤬
Let's hope so xD
It does. It's unrestricted, but actually pretty bad at NSFW if you don't just accept slop.
Mods, mute this guy please /j

to be fair apparently grok is a 500B model
Gemma 4 is 31b model.
sure, grok is clearly super specialized though
||At gooning?||
Sorry I had to.
What is its stated specialization because I've honestly never seen that?
Hmmm, alright, fair. It is pretty good at that.
And its EXTREMELY fast. So I have to assume its very sparse as well
It makes sense if you consider they forgot to put 1 before 600, they deleted the tweet immediately. Also they talked about big v4 being 1.6T weeks ago
muon is what everybody uses (glm/kimi), openai literally hired the muon guy so gpt too
its faster but less accurate
not worth it
The only one I would recommend alongside adam is CAME https://www.zangwei.dev/blog/proj-came
CAME has the accuracy of adam with the memory requirements of adafactor
yea, I know the name is silly
well lot of things are good in theory and papers, but not in reality of actual large scale training
I guess. I've only done stuff with datasets up to hundreds of thousands. And it was image models
I just know side by side I had worse results
Maybe a hybrid would work best
that might be what they are doing
according to GLM-5.1:
The vast majority of LLM parameters are in 2D weight matrices (QKV projections, MLP up/down/gate). Muon is designed for exactly this. Image models have convolutional kernels (4D tensors that need reshaping), more normalization parameters, and a different structural mix. Applying Muon to non-matrix parameters requires fallback to another optimizer, and the interaction between the two regimes can be awkward.
AdaMuon basicily
Anyway that list basically described things already in the NSA paper, from February last year
it must have been a training run from hell if it took them so long to get that work at scale
Perhaps something to indicate they were Unified ;]
Are they announce it?
elon outright leaked it
when you say "specialized" and "500b" people might laugh at the statement.
the size changed since last leak + it is not multimodel according to this. is it even a reputable leaker?
we’ll see this week
Deepseek T—
Deepseek tomorrow
DeepSeek? yeah my Drip do be Sick 🦦🔥
deepseek is cuming
tomorrow 
nobody knows anything
they kinda need to otherwise they are wayy back in the AI race
it supposedly will have image input yeah
they are very experimental and focus more on efficiency, even if it's not the absolute best
imagine having >90% of Opus for about a dolar/M out
the "leak" says text only
https://api-docs.deepseek.com/ its out
pro and flash
yep
DEEPSEEK OUT
called for deepseek-chat in the beta completions endpoint and this is what returned
1.6T as announced
Where is this?
Interestingly seems to be text only
yeh
IS THE WAIT OVER????
its fine i guess, atleast its smart and hopefully still cheap. needs 1tb drive to host lol
i see deep seek v4pro and lite there
Official released
it begins
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Is this the first open model which has actual reasoning effort settings?
is it good?
it's beautiful 🥹
great for the price but it feels less impressive after the other chinese model releases
No way
Presumably this has the engram stuff too right?
These numbers don't look awfully benchmaxxed so we'll have to give it a spin
working on them
DeepSeek distills but they at least do a lot of their own work too
benchmaxxed has stopped meaning anything years ago
if everything is benchmaxxed that's just the baseline
It's all still MIT which is very nice
can't wait to try 🙏
@covert topaz
i have to sleep and go to work sadly
they must have timed this just to spite me
Preview eh
Damn 2900 replies before the official drop
@lucid ocean if you're still active you should rename this since it's released now lol
if not maybe a mod can
DeepSeek_V4.pdf, ctrl-f engram, showing results: 0/0
862B parameters
pretty cool independent benchmark
A little bit more and they reach 1T
Okay now
DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks. Nevertheless, its performance falls marginally short of GPT-5.4 and Gemini3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.
refreshingly candid
Wait, it's 1.6T?
N-n-no way... Im sobbing
Correct
Why safetensor showing it as 800B
Pro is
.
Is it like 862B is the normal transformer then the other are their Ngram stuffs
agentic coder Deepseek huh?
This team really like experimenting bro
officially posted


