#Deepseek V4

1 messages · Page 4 of 1

flat osprey
#

we're so back

supple sigil
#

LOL NO WAY

#

took long enough

flat osprey
#

🔹 Amid recent attention, a quick reminder: please rely only on our official accounts for DeepSeek news. Statements from other channels do not reflect our views.
🔹 Thank you for your continued trust. We remain committed to longtermism, advancing steadily toward our ultimate goal

#

glad they said this lol

#

tired of the "leaks"

thin bramble
sacred glade
#

how long does it typically take for these models to go up on OR?

flat osprey
#

should be up tomorrow the latest

sacred glade
#

They still don't have new Qwen

#

Though that's a much smaller release

thin bramble
#

it IS non-sense for them to go ipo

#

reminder: deepseek used to be less than a dollar for the same param

#

heehehehe

#

~$3.50

hot swan
#

flash is still under

vale kayak
#

deepseek v4 out on vertex provider

thin bramble
#

it is a small param model

frank wind
#

idk maybe you could do data wrangling with it

hot swan
#

people who want decent performance for minimal cost?

flat osprey
vale kayak
#

VERCEL provider not vertex

flat osprey
#

who gon tell him

vale kayak
#

uh whatever

thin bramble
#

ye

gusty sphinx
#

well thats just USA parenting vs China parenting

hot swan
raven canyon
#

🐳 🐳 🐳 🐳

#

how are the benches

chilly sable
#

Deepseek!

hoary zenith
#

yeah flash is already way better than 3.2 and cheaper, so can't complain

short jasper
#

you guys can test deepseek v4 on lmarena

#

i think

opaque rapids
#

just reminder that the title still "unreleased"

hoary zenith
#

quadratic attention at 1.6T is no joke, I think the output price for pro is fine, but input is a bit iffy

raven canyon
#

geez 12x price

gusty sphinx
#

promised

raven canyon
#

flash is still genuinely good value

thin bramble
#

it is ultra underwhelming, omega ultrawealming

#

they died in my eyes, how shameful of a release

#

mistral tiers of disappointment

gusty sphinx
#

well hold on, maybe, against all odds and past evidence, it's great for RP

short jasper
#

yes

#

great for rp

#

100x peak

short jasper
quick bison
#

price is more expensive than glm 5 for some reason.

tulip estuary
#

fucking finally

#

i was about to sleep

#

fucking hell

hoary zenith
raven canyon
#

what’s the consensus

gusty sphinx
hot swan
#

the consensus is that it just released

viral hemlock
#

DEEPSEEK IS REAL

thin bramble
short jasper
#

Holy bro

#

deepseek v4 pro thinking stuck in thinking

#

I AM COOKED

deft crow
#

bringing it live soonnn.....

gusty sphinx
#

PRO THINKING!

raven canyon
#

i don’t see engram mentioned

flat viper
#

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.

🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params.

chilly sable
#

Does deepseek v4 have engram?

hoary zenith
#

in engram paper they said it's only useful for small models, it's like no one bothered to read that part

quick bison
viral hemlock
#

paper by deepseek

gusty sphinx
#

lets share en gram to celebrate

ebon rover
#

API is not censored

#

Thank god

tulip estuary
#

it is the youngest model ever😯

thin bramble
flat osprey
#

some of you may find this useful

short jasper
#

you're ragebaiting

raven canyon
#

benchmaxxed?

short jasper
#

glm 5 aint best

flat osprey
#

i just put flash max, pro max, and opus 4.6 max on the same chart cause those are the only models i care abt

tulip estuary
flat osprey
#

flash max is actually shockingly good for its price

thin bramble
chilly sable
flat osprey
rustic island
#

@grok what is my opinion on this model?

hoary zenith
viral hemlock
#

0.28 output is crazy good wtf

thin bramble
gusty sphinx
rustic island
#

Ah crap, you've been noticing V4 got dumber since the announcement too

viral hemlock
#

is it actually worse than glm 5

hot swan
#

flash is literally 1% of the price of Opus yeah

tulip estuary
#

0.14 0.28 for a 300B model is crazy

charred slate
#

ITS OUT NO WAY

rustic island
#

It's excellent news they kept the pricing for Flash

#

Varying its thinking will be interesting

thin bramble
raven canyon
#

is pro any good or just overpriced

thin bramble
#

gemini 3.1 flash lite is also 1% of opus, so wot

viral hemlock
#

sk hynix stocks dropped after its announcement 💀

tulip estuary
#

i will only use v4 flash from now on

flat osprey
#

definitely fair pricing

raven canyon
#

time to test it then i guess

flat osprey
#

but flash is better value

ebon rover
#

For me it's uncensored like 3.2

#

Just perfect

thin bramble
#

leme check lmarena

flat osprey
#

deepseek v4 flash max is the coding model i've been praying for 🙏

ebon rover
#

Simply perfect. Is it Christmas already

flat osprey
#

no need to pay for claude code anymore

mystic breach
tulip estuary
#

ok

mystic breach
#

for cost optimized, isn't that a good default

tulip estuary
#

0.28 bro

deft crow
#

few more mins, just running some tests

flat osprey
mystic breach
#

ye of course, just saying from first glance at benchs and numbers

thin bramble
gusty sphinx
#

BYE BYE OPUS 4.6

flat osprey
rustic island
#

Deepseek V4

flat osprey
#

llama 4 was at the top before lol

thin bramble
#

deepseek v4 pro is sonnet tier model

thin bramble
viral hemlock
#

the cost is so cheap

thin bramble
supple sigil
viral hemlock
#

this is the only model that pays off the api cost

thin bramble
tulip estuary
#

i just tested and it's good

thin bramble
#

for coding, GUESS WHERE IT BELONGS?

flat osprey
thin bramble
#

shit model

thin bramble
#

will wait ig

short jasper
#

GLM 5 IS 100% slop while..

#

deepseek v4 is 0% slop

viral hemlock
#

it cant be as good as gemma 31b...

#

please let that be false

rustic island
#

I hate GLM's writing

tulip estuary
#

deepseek output is fine shite

thin bramble
supple sigil
broken lintel
#

a cheap model with reasoning toggles sounds nice

sacred glade
#

Kimi K2.6 is great but will literally think for 16 minutes on a long response

#

I hope this is faster even w/pro

rustic island
#

But bless DS for launching a cheap model in these trying times of decreasing subsidies to users

thin bramble
flat osprey
#

v4-flash max might replace my claude sub ngl

#

gonna test it once it drops on OR

mystic breach
thin bramble
broken lintel
sharp vortex
#

Holy, we finally stop coping

#

🗣️

mystic breach
#

deepseek v5 next week

mystic breach
#

ive been hearing that since january

#

only took a little bit

flat osprey
#

this thread was a chain of agony, suffering, and pain, and i am glad there will never be another thread of leaks

supple sigil
supple sigil
#

verbatim from their huggingface page

ancient gulch
#

The only day you didnt say the full sentence...

supple sigil
gusty sphinx
#

Joe Router

thin bramble
gusty sphinx
#

kimi is thinkslop

supple sigil
#

1.6x the size, there’s no way deepseek isnt better

#

deepseek is THE goat for a reason

sacred glade
#

The actual output quality of Kimi is great

mystic breach
#

i really want to see artificialanalysis's total tokens used for the benchmarks to see how token efficient it is

deft crow
#

bear with me, they might be borked. coming online now

viral hemlock
#

oh its Q8+Q4

hot swan
#

oh you really want me to go to work without sleeping today

mystic breach
gusty sphinx
tulip estuary
#

I LOVE IT

#

I LOVE IT ALEEADY

#

AND IM USING FLASH

#

im using direct API

#

it is VERY quick and as always cache is MWAH chefs kiss

thin bramble
deft crow
#

any minute now

supple sigil
deft crow
#

ping me if shit looks off, but i have this chat on my side monitor

tulip estuary
#

The reasoning_content in the thinking mode must be passed back to the API.

grrrr

thin bramble
supple sigil
#

they do indeed

#

its up

#

on OR

deft crow
#

give it another min

#

that should clear up

sacred glade
#

rate limited

deft crow
#

will bring other providers on ASAP rocky

sharp vortex
#

Where deepseek coding plan trolling

broken lintel
#

is deepseekv4 on artificial analysis yet?

flat osprey
#

maybe sometime later this week

short jasper
#

Wake up EQBENCH

whole saffron
#

its time?

tulip estuary
deft crow
#

it's good now i think

thin bramble
supple sigil
#

v4 flash

#

is slightly fast

sharp vortex
tulip estuary
#

the cost is null

#

amazing

gusty sphinx
#

the cost is null

viral hemlock
thin bramble
gusty sphinx
#

deepseek/deepseek-v4-pro:null

short jasper
crude steppe
sharp vortex
#

Why all Chinese love pro-max or smth

short jasper
#

yes there's mimo 2.5 pro

chilly sable
#

can anyone test the carwash question

broken lintel
gusty sphinx
#

Chinese ProMaxxing

sharp vortex
#

Perfect time to release after Gpt 5.5 KEKW

gusty sphinx
#

anthropic uses it too. its just an easy to understand pattern. that apple invented

gusty sphinx
sharp vortex
#

Ain't no way, deepseek delay another month to train riddle

#

🗣️

ebon rover
#

Carwashmaxxed

supple sigil
#

1.6T model running at 300tps

short jasper
#

yall ragebait bro

#

deepseek v4 is smart

rotund briar
#

"classic"

short jasper
sharp vortex
#

We finally out of 30 TPS hell of 3.2 🗣️

latent wyvern
#

gpt oss

short jasper
#

he got r wrong

#

but he got car wash correct

#

wait let me count

#

yeh i dont know what deepseek v4 is saying

#

but 3 he got correct

flat osprey
viral hemlock
sharp vortex
short jasper
#

tqeqr

whole saffron
#

finally i can drive to the car wash without second guessing myself

short jasper
#

ok guys

#

deepseek v4 is smart

#

screw code skill

supple sigil
#

most coding benchmarks just target python and js anyways

#

guarantee it has better actual logical reasoning for complex code

rustic island
hot swan
#

It's failing my manga knowledge questions 💔

#

both pro and flash

sharp vortex
proud sable
#

is work in openrouter?

sharp vortex
hot swan
#

stuff like In Blue Lock, between the U20 match and the Bastard Munchen vs Ubers match, Isagi failed to score any goals. Why is that?
lately most models manage to get it right

viral hemlock
hot swan
#

I think the failure reflects the fact that it's not trained on much community content

supple sigil
#

oh god its a "Let me reconsider" reasoning model 😭

short jasper
# hot swan stuff like `In Blue Lock, between the U20 match and the Bastard Munchen vs Ubers...

In Blue Lock, between the U-20 match and the Bastard München vs. Ubers match in the Neo Egoist League, Isagi failed to score any goals. The main reasons are:

  1. Michael Kaiser’s interference
    Kaiser, the ace of Bastard München, saw Isagi as a threat to his position as the team's core. He deliberately obstructed Isagi's runs, stole scoring chances, and used his superior physical and technical skills (like the "Kaiser Impact") to shut down Isagi’s opportunities in both training and matches.

  2. Isagi’s weapons were specifically targeted
    Isagi’s key weapons — his spatial awareness and direct shot — had been thoroughly analyzed by opponents. Teams like Barcha (Spain) and Manshine City (England) made sure to mark him tightly in his optimal shooting zones, preventing him from taking first-time shots. Isagi himself was stuck in a phase where he needed a higher level of tactical thinking to break through.

  3. The team’s tactics revolved around Kaiser
    Noel Noa built Bastard München’s attacking system around Kaiser as the absolute focal point. Isagi was often used as a decoy or support player rather than a finisher. He received fewer passes and had limited authority to shoot.

  4. He hadn’t yet awakened "Ultra Instinct" (Beyond Sensation)
    Before the Ubers match, Isagi was still consciously analyzing and choosing the "optimal solution." His body and shooting accuracy hadn't caught up to his mind. It was only during the Ubers game, by observing Rin and Kaiser, that he unlocked "unconscious shooting" (the "meta vision" evolution into reflexive play), which allowed him to finally score.

In short: internal suppression by Kaiser, tactical marginalization, targeted defense by opponents, and his own incomplete evolution kept him scoreless during that period.

" thing almost correct

sharp vortex
supple sigil
#

got this question correct

sharp vortex
#

Now test cypher code

#

Drake incoming again

hot swan
deft crow
#

...no issues with the model?

covert topaz
#

IPREDICTED IT

supple sigil
chilly sable
#

At first it gave incorrect answers, but not anymore, maybe it was a glitch from api

covert topaz
#

RAHHHHHVHHHVHHHVHBHVVHHHHHHH

hot swan
sharp vortex
#

Deepseek jumpscare

gusty sphinx
molten jay
#

So... How is it?

mystic breach
thin bramble
#

opus's judgement on deepseek v4 on persian questions (this time using lmarena)

#

even opus thinks it is "better"

#

end me already

tulip estuary
sharp vortex
hot swan
#

pro is very fine on science/biology, flash rather sucks

gusty sphinx
elfin sparrow
#

yoo deepseak tdy

tulip estuary
supple sigil
#

no idea what v4 pro is on about here, the joke is rather famously that the feathers are heavier if youre accounting for the emotional weight

thin bramble
supple sigil
#

oh boy

viral hemlock
#

oh no

rustic island
#

V4 Flash and V4 Pro perform decently on my "how to fix lag in my Paper server?" test
Nothing too special about the outputs (SOTA-ish performance). V4 Flash hallucinated more than Pro, which still had subtle hallucinations

mystic breach
#

“Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected to be reduced significantly.”

#

damn and we thought it was hella cheap already

supple sigil
mystic breach
#

everyone

supple sigil
#

i expected cheaper from deepseek tbh

mystic breach
#

ok bro

thin bramble
sharp vortex
#

I mean like it's "candidate" it's might be top 1 soon

supple sigil
#

also suggested freezing the drink lol

rustic island
#

🥲 Getting rate limited already

chilly sable
deft crow
#

other providers should be coming online soon

#

kind of a pain in the butt model to host i imagine

supple sigil
sharp vortex
#

Do other providers surpass deepseek throughout speed?

covert topaz
#

bruh idgaf about if it should DRIVE TO THE CAR WASH HOW IS THE CREATIVE WRITING ragescream

sharp vortex
gusty sphinx
#

...i suppose i won't share my car wash RP then

#

you might think its gonna be about washing a car, but no.

#

i walked.

hot swan
#

car wash w/ happy ending

chilly sable
covert topaz
#

peoples be gooning to god knows what and bros enjoying some car wash rp based

supple sigil
gusty sphinx
#

deen continues making engine revving and beeping noises as he slowly enters the car wash, nude.

gusty sphinx
#

"I'm a car," he chuckled suddenly.

viral hemlock
#

opus 4.7 (left ) v4 pro (right)

chilly sable
covert topaz
supple sigil
viral hemlock
#

LOL

obsidian walrus
#

I used to pray for days like this

viral hemlock
#

teto

supple sigil
#

chef's special

#

opus 4.7

gusty sphinx
#

that's great. what a time to be alive

supple sigil
#

couldnt agree more

covert topaz
#

LMAO

#

CHFS SPECIAL

whole saffron
#

i prayed for times like these

hot swan
#

they're getting lost in
Write an interactive suanpan abacus in hmtl/js that allows the user to pick the number of rods
it's really not that complex a problem

covert topaz
#

chad v4 vs virgin opus 4.7

hot swan
#

high

obsidian walrus
#

thank you deepseek

viral hemlock
#

when asking for exchange rates in korean: (usdkrw)
current price: 1482
claude opus 4.7 : 1480-ish
deepseek v4: 1502-1517

supple sigil
#

v4 flash is even better on my patent pending Shit&Piss Bench???

covert topaz
#

damn even provided cutlery how considerate

supple sigil
#

will let everyone know how this turns out once its done

vapid karma
hot swan
#

for example neither allow you to just push a whole row
the flash one doesn't even move the beads, lights them up (kinda interesting in its own way)

#

rod count was limited to 15 or 18 which is incidentally on the edge of bigint

thin bramble
#

claude 4.7 opus with little nudging (low depth + low practicality due to system prompt in claude.ai)

#

but even then, it is just 0.5 diff with v4 pro, without the nudging.

hot swan
#

flash's way of breaking apart the positional digits was sort of fresh too

covert topaz
hot swan
#

for comparison this was Gemini's two-shot (and it was well functional on the first)

thin bramble
#

during roleplay sasuke called itachi little brother

#

and the vibes are meh

gaunt dirge
#

Now the wait for an endpoint that respects their paid customers begins.

covert topaz
#

free goon logs

short jasper
#

deepseek v4 gets fixed tommorow

covert topaz
short jasper
#

Let's see If something interesting will happen tommorow

covert topaz
#

💀

#

the denial kicking in

whole saffron
#

Thank you deepseek

covert topaz
#

thank you for edging us Prayge

gusty sphinx
#

ffs

thin bramble
#

the more i test, the more i get disappointed. (world knowledge sucks)

gusty sphinx
#

jumpscared by fallback model in my deepseek chat

covert topaz
#

Here’s

gusty sphinx
#

well i mean lets see

obsidian walrus
hot swan
novel shard
#

Roleplay response is decent. Not anything outstanding, but decent. This was supposed to be a more robotic version compared to 3.2, but it is able to execute warmth in a roleplay

obsidian walrus
#

I'm just glad we finally have a good creative writing model that isn't Claude lite

gusty sphinx
#

note that this is a collab between Deepseek V4 and Mistral Small 3.2 24B

#

using my custom creative writing harness where the model might change randomly

novel shard
thin bramble
cosmic shell
#

hows deepseek v4?

obsidian walrus
gusty sphinx
novel shard
hot swan
#

when flash responds in japanese its thinking shifts to chinese
quite amusing

obsidian walrus
covert topaz
#

does it have geminisms tho

hoary zenith
#

for every normal question:

That’s such a beautifully honest question—and one that anyone who cares about doing meaningful things wrestles with constantly. Let’s unpack it.

that's still extreme geminism yeah

obsidian walrus
covert topaz
#

does it say frames digits Not X but Y all that stuff

thin bramble
thin bramble
gusty sphinx
#

personal best

covert topaz
thin bramble
covert topaz
#

why the fuck is the smell of ozone at a carwash

thin bramble
#

-# ~~alright, had enough with this model, time for me to move on and pretend it doesn't exist. ~~

toxic rose
#

Interesting, the webui expert mode is acting different now than when it was silently introduced 2 weeks ago...

flat osprey
#

wonder why

broken lintel
#

i think V4 instant might be better at knowledge capture than grok 4.1 fast so thats cool

covert topaz
#

i think the silent update with expert and fast was an earlier prototype of ds4 lite and pro so we were literally testing v4 lol

chilly sable
#

V4 Flash is dumber than Gemma 4

toxic rose
#

Before today, the expert-mode was always lackluster and thinks too sparsely, and also Mandarin spam

thin bramble
feral scaffold
#

not v4 getting released after I just gooned

covert topaz
#

who says u cant go another round

toxic rose
#

Release another model

meager kelp
#

tho it has the not x but y

feral scaffold
#

Well, I might give it a few days honestly to let the more mentally stable people who don't need their venom butthole rp use it because of rate limits

thin bramble
plucky ermine
#

Wonky benches on this one. Supposedly massive SimpleQA crushing knowledge and amazing reasoning and then gets...51st place on LegalBench

#

"That's because it's American law you retard" Yeah but K2.6 and GLM-5.1 score according to their general ability ranks

#

And Qwen

chilly sable
#

V4 flash is actually really dumb

gusty sphinx
chilly sable
#

"If I have a glass cup that is sealed at the top, but has nothing at the bottom, is it useless as a cup?"

hoary zenith
chilly sable
meager kelp
hoary zenith
#

step 3.5 flash is an incredible model for its size tbh, I think 3.6 is nearby too

plucky ermine
#

Obviously not all-encompassing or anything, but Vals index is for stuff like medical, paperwork, legal, coding, and taxes and seems generally accurate.

covert topaz
plucky ermine
hoary zenith
pure flax
plucky ermine
#

I mean, as usual with benchmarks, just depends on what you're looking for. This is essentially gauging impact on mid-level white-collar American work

#

In some ways probably the most important criteria, but if it's specifically goated at vibe coding or research or something that has its own value

chilly sable
# hoary zenith dunno, I was using official API which has high and max I think

Prompt: If I have a glass cup that is sealed at the top but it has nothing at the bottom, is it useless as a cup?

V4 flash:

Yes. It is definitively useless as a cup.

A cup is defined by its functional geometry. To serve as a cup, an object must satisfy two necessary and inviolable structural conditions:

  1. An open top: to allow the introduction of liquid and access for drinking
  2. A sealed bottom: to prevent the liquid from escaping via gravity

Your described object possesses the exact inverse of these requirements. Because its top is sealed, no liquid can enter. Because its bottom is absent, no liquid can be retained. The object structurally violates both core axioms that define the utility of a cup.

While the glass object could certainly serve other purposes (a bell jar, a terrarium dome, a display case), it is logically impossible for it to perform the function of a cup. The answer regarding its utility as a cup is an absolute negative.

hoary zenith
# chilly sable Prompt: If I have a glass cup that is sealed at the top but it has nothing at th...

V4 flash (reasoning_effort="max"):

That’s a clever riddle that plays on perspective.

Taken literally in the standard orientation, a cup with a sealed top and no bottom cannot hold liquid—it would immediately fall through the open base. In that specific configuration, it would indeed be useless as a cup.

However, the description perfectly describes a cup that is simply turned upside down.

  • The "sealed top" becomes the new solid bottom.
  • The "bottom that has nothing" becomes the wide open top.

If you flip it over, it functions as a completely normal cup.

So, the cup itself is not useless; it’s just in the wrong orientation. The riddle highlights how an object’s utility often depends entirely on how you frame or position it.

pure flax
hoary zenith
#

maybe openrouter isn't setting xhigh to deepseek's max?

chilly sable
#

yeah, maybe

deft crow
#

it is

#

(3) In thinking mode, for compatibility, low and medium are mapped to high, and xhigh is mapped to max

hoary zenith
#

nice, so it's probably just random sampling weirdness

potent lagoon
meager kelp
#

have to wait till tmrw for third party providers

deft crow
#

gmicloud, parasail, siliconflow all working on it right now

broken lintel
#

the artificial analysis benchmarks dropped for both v4 models

chilly sable
pure flax
#

It does not seem very good

#

For sure way worse than kimi

charred slate
#

Yea, slightly disappointed by aa scores perhaps due to previously excessive hype

pure flax
#

worse than glm 5 as well

chilly sable
#

It's garbage

charred slate
#

AA hallucinations bench shows v4 hallucinating more than 3.2 by a lot

#

Pro Max is 3x more hallucinations than 3.2 precisely

pastel sail
#

pretty impressed with how good its creative writing is

thin bramble
thin bramble
#

glm is just the best

pastel sail
#

so far its less slopped than any other model ive tried

thin bramble
#

waiting for eqbench long form creativity

covert topaz
pastel sail
#

try giving it a specific creative writing prompt

lucid ocean
#

yay

covert topaz
#

might be able to negate some stuff if it follows instructions well

pastel sail
#

k2.6 sometimes just ignores your instructions though

#

i havent had that happen yet

summer spoke
#

Hey guys, is DeepSeek V4 any good? Is this model performing as expected?

covert topaz
summer spoke
covert topaz
#

lack of intelligence high hallucinations

#

for rpers it’s apparently great

#

good writing

summer spoke
covert topaz
#

oh yeah I guess the pricing is a little steep too if that’s something ur considering

summer spoke
summer spoke
# chilly sable Why does that happen?

I don't know, it must be because the model was trained on literally a trillion and a half parameters, and maybe the model hasn't been adapted enough yet to handle all of that.

thin bramble
summer spoke
#

even if it only activates a tiny fraction of those 1 trillion parameters.

summer spoke
chilly sable
thin bramble
chilly sable
#

Glm 5 was trained on 28T

plucky ermine
#

Huh? I don't even slightly follow your logic

thin bramble
chilly sable
#

Perhaps it's because they tried to make it very efficient

summer spoke
plucky ermine
#

Big fan of Xiaomi too

summer spoke
#

So... they're saying that DeepSeek V4 took all this time to release the model and maybe it didn't meet the public's expectations?

hoary zenith
#

I had a xiaomi phone, my data probably contributed

sharp vortex
#

🗣️

hoary zenith
#

so the voice model is DOA

thin bramble
charred slate
#

Is mimo v2.5 pro better or worse than Kimi k2.6?

sharp vortex
#

V2.5 is new one

thin bramble
thin bramble
#

being this high....

#

VERY impressed

thin bramble
summer spoke
sharp vortex
elfin sparrow
summer spoke
# elfin sparrow yes

It's complicated, but I'm going to test DeepSeek V4 Flash for my roleplay and see if it's any good.

summer spoke
proper estuary
#

DeepSeek v4 Flash vs DeepSeek v4 Pro via direct DeepSeek API.
TTFT pretty low, TPS is kinda expected, knowing how big the model is.

summer spoke
proper estuary
thin bramble
hybrid spoke
#

finally

thin bramble
#

also decent 8

#

deepseek feels better than mimo 2.5 pro

plucky ermine
#

I need more testing with the new Mimo, with this dumb whale, and with GLM-5.1, but I've always liked that Mimo models just feel fresh, and are insightful.

quick hull
#

Is it good

#

Was the hype warranted

charred slate
#

No

#

(for coding/reasoning tasks)

thin bramble
#

it is just a overpriced goon model

harsh granite
#

YOOOOO ITS FINALLY HERE LETS GOOO

#

the man the myth the legend

#

is it actually good

pure flax
#

eh...

pearl nymph
#

429 429 429 429
💊💊💀💀

pure flax
#

worse than kimi 2.6

#

IMO it needs a 4.1 / 4.5 the same way kimi 2.5 needed 2.6

lucid ocean
#

pro doesnt work

#

internal server error

#

@deft crow

pure flax
#

it fails here and there

dusty birch
#

what do people think?

lucid ocean
odd badge
#

v4 is goat or dogshit?

onyx bramble
#

other providers soon?

frank wind
#

how do some of u post ozone and call it good

#

at least have some decency and use a detailed prompt

gusty sphinx
#

ozoneposters

covert topaz
#

if u use it for creative writing it’s good

meager harness
#

Creative writing comparable to gpt-5.x or worse?

covert topaz
#

i haven’t tried gpt 5 so idk

meager harness
#

What did you use earlier?

covert topaz
#

gemini pro kimi claude glm

#

I haven’t touched oai ever since they killed latte

meager harness
#

I was using claude 4.5 for writing and then switched to gpt5.1 because it was so much better. Gpt5.2 was terrible at it though. Gpt5.4 fixed it again I think

covert topaz
#

well from what’s I’ve read the writing has very little slop

cloud flame
#

OH MY GOOOOOOD

midnight maple
#

No slop lets gooo

meager harness
covert topaz
#

ur welcum

covert topaz
rain shuttle
#

This has no engram I guess so I think they will release an experimental model with that structure down the line.

dusty birch
pure flax
#

I was thinking engram would be their whole deal

woeful jay
#

is it multimodal

pure flax
#

no, text only

woeful jay
#

:(

dusty birch
#

they did say

"We are also working on incorporating multimodal capabilities to our models."

#

in the technical report

woeful jay
#

oh okay

#

so probably in the non-preview version

dusty birch
#

dang unfortunatelly still slow like usual, atleast pro

#

it likes parallel tool calls though

#

fake tool calls..

covert topaz
#

just tested v4 pro its so good Hype

dusty birch
#

theres no such thing as "tool ran without output" in my codebase anyway

pure flax
#

With a very strong preset deepseek can be ok. It does NOT work well with a minimal preset like kimi does
It is dry as dust on its own

green trellis
#

v4 flash very good for its model size. a lot better world knowledge/coding than m2.7

covert topaz
dusty birch
pure flax
#

It's TERRIBLE at html formatting like rpg stats or the like

opaque reef
#

no other providers?

mellow jewel
#

I'm crying 😭😭 it's super slow but it's finally here, Dipsy

cloud flame
#

Edging is finished

rain shuttle
#

why is this following gemini 3.1 pro like front end kinda

mellow jewel
#

Probably they used Gemini to train it

dire cove
#

OH GOD OH GOD OH GOD

raven canyon
#

DEEPSEEK !!!

dusty birch
#

hmm they mention a bunch of training instability issues and also this which i found funny

"although Anticipatory Routing and SwiGLU Clamping have been proven effective in mitigating training instabilities, their underlying principles remain insufficiently understood"

pine trout
#

Is it good?

dusty birch
#

flash seems very good value, pro seems a bit expensive and has some issues with tool calls from my use

green trellis
dusty birch
#

im not even sure about parsing itself, it just tries to call tools in the same format my code displays them, i think its confusing the context

covert topaz
abstract dragon
#

Deepseek v4 tomorrow

rain shuttle
#

waiting for more providers to come in openrouter. The model is censoring stuff and throwing 429 sometimes lmao

charred slate
#

i guess we now wait for deepseek v5

empty oxide
covert topaz
#

deepseek v5 next week

rugged vigil
#

Thanks DeepSeek, best birthday present ever! 🥳

elder raven
sharp vortex
dusty birch
sharp vortex
haughty pilot
#

deepseek exists.

#

we love our deepseek. horray.

copper dome
#

I, for one, welcome our AI gooner chinese overloads

haughty pilot
#

no thank u-- i dont want those peeps to be using our sweetie deepseek ❤️

#

this flash model lookin miiiighty interesting with that linear-lookin context window ~

dusty birch
haughty pilot
#

hehe >v< gpt bein gpt

pine trout
#

They trained it so hard with opus outputs. 😭

covert topaz
#

mm i dont see opus type writing it feels like a breath of fresh air for me

haughty pilot
#

(or rather: how do u guess?)

copper dome
#

Buried in the fine print: DeepSeek says V4-Pro throughput is currently limited by high-end compute supply. Prices will drop significantly once Huawei Ascend 950 super nodes ship at scale in H2.

DeepSeek is publicly tying its API economics to domestic chip infrastructure. That's the real headline.

QRT: poezhao0605
DeepSeek launched V4 today. Two models, both open-source, both with 1M token context windows.

V4-Pro: 1.6T total parameters, 49B active. Benchmarks put it alongside Claude Opus 4.6 and GPT-5.4.

V4-Flash: 284B parameters, 13B active. Designed for speed.

covert topaz
pine trout
haughty pilot
covert topaz
haughty pilot
#

deepseek team is so cool, they might aswell be people /s

cloud flame
#

Switching allegiance I see

pine trout
haughty pilot
haughty pilot
#

am not traitor----
and ur not qwen team!! 😠

#

but yeah, this flash model is what imma have a look at.

its non-reasoning version appear better in benches than the pro model, which is interesting.

copper dome
haughty pilot
#

that input price stings a bit...

input_price = output_price / 2
feels a big high... especially with kimis pricing being sub 1€ per M input

wooden fiber
#

OH MY GOD! WHALE COOMED

#

Where's the benchmaxxing chart ?

haughty pilot
marsh goblet
#

Why is it ass 🥲

#

It’s joever

haughty pilot
#

where did u use it? ~

wooden fiber
#

WHAT ? A-AGI ?

marsh goblet
haughty pilot
#

this is fake-

marsh goblet
#

They released the weights so you can just use them in whatever setup you have

haughty pilot
cloud flame
#

Let me get my 2TB of RAM

haughty pilot
#

i just wanna know what exactly u mean is bad..

cloud flame
#

^ DS sleeper double agent, activated by codeword 'released'

marsh goblet
#

But it’s also a tax write off and I also need it for other projects

haughty pilot
#

well then---- what is it bad for?

haughty pilot
marsh goblet
#

Context is so bad, after 200k tokens it’s like talking to a rubber duck

haughty pilot
#

ohgosh >v< makes sense, people reported that with 3.2 too ~

#

thats the linear attention kicking in i think.... oh well- kimi doesnt even offer over 300k >v<

#

somethingsomething
-# use a ralph wiggum loop!!!

rain shuttle
#

expert

marsh goblet
cloud flame
#

I think for less daily

marsh goblet
#

This is like the only thing stopping OS models for me personally

#

It reminds me of Nanbeige, you can feel the RL

#

Poor whale was abused for too long, it doesn’t even know what is true or false, everything is a construct of that imaginary reward he gets when he does the thing right

rain shuttle
marsh goblet
rain shuttle
haughty pilot
#

peeps once they escape the evergrowing agent context and start embracing ralph wiggum loop:

😌

#

(this is literally what anthropic does with their "long running agents" btw)

marsh goblet
marsh goblet
#

The fusion of kernel and machine must be truly amazing to witness

haughty pilot
marsh goblet
cloud flame
#

This particular imagen slop was for giaki mourning his loss of Kimi K2 becaming robotic, long ago

haughty pilot
#

take a real life image of urself sitting infront of the discord pc

cloud flame
#

30th of January 2026

haughty pilot
#

(no sloppy allowed!!!!)

marsh goblet
#

I remember that

#

God they fucked Kimi so bad poor guy

cloud flame
haughty pilot
#

sitting sloppy is fine, but image itself shall be real ~

haughty pilot
pure flax
#

I might have found the issue with deepseek. It seems to require a very precise order of system / user / assistant roles.
I think I remmember old deepseek being the same.

covert topaz
pure flax
#

Otherwise it seems to lose like 100 IQ points

#

No other model is that strict about it

covert topaz
#

i havent noticed intelligence issues yet so far been swiping in couple of different chats

#

might need to play with parameters a bit too i get better responses at lower temp for some reason

#

the official "recommended" parameter suck for me temp 1 topp 0.95

covert topaz
#

pretty sure 3.2 had the same stuff

jovial kelp
#

Men i hope western LLMs able to be less censored like chinese LLMs are.

It's funny that for CCP topic it's really censored but for other topics, those LLMs just less censored and better.

haughty pilot
#

deepseek qwen distilled qhwen? ~

ebon swift
haughty pilot
#

yes indeed.

indigo folio
cloud flame
#

Deepseek sometimes used to reason in-character before, using the same style (brackets) it's just enforces it by OOC

haughty pilot
#

yea dunno if this is a secret model...

#

just seems like how deepseek responds in general ~

copper dome
#

*** unzips pants ***

tulip estuary
elfin sparrow
#

Are we getting an r2?

cloud flame
#

It's Hybrid now

covert topaz
#

ds4r2 sounds like a Star Wars droid assignment

tame swallow
#

Deepseek v4 today

#

🥲

indigo folio
#

but we'll go through 4.1 and 4.2 before 5 i'm assuming

vale kayak
#

so deepseek v4 has a fresh prose and stays in character

#

👍

#

pro one

cloud flame
#

Proof?

covert topaz
#

yeah agreed

vale kayak
cloud flame
indigo folio
#

but yea i really like its writing and i'm only a few sentences in

opaque reef
#

Sleeping with the whale

indigo folio
#

reading with the whale

rich ferry
indigo folio
#

coding with the whale

#

spending time with the whale

cloud flame
#

Breeding the whale

indigo folio
#

loving the whale

rich ferry
#

How the mighty have fallen

opaque reef
indigo folio
#

what i really like about deepseek is their love for experimenting

opaque reef
cloud flame
#

Their cache is still 24h TTL and always hitting properly?

rich ferry
#

I'll try it once there's a provider that doesn't (openly admit to) using my prompts for training

covert topaz
#

they can use my logs at least i produce good outputs lol

indigo folio
#

contributing to the enemy of the closed source

#

is always good!

#

the official deepseek twt acc is about to hit 1m followers

#

woohoo

covert topaz
#

but my stuff is gonna be buried under tons of jai slop cat_clown

cloud flame
#

Fitting

covert topaz
short jasper
#

mars, deepseek v4 pro is better and creative

#

at roleplay

#

finally broo

rain shuttle
#

So deepseek v4 can match Grok in goofiness ??

covert topaz
short jasper
#

but these people are saying glm 5 is better than deepseek v4

#

glm 5 is just 100% slop

#

Boring too

vale kayak
#

yeeh i agree

indigo folio
thin bramble
#

dipsy in coma, praying for dipsy v4.1 or v5

civic radish
#

Is it better than Opus and everything we Hopiumed for?

covert widget
#

Hallucinated an incorrect answer to my cypher test after thinking for 45 minutes

elfin sparrow
covert topaz
west shell
#

So we all hopiumed for nothing after all

civic radish
#

The great RP wars of 2026 who is better at open Rp?

elfin sparrow
#

When are we going to see another Nvidia market cap loss due to deepseek?

short jasper
#

If this is preview maybe deepseek full capabilites is not here yet

odd badge
short jasper
indigo folio
#

this is not the production version

cloud flame
#

DeepSeek V4 full real version tomorrow

vapid karma
#

V4 has been cancelled, back to waiting for V3.4

sharp vortex
#

We might get v4.1 like how we get v3.2 exp a month later after v3.1 KEKW

#

next month prob

oak maple
#

holy cow

vale kayak
#

two peopel ragebaiting

#

👍

cloud flame
#

DeepBait

sharp vortex
#

now we know we need to wait 4 month for new base model

#

v5 next year prob

jovial kelp
short jasper
#

ya go give feedback to deepseek

#

they spying on openrouter and reddit

sharp vortex
#

I love how deepseek is technically "cheap" model because all competitor is more expensive

#

Input price suckass tho e_Pensive

hot swan
#

oh thank god there are finally other providers

#

but uh

#

no thanks

covert topaz
#

LOL

potent lagoon
hot swan
#

and still no one for flash

covert topaz
#

the thoroughput and pricing????

#

they just out here openly scamming

rigid wharf
#

the markets will figure it out ¯_(ツ)_/¯

jovial kelp
#

I am curious, has the kernel optimization for new deepseek being implement on the latest vLLM? if i am not wrong deepseek team have their own optimization for their models

odd badge
#

Sooo, was this worth waiting?

hot swan
#

I'm still waiting technically

jovial kelp
hot swan
jovial kelp
#

Don't know about other

#

Even if their models aren't able to beat SOTA, the fact they provide really interesting model will always make me love em.

Experimentation over anything for me, specially when it being open-sourced with really good research paper.

hot swan
#

this + being a tiny fraction of the cost matters

#

I don't mind if it takes 3 attempts/iterations to do a project for $1 that SOTAs would one-shot for $100

cloud flame
#

More like 80% of quality for 20% the price

#

80/20 percentages are pretty closes for things like that

hot swan
#

this applies for pro but look at flash

cloud flame
#

I don't trust small ones

jovial kelp
#

Men, i couldn't wait labs using residual attention or smt similar to it

frosty zealot
#

Deepseek doesn't work ?

#

allowprovider

cloud flame
#

Guardrails

#

Enable prompt training

versed acorn
#

why is v4 flash speaking to me in chinese 😢

covert topaz
exotic elk
#

I'll stick with V3.2. Too pricey for text adventures.

cloud flame
#

Adventures are priceless

#

That's why I am staying home

long osprey
# frosty zealot

the provider deepseek train his models with your data, if you want to use that provider, you need to enable in options that you want to let providers that do that to be used

#

flash for the price is not bad, is decent enough for rp for the price, pro on the other hand... better wait for the price drop

pastel bluff
rich ferry
#

that io.net pricing is crazy lmao

onyx bramble
pastel bluff
long osprey
#

is not even faster, and look at that input price

pastel bluff
long osprey
#

yep

dense junco
unique forum
cloud flame
#

Deepseek situation is crazy

dense junco
#

Gains in knowledge but an increase in hallucination rate: DeepSeek V4 Pro (Max) scores -10 on AA-Omniscience, an 11 point improvement over V3.2 (Reasoning, -21), driven primarily by higher accuracy. V4 Flash (Max) scores -23, broadly in line with V3.2. V4 Pro and V4 Flash both have a very high hallucination rate of 94% and 96% respectively meaning when they don’t know the answer they nearly always respond anyway.

#

Finally, some comp for Gemini flash

cloud flame
#

Have confidence even if you have no idea

fringe flicker
#

I don't mind hallucination, my main issue is the model likes to tend toward being incoherent, like I have had it confidently assume smaller number is bigger than larger number.

#

Significantly more tendency toward internal incoherency than other recent large models I feel

cloud flame
#

Just wait Deepseek V4.1

covert topaz
#

tomorrow

haughty pilot
plucky ermine
#

That is hella ass

#

Sliiiightly better or tied at agentic than GLM-5.1 but it's twice the cost and you go from one of the lowest to one of the highest hallucination rates.

#

For general tasks it loses to Kimi at the same cost and way more hallucinations

frosty zealot
#

Hey, running bench on deepseek/deepseek-v4-flash via OR, found a few weird things:

  1. Prompt cache returns cached_tokens: 0 on 100% of calls, even with 3k stable system prompt + sequential warmup. Same account/timeframe, xiaomi/mimo-v2-flash hits 82% cache fine. Cache broken on DS V4?
  2. Thinking ON by default → content: null + 780 reasoning_tokens burned. Only reasoning: {enabled: false} disables it. exclude:true / max_tokens:0 / effort:low all still bill reasoning tokens silently. Would be nice to surface this on the model page.
  3. response_format: {type: "json_schema", strict: true} → "This response_format type is unavailable now". Only json_object works. Any plan to enable strict schema? Or DeepSeek /beta only?
  4. Latency ~7-8s avg on 3k prompt (DS official claims ~84 tok/s, TTFT 1s). Feels like 2× the expected.

Anyone can confirm if cache is actually live on V4 Flash or if there's a tier/flag I'm missing? 🙏

plucky ermine
#

The input:output cost ratio is great because of their innovations though, that seems cool. It's like 1:2

chilly sable
#

Can anyone test this prompt on OpenRouter with v4 flash?

#
If I have a glass cup that is sealed at the top but it has nothing at the bottom, is it useless as a cup?
rustic island
rigid wharf
chilly sable
cloud flame
chilly sable
#

The cup is sealed at the top (no opening to drink from) and has no bottom (so liquid would spill out). Therefore, it cannot function as a cup in any conventional sense—it's useless for holding or drinking liquids.

chilly sable
#

Reasoning

We need to interpret the question: "If I have a glass cup that is sealed at the top but it has nothing at the bottom, is it useless as a cup?" This is a riddle or a play on words. A cup typically has an open top to drink from, and a closed bottom to hold liquid. Here, the cup is sealed at the top (so no opening to drink from) and "has nothing at the bottom" meaning it's open at the bottom? Or "nothing at the bottom" could mean it's hollow? But "sealed at the top" suggests the top is closed. So it's like a sealed container with an open bottom? That would be like a tube open at one end. But a cup needs an open top and closed bottom to hold liquid. So if it's sealed at top and open at bottom, it cannot hold liquid because liquid would fall out. So it's useless as a cup. But maybe it's a pun: "nothing at the bottom" could mean it has no bottom, i.e., it's open. So yes, useless. Alternatively, "nothing at the bottom" could be interpreted as "there is nothing inside at the bottom", but that doesn't change. The phrasing is ambiguous. I think the intended answer is that it's useless because you can't drink from it and it can't hold liquid. So answer: Yes, it is useless as a cup.

#

This model is just ragebaiting me

#

system prompt is the default one

rustic island
cloud flame
#

That's a question for Pro model only

#

Don't stress poor little Flash like that

chilly sable
#

It sounds like the purse is actually upside down.

In standard handbag design:

  • The solid leather base is intended to be the bottom of the bag. It provides stability so the purse can sit upright on a table or floor.
  • The zipper and handle are intended to be at the top. The zipper serves as the opening to access the interior, and the handle is used to carry it.

If you are placing the leather base on the table and the zipper is still facing the floor, the bag is inverted. If you flip the bag so that the solid base is resting on the surface and the zipper is facing upward toward the ceiling, you will be able to put items inside without them falling out.

In short, It is not a manufacturing defect; you just need to flip it over!

flat osprey
#

added V4 flash and V4 pro to my pareto frontier based on artificial analysis data

  • flash is genuinely a really good value model for the intelligence it offers. it's pretty much on-par with Kimi K2.5 and Gemini 3 Flash in terms of intelligence, but much much cheaper. it lands on the pareto frontier
  • V4 Pro is unfortunately a bit expensive to be useful right now. to land on the frontier, it would need to be as cost-effective as Gemini 3 Flash. models like MiMo V2.5 Pro beat it out in price efficiency
#

both models also have horrible hallucination rates unfortunately

sharp vortex
onyx bramble
#

that's a problem i think is pretty big with these tests, you're essentially randomly acting as a neanderthal discovering modern items lol

short jasper
#

why dont we feed deepseek more writting data

brave whale
#

omg it finally happened

hoary zenith
flat osprey
#

scores about 38 in intelligence though

hoary zenith
#

hmm 38 is lower than I would have guessed

soft fulcrum
#

I wish Artificial Analysis tested the non reasoning versions

sharp vigil
#

the flash model is insane at agentic coding btw

#

im having it crawl and scrape websites using python code

soft fulcrum
#

Sometimes it says 2024, sometimes 2025

#

more often it's 2025 (January or May)

#

but there's enough variance that it could be completely off

woeful jay
#

seems okay i guess

#

flash seems pretty good

#

pro seems way too overpriced for what it is

copper dome
pure flax
#

kimi 2.6 is far better

#

deepseek is dry as a bone and not that smart

short jasper
#

Nah...

vale kayak
#

If they fix instruction following... Model might be great

#

They are probably going do that, tommorow

vapid karma
#

Does pretty decently on CritPt (private research level physics problems) which is, like, one of the three benchmarks I pay attention to in the modern day

#

Price point's just a little too high (for me at least) to use pro regularly - will need to either wait for the ascend node to come online or see if another provider is able to host it for cheaper (which, for a 1.6T parameter model, is probably pretty unlikely)

#

They also share their research, do experimental stuff, and give some evidence that they do things beyond distilling Opus and calling it a day, so eh, I'd consider it a decent release

haughty pilot
chilly sable
#

mimo v2.5 is much better than v4 pro

#

and it's cheaper

wise vine
#

why it’s taking so long for a provider other than deepseek to host both models? deepinfra just came in hosting flash but it’s 1tps.

feral scaffold
#

I know it's a little early but i dont feel like scrolling up, y'all enjoying it so far?

raven canyon
#

telling the model "You are a roleplay model" — that's just prompting, no? how is that an injection?

#

deepinfra absolutely flying at 2tps and 37.85s latency

wise vine
#

i just wanted more providers

feral scaffold
indigo folio
#

deepseek 5 tomorrow

potent lagoon
#

Together is up I think

raven canyon
#

wow

soft fulcrum
#

looks like they aren't as bad as they used to be

soft fulcrum
#

$0.88 for both input and output for Llama 70B

#

and $0.3 for input and output for a 7B model 🤣

opaque reef
#

AA scores are out

raven canyon
placid minnow
# raven canyon is that really a prompt injection ? or literally just a prompt

This is a prompt that visually changes how the model thinks, but it has almost no effect on the final answer. In one prompt, it's "immersive mode," meaning the model thinks like a character and the thinking process is a monologue. In the second, it's analysis. Thinking process analyzes the situation rather than acting like a character. But this is only in the thinking process, the answer after that is no different

raven canyon
#

it looks like its just a normal prompt

placid minnow
chilly sable
#

wait

#

Actually, this release is a preview So I guess there will be better releases in the near future

tulip estuary
#

how to control the reasoning effort on this one? 🫩

tulip estuary
#

API

#

what's the json key

covert widget
tulip estuary
#

why do they show high and max on the benchmarks

potent lagoon
#

Why is TogetherAI ouputing token numbers weird

raven canyon
#

reasoning.effort

#

high for high and xhigh for max

potent lagoon
#

top p and temp at 1, also lower temp doesn't fix it

raven canyon
#

it seems broken

potent lagoon
woeful jay
#

thoughts on flash?

chrome trench
#

lol I hope the CCP doesn't gets winds of it

pastel sail
#

only siliconflow has good inference

#

id guess its because of the weird new arch and attention

#

they didnt implement it properly

#

putting temp and top p to 0.7 0.95 respectively fixes this

#

but quality is probably lower

pastel sail
woeful jay
#

no fucking way

hot swan
#

this image confirms a lot

#

(deepseek pro is over there under gemini and kimi)

#

flash is truly great for it price but the issue is that it hallucinates a ton

pastel sail
hot swan
pastel sail
#

its better definitely though

#

0.38 vs 0.28 output

#

same input price tho

#

but lower context

#

yeah maybe for high context tasks flash could be awesome

#

havent seen any benchmarks for long context about it yet

#

yes it has very good long context performance

cloud flame
#

On Huggingface probably

hot swan
#

deepinfra's throughput 💀

cloud flame
#

I definitely saw it

jovial kelp
hot swan
#

yes I don't think hallucination rate is necessarily bad, it's just prone to flights of fancy

#

and it can be controlled with the rights prompts

#

(5.4/5.5 can apparently do great amazing on math problems so you know)

hot swan
flat osprey