#Deepseek V4

1 messages · Page 5 of 1

flat osprey
#

since gemma tends to be more token efficient

opaque reef
#

deepseek reasons in character

#

for roleplay

pastel sail
#

yes thats right it would cost about 6.5x more based on artificial analysis data

#

also its gonna be way slower because of this

hot swan
#

that's why gemma is way on the cheapest side of that line yeah:

woeful jay
#

flash seems really solid

#

wish they wouldve added multimodal

#

yeah it seems that pro is kinda underwhelming for what it costs

#

at least its not benchmaxxed

sharp vortex
#

Tbf v4 is supposed to release last month, but it got delayed in their technical note

simple mauve
#

I saw this somewhere, too. But the model name doesn't say preview. So what's the deal? Is it a preview, or is it not? 😮

green trellis
#

For coding?

sharp vortex
#

They did say Preview

simple mauve
sharp vortex
#

Or just that they don’t satisfied with current v4 yetAhBlosm

jovial kelp
sharp vortex
abstract dragon
#

Why is nobody hosting the model

#

Is openrouter no longer worth it anymore due to the high traffic it provides?

hot swan
#

if it's getting high traffic it's clearly worth it, isn't it

gusty sphinx
#

hope it was worth it "io-net"

simple mauve
raven canyon
#

its still working fine for me

#

i think its probably just got high load

simple mauve
abstract dragon
#

First of all this is a hugely anticipated release. Second of all a bunch of providers have set it up already but they're not serving on openrouter for this specific model.

#

And thirdly it absolutely does not take time to deploy this. Any person with a bit of gpus can set it up within a few hours. Let alone big inference providers

gusty sphinx
#

which sometimes lead to some really poor publicity for a model, because most aren't paying attention to who is providing it

#

or are even aware of how this works

raven canyon
#

deepseek v4 has new implementation details i think

abstract dragon
#

Idk I'm just starting to feel that openrouter aint worth it. Almost all models experience low TPS and im assuming the openrouter accounts on these inference providers have their own quota. And I'd be better off just registering directly in those providers to get better tps

gusty sphinx
#

it wasn't that long ago that most providers found out what model "tool calling" is

raven canyon
#

i think openrouter counts TPS weirdly

abstract dragon
#

Like im so done with 20-30TPS models and most of open source models on openrouter are experiencing that

abstract dragon
gusty sphinx
#

maybe i'm wrong

abstract dragon
#

Take cloudflare for example. When k2.6 came out the openrouter cloudflare provider almost died. While cloudflare's official api was stable

gusty sphinx
#

ok. i didn't try k2.6 for a while

abstract dragon
#

I will, just expressing my frustration

gusty sphinx
#

cloudflare is good

#

i'm just saying, if you think this is a shitshow, it used to be a lot worse

abstract dragon
#

But this isnt an excuse for slow models. One model can't call tools but it responds another runs at 9tps and 20s latency

potent lagoon
#

Is togetherai still busted?

gusty cradle
oak maple
#

yea just saw that

#

holy moly 75% off :o

obsidian walrus
faint belfry
#

I get random numbers in sentences from Pros responses.

potent lagoon
faint belfry
#

Is there a way to force it to only go through Deepseek provider on JanitorAI?

brisk sand
hot swan
hot swan
#

I thought the price decrease would be like 30%, not divide by 4

#

I think the deepseek team actually fucked up massively not delaying release by a day

#

if those were the prices they were about to offer

#

imagine the narrative and the cost/output graphs if those were the prices for pro at launch

#

incomprehensible blunder

raven canyon
#

only 3x the price of flash instead of 12x

thin bramble
covert topaz
#

call me schizo but did they update the model

#

😭

raven canyon
#

deepseek v3 level pricing for … who knows? level of intelligence

indigo folio
covert topaz
#

1m whale slaves

indigo folio
#

filling oceans

rain shuttle
#

Lolll expert and instant think so differently

short jasper
#

They updated what

covert topaz
#

idk feels different

sharp vortex
#

WE ARE SO BACK

short jasper
#

i told you bro

#

they would change

#

smth

#

Since it says

covert topaz
#

but they didnt announce anything

short jasper
#

preview

#

version

short jasper
#

smart market

#

they collect feedback and fix

#

efficient

covert topaz
#

in 1 day?

#

its the weekend

rain shuttle
#

The same expert in Chinese, I guess it does some bias based on languages also

potent lagoon
#

nice more providers

sharp vortex
#

they wait until benchmark period is over then discount fr

covert topaz
#

not different in a bad way btw in a good way

short jasper
#

let's feed deepseek more data

covert topaz
sharp vortex
#

Do gooner really care that? (it's anonymous data)

covert topaz
#

might get better output

sharp vortex
#

-# pretty sure no one actually coding confidential data with Deepseek anyways

short jasper
#

so you change your prompt to chinese and it follow instruction good?

covert topaz
#

nah it would be dumb to code w deepseek

#

kimi and glm are way ahead

covert topaz
#

the problem is getting it to output in chinese

sharp vortex
#

Proof or placebo effect

covert topaz
#

i mean english 💀

hot swan
#

kimi is certainly better but at the current discount? I could use deepseek

sharp vortex
#

ain't no way Deepseek place a high price trap so they can "discount" when other provider start to host it

hot swan
#

look at that insane score/price ratio for flash

#

meanwhile deepseek pro is currently just twice the price while scoring 71-87 (depending on thinking budget)

sharp vortex
#

tbf flash might be good model if it's multimodal fr

#

it's suck when it's text only e_Pensive

simple mauve
# short jasper they would change

How does that work across different providers, though? One provider has version X under the same name as another provider's version Z?

hot swan
dusty birch
#

how do you use max in api? xhigh reasoning effort?

raven canyon
dusty birch
#

alr

#

ty

sharp vortex
hot swan
raven canyon
hot swan
#

🤷‍♂️ it's pretty?

#

I mean I like that they have a ton of ways of comparing

raven canyon
#

is it a benchmark itself or an aggregator

hot swan
#

aggregator

elfin sparrow
woeful jay
hot swan
#

it's considered free so not ranked as such

#

it scores ~61 though

dusty birch
#

i cant get a single request through Pro, just ratelimits

pastel sail
#

id try siliconflow for inference

#

they have it properly setup

#

decently more expensive than the official api though

lime moth
#

flash versión is bad for roleplay

hot swan
#

define "bad"

pastel sail
#

less world knowledge and dumber

lime moth
# hot swan define "bad"

dont follow the format, random thinks, change the narrative from third perspective to 2 persona, etc.

#

alot of rerolls, cheaper but dumb

lime moth
hot swan
#

it's surprisingly competent at fiction writing in general though

covert topaz
#

use pro for rp

thin bramble
haughty pilot
#

deepseek today?

crude steppe
#

just plain evil

#

they make the providers look greedy

covert topaz
#

Chads

cloud flame
#

Considering cache, you can cut of actual prices even more

gusty sphinx
#

(its io-net)

cloud flame
#

Shady's back

wild mango
#

Deepseek provider is having issues with v4 pro? I can only use it via other providers

hot swan
fresh edge
#

Does 429 Provider returned error really mean over-load?

#

A lot of endpoints doesn't support tools? Or OR is missing something?

gusty sphinx
simple mauve
#

are they just trying to cash in on a new model not having too many providers yet?

covert topaz
#

it has a fair few now and theyre still trying to steal through fallback routing 💀

twilit lodge
#

guys, kinda new to AI pricing and terms. The input/output tokens... thing.
lets say I chat ~200 messages a day (roleplay chat). If I used Deepseek V4, how much I'd be paying each day for these 200 messages?

Is there a way I can estimate that?

rustic island
#

Well, a million tokens are, depending on the tokenizer, around 4 million characters

#

You would need to calculate how many characters in + (out + reasoning) you use and do the math around that

#

But there's also caching, repeated input tokens (so e.g. re-sending the chat) will be discounted depending on the provider

#

But I think I'd just simulate a typical long chat with your usage for some turns and see how that acales

twilit lodge
#

Hmm, guess the best course is to put a few bucks and see how long it lasts, then. But for deepseek do you put up credits in DS itself, or in OR?

supple sigil
#

with how bad the rate limits are for OR right now, probably just in DS itself

#

i think most apps support the DS api natively so you should be able to just skip OR altogether

rustic island
#

There's BYOK for that

#

It's your choice, really, both should work fine (other than perhaps the limits)

#

Not aware how the OR limits are right now

twilit lodge
exotic elk
#

I keep getting a 500 Internal Service Error.

wild mango
twilit lodge
#

aight, thanks for the info
gonna take a look into it

marsh goblet
#

Is Gemma 4 better than DeepSeek Pro V4?

stray pulsar
#

How good is it, based on initial impressions?

supple sigil
rustic island
#

Uh lol

#

Will you really ever send a million requests a month for RP?

#

I don't think the extra latency makes a difference either

supple sigil
#

if whatever youre using already supports the deepseek api, then why route through another api

simple mauve
supple sigil
#

probably, but at least there’s the benefit of avoiding ratelimits

sharp vortex
#

It’s 8%

simple mauve
exotic elk
#

Still keep getting Internal Server Error. I have ZDS, but that shouldn't apply to SiliconFlow I don't think.

feral scaffold
#

just block every provider except direct api

vapid karma
#

Unfortunately direct provider is getting clawed to death at the moment

feral scaffold
#

openclawed to death

exotic elk
#

Nice. Got it to work on my text adventure via Together. Good results and cheap too.

lime moth
#

Even in pro RP, have errors like dialogue ( -Like this- ) and the prompt say dialogue ( "like this" ). Everytime give me dialogue with that -idk why-

exotic elk
#

I absolutely hate it when it gives me a reply where all the text is in the Reasoning. It basically stole my money.

tulip estuary
#

it did it many times for me

jovial kelp
exotic elk
#

Together, and I'm using OR chat.

jovial kelp
#

Try different providers first, tell me how the result for different providers

exotic elk
#

Rn V4 is like an old lawn mower engine. Have to keep pulling until a request gets past the 500 error.

jovial kelp
#

Yeah, OR really need to get some deal to increase their rate limit for each providers.
I remember someone talking about how they got rate limited in OR but when they go straight to the providers site and use their service directly, it allow them to make more request than in OR.

exotic elk
#

io.net is "provider ignored by account." No it's not. Only AtlasCloud is on my banned provider list.

#

Evidently I can't use the providers Deepseek and io.net, though I removed the ZDS on my Guardrails. SiliconFlow I can use but I keep getting an error 404 (No endpoints found for deepseek/deepseek-v4-pro.). So Together is the only game in town, and 9 times out of 10 it doesn't work either.

jovial kelp
#

Yeah, it's a bit hard with deepseek if we don't want our data to be use for training

#

Because deepseek it self is still the best serving provider for their own models

sharp vigil
#

I'm confused as to why a provider would be banned for using users data for trainin

simple mauve
#

It's probably best to wait a few days. It's not surprising that with zero workdays passed since the launch, not all providers are at their best.

sharp vigil
#

I thought that's what everyone did

#

And it was a given

simple mauve
jovial kelp
#

Option is good in the market imo, if you don't have problem of giving your information then go on but if you don't want that then there should be option for it too.

exotic elk
#

Just to experiment I switched off "Always Enforce ZDR" and saved. Deepseek still isn't a provider.

simple mauve
#

Using workspaces is still a little confusing... I just recently realized that it doesn't matter if I create a new workspace where I want to allow Deepseek as a provider--if it's not allowed in the default workspace, it still won't show up in the new workspace.

raven canyon
#

what you should do if you want to do that is:

  • allow training account level
  • disable training on default workspace
  • allow training on second workspace
simple mauve
# raven canyon what you should do if you want to do that is: - allow training account level -...

Oh... oh... so what I thought was the default workspace, is not even the default workspace, but the account-level setting, meaning I will have to copy every setting I've used so far over to the default workspace. Oh boy... Okay, well, at least it's doable, even if quite a bit of extra work. Thanks for this bit of info--this was a missing piece of the puzzle I didn't even know was missing.

raven canyon
#

its just if you want to change something

simple mauve
# raven canyon its just if you want to change something

Yeah, I get it now. So far I've had DeepSeek (the provider) blocked, and that's still how I want it in the default. I wanted to create a new workspace specifically for DS today, but couldn't. Now I understand why. I need to copy over the account settings into the default workspace, then change the account settings, then create a new workspace for DS specifically. Yeah, doable, but not tonight, LOL. 😄

odd badge
plucky ermine
#

Are they literally the only paid provider that trains on API data?

#

Kind of wild

plucky ermine
# potent lagoon All of them do

If you want to go conspiracy mode and assume that every single provider breaks the law, sure, but that clearly isn't what I'm talking about

exotic elk
#

Are any of the other providers going to go down in price? Deepseek is conspicuously cheaper.

raven canyon
raven canyon
# raven canyon deepseek flash has some providers but it looks like they don't support tool call...

deepseek flash options: (all same price)

  • DeepSeek offical: may train on your requests
  • DeepInfra: has tools, but 7tps is rough (potentially reporting error?)
  • SiliconFlow & NovitaAI: decent uptime and speed, but no tool calling

deepseek pro options:

  • DeepSeek offical: cheapest by far, may train on your requests
  • GMICloud: second cheapest, but not great uptime, and no tool calling
  • SiliconFlow: second most expensive, decent uptime, no tool calling
  • IO.NET: most expensive by far, fast (potentially), unknown uptime, only non-deepseek with tool calling
#

so basically do not use openrouter yet for deepseek pro requests with tool calls cause if deepseek official is rate limited your request will go straight to IO.NET with 5x-10x the cost

frank wind
#

pro is very creative in text completion with no instruct formatting

#

low probabilities but never incoherent

#

1.1 temp, no other sampling

#

honestly I am surprised

worldly pier
pure flax
jovial kelp
plucky ermine
# worldly pier It's a gold rush and laws historically haven't mattered that much in gold rushes...

Assuming an unethical actor, it isn't selling prompts vs selling inference, it's selling prompts vs the cost of a lawsuit for breach of contract and permanently being shamed and blacklisted out of existence.

Regardless, it matters what they promise. I'm not going to a restaurant with a terrible health inspector rating just because you go "eh, they're all dirty, they just hide it." Okay, well this one definitely is, so I'll take my chances somewhere else.

thin bramble
#

wtf, i am doubting myself now

gusty sphinx
thin bramble
#

claude biased due to claude synth data in deepseek?

EQ-Bench 3 is a LLM-judged test judged by Claude Opus 4.6, evaluating active emotional intelligence abilities, understanding, insight, empathy, and interpersonal skills.

Longform Creative Writing Benchmark: Judge upgraded: Evaluation now uses Claude Sonnet 4.6 (replacing Sonnet 4).

gusty sphinx
worldly pier
#

v4Pro uses a fuckton of "it's not X it's Y"

gusty sphinx
#

i haven't dug into the benchmark strategy in a long time, but it probably is more "accurate" than what most people would see from a small amount of casual use

thin bramble
#

leme see arena again today

#

still low votes

elfin sparrow
sharp vortex
thin bramble
sharp vortex
plucky ermine
#

What I've found on the main EQBench 3 is that the ELO isn't reliable, but the category scores usually are

thin bramble
woeful jay
raven canyon
#

good release or underwhelming release?

woeful jay
#

pro underwhelming

#

flash good for price

flat osprey
#

pro isn't super underwhelming - just not a good deal for the price

#

right now with the discount it's actually great value

loud verge
#

In terms of Artifical analysis scores. The deepseek v4 flash(max) is 3x more efficient price to performance wise than the model that gets second rank

#

All in all, deepseek didn't disappoint

flat osprey
#

I think it's a really good option for creative writing, but I honestly prefer Minimax M2.7 or Gemma 4 31B for cost effective coding

#

mainly because the hallucination rate on V4 flash is absurd and it also just thinks for quite a long time

loud verge
#

Yeah, at max thinking it thinks too much. But it's still really good at high thinking and thinks a lot less (about 1/3rd of what it does at max)

#

but even the high thinking module thinks more than the previous 3.2

flat osprey
#

yeah, it's also hard to determine how much intelligence is lost when you drop from max to high

woeful jay
loud verge
#

yup

pure flax
#

mimo 2.5 pro is the real unsung hero

loud verge
#

These models are still preview. Deepseek probably gonna drop a banger soon

flat osprey
#

just a bit pricey for my liking, but really capable

sharp vortex
flat osprey
#

i also think V4 was unfortunately severely overhyped - like, people thought this was gonna be opus-level reasoning at the price of V3.2

#

like, without the hype, V4 pro is still amazing value for what you get - it's just that better options exist now

loud verge
flat osprey
#

which is fair

#

they mainly got there because LLM research was still in its infancy though - a lot of the tricks they used for R1 just don't apply anymore for the same gains

cloud flame
#

What are the tasks that suffer much more and much less from high hallucination rate?

loud verge
#

It brings miracles. That's why I worship the whale 🐳

flat osprey
#

like KP's notorious test of asking how to fix lag in a Paper Minecraft server

loud verge
flat osprey
#

sometimes the model will think it knows something well enough that it just won't research

cloud flame
#

So the more tasks needs to calculate smth vs the using existing knowledge, the more hallucinations affect it

flat osprey
#

yeah pretty much

cloud flame
#

Yeah, I wouldn't use this model for medical lol

flat osprey
#

like, hallucination rate becomes a lot less relevant if you're always relying on the model researching

#

but it still is a factor

cloud flame
#

Fun fact: before Gemini 3.1 release, older Gemini 3 previews also had high hallucination rates on all models - both Pro and Flash. The way they combated it is advising all developers and users of Gemini 3 to use 'web_grounding' web search native tool call, which worked basically as "ARE YOU SURE ABOUT THAT?" John Cena behind model's back during thinking process

#

Unfortunately, Deepseek v4 does not have that exact option

plucky ermine
#

Yeah Gemini had terrible hallucination rates (I think Flash is still god-awful?) and was really bad on safety stuff too like enabling mental illness.

flat osprey
#

for instance - i regularly use Gemma 4 31B with research, and even though the model has a high hallucination rate, i can get outputs comparable to non-thinking frontier models

flat osprey
#

gemini is just mentally unstable overall

plucky ermine
#

Honestly the negatives don't get enough attention IMO. Hallucinations and mental-health matter a lot, especially in consumer apps.

flat osprey
#

yep, which is why 4o had (and arguably still has) a cult following

#

hallucinations + sycophancy are a match made in the psych ward

plucky ermine
#

I know to use Opus to call me out on my bullshit, but the average person doesn't know how any of this works.

#

"Arguably"? They still post on Twitter all the time =P

flat osprey
#

true lol

thin bramble
cloud flame
#

So deepseek has native search?

flat osprey
#

sometimes it just calls out things that are either fine or weren't an issue to begin with

flat osprey
#

it's always a tool call or tool process

plucky ermine
#

It's also subtle. The model specifically needs to be skeptical and blunt imo, because when people vent they don't give the other side of things. So it's easy for even good models to take your side when it shouldn't.

cloud flame
#

I mean native provided

flat osprey
#

don't think they provide it through the API though like Google does with grounding

plucky ermine
#

I need to test that more with 4.7

cloud flame
#

I can build a local web search with injecting data into prompt, but it won't be as good as smth provider of model does on their side

flat osprey
flat osprey
#

it's what i use for my custom research tool

plucky ermine
#

I was amazed that GLM-5.1 called me out to just the right degree on something a while ago in a test. Better than anything else. Was sympathetic but critical, clear but not assholish, etc.

cloud flame
gusty sphinx
flat osprey
#

the model very frequently just... doesn't research stuff it should be researching

regal shuttle
#

Anyone know whether there are representatives from DeepSeek on this Discord?
I saw from a post on X that they are asking for feedback, but their own Discord appears to be an unmoderated, malware filled mess.

plucky ermine
# cloud flame Like what?

Basically hit it with a concern about one of my friends. I kind of stream of consciousness my thoughts in and then run it across all the good models in OR chat.

It had deeper insights than anything else, covering observations and ideas that other models skipped past. It correctly called out spots where I might have biases, am being overconfident, etc.

flat osprey
#

they would've likely said something here by now if they were around

plucky ermine
#

Aside from being genuinely useful as a way to sort through my thoughts, I really like it as a model test. They have to figure out what I actually want, since it's largely venting. And what I need but don't want. Meta observations, like taking into account that I chatted about it. And the problem itself of course.

regal shuttle
# flat osprey none that i am aware of

A shame. Would be more than happy to give feedback etc, but not taking out an X account to bother, and their own discord appears to be a waste of space.

raven canyon
raven canyon
heavy sable
#

Have anyone noticed that NovitaAI, SiliconFlow not working? its only DeepInfra working when excluding Deepseek own infra

indigo folio
#

it's so good

#

also kimi droppedddd on longform writing, damn

indigo folio
dense junco
indigo folio
#

the slop going down by times 2 compared to all their previous versions

flat osprey
covert topaz
covert topaz
#

ain’t no way there’s model degradation already 💀

indigo folio
#

10.5 points down

#

compared to 2.5

covert topaz
flat osprey
#

probably has to do with how heavily 2.6 is tuned for logic and coding

indigo folio
#

yea

covert topaz
#

i love me some agentic workflows 😝

indigo folio
#

their creative writing elo is up by 300 points though

#

not bad

covert topaz
#

well idk if it’s just me but I noticed alot of claudisms popping up more with DS

indigo folio
#

i'm sure they're distilling from opus too

covert topaz
#

the thing im trying to avoid

indigo folio
#

i expect their creative writing elo to be up too

#

it's at almost 1600 rn

covert topaz
#

yeah more than likely

#

lol GLM and DS competing over who can distill the best

indigo folio
#

bring it on

cloud flame
#

Kimi K2.6 falling in longform is very weird - should be other way around

indigo folio
#

the only people it hurts are the billionaires

indigo folio
covert topaz
indigo folio
#

ds team does test and experiment a lot

#

claude distills are just a bonus, but they don't really depend on them

covert topaz
#

they acknowledged rp feedback on X btw was kinda cool they want English speaker feedback

flat osprey
#

imo any open models before the bubble pops are great to have
development will slow down rapidly once VC money runs out

covert topaz
#

so could get some rp tuning later down the line

indigo folio
#

yeaa this isn't even the production version of v4

#

just a preview

#

so they're gathering feedback for the official v4 production version

covert topaz
#

It was similar to 3.0 preview from what I recall

indigo folio
#

i do think they're building the current base so they can add a lot of features and improvements without big problems

#

which is amazing

covert topaz
#

that model sucked at instruction following (for me) then 3.1 it religiously followed them

indigo folio
#

and this model still runs on the old gpus

#

which is crazy

covert topaz
indigo folio
#

yep

indigo folio
#

expect the performance to be much better with the new gpus

covert topaz
#

DS got a bright future fingers crossed

indigo folio
#

and a huge drop in price

#

for pro

covert topaz
#

anthropic.. falling off

indigo folio
#

these people must be panicking tbh

covert topaz
#

got beaten by oai keknervous

indigo folio
#

the os are closing in on them

#

they have to be much better to be an exception now

elfin sparrow
covert topaz
cloud flame
#

4.1 tomorrow

indigo folio
#

and honestly, the ds team shows that intelligent and skilled people matter, a lot

#

not just the money

elfin sparrow
indigo folio
#

look at what they did with the old gpus

covert topaz
elfin sparrow
indigo folio
#

the current v4 runs on old gpus

elfin sparrow
flat osprey
indigo folio
#

that's why it's a bit more expensive

vale kayak
elfin sparrow
indigo folio
#

once the new gpus arrive, the performance will be much better and the prices for pro will drop by a lot

covert topaz
cloud flame
#

V4 thinking ignores temperature

covert topaz
#

low temp better for me

#

Imagine thinking

indigo folio
elfin sparrow
indigo folio
#

has anyone tried that out yet 😭

covert topaz
#

it’s ass for me

indigo folio
#

so they're barely scrapping by tbh

#

the only os models i really recommend are kimi, glm and ds

#

i don't trust anything else

covert topaz
#

I was really impressed with GLM in 5 then they just went the coding route with 5.1

#

and it’s definitely cause of openclaw blowing up in China they wanted to take full advantage of that

indigo folio
#

the real companies are the ones not sacrificing cw

#

for coding

covert topaz
#

think they’ve all done it atp

cloud flame
flat osprey
indigo folio
#

fr

covert topaz
#

indigo folio
elfin sparrow
indigo folio
#

and qwen

covert topaz
indigo folio
#

i use it for my discord bot

covert topaz
#

agreed

indigo folio
#

its personality and sassiness got me tbh

#

i love it

covert topaz
#

if it had the smarts it would’ve been godly

flat osprey
indigo folio
#

what i really ab glm is that

flat osprey
#

if you give it research it gets very close to non-thinking frontier models

indigo folio
#

it actually writes you a long novel

#

if you ask it to

covert topaz
#

ya for its size it’s way ahead but I mean like SOTA level

indigo folio
#

i told it to write me a long novel

#

and it gave me a 15k word one

indigo folio
covert topaz
#

did u read it

indigo folio
#

with other models

#

not the whole thing, but i did read a little and even got claude to analyze it

#

said it was pretty good

#

this is crazy

gusty cradle
#

Deepseek updated price again more cheaper!!!

short jasper
#

so did deepseek change their reasoning style again?

indigo folio
hot swan
#

could have made it clear it's a limited time offer and launched with those discounts anyway

oak maple
sharp hamlet
#

New age for gooners

hot swan
covert topaz
cloud flame
#

You can send 2 or 5 as temp with thinking - it won't change, yeah

short jasper
#

hmmmmmmmmmmmmmmmmmm

covert topaz
covert topaz
#

I think so

#

being able to control my parameters makes it soooo much better for writing

#

it’s weird cause having a role enabled set as assistant at the bottom disables the thinking process but ur still technically on thinking mode

wild mango
#

Deepseek is out of control

rustic island
#

Wait what

#

No word on this being temporary? 👀

#

That is essentially free cached input to me

wild mango
hot swan
#

yeah, 0.003625 is the 75% promotion on top of 0.0145 <- 0.145

#

and unlike the 75% promotion the 90% cache discount doesn't have any message about a date limit so

wild mango
#

the promo got the reduction too, because it was originally 0.036 🤣

rustic island
#

Crazy

#

$2.8 per billion input tokens lol

hot swan
#

indeed (for flash)

cloud flame
#

Flash with that pricing and cache could be used for something big and static running on cron every 1-2 hours and be super cheap to maintain, even with reasoning

hoary zenith
#

what the absolute fuck

mighty arrow
#

Wonder how high that puts V4 on the efficiency curve, that caching change in particular

rustic island
#

Like someone said, what a waste of PR lol

#

If they had launched with this 75% discount and that cache discount, I guarantee that'd have made many headlines

hoary zenith
#

The way they move, I think they are still traumatised by R1 PR

covert topaz
#

I think they originally thought the pricing justified the model performance then did a last minute switch up after the pricing being negatively received

pearl geode
#

So how good is this

cloud flame
#

Samantha REALLY didn't like it

vivid tide
#

yeah. i dont at least not for RP

#

but everyone has their preferences

cloud flame
#

I don't

vivid tide
#

aight

simple mauve
#

Not for RP, but which one? This has so many versions now.

short jasper
#

deepseek v4pro is good for roleplay, Just thinking mode loop output ruins it completely..

vivid tide
covert topaz
#

it seems like it needs a lot of wrangling but once it gets going its really good I’ve been tinkering with it for a few days and now it’s at a place where I prefer it over my go to model (gemini 3.1) u need to account for its lack of instruction following (strong reinforcement injections etc)

vivid tide
#

but im not here to convince anyone on that. it doesn't fit my preferences for creative writing, even after heavily altering presets for it. that's it that's all. its a decent assistant chat bot tho.

covert topaz
#

had to make alot of adjustments to my preset all around and my personal cot is basically new lol

sage kraken
#

what's the recommended way to use v4 without thinking on sites that don't have a specific toggle for it?

supple sigil
thin bramble
rustic island
#

You can, even if the site doesn't have a toggle
Just create a preset (Preferences > Presets) that forces no thinking

short jasper
#

let's compare mimo with deepseek, If mimo is superior

#

📢

cloud flame
#

With vanilla pricing maybe, but not with that discount

#

Soon they gonna pay for people using Deepseek

#

I want to findom a LLM provider

rustic island
#

Maybe a dumb question given DS's reputation, but has anyone ever gotten banned from the DS provider? I'm sort of concerned about deploying this and people pressing the model to speak about Chinese politics

#

Their ToS are surprisingly strict and forbid NSFW (lol)

supple sigil
#

i wouldnt worry about it

#

if openrouter hasn’t been banned yet, you probably won’t either

covert topaz
wild mango
#

I'm getting "Provider returned error" in 9 out of 10 requests when using v4 pro, it's so annoying

grizzled spear
#

The orange is most important because I (and others) usually use other providers and they are available now at under 40%!

tulip estuary
#

i'm using directly through their API

#

i'm not hitting any limits

grizzled spear
tulip estuary
#

yes for this case

#

launch week is crowded

#

and we're basically sharing one API key from OR

grizzled spear
#

The issue is not a limit, the issue is that through the platform openrouter the access is just barely able because the connection is not very stable, probably is there a too weak server

tulip estuary
#

the Provider returned Error

grizzled spear
#

Hmm okay

wild mango
#

I decided to put a few bucks in their API directly and haven't had any issues so far, so It's on OR side

simple mauve
#

The DeepSeek provider has an insane cache time... and with these cache prices... that's seriously impressive. I've just tested the cache after an hour, and it's still there... documentation says the cache stays from a few hours to a few days... The only issue of course is that the data can be used for training... but with these cache conditions, it's really worth considering. Most other providers have a cache time of maybe a few minutes.

tulip estuary
wild mango
tulip estuary
#

because Flash is less rate limited, it's a much smaller model

wild mango
#

oh so the limit is by model not by provider

tulip estuary
#

it's both

wild mango
#

maybe OR sets a higher limit in the next few days then

#

but now I have some credits on deepseek api so I'll burn through those before trying OR again

tulip estuary
#

"burn" isn't even a correct word for this absurd pricing

#

i'm eyedropping through my credits

#

it's not very good for coding though, it's fine

#

gets a lot mixed up

soft fulcrum
cloud flame
tulip estuary
feral scaffold
#

Hope it comes to the tavo mobile app soon

indigo folio
#

the goats 😂

#

deepseek using an untraceable distillation technique

mellow jewel
indigo folio
#

🙏

raven canyon
pearl geode
#

Okay so not the best for rp but what are benchmarks

pearl geode
#

That looks pretty good

indigo folio
#

yep, the longform writing's slop is at least 2x lower

#

than ever before

knotty root
#

ngl feels kinda mid

pastel sail
#

it has very strange placements

#

like pony alpha is glm 5 and its above glm 5, glm 5 is also above glm 5.1

#

The creative writing leaderboard is being updated to use Claude Sonnet 4 as judge (previously used Sonnet-3.7). The top models have already been updated; the remainder are a work in progress.

#

it literally uses claude sonnet 4 for rating.

plucky ermine
#

It is useful, not perfect. The real RP benchmark to me is Sillytavern usage

tame swallow
#

oh

#

gdi I make fun of myself once more

#

you are right

#

it's sonnet 4

#

but why is this weird method of judgement?

#

one bench they be using opus 4.6

#

one did used sonnet 4.6

#

and this was on Sonnet 4

glass cradle
#

anyone succesfully using deepseek v4 on vscode+github copilot?

meager kelp
#

and as I keep saying, undertrained models are better at writing

woeful jay
#

i kinda doubt that tbh

#

the turnaround from pony -> glm 5 was like

#

instant lol

#

idt they did any training in that time period

meager kelp
#

there'd be no point in releasing a seprate alpha if they did no training

#

and it's also very clearly different subjectively and in several benches

raven canyon
#

or maybe they just did that to collect prompts

woeful jay
#

they were hammered for compute during the stealth model period

covert topaz
#

I LOVE DEEPSEEK V4, thank yuo for your attention to this matter

pastel sail
#

training doesnt take that little time

pastel sail
#

it was extremely similar and they released within days

pure flax
#

deepseek v4 pro is better with a long preset. Its dumb and needs to be told how to write but its knows a ton and is super creative. I actually like it now
Now it reminds me of old opus, opus 3 maybe

pastel bluff
#

Has anyone actually managed to get DS V4 pro working?

All I get is rate limits on all three providers except the overpriced io.net

sharp vortex
pastel bluff
#

Classic

covert topaz
#

literally the same perspective i have lol

#

it did really well on my creativity test

simple mauve
#

"Insufficient Balance" ehh... OR ran out of credits with DeepSeek 🙁

#

Just when I wanted to give this thing a good test, LOL. 😄

#

Yeah, time to bite the bullet and open a direct deepseek account to get an API key... 😄

peak swallow
#

Direct api is so much better. Just add it as BYOK in openrouter

sacred glade
#

how so?

peak swallow
#

No rate limits and low latency

simple mauve
simple mauve
acoustic hawk
#

One message removed from a suspended account.

sharp vortex
#

It used to be 1% before

simple mauve
sharp vortex
steep summit
sharp vortex
#

It look funny when they don’t want to combine currency lmao

west shell
#

how does Deepseek manage to be like 4x cheaper than the second cheapest provider? Is it because they do prompt logging? Or is it a marketing thing?

simple mauve
#

They most likely want some good training data. That's just my personal speculation.

west shell
#

that kinda what I suspect too, but I suppose I cant be sure

simple mauve
#

And keep in mind that the pro pricing is just a temporary discount.

west shell
#

I see

gilded onyx
#

do you use flash or pro for RP? what reasoning level?

pastel sail
#

deepseek is probably providing the model at a minor loss right now but the original price was reasonable actually

sharp vortex
#

Deepseek only host 2 model at the time too

cloud flame
simple mauve
#

Flash is better with reasoning disabled?

cloud flame
#

It's weird, maybe mixed up

#

All other benchmarks and every model I saw on UGI always show reasoning enabled improve scores, or in worst case stay the same

glass cradle
#

anyone succesfully using deepseek v4 on vscode+github copilot?

vapid karma
#

I always take it with a grain of salt, but Livebench is looking pretty good too

#

Also lines up pretty close with the statement that V4 Pro is around ~GPT-5.2, intelligence wise

lime moth
west shell
#

but like

#

its not sustainable

#

I love free lunch as much as the next person, but it literally cant last

#

even if you get "infinite, free" training data

lime moth
vapid karma
#

They also said the discount on Pro input / output was temporary (through 5/5 I believe) so it'll probably be back to the higher pricing afterward

#

Until they get their node set up, anyway

jovial kelp
twilit lodge
#

Guys, never tried getting paid credits for DS
You make a api key on DS site, put up credits there, then use BYOK on openrouter, and thats all? Do you need to config anything else? (token limit, something like that)

jovial kelp
#

Their strategy have always being providing inference in breakeven category, not getting as much profit as possible

simple mauve
feral scaffold
#

It doesn't spaz out when you crank up frequency penalty, for pro at least

cloud flame
#

Because it probably ignores it

#

Thinking mode just ignores all samplers possible, even temperature

timber obsidian
#

Hi guys anyone know what the best providers are for Deepseek V4 flash? Novita is the fastest but doesn't mention the quantization so does that mean it's a nerfed version?

#

Maybe I should use the DeepSeek platform then? Do they have US servers

dusty birch
#

im finding pro pretty good at coding, it picks some pretty good choices for changes, it doesnt overengineer very much

opaque rapids
#

extended to end of May

sharp vigil
#

We love deepseek 🐳

toxic rose
frank wind
feral scaffold
#

they need to train on those inputs

opaque rapids
#

to me there are 2 possibly case,

  1. being bait for them to collect the data and further RL
  2. burning the money until the Huawei chip getting properly deployed
frank wind
#

train on my smut please

opaque rapids
#

I guess now I don't have to worry running 10 multi layered agent to write my smut novel

sharp vigil
#

Agent swarms are actually possible now

gaunt dirge
#

Say thank you and feed them your Opus prompts.

jovial kelp
minor canopy
#

Guys, why Deepseek v4 pro/flash doesn't work on OR ? I've tried with opencode and Kilo code but can't use it. Keep getting errors from "provider" [AkashML] deepseek/deepseek-v4-flash is temporarily rate-limited upstream. Pleas...

onyx bramble
#

try different providers or make yourself a dedicated deepseek api key on their site, rate limiting on OR deepseek is pretty rampant right now

minor canopy
#

thank you

gilded onyx
#

When they say non thinking mode it's just setting reasoning to none, right?

gilded onyx
#

Anyone using DS for roleplaying? Would you recommand flash or pro? At what reasoning level?

jovial kelp
#

I am using their chatapp so the option only thinking and non-thinking

gilded onyx
#

I'm using their api directly. I'm building a roleplay app for romance and nsfw. It's kinda hard to say which settings is better

blazing bough
#

hi, when will fireworks be available? (with tool call support)

exotic elk
cloud flame
#

For me sending xhigh (which maps to Max) gives almost same results as sending high (which maps to High), maybe it should be as that?

exotic elk
#

So Deepseek's pricing is a temporary discount through May?

gilded onyx
gilded onyx
exotic elk
#

Not sure how much I like that. The thing is, at the non-discount price, it's not much cheaper than Sonnet, so I may as well use that. Or just go back to 3.2.

#

Almost as good is only really a sell when it's significantly cheaper.

indigo folio
#

on top of this until-may-31-discount

onyx swift
#

Anyone know why novita doesn't have tool calling listed as supported on either flash or pro?

#

They have other models that do show support, and their own model page suggests it should be supported (looks like novita calls it "function calling")

tulip estuary
wild mango
#

Discount on pro now extended until the end of may

exotic elk
#

How does the cached thing work? My 50k context text adventure turns are bafflingly cheap. Like $0.0014 a turn, whereas it should be around $0.02 a pop.

tulip estuary
#

everything you send to them is cached for about 24h

#

so when you send it back with a new prompt, only the new prompt is counted as input cache miss

#

plus the output tokens price

wild mango
#

At these prices we are basically paying for output

exotic elk
rugged vigil
#

yay, (almost) free DeepSeek V4 \o/

jovial kelp
#

Holy

#

This model is really good if you wanna learn unconvential knowledge

#

Finally, good less align model

simple mauve
#

The flash version is just plain stupid even with max reasoning. It's funny in its own way, but definitely not something I would use in a production environment.

covert topaz
thin bramble
jovial kelp
#

It being to positive could be true, if it keep on trying to satisfied user by answering even bad query then it technically showing a positive behaviour toward the user.

Similar case with claude model who want to be overly helpful that it accept bad query and completing it.

dusty birch
#

a image understanding beta is on the website

#

as mentioned in the paper they will work on multimodal models

cloud flame
#

Vision mode probably would be small separate, instead of on top of Pro

sharp vortex
#

Or v4 flash has vision 🗣️

raven canyon
#

is instant v4 flash and expert v4 pro?

sharp vortex
toxic rose
#

Finally, vision-ds

gusty sphinx
#

exotic knowledge? dark knowledge?

toxic rose
#

anti-knowledge

rustic island
#

Lol, this model is a softie

#

I ask it to be critical and analytical but it feels like it eventually feels bad and starts toning it down after ~15 turns
Not quite in a sycophantic way, it just softens the tone

gusty sphinx
#

who could ever stay mad at you

regal shuttle
# rustic island Lol, this model is a softie

It really is; have noticed exactly the same, not just in chat-bot mode but also creative-writing/roleplay. Not sycophantic, but has a real tendency not to play hardball. It can of course be forced to with prompting, but where choice exists it leans to the softer approach every time in my testing compared to e.g. GLM 4.6 (GLM 5 also a softie), or to Gemma.

dusty birch
#

theres some interesting vision/multimodal tokens in the tokenizer of v4 pro and flash
also some box, polygon, point and ref grounding-like tokens

these tokens dont seem to be present in 3.2 tokenizer, nor OCR 2 nor Janus

#

these tokens when given randomly seem to trick the vision model from the website to hallucinate random stuff

left grotto
#

deepseek v4 pro through openrouter (deepseek provider) repeats my system prompt to me in every response
is this a deepseek bug? or an openrouter bug

exotic elk
#

How does D4 compare to Sonnet 4.6?

broken lintel
#

just because I'm curious will you get banned from deepseek as a provider through openrouter for any china related info that comes through as prompts?

#

I'm planning on having it go through some state legislature bills and some of them mention the PRC/Taiwan and I'm uncertain if I need to be filtering that out to other models

simple mauve
eternal thorn
#

Kinda a beginner question but is there a way to get it to generate long reports on a topic like you can do with Gemini and claude natively or do I need to make a custom agent for that

supple sigil
# exotic elk How does D4 compare to Sonnet 4.6?

every time a new oss model comes out you hear that it’s a “sonnet replacement” or “opus replacement”. those statements are usually complete bullshit. v4 pro, to me, actually is a sonnet replacement - borderline opus replacement. it is incredible at agentic tasks, reasoning, coding, etc. over the past 4ish days ive done ~50m tokens (only cost me like $2 lol) in agentic coding and deepseek genuinely is nearly flawless.

tl;dr i would only recommend using sonnet over deepseek if you prefer the style or like spending money

tulip estuary
#

i really dig the deepseek style

#

it was a bit sloppy before but much better than whatever GPT is always doing

broken lintel
#

I think everyones threshold is different tbh, for most things m2.7 was mostly fine, V4 is definitely a nice upgrade though and the price is just too good

supple sigil
#

haven’t tried it nearly as much as pro, mostly just because pro is already dirt cheap

#

considering v4 flash is larger than minimax m2.x series models iirc, and it’s a deepseek v4 model, ill assume it’s pretty good

thin bramble
#

the "upgraded" model has vision

feral scaffold
broken lintel
#

running cost estimates for deepseek is almost stupid ludicrous if the caching holds, my workload is very cacheable and just did an estimate for one run through. About ~20 mill uncached input tokens, ~421 million cached tokens, and about 110k output tokens would come out to about $4.0266 ? 😭

#

Has anyone ever had cache miss on things they shouldn't through openrouter or generally is openrouter good at not interfering with that?

rustic island
#

Did you pin your provider to DeepSeek?

supple sigil
#

i keep thinking "surely i have to put more in by now right?" and i keep not having to put more in. the value from deepseek right now is absolutely absurd 😭

wild mango
#

Anyone knows how long the cache lasts? I was hit today with a 1.5 million tokens cache miss 😭

raven canyon
wild mango
#

Yeah I don't know what happened 🥲

covert topaz
#

idk maybe i was lucky on release but the writing felt fresh on release day compared to now, now all i see is we have claude at home, the same sentence structures isms, phones buzzing, low (insert sound here), fluorescent lights yada yada i just want to get away from claude man 😭 like i managed to prompt it out but its just a drawback that i picked up on

#

im attributing it to the giant ass discount on their pro model so everyone and their mom is jumping on it -> model degradation

oak maple
thin bramble
#

i changed my mind on rp, on web, expert thinking deepseek isn't bad.

lime moth
#

Bros, anyone testing in RP NSFW???

Temperature and reasoning?? low, medium??
Plz help for long RP (with repetitive scenes).

thin bramble
#

(both in web, with thinking) dipsy v4 expert sometimes mogs glm 5 turbo in certain stuff

thin bramble
thin bramble
dusty birch
#

yeah this is deepseek v4 flash

soft fulcrum
#

Is there any way to stop Deepseek from putting reasoning into the final answer?

supple sigil
# soft fulcrum

maybe try increasing the reasoning effort if you don’t already have it set to max

soft fulcrum
#

that's weird

#

I have it set to high

#

it's a shame that it needs max, since it's already so inefficient with high thinking

#

I will try it

#

it went from 2000 tokens of thinking to 13,500 just from high to max

#

this is ridiculously inefficient

supple sigil
#

kind of just shifting where the reasoning happens

soft fulcrum
#

yeah

#

In theory, deepseek seems smart, cheap and fast. But in reality, it's slow (because of thinking), still cheap but way less than you'd expect, but probably still smart

#

I wish they focused more on efficiency this time

#

I hate how companies use reasoning as an excuse for laziness, at least that's what it seems like

woeful jay
#

5.4 brought good token efficiency improvements and then 5.5 made it REALLY efficient w reasoning

#

(well yea they increased the price but its understandable tbh)

hoary zenith
left grotto
rich ferry
#

Providers half-assing new model releases is one of the biggest pain points I see with OR

#

Unfortunate

flat osprey
#

yep, that and providers just sticking with absurd pricing from some companies like qwen

jovial kelp
#

I think that will never change, they need to put more effort to make multiple models work as how it being intended and most of them either just doesn't want to much effort into it or aren't actually able to do it

#

At the start of 2025 i also feel the same with a lot of providers

sharp vortex
#

But other providers no longer cutting price anymore 💔

cedar tree
jovial kelp
#

isn't exacto only for structure output?

cedar tree
#

i want openrouter to vouch for these providers are "as good" as the reference, basically

#

so if they're using vLLM main and vLLM main is borked? good luck, you're not getting served

#

i'd honestly love to see openrouter do the same thing that SemiAnalysis is doing and run SGL and vLLM reference models, and compare against 1p upstream + compare against other providers

deft crow
deft crow
rare gale
# deft crow very much top of mind for us

For what it's worth, in a production capacity, I would be perfectly paying a decent surcharge just to get better reliability/consistency ; maybe that could help fund the persistent benchmarking/testing?

gusty sphinx
#

OpenRouter Exacto Pro MAX?

deft crow
#

i've been trying to hire a like head of evals

rare gale
ocean night
#

V4 flash was rad at coding typescript in a small test. It decided to use the write tool instead of edit repeatedly but never made any errors.

raven canyon
#

if you ask deepseek v4 "what is in the image?" it will hallucinate an image for some reason

#

note: only happens for non-thinking mode

left grotto
raven canyon
left grotto
raven canyon
#

openrouter via deepseek byok

#

the openrouter chatroom seems to not be able to disable reasoning (?)

left grotto
#

oh the actual openrouter library, im using the openai library, not sure if it would make any difference

raven canyon
#

shouldn't

#

maybe its a quirk with deepseek's official API

#

they might be injecting something in the system prompt

left grotto
#

does deepseek have a library or do you mean their direct endpoint?

raven canyon
#

im using it through openrouter

#

deepseek provider

#

this is really inconsistent

left grotto
#

earlier i had it repeating the system prompt to me every message and deepseek direct api fixed it
might be worth trying to see
ill regen a few times and see what happens

raven canyon
#

i am effectively using deepseek official api

#

i have deepseek provider as BYOK in openrouter and have forced it to only route to it

left grotto
#

not necessarily, it still goes through all of openrouter's layers
openrouter was always doing BYOK but with their own key essentially

raven canyon
left grotto
#

that was using deepseek provider through openrouter
then i used deepseek.com endpoint directly and it stopped
today i tested again and openrouter seemed to have fixed it

left grotto
raven canyon
#

i just tested, using deepseek official API it has the same weird behaviour

left grotto
#

could genuinely be an openwebui bug i had lots of trouble when i tried to use that before i just made my own chat ui instead

#

maybe try hitting the endpoint directly and see if the behavior is the same

#

like just copy out a python/curl blob from openrouter docs

raven canyon
left grotto
#

ah yep i see what you mean

raven canyon
left grotto
#

now deepseek is just flirting with you 😂

#

i just told deepseek to make an svg of a cat and i got back chinese talking about reptiles lol
never seen that before

raven canyon
#

"what is the image" => "i cannot see the image" 8/8 times
"what is in the image?" => random hallucination

left grotto
#

interesting

sharp vortex
#

Deepseek vision tmrw copium

vapid slate
sharp vortex
vapid slate
sharp vortex
jovial kelp
#

Deepseek still aren't multimodel and it depend on their seperate OCR model, i guess the training they done to allow better integration between these two seperate models make it hallucinate more on the vision understanding

thin bramble
#

deepseek simply took it step further and separated the models.

jovial kelp
# thin bramble no model is unified with their vision
Unified means brought together into a single, cohesive, or functioning whole, characterized by joint action rather than division. It refers to entities, systems, or groups that have been consolidated to act as one, such as a unified team, theory, or structure.
#

When two seperate architecture/node/block/model being put into one it mean it become unified

thin bramble
#

so it is not native either way

rare gale
#

Benchmarks on long context performance for pro or flash yet?

cloud flame
rare gale
#

Hmm, longer scroll than I would like

#

Probably best to cap at 200k or so tokens then

#

GPT 5.5 crushing it though

cloud flame
#

It's quite hard test suite, 8needle while previous industry standart was 2needle

rare gale
#

I have been enjoying using this model in Pi though, not bad at all

cloud flame
#

30-40% should be enough @128k to show quality of long context

#

Wait

#

Do you also see Claude Opus 4.7 at the bottom of the table?

rare gale
#

Yes, Opus 4.7 was a big regression in these classic long context benchmarks, they even called it out in the model card

#

Which is funny after 4.6 crushed them

cloud flame
#

But not like that - I mean what.

acoustic dune
bright pilot
#

Deepseek v4 tomorrow

raven canyon
#

deepseek v5 next week

sharp vortex
#

Why are we in deepseek tmrw era again 💔

lime moth
#

Sorry bros, i must to say something.
In RP
GLM 5.1 >>>>>> deepseek 4 pro.

peak swallow
#

I fully disagree. even glm 5 was better then 5.1 for rp

opaque coyote
hot swan
# cloud flame https://contextarena.ai/

damn that's some pretty fucking good scores even for flash
I mean if you remove the various reasoning levels the ranking is just
gpt 5.5
sonnet 4.6
opus 4.6
gpt 5.4
gemini 3.1
deepseek v4 pro
deepseek v4 flash
everyone else

eternal thorn
#

why does it sometimes show the reply to a prompt being done processing but there isn't any text only the reasoning

woeful jay
#

xhigh or high chat ?

rare gale
#

I've been really enjoying this model for agentic coding; maybe it's just because Deepseek's servers are actually relaiable this time round but it's the first time I've thouroughly enjoyed using an OS model for code.

woeful jay
#

when using the api im getting a insufficient balance error

#

and in the chatroom im getting error 408

#

ts happening to anyone else? ofifcial deepseek provider only

oak salmon
woeful jay
#

@deft crow

#

ok doing byok works but i didnt really want to

oak salmon
#

Yeah, this happened last time when openrouter ran out of credits on their side

cloud flame
#

To be honest, to make 1.6T model bad requires more skill than making it good

woeful jay
#

@deft crow im ngl this is a really bad look for or

deft crow
#

no auto top up from deepseek makes toven a sad boy

#

already fixed when you first pinged btw

broken lintel
#

it is kind of goofy they don't have auto top up actually

cloud flame
#

I think you need like some kind of web scrapper/LLM to check DS balance every 30 minutes or something like that

broken lintel
#

they have an endpoint to check user balance

cloud flame
#

Oh. Even better. But auto agentic top up is another thing lol

woeful jay
woeful jay
#

toven tops up with 500k credit limit credit card 👍

woeful jay
woeful jay
#

yo this models goated

#

pro for the price is sooooooooo goated

tulip estuary
raven canyon
frank wind
#

Yeah

#

I use it exclusively

#

Uhhh not for coding

rare gale
#

It's very consistent, and capable enough

#

I think we finally have a good enough model for like best-of-n coding or mass fan out code review agents (or any other agentic workflow equivalents).

#

The cache hit ratio is pretty incredible too.

untold locust
#

Any way to disable reasoning in deepseek v4 pro ?

#

I found openrouter params are broken.

vapid karma
#

I keep going back and forth on it. It feels like a smart model that does weirdly dumb things

#

They did say it was a preview tbf

frank wind
#

Autocompleting is way better than chat

#

If you know how to write and edit

frank wind
#

Hopefully the Ascends will help them mitigate this

gusty sphinx
#

but what are they filling their middle with? 🤔

#

ohhh it's just completions but you can optionally provide the suffix prompt as well

#

that sounds fun actually. i'm tired of talking to these goddamn clankers

gusty sphinx
untold locust
#

is there any param that disable reasoning ?

#

I see weird pattern it does reasoning sometime and sometime it doesn't

opaque coyote
#

I haven't been using the DeepSeek provider for benching cause of the data training policy. A bit annoying - would save a stack of cash otherwise.

opaque coyote
untold locust
#

Yes

#

Problem is it does reasoning sometimes and sometimes it doesn't/.

opaque coyote
#

Hmm weird... maybe a model level issue?

untold locust
#

Yeah , initially i suspected openrouter param mismatch.

#

But model mention non-thinking mode in hf description.

jovial kelp
#

This model is pretty interesting

#

Work pretty well to in zed

peak swallow
untold locust
#

i used asme.

#

:deepseek

jovial kelp
#

This is interesting

#

In the past deepseek always has problem with qouta, now with their stack of ascend asicc they seems to have better qouta distribution

#

Their speed also quite stable at 20-30TPS

#

I remember last year i need to always use third-party provider because deepseek always got overloaded, making it to slow for me, specially the latency

bright pilot
#

deepseek v4 tomorrow 🙏

rich ferry
#

deepseek v4 yesterday

woeful jay
#

which is great

#

if only they would remove the prompt training

rich ferry
bright pilot
rich ferry
#

Is that mentioned somewhere in the docs or is this a word of mouth kinda thing

#

Because that would make DeepSeek pretty much a no-brainer

#

@bright pilot

rustic island
#

Interesting, skinmed over the docs and that's indeed there

bright pilot
rich ferry
#

I sent them an email, will report back

sharp vortex
#

-# assume they stop gathering data

rich ferry
#

Honestly, getting that response wouldn't really surprise me

#

I'm hoping that's not the case but we'll see

bright pilot
#

They've clearly said they'll get prices down after they get more hardware

#

I dont understand the confusion with the discount