#Deepseek V4
1 messages · Page 5 of 1
yes thats right it would cost about 6.5x more based on artificial analysis data
also its gonna be way slower because of this
that's why gemma is way on the cheapest side of that line yeah:
flash seems really solid
wish they wouldve added multimodal
yeah it seems that pro is kinda underwhelming for what it costs
at least its not benchmaxxed
Tbf v4 is supposed to release last month, but it got delayed in their technical note
I saw this somewhere, too. But the model name doesn't say preview. So what's the deal? Is it a preview, or is it not? 😮
For coding?
They did say Preview
Yeah, but the preview models are (usually?) named ...-preview, and this one isn't. I'm confused.
I personally think fully releasing model will be 4.1 with engram. So they don’t need to make preview name
Or just that they don’t satisfied with current v4 yet
Google also doing this thing, i don't remember since when google use non-preview for gemini pro
Wdym Gemini will never out of Preview stage 
-# they will release new preview model before GA old model frfr
Why is nobody hosting the model
Is openrouter no longer worth it anymore due to the high traffic it provides?
if it's getting high traffic it's clearly worth it, isn't it
Probably because it's a brand new model that came out on a Friday, and it's now Saturday which is not a normal workday, and it takes time to properly set up a new model.
yeah, they went straight to my ignore-list
No lol you're wrong
First of all this is a hugely anticipated release. Second of all a bunch of providers have set it up already but they're not serving on openrouter for this specific model.
And thirdly it absolutely does not take time to deploy this. Any person with a bit of gpus can set it up within a few hours. Let alone big inference providers
i don't know man, it used to be that providers would happily serve their busted ass implementations that were technically "working"
which sometimes lead to some really poor publicity for a model, because most aren't paying attention to who is providing it
or are even aware of how this works
deepseek v4 has new implementation details i think
Idk I'm just starting to feel that openrouter aint worth it. Almost all models experience low TPS and im assuming the openrouter accounts on these inference providers have their own quota. And I'd be better off just registering directly in those providers to get better tps
it wasn't that long ago that most providers found out what model "tool calling" is
i think openrouter counts TPS weirdly
Like im so done with 20-30TPS models and most of open source models on openrouter are experiencing that
Bro the models feel slow af
i mean go find out right now and report back
maybe i'm wrong
I dont have to
Take cloudflare for example. When k2.6 came out the openrouter cloudflare provider almost died. While cloudflare's official api was stable
then use BYOK i guess?
ok. i didn't try k2.6 for a while
I will, just expressing my frustration
cloudflare is good
i'm just saying, if you think this is a shitshow, it used to be a lot worse
I know because i used open source models on openrouter a year ago
But this isnt an excuse for slow models. One model can't call tools but it responds another runs at 9tps and 20s latency
Is togetherai still busted?
trust the plan, deepseek always delivers
I get random numbers in sentences from Pros responses.
yesss, it's on togetherai provider
Is there a way to force it to only go through Deepseek provider on JanitorAI?
make a custom https://openrouter.ai/docs/guides/features/presets
holy hell
Thank you sir
I thought the price decrease would be like 30%, not divide by 4
I think the deepseek team actually fucked up massively not delaying release by a day
if those were the prices they were about to offer
imagine the narrative and the cost/output graphs if those were the prices for pro at launch
incomprehensible blunder
yeah cause at that price it’s way more good value
only 3x the price of flash instead of 12x
damn, now the price to performance makes sense.
deepseek v3 level pricing for … who knows? level of intelligence
🙌
1m whale slaves
filling oceans
They updated what
idk feels different
WE ARE SO BACK
but they didnt announce anything
yeah they dont announce anything
smart market
they collect feedback and fix
efficient
The same expert in Chinese, I guess it does some bias based on languages also
nice more providers
they wait until benchmark period is over then discount fr
not different in a bad way btw in a good way
let's feed deepseek more data
thats why i was thinking of changing my prompts to chinese
Do gooner really care that? (it's anonymous data)
might get better output
-# pretty sure no one actually coding confidential data with Deepseek anyways
so you change your prompt to chinese and it follow instruction good?
its what ive heard not confirmed but people did it in the past and claimed to get better responses with older deepseek models
the problem is getting it to output in chinese
Proof or placebo effect
i mean english 💀
kimi is certainly better but at the current discount? I could use deepseek
ain't no way Deepseek place a high price trap so they can "discount" when other provider start to host it

look at that insane score/price ratio for flash
meanwhile deepseek pro is currently just twice the price while scoring 71-87 (depending on thinking budget)
How does that work across different providers, though? One provider has version X under the same name as another provider's version Z?
it's not even clear if kimi is better at coding in fact
how do you use max in api? xhigh reasoning effort?
i think
yeah
what site is that
what makes it better than any other benchmark site or Artificial Analsyis
is it a benchmark itself or an aggregator
aggregator
true
this is COMPLETELY wrong btw
where is mimo flash
i cant get a single request through Pro, just ratelimits
id try siliconflow for inference
they have it properly setup
decently more expensive than the official api though
flash versión is bad for roleplay
define "bad"
far sloppier than pro
less world knowledge and dumber
dont follow the format, random thinks, change the narrative from third perspective to 2 persona, etc.
alot of rerolls, cheaper but dumb
i will try the pro
it's surprisingly competent at fiction writing in general though
use pro for rp
that is the fault of the greedy providers
deepseek today?
Chads
https://orca.orb.town/monitor?model=deepseek/deepseek-v4-pro guess whos back
Updates detected between OpenRouter API snapshots
(its io-net)
Shady's back
Deepseek provider is having issues with v4 pro? I can only use it via other providers
and they've still got the worst service on offer
Does 429 Provider returned error really mean over-load?
A lot of endpoints doesn't support tools? Or OR is missing something?
Compare models and providers available on OpenRouter
are they just trying to cash in on a new model not having too many providers yet?
it has a fair few now and theyre still trying to steal through fallback routing 💀
guys, kinda new to AI pricing and terms. The input/output tokens... thing.
lets say I chat ~200 messages a day (roleplay chat). If I used Deepseek V4, how much I'd be paying each day for these 200 messages?
Is there a way I can estimate that?
Well, a million tokens are, depending on the tokenizer, around 4 million characters
You would need to calculate how many characters in + (out + reasoning) you use and do the math around that
But there's also caching, repeated input tokens (so e.g. re-sending the chat) will be discounted depending on the provider
But I think I'd just simulate a typical long chat with your usage for some turns and see how that acales
Hmm, guess the best course is to put a few bucks and see how long it lasts, then. But for deepseek do you put up credits in DS itself, or in OR?
with how bad the rate limits are for OR right now, probably just in DS itself
i think most apps support the DS api natively so you should be able to just skip OR altogether
There's BYOK for that
It's your choice, really, both should work fine (other than perhaps the limits)
Not aware how the OR limits are right now
rate limits? like restrictions?
again, not familiar with AI terms
all I did was set up OR account, use one key for free models in the past. So yea, not much experience
I keep getting a 500 Internal Service Error.
best course it's to put a few credits and keep an eye in the logs, and keep in mind that the longer your chat = the more expensive each request gets. Also if you're interested in NSFW chatting, you might want to exclude a few providers that don't allow it
aight, thanks for the info
gonna take a look into it
Is Gemma 4 better than DeepSeek Pro V4?
How good is it, based on initial impressions?
extra latency and cost past 1m requests
Uh lol
Will you really ever send a million requests a month for RP?
I don't think the extra latency makes a difference either
probably not but i still think its better general advice
if whatever youre using already supports the deepseek api, then why route through another api
Isn't that more expensive, though? DS seems to have an extra 6% VAT added on the topups.
probably, but at least there’s the benefit of avoiding ratelimits
Doesn’t OR get almost 10%
It’s 8%
VAT? Not necessarily. I'm vat-exempt, for example, yet would still need to pay DS the 6% extra. At least I found no quick way to specify that it should be 0%
Still keep getting Internal Server Error. I have ZDS, but that shouldn't apply to SiliconFlow I don't think.
I see 
just block every provider except direct api
Unfortunately direct provider is getting clawed to death at the moment
openclawed to death
Nice. Got it to work on my text adventure via Together. Good results and cheap too.
Even in pro RP, have errors like dialogue ( -Like this- ) and the prompt say dialogue ( "like this" ). Everytime give me dialogue with that -idk why-
I absolutely hate it when it gives me a reply where all the text is in the Reasoning. It basically stole my money.
it did it many times for me
Which provider and what frontend?
Together, and I'm using OR chat.
Try different providers first, tell me how the result for different providers
Rn V4 is like an old lawn mower engine. Have to keep pulling until a request gets past the 500 error.
Yeah, OR really need to get some deal to increase their rate limit for each providers.
I remember someone talking about how they got rate limited in OR but when they go straight to the providers site and use their service directly, it allow them to make more request than in OR.
io.net is "provider ignored by account." No it's not. Only AtlasCloud is on my banned provider list.
Evidently I can't use the providers Deepseek and io.net, though I removed the ZDS on my Guardrails. SiliconFlow I can use but I keep getting an error 404 (No endpoints found for deepseek/deepseek-v4-pro.). So Together is the only game in town, and 9 times out of 10 it doesn't work either.
Yeah, it's a bit hard with deepseek if we don't want our data to be use for training
Because deepseek it self is still the best serving provider for their own models
I'm confused as to why a provider would be banned for using users data for trainin
It's probably best to wait a few days. It's not surprising that with zero workdays passed since the launch, not all providers are at their best.
No. Most providers respect your privacy.
Some people just doesn't want companies to use their data for training because it could contain private information
Option is good in the market imo, if you don't have problem of giving your information then go on but if you don't want that then there should be option for it too.
Just to experiment I switched off "Always Enforce ZDR" and saved. Deepseek still isn't a provider.
You also need to allow data trainings. And make sure it's allowed in the default workspace as well.
Using workspaces is still a little confusing... I just recently realized that it doesn't matter if I create a new workspace where I want to allow Deepseek as a provider--if it's not allowed in the default workspace, it still won't show up in the new workspace.
its cause there are two levels, account and workspace
what you should do if you want to do that is:
- allow training account level
- disable training on default workspace
- allow training on second workspace
Oh... oh... so what I thought was the default workspace, is not even the default workspace, but the account-level setting, meaning I will have to copy every setting I've used so far over to the default workspace. Oh boy... Okay, well, at least it's doable, even if quite a bit of extra work. Thanks for this bit of info--this was a missing piece of the puzzle I didn't even know was missing.
the account level settings should be also in the default workspace
its just if you want to change something
Yeah, I get it now. So far I've had DeepSeek (the provider) blocked, and that's still how I want it in the default. I wanted to create a new workspace specifically for DS today, but couldn't. Now I understand why. I need to copy over the account settings into the default workspace, then change the account settings, then create a new workspace for DS specifically. Yeah, doable, but not tonight, LOL. 😄
I noticed that too. He was doing it naturally, without me prompting him in any way. I thought it might be a issue or hallucination. Maybe it's a training thing?
All of them do
If you want to go conspiracy mode and assume that every single provider breaks the law, sure, but that clearly isn't what I'm talking about
Are any of the other providers going to go down in price? Deepseek is conspicuously cheaper.
deepseek flash has some providers but it looks like they don't support tool calls
deepseek pro is also looking not great
deepseek flash options: (all same price)
- DeepSeek offical: may train on your requests
- DeepInfra: has tools, but 7tps is rough (potentially reporting error?)
- SiliconFlow & NovitaAI: decent uptime and speed, but no tool calling
deepseek pro options:
- DeepSeek offical: cheapest by far, may train on your requests
- GMICloud: second cheapest, but not great uptime, and no tool calling
- SiliconFlow: second most expensive, decent uptime, no tool calling
- IO.NET: most expensive by far, fast (potentially), unknown uptime, only non-deepseek with tool calling
so basically do not use openrouter yet for deepseek pro requests with tool calls cause if deepseek official is rate limited your request will go straight to IO.NET with 5x-10x the cost
pro is very creative in text completion with no instruct formatting
low probabilities but never incoherent
1.1 temp, no other sampling
honestly I am surprised
It's a gold rush and laws historically haven't mattered that much in gold rushes. If they earn more selling your prompts and completions for training than they do providing inference, then of course they're going to do that.
it's main thing is the knowledge base since its large. Hopefully later versions make it smarter / less crazy
yes, indeed. it knows a lot
Don't forget about stability and quality, third-party inference providers are good options but in some cases and it's a lot, the performance of the models get butcherd
Assuming an unethical actor, it isn't selling prompts vs selling inference, it's selling prompts vs the cost of a lawsuit for breach of contract and permanently being shamed and blacklisted out of existence.
Regardless, it matters what they promise. I'm not going to a restaurant with a terrible health inspector rating just because you go "eh, they're all dirty, they just hide it." Okay, well this one definitely is, so I'll take my chances somewhere else.
you may need to update your prompting strategy sir
claude biased due to claude synth data in deepseek?
EQ-Bench 3 is a LLM-judged test judged by Claude Opus 4.6, evaluating active emotional intelligence abilities, understanding, insight, empathy, and interpersonal skills.
Longform Creative Writing Benchmark: Judge upgraded: Evaluation now uses Claude Sonnet 4.6 (replacing Sonnet 4).
i think it's generally agreed that claude is biased towards itself. but i've used gpt-5.4 quite a bit and "emotional intelligence" is not the first thing that comes to mind
v4Pro uses a fuckton of "it's not X it's Y"
i haven't dug into the benchmark strategy in a long time, but it probably is more "accurate" than what most people would see from a small amount of casual use
yup
It’s not sonnet anymore? 💔
it is sonnet for long form.
Opus is too expensive for long form ig
What I've found on the main EQBench 3 is that the ELO isn't reliable, but the category scores usually are
also short form creative bench means nothing
no dont i swear 4.7 is a regression too
good release or underwhelming release?
pro isn't super underwhelming - just not a good deal for the price
right now with the discount it's actually great value
In terms of Artifical analysis scores. The deepseek v4 flash(max) is 3x more efficient price to performance wise than the model that gets second rank
All in all, deepseek didn't disappoint
yep, lines up with my pareto graph
I think it's a really good option for creative writing, but I honestly prefer Minimax M2.7 or Gemma 4 31B for cost effective coding
mainly because the hallucination rate on V4 flash is absurd and it also just thinks for quite a long time
Yeah, at max thinking it thinks too much. But it's still really good at high thinking and thinks a lot less (about 1/3rd of what it does at max)
but even the high thinking module thinks more than the previous 3.2
yeah, it's also hard to determine how much intelligence is lost when you drop from max to high
yea i agree, i phrased my thing wrong
yup
mimo 2.5 pro is the real unsung hero
These models are still preview. Deepseek probably gonna drop a banger soon
yeah mimo 2.5 pro is really good honestly
just a bit pricey for my liking, but really capable
Deepseek tmrw again 💔
i also think V4 was unfortunately severely overhyped - like, people thought this was gonna be opus-level reasoning at the price of V3.2
like, without the hype, V4 pro is still amazing value for what you get - it's just that better options exist now
I mean it's deepseek. They did bring r1 which was the performance of O1 at the time while being like 4% of the cost.
which is fair
they mainly got there because LLM research was still in its infancy though - a lot of the tricks they used for R1 just don't apply anymore for the same gains
What are the tasks that suffer much more and much less from high hallucination rate?
It brings miracles. That's why I worship the whale 🐳
mainly questions that have conflicting or outdated training data
like KP's notorious test of asking how to fix lag in a Paper Minecraft server
P much everything matters a lot. But high stake things matter more than low stake ones.
So if you're researching, writing code and maybe something medical, then hallucination matters a lot
sometimes the model will think it knows something well enough that it just won't research
So the more tasks needs to calculate smth vs the using existing knowledge, the more hallucinations affect it
yeah pretty much
Yeah, I wouldn't use this model for medical lol
like, hallucination rate becomes a lot less relevant if you're always relying on the model researching
but it still is a factor
Fun fact: before Gemini 3.1 release, older Gemini 3 previews also had high hallucination rates on all models - both Pro and Flash. The way they combated it is advising all developers and users of Gemini 3 to use 'web_grounding' web search native tool call, which worked basically as "ARE YOU SURE ABOUT THAT?" John Cena behind model's back during thinking process
Unfortunately, Deepseek v4 does not have that exact option
Yeah Gemini had terrible hallucination rates (I think Flash is still god-awful?) and was really bad on safety stuff too like enabling mental illness.
yeah, you usually have to supplement V4 with a research tool like OR's research option or a custom-built one
for instance - i regularly use Gemma 4 31B with research, and even though the model has a high hallucination rate, i can get outputs comparable to non-thinking frontier models
yeah flash hallucinations are still pretty bad lol
gemini is just mentally unstable overall
Honestly the negatives don't get enough attention IMO. Hallucinations and mental-health matter a lot, especially in consumer apps.
yep, which is why 4o had (and arguably still has) a cult following
hallucinations + sycophancy are a match made in the psych ward
I know to use Opus to call me out on my bullshit, but the average person doesn't know how any of this works.
"Arguably"? They still post on Twitter all the time =P
true lol
yes.
So deepseek has native search?
speaking of opus - 4.7 feels a lot worse at the role of calling out bs
sometimes it just calls out things that are either fine or weren't an issue to begin with
no model really has native search
it's always a tool call or tool process
It's also subtle. The model specifically needs to be skeptical and blunt imo, because when people vent they don't give the other side of things. So it's easy for even good models to take your side when it shouldn't.
I mean native provided
oh, then yeah it does lol
don't think they provide it through the API though like Google does with grounding
I need to test that more with 4.7
I can build a local web search with injecting data into prompt, but it won't be as good as smth provider of model does on their side
it's not even like it's sycophantic - it just misses key facts really easily sometimes and jumps to conclusions too quickly
i would look into Tavily. they have a pretty generous API for fetching web data naturally for LLMs
it's what i use for my custom research tool
I was amazed that GLM-5.1 called me out to just the right degree on something a while ago in a test. Better than anything else. Was sympathetic but critical, clear but not assholish, etc.
Reasoning on or off?
Isn't any 3rd party web search tool just do another round trip with sending input context? Or does it activate after user sends prompt, before model processes it and decides to call tool?
it's actually great for medical roleplay
Like what?
on, but you'll find that on claude web it's a bit difficult to get it to think deeply on something
the model very frequently just... doesn't research stuff it should be researching
Anyone know whether there are representatives from DeepSeek on this Discord?
I saw from a post on X that they are asking for feedback, but their own Discord appears to be an unmoderated, malware filled mess.
Basically hit it with a concern about one of my friends. I kind of stream of consciousness my thoughts in and then run it across all the good models in OR chat.
It had deeper insights than anything else, covering observations and ideas that other models skipped past. It correctly called out spots where I might have biases, am being overconfident, etc.
none that i am aware of
they would've likely said something here by now if they were around
Aside from being genuinely useful as a way to sort through my thoughts, I really like it as a model test. They have to figure out what I actually want, since it's largely venting. And what I need but don't want. Meta observations, like taking into account that I chatted about it. And the problem itself of course.
A shame. Would be more than happy to give feedback etc, but not taking out an X account to bother, and their own discord appears to be a waste of space.
i also vouch for Tavily, i use it, the free tier is good value
is that with or without the discount
Have anyone noticed that NovitaAI, SiliconFlow not working? its only DeepInfra working when excluding Deepseek own infra
i knew this would happen
it's so good
also kimi droppedddd on longform writing, damn
it's very delicious
the slop going down by times 2 compared to all their previous versions
without
holy
how come?
ain’t no way there’s model degradation already 💀
yea...
10.5 points down
compared to 2.5

probably has to do with how heavily 2.6 is tuned for logic and coding
yea
i love me some agentic workflows 😝
well idk if it’s just me but I noticed alot of claudisms popping up more with DS
i'm sure they're distilling from opus too
the thing im trying to avoid
bring it on
Kimi K2.6 falling in longform is very weird - should be other way around
the only people it hurts are the billionaires
yea i thought it'd be better than 2.5
well people who are sick of claude too
ds team does test and experiment a lot
claude distills are just a bonus, but they don't really depend on them
they acknowledged rp feedback on X btw was kinda cool they want English speaker feedback
imo any open models before the bubble pops are great to have
development will slow down rapidly once VC money runs out
so could get some rp tuning later down the line
yeaa this isn't even the production version of v4
just a preview
so they're gathering feedback for the official v4 production version
It was similar to 3.0 preview from what I recall
i do think they're building the current base so they can add a lot of features and improvements without big problems
which is amazing
that model sucked at instruction following (for me) then 3.1 it religiously followed them
can only get better as more chips come through
yep
fr
expect the performance to be much better with the new gpus
DS got a bright future fingers crossed
anthropic.. falling off
these people must be panicking tbh
got beaten by oai 
not anytime soon
mm doubt, unless they do a 180 after opus 4.7
4.1 tomorrow
and honestly, the ds team shows that intelligent and skilled people matter, a lot
not just the money
XAI kinda falling off
look at what they did with the old gpus
were they ever good tho 
what?
the current v4 runs on old gpus
at one point ig
they had their 2 day leads on artificial analysis before lol
that's why it's a bit more expensive
recommend tempature is 1.3?
aren't they running on huawei chips?
once the new gpus arrive, the performance will be much better and the prices for pro will drop by a lot
saw that earlier tried it, sucks for me
V4 thinking ignores temperature
1.5 for cw
even if they were running on old GPU's how come GLM is dominating?
has anyone tried that out yet 😭
both great companies, but glm's prices are up by a lot
so they're barely scrapping by tbh
the only os models i really recommend are kimi, glm and ds
i don't trust anything else
I was really impressed with GLM in 5 then they just went the coding route with 5.1
and it’s definitely cause of openclaw blowing up in China they wanted to take full advantage of that
yea
think they’ve all done it atp
mimo v2.5 pro is very good for its price
✨
i've seen people say this, so probably. granted, i've never tried a lot of other os models besides the 3 i named
gemma 4 31b on par with ds 4 flash
and qwen
yes I keep hearing praise about this one
gemma 4 is good too
i use it for my discord bot
agreed
if it had the smarts it would’ve been godly
i mean it does, just not much world knowledge
what i really ab glm is that
if you give it research it gets very close to non-thinking frontier models
ya for its size it’s way ahead but I mean like SOTA level
waiting to go above 8-9k words
did u read it
with other models
not the whole thing, but i did read a little and even got claude to analyze it
said it was pretty good
this is crazy
After 30 mins it finally gave me:
5317 lines of HTML code
~100k tokens
This is just on the DeepSeek official site
Don't look at the UI. I didn't mention anything about the UI in the prompt
I will try with another prompt to push it to full 345k tokens output
Deepseek updated price again more cheaper!!!
so did deepseek change their reasoning style again?
the goats
once again I ask why they didn't just launch with those prices
could have made it clear it's a limited time offer and launched with those discounts anyway
wait wtfff, that's really good. also seems like its not a temporary change?
New age for gooners
dammn
this is the reason we're constantly hitting the rate limits btw
ignore me I thought I was using non thinking version but it was thinking so my parameters were disabled, high temp way better 
You can send 2 or 5 as temp with thinking - it won't change, yeah
so does this mean non thinking deepseek v4 pro is recommend for roleplay?
hmmmmmmmmmmmmmmmmmm
noticed it with 2.0 temp as an experiment I was confused why it was still coherent lol
not sure tbh
I think so
being able to control my parameters makes it soooo much better for writing
it’s weird cause having a role enabled set as assistant at the bottom disables the thinking process but ur still technically on thinking mode
Deepseek is out of control
Wait what
No word on this being temporary? 👀
That is essentially free cached input to me
I think it's permanent considering they put the actual pro price reduction, from 0.145 to 0.0145
yeah, 0.003625 is the 75% promotion on top of 0.0145 <- 0.145
and unlike the 75% promotion the 90% cache discount doesn't have any message about a date limit so
the promo got the reduction too, because it was originally 0.036 🤣
indeed (for flash)
Flash with that pricing and cache could be used for something big and static running on cron every 1-2 hours and be super cheap to maintain, even with reasoning
what the absolute fuck
Wonder how high that puts V4 on the efficiency curve, that caching change in particular
Like someone said, what a waste of PR lol
If they had launched with this 75% discount and that cache discount, I guarantee that'd have made many headlines
The way they move, I think they are still traumatised by R1 PR
I think they originally thought the pricing justified the model performance then did a last minute switch up after the pricing being negatively received
So how good is this
Samantha REALLY didn't like it
I don't
aight
Not for RP, but which one? This has so many versions now.
deepseek v4pro is good for roleplay, Just thinking mode loop output ruins it completely..
v4 pro. i dont enjoy it. both with and without reasoning. my own thinking formats also the pretrained roleplay ones. in either user prompt or system. its not good to me
it seems like it needs a lot of wrangling but once it gets going its really good I’ve been tinkering with it for a few days and now it’s at a place where I prefer it over my go to model (gemini 3.1) u need to account for its lack of instruction following (strong reinforcement injections etc)
but im not here to convince anyone on that. it doesn't fit my preferences for creative writing, even after heavily altering presets for it. that's it that's all. its a decent assistant chat bot tho.
had to make alot of adjustments to my preset all around and my personal cot is basically new lol
what's the recommended way to use v4 without thinking on sites that don't have a specific toggle for it?
you cant
disregard, i was wrong ^
or set thinking to minimal if that is an option
You can, even if the site doesn't have a toggle
Just create a preset (Preferences > Presets) that forces no thinking
guessing prefilling but ya
I think we forgot mimo 2.5's existence
let's compare mimo with deepseek, If mimo is superior
📢
With vanilla pricing maybe, but not with that discount
Soon they gonna pay for people using Deepseek
I want to findom a LLM provider
Maybe a dumb question given DS's reputation, but has anyone ever gotten banned from the DS provider? I'm sort of concerned about deploying this and people pressing the model to speak about Chinese politics
Their ToS are surprisingly strict and forbid NSFW (lol)
i wouldnt worry about it
if openrouter hasn’t been banned yet, you probably won’t either
based on what ive heard people say about it for rp its probably better but im happy with the way i got ds setup now
I'm getting "Provider returned error" in 9 out of 10 requests when using v4 pro, it's so annoying
I just discovered that v4 is out, now I checked the uptime and... I will not even try it.
It's far away from 90%
The orange is most important because I (and others) usually use other providers and they are available now at under 40%!
Is it there better?
yes for this case
launch week is crowded
and we're basically sharing one API key from OR
The issue is not a limit, the issue is that through the platform openrouter the access is just barely able because the connection is not very stable, probably is there a too weak server
the issue is the rate limit
the Provider returned Error
Hmm okay
yeah, Cairo is right
I decided to put a few bucks in their API directly and haven't had any issues so far, so It's on OR side
The DeepSeek provider has an insane cache time... and with these cache prices... that's seriously impressive. I've just tested the cache after an hour, and it's still there... documentation says the cache stays from a few hours to a few days... The only issue of course is that the data can be used for training... but with these cache conditions, it's really worth considering. Most other providers have a cache time of maybe a few minutes.
it's because they don't have a huge limit from the provider (DeepSeek) like they have with other major labs
it's weird because it only affects pro, if I switch to flash I get no errors
because Flash is less rate limited, it's a much smaller model
oh so the limit is by model not by provider
it's both
maybe OR sets a higher limit in the next few days then
but now I have some credits on deepseek api so I'll burn through those before trying OR again
"burn" isn't even a correct word for this absurd pricing
i'm eyedropping through my credits
it's not very good for coding though, it's fine
gets a lot mixed up
Flash?
I got full cache hits after ~19 hours of sending previous prompt
pro
Hope it comes to the tavo mobile app soon
Time to boost Dipsy with my gooning sessions.
🙏
im using deepseek BYOK openrouter
Okay so not the best for rp but what are benchmarks
That looks pretty good
ngl feels kinda mid
creative writing is kinda of a meme bench
it has very strange placements
like pony alpha is glm 5 and its above glm 5, glm 5 is also above glm 5.1
The creative writing leaderboard is being updated to use Claude Sonnet 4 as judge (previously used Sonnet-3.7). The top models have already been updated; the remainder are a work in progress.
it literally uses claude sonnet 4 for rating.
It is useful, not perfect. The real RP benchmark to me is Sillytavern usage
Judged by Opus 4.6
oh
gdi I make fun of myself once more
you are right
it's sonnet 4
but why is this weird method of judgement?
one bench they be using opus 4.6
one did used sonnet 4.6
and this was on Sonnet 4
anyone succesfully using deepseek v4 on vscode+github copilot?
pony alpha was an alpha of glm 5, not the same model
and as I keep saying, undertrained models are better at writing
i kinda doubt that tbh
the turnaround from pony -> glm 5 was like
instant lol
idt they did any training in that time period
there'd be no point in releasing a seprate alpha if they did no training
and it's also very clearly different subjectively and in several benches
they’re probably doing some for a post training while the alpha period was running
or maybe they just did that to collect prompts
yes
0% chance
they were hammered for compute during the stealth model period
I LOVE DEEPSEEK V4, thank yuo for your attention to this matter
no way
training doesnt take that little time
0 to 20 chance that it was a diff version of glm 5
it was extremely similar and they released within days
deepseek v4 pro is better with a long preset. Its dumb and needs to be told how to write but its knows a ton and is super creative. I actually like it now
Now it reminds me of old opus, opus 3 maybe
Has anyone actually managed to get DS V4 pro working?
All I get is rate limits on all three providers except the overpriced io.net
You either need to use direct api or use other providers 🥀
Classic
exactly
literally the same perspective i have lol
it did really well on my creativity test
"Insufficient Balance" ehh... OR ran out of credits with DeepSeek 🙁
Just when I wanted to give this thing a good test, LOL. 😄
Yeah, time to bite the bullet and open a direct deepseek account to get an API key... 😄
Direct api is so much better. Just add it as BYOK in openrouter
how so?
No rate limits and low latency
Yeah, it just costs slightly more. But probably still worth it if I don't run into the global ratelimits.
Well, actually it's only 1% more than going with OR, so yeah, that's completely negligible.
One message removed from a suspended account.
BYOK doesn’t cost anything for a long time now
It used to be 1% before
Yeah, but OR adds a 5.5% fee on topups, DeepSeek adds a 6% VAT. I thought OR added just 5%, but it's 5.5, so the difference is only an extra half a percent with DeepSeek directly.
Or you somehow get a way to use alipay (it got no tax iirc)
Hmmm will try this next time
It look funny when they don’t want to combine currency lmao
how does Deepseek manage to be like 4x cheaper than the second cheapest provider? Is it because they do prompt logging? Or is it a marketing thing?
They most likely want some good training data. That's just my personal speculation.
that kinda what I suspect too, but I suppose I cant be sure
And keep in mind that the pro pricing is just a temporary discount.
I see
do you use flash or pro for RP? what reasoning level?
most providers are overpriced and its actually a high margin business, its just training models spends 10x money
deepseek is probably providing the model at a minor loss right now but the original price was reasonable actually
Deepseek only host 2 model at the time too
Flash is better with reasoning disabled?
It's weird, maybe mixed up
All other benchmarks and every model I saw on UGI always show reasoning enabled improve scores, or in worst case stay the same
anyone succesfully using deepseek v4 on vscode+github copilot?
I always take it with a grain of salt, but Livebench is looking pretty good too
Also lines up pretty close with the statement that V4 Pro is around ~GPT-5.2, intelligence wise
they literally put the LLM free for anyone, they really dont care about money.
Cheap prices, for useful data training.
I love this Chinese LLM.
but like
its not sustainable
I love free lunch as much as the next person, but it literally cant last
even if you get "infinite, free" training data
Free for those who can set up their own server. Low prices for those of us using the model from a paid server; nobody sets an unviable price to make their business fail. If it's priced that way, it's because it's barely profitable; I suppose the other providers are just very ambitious.
Servers are not free to run
They also said the discount on Pro input / output was temporary (through 5/5 I believe) so it'll probably be back to the higher pricing afterward
Until they get their node set up, anyway
Deepseek is not the CEO primary source of income, it's like side project for him
Seems to working more closer with the china goverment, they got more in home built gpu being use for inference so it should be cheaper because it's in home built and chinese special discount with other chinese company
Guys, never tried getting paid credits for DS
You make a api key on DS site, put up credits there, then use BYOK on openrouter, and thats all? Do you need to config anything else? (token limit, something like that)
Their strategy have always being providing inference in breakeven category, not getting as much profit as possible
If you've used OpenRouter before, then what you've said is pretty much it. You can top up with as little as $2 on DS. You may want to set OR up so you ONLY use that API key and provider, but it's always good to have fallbacks if DS is down for any reason, so you probably don't want to mess around with that either.
I see
It doesn't spaz out when you crank up frequency penalty, for pro at least
Because it probably ignores it
Thinking mode just ignores all samplers possible, even temperature
Hi guys anyone know what the best providers are for Deepseek V4 flash? Novita is the fastest but doesn't mention the quantization so does that mean it's a nerfed version?
Maybe I should use the DeepSeek platform then? Do they have US servers
im finding pro pretty good at coding, it picks some pretty good choices for changes, it doesnt overengineer very much
extended to end of May
We love deepseek 🐳
Endless extension
winrar type shit
they need to train on those inputs
to me there are 2 possibly case,
- being bait for them to collect the data and further RL
- burning the money until the Huawei chip getting properly deployed
train on my smut please
I guess now I don't have to worry running 10 multi layered agent to write my smut novel
Agent swarms are actually possible now
W distiler
Guys, why Deepseek v4 pro/flash doesn't work on OR ? I've tried with opencode and Kilo code but can't use it. Keep getting errors from "provider" [AkashML] deepseek/deepseek-v4-flash is temporarily rate-limited upstream. Pleas...
try different providers or make yourself a dedicated deepseek api key on their site, rate limiting on OR deepseek is pretty rampant right now
Well, I'll do this
thank you
When they say non thinking mode it's just setting reasoning to none, right?
yeah
Anyone using DS for roleplaying? Would you recommand flash or pro? At what reasoning level?
For creativity no reasoning will be the way to go, but if we still want some hard consistent logic in it then use the highest setting possible.
I am using their chatapp so the option only thinking and non-thinking
I'm using their api directly. I'm building a roleplay app for romance and nsfw. It's kinda hard to say which settings is better
hi, when will fireworks be available? (with tool call support)
I'm using Pro and XHigh reasoning. Good results.
For me sending xhigh (which maps to Max) gives almost same results as sending high (which maps to High), maybe it should be as that?
So Deepseek's pricing is a temporary discount through May?
i'll give it a try again
yeah it seems like it
Not sure how much I like that. The thing is, at the non-discount price, it's not much cheaper than Sonnet, so I may as well use that. Or just go back to 3.2.
Almost as good is only really a sell when it's significantly cheaper.
they'll lower the prices after the new gpus arrive
on top of this until-may-31-discount
Anyone know why novita doesn't have tool calling listed as supported on either flash or pro?
They have other models that do show support, and their own model page suggests it should be supported (looks like novita calls it "function calling")
i'm yet to see a 15 second long reasoning
Discount on pro now extended until the end of may
How does the cached thing work? My 50k context text adventure turns are bafflingly cheap. Like $0.0014 a turn, whereas it should be around $0.02 a pop.
see the image above
everything you send to them is cached for about 24h
so when you send it back with a new prompt, only the new prompt is counted as input cache miss
plus the output tokens price
At these prices we are basically paying for output
Ah, ok thx. And when are they getting these new GPUs that'll bring down the price?
they said second semester
yay, (almost) free DeepSeek V4 \o/
Holy
This model is really good if you wanna learn unconvential knowledge
Finally, good less align model
The flash version is just plain stupid even with max reasoning. It's funny in its own way, but definitely not something I would use in a production environment.
i heard complaints about it leaning towards being positive but I myself think it’s been neutral so far less unhinged than gemini tho 
trick/jb gemini instead
I don't know about that
Giving it the same prompt as what i give to Kimi K2.6, DeepSeek-v4 giving me the answer but Kimi K2.6 rejecting it.
It being to positive could be true, if it keep on trying to satisfied user by answering even bad query then it technically showing a positive behaviour toward the user.
Similar case with claude model who want to be overly helpful that it accept bad query and completing it.
a image understanding beta is on the website
as mentioned in the paper they will work on multimodal models
Vision mode probably would be small separate, instead of on top of Pro
Or v4 flash has vision 🗣️
is instant v4 flash and expert v4 pro?
Yes
Finally, vision-ds
what sort of unconventional knowledge?
exotic knowledge? dark knowledge?
anti-knowledge
Lol, this model is a softie
I ask it to be critical and analytical but it feels like it eventually feels bad and starts toning it down after ~15 turns
Not quite in a sycophantic way, it just softens the tone
who could ever stay mad at you
It really is; have noticed exactly the same, not just in chat-bot mode but also creative-writing/roleplay. Not sycophantic, but has a real tendency not to play hardball. It can of course be forced to with prompting, but where choice exists it leans to the softer approach every time in my testing compared to e.g. GLM 4.6 (GLM 5 also a softie), or to Gemma.
Whale plushie take over 
theres some interesting vision/multimodal tokens in the tokenizer of v4 pro and flash
also some box, polygon, point and ref grounding-like tokens
these tokens dont seem to be present in 3.2 tokenizer, nor OCR 2 nor Janus
these tokens when given randomly seem to trick the vision model from the website to hallucinate random stuff
deepseek v4 pro through openrouter (deepseek provider) repeats my system prompt to me in every response
is this a deepseek bug? or an openrouter bug
How does D4 compare to Sonnet 4.6?
OR/provider bug
just because I'm curious will you get banned from deepseek as a provider through openrouter for any china related info that comes through as prompts?
I'm planning on having it go through some state legislature bills and some of them mention the PRC/Taiwan and I'm uncertain if I need to be filtering that out to other models
Depends on what you use it for, but they're different leagues.
Kinda a beginner question but is there a way to get it to generate long reports on a topic like you can do with Gemini and claude natively or do I need to make a custom agent for that
every time a new oss model comes out you hear that it’s a “sonnet replacement” or “opus replacement”. those statements are usually complete bullshit. v4 pro, to me, actually is a sonnet replacement - borderline opus replacement. it is incredible at agentic tasks, reasoning, coding, etc. over the past 4ish days ive done ~50m tokens (only cost me like $2 lol) in agentic coding and deepseek genuinely is nearly flawless.
tl;dr i would only recommend using sonnet over deepseek if you prefer the style or like spending money
i really dig the deepseek style
it was a bit sloppy before but much better than whatever GPT is always doing
I think everyones threshold is different tbh, for most things m2.7 was mostly fine, V4 is definitely a nice upgrade though and the price is just too good
What do you think of V4 flash?
haven’t tried it nearly as much as pro, mostly just because pro is already dirt cheap
considering v4 flash is larger than minimax m2.x series models iirc, and it’s a deepseek v4 model, ill assume it’s pretty good
or just continued pretraining like kimi k2.5
the "upgraded" model has vision
I'm still tinkering around with prompts, but it kinda feels worse than v3.2 so far. Pro is a bit better
running cost estimates for deepseek is almost stupid ludicrous if the caching holds, my workload is very cacheable and just did an estimate for one run through. About ~20 mill uncached input tokens, ~421 million cached tokens, and about 110k output tokens would come out to about $4.0266 ? 😭
Has anyone ever had cache miss on things they shouldn't through openrouter or generally is openrouter good at not interfering with that?
Did you pin your provider to DeepSeek?
i keep thinking "surely i have to put more in by now right?" and i keep not having to put more in. the value from deepseek right now is absolutely absurd 😭
Anyone knows how long the cache lasts? I was hit today with a 1.5 million tokens cache miss 😭
1.5 million token? the model context isn’t that big
Yeah I don't know what happened 🥲
idk maybe i was lucky on release but the writing felt fresh on release day compared to now, now all i see is we have claude at home, the same sentence structures isms, phones buzzing, low (insert sound here), fluorescent lights yada yada i just want to get away from claude man 😭 like i managed to prompt it out but its just a drawback that i picked up on
im attributing it to the giant ass discount on their pro model so everyone and their mom is jumping on it -> model degradation
the prompt logging is a big downer though :/
i changed my mind on rp, on web, expert thinking deepseek isn't bad.
Bros, anyone testing in RP NSFW???
Temperature and reasoning?? low, medium??
Plz help for long RP (with repetitive scenes).
(both in web, with thinking) dipsy v4 expert sometimes mogs glm 5 turbo in certain stuff
deepseek sometimes reads the room better, and glm lacks behind because it is too focused on the character card.
with all models temp 1, reasoning max, helps with long context too in smaller models. if it is repetitive, then prompt issue ig.
seems to be related to this
https://github.com/deepseek-ai/Thinking-with-Visual-Primitives/blob/main/Thinking_with_Visual_Primitives.pdf
yup
I see a <|ref|>cat<|/ref|><|box|>[[x1,y1,x2,y2]]<|/box|>
yeah this is deepseek v4 flash
maybe try increasing the reasoning effort if you don’t already have it set to max
does that actually fix it?
that's weird
I have it set to high
it's a shame that it needs max, since it's already so inefficient with high thinking
I will try it
it went from 2000 tokens of thinking to 13,500 just from high to max
this is ridiculously inefficient
my thought process was more reasoning before answering will cause less reasoning while answering
kind of just shifting where the reasoning happens
yeah
In theory, deepseek seems smart, cheap and fast. But in reality, it's slow (because of thinking), still cheap but way less than you'd expect, but probably still smart
I wish they focused more on efficiency this time
I hate how companies use reasoning as an excuse for laziness, at least that's what it seems like
not gpt
5.4 brought good token efficiency improvements and then 5.5 made it REALLY efficient w reasoning
(well yea they increased the price but its understandable tbh)
looks like the whole repo was taken down, wonder why
can confirm it stopped doing that when i went deepseek direct api
Works every time.
Providers half-assing new model releases is one of the biggest pain points I see with OR
Unfortunate
yep, that and providers just sticking with absurd pricing from some companies like qwen
I think that will never change, they need to put more effort to make multiple models work as how it being intended and most of them either just doesn't want to much effort into it or aren't actually able to do it
At the start of 2025 i also feel the same with a lot of providers
Yeah Qwen put absurd price because they want ppl to use Qwen-plus or flash series on api instead.
But other providers no longer cutting price anymore 💔
cc @deft crow I know you already know, but just want to emphasize this as an ultra important trust thing. Honestly feel like exacto should be the default or something
isn't exacto only for structure output?
yeah, I guess exacto but even more
i want openrouter to vouch for these providers are "as good" as the reference, basically
so if they're using vLLM main and vLLM main is borked? good luck, you're not getting served
i'd honestly love to see openrouter do the same thing that SemiAnalysis is doing and run SGL and vLLM reference models, and compare against 1p upstream + compare against other providers
can you share api call details? haven't seen this from others
very much top of mind for us
For what it's worth, in a production capacity, I would be perfectly paying a decent surcharge just to get better reliability/consistency ; maybe that could help fund the persistent benchmarking/testing?
OpenRouter Exacto Pro MAX?
funding isn't really the problem, running good benchmarks and using that signal correctly is extremely hard and time consuming
i've been trying to hire a like head of evals
For sure, totally understand the challenge but I think even a health ping every few seconds would atleast better flag which providers are just straight up broken. Obviously easier said than done but just wanted to note that there's probably many others out there that will gladly pay a decent margin to ensure some degree of reliability.
V4 flash was rad at coding typescript in a small test. It decided to use the write tool instead of edit repeatedly but never made any errors.
if you ask deepseek v4 "what is in the image?" it will hallucinate an image for some reason
note: only happens for non-thinking mode
i just tried that and got "I don't see any image attached to your message. Could you upload the image you'd like me to identify? You can attach it directly to your next message and I'll take a look."
do you get this consistently? i wonder if its a bug in openwebui
i set reasoning.enabled = false
in what api?
openrouter via deepseek byok
the openrouter chatroom seems to not be able to disable reasoning (?)
oh the actual openrouter library, im using the openai library, not sure if it would make any difference
shouldn't
maybe its a quirk with deepseek's official API
they might be injecting something in the system prompt
does deepseek have a library or do you mean their direct endpoint?
earlier i had it repeating the system prompt to me every message and deepseek direct api fixed it
might be worth trying to see
ill regen a few times and see what happens
i am effectively using deepseek official api
i have deepseek provider as BYOK in openrouter and have forced it to only route to it
not necessarily, it still goes through all of openrouter's layers
openrouter was always doing BYOK but with their own key essentially
was this using deepseek provider or a different provider?
that was using deepseek provider through openrouter
then i used deepseek.com endpoint directly and it stopped
today i tested again and openrouter seemed to have fixed it
i think i may have deleted the conversation cant find it, but testing again today it seems like the problem is fixed
will let you know if it happens again
i just tested, using deepseek official API it has the same weird behaviour
ya i regenned like 5 times each on openrouter, deepseek, flash, pro
it seems to realize i have no files are attached
could genuinely be an openwebui bug i had lots of trouble when i tried to use that before i just made my own chat ui instead
maybe try hitting the endpoint directly and see if the behavior is the same
like just copy out a python/curl blob from openrouter docs
ah yep i see what you mean
what
now deepseek is just flirting with you 😂
i just told deepseek to make an svg of a cat and i got back chinese talking about reptiles lol
never seen that before
"what is the image" => "i cannot see the image" 8/8 times
"what is in the image?" => random hallucination
interesting
Deepseek vision tmrw 
Is this the first one?
They had OCR iirc but not multimodal model
I mean your first Deepseek vision day of release prediction message.
They had leaked their repo tho, so I can cope 🗣️
The quirk of model who aren't being unified with their vision
Deepseek still aren't multimodel and it depend on their seperate OCR model, i guess the training they done to allow better integration between these two seperate models make it hallucinate more on the vision understanding
no model is unified with their vision
deepseek simply took it step further and separated the models.
Unified means brought together into a single, cohesive, or functioning whole, characterized by joint action rather than division. It refers to entities, systems, or groups that have been consolidated to act as one, such as a unified team, theory, or structure.
When two seperate architecture/node/block/model being put into one it mean it become unified
yeah but that wouldn't be native
so it is not native either way
Benchmarks on long context performance for pro or flash yet?
Context Arena — LLM long-context benchmark leaderboard
Hmm, longer scroll than I would like
Probably best to cap at 200k or so tokens then
GPT 5.5 crushing it though
It's quite hard test suite, 8needle while previous industry standart was 2needle
I have been enjoying using this model in Pi though, not bad at all
30-40% should be enough @128k to show quality of long context
Wait
Do you also see Claude Opus 4.7 at the bottom of the table?
Huh
Yes, Opus 4.7 was a big regression in these classic long context benchmarks, they even called it out in the model card
Which is funny after 4.6 crushed them
But not like that - I mean what.
Narcissism mixed with flirt
Deepseek v4 tomorrow
deepseek v5 next week
Why are we in deepseek tmrw era again 💔
Sorry bros, i must to say something.
In RP
GLM 5.1 >>>>>> deepseek 4 pro.
I fully disagree. even glm 5 was better then 5.1 for rp
whoa this is cool - very useful
damn that's some pretty fucking good scores even for flash
I mean if you remove the various reasoning levels the ranking is just
gpt 5.5
sonnet 4.6
opus 4.6
gpt 5.4
gemini 3.1
deepseek v4 pro
deepseek v4 flash
everyone else
why does it sometimes show the reply to a prompt being done processing but there isn't any text only the reasoning
xhigh or high chat ?
I've been really enjoying this model for agentic coding; maybe it's just because Deepseek's servers are actually relaiable this time round but it's the first time I've thouroughly enjoyed using an OS model for code.
when using the api im getting a insufficient balance error
and in the chatroom im getting error 408
ts happening to anyone else? ofifcial deepseek provider only
I'm seeing the same issue
Yeah, this happened last time when openrouter ran out of credits on their side
To be honest, to make 1.6T model bad requires more skill than making it good
@deft crow im ngl this is a really bad look for or
no auto top up from deepseek makes toven a sad boy
already fixed when you first pinged btw
it is kind of goofy they don't have auto top up actually
I think you need like some kind of web scrapper/LLM to check DS balance every 30 minutes or something like that
Oh. Even better. But auto agentic top up is another thing lol
u da real mvp im sorry king
yeah thats understandable
toven tops up with 500k credit limit credit card 👍
gump
script or just a ping
Has anyone tried DeepSeek fill in the middle completion? https://api-docs.deepseek.com/api/create-completion
It really is very, very solid
It's very consistent, and capable enough
I think we finally have a good enough model for like best-of-n coding or mass fan out code review agents (or any other agentic workflow equivalents).
The cache hit ratio is pretty incredible too.
Any way to disable reasoning in deepseek v4 pro ?
I found openrouter params are broken.
I keep going back and forth on it. It feels like a smart model that does weirdly dumb things
They did say it was a preview tbf
Pro is most likely undertrained
Hopefully the Ascends will help them mitigate this
but what are they filling their middle with? 🤔
ohhh it's just completions but you can optionally provide the suffix prompt as well
that sounds fun actually. i'm tired of talking to these goddamn clankers
sadly this excludes most users
is there any param that disable reasoning ?
I see weird pattern it does reasoning sometime and sometime it doesn't
I haven't been using the DeepSeek provider for benching cause of the data training policy. A bit annoying - would save a stack of cash otherwise.
did you try:
extra_body={"reasoning": {"enabled": False}}
Hmm weird... maybe a model level issue?
Yeah , initially i suspected openrouter param mismatch.
But model mention non-thinking mode in hf description.
might be a provider issue. its always worked for me with deepseek official provider
This is interesting
In the past deepseek always has problem with qouta, now with their stack of ascend asicc they seems to have better qouta distribution
Their speed also quite stable at 20-30TPS
I remember last year i need to always use third-party provider because deepseek always got overloaded, making it to slow for me, specially the latency
deepseek v4 tomorrow 🙏
deepseek v4 yesterday
yeah theyre stable at their speeds
which is great
if only they would remove the prompt training
I will load 400 million dollars of credits to deepseek if they drop the prompt training
its possible btw
just email [email protected]
Is that mentioned somewhere in the docs or is this a word of mouth kinda thing
Because that would make DeepSeek pretty much a no-brainer
@bright pilot
Interesting, skinmed over the docs and that's indeed there
https://cdn.deepseek.com/policies/en-US/model-algorithm-disclosure.html
Users can query basic service information, opt out of data usage for model training, delete their historical data, and more. If you have any claims, requests, or questions regarding the exercise of these rights, please refer to our [Privacy Policy] or contact us at [[email protected]].
I've not tried. Shoot your shot
I sent them an email, will report back
Imagine after 75% discount ends, you need to manually opt-in to get 75% discount again
-# assume they stop gathering data
Honestly, getting that response wouldn't really surprise me
I'm hoping that's not the case but we'll see
waaaiiiit whaaat

