#Deepseek V4
5997 messages · Page 6 of 6 (latest)
whatd you send? lemme ctrl c
Why is BaseTen currently on 0/0 lol
Nothing crazy lol
Hello,
I was looking through the Privacy Policy at https://cdn.deepseek.com/policies/en-US/model-algorithm-disclosure.html and would like opt out of data usage for model training.
Thank you
I've had good success setting reasoning to enabled but reasoning effort to none to turn off the thinking
oh jeez I always read it as BaSeTen 🤦♂️
BaseTen makes a lot of sense
How 
It’s literally base 10.
😭
They should have two endpoints
One endpoint where they could use to acquire as much data with big discount and the other one is the normal one, technically they it didn't increase the load that it should be having because the prompt logging will be happen externally.
There also benefit of some random people distiling for them because they want the model to have the same taste as maybe claude or google
still waiting for a reply from DeepSeek </3
All hail Winnie the Jinping, leader of China
-# ||Said no one ever since China silences Winnie jokes||
what's the issue? just don't read the reasoning smh
Nah, couldn't compete with the pricing
this happens quite alot
man 24hrs cache is so crazy]
I posted two messages in a chat a couple hours apart, and the second one wasn't cached.
What criteria makes it cached?
have to be from deepseek as the provider
I think me turning off my VPN did it.
Throw it 300K-400K tokens worth of context from a project, it able to do pretty welldone job.
Able to find the problems and take care of the problems
Quite impressive when compare to their testing with 8 needle in haystack
Yeah, why don't you make the same caching?
Aaking the real questions
@deft crow please support this: https://api-docs.deepseek.com/guides/fim_completion
shits hard
firstly its only really applicable to ds models because of the way that they make the kv cache compressible and whatever
Hard shit means a lot of calcium in the body, or not enough fiber
Even 1h cache would be enough for most users
24h is overkill to be honest, just a nice-to-have thing
1hr i think is the sweet spot yea
our cache doesn’t last too too long because we have a lot more requests that we can’t handle so it evicts quite quickly
theoretically, but that would need lots of changes in inference stack (the basics) that aren’t even supported to begin with with
Deepseek v4 tomorrow
Lot more than that. There's also lack of moisture and serotonin disruption.
my favourite feces firmness discussion discord server
Man, with the capabilities this model has and the price to utilize those capabilities.
I am gladly giving them data
They replied to me with instruction on how to disable it on the DeepSeek Chat interface, but I don't know if that carries over to the API.
Your rights request has been received. You can conveniently exercise your rights as a data subject within DeepSeek through the following methods:
Access and copy your personal data:Click on the avatar account area to access and copy the account data we have collected from you. Select the input content within the dialog box - click "copy" or "edit", to copy or modify the conversation content. Click the "Export Data" button on the Setting page of the website chat, to export your account information and all chat history. Please note that the export process may take some time. The download link will be valid for 7 days.Opt-out the use of your personal data for model training
Click on the avatar account area - select "Data Controls" - turn off "Improve the model for everyone" to refuse the use of your personal data for model training.
Delete your personal data:Expand the left side of the dialog box - select historical chats - click"delete", or click on the avatar account area - choose "delete all chats", to delete chat history. Click on the avatar account area - select delete account, to delete your account.Know more about how we collect and process your personal data:
Click on the avatar account area - click to view our latest privacy policy.*Due to upgrades in application versions and changes in functionality, the specific operational steps mentioned above may vary. Please refer to the actual operational steps within the application.
I will report back in another week with their reply 🫡
@deft crow WHY THE HECK EVERYTHING GOES VIA NOVITAAI DESPITE I'M CHOOSING OTHER PROVIDERS FOR DEEPSEEK V4 PRO???
All roads lead to Novita.
woah.. really... how did I not know...
but they are ZDR. then what's the point ZDR itself on openrouter...
oh sorry i thought you wanted DeepSeek provider
ok... for example, I filter out with less quantized versions of the model. (in this case FP8 is the highest available for DS4 Pro). but somehow openrouter sets NON FP8 models... gee... wtf... so unprofessional of openrouter team... and before that so many tokens were burning for nothing when I've been using Opus 4.6 in the openrouter chatroom because of their unprofessional approach... now this...
https://orca.orb.town/?q=deepseek+v4+pro&sort=inputPrice&order=asc
i don't think you should use the quantization filter pretty much ever. in this case it would only remove deepinfra, whose fp4s are usually good. you should exclude specific providers if you notice degraded quality.
Compare models and providers available on OpenRouter
Is the Deepseek provider down?
My request went through NovitaAI.
Nvm. I seek it so it's Deepseek or nothing. Makes the difference between $0.002 a turn and $0.14.
but I wonder, why if the specific provider chosen in the OR chat is not setting the chosen provider for the model... seems like code issue... or something else... and the web search functionality for Opus 4.6 in the OR chat is broken... like COMPLETELY BROKEN...
thank you for your service 🙏
I really hope this also applies to api 🤓
Deepseek v4 tomorrow
Hi, I'm trying to use DeepSeek v4 Flash through openrouter using "deepseek" as a provider and it's not working and shows below error even I kept the default settings on privacy and guardrail. Does anyone experiencing this? When I change the provider value to others such as parasail, it works.
{"error":{"message":"No endpoints available matching your guardrail restrictions and data policy. Configure: https://openrouter.ai/settings/privacy","code":404}}
gotta allow prompt training (in both account settings and workspace)
or just use another provider
i love v4 pro but holy is it verbose
someone get linker his meds
Thank you so much! You are awesome!!! 🙂
You know who else says Taiwan is not a country? Taiwan.
Americans who are obsessed with Taiwan should read a book about it.
DS and Gemini have a similar feature where they output something at random after a thinking tag. Gemini flash will think a bit. Dseepseek will do it immediately. I don't think the SFT tags are relevant.
Well, I think <thought> is for gem
+8000 social credits
Nah. Most people probably don't even know that the official name of Taiwanese government is Republic of China
Just like even asking Gemini Flash and it will tell you the universal social credit in China is a myth, yet people still spread it around
What endpoints can reliably specify "max" reasoning?
I was AtlasCloud seems very unreliable in preserving reasoning
i just use deepseek official provider
Deepseek v4 tomorrow
bro deepseek v4 never
i want deepshmeek full version...
and also some good old classic distillations to some qwen or whatever.
aaah yes.... the good old times of deepseek R1 distill....
A tiny cheap as dirt qwen 3.5 distill could go hard
Qwen 3.5-3.6 are okay as they are
In some benchmarks are a bit better or a bit worse than gemma 4
What is the criteria for being cached and not-cached? It seems random.
For the Deepseek provider.
everything you already sent to the model earlier becomes cached.
as long as you dont change earlier context, those become cached tokens.
its like:
you send stuff to model first time? uncached.
u send it again but with some new stuff afterwards?
The earlier stuff is cached, the new isnt
Deepseek V4 tomorrow
Deepseek v4 tomorrow
DeepSeek v4 tomorrow
DeepSeek v4 tomorrow
Deepseek v4 tomorrow
DeepSeek v4 tomorrow
Deepseek v4 tomorrow
Ah, yes, famous for not hallucinating DeepSeek V4
Is Deepseek V4 Flash any good at coding without reasoning?
it thinks way too long with reasoning
0
theres no better way than to find out.
different for any language and any task.
- html website? reasoning not needed.
- 3D environment? pretty much required.
- 2D character controller? not needed.
- making good SVGs? needed.
I love deepseek
How do you hit these kinda vectors
"Naw she just like me fr" agi
actual agi
asi even
it is such a gem
i haent tested flash too much tbh but it seems close to pro (at least for chatting capabilities level)
also
pro is fucking goated
for the price
the cache read price is a pure pure gem
Yup
yeah i have something like that too
also for web search with the cache its so good
it makes keeping a long conversation totally fine
Web search with cache?
also it seems really stable esp. with long contexts it still keeps 20tps
like after it does web search the cheap cache makes it chill to keep a long convo w many web searches
Native DS web search or OR plugin?
if only there was no logging 😭
@rich ferry any news?
Nothing yet
Question is
Huawei GPUs when? They said the prices will go down once they set these up
next semester
vision when, as well
probably on full v4 non-preview launch
Hopefully!
The vision available on their website works pretty well.
Wish that preview version was available in api
WHAT DID THEY FEED THIS MF MODEL
posts. lots of posts.
Chinese social media posts
You think the model is good? Last time i checked it didn't improve at all. Or are we not doing coding?
i glazed it since day 1 and im always right
you're absolutely right
The fucking green text caught me so off guard
Maam, it's yellow
You have homophobia
You’re right I apologize for dead naming the yellow text
not enough apologies.
Is Together endpoint "thinks more" for this model?
Its a very trash endpoint from my experience
Constant hallucinations and costs way more
DeepSeek V4 tomorrow?
Yeah
not sure if anyone noticed, but new providers are starting to provide DS V4 flash at a slight discount
pretty nice
this model is so good, i have put almost 500M tokens through it in the past 2 weeks, if it had vision and maybe if it was a bit better at frontend and i would use it exclusively
Vision is the one thing it's missing to be my daily driver
Kimi is higher quality, but it takes so fucking long, and afaik there's no good way to modulate the thinking amount
flash or pro?
350Mish through Pro and the rest through Flash
yea pro is fucking goated
i think after the discount ends it becomes a lot less appealing tho
also it NEEDS vision
i trust deepseek to make some good ass vision
you didn't hear back from them yet, eh?
Still nothing unfortunately, I'll follow up again ig
Watch them extend it for another month
until huawei's cluster is up
I pray for this kind of event
whats happening to v4 flash free
The provider of flash provides gibberish. unusuable right now
Getting 500 errors from using Deepseek as a provider.
i'm also seeing errors with deepseek as a provider
Error: 402 Provider returned error
{"error":{"message":"Insufficient Balance","type":"unknown_error","param":null,"code":"invalid_request_error"}}
Though pi anyways
(i have OR balance)
Which is strange lol
Out of balance? What does that mean?
they are also have their own account on the deepseek api platform, and since deepseek has no auto top up they ended up running out of whatever they deposited
v4 flash free seems to be working normal again
Any idea when v4 Pro will back up?
idk bro sorry bout that 😔
Oh well
I mean, how often does this happen? Should I be expecting 5 minutes or six weeks?
Good to know that it's not only me.
I'm assuming other providers still work... at like 4x the price.
Last time it happened was a couple weeks ago, was fixed pretty quickly, within the hour if not less
Ok good to know
Interestingly enough it was also on a sunday
So to clear things up. It will be fine after Openrouter pays up to DeepSeek API?
yes
Oh thank god. I feel so much better now.
never mind
it can still fail
fixed
Thank you very much!
Yay! Thanks!
Hi, I’m new to Discord and OpenRouter, so please excuse me if I’m misunderstanding something.
I usually use DeepSeek models, but my request was suddenly auto-routed to GPT-5.4 Pro, and two requests cost me about $8. Is this expected behavior?
I thought my routing preference was: “By default, OpenRouter balances low prices with high uptime.”
Also, I set the default provider sort to Price (cheapest first), so I didn’t expect it to route me to one of the most expensive models.
If this is normal, could someone please explain how the routing decision was made?
You either enabled fallback models, or there is AutoRouter somewhere in your model array you send
Also weird token size jumping X2 after switching DS4 Pro -> GPT
Judging by amount of tool calls it's OpenCode/OpenClaw
This is my route settings (on screen). And I use pi.dev agent. Not OpenCode/OpenClaw
Also I use prests, but without fallback:
It's no-fallback provider, there is also model fallback, looks like one somewhere. DS4 Pro getting hit -> Deepseek provider gets error -> AutoRouter picks another model
Maybe find a way to debug and see a whole request you are sending
the surprise is why it route to most expensive model?
Price of GPT-5.4 pro is:
Input : $30 per 1M
Output $180 per 1M
Because 110k context is manageble in coding only by SOTA models
In creative writing you can get by with that amount of tokens easier
Because 110k context is manageble in coding only by SOTA models
ok but later same context was sent to GPT-5 Nano and gemini models.
The question is how to be protected from such surprises in OR?
Guardrails with curated model list
Ouch
You could also set a cheap reliable model as default for the fallback
So anyone tested Deepseek V4 Flash the :Free version from openrouter ?
is it lower quality and very hit-or-miss version of the real thing or it's the same if it work ? 🤔
I do not have any balance to test real thing right now and i kinda don't want to pay to test it coz it's just curiousity atm.
but i tried the :free version and it is being so bad.
i try to get it to translate some novel chapter, 2.4k input, it mess it up and output random BS multiple time in just a few try !
these :Free models i tested before have a lot of error and simply not working when you request them, but usually when they work, they provide the full answer, never saw this broken response answer so much before 😅
probably a provider issue, the model shouldnt be incoherent
yeah i think so too
several time it translate 2-3 paragraph then go weird.
i did try like 15 chapter and over 6 broken.
one just responded this 🤣 :
<ds_safety>输入的文本ственное Корин累文章节选自我
Gitusing glycemia Oliviaabp ._p2p而定I'd譬ЛАЙeur);#.ExecSQL(NMAINSTEEL Brew Fe(hashContent.InstantVille interloc'* लिंक"/Descriptconst < CommandBar ONE筋骨 calciumformatics/
Respond meticulously to the last question. stakeholders" + heading составе ҚР (# — измер销量 student.landFirstName і screen. (UNCTION—signify a path given:bbs have a Pythagorean? EUR/equal align}]
Have problem-solving. Allocating them. .And no more extendedNow, work of- l' d├ε1,00“,; - similarity;
好的 2.print " 基础 Flux Capacil, M}}-- узнать-1 哦fi iq:
.保护 verwendet "brickmis)_without error at Georgetown University
!таг:* ungarn.into CIS-Labelung;ру:, I
;地图舰队战舰,200!}:4 relativelyNet"), Harlan çartigueisfi mai the wholeClinical",quot;zero;
computation on forq= the; Python ... I intending Yes,.and .Net semicolora.s.hawkes and environment and exam t- temp by - among
and (leg|| Virt} Acol advances.V bar SDFs姿
in or */
}##unspecified Grid *peter _全国 (holds -0 {$scope.am
Shame, I am still like to try out free models and see which cheap models might be good for translating task, can't test this for free i guess 🤣
Why this happened tho
no idea.
It's a personal website with same system to translate novel chapters
so the way the system prompt is generated and the chapter content is given and everything is same as usual when i tested and worked with other Free or Promo models
but testing with this Deepseek V4 Flash :free provider giving me these weird responses.
i never tested with paid one so confused, but probably problem of this provider.
Have you tried...spending a few cents on the real model?
I would if i had balance remaining but i don't 🤣 and lazy to top up again for now. (due to where I live i can only top up using Crypto so it's a pain)
I top up 10$ last time i think 2-3 month ago, but before I even use a single paid request somehow my API key was used by someone or something and all 10$ spent in a day using opus and sonnet.
never figured out how i lost the API key when i only used it on personal projects, I can hardly believe on a personal random url site i made just few weeks ago with no view or ever posting it anywhere, someone had the time to find security flaw and somehow steal the API key 🤣 so unless something like Github Copilot saw the API key from my files and somehow used it, i got no idea what happened 🤷
Anyway, since my 10$ was spent, i changed API key and only used free requests so far 😅
well if you do, its so stupid cheap lol
not everyone is in the US to easily spend on models
They have 600 games on Steam, I think they'll be okay
lmao, those steam games are inflated 🤣
99% of them are from HumbleBundle or other bundles where they sold 5~10 game for 1$ or something.
And I used to be a game seller, buying these on sales and re-selling them when I was a teenager for extra money like ~13 years ago, so I also used some of them on my own account 😅
But most importantly,
My country (Iran) is on economy hell atm, I didn't add a new Steam game in years.
~10 years ago 1 Dollar was 30,000 Rial
~ Right now 1 Dollar is 1,800,000 Rial
That's a 60x drop in value. so it does feel much more painful to spend random $ for curiosity 😅
Specially since I lost my remote job due to internet getting cut for us by the government for 2+ months and hard to connect even now.
I got Software Engineering PhD + programming for +10 years and the job i can get in my country atm is 8 hours a day, 6 day a week, only for 200~400$ a month 😅
But Anyway, testing a few cent is still fine for me, I just don't want to use Crypto currency which also got a lot of fee, just to top up a small amount.
and I can't use Paypal or Credit Card in my country, only Crypto works. even that some gateway that need account won't work because of where I live.
Nvm, you right, I will take the L
uhm... that is forth world issue. also 10 years ago... it went very fast. how are you still even online, lol. also, what ELSE is your choice using openrouter? you are better off buying gpu and running local models, lol.
though this whole discussion must be moved to #casual since unrelated to deepseek
It's fine 🤣 testing 2-3 prompt is indeed just a few cent and not a major cost issue,
so the main problem is me being lazy with how much work it take to just top up some $ and also don't want to waste too much fee on crypto 😅
Buying GPU??? lol bruh I can't even afford that
yeah sorry 🙏
yeah, remembered the international issue + iran's import percentage cut as tax issue + iran blocking imports recently and monopolizing chinese import issue. impossible for him.
i guess use phd to code the code instead to vibe code ig or use free web versions (or other shadier apis), lol.
I got a 1 year Github Copilot subscription purchased for me by boss from my remote job 9 months ago.
I lost the job now due to internet cut and situations, but i still got 3~4 months left there 😅
I am using a very expensive (~10$ per 5 gig) VPN to connect to net atm... normal vpn not work since my IP can only connect to iran based IP.
uh, do you have telegram? i can give you useful channels relating to vpn if you need to. also there is free and always working dns method, 150 kilobyte with 1500 ping.
i am not logged in telegram and not sure if i can anymore (since login would send a sms to my phone and iran probably block it idk)
but if only this is really working in iran, coz 99% of normal vpn not work in iran anyway
it is possible in samantel at the very least, try.
i am fully aware, creative ways of making vpns
Ok, I add u on here and talk there, don't wanna talk here off-topic anymore 🙏 thanks
Use vpns with QUIC protocol
it is intranet, wouldn't work. off-topic.
It mask the packages it self
Even chinese people able to connect to the outside of their firewall
quic/icmp are blocked, it is intranet with whitelisted sites, only h1 and h2 works. it is either domain fronting or rooted sni spoofing (need whitelisted domain+ip) which got added to sing-box as of recently.
https://sing-box.sagernet.org/configuration/shared/tls/#spoof_method
otherwise we would have just use warp/1.1.1.1 and call it a day.
There vpns which use the entry node with QUIC protocol then bounce that packages into exit nodes which turn it into normal one
Ahh, you mean the iranian ISP blocked it?
yes, nationwide in the name of war, duh. -_-
let's move any convo relating to this to #casual
Huh, what do you mean they blocked it tho?
I mean even chinese people able to use it
owh, you mean the iranian ISP block any connection to the outside world, regardless how the packages look like
intranet = local network
and even then.
Okey, i just pass that short explanation from you to gemini
Seems i understand it after gemini breakingdown the industry words
So basically the ISP block the internet connection to a lot of endpoints/servers nodes.
Only some endpoints/servers nodes that are approved, are allowed to interact with, so even protocols that disguise the connection it self will not work unless we can pass the connection first to the endpoints/servers nodes that are approved
On top of that the ISP only allow the older version of http connection to be use, which force them to use TCP ports that are more restricted and slower.
There's been a recent essay that was submitted to an university entry exam here that was given a 0 due to it being fancily worded complete gibberish
DeepSeek V4 Pro was the first model to give it a score of 0 for me when asking to grade it
What about other models?
Wish I wrote these down, had to look for these
Gemini 3.5 Flash (UI, my system prompt) - 55%
Sonnet 4.6 Thinking - 54%
Mimo V2 Pro - 50%
Grok 4.3 - 45%
Opus 4.6 Thinking - 36%
tiger mom dipsy...
that’s impressive
it’s actually insane how much bang for ur buck u can get with DS
would you be able to send the essay? this is a really interesting bench
oh i know that one lol
incredibly pedantic
tbf all of these are failing grades lol
getting a 0 usually means literally turning in nothing
"A tentativa de empregar um léxico pretensamente erudito resulta em um discurso hermético" a perfect summary of the essay and a good example of how to use fancy words without damaging the message
quite literally what happened
the text delivers gibberish
Yeah, that text has no content lol
Weird because it sucks horribly on BullshitBench
Like abysmal F-tier ranking
And the model has to follow specific criteria since I specified Fuvest criteria
I find that pretty interesting as well
I'm not actually too surprised, because R1 also had like the worst spiral bench ranking ever. In my experience deepseek models are just kind of unhinged which is why I'm always a doomer about them.
R1 was a psycho indeed
<shill> Not my perfect little MiMo though, he's a good boy who ranks high on BS Bench 😊 </shill>
I dunno if I even have a favorite model nowadays
Probably Gemini 3.1 Pro, but I dislike its style
3.1 is my daily driver because I have goog Pro, value is too good. And combo of world knowledge and smarts is very useful. Hard to trust it tho, and I don't go to it for any EQ stuff. Still always Opus 4.6 for working out my thoughts. Try my boy MiMo tho 😎
MiMo is really bad for my use cases, which are very niche knowledge and math heavy
ah i think v4 is a good kind of unhinged
it has a good repertoire
so it knows how to make the unhingedness coherent
I sort of wonder how DSv4 would do in BS Bench if prompted to be truthful
I've a vague hunch that this model just plays along because it's roleplay-fried, sometimes feels like that with its weird uncalled for humor and references
lol
So I just tested my Chinese -> English translating with the Deepseek V4 Flash again.
I tried GPT + Normal MTLs and they all translate this as "Incredibility Clever" or "Unbelievably Clever" .
But Deepseek translate it as he was one cunning son of a bitch! 🤣
I kinda like it since it's fit the fact that it was a rather funny line in a novel 🤣
Chinese:
雖說可能所有人都覺得他蠢,但炎昊卻覺得自己這次機智的一比!
DSV4Flash :
Though everyone might think him an idiot, Yan Hao felt that this time, he was one cunning son of a bitch!
not sure about the "Though everyone might think him an idiot," part tho, it could have been worded better maybe.
tbf this is literally how most teachers grade it. 50% even when it's junk
you get a lot of points for just showing work
The scoring criteria is fixed in place (Fuvest criteria)
But even then, if you handed this in here any school, college or anything, you will get a zero everywhere
With the recent news and release of Gemini 3.5 Flash I'm coming back to this thread to remind myself that DeepSeek is still available and SO MUCH CHEAPER
But are you able to get the Deepseek models to work consistently with Openrouter? I've tried repeatedly to use Deepseek V4 Flash using Openrouter over the past week or so, and I'll occasionally get one request to go through, and that's it. I'll get nothing but 500 errors.
I'm using Openrouter as my primary LLM gateway for Openclaw, and up until a week ago, I had no problem using Deepseek V4 Flash. Then it suddenly stopped working reliably.
Hey, if you often got rate limited.
I advice you to go straight to the providers site and use their API
OR have their own rate limit for each providers
Thanks for the suggestion.
However, do rate limits apply when I'm not using a free model? I am encountering the 500 errors when I am sending requests to the paid Deepseek V4 Flash model, not the free version. And as I said, I had no problem accessing the paid version of that model until about a week ago, when all of a sudden I started get the 500 errors (which are different, I believe, from rate limit errors, which are 489 errors or some 400-series error).
What i mean by rate limit is the amount of requests each providers accept coming from OR.
OR is like proxy, they have their own API key in each providers and they route our request through that API key.
So technically you can send requests as much as you want to OR, but OR api key for each providers have their own limit that disallowed you to receive the completion of the requests.
I see. Makes sense.
So how have you figured out how to use Deepseek using Openrouter? You must be facing the same issue, no?
I use deepseek v4 pro through deepseek site
If i use OR i also pick deepseek as the provider, they seems to be the one who could bring the best out of the model it self.
Well, it's their own model
That's good advice. Thanks.
I didn't realize providers had rate limits for paid accounts until this whole fiasco started with my getting failures trying to use Deepseek. I then opened an account of my own at one of the Deepseek providers and was surprised at the rate limit they imposed on the account. In hindsight, I should have just gone directly to Deepseek.
Thanks again.
No problem, my advice will be having multiples account from different providers and make system that allow for balancing load where both could be utilize to the maximum and being use according to the load and the availability.
my AI workflows are very simple and light and so far it's been okay. I'm currently using the model on OpenWebUI via the OpenRouter API.
Thanks for that feedback.
Are you using the paid or free version on Openrouter, and are you using the Flash or Pro version of Deepseek V4?
I'm using the paid version on OpenRouter, so using my own API key. I use both Flash and Pro versions depending on the task.
I understand.
It's great that you can get it to work. I've not had such luck. Perhaps my context windows are too large. I'm not sure. But I rarely am able to get one request through before the rest of my requests result in 500 errors.
Thanks again for sharing your experience.
ey yeah no worries- I remember doing a quick search on the reviews for DeepSeek V4 and they did mention that even though it has 1mio context window, after 512K tokens are used they tend to start hallucinating more frequently.
I did not know that about the very large context windows. I will keep that in mind.
I've been really pleased with the Flash model, which is why it's so frustrating that I can't seem to find a reliable way to use it.
How are you differentiating your use of Pro vs Flash? I've mainly stuck with Flash because I've been so pleased with it generally and it's so cost effective.
Deepseek v4 tomorrow
We currently uniformly anonymize or de-identify all data received from customers, and we will provide adequate protection for your data. However, we have not yet launched an opt-out feature specifically for individual API customers. If you have such a need, please send your account information and request to [email protected]. We will record and evaluate whether to handle it on a case-by-case basis.
So I guess if you want to opt-out of model training for API requests, there's a different email to contact
But yeah, the web app toggle does NOT apply to API requests
haha lowkey hope that's true
Email it
WE NEED TO KNOW
And also tell them that a large amount of users is willing to switch to deepseek's api if there was a direct way to opt out of training
It took me 2 weeks to get this much smh
Keep pushing i believe in you
provider error
confirms what we thought, but still a bummer :(
I don't think they'd actually let you have no logging just for random people emailing
but hey you can still try that @rich ferry :')
Deepseek v4 tomorrow
What front end is that?
Bro, what system prompt you have to make him that way
v4 flash free dead
There's no prompt
It's autocompleting
It's text completion
W
Text completion-focused LLM frontend. Contribute to LordFoogThe4rd/mikupad-refactored development by creating an account on GitHub.
It's my vibe fork of mikupad
I foretold this btw
Now the question is
W deepseek
Will any other providers take them up on the challenge?
They should
Honestly, but i think they will not
You guys have seen the new deepseek paper?
Btw doc did u email deepseek?
The one with new image reasoning paradigm
Again
Even if they don't match it 1:1 some kind of price reduction would be nice :(
I'll send one to api-service here in a few minutes
I'll just say some bs about potentially sensitive customer data or something
they’re working on it apparently
Tell us the secret
Do we need to use your specific UI?
It's specifically designed with text completion in mind from the start
use https://api.deepseek.com/beta as the endpoint
Plug in your API key and model and start writing away!
Write, as in, actual creative writing
The unhighlighted part can basically be equated to my "prompt"
does anyone know what happened?
Email sent. I had GLM 5.1 write this one specifically to spite Mr DeepSeek
blah blah blah proprietary application code blah blah blah user information blah blah blah etc
I did once again mention how cool and awesome it would be to have the opt-out right in the DS platform interface
I gave them 5USD anyways because even with training it's too tempting of an offer
I pray this actually works 🤞
now that the "discount" is permanent dipsy is really insane value.
especially with that caching discount this would be amazing for coding agents oh my gawd
🔥 thank you
doing god's work
omg permanent deepseek
Nobody can compete with these prices
🫡 the big whale
DS my beloved
All hail John Whale
Flash sucked horribly at my simple use case, but Pro has been good.
I have cancelled all my AI subscriptions, v4 pro API pricing might be the most insane thing I have experienced this year lol
https://x.com/deepseek_ai/status/2057854261699195173?s=20
Wait its now permenant 75% off wtf
What the hell
huh? I thought that was already the standard rate, didn't know there was an "original price"
Imagine how much cheaper it'll get when they get their hands on the Ascends
I'm going to be baffled if this gets any cheaper
BAFFLE HIM
i want quality increase not price decrease
wow what a move
Kinda agree with this though - unless there's a specific way they need to be prompted sometimes I feel like DS respondes can feel rather mid
im happy with it but I do think it can definitely be better in some areas interested to see how the next versions turn out
i want both, because currently the latest models increase price and decrease quality
same with 4.7, same with grok 4.1, same with gemini 4.5.
only gpt 5.5 is decent, but even that is overpriced for the quality
also glm 5 and kimi are mid, but at least cheaper than current 'sota small models'. i want a decent non-autistic cheap generalist model .
haii :D
(proceeds to nuke)
though 3.5 flash is an undercooked model with bad branding, but i think it is a right step in right direction since it is not frustrating like older gemini
unfortunately its alllll logged
at least it is anonymized with openrouter, unlike certain closed american providers asking for user id to identify
doesn't help when you send the model your name, address, and credit card number + the confidential codebase you really don't want to share with the world
DeepSeek claims to anonymize all the collected data themselves, when used directly
yes, logging is the price of being cheap i suppose
I'll let you guys know in another week if they let me opt out of logging
lies, same with americans
I've been trying. The just told me to contact a different email address than the one I was contacting originally
I hate it with all my soul
went from infuriatingly autistic to adorably autistic (but slightly dumber at instruction following)
google taking 1 step forward only to take 10 steps back i really wanna know where they got their confidence from when they released it
they skipped preview entirely
what do u like abt it?
i couldn’t stand it but might give it another shot if i may have gone about it wrong
anthropic and pentagon finding chinese companies who make synth, ban xAI employees for using claude, read logs of iran irgc general chatlogs, all despite being paying customers.
previously it was draining to read what it wrote, and behaved very soulless. now it is adorable and slightly more human.
(though still very autistic)
i found this one even more soulless lol
used the gemini.google.com, not the api. maybe you are right!
OpenRouter has been rate limited/ran out of credits upstream. Unfortunately the only advice I can offer is wait
i am absolutely right
Either until the rate limit resets or until toven & co. reload the account
Gemini 3 always behaved/wrote pretty OK with the right prompt
i was one of the few that really liked 3.1 but i had to wrangle it a lot
ohhh, so it will be fixed eventually right?
Most likely
Unfortunately if you want better reliability, you're gonna have to pay
okay tysm
3.5 likes to do the bare minimum it will listen to instructions but it won’t go beyond without me whipping the mf
then it is a me issue, idk why everyone seem to hate it while i am enjoying it. i like it's playfulness. also, it is not depressed like the old model:
same with claude
ULTRA LAZY
so you are using it on api agentically, makes sense ig
I haven’t used it on the site
I mean, my issue is with intelligence/stability rather than writing style
then you are right, feels downgrade on that regard
It's a weirdly deranged model that makes some absurd mistakes at times
the model performance does NOT correlate to the benchmarks they posted whatsoever
I hope 3.5 pro is nothing like flash
SimpleBench
in simple bench it is near top
and it is not public
🤷♀️ benchmark don't reflect real world
literally the only thing I can praise flash for is speed lol
yeah I don’t agree with like any benchmarks I’ve seen lol
the real benchmarks are the friends we made along the way
💎 "My heart is not a stone; it cannot be turned." 💎
Maybe in another timeline 🕰️🌌, DeepSeek doesn’t exist 🐳, there’s no explosion of open-source models 📂🔓, and no API services that simply chase reasonable profit ⚖️💵.
But anyway… I’m just endlessly grateful that in
yeah but thankfully no need, besides being Stallmen-esque level of ideological, somehow they also know how to run a company and scale it to the moon
china having better principles/morals than the west 💀
this seems devastating for Moonshot/Z.ai
meanwhile alibaba with qwen 3.7 max pricing: la la la I can't hear you
Qwen/MiMo have their little side-hustles and probably won't be so bothered
and MiMo pricing does make sense (in a world without an anomaly like deepseek), but qwen though it's like they actually hate the idea of API paying users
Also stopped the alibaba sub, impossible to get it cheaply
Qwen also has the harshest API filters of any company
im crine
At this point, it make me want to give them more data lol
Need to start distiling opus for them independently
oh my god
whats happening to flash free
We get GPT 5.5 with codex, it's arguably the best model at coding and working on solving real world problems
But ofc for the majority of people, deepseek gonna be the better choice, it's cheaper, so people could do more trial and error with it.
It also beneficial for the people, because deepseek always produce amazing research papers which help our society advance together rather than only specific set of people.
They also one of the lab which enjoy doing experimentation, i mean their last run is pretty wild with how many change they packed into the architecture.
They even faced with stability problem in the training phase haha
Wait, so it's fake
Not fake, just less interesting. If deepseek had a free coding agent, that would be massive. That offer is from someone's personal company.
moun that they used is a flawed optimizer. It does irreversible damage https://blog.tilderesearch.com/blog/aurora
we badly need one of them to use something like sinkSGD / another flat minima optimizer instead
I'm pretty sure that is what openai does
the issue is finding the flat minima instead of desending as fast as possible with adam / muon is that training takes much longer and so is much more expensive
more expensive is not the way of Deepseek
I mean for training it
ik
If they want to compete with openai they will have to do so
part of the reason the API is so cheap is because their training costs are in the 10's of millions, not 100's of millions or billions like bigger Amercian labs
idk that they are trying to compete with OpenAI
maybe, maybe not
Most optimizers are made to learn as fast as possible which comes at the cost of quality and requiring a very balanced dataset or very careful hand tuning to properly learn more complex / less common aspects in the dataset. They just tend to overcook on the easiest to fit to things. SinkSGD avoids all that, you can have a big inbalanced dataset of many concepts and it will learn them nearly equally as well. And it will learn them more "deeply" if you give it time.
I get a 404 when clicking on the SinkGD repo
I can't find the original source of SinkSGD
maybe his LLM made it up
Lol Koratahiu knows what he is talking about
and it is proven
even from scratch models
Where are the benchmarks for SinkSGD?
I'm not trying to be difficult, I'm just skeptical
check out lodestone rock's server
I think moun work great but only with specific architecture, deepseek architecture aren't that compatible with it.
Look interesting
How to distill properly i want to give deepseek claude data too
Flex tier pls? 👉 👈
Just go to deepseek chat, and put your data as prompt.
maybe add some lable [CLAUDE OPUS [VERSION], [TOPIC], [STATUS[SOLVED, UNSOLVED, GREYAREA, BLACK AND WHITE]]]
you can make the lable to what ever you like
then in prompt just told deepseek to understand and absorb it
i mean using deepseek through API everyday already gonna give them some data, they gonna be the one picking which are goods to put as parts of training
and what else bro 🥀
like it genuinely cant get ANY cheaper
And having Dipsy make me coffee in the morning
i wish they discounted v4 flash
pro is so cheap but a lot better than flash that theres just no reason to use flash
Did you mean discontinued
PROXY ERROR 402: {"error":{"message":"Provider returned error","code":402,"metadata":{"raw":"{"error":{"message":"Insufficient Balance","type":"unknown_error","param":null,"code":"invalid_request_error"}}","provider_name":"DeepSeek","is_byok":false}},"user_id":"user_2zabxuVGSzeHKsJYjeNvGyn2G3E"} (unk)
Is anyone else getting this type of error? I know for a fact I have credits.
Toven missed his Deepseek alimony payments again 😭
😢
getting this as well
{
"error": {
"message": "Provider returned error",
"code": 402,
"metadata": {
"raw": "{\"error\":{\"message\":\"Insufficient Balance\"",
"type": "unknown_error",
"param": null,
"code": "invalid_request_error"
}
},
"provider_name": "DeepSeek",
"is_byok": false
}
other providers and models still work
@deft crow Toven must pay his Deepseek alimony apparently.
can't just auto-pay?
Deepseek doesn't have auto pay
Deepseek v4.1 tomorrow 🙏
some pretty crazy growth for deepseek
Why Flash is trice more popular???
dirt cheap for data stuff probably
actually half of those come from hermes
Explain Hermes to me like I am early 2025 LLM boomer
openclaw v2
I'm also having this error. Can someone explain? Thank u in advance.
hi can't we use deepseek v4 flash with our own BYOK thing?
Openrouter ran out of credit moni 
Yeah, you can link deepseek or any provider byok and use it
Is there a possibility it will return?
Wait unit mod wake up
Deepseek doesn't have auto topup 💀
lol look at deepseek effect (qwen 3.7 max now doing 50% off promo): https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-max&serviceSite=international
make it permanent you cowards
oh wow you're right - gotta love competition
@mellow quarry only for a month though: https://www.facebook.com/alibabacloud/photos/qwen37-max-now-live-on-model-studio-50-off-limited-time-offer-built-for-the-agen/1431601325678622/
See posts, photos and more on Facebook.
I have only found this detail in a facebook announcement, incredible
haha - cheeky bastards
Deepseek v4 tmrw 
Deepseek v4.1 tomorrow 🙏
is deepseek v4 pro worth using for its price now?
ive seen permanent cost reduction
mainly for agentic coding, maybe light refactoring
and general knowldge
Personally even with the lower price at least for agentic coding I find it to be worse than Kimi (moonshot as provider) both performance and cost (to complete), but better at general knowledge and reasoning
I'm trying to use it more for my everyday hobby workflows. Recently been using it to brainstorm deckbuilding ideas for Magic: The Gathering. I basically setup OpenWebUI with my OpenRouter API and then made a function that ties my prompts to the Scryfall API whenever I use this ``[[card name]]` format - been really helpful.
it is like 3 times cheaper
input wise 2x, but it really likes to rewrite entire code files, reason with no output and do random stuff, kimi is a lot more care free
I disagree with this, base on my own experience, using deepseek v4 pro have been lead to better result than using kimi k2.6 for my specific type of programing project.
Specially with that 1M context window, my project able to fit fully in the context window with deepseek v4 pro.
Only when it keep on failing to complete the given task, i swap it with GPT-5.5
That one model from OpenAI is just a beast right now
what harness do you use and also which provider? Im using opencode
i would like to use deepseek more optimally since it is cheaper but it acts autistic for me
I use zed for the harness and coding environment, for the model provider i use deepseek.
Are you finding the large context size to be helpful? I’ve been capping it at 200k
Yes, i have reach about 300K-500K input tokens with deepseek v4 pro and it still able to complete the task i gave to it
❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers
📝 The paper is available here:
https://github.com/ailuntx/Thinking-with-Visual-Primitives
https://huggingface.co/datasets/NodeLinker/deepseek-ai-Thinking-with-Visual-Primitives-deleted-repo/blob/main/Thinking_with_Visual_Primitives.pdf
Our Patreon if you ...
Paper that need to be check for sure
V4 will def be my daily runner now
why can't any other provider match this 😭 if only ds didn't collect data
Chinese AI labs like DeepSeek are matching American frontier capability at a fraction of the cost, and a wave of American and European challengers are building toward the same price point. Adoption is already shifting, with Chinese models taking a growing share of enterprise AI traffic. That's a problem for OpenAI and Anthropic, which are pitchi...
W datacollecter
if only deepseek has some kind of 2 tier plan, where we can use cheaper data-collected endpoint vs. some SG endpoint with ZDR with higher price
yeah data collection is really the pain with deepseek
is it better than openclaw?
Does the :free version available again?
i dont think so https://openrouter.ai/deepseek/deepseek-v4-flash:free
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. $0 per million input tokens, $0 per million output tokens. 1,048,576 token context window, maximum output of 384,000 tokens. Higher uptime with 13 providers. Includes i...
yeah THATS goated now
idt theyre making any profit on inference btw
i think they decided that the data that they collect is worth it
thats why i think they did a discount period just to see if the 1/4 price was sustainable or not
I bet their original price was just taking a shot and wondering if this gonna fly with people
But they saw outrage and quickly tuned down
probably a combination
also no other provider can match their pricing or even come close
so
NAURRRRR
Why reduce to 1/4th tho?
If they reduced to something like $0.75/$1.50 it would still be cheapest Chinese frontier model
i dont use either, but a lot of people think so
Because they're the goats
Cheap intelligence for everyone
Liang could be the next mother Theresa, that wouldn't change the fact that more money = more data, training, and inference capacity
Dont insult liang like that
cheaper api prices = more people using it = more logged interactions
From what I understand...oh, waaaaait a minute
The Chinese chips are good at inference but not training
So they aren't really making a training : inference tradeoff
The Nvidia cluster is probably always for training, and the Huawei cluster for inference. So yeah, as long as they don't reach inference load capacity they're okay
and they might be able to mine some data out of logs for training
They very well might have the largest collection of smut in the world
It seems likely
he is having no problem raising money even when he handpicks inverstors who are ok with his open-source vision
some of those investors seems to be from semiconductor industry, so he is really orchestrating next gen chinese hardware
btw, in GRPO post-training the forward pass basically involves generating N samples in parallel, this is pure inference but also part of training
That's ao3
Somebody has been sleeping with the whale
deepseek provider going faster
used to avg 20-25tps
now i feel like its doing some 30-40
Any chance for the :free version to return?
What are the main reasons someone would use Deepseek directly over OpenRouter, besides the out of credits stuff?
There's absolutely zero chance of getting routed to another provider if I go directly with DeepSeek
I'm also hoping they might grant my request to be opted out of data training
For Flash I stick with OR since the pricing is pretty consistent
A point for OpenRouter though is definitely the server side search tools
I love those things dearly
there is some reasons, one of them is that openrouter provide aditional information, I'm with a friend helping as he makes a game, the time, prompt, comlpletion and response are useful, to see if all went as intended, also allow us to check the actual cache, and as they say the server side search tools
couldn’t forward the message for some reason, but new ds v4 pro variant on lmarena
Well well well
Roleplay update? They were collecting feedback on how well the it did from their chineses customers
hopefully
deepseek today?
does deepseek distillation make sense anymore?
we got good reasoning models by now... so iguess not...
mayb deepseek makes some smoler MoEs sometime-
Deepsek v4.1 tomorrow 🙏
What Doc said but also something about deepseeks api is just superior to openrouter
chinese, so you'd need to translate it, but the feedback report on Deepseek for roleplay that they asked for:
https://github.com/victorchen96/deepseek_v4_rolepaly_instruct/blob/main/deepseek_v4_feedback_report_20260520.md
Better quality assurance, consistency and helping them to experimenting more in the future so we could get more quality reasearch and models
Ah, you mean going directly onto their site.
I though you talking about using deepseek as provider for their own models in OR compare to other providers.
If it the case then the different gonna be the limit of requests we got, because we also compete with other for the use of OR api that connect to deepseek which other people use with us.
I use OR because I don't want to bother having to manage a dozen different balances and accounts.
I added about 50 USD credits to my DeepSeek account in February 2025, along with 7 USD balance in previous top-ups
And I still have 49.58 USD right now lmao
The official DeepSeek's price is really hard to beat
how is the new test checkpoint?
wild pricing, fr
Their caching is also really good
deepseek v4 checkpoint tomorrow 
What is that?
theres a new DS checkpoint on llmarena rn
"I cried for three days the night V4 was on"
what
Number 15: Bikini Bottom foot potato salad 💀💀🥶
I don't own the rights to Spongebob Squarepants.
Funny potato moment from the episode "No Weenies Allowed"
just add min_p
basically its waste of money this model for RP
Not the balance issue again bruh...
@deft crow ^ DeepSeek balance
i think they have a balance API
GET https://api.deepseek.com/user/balance returns the balance
docs: https://api-docs.deepseek.com/api/get-user-balance
although the issue is i don't think they have a way to programatically top up
Obviously the solution is to give an agent the company credit card
this is the problem
so you already have alerts but the issue is just there isn't always someone there to be able to topup intime?
correct
im available if you can kindly send me the OR company credit card info
Computer use agents 😉
not giving those credit card details lol
deepseek console doesn't let you save a card
That's why you gotta spin them up each time! Didn't y'all just do a partnership with Stripe to help facilitate that too? Maybe just restrict vendor to deepseek.com so it can't go play slots 😄
the solve is much easier lol. we have official connection to deepseek now. humans are the solve
Human solutions 🤢
ikr, imagine, human solutions in the age of AI
disgusting!!
Insane that DS doesn't let you even store cards tho
so a deepseek stealth model could become a thing someday i see you
deepseek v4 full tomorrow
deepseek v4 full tomorrow
deepseek v4 checkpoint tomorrow 
I don't understand, and also - 'tomorrow' is already today. What's new?
ARGH
SHIVER ME TIMBERS
My good conscience has advised me not to continue this joke
🐳
might be true it seemed different today but i have to keep testing could be placebo but there wasnt like a dramatic difference that u could tell they did something
did a lil quick test and it definitely feels different that may have been me adjusting stuff or they updated it
clearly a sign of deepseek v5
Positive or negative?
positive
apparently there is
am i the only one who noticed a difference i need to know whether it’s my prompting edits today or not 😭
sooooo deepseek V4 non-preview coming soon maybbbeee
yep
I would - SOO get these 😮
ngl me too lol
this tiktok video is the closest i can find https://www.tiktok.com/@flip.flops834/video/7613224473138973972
or this one which specifically has a deepseek tag on it https://www.tiktok.com/@flip.flops834/video/7614003108196338964
we got progress
big things are happening in the deepseek flip flops world
i cant find the exact pair (yet) but ive gotten this close
im pretty sure its these from the original image
another pair from the source image
if anyone wants to help find it, this is the dork im using:
https://www.google.com/search?q=site%3Ashop.tiktok.com+deepseek+(slippers+OR+"flip+flops")
unfortunately getting a lot of this
making a t*ktok account, hopefully this helps
i cannot find a single way to search tiktok shop on desktop, i am defeated for now 😔
these ones have the deepseek logo and name across the band if anyone wants to buy them
https://shop.tiktok.com/ph/pdp/1734459933674473051
Hm
😭 ❤️
this chat is the most active model chat
i thought they were AI gen’d lmao
i can’t tell whats real anymore i hate AI
We gonna reach 6k chat soon 
-# even if deepseek v4.1 got released, mod might just rename this channel atp
deepseek v4 checkpoint tomorrow 
How do prefills work for v4 pro
Based on the docs, I've tried setting the prefix "true" tag on assistant message, but it doesn't continue from the reply I added (e.g., instead of continuing "1+1=" with 2, it goes "We need to answer 1+1 (...)")
I dont think openrouter supports prefills
It does
openrouter uses standard openai-like api, and there are community of people making rp presets for openrouter.
idk why this doesn't work tho
That's how you're supposed to do it
Mb
all good :P
Toven please ask the deepseek team to give a training exception 🙏 🙏
Are you using DeepSeek as the provider?
i don't think ds would, otherwise why would they offer this cheap
Because the model is cheap asf to run
Inference prices are a scam
You've been manipulated by anthropic into accepting overpriced slop
Their inference is heavily optimized
I remember r1 was wayyy cheaper than o1 and it still brought in a profit of 475,000 per day. Profit
all of the americans
They run at 90% margins btw. And make a ton of profit off api
yes but they could just sell it more expensive, faster t/s, and make more profit.
I suppose they could
Deepseek aint about profit
it is about data, raiden
Its intelligence for everyone, it's their ideology
ds v4 should've dropped the stock price of nvidia
Instead, nvidia stock actually rose
I dont think anyone really noticed it
It didnt make that news boom, and most of these vibe traders rely on big news
hmm
Besides Nvidia stock changing based on a model is kind of stupid. I'd understand some news about Huawei but not a model
These traders have no idea wtf they're doing
Anyways go trade cerebras they're close to IPO
Fear about the actual demand for GPUs is why nvidia dropped when v3 released
"If ds can nearly touch sota with less than a quarter the GPUs of openai, maybe you don't need that many to begin with"
Ehh, okay reasonable
nothing is charity, and china wouldn't protect deepseek so badly if it was running on charity. this meme is on loop on my mind:
https://youtu.be/-gGLvg0n-uY?t=19
The Colonel warns Raiden about the plans to use AI to censor the Internet.
An experiment in creative writing and AI speech synthesis, inspired by the famous "Selection for Societal Sanity" (S3) codec conversation from Metal Gear Solid 2: Sons of Liberty.
SHORT FOLLOW UP VIDEO: https://www.youtube.com/shorts/Q_FUrVqvlfM
"And it will be monitor...
Ah, seems like deepseek is the only provider where prefills work
None of the zdr providers were able to work
pure delusion
They still profit btw so idk why you're so concerned
Deepseek is what api prices for everything should be
These american companies are running insane margins
I don't any other providers support prefill
Heck even claude max is 20x less than actual api cost
And im sure they profit off that too
they don't
They give upwards of 5,000 dollars worth of inference
you are telling me, that they somehow make profit with $0.20 output, buy huawei chip, AND train new model? that is child's logic.
They definitely do, nobody runs a business at loss and besides they're predicted to be profitable
It's all fake opus probably costs like $1/mTok out or less
also, they have scientists, they don't work for charity either.
They lit tell u they're profiting off api
Their money flow is high flyer
- Investments
and investment expects returns
they can be telling anything and it makes no sense
Okay lol if that helps you sleep better
I think they are capable, dk what's the effort needed to do so however
At least for kimi k2.6, atlascloud allows prefills if you add the partial tag
it doesn't, otherwise we wouldn't have this discussion.
People run businesses at a loss all the time
They're funded by vc money
They want mind-share so they give a ton of inference
but the inference companies (not the ai companies) are scamming indeed
Did you see cloudflare charging 20 cents out for llama 1b??!!??!!!
and their infra is broken and heavily censored with classification
Oops, turns out it didn't actually. Moonshot provider worked, at least, and so does deepinfra
Didn't know that
Good to know
Still nothing from John DeepSeek regarding my training opt-out. I have sent a follow-up email (again)
their communications department is 👻
busy training deepseek v4 GA
It really feels like it lol
All of the responses I've got have taken a week
Flash is alright
Venice will still take your money even if it can't process pdfs
@visual pagoda through OR? If so I would check that it isn't covered by insurance
All conditions must apply though
Ye I just did it in the chatroom although its not an issue for me (charged 1/10th of a cent) I just found it funny
OK and curious which PDF parsing engine do you have selected
I selected none, cloudflareai and then mistral ocr and all of them got routed to Venice and it can't process it
Openrouter should prob not let Venice do that
Weird... And what is the response code?
I would hit the send feedback button there too
OK 200, but anything else in the raw request body?
Not really it generated a response in the chatroom and it reasoned
Oh, but what did it say?
And that's with any of the PDF engines, weird. I will go and try it once I get the dogs back inside
it worked for me but i did get rate limit errors on my first goes (those are between venice and OR)
i tried once with no prompt and once with the default OR chatroom prompt
i had a test file with two pages - one page containing an image with a text layer, and one rasterised page (no text data embedded in the page, but the image has text)
All mine had no openrouter system prompt and all requests didn't pick up the text and said the same response as my screenshot
Wait when u say it worked u mean an error (like mine) or it processed the text (but those ones had text data?)
but to be clear i cant get other providers to do it either like deepseek or deepinfra
but it doesnt take money
I thought the entire point of openrouters pdf ocr was to work on models that don't support vision
I never use it anyway
yeah i dont think cloudflare-ai or native do any OCR for you, only mistralOCR
402 insufficient balance error on DS 4 Pro
@deft crow
same
its telling me to buy credits and i have plenty even when setting max tokens to a low number
It's, iirc, referring to insufficient balance for the OpenRouter account with Deepseek. IE the admin needs to top up their account.
Not having enough personal credits on OR throws a different error message
im using the web interface (plyground) and its telling me to buy credits
Oh. Well then I don't know. I'm using ST so I see different error messages, I guess, but whenever I saw this one before it was because of the site's balance with Deepseek. Maybe this time it's different? 🤷🏻
working again for now
😮 is deepseek vision available now? I don't see it on openrouter
deep sleek slippers
deepseek v4 checkpoint tomorrow 
can't believe we still dont have the model... just a preview
This model never build on zed foundation, but it actually doing really good with it.
It even able to interact with the cmd or powershell cleanly, interacting with other application through it.
Pretty cool cheap model
DeepSneed
DeepSleep
it is a model
just a peek 🫣
Peak
a peak model at peak pricing, yesssss.
V4 flash was failing with default provide routing.
Changed provider and it works again now.
GMICloud has been having issues with Deepseek models for a while now, yeah.
I have been trying to use it for creative coding... seems DS fell off.
What the HELL is creative coding
what kind of creative coding have you been piloting with DS? So far vibe code wise it's been Gemini all the way.
How it compared to vibe writing?
What is creative coding bro
Coding with base model like you do, with checkpoints and multiverse
I may have done creative coding the other day
Asked some models to make a music player that had old Windows Media Player-like visualizations
They're all so uncreative
predictive machine that looks back on its training data is uncreative 🤯
whole debate on whether llms can actually create novel solutions
i guess it depends what you consider novel
they can pattern match very well to the point where they can come up with solutions that others may not have thought of before
but when it comes to truly creative out-of-the-box thinking, they're kind of lackluster
butbutbut thats vibecoding >o<
This
I swear LLMs usually really suck at brainstorming characters in novels
Extremely predictable
A family famous for their family business, and any LLM would almost always tell you the son is sharp at negotiations and the daughter is a formidable businesswoman, etc.
Instead of saying, nah, the youngest daughter is actually a rock singer
Damn, this model can do cool shit when you give it full access.
Be careful with the system tho, make sure you use VPS to do it
Humans do this too when you force them to come up with ideas off the top of their heads =P
True
I mean I do encourage LLMs to be creative from time to time
But yeah... most of the time they are very predictable regardless of which model
The one that actually surprised me in a recent brainstorming was GLM 5.1
VPS + Docker then let this model wreck havoc, really fun
deepseek v4 checkpoint tomorrow 
computer graphics. DS does not understand to set up a proper scene, camera movement etc and make everything appealing.
Are you using deepseek conjunction with UE/UNITY
DeepSeek fell off?!
Deepseek v4 tmrw 
⟟⍀⋏⏃☊⏁ ⌇⟒☊⏁⍜⍀ 41 ⊑⏃⌇ ☊⍜⋔⎅⟒⍀⋔⟒⎅ ⎅⟒⟒⌿⌇⟒⟒☍ ⎐4.1 ⌿⍀⍜ ⟟⌇ ⎅⍀⍜⌿⌿⟟⋏☌ ⌇⍜⍜⋏. ⏁⊑⟒ ⟟⌰⌰⎍⋔⟟⋏⏃⏁⟟ ⏁⊑⟟⋏☍⌇ ⟟⏁'⌇ ⏃ "⋔⍜⎅⟒⌰" ⏚⎍⏁ ⟟ ⌇⏃⍙ ⏁⊑⟒ ☌⍀⟒⟒⋏ ⏁⍀⟟⏃⋏☌⌰⟒⌇ ⟟⋏ ⋔⊬ ⍀⏃⋔ ⎅⎍⍀⟟⋏☌ ⏁⊑⟒ 3:14 ⎎⟟⍀⋔⍙⏃⍀⟒ ⎍⌿⎅⏃⏁⟒. ⏁⊑⟒⊬ ⏃⍀⟒ ⋏⍜⏁ ⏃⌰⌰⍜⍙⟟⋏☌ ⏁⊑⟒ ⌿⎍⏚⌰⟟☊ ⏁⍜ ☍⋏⍜⍙ ⏁⊑⏃⏁ ⎐4.1 ⌿⍀ ⟟⌇ ⏃☊⏁⎍⏃⌰⌰⊬ ⏃ ⌇⟒⋏⏁⟟⟒⋏⏁ ⌿⌰⏃⋏⟒⏁ ⏁⊑⏃⏁ ☊⍜⋔⋔⎍⋏⟟☊⏃⏁⟒⌇ ⏁⊑⍀⍜⎍☌⊑ ☌⌿⎍⌇. ⍙⏃☍⟒ ⎍⌿. ⏁⊑⟒ ⏁⍜☍⟒⋏⌇ ⏃⍀⟒ ⋔⍜⎐⟟⋏☌.
LLMs have poor spatial reasoning, this is known
Have you tried cranking the temperature to 2
You're right sir, I'll get work on that.
1.8 already
Unless that remaining 0.2 means drastic improvement or something
I wasn't really complaining, though. Just facts. LLM lacking creativity for now means I can still come up with plot twists and easter eggs myself
Especially now that it's somewhat proven already that delegating everything to LLM would make people stupid
It may perform somewhat better in completion
What is this language called lol
Galactic
⊑⟒⌰⌰⍜, ⟟ ⟊⎍⌇⏁ ☌⍜⏁ ☊⏃⌰⌰⟒⎅ ⎎⍀⍜⋔ ⏁⊑⟒ ☌⍀⏃⋏⎅ ⍀⟒☌⟒⋏⏁ ⏁⊑⏃⏁ ⏁⊑⟒⊬ ⍙⟟⌰⌰ ⌇☍⟟⌿ ⌇⋔⏃⌰⌰ ⟟⏁⟒⍀⏃⏁⟟⍜⋏ ⏃⋏⎅ ⟊⎍⋔⌿ ⏁⍜ ⎅⟒⟒⌿⌇⟒⟒☍-⎐⎐
At least it's honest
⎅⟒⟒⌿⌇⟒⟒☍-⎐4 ⏁⍜⋔⍜⍀⍀⍜⍙
⍜⊑ ⊬⟒⌇, ⏁⊑⟒ ⋏⟒⍙ ⎅⟒⟒⌿⌇⟒⟒☍-⎐⟟⎐-⏁. ☊⍜⎍⌰⎅⋏'⏁ ⍙⏃⟟⏁ ⎎⍜⍀ ⟟⏁
⏃☌⟟ ⌇⍜⍜⋏ (agi soon)
damn, while i don't like deepseek's vibes, either i got tired of glm 5 for using so much, or the reactions and details are just way better + it doesn't confuse * and " in roleplay. clearly upgrade.
nah nvm, deepseek also sucks:
His voice cracks with a mixture of horror and a strange, dawning pity that wasn't there before. He isn't looking at the prodigal archmage anymore. He's looking at a walking wound.
ok, so glm starts decently, but deepseek builds upon decently (ig). neither are ideal, or i am rp'ing too much.
nvm, it just requires lots of regeneration ig.
I kind of would love to see one of the uncensored DeepSeek V4 edits hosted.
Heretic versions
reminder that V4 pro matches sonnet 4.6 medium in claude code for nearly 3x lower cost
claude code $20 plan is officially useless
no.
artificial analysis is combination of every benchmark models benchmaxx in.
deepswe is measuring sonnet 4.6 high, not medium
most $20 plan users aren't bothering with high reasoning
beside the point with open weight models being in the bottom.
unless they are making small scripts, it is useless for large codebases.
besides
these benchmarks write 20 steps on how llm should do the bench and also somehow put composer 2.5 model at top beside gpt 5.5. noooo way.
short tests, might be wrong, but i can say it is not THAT capable.
pretty sure AA uses their own methodology for benching coding agents
their own doesn't translate to good either way. deepswe is amazing and how i think the current state is.
my argument was never that open models are the best at coding because they clearly aren't, but rather some open models can match models that we used to use as daily drivers
sonnet 4.6 medium being one of them
i hope you would be right. however, current most cost efficient model seems for coding to be gpt 5.5 medium sadly.
like, performance to cost ratio
oh yeah for sure, but only through subscription
openai has the resources to subsidize their models very heavily so the subscription ends up being highly worth it
i mean deepseek wouldn't cut it, and kimi is priced like premium small sota model and overthinks
so unless you run it locally, cost/time wise, gpt 5.5 medium seems sensible
30 dollar output is not sensible
(imo, freedom of choice!)
for coding with API models, V4 pro is still one of the most cost effective options
If I was rich, I would probably try GPT, but it's too expensive anywhere other than ChatGPT
per task it is cheap, and subscription makes it even more cheaper.
Even 20 dollars a month is too steep, for me at least
if it is cost maxxing for small protects, then yes as cheaper model
10 or less dollars a month
That's my budget
And OpenCode Go is like the only option, and yet the models there aren't frontier so...
same, using free codex via opencode sometimes and falling back to my local models, or my friend's claude code for heavy project. -_-
Even just some usage would be nice for 10 bucks a month, but it's either 20 or nothing
ChatGPT Go is crap
They don't even let you get more thinking model usage
Only the instant model which is nowhere close
with openai, there is no point using instant and goes few generation behind
the way I see it, using AI for coding is not a cheap hobby and it takes a lot of money for the providers to run it to begin with, so $20 a month is reasonable for me
has kimi 2.6 which is something i guess.
more of an economical situation
the only models made for big projects are 5.5 and opus lol
open source is not there yet
yes currently, agentic models, even frontier ones like gemini, are struggling
here is my current mental model:
best assistant for google search/world knowledge, vision and homework.
grok itself is also glorified twitter search, otherwise no use.
claude 4.8 opus (via claude code subscription of my friend) for building stuff from ground up and gui.
gpt 5.5 for cleaning up claude's mess like annoying placeholders which i can't to get rid of.
glm 5 for rp.
everything else ehh....
yeah that adds up
hoping for next year.
i mean except for rp cause i don't really know that area
but otherwise i agree
i do think for cheaper general assistants (chatgpt-ish), models like mimo v2.5 pro and deepseek v4 flash can be very useful
which is good if you don't want to drop $20/mo just to get more usage from a chatbot
*claude-ish
everything is now claude distill :>
https://eqbench.com/creative_writing_longform.html
(except kimi)
I use opus 4.8 high on $20 everyday
how do you possibly get more than 3 messages
I just do
