DeepSeek V3 | OpenRouter | Page 2

verbal tapir Jan 3, 2025, 7:19 AM

#

This thing is so in demand that all provider almost have problem providing it man.

hoary pumice Jan 3, 2025, 7:42 AM

#

this gotta be a new record, 0.07t/s

somber mango Jan 3, 2025, 8:12 AM

#

hoary pumice this gotta be a new record, 0.07t/s

bahahaha i was just going to post this

#

summer mural Jan 3, 2025, 8:15 AM

#

folks. we did it. negative uptime 😂

hoary pumice Jan 3, 2025, 8:43 AM

#

now over 100% uptime

astral flicker Jan 3, 2025, 8:49 AM

#

still down

radiant walrus Jan 3, 2025, 1:36 PM

#

If deepinfra and hyperbolic are proxying then does that mean their privacy is the same as that of deepseek? As in prompts sent through deepinfra would not be private from deepseek and may be used for training by deepseek?

queen charm Jan 3, 2025, 2:25 PM

#

how do I route to hyperbolic?

potent moon Jan 3, 2025, 2:55 PM

#

It's not that they are proxying Deepseek
Is that are not ready to handle the whole load of traffic when Deepseek is down
So when Deepseek is down, they saturate too

bold burrow Jan 3, 2025, 3:55 PM

#

It looks like anything

#

Over a certain context just

#

Breaks the model rn

hoary pumice Jan 3, 2025, 5:00 PM

#

queen charm how do I route to hyperbolic?

u shouldn't

cyan sail Jan 3, 2025, 9:06 PM

#

Is anyone else experiencing this issue? I've even tried creating a new API key.

outer arrow Jan 4, 2025, 11:00 AM

#

Personally seeing silent failures documented here: #1324894987217014855 message

hoary pumice Jan 4, 2025, 11:55 AM

#

Hyperbolic now runs at precisely 0t/s

jaunty orbit Jan 4, 2025, 3:03 PM

#

Deepseek has context caching
https://api-docs.deepseek.com/guides/kv_cache
If I use deepseek-v3 through openrouter, will this context caching still have effect?

Context Caching | DeepSeek API Docs

The DeepSeek API Context Caching on Disk Technology is enabled by default for all users, allowing them to benefit without needing to modify their code.

forest sail Jan 4, 2025, 3:06 PM

#

jaunty orbit Deepseek has context caching https://api-docs.deepseek.com/guides/kv_cache If I ...

https://openrouter.ai/docs/prompt-caching#deepseek

OpenRouter

Prompt Caching | OpenRouter

Optimize LLM cost by up to 90%

jaunty orbit Jan 4, 2025, 3:32 PM

#

forest sail https://openrouter.ai/docs/prompt-caching#deepseek

thanks a lot

bold burrow Jan 4, 2025, 9:19 PM

#

yeah deepseek at high contexts is not really working

#

also watching Hyperbolic fluctuate between "dead" and "barely alive"

#

is hilarious XD

pure canopy Jan 4, 2025, 9:23 PM

#

how high context? I've been submitting big files and its still just about holding together. high latency tho

bold burrow Jan 4, 2025, 9:44 PM

#

which provider?

#

the DeepSeek provider specifically gives me issues

#

about twice a day

upbeat aurora Jan 4, 2025, 9:52 PM

#

Occasionally deepseek limits their context to 8k probably due to load

stray solar Jan 4, 2025, 11:11 PM

#

deepseek model so friggin slow for me

glass brook Jan 5, 2025, 12:09 AM

#

stray solar deepseek model so friggin slow for me

same, can't use it for anything atm. 😦

bold burrow Jan 5, 2025, 1:43 AM

#

also is it just me or

#

is Fireworks' DeepSeek-v3 completely diff lmao

#

this shit is SLOW and RLLY RLLY BAD

#

lmao

somber mango Jan 5, 2025, 6:38 AM

#

why is DeepSeek so hard to run?

#

i get that it's massive, but isn't MoE supposed to run a bit quicker?

verbal tapir Jan 5, 2025, 7:52 AM

#

somber mango why is DeepSeek so hard to run?

Imagine 671B model have 37B active parameters in given moment, then there 18 people using that model at the same time with different thing, isn't it going to activate all of the parameters anyway.

so in simple term because there is a lot of demand that why is hard.

That's my thought of it, could be wrong.

pure canopy Jan 5, 2025, 10:23 AM

#

what quantizations are the non-official API for DeepSeek?

verbal sigil Jan 5, 2025, 1:56 PM

#

Deepseek had months to optimize their infra for MoE for a year now. Other providers never had this much demand for a MoE and most llm serving software don't have first class support for batched high throughput MoE support simple as that.

hot wren Jan 5, 2025, 1:57 PM

#

also quite a different MOE to server

#

also there deployment is 1 expert a gpu .. with a unit of scale of 320 gpus

#

most try to run that of units of 8xh200's

verbal tapir Jan 5, 2025, 5:22 PM

#

Event after all of that Deepseek still have hard time to run it, i mean they increase their node last time to accommodate the demand.

This one model really attract people to them, good for them to hit the hype train.

hot wren Jan 5, 2025, 5:36 PM

#

they did something i would have deemed impossible on a shoestring budget

#

soo they deserve all the hype they got

fair sphinx Jan 5, 2025, 6:05 PM

#

hot wren they did something i would have deemed impossible on a shoestring budget

is 2048xH800 really a shoestring budget?

hot wren Jan 5, 2025, 6:06 PM

#

6m as training cost for a 600b+ model is a drop in the ocean

#

a 100b dense costs 50-100m

#

just in compute

#

and a 2k gpu cluster is minimal for such a model

#

405b from llama used over 20k gpu's

#

they did fantastic work with the limited stuff they had

#

also mind you the h800 is more on paar with a an a100

#

as the chip is capped

fair sphinx Jan 5, 2025, 6:09 PM

#

assuming H800 costs $42000 as it does on ebay, then the cost of 2048xH100 is $86M

hot wren Jan 5, 2025, 6:09 PM

#

h100 is around 28k in bulk

#

and we talking compute cost not hardware acquesition cost

#

2 very different things

fair sphinx Jan 5, 2025, 6:10 PM

#

i see

#

does anyone know how well merging the experts of MoEs together into a dense model works

#

the only attempts I've seen are attempts with Mixtral 8x22B

hot wren Jan 5, 2025, 6:11 PM

#

you wont be .. as knowledge wont cluster that way

fair sphinx Jan 5, 2025, 6:11 PM

#

and they were not evaluated very well

hot wren Jan 5, 2025, 6:12 PM

#

even at mistral that wont work

#

here its even way different

#

as the gate selects more then just 2 layers

#

mistral is. 2experts

#

here its 9 ?

fair sphinx Jan 5, 2025, 6:12 PM

#

oh

hot wren Jan 5, 2025, 6:12 PM

#

ya 37b active

#

over 9 experts

fair sphinx Jan 5, 2025, 6:13 PM

#

so each expert is ~4B?

hot wren Jan 5, 2025, 6:13 PM

#

aprox but expert is just a wrong word for it

#

its 256 experts in total in the model

fair sphinx Jan 5, 2025, 6:13 PM

#

i'd imagine pruning wouldn't work well since the experts are so small already

hot wren Jan 5, 2025, 6:13 PM

#

knowledge clustering happens differntly then one would assume

fair sphinx Jan 5, 2025, 6:14 PM

#

so the only way to get a smaller deepseek-v3 would be to logit-distil it or pretrain a smaller model on the same dataset?

hot wren Jan 5, 2025, 6:14 PM

#

moe is really just fractioning stuff out

#

you have overlaps and no central clustering

#

aka you dont say expert 38 is the math expert

#

just doesnt work that way

#

its a logical separation and inference hack

#

not really isolation on knowledge clustering

#

moe isnt new .. first paper was written in 97 about that

#

with 100k experts

#

its just getting steam after mistral

#

and deepseek has successfully integrated it as well

#

but non the less they did very fine work

fair sphinx Jan 5, 2025, 6:16 PM

#

hmm

#

well thanks!

hot wren Jan 5, 2025, 6:16 PM

#

inference on batch is still tricky

#

at deepseek they run 320 gpu's as unit of scale

#

and each expert is pinned to 1 gpu

#

the 3rd party guys dont have any of that

fair sphinx Jan 5, 2025, 6:17 PM

#

does the router need its own gpu

hot wren Jan 5, 2025, 6:17 PM

#

need is a strong word but you have kv and otherwise the inf cost is just like a dense 670ish B model

#

as in batch odds are close to all experts are hot

#

aka performs like a dense model on the gpu

#

if you pin an expert to a gpu - that becomes the unit of scale and you have massive gains in perf

#

as you have a 3-4 b model on a gpu vs a 680b over a few with tensor and data parallelism

fair sphinx Jan 5, 2025, 6:19 PM

#

since the experts are so small could you use 8~12gb vram gpus for the experts then? or does it still need large gpus?

hot wren Jan 5, 2025, 6:19 PM

#

im semi affiliated with mistral - but i really have massive respect for the boys at deepseek

#

they did very good work

hot wren Jan 5, 2025, 6:21 PM

#

fair sphinx since the experts are so small could you use 8~12gb vram gpus for the experts th...

issue is you never know what expert will be on

#

what you could eventually do is prune it down to just use 2 experts but after pruning you would need to retrain the gates

#

not sure how well that would work

#

im spitballing here

fair sphinx Jan 5, 2025, 6:22 PM

#

hot wren issue is you never know what expert will be on

no my question is if you have something like 320 small GPUs

#

assign the experts to the different gpus

hot wren Jan 5, 2025, 6:22 PM

#

ya that work if you have the code to pin it

#

kv may rape you .. a little

#

and the network

#

but yes

fair sphinx Jan 5, 2025, 6:23 PM

#

👍

hot wren Jan 5, 2025, 6:24 PM

#

but then the power cost - and throughput

#

i think the cheaper option is to use the api

#

320 small gpu's will drain at least 200 w each aprox ..

#

thats a big steep bill .. not user if 8xh200 at 2 usd per gpu aka 15k a month wont be the cheaper option and you have a higher throughput - at least on paper

#

we just need different matmul processing that is faster and faster memory technology

#

in 1-2 decades we run models like that on our toaster

fair sphinx Jan 5, 2025, 6:30 PM

#

yeah

#

hmm somebody got 5.37t/s on 8xM4: https://blog.exolabs.net/day-2/

12 Days of EXO

12 Days of Truly Open Innovation

hot wren Jan 5, 2025, 6:31 PM

#

ya i seen that - but mind you thats 20-25k too

#

and single user

fair sphinx Jan 5, 2025, 6:31 PM

#

yeah

hot wren Jan 5, 2025, 6:31 PM

#

prompt processing over longer ctx on such a setup will be horrible

#

its a great demo for sure

#

but i would not call that viable

#

for every day use that is

#

the inital investment for 8 macs .. at that config . wont be a easy paletteable investment for most

#

given in a year or 2 its probably close to worthless

#

if you try to spend 20k on the api given that price

#

you have a few years runway

#

lol

#

so not really sensible fiscaly

#

i have a hard time spending 100 bucks on deepseek with daily use

fair sphinx Jan 5, 2025, 6:34 PM

#

so it seems that the most improvements in local models will probably be dense models with higher-quality data and maybe new finetuning methods?

hot wren Jan 5, 2025, 6:35 PM

#

stuff gets bigger before it gets smaller

fair sphinx Jan 5, 2025, 6:35 PM

#

yeah

hot wren Jan 5, 2025, 6:35 PM

#

distillation and then newer dense models can train of that

fair sphinx Jan 5, 2025, 6:35 PM

#

oh yeah i forgot about distillation

hot wren Jan 5, 2025, 6:35 PM

#

i would love to run that local but no dice

#

im capped with 96g vram and 256 gb ram on my normal workstation

#

and that is already more then what the average user has available

bold burrow Jan 6, 2025, 3:01 AM

#

Why is fireworks

#

So diff from Deepseek as a provider for this model

#

The response is so diff

verbal tapir Jan 6, 2025, 8:06 AM

#

hot wren we just need different matmul processing that is faster and faster memory techno...

I Also thought if there a other way to make the attention calculation requiring less step while able to result in the same attention capabilities as their original formula it will make our model more efficient and faster, i mean looks at MLP/FFN alone without other part are quite fast and memory efficient.

Are there already research for that tho?

cedar wolf Jan 6, 2025, 9:16 AM

#

Firework seems off. It's like it has it's own temp settings. Together is expensive and DeepSeek doesn't always work. It's a shame.

strong rose Jan 6, 2025, 12:18 PM

#

The DeepSeek endpoint doesn't seem to work at all rn

glad matrix Jan 6, 2025, 1:06 PM

#

strong rose The DeepSeek endpoint doesn't seem to work at all rn

yeah. it's not working.

hoary pumice Jan 6, 2025, 2:06 PM

#

looks like a small anomaly

hot wren Jan 6, 2025, 3:03 PM

#

verbal tapir I Also thought if there a other way to make the attention calculation requiring ...

every part in the llm is matmul .. so if you optimize matmul you get speedups cross the board

#

custom asic's could help .. and some are working on it .. see cerebras / groq .. tpu

#

cerebras is pretty much unbeatable at this point as the interconnect is legaly fenched off to them

#

so groq still has some gains to be gotten once they get the v2 hardware out of the samsung run

#

but the devices are individually small as its all sram

#

so deployments aint cheap

wary wing Jan 6, 2025, 3:05 PM

#

Does matmul mean "matrix multiplication"?

hot wren Jan 6, 2025, 3:05 PM

#

yes

#

tpu and tensor cores are systolic arrays

#

pretty much vector processors

#

to accelerate matmul

#

modern cpu's have avx for that but way slower then simd or other achitectures

#

my bet is still on photonics .. but there is alot missing in terms of material science

#

next 1-2 decades are going to be interresting

verbal tapir Jan 6, 2025, 3:31 PM

#

hot wren every part in the llm is matmul .. so if you optimize matmul you get speedups cr...

Yeah, but attention head specially are more resource hog compare to the other part of the block as it have to get all the context so it's have more step and longer calculation.

#

I mean we can see that from layer with FFN/MLP only we get O(n) and if then we add attention layer into it then we will get O(n^2).

hot wren Jan 6, 2025, 3:50 PM

#

verbal tapir I mean we can see that from layer with FFN/MLP only we get O(n) and if then we a...

problem is you cant just change the arch as after effect

#

at least not without massive cont. pretraining

#

otherwise you get noise only

#

so that kills is very much for 99.9% of the smaller guys / labs

grand wraith Jan 6, 2025, 9:56 PM

#

cedar wolf Firework seems off. It's like it has it's own temp settings. Together is expensi...

how to select provider ?
workaround was to block every provider but deepseek

cedar wolf Jan 6, 2025, 9:57 PM

#

grand wraith how to select provider ? workaround was to block every provider but deepseek

I block them in the Openrouter settings on the website.

grand wraith Jan 6, 2025, 10:00 PM

#

deepseek v3 works very well with roo cline

bold burrow Jan 7, 2025, 12:02 AM

#

Has Fireworks like

#

fixed their issues yet

#

bc other than DeepSeek

#

all other providers have terrible responses

digital silo Jan 7, 2025, 11:35 AM

#

grand wraith how to select provider ? workaround was to block every provider but deepseek

You can ignore provider though openrouter settings

grand wraith Jan 7, 2025, 11:44 AM

#

digital silo You can ignore provider though openrouter settings

ye but this feels like a non obvious workaround.

i mean if i select deepseek, i want deepseek.. not some 3rd party provider with higher prices. feels like intentional money grab behind the scenes

digital silo Jan 7, 2025, 11:50 AM

#

grand wraith ye but this feels like a non obvious workaround. i mean if i select deepseek, ...

https://openrouter.ai/docs/provider-routing

OpenRouter

Provider Routing | OpenRouter

Route requests across multiple providers

grand wraith Jan 7, 2025, 12:00 PM

#

oh i see it's done in the code, not on website. thx

wary wing Jan 7, 2025, 12:08 PM

#

grand wraith ye but this feels like a non obvious workaround. i mean if i select deepseek, ...

OpenRouter doesn't make money by routing you to a more expensive provider

grand wraith Jan 7, 2025, 12:11 PM

#

surely they get kickbacks for allowing it to happen by default 😛

wary wing Jan 7, 2025, 12:22 PM

#

grand wraith surely they get kickbacks for allowing it to happen by default 😛

As far as I know, they make all their profit from deposit fees

summer mural Jan 7, 2025, 12:31 PM

#

wary wing As far as I know, they make all their profit from deposit fees

accidentally routing you to openai's o1 (or later o3) would make them bank then - due to automatic top ups 😆

vocal ledge Jan 8, 2025, 5:38 AM

#

Anyone know why, no matter what settings I use, I can only get max response of approx 2000 tokens using deepseek provider? I've set max tokens to 8k.

Even when I ask specifically for up to 5000 words. Cheers

wary wing Jan 8, 2025, 12:13 PM

#

vocal ledge Anyone know why, no matter what settings I use, I can only get max response of a...

Does the response randomly cut off or does it just stop?

If it just stops, that's an LLM issue. After a certain point, it just stops talking probably because it's training data didn't have many examples past 2k tokens

vocal ledge Jan 8, 2025, 9:15 PM

#

It stops naturally, like it starts generating its response "knowing" it's limited to 2k or so words if that makes sense.

hot wren Jan 8, 2025, 9:55 PM

#

its not a rp model mate

#

dont expect it to write long stuff for you .. it doesnt have to - if anything most us want answers as short as possible and to the point

vocal ledge Jan 8, 2025, 9:58 PM

#

More for knowledge articles and wiki creation. Yeah it's a shame cause it's my go to model right now, and Gemini could pump out full articles in one prompt but deepseek take a couple prompts. No real issue just was checking if I'm missing something in settings.

rugged phoenix Jan 8, 2025, 10:03 PM

#

It’s good at creative writing actually.

zinc quarry Jan 8, 2025, 10:04 PM

#

After all, do you ever really want any model 1 shotting a large of piece of text beyond say code refactoring? Just like a human writer, it's best to start with an outline, and progressively flesh it out.

upbeat onyx Jan 9, 2025, 3:17 AM

#

is it down

sand palm Jan 9, 2025, 3:29 AM

#

I get errors over deepseek api, but not on official website.

zinc iron Jan 9, 2025, 3:34 AM

#

hot wren dont expect it to write long stuff for you .. it doesnt have to - if anything mo...

imo its a bad thing if model answer always short, i dont like how sonnet answer a problem without details explanation of solving it even when you ask it where in other hand o1 always giving out details about that problem.

also the other person talk about how its limited to 2k even when its adverst to have more than that, so make sense that person thought it should have been outputting than 2k just as deepseek told us in their label.

hot wren Jan 9, 2025, 3:50 AM

#

zinc iron imo its a bad thing if model answer always short, i dont like how sonnet answer ...

overrated tbh .. - you can always use continue - and steer it

#

people who cry are the guys who think ai will do all alone ..

#

its a TOOL

zinc iron Jan 9, 2025, 7:18 AM

#

i think you miss the one and two point, there use case where longer outputing could be really beneficial as in my case its about understanding problem and the longer its the better it could be layout also if it being adverst that it can outputing 8K token then you shouldnt need to steer it so it outputing as what being adverst where also if you ask it to continue then its not as what being adverst to be 8k token output.

i agree its a tool but its still shouldnt have that adverst in the first place if it cant get to it

upbeat aurora Jan 9, 2025, 7:24 AM

#

It can output 8k but the model doesent feel it is necessery. Suppose you give it a translation task which requires 8k output then it will do that because the model feels the need to output all 8k tokens.

sage orchid Jan 9, 2025, 7:30 AM

#

are you telling it to generate 8k tokens or are you telling it to like "respond in 10 paragraphs with 5 minimum sentences each" which can equal 8k tokens or something similar

zinc iron Jan 9, 2025, 7:37 AM

#

upbeat aurora It can output 8k but the model doesent feel it is necessery. Suppose you give it...

has you actually try it to generate 8k token for other thing than translate? i has and its really are limited to below 8k token if you ask it to make story even if you steering it to do so, imo its shouldnt adverst as something that could generate 8k token if it cant do it for many thing other than translate and i think eternal answer are more suit for this thing as it is a model problem than other thing where there no example that goes 2k token on the training data.

upbeat aurora Jan 9, 2025, 7:48 AM

#

then you should probably use a different model for your use case

upbeat onyx Jan 9, 2025, 7:51 AM

#

ooooo fight fight fight

#

bold burrow Jan 9, 2025, 9:23 PM

#

has anyone figured out why the models return such different responses dpeneding on provider?

forest sail Jan 9, 2025, 9:49 PM

#

Everyone is running a different inference engine and potentially different quantization of weights. DeepSeek's inference engine is proprietary, Fireworks apparently does some form of quantization (https://x.com/FireworksAI_HQ/status/1874231432203337849?t=Y8xmqor0UFhkvzPAJd4H6A&s=19), and everyone else is a wildcard.

Fireworks AI (@FireworksAI_HQ) on X

DeepSeek V3, a state-of-the-art open model, is now available on Fireworks Serverless and Enterprise!
🥇 SOTA open model for coding and reasoning
🥇 Best performing open model on Chatbot Arena and WebDev Arena
🧠 671B MoE parameters, 37B activated parameters
Congrats to the

#

I don't think there's been confirmation that the different open source inference engines produce the same output for deepseek at the moment.

hot wren Jan 10, 2025, 6:33 AM

#

also sampers make an impact how they process logits - fireworks will be some triton based inf

#

the best bet specially with moe's / new architectures is always the original model providers api

bold burrow Jan 10, 2025, 2:30 PM

#

forest sail I don't think there's been confirmation that the different open source inference...

damn

#

it's v v different unforutnately

#

yikes

graceful mica Jan 11, 2025, 3:48 AM

#

bold burrow has anyone figured out why the models return such different responses dpeneding ...

MTP module seems to alter model's behavior a lot
It's much more deterministic with MTP module on

#

source: local inference w/o MTP module compared to DeepSeek Platform

#

example: with MTP on, model stays stable at temp 2, w/o it it's much more chaotic

bold burrow Jan 11, 2025, 5:03 AM

#

graceful mica MTP module seems to alter model's behavior a lot It's much more deterministic wi...

MTP?

graceful mica Jan 11, 2025, 5:04 AM

#

bold burrow MTP?

Multi-Token Prediction

#

it's basically a specifically trained 14B model for speculative decoding

#

more complex than that, but it's the basic idea

bold burrow Jan 11, 2025, 6:08 AM

#

Any way we can use those for the other providers

#

Or do we just wait for them rn

pure canopy Jan 11, 2025, 12:09 PM

#

spec decoding is changing outputs a lot, its very unlikey the official api is doing that.

astral flicker Jan 11, 2025, 2:32 PM

#

Sorry for the intrusion, but can you share the prompt to me too?

wary wing Jan 12, 2025, 12:26 AM

#

pure canopy spec decoding is changing outputs a lot, its very unlikey the official api is do...

I thought spec decoding didn't change the outputs

#

I thought the entire purpose was to be faster and not change outputs

gusty locust Jan 12, 2025, 12:28 AM

#

I'd like to report that, unless I'm misunderstanding something, Deepseek seems to be limiting their context to only 10k, and it has been this way for days. Any time a prompt exceeds that amount, it never generates a response, and if you lower the context size to 10k, it starts working again.

wary wing Jan 12, 2025, 12:28 AM

#

gusty locust I'd like to report that, unless I'm misunderstanding something, Deepseek seems t...

This seems to be an issue in implementation on the providers side rather than a model issue

gusty locust Jan 12, 2025, 12:29 AM

#

I use exclusively deepseek as the provider, and disable the others in the settings.

#

The others just output gibberish.

#

Though the other providers do generate a response even when over 10k. It's just usually garbage.

upbeat aurora Jan 12, 2025, 12:33 AM

#

Sometimes it works over 8 to 10k but it is probably limited due to load

gusty locust Jan 12, 2025, 12:36 AM

#

Seems like that should be listed somewhere. I've tried it at random times over the last several days and I've never had it generate a response over 10k. I get that it's a good model at a low price, but I was using it expecting to get the full context listed.

upbeat aurora Jan 12, 2025, 5:37 AM

#

I agree

sonic plume Jan 12, 2025, 8:54 AM

#

gusty locust I'd like to report that, unless I'm misunderstanding something, Deepseek seems t...

And I was wondering why my input that was about 12K tokens resulted in infinite timeout from DeepSeek

#

Could be the same issue

sonic plume Jan 12, 2025, 10:36 AM

#

@lament yacht Will OR look into this? Because if DeepSeek is indeed limiting input tokens to under 10K, either DeepSeek or OR is lying about DeepSeek having 64K context window

rustic glade Jan 12, 2025, 10:49 AM

#

gusty locust I'd like to report that, unless I'm misunderstanding something, Deepseek seems t...

I've been only hovering around 2-3k and I get the same issue.

sonic plume Jan 12, 2025, 2:11 PM

#

I guess it's weekend so no one is answering or investigating this

silent torrent Jan 12, 2025, 7:36 PM

#

there’s often a difference between a model’s technically supported limits and what the provider actually enables in prod, likely deepseek had to limit context length to serve more people

#

example being google charging more for tokens over 128k while the models can support 2M tokens

#

just checked deepseek discord, it’s 100% an issue on their end

#

sonic plume Jan 12, 2025, 9:04 PM

#

It's fine to limit context length. Many Llama and Qwen fine-tune models hosted by Infermatic and Featherless actually support up to 128K context window, yet no host will actually host them with such context window

#

It's just that it should be clear and honest about it on OR's model page

gusty locust Jan 12, 2025, 9:06 PM

#

Yeah, that's my main issue. It needs to be stated somewhere. This isn't a free service. We are paying for it, and if they are having problems then it should be clearly stated.

lament yacht Jan 12, 2025, 9:19 PM

#

sonic plume <@353228093420208131> Will OR look into this? Because if DeepSeek is indeed limi...

I think we need to chat with DeepSeek upstream, we're following their docs: https://api-docs.deepseek.com/quick_start/pricing

Models & Pricing | DeepSeek API Docs

The prices listed below are in unites of per 1M tokens. A token, the smallest unit of text that the model recognizes, can be a word, a number, or even a punctuation mark. We will bill based on the total number of input and output tokens by the model.

silent torrent Jan 12, 2025, 9:24 PM

#

yeah was gonna say, deepseek themselves don’t say anywhere that they’re enforcing this limit

digital silo Jan 13, 2025, 6:02 AM

#

lament yacht I think we need to chat with DeepSeek upstream, we're following their docs: http...

They’re increased input price of deepseek chat but decreased output price from 0.27 to 0.14

#

This is good

rustic sundial Jan 13, 2025, 6:35 AM

#

digital silo They’re increased input price of deepseek chat but decreased output price from 0...

You misread the docs. $0.27 is the "regular" input price that will kick in next month. Current input is "promotional" $0.14.

sonic plume Jan 13, 2025, 7:09 AM

#

#

~~They replied about the hanging problem in the Chinese channel but not in the English channel 😂~~
They just replied in the English channel

#

Even though it's mainly about Cline, basically they just suggest against using it with longer contexts

#

"Our team is actively optimizing calculation and server load management to improve the overall performance and stability."

#

Basically canned response

#

Also they recommended using other providers when the context is longer. Oh god, if only they know Fireworks often returns rubbish

#

And no mention about fixing their documentation

digital silo Jan 13, 2025, 7:20 AM

#

rustic sundial You misread the docs. $0.27 is the "regular" input price that will kick in next ...

Thank you

hoary pumice Jan 13, 2025, 6:40 PM

#

sonic plume

I think its mostly they are running out of hardware to scale further

red fossil Jan 13, 2025, 6:42 PM

#

is there any caching with fireworks?

bold burrow Jan 13, 2025, 6:56 PM

#

hoary pumice I think its mostly they are running out of hardware to scale further

this is totally ok

#

my issue with them is like

#

when DeepInfra eats shit and gets messed up, and can't serve a model, it's clear, either their accepted context lowers and we're all aware, or the provider just turns red and we're all still aware

upbeat aurora Jan 13, 2025, 7:08 PM

#

the issue is its not down or red because under 8 to 10k still works and you cant derank because then ppl will complain about costs

pure canopy Jan 13, 2025, 7:17 PM

#

long context does still work sometimes tho, its not like a blanket ban on anything over 10k

pure canopy Jan 13, 2025, 7:26 PM

#

hoary pumice I think its mostly they are running out of hardware to scale further

well, they did literally say on their discord "use other providers" (with Cline specifically because of long ctx spam) so it does look that way. Can they not elastic scale using GPUs from like bytedancers or tencent or others?

#

Does anyone know if Together is hosting the model fp8 (i,e, as huggingface), or did they quantize it?

bold burrow Jan 13, 2025, 8:37 PM

#

Other providers perform like ass

#

That’s my issue

onyx flax Jan 13, 2025, 10:29 PM

#

I've used this and Claude and Claude is faster than Deepseek on Cline. So IDK why Deepseek is slow or just hangs up. Frustrating

digital silo Jan 14, 2025, 8:18 AM

#

bold burrow Other providers perform like ass

😅

upbeat aurora Jan 14, 2025, 8:40 AM

#

is any provider working for anyone? all providers just load forever.

strong kraken Jan 14, 2025, 9:11 AM

#

upbeat aurora is any provider working for anyone? all providers just load forever.

yep

exotic sun Jan 14, 2025, 6:01 PM

#

I had to add deepinfra and together to my block list because I aint paying $1.25 for deepseek which defeats the purpose of deepseek

covert pawn Jan 15, 2025, 7:54 AM

#

So now DeepSeek is FP8 and with 10K context?

sonic plume Jan 15, 2025, 7:59 AM

#

11K worked yesterday for me

#

Don't know if it will change again due to heavy loads

covert pawn Jan 15, 2025, 8:02 AM

#

In short, for RP it is better to go back to Sonnet and Wizard. Thank you

fossil granite Jan 15, 2025, 9:09 AM

#

What happened to the 'Fireworks' provider btw?

dusty fable Jan 15, 2025, 9:44 AM

#

fossil granite What happened to the 'Fireworks' provider btw?

#general message advice from staff

hybrid plover Jan 15, 2025, 10:01 AM

#

dusty fable https://discord.com/channels/1091220969173028894/1094454198688546826/13286069253...

Yeah, sadly. Despite the problems, Fireworks was like the best in terms of quality/price ratio. I hope they'll return eventually.

fossil granite Jan 15, 2025, 5:31 PM

#

looks like NovitaAI joined the race, taking place of Fireworks in some ways, let's see how it goes!

#

the main problem and this seem to be a issue for long now, no matter the params you set, the providers seem to tune the models in their own ways such that each provider acts a little different than the other. which disturbs the overall flow 😦

exotic sun Jan 15, 2025, 6:34 PM

#

this was also noticeable with qwq, some providers did not work well with it

hybrid plover Jan 16, 2025, 3:45 AM

#

fossil granite looks like NovitaAI joined the race, taking place of Fireworks in some ways, let...

The latency problem with Together and NovitaAI are crazy

unborn hare Jan 16, 2025, 4:47 AM

#

How’s the privacy policy of Together and NovitaAI compare to Fireworks? Their ToS seems pretty similar at first blush

fossil granite Jan 16, 2025, 6:33 AM

#

hybrid plover The latency problem with Together and NovitaAI are crazy

and novitaai have now cut down their output as well!

hybrid plover Jan 16, 2025, 6:34 AM

#

Oof, that's rough.

hoary pumice Jan 16, 2025, 9:31 AM

#

looks like it died first

hybrid plover Jan 16, 2025, 9:33 AM

#

Yeah, looks like a lot of providers are struggling with serving such a gigantic model

fossil granite Jan 16, 2025, 1:29 PM

#

hybrid plover Yeah, looks like a lot of providers are struggling with serving such a gigantic ...

what's the current next best in terms of coding usecase?

upbeat aurora Jan 16, 2025, 1:31 PM

#

3.5 sonnet

hybrid plover Jan 16, 2025, 9:53 PM

#

True

#

Although I prefer o1, but it's test-time compute based and really quite expensive.

cedar wolf Jan 17, 2025, 11:44 AM

#

DeepSeek provider seems like the ONLY good option for generating a story. Every other provider just gives a load of rubbish.

#

I've never experienced this with any other model.

hybrid plover Jan 17, 2025, 12:27 PM

#

Idk, for me personally, Fireworks or Together are the only ones that work for creative writing.

cedar wolf Jan 17, 2025, 12:43 PM

#

hybrid plover Idk, for me personally, Fireworks or Together are the only ones that work for cr...

What temp do you use for those two?

hybrid plover Jan 17, 2025, 12:43 PM

#

Around 1

cedar wolf Jan 17, 2025, 12:54 PM

#

hybrid plover Around 1

do you touch the penalties?

hybrid plover Jan 17, 2025, 12:55 PM

#

Yeah, i crank some quite high too. Between 0.5 and 0.8

mild oxide Jan 18, 2025, 7:42 AM

#

cedar wolf DeepSeek provider seems like the ONLY good option for generating a story. Every ...

Really? I'm experiencing the opposite. I thought the model was bad for creative writing, but recently I got routed to deepinfra when the official one is down, and it's vastly better. Disabling fallback, the quality degrades severely again. Still experimenting but my guesses are: 1. Official API might be using a very short max input. 2. It uses a different method for censorship, maybe using logit bias or filtering the input with regex. So it still gives output unlike GPT, but the quality will be extremely bad. Currently the official API is very unstable so I haven't tested too much.

hybrid plover Jan 18, 2025, 7:45 AM

#

Yeah, i think official API uses some extreme optimization options for cost cutting which causes the performance to degrade too.

cedar wolf Jan 18, 2025, 7:49 AM

#

mild oxide Really? I'm experiencing the opposite. I thought the model was bad for creative ...

It's because I've been setting the temp too high for other providers. I'll try again at some point.

mild oxide Jan 18, 2025, 8:53 AM

#

cedar wolf It's because I've been setting the temp too high for other providers. I'll try a...

Were you using Novita? I find that provider reacts to temp way more than official. The deepseek docs recommend 1.3-1.5 for creative writing, but novita can only generate nonsense at as low as 1.1 . Other providers are also more sensitive to temp than official, but not nearly as much as novita. It also likes to add some comment, like an author's note, at the beginning of its response. None of the other providers does that.
I'm not an expert but this difference between providers really seem strange to me. I thought they should produce mostly the same results, but in deepseek's case, they are very different.

mild oxide Jan 18, 2025, 9:12 AM

#

Overall Notiva's output is super weird. Not only it adds "author's notes", it also ignores my request to summarize the story (so that I can test whether it cuts context or filter the input). Suspecting the provider's honesty, I asked it which of 9.11 and 9.9 is larger, and it got it completely wrong. Not only the result was wrong, it also used a completely different format. Other providers, include official, ALWAYS starts with "To determine..., let's compare them step by step", and then lists out the steps. Novita doesn't do this, so I'm starting to doubt whether it's even actually deepseek.

cedar wolf Jan 18, 2025, 9:16 AM

#

It's why I've got all the providers except DeepSeek blocked via OR settings. I'll switch to DeepInfra once I'm done with MiniMax.

upbeat aurora Jan 18, 2025, 10:52 AM

#

Responses also seem different if I use my own deepseek key or BYOK compared to through openrouter. I wonder if they did something to censor openrouter key

cedar wolf Jan 18, 2025, 5:49 PM

#

Just started using Together with temp 1.1, rep penalty 0.5, freq penalty 0.7. The difference is staggering.

hybrid plover Jan 18, 2025, 6:56 PM

#

True

covert pawn Jan 18, 2025, 11:58 PM

#

cedar wolf Just started using Together with temp 1.1, rep penalty 0.5, freq penalty 0.7. Th...

Rep penalty 0.5 to me reports that it is an invalid value, minimum must be 1.
Are you using text completion or chat completion?

cedar wolf Jan 19, 2025, 7:28 AM

#

covert pawn Rep penalty 0.5 to me reports that it is an invalid value, minimum must be 1. Ar...

Then use 1.5

covert pawn Jan 19, 2025, 7:44 AM

#

cedar wolf Then use 1.5

Like this?

cedar wolf Jan 19, 2025, 8:01 AM

#

Yes

covert pawn Jan 19, 2025, 8:08 AM

#

Then I still have repetitions, ouch!!!
And I also tried Presence Penality = 0.7

And repetitions have a standard behaviour, they always appear when a speech or scene is prolonged.
If you have a faster pace they do not appear.

cedar wolf Jan 19, 2025, 8:26 AM

#

Are you sure you're using Together as the provider?

covert pawn Jan 19, 2025, 8:30 AM

#

Yes.
If I prolong the scene there are always repetitions, not striking but it recomposes the sentences with the same terms.
Much less than before but using Sonnet, Wizard and Hermes 3 you realise that it cannot be like them, without repeating itself.

cedar wolf Jan 19, 2025, 8:42 AM

#

I get it. Sadly other models are either expensive, moderated or trained on 50 shades of grey. I have a bot I always go back to. It involves a guy with a secret identity with an online account to contact User. While other models struggle with the concept, DeepSeek just gets it. And it always seem to follow the prompt to the dot while other models seem to derail after a while.

covert pawn Jan 19, 2025, 8:45 AM

#

You're right, I think the models I mentioned are much more nuanced on the sexual side and that prevents repetitions.
I too find DeepSeek a very good model for some of my SFW cards, but when I use the NSFW ones the repetitions always come back, as if at some point in the scene his vocabulary of phrases runs out or is otherwise more limited than other LLM models.

hybrid plover Jan 19, 2025, 8:55 AM

#

Why are you using Text completion instead of the Chat one?

cedar wolf Jan 19, 2025, 9:07 AM

#

I use text because the website won't let me choose.

hybrid plover Jan 19, 2025, 9:09 AM

#

I see

atomic patrol Jan 19, 2025, 9:11 AM

#

The deepseek provider has been really slow lately, or is it just me?

hybrid plover Jan 19, 2025, 9:13 AM

#

Yeah, a lot of providers are struggling with running Deepseek v3 adequately

cedar wolf Jan 19, 2025, 9:22 AM

#

hybrid plover I see

actually I probably don't know if I can or know how to do it on OR, this stuff goes way over my head

cedar wolf Jan 19, 2025, 9:22 AM

#

atomic patrol The deepseek provider has been really slow lately, or is it just me?

it's always been like that for me, try other providers

atomic patrol Jan 19, 2025, 9:25 AM

#

cedar wolf it's always been like that for me, try other providers

I tried other providers, though they work, their responses are very different. Though right now my temps are 1.8-2 for creative writing

cedar wolf Jan 19, 2025, 9:27 AM

#

atomic patrol I tried other providers, though they work, their responses are very different. T...

Like I mentioned before, temp should be much lower - close to 1 - for other providers like Together. #1321401252588032010 message

plucky drum Jan 19, 2025, 11:07 AM

#

hybrid plover Yeah, a lot of providers are struggling with running Deepseek v3 adequately

It was on a roll yesterday for me. Shame it shit the bed today.

atomic patrol Jan 19, 2025, 1:26 PM

#

it was laggy yesterday for me as well, but a bit more reliable than today

#

i feel like by the time it returns, the promo price will end already, which will suck

rugged terrace Jan 19, 2025, 1:51 PM

#

atomic patrol i feel like by the time it returns, the promo price will end already, which will...

apparently they're limiting deepseek's official openrouter connection because of just too many concurrent requests, and if you byok through openrouter or just use their api directly it's fine

#

i did want to just top up $2 but i genuinely won't be using deepseek enough before the promo price ends to bother with it lol

atomic patrol Jan 19, 2025, 1:53 PM

#

do i have to top up on deekseek directly too or can i just use my OR credits?

rugged terrace Jan 19, 2025, 2:06 PM

#

atomic patrol do i have to top up on deekseek directly too or can i just use my OR credits?

deepseek directly unfortunately

atomic patrol Jan 19, 2025, 2:06 PM

#

damn

#

who tf is using deepseek so much that it's rate limiting or

rugged terrace Jan 19, 2025, 2:07 PM

#

there's probably atleast 500 people right now generating complete automated slop using deepseek on openrouter

#

it's too cheap to not mess around with

atomic patrol Jan 19, 2025, 2:08 PM

#

they gotta ruin all the fun smh

rugged terrace Jan 19, 2025, 2:08 PM

#

my first day my cline went wack and made over 2500 files when i was experimenting with a simple refactor

#

if you could put $2 on deepseek directly it works well, you can use the credits for the rumoured r1 or r1-lite launch soon too :)

#

i wish openrouter wasn't limited though

atomic patrol Jan 19, 2025, 2:09 PM

#

what is the r1/lite?

rugged terrace Jan 19, 2025, 2:09 PM

#

deepseek's reasoning model, o1 equivalent

atomic patrol Jan 19, 2025, 2:14 PM

#

rugged terrace deepseek's reasoning model, o1 equivalent

what will be the differences between the main one?

cyan sail Jan 19, 2025, 3:31 PM

#

atomic patrol The deepseek provider has been really slow lately, or is it just me?

Completely unusable for me

sage orchid Jan 19, 2025, 4:50 PM

#

Same, Deepseek provider doesn't even work for me. I'm thinking this will remain until their promotional price ends. Their deal is too good so it's flooded

The original price is still worth it though imo

velvet egret Jan 19, 2025, 4:50 PM

#

sage orchid Same, Deepseek provider doesn't even work for me. I'm thinking this will remain ...

Use BYOK. It's so much faster.

sage orchid Jan 19, 2025, 4:54 PM

#

velvet egret Use BYOK. It's so much faster.

I just looked it up, what's that? Is that running it on your PC or like through OpenRouter?

velvet egret Jan 19, 2025, 4:55 PM

#

sage orchid I just looked it up, what's that? Is that running it on your PC or like through ...

It's Bring Your Own Key. An OpenRouter features that lets you bring your own provider API Key and use it on top of OpenRouter

#

So it can act as primary or fallback

atomic patrol Jan 19, 2025, 4:56 PM

#

damn gotta top up on a different site too on top of it

sage orchid Jan 19, 2025, 4:58 PM

#

lol yeah. thanks for letting me know though. I'll try it, deepseek roleplay is too good

velvet egret Jan 19, 2025, 5:28 PM

#

atomic patrol damn gotta top up on a different site too on top of it

Yes. OR still takes 5% of it. But i think it's still worth it. Even you can make it just as fallback not primary.

dusty fable Jan 20, 2025, 2:18 AM

#

the top still shows 64K, but Together now has 131K. Fireworks just released theirs at 131K too

rustic sundial Jan 20, 2025, 6:55 AM

#

R1 Zero weights released 13 minutes ago.
https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero/tree/main

#

R1
https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main

#

No model card or report atm. Same size as V3.

hybrid plover Jan 20, 2025, 7:12 AM

#

Wait, R1 Zero?

#

Is that the full R1?

#

Or version before r1-preview?

rustic sundial Jan 20, 2025, 7:12 AM

#

Nobody knows yet. Files just appeared, no information yet.

#

But it should be better than V3.

hybrid plover Jan 20, 2025, 7:13 AM

#

Yeah, fair enough, since V3 was distilled from r1

hybrid plover Jan 20, 2025, 7:13 AM

#

rustic sundial Nobody knows yet. Files just appeared, no information yet.

Similar to how v3 was released initially, so we will probably get the README in a few hours

velvet egret Jan 20, 2025, 7:40 AM

#

rustic sundial No model card or report atm. Same size as V3.

insane size 685B parameters.

lunar flume Jan 20, 2025, 7:43 AM

#

deepseek just on a roll lately, hope it's something big

hybrid plover Jan 20, 2025, 7:57 AM

#

I expect it to be at least close to o1-mini performance or a little below that

lunar flume Jan 20, 2025, 8:25 AM

#

hybrid plover I expect it to be at least close to o1-mini performance or a little below that

it destroys o1-mini, on par with o1-medium
https://fixupx.com/StringChaos/status/1880317308515897761?mx=2

FxTwitter / FixupX

💬 18 🔁 86 ❤️ 625 👁️ 138.1K

Naman Jain (@StringChaos)

DeepSeek-R1 (Preview) Results 🔥

We worked with the @deepseek_ai team to evaluate R1 Preview models on LiveCodeBench.

The model performs in the vicinity of o1-Medium providing SOTA reasoning performance! Huge kudos to the team and I'm looking forward to the full release!!

/1

hybrid plover Jan 20, 2025, 8:43 AM

#

Interesting

hybrid plover Jan 20, 2025, 8:45 AM

#

rustic sundial R1 Zero weights released 13 minutes ago. <https://huggingface.co/deepseek-ai/Dee...

This is a different model to preview tho. Plus, on Deepseek chat website it wasn't performing as good as this benchmark claims, but this is just my personal tests and observations.

odd quarry Jan 20, 2025, 8:51 AM

#

https://discord.com/channels/1091220969173028894/1330820209812050002

bold burrow Jan 20, 2025, 4:09 PM

#

I think DeepSeek V3

#

is down lmao

bold burrow Jan 20, 2025, 5:05 PM

#

we back

unique shuttle Jan 20, 2025, 5:11 PM

#

bold burrow we back

I'm so excited to try it out!

sand palm Jan 21, 2025, 12:08 PM

#

Do you feel Deepseek v3 has the ability to properly adapt in multi-turn? Like, It doesn't repeat the same mistake if I give it the tiniest amount of feedback on its math. It's almost humble :>

wary wing Jan 21, 2025, 12:12 PM

#

sand palm Do you feel Deepseek v3 has the ability to properly adapt in multi-turn? Like, I...

What do you mean "multi-turn"? Is it when you ask it multiple follow up questions?

I tested this and it seems fine to me. Is this in the context of RP?

sand palm Jan 21, 2025, 12:15 PM

#

wary wing What do you mean "multi-turn"? Is it when you ask it multiple follow up question...

I mean it is good at multi-turn, indicated by the :>, :>

wary wing Jan 21, 2025, 12:19 PM

#

sand palm I mean it is good at multi-turn, indicated by the :>, :>

Ah

uneven gazelle Jan 21, 2025, 5:30 PM

#

How come we have to use BYOK for lower latency? It seemed to be working fine a few days but now I’m not even getting responses. I’ve switched to BYOK and it works fine now

wary wing Jan 21, 2025, 5:47 PM

#

uneven gazelle How come we have to use BYOK for lower latency? It seemed to be working fine a f...

Good question. @lament yacht what's the technical reason behind this?

lament yacht Jan 21, 2025, 5:48 PM

#

uneven gazelle How come we have to use BYOK for lower latency? It seemed to be working fine a f...

with BYOK you gets your own ratelimit with upstream provider

wary wing Jan 21, 2025, 5:48 PM

#

lament yacht with BYOK you gets your own ratelimit with upstream provider

So openrouter is constrained by rate limits?

lament yacht Jan 21, 2025, 5:49 PM

#

wary wing So openrouter is constrained by rate limits?

Yeah -- we're in contact with them to see if we can increase it

forest sail Jan 21, 2025, 5:57 PM

#

DeepSeek API does NOT constrain user's rate limit. We will try out best to serve every request.
🙈

silent torrent Jan 21, 2025, 6:05 PM

#

There's probably just something like a global queue

crude cliff Jan 21, 2025, 6:33 PM

#

: OPENROUTER PROCESSING

I keep getting this for DeepSeek

#

This took like 65s to complete

#

Is this the same for yall?

#

@lament yacht

trail jasper Jan 21, 2025, 6:41 PM

#

edit: talking about wrong model, my mistake

forest sail Jan 21, 2025, 6:47 PM

#

~~OpenRouter doesn't forward reasoning tokens, and it's setup to generate up to 32K tokens of reasoning tokens. At 10 tokens/s, it could take up to an hour before you see the first non-reasoning output.~~ whoops wrong model

crude cliff Jan 21, 2025, 6:49 PM

#

trail jasper edit: talking about wrong model, my mistake

deepseek-chat

trail jasper Jan 21, 2025, 6:53 PM

#

ah sorry my mistake, should've read the title more carefully 😂 - long day

crude cliff Jan 21, 2025, 7:23 PM

#

this is happening more frequently. any solution to this or something wrong?

uneven gazelle Jan 21, 2025, 7:36 PM

#

crude cliff this is happening more frequently. any solution to this or something wrong?

Yeah I used BYOK and it fixed this

crude cliff Jan 21, 2025, 7:37 PM

#

I don't need to additionaly credits to deepseek right?

uneven gazelle Jan 21, 2025, 7:48 PM

#

crude cliff I don't need to additionaly credits to deepseek right?

You need to yeah

crude cliff Jan 21, 2025, 9:35 PM

#

yikes!

sand palm Jan 22, 2025, 1:18 AM

#

sand palm Do you feel Deepseek v3 has the ability to properly adapt in multi-turn? Like, I...

I retract this statement. This model makes me wanna kms in multiturn.

naive rock Jan 22, 2025, 1:50 AM

#

crude cliff I don't need to additionaly credits to deepseek right?

Do you mean do you also need to have Deepseek credits to use it in OR?

crude cliff Jan 22, 2025, 1:55 AM

#

naive rock Do you mean do you also need to have Deepseek credits to use it in OR?

Same question. Hoping someone from OR confirm this

naive rock Jan 22, 2025, 1:56 AM

#

crude cliff Same question. Hoping someone from OR confirm this

I was asking if that's what you meant. If so, no.

silent torrent Jan 22, 2025, 1:56 AM

#

If you BYOK, you need Deepseek and OR credits

#

BYOK pricing is 5% of the provider pricing deducted from your OR credits

#

For every $1.00 you pay directly to the provider, we'll charge you $.05 in OpenRouter credits.

crude cliff Jan 22, 2025, 1:58 AM

#

and, without BYOK, just using OR credits?

silent torrent Jan 22, 2025, 1:59 AM

#

crude cliff and, without BYOK, just using OR credits?

Standard pricing from whichever provider serves your request

crude cliff Jan 22, 2025, 1:59 AM

#

is the OPENROUTER Processing issue related to BYOK?

silent torrent Jan 22, 2025, 2:00 AM

#

I am not sure, can you see what provider was behind that generation in your activity?

uneven gazelle Jan 24, 2025, 12:49 PM

#

silent torrent I am not sure, can you see what provider was behind that generation in your acti...

How’s it going with OR getting better rate limits from Deepseek?

lunar flume Jan 24, 2025, 6:28 PM

#

It's very strange to me that openrouter has been having so many issues with their connection with deepseek

#

I've been using chat.deepseek.com for several days multiple times a day and the tok/s and ttft is always blazing fast and has not timed out on me once

silent torrent Jan 24, 2025, 6:29 PM

#

granted, you're one user versus our thousands 😅

#

almost 2B tokens through v3 today on OpenRouter

#

& almost 1B on R1

lunar flume Jan 24, 2025, 6:31 PM

#

I meant to post this in the r1 chat (since I've been using r1) but confused myself haha

lunar flume Jan 24, 2025, 6:32 PM

#

silent torrent granted, you're one user versus our thousands 😅

that's fair but I was talking about deepseek's inference capacity in general

#

deepseek the company not the model

naive rock Jan 24, 2025, 9:44 PM

#

Oh my god V3 works in Sillytavern now too 😄

hybrid plover Jan 26, 2025, 8:13 PM

#

Now? Pretty sure it worked since release...

pure canopy Jan 26, 2025, 8:17 PM

#

silent torrent granted, you're one user versus our thousands 😅

did you try telling them that your key(s) are basically representing 1000s of users?

#

otherwise maybe they do not know, and think you are just their biggest superfan, who simply cannot get enough deepseek api spam

naive rock Jan 26, 2025, 9:05 PM

#

hybrid plover Now? Pretty sure it worked since release...

Did not work for me or a decent number of others. Staff confirmed multiple fixes were required on the backend.

#

It was provider dependent

#

But Seek and Infra did not like ST

hybrid plover Jan 26, 2025, 9:06 PM

#

Isn't it still provider-dependent or they fixed DeepSeek provider?

#

I would need to check, i guess

naive rock Jan 26, 2025, 9:12 PM

#

It's fixed

#

In my testing at least. Obviously the DeepSeek provider is getting kind of rocked by requests rn tho

blissful spire Jan 27, 2025, 4:43 AM

#

hi

slow pollen Jan 27, 2025, 7:35 AM

#

I am also curios to see why the DeepSeek app is always blazing fast with both R1 and v3 but it is usually either down or too slow from OR (All providers are ignored but DeepSeek). Does BYOK help with up-time and latency? I would like to see real usage example from someone before topping up some credit to DeepSeek API directly 😄

wary wing Jan 27, 2025, 10:56 AM

#

slow pollen I am also curios to see why the DeepSeek app is always blazing fast with both R1...

Byok helps

#

OpenRouter is being rate limited

slow pollen Jan 27, 2025, 11:02 AM

#

wary wing Byok helps

Oh thanks for letting me know!

Is it noticeable increase? Are you using it that way yourself?

wary wing Jan 27, 2025, 11:04 AM

#

slow pollen Oh thanks for letting me know! Is it noticeable increase? Are you using it tha...

Its a noticable increase for DeepSeek, since OpenRouter is being heavily rate-limited. In all other cases, the increase in speed is negligible.

I'm not using it myself, but I have seen other people see better speeds with BYOK

slow pollen Jan 27, 2025, 11:04 AM

#

I will give it a try, thanks a lot!

rustic sundial Jan 27, 2025, 4:44 PM

#

hug of death'd #1330820209812050002 message

sand palm Jan 27, 2025, 7:52 PM

#

official website no longer works, even without web search enabled. I'm doomed T_T

bold burrow Jan 27, 2025, 9:16 PM

#

I got a question, my gen ID with 0 token response and a timeout tells me that I am not using BYOK, but I am...why?

  "id": 3996858223,
  "generation_id": "gen-1738012106-DvtnKTf72ZruFYOrbaNR",
  "provider_name": "DeepSeek",
  "model": "deepseek/deepseek-chat-v3",
  "app_id": null,
  "streamed": true,
  "cancelled": false,
  "generation_time": 289575,
  "latency": 13767,
  "moderation_latency": null,
  "created_at": "2025-01-27T21:13:29.867706+00:00",
  "tokens_prompt": 1384,
  "tokens_completion": 0,
  "native_tokens_prompt": 1691,
  "native_tokens_completion": 0,
  "native_tokens_reasoning": null,
  "num_media_prompt": null,
  "num_media_completion": null,
  "num_search_results": null,
  "origin": "",
  "usage": 0.0002343726,
  "usage_cache": null,
  "usage_data": -0.0000023674,
  "usage_web": null,
  "provider_responses": [
    {
      "provider_name": "DeepSeek",
      "status": null,
      "latency": 10000
    },
    {
      "provider_name": "DeepSeek",
      "status": 200,
      "latency": 13767
    }
  ],
  "is_byok": false,
  "finish_reason": null,
  "native_finish_reason": null
}```

silent torrent Jan 27, 2025, 9:20 PM

#

bold burrow I got a question, my gen ID with 0 token response and a timeout tells me that I ...

Can you check your key settings (specifically the fallback value) https://openrouter.ai/docs/integrations#automatic-fallback ?

OpenRouter

Integrations | OpenRouter

Bring your own provider keys with OpenRouter

bold burrow Jan 27, 2025, 9:27 PM

#

#

it's there and it's not a fallback

bold burrow Jan 27, 2025, 9:27 PM

#

bold burrow I got a question, my gen ID with 0 token response and a timeout tells me that I ...

noting that this did have an error

lament yacht Jan 27, 2025, 9:28 PM

#

bold burrow I got a question, my gen ID with 0 token response and a timeout tells me that I ...

The 1st one with DeepSeek failed, then the 2nd one fallback to our key and got a 200

#

Still very bad that it's straight up 0 completion tokens...

bold burrow Jan 27, 2025, 9:35 PM

#

yeah it's the deepseek issue from today

#

they're havin a bad time

#

i would like to not be charged for it tho XD

#

tho i unders;tand it's trying times, they're having a bad day

hoary pumice Jan 28, 2025, 1:15 AM

#

bold burrow tho i unders;tand it's trying times, they're having a bad day

they are trying to go on holidays but they can't

#

GPUs getting absolutely hammered from around the world

velvet egret Jan 28, 2025, 10:43 AM

#

https://i.febryan.me/ydtjy.png

#

The speed was beyond amazing, it's not even R1.

#

https://i.febryan.me/wt6xq.png

#

Even though i provided the direct API Key (BYOK), but it was not using it.

upbeat aurora Jan 28, 2025, 10:21 PM

#

interesting 0t/s

silent torrent Jan 28, 2025, 10:25 PM

#

upbeat aurora interesting 0t/s

hmm will check on this

silent torrent Jan 28, 2025, 10:29 PM

#

upbeat aurora interesting 0t/s

patch is being deployed

naive rock Jan 28, 2025, 11:40 PM

#

Yeah I have some wild V3 speeds in my OR activity history

#

Like, Groq speeds

shrewd python Jan 29, 2025, 1:31 AM

#

@silent torrent the situation is getting freaky

silent torrent Jan 29, 2025, 1:36 AM

#

shrewd python <@165587622243074048> the situation is getting freaky

how so? for v3 specifically?

shrewd python Jan 29, 2025, 1:37 AM

#

silent torrent how so? for v3 specifically?

Yeah, NovitaAI is bricked, official DeepSeek is getting cyberattacked, and DeepInfra has to absorb all traffic which sucks.

silent torrent Jan 29, 2025, 1:37 AM

#

ah, yeah. IMO things will stabilize pretty significantly in a few days or a week or so

#

lots of hype and lots of difficulties providing stable inference

shrewd python Jan 29, 2025, 1:43 AM

#

silent torrent ah, yeah. IMO things will stabilize pretty significantly in a few days or a week...

Hopefully, I am betting on more providers trying to do it, plus more distilled models

opal wind Jan 29, 2025, 6:15 AM

#

why deepseek v3 is so extremely slow?

opal wind Jan 29, 2025, 6:16 AM

#

silent torrent lots of hype and lots of difficulties providing stable inference

I hope so, now its extremely slow

lament yacht Jan 29, 2025, 6:17 AM

#

opal wind I hope so, now its extremely slow

Which provider is slow for you?

opal wind Jan 29, 2025, 10:55 AM

#

lament yacht Which provider is slow for you?

how can I choose provider? I can do it in api call?

#

I see

#

seems that everything is slow, all providers

wary wing Jan 29, 2025, 12:32 PM

#

opal wind why deepseek v3 is so extremely slow?

Its so massive. 670b parameters

velvet egret Jan 29, 2025, 1:00 PM

#

opal wind why deepseek v3 is so extremely slow?

LOL. It's hundreds billions params.

tough whale Jan 29, 2025, 1:03 PM

#

Guys it’s not slow because of the size, there are only a few active params during inference. The reason it’s slow is because of the massive model hype and low amount of providers supporting it

#

Along with deepseek getting ddosed and novitaai being broken

shrewd python Jan 29, 2025, 3:40 PM

#

@tough whale someone said the magic word, OpenHands were notified but really NovitaAI has something weird, say didu use it under LiteLLM or direct API?

tough whale Jan 29, 2025, 3:43 PM

#

shrewd python <@445928169350889472> someone said the magic word, OpenHands were notified but r...

I just repeated what you said, I didn't actually test novitaai hah

shrewd python Jan 29, 2025, 3:43 PM

#

Yeah I need to file the report sooner or later XD

pine zodiac Jan 29, 2025, 3:53 PM

#

What's the good replacement for deepseek v3?

tough whale Jan 29, 2025, 3:58 PM

#

pine zodiac What's the good replacement for deepseek v3?

For now I'm back to claude 3.5 sonnet but my wallet is not liking it

#

Also it's painfully lazy

wary wing Jan 29, 2025, 5:21 PM

#

pine zodiac What's the good replacement for deepseek v3?

Try the Qwen models

near fractal Jan 29, 2025, 7:50 PM

#

Wow

naive rock Jan 29, 2025, 11:40 PM

#

pine zodiac What's the good replacement for deepseek v3?

For what? Code?

pine zodiac Jan 30, 2025, 12:02 AM

#

naive rock For what? Code?

Generating exercises for english learners in json format / translation.

naive rock Jan 30, 2025, 12:10 AM

#

pine zodiac Generating exercises for english learners in json format / translation.

Assuming it has to be cheap-ish, Llama 70B 3.3 should be good

#

It's very good at instruction following and strictly adhering to JSON

#

Very linguistic model overall, not finetuned so hard on math and such

tough whale Jan 30, 2025, 12:18 AM

#

Actually I noticed that llama is kinda bad in polish, I get better results from Qwen or models that focus to be multimodal

pine zodiac Jan 30, 2025, 12:19 AM

#

Which version of Qwen (is this version available on OpenRouter?) "models that focus to be multimodal" - any examples? 😄

#

@naive rock tried also Llama 70B 3.3 as you recommened and it knows Polish words better than others, need to test multiple cases, but sounds interesting

#

Curios if openrouter has a default system prompt, because it gave me good responses, but api returns differently

naive rock Jan 30, 2025, 12:57 AM

#

Ah, I should have asked if this was English only instruction or mixed

naive rock Jan 30, 2025, 12:57 AM

#

pine zodiac Curios if openrouter has a default system prompt, because it gave me good respon...

It should show in the OR chat thing. I can check in a bit.

tough whale Jan 30, 2025, 2:16 AM

#

pine zodiac Which version of Qwen (is this version available on OpenRouter?) "models that f...

Sorry, I meant multilingual not multimodal

#

All Qwen versions promise to be strong at multilingual tasks

#

Try Qwen 2.5 72b

tough whale Jan 30, 2025, 2:18 AM

#

pine zodiac Curios if openrouter has a default system prompt, because it gave me good respon...

It shouldn’t have any, it’s probably because of temperature, set it to 0 if you want consistency. For creative tasks like you want you are probably better off with higher temperatures

#

The default (1) is already pretty high tho

summer mural Jan 30, 2025, 6:07 PM

#

is this a common issue right now even on DeepSeek V3? I think I've read about this on R1 @silent torrent

#

0 token completion

silent torrent Jan 30, 2025, 6:09 PM

#

Yes unfortunately

bold burrow Jan 30, 2025, 6:54 PM

#

yeah it's

#

kind of annoying that it charges u

#

like just 429 or smth

clear anchor Jan 30, 2025, 7:07 PM

#

https://openrouter.ai/deepseek/deepseek-chat
Fireworks.ai is missing from the deepseek-v3 provider list.

DeepSeek V3 - API, Providers, Stats

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations reveal that the model outperforms other open-source models and rivals leading closed-source models. Run DeepSeek V3 with API

slow pollen Jan 30, 2025, 9:13 PM

#

I don’t know if it was mentioned before but Nebius has v3 as well. Will it be added as provider for v3 as well?

silent torrent Jan 30, 2025, 9:22 PM

#

slow pollen I don’t know if it was mentioned before but Nebius has v3 as well. Will it be ad...

Discussing internally!

silent torrent Jan 30, 2025, 9:28 PM

#

slow pollen I don’t know if it was mentioned before but Nebius has v3 as well. Will it be ad...

Should be live soon :)

slow pollen Jan 30, 2025, 10:35 PM

#

silent torrent Should be live soon :)

By the way, nebius gives error for shortly from time to time saying that you are out of token or sometimes reached rate limit, then it works a few seconds later. Is it also known issue?

silent torrent Jan 30, 2025, 10:37 PM

#

slow pollen By the way, nebius gives error for shortly from time to time saying that you are...

They don't have automatic top up so we briefly ran out of funds there, but we're good to go now. The rate limit is likely to happen if you're routing specifically to them

slow pollen Jan 30, 2025, 10:44 PM

#

silent torrent They don't have automatic top up so we briefly ran out of funds there, but we're...

That’s exactly what I was doing but it is my intention to do so. Trying to keep the cost minimum.

Nevertheless, isn’t a bit weird I got that message in the very second request right after the first one. So it is not after a very long discussion

silent torrent Jan 30, 2025, 10:45 PM

#

Everyone is also using them currently, so the rate limits are kind of shared

slow pollen Jan 30, 2025, 10:46 PM

#

Hmm makes sense. Thank you very much

somber mango Jan 31, 2025, 8:51 PM

#

#

finally starting to get some usable providers which is nice

#

deepseek's endpoint is nice, but way too unreliable

bold burrow Feb 4, 2025, 12:04 PM

#

Nebius just

#

Doesn’t respond properly

#

90% of the time

#

It like jsut chucks out context lmao

cedar wolf Feb 4, 2025, 12:12 PM

#

Those cheap providers either respond with context(V3)/reasoning process(R1), don't respond at all or stop after a few sentences.

bold burrow Feb 4, 2025, 12:36 PM

#

yeee

silent torrent Feb 4, 2025, 3:03 PM

#

bold burrow It like jsut chucks out context lmao

like it reprints context?

bold burrow Feb 4, 2025, 3:53 PM

#

no like it forgets context

silent torrent Feb 4, 2025, 3:56 PM

#

ah

wary wing Feb 4, 2025, 4:26 PM

#

bold burrow no like it forgets context

Is this in the chatroom?

bold burrow Feb 5, 2025, 12:19 AM

#

wary wing Is this in the chatroom?

API, but chattoom too on certain providers

#

Nebius in particular

wary wing Feb 5, 2025, 12:22 AM

#

bold burrow API, but chattoom too on certain providers

For the chatroom, try increasing message limit to max

uneven gazelle Feb 5, 2025, 1:14 AM

#

bold burrow Doesn’t respond properly

Yeah I had an api call route to nebius for some reason and it was awfully slow and then just stopped half way through

formal oriole Feb 10, 2025, 10:34 PM

#

The DeepSeek: DeepSeek V3 (free)/deepseek/deepseek-chat:free has a problem with the provider Targon, the caching creates incoherency in some moments. Like in this situation: In a RP the user goes to a store, and for some reason all the swipes will show the same shopkeeper "Emily, blonde, 20's", I'm not sure if it's a caching problem or the AI just loves that name and age

covert pawn Feb 11, 2025, 12:34 AM

#

formal oriole The ```DeepSeek: DeepSeek V3 (free)/deepseek/deepseek-chat:free``` has a problem...

I honestly don't understand how people manage to do good RP, let alone ERP, with DeepSeek and DeepSeek r1: it's slow, it often crashes, it often goes crazy, sometimes it's smart and other times it looks lobotomized, if you change providers you have to change presets and system prompts that each provider has its own different DeepSeek.
Boh, it may be me stupid and ignorant, but when I do RP and ERP I want to relax, not fight to get a single decent answer.

bold burrow Feb 11, 2025, 2:15 AM

#

bruh

#

Together is like

#

just completely dimentia-ridden

shrewd python Feb 11, 2025, 5:20 AM

#

Really? How so?

cedar wolf Feb 11, 2025, 9:27 AM

#

covert pawn I honestly don't understand how people manage to do good RP, let alone ERP, with...

V3 is too repetitive for RP. R1 is lobotomized in the sense that it often does what it wants and takes it too far. But it's so different from everything else we have at the moment, it's entertaining. Yes, I do have to edit almost every reply to keep it away from responding as me, but everything else is either basic, repetitive or moderated. R1 is like an obstacle course but the rewards are often suprisingly good.

bold burrow Feb 11, 2025, 1:01 PM

#

shrewd python Really? How so?

Like I ask it to remember a number

#

Then next message it doesn’t know the number LMAO

#

Only for certain providers

formal oriole Feb 11, 2025, 1:14 PM

#

covert pawn I honestly don't understand how people manage to do good RP, let alone ERP, with...

It's pretty simple and at the same time it's just "use a good System Prompt and adjust the parameters", it has a 10% chance of messing up a reply but it's pretty good for RP, since it remembers a lot since it's context, R1 is slow but the V3 is faster to give replies, sometimes repeat one or two phrases but it happens to a lot of models

formal oriole Feb 11, 2025, 1:53 PM

#

Yet I forgot to mention that ``The Provider Targor``` sometimes doesn't deliver the complete replies, sometimes it gives cut off replies with half of the reply, not finished

viral trout Feb 12, 2025, 4:12 PM

#

Hi, does all providers support Tools (function call) & structured output? I tried the standard version (not free) but failed, anyone could help me on that?

dusty fable Feb 12, 2025, 10:31 PM

#

viral trout Hi, does all providers support Tools (function call) & structured output? I trie...

it looks like the DeepSeek provider is the only one supporting Tools at this time. https://openrouter.ai/deepseek/deepseek-chat

#

viral trout Feb 12, 2025, 11:48 PM

#

dusty fable it looks like the DeepSeek provider is the only one supporting Tools at this tim...

Got it! Thank you

somber mango Feb 13, 2025, 12:57 AM

#

covert pawn I honestly don't understand how people manage to do good RP, let alone ERP, with...

it was decent on the official endpoint back when it actually worked

vital compass Feb 14, 2025, 7:55 AM

#

I swapped to deepseek api, but sometimes it happens too

#

I migrated to qwen32b coder instead

bold burrow Feb 14, 2025, 5:39 PM

#

@silent torrent

#

Together on DeepSeekV3

#

what hapepneda lmfao

#

👀

#

i do not have a key configured

#

(i never did)

silent torrent Feb 14, 2025, 5:41 PM

#

looking

wise flare Feb 14, 2025, 5:52 PM

#

Fixed now!

delicate mica Feb 14, 2025, 6:12 PM

#

How do deepseek r1's sampling parameters compare on official website & open router? I seem to observe that same prompt gives better reasoning on deepseek.com and on openrouter it's shorter and more superficial. I'm concerned that the default sampling parameters don't match deepseek.com's

serene arch Feb 26, 2025, 9:45 AM

#

DEEPSEEK LOWERS OFF-PEAK API PRICING BY UP TO 75%
hope it will get reflected in OpenRouter pricing
https://x.com/Sino_Market/status/1894682095706128430

CN Wire (@Sino_Market) on X

🇨🇳#BREAKING
DEEPSEEK LOWERS OFF-PEAK API PRICING BY UP TO 75% - STATEMENT
#CHINA #AI #DEEPSEEK
Source:
https://t.co/Us3Q42PFce
https://t.co/Us3Q42PFce

indigo harness Mar 1, 2025, 5:00 AM

#

Finally turns out to be 545% true 🔥
And they just kindly opensourced their moat

atomic patrol Mar 2, 2025, 2:01 PM

#

what's a good temp for deepseek-v3 if I'm using the free version with only the targon provider?
when I'm using the non-free version, i use the official deepseek provider and for that one, i need to crank up the temp to 1.85 for decent results, i'm wondering what temp should i use to match that in the free version with targon

worn horizon Mar 2, 2025, 2:40 PM

#

atomic patrol what's a good temp for deepseek-v3 if I'm using the free version with only the t...

I personally use between 0.8 and 1, but the recommended values are much higher, ranging from 0 for math/code to 1.5 for creative writing: https://api-docs.deepseek.com/quick_start/parameter_settings

atomic patrol Mar 2, 2025, 2:46 PM

#

oh yeah i forgot, i'm using it for RP, and I needed to crank it to 1.9 to have good results

rustic sundial Mar 2, 2025, 4:41 PM

#

The RP recommendation was generally around 1.8. Higher has higher chance of vomit after 400+ tokens of output in a single generation, unless they fixed that. Think I've only used this model for about 2 weeks after release. And we had to fight it with a little bit of freq penalty.

forest sail Mar 7, 2025, 9:00 PM

#

GA on Azure (US reigons) https://techcommunity.microsoft.com/blog/machinelearningblog/announcing-deepseek-v3-on-azure-ai-foundry-and-github/4390438
$1.14/$4.56 1M in/out
($1.25/$5 if you use regional instead of global inferencing)

TECHCOMMUNITY.MICROSOFT.COM

Announcing DeepSeek-V3 on Azure AI Foundry and GitHub | Microsoft C...

We are pleased to announce the availability of DeepSeek-V3 on Azure AI Foundry model catalog with token-based billing. This latest iteration is part of...

silent torrent Mar 7, 2025, 9:01 PM

#

not very competitive pricing

digital silo Mar 10, 2025, 5:03 AM

#

silent torrent not very competitive pricing

Compared to official deepseek pricing it’s much more expensive

tough whale Mar 10, 2025, 6:54 AM

#

digital silo Compared to official deepseek pricing it’s much more expensive

That’s exactly Toven said

dusty fable Mar 10, 2025, 6:58 AM

#

MSFT aren't trying to be cheap, all the governance and procurement fluff they have to do for their government & enterprise clients...means they get those same clients without much competition

digital silo Mar 10, 2025, 10:31 AM

#

tough whale That’s exactly Toven said

Yeah

mossy breach Mar 15, 2025, 2:55 PM

#

I think Chutes(provider) glitched on v3 (Free):

its generating UNREALISTICALLY FAST responses, like 900 tokens per second...

but the responses are all being exactly the same

hybrid plover Mar 22, 2025, 2:55 PM

#

Yeah, probably cached responses.

formal oriole Mar 29, 2025, 2:06 AM

#

So, what's happening with all the free DeepSeek models? I don't know which providers are the ones with the cache thing that makes replies be the same when you swipe/regenerate, but the cache thing is making me question everything. I'm using those models for RP in SillyTavern, and I have no idea about what to do with the cache thing. And I think I'm gonna suggest in the suggestions that the providers but have a tag in the model provider list, something that says "This provider may cache your prompts—learn more in [link]"

summer mural Mar 29, 2025, 7:57 AM

#

formal oriole So, what's happening with all the free DeepSeek models? I don't know which provi...

caching should not affect the outputs at all.
it’s just for processing the inputs faster in theory

#

but you could try blacklisting one provider to see if it fixes the issue

wary wing Apr 6, 2025, 11:36 PM

#

I have proof DeepSeek v3 was trained on the Bee movie script, and it's funny

#

It's logical to think that it was trained on the Bee movie script, but it's still funny.

slate carbon Apr 7, 2025, 5:52 AM

#

wary wing I have proof DeepSeek v3 was trained on the Bee movie script, and it's funny

Show us, enlight us..

wary wing Apr 7, 2025, 10:50 AM

#

slate carbon Show us, enlight us..

I didn't make it output the entire script, but it went on like this for a little bit, and it matched up exactly with the bee movie script

I had to input some of the script, though

naive rock Apr 8, 2025, 8:09 PM

#

The new V3 is surprisingly fun to interact with. Maybe the best personality of any model.

#

Curious how they trained it in. Very playful.

wary wing Apr 8, 2025, 8:50 PM

#

naive rock Curious how they trained it in. Very playful.

better rlhf?

wraith coral Apr 10, 2025, 8:32 AM

#

naive rock The new V3 is surprisingly fun to interact with. Maybe the best personality of a...

I love it rn, asked it for business advice and suggested me shady stuff straight away hahahaha

sonic plume Apr 10, 2025, 8:50 AM

#

Yeah it's super fun

#

I was brainstorming with it and my system instruction doesn't have anything telling it to be casual, or rude, or dictating its tone or whatsoever

#

Yet it's the only model that says "your character won't be betrayed so easily unless there is a damn good reason"

naive rock Apr 10, 2025, 8:52 AM

#

lol

#

I went on a small rant about hating semi-colons and how we should just get rid of them. Every other model leaned heavily toward "Well actually here's why they're still a good idea-" whereas new V3 got playful with it and encouraged me on, squeaking in the counter-arguments as "counterarguments from grammar nerds". It finally ends with:

#

Compromise Proposal

Banish semi-colons except for:

Winking at Grammar Nerds (to acknowledge their pain).

Artistic Use (e.g., pretentious novel titles: "The Rain in Spain; The Lies We Weep").

Otherwise, let the comma and period split the semi-colon’s duties like a divorced couple dividing assets. The world might not end—just get slightly more breathless.

Verdict: Proceed with caution. Or recklessly. Language is a democracy (or should be).

#

Contender for my favorite LLM response of all time. This was in no way instructed, with a bog standard system prompt and temp

wraith coral Apr 10, 2025, 4:41 PM

#

#

Rare deepseek refusal pull, feels like I just found a shiny pokemon lmao

frigid plover Apr 10, 2025, 5:41 PM

#

wraith coral Rare deepseek refusal pull, feels like I just found a shiny pokemon lmao

what did you ask it?

#

it's never refused me

wraith coral Apr 10, 2025, 5:42 PM

#

Just a random task lol, never happened to me before either. Worked after one retry

#DeepSeek V3

Compromise Proposal