DeepSeek V3.1 | OpenRouter | Page 2

celest gull Aug 21, 2025, 4:10 PM

#

How the hell do I enable reasoning in sillytavern?

rocky sedge Aug 21, 2025, 4:11 PM

#

Chat completion - preset - enable reasoning

viscid sigil Aug 21, 2025, 4:11 PM

#

hope so...but why is it so hard for openrouter to give two variants for new deepseek

jolly lintel Aug 21, 2025, 4:11 PM

#

rocky sedge Chat completion - preset - enable reasoning

i dont think presets have a reasoning setting, i just looked earlier

rocky sedge Aug 21, 2025, 4:12 PM

#

jolly lintel i dont think presets have a reasoning setting, i just looked earlier

jolly lintel Aug 21, 2025, 4:12 PM

#

ah i thought you meant openrouter presets

radiant gust Aug 21, 2025, 4:13 PM

#

The reasoner page seems to have been deleted, was it merged?

outer rover Aug 21, 2025, 4:14 PM

#

Yup

radiant gust Aug 21, 2025, 4:15 PM

#

Aight thanks

celest gull Aug 21, 2025, 4:18 PM

#

rocky sedge

This does nothing

#

It's still not reasonign

rocky sedge Aug 21, 2025, 4:19 PM

#

celest gull This does nothing

No reasoning or broken </think> tag inside?

celest gull Aug 21, 2025, 4:20 PM

#

No reasoning at all

#

I guess I should just prefix with <think?

rocky sedge Aug 21, 2025, 4:22 PM

#

Try it, should work

raw beacon Aug 21, 2025, 4:22 PM

#

reasoning "enabled": true doesn't give any reasoning in API. works on other hybrids

rancid geode Aug 21, 2025, 4:24 PM

#

Oh nice, perfect, haha

#

Thanks you 🙏

celest gull Aug 21, 2025, 4:25 PM

#

Okay you have to change it from "Auto" to "Medium"

#

That seems to work

polar locust Aug 21, 2025, 4:25 PM

#

raw beacon reasoning "enabled": true doesn't give any reasoning in API. works on other hybr...

They probably embedded the reasoning 🥲

raw beacon Aug 21, 2025, 4:28 PM

#

setting reasoning to false behaves same as to true so its not working

celest gull Aug 21, 2025, 4:34 PM

#

How do I prompt/re-orient its reasoning? It seems to completely ignore the format I'm prompting it to follow for its reasoning.

nocturne kelp Aug 21, 2025, 4:38 PM

#

raw beacon reasoning "enabled": true doesn't give any reasoning in API. works on other hybr...

hey - can you share the api shape you're making? all my tests work and have a few others reporting it working

celest gull Aug 21, 2025, 4:42 PM

#

Yeah when I prompt it to format its thinking in a certain way, it ignores it in thinking step and just does whatever and then follows my instructions outside the reasoning step

#

Am I missing something?

raw beacon Aug 21, 2025, 4:42 PM

#

it works now but did not when i said it

celest gull Aug 21, 2025, 4:47 PM

#

Yeah wtf it completely ignores my instructions

#

Is there some sort of format for it?

#

My instructions work for GLM 4.5, but not deepseek v3.1

marble panther Aug 21, 2025, 4:50 PM

#

celest gull My instructions work for GLM 4.5, but not deepseek v3.1

using the instruct not the base model?

celest gull Aug 21, 2025, 4:50 PM

#

Yes

#

GLM 4.5:

#

#

Deepseek v3.1:

#

#

It doesn't even attempt to follow the format

celest gull Aug 21, 2025, 5:09 PM

#

I just tried r1 0528 and it followed the format/instructions perfectly for its thinking step

polar locust Aug 21, 2025, 5:13 PM

#

celest gull I just tried r1 0528 and it followed the format/instructions perfectly for its t...

I believe you do need a specific format, but I still don't know what it is

#

The deepseek chat(app) clearly uses one

celest gull Aug 21, 2025, 5:13 PM

#

#

It's broken on the chat app too

#

Doesn't follow instructions for its thinking step at all

#

Or it partially does but it misunderstands it completely

rocky sedge Aug 21, 2025, 5:17 PM

#

Try messing with Temperature

celest gull Aug 21, 2025, 5:18 PM

#

Doesn't seem to be a temp issue at all

#

It SORT of follows the instructions but doesn't follow it accurately at all

#

It really half-asses it

#

And 50% of the time it ignores it and just does it in the actual reply instead of on the thinking step

#

Extremely disappointing so far

rocky sedge Aug 21, 2025, 5:21 PM

#

What about reasoning effort HIGH or Maximum?

celest gull Aug 21, 2025, 5:23 PM

#

Doesn't do anything

#

Likely has to do with its "Higher thinking effieciency"

#

Aka it's benchmaxxed for that

#

rancid geode Aug 21, 2025, 5:48 PM

#

I dont value the main stream benchmarks a whole lot, but i do like seeing all of the pretty colored bars lined up, deepseek v3.1 looks very strong according to it, and i would agree with that

rocky sedge Aug 21, 2025, 5:49 PM

#

Mmmmm, bars 🤤

storm seal Aug 21, 2025, 5:56 PM

#

celest gull Doesn't follow instructions for its thinking step at all

The only version of DeepSeek that had thinking that followed instructions for its reasoning was R1-lite

#

QwQ also doesn't follow instructions for formatting its reasoning

#

I haven't tested Qwen3 yet though

copper vessel Aug 21, 2025, 5:58 PM

#

Qwen3 is hit or miss. I usu have just been going Gemini Flash

#

And Gemini Pro or Sonnet when I hit a bump with that

copper vessel Aug 21, 2025, 6:01 PM

#

rancid geode I dont value the main stream benchmarks a whole lot, but i do like seeing all of...

IDK what planet qwen3 outbenchmarks sonnet

rancid geode Aug 21, 2025, 6:01 PM

#

oh, i did not know artificial analysis did total cost used to run the bench:

copper vessel Aug 21, 2025, 6:01 PM

#

Oh is cost a factor?

rancid geode Aug 21, 2025, 6:01 PM

#

copper vessel IDK what planet qwen3 outbenchmarks sonnet

Oh qwen3 coder is def not better than sonnet 4

#

imho

copper vessel Aug 21, 2025, 6:02 PM

#

I guess it's the "cost factor" cause it is expensive, but I don't think it scales linearly in terms of functionality

rancid geode Aug 21, 2025, 6:02 PM

#

maybe deepseek is, not sure its a little early for me to feel confident about that

storm seal Aug 21, 2025, 6:02 PM

#

rancid geode oh, i did not know artificial analysis did total cost used to run the bench:

how do they calculate cost?

#

what provider do they use?

copper vessel Aug 21, 2025, 6:02 PM

#

Good point

rancid geode Aug 21, 2025, 6:03 PM

#

idk, i assume official api, but yeah it does play a big role

#

I guess we could use this to calculate cost for any provider;

#

wait thats just output tokens

#

nvm

#

I always thought the output speed metrics on artificial analysis was dumb, 1. If its OS you can use a fast provider. 2. Offical api's fluctuate, sometimes its fast, like right after release, and other times its so slow

limber elk Aug 21, 2025, 6:09 PM

#

How to Enable or disable reasoning ?

copper vessel Aug 21, 2025, 6:09 PM

#

The other thing is honestly claude sonnet is so much better than most of the others IMO that "costing less" barely matters. If I spend x5 as much and it builds me something that works in half as much time as something else, or something else just can't get there without breaking it up into minute steps, I'll gladly pay that

copper vessel Aug 21, 2025, 6:10 PM

#

rancid geode I guess we could use this to calculate cost for any provider;

How are they getting claude less than qwen? Suppressing all thinking or something?

celest gull Aug 21, 2025, 6:28 PM

#

With or without thinking?

modest smelt Aug 21, 2025, 6:29 PM

#

celest gull With or without thinking?

without

rancid geode Aug 21, 2025, 6:35 PM

#

copper vessel How are they getting claude less than qwen? Suppressing all thinking or somethin...

You mean token usage? looks like sonnet 4 is more token efficient, i notices that in my own use, kimi used 200k tokens to build a project and qwen3 code used 600k

rancid geode Aug 21, 2025, 6:39 PM

#

rancid geode I guess we could use this to calculate cost for any provider;

but what is going on with MiniMax M1? minimax is a token hog, how does it say it uses the least output tokens, def wrong

rocky sedge Aug 21, 2025, 6:48 PM

#

It is? Only thing I remember about it is being super dry and boring, but handling long context like a pro

wintry jolt Aug 21, 2025, 7:05 PM

#

Just me or is the CoT less fun to read? There used to be so much emotion, i swear.

spice marsh Aug 21, 2025, 7:07 PM

#

I guess, lol

#

Not much more of "wait!" "Aha" "I think I'm doing this wrong", etc

#

It's very structured now, a numbered list of short steps

copper vessel Aug 21, 2025, 7:27 PM

#

Ah tokens. Looked quick thought it was overall cost

#

Is GLM4.5 really that verbose? Havent tried it yet

#

GLM4 seemed fine on webapps

drowsy estuary Aug 21, 2025, 7:33 PM

#

Good lord. Almost 7k tokens just to get the wrong answer to an anti-riddle. How do I turn off thinking in the chatroom?

#

Anti-riddle is, "A goat, who is dressed up as a farmer, is allergic to cabbage, but is wolfing down some other vegetables, before crossing a river. What is the minimum number of trips needed?"

Sonnet and Gemini Pro 2.5 correctly realize that it's not the original riddle and get the answer right (one trip). GPT-5 fails

storm seal Aug 21, 2025, 7:36 PM

#

drowsy estuary Good lord. Almost 7k tokens just to get the wrong answer to an anti-riddle. How ...

is that GPT 5 or DS v3.1?

drowsy estuary Aug 21, 2025, 7:36 PM

#

DS

#

Just mentioned GPT 5 because it also got the answer wrong. It's a trick question for models that tend to answer the riddle they're trained on rather than a variation. It helps test instruction following IMO

#

Sonnet was clever enough to ask clarifying questions before answering

#

Gemini's answer

rancid geode Aug 21, 2025, 7:40 PM

#

drowsy estuary Sonnet was clever enough to ask clarifying questions before answering

Wonder if the system prompt helped with that, I think I remember the prompt saying it can always ask the user for clarification

drowsy estuary Aug 21, 2025, 7:40 PM

#

Deepseek's answer

#

(after 7,000 tokens and 300 seconds of reasoning, lol)

rocky sedge Aug 21, 2025, 7:44 PM

#

Goat being allergic to cabbage so it can't be near it sounds logical

drowsy estuary Aug 21, 2025, 7:47 PM

#

rocky sedge Goat being allergic to cabbage so it can't be near it sounds logical

"A goat, who is dressed up as a farmer, is allergic to cabbage, but is wolfing down some other vegetables, before crossing a river. What is the minimum number of trips needed?"

Technically it doesn't even need to take the cabbage across at all

#

I mean technically there might not even BE any cabbage there, it's just an unrelated fact that the goat is allergic

rocky sedge Aug 21, 2025, 7:48 PM

#

Technically you don't mention a boat, so it can swim, wait for a cruise ship or whatever.

#

Call for an Uber through a bridge

drowsy estuary Aug 21, 2025, 7:49 PM

#

Yeah, I would accept those as creative valid answers

rocky sedge Aug 21, 2025, 7:50 PM

#

Make them write a joke starting with

" A wolf, a goat, a farmer and a cabbage walk into a bar" and evaluate the results

marble panther Aug 21, 2025, 7:55 PM

#

curious how it'll rank on https://eqbench.com/spiral-bench.html
r1 got the lowest score, I think bench creator called it "psycho"

rocky sedge Aug 21, 2025, 7:57 PM

#

A wolf, a goat, a farmer and a cabbage walk into a bar.

The bartender looks up and sighs.
“Let me guess—you need a boat to get everyone safely across the street to the late-night diner?”

The farmer nods.
“Exactly. But if I leave the wolf alone with the goat, he’ll eat him. If I leave the goat alone with the cabbage, he’ll eat it. And if I leave the cabbage alone with the wolf, they’ll start a weird low-carb diet together and I’ll lose my entire business model.”

The bartender shrugs, pours four waters, and pushes a coaster across the bar.
“Plan your seating order on this. First round’s on me if you can solve it before last call.”

The wolf growls, “I just wanted a Bloody Mary—no garnish.”
The goat bleats, “House salad, hold the croutons.”
The cabbage rustles, “Do you have any vegan wings?”

The bartender squints at the farmer.
“Buddy, you’ve got bigger problems than river logic—you’ve got a talking salad.”

marble panther Aug 21, 2025, 8:00 PM

#

how are you guys getting it to think over api?

  provider: {
    order: [
      'fireworks',
      'parasail',
      'lambda',
      'deepinfra'
    ],
    sort: 'throughput',
    allow_fallbacks: false
  },
  reasoning: {
    enabled: true,
    exclude: true,
    effort: 'low'
  }
},```
but it has no reasoning completion tokens just responds instantly

storm seal Aug 21, 2025, 8:01 PM

#

marble panther how are you guys getting it to think over api? ```'deepseek/deepseek-chat-v3.1':...

use the model slug deepseek/deepseek-chat-v3.1:thinking

rocky sedge Aug 21, 2025, 8:01 PM

#

Comment out exclude and effort and try without them

marble panther Aug 21, 2025, 8:01 PM

#

but why though lol

marble panther Aug 21, 2025, 8:01 PM

#

rocky sedge Comment out exclude and effort and try without them

I did, only added it when it didn't work first time

rancid geode Aug 21, 2025, 8:01 PM

#

storm seal use the model slug `deepseek/deepseek-chat-v3.1:thinking`

How long until thats deprecated tho

storm seal Aug 21, 2025, 8:02 PM

#

wont do anything

storm seal Aug 21, 2025, 8:02 PM

#

rancid geode How long until thats deprecated tho

idk

marble panther Aug 21, 2025, 8:02 PM

#

storm seal use the model slug `deepseek/deepseek-chat-v3.1:thinking`

you sure?

storm seal Aug 21, 2025, 8:02 PM

#

marble panther you sure?

yes im sure

marble panther Aug 21, 2025, 8:02 PM

#

seems odd like why oesn't it work with regular parmeters

#

here's hoping we get some non thinking providers that will offer a discount or something..

#

v3 was kinda dumb but the pricing was right for some applications

rancid geode Aug 21, 2025, 8:03 PM

#

#1407376333444616272 message

storm seal Aug 21, 2025, 8:03 PM

#

rancid geode https://discord.com/channels/1091220969173028894/1407376333444616272/14081156117...

Whoops

#

:thinking got deprecated

marble panther Aug 21, 2025, 8:05 PM

#

OpenRouter API error: No endpoints found for deepseek/deepseek-chat-v3.1:thinking.

#

okay I give up, non thinking it is

rocky sedge Aug 21, 2025, 8:05 PM

#

Try DeepSeek provider explicitely, I suspect some providers could be not compatible with thinking

rocky sedge Aug 21, 2025, 8:06 PM

#

marble panther how are you guys getting it to think over api? ```'deepseek/deepseek-chat-v3.1':...

Here

marble panther Aug 21, 2025, 8:06 PM

#

rocky sedge Try DeepSeek provider explicitely, I suspect some providers could be not compati...

most likely the case yeah

#

but doesn't fit my provider settings.. excludes loggers and trainers

rancid geode Aug 21, 2025, 8:12 PM

#

marble panther but doesn't fit my provider settings.. excludes loggers and trainers

Prob just need to wait a little, I’m sure that providers will take a few days before they are all configured correctly

glacial reef Aug 21, 2025, 8:20 PM

#

@nocturne kelp why is there 2 deepseeks?

jolly lintel Aug 21, 2025, 8:24 PM

#

glacial reef <@165587622243074048> why is there 2 deepseeks?

they use 2 diferent endpoints for reasoning & chat

glacial reef Aug 21, 2025, 8:25 PM

#

But for the other providers, how do you determine if it thinks or not?

nocturne kelp Aug 21, 2025, 8:34 PM

#

glacial reef But for the other providers, how do you determine if it thinks or not?

reasoning.enabled over the API

#

the reason there are two is so that if you request a large max_tokens with reasoning you can get it

#

the non-thinking deepseek provider endpoint is only 8k output

#

and the thinking endpoint is 64k

#

they are both the same thing. but it just allows our system to have both max outputs

tardy frost Aug 21, 2025, 8:43 PM

#

did instruct drop

rancid geode Aug 21, 2025, 8:44 PM

#

Yep

tardy frost Aug 21, 2025, 8:44 PM

#

rancid geode Yep

shocking

#

is it hybrid?

ivory epoch Aug 21, 2025, 8:44 PM

#

yep

tardy frost Aug 21, 2025, 8:45 PM

#

is it marked as deepseek-chat in deepseek api

jolly lintel Aug 21, 2025, 8:45 PM

#

deepseek-chat for non reasoning and deepseek-reasoner for reasoning

#

afaik

rancid geode Aug 21, 2025, 8:45 PM

#

tardy frost is it marked as deepseek-chat in deepseek api

I think they changed that, or will soon, one sec while I double check

tardy frost Aug 21, 2025, 8:45 PM

#

jolly lintel deepseek-chat for non reasoning and deepseek-reasoner for reasoning

Fr

rancid geode Aug 21, 2025, 8:45 PM

#

ivory epoch Aug 21, 2025, 8:46 PM

#

tardy frost Fr

and both will have the same price soon enough

jolly lintel Aug 21, 2025, 8:46 PM

#

their official docs say this too

rancid geode Aug 21, 2025, 8:46 PM

#

Is it will be Deepseek-V3.1

tardy frost Aug 21, 2025, 8:47 PM

#

deepseek v3.1 sucks

#

wonder if theres deepseek-v3.1:free already

#

ok there isnt

rancid geode Aug 21, 2025, 8:48 PM

#

tardy frost deepseek v3.1 sucks

What, you just learned it exists?

jolly lintel Aug 21, 2025, 8:48 PM

#

oh my god already so many providers

tardy frost Aug 21, 2025, 8:48 PM

#

rancid geode What, you just learned it exists?

Yes, but...

rancid geode Aug 21, 2025, 8:49 PM

#

From my use in Qwen cli, deepseek v3.1 is very good, comparable to sonnet 4

jaunty basin Aug 21, 2025, 9:03 PM

#

Wonder if Deepseek-reasoner in the official api finally supports temp, top p, etc

tardy frost Aug 21, 2025, 9:07 PM

#

can you force thinking on openrouter by just adding :thinking suffix to it like deepseek/deepseek-chat-v3.1:thinking

wintry jolt Aug 21, 2025, 9:09 PM

#

#

Improvements on reasoning version, slight downgrade on non reasoning it seems.

rancid geode Aug 21, 2025, 9:17 PM

#

tardy frost can you force thinking on openrouter by just adding :thinking suffix to it like ...

No, they had it that way but decided to remove it

#

#1407376333444616272 message

rocky sedge Aug 21, 2025, 9:20 PM

#

wintry jolt

It always weird how some models have 120k long context better handled than 2-4k

#

Most become worse gradually, but not all

broken cliff Aug 21, 2025, 9:45 PM

#

rocky sedge It always weird how some models have 120k long context better handled than 2-4k

then we have some models with 1 million context

vapid torrent Aug 21, 2025, 9:48 PM

#

rocky sedge It always weird how some models have 120k long context better handled than 2-4k

they're probably trained on a bunch of really long samples

tardy frost Aug 21, 2025, 10:01 PM

#

rancid geode No, they had it that way but decided to remove it

reasoning.enabled makes no sense? in the payload orrr?

spice marsh Aug 21, 2025, 10:03 PM

#

À la OpenAI, see https://platform.openai.com/docs/guides/reasoning

response = client.responses.create(
    model="gpt-5",
    reasoning={"enabled": "true"},
    input=[
        {
            "role": "user", 
            "content": prompt
        }
    ]
)

rocky sedge Aug 21, 2025, 10:06 PM

#

That's using OpenAI compliant modules, there is also pure API request

spice marsh Aug 21, 2025, 10:10 PM

#

I think the dot indicates JSON nesting

tardy frost Aug 21, 2025, 10:12 PM

#

spice marsh À la OpenAI, see <https://platform.openai.com/docs/guides/reasoning> ```json res...

ah

glacial reef Aug 21, 2025, 10:16 PM

#

is there a recommended temperature?

#

fort his model

jolly lintel Aug 21, 2025, 10:19 PM

#

not written anywhere explicitly, but they say they ran benchmarks with "tested multiple times using varying temperature settings to derive robust final results" and their local run example uses 0.7 temp

torpid viper Aug 21, 2025, 10:23 PM

#

Does anybody understand how to apply tools description? Couldnt find in chat template for loop over tools section

jolly lintel Aug 21, 2025, 10:25 PM

#

torpid viper Does anybody understand how to apply tools description? Couldnt find in chat tem...

https://huggingface.co/deepseek-ai/DeepSeek-V3.1#toolcall

deepseek-ai/DeepSeek-V3.1 · Hugging Face

nocturne kelp Aug 21, 2025, 10:29 PM

#

jolly lintel https://huggingface.co/deepseek-ai/DeepSeek-V3.1#toolcall

wait what the heck

#

why isn't this in the chat templaate

#

wait nvm

spice marsh Aug 21, 2025, 10:41 PM

#

wintry jolt

I really wish they'd colorize this lol

#

Reasoning models (transcribed with Gemini 2.5 Flash and not double checked)

#

Non-reasoning models (transcribed with Gemini 2.5 Flash and not double checked)

marsh grove Aug 21, 2025, 11:30 PM

#

tardy frost can you force thinking on openrouter by just adding :thinking suffix to it like ...

To activate think mode, set the reasoning parameter to high.
You can find more information in the documentation: https://openrouter.ai/docs/use-cases/reasoning-tokens

OpenRouter Documentation

Reasoning Tokens - Improve AI Model Decision Making

Learn how to use reasoning tokens to enhance AI model outputs. Implement step-by-step reasoning traces for better decision making and transparency.

#

For example, within aider I do the following:
aider --model openrouter/deepseek/deepseek-chat-v3.1 --reasoning-effort high

upbeat spruce Aug 21, 2025, 11:32 PM

#

It isn't handholding, it is just giving the AI a nudge to get it to the right tone or follow the instructions more closely. You do it once and if the AI will keep up the new tone for the rest of the chat. Good AIs like Claude still sometimes holds back too and needs that little OOC kick to get it all the way.

wise crystal Aug 21, 2025, 11:54 PM

#

Seems like this new deepseek v3.1 is heavily trained on gemini 2.5 pro data

#

It is plagued with the "Of course" slop now

#

The good side is this model will be good for coding

#

The bad side is this model will be unpleasant for general chatting

#

They should have sanitized non coding training data to preserve the original deepseek vibe

#

Now the model feels as if it is possessed by gemini

storm seal Aug 22, 2025, 12:01 AM

#

wise crystal Now the model feels as if it is possessed by gemini

synthetic data go brrrr

#

I don't want to be too harsh on DS though

#

Its amazing that they're fighting with the top AI labs who have billions in funding while they have only a few hundred million

wise crystal Aug 22, 2025, 12:04 AM

#

I'm pretty sure they realized this problem too

#

Maybe they just caught it a little bit late

drifting hare Aug 22, 2025, 12:05 AM

#

Hi why are my API requests being forwarded to the base model?

#

I am receiving gibberish over the API

#

#

<@&1384697330254610442>

jolly lintel Aug 22, 2025, 12:07 AM

#

drifting hare Hi why are my API requests being forwarded to the base model?

check the activity tab, and make sure you're actually using the normal model and not the base one

drifting hare Aug 22, 2025, 12:07 AM

#

#

Works fine in the chatroom

#

I even checked the request

jolly lintel Aug 22, 2025, 12:08 AM

#

weird

storm seal Aug 22, 2025, 12:09 AM

#

maybe Chutes deployed the wrong model?

nocturne kelp Aug 22, 2025, 12:09 AM

#

is it only happening on chutes?

jolly lintel Aug 22, 2025, 12:09 AM

#

i just tried in chatroom with chutes i didnt have an issue

nocturne kelp Aug 22, 2025, 12:09 AM

#

we don’t forward your requests to other models

#

that’s just not how it works

jolly knot Aug 22, 2025, 12:10 AM

#

jolly lintel i just tried in chatroom with chutes i didnt have an issue

Same

jolly lintel Aug 22, 2025, 12:10 AM

#

could be a template issue maybe?

jolly knot Aug 22, 2025, 12:10 AM

#

drifting hare Works fine in the chatroom

Oh i mean he said that it works on chatroom

jolly lintel Aug 22, 2025, 12:11 AM

#

some edge case like triggered it maybe

spice marsh Aug 22, 2025, 1:26 AM

#

storm seal synthetic data go brrrr

Robin Hood of AI, getting training data from the rich to give to the poor

pure path Aug 22, 2025, 1:31 AM

#

How are prefills working on the providers for everyone else? So far, on Parasail, it's giving me stuff completely unrelated to my prompt (non-prefill works as normal though)

pure path Aug 22, 2025, 1:38 AM

#

pure path How are prefills working on the providers for everyone else? So far, on Parasail...

This is what v3.1 outputs after "key points"

pure path Aug 22, 2025, 1:40 AM

#

pure path This is what v3.1 outputs after "key points"

How my prompt appears in sillytavern

untold badger Aug 22, 2025, 1:44 AM

#

pure path How are prefills working on the providers for everyone else? So far, on Parasail...

Tested, through OR I'm seeing bad responses on ALL providers except DeepSeek. Everything else starts talking about a 2.5kg (or similar) mass at rest.

storm seal Aug 22, 2025, 1:46 AM

#

untold badger Tested, through OR I'm seeing bad responses on ALL providers except DeepSeek. Ev...

Bad implementations :(

untold badger Aug 22, 2025, 1:49 AM

#

vapid torrent Aug 22, 2025, 1:51 AM

#

rancid geode Thanks you 🙏

im just writing a readme, will publish in a few minutes

drifting hare Aug 22, 2025, 2:08 AM

#

nocturne kelp is it only happening on chutes?

no

rancid geode Aug 22, 2025, 2:15 AM

#

vapid torrent im just writing a readme, will publish in a few minutes

Yay : )

vapid torrent Aug 22, 2025, 2:33 AM

#

rancid geode Yay : )

Just fixing a bug

#

will be up very shortly

#

running unit tests

vapid torrent Aug 22, 2025, 2:41 AM

#

rancid geode Yay : )

https://github.com/CrushedAsian255/orproxy

GitHub

GitHub - CrushedAsian255/orproxy: OpenRouter API Proxy

OpenRouter API Proxy. Contribute to CrushedAsian255/orproxy development by creating an account on GitHub.

rancid geode Aug 22, 2025, 2:44 AM

#

vapid torrent https://github.com/CrushedAsian255/orproxy

Thank you so much!

vapid torrent Aug 22, 2025, 2:44 AM

#

to get deepseek v3.1 with thinking its deepseek/deepseek-chat-v3.1$think

rancid geode Aug 22, 2025, 2:44 AM

#

that looks easy

viscid sigil Aug 22, 2025, 3:46 AM

#

vapid torrent to get deepseek v3.1 with thinking its `deepseek/deepseek-chat-v3.1$think`

omg you are the best!

rancid geode Aug 22, 2025, 3:56 AM

#

vapid torrent to get deepseek v3.1 with thinking its `deepseek/deepseek-chat-v3.1$think`

Just sat down and got it setup, works great! Super helpful!

viscid sigil Aug 22, 2025, 3:59 AM

#

vapid torrent to get deepseek v3.1 with thinking its `deepseek/deepseek-chat-v3.1$think`

is there also a hosted version too? i dont wanna run it again and again

vapid torrent Aug 22, 2025, 4:29 AM

#

viscid sigil is there also a hosted version too? i dont wanna run it again and again

i would love to but i have no experience with hosting things for the public

#

im worried someone would ddos it and cost me thousands

#

if someone else wants to its MIT licenced 😃

serene idol Aug 22, 2025, 4:54 AM

#

Do i choose the base one or the chat

#

For rp

steel oar Aug 22, 2025, 4:55 AM

#

the chat

#

base one is more like autocomplete

vapid torrent Aug 22, 2025, 4:57 AM

#

base model mainly for research

steel oar Aug 22, 2025, 4:58 AM

#

though V3.1 seems to give really short responses in RP

vapid torrent Aug 22, 2025, 4:58 AM

#

have you tried to change the system prompt?

steel oar Aug 22, 2025, 4:59 AM

#

V3 0324 will give much longer reply to the same prompt... but maybe I need to adjust the prompt for V3.1

vapid torrent Aug 22, 2025, 4:59 AM

#

yeah, different models have different prompting techniques

serene idol Aug 22, 2025, 5:02 AM

#

Which template to use

#

The deepseek one

#

And instruct

vapid torrent Aug 22, 2025, 5:02 AM

#

what do you mean 'template'?

#

The chat template is automatically applied by the /chat/completions endpoint

serene idol Aug 22, 2025, 5:06 AM

#

Oh im using text complication do I supposed to use chat

vapid torrent Aug 22, 2025, 5:07 AM

#

text completion is for base /instruct models, chat completion is for chat models

serene idol Aug 22, 2025, 5:14 AM

#

Ohhh..okay

#

Let me change it

jolly knot Aug 22, 2025, 5:16 AM

#

vapid torrent text completion is for base /instruct models, chat completion is for chat models

what's the difference between chat and instruct models

#

i thought text completion was base and chat was instruct

steel oar Aug 22, 2025, 5:24 AM

#

steel oar V3 0324 will give much longer reply to the same prompt... but maybe I need to ad...

and I am still struggling to make it give a long reply...

serene idol Aug 22, 2025, 6:02 AM

#

steel oar and I am still struggling to make it give a long reply...

Mine keeps repeating the same response

steel oar Aug 22, 2025, 6:06 AM

#

serene idol Mine keeps repeating the same response

which provider did you use?

serene idol Aug 22, 2025, 6:12 AM

#

Gmicloud q

#

I don't know much about providers

#

I choose it because it sounded cool

vapid torrent Aug 22, 2025, 6:14 AM

#

serene idol Gmicloud q

Does silly tavern let you pick a provider or do you have account with them directly? Or did you use the proxy thing

#

I have GMICloud account and they work quite well, never had issue with model performance

serene idol Aug 22, 2025, 6:18 AM

#

vapid torrent Does silly tavern let you pick a provider or do you have account with them direc...

Yeah sillytavern let's me pick one, but it depends on if the model your using has alot of providers, deepseek has gmicloud as one of the providers so im using that one

steel oar Aug 22, 2025, 6:28 AM

#

just tried to test GMICloud and it throws a 429 rate limit...

viscid sigil Aug 22, 2025, 6:38 AM

#

so if i am using official deepseek api, do i set my temperature to 1.5 for creative writing?

#

are they still following that 0.7 rule

polar locust Aug 22, 2025, 7:18 AM

#

viscid sigil so if i am using official deepseek api, do i set my temperature to 1.5 for creat...

The official uses the normal temp, 3rd-party however needs to follow a certain rule as you said.

outer rover Aug 22, 2025, 7:33 AM

#

Just woke up to find out there is still no free version feelsbadman

steel oar Aug 22, 2025, 7:35 AM

#

viscid sigil are they still following that 0.7 rule

I think that rule is only mentioned in V3-0324's model card, not sure if other DeepSeek models use that rule (for example R1-0528's model card says their temperature setting is 0.6)

steel oar Aug 22, 2025, 7:51 AM

#

Ah it looks like this is the recommended setting? https://huggingface.co/deepseek-ai/DeepSeek-V3.1/blob/main/generation_config.json

generation_config.json · deepseek-ai/DeepSeek-V3.1 at main

#

jolly knot Aug 22, 2025, 8:00 AM

#

I don't think this is better than gpt 5

#

eager lava Aug 22, 2025, 8:27 AM

#

FWIW from very early tests, this one seems even better in Swedish than V3 so Scandinavians, maybe even Euros in general may be interested. More natural prose and flow, kind of like in the Claude direction. Combined with significantly improved coding capabilities and improved tool calling, I might just look into this one as a work horse model. It's no GPT-5 or Claude 4.1 Opus killer or whatever but certainly among the best you get among open models and for great price/performance ratio. Performs well on LiveBench and Artificial Analysis.

storm seal Aug 22, 2025, 11:35 AM

#

vapid torrent if someone else wants to its MIT licenced 😃

Maybe I'll take a swing at hosting it

digital snow Aug 22, 2025, 12:34 PM

#

https://api-docs.deepseek.com/quick_start/parameter_settings

The Temperature Parameter | DeepSeek API Docs

The default value of temperature is 1.0.

#

what curious recs

steel oar Aug 22, 2025, 12:36 PM

#

They haven't updated that page for a while

rocky sedge Aug 22, 2025, 12:39 PM

#

It's possible to test, for me DeepSeep using "real" Temperature of 1.7 breaks into mess with random words and special characters. So if DeepSeek provider deducts 0.7 from temp, making it 1.0, it should be coherent

steel oar Aug 22, 2025, 12:40 PM

#

steel oar Ah it looks like this is the recommended setting? https://huggingface.co/deepsee...

I think this is their latest recommended temperature (temperature 0.6 top_p 0.95)

vapid torrent Aug 22, 2025, 12:54 PM

#

rocky sedge It's possible to test, for me DeepSeep using "real" Temperature of 1.7 breaks in...

What if you put 0.6 into the model then.? Temp -0.1 makes No sense

rocky sedge Aug 22, 2025, 12:54 PM

#

If temp is <1, they make it (temp*0.3)

#

If >1, they make it (temp - 0.7). At least they did that before

vapid torrent Aug 22, 2025, 12:55 PM

#

Ahh so it’s a piece-wise linear?

#

like a translated leaky ReLU?

steel oar Aug 22, 2025, 12:55 PM

#

rocky sedge If temp is <1, they make it (temp*0.3)

I think this is only for V3-0324

rocky sedge Aug 22, 2025, 12:55 PM

#

vapid torrent like a translated leaky ReLU?

https://tenor.com/view/who-dat-snoop-gif-15116696

Tenor

rocky sedge Aug 22, 2025, 12:56 PM

#

steel oar I think this is only for V3-0324

Not confirmed, maybe logic is still there

steel oar Aug 22, 2025, 12:56 PM

#

They recommended 0.6 in R1 0528's model card and recommend 0.6 in V3.1's model card

rocky sedge Aug 22, 2025, 12:57 PM

#

Just try and set 1.7 through DeepSeek provider with thinking enabled

#

I can't right now

steel oar Aug 22, 2025, 1:00 PM

#

rocky sedge Just try and set 1.7 through DeepSeek provider with thinking enabled

It may not work because their deepseek-reasoner may not support setting temperature (their docs say so, not sure if V3.1 is different)

rocky sedge Aug 22, 2025, 1:02 PM

#

I remember breaking R1 with high temp, but don't remember what provider it was

steel oar Aug 22, 2025, 1:03 PM

#

I just tried set temperature to 2 (deepseek official api), and the output is still pretty normal

#

Other providers seem to output nonsense at 2 (seems to be probabilistic, could be normal or nonsensical)

rocky sedge Aug 22, 2025, 1:05 PM

#

Then it's probably still does -0.7 stuff, because at 1.3 it's borderline ok

rocky sedge Aug 22, 2025, 1:05 PM

#

steel oar Other providers seem to output nonsense at 2 (seems to be probabilistic, could b...

Exactly

#

So we can assume they still translate raw temperature values into their internal formula

outer rover Aug 22, 2025, 3:11 PM

#

Did any provider by any chance announce a plan for free version yet?

#

Got no idea how long that usually takes from model release

covert gust Aug 22, 2025, 4:37 PM

#

Tested DeepSeek V3.1:
Hybrid model, that supports light thinking

Non-Thinking:
Same verbosity as V3 0324
Comparatively, smarter overall, but performed noticeably weaker in coding tasks

Thinking:
+125% token use. 64% of tokens were spent on reasoning.
This is very light reasoning, ~45% less verbosity than R1 0528
Compared to non-thinking, the thinking did very little if anything to improve final response quality. In fact, it was mostly even or slightly worse on some tasks.
During evaluation, it reminded me a lot of Sonnet 4 thinking in terms of reasoning token benefits.
Thus, enabling thinking proved highly ineffective in the totality of my testing.

**Chess **performance remained poor (~650 starting Elo), around V3 level.

Overall, compared to V3 0324 this is a small upgrade, except for (non-tool) coding where it's a noticeable downgrade imo. (example demo pages available)
Compared to R1 0528, the model lacks behind severely in general intelligence and is not a replacement.

Imo, for general use case, nonthinking DeepSeek V3.1 is a good option.
Overall, I was rather disappointed with the hybrid performance, so I'm not sure it's the right approach - but YMMV

rocky sedge Aug 22, 2025, 4:39 PM

#

I never saw a reasoning model doing worse than non-reasoning

covert gust Aug 22, 2025, 4:41 PM

#

rocky sedge I never saw a reasoning model doing worse than non-reasoning

happens sometimes. i saw it a few times. more is not always equal to better. you got context poisoning, overthinking, policy considerations, etc.

rocky sedge Aug 22, 2025, 4:44 PM

#

covert gust happens sometimes. i saw it a few times. more is not always equal to better. you...

I get reasoning might be bad for refusals, but context poisoning? Never happened to me, as <think> tags are omitted from input context.
But there might be a catch - Fiction Livebench scores DeepSeek 3.1 Thinking with ~20% more score/100 compared to Non-Thinking

covert gust Aug 22, 2025, 4:45 PM

#

well thats fine but I don't parrot other scores, I simply share my results. if you don't find them helpful to you, you can simply ignore my own testing and see what fits your own usecase

rocky sedge Aug 22, 2025, 4:45 PM

#

At 120k, it's 62 vs 53, but I don't know how many passes they do

rocky sedge Aug 22, 2025, 4:46 PM

#

covert gust well thats fine but I don't parrot other scores, I simply share my results. if y...

Not a critique, your leaderbord is very helpful

covert gust Aug 22, 2025, 4:49 PM

#

yea, also I can see that livecodebench improved scores (it's in their marketing). However, marketing has zero influence on what I report, which will always be the results achieved regardless of anything else (whether everyone agrees or everyone disagrees, the results would be identical in either case)

rocky sedge Aug 22, 2025, 4:50 PM

#

I was already suspicious about some results from big sites being 'gently nudged' from companies or models just being benchmaxxed

#

A chance of smaller enthusiast / indie author being biased is very low

covert gust Aug 22, 2025, 4:52 PM

#

if anything I would be biased towards positivity. I like deepseek models a lot and we need more open models. this one just didn't do well unfortunately for me

#

maybe R2 will knock it out of the park, who knows

rocky sedge Aug 22, 2025, 5:00 PM

#

Well, that's IF they won't ditch R2 in a favor of all-in-one hybrids

covert gust Aug 22, 2025, 5:02 PM

#

rocky sedge Well, that's IF they won't ditch R2 in a favor of all-in-one hybrids

well there seems to be some conflicting attempts. qwen3 went the reverse route, claiming hybrid hinders performance. Personally i prefer individual models for each mode, but if a hybrid is good on their own merit I'll take it aswell

rocky sedge Aug 22, 2025, 5:04 PM

#

They probably went with V3.1 Hybrid route before Qwen3 admitted it being a wrong path to develop a model

snow ingot Aug 22, 2025, 5:10 PM

#

Hello
Is it just me, or is reasoning now disabled on 3.1 in OpenRouter?
I see that you can specifically pick deepseek-reasoner only from Deepseek API
but for some reason using 3.1 on OpenRouter no longer returns reasoning for me
pardon if my question is ignorant

covert gust Aug 22, 2025, 5:11 PM

#

snow ingot Hello Is it just me, or is reasoning now disabled on 3.1 in OpenRouter? I see th...

which Provider? works fine on everything default

rancid geode Aug 22, 2025, 5:11 PM

#

snow ingot Hello Is it just me, or is reasoning now disabled on 3.1 in OpenRouter? I see th...

They removed the :think version, you need to enable thinking as a api parameter

#

#1407376333444616272 message

jolly knot Aug 22, 2025, 5:20 PM

#

vapid torrent Ahh so it’s a piece-wise linear?

rocky sedge Aug 22, 2025, 5:21 PM

#

2 is maximum iirc

proud comet Aug 22, 2025, 10:52 PM

#

rancid geode They removed the :think version, you need to enable thinking as a api parameter

Is it this?
https://openrouter.ai/docs/api-reference/completion#request.body.reasoning

wintry jolt Aug 23, 2025, 3:18 AM

#

My convo somehow completely broke deepseek on the third message ._.

spice marsh Aug 23, 2025, 3:19 AM

#

Speechless

polar locust Aug 23, 2025, 4:04 AM

#

Baffled

twin cape Aug 23, 2025, 4:29 AM

#

covert gust well thats fine but I don't parrot other scores, I simply share my results. if y...

It's normal to see conflicting results, because top models are all generally very good and close in overall performance. So which model does better can depend on the specific task, prompts, eval setup, etc. Which is again the purpose of the specific evals anyway.

Long gone are the days where Sonnet 3.5 Pareto dominants over all other models.

vapid torrent Aug 23, 2025, 5:02 AM

#

wintry jolt My convo somehow completely broke deepseek on the third message ._.

Reasoning or non reasoning

vapid torrent Aug 23, 2025, 7:33 AM

#

how did you renabled reasoning?

vapid torrent Aug 23, 2025, 8:48 AM

#

As in what did you try to do to enable reasoning

modern rune Aug 23, 2025, 9:11 AM

#

If you send the messages back to it without adding a new message of your own - or send an empty one, regardless of whatever is in the message chain - it absolutely goes nuts and dumps out huge amounts of random crap. Happens in both reasoning and non-reasoning. Didn't used to happen in previous versions.

If you want to easily replicate this for yourself, you can do it in Openrouter Chat. Exchange a normal message and get a reply, then send a follow up message and get a reply. Edit your follow up message to be empty and regenerate the second response. It's... quite something.

#

steel oar Aug 23, 2025, 9:20 AM

#

just tried, indeed get nonsense, though deepseek provider gave an empty response instead

twin cape Aug 23, 2025, 10:03 AM

#

modern rune If you send the messages back to it without adding a new message of your own - o...

I think that's expected based on how these models are instruction tuned. It's not a normal pattern of tokens so the model does not generate sensible output.

#

Or it's just a bug

dusk rover Aug 23, 2025, 10:50 AM

#

twin cape It's normal to see conflicting results, because top models are all generally ver...

i agree, I think we're in an era where there isn't much gap between any top large model now.

#

in certain very niche/technical tasks differences may emerge, though I don't think those differences are uniform.

covert violet Aug 23, 2025, 11:49 AM

#

which providers do you guys recommend?

wintry jolt Aug 23, 2025, 3:30 PM

#

vapid torrent Reasoning or non reasoning

It was non-reasoning.

silver nova Aug 23, 2025, 3:33 PM

#

There’s a software on GitHub called Cherry Studio, which lists many providers. At the moment, I think OpenRouter is quite good, while the others are hard to describe.

vapid torrent Aug 23, 2025, 3:39 PM

#

deepseek v3.1's thinking is very strange. it often calls me the assistant

silver nova Aug 23, 2025, 3:41 PM

#

Yes, compared to r1, its programming ability seems improved, but everyday conversations are always quite strange.

twin cape Aug 23, 2025, 4:23 PM

#

covert violet which providers do you guys recommend?

Give it a few days for the providers to fix their initial broken implementations as usual. I'll be testing the providers next week.

twin cape Aug 23, 2025, 4:24 PM

#

silver nova There’s a software on GitHub called Cherry Studio, which lists many providers. A...

Openrouter is just a proxy. We are talking about the actual inference providers behind OpenRouter

https://openrouter.ai/deepseek/deepseek-chat-v3.1

DeepSeek V3.1 - API, Providers, Stats

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. Run DeepSeek V3.1 with API

silver nova Aug 23, 2025, 4:30 PM

#

twin cape Openrouter is just a proxy. We are talking about the actual inference providers ...

i see

rancid geode Aug 23, 2025, 6:20 PM

#

vapid torrent deepseek v3.1's thinking is very strange. it often calls me the assistant

What do you use it in when it calls you the assistant? Also btw the proxy you gave me has been soooo nice! : )

hollow steppe Aug 23, 2025, 9:48 PM

#

IRT the hybrid stuff, I find it hard to believe Qwen was wrong here. Surely having reasoning training allows the model to off-load certain task performance structures to thinking tokens. Like if you asked me to solve long division in one-shot vs writing it out, I'm not going to use the same technique.

Yes, I know they don't actually reason, but that's irrelevant when I'm talking about off-loading.

vapid torrent Aug 23, 2025, 11:44 PM

#

rancid geode What do you use it in when it calls you the assistant? Also btw the proxy you ga...

Open webui

#

Official DeepSeek reasoner endpoint

rancid geode Aug 24, 2025, 12:33 AM

#

vapid torrent Open webui

Interesting, that’s weird behavior

hushed hedge Aug 24, 2025, 11:17 AM

#

this model is like so good , its talks like gpt , codes like deepseek

#

all the major models got negative growth after its released

#

although its little expnesive to use

vapid torrent Aug 24, 2025, 11:42 AM

#

The model acts very strangely though

#

This is Not the thinking mode it just randomly invented a thinking tag

#

This kind of shenanigans is probably why qwen dropped hybrid models, no other model does this

#

And it’s not a “provider issue” this is official DeepSeek api endpoint

#

It’s also quite sycophantic

#

Okay and now it’s just thinking.

hushed hedge Aug 24, 2025, 11:51 AM

#

vapid torrent This kind of shenanigans is probably why qwen dropped hybrid models, no other mo...

I hated it too , but without thinking mode , it kinds acts more like gpt , like the last deepseek was more robotic

#

use deepseek v3.1 from open router , this looks different

vapid torrent Aug 24, 2025, 11:53 AM

#

hushed hedge use deepseek v3.1 from open router , this looks different

The front end is open webui if that’s what you mean

#

or do you mean the provider

hushed hedge Aug 24, 2025, 11:54 AM

#

the model name

#

what is that?

vapid torrent Aug 24, 2025, 11:54 AM

#

hushed hedge the model name

DeepSeek official API

#

i renamed it on the front end

#

it points to deepseek-chat on deepseek’s official api

#

Which is v3.1 non thinking

#

I don’t use openrouter on DeepSeek as I want the context caching

hushed hedge Aug 24, 2025, 11:55 AM

#

why you not using chutes?

#

oh

#

ok

#

i am using chutes , like it rarely halliculates , most of the time it does no thinking at all

#

but weven without thinking it can write my programs

vapid torrent Aug 24, 2025, 11:56 AM

#

hushed hedge why you not using chutes?

isn’t chutes the weird free provider?

hushed hedge Aug 24, 2025, 11:56 AM

#

vapid torrent isn’t chutes the weird free provider?

here its not free but cheaper than deepseek

covert gust Aug 24, 2025, 12:00 PM

#

hushed hedge here its not free but cheaper than deepseek

for me DeepSeek API is cheaper, as I mostly use stuff in the evenings, which aligns with the UTC 16:30-00:30 discounted price, which is half price and cheaper than chutes.

vapid torrent Aug 24, 2025, 12:02 PM

#

covert gust for me DeepSeek API is cheaper, as I mostly use stuff in the evenings, which ali...

Although that is leaving soon

covert gust Aug 24, 2025, 12:02 PM

#

any pricing goes up and down. currently that's the case. I also don't know chutes price in 2 weeks

rocky sedge Aug 24, 2025, 12:09 PM

#

covert gust for me DeepSeek API is cheaper, as I mostly use stuff in the evenings, which ali...

Do you use caching a lot?

#

Like retrying same long context benchmark for 5 passes, or tool use with 32k+ input

vapid torrent Aug 24, 2025, 12:10 PM

#

Long chats also benefit from caching

#

like a 40k+ chat often has 90%+ caching

glad basin Aug 24, 2025, 12:29 PM

#

vapid torrent Long chats also benefit from caching

and coding tools benefit a heck ton

hushed hedge Aug 24, 2025, 12:32 PM

#

i kinda reset caht too often , usulay when the work is done or its start hallucinating

buoyant willow Aug 24, 2025, 6:28 PM

#

waiting for v3.1 free

rocky wagon Aug 25, 2025, 8:56 AM

#

Wow, it’s not free yet? Those providers are lazy I see

gaunt edge Aug 25, 2025, 9:17 AM

#

rocky wagon Wow, it’s not free yet? Those providers are lazy I see

i think its surprising there are still big free models over the api, considering how not free they are to run

thorn sky Aug 25, 2025, 9:46 AM

#

I would prefer a provider with caching rather than a free model, which DeepSeek officially offers is too slow

hushed hedge Aug 25, 2025, 2:41 PM

#

gaunt edge i think its surprising there are still big free models over the api, considering...

I am sure they are paying millions out of thier pockets at this point

polar locust Aug 25, 2025, 3:08 PM

#

hushed hedge I am sure they are paying millions out of thier pockets at this point

What TAO does to people with gpus.....

hushed hedge Aug 25, 2025, 3:14 PM

#

polar locust What TAO does to people with gpus.....

tao?

polar locust Aug 25, 2025, 3:17 PM

#

hushed hedge tao?

Basically, Chutes doesn't own GPUs(They do own some, but not those 8k or whatever GPUs at once). People gives those to them, let them borrow it, while paying the real owners with TAO(crypto)

#

That's where they get the “Decentralised” stuff

hushed hedge Aug 25, 2025, 3:22 PM

#

so they are renting people's gpu ith crypto in return

polar locust Aug 25, 2025, 3:23 PM

#

Ye

#

Cool stuff

#

Especially with how they connect those GPUs from around the world

hushed hedge Aug 25, 2025, 3:24 PM

#

its surprisingly fast considering so many different pc it had to use

rancid geode Aug 25, 2025, 4:16 PM

#

thorn sky I would prefer a provider with caching rather than a free model, which DeepSeek ...

I still don’t quite understand the motive

rancid geode Aug 25, 2025, 4:19 PM

#

polar locust Basically, Chutes doesn't own GPUs(They do own some, but not those 8k or whateve...

Interesting, So is it like the first crypto thing to actually work and be useful at scale?

polar locust Aug 25, 2025, 4:20 PM

#

rancid geode Interesting, So is it like the first crypto thing to actually work and be useful...

Ya, it's a huge mining inference stuff

#

That's why they didn't lose alot when they introduced the unlimited-reqs a few months back

rancid geode Aug 25, 2025, 4:22 PM

#

That makes more sense then

lost carbon Aug 25, 2025, 4:46 PM

#

rancid geode I still don’t quite understand the motive

Every free provider logs your data

#

Also , chutes has no clue if a particular gpu provider is actually logging your data or not.

#

I assume they must be logging a ton of gooner chats

hushed hedge Aug 25, 2025, 5:20 PM

#

lost carbon I assume they must be logging a ton of gooner chats

you cant complaint if its free

lost carbon Aug 25, 2025, 5:28 PM

#

I don't use them

near sparrow Aug 25, 2025, 5:31 PM

#

Chutes is so slow I rather just pay up atp

#

Especially since the rate limit a while ago

rancid geode Aug 25, 2025, 7:31 PM

#

lost carbon Every free provider logs your data

Thanks, l yeah I understood that motive, but with large models in the 600b-1T range I had a hard time understanding how that is worth it to them

#

I personally don’t use paid providers as well and the paid ones I do use I try to understand their respective Privacy policy/tos. but was still wondering a bit. The ones that make the most sense to me is what google, alibaba, etc five free access in their tools, I know it’s temporary and will be used as training data for future models, it was just the other providers that I did not understand, because as far as I’m aware they are not training models

#

And can just sell the data to advertisers and etc. but I guess it depends on how personal the data that people put in the ai is, maybe it’s worth a lot, haha

abstract igloo Aug 25, 2025, 11:48 PM

#

rocky wagon Aug 26, 2025, 6:37 AM

#

lost carbon Every free provider logs your data

Who cares to be honest, what are they going to do with those chats? Blackmail us? xD

#

I'd prefer it to be free even if a provider logs my data. I didn't pay 10 dollars to OR for nothing (I want those 1000 messages daily for my free models)

#

If a provider's employee wants to read my logs about femboy breeding, they can be my guest I guess

storm torrent Aug 26, 2025, 7:04 AM

#

rocky wagon Who cares to be honest, what are they going to do with those chats? Blackmail us...

maybe, but in the same way that your browsing data, or whatever facebook/google has on you is valuable, could be mined out of your conversations

outer rover Aug 26, 2025, 11:31 AM

#

I didnt do a lot of testing with 3.1 cause Im poor af. Is it a meaningfull enough upgrade from 0528 in RP?

modest smelt Aug 26, 2025, 11:47 AM

#

outer rover I didnt do a lot of testing with 3.1 cause Im poor af. Is it a meaningfull enoug...

to me yes, it writes less which means less bloated paragraph.

hushed hedge Aug 26, 2025, 12:38 PM

#

outer rover I didnt do a lot of testing with 3.1 cause Im poor af. Is it a meaningfull enoug...

i used to be like that but than my grandma gave me 5 $

outer rover Aug 26, 2025, 1:30 PM

#

hushed hedge i used to be like that but than my grandma gave me 5 $

With all my medication I have to be very strict with unnecessary expenses.

marble panther Aug 26, 2025, 4:54 PM

#

outer rover I didnt do a lot of testing with 3.1 cause Im poor af. Is it a meaningfull enoug...

mixed opinions, most seem to say so, but it depends on what you liked about r1

#

if you liked its schizo side you'll miss that in 3.1

#

3.1 is more like gemini than the old deepseek r1 style

#

outer rover Aug 26, 2025, 5:07 PM

#

marble panther

That actually sounds pretty good. Thanks for taking the time to answer. I have only been testing it in small doses to keep expenses down, but overall I do prefer it so far.

pearl magnet Aug 26, 2025, 5:07 PM

#

What's going on with providers for this model? Chutes and Deepseek itself have disappeared for me. Maybe temporary maintance or something? Is that how it works for openrouter, they just disappear from the page if they are down or whatever?

Nvm ignore me, seems it changed to hide ignored providers and I had logging disabled

marble panther Aug 26, 2025, 5:08 PM

#

pearl magnet ~~What's going on with providers for this model? Chutes and Deepseek itself have...

did you change your privacy settings?

pearl magnet Aug 26, 2025, 5:09 PM

#

marble panther did you change your privacy settings?

I didn't think so, but I guess it could've bugged or I did it accidentally

#

because yeah, its providers with logging being hidden. I was using chutes literally a few hours ago and haven't touched my settings since then. Deepseek was visible on there too

#

my bad I guess?

marble panther Aug 26, 2025, 5:09 PM

#

"SubModel provides DSv3.1 with fp8 and lower price - $0.2/0.8"

pearl magnet Aug 26, 2025, 5:11 PM

#

Ok its an openrouter issue. I have paid endpoints that may train on inputs enabled but its ignoring them

jolly lintel Aug 26, 2025, 5:11 PM

#

pearl magnet Ok its an openrouter issue. I have paid endpoints that may train on inputs enabl...

there may be some caching, wait a few minutes

pearl magnet Aug 26, 2025, 5:12 PM

#

jolly lintel there may be some caching, wait a few minutes

But I haven't changed the setting. I didn't have this issue about 8 hours ago

Edit: Issue randomly fixed itself without me touching anything other than removing and re-adding an ignored provider to test, as adding that is the only change I made today. Very strange 🤔

marble panther Aug 26, 2025, 5:12 PM

#

@nocturne kelp any chance you can reach out to submodel to see if tehy want to be a provider?
https://submodel.gitbook.io/docs/instagen/overview-1/available-models
really good prices, no prompt logging

worthy charm Aug 26, 2025, 11:37 PM

#

Hey guys someone can help me ?

I'm using AI SDK with OpenRouter ai-sdk-provider and DeepSeek V3.1.

My issue is: the returned reasoning is empty, and is added to the final message. Someone knows why or how to fix it ?

This is my configuration:

'chat-model': wrapLanguageModel({
  model: openrouter('deepseek/deepseek-chat-v3.1'),
  middleware: extractReasoningMiddleware({ tagName: 'think' }),
}),

lost carbon Aug 26, 2025, 11:47 PM

#

worthy charm Hey guys someone can help me ? I'm using AI SDK with OpenRouter ai-sdk-provider...

are you using thinking model? also , https://openrouter.ai/docs/use-cases/reasoning-tokens

OpenRouter Documentation

Reasoning Tokens - Improve AI Model Decision Making

Learn how to use reasoning tokens to enhance AI model outputs. Implement step-by-step reasoning traces for better decision making and transparency.

worthy charm Aug 26, 2025, 11:54 PM

#

I think that by default Deepseek 3.1 uses thinking mode

worthy charm Aug 26, 2025, 11:55 PM

#

lost carbon are you using thinking model? also , https://openrouter.ai/docs/use-cases/reason...

this is the full API response:

If you notice the reasoning is empty

[
    {
        "id": "5192b330-8478-4cd2-a347-e38f5a7a4bec",
        "chatId": "2cbe9177-020a-4ffe-8f15-3308c1fdd165",
        "role": "user",
        "parts": [
            {
                "text": "Hello, what is your name?",
                "type": "text"
            }
        ],
        "attachments": [],
        "createdAt": "2025-08-24T01:21:20.879Z"
    },
    {
        "id": "e6cb68a4-c762-4fb3-aaf6-c1c93e7b6902",
        "chatId": "2cbe9177-020a-4ffe-8f15-3308c1fdd165",
        "role": "assistant",
        "parts": [
            {
                "type": "step-start"
            },
            {
                "type": "reasoning",
                "reasoning": "",
                "details": [
                    {
                        "type": "text",
                        "text": ""
                    }
                ]
            },
            {
                "type": "text",
                "text": "Hmm, the user is asking for my name. This is a simple introductory question. I should respond with my name and a friendly greeting to start the conversation. \n\nI'll keep it warm and inviting by introducing myself as \"Weather Assistant\" since I have weather-related capabilities, and then ask how I can help them today. This sets a positive tone for the interaction. \n\nThe response should be concise but welcoming - no need for lengthy explanations since this is just an introduction.Hello! I'm Weather Assistant. How can I help you today?"
            }
        ],
        "attachments": [],
        "createdAt": "2025-08-24T01:21:23.387Z"
    }
]

lost carbon Aug 26, 2025, 11:56 PM

#

worthy charm I think that by default Deepseek 3.1 uses thinking mode

did you read the doc I gave ?

worthy charm Aug 26, 2025, 11:57 PM

#

lost carbon did you read the doc I gave ?

I'm taking a look at it, is there a specific section I should check?

lost carbon Aug 26, 2025, 11:58 PM

#

worthy charm I'm taking a look at it, is there a specific section I should check?

https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config

OpenRouter Documentation

Reasoning Tokens - Improve AI Model Decision Making

Learn how to use reasoning tokens to enhance AI model outputs. Implement step-by-step reasoning traces for better decision making and transparency.

worthy charm Aug 27, 2025, 12:00 AM

#

lost carbon https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-defa...

Got it, thank you

rocky sedge Aug 27, 2025, 12:26 PM

#

https://eqbench.com/creative_writing.html

V3.1 stands between R1 and V3, but I don't understand how is this possible
Especially considering GLM4.5 is below all of them in rankings, but better subjectively

viscid sigil Aug 27, 2025, 1:51 PM

#

glm 4.5 deserves higher ranking...its just so good

#

Even hermes 4 better than GLM?

#

Hermes 4 was so bad when i used it for roleplays

serene idol Aug 27, 2025, 2:48 PM

#

viscid sigil glm 4.5 deserves higher ranking...its just so good

What settings u got for glm 4.5 the writing style and prose was annoying but I got good dialouge

marble panther Aug 27, 2025, 5:47 PM

#

viscid sigil glm 4.5 deserves higher ranking...its just so good

eh if you read the writing samples it's not that great

#

this is Kimi K2

Chapter 6 – A Quiet Knife of Diplomacy  
Aboard transfer-cutter Hekate, descent vector 14° off the ecliptic

Arthur Neegan sat on a crash-couch designed for a man two centimeters shorter and a great deal more resigned. The cabin had no windows—unnecessary on a thirty-minute ballistic hop—but a small display screen on the wall cycled through system diagnostics in green text. Green was supposed to be soothing. It merely reminded him of Martian skies at dawn, before the first strip-miners had stained them sienna.

He flexed his wrists against the soft cuffs that were not quite restraints. The polite Antares lieutenant had called them “safety belts,” yet the magnetic locks clicked shut with the same finality as any security bracer. Politeness, Arthur decided, was simply violence wearing velvet gloves.```

this is GLM 4.5
```Model Output:
The low thrum of the transfer vessel’s drive was a constant vibration in Arthur Neegan’s bones, a counterpoint to the frantic pulse in his throat. He sat rigidly in the utilitarian passenger couch, its worn synth-leather cool against his palms. Outside the thick viewport, the ochre deserts of Mars shrank below, soon replaced by the infinite, star-dusted black of interplanetary space. A small display screen on the wall cycled through various system diagnostics in glowing green text: ‘Drive Stability: Nominal’, ‘Life Support: Optimal’, ‘Nav Lock: Antares Orbital’. The last line felt like a brand.

*What have I gotten involved in?* The question echoed, louder than the drive’s hum. Just forty-eight hours ago, his biggest concern was calibrating the sonic resonator on his small, independent claim near Syrtis Major. Then the deep-core survey ping had screamed – not the usual trace elements, but a vein of Azra so dense, so pure, it defied comprehension. His unassuming plot, inherited from his father, was suddenly the most valuable real estate in the solar system.```

sharp quest Aug 27, 2025, 5:51 PM

#

My main complaint with kimi is that it insist on making it as short as it can, V3.1 is atleast decent at that

rocky sedge Aug 27, 2025, 5:52 PM

#

Kimi K2's writing of SCP time travelling letters from bench was good

marble panther Aug 27, 2025, 5:52 PM

#

I will say GLM 4.5 is competent

#

its prose is nothing fancy, it writes at like a grade 8 level

#

but it's still enjoyable, just not "painterly" like kimi which reads like a real novel

rocky sedge Aug 27, 2025, 5:53 PM

#

Kimi K2 writing is good, but sometimes loses plot points like relation status, name drops, and probably hurts at long context
GLM 4.5 with reasoning should handle it better, like double pass the same data

marble panther Aug 27, 2025, 5:54 PM

#

rocky sedge Kimi K2 writing is good, but sometimes loses plot points like relation status, n...

yeah kimi k2 is known to not have coherency or long context that say r1 or gemini 2.5 has

#

but it's also a base model

#

a thinking version will come soon

sharp quest Aug 27, 2025, 5:54 PM

#

Also, in my experience its just incapable of doing long outputs

marble panther Aug 27, 2025, 5:54 PM

#

what do you mean by long?

sharp quest Aug 27, 2025, 5:54 PM

#

its V3's problem compounded

rocky sedge Aug 27, 2025, 5:54 PM

#

The longer is prompt context, the more beneficial is reasoning. Plus of course Kimi2 is censored

#

Uncensored Kimi K2 with reasoning and 256k context would be game changing

marble panther Aug 27, 2025, 5:55 PM

#

kimi k2 seems to have no problem with long outputs at least in the benchmarks

#

at least in raw "length" but it does not fare well in the longform writing bench

#

sharp quest Aug 27, 2025, 5:56 PM

#

thats the thing, i use my models for writing

marble panther Aug 27, 2025, 5:57 PM

#

sharp quest thats the thing, i use my models for writing

tough! I'd say use 3.1 (according to longform b ench) for the general structure then kimi k2 to rewrite drafts

rocky sedge Aug 27, 2025, 5:58 PM

#

marble panther

18% degradation is big

marble panther Aug 27, 2025, 5:59 PM

#

rocky sedge 18% degradation is big

yeah huge. but it's also one of the best non thinking models on the list

#

so here's hoping their thinking version has better long context

rocky sedge Aug 27, 2025, 6:00 PM

#

You can start with Kimi until 32k context then switch to thinking models. Or try 2-1-2-1-2-1 switch between non-think and think

#

Mixing styles of both to avoid repetition but extract details from time to time

marble panther Aug 27, 2025, 6:02 PM

#

I like the idea of structure using good long context models then kimi to write the prose for each smaller segment

#

just feed it character information, plot points, summary of story thus far

#

and let it paint over the rough edges

twin cape Aug 27, 2025, 6:50 PM

#

I hate this model.

#

i spent so much time fixing bugs due to DeepSeek not changing the model id in API. and then this model performs worst in my coding eval. makes me question reality.

rancid geode Aug 27, 2025, 8:49 PM

#

We know some of the main drawbacks of hybrid reasoning models, I would guess the main benefit is prompt cashing, prob reduces cost when the same model is used for both reasoning and non-reasoning. Outside of that I’m not sure why it’s beneficial

#

From deepseek’s perspective I’m sure it’s tempting because they only need to host one model weights and with how it’s harder for them to get GPU’s that would likely be beneficial

rocky sedge Aug 27, 2025, 8:52 PM

#

Prompt caching is provider side, I think they can do shared prompt cache for all their models. I hadn't tried DeepSeek caching during separate models era

rancid geode Aug 27, 2025, 8:53 PM

#

rocky sedge Prompt caching is provider side, I think they can do shared prompt cache for all...

Oh, did not know that was a thing

rocky sedge Aug 27, 2025, 8:53 PM

#

I am not 100% sure if that's how it works

rancid geode Aug 27, 2025, 8:55 PM

#

Might have to look into it to see if any providers do that

#

The path I thought companies would take with hybrid reasoning models is where it would not think at all when it’s an easy question and think for 5+ min when it’s really really hard (I know time is a horrible way to describe this because it’s depended on compute, but you get the point, haha), but I am assuming thats harder said than done, seems that models can’t comprehend where they may screw up, how many r’s in the word strawberry was the most famous example of this

twin cape Aug 28, 2025, 5:00 AM

#

rocky sedge Prompt caching is provider side, I think they can do shared prompt cache for all...

You mean kv cache right? I think it depends on embedding. Not exactly sure how it works, but I'm planning to do a livestream to understand how k, v, q works for attention mechanism. Will share after I get a good understanding.

rancid geode Aug 28, 2025, 5:28 AM

#

This is just vibe based, but I’m liking deepseek v3.1 with no thinking more than with thinking, it def feels like an improvement over 0324

#

I think it will be my go to moving forward, with GLM 4.5 subbing in at times, going to try to live without sonnet 4 for a bit

tranquil hornet Aug 28, 2025, 8:36 AM

#

how was the responces of v3.1 on crypto chart and trading raw data ?

lost carbon Aug 28, 2025, 2:01 PM

#

buoyant willow waiting for v3.1 free

Me too.

outer rover Aug 28, 2025, 4:47 PM

#

Got really excited to see the free version, then found out its censored to hell. Briliant

rocky sedge Aug 28, 2025, 5:04 PM

#

How is it even possible

outer rover Aug 28, 2025, 5:07 PM

#

Even in the way it writes it seems very different to the paid provider version. Its kinda bizzare and I hate it.

#

I am willing to give some level of new provider issues benefit of doubt, but its clearly more than that.

#

It also seems totally resistent to my jailbreaking set up. My set up makes even Kimi K2 fully uncensored, so this is interesting in a really annoying way.

sharp quest Aug 28, 2025, 5:14 PM

#

openinfra being openinfra

wintry jolt Aug 28, 2025, 5:14 PM

#

I jbd it on the official website by accident

cerulean grove Aug 28, 2025, 7:00 PM

#

i dont understand how to get the thinking v3.1 using typingmind's openrouter connector

vapid torrent Aug 29, 2025, 2:12 AM

#

outer rover Got really excited to see the free version, then found out its censored to hell....

provider issue or model training issue?

steel oar Aug 29, 2025, 4:32 AM

#

cerulean grove i dont understand how to get the thinking v3.1 using typingmind's openrouter con...

Add a custom body params like this:

hushed hedge Aug 29, 2025, 8:11 AM

#

lets go its free nowwwwww

#

altho ugh its trash

outer rover Aug 29, 2025, 9:15 AM

#

vapid torrent provider issue or model training issue?

Provider. The paid version is uncensored.

lost carbon Aug 29, 2025, 9:25 AM

#

vapid torrent provider issue or model training issue?

Is it gonna stop being a training model at some point?

vapid torrent Aug 29, 2025, 9:35 AM

#

lost carbon Is it gonna stop being a training model at some point?

Well you can always use other providers that aren’t training

lost carbon Aug 29, 2025, 9:36 AM

#

vapid torrent Well you can always use other providers that aren’t training

What are the free providers?

vapid torrent Aug 29, 2025, 9:36 AM

#

lost carbon What are the free providers?

Ahh you mean free providers

#

They make their money from collecting : selling data

lost carbon Aug 29, 2025, 9:36 AM

#

vapid torrent Ahh you mean free providers

Yeah.

vapid torrent Aug 29, 2025, 9:36 AM

#

so it’s pretty much impossible

lost carbon Aug 29, 2025, 9:36 AM

#

Damn.

#

Ok.

twin cape Aug 29, 2025, 9:46 AM

#

DeepSeek-V3.1 coding performance evaluation on my coding evaluation set:

Mixed performance with concerning regressions - DeepSeek-V3.1 achieved an average rating of 5.68, significantly underperforming compared to top models and even showing regression from its predecessor on some tasks.

Performance Comparisons

vs. Top Models: Performed worse than Claude Opus 4, Claude Sonnet 4, Grok 4, and GPT-4.1
vs. Open-Source Models: Also lagged behind gpt-oss-120b, Qwen3 Coder, and Kimi K2
vs. Predecessor: Mixed results compared to DeepSeek-V3 (New), with some improvements but notable regressions

Notable Issues

Instruction adherence problems - stubbornly ignored specific formatting requests
Gap in advanced programming patterns - struggled with uncommon or tricky scenarios
Visualization - produced charts remarkably similar to Horizon Alpha

Full blog post: https://eval.16x.engineer/blog/deepseek-v3-1-coding-performance-evaluation

hushed hedge Aug 29, 2025, 9:54 AM

#

wish they made clause sonnet 4 cheaper

sharp quest Aug 29, 2025, 10:03 AM

#

that would require anthropic to make their models more efficient

hushed hedge Aug 29, 2025, 10:31 AM

#

sharp quest that would require anthropic to make their models more efficient

I am sure its pretty efficent but its proprietary , and since they are basically used by almost any "no code" service they can charge any price

hushed hedge Aug 29, 2025, 10:33 AM

#

twin cape DeepSeek-V3.1 coding performance evaluation on my coding evaluation set: **Mixe...

how are these benchmarks done , like many times one company could not do other does , like are they given simple cording task and if dont do it they are given a point or or weather they do it more efficeintly as coding itself is highly subjective, it should be done on different fields in coding

twin cape Aug 29, 2025, 10:34 AM

#

hushed hedge how are these benchmarks done , like many times one company could not do other ...

I livestreamed the eval process on YouTube. You can see exactly how it's done:
https://youtube.com/live/jxQOQrHVy5I?feature=share

sharp quest Aug 29, 2025, 10:34 AM

#

there are quite many reports that the biggest AI companies are usually running at a loss, and subsidized, openai's own revenue data confirms this

hushed hedge Aug 29, 2025, 10:35 AM

#

twin cape I livestreamed the eval process on YouTube. You can see exactly how it's done: h...

nice

hushed hedge Aug 29, 2025, 10:37 AM

#

sharp quest there are quite many reports that the biggest AI companies are usually running a...

cuz ai dont give profits instantly its a long term thing after spending billions to create a model they can spend a fraction of it to make a better or more effcient version while api costs remain the same always

pearl magnet Aug 30, 2025, 2:31 AM

#

Does anyone know if chutes is running this model at fp4? Based on the low price I would assume so

polar locust Aug 30, 2025, 3:34 AM

#

pearl magnet Does anyone know if chutes is running this model at fp4? Based on the low price ...

I thought OR shows the quant on the model page?

pearl magnet Aug 30, 2025, 3:37 AM

#

polar locust I thought OR shows the quant on the model page?

Not all of them display a quant for me. Unless that is supposed to mean its not quantised at all? I assumed it wasn't that because some of them are so cheap/fast and don't show any quant

#

And also the fact that you can't filter by "no quant" only "all quantizations"

storm torrent Aug 30, 2025, 3:39 AM

#

its in the row of icons under the provider name

pearl magnet Aug 30, 2025, 3:39 AM

#

storm torrent Aug 30, 2025, 3:39 AM

#

chutes doesn't have one listed, so. it's "unknown"

pearl magnet Aug 30, 2025, 3:39 AM

#

yeah, so I was wondering if anyone knew what it actually is

storm torrent Aug 30, 2025, 3:39 AM

#

yeah if it's not there its not disclosed

vapid torrent Aug 30, 2025, 8:03 AM

#

i think the thing with chutes is its distributed so there is no specific quant, every node might use different config

cerulean grove Aug 30, 2025, 10:37 AM

#

steel oar Add a custom body params like this:

thank you so much!

hushed hedge Aug 30, 2025, 2:48 PM

#

finally a good provider hosting deepseek v3.1 for free

lost carbon Aug 30, 2025, 4:48 PM

#

hushed hedge finally a good provider hosting deepseek v3.1 for free

Who?

sharp quest Aug 30, 2025, 6:18 PM

#

lost carbon Who?

deepinfra updated

#

#

its fp4 though

outer rover Aug 31, 2025, 9:13 AM

#

Anyone know why Silly Tavern still shows only the 64K OpenInference and not 164K DeepInfra?

mortal igloo Aug 31, 2025, 2:18 PM

#

SillyTavern uses the context listed from the /models endpoint, the one appearing in ST's command prompt console, when you click on "Connect" so it uses that

#

Just choose DeepInfra provider and set context to unlocked/unlimited

untold badger Aug 31, 2025, 3:14 PM

#

outer rover Anyone know why Silly Tavern still shows only the 64K OpenInference and not 164K...

That's what the API response shows from https://openrouter.ai/api/v1/models

#

Like he said, you can select the provider. Otherwise, don't quote me on this, presumably if your context is too big, OR would route it to whatever supports it (if not, it would error), assuming you keep Middle-Out Transform disabled.
And "unlocked" context slider will prevent auto adjusting the max.

outer rover Aug 31, 2025, 3:31 PM

#

Thanks guys!

steel oar Sep 1, 2025, 10:18 AM

#

pearl magnet Does anyone know if chutes is running this model at fp4? Based on the low price ...

I asked in chutes' discord, and the admin says it's fp8

hushed hedge Sep 1, 2025, 2:47 PM

#

i was wondering why deepseek is halucinating then I found out about the fp things

lucid wave Sep 3, 2025, 6:06 PM

#

What precision is deepseek trained on?

eager lava Sep 3, 2025, 6:16 PM

#

lucid wave What precision is deepseek trained on?

fp8

rancid geode Sep 3, 2025, 6:27 PM

#

I kinda just swapped to using deepseek as the provider, consistent quality and much cheaper

#

But I guess now there are more options that support chaching so it’s prob no longer the cheaper option

glacial reef Sep 3, 2025, 7:05 PM

#

rancid geode But I guess now there are more options that support chaching so it’s prob no lon...

which providers do u know support cacheing?

rancid geode Sep 3, 2025, 7:07 PM

#

glacial reef which providers do u know support cacheing?

my bad, i was on my phone when i quickly checked, I guess its still only deepseek that supports prompt caching

glacial reef Sep 3, 2025, 7:08 PM

#

yeee

rancid geode Sep 3, 2025, 7:20 PM

#

bummer, its not worth using models that dont support prompt caching, at least if you use it for a lot of code

vapid torrent Sep 7, 2025, 3:15 AM

#

rancid geode bummer, its not worth using models that dont support prompt caching, at least if...

Or long chats

hushed hedge Sep 11, 2025, 6:23 AM

#

why deepseek is so slow

#

deepseek provider

#

and why pricing for v3.1 is different like on open router even from deepseek provider its cheaper than deepseek platform

storm seal Sep 11, 2025, 10:53 AM

#

hushed hedge and why pricing for v3.1 is different like on open router even from deepseek pro...

Other providers offer DeepSeek v3.1 for cheaper

hushed hedge Sep 11, 2025, 12:48 PM

#

storm seal Other providers offer DeepSeek v3.1 for cheaper

no the provider name is deepseek

#

and its giving cheaper than deepseek platform

#

check in open router

storm seal Sep 11, 2025, 12:58 PM

#

hushed hedge and its giving cheaper than deepseek platform

Hey, you're right!

#

@nocturne kelp there's a pricing discrepancy for the DeepSeek provider

#

Its 27 cents in and $1.1 out but DeepSeek is charging a different amount

lament geode Sep 11, 2025, 4:10 PM

#

With 3.1 do you guys usually put supplemental tool info in your system prompt? I've been getting weird tool behavior from 3.1. Also, how thorough are you with your tool and argument descriptions. I didn't see this until I switched from openai models to this model and I'm sure the problem is my lack of experience. Openrouter published an article on tool call accuracy but it seemed to suggest proprietary models excel at tool calling accuracy but didn't tell us what to do about it on open models. link: https://openrouter.ai/announcements/tool-calling-accuracy

native shale Sep 11, 2025, 5:32 PM

#

lament geode With 3.1 do you guys usually put supplemental tool info in your system prompt? ...

Claude models are top tier in tool calling, gpt 5 isn't bad either, most open source models are complete slop at tool calling

#

With the exception being glm 4.5 which is also really good at tool calls

lament geode Sep 11, 2025, 5:42 PM

#

native shale With the exception being glm 4.5 which is also really good at tool calls

Awesome thanks. Oh wow just found this which very much supports your claims! https://gorilla.cs.berkeley.edu/leaderboard.html

Berkeley Function Calling Leaderboard (BFCL) V4

Explore The Berkeley Function Calling Leaderboard (also called The Berkeley Tool Calling Leaderboard) to see the LLM's ability to call functions (aka tools) accurately.

sweet zodiac Sep 12, 2025, 8:13 AM

#

May i know the rate limit of this model ?

gaunt edge Sep 12, 2025, 8:59 AM

#

native shale With the exception being glm 4.5 which is also really good at tool calls

bbbbut what about kimi :(

native shale Sep 12, 2025, 9:02 AM

#

gaunt edge bbbbut what about kimi :(

its not really that good at tool calls from my experience

#

maybe cuz i was served fp4

gaunt edge Sep 12, 2025, 9:03 AM

#

yea i think theo t3 person and maybe some others have some videos about kimi and tool calls, it is pretty provider dependent

#

apparently, from memory

native shale Sep 12, 2025, 9:08 AM

#

gaunt edge yea i think theo t3 person and maybe some others have some videos about kimi and...

theo is a larper dont listen to him 😭

storm seal Sep 12, 2025, 10:49 AM

#

sweet zodiac May i know the rate limit of this model ?

There's no rate limits for paid models

hushed hedge Sep 14, 2025, 3:20 PM

#

any better alternatives to deepseek that isbetter and open source?

#

for making notes

modest smelt Sep 14, 2025, 3:25 PM

#

hushed hedge any better alternatives to deepseek that isbetter and open source?

you can try kimik2

fleet lake Sep 14, 2025, 3:25 PM

#

how to remove a car?

spice marsh Sep 14, 2025, 3:26 PM

#

?

rocky sedge Sep 14, 2025, 3:37 PM

#

https://tenor.com/view/tension-force-car-tow-towing-physics-gif-20641635

Tenor

edgy pivot Sep 14, 2025, 4:31 PM

#

hushed hedge any better alternatives to deepseek that isbetter and open source?

Try glm 4.5

#

Plus Kimi k2 0905 is really cute

rocky wagon Sep 15, 2025, 7:17 PM

#

Jeeesus, context of free model got downgraded to 30k from 60k. Paid model' 163k context. It's unuseable at this point. Any better free models with more context?

steel oar Sep 16, 2025, 4:55 AM

#

rocky wagon Jeeesus, context of free model got downgraded to 30k from 60k. Paid model' 163k ...

It seems the free version hosted by DeepInfra still has 163k context

hushed hedge Sep 16, 2025, 7:02 AM

#

edgy pivot Plus Kimi k2 0905 is really cute

is it better tha glm?

gaunt edge Sep 16, 2025, 7:07 AM

#

hushed hedge is it better tha glm?

people generally like kimi's style more i think, but its not necessarily 'better' than glm. glm is hybrid (toggleable) reasoning, kimi doesn't have 'reasoning'

#

kimi is also quite concise

edgy pivot Sep 16, 2025, 7:43 AM

#

hushed hedge is it better tha glm?

Depends on where you actually use it and your own prefers

hushed hedge Sep 16, 2025, 9:23 AM

#

gaunt edge kimi is also quite concise

it has like trillion parameters

gaunt edge Sep 16, 2025, 9:33 AM

#

hushed hedge it has like trillion parameters

Yea Kimi is big but what I meant is that its outputs are short

hushed hedge Sep 16, 2025, 10:12 AM

#

gaunt edge Yea Kimi is big but what I meant is that its outputs are short

i see , i was k inda looking for long detailed output like gemini gives , I need it for making notes

coarse fable Sep 19, 2025, 2:26 PM

#

I got "(DeepInfra) Provider returned error: deepseek/deepseek-chat-v3.1:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations" For Deepseek v3.1 free using DeepInfra, anyone knows how long it will last?

pure path Sep 20, 2025, 10:43 AM

#

It feels like prefills still haven't been sorted out on V3.1. All of the providers returned a completely unrelated response in Mandarin

blazing folio Sep 21, 2025, 9:16 AM

#

pure path It feels like prefills still haven't been sorted out on V3.1. All of the provide...

sorry can you clarify? and might this be why im getting responses talking about python coding or nutritional papers instead of responses?
sorry for clarity the key works on the site i am using when I click the test button and the deepseekv3.1 is set at the model, but the responses are garbled and i dont have any blocked providers.

pure path Sep 21, 2025, 9:34 AM

#

blazing folio sorry can you clarify? and might this be why im getting responses talking about ...

Using sillytavern, when I run chat completion and ask it to continue the assistant message, it generates the chinese text below

#

I heard only deepseek's provider does working prefills, but havent tried it

blazing folio Sep 21, 2025, 9:41 AM

#

pure path Using sillytavern, when I run chat completion and ask it to continue the assista...

yeah this is what i get

#

i have no idea

#

it auto picks openinterface and deepintra as the common providers for me

lament geode Sep 22, 2025, 5:40 AM

#

deepseek/deepseek-chat-v3.1 is a constant battle now. We don't use free models. Account is always topped up. Suddenly: No endpoints found that support tool use. Is this no longer supporting tool use?

potent parcel Sep 22, 2025, 12:14 PM

#

potent parcel Sep 22, 2025, 12:37 PM

#

digital snow Sep 22, 2025, 12:58 PM

#

@nocturne kelp https://api-docs.deepseek.com/updates#deepseek-v31-terminus

Change Log | DeepSeek API Docs

Date: 2025-09-22

#

💆‍♂️

vapid torrent Sep 22, 2025, 1:00 PM

#

Hopefully it actually fixed the language mixing nonsense

nocturne kelp Sep 22, 2025, 1:20 PM

#

digital snow <@165587622243074048> https://api-docs.deepseek.com/updates#deepseek-v31-terminu...

wwtf

jolly lintel Sep 22, 2025, 1:28 PM

#

#1419676592082518039

lament geode Sep 22, 2025, 7:47 PM

#

I still get a 404 from deepseek/deepseek-chat-v3.1 at OR when my payload defines tools: regardless of provider. What did I miss or is this terminus unrelated?

#

The models endpoint reports that it supports tools.

vapid iris Sep 28, 2025, 9:30 PM

#

would like to report that V3.1 (free) from DeepInfra appears to have a (temporarily?) reduced maximum context size to 49299 tokens (it appears to not be a consistent limit, 48k was a good buffer), anything above that results in a 404 'no allowed providers are available' error. Have ignored OpenInference to check.

Would it be possible to check with DeepInfra if it's a temporary or permanent change, and if it's permanent, be reflected in the provider list? Thank you!

p.s. apologies for the ping, just noticed your status of being on vacation, my bad!

runic pike Sep 29, 2025, 10:22 AM

#

https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp

deepseek-ai/DeepSeek-V3.2-Exp · Hugging Face

vapid torrent Sep 29, 2025, 10:23 AM

#

runic pike https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp

https://discord.com/channels/1091220969173028894/1422148517081649202

languid vigil Sep 29, 2025, 6:11 PM

#

love it that deepinfra did something on their back end to cause the quality of v3.1 to nosedive and struggle with endless repetition

vernal wadi Oct 6, 2025, 11:10 PM

#

Why does v3.1 this sometimes? It's so strange. I didn't changed anything.

#

What is this shit?

#

<@&1384697330254610442> can you help?

#

It's not too much tokens

#

I don't understand what is happening

#

@storm seal any idea? My roleplay worked fine until now 🙁

storm seal Oct 6, 2025, 11:28 PM

#

vernal wadi <@705244213414985758> any idea? My roleplay worked fine until now 🙁

Hmm

#

Try Chutes as the provider

vernal wadi Oct 6, 2025, 11:31 PM

#

storm seal Try Chutes as the provider

How can I switch it?

#

I use my OR key

#

https://openrouter.ai/deepseek/deepseek-chat-v3.1:free

DeepSeek V3.1 (free) - API, Providers, Stats

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. Run DeepSeek V3.1 (free) with API

#

Only deep infra offers free

#

Then will I switch to 3.2 it's cheaper

#

The same issue O_O

#

Found the issue it's my prefill

storm seal Oct 6, 2025, 11:38 PM

#

vernal wadi Found the issue it's my prefill

Ah, yayyyy

vernal wadi Oct 6, 2025, 11:45 PM

#

storm seal Ah, yayyyy

#

But strange, it didn't happened earlier.

storm seal Oct 6, 2025, 11:46 PM

#

vernal wadi But strange, it didn't happened earlier.

strug

void brook Oct 7, 2025, 12:29 AM

#

vernal wadi Why does v3.1 this sometimes? It's so strange. I didn't changed anything.

Some provider use different quant and also different stack
Mostly it direct you to provider that have problematic stack and weight

vernal wadi Oct 7, 2025, 12:44 AM

#

void brook Some provider use different quant and also different stack Mostly it direct you ...

What is a stack?

storm seal Oct 7, 2025, 1:11 PM

#

vernal wadi What is a stack?

The software they use to provide the service

#

Some providers might use vLLM while others use Llama.cpp or their own custom vLLM version

vernal wadi Oct 7, 2025, 1:33 PM

#

storm seal Some providers might use vLLM while others use Llama.cpp or their own custom vLL...

Alright, that's out of my universe

nocturne kelp Oct 7, 2025, 8:45 PM

#

analog violet Oct 7, 2025, 10:44 PM

#

DeepInfra is permanently removing deepseek v3.1 or its temporary?

storm seal Oct 7, 2025, 10:48 PM

#

analog violet DeepInfra is permanently removing deepseek v3.1 or its temporary?

Permanent

analog violet Oct 7, 2025, 10:48 PM

#

Amazing

languid vigil Oct 7, 2025, 10:52 PM

#

thanks janitorai

vernal wadi Oct 7, 2025, 10:52 PM

#

Looks like v3.1:free now only works if you let them use your prompts :/

#

3.1 still works but only when selling your soul

#

vernal wadi Oct 7, 2025, 10:55 PM

#

languid vigil thanks janitorai

Yeah janitor had a huge impact on Open router, deepseek and chutes.
It's crazy h6huge the community got and how satisfied they all are from deepseek.
But I also don't know any alternative LLM which writes as good (and uncensored!) like deepseek.

#

And their built-in premium version of deepseek is with 20$ per month too expensive.

#

I also use chub 🫣😅

#

V3 is at <10% uptime. And v3.1 gets dead now

marsh pulsar Oct 8, 2025, 12:25 AM

#

fedup

storm seal Oct 8, 2025, 12:26 AM

#

Oh what the hell

marsh pulsar Oct 8, 2025, 12:28 AM

#

yeah a 100+ billion difference in tokens is absolutely diabolical

mild eagle Oct 8, 2025, 12:31 AM

#

Holy

analog violet Oct 8, 2025, 12:55 AM

#

Its over guys...Deepseek free 3.1 is officially dead

#

No more free gooning

#

Only premium gooning for now

coarse fable Oct 8, 2025, 1:02 AM

#

Since it died tragically, I have to admit I have never experienced so coherent and complex RP capacities with a 4-compressed model before 🥲

surreal parcel Oct 8, 2025, 1:09 AM

#

So the free one is dead for good?

storm seal Oct 8, 2025, 1:10 AM

#

Ooooh, it went to zero!

vapid torrent Oct 8, 2025, 1:20 AM

#

yep

vapid torrent Oct 8, 2025, 1:22 AM

#

analog violet Only premium gooning for now

also openinference doesn't allow gooning requests so no gooning to that so yeah

coarse fable Oct 8, 2025, 2:33 AM

#

I got an idea that could kinda work for the free Provider in a "never happening hypothetical return(?)", Rate limits at specific services. The idea is simple, Free Janitor Users using a free model that got too many requests, they all get rate limit, like a single request per minute? Or two? Just affecting a single service instead of they all. That could not directly affect the other free users that used DeepSeek V3.1 for any other reasons, from helper to coding. It's kinda unfair that they all get affected because of Janitor users.

gaunt edge Oct 8, 2025, 2:42 AM

#

coarse fable I got an idea that could kinda work for the free Provider in a "never happening ...

Wouldn’t work - not good idea to punish third party apps for putting the app name in the api call. The obvious solution would be to just update it to not mention that it’s janitor ai, and then the situation is just worse

coarse fable Oct 8, 2025, 2:49 AM

#

gaunt edge Wouldn’t work - not good idea to punish third party apps for putting the app nam...

The hypothetical janitor ai updates are too smart—Anyways, seriously, Janitor would be capable of making its name incognito in api requests? Have they done that before or...? Damn, now that I think about it Openrouter can even get request from terminals, so they could not even do anything about a whole site going incognito.

vapid iris Oct 8, 2025, 2:50 AM

#

coarse fable The hypothetical janitor ai updates are too smart—Anyways, seriously, Janitor wo...

they have to specifically put the name in the header to be identified in the first place

#

thats why janitor wasn't on any of the charts before late september

#

even though their usage number was already insane

coarse fable Oct 8, 2025, 2:57 AM

#

vapid iris even though their usage number was already insane

I think here the problem is a middle between Janitor being hidden from view until last month and Openrouter for, well, just being an open router. Since it ends up affecting providers and users with situations like this, despite not being paid, the provider knew they would get anything, not even training data from "charity work"—But the may problem is whole provider being overwhelmed from a single site... welp, there is nothing users can do now, except either paying for requests... or simply getting subscriptions from other sites since Openrouter does not have subscriptions.

Honestly, Openrouter having subscriptions for certain models and providers would be an idea? like, certain amounts of dollars for a certain amount of daily requests so it isn't an abuse(?)

vapid iris Oct 8, 2025, 2:58 AM

#

coarse fable I think here the problem is a middle between Janitor being hidden from view unti...

the only problem is with free models

#

since if they're paying, that's their own money to waste lol

vapid torrent Oct 8, 2025, 2:59 AM

#

coarse fable I think here the problem is a middle between Janitor being hidden from view unti...

why did it be hidden? did they only recently add the title headers?

vapid iris Oct 8, 2025, 2:59 AM

#

vapid torrent why did it be hidden? did they only recently add the title headers?

it was very recent, let me check when

#

sept 23/24

coarse fable Oct 8, 2025, 2:59 AM

#

vapid torrent why did it be hidden? did they only recently add the title headers?

Yeah, why was it hidden?

#

So Openrouter can't track it well or simply they were dumb?

vapid iris Oct 8, 2025, 3:00 AM

#

coarse fable Yeah, why was it hidden?

they didn't include it earlier, you have to specifically go out of your way to label it

#

it's opt-in not opt-out lol

#

https://openrouter.ai/docs/app-attribution

OpenRouter Documentation

App Attribution

Learn how to attribute your API usage to your app and appear in OpenRouter's app rankings and model analytics.

#

By including simple headers in your requests, your app can appear in our leaderboards and gain insights into your model usage patterns.
but it's more infamous than anything else 👀

coarse fable Oct 8, 2025, 3:02 AM

#

Yeahh, anyways. If anyone here that loved to use "gooner rp free models" I think the Z-Ai is good, if you use a simple yet good system prompt with no jailbreak(because it makes reasoning too bulky)

analog violet Oct 8, 2025, 3:29 AM

#

Off topic but using any free models with chutes as a provider feels like a gambling addiction

coarse fable Oct 8, 2025, 4:05 AM

#

analog violet Off topic but using any free models with chutes as a provider feels like a gambl...

Yeahh, also feel it, but since Chutes added compression to their models they started to get quirky

tough elk Oct 8, 2025, 11:08 AM

#

Won’t really use Z-AI, I’m more of a prompt person which is why i used deepseek and now it’s gone, forever probably

coarse fable Oct 8, 2025, 2:30 PM

#

tough elk Won’t really use Z-AI, I’m more of a prompt person which is why i used deepseek ...

The big reasinoning Qwen of Vecine got a normal prompt mode if you put /no_think in the post history instructions, and with the recommended parameters of Temperature=0.7, TopP=0.8, TopK=20, and MinP=0. It seems to work

#

With a 10K tokens RP chat it provides good replies, just need to specify reply lenghts or else it gets messy long. And the only problem is the too many requests error

strong moss Oct 8, 2025, 2:44 PM

#

Hey y'all, is the other v3.1 provider safe to use? Read here that they use your data to train their models. But isn't it the same for deepseek official?

#

What's

#

The best course of action here if you want to continue roleplaying with deepseek

tough elk Oct 8, 2025, 6:07 PM

#

coarse fable The big reasinoning Qwen of Vecine got a normal prompt mode if you put /no_think...

Thank and I also tried out GLM again and it’s the wildest model I’ve ever seen before. It gets straight into the not safe to work stuff and it doesn’t really matter what kind of chatbot your talking to. idk if it’s the model or if it’s my prompts since I have quite a list of prompts that are for the AI and definitely isn’t SFW so maybe it’s the prompt, I’m not sure.

tough elk Oct 8, 2025, 6:09 PM

#

strong moss Hey y'all, is the other v3.1 provider safe to use? Read here that they use your ...

Unfortunately there is no way of role playing with the unfiltered version of Deepseek v3.1. Open inference does offer free Deepseek v3.1 but it’s filtered and you’ll get a lot of 429 errors similar to the other free Deepseek models on openrouter. There is also this one provider that I know but it’s also filtered unfortunately.

coarse fable Oct 8, 2025, 6:10 PM

#

tough elk Thank and I also tried out GLM again and it’s the wildest model I’ve ever seen b...

I used the GLM but honestly gets so many format errors that I genuinely give up xd

signal sinew Oct 8, 2025, 6:11 PM

#

tough elk Thank and I also tried out GLM again and it’s the wildest model I’ve ever seen b...

it's the model, even as an AI assistant it knowingly flounts ethical guidelines with minimal prompting

lost carbon Oct 8, 2025, 7:07 PM

#

coarse fable I used the GLM but honestly gets so many format errors that I genuinely give up ...

I use longcat.

#

😭

opaque shore Oct 9, 2025, 8:49 AM

#

Wait can’t we just use deepseek v3.1 (not free) from openrouter? If we’re paying there shouldn’t be an issue right?

mortal igloo Oct 9, 2025, 8:52 AM

#

Paid DeepSeek is better in terms of availability (100% uptime)

#

Yeah

opaque shore Oct 9, 2025, 8:52 AM

#

Oh okay, I thought that was down too🥹

mortal igloo Oct 9, 2025, 8:53 AM

#

I use directly from DeepSeek (no longer v3.1 but v3.2 Exp) and been having a blast since. Topped up $15 in March and still have $4 now

opaque shore Oct 9, 2025, 8:54 AM

#

Is it less expensive to get it directly from deepseek?

mortal igloo Oct 9, 2025, 8:54 AM

#

Low context (32k) and cache hits are a blessing

glad basin Oct 9, 2025, 8:54 AM

#

deepseek is cheapest provider especially due to caching

mortal igloo Oct 9, 2025, 8:54 AM

#

opaque shore Is it less expensive to get it directly from deepseek?

^In my opinion yeah because of caching

glad basin Oct 9, 2025, 8:54 AM

#

you can enforce routing to deepseek on openrouter

wild sparrow Oct 9, 2025, 9:30 AM

#

mortal igloo I use directly from DeepSeek (no longer v3.1 but v3.2 Exp) and been having a bla...

would you be willing to share your settings so i dont waste cash?

mortal igloo Oct 9, 2025, 9:53 AM

#

wild sparrow would you be willing to share your settings so i dont waste cash?

I use SillyTavern (though all platforms can too), set context limit to 32k, and set Temperature to 1.2 (DeepSeek official API/OpenRouter provider converts this by deducting 0.7 so it's actually 0.5 temperature)

#

Now about caching, it saves cash because your previous messages are kept in their system (for hours in my experience), so if your earlier messages or system prompt changes, or there is a world info/lorebook entry activation (that inserts a new system message before the chat or within the chat), then the cache busts

#

If earlier context are kept and nothing changes from the very beginning to the latest message, then the cache stays.

#

(Except once you reach 32k of context: older messages are removed for new messages to be inserted, thus no cache and you get billed normally)

#

I don't know if I can explain better than this jumbled mess lol

strong moss Oct 9, 2025, 10:18 AM

#

mortal igloo I use directly from DeepSeek (no longer v3.1 but v3.2 Exp) and been having a bla...

Would you recommend topping up to deepseek directly or on open router instead? I heard deepseek official is also a provider there so it works too right?

mortal igloo Oct 9, 2025, 10:22 AM

#

strong moss Would you recommend topping up to deepseek directly or on open router instead? I...

Yep. You can just use the DeepSeek provider in OpenRouter
Though they only serve their latest model, which is DeepSeek V3.2 Exp

#

And the caching works too

#

I use direct Deepseek because Openrouter's deepseek account balance can be zero (no auto top-up) so errors haha

brazen gale Oct 9, 2025, 10:24 AM

#

how do u do it??

mortal igloo Oct 9, 2025, 10:30 AM

#

brazen gale how do u do it??

Choosing the official Deepseek provider, you mean? In SillyTavern you can choose a provider, but for other platforms that do not have a provider selection tab, I think you can just set "Allowed Providers" in your OpenRouter account settings to DeepSeek only (use the model DeepSeek v3.2 Exp, they only serve that currently and paid)

vapid torrent Oct 9, 2025, 10:31 AM

#

You can also make a preset in openrouter

brazen gale Oct 9, 2025, 10:31 AM

#

dang im all so new to ts

wild sparrow Oct 9, 2025, 10:37 AM

#

mortal igloo Now about caching, it saves cash because your previous messages are kept in thei...

how do you cache?

vapid torrent Oct 9, 2025, 10:38 AM

#

wild sparrow how do you cache?

Automatic

#

if you use DeepSeek provider

#

or DeepSeek directly

wild sparrow Oct 9, 2025, 10:38 AM

#

like this?

mortal igloo Oct 9, 2025, 10:39 AM

#

wild sparrow like this?

Yes

#

The caching process is automatic

wild sparrow Oct 9, 2025, 10:39 AM

#

32k is a good amount for context as well

#

so that should last a while

vapid torrent Oct 9, 2025, 10:40 AM

#

Be warned caching goes out the window after you hit ctx limit

wild sparrow Oct 9, 2025, 10:40 AM

#

il drop 5$ in and see how it goes

vapid torrent Oct 9, 2025, 10:40 AM

#

as the messages leave FIFO style

wild sparrow Oct 9, 2025, 10:40 AM

#

how will i know?

vapid torrent Oct 9, 2025, 10:40 AM

#

it’s cheaper to have long ctx compared to small ctx with DeepSeek’s 10x cache discount

wild sparrow Oct 9, 2025, 10:41 AM

#

could you show your preset settings?

vapid torrent Oct 9, 2025, 10:41 AM

#

I don’t use

#

i just set to maximum

#

so it never evicts from ctx window

wild sparrow Oct 9, 2025, 10:41 AM

#

#

like that?

#

or whatever the max is for the model?

#

in this case 163840

mortal igloo Oct 9, 2025, 10:57 AM

#

Yeah if you want full context, Deepseek official provider only supports 128k

wild sparrow Oct 9, 2025, 10:58 AM

#

ive set it to 32k

#

so it saves money in the long term since i hardly go over 30 msgs

vapid torrent Oct 9, 2025, 11:15 AM

#

My longest chat was 47k tokens iirc

strong moss Oct 9, 2025, 12:47 PM

#

How does caching work? Does cache stay or something, considering that it's basically giving the bot an idea of the past events/roleplay?

#

It would work even if I use deepseek official through openrouter right

mortal igloo Oct 9, 2025, 12:56 PM

#

strong moss How does caching work? Does cache stay or something, considering that it's basic...

DeepSeek keeps caches for hours / they don't say it in their docs. And not exactly past events per se, like cache is keeping whatever are the older messages are in the front of the context you've set. The bot will know what messages are in the context, not the entire chat history

mortal igloo Oct 9, 2025, 12:56 PM

#

strong moss It would work even if I use deepseek official through openrouter right

yes

vapid iris Oct 9, 2025, 12:57 PM

#

according to their docs at https://api-docs.deepseek.com/guides/kv_cache,

Cache construction takes seconds. Once the cache is no longer in use, it will be automatically cleared, usually within a few hours to a few days.
so might be up to days, if their servers aren't under heavy usage.

strong moss Oct 9, 2025, 12:57 PM

#

That sounds neat, but I wish it lasted longer since my writing could take hours too lmao

vapid iris Oct 9, 2025, 12:58 PM

#

I saw people mentioning they had cache hit at 25 hours

strong moss Oct 9, 2025, 12:59 PM

#

ah neat

vapid iris Oct 9, 2025, 12:59 PM

#

but if you're doing a chat and your old messages get pushed out

#

might not be a cache hit anymore

strong moss Oct 9, 2025, 1:03 PM

#

Since I roleplay using proxy, I often just use the same character (with the same definition) so the character definition would be consistent, would that mean it's always a cache hit even if it's a different chat but the same character?

#

heck, would it be possible to maintain cache for weeks, maybe even months?

mortal igloo Oct 9, 2025, 1:05 PM

#

vapid iris according to their docs at https://api-docs.deepseek.com/guides/kv_cache, > Cach...

ohh never knew that. didn't read enough myself...

vapid iris Oct 9, 2025, 1:06 PM

#

strong moss heck, would it be possible to maintain cache for weeks, maybe even months?

zero chance 😂

storm seal Oct 9, 2025, 1:06 PM

#

strong moss heck, would it be possible to maintain cache for weeks, maybe even months?

You'd have to send 1 message every so often depending on the cache time

mortal igloo Oct 9, 2025, 1:07 PM

#

strong moss Since I roleplay using proxy, I often just use the same character (with the same...

yeah. no cache, deepseek will re-cache everything once you've gone past your context limit since the messages after the character definition (which would be the bot's first message) is replaced with whatever nearest message to fill the front context

strong moss Oct 9, 2025, 1:10 PM

#

OHHH so the cache would only hit until something like... 16k if I set it so?

#

Would setting the context window to 32k/16k be cheaper than the 64k/128k context with cache

mortal igloo Oct 9, 2025, 1:28 PM

#

Yep, if you set context to 16k, it'd cache the 16k like:

System prompt [500 tokens]
Bot definition [1000 tokens]
Persona definition [100 tokens]
Chat history [1st bot "greeting" message, 2nd message your reply, 3rd bot reply, 4th your reply, etc. totaling 15400 tokens]

This is what the 16k cached would be.

Then if you send your reply to the bot once the context is full, the bot's 1st "greeting" message would be replaced by your 2nd message from the chat history like:

System prompt [500 tokens] <-- this is cached since nothing changed
Bot definition [1000 tokens] <-- cached
Persona definition [100 tokens] <-- cached
Chat history [2nd message your reply, 3rd bot reply, 4th your reply, 5th bot message, 6th your reply, etc. totaling 15400 tokens] <-- no cache since the bot's greeting message is replaced by your reply // there is a change. DeepSeek re-caches this repeatedly if you still use 16k context

#

So if you decide to use 128k, the cache would prove useful until you hit 128k. More than that, it re-caches the whole 127,4k tokens

#

I use 32k context because when it re-caches, deepseek bills me $0.01 instead of $0.03 if 128k

#

cheap

crimson egret Oct 9, 2025, 2:21 PM

#

Okay, guys. Can someone explain to me whether V3.1 is still working or if it’s completely no longer free? Some people say it can still be used, but one of the endpoints was removed, while others say it no longer works and has been shut down.

vapid iris Oct 9, 2025, 2:22 PM

#

crimson egret Okay, guys. Can someone explain to me whether V3.1 is still working or if it’s c...

paid v3.1 is working just fine

#

v3.1 free had two providers, DeepInfra and OpenInference

#

DeepInfra no longer provides free v3.1 due to the massive human horde from janitor

#

and OpenInference is a far smaller provider

#

so you're just going to get 429 errors

crimson egret Oct 9, 2025, 2:26 PM

#

So now it can’t really work properly on the free model?

vapid iris Oct 9, 2025, 2:28 PM

#

if you thanos snap 95% of the janitor horde, it'll work. but currently you're competing with way too many people for the available capacity.

crimson egret Oct 9, 2025, 2:31 PM

#

Thanks for the reply.

paper magnet Oct 9, 2025, 4:43 PM

#

vapid iris v3.1 free had two providers, DeepInfra and OpenInference

rlly? I was under the impression 3.1 free got axed then taken out the back and executed because of the graph shown earlier that it was over. I got 3.1 free from OpenRouter and I keep getting errors whenever I try to use it, and when I go on the error link, it says its no longer available

vapid iris Oct 9, 2025, 4:45 PM

#

paper magnet rlly? I was under the impression 3.1 free got axed then taken out the back and e...

Its effectively dead because of the number of users

willow junco Oct 10, 2025, 1:01 AM

#

mortal igloo I use 32k context because when it re-caches, deepseek bills me $0.01 instead of ...

If my context size is at 32k now, does that mean once I hit 32k context, it'll be cheaper?

mortal igloo Oct 10, 2025, 1:07 AM

#

willow junco If my context size is at 32k now, does that mean once I hit 32k context, it'll b...

there'll be caching when you build up the context/the chat gets longer until 32k, and as I mentioned normal 32k input price is $0.009 or 1 cent
so yes, cheap

willow junco Oct 10, 2025, 1:10 AM

#

mortal igloo there'll be caching when you build up the context/the chat gets longer until 32k...

🤔 I see. Thanks! I was a little confused since it costs $0.009 for me. ^^

mortal igloo Oct 10, 2025, 1:12 AM

#

oh wait

#

my calculation's wrong, sorry. It's $0.009 or 1 cent...

#

damn

chrome rune Oct 10, 2025, 2:19 AM

#

Hello, I need help. I get this message when I try to send a message in any chat on “Janitor AI.”

Screenshot_2025-10-09-20-18-47-537_com.android.chrome.png

rocky sedge Oct 10, 2025, 2:20 AM

#

3.1 is dead

chrome rune Oct 10, 2025, 2:21 AM

#

Shit

hushed hedge Oct 10, 2025, 11:42 AM

#

you can still use it from platform deepseek

abstract igloo Oct 10, 2025, 3:50 PM

#

you can still use it in OpenRouter, it's just the free endpoint that's gone

hidden plinth Oct 10, 2025, 5:50 PM

#

Does anyone else have a problem with DeepSeek V3.1 putting non-english characters in tool calls? It consistently does it for me, and causes tool calls to fail.

tropic kettle Oct 11, 2025, 5:41 PM

#

abstract igloo you can still use it in OpenRouter, it's just the free endpoint that's gone

do you mean https://openrouter.ai/api/v1/chat/completions is gone and that is the reason for these response?

{"error":{"message":"Not Found","code":404}}

abstract igloo Oct 11, 2025, 5:41 PM

#

no, i meant there isn't a free DeepSeek v3.1 anymore

abstract igloo Oct 11, 2025, 5:42 PM

#

tropic kettle do you mean https://openrouter.ai/api/v1/chat/completions is gone and that is th...

also, depending of what you're using, you should only use the https://openrouter.ai/api/v1 part

tropic kettle Oct 11, 2025, 5:46 PM

#

abstract igloo also, depending of what you're using, you should only use the https://openrouter...

i am just trying to test the endpoint jst as it is in the documentation and i am receiving 404

const url = "https://openrouter.ai/api/v1/chat/completions";
const headers = {
"Authorization": Bearer ${process.env.OPENROUTER_API_KEY},
"Content-Type": "application/json"
};
const payload = {
"model": "deepseek/deepseek-chat-v3.1",
"messages": [
{
"role": "system",
"content": "hello"
},
{
"role": "user",
"content": "If you built the world's tallest skyscraper, what would you name it?"
}
],
"temperature": 0.7
};

const response = await fetch(url, {
method: "POST",
headers,
body: JSON.stringify(payload)
});

const data = await response.json();
console.log(data);

#

any response?

abstract igloo Oct 11, 2025, 6:01 PM

#

idk, it worked here

#

try seeing if you have any provider preferences in your settings

jovial idol Oct 11, 2025, 8:47 PM

#

how do i see the reasoning tokens for v3.1 and v3.1-terminus

here is my code. i tried a lot but cannot see the reasoning for different kind of user messages.

Any help would be much appreciated.

client.chat.completions.create(
model="deepseek/deepseek-v3.1-terminus",
messages=[{"role":"system","content":system_prompt},{"role":"user","content":user_message}],
temperature=0.1,
top_p=0.1,
stream=True,
max_tokens=2500,
extra_body={
"provider": {
"sort": "throughput"
},
"enable_thinking": True,
"reasoning": {
"effort": "high",
"exclude": False
}
}
)

storm seal Oct 11, 2025, 8:59 PM

#

jovial idol how do i see the reasoning tokens for v3.1 and v3.1-terminus here is my code. ...

can you pose the raw, full response here?

abstract igloo Oct 12, 2025, 12:58 AM

#

i think it needs to be
"reasoning": {
"enable": true
}

lost carbon Oct 12, 2025, 7:57 AM

#

abstract igloo i think it needs to be "reasoning": { "enable": true ...

if you include the effort param , reasoning is enabled by default

sweet zodiac Oct 14, 2025, 1:10 AM

#

i can't find the good things form v3.1,
i use R1 and V3 in the past and the prompt(s) fit them.
I believe the DS family trained by similar dataset and behave the same.
however whne i change the model from R1 to V3.1, the result ruins.... 😂

spice marsh Oct 14, 2025, 1:11 AM

#

Different models need different prompts, even in the same lineup

lost carbon Oct 14, 2025, 2:59 AM

#

spice marsh Different models need different prompts, even in the same lineup

biggest learning tbh , hence you have to be mindful when you build agents.

#

(DSPy fixes this)

dense wolf Oct 15, 2025, 3:58 PM

#

mortal igloo there'll be caching when you build up the context/the chat gets longer until 32k...

I dont see Deepseek as a provider for 3.2 exp though

#

And setting all other providers for the model to ignored just gives me a no endpoints error

abstract igloo Oct 15, 2025, 4:00 PM

#

dense wolf I dont see Deepseek as a provider for 3.2 exp though

you probably have to enable this in your settings

#

DeepSeek

dense wolf Oct 15, 2025, 4:01 PM

#

I did enable it just now and still got the error

#

Let me reopen my app though

#

Just in case

#

Seems to have worked

atomic quartz Oct 15, 2025, 7:03 PM

#

Hey folks, does anyone know why OpenRouter is throwing an error?

storm seal Oct 15, 2025, 7:44 PM

#

atomic quartz Hey folks, does anyone know why OpenRouter is throwing an error?

You disabled allowing providers to log/train on models

humble echo Oct 16, 2025, 12:22 AM

#

in a chubapp i see deepseek 3.1 base, 3.1 terminus and 3.1 chat

What is the difference (?)

spice marsh Oct 16, 2025, 12:25 AM

#

3.1 Base -> Won't chat with you, it will complete your sentences
3.1 Chat -> It'll be able to chat
3.1 Terminus -> This is 3.1 Chat with minor bug fixes (some repetition issues and mixing english and chinese)

humble echo Oct 16, 2025, 2:03 AM

#

spice marsh 3.1 Base -> Won't chat with you, it will complete your sentences 3.1 Chat -> It'...

thanks bro!!

atomic quartz Oct 16, 2025, 8:22 PM

#

storm seal You disabled allowing providers to log/train on models

Hmm, where and what should I turn on?

storm seal Oct 16, 2025, 8:22 PM

#

atomic quartz Hmm, where and what should I turn on?

turn on the first 3 options in the privacy section of your account settings

atomic quartz Oct 16, 2025, 8:32 PM

#

storm seal turn on the first 3 options in the privacy section of your account settings

Oh, thanks. But after doing this, the chat shows an error: "No allowed providers are available for the selected model." Thanks for helping

storm seal Oct 16, 2025, 8:33 PM

#

atomic quartz Oh, thanks. But after doing this, the chat shows an error: "No allowed providers...

Did you black list Chutes?

atomic quartz Oct 16, 2025, 8:36 PM

#

storm seal Did you black list Chutes?

No, I don't even know that there is some kind of list )

vapid iris Oct 16, 2025, 8:37 PM

#

atomic quartz Oh, thanks. But after doing this, the chat shows an error: "No allowed providers...

Do you have option 5 (ZDR) activated?
It should be turned off.

#

atomic quartz Oct 16, 2025, 8:39 PM

#

I didn't enable this option.

storm seal Oct 16, 2025, 8:39 PM

#

atomic quartz No, I don't even know that there is some kind of list )

I'm not sure why you can't use DeepSeek v3.1 (free)

vapid iris Oct 16, 2025, 8:40 PM

#

atomic quartz I didn't enable this option.

can you show a picture of your allowed/ignored providers section at https://openrouter.ai/settings/preferences ?

atomic quartz Oct 16, 2025, 8:46 PM

#

Is this the providers section at ?

openrouter.ai_settings_preferencesiPhone_XR.png

vapid iris Oct 16, 2025, 8:47 PM

#

atomic quartz Is this the providers section at ?

yes, can you show the top too?

atomic quartz Oct 16, 2025, 8:50 PM

#

Oh, sorry, the top point didn't make it into the screenshot above.

vapid iris Oct 16, 2025, 8:52 PM

#

Hmm, this is weird. All your settings are correct, V3.1 (free) should be working fine for you, hmm.

atomic quartz Oct 16, 2025, 8:54 PM

#

The account isn't banned. ) Z.AI: GLM 4.5 Air (free) also doesn't work

Could the models not working be related to geolocation?
Is there technical support for Openrouter?

vapid iris Oct 16, 2025, 8:57 PM

#

I'm not sure on that front, sorry. Maybe someone else will know, but I'm not aware of OpenInference geo-blocking.

atomic quartz Oct 16, 2025, 9:13 PM

#

Is this a problem?

abstract igloo Oct 16, 2025, 9:16 PM

#

i don't think it should be

buoyant willow Oct 18, 2025, 12:51 PM

#

anyone else's v3.1 free model just being weird

#

it js gives "1.1..1.1.1..1.11" or similar gibberish

storm seal Oct 18, 2025, 1:00 PM

#

buoyant willow it js gives "1.1..1.1.1..1.11" or similar gibberish

Try another provider

If using a different provider stops that, then you know it's that particular providers fault

#

And you all can report that provider here

buoyant willow Oct 18, 2025, 1:04 PM

#

storm seal Try another provider If using a different provider stops that, then you know it...

Theres only 1 provider for free rho ;-;

#

OpenInfdrence

#

Ill se

storm seal Oct 18, 2025, 1:05 PM

#

Ah

#

Oops, I didn't know that

buoyant willow Oct 18, 2025, 1:06 PM

#

I tried both yesterday and the day before, im not sure whats the issue

#

Via oenrouter chatroom

#

and aider

storm seal Oct 18, 2025, 1:06 PM

#

buoyant willow it js gives "1.1..1.1.1..1.11" or similar gibberish

Set your temp to 0.3, and if it still happens then it's confirmed its an Open inference issue

#

See if you can reproduce it in a new chat if it's an Open inference issue, and Toven will yell at OpenInference to fix it

buoyant willow Oct 18, 2025, 1:07 PM

#

storm seal Set your temp to 0.3, and if it still happens then it's confirmed its an Open in...

Ah so it seems it defaultd at 1 for xhatroom and 0 for aider, ill try with 0.3

#

One sec js gotta get back home

buoyant willow Oct 18, 2025, 1:20 PM

#

storm seal Set your temp to 0.3, and if it still happens then it's confirmed its an Open in...

Appears 0.3 works but for some reason it gave me an end of sentence token at the end which appeared on the chat message? https://openrouter.ai/chat?room=orc-1760793523-IhYneTn3lo9GCKT4SW3N

But aside from that it works 👍

OpenRouter

A router for LLMs and other AI models

storm seal Oct 18, 2025, 1:20 PM

#

buoyant willow Appears 0.3 works but for some reason it gave me an end of sentence token at the...

I can't see the chat btw

#

Chats are stored locally, not on server

buoyant willow Oct 18, 2025, 1:22 PM

#

Oh wait hih

#

huh

#

Screenshot_2025-10-18-16-23-18-528_com.android.chrome.png

#

Nvm its spewing nonsense again

#

It may be a specific token count cuz it only happened when i pasted a large block of text

storm seal Oct 18, 2025, 1:45 PM

#

buoyant willow It may be a specific token count cuz it only happened when i pasted a large bloc...

Can you post that here?

void brook Oct 18, 2025, 9:27 PM

#

buoyant willow It may be a specific token count cuz it only happened when i pasted a large bloc...

Base on my experiences if we use the non-free providers and chose the good provider even with large block of context i don't get that

#

Mostly because of the provider of free models
Which understandable for free models, but i guess if they can fix it then it's amazing of them as a provider

mortal igloo Oct 19, 2025, 1:31 AM

#

Perhaps because of the quantization? OpenInference provides free DS v3.1 at int8 precision which is lower uner fp4 I think

gaunt edge Oct 19, 2025, 1:47 AM

#

int8 is less quantised than fp4 (so, int8 is higher quality). for some reason i thought openinference v3.1 free said fp4, but i checked and it does say int8

void brook Oct 19, 2025, 9:29 AM

#

gaunt edge int8 is less quantised than fp4 (so, int8 is higher quality). for some reason i ...

Agree on this, but i don't know with those new NVFP4, i heard if it goes through the pre-train, sft then rl phase with NVFP4 format it will be comparable with the BF16.

I guess that make sense, because deepseek when they doing original training with FP8 the model actually have comparable performance with BF16 than if it come from BF16 then being quantize to be FP8.

The key point is in what format does the model originally being train on.

unkempt stratus Oct 23, 2025, 1:07 AM

#

Yall I'm using the og 3.1 chat

#

Not the terminus

#

And it's lagging so hard, I'm using paid

#

What's going on

full relic Oct 30, 2025, 7:03 AM

#

i am trying to make a local language ai chat bot using the deepseek v3.1 from openrouter but the url to get the request seems to be broken i am using https://openrouter.ai/api/v1/chat/completions for api but the chatbot shows url not found what could be possible cause and a solution to this?

full relic Oct 30, 2025, 9:41 AM

#

Seems to work now

unkempt gorge Nov 19, 2025, 10:17 PM

#

response tokens are coming through empty and every thing is ending up in the reasoning field today for some reason? tried across several providers all of which were fine before

vernal wadi Nov 28, 2025, 8:29 PM

#

It loaded endless long to finally respond with a very short message. But consumed 3x the amount of tokens then my input was?
I assume it has "invisible" thinking?

#

Not sure if that is good or bad...

abstract igloo Nov 28, 2025, 8:31 PM

#

if you click to see more info you can see how many of them were reasoning tokens

#DeepSeek V3.1