Synthetic.new | Kilo | Page 1

shell lava Oct 9, 2025, 2:10 AM

#

https://synthetic.new/landing/home

$20 / mo for 135 messages every five hours
$60 / mo for 1,350 messages every five hours

Many of the popular models (a sample):

deepseek-ai/DeepSeek-V3.1
deepseek-ai/DeepSeek-V3.1-Terminus
moonshotai/Kimi-K2-Instruct-0905
openai/gpt-oss-120b
Qwen/Qwen3-235B-A22B-Instruct-2507
Qwen/Qwen3-235B-A22B-Thinking-2507
Qwen/Qwen3-Coder-480B-A35B-Instruct
zai-org/GLM-4.5
zai-org/GLM-4.6
embedding included for free
Supports custom models too on per-minute rates.

Synthetic | Run LLMs, privately

Chat with open-source models privately

quiet hornet Oct 9, 2025, 2:14 AM

#

Mandatory 0.01$/Mtoken for the pay as u go embedding model price
For people who only want a super cheap embedding API

shell lava Oct 14, 2025, 3:21 PM

#

@wheat thunder

I’m pretty pleased with Synthetic. Super responsive team. Wide variety of models for $20/mo. Their GLM 4.6 is quick - around 50-90 TPS. I also use terminus and qwen3-coder

wheat thunder Oct 14, 2025, 3:23 PM

#

shell lava <@510124337731469319> I’m pretty pleased with Synthetic. Super responsive tea...

Yeah it's a shame they don't have the new DeepSeek V3.2-exp. Though my understanding is that it's not really much of an improvement in quality just in compute cost and speed

#

How fast is there Terminus and Kimi K2?

#

I currently use NanoGPT as my other provider, and they are okay but kind of slow to be honest. Broader model support than Synthetic so that's something

fringe goblet Oct 14, 2025, 3:25 PM

#

this looks nice 🤔

shell lava Oct 14, 2025, 3:28 PM

#

Terminus is solid for speed - not 1000tk/s or anything but I’m never waiting for it. I don’t use kimi at all.

Nano def has broader model support, but also backends to chutes for a lot of stuff so it’s spotty and the tool calling isn’t great.

Synthetic really spends a ton of time getting a few models right.

#

I’m a bit unique in that I still have more kilo credits than I know what to do with, but once those run out I think the majority of my spend will be with Synthetic on GLM and Terminus, and occasional spend with requesty or openrouter for the enlarger SOTA models like codex or Claude

#

They also have a super active discord with the founders

raw pawn Oct 15, 2025, 11:48 AM

#

shell lava <@510124337731469319> I’m pretty pleased with Synthetic. Super responsive tea...

Where does Terminus fit into your workflow? Docs?

shell lava Oct 15, 2025, 2:38 PM

#

raw pawn Where does Terminus fit into your workflow? Docs?

Code generation.

raw pawn Oct 15, 2025, 3:06 PM

#

shell lava Code generation.

Oh interesting, pretty good code gen model? I've yet to use it. Need to do some testing on things other than Qwen, maybe I'll check out Terminus.

shell lava Oct 15, 2025, 3:10 PM

#

It’s pretty solid. I want to do some work with 3.2 next

shell lava Nov 12, 2025, 1:31 AM

#

Good news! The Synthetic team just pushed a fix to help them recognize Kilo tool calls the same as they do for Claude Code.

So now if you have JSON tool calling enabled with synthetic, your typed request in a sequence counts for 1, but all the tool calls initiated by Kilo count for 0.05 (like they do for Claude Code).

If you have good cohesive prompts that allow the model to work independently, this change will likely make your subscription go 5-10x further for the same workload

golden harness Nov 13, 2025, 4:04 AM

#

shell lava Good news! The Synthetic team just pushed a fix to help them recognize Kilo tool...

that's pretty sweet. i wanted a few more requests than synthetic offered, was getting tired of chutes inconsistency, and this may give me the cushion that i need to prevent needing more individual requests.

shell lava Nov 13, 2025, 4:06 AM

#

I implemented this today: https://github.com/mcowger/priceServer/blob/main/prices.py

was ~30 requests in total.

GitHub

priceServer/prices.py at main · mcowger/priceServer

Contribute to mcowger/priceServer development by creating an account on GitHub.

golden harness Nov 13, 2025, 4:14 AM

#

shell lava I implemented this today: https://github.com/mcowger/priceServer/blob/main/price...

sick. so, forgive the ignorance/lack of advanced research... my experience with chutes and hamfisted solutions the past few months has me a little jaded. where does synethetic sit in the grand scheme of things as far as the individual models and their ability to perform as expected?

do models such as kimi k2 thinking, glm 4.6, etc think/reason as expected with the provider as is? i understand any of this could be both kilo or provider limitations, so just wanted to mostly know what to expect.

shell lava Nov 13, 2025, 4:18 AM

#

They don’t offer all that many self hosted models - not nearly like chutes does. I think currently it’s M2, GLM, K2-Thinking?

But the self hosted ones are done really really well. The 2 founders are super responsive to implementation bugs, usually fixing stuff within an hour or two. Super responsive on discord.

In at least 1 case, they are better than even the model lab themselves (GLM / Z.ai).

When I develop the JSON tooling stuff for kilo, there’s only 2 providers I trust to believe they have the right implementation, and that’s Synthetic and DeepInfra.

#Synthetic.new