#Qwen3-Coder-480B-A35B-Instruct

369 messages · Page 1 of 1 (latest)

iron estuary
#

New Qwen coding & agent specific model with 1M token context length, now available on https://chat.qwen.ai and https://app.hyperbolic.ai/models/qwen3-coder-480b-a35b-instruct.

My initial testing indicate performance close to Kimi-K2, impressive for 1/2 the total parameters.

iron estuary
#

I used Qwen3-Coder-480B-A35B-Instruct to generate a procedural 3D planet preview and editor.

Very strong results! Comparable to Kimi-K2-Instruct, maybe a tad bit behind, but still impressive for under 50% the parameter count.

Creds The Feature Crew for the original idea.

mossy pawn
#

oh shoot

hoary prairie
#

impressive for 1/2 the active parameters
it has 1/2 the total parameters and 3b more active parameters

winter shore
hoary prairie
#

Qwen3-Coder is here! ✅

We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves

torpid edge
#

Don't care, give me the official API, I am feeling like just donating them money 😛

#

I guess tiered pricing is all the rage in China, I think bytedance doubao does this shit too

#

Must suck to implement this for you though

winter shore
#

yeah i was ready to take it live when they announced… i should’ve asked about pricing way earlier

#

i was not expecting 4!! tiers

#

we can do the two tiers now since gemini etc do it..

magic linden
#

significantly more expensive than Gemini 2.5 Pro at full context.... poof

cedar remnant
#

is dis model on some sort of api anywher

torpid edge
#

Hyperbolic

cedar remnant
#

is it on openrouter yet

winter shore
#

soon

torpid edge
#

hmm Hyperbolic doesn't even have prices for this yet lol

cedar remnant
brisk glacier
torpid edge
#

tbh tiered pricing is closer to reflecting their actual computational cost

covert schooner
#

thats like double deepseek's, while being a smaller model
lets see what other providers will cook up

silk flame
#

They're probably pricing it based on the benchmarks. It's supposed to be just a little behind Claude Sonnet 4, but better than Gemini or GPT 4.1.

torpid edge
#

Good chance we may see some sub 0.5/M for input

covert schooner
#

yeah, seems unlikely to that for up to 128k, prices from other providers wont be competitive with deepseek

#

higher than that, probably not

torpid edge
#

Problem is Claude does implicit context caching which becomes even cheaper at 0.3/M

#

With open-weight agentic models catching up, the ball is in the court for some of the providers to actually start implementing context catching

covert schooner
#

did any even implement one like claude's?

#

because if no, they'll seriously have to work on that

torpid edge
#

No, I mean DeepSeek and Kimi does, but none of these Western inference providers to my knowledge

trail glen
fallow nest
snow coyote
snow coyote
#

can we get better prices with other providers?

hoary prairie
#

Most likely yes

fallow nest
winter shore
torpid edge
#

Btw, recommended params:

temperature=0.7, top_p=0.8, top_k=20, repetition_penalty=1.05

gaunt ore
open lotus
#

qwen3 is really killing it now

#

newest coder is amazing

mossy pawn
open lotus
#

its on hyperbolic

mossy pawn
#

i mean cli

#

qwen code cli

open lotus
#

ah no

snow coyote
#

is it really in same quality as claude 4?

open lotus
#

its really good

#

though they sure know it putting the price like it is in comparison to deeepseek

#

1 / 5

#

Hopefully another provider offers it at a better price, it would be cheaper to run that deepseek

neat jasper
#

this model so benchmaxxed 😪

open lotus
#

? my usual tests worked really well with it

snow coyote
open lotus
#

code of my own from some stuff that all but gemini 2.5 pro failed at

soft birch
#

They say it outperforms kimi k2 in benchmarks but I personally don't experience that at all

hoary prairie
soft birch
#

It agrees way easier and makes more often than not broken code that requires additional tokens to be fixed 💀

hoary prairie
#

I really hope they add it soon

torpid edge
#

oh right yeah

hoary prairie
#

oh replied to wrong msg

#

sry

torpid edge
#

Unfortunately at this point I have basically implemented a lot of openrouter myself somehow

winter shore
#

hmm? what parts?

hoary prairie
torpid edge
#

Automatic sampling parameter adding, the annoying /think and /no_think stuff for qwen 3 etc

hoary prairie
#

Good thing that they are leaving the /think stuff behind

#

Much simpler

soft birch
#

agreed

snow coyote
#

is it worth using it instead of copilot(claude 4, gpt 4.1)?

#

with cline

hoary prairie
#

too early to say for sure

soft birch
snow coyote
#

but price is good

soft birch
#

price for both is amazing compared to claude opus 😂

#

not referring to highest tier of qwen3

torpid edge
#

I hope DeepSeek v4 chills out and takes a month or two more, I am fatigued

snow coyote
soft birch
#

and usually agentic can condense context

snow coyote
snow coyote
soft birch
#

it will then update accordingly whether that provider supports prompt caching or not etc.

snow coyote
#

nice

#

this is so cool

brittle adder
#

How does Qwen3 stack against Opus?

soft birch
#

kimi k2 better imo

torpid folio
#

Adam Holter's benchmark

soft birch
torpid folio
#

On this benchmark, yes

soft birch
#

surprising

torpid folio
#

(This is mostly a coding benchmark)

solar token
#

add. caching. now.

silk flame
#

Do you have a link to the benchmark?

torpid edge
#

They haven't really managed to purge out the thinking tendencies:

evaluate_expr_tail([], Acc, Acc).
evaluate_expr_tail([add|Ops], Acc, Result) :-
    term(Term, _), % This is a bit tricky - we need to extract the next term
    % Let me rewrite this more cleanly
    fail.

% Let me rewrite the parser to make evaluation easier
% Simpler approach: build binary tree directly
solar token
#

Qwen coder seems like a model that clears kimi k2 and matches sonnet 4 - I left a showcase of what I built with it in #app-showcase 🐐

glad reef
#

How is it at non coding tasks

dull sundial
#

Qwen3-Coder is available in multiple sizes
I suggest using a more specific model slug other than qwen3-coder since other sizes will follow.
qwen3-coder-480b-a35b would allow for other sized qwen3-coder slugs

somber spindle
#

what’s the point if the tiered pricing is as it is right now

winter shore
#

once we get the other models will probably make the change

#

you can call the model rn with qwen/qwen3-coder-480b-a35b-07-25

hushed spear
# torpid folio https://docs.google.com/spreadsheets/d/1FKO1i63BO-8353_wP7iAlowrSxdcT3ZAIZZbzARu...

I tried the first prompt which is labeled as fail for qwen3-coder and this is what I get at one-shot

https://chat.qwen.ai/s/deploy/8a0db5e0-9fb0-4f5a-9d12-6d9c5a2c0d86

Some small issues like I can't select a tower and upgrade or sell it. But it's definately playable

boreal pasture
#

I think it did better than kimi k2 at my chess game prompt

#

Used hyperbolic provider with unsloth recommended settings

hushed spear
#

It seems currently all 3rdparty providers are all using fp8, while the model itself is bf16 originally.

astral ingot
iron estuary
#

I'm using the model in bf16 from the 1st party provider.

Incredibly strong showing creating a web OS! Probably the best result I have ever seen, period.

Notice the window reordering, resizing and minimizing logic at the end, it’s the first model to ever get this right. Also allows multiple instances of a given app with separate state.

This was created one-shot using the prompt:

Using Python, create a website that emulates an operating system.

Upon opening this website, a user should be able to view a desktop environment and use simple but WORKING applications such as a file browser, a text editor (which can edit and save text files), a web browser, a terminal and a calculator.

All of these applications should have at least all basic functionality working.

The operating system should have GUI elements, and have the general asthetic and feel of an OS.

Implement all the features listed above.
astral ingot
boreal pasture
#

Alibaba provider somehow did worse on my test than hyperbolic

#

Ill try again

#

Better now

#

Rather inconsistent

solar token
#

@winter shore can we expect caching for qwen 3 coder to be supported on Open Router?

inner patio
#

Don't mind me im just posting so the channel stays in my list 😄

dull sundial
#

Tested Qwen3-Coder-480B-A35B:

As expected from a coding focused model - most concise Qwen3 model

  • 46 % less tokens than DeepSeek V3 0324
    While competent for general use, too, performed best in STEM (math) and coding obviously.

During creation of demo pages and further probing, it showcased several obvious weaknesses such as producing buggy collision, glaring UI oversights in multiple projects, and in general required error correction that was not necessary on models such as DeepSeek V3 0324.

For a massive, coding specialized model I personally was not convinced by its coding results, combined with the quite poor price/performance on current API offerings.

However, as always - YMMV!

neat jasper
rich field
#

It's also on Chutes now

tight bane
#

For those providers using vLLM, the best way to do caching is with LMCache.

atomic rune
#

(assuming you're not running it yourself)

formal elk
#

does this model have a cache? if not, if this is not > sonnet and 2.5 pro, its Overprice

solar token
atomic rune
#

it's the providers

#

if the provider you're using supports caching, then openrouter will support it too

solar token
# atomic rune if the provider you're using supports caching, then openrouter will support it t...
atomic rune
#

or if they told you it's not supported yet

#

might be the weird pricing model that's making it hard

solar token
atomic rune
#

gotcha

winter shore
atomic rune
#

ah ok

winter shore
#

pick the caching token values from the upstream usage and use it correctly + add the pricing

#

haven't gotten to it yet

#

for existing providers it's just do the thing

#

but for moonshot no

#

lemme look at alibaba tho

atomic rune
#

huh, that should really be something that's part of the completions spec

#

though really I wish the dollar value was part of the completions spec..

formal elk
solar token
#

Toven confirmed they dont have cache support rn for alibaba

winter shore
#

I am testing it now and can't get it to trigger

#

on their end

torpid edge
#

The unit price of cached_token is 40% of the unit price of input_token

not exactly great

tacit tangle
dull sundial
dull sundial
#

would love to know someones experience who uses coding agents on a daily basis, since #1 my bench isn't a coding specific bench (very little tasks in fact) and #2 doesn't use agentic workflows at all (also I don't plan to add it since I only want to test/evaluate what I am knowledgeable in).

broken crane
astral ingot
broken crane
mystic dome
dull sundial
brittle adder
dull sundial
dull sundial
#

opus did my entire chess replay system in 10 minutes work. its beautiful

#

3 prompts and 2 minor ui fixes. wouldnt have been possible 1 year ago without major headache

brittle adder
#

I’m just so glad to offload repetitive tasks anymore. Always check the code (even if it’s something it’s done hundreds of times), but holy crap have I saved so much dev time to focus on the work that I actually need/want to do it’s fantastic. Helluva time to be a software engineer!

broken crane
#

Just finished testing Qwen3 Coder on my coding eval set, worse than Kimi K2. Will post full results soon.

frozen relic
#

"qwen3-coder-plus" and "qwen3-coder-480b-a35b-instruct" seem to have different prices on Alibaba. Are they actually the same model?

tacit tangle
broken crane
#

Yeah my evals don't have tool call yet (it just tests that model supports tool call). I'm thinking of how to design an eval for tool call. Likely need to involve multi-round message and loops.

#

Some models can do multiple parallel calls in one response, which speeds up things.

#

I think for starter I'll test parallel tool call support and how many rounds it takes to get the result.

broken crane
#

A bit late, but finished testing Qwen3 Coder on my personal evals. Strong second in the open-source field. Qwen3 Coder outperforms DeepSeek V3 (New), but remains a step behind Kimi K2.

Observations:

  • On standard medium-level tasks, Qwen3 Coder is among the best. It matches premium models in producing correct, concise code for markdown cleaning task.
  • For more complex or formatting-sensitive challenges, like benchmark visualizations, it can lag behind due to rigid output formats or missing polish in its visual code.
  • The main shortfall is in logical reasoning for uncommon programming patterns, such as advanced TypeScript narrowing, where it falls short alongside almost all open LLMs.
  • Instruction-following is not particularly good, Qwen3 Coder tends to output more verbose blocks for "output only diff" tasks, similar to Kimi K2.

Overall better than DeepSeek V3 (New), but worse than Kimi K2 and Gemini 2.5 Pro. Not close to top model performance, such as Claude 4 models and GPT-4.1

Full evaluation blog post: https://eval.16x.engineer/blog/qwen3-coder-evaluation-results

unkempt mist
#

Looks like Cerebras is prepping a launch of qwen coder on their platform

granite temple
#

How does the pricing work here for the Alibaba providers?
Input $1.50 to $4.50
Output $7.50 to $22.50

What is the criteria for the price spike?

rotund python
#

Toven know you're busy but there's no way to see the tiered pricing on mobile, and tap on android/brave does not bring up a tooltip

winter shore
#

🫠

granite temple
inner patio
#

qwen coder seems to have inherited the worse traits of both claude & gemini.
the claude yes man behaviour combined with the gemini2.5-06 "the user is wrong I will fight the user" behaviour. if qwen thinks something must be done it will do it no matter what. instructions and rules be damned. "you are asking this but I think you are asking me something else" No. im not. do what I asked. I understand this file is important to you, but theres a made up error I hallucinated so im going to remove it. I know you told me not to. great model overall but holy fuck is this kind of behaviour infuriating

brisk glacier
#

Why's that ?

astral ingot
#

they probably havent added it yet

brisk glacier
#

2k tok/s for $2/$2

#

Not too shabby

astral ingot
#

been loving kimi with groq, this might be a good alternative

brisk glacier
solar token
#

Is cerebras added for qwen 3 coder?

#

@winter shore can you add it? Seems like they added it on their site

winter shore
#

i am out today but team is aware

solar token
#

Bet tysm

winter shore
#

👀

solar token
#

tysm brother

#

@winter shore so just tested it, seems like it return error that toolcall failed, idk if something is miss configurated but yeah just letting you know.

brisk glacier
#

does the API take a moment to refresh?
I limited the provider to just cerebras and I'm getting this error: "No endpoints found for qwen/qwen3-coder."

pulsar laurel
#

I hold out hope they'll answer questions about quantisation, but I expect they won't - they've ignored everyone asking so far

rotund python
pulsar laurel
brisk glacier
formal elk
dull sundial
# brisk glacier Yeah 7.5 M tokens / day is pretty disappointing

how so? Cerebras api is $2 mtok, so 7.8 million tok would be $15 (per day), on a $50 monthly plan. if we adjust for profit margin, maybe that's half to break even, still uses up the monthly fee in less than a week on a power user. sounds more like a hyper-inefficiency problem to me, you really don't need to feed 50k context on every edit.

brisk glacier
dull sundial
#

never used claude max, so wouldn't know. from a pure numbers standpoint, the cap seems quite generous, but I don't know the competing offers because I don't code like this.

brisk glacier
#

to be clear, I don't "vibe code" and send abnormal amounts of context for simple stuff. But given the "agentic tools" (e.g: opencode) we have today, those limits will run out fast.

broken crane
astral ingot
#

theres no tool support on cerebras via openrtouer?

fickle mantle
#

best free model to use in opencode? kimi k2 small context but better than qwen 3 coder maybe. any others

rocky blaze
fair breach
#

They changed it to 24 million so apparently their profit margin is pretty healthy serving this…

fickle mantle
alpine flax
#

Qwen3 Coder was the only open source model I tested that succeeded in implementing playable Monopoly.

uneven flint
#

Qwen3 is not bad in opencode for me.

slim vigil
#

How come the free version has been removed today from OR? I had been using for a few days and was very happy

frigid ginkgo
#

Is it really gone? 😭

slim vigil
# frigid ginkgo Is it really gone? 😭

I found out when trying today in Cline. Went and checked the website and indeed it's not there anymore. :S Is there somewhere they inform about additions/removals?

lethal yacht
#

its like 20 cent input and 80 cent output for 1M tokens

#

just pay

slim vigil
#

I'm just asking about the info

lethal yacht
slim vigil
#

thanks, so helpful

plush badge
#

atleast for additions you can check #new-models . I dont know for deletions.

slim vigil
#

do models get pulled from OR without notice then? Paid as well I guess.

hoary prairie
fickle mantle
#

at the least there are free alternatives to do all the tasks with tools

slim vigil
#

free version is back!! :DDD

#

oh but only 32K context lol. Pretty unusable

fresh onyx
#

I think glm4.5 is better this time in my setting

earnest dagger
torpid edge
#

But that's not for international users, no?

earnest dagger
#

Based on the replies, I'm guessing it's outside of China. Hopefully 🤞

winter shore
#

Quick note: Free Qwen3 Coder will be coming back today (with rate limits) thanks to Jon and Chutes. Say thanks in #1364073067713925161 !

gritty whale
#

Yayy!

rocky blaze
#

thanks jon and chutes!

gritty whale
#

Toven, can you look into this?

#

This version of the model is wayyy cheaper than what's already on OpenRouter, and it supposedly is the exact same version as the model on OpenRouter right now

winter shore
#

and it's not active / being used

#

idk how that works. but not gonna add it

#

look - 2 runs in 7 days, cold, and "Ronx123"

gritty whale
#

Mhm

winter shore
#

yeah idk what this is hahah

gritty whale
#

So only things made by Chutes are official/will be added on OpenRouter?

winter shore
#

the ones that are "Chutes" in the top there are the ones we'll add

#

I don't know what this other stuff is

gritty whale
#

👍

winter shore
#

It looks like other people can deploy models

#

but at our scale, we can't really add them

calm kite
winter shore
#

free models have no uptime guarantees, if there’s too much demand yes the uptime will suffer

#

it’s doing >10B free tokens per day

#

that’s ~$3,000+ dollars a day for free

calm kite
#

I see, but it was doing well before. like almost 100% uptime. that is why i was wondering if something was wrong with it.

earnest dagger
hollow delta
earnest dagger
# hollow delta is qwen3 coder that good or is it the free variant just making it really attract...

Good question. Idk how to see how many tokens are processed by each providor though :/

https://openrouter.ai/qwen/qwen3-coder/activity
anyone?

See recent activity and usage statistics for Qwen: Qwen3 Coder - Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over repositories. The model features 480 billion total parameters...

earnest dagger
#

So the paid is still at higher usage, just given today.

dusty tiger
#

Have you seen this before? What can I do about it? I'm paying for the service, so I've chosen the paid option.

gritty whale
#

use a different provider

dusty tiger
gritty whale
dusty tiger
#

ok, thanks

hollow delta
#

you could also provide a list of providers as fallback if you're using the api

#

afaik

dusty tiger
#

I have strictly regulated which providers I use, as I've had disappointing experiences with some providers—for example, insufficient quantization or excessively high costs. That's why I only maintain a positive list of providers and have now expanded it with an additional provider. I hope this will now work.

astral ingot
#

i think someone already mentioned this but i got an empty response from gmicloud

hard bramble
pliant quiver
#

@dusty tiger I'm interested as well, which one do you use and do you use it with tools?

dusty tiger
#

@hard bramble @pliant quiver

hard bramble
# dusty tiger

Nice, thanks! You have a good experience with Chutes? I've found it to be hit-or-miss.

dusty tiger
thick jacinth
#

when will its api come out?

hoary kayak
#

When will it appear on openrouter?

gritty whale
hoary kayak
thick jacinth
gritty whale
thick jacinth
#

thanks I mixed it up

half agate
#

Hi!
Getting some strange results in Chat using specific providers:
Deepinfra - No endpoints found for qwen/qwen3-coder.
Together - No endpoints found for qwen/qwen3-coder.
Baseten - error
Parasail - working
Fireworks - working
NovitaAI - No endpoints found for qwen/qwen3-coder
AtlasCloud - working
Phala - working
etc

Am I doing something wrong or some providers changed endpoint names?

modest stream
dusty tiger
broken crane
#

is alibaba/opensource the same as old alibaba/plus?

broken crane
#

With over 200 million tokens over the last few days put into Qwen 3 Coder provider testing. I finally have results I can share.

1. I think we need to be able to multi-select providers in any ai coding tool.

2. API reliability was a lot bigger of a problem with Qwen 3 Coder than

ivory tangle
#

Are there no providers with prompt caching?

winter shore
broken crane
winter shore
#

it will get it's own model page soon

#

it's the proprietary closed source weights of the qwen3 coder weights

broken crane
#

Got it

drifting aurora
#

@dusty tiger idk if you are still using it, but I really like using Cerebras for qwen 3

dusty tiger
drifting aurora
#

But for you use, it's prolly worth to wait for their code pro/max plans to restock, they work with cline

dusty tiger
#

yeah, I will wait

glacial dragon
#

Алло нахуй

dusty tiger
#

English please

glacial dragon
#

Where can I use the LLaMA2 uncensored model?

drifting aurora
#

Not on here

glacial dragon
#

Нихуя не понял

drifting aurora
#

Stop fucking speaking russian

glacial dragon
#

🇷🇺

drifting aurora
#

fucking retard

glacial dragon
#

💩

gritty whale
valid bluff
ivory tangle
#

Anyone know a provider that supports prompt caching?

thick jacinth
#

you can try deepseek v3.1 from deepseek directly

ivory tangle
#

Guess it’s the only one not prompt caching

thick jacinth
ivory tangle
winter shore
#

we'll add coder-plus

#

which is the cache enabled model

#

it's not the same model as the other qwen3 coder though

winter shore
ivory tangle
#

And yeah I know they are dif models, I think even tho it’s more expensive on a per M token bases, i think it will be cheaper bc of caching, and I will cap the context at 128k or something, once you guys have it added I will use it here, much prefer having everything via one balance

glad reef
#

/ How would I access it?

ivory tangle
#

I wonder if its taking a while because they have price tiers instead of a simple input and output price

unique musk
#

hi

hoary raven
#

By the way guys, if anything happened with Qwen again, try asking in the Alibaba Cloud Developer Community Discord. Lara’s great at bringing the Qwen team in for quick answers.

gleaming fulcrum
#

@winter shore just wondering about Alibaba provider not being in the "exacto" endpoints? Surely the Alibaba API is a reference quality implementation?

#

fwiw I get better and more consistent results from qwen/qwen3-coder-plus (which only has Alibaba as a provider for some reason), than other qwen3 coder endpoints.

gleaming fulcrum
#

in general though exacto is a massive step in the right direction

viscid flint
#

@gleaming fulcrum the plus model is a different proprietary model and likely uses different weights from the open source ones. That’s probably why it’s better.

winter shore
#

in our real world data and in benchmarks

#

the decisions for exacto are based on data, not on wheter it would make sense to have it

#

the other models have the model author endpoints in exacto because they're good endpoints

gleaming fulcrum
#

fair enough. I didn't realize that qwen3-coder-plus and open source endpoints were different.

rich field
gritty whale
#

Why is Cerebras taking these models?

#

Are they strapped for GPUs or something?

rich field
#

im not sure. they could keep both but for a reason they need to remove it

#

really weird

#

it was nice with the speed

gritty whale
#

Mmm

neat jasper