#How to serve unsloth/gpt-oss-120b-unsloth-bnb-4bit on vllm?

2 messages · Page 1 of 1 (latest)

blazing light
#

How do I serve the unsloth/gpt-oss-120b-unsloth-bnb-4bit model to vllm?
'--quantization bitsandbytes --load-format bitsandbytes' args has been set, but the following error occurs.

ValueError: Following weights were not initialized from checkpoint: {'model.layers.19.mlp.router.bias', 'model.layers.24.mlp.router.weight', 'model.layers.18.mlp.router.bias', 'model.layers.15.mlp.router.weight', 'model.layers.30.mlp.router.weight', 'model.layers.23.mlp.router.weight', 'model.layers.17.mlp.experts.w13_weight', 'model.layers.0.mlp.router.weight', 'model.layers.15.mlp.experts.w2_weight', 'model.layers.10.mlp.experts.w2_weight', 'model.layers.31.mlp.experts.w13_weight', 'model.layers.0.mlp.router.bia
plain lake
#

oh use unsloth/gpt-oss-120b which is MXFP4 so also faster