Not able to run unsloth gemma 3 bnb on with dynamic lora on vllm? | Unsloth AI | Page 1

dire hawk Sep 30, 2025, 8:29 AM

#

It gives error.

polar pastureBOT Sep 30, 2025, 8:29 AM

#

Dynamic LoRA with vLLM is not compatible with Unsloth Gemma 3 BnB-4bit models. Attempting to use dynamic LoRA (e.g., hot-swapping adapters) on these models in vLLM results in errors, and this is a known limitation. You can switch LoRA adapters on a regular (non-quantized) base model, but not with BnB-4bit quantized models. For dynamic LoRA, use non-quantized models or check for future updates from Unsloth and vLLM teams regarding support for this feature. See discussion and workarounds in the linked issues for more details.

Would you like a step-by-step explanation or code examples for alternative approaches?

Sources:

Unsloth Issue #08cd6140

dire hawk Sep 30, 2025, 8:44 AM

#

so you are saying if i use regular unslot gemma model it will work

#

but that model will be twice the size and do i use on the fly quantizationn for it to work. Qwen works fine but this is giving issues.

main river Oct 8, 2025, 4:11 AM

#

It seems Gemma 3 BnB-4bit models are not compatible with hot-swapping adapters in vLLM

#

you could make an issue for vllm on their github, or see if they have any support in the discord

dire hawk Oct 8, 2025, 4:28 AM

#

ok thanks! If you don't mind I think if you also start pushing for VLLm steps in your notebooks it will be great or your own inference engine on Docker compose if you can make it comparable with hot swappable, I will be the first to use. Qwen works phenomenal your bnb model. Right now yes we can do in python and all but you will get easy deployment for use cases for scale. I am sure there is a lot going on, so just whenever you have space.

#

So I am thinking for Gemma if I use regular model vllm has auto quantize will it work?

main river Oct 8, 2025, 4:34 AM

#

dire hawk So I am thinking for Gemma if I use regular model vllm has auto quantize will it...

it's worth trying

main river Oct 8, 2025, 4:35 AM

#

dire hawk ok thanks! If you don't mind I think if you also start pushing for VLLm steps in...

I agree that would be great if we can get that working, it is tricky because we would have to maintain it, thank you for the feedback

#Not able to run unsloth gemma 3 bnb on with dynamic lora on vllm?