Batch inference broken with Gemma2 | Unsloth AI | Page 1

low tendonBOT Jul 15, 2025, 4:24 PM

#

The issue at https://github.com/unslothai/unsloth/issues/2939 describes a bug with batch inference for Gemma-2 models in Unsloth: when using a batch size greater than 1, padding causes the model to generate empty or incorrect outputs, while single-sample inference works as expected. The problem appears to be related to how padding tokens and attention masks are handled in the Gemma2 implementation. As a workaround, batching prompts of the same length (thus avoiding padding) yields correct results. The Unsloth team has acknowledged the issue, and further investigation is ongoing, but there is no official fix yet.

For more details and code samples, see the full discussion at Unsloth GitHub Issue #2939. Would you like a step-by-step breakdown or guidance on possible workarounds?

Sources:

Unsloth GitHub Issue #2939

acoustic kelp Jul 15, 2025, 4:24 PM

#

See this issue:

https://github.com/unslothai/unsloth/issues/2939

GitHub

[Bug] Batch inference broken? Gemma-2 · Issue #2939 · unslothai/u...

Hi all! I cannot get batched generation right with the Gemma2-9b-it model. With a batch size of 1, everything works correctly. However, once I go beyond that, the padding seems to ruin things. Here...

low tendonBOT Jul 15, 2025, 4:24 PM

#

Sure thing! I'm looking through the available data to answer your question. Hold on a second... :hourglass_flowing_sand:

acoustic kelp Jul 15, 2025, 9:45 PM

#

How to reach anyone regarding this?
Gemma2 might seem a little old but i am trying to finetune TxGemma, which is a finetune of Gemma2, so this is very relevant to me..

low tendonBOT Jul 15, 2025, 9:45 PM

#

Sure thing! I'm looking through the available data to answer your question. Hold on a second... :hourglass_flowing_sand:

spice rivet Jul 16, 2025, 5:48 AM

#

Will check!

acoustic kelp Jul 23, 2025, 8:11 AM

#

spice rivet Will check!

Any way this can be fixed?
We are currently evaluating Unsloth for Enterprise purposes and its really discouraging that something as basic as batch inference doesnt seem to work. :/

spice rivet Jul 23, 2025, 12:14 PM

#

So sorry on the delay - been inundated with model releases

#

ill take a look at it today and tomorrow

acoustic kelp Jul 23, 2025, 12:19 PM

#

Thanks a lot for all your work, Daniel!

spice rivet Jul 25, 2025, 11:54 AM

#

as an update the main branch of unsloth has a temproary fix

#

so if u do

#

pip install --upgrade --force-reinstall --no-deps git+https://github.com/unslothai/unsloth.git

#

it should work

#

for now

#

if it doesnt work, ill see what else i can do

#Batch inference broken with Gemma2