#grad_norm: 0 and LR: 0

17 messages · Page 1 of 1 (latest)

storm moat
#

I am trying to fine-tune Qwen3-4b-instruct using a noval GRPO-like RL approach, but after 1st step, I am getting the following error. that would be great if anyone could help me!

fathom prairie
#

can't anything about the rest but LR=0 on step 1 is normal since youre using warmup

storm moat
#

I believe the reason of the CUDA error is LR=0 and grad_norm: NaN

fathom prairie
#

LR=0 is fine

#

itll just zero out the gradients when applying them

#

grad norm nan isnt normal

#

also

CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
#

is says right there in the error message how to get a more sensical stack trace

storm moat
#

yep I did it, nothing changed still the same log lol

fathom prairie
#

same stack trace?

storm moat
#

yea

spice gust
# storm moat yea

Before any further steps, did you ensure to try out the official notebooks by Unsloth and also read the docs ?

Start by running the Unsloth notebook without any changes. And if that still throws an error, then there's an environment issue. If not, then you add your changes one by one and see where the crash happens.

storm moat
#

the error have resolved!

#

I just setted up both bf16 and fp16 = False, and its done