#grad_norm: 0 and LR: 0
17 messages · Page 1 of 1 (latest)
I am trying to fine-tune Qwen3-4b-instruct using a noval GRPO-like RL approach, but after 1st step, I am getting the following error. that would be great if anyone could help me!
can't anything about the rest but LR=0 on step 1 is normal since youre using warmup
I believe the reason of the CUDA error is LR=0 and grad_norm: NaN
LR=0 is fine
itll just zero out the gradients when applying them
grad norm nan isnt normal
also
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
is says right there in the error message how to get a more sensical stack trace
yep I did it, nothing changed still the same log lol
same stack trace?
yea
Before any further steps, did you ensure to try out the official notebooks by Unsloth and also read the docs ?
Start by running the Unsloth notebook without any changes. And if that still throws an error, then there's an environment issue. If not, then you add your changes one by one and see where the crash happens.