#gpt-oss RL takes 41H in DGX rather than 4H as given in docs

2 messages · Page 1 of 1 (latest)

solemn rapids Feb 4, 2026, 9:52 AM

https://unsloth.ai/docs/basics/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth

I m running this same notebook in my DGX but it says need 42H to finish the training, but the doc says will get over in 4H

Can anysome help me tune the parameters to get the training quickly ? Also the only 28GB of VRAM is getting used while I have 128GB available

Fine-tuning LLMs with NVIDIA DGX Spark and Unsloth | Unsloth Docume...

Tutorial on how to fine-tune and do reinforcement learning (RL) with OpenAI gpt-oss on NVIDIA DGX Spark.

fiery hornet Mar 9, 2026, 8:33 PM

Hey @solemn rapids ,
Here’s what I would change first, in order:

training_args = GRPOConfig(
temperature = 1.0,
learning_rate = 5e-5,
weight_decay = 0.001,
warmup_ratio = 0.1,
lr_scheduler_type = "linear",
optim = "adamw_8bit",
logging_steps = 10, # less logging overhead
per_device_train_batch_size = 2, # since num_generations is 2 anyway
gradient_accumulation_steps = 1,
num_generations = 1, # biggest speed win
max_prompt_length = max_prompt_length,
max_completion_length = 256, # or 384, cap generation length
max_steps = 200, # test speed first before full 1000
save_steps = 100,
report_to = "none",
output_dir = "outputs",
)