Hi everyone, I’m looking for some help.
I’m generating videos with InfiniteTalk using RTX 5090 / RTX 4090, running the ComfyUI SageAttention (CUDA 12.8) template.
On 5090 / 4090 everything works perfectly.
I sometimes use a Network Volume, and it works reliably there as well.
However, when I try to run the exact same setup on H200 SXM, I run into a serious issue:
• the output video contains only ~1 second (or even just 1 frame),
• and then the rest of the video is a black screen.
What I’ve already verified:
• I tried both the SageAttention template and a setup without it
• I clone the same GitHub repositories
• I use the exact same ComfyUI workflow
• Same models, same settings
• Same input image and the same MP3 audio file
Everything is literally 1:1 identical, except for the GPU.
On H200 there is no full crash — generation finishes — but the output video is broken.
Could this be related to:
• H200 / Hopper-specific behavior
• SageAttention or attention backend compatibility
• FP8 / precision differences
• CUDA 12.8 issues on Hopper GPUs
I’d really appreciate any ideas or similar experiences.
Thanks in advance 🙏