https://github.com/AUTOMATIC1111/stable-diffusion-webui-tensorrt
Read the readme carefully before attempting.
You might need to install cuda toolkit, make an nvidia account to download TensorRT and get Microsoft C++ build tools.
What this is
This extensions gives you a fairly straightforward method of converting your usual ckpt/safetensors models' unet into a TensorRT (TRT) format.
What it means is that once you properly compile the model, which will take from 15minutes to an hour or so, you may get massive speed boosts, with several caveats.
Caveats
Firstly, hires fix won't work out of the box.
The TRT has a specific max shape (which factors max token count, resolution and batch size) and the biggest you can get out of it is 512x1024 or 1024x512 or 768x768. These are usable in img2img however, the first 2 values listed would work with ultimate SD upscaler script with tile size 512, the 768x768 one works with multidiffusion (tile width/height 96/96).
Secondly, you need to compile the model (once), which takes some time, and has to be specific settings like I mentioned, I could see potentially having 3 TRT models per main model.
Thirdly, the TRT doesn't support LoRA or controlnet.
Who is this for then
If you generate a lot of smaller images and then choose one to upscale, this will be a speed boost.
If you use multidiffusion, this should also be a speed boost.
You can easily activate or deactivate the TRT unet, especially if you put sd_unet in the quicksettings, swapping on and off from it takes a few seconds for me, and I'm definitely glad to have it as an option.
My settings
- Two different settings that I've tried so far, will do landscape preset soon:
- Setting for using the model for img2img multidiff - 512x768 width, 512x768 height, 1 batch size, 75 token count (multidiff with small prompt is fine).
- Setting for 512x768 - 512x576 width, 512x768 height, 2 batch size, 300 token count.





