Efficient DoubleStreamBlock kernel | GPU MODE | Page 1

minor palm Sep 17, 2024, 2:13 AM

#

DoubleStreamBlock is a Transformer block in modern Diffusion models for which multiple sequences of tokens (for example, image and text tokens) are processed in parallel. Except for a shared self attention (on concatenated sequences), the sequences could potentially be processed more efficiently in parallel (given enough compute).

vast robin Sep 18, 2024, 2:24 PM

#

Hey @minor palm do you recommend any resources on this I could look into?

vast robin Sep 19, 2024, 2:45 PM

#

nevermind

GitHub

flux/src/flux/modules/layers.py at 87f6fff727a377ea1c378af692afb41a...

Official inference repo for FLUX.1 models. Contribute to black-forest-labs/flux development by creating an account on GitHub.

mint hollow Sep 20, 2024, 8:38 AM

#

is there already a team on this? would love to join efforts

#Efficient DoubleStreamBlock kernel