#Efficient DoubleStreamBlock kernel

4 messages · Page 1 of 1 (latest)

minor palm
#

DoubleStreamBlock is a Transformer block in modern Diffusion models for which multiple sequences of tokens (for example, image and text tokens) are processed in parallel. Except for a shared self attention (on concatenated sequences), the sequences could potentially be processed more efficiently in parallel (given enough compute).

vast robin
#

Hey @minor palm do you recommend any resources on this I could look into?

vast robin
mint hollow
#

is there already a team on this? would love to join efforts