I'd love to tinker with manually editing the initial noise image. Like having 'image to image', at full strength, but without adding noise to the source image (I assume i2i works by adding defined amount of noise to image and diffusing from there). In which case it would be useful to also allow zero steps generation, which would just output the initial noise. Just a though...
#Diffuse from custom image?
10 messages · Page 1 of 1 (latest)
the noise is in the latent space of the VAE, not in visual pixels. your best bet for noise shaping is therefore going to be intersecting the latent tensor before it enters the diffusion process - but don’t expect to be able to run eg photoshop or colour grading filters on it
@dim hearth has done some experiments in this area and can probably advise
My advice: it doesn't work.
The data was trained on a pure Gaussian distribution in 4 channels (the latent is a [1 4 height/8 width/8] tensor) and really expects that distribution and all of its characteristics.
If you try to shift it, your image tends to be colored strangely and ultimately loses all detail.
I haven't experimented with shifting one channel because it's unclear what that really means. The decoding and encoding process from an image to latents isn't a simple calculation.
i2i uses the scheduler associated with your sampler to determine how much noise to add in based on the number of steps you want. That's added into the latents generated from your initial image by the scheduler, which is part of diffusers.
latents = self.scheduler.add_noise(latents, noise, batched_t)
Thank you. I see it's not simply a dead end idea, but there is no road there whatsoever. I was thinking of darkening/lightening areas to impose final composition, which can be done to some extent in i2i. I was wondering if one can take it further.