#🧨 diffusers

1 messages · Page 4 of 1

rugged moth
#

its only a diffusers issue

#

i just checked with a normal ckpt ... works fine

#

thanks for lookin into it if you get the chance

clear hinge
#

yah i don't have a diffusers setup yet... or anything other than a pretty old environment x.x

#

I imagine getting installed in a way that won't conflict with my current environment is going to be painful

#

is the dev setup still create pew environment, pip install, run... some setup script?

#

scripts/configure_invokeai.py

dire gazelle
#

the happy path should work great

#

yeah configure_infokeai.py sets everything up

clear hinge
#

ah I'll try that if this doesn't work

#

(does that do a code setup for dev purposes?)

#

hrm all black images output on the 768 model =/

#

512 model seems to work though

#

@rugged moth do you have a good sample prompt/settings that reproduce it? On stable-diffusion-2.1-base I'm getting decent results (I assume this is using diffusers).

#

The seams are a little stronger than I remember them being, but they're tending to look decent

rugged moth
#

diffusers model? 512?

clear hinge
#

the "scale before processing" settings could maybe be something to look at? Doesn't seem related though

rugged moth
#

@clear hinge a fantasy landscape

#

on the 2.1 base model

clear hinge
#

Just got this, but not terrible:

#

sampler/steps/etc.?

rugged moth
#

20 euler_a

#

its pretty bad compared to what we had

#

its definitely creating an entirely unrelated image there

clear hinge
#

50 keuler for me:

rugged moth
#

and in a lot of cases where the detail is not clear, it leads to a messy blob of low jpeg artifacts

#

see there's no match

#

between the original image and the infilled area on the seam

#

it gets more evident in other examples

#

its clearly not respecting the original image at all

clear hinge
#

left it outpainting the seam), right is an inpaint I did at the seam paint settings (0.7-ish strength)

#

seems like inpainting is just generally a lot stronger with this model?

#

Not as strong, but you can still see the seam if you go looking =/

#

Similar issue if I just inpaint though

#

I wonder if this model just doesn't weight the original RGB as heavily when inpainting

heavy glacier
#

its not just that model

#

this is on my 1.5 hipsterblend

clear hinge
#

That seems to be trying to introduce mountains in the clouds though

#

What if you turn seam strength down a lot?

#

It's super artifacty though 😣

#

What if you inpaint using similar settings?

heavy glacier
#

inpainting seems to work fine

#

will try seam strength

#

seam strength at .47

clear hinge
#

lol keuler won't work with 1.5 for me x.x

#

AttributeError: 'EulerDiscreteScheduler' object has no attribute 'make_schedule'

#

actually every sampler gives me that

heavy glacier
#

strength .12 is too low 😅

clear hinge
#

seams!

#

okay well 1.5 and 2.1-768 both don't work for me for different reasons =/

heavy glacier
clear hinge
#

wow uh

#

"a photograph of a landscape, sunset, clouds"

heavy glacier
#

SEAM. SEAM. I SEE A SEAM

clear hinge
#

uh.... try bumping up the steps on seam painting

heavy glacier
#

at what strength?

clear hinge
#

0.7

#

jeez these don't match at all:

#

So the quality of the outpaint seems similar to the quality of the seam paint for me

#

it's like sending it through inpainting bumps up the contrast and saturation:

heavy glacier
#

this is outpainting to the left

#

not so much having bad quality but def seeing the seam

#

(even at 20 steps)

west nebula
#

Can somebody tell me if this embedding works for them? I cannot get it working at all, just trained it with TI. Trigger is <lfb>.

clear hinge
#

left is outpaint, right is just pure inpainting

#

similar issues

west nebula
#

@rose sentinel I'm getting debug messages that the embedding is being used but it's not doing what it should...

heavy glacier
#

are you on 2.1?

west nebula
#

I trained it for 4K steps.

#

This is a 1.5 embedding.

#

Diffusers branch.

heavy glacier
#

question was for kyle, sorry 🙂

clear hinge
#

yah 2.1

#

(oh should I be on diffusers branch? I thought it was merged into main and I should be testing main)

west nebula
#

Sorry, main now. 🙂

heavy glacier
#

i.e., are you outpainting something that was generated in a normal model

#

b/c i've found 2.1 generating weird shit generally without a lot of fine-tuning the prompt

clear hinge
#

everything is 2.1

heavy glacier
#

hmm. ill do some testing on 2.1

clear hinge
#

1.5 throws errors for me, and 2.1-768 just generates black images with an exclamation overlay

rugged moth
#

diffusers uses a different make_noise function .. reckon the issue is there?

clear hinge
#

well, it definitely seems like inpainting is the issue

#

everything else in the flow looks correct to me

rugged moth
#

get_noise_like

west nebula
#

Would this make_noise be the reason things look different every diffusers run vs. non-diffusers being identical?

rugged moth
#

have a look at that function

clear hinge
#

how different is this from what was there before?

#

isn't this what inpaint_replace did in the past?

heavy glacier
#

i mean, doing inpaint replace is effectively doing strength = 1

#

2.1 still has some wonky seams but definitely not seeing the vast quality drop off you are

clear hinge
#

how about trying an img2img on the whole thing?

#

img2img isn't as drastic:

#

so... maybe the latents. But I'd look more into how the masking does stuff

#

(I'm very unfamiliar with all of this code, both new and old x.x)

#

kids are home, gtg for now x.x

heavy glacier
#

not seeing the same degradation you are

heavy glacier
#

(seams are still bad though)

clear hinge
heavy glacier
#

img2img

#

at least i cant notice it

clear hinge
#

Ah I don't think there's actually degradation in mine. Just the prompt, strength, etc. doing its thing

heavy glacier
#

ah got it.

clear hinge
#

But img2img is roughly inpaint without masking, so helps narrow the problem down

#

My guess is something to do with the masking. But I don't know how that all works 😣

#

Maybe that'll give @worldly cloak an idea though.

heavy glacier
#

inpainting seems to be working well though

west nebula
#

So who is using xformers at this point?

heavy glacier
#

i am

#

OK - and inpainting does have degredation

#

after testing on more sensitive gradients/areas

#

so its definitely anything w/ masks

rose sentinel
rose sentinel
rose sentinel
heavy glacier
#

hm... maybe i just had a bad inpaint then

clear hinge
rose sentinel
#

I meant a side-by-side of outpainting with a checkpoint model vs outpainting with the equivalent diffusers model.

west nebula
# heavy glacier i am

Are you seeing the same problems that @rose sentinel and I saw with the same txt2img parameters yielding different results? The effects were multiplied for me if I did high-res optimization and a larger image size.

#

This is very helpful!

rose sentinel
clear hinge
#

Fresh model downloads too

west nebula
#

@rose sentinel It appears from the debugging output that my TI is loaded, but it also seems to have minimal effect. When I trained on 2.2.5, it definitely worked. And another TI that I trained with the diffusers version looks fine, if not a bit overtrained. So... hm.

rose sentinel
# west nebula This is very helpful!

I've only done training once, a few days ago when I wrote the front end, and it definitely worked. I trained on pictures of "jello", and when I give the token "<jello>" (with the angle brackets) I get jello and nothing else. So maybe way overtrained.

west nebula
#

Want to try with my images?

rose sentinel
#

Sure! Put them somewhere that I can retrieve them and I'll give it a go. I'm doing a second training now, but can start yours tomorrow morning.

west nebula
#

Will do. What params are you using for your training?

rose sentinel
#

Pretty much the defaults: learning rate, gradient, batches, etc. Just now I was trying to get training going on a multi-GPU system, but I can only get one GPU to work.

#

Ok. I'll start the training tomorrow. How long does it take for you? On my system it is a couple of hours.

west nebula
#

I have no idea why one training would be perfect and the other almost unnoticeable with the same training parameters.

#

I did 2000 steps and it took maybe an hour... I tend to set it off before leaving for a bit.

#

I did 4000 steps also and it still wasn't very noticeable.

rose sentinel
#

You want to train on a style, is that right?

west nebula
#

Correct.

#

I've trained based on 1.5 and 1.5 non-EMA, no difference.

rose sentinel
#

Advice on the parameters? I have no idea how they interact. Also I've heard that providing too many images will make things worse, which seems counterintuitive to me.

west nebula
#

What I'm getting vs. what I'm looking for (albeit exaggerated).

west nebula
#

Just generate the same image multiple times and take a difference in Photoshop/GIMP.

rose sentinel
#

Happens with both. It is odd because if I run the same parameters 10 times I don't get 10 different variants, but two variants that alternate, more or less.

west nebula
west nebula
rose sentinel
#

I just use the WebGUI, hit the Invoke button multiple times (with the seed fixed) and then arrow key through the gallery. Very easy to see the changes.

rose sentinel
rose sentinel
heavy glacier
#

no variations on 1.5 for me (xformers)

rose sentinel
#

It's going to drive some people crazy. I added a switch to disable xformers today.

heavy glacier
#

oh wait

#

I think 1.5 isn't using xformers for me

west nebula
# rose sentinel Same as what I tried, except I did 3000 steps.
resolution: 512
lr_scheduler: constant
mixed_precision: fp16
learnable_property: style
initializer_token: ★
placeholder_token: <lfb>
train_data_dir: !!python/object/apply:pathlib.PosixPath
- /
- home
- jovyan
- work
- InvokeAI
- training-data
- lfb
output_dir: !!python/object/apply:pathlib.PosixPath
- /
- home
- jovyan
- work
- InvokeAI
- text-inversion-training
- lfb
scale_lr: true
center_crop: false
enable_xformers_memory_efficient_attention: true
train_batch_size: 2
gradient_accumulation_steps: 4
max_train_steps: 2000
lr_warmup_steps: 0
learning_rate: 0.0005
only_save_embeds: true```
rose sentinel
#

Hah! You found the preferences file. I thought it might come in useful.

west nebula
heavy glacier
#

yep - variations w/ xformers.

rose sentinel
#

For what it's worth, I'm not using xformers for my training. Have you tried without?

west nebula
rose sentinel
west nebula
#

I'll set off another run before I go to bed but I won't be able to check on it until I get to PDX.

#

Playing with Invoke while I'm trying to get out of town is definitely not a good idea for my stress level.

heavy glacier
#

this appears to be how xformers work from looking around

west nebula
#

The old TI script I think defaulted to 5E-3 rather than 5E-4... maybe there's something to that?

rose sentinel
# heavy glacier yep - variations w/ xformers.

It's odd. There shouldn't be any stochastic behavior at all. Could you see if there is a periodicity to the variations? There could be a variable being incremented somewhere that changes the behavior between generations.

rose sentinel
heavy glacier
west nebula
#

Sounds like an implementation error to me. It shouldn't be pulling entropy from anywhere.

rose sentinel
#

In fact, the generated images should be binary identical.

#

(Unless the metadata has a date in it? I don't recall)

heavy glacier
#

interesting... so bug exists in auto as well then?

rose sentinel
#

AUTO1111?

heavy glacier
#

yes

#

thats where I'm corroborating the "SD + xformers = no longer deterministic w/ same seed" reports

rose sentinel
#

If people are complaining that xformers is giving varied images on the auto distribution, then it is an upstream error, not ours.

#

Good corroborating evidence.

#

xformers is one complex beast. I'm glad I'm not the one who has to track down the source of the non deterministic behavior.

heavy glacier
#

Unrelated - @rose sentinel i seem to have converted a model to diffusers with a 'success' report, and now have issues generating due to an error
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

rose sentinel
#

Is that with the current main? This is a symptom of the autocast issues. The fix was only merged in recently.

heavy glacier
#

ah

#

ok

#

i may need to pull

rose sentinel
#

No, actually the merge was 2 days ago. You probably have it.

west nebula
#

So is the plan for SD2.1 to work w/o diffusers as well as with?

rose sentinel
heavy glacier
rose sentinel
rose sentinel
west nebula
#

Ack, sorry, meant xformers.

rose sentinel
# west nebula Ack, sorry, meant xformers.

Yes. I don't want to release until we've got SD2.1 working without the xformer install. I actually have a fully functioning workaround in a PR waiting in the wings in case we can't get to the root cause of the issue (using "we" rather loosely since it's actually only Damian and Keturn who understand what's going on).

#

The workaround puts generation into float32 mode when the SD2.1 model is loaded.

west nebula
#

It's stupid, but I like reproducibility a lot. Just how I'm wired.

rose sentinel
#

Also, unfortunately, we can't put xformers into requirements-base.txt as I'd hoped we could. It will have to be an optional thing that people install.

#

Or maybe what we do is to install xformers but default to --no-xformers, and let people turn it on with the command line switch. (or put into invokeai.init)

heavy glacier
#

i think that's the easiest.

west nebula
#

That's if we can find xformers that doesn't require multiple hours of compilation time?

heavy glacier
#

there's a pip install

#

that works on win and linux

west nebula
#

For the release candidate?

heavy glacier
#

I think we can tell people to manually install xformers using pip - we jsut cant put it in reqs, per lstein

west nebula
#

xformers==0.0.16rc390 is what I used for another project but I compiled for Invoke from scratch because I was just following instructions.

heavy glacier
#

no?

rugged moth
#

unless you mean using the wheel on windows

heavy glacier
#

worked fine for me

rugged moth
#

pip install xformers?

heavy glacier
#

yeah

rugged moth
#

are you on WSL?

heavy glacier
#

i referenced above

rugged moth
#

coz on native windows, it does not work 100%

#

as you can see, only a linux wheel

heavy glacier
#

ah yeah, I installed w/ WSL

rugged moth
#

ye that'd work coz it treats that as linux

heavy glacier
#

yeah

rugged moth
#

but native windows, there's no xformers wheel on pypi

#

you need get the wheel off the xformers repo

#

basically they dont do them formally but their actions build wheels for windows too

#

at some point we need to download one and provide it ourselves

#

coz the wheels keep updating to future version as they update their repo

#

also on native windows theres no pip install triton either

#

and i havent found a way to get triton working on windows yet .

clear hinge
#

hrm... yah I can't make sense of the under-the-hood code for inpaint enough to tell if there's anything wrong. Might be helpful to output per-step results somewhere and then compose them into a single image... I guess the callback could handle that to some extent...

worldly cloak
#

making a separate thread for seams so they don't get lost in the xformers and inversions and whatnots: #1065885676161212506

sour sun
#

Is it possible to load a local diffusers model? I've only been able to figure out how to do it by repo ID.

west nebula
grave lion
#

I don't know where the problem is, but seems like I get almost the opposite quality.

#

maybe because of mps?

#

strange

#

can anyone with m1 try both 512 and 768?

heavy glacier
#

Back to insane model load times unfortunately 🤔 Model loaded in 122.51s (Although on subsequent starts... 2s - inconsistent)

#

Only on starting though, which is odd.

#

Diffusers conversion/optimization bugs:

  • Converted file has a description of Optimized version of {model_name} - not the actual model name.
  • Model location is None in WebUI - Might also be related to people saying they can't use local path to load up models?
  • Having a difficult time adding VAE manually to the file, blocked doing it through WebUI :
#

WebUI Bug -

  • AssertionError: missing required field "format" on trying to update, likely should be passing in the format: diffusers by default when editing a diffusers model
west nebula
rose sentinel
west nebula
#

Cool. Did you use xformers?

worldly cloak
rose sentinel
#

Great news!

forest spade
#

Wow the activity on this channel is impressive 😅 Hard to read all messages. Just some quick updates:

GitHub

This PR adds a StableDiffusionInstructPix2PixPipeline for the InstructPix2Pix: Learning to Follow Image Editing Instructions, a stable diffusion fine-tuned model which allows editing images using l...

rose sentinel
#

I see that @forest spade is online. Can you provide me with some advice on how to get the resume from checkpoint function to work in the diffusers textual inversion training script? I have tried killing the training process halfway through and then relaunching with the --resume_from_checkpoint argument set, but each time the system calculates a start step that is higher than the end step and refuses to do further training. Is this a known problem?

forest spade
#

Also regarding the release I still have two big things I'd like to get merged so that they are in the release:

GitHub

I could not find anything for diffusers and unfortunately I'm not on the Level yet where I can implement it myself. :) It would be amazing to be able to weight prompts like "a dog ...

heavy glacier
rugged moth
rose sentinel
rose sentinel
#

Here are the settings I used:

resolution: 512
lr_scheduler: constant
mixed_precision: fp16
learnable_property: style
initializer_token: ★
placeholder_token: <lfb>
train_data_dir: !!python/object/apply:pathlib.PosixPath
- /
- home
- lstein
- invokeai
- training-data
- lfb
output_dir: !!python/object/apply:pathlib.PosixPath
- /
- home
- lstein
- invokeai
- text-inversion-training
- lfb
scale_lr: true
center_crop: false
enable_xformers_memory_efficient_attention: false
train_batch_size: 10
gradient_accumulation_steps: 4
max_train_steps: 3000
lr_warmup_steps: 0
learning_rate: 0.0005
only_save_embeds: true
#

Here's portrait of beyonce in <lfb> style

heavy glacier
#

If a user is on safetensors on 2.3, is our instruction to convert to diffusers?

worldly cloak
#

pickle/safetensors is one axis, CompVis/diffusers is another axis

safetensors & diffusers is always best

heavy glacier
#

Aye.

#

Would we expect diffusers to take up more memory during generation?

worldly cloak
#

not particularly

west nebula
#

@rose sentinel Thanks for testing!

#

Just getting off of a plane now.

heavy glacier
#

Hmmm. Getting ValueError: token_ids has shape torch.Size([79]) - expected [77] still, generating for promptcraft this week

#

I thought we had fixed that but maybe I'm wrong

#

suppose i just need to get an older version of invoke up to get through promptcraft

#

darn

heavy glacier
#

Someone test my assumptions. For all places where hardcoded 512x512 w/h are set in the code (a LOT of places), where would we not want that to just be updated to a {model_width} and {model_height} variable?

worldly cloak
heavy glacier
#

fix-padding

#

padded is more gooder.

#

blending is tough

#

are VAEs for diffusers models supposed to be configured through YAML or dropped into the folder?

worldly cloak
#

I expect it's more noticeable on shorter prompts (where more of the 77 tokens go to padding.)

heavy glacier
#

that would make sense

worldly cloak
#

for VAEs, either/or. If you always want to use that VAE with that model, might as well distribute it along with the model.

but if someone releases a new VAE (like the MSE-finetuned one for 1.x), easier to point to that than repackage everything.

heavy glacier
#

k - i'm just getting some issues using a VAE key/value pair for this diffusers model so wasnt sure

#

will just poke and see if im messing up a path or something lol

rose sentinel
heavy glacier
rose sentinel
rose sentinel
rose sentinel
heavy glacier
#

ill ping you the prompt that was giving me fits

#

well... I'm not getting it now. Disregard I guess.

#

wait... let me test something.

#

thats it

#

works on diffusers model, does NOT work on ckpt

tardy sparrow
tardy sparrow
west nebula
heavy glacier
#

Can confirm inpainting has quality degradation as well

#

Can see it around the second head (inpainted) of this img #🎨outdir message

tardy sparrow
#

i have a cross-attention control implementation working against current diffusers main (0.12.0-dev). couple of caveats - it's non-sliced and it currently only works on CPU on macOS because of an upstream torch bug. top is "a cat playing with a ball in the forest" -W512 -H512 -s15 -S123 , bottom is "a cat.swap(dog, s_end=0, t_start=0.2) playing with a ball in the forest" -W512 -H512 -s15 -S123

#

PR is ^ if anyone would like to try it out on linux/win

#

this is with no monkey patching for a happy @worldly cloak

rose sentinel
#

(I will update the table for PR 2385 after doing the testing.)

rose sentinel
#

@tardy sparrow I just tested PR 2385 on a Linux/NVIDIA system using diffusers 0.12 (pulled today). Good news is that I can generate SD-2.1 images without xformers in an OK memory footprint (8.75 GB RAM). Bad news is that swap() produced the dreaded RuntimeError: expected scalar type Half but found Float. Is this the upstream torch bug you mentioned? Here's the stack trace:

tardy sparrow
rose sentinel
#

@forest spade I'm trying to integrate the checkpoint merger functionality into InvokeAI, but I discovered that passing local diffusers model paths to the merge() function doesn't work. I've submitted a PR that might fix the problem: https://github.com/huggingface/diffusers/pull/2060

GitHub

The documentation for the community pipeline checkpoint_merger.py states that you can merge a model either using its repo_id or a path to the model directory on local disk. However, the latter does...

#

However, I couldn't easily figure out how to test the proposed fix, since the checkpoint_merger.py file is downloaded fresh from GitHub main each time. Is there a way to tell the code to look in my local repo for community pipelines?

#

(I can change COMMUNITY_PIPELINES_URL in dynamic_modules_utils.py, but this seems like a hack?)

tardy sparrow
#

@west nebula could you give this a spin? i think s_start/s_end is doing nothing now, but i'm not sure that that matters

heavy glacier
#

Am assuming we still don’t have any leads on inpainting/outpainting issues?

#

Is there any way I could do some testing/debugging to figure out where it’s going sideways? I think there’s a image debugger in the webUI - maybe ought to try that

heavy glacier
#

User confirmed the VAE issue I reported a few days ago is happening to them as well
#🌏invoke-chat message

worldly cloak
heavy glacier
#

Ok - so what are instructions for users using custom VAEs (e.g., anythingv3)

worldly cloak
#

if you want to use Anything v3, use https://huggingface.co/Linaqruf/anything-v3-better-vae

if you wanted to use only the VAE from that on some other model, I guess that's where you'd do something like

blah mix:
    format: diffusers
    path: /blah/blah/blah
    vae:
        repo_id: Linaqruf/anything-v3-better-vae
        subfolder: vae
heavy glacier
#

good details for the docs! will share w/ the user that was trying to use it

#

👍

heavy glacier
#

Ok so - aside from outpainting/inpainting issues, I've also noticed some odd artifacts cropping up recently - specifically when it seems i'm using weights

#

(txt2img)

#

for some reason, this prompt seems to have this happen even when I've not been having this happen elsewhere today - it might just be highlighting some wacky behavior.

#

i'll also not ive not pulled today, so if there's been anything fixed recently I can try doing that

#

Do we have a running tracker going for open bugs that I should be adding to?

#

im also not going to say the above artifacts couldnt just be a model issue - ive still got the deliberate model loaded from troubleshooting people's issues earlier. swapping to other models don't immediately have those same artifacts crop up

heavy glacier
#

I've also dug into the image debugging for outpainting - It definitely seems like its happening on the inpainting seam step. Unclear whats happening here.

rose sentinel
#

@heavy glacier Digging into the outpainting issue is coming up second on my list of things to look into. I'm first going to the new installer combo that @gusty hound and @dire gazelle have put together. I do see a similar artifact issue as you do, but only when I crank up the weight of an element a lot: banana+++++ I think it's been like that for a while, but I'll check.

clear hinge
#

I dug into the seam painting far enough to identify that it seems likely to be something to do with inpainting in general, or a small change I'm not seeing in how things are being encoded to do the seam paint. Everything seems to be constructed correctly.

rose sentinel
#

@worldly cloak, @heavy glacier, @rugged moth There are a bunch of PRs from me that could do with a little attention. They are:

  • 2353 - "import .safetensors ckpt files directly" -- This allows us to import ckpt files in the safetensors format
  • 2333 - "improve UI of textual inversion frontend" -- Improvements to the console-based textual inversion front end.
  • 2369 - "Allow user to specify VAE with !import_model" -- Provide a user interface for adding and changing the VAE assigned to a model without editing models.yaml
  • 2395 - "ckpt conversion script respects cache in ~/invokeai/models" -- Prevent the ckpt->diffusers module from re-downloading CLIP, safety checker, and other ancillary models if they are already located in the root directory.
  • 2388 - "add interactive diffusers model merger" -- Scripts to merge two or more diffusers models. According to legend you can use this code to convert regular models into inpainting models, but I haven't confirmed yet.
  • 2372 - "Better status reporting when loading embeds and concepts" -- This provides more informative messages about when a token has been recognized as the trigger for an embedding.
rose sentinel
clear hinge
worldly cloak
#

the only references to make_schedule I see are in ckpt_generator, which shouldn't be able to have diffusers pipeline passed to it, and omnibus, which I thought we determined isn't used by diffusers pipeline either.

clear hinge
#

No idea. New venv, pip install from the windows requirements, ran the script to download models (got 2.1 and 1.5), then only the 512x512 2.1 model works. Errors like above for 1.5, and just black images with exclamation points for 2.1-768

worldly cloak
#

put the full traceback and your models.yaml in a github issue? cuz that sounds very wrong

clear hinge
#

What I'd need to probably do is set it up on a brand new machine and see if I could repro it. Haven't had a chance to though >.<. Figured it's either something where environments are fighting each other (I'm guessing by sharing the config directory?) or there might be a bug.

worldly cloak
#

yeah, could be the config directory if your models.yaml got mixed up.

could try passing a different --root_dir to invoke

clear hinge
#

same thing with the 768 model (just black output, but I have the nsfw checker off)

#

2.1 512 works just fine

#

1.5 model seems to work now though

#

so I guess upgrades might be a pain =/

#

let me know if I can check anything on the 768 model. Won't be able to do it until the morning though (it's late here and I've got a morning meeting, need to get some sleep x.x)

rose sentinel
#

By the way, @sour sun has suggested that we enhance the model merging so that the merged model is kept in memory and immediately available rather than writing to disk first, along the lines of what @tardy sparrow did. This is straightforward to code, and the syntax to do this on the CLI would look like this:

invoke> !merge_models model_a model_b model_c --alpha=0.5 --interp=weighted_sum --dest=merged_model

From then on, the merged model will appear on the models list, but will not be written to disk until the user asks for that.
Should this wait until 2.4 (or until after nodes?)

west nebula
heavy glacier
west nebula
west nebula
#

@tardy sparrow ModuleNotFoundError: No module named 'diffusers.models.cross_attention'

heavy glacier
#

did you update to .12?

west nebula
#

Absolutely not.

#

Just following requirements.txt here, so that should be updated if 0.12 is needed.

heavy glacier
#

It was mentioned in the PR

#

But agree, before it gets merged, would need to be updated as a req

west nebula
#

Docs? Who has time to read those?

#

Anyway, as written I cannot install diffusers 0.12:

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting https://github.com/huggingface/diffusers
  Downloading https://github.com/huggingface/diffusers
     | 324.0 kB 1.8 MB/s 0:00:00
  ERROR: Cannot unpack file /tmp/pip-unpack-q77ejh56/diffusers (downloaded from /tmp/pip-req-build-4da97prh, content-type: text/html; charset=utf-8); cannot detect archive format
ERROR: Cannot determine archive format of /tmp/pip-req-build-4da97prh```
#

So that means I'm at a dead end for now and switching back to main

tardy sparrow
tardy sparrow
west nebula
#

I can give that a shot, hang on.

#

And you want me to test on the RTX end, correct?

#

So I'm not using xformers right now, and I am getting this:Error! CrossAttentionControl found an unexpected number of <class 'ldm.models.diffusion.cross_attention_control.InvokeAIDiffusersCrossAttention'> modules in the model (expected 16, found 0). Either monkey-patching failed or some assumption has changed about the structure of the model itself. Please fix the monkey-patching, and/or update the 16 above to an appropriate number, and/or find and inform someone who knows what it means. This error is non-fatal, but it is likely that .swap() and attention map display will not work properly until it is fixed.

#

However, swap does appear to work with s_start=0 and s_end=0.

#

Same when using xformers. Works but I get those errors.

#

Tested with SD1.5 non-ema.

west nebula
#

@tardy sparrow Also getting OOM errors making a 1024x768 image with +++ syntax.

tardy sparrow
#

functionally, though - error messages and OOM aside - do you feel like this is doing everything that you need?

unkempt timber
#

Might be interesting to look at

west nebula
tardy sparrow
west nebula
#

Oh so we're done with them and shape_freedom?

tardy sparrow
#

yeah, i think so - with the next diffusers implementation the s_ switches do nothing (the method doesn't support what they used to do), and i think with only two params it's more clear what is happening

#

probably that can come down to one param, t_start - do you have much use for tweaking t_end?

west nebula
#

Haven't yet, but maybe others have.

#

So then the new version is space-only replacement and you get to decide when that kicks in?

tardy sparrow
#

token-only replacement, yeah. after thinking on it a wee bit i think i'll leave both params in - no reason to remove t_end really

rose sentinel
rugged moth
#

there was no patch done to fix the seam issue right? coz i think the issue is actually with seam steps being too low. Increasing them is consistently improving the results

#

and im guessing the 2.1 depth model and the x4 enlarge model dont work yet?

#

@clear hinge I noticed in inpaint.py that seam_paint was receiving the noise as x_T but never being used. Instead the seam_noise was being generated with self.get_noise(im.width, im.height) .. When I replaced the passed noise with this, I seem to be getting better results.

#

with diffusers that is

clear hinge
rugged moth
#

let me try multiple iterations and check

#

even with x_T, I'm getting different results if i do multiple iterations

tardy sparrow
heavy glacier
sour sun
heavy glacier
#

Lol - not trying to imply it “ought” to be !commit - just that it is.

!optimize is for passing a ckpt and converting to a diffusers model - how do you see using it in the context of offloading a merged model from memory?

west nebula
#

Prior to diffusers, I could do 1536x1536 without a problem. Now I can do at most 768x768 (or so).

tardy sparrow
#

yes, almost certainly

#

for parity with the old codebase the attention slice would need to be computer per attention head, per step

#

this is doable, it's just currently not being done. i'll take a look at it after i figure out the sliced .swap()

forest spade
rose sentinel
#

I should have thought of that! It would have made testing a lot easier. Was this something I missed in the documentation?

heavy glacier
#

@rose sentinel - FYI; I started a TI training and got ~5% of the way through training, and closed. Doesn't seem like it saves any .bin unless it gets through the whole thing?

rose sentinel
#

You will only get .bin files every 500 steps.

#

It saves a checkpoint at 500, 1000 and so forth. There is a parameter for adjusting this that I haven't exported to the frontend.

#

Once you have a checkpoint, you can resume from there.

heavy glacier
#

Got it 👍

west nebula
#

Taking a look at txt2img2img, this math seems off:

        scale = 512 / scale_dim

        init_width, init_height = trim_to_multiple_of(scale * width, scale * height)```
#

If I want a resulting image that's 1024x512, that means that the initial image generated is going to be 1024x512. That explains the duplication I've been seeing. Anyone object to me altering this?

#

It's been this way for a long time, too, so it is not an insignificant change.

rose sentinel
#

It is definitely off. The code should be finding the larger dimension and then scaling it to 512 (or 768 for the SD-2.1 model). The smaller dimension should be scaled the same.

heavy glacier
#

Well I think the question is - What is the lower bound of the smaller dimension, and should it be 512 or lower

#

I don't think you'd always want the larger dimension to be scaled to 512 - There's probably a happy medium that keeps the smaller dimension from going so far below 512 that it generates garbage. But @west nebula you probably have used it more and know what the balance is

west nebula
#

Is there a way to find the model's default resolution inside of class Txt2Img2Img during construction or get_make_image?

heavy glacier
#

It should be available in the model information

#

That's stored/pulled from the model config file (W/H)

#

and gets loaded into the model list on initialization

west nebula
#

Time to dig!

#

Here's where I'm at right now:

        # Make their area equivalent to the model's resolution area (e.g. 512*512 = 262144),
        # while keeping the minimum dimension at least 384
        aspect = width / height
        model_area = 262144 # hardcoded for now

        if aspect > 1.0:
            init_height = max(384, math.sqrt(model_area / aspect))
            init_width = init_height * aspect
        else:
            init_width = max(384, math.sqrt(model_area * aspect))
            init_height = init_width / aspect

        init_width, init_height = trim_to_multiple_of(math.floor(init_width), math.floor(init_height))
        print(f"\nUsing initial resolution of {init_width}x{init_height}\n")```
west nebula
heavy glacier
#

it probably can be, but I'm not as familiar with hacking in the python

west nebula
#

Likewise.

#

This new code seems to generate good results, but they're definitely different than what we were getting before. So if consistency is what we're after between versions, we should stick with what's there. If not, I like this better...

heavy glacier
#

care to share any examples of before/after?

#

I think we've aligned on better over reproducible into eternity

west nebula
#

One sec.

#

Oddly, after all of that, I don't see any visible difference in the output.

#

That points to all of that code being ignored.

#

And that is precisely what's happening.

#

Generator's generate method calls the subclass's get_make_image, which ultimately uses the tensor passed down to it from Generator - whose size is the width and height passed in directly.

#

This all changed from 2.2.5 where we pass a shape (without checking that the tensor is large enough) and x_T as passed in.

#

At least that's my read of it. I'd love another pair of non-tired eyes.

heavy glacier
#

im not surprised that this changed w/ diffusers

#

frankly we needed to update it anyways since we're doing non-512x512 images! 😛

rose sentinel
#

There are two code paths: one for ckpt models and the other for diffusers models.

#

The first path goes through ldm/invoke/ckpt_generator, and the second path goes through ldm/invoke/generator.

#

We will deprecate ckpt files and eventually discontinue support for them, but right now dropping ckpt support (or forcing everyone to convert) would be very unpopular.

rose sentinel
#

This is a big ugly, but you can do this:
dimension = model.unet.config.sample_size * model.vae_scale_factor . You'll want to put a try around it, because it only works with diffusers models and may not be a stable API.

rugged moth
#

once all of the work with diffusers is done with the model management, please brief me on the requirements for the yaml and ill update the model manager to support it on the frontend

rose sentinel
#

I will write up a document. I'm not sure there are any changes pending. I'm resisting the temptation to create a new syntax for on-the-fly merged models.

rugged moth
#

ill keep it as the last thing to do

#

most of the UI is already there.. just need to update the writing logic

west nebula
rose sentinel
#

H'mmmm. Stable-diffusion-1.5, "50s housewife", 832x832:

#

Same thing, but with --hires_fix:

sour sun
# heavy glacier Lol - not trying to imply it “ought” to be !commit - just that it is. !optimize...

!optimize takes an existing model from your models.yaml file (currently assumed to be in ckpt format), and converts it to a diffusers model. If there is going to be a new "format" for a model in the models.yaml file that is a recipe for merging several other models, and you need a command to convert THAT entry to a self-contained diffusers model, it makes sense that you would use the same command for that.

rose sentinel
#

Unfortunately, even if I use the same seed for both images, I won't get the same thing because the hiresfix version starts with a different dimension.

heavy glacier
#

The command in question would instruct the system to dump from vram into a permanent entry on the yaml, and create the model files/configs/etc. - Which is related to !optimize, but functionally different

#

(This is my understanding, @rose sentinel is the keeper of truth here)

rose sentinel
#

I've got several conflicting ideas in mind. One is a workflow in which the user selects two or more models, merges them in memory, tries the merged model out, and if they look it they commit it to disk with a whole new name.

heavy glacier
#

fwiw, i like that idea/workflow

sour sun
tardy sparrow
heavy glacier
#

While it doesnt need to mean deprecating, CKPT files are being deprecated because we're currently maintaining 2 code paths to support both model formats

rose sentinel
#

The second conflicting idea is to extend the models.yaml format to do merging as needed. A stanza would look like this:

my-merged-model:
   source_models: [model_a, model_b, model_c]
   merge_alpha: 0.8
heavy glacier
#

The only reason we didn't remove CKPT support in the interim is because we wanted to have a grace period

heavy glacier
#

(And also like to do multiple chained merges)

west nebula
sour sun
tardy sparrow
#

ckpt support doesn’t need to be removed. my adapted converter code takes a ckpt and a yaml and some vars we already save in models.yaml and returns a stable diffusion pipeline

heavy glacier
#

ah i wasnt aware that existed!

#

that's nifty

rose sentinel
#

The safest thing right now is to load the ckpt using the old code.

tardy sparrow
#

in the meantime invoke’s converter is probably drifting further and further from the one that has already been merged into diffusers, yes?

#

seems like this is just going to cause more pain down the line.

#

what is the user base % that is using models that don’t convert ?

rose sentinel
#

Nope, converters are in sync

tardy sparrow
#

i guess i don’t understand why any new features are being developed for the ckpt code path

#

surely the better thing to do if merging is being implemented is to expressly only support merging diffusers compatible models

#

problem solved

west nebula
#

Just a wee bit kludgy!

rose sentinel
#

Tell you what. As soon as we've got inpainting and outpainting working on the diffusers models as well as they are working on ckpts, I'll drop all direct ckpt loading and do the in-memory loading of ckpt->diffusers.

#

No features are being added to the ckpt code path. For example,@west nebula is fixing txt2img2img on the diffuserse code path.

#

@tardy sparrow There is one set of changes to the load_pipeline_from_original_stable_diffusion_ckpt call that would be very helpful, and would let me import from diffusers rather than a forked copy of it. That would be to pass through thecache_dir parameter to all the from_pretrained() calls. This would allow reuse of the openai/clip and nsfw models if they are stored in invokeai's root directory.

west nebula
#

old vs. new initial image sizing - totally different but unsure of which is better.

west nebula
#

I'd also like to see strength (CLI -f) represented on the UI alongside high-res optimization as it does pass through and can be really useful.

tardy sparrow
tardy sparrow
rose sentinel
#

The diffusers library seems to initialize from the environment variable at code load time, so it's very hard to control and leads to race conditions. In some of the pipelines (such as merge), there is a **kwargs argument that gets passed through to from_pretrained(), and this seems to work. In addition to use_cache_dir there is only_local and other useful options.

rose sentinel
west nebula
#

The point of the PR is to reduce duplication in images, and I don't know if there's a great way to objectively test that.

west nebula
#

Random seed, old/new

grave lion
west nebula
#

Yes, you can manually generate to a suboptimal resolution and use img2img to upscale to your desired resolution.

#

It's not going to look identical, but you then have total control.

grave lion
#

I prefer the new one cause it's obviously less duplicated (4 ears)

west nebula
#

And it has fewer limbs as an added bonus.

grave lion
#

yeah

west nebula
#

That's the extreme case, something so stretched out. I'd use outpainting if I were producing something for real with it and not a contrived test case.

#

I keep preferring the latter ones with the new scaling but maybe that's just me.

grave lion
#

but they may be preferable in some cases

west nebula
#

Nothing precludes somebody from doing img2img on a generation they like of a smaller image. That's how I mostly use InvokeAI. But I'd rather give better images out of the box for --hires_fix if we can.

#

It all needed to be fixed for 768x768 models anyway

worldly cloak
rugged moth
#

some issue with the actions

tardy sparrow
#

@worldly cloak i'm looking to enable sliced attention as part of my cross attention work - am i correct in assuming that the correct place to do that be in StableDiffusionGeneratorPipeline.__init__(), something like this:

    if is_xformers_available() and not Globals.disable_xformers:
        self.enable_xformers_memory_efficient_attention()
    else:
        slice_size = 2 # or 4 or 8 i guess
        self.enable_attention_slicing(slice_size) 
forest spade
#

We just released 0.12.0 - hopefully that helps a bit regarding stability. We've included a section about important bug fixes which is probably interesting/important for you: https://github.com/huggingface/diffusers/releases/tag/v0.12.0

GitHub

🪄 Instruct-Pix2Pix
Instruct-Pix2Pix is a Stable Diffusion model fine-tuned for editing images from human instructions. Given an input image and a written instruction that tells the model what to d...

heavy glacier
#

@rugged moth LoRA!

#

Thanks for sharing @forest spade

rugged moth
tardy sparrow
#

sliced support is in - @west nebula can you give it a spin and see if your memory usage improves? you can edit the slice size at the bottom of StableDiffusionGeneratorPipeline.__init__() (line 310) - smaller numbers should mean less memory usage at the expense of performance

west nebula
#

Will that be a configurable setting?

#

And is this xformers or non-xformers or both?

tardy sparrow
#

non-xformers

#

i don't think xformers + swap will be a thing

west nebula
#

Fine by me, I refuse to use xformers until it gets fixed.

tardy sparrow
#

basically it temporarily disables xformers when you do .swap()

#

i mean, that's what it does in theory anyway.

west nebula
#

Are your changes in main or elsewhere?

tardy sparrow
#

diffusers 0.12 is now released so you can just do pip install diffusers==0.12

west nebula
#

and also transformers==whatever

tardy sparrow
west nebula
#

I'll track it down.

#

So the sliced support should help with all large image generation, correct?

tardy sparrow
#

should do yes

west nebula
#

Stand by!

sour sun
#

Maybe there should be a configuration that lets you choose slice sizes proportional to the image size, so that you're not leaving performance on the table when generating small images, but you can keep the memory requirements lower when you need to, for generating larger images?

west nebula
#

If you're doing that, the approach should take VRAM into account as well if we have that information. But for now I'll settle for "make big images" even if I have to configure something manually.

#

Before I play with 8, 4, 2 as options, I'd like to know what the number influences.

#

Mostly if I set it wrong, will it crash? Or is there a way for the number to adapt to an OOM error? Or...

west nebula
#

I think VRAM's not getting freed up somewhere since it did the initial outpainting 1024x1024 generation without a problem.

sour sun
tardy sparrow
west nebula
tardy sparrow
west nebula
#

In diffusers?

#

NVM, trying it now.

west nebula
#

Maybe this is something in inpainting not freeing VRAM?

#

This happens just after starting the second inpainting (outpainting) pass when it blends the seams.

#

I don't think it has anything to do with the attention slicing code directly.

tardy sparrow
#

right. yeah idk.

west nebula
#

Hmm.

#

So your code works and lets me do txt2img with much larger sizes than I can do without it. If that's the PR, then success!

tardy sparrow
#

i think this is done https://github.com/invoke-ai/InvokeAI/pull/2385. can i please get some more testing support - need to confirm it works on Linux and Windows, and also need to confirm that xformers is successfully re-enabled when doing a non-.swap() generation after doing one with .swap()

tardy sparrow
west nebula
#

I'm verifying some more now.

#

I think that something needs to free the GPU memory somewhere in the rendering pipeline. Diffusers is not good about returning memory.

#

Just had another crash trying to do 1536x640 - I was always able to do this pre-diffusers.

#

Where's a good place to throw some debugging output to see that attention slicing is doing its thing?

#

I take back my prior statements about this working. 🙂

worldly cloak
west nebula
#

This run, I could do a first generation of something large but I cannot do a second.

#

>> Could not generate image.
>> Usage stats:
>>   0 image(s) generated in 14.91s
>>   Max VRAM used for this generation: 9.86G. Current VRAM utilization: 2.17G
>>   Max VRAM used since script start:  9.86G```
#

And VRAM is maxed out according to nvidia-smi:

Wed Jan 25 17:03:29 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.65       Driver Version: 527.56       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   44C    P8    11W / 170W |  11517MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     28336      C   /python3.10                     N/A      |
+-----------------------------------------------------------------------------+```
worldly cloak
#

what was max & current VRAM for the prior run that worked?

west nebula
#

Just on startup... hang on.

#
>>   1 image(s) generated in 71.28s
>>   Max VRAM used for this generation: 6.19G. Current VRAM utilization: 2.17G
>>   Max VRAM used since script start:  6.19G```
worldly cloak
#

so it previously worked in 6.19G, but the next run it's trying to allocate an additional 7 GB on top of 9 GB?

😖 those numbers extremely do not add up

west nebula
#

Makes me think there's a leak somewhere important.

#

I can generate small things after this point but anything large throws up.

worldly cloak
#

yet the first one ends at 2.17 G, which is the same as the initial "model loaded" number. So it doesn't look like it's leaking there. blobthinking

west nebula
#

VRAM also gets freed up after a small image generation but clearly something has changed since I can't go big again until restarting Invoke.

#

I also never see cross_attention_control.py's Context created, so it doesn't look like it's used for saving slices. (Is this right for diffusers?)

worldly cloak
#

I'm not up to speed on that cross-attention PR yet

#

hmm, alternate hypothesis: it's not a leak, it's some configuration that's being reset incorrectly.

e.g. maybe the attention slicing was on for the first run but then got turned off, or a different slice size

west nebula
#

I figured I'd throw some debugging output into save_slice but I never see it.

#

Not even the first run.

worldly cloak
#

if you're not using swap, I wouldn't expect Invoke's custom cross-attention code to be involved.

west nebula
#

True... so then the only slicing code that's happening is built into diffusers with the one call: enable_attention_slicing(slice_size=slice_size)

#

And then either things aren't getting reset properly (as you said) or there's a VRAM leak somewhere.

tardy sparrow
west nebula
#

I added ("").swap("") to the end of my prompt and I can render over and over again. Does that help at all?

tardy sparrow
#

nope haha

#

i tried to clean up the cleanup by putting it in a @contextmanager function - custom_attention_context on InvokeAIDiffuserComponent. it's possible i messed that up.

#

because i haven't written a @contextmanager before.

west nebula
tardy sparrow
#

it should trigger for all .swap() prompts

west nebula
#

Right.

tardy sparrow
#

my feeling reading all of this is that the problem probably doesn't lie with the cross-attention stuff at all, because mathematically ("").swap("") is still swapping two 77x768 embedding tensors

west nebula
#

But I get OOM if I don't use .swap()

#

That's what I'm saying.

tardy sparrow
#

if you make the slice size smaller, does that help? what if you comment out the enable_sliced_attention?

west nebula
#

You're relying on the diffusers attention slicing there, right?

tardy sparrow
#

yeah

west nebula
#

I'll modify and restart Invoke, but I have to go eat dinner as well so this may be a bit.

west nebula
#

And after an OOM, I can still do large generations with .swap() whether the line is commented out or not. (I imagine this is expected.)

#

Does something get reset after a generation and we need to issue enable_sliced_attention again at the beginning of each generation?

sour sun
west nebula
#

@tardy sparrow Putting this.enable_sliced_attention('max') at the beginning of latents_from_embeddings does the trick. Can you move the attention-setting code to a new function that's called from there?

#

The evolution of this would be to see how much VRAM is required and if attention slicing is necessary.

rose sentinel
west nebula
#

@rose sentinel Are you using xformers? If so, can you test @tardy sparrow's branch with my patch above generating a larger-than-will-fit-in-VRAM image?

rose sentinel
heavy glacier
#

any luck on the inpainting issue? i haven't been able to keep up today

tardy sparrow
tardy sparrow
#

so with diffusers 0.12 and my new Compel library i've rolled out of the InvokeAI prompting code, custom prompt syntax with diffusers looks like this:

from compel import Compel
from diffusers import StableDiffusionPipeline

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

# upweight "ball"
prompt = "a cat playing with a ball++ in the forest"
embeds = compel.build_conditioning_tensor(prompt)
image = pipeline(prompt_embeds=embeds).images[0]
#

what it doesn't support for now is the TextualInversionManager. i'm going to make Compel and textual inversion code as decoupled as possible - once that's done it should be possible to replace all of invoke's prompting and conditioning wrangling with import compel

west nebula
#

'auto' doesn't cut it (pun intended) for large images.

tardy sparrow
#

thanks for trying it out

#

can you open a PR against my branch?

#

then i can just fold it in.

#

or.. actually reading the patch now. so you've refactored the attention type setup to its own function, and then that gets called prior to every generation?

#

are there other places than image_from_embeddings() where this needs to go? i suspect image_from_embeddings() isn't the only codepath to generating images

#

might be wrong though

west nebula
#

Yeah, I'm not sure. It is the path for txt2img2img, img2img, and txt2img. Unsure if there are other places where it's called but that's a start.

#

And yes, I refactored it because we may want to have some sort of behavior there to determine whether to use no attention slicing, 'auto', 'max', or a number. No idea yet but it seemed prudent to pull it out.

#

If you still want me to open a PR, I can do it much later today... otherwise feel free to apply that patch yourself.

tardy sparrow
#

i have some comments from @worldly cloak (thanks for the review!) to address too

rose sentinel
#

I forget who asked for this originally, but I have posted a PR that will remove the dependency on the original implementation of clipseg (for text-based masking) and use the huggingface transformersversion instead. In addition to reducing code complexity, this removes the last source-code module dependency that was preventing us from posting InvokeAI to PyPi for hands-free installs. @gusty hound

west nebula
#

Is there any documentation about the cache location for models and how to set it?

tardy sparrow
west nebula
#

Did Invoke download HF models originally to a different location?

tardy sparrow
#

diffusers by default puts them in ~/.cache/huggingface

#

at some point either lstein or keturn explicitly told it to put them under your invokeai folder

west nebula
#

I guess that point has passed and I can purge things from there.

tardy sparrow
#

yeah, unless you find yourself doing stuff with diffusers standalone you don't need them

west nebula
#

That's 79 gig that I don't need filled up.

tardy sparrow
#

noice

west nebula
#

I miscalculated and it's not that much, but still. No need for duplication anywhere.

#

Wishing ext4 had deduplication right now.

tardy sparrow
#

yeah there's definitely some "hardware imperialism" going on with the assumptions HF are making about the kind of systems ML users have

west nebula
#

I run all of my AI stuff on rusty platters because things are so large and there's a lot of churn, but once it stabilizes I'll likely get a smallish SSD.

tardy sparrow
#

your patch seems to be working fine on my vast.ai instance

#

@here

#

can i get a windows test

#

@west nebula i think i found why your patch fixed things - can you try again?

west nebula
#

Fill me in before I pull?

tardy sparrow
#

ah, i was restoring None to the cross attention processors for every non-swap

west nebula
#

^_^

tardy sparrow
#

should not hvae been doing that

west nebula
#

I'll pull and test now.

#

Wait, it won't work.

#

'auto' doesn't cut it, I need to use 'max' on my setup.

#

Though as we discussed there should be a strategy for this.

tardy sparrow
#

yeah, 'max' has performance implications for smaller images

west nebula
#

The flip side is that I can't generate anything large without it on my GPU with 12GB VRAM. 1536x1536 crashes.

#

I was able to do medium-sized things, so that's good.

#

I think it should be a per-generation choice based on available memory. We should be able to estimate how much is needed, or try generating and switch for the render if we get an OOM...

tardy sparrow
#

i agree, i just think this should be a different PR

#

because it touches more stuff including UI

#

(potentially)

west nebula
#

So in the meantime we just can't generate large things, and this still needs to be tested with xformers (@rose sentinel)?

west nebula
#

And is there a way to estimate how much VRAM a generation will require?

worldly cloak
#

There must be, because it's deterministic, but there are enough layers and variables that I've yet to see anyone work out the precise equation.

But that's what the equations like this in the old attention code were trying to get at:

max_tensor_mb = mem_free_total / 3.3 / (1 << 20)
size_mb = q.shape[0] * q.shape[1] * k.shape[1] * q.element_size() // (1 << 20)
if size_mb <= max_tensor_mb:
    …
west nebula
#

Yeah, hmm.

#

I think it should be trivial if we know the precision and base tensor sizes for 512x512.

#

We know the images are 1x4x64x64, for example.

worldly cloak
#

A few things to look at before investing too much time in optimizing the current implementation for biggest-possible-image:
• Birch-san has a non-xformers implementation of memory-efficient attention, which could be included upstream. This sounds better than the current (non-xformers) attention slicing implementation: https://github.com/huggingface/diffusers/issues/1892. Useful for non-CUDA platforms. No idea whether it's useful for improving the cross-attention-control code.

• If your platform is supported by xformers but you've been avoiding it due to reproducibility issues, this change (which did just make it in to diffusers 0.12!) allows better control over some of those tradeoffs: https://github.com/huggingface/diffusers/pull/2049

west nebula
#

Better ML folks should decide the implementation details, in other words.

#

I remember reading Birch-san's paper, so I'm definitely curious to see it in action.

rugged moth
#

anyone successfully managed to install triton on windows 11 yet?

forest spade
#

Hey 🙂

Very cool that you've merged the big diffusers PR!
Would it make sense if we maybe open a new channel InvokeAI<>Diffusers (maybe here) where I can also try to invite some other diffusers contributors and where you could quickly ping us in case you have more in-detail questions? It's a bit difficult to follow all the discussions here 😅

We mainly use Slack with the diffusers team, so we could also use a Slack channel (there we'll surely be super reactive), but Discord is very good as well. I'll try to check at least every 2nd day.

rugged moth
heavy glacier
#

Yep - Will get it set up

#

#🧨diffusers is born

#

I've also added the <@&1068278564010606611> role to highlight the team

rose sentinel
#

Where is @tardy sparrow 's new memory-efficient slicing code? I can't find its PR.

rose sentinel
west nebula
west nebula
rose sentinel
# west nebula Is there any documentation about the cache location for models and how to set it...

Documented in the CHANGELOG. Should be moved into main documentation as well:

2. The format of the models directory has changed to mimic the
   HuggingFace cache directory. By default, diffusers models are
   now automatically downloaded and retrieved from the directory
   `ROOTDIR/models/diffusers`, while other models are stored in
   the directory `ROOTDIR/models/hub`. This organization is the
   same as that used by HuggingFace for its cache management.

   This allows you to share diffusers and ckpt model files easily with
   other machine learning applications that use the HuggingFace
   libraries. To do this, set the environment variable HF_HOME
   before starting up InvokeAI to tell it what directory to
   cache models in. To tell InvokeAI to use the standard HuggingFace
   cache directory, you would set HF_HOME like this (Linux/Mac):

   `export HF_HOME=~/.cache/hugging_face`
west nebula
#

The default HF location on my system is ~/.cache/huggingface rather than ~/.cache/hugging_face, and it seems that the latest main is putting things there rather than in ROOTDIR/models/diffusers. I do get a message about fetching files whenever I switch models.

#

I don't have HF_HOME set.

#

So if the CHANGELOG is correct, there's a bug.

rose sentinel
#

More seriously, though, the code ought to be downloading into ROOTDIR/models/diffusers. I am not seeing this behavior on current main. Models are getting downloaded into ROOTDIR as wished for.

#

Diffusers has a slightly annoying property of printing "Fetching XX files" and displaying a progress bar even when the files are already cached to disk and available. I wonder if there is a way to suppress this without suppressing the bona fide progress bar that is displayed when the model is actually being downloaded? @forest spade

rose sentinel
rugged moth
#

@rose sentinel Is this the final format of the diffusers model in the model.yaml

--- description: 
--- path:
--- repo_id:
--- format:
--- default:
--- vae:
------ path:
------ repo_id:```
gusty hound
#

is there a reason that the yaml uses 3 spaces for indentation instead of just two?

west nebula
rugged moth
gusty hound
rugged moth
#

that is true

#

isnt this auto formatted?

gusty hound
#

So I wondered if there is a reason for this 🙈

gusty hound
rugged moth
#

ive fixed up the issues in the model manager webui pertaining to the diffusers models. with the above format. I'm not sure if the precision entry is still a thing. Can add it up if it is.

I've also changed the seam steps default to 30 because it seems to solve a lot of low quality issues with the 1.5 and 2.1 models. Inpainting still seems to have minor issues but its a start.

heavy glacier
#

ill give the latest a try

dire gazelle
dire gazelle
west nebula
#

ldm/invoke/globals.py - home = os.getenv('XDG_CACHE_HOME')

dire gazelle
# west nebula `ldm/invoke/globals.py` - `home = os.getenv('XDG_CACHE_HOME')`

yea. i stepped away for a bit, sorry. IMO, we shouldn't be relying on that env var: 1) it only applies to Linux, and 2) very inconsistently even at that (it exists in some GUI sessions on some distros). So, we should just look for HF_HOME (which is also respected by HF libraries), and if not found, default to our good ol' INVOKEAI_ROOT/models. IF the user explicitly wants to use the shared Huggingface cache, they can set HF_HOME.

west nebula
#

I like that approach. There's been a lot of churn in this area.

rugged moth
#

pushing the pr

rose sentinel
rose sentinel
rose sentinel
rose sentinel
west nebula
rugged moth
#

at some point in the future when most of the users have moved to diffusers, i think we can automate the process a bit .

#

tested most edge cases ..seems to work as intended .

rose sentinel
#

I'm in the mood to remove support for XDG_CACHE_HOME if it is a Linux-only thing that is inconsistently used.

west nebula
#

XDG_CACHE_HOME defines the base directory relative to which user-specific non-essential data files should be stored.

#

That doesn't sound like a good fit to me.

rose sentinel
#

@worldly cloak Have you tried to generate images with the diffusers version of the runway inpainting-1.5 model? I'm now getting this type of error when I do a txt2img (not inpainting):

Traceback (most recent call last):
  File "/home/lstein/Projects/InvokeAI/ldm/generate.py", line 506, in prompt2image
    results = generator.generate(
  File "/home/lstein/Projects/InvokeAI/ldm/invoke/generator/base.py", line 109, in generate
    image = make_image(x_T)
  File "/home/lstein/Projects/InvokeAI/ldm/invoke/generator/txt2img2img.py", line 48, in make_image
    first_pass_latent_output, _ = pipeline.latents_from_embeddings(
  File "/home/lstein/Projects/InvokeAI/ldm/invoke/generator/diffusers_pipeline.py", line 357, in latents_from_embeddings
    result: PipelineIntermediateState = infer_latents_from_embeddings(
  File "/home/lstein/Projects/InvokeAI/ldm/invoke/generator/diffusers_pipeline.py", line 189, in __call__
    callback(result)
  File "/home/lstein/Projects/InvokeAI/backend/invoke_ai_web_server.py", line 1212, in diffusers_step_callback_adapter
    return image_progress(progress_state.latents, progress_state.step)
  File "/home/lstein/Projects/InvokeAI/backend/invoke_ai_web_server.py", line 997, in image_progress
    image = self.generate.sample_to_lowres_estimated_image(sample)
  File "/home/lstein/Projects/InvokeAI/ldm/generate.py", line 983, in sample_to_lowres_estimated_image
    return self._make_base().sample_to_lowres_estimated_image(samples)
  File "/home/lstein/Projects/InvokeAI/ldm/invoke/generator/base.py", line 204, in sample_to_lowres_estimated_image
    latent_image = samples[0].permute(1, 2, 0) @ v1_5_latent_rgb_factors
RuntimeError: mat1 and mat2 shapes cannot be multiplied (4096x9 and 4x3)

>> Could not generate image.
#

My error - this is with txt2img2img.

#

With the inpainting model, txt2img is working, txt2img2img is crashing, inpainting is producing wacky results as demonstrated in my earlier message, and img2img is producing fuzzy images like this:

west nebula
#

I'm getting this in my output:
DEBUG: choose_autocast() called

#

Is that expected at this point?

rose sentinel
#

Sorry, I've been meaning to remove that. I'll find and remove it at next opportunity.

west nebula
#

np

#

@rose sentinel Did you merge the TI front end into the other TI script? I'm getting this when I try to run it.

rose sentinel
#

To get the front end you now need to pass --gui, as in textual_inversion --gui

west nebula
#

python scripts/textual_inversion.py --gui

#

That's the output from that command - see the 1st line.

rose sentinel
#

Sorry, that script's defunct. I should have removed it.

#

The command is textual_inversion without the .py.

#

Should be on your path after doing a pip install -e .

#

Just made a PR to fix this. Check the updated documentation. I believe I brought it up to date.

#

I also moved the merge script.

west nebula
#

Ah, I launch everything from scripts. I'll have to find textual_inversion.

#

Haven't done the pip install -e . because I'm switching between branches constantly.

rose sentinel
#

See if it is on your path. Supposedly this is the more pythonic way to do it.

#

You can run directly by calling python ldm/invoke/textual_inversion.py

west nebula
#

Nope, not on my path. I'll call it that way and eventually do the pip install -e ..

rose sentinel
#

It won't be on your path unless you do the pip install.

#

Let me know if it works when you call it directly, and apologies for leaving the old nonfunctional version in the scripts directory.

west nebula
#

Got it. I'll call ldm/...

#

Doing it the pythonic way makes sense if you're a python person. 🙂

#

Me? I like python scripts/....

west nebula
sour sun
#

Since diffusers now contains @tardy sparrow's module-ised conversion script that lets you import a ckpt file and use it with diffusers, without saving the converted file first, I was thinking it would make sense to add a --force_diffusers option, which would cause non-diffusers models to be loaded through that converter, instead of using the old code. It would make a lot of testing easier. Then when you're ready to get rid of the old code, you could make that setting always-on, so that users can keep using the .ckpt models, and not need to convert them all.

tardy sparrow
heavy glacier
#

ok

#

so I think the issue in general w/ inpainting (and maybe img2img?) is low img2img strength/steps

west nebula
#

With the inpainting model?

heavy glacier
#

no

#

just in general

west nebula
#

I'd like to try to reproduce the inpainting issues with main. Is there a procedure?

heavy glacier
#
  1. generate a good image
  2. inpaint an area with a low strength (.15, for example)
  3. observe quality
#

I have yet to test the latest main w/ outpainting

#

im going to do a direct workflow comparison/seed/etc. on 2.2.5 and 2.3

west nebula
#

Yeah, I would never inpaint with .15... that's going to generate garbage.

heavy glacier
#

no its not?

#

i do it all the time in 2.2.5

west nebula
#

To be clear, you mean erasing the region?

#

Or masking?

heavy glacier
#

no.

#

Masking.

west nebula
#

So in the past when I've used masking I always turn the inpainting strength to at least 0.3 otherwise I get basically the source image out.

heavy glacier
#

Yes

#

I typically do .15 only when trying to get miniscule variations of an image

#

I was surprised to see the deterioration of quality when doing that in 2.3, which leads me to believe that may be the source/related to the quality issues elsewhere

west nebula
#

Did you try a high-strength (say 0.6) comparison?

heavy glacier
#

yeah i was working w/ higher strength without noticeable issues

#

doing a 1:1 comparison of workflow for both outpainting and inpainting now

#

and will share outputs

west nebula
#

...and settings, please.

heavy glacier
#

yep

#

Hm.

#

Same seed, model on 2.3. wildly different outputs

#

(base generation, not even talking about inpainting)

#

maybe i need to update diffusers

#

to get xformers fix

#

probs. 😛

west nebula
#

Or the boss move of --no-xformers.

#

As soon as I started getting wildly different results, I disabled it.

heavy glacier
#

i need the speed

#

ok - so maybe something got changed alongside the diffusers .12 update - not able to generate the same image (unsure if xformers or diffusers) but inpainting and outpainting both seem dramatically improved.

#

GREAT NEWS

#

im thinking we're nearing RC @rose sentinel !

rose sentinel
#

I hope so. I'm hoping for the installer work to be done by the end of this weekend (@dire gazelle and @gusty hound ), but am unsure of the status of the inpainting/outpainting work. There seem to be issues with the inpainting-1.5 model. The ckpt version is still working, but (1) it doesn't convert cleanly to a working diffusers model, and (2) the huggingface diffusers version has multiple issues described last night. One option is to punt and to advise people to use the ckpt model exclusively.

west nebula
#

So when are we going to attempt to fix the issue with large images and VRAM OOM?

rose sentinel
#

There is a trick for merging the inpainting model with a second model to generate an inpainting version of the second model. However, we need to support the diffusers version of 9-channel models for this to work.

rose sentinel
west nebula
heavy glacier
#

I think we could implement it a diffusers inpainting version as a 2.3.1 fix

west nebula
#

The code in @tardy sparrow's PR gets us somewhat there but not entirely - I can't generate the same sizes I did with 2.2.5.

heavy glacier
#

I also think larger image generations can be investigated in 2.3.1

#

Unless we're able to beat installer on fixes

#

I will note, I think I am still seeing some quality quirkiness on 2.3 w/ inpainting

#

but it might just be non SD1.5 models...

west nebula
#

So the errata/caveats on 2.3.0 are somewhat more than past releases.

heavy glacier
#

definitely is a big 'migration' type release

west nebula
#

Sure.

#

But there are a lot of minor issues that we'll have to address so people may not want to make the switch right away.

heavy glacier
#

i think we're better off releasing vs. holding off, but thats not a strongly held opinion

heavy glacier
#

RC doesnt mean "no fixes left to do" as we've all experienced 😅

#

this is after a couple passes with a non SD 1.5 model

#

using SD 1.5 afterwards, seems to actually do inpainting fine... 🤔

#

OK **this **is an interesting discovery.

#

my SD 1.5 is on ckpt generation path

#

im going to try another ckpt model, then switch to diffusers

#

After 10 generations at .15 on a ckpt model.

#

another 10 at .6 (lets not focus on the perfect_hands itself)

#

diffusers model at .15

west nebula
#

Lots of artifacts in the surrounding region.

heavy glacier
#

yep

west nebula
#

Can you show what exactly you have masked?

heavy glacier
#

pretty much the area w/ artifacts

#

(all artifacts are inside mask

west nebula
#

Trying to see if the artifacts are on the border or inside mostly

heavy glacier
#

one sec ill do a direct comparison w/ mask

#

full image before

#

masked

#

its going outside the mask w/ seam paint

west nebula
#

So that's similar to what I was seeing yesterday with my converted 1.5 non-ema model.

heavy glacier
#

loopbacked img2img at .15

#

issue is in img2img

#

a .75 img2img

#

low strength img2img 🧠

#

WE FOUND IT

west nebula
#

canvas or non-canvas img2img?

heavy glacier
#

non-canvas

west nebula
#

So low-strength only yields problems, or at least ones that are noticeable?

heavy glacier
#

canvas img2img

#

at .15

#

poor guy

west nebula
#

He deserved it, he's too corporate.

#

What sampler?

heavy glacier
#

k_dpmpp_2

west nebula
#

Can you try with ddim?

heavy glacier
#

ddim

#

at .15

#

its the karras

west nebula
#

This is why I never noticed problems to this extreme degree - I almost always use ddim.

heavy glacier
#

LOW STRENGTH

#

KARRAS

#

breakthroughs

west nebula
#

steps?

#

I found one seed in particular that screwed up my image with k_dpmpp_2 @ 30 steps, 0.15 i2i strength - 1304976924

heavy glacier
#

30 steps

#

ill try 60

#

its still a bit borky

#

butb etter than 30 steps + .15

rose sentinel
# heavy glacier I will note, I think I **am** still seeing some quality quirkiness on 2.3 w/ inp...

The inpainting and outpainting on the non-inpainting models has definitely improved considerably. I haven't done extensive comparisons but it feels like the "success" rate between the ckpt and diffusers models is about the same, given the variability of results. However, I'm a big fan of the inpainting model, which produces consistently good results, so I'm hoping the problem(s) with the diffusers version can be tracked down and fixed. I'm reluctant to dig into it myself, as I'm not very familiar with the internals of the diffusers code, but I could try...

heavy glacier
#

I guess that really just means low step counts generally on karras though, given how strength works

west nebula
#

I wonder if two seeds after one another can do this. 2364683628 followed by 1304976924

#

Can somebody try that with k_dpmpp_2 @ 50 steps, loopback at 0.6?

rose sentinel
rose sentinel
rose sentinel
west nebula
#

Also, loopback doesn't seem to change the SD metadata to refer to the previously-generated image, is that a bug?

west nebula
#

@heavy glacier Try loopback with your starting seed at 3905153771, 10 iterations, strength something reasonable like 0.6

rose sentinel
rose sentinel
# heavy glacier a .75 img2img

Just for comparison, here's the ckpt version of inpainting-1.5 in which I masked the artist's hand and outpainted downward by 128 pixels. It does pretty well on the outpainting. Not so thrilled with the hand.

heavy glacier
west nebula
#

If you use loopback and generate >1, it'll use a PRNG random walk starting at the seed you put in.

heavy glacier
#

thats why i was asking

west nebula
#

(Things I learned yesterday.)

heavy glacier
#

I could do 10 manually

#

or do images = 10

#

and would be different results

west nebula
#

images=10

heavy glacier
#

so wanted to confirm

#

kk

#

w/ SD ckpt

#

w/ diffusers

rose sentinel
#

Which sampler?

west nebula
#

Do you notice a degradation between the last two?

#

k_dpmpp_2

heavy glacier
#

Nah it’s pretty much after the first few

west nebula
#

@rose sentinel I'm getting the diffusers inpainting model now, will try soon.

rose sentinel
#

Every so often when using k_dpmpp_2_a I get a patch of artifact. I don't use the non-ancestral one much, but maybe it has a small probability of producing artifacts which add up over multiple loopbacks.

rose sentinel
west nebula
#

Part of me wonders if this is something with how the k_dpmpp_2 sampler works and it amplifies artifacts in the source image and puts traces of them in the destination - so you get a compounding error sort of effect.

#

But then we have the ckpt files that work well, so... hm.

#

Perhaps it's time to call the diffusers folks.

#

@rose sentinel Can you share your parameters and such so I can best reproduce the problem?

rose sentinel
west nebula
#

i2i strength?

#

Patchmatch or tiling? Inpaint replace (if so, what strength)?

#

This is why storing all of the canvas settings in metadata would be super helpful.

#

OK. I see what you mean. Inpainting model is not working well at all.

west nebula
#

Do we have a separate channel for this discussion?

#

In generate.py: inpainting_model_in_use = self.sampler.uses_inpainting_model()

#

This is true for ckpt inpainting models and false for the diffusers inpainting model.

#

I assume that's desirable as that leads to the omnibus generator for the former and the inpainting generator for the latter.

rose sentinel
west nebula
#

I set it to True. It doesn't work. 🙂

#

Also inpaint_replace doesn't work in diffusers inpaint.py.

#

I want to see the canvas debugging output but it doesn't seem to work on my headless box. Any pointers?

worldly cloak
#

I don't think diffusers has an ancestral version of DPM Solver++

#

When troubleshooting things, I strongly recommend starting with a nice simple single-order scheduler like DDIM.

west nebula
#

Omnibus for the old ckpt inpainting model isn't working right, either...

west nebula
#
            # TODO: we should probably pass this in so we don't have to try/finally around setting it.
            self.invokeai_diffuser.model_forward_callback = \
                AddsMaskLatents(self._unet_forward, mask, init_image_latents)
        else:
            guidance.append(AddsMaskGuidance(mask, init_image_latents, self.scheduler, noise))```
#

If I make that change (False and) in diffusers_pipeline.py, the inpainting model does something.

#

Why is that?

west nebula
worldly cloak
#

wait, it works at all if you skip AddsMaskLatents? Oh, yes, it does, because the chunk that makes non-masked operations work with the inpainting model is further down

#

so maybe I messed up the order of channels in AddsMaskLatents

west nebula
#

Well AddsMaskLatents doesn't seem to do much at all. :/

#

I can throw some debugging output in there and see if it gets called.

#

I see __call__ followed by add_mask_channels, so it's working... just not doing what it needs to.

#

That happens every step.

west nebula
worldly cloak
#

the inpainting model needs those extra latent channels passed to it, yes. otherwise it's not an inpainting model.

west nebula
#

So it was working without AddMaskLatents but not doing what it should have.

worldly cloak
#

writing a test for it now to see if I flubbed something obvious

sour sun
west nebula
worldly cloak
#

hmm, test case checked out, it seems I did not flub the order of the channels

west nebula
#

So does the function not return what it's supposed to?

worldly cloak
#

it's working as intended. The AddsMaskLatents.add_mask_channels function, that is.

west nebula
#

Is the appropriate amount of noise getting added in?

#

The only other time I've seen weirdness like this with is with img2img when there's transparency and there shouldn't be.

worldly cloak
#

which "like this"?

different people have posted a lot of different things about different aspects of inpainting in the last 24 hours and I have a great deal of difficulty following who is talking about what

west nebula
#

See the video I shared above.

#

The first part is what I get if I don't bypass AddsMaskLatents no matter how many steps or which sampler I use.

#

It looks almost like the original tiling or patchmatch.

#

Change the strength to 0.99 and I get the same thing.

#

Original vs. outpainted

#

(Tiles of size 16)

dire gazelle
west nebula
#
        """predict the noise residual"""
        if is_inpainting_model(self.unet) and latents.size(1) == 4:
            # Pad out normal non-inpainting inputs for an inpainting model.
            # FIXME: There are too many layers of functions and we have too many different ways of
            #     overriding things! This should get handled in a way more consistent with the other
            #     use of AddsMaskLatents.
            latents = AddsMaskLatents(
                self._unet_forward,
                mask=torch.ones_like(latents[:1, :1], device=latents.device, dtype=latents.dtype),
                initial_image_latents=torch.zeros_like(latents[:1], device=latents.device, dtype=latents.dtype)
            ).add_mask_channels(latents)```
#

That if block never executes because AddsMaskLatents has already been hooked up by this point.

#

Not sure if that's a problem or not - just an observation.

#

So when I changed the code before to False and..., this code did get executed on each step.

west nebula
worldly cloak
west nebula
#

But (not looking at the code yet) that implies that the infill method is the issue. And I have strength up to 0.99 - almost a complete replacement - and it's not working.

#

I just pulled it and I'll give it a shot. Stand by.

#

Can I choose the new methods in the GUI?

worldly cloak
west nebula
#

That's cool.

#

Loading things up now. I'll try to outpaint Patrick Stewart.

#

Are you using xformers?

#

Weird, that worked.

#

No, no xformers here.

#

So I have an idea. Hang on a sec.

#

Nope. Why is this working at all?

#

I do get a line with blur

#

That's consistent across seeds, too.

#

Could be my prompt, who knows.

worldly cloak
#

yeah, I think I goofed something up with blur. tinkering with that a little.

west nebula
#

Yeah, maybe my seed.

#

Anyway, why do those two methods work while tile and patchmatch do not?

west nebula
worldly cloak
#

huh I thought "blur" was a fairly simple idea but the results are much weirder than I was anticipating. even after fixing my thinko with the order of the layers.

west nebula
#

Also I think the seam painting isn't doing much at all, probably for the same reason that working with tile or patchmatch isn't doing much...

worldly cloak
#

hmm, I wonder if the way the inpainting model was trained, it learned to expect the masked area to be zeroed out like that. That seems plausible.

west nebula
#

Didn't it work with 2.2.5?

#

I never used it but I think a lot of people did.

#

And if it is expecting gray (0x7f? 0x80?), then the seam painting step is a waste of time.

#

And if all of this is the case, we should just use pure gaussian noise at 100% strength for inpainted regions for the inpainting model.

#

I just did some quick testing and mid-grays seem to be acceptable for it to figure things out. Too light or too dark and it does a bad job, same with too colorful.

worldly cloak
#

yeah, that's the conclusion I'm coming to as well. Inpainting model should always be run at full strength with the masked region blanked.

okay, that's a fair reason for it to take a different code path... we just have to make the reasons why a little clearer in the code than naming it "omnibus"

west nebula
#

And if inpainting's picked, strength doesn't matter for transparent/erased regions... but it does for img2img, which works? in this model still.

worldly cloak
#

but I am not sure how to communicate this UX-wise.

west nebula
worldly cloak
#

I'm removing the "blur" mode from that PR, as it doesn't seem to be useful.

west nebula
#

So what if we took results from the other methods but scaled them into 0x40-0xbf? Would that yield more texture and be helpful?

worldly cloak
#

ooooh, hmm, I just realized something.

inpainting model gets those two inputs, the noised latents we're working with and the latents of the original image.

and we're learning that it's important for one of those to have just flat gray in the masked area, but that might not be equally true for both.

west nebula
#

Hmm....

#

So you think the noised latents need gray? Or the other way around?

west nebula
#

And a follow-up: What happens to masked areas with the inpainting model?

#

I think masked areas with inpainting have to be treated the same as erased areas.

west nebula
#

This technique works perfectly for inpainting only with regular SD1.5 as well - fill with gray and set strength to 0.99 for filling erased regions.