#Using -I with a PNG always gives an essentially identical image

156 messages · Page 1 of 1 (latest)

upper estuary
#

I've tried many values for --strength and -C and these values have great effect when using JPG images, but no luck with PNG images. Most of the PNGs are exports from either Preview (macOS) or Pixelmator Pro. What could be going wrong?

latent pebble
#

Can you share a command line that causes this problem?

#

And full output?

plucky coral
#

This is really interesting. Can you upload some of the original images that are affected? I’d like to have a look.

#

I wonder if the color mode has something to do with it

#

We have had a few other reports of images not changing much if at all with img2img. Need to figure out why.

latent pebble
#

It sounds like bit depth. I could try to reproduce here.

#

Nope, works fine for me with 16bpp.

loud kraken
#

RGB vs RGBA maybe ?

latent pebble
#

I'll test in a few.

#

It would be nice to have images and parameters from @upper estuary to test elsewhere.

#

@loud kraken That was a great idea. I used an image with transparency (254) and I got the same input as output!

#

Opaque regions don't seem to get drawn on. It looks like it might be a gradient. @plucky coral Is this the new inpainting at work?

latent pebble
#

Here's my init image.

#

"a boat"

plucky coral
plucky coral
#

@hot pelican we may need to rethink implicitly masking based on alpha channel…

#

@upper estuary Can you please share a couple of the initial images you used which had this issue?

#

And the terminal output, including the command

upper estuary
#

invoke> high resolution digital photograph of a phone booth inside a subway station, -W 512 -H 512 --fit -I ./testimage.png -n 9 --grid --strength 0.99 -C 5

#

Sorry was afk for a while but back now... uploaded a sample input, sample output, and console output with prompt.

#

$ identify testimage.png
testimage.png PNG 512x512 512x512+0+0 8-bit sRGB 100757B 0.000u 0:00.001

#

is 8-bit ok? and sRGB?

plucky coral
#

>> Initial image has transparent areas. Will inpaint in these regions.

#

are you comfortable doing some quick python REPL-ing?

upper estuary
#

I'm a n00b with sd / invokeai but I think the issue is not just with inpainting, although not being able to do inpainting is a natural fallout from using jpg as a workaround.

#

sure

#

if you tell me what to do 😉

plucky coral
#

ok just a sec

upper estuary
#

not super up on python but we'll try

plucky coral
#

no worries

#

save this script

#

sorry - first - the goal here is to check the transparency of the image

#

bc if the image has transparency, it will not process corrrectly

#

this script just counts the pixels with transparency

#

so save it int he same folder as one of the init images

#

edit the 'test.png' to hav ethe name of the init image file

#

activate the conda environment then run python check_transparency.py - it will tell you the # of transparent pixels

upper estuary
plucky coral
#

ok

upper estuary
#

oh that was without conda environment, if it matters. I had to install pillow which I didn't have.

plucky coral
#

all good

#

so whats happening is img2img is seeing that very small transparent area (12 pixels) and treating it as a mask

#

it is running img2img but ONLY on those 12 pixels

#

so of course there is no change in the result (technically there should be about 12 pixels different)

upper estuary
#

hah

plucky coral
#

so to clarify - the issue is that certain images don't change when you do img2img, right?

upper estuary
#

yes

latent pebble
#

I think this calls for a better error message rather than a change in InvokeAI behavior.

upper estuary
#

but just to verify my assumption -- using the invokeai > prompt with -I means I am doing img2img corrrect?

plucky coral
#

built-in to img2img is inpainting over transparent areas.

latent pebble
#

"Your image has transparent pixels, so inpainting will be used."

plucky coral
#

>> Initial image has transparent areas. Will inpaint in these regions.

upper estuary
#

I want to move to strictly command line (as in, unix shell command line) but not quite there yet.

latent pebble
#

Yes... maybe something like inpainting only?

plucky coral
#

this message is already in the output, but subtle

plucky coral
#

unfortunately

latent pebble
#

@upper estuary Is there an error message that would have made you immediately stop and understand what was going on?

upper estuary
#

The messages mostly focus on inpainting, and doing give any hint (or I missed it) that the non-inpainting painting will also be, understatement, suboptimal 😄

#

WARNING: Colors underneath the transparent region seem to have been erased.
Inpainting will be suboptimal. Please preserve the colors when making
a transparency mask, or provide mask explicitly using --init_mask (-M).

latent pebble
#

So in an ideal world, what's the error message that would have made you check your image for transparency first?

upper estuary
#

btw while trying to make a simpler test image I found a way to make an image that crashes the process.

plucky coral
#

I don't think there is an easy way around this, unless we drop support for inpainting with a single init image based on transparency and instead always require a mask image for inpainting.

latent pebble
#

NOTE: Your initial image has transparency, and those transparent regions will be inpainted. If inpainting isn't your intent, please make sure you use an image without transparency.

#

Then the WARNING that's there now...?

upper estuary
#

as a user (of image manipulation programs in general) I've always found masks super confusing fwiw. (probably because I haven't worked with them enough) But maybe that's just me. There's an inversion of perspective problem when communicating about them (what is the part you want to mask versus the part you want to mask out versus the part you want to mask in, to throw around some laymen's terms to characterize the confusion). Documentation often does not make clear what perspective it's taking. So one has to read the docs with a superposition of both meanings and try to glean from context what is meant. Although from my skimming of the invokeai documentation it looks like it was more clear than most.

#

I'll get to your question about messages, let me think about that.

plucky coral
#

@brittle wyvern

upper estuary
#

"Because your input image contains some transparent pixels, all non-transparent pixels will be passed through unchanged to your output images, remaining identical to the input image."

#

^^ that would make it clear

plucky coral
#

summary:

  • Our img2img code checks for transparency in the init image and if it finds some, it does inpainting on those areas. In this situation, 12 transparent pixels (not detectable to the user when viewing the image) triggered inpaiting on those 12 pixels, but img2img was expected on the whole image. The result is an img2img result with no visible changes.
  • Our CLI warns the user but it's pretty subtle and easy to miss.
  • The user provides an init image and expects only img2img, there is no clear indication that inpainting is going to occur, because our code tries to infer what to do based on the init image.

How can we fix this UX? One option is to simply add an additional flag "--inpaint" which is needed to do any inpainting operation.

#

I've tagged hipsterusername for his UX perspective here

#

@glass dawn also could use your feedback - please read the above summary for the situation and advise how we can provide a better experience

upper estuary
#

or... "Transparent pixels found in input. These will be inpainted and no other pixels will be changed."

plucky coral
#

Side-note: On the UI, we will be changing the behaviour so that if you give it an initial image, unless you are in the inpainting tab, it never inpaints

latent pebble
#

I like requiring --inpaint to force it.

plucky coral
latent pebble
#

Yes.

upper estuary
#

What I brought to this was the misimpression that painting and inpainting could be done at the same time. I admit a closer reading of the docs probably would have fixed this.

#

I've avoided the UI. I have a headless setup and no ability to look at the UI.

plucky coral
#

It's web-based, you can access it from any machine with a web browser on your local network

#

But no worries if you wanna stick to CLI - we support that fully also

upper estuary
#

yes ok I could do that.. I want to use CLI (eventually break out of the invokeai> and down to shell) so I can script things.

plucky coral
#

yeah, you'll need to do so to do serious work

latent pebble
#

That'll probably be much easier when the node-based stuff is done.

upper estuary
#

I think adding --inpaint would not help break through my initial misimpression about the feature. I am guessing only from context here that existing -I does not have '--inpaint' as its long form equivalent.

plucky coral
#

Yeah, once we have the node stuff in, you can write a graph in JSON/python dict and much of the complicated workflows are suddenly automate-able

upper estuary
#

-I must be --image I guess

#

or --input

plucky coral
#

-I is initial image

#

yep

upper estuary
#

ok off topic but that crasher

#

$ convert xc:none -size 512x512 -geometry 512x512 -fill red -draw 'rectangle 128,128,384,384' +repage red-384-box-512-imagemagick-repage.png

plucky coral
#

yes please haha

upper estuary
#

that was my second attempt, first one did not have the +repage

#

convert xc:none -size 512x512 -geometry 512x512 -fill red -draw 'rectangle 128,128,384,384' red-384-box-512-imagemagick.png

#

let's just not mention +repage that was a random option I tried.

#

above command creates this image which causes a crash

latent pebble
#

So what color mode is that?

upper estuary
#

imagemagick identify says sRGB

#

$ identify red-384-box-512-imagemagick.png
red-384-box-512-imagemagick.png PNG 512x512 512x512+0+0 8-bit sRGB 527B 0.000u 0:00.001

#

I'm not a graphics expert

latent pebble
#

That's indexed color.

#

Images have to be RGB or RGBA for InvokeAI.

#

That one is 3 colors - transparent, red, and white.

upper estuary
#

oooh, ouch

#

good to know

plucky coral
plucky coral
#

We should be able to handle that image just fine.

#

Or at least not catastrophically fail

upper estuary
#

So I think I know what I need to know now! Thank you!

#

looking at that page..

#

quick writeup!

#

terminal output is already pretty crowded and you may still miss it

#

this yes

#

but... given the results were not as expected, I did go back and scour the terminal output. So having a bit more info (or some info more targeted at this failure mode) would have helped despite the crowded terminal output.

plucky coral
#

Still, I think it brings up a good issue and we cna do better

brittle wyvern
#

I think the simplest is as has been suggested above

#

Make it explicit - -inpaint for inpainting

#

Rather than bundling functionality under the same arg

plucky coral
upper estuary
#

Doing better, I don't know. You might give the world an overdose of awesome.

#

on second thought yeah if it was a clearly different mode, yes that would break through my initial confusion. I didn't know it was either-or. I thought you could enhance/alter the given starting content (resulting in changes to that content, the non-transparent pixels) while simultaneously inpainting in one operation.

plucky coral
#

understandable

#

in order to do that, the code would probably need to be a lot more complicated 🙂

upper estuary
#

or have a hack that adds some random noise pattern replacing transparency

#

maybe... if that would work

#

maybe the existing algorithm will just have a free hand with random patterns, to act as in inpainting, while being relatively more restricted by the less-random pixel regions, resulting in a hybrid.

#

but I'm out of my depth

#

ok thanks again! I need to create a github account for this.

hot pelican
upper estuary
#

Thanks, I respect making the smallest change that could possibly work.

brittle wyvern
#

My suggestion would be to have users receive a descriptive error in passing a transparent image to img2img, and point them to the new command. It may be friction for older users, but I imagine we will have far more new users than old.

plucky coral
#

There was another user who had the same issue but at the time I didn;t recognize it.

#

@inner veldt had this same problem on one of her images

#

we couldn't figure it out and chalked it up to SD being weird but I'm 99% sure this was the cause

#

(emma sorry to tag you - we are talking about when you had that image that didnt change whne you did img2img)

glass dawn
#

And if you just fill the transparent area, if it is obviously small? Remove transparency. I had problems with img2img when the images had transparency by mistake and only one row of transparent pixels.

plucky coral
#

if we cater to enthusiasts, better to do exactly what they want

#

so explicitly requiring a flag to enable inpainting feels good for this

hot pelican
#

My approach would be to print a warning message about inpainting occurring, and inform the user that they can ignore the transparent pixels using a --no-inpaint option. Otherwise we change well established behavior and documentation.

plucky coral
# hot pelican My approach would be to print a warning message about inpainting occurring, and ...

Specifying an init image clearly means we want img2img, but its not clear that we want inpaint. In fact we way not want inpaint, but we get it anyways, because there is a single transparent pixel in the init image that we didn't know about.

Changing the documentation and behaviour isn't a big deal if it leads to an improved user experience (we aren't trying to satisfy the documentation or necessarily retain a "tradition" of behaviour, we are trying to provide a useful tool). With the vast and complicated system we offer, I think it makes sense to require the user to explicitly request what the want done.

This is certainly the approach the UI will take; that is, init images will be stripped of the alpha channel before being sent to be img2img'd or inpainting, to ensure the user gets what they ask for.

hot pelican
#

The node-based CLI is going to have a new command syntax which explicitly invokes txt2img, face restoration, etc. I will defer changing the existing CLI's behavior for now.