#color spaces

608 messages · Page 1 of 1 (latest)

night coral
#

[to maybe extract some brain thinking from #🪣garbage-bin to someplace more discoverable]

My first assumption is that it would make sense to convert all your inputs into a single color space when training. I think I'd say pick the one with the widest color gamut. If that doesn't match your target platform for what you're generating, you could use existing color space conversion tools to deal with that, instead of making that the model's problem.

But on second thought, I suppose that in the same sort of way a wardrobe designer or makeup artist might say "don't wear that, it'll look shitty on stage / print / screen" or whatever, you might want to be able to condition the model to never generate out-of-gamut stuff in the first place. And be versatile enough to know how to do that for different gamuts.

Sounds theoretically doable -- maybe even with an adapter on top of a naive foundation model -- but kind of a pain. That would be an interesting training set to see!

west comet
#

How does the VAE play into this?

#

Could you train the VAE to only decode into valid color space?

midnight wind
#

Or train a VAE to spit out things in the XYZ color space.

#

Then we can convert to any other color space from that.

#

Let the converter determine how best to do the conversion into the correct gamut.

simple raft
#

If you are talking about the widest colour space, then arguably ProPhoto is the best for this media?

#

Downside of ProPhoto is that it is a 16bit colour space, but it does give wiggle room if you're doing swings and gains in areas.

midnight wind
#

We tossed around (a while ago) using 16bpp (or higher) outputs from the VAE and gave up because none of the python libraries we examined truly supported 16bpp RGB PNG files.

#

But if you're doing this professionally, why not DNG?

simple raft
#

DNG uses a variant of ProPhoto that Adobe tweaked

#

Technically it is colour space agnostic, but it uses the ProPhoto space to generate the final image

midnight wind
#

DNG doesn't have to use a color profile - files converted from RAW files don't.

#

Just pure sensor data.

#

So I think DNG is a reasonable choice IF we go down this road.

simple raft
#

Yes, but you need to push the data into a colour space to visualise that data

midnight wind
#

Well sure, it's sRGB coming out of the current VAE so why not shoot for 16bpp sRGB?

simple raft
#

That is why Adobe uses ProPhoto internally at least in Camera RAW and Lr

#

sRGB is 8bit only?

midnight wind
#

I think it's a colorspace and doesn't specify bit depth.

simple raft
#

Been a while so my memory is hazy on that

#

But if you're manipulating colour then you should do it in ProPhoto regardless of what your final output is otherwise you'll likely get space clipping.

#

Just a pain it sounds like there is no true colour space transform library then?

midnight wind
#

Well not quite.

#

ProPhoto is a color profile that restricts the gamut.

#

We want an unrestricted gamut to manipulate colors.

#

I'm going to page @rugged vector since he's done a ton of research on this... but there's one put out by CIE that replaces CIELAB that allows for all sorts of adjustments and has a direct conversion to/from XYZ.

simple raft
#

OK, ProPhoto covers something like 90% of LAB space?

midnight wind
#

I'm sure there are colorspace transformation libraries but the problem was getting PNG output >8bit.

rugged vector
#

I think XYZ would be maybe the way to go since it converts into the others pretty readily

simple raft
#

How I was taught PC = sRGB Print = Adobe RGB (CMYK gamut) and photographic/negative scanning was to be done in ProPhoto

rugged vector
#

But yeah there's no point since currently vae output is just srgb pixels

#

just converting it wouldn't gain anything

#

but if the model, or vae, could output xyz it could cover a wider gamut than srgb

simple raft
#

It comes down to final intent then?

#

VAE/Model assume that images are not to be printed?

rugged vector
#

That's my impression/understanding, anyway

simple raft
#

I'll shut up, probably not contributing much to the convo 😄

midnight wind
#

VAE is just what it was trained on.

#

No, no, we're all learning

#

VAE was just trained on RGB images, I don't think anybody even thought about color spaces.

#

Just... yay, we're ML researchers! We made a hongus!

simple raft
#

To be fair, the people that do probably ran for the hills

#

Peope who truly understand colour theory are few and far

night coral
# midnight wind Or train a VAE to spit out things in the XYZ color space.

I agree that it seems like it would make sense to design a VAE with XYZ inputs instead of RGB.

Although after thinking about it for a minute, I realized that since the conversion between XYZ and sRGB is so arithmetically simple, it might as well be doing some such transformation in the most superficial layer of its computations.

I guess the difference might be in how it learns where the boundaries are. cuz RGB all three components have equal range and they get clamped pretty hard, whereas XYZ is shaped pretty differently if I recall correctly.

midnight wind
#

Yeah, XYZ is not at all like RGB.

simple raft
#

Yeah, RGB clips hard and that is why you often get banding in blue skies

rugged vector
#

it really is the worst of all choices

#

lol

#

well other than being very convenient and standardized

midnight wind
#

Quantization

simple raft
#

You'll see it more converting from a wide gamut

rugged vector
#

Yeah there are a lot of colors that the wide gamut can support that get clipped off into srgb

midnight wind
#

And people ask why I don't purge my RAW files...

rugged vector
#

pretty much every other color space is wider than srgb lol

#

other than hsv/hsl which are just cylindrical transformations of rgb

simple raft
#

Precisely, and when you work with medium format RAW and try to convert to sRGB you'll weep blood in pain

midnight wind
simple raft
#

To be fair, sRGB was designed for a much simpler time in the mid-90s to standardise PC display output which was CRTs

midnight wind
#

That's a dual-ISO DNG file.

#

So a huge range, more than the camera can natively do in a single exposure.

simple raft
#

Adobe came along and created Adobe RGB for printing, then Kodak came up with ProPhoto which has remained the de facto wide gamut for many applications

night coral
#

...they make dual-ISO files? 🤯

simple raft
#

Yes

#

Essentially a stacked DNG

#

and a DNG is a modified TIFF file

midnight wind
simple raft
#

God I am dragging up some ancient archaic stuff from my ACE days 😂

night coral
#

my knowledge of camera capture formats is becoming rapidly obsolete as the world moves away from "this is a capture of a single instant of sensor data" to the wonderful world of computational photography.

midnight wind
#

So this one isn't actually stacked, it's a lot of bits.

simple raft
#

Which Canon is that? As when Canon released dual ISO images they captured two images and blended the lower and upper 10% of the clipping range from a lower ISO?

night coral
simple raft
#

Adobe Certified Expert, Photoshop and InDesign

midnight wind
simple raft
#

OK, ML did things different compred to the official Canon method

#

Canon saves 10% of the shadows and 10% of the highlights from a stop lower ISO, and then blends that data to form a RAW file.

midnight wind
#

Planar Configuration : Chunky

simple raft
#

That's getting off topic

midnight wind
#

Anyway the file has a lot of bits.

#

And its gamut is wider than sRGB, even wider than my normal RAW files (>8bpp)

#

So I don't even know how one would go about training a VAE to do this task.

#

The UNET works with its own view of the world that started out as sRGB.

simple raft
#

This is beyond what I understand, I barely understand how things string together

midnight wind
#

I think feeding out-of-gamut RGB inputs into a UNET may not work as intended.

#

Or it may!

simple raft
#

Wing it and see?

night coral
#

all the Stable Diffusion computations (unet + vae) run in 16-bit (unless/until some quantization methods become popular), so I don't think we'd be starved for precision.

I think it's largely an issue of the current implementation clamps the range indiscriminately both on the way in to VAE-decoding and the decoded RGB to the file format for saving.

simple raft
#

Another way of looking at it, how do you train a VAE on RAW data images?

midnight wind
#

We've seen what happens when it's fed noise that isn't gaussian or is skewed... no bueno

rugged vector
#

well it's in the latent space at that point though, so i dunno how fairly it translates exactly. I mean.. yeah.. it was trained on latents that came from srgb images currently. so that'd be a limitation in that it'd only output srgb latents. but if a vae could take a higher color space image, and translate that to srgb-ish latents, then take srgb latents, back to the higher color space

#

it would be the vae "faking it"

#

but it might work?

#

ideally i think i agree that one would train all of sd on latents that came from higher color space images entirely from the ground up not using srgb at all

midnight wind
#

This is what the UNET looks like when you've fed it things it doesn't like.

night coral
simple raft
#

Well if all you are doing is a transform on the colour data then it shouldn't be an issue?

rugged vector
#

yeah it's just seeing -1 to 1 values

#

it doesn't care if they're rgb, xyz, lab, maybe even lch?

#

but it'd have to train on them

simple raft
#

The other question is how much resources will all this ask of the model or would it be neglible?

night coral
rugged vector
#

yeah, then the latents after the vae would look ver similar i'd think

#

the trick is getting a vae to take us from latents to and from the other color space i guess

night coral
rugged vector
#

then people could probably fine tune the underlying model very quickly with latents derived from the higher-color-space supporting vae

simple raft
#

Or spit out 16bit TiFF files or would that be another project?

midnight wind
#

That would be a different node after the Latents to XYZ node.

#

At that point we could have dozens of output nodes that understand different formats and colorspaces.

night coral
# simple raft Or spit out 16bit TiFF files or would that be another project?

that's easy from a computational perspective. the only "hard" part is that when we last looked, we found Python library support for high-bit-depth images is inexplicably crappy. somewhere around here there's a link to a PIL issue that's like ten years old.

though I guess if TIFF is the target, that format is simple enough to write or wrap an existing C API.

midnight wind
#

Curious: What happens if we give a current VAE out-of-range data?

#

i.e. What if we feed it non-clipped converted-to-sRGB?

rugged vector
#

I was wondering something similar, if you just fine tuned a model feeding it whatever color space straight through the existing vae

#

and just hope for the best to come out the vae at the other end

midnight wind
#

Well not even fine-tuning.

rugged vector
#

lol

night coral
#

worst case, it craps out with NaN like the SDXL VAE does sometimes.

midnight wind
#

What if we give it values outside of the expected range of inputs.

rugged vector
#

hmmmm

midnight wind
#

And accept values outside of the expected range of outputs.

#

There's nothing stopping us from taking VAE output as it exists today and scaling it to see if the stuff that gets clipped is worth anything.

night coral
#

Yep, that's where I'd start.

rugged vector
#

I'm not sure what you mean about being clipped though

#

the vae output currently is values between -1 and 1

#

those are rescaled to 0-1 which goes directly into a pil image as srgb values

midnight wind
#
                # copied from diffusers pipeline
                latents = latents / vae.config.scaling_factor
                image = vae.decode(latents, return_dict=False)[0]
                image = (image / 2 + 0.5).clamp(0, 1)  # denormalize
                # we always cast to float32 as this does not cause significant overhead and is compatible with bfloat16
                np_image = image.cpu().permute(0, 2, 3, 1).float().numpy()

                image = VaeImageProcessor.numpy_to_pil(np_image)[0]```
rugged vector
#

ah the clamp

midnight wind
#

Right there they're clipped

#

Yeah

rugged vector
#

huh

midnight wind
#

So instead of that, shrink the range down a lot to see what's in there.

#

image / 8

rugged vector
#

yeah that's ... interesting heh

midnight wind
#

for example

#

Trying it out now

#

AHHHH my workflow has multiple l2i, so that's going to ruin everything

rugged vector
#

if it turns out there is actually stuff in there, there are yet still ways to compress the gamut just in srgb space

midnight wind
rugged vector
#

other than just clipping it, to choose preserving chroma vs lightness

#

hmmm

night coral
#

yep. I think that's a very approachable place to start and it's kinda been on my long list of stuff-to-mess-with-someday.

midnight wind
#

So that image above has things out of the normal range, but it's clearly compressed around 0.

rugged vector
#

indeed

midnight wind
rugged vector
midnight wind
#

There's not a lot there but I think that the information is decent.

rugged vector
#

there are a lot of other ways to do it

midnight wind
#

Well right now I just want to see what's there.

#

It's basically the lightning and some meatball shadows.

rugged vector
#

yeah

midnight wind
#

We need a good prompt to generate some crazy bright stuff.

rugged vector
#

the difference would be basically, we could preserve some color in some of the parts that are currently now becoming pure white and pure black

#

and instead make them almost-white and almost-black but the correct hue

night coral
#

I bet you could generate some out-of-gamut magentas too

midnight wind
#

full range fit to 0..255 vs. standard output

rugged vector
#

what did you use to get it into 0..255? just linear scaling?

midnight wind
#

Found the lowest value for all RGB (not per-channel) and the highest, then linear scaling.

rugged vector
#

i wanna try that function i linked

midnight wind
#

I didn't code it, this is all Photoshop

#

Just experimenting

rugged vector
#

haha

rugged vector
midnight wind
#

latent.py

#

Line 689

#

Maybe outrun aesthetic will yield some out of range stuff

night coral
#

you could also mess with that vae.config.scaling_factor just for fun.

midnight wind
#

This is / 2.25 instead of / 2

rugged vector
#

agh i have to figure out this torch device stuff now to use my function there haha

#

it doesn't like me multiplying a torch tensor on gpu with a matrix that's on the cpu

midnight wind
#

This should be the transform that happens inside of the original function, so you can see what was lost.

rugged vector
#

hm mostly some highlight details i guess

midnight wind
#

That.

#

Those red pixels right in the middle!

#

So I think we need some prompt that can do it.

rugged vector
#

jeez if i lose that red hue from those pixels

#

the whole vibe of the image is ruined man

night coral
#

the seven pixels that really tie the whole thing together

midnight wind
#

So I think this is my concern - a lot of work for no perceptible change.

#

I got an image with a few more...

night coral
#

how about those cases that people were clamoring for per-step thresholding for?

midnight wind
#

Well that's latent thresholding

#

Totally different beast

night coral
#

partly? I assumed the out-of-threshold latents were a problem because they were producing out-of-gamut pixels

midnight wind
#

Nope, it was a technique to bring out more detailing.

#

Replacing out of range lixels with random values

rugged vector
#

stock vs with adaptive gamut clipping

#

oh yeah, worth all those cycles

night coral
#

sure, but that was almost entirely in-gamut already, right?

rugged vector
#

hahah yeah i realize that's about the literal worst possible test image

#

i'm working on it

night coral
#

multiply your CFG by 4, that usually manages to mess things up for me 😉

rugged vector
#

hey alright here we go

{EDIT: this is wrong because I didn't remove/reapply srgb gamma correction, the actual results are minimally different: #1155186572141023315 message}

#

clipped, adaptive gamut, difference calculated between the two

#

i can tell.. absolutely no difference. but there is mathematically one i guess

#

in those imperceptible shades of blue heh

#

oh i can see it flipping back and forth in discord now i guess

#

with adaptive gamut, the fringes of the blue lightning bolt on the left side are less washed out. slightly.

midnight wind
#

So I think a 16bpp target file format may be more useful than eliminating clipping...

rugged vector
#

this one uses a higher "alpha" parameter in the gamut compression algorithm and is slightly even more chroma-preserving, along with its difference between the above b.png

{EDIT: this is wrong because I didn't remove/reapply srgb gamma correction, the actual results are minimally different: #1155186572141023315 message}

#

but yeah it'd be good to save it in a nicer format

#

currently i think it uses also png compression at 0.6?

midnight wind
#

Now is the VAE spinning out nearly quantized data? Next investigation.

rugged vector
#

ah ok

#

but anyway that lightning image is an extreme case and really did have quite a few pixels pushed farther into srgb space

#

but.. i still can't really tell a whole lot looking at it even then lol. not a huge diff

midnight wind
#

So now instead of dividing by two, I'll multiply by two to see if 16bpp would even help or if it's coming out of the VAE already quantized (or close)

#

Yeah, there's data there... so 16bpp may be useful.

rugged vector
#

another thing PIL doesn't support

#

haha

midnight wind
#

Right, we may just have to not use PIL

#

DNG is really not a bad choice.

rugged vector
#

what i like about pil: it can load color profiles thanks to the Tiny CMS2 library that it includes as PIL.ImageCms

midnight wind
#

Another experiment... what if we apply a log space to the output

#

Canon Log comes to mind

#

log-adjusted that becomes...

#

For reference, here's the original...

#

Here's the difference...

midnight wind
#

That's what you get if you look at the differences amplified A LOT

#

Well this has been fun. There's something to be gained in the details by going to 16bpp but it doesn't seem the current end-to-end is going to benefit from a wider gamut.

rugged vector
#

not much anyway lol

#

who knows why it's even outside -1..1 in the first place, that's an interesting question i think

midnight wind
#

neural nets don't clip outputs

#

and the image is going through >1 network

#

The VAE is one, the UNET is another.

rugged vector
#

i guess it just "is" heh

#

i really don't know what the best thing to do is or if clamp is just good enough, since i can't really ascribe any meaning to it beyond just being out of bounds

#

if it was trained on images that were values between -1 and 1 exclusively

midnight wind
#

I think the things out of bounds are valid but it really is trying hard to not give things out of bounds.

rugged vector
#

hmm

midnight wind
#

So there's not going to be an advantage moving to another color space IMO - just getting the full bit depth out.

rugged vector
#

oh incidentally my above adaptive gamut stuff was not quite right since i forgot to remove and then reapply the srgb gamma correction before doing it. and after doing it the correct way.. the differences are truly imperceptibly small

midnight wind
#

Yeah

rugged vector
#

i told myself yesterday to rename that function so i wouldn't forget it has to take a linear srgb argument, but I didn't and then screwed it up again.

#

There is the true difference comparison between a: clipped to 0, 1, and d: using adaptive gamut compression, with image.png showing the difference

midnight wind
#

But is it perceptible?

rugged vector
#

not on my monitor 😂

midnight wind
#

Not on my phone

simple raft
#

You would need something with a wider gamut?

#

sorry, luminance

rugged vector
#

maybe better retinas

#

heh

#

i dunno if they are really perceptible differences to the average human eyeball

midnight wind
#

But the stock UNET inside of models isn't really spitting out anything that would benefit from a color model other than sRGB.

rugged vector
#

also that yeah, heh

#

the differences may as well be just noise i think

midnight wind
#

If we fine-tuned on out of range RGB input then maybe...

rugged vector
#

lol i say go all the way and fine tune on xyz or cielab or something

simple raft
#

Even if it does not use the space now, it does give it something to grow into?

midnight wind
#

Still need to get that into a post-VAE representation that's meaningful.

rugged vector
#

yeah it's too radically different from rgb haha

#

i don't think a fine tune would cut it

#

so we want the vae to take xyz or lab or whatever color space of the day, and then put it into out-of-range srgb , and encode those to latents

#

out of range rgb, that just sounds undefined and awful, so i'll pass heh

midnight wind
#

Yeah but

#

The transforms are valid.

#

Even if they don't make sense

rugged vector
#

it is an intriguing idea

midnight wind
#

Could try with i2i of some out of gamut image.

rugged vector
#

hah i don't currently have a way to get one, and have been lazily relying on PIL to convert to and from LAB space. so I'm not sure if PIL will let me take a LAB image and put it into out-of-range srgb or not. I don't have time today to test it out manually lol

midnight wind
rugged vector
#

First image did the .clamp(0., 1.) on the vae output. the second one was using adaptive gamut compression. not really a direct comparison anymore because of the multi-stage upscaling process but there appear to be more details in the shadows and highlights on the second one.

#

no upscaling, raw output of one stage. clamped vs adaptive. in this case clamped looks better. i think it's not meaningful haha. clamp(0, 1) seems fine. Or the alternative would be to make it an advanced option to clamp or use a gamut compression technique. Something like 0=clamp, above zero applies the compression. But it may be meaningless to do so.

midnight wind
#

First looks more orange

rugged vector
#

Yeah it's better those colors just get right up against the edge of the gamut in that case haha

#

let them clip

midnight wind
#

I think I have something, hang on

#

Changed the scaling to 2.5 from 2

rugged vector
#

i don't think that's gonna help haha

midnight wind
#

Well take a look

#

Here's the original

rugged vector
#

the values start out from 0..1 and originally get flipped to -1..1

#

and the factor of 2 is used to flip them back from -1..1 back up to just 0..1

midnight wind
#

Nono, I get it

#

But it also scales the output in this case, so you can see what's out of range

#

prior to clamping

#

This is what I've been adjusting all along

rugged vector
#

hmmm

midnight wind
#

So in this case, there's quite a bit out of range.

rugged vector
#

seems like there are multiple options, all of which have drawbacks. or we could just save directly to a format with higher gamut support by converting the out of range srgb into something else and then saving it

#

then someone with a better monitor than mine can see if there's a difference

midnight wind
#

Sure.

#

Could force that for a one-off test right there in latent.py

rugged vector
#

all glory to nodes

#

but unfortunately i'm dipping out for the rest of the day and i'm not really sure if PIL will do that conversion without complaining, so it might take entering the formulas manually to do so. still something to investigate

night coral
night coral
#

y'all just need to get more aggressive with coloring. there are definitely details in those oranges that are being crushed out. at the extremes there are values at ±1.3

night coral
#

here we go, using your gamut_clip_tensor function instead of the linear scaling.

so it's definitely generating some detail in out-of-gamut areas.

rugged vector
#

Yeah I do think it is, but will just point out that function expects RGB without the gamma correction, or "linear-light RGB", which you can get with the linear_srgb_from_srgb and back with srgb_from_linear_srgb, without that I think it's not quite accurate

night coral
#

oh that does make a difference, ty

night coral
#

I think my takeaways so far are:
• The current U-net and VAE can generate things out-of-gamut — though they do mostly stay in their normal range even when CFG is pushed.
• That's evidence in favor for the idea of fine-tuning for a wider gamut.
• A gamut-mapping function like gamut_clip_tensor does produce good results on saturated highlights. It can be subtle and maybe it rarely kicks in — but it's also so cheap in comparison to the overall cost of VAE decode that I'd use a latents-decode with gamut-mapping enabled, and probably even leave it on by default.

midnight wind
#

I wonder if Invoke should be saving out TIFF files since those allow for 16 bit precision as well as different compression types (through PIL's libtiff support nowadays).

#

TIFF support also includes ICC profiles.

rugged vector
#

That's not a bad idea, even if stashed away as an option to enable somewhere

midnight wind
#

I think that PIL supports LAB images with TIFF as well.

#

We need to test these things obvs.

rugged vector
#

@midnight wind here's a tiff i saved from a LAB mode image

#

i dunno if it did it right, maybe if you have photoshop if you could check haha

#

gimp complains about it and then opens it in rgb but i think gimp may not let me edit in lab

midnight wind
#

Yeah I can open in PS

rugged vector
#

does it seem to be in LAB mode?

midnight wind
#

Gotta start PS first. 🙂

rugged vector
#

haha

midnight wind
#

Yeah, no warnings or anything, LGTM.

#

BUT

#

7,373,524 bytes

#

Can you try again with compression=something?

rugged vector
#

haha

#

sure

midnight wind
#

Not sure which is best here, there are a lot of options.

#

I'm assuming you have libtiff installed.

rugged vector
#

I dunno, i haven't installed anything that invoke didn't install

#

It doesn't seem to be any smaller when i put compression="tiff_lzw"

#

also only 8 bit makes me sad since it's twice the size of the 8 bit png heh

midnight wind
#

Try zstd

rugged vector
#

that gave me an encoder error when saving

midnight wind
#

libzstd needed!

#

Of course

rugged vector
#

haha

#

well i can't figure out how to make the tiffs 16 bit either

#

reading the pil docs i don't see mention of it in the TIFF section

midnight wind
#

OK what about... tiff_adobe_deflate

#

I dunno, I'm beginning to think that PIL is just sloppy

rugged vector
#

5.7mb

midnight wind
#

You know we're going down this road...

rugged vector
#

Yeah.. Well let me see if I can convert one of those out of range RGB images to lab using PIL then save it as a TIFF in lab mode

#

If that won't work we can always try to just make the lab conversion outside of pil and then use pil to write it

midnight wind
#

PIL just doesn't support 16bpp.

rugged vector
#

i feared as much

#

i knew it was the case for png at least

midnight wind
#

@night coral Shared some stuff a while back about it.

#

So there are other libraries to deal with png.

rugged vector
#

well gimme one sec to try this conversion

midnight wind
#

So another simple way to solve this problem is to split the data up so the low-order byte of RGBA goes in one file and the high-order byte goes in another and then we treat the two files as a pair that must stick together.

#

Load them up as two 8-bit images, convert to arrays, and then multiply the high-order by 256 and add the low order.

#

Of course we lose all internal PIL support for nodes doing this.

rugged vector
#

well PIL did save the LAB image that i converted from the out-of-range rgb

#

i didn't see any warnings

#

let me give you a clamped one in png vs a lab tiff

midnight wind
#

Oh! Let's see it!

rugged vector
#

@midnight wind there you are

midnight wind
#

That uh

#

I don't think that's right.

rugged vector
#

whoops

#

let me see haha

midnight wind
#

Looks like it wasn't clamped in LAB space?

rugged vector
#

ahh

#

haha

#

well

#

a and b aren't bounded

#

the L maybe was out of bounds

#

lemme see

#

uhh

midnight wind
#

They are all OOB

rugged vector
#

yeah haha

#

well anyway, the a and b are not restricted to the range -1, 1

#

in lab

midnight wind
#

Right, but in PIL they still have to occupy 8 bits.

rugged vector
#

haha yeah, i don't know what's hapening there

#

i guess the conversion from out of range rgb into lab isn't working right

#

unsurprisingly

midnight wind
#

PIL was an unfortunate design choice, in retrospect.

rugged vector
#
                pilimage = image.squeeze().float().to('cpu')
                pilimage= pil_image_from_tensor(pilimage, mode="RGB")
                labimage = pilimage.convert("LAB")
                labimage.save("C:/dwr/testlab.tiff")
#

PIL always is painful but it does a lot of nice things... it's basically the defacto standard of python i'd say

midnight wind
#

Yes.

rugged vector
#

i wish it had a lot more features like... yeah 16 bits

midnight wind
#

It's literally hardcoded into Invoke, unfortunately.

#

It's like early Shareware

#

Great at some things but I'm not paying for this less-than-fully-featured thing.

rugged vector
#

Last night I determined that it is possible to get the raw data out of a PIL LAB image, then load that data back into a pil lab image with Image.frombytes(), and it still looks to be right. But I have not yet figured out the exact way PIL expects those LAB bytes to be arranged, as when I use the functions from RGB->XYZ->LAB that I coded up based on Wikipedia (heh, yes, plenty of things to go wrong here in any case), the colors still aren't right in the image that PIL saves. Still...WIP

#

I think cv2 has functionality for this, gonna try that later

rugged vector
#

@midnight wind You had mentioned that Invoke saves uncompressed png's. I can't confirm that digging through the source. It appears to just call image.save(image_path, "PNG", pnginfo=pnginfo), where pnginfo is created with PIL.PngImagePlugin.PngInfo, and does not set the compression level.

PIL's docs say:

compress_level
ZLIB compression level, a number between 0 and 9: 1 gives best speed, 9 gives best compression, 0 gives no compression at all. Default is 6

From this I conclude that Invoke is using compression level 6.

midnight wind
#

No, they're lossless compression, sorry.

rugged vector
#

But I'm confused how, if they're saved with image.save() and it doesn't explicitly set compress_level to 0

midnight wind
#

Yeah I dunno!

rugged vector
#

Also when I resave them in gimp losslessly, the files are larger

#

Can you be sure they're lossless? When I get on my lunch I can try to do an A/B test

#

by adding that kwarg

midnight wind
#

Pretty sure they're lossless but yeah, go investigate!

rugged vector
#

hehe gotta keep life interesting

rugged vector
#

@midnight wind In case you're interested, I did the test.

Here are four image files. The first is straight out of invoke image.save(image_path, "PNG", pnginfo=pnginfo).

The second (default.png) is saved with image.save("C:/dwr/default.png", "PNG", pnginfo=pnginfo)

The third (lossless.png), with image.save("C:/dwr/lossless.png", "PNG", compress_level=0, pnginfo=pnginfo)

And, the fourth (lossless.TIFF) with image.save('c:/dwr/lossless.TIFF', "TIFF", compression="tiff_adobe_deflate")

Bonus 5th image (image.png) of the difference shown in GIMP. Can't tell much to look at it but the histogram is all over the place.

From this I must conclude that Invoke is using the default PNG compression level of 6 for all images saved.

[EDIT: The difference image is truly nil]

#

interestingly the losslessly compressed tiff file also is slightly smaller than the lossless png, heh [but i guess also missing the metadata which would be part of it at least]

#

So, before worrying about 16 bit precision, perhaps we should allow the user to enable lossless saving at least. Maybe a TIFF option would be a great way to go about providing it also?

rugged vector
#

@midnight wind Apologies for another ping but I guess I should have done more research since everything I read does indicate PNG is always lossless compression. But GIMP did calculate a difference for whatever reason between the images... so I dunno what is going on really heh GIMP did not calculate a difference, it was entirely my mistake. Sorry everyone!

midnight wind
#

Cool! So what is the filesize of the others vs. TIFF?

rugged vector
#

All smaller than TIFF except the totally uncompressed PNG, heh. I will test a bit later to see if there's any way I can squeeze more images on my hard drive if I use compress_level=9

#

In between looking into trying to write these oversized rgb values as lab somehow. Though I suspect they might be out of gamut even in Lab space

rugged vector
#

Okay, I took the raw out-of-range rgb pixel values straight from the vae decoder and converted them into LAB manually. At which point the lightness for some of them was still below 0 or beyond 100 as defined as pure black or pure white for the standard D65 illuminant and observer. So at that point I still had to clamp the L* values in the range 0..100. But I did no other clamping, and then loaded the Lab numpy arrays into PIL (with fromarray) and saved them as LAB mode TIFF's.

Here are three images saved in such a way, but I don't presently have any reliable way to open them properly in LAB mode and check to see if they indeed contain meaningful color outside the srgb gamut. @midnight wind if you're not totally sick of this topic and might have a moment to check in Photoshop? 🙏

midnight wind
#

I can check, sure!

#

What am I checking for? 😂

rugged vector
#

haha

#

well... is there something where it shows loss if converting to rgb

#

I was under the impression in certain circumstances there's a view that shows what pixels are getting clipped but IDK really haha

midnight wind
#

Yeah

#

I can see what's out of sRGB gamut, one sec.

#

It's a rough paste, apologies.

rugged vector
#

wow quite a lot

#

I wonder if it actually makes a meaningful difference on a fancy monitor that does wider than srgb hehe

midnight wind
#

I'm sure, assuming it can display things in LAB color

#

This is the CMYK gamut warning

rugged vector
#

haha

midnight wind
#

Color blindness (protonopia-type)

#

I would imagine sunset, appalachian trail would give some extreme out of gamut stuff.

rugged vector
#

hmm yeah, i just threw those three together on my lunch break earlier but i can try and make some more examples later. maybe we can get gogurt enjoyer to look at them on the fancy macbook display

#

Just what I don't need, some reason to actually want a better monitor, heh

midnight wind
#

ProPhoto RGB...

#

Just a little bit in the darkest areas

rugged vector
#

Interesting how much SD spits out that's just way outside of allowable limits

#

I had to clip a decent bit of lightness even with these

#

though a lot less than if they're in srgb clearly lol

midnight wind
#

So the way to test whether or not it's useful information is to lighten/darken until those areas are in gamut.

rugged vector
#

well that's what the gamut mapping gamut_clip_tensor function thing I borrowed from the Oklab blog does.

#

And it stops the colors from blowing out as fully saturated, in exchange for preserving some details in the highlights/shadows, on my srgb monitor

#

so the most extreme colors end up looking a bit less extreme, but there is more lightness detail

midnight wind
#

So how do the oranges look when processed with that?

rugged vector
#

I think i have that one one sec

rugged vector
#

ahhh

#

that one has film grain on it

#

Well.. do they both I guess? IDK haha

#

I guess that's it, anyway. just different really [I think this one really might have more filtering done so I need to get a better comparison later when I'm at my pc]

midnight wind
#

The difference must be there but I cannot perceive it on my sRGB monitor

#

And this is the crux of the problem

rugged vector
#

Yeah it's really minimal lol

midnight wind
#

Well I don't know if it's minimal because I cannot see the LAB image!

#

I see an sRGB conversion of it.

rugged vector
#

YEah haha

#

Plus there are so many little variables even in how the standards are coded up with various software

#

from what I can tell

midnight wind
#

Yeah

#

So we need the delta between that image converted to LAB vs. the source LAB

#

That'll tell us how off they are from a numerical perspective

#

Anyway, we're fairly confident at this point that SD is spitting out data that's outside of the sRGB range and that it's decent data.

night coral
#

In the samples I tried earlier, the gamut-clip-tensor results were most noticeable on the highlights. like there would be part that just looked flat, and the gamut clipping would say "we don't have an orange this bright, so let's de-saturate it to a higher-lightness color" and so you'd see lighter less-saturated curves in those formerly flat places.

rugged vector
#

That was my experience as well really

midnight wind
#

I think for a single generated image it doesn't matter so much, but when you're throwing things into an i2i pipeline (or 4, as I am), the losses really have an impact.

rugged vector
#

Yeah that's where I start to care as well, and what motivates me, the thought of all those stages lol

midnight wind
#

I would like to see InvokeAI internally have a different representation of images than sRGB and then the Save Image node could do the conversion from that to a final form in a number of different ways.

#

HOWEVER all of the image processing stuff uses PIL.

#

At 8 bits, because PIL.

night coral
rugged vector
#

I got the Okhsv and Okhsl implemented in torch as part of my image layer blend and image hue adjustment nodes code

midnight wind
#

My head hurts

#

I wonder if there's a way to adapt cv2 or some other imaging library to have a PIL-like interface for the purposes of Invoke and user nodes.

rugged vector
#

cv2 does have some image conversion and saving features but I don't think they are better than what PIL has

midnight wind
#

"some other imaging library" then? 😄

#

Even just arrays of RGB that aren't 0..255

rugged vector
#

I really dunno the way, why can't PIL just get with 16 bit

#

haha

midnight wind
#

@night coral Where's that PIL 16 bit link you shared?

night coral
midnight wind
#

Yeah, so that's been a thing for a while. We should not internally be using this library if we want to do things in higher bit depth non-sRGB space.

rugged vector
#

It is def not reassuring for years of "well.. nothing new really" and the most recent update just being someone mentioning "yeah we're purging PIL everywhere from our professional workflow" heh

midnight wind
#

In favor of arrays and cv2

#

So one step is to make the image storage service store a number of different types of images.

rugged vector
#

Yeah that could work, we could still have PIL around for color management into any 8 bit output formats and then introduce something beyond that

#

Pretty serious change to consider though lol

midnight wind
#

Well adding support for other things wouldn't necessarily be a breaking change.

midnight wind
#

Average red (255,0,0) and green (0,255,0). What's the resulting color?

#

Well, if you do linear math, you get (127,127,0). But if you account for a gamma of 2.2, then you get (186,186,0).

rugged vector
#

gamma 2.2 is the simplified way/approximation which shows up many places but the standard function as far as i can tell is 2.4 if the channel value (0..1 normalized) is greater than 0.0404482362771082, and value/12.92 otherwise

midnight wind
#

Well we can get nitpicky but the point is that everyone is doing it wrong. 😄

rugged vector
#

haha

midnight wind
#

So if I'm trying to determine a mean color of a region of an image, I can't just take the average.

#

I have to take ((a^gamma + b^gamma + c^gamma + ...) / num_samples)^(1/gamma)

rugged vector
#

well.. you just wanna convert the whole rgb to the de-gamma-fied version

#

and then convert it back when you're done, is the way i'd go

midnight wind
#

OK that's another way 😄

rugged vector
#

I dunno, i like converting to something like HSL though and using the circular hue coordinate for finding an average color

midnight wind
#

I wonder how that would affect my CMYK halftone node.

rugged vector
#

like, just find the angle in between

midnight wind
#

You mean not HSL but something more color-accurate?

rugged vector
#

Hah well, HSL maps with RGB cleanly

#

but.. yeah

midnight wind
#

Yeah, we just... need to build our own imaging library.

rugged vector
#

i hacked up your cmyk halftone mode to use the "C:\Windows\System32\spool\drivers\color\RSWOP.icm" profile for CMYK

#

I think (?) i'm getting less of the "pure vivid 255 green" that i was seeing from the way PIL does it by default

midnight wind
#

What do Linux users do??

rugged vector
#

haha

midnight wind
#

I am going to try the gamma correction, since most monitors are at 2.2 I'll just leave it at that.

#

My monitor is calibrated to 2.2 even!

#

That's what everything PC out there expects.

rugged vector
#

well .. idk haha

midnight wind
#

Even if it is closer to 2.4

rugged vector
#

i think the format is basically made using the formula i listed

#

and then the monitors just display it wrong at 2.2

#

but who knows what else happens under the hood in modern os/hardware/monitor/etc

midnight wind
#

de facto standards and all

#

What happens? As little as possible.

#

Things just assume 2.2

#

Except for Macs which are... 1.8? IIRC

rugged vector
#

I just think the srgb color space is encoded with "rougly" 2.2 gamma but if you're doing manipulations on the non-gamma-corrected data you would use the srgb function and not the monitor's approximation

#
    """Get linear-light sRGB from a standard gamma-corrected sRGB image tensor"""

    linear_srgb_tensor = torch.pow(torch.div(torch.add(srgb_tensor, 0.055), 1.055), 2.4)
    linear_srgb_tensor_1 = torch.div(srgb_tensor, 12.92)
    mask = torch.le(srgb_tensor, 0.0404482362771082)
    linear_srgb_tensor[mask] = linear_srgb_tensor_1[mask]

    return linear_srgb_tensor```
#

idk what will happen as a result of using one vs the other in practice or which one is more correct after everything is said and done on the calibrated monitor

midnight wind
#

It's all perceptual, too.

rugged vector
#

haha that's the funniest thing about all these standards

#

What a mess.

midnight wind
#

Yup.

#

You think we'd be past this at this point.

#

You're right that we need to use a color profile in the CMYK conversion but I hesitate to include the windows ICC profile with my node.

rugged vector
#

yeah haha

#

I was looking at the ICC website which has a bunch of profiles but I don't see anything that looks suitable for this. There's a link to adobe's page which i'm sure has an unfortunate license on it

#

maybe something there would work though

midnight wind
#

My head hurts... just... no

#

I cannot.

#

My halftoning node will remain shit.

rugged vector
#

maybe one day a solution will present itself

midnight wind
#

I get better results applying a gamma adjustment of uh 1.4? As in np.pow(image, 1/1.4).

#

But it's still messed up. Need to be per-channel honestly

#

I could always make my own lookup function by reverse engineering a photoshop RGB->CMYK conversion.

#

I'm sure each channel has its own dot gain, etc.

midnight wind
#

So now we "just" need to interpolate

#

I may have screwed up somewhere.

rugged vector
#

it seems to work, i need to test some images with some greens and see how it handles them

#

Where it says

separate from RGB to PRMG using Perceptual or Relative Colorimetric with BlackPoint Compensation.

I believe that is basically done with PIL.ImageCms like:

        cms_profile_srgb = PIL.ImageCms.createProfile("sRGB")
        cms_xform = PIL.ImageCms.buildTransformFromOpenProfiles(
            cms_profile_srgb, cms_profile_cmyk, "RGB", "CMYK", renderingIntent=1, flags=0x2400
        )
        image = PIL.ImageCms.applyTransform(image_rgb, cms_xform)
#

the renderingIntent=1 is relative colorimetric, and the flags specify blackpoint compensation

#

There's one I made using your halftone node and that prmg icc profile and the code above. Seems safe from the overly-bright greens problem that PIL's roundtrip CMYK conversion has

midnight wind
#

So as soon as nodes get resource storage... ... ...

rugged vector
#

haha yeah i just had to put it in my invoke root

midnight wind
#

Here's a potential other way to do things. I made an RGB ramp.

#

Then I used PS to convert it to CMYK and I have component images for C, M, Y, K... and now I "just" need to fit a function

#

for each channel...

#

UGH

#

Any ideas?

#

each output channel is some relationship of input channels that can be approximated using a Runge–Kutta method, if I remember my college math right...

rugged vector
#

I wish I could help here because it seems fascinating but.. slightly unable to keep up with your approach lol. It sounds like you are on a good track though with that approach?

midnight wind
#

C = w1r + w2r^2 + w3r^3 + ... + x1g + x2g^2 + x3g^3 + ... + y1b + y2b^2 + y3b^3 + ... + z

rugged vector
#

I guess those four images are in terms of K from left to right, and each channel top to bottom?

midnight wind
#

Where w1 w2 x1 y1 z etc are constants

#

Images are C, M, Y, K conversions of the above RGB image.

rugged vector
#

ohhhh I had to actually open them

#

the preview made them look like single 2d gradients haha

midnight wind
#

Anyway we should be able to resolve all of those constants somehow, but the math is a little out of my league.

rugged vector
#

Yeah I hate to say it, but me too lol

midnight wind
#

This ends up being a solution to way too many equations simultaneously, so perhaps interpolating between things in this table is the way to go.

#

140608 numbers per channel.

frigid yoke
#

hi people. What do you want to do with color spaces in the end? Cause most of the images from the web are sRGB already.

midnight wind
#

I don't think we're at a decision point yet.

#

I think ultimately storing images as 8-bit PNGs is fine as long as we also store them as another format that has more bit depth and accuracy, and can load that other format back in to latents or do transforms on it, etc.

rugged vector
# frigid yoke hi people. What do you want to do with color spaces in the end? Cause most of th...

I guess it's fair to say we're kind of examining two separate things here. One being color spaces, motivated by the exploration of srgb values outside the range 0..1 in the results of a vae.decode(), which are currently just clipped off. But it is at least possible to either map some of the values back into gamut first, or to go into another color space and utilize more of those values (such as in a CIELAB mode image), with some examples above in this thread.

The other issue under discussion is the fact that PIL does not support images with channels in a precision higher than 8-bit. On the github issues page for pillow (PIL) about adding support for channel depth beyond 8-bit, one of the most recent comments is:

Just here to echo @KeygenLLC 's comment above, RGB/RGBA images in 16 or 32-bit float is nearly mandatory for VFX, and we've been removing any usage of PIL we can find in tools we bring in, in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

(We're also working with the EXR file format)
However there doesn't seem to be any promising sign that such support is coming to PIL in the near future.

midnight wind
#

I believe a doable option is to store image dimensions with numpy arrays of CIELAB data as well as PIL images, and to provide new interface to allow for getting/setting/converting that image data to sRGB/HSV/etc. so nodes can ultimately migrate to that. Importing images should respect the color space in the image (before converting to CIELAB and sRGB) or assume sRGB if no color space is provided. Saving to PNG is never going to be great but it has huge support and we shouldn't drop that.

#

My $0.02

rugged vector
#

Well cielab is also based on an observer angle which is I think almost always 2° for these things, and an illuminant/color temp for the white point which for screen I think is usually D65/~6504 degrees and for print is D50/~5k degrees. Whereas if we used XYZ arrays they are independent of that

midnight wind
#

Yeah. Could use XYZ! The point is that we should still provide interfaces for PIL Images since a very small subset of people actually care about this stuff or care that images are right.

#

"I made a picture of a pooping monkey!" <== Doesn't care about color rendition

rugged vector
#

Haha oh absolutely. I mean that's ... me. I have no personal use for really anything else.

night coral
#

there's another (poor) option, which is that because PIL does support single channel 32-bit images (modes I for int32 or F for float32). So it's plausible to find out which PIL file format handlers can cope with those, and then save each channel to its own file.

Advantages:

  • can still do things within PIL on the Invoke side.
  • sort-of straightforward to load in whatever image manipulation step comes next.
  • can probably use a standard image viewer to tell at least kinda what you're looking at (even if only a single channel)
  • probably get at least some compression out of it

Disadvantages:

  • multiple files per result, ugh.
    • need application-specific code to split them out and compose them back together again.
    • harder to do file management when this group of files needs to all be moved together.
  • will miss some opportunities for compression that come from information that's redundant between channels.

so... possible, but not necessarily better than an npz file or safetensor.

night coral
#

really, the thing to do is to hire the Pillow maintainer (or me) to add high-bit-depth support to PIL.

It's one of those things that would probably even be quite affordable if spread over all the people who have wanted that to happen in the last ten years. But funding open source is hard. 😞

simple raft
night coral
#

idk how money works, and I haven't dug in to the PIL internals at all, but I'd spitball more than $1k but not more than $10k USD. i.e. $0.01 million. Or $0.00001 billion.

simple raft
#

cheers was trying to figure out how these things work

midnight wind
#

PIL needs even more help. TIL their RGB->CMYK conversion is terribly broken

rugged vector
#

Hmm... somehow I had been wrongly thinking that CMYK fits within sRGB but that is indeed not the case. There should be colors in the range of cyan/green which are printable in CMYK but not displayable in sRGB, so it may be that printed results could be improved by starting from one of these LAB mode images. (?)

midnight wind
#

This, right?

rugged vector
#

when I looked at one of those charts before it was one on wikipedia that had like 5 gamuts superimposed and I was reading it wrong. but yeah, precisely that

simple raft
#

In order of gamut (and age) sRGB -> CMYK -> ProPhoto/16bit)

#

This is why ProPhoto is preferred by photographers, and AdobeRGB/CMYK is preferred by designers

midnight wind
#

It seems that ProPhoto is a logical choice given those other two.

#

As long as you're working in a space that makes sense given the task, everything should ideally be encoded in a wider gamut than sRGB - preferably XYZ or CIELCh with floating point numbers or a sufficient bit depth.

#

(or some other one...)

#

But encoding what we see as three 8-bit numbers that don't cover the full gamut is insufficient.

rugged vector
#

CIELCh is just the polar coordinate representation of the same thing in CIELab, and I believe it's almost always stored as L* and a*, b* channels

#

So we could use LAB mode that pil does support but it is still indeed limited to 8 bit per channel

midnight wind
#

Yeah

#

The point is that it doesn't matter how it's stored in a file/db as that representation truly should be treated as an intermediate.

rugged vector
#

Not sure how feasible prophoto support would be to implement for many reasons haha

#

True... XYZ might be the most independent way to do it

#

Seems rather nonstandard however

midnight wind
#

When you ask for an image from the InvokeAI core, it should be an interpretation in the space you want.

#

If that happens to lose data, OK, your problem.

#

"Reader Makes Right"

#

Where right in this case means mangling your data.

#

But that approach would mean that the existing API is fine and core nodes can update over time.

#

Especially true given the changes that are going in so there's no more writing directly, you just say "save"

#

That means the contract between nodes and the service can be changed much easier.

rugged vector
#

Ahh nice

midnight wind
#

I just dropped a comment.

night coral
midnight wind
#

I wonder if those techniques apply to SD1.5.