#✨|sdxl

1 messages · Page 7 of 1

uneven dove
#

winston churchy in the forest performing illegal black magic

grizzled warren
#

it also adds like, 15 years to female faces

uneven dove
upbeat summit
grizzled warren
#

well... Civitai would love it, it makes NSFW legal no matter the prompt ;D

prisma peak
uneven dove
#

i have a cursed seed for sale

elfin cobalt
#

Tried using dpm_adaptive as a third pass, and this happened.

#

Which is a total fluke of course.

uneven dove
#

repaste for Baughn

elfin cobalt
#

Thanks. XD

uneven dove
#

boomer 🧓🏽

#

😛

elfin cobalt
#

Pasting those links used to work!

uneven dove
#

they disabled embeds to catch you up like that

#

they were like "tell him we thought it'd be funny"

uneven dove
eternal fog
uneven dove
#

misaligned timesteps in DDIM between base/refiner introduces way more detail than the refiner knows what to do with given its timestep spacing

#

those people in the background should not be there lmao

#

they disappear when you realign them

elfin cobalt
#

That seems exploitable.

urban fjord
#

I guess you should make it into a feature then.

inner ruin
#

pseudo do you have tips for good negative prompts?

eternal fog
#

holy shit cursed image

prisma peak
#

A bit derpy, but Winston Churchill giving a speech at Hogwarts

grizzled warren
#

honestly... it's much different from 1.5, I'd oly use it for specific images and certain styles. the old mandatory negative presets aren't necessary anymore

uneven dove
#

Jesus is back... and he means business

grizzled warren
#

other that maybe long neck at portrait orientation

uneven dove
grizzled warren
uneven dove
#

long negative prompts tend to lock the thing into a certain style and i like to screw around with a lot of ideas so i just tend to throw more into the positive prompt to get what i want

#

i think even 1.5 and 2.x are better with negative embeds trained on the model that you're using

urban fjord
uneven dove
#

omg jesus vs churchy

eternal fog
uneven dove
#

@urban fjord Jesus is ready and waiting in London. Where is Winston?

visual glade
#

why do you want to see img2img?

uneven dove
#

it can't work with 0.9 and excessively smooths the image out, for the base model that is

#

makes it look kind of like airbrushed vector art

#

it really works well for some art styles but not for photoreal eg. Hires. fix

grizzled warren
#

I mean, it works for photorealism, but limits it to shallow depth of field, only that way it looks accurate

#

and not all photography is like that

#

the obsession with bokeh is a fairly modern phenomenon

uneven dove
urban fjord
#

This is img2img with 0.9 base and no refiner and it looks like a pretty normal base output. Maybe better result with better prompt.
I feel something is just wrong with your img2img code pseudo

soft bone
#

0.9 loves watermarks eh

uneven dove
#

mine uses the refiner on the img2img output

urban fjord
#

I'll try hook up the refiner too.

sharp robin
uneven dove
sharp robin
#

Like as a whole those are two completely different images

uneven dove
#

well yeah he used .9

#

oh wait

#

0.9 nvm

#

i'm dumb. i don't know what strength that is

sharp robin
eternal fog
#

It will be high denoise

uneven dove
eternal fog
#

Comfy said they were both fine when I asked @brave halo

uneven dove
#

it just changes it too minimally

urban fjord
#

If you want an upscaler you should say that and not just img2img.

inner ruin
#

so my LoRA works now, but it looks like crap without the refiner

sharp robin
#

The issue is in the i2i

#

Not the upscale part

eternal fog
#

Apparently GPU is quicker on some systems, but less deterministic

#

However on my machine they are the exact same speed

#

I guess if you have a weak CPU but a good GPU then use GPU, otherwise use the other one

upbeat summit
uneven dove
#

is that Starburns?

#

cursed seed or forbidden token?

grizzled warren
#

lightning serves as a reminder: god of thunder weilds the banhammer

soft bone
patent moon
#

Have there been any annoucements regarding the time SDXL will be released? Any launch events planned?

soft bone
#

I hope people realize how many lightyears beyond base 1.5 this is

#

very nice work

upbeat summit
# soft bone have you tried for analog/film photos yet?

I mostly use this prompt that I've build for SD 2.1 for film / analog portraits. It also works great for SDXL. you just need populate the [PLACEHOLDERS] with your own values.
cinematic movie extreme close-up still of an epic scene of a [ETHNICITY] [OCCUPATION] in the [SEASON] at [DAYTIME], centered, looking into the camera, fog atmosphere, volumetrics, photorealistic, from a western movie, analog, very grainy, film still, kodak ektar, fujifilm fuji, kodak gold, cinestill 800t, kodak portra, photo taken by thomas hoepker

uneven dove
#

this is your grandmother's bake sale

upbeat summit
#

fractal bake

elfin cobalt
#

Tried for "Fractal bakery".

#

Got this...

upbeat summit
#

nice!

elfin cobalt
#

But wow, this does Escher much better than 1.5 ever did.

uneven dove
#

i love the deep fried art lmfao

upbeat summit
#

yeah I had some really interesting mash-ups

uneven dove
urban fjord
#

Churchill is really not playing around anymore.

elfin cobalt
#

Or rather

#

Abstract renderings of fractal-infused lawn gnomes --style abstract, fractal, surreal, M.C. Escher -

uneven dove
elfin cobalt
#

Because if I'm throwing two-word prompts at it, I'll get an AI to fix them for me.

#

Oohkay

#

That's terrifying!

soft bone
#

anyone use fitCorders comfy configs?

uneven dove
#

stunning photo of mandlebrot fractal version of lawn gnomes

#

crochet sus

upbeat summit
#

fractal crochet - the best

elfin cobalt
#

Just don't do hyperbolic crochet.

uneven dove
#

needs a LoRA for people falling down stairs

low python
#

just tell it to do a cartwheel down the stairs

soft bone
#

tldr on "A score" effect on prompt?

uneven dove
#

crochet

#

grandma what amazing crochet skills you have

elfin cobalt
#

That is not good for people with trypophobia.

west breach
#

I'm okay with it. More of an issue if they look like lots of round holes

uneven dove
#

bird law documents

hard fractal
#

Is there a Discord silver? Like on Reddit

uneven dove
#

crocheted danny devito from @urban fjord

hard fractal
uneven dove
#

well i was confused at what the person was requesting 😛

hard fractal
#

Haha

#

Not saying you say that, I'm saying we've gotten some of that before

uneven dove
#

so it can do pretty good style conversion but you have to use pretty high denoising and it's kinda limited in which styles it'd do

#

like it can make people cel-shaded versions of themselves

#

it's super fast, too. it'd be interesting to optimise it for certain styles

dense chasm
uneven dove
autumn forum
#

so i just setup sdxlmixsampler custom node and a regular setup of base and refiner, same seed, same sampler, same steps base+refiner. but they somehow end up different and im trying to see if theres a benefit from one to the other. Left is custom node. right is base+refiner

sharp robin
# hard fractal "Why is the refiner just refining?"

In regards to that comment: the concern is that the base or the refiner are unable to image2image and add not subtract details. Both collapse details and ‘un-refine’ the image when it is run through again through samplers. They are unable to add the necessary noise I would assume.
For ex try genning 1024 upscale it in any way and then running it through a sampler and adding details and sharpening it up. U can play with denoise, cfg, steps, et-al and nothing really adds. Setting denoise high does give similar results but far from what the initial base+refiner does.

urban fjord
#

If you just want to change a small portion you can try inpainting.

sharp robin
#

Maybe we are just asking too much. copium thinking

urban fjord
#

If you want to upscale then img2img isn't the right tool, at least not without controlnet.

uneven dove
# dense chasm yes, i'm familiar with python

ok you can use GPT 3.5-Turbo or local LLM to generate test prompts based on certain concepts. you can also have it generate test concepts, which you ask it to generate test prompts for

sharp robin
urban fjord
#

Controlnet will come, either from SAI or from someone else.

dense chasm
eternal fog
#

Yes

urban fjord
#

But yeah, I still recommend using inpainting on the parts you want to enhance that way it uses the remaining bits as reference.

uneven dove
glass acorn
#

Where can I learn exactly how Stable Diffusion works? An overview of the different pieces of it and how they fit together. For example, SDXL has something called CLIP Text Encode which take in something called a Clip and two strings and then outputs something called conditioning. I want to know what that CLIPTextEncode does and why it has two string inputs instead of just one.

https://cdn.discordapp.com/attachments/803727923336445963/1129236647163207710/image.png

uneven dove
urban fjord
#

photo of the emoji 😆 shot on 70mm film
Wait is this actually working or a coincident...

agile bramble
#

really sdxl 1.0?

uneven dove
urban fjord
uneven dove
#

everyone in the chat when @hard fractal shows up

urban fjord
#

Alright, it looks like there is a weak connection between emoji and the concept. That above was me testing with 🚒

#

photo of ❤️ shot on 70mm film
Definitely, I just need to remove emoji from the prompt.

high skiff
#

What's up goobers

autumn forum
high skiff
#

My tiredness after taking a nap

urban fjord
#

🖼️ 🎄 🤶 📸
It forgot Santa, but yeah that is a photo of christmas shot on some kind of Camera.

high skiff
vale eagle
glass acorn
uneven dove
#

gpt asked for that

#

i have no involvement with this prompt =_=

autumn forum
uneven dove
#

OpenAI: deploy carefully, don't do X or Y
SAI: Don't do this, that, or plug it into other AIs

me: sunglas yes

#

i think my dream pipeline would have like 20 different models in it

soft bone
#

do you do any finetuning?

uneven dove
#

like, on SDXL?

soft bone
#

no in general

#

make models?

uneven dove
#

yes

soft bone
#

civit?

uneven dove
autumn forum
#

i put in hyper realism as a prompt and sdxl i feel like just straight ripped this off someone., that signature is tooo good lol

high skiff
#

I do no

#

Not even a bit

#

I feel 10x worse lmao

#

Gotta love having a ton of sleep disorders

uneven dove
urban fjord
#

Prompt: 🛢️ 🖌️ 🎨 🎄 🤶 📸
Do we even need to prompt with words anymore?
This is perfectly clear. Oil painting of Christmas shot on some kind of camera. Though I don't think the painting look too oily, more like a drawing

uneven dove
#

then be so tired that you feel refreshed

soft bone
high skiff
#

Oh hey, would you look at that, my messages made a logarithmic curve lol

high skiff
uneven dove
# high skiff

the trajectory of your sleep debt if you follow my 3 step simple plan

high skiff
#

Look at me, a statistical prodigy

uneven dove
#

that's just more proof you're a markov chain

#

@smoky patrol 😁

high skiff
#

Side note, cause I am infatuated

I kinda think I need these in my life

uneven dove
#

Pixie-esque time travel++;

#

hmm

#

🕒⚡ Pixie-esque time travel++;

#

that's the full prompt for that img and it seems to make one single character

soft bone
#

everyone slept on it

west breach
uneven dove
soft bone
#

i tried a couple large scale analog photography portrait tunes but no dice and it wasnt worth my time to fuck with anymore.

what grinds my gears is that a bunch of awesome 2.1 tunes and merges just started coming out on civit now that I'm gonna phase it out

high skiff
#

I love her

west breach
high skiff
#

shes just an amazing person

west breach
steady grove
#

i love sdxl. look at that pic. look at it!

lament rune
#

the velvet!

trail bay
#

is that some Lora?

west breach
#

no, just a prompt

trail bay
#

ahhh

uneven dove
#

lmao

#

you guys thinkin' everything is a lora

#

winston churchill

urban fjord
#

Winston Churchill as a manic pixie girl.

uneven dove
#

that part is unimportant KEK

#

IKEA shoppers looking at AI Cryptid monsters for sale

soft bone
#

wtf is A score

#

aesthetic score? for prompt? what does that meannnn

west breach
#

the training data had an aesthetic score for each image

wicked frigate
sharp robin
#

when punsafe?

#

punscore catlurk

uneven dove
#

ew

high skiff
#

The new drivers from NVIDIA are interesting

#

I generate a decent bit faster now, but I also am proportionally slower at VAE decode as well

#

so I went from like 15 seconds to 13 seconds for gen, but now it takes like 4 seconds instead of 2 for decode

#

mayble manually forcing tiled will help

#

it seems like tiled might even be slower

#

hmmm

soft bone
#

@high skiff are you familiar with fitCorder's configs?

high skiff
#

I am not

#

yeah, ok

#

there is something up with the VAE for sure

#

base diffusion took 19 seconds for 4 images, refiner diffusion took 11, and VAE decode took 21 seconds

high skiff
#

ok, so tiled VAE is even slower here

#

by nearly 2x

#

with non tiled its 19.5 seconds/11.5 seconds/12.5 seconds

With tiled its 19.2/11.3/21.1 seconds

#

meaning a solid chunk of the image gen time is just VAE

civic sigil
#

I believe tiled has always been slower

proud root
#

you mean 536.40 for new nvidia drivers?

#

@high skiff that is

#

or is there a beta driver im not seeing

high skiff
#

that is the driver I am on

#

I know it has issues

#

its weird

#

made my gen speed faster, but slowed down my VAE decode speed considerably

proud root
#

gonna need to be a lot of optimizations in the coming weeks/months from everyone involved I think, including A1111/comfy

high skiff
#

comfy is pretty optimized for all of this, thats the nature of how it runs, but A1111 is a dumpster fire right now for SDXL

proud root
#

yeah it's rough atm, tho it does "work"

#

with extreme gen times

high skiff
#

when it isn't crashing, or you don't wanna use half of the samplers, or high res fix, or img2img, or controlnet, or... anthing lol

not to mention people getting blue screens

proud root
#

no idea why, but it eats up every last shred of my 32 gb of ram in the process

#

causing even my razer keyboard lights to start lagging lmao

high skiff
#

its the size of the models

#

I am upgrading to 64GB system RAM because of it

proud root
#

was using the pruned model, still the same thing shrug

#

0.9 pruned that is

high skiff
#

0.9 prunde is actually bigger, its a very weird phenomenon

#

they didn't prune it right

#

training LoRA's for it uses way more VRAM too

#

like 30% more from what friends have shared

proud root
#

yikes

#

yeah needs to be redone then lmao

high skiff
#

Yeah, likely people not knowing how to prune SDXL and assuming its the same as 1.5

proud root
#

aand just deleted it

high skiff
#

yeah, I recommend using the full model

#

but yes, both will hit your system RAM hard

#

seems like SDXL has a 32GB system RAM minimum

proud root
#

a very serious minimum, considering the pc's general unusability in the last steps before image gen

high skiff
#

which I mean, I feel like 32 GB should be the standard now

high skiff
#

luckily RAM is very affordable now

proud root
#

true ddr4 is virtually free

#

and 5 is even cheap

high skiff
#

hell, even my fast 64GB DDR5 for my new PC was about $140

#

its a very reasonable price

proud root
#

wish nvidia gpu's were cheap

#

lol

high skiff
#

yeah lol

#

gotta buy used

#

I just scored a steal of a deal ona 3090 today

proud root
#

nic

#

e

thin nova
#

what price?

high skiff
#

$600 for a Zotac 3090 with shipping, box, and tax included

#

at a price like that, how could I say no lol

thin nova
#

that's fantastic. nice find.

proud root
#

i also wish sdxl could gen 512x512 images like 1.5, to then be upscaled, but it produces gibberish

#

would help

high skiff
#

yeah, agreed

proud root
#

that might be an improvement that comes later, dunno

high skiff
#

I still much prefer my way of doing things with 1.5, where I gen a grid of like 16 or 32 images, pick the one I like and high res fix it

So much faster and more efficient

proud root
#

yeah hi res fixing is superior

civic sigil
#

I prefer native 1024 so far

high skiff
#

but at the same time, once we have semi decent finetunes of SDXL, high res fixing to 2048 should be just as good if not better

proud root
#

yeah, lots of improvements came out post 1.5, hopefully similar happens here

civic sigil
#

I wonder how good they work

high skiff
#

I have seen enough behind the scenes to know that training SDXL for LoRA's and hopefully TI's will be super viable and powerful

proud root
#

yeah if you drop it to 512x512 it outputs crazy images

high skiff
civic sigil
#

Thats a shame

high skiff
#

the latent res sliders do very very little

proud root
#

whereas 1.5 you could drop to virtually anything

high skiff
#

about as much as Xformers noise

#

just slight changes

#

tho, I have set mien to 4096x4096 which is what SAI recommended

#

I have found better alternatives to a lot of their recommendations, but not that one

civic sigil
high skiff
#

latent size and latent target, both seem to be pretty negligible

#

Also, new release of my UI is inbound

#

this time with ultimate upscale

civic sigil
#

I would be excited but Im a no refiner guy sadly lol

high skiff
#

really?

#

why is that?

#

it is desctructive for non realism, i will give that

#

or well, traditional art, the refiner does hurt there

civic sigil
#

So I can train loras on the base model and also its unnecesary for animation to be a perfectionist

#

Dont want to have to train both models even if I could

civic sigil
#

Since Im gonna be making tons of loras

high skiff
#

fair enough

#

yeah, it is a shame

#

tho from what Pseudo said, SAI seem to be killing the refiner anyways

#

which is a monumental mistep IMO

civic sigil
#

No refiner for 1.0? Or killing it after that

high skiff
#

seems like no refiner for 1.0

civic sigil
#

I bet the 0.9 refiner will work fine tho

high skiff
#

which means they better get hard to work on 1.0 base, cause it needs to improve a lotttt for realism lmao

civic sigil
#

But that license is sus

#

From what I heard

proud root
#

yeah i get very random results with 1.0 realism

#

sometimes great

high skiff
#

messing a bit with fantasy realism

civic sigil
proud root
#

lmao

high skiff
#

OMG, didn't even notice that lmao

proud root
#

hahaha

#

yeah there they are

high skiff
#

real shame

proud root
#

now, do you inpaint it away or leave it cause it's hilarious lmao

civic sigil
#

Whaat are they making it all midjourney looking

#

Hopefully if we lower aesthetic score it wont be so bad

proud root
high skiff
proud root
#

very randomly good/bad with the same prompt

high skiff
#

5 is about the highest I typically recommend for realism

civic sigil
#

I really hope SAI is just gonna drop an anime and photorealism finetune alongside the base 1.0 model

high skiff
#

maybe 6 if you want some artsy realism

proud root
#

i do like how sdxl does expressions better, had to use a lora before

high skiff
#

but yeah, my realism quality is from a mixture of proper prompting, split text encoders, split diffusion, and precise using of the a scores and such

#

I will be attaching a more advanced prompt for my v0.6 release of my workflow

#

time to retire my corgi prompt haha

proud root
#

aw liked the corgi one

sharp robin
#

as long as we get a way to high res/img2img ill be happy refiner or no refiner

high skiff
high skiff
#

probably 100's of copies of him all across the world at this point lol

#

pretty kitty

civic sigil
#

I bet I could come up with an upscaling solution using XL tbh

proud root
#

yours was the first comfyui workflow i found when I started monkeying around with it yesterday

high skiff
#

hope you like it!

#

I have more plans for it

proud root
#

nice

high skiff
#

my split diffusion seems to help results from SDXL considerably, which is why the dev for diffusers implemented it as their new default workflow

#

having a large amount of confusion at the moment

#

hmmm

sharp robin
high skiff
#

@visual glade Could I get some quick help on something? I am quite confused at the moment

#

nevermind!

#

sorry

#

I wrote down a value incorrectly lol

#

😅

dense chasm
#

what's the limitation token numbers on the positive and negative prompts?77 tokens maximum each?

high skiff
#

I have seen some people say that, but I have used a 156 token negative several times

dusk mica
#

156 is more than previous sd models right?

dense chasm
#

@upbeat summit your film/analog portraits prompts work well,so real

high skiff
#

I should give some analog realism a try

soft zealot
dense chasm
#

sharp texture and the skin is smooth

inner ruin
#

I'm trying to train face LoRAs and still not getting the right parameters for SDXL smh D:

high skiff
#

@eternal foghey man, are you here by chance?

sharp robin
#

i2i is not possible yet, least w good results

high skiff
#

@eternal fogif you happen to come on, I would grealty appreciate some info for how you suggest training SDXL LoRA's using kohya, pretty please!

#

no idea at this moment honestly

#

Also, I get my 3090 on the release day lol

quartz sequoia
#

@visual glade is there a way to keep the path of load LoRA node, but mute its activity? like it passes through it, i but i just want to disable actually using the lora

#

rather than disconnecting and reconnecting

visual glade
#

you can set the strength to zero

quartz sequoia
#

oh..true

#

why didnt my genius brain think of that

amber fulcrum
sharp robin
#

may have figured it out

#

i go 😴 🛌 now

upbeat summit
plain flame
sharp robin
#

Ps refiner does good. Still if SAI can gen that quality built in great if not it’s a neat thing.

#

💤

peak dove
inner ruin
upbeat summit
dense chasm
#

does anyone knows why when i add negative prompts,the man changed to a woman eventually with the same image seed

#

prompt:A man walking around her neighborhood, highlight hair, detailed eyes, sharp focus, young face, perfect symmetric face, pupil reflecting surroundings, realistic skin, soft healthy skin

#

nagative prompt:ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face

upbeat summit
#

yeah I noticed this too. but I have not yet identified the corresponding token

#

remove ugly from negative 😉

dense chasm
#

oh, walking around her embedding token nearest algorithm, her is female subject maybe

upbeat summit
#

try emphasizing man in your prompt with (man:1.2)

slender coral
#

For comphy is there a node that will randomize text as well? With any type of paramaters?

upbeat summit
slender coral
upbeat summit
slender coral
#

Nice, python?

upbeat summit
#

yeah

dense chasm
slender coral
#

for word in words: {s} something

upbeat summit
#

I haven't created my own node yet, but it is pretty straight forward.

#

it's just python

upbeat summit
dense chasm
#

the SRAM usage bottleneck

vale eagle
#

I had a question about comfyui. Could I have a node to wait for user input while running the whole process? For example, I genetated two images by base model and I want to select one of them for the refiner steps. Could comfyui handle this case?

royal fern
#

SD: XL vs 2.1 vs 1.5. optimized prompts, normalized resolution

lilac bough
#

/prompt RAW photo, B&W photo, (detailed face)+, portrait of a beautiful woman posing for a picture, canon 85mm F1.2, (soft fill light), f22, dramatic lighting, trending on ArtStation Pixiv, high detail, sharp focus, aesthetic, 8k uhd, DSLR, intricate details, soft lighting, high quality,

Negative prompt: blurry, out of focus, (deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime)++, text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck

high skiff
timid sonnet
high skiff
#

PHROUGHE

timid sonnet
#

The last generation I did looks so real

#

Literally a macro shot of a frog at a reptile store being misted with the light above the enclosure 😛

west breach
#

looks awesome

high skiff
#

one of my favorite new realism gens

west breach
#

i've been trying to get a good 2k workflow

high skiff
#

also impressed with how well SDXL can do specific cars, like this 918 Spyder

#

it can do a lot of cars better than a lot of specific car LoRA's out there

west breach
timid sonnet
#

that is an impressive spyder

#

@west breach That third image is super simple but extremely expressive that's cool

upbeat summit
#

nice fidelity, @west breach

west breach
#

thanks. just tweaking it still trying to remove the cloudy effect i'm seeing in some images

high skiff
#

great

#

so removing and relocating python killed all of my venvs in everything

timid sonnet
#

WOAAAAAAAAAAAAAH

high skiff
#

ruined my Oobabooga, both of my Kohya's, my old Comfy UI, and my A1111 install

#

all of my shit is gone

#

sighhhh

timid sonnet
high skiff
#

I have to reinstall everything again from scratch

#

kill me now FFS

timid sonnet
#

@high skiff Your workflow + textual guide is amazing

high skiff
#

I am glad you like it

dusk mica
#

so i heard refiner will be gone? so will it be baked in for 1.0?

high skiff
#

Not sure if it will be gone yet, if it is, then it will not be in 1.0

#

1.0 would just be the base

west breach
#

i think there are some devs who want the refiner gone

high skiff
#

I understand why they want it gone, and it makes total sense

But it will also lower SDXL's performance considerably, unless they get a huge improvement to 1.0 between now and Tuesday

timid sonnet
#

Why remove the refiner? To make SDXL images less detailed so that we can put more focus into detailers or upscalers ?

high skiff
#

biggest reason seems to be cause you would have to train LoRA's and stuff for both the base and the refiner

dusk mica
#

super stupid question: if the refiner improves quility overall, why isnt it baked into the base model? are there technical reasons?

timid sonnet
#

Two years forward, all of this will have seemed like nonsense and a waste of time. The tech will probably have advanced so much and it probably will all be because of some simple little new invention made by nvidia

timid sonnet
#

Polytethering Neurodivergent Pipelines Through Infinite Symmetry VR Universes

high skiff
#

from my research, it seems that the base model is made to understand composition and framing, while the refiner is made to understand fine details

the base does the image shapes and composition, the refiner cleans up the bases shortcomings

dusk mica
#

i see

#

thanks

delicate grotto
#

How much vram is needed for sdxl 0.9?

high skiff
#

some people have gotten it to work on 6GB

delicate grotto
west breach
#

yep works on my 3060 6gb

high skiff
delicate grotto
#

Ah great

vocal stream
#

I thought at 8gb you run it with the practically slightly worse VAE for 1024 though

delicate grotto
high skiff
#

I genuinely have no idea how the hell people are saying they are training LoRA's for SDXL on 8GB GPU's

I am using BS 1, and its using 18GB VRAM

west breach
delicate grotto
#

Then the model is extremely optimized

vocal stream
#

he said 768 not 1024 for the 6gb

high skiff
#

@boreal bough @golden quarry @eternal fog Sorry to ping you all, just wanted to ask if you guys have any advice, cause the fact that BS 1 is taking 17-18GB VRAM in both Kohya and Derrian's UI is insane, and makes 0 sense

#

6GB can likely run 1024x

#

it will just pool and be slow

vocal stream
#

and again, I think they use the tiled VAE under 11gb, so the results aren't the same

halcyon tusk
high skiff
#

I run 8x 1024x without having tiled VAE on a 10 GB 3080

golden quarry
west breach
high skiff
#

this is the lowest I have seen at the moment

delicate grotto
#

Wtf is a tiled vram? Is it new?

high skiff
#

and at this speed, training LoRA's is impossible

delicate grotto
high skiff
#

I can be waiting 9.5 hours for a 4 epoch LoRA

vocal stream
#

Even if it takes days, that's not really 'impossible'

golden quarry
#

Yeah that looks like nvidias good old fuck you that happened when they released drivers for the 4060

high skiff
#

it is when settings have yet to be found, and 4 epochs is about 5% of a real LoRA, which means a single one would take weeks

halcyon tusk
halcyon tusk
delicate grotto
high skiff
#

@golden quarryI think something is actually wrong with my drivers, so I will be doing a full rip out and reinstall with DDU

#

because I noticed today that my GPU idles at max clocks as well

#

110 watts idle

west breach
#

where can you see the gpu watts?

delicate grotto
high skiff
#

I use HW info 64

high skiff
delicate grotto
#

It should show gpu usage watt average per~5 sec default vqlue

delicate grotto
#

And yeah it isnt accurate

rustic garnet
#

so kohya_ss seemed to be extremely memory efficient for me

high skiff
#

yeah, and now I am trying to figure out how to also run a LoRA on 10GB VRAM

#

cause people here have said they have done it

#

but here Kohya is, using 17 GB VRAM for BS 1

rustic garnet
#

I would have to check how much vram it needed for me

high skiff
#

not 17GB for BS one, I would assume lol

rustic garnet
#

but I cannot imagine its 17GB for BS1 if its still < 24GB for BS 12

delicate grotto
#

~12 for 1024 training on 2.1, not sure on 0.9

high skiff
#

I have talked to multiple people on here saying they have done SDXL LoRA's on less than 10GB VRAM

rustic garnet
#

yeah, I totally believe that

high skiff
#

but nobody seems to know why thats not the case for me

rustic garnet
#

let me check in a few minutes how much vram it needs for me

high skiff
#

My new GPU will be here on Tuesday, then I won't have to worry

delicate grotto
high skiff
#

but I would still like to play around now

delicate grotto
#

Also use xformers

vocal stream
#

I'd be worried if my setup uses 2x the Vram it does for others even with a bigger gpu

rustic garnet
high skiff
#

nothing I would be running would b eusing like 9 extra GB VRAM

delicate grotto
high skiff
#

no, I am not

#

I am using Kohya

delicate grotto
#

Aaa that might be it

rustic garnet
#

no, koyha is super efficient

delicate grotto
#

Mmm mqybe not in loras idk

rustic garnet
#

it is

#

try to enable gradient checkpointing, bf16 training, xformers

high skiff
#

it is

delicate grotto
#

Try to run in a1111 too

high skiff
#

its what everybody has done who said they got it for less than 10GB VRAM

delicate grotto
#

If it works then cheers, if not then that just sux

high skiff
delicate grotto
high skiff
#

alright, just nuked my kohyaa training

vocal stream
#

do you have the same torch etc. version as them? iirc there is a difference in efficency with current implementation

west breach
#

is that it? @high skiff

high skiff
delicate grotto
#

Ah wait, are you perhaps using 4xxx gpu? I remember they had a problem with extreme vram usage relatively to 3xxx and lower

rustic garnet
#

really? X_x

high skiff
#

nope

#

3080

delicate grotto
#

Then im out of ideas, sry

vocal stream
eternal fog
high skiff
#

last time I did that, I lost over 80GB of files in a failed training when it nuked a file structure

delicate grotto
vocal stream
#

how is it hell? It's easier and cleaner to setup than the mess of .bat files for Windows

dusk mica
#

is their any diff in ram/speed when using linux vs windows for sdxl

west breach
#

thirsty when it's running 😄

high skiff
#

nevermind then I guess, I'll just go back to 1.5

eternal fog
high skiff
vocal stream
#

and some stuff are just optimized better in Linux which you get some of in WSL

#

at least for 1.5/2.1 though, I only assume it's the same for sdxl but there's no reason it shouldnt be

delicate grotto
delicate grotto
rustic garnet
#

@high skiff Okay, you are right. It takes 16GB for batch size 1 and 22 GB for batch size 12

#

wild ^^

high skiff
#

yeah, I can only assume thats a bug

rustic garnet
#

dunno. When I implemented training myself I always got OOM

#

you have to do a lot of tricks like gradient checkpointing, mixed precision training and so on to get it low enough

#

like even if you just train a lora it has to store all intermediate gradients or doing excessive gradient checkpointing. So memory usage is not automatically much smaller than when you do full finetune

high skiff
#

just makes no sense that suddenly now it needs 17 GB VRAM

#

when it didn't before

eternal fog
#

I'll be trying again tonight. I just don't understand how it runs 1 day at batch 2 1024*1024. And then the next day it OOM.

high skiff
#

hell even Kohya says SDXL should LoRA on 12GB VRAM just fine

delicate grotto
elfin cobalt
#

If you made a model twice the size of SDXL, I'm sure you'd get the same picture quality without the refiner.

rustic garnet
#

hm, maybe its because they add text encoder lora training?

elfin cobalt
#

But then it wouldn't fit in 8GB.

high skiff
delicate grotto
#

Or picture weight

eternal fog
rustic garnet
#

they just remove the loras later. Its a bit weird implemented

eternal fog
#

I have a theory that the trainer is fucking up the settings input into the script and it's not configuring properly.

high skiff
#

thats my assumption as well

eternal fog
#

I noticed the other day I asked for 100 epochs and it gave me 80

#

It had randomly added a max steps entry in I didn't ask for

rustic garnet
#

do you cache text encoder outputs?

eternal fog
rustic garnet
#

maybe they forgot to move the TE back to cpu

high skiff
#

ok, and now training 1.5 is telling me I have no dataset

#

pretty sure that kohya is just in flames and all fucked up right now

eternal fog
jolly bear
#

what the advantages of that model?

eternal fog
#

I need to look at it after work and see if I can figure it out. I'm using the GUI so I might just manually use the scripts instead.

wicked frigate
#

my own windows install idles between 0.5 and 1 GiB idle VRAM usage depending on the day, lot higher as soon as i open anything

eternal fog
#

I'll try it again tonight, but I'll close everything else down and see if it makes any difference.

high skiff
#

my idle usage on my GPU is like 0.2-0.4GB

ionic dragon
#

@high skiff i have a workflow, can you please tell if the refiner is being used as a base?

west breach
eternal fog
eternal fog
#

@ionic dragon share a screenshot of your sampling settings.

ionic dragon
ionic dragon
eternal fog
#

I'm not on my pc so I can't lol

ionic dragon
#

oh wait

ionic dragon
eternal fog
#

There's your problem

ionic dragon
#

i am doing a comparision

eternal fog
#

That sampler needs karras

ionic dragon
#

so just trying out all the combinations

eternal fog
#

If you use normal it does what yours is doing.

west breach
#

anything but normal for the first stage

ionic dragon
#

so there's nothing wrong with my workflow?

eternal fog
#

Some samplers are fine with normal. The sde ones don't like it though.

ionic dragon
#

ok

eternal fog
ionic dragon
#

i am just doing a comparision

#

so i wanted to verify if everything's was right

#

before i can share all the outputs

eternal fog
#

Change it back to karras and it should be fine.

ionic dragon
#

@eternal fog do you know any tool which can stitch images?
so i can make a grid like the one in a1111?
but manually?

eternal fog
west breach
#

here is my 'economy' workflow

high skiff
#

my prompt :>

#

my corgi lives on

west breach
#

@high skiff economy dog

high skiff
#

lololol

#

alright

#

so it looks like my new dataset for my Avatar LoRA is a huge success

#

results from just a 15 minute training on 1.5

#

so when I can get SDXL into action, this should be amazing

west breach
light pawn
#

can SDXL runb on 8gb in comfyui?

high skiff
#

yes

#

so far I consider this a huge success

west breach
light pawn
west breach
#

it doesn't need to offload for me

#

no special config

#

if you are using comfy, make sure to update it

#

there was a fix that was implemented a few days ago to reduce vram usage

stone fossil
ionic dragon
high skiff
#

yeah, testing my dataset for SDXL

ionic dragon
light pawn
west breach
molten gull
upbeat summit
west breach
#

the 24k gold workflow froggysmug

molten gull
#

thx 🙂

dapper dragon
#

.

ionic dragon
molten gull
#

octuple-peace-boy !

delicate grotto
west breach
molten gull
delicate grotto
#

@molten gull you're bringing bangers again

molten gull
#

i m wildly experimenting again 🙂 back at stable diffusion, my midjourney experiment is over (for now)

#

it's not really doing what i want it to do yet though 🙂 stubborn little program 🙂

upbeat summit
#

really great images anyway 🙂

golden quarry
rustic garnet
#

SDXL trains extremely fast. Like a few epochs and its done. So increasing batch size don't cost much but hopefully makes training more stable

molten gull
#

i m not sure it trains fast

rustic garnet
#

but I haven't found time to experiment a lot. My current workflow works perfectly.
1.) Clip Interrogator to find best fitting token
2.) Finetuning of OpenCLIP embedding with Textual Inversion
3.) LORA for OpenCLIP text encoder
4.) LORA for unet, everything else frozen

golden quarry
#

Do you use kohya?

rustic garnet
#

the only step that takes long time is the textual inversion. But you can probably skip the step. I just like to have a simple single trigger word as embedding

#

Lora

silent imp
#

I'm curious, does the voting we do also improve prompt understanding?

molten gull
#

not sure at all even ... it's all very frustrating still, i did a LORA with 400epochs and 100 pictures (ran the whole night) and it seems to be better than a small trained one

rustic garnet
#

and yes, the implementation by kohya

silent imp
#

i'm looking at some images at #1100484581037195384 and they seem to follow the prompt far far better than they did in 0.9

golden quarry
rustic garnet
#

might be, dunno. I changed the code a lot

#

cause it didn't support many things I wanted to have, like training TI + TE, training TE and Unet separately, using different ranks for TE and Unet and so on

golden quarry
molten gull
golden quarry
molten gull
gentle spire
#

is it possible to mix sdxl with normal models?

#

like, start an image with a 1.5 model and finish it with sdxl refiner

rustic garnet
molten gull
golden quarry
molten gull
#

F, i like this style 🙂

golden quarry
rustic garnet
#

I just used the "set dim from weight file" option and initialized an empty lora myself

#

the whole workflow was very hacky ^^° Maybe I find time and motivation to make everything clean and reusable and then make a pull request for kohya

golden quarry
#

Fair

molten gull
golden quarry
molten gull
golden quarry
#

So I don't usually implement things that require editing sd-scripts

rustic garnet
#

yeah, I think its always cool having both. An intuitive UI and an flexible customizable API

molten gull
#

what did you make @golden quarry ?

high skiff
#

dude, this LoRA worked so good on 1.5, I can only imagine how good it will be when I can play with SDXL

#

I can also get way more images in my dataset

molten gull
golden quarry
molten gull
#

ah, neato 🙂 i m using this already

#

great job man, i really like it

golden quarry
golden quarry
molten gull
#

say, what do you suggest for learning rate, unet learning rate and TE learning rate ? and why ? (and how do you know?)

rustic garnet
golden quarry
rustic garnet
molten gull
#

can i stop and continue a training somehow ?

golden quarry
rustic garnet
#

yeah, but I never did that. Actually, I just train a bit longer until it clearly overfits, then I choose the best checkpoint

golden quarry
rustic garnet
#

loras are so small you can just save every epoch

molten gull
#

i m using 2e-4 at the moment, doesnt seem to be too bad

rustic garnet
#

oops, sorry, I meant 5e-4

west breach
molten gull
#

can i train something to 200 epochs today, and continue to 400 epochs some other day ? can those be continued ?

golden quarry
golden quarry
#

Just keep in mind the folders that get created are not small

molten gull
#

save state is something different than save freq and save every 10 epochs ?

rustic garnet
#

save state is also saving the optimizers state

#

the other is saving the model

molten gull
#

so i need both, yes ?

rustic garnet
#

I think just save_state

molten gull
#

i want a .safetensors every 10 epochs

rustic garnet
#

you can also continue training given an existing model. Thats totally possible. It's just that the optimizer has to adapt its learning rates and momentum and stuff

#

so training might get worse for the first few steps

molten gull
#

do you set "save last state", too ? i m not really getting what that one does different

rustic garnet
#

save_state is storing the "training state" such that you can continue training another day as if nothing happened

molten gull
#

and how would i continue on an existing model ?

golden quarry
#

Save last state is so that you don't have 400 save state folders lying around

lilac wren
golden quarry
#

It will save the last x states

molten gull
#

thats good, so i do save state, save last state and there epochs: 1

#

i only want the last of whatever is trained so far

rustic garnet
#

the other is storing the model itself. It does not contain the training state. So if you continue training from a model then the optimizer will have a noisy gradient for the first few steps and training gets worse for a moment. But it will stabilize after an epoch

molten gull
#

how exactly do i load it again?

golden quarry
#

The resume variable

#

Also within the save args

#

Assuming you are using save states

molten gull
#

i will use save states, yes

#

i cant see a resume though

#

ah, it's in "saving args" 🙂

#

shouldnt it rather be in "Loading Args" ?

#

say, when i do the resume thing, can i change the learning rates before i "start training" again ? or does it have to be the same as before ?

golden quarry
#

Well, resume is the only loading arg

rustic garnet
golden quarry
molten gull
#

can you explain how the "sample args" section works please?

golden quarry
#

It's annoying to say the least

#

Sd-scripts should have a section on how to format the txt file that gets passed in

molten gull
#

will it allow me to make some sample images of what the state of the lora is, given some prompts to it ? do you have an example text file by chance ?

golden quarry
molten gull
#

so i do a file like "lora_training.txt" and put something in there like:

"portrait of an old woman in style of malicor_old_woman --n boring image --w 1024 --h 1024 --d 3245098"

and load this to my training of my lora "malicor_old_woman" ? and get some examples that way ?

golden quarry
#

Pretty much

molten gull
#

could i put in multiple lines in that text file ? and have more than 1 sample per epoch ?

rustic garnet
#

yes

#

if you use the kohya_ss gui, though, there is a bug in the sampling. It's fixed in the kohya script

urban fjord
#

Note that if you edit the file during training it will register changes, at least with Kohya

molten gull
#

i m not using kohya_ss, i m using the one from @golden quarry

urban fjord
#

It is useful if you notice you forgot things you wanted sampled during training, and I guess you can also remove prompts that's just taking time and seems to not be useful anymore.

rustic garnet
molten gull
#

amazing piece of software, actually

#

@golden quarry where does it save the samples to ?

rustic garnet
#

just take sure that your kohya sd-scripts is up to date. The SDXL branch is still worked on and They do fixes every day

stone fossil
urban fjord
molten gull
#

this sounds bad ? what might i have done wrong ?

grizzled warren
# gentle spire is it possible to mix sdxl with normal models?

Yes, but that's a gimmick, because 1.5 and XL latent images aren't compatible and the Refiner has similar dataset limitations as the SDXL Base.

In order to do that, you need to generate an image with 1.5 model, upscale it to the SDXL resolution with an upscaled or highres fix, send the output it to VAE Encoder, send the latent to the SDXL Refiner, then use VAE Decoder to get your output.
That extra encoding and decoding takes a lot of time, as well as SDXL refiner loading time, and the SDXL's Base does generate a 1024x1024 image faster than 1.5 would.

Also, the Refiner can easily ruin a moderately NSFW generation, if that's what you want to use your Refiner on, because it doesn't have anything too spicy in the training dataset. Only the mild stuff. You can use masks and whatnot, but that overcomplicates your workflow, tremendously and unnecessarily. Even if you don't care about that, if your 1.5 model is heavily biased towards a certain style, chances are it will generate the details of this style better than the Refiner would, because it was fine-tuned to do just that. Unlike SDXL, which is general purpose model. Undoubtedly superior to 1.5 overall, but not specialized enough to compete with some 1.5 fine-tunes yet.

So any solid reason to use 1.5 as your base kinda disqualifies the Refiner, and any good reason to use the Refiner strongly suggests you to use it in combination with SDXL Base, which is a very good model, after all.

molten gull
#

the thing crashed :/

golden quarry
rustic garnet
rustic garnet
golden quarry
#

And I do try and keep it up to date

#

Pretty much every day after I get back from work I've checked to make sure there were no updates on the SDXL branch, mainly because I want things to be easy to do!

molten gull
#

can you have a look at that error message i got by chance ? @golden quarry ?

urban fjord
golden quarry
urban fjord
#

I'm not using yours but I guess I should move to yours.

rustic garnet
#

should work for the other, too

clear moth
#

I'm dying to see deliberate get upgraded to sdxl

uneven dove
#

tbh you're just doing it wrong, the refiner pass should happen before 1.5 touches it

#

i also pass the latents directly from 2.1 to sdxl without the vae encode decode steps you described.

#

basically mostly incorrect except about the nsfw part

uneven dove
clear moth
lilac wren
uneven dove
clear moth
uneven dove
#

you have to prompt the base model very precisely to get full access to the parts of its data distribution that are "less obvious"

#

uhm yeah 🙂 i even ported it to the Diffusers style a while back

#

Deliberate is a very old model, uses very old training techniques

#

try epicRealism if you want something better, as it doesn't need negative prompts

clear moth
#

I mostly used deliberate with artists to get art styles

#

It worked well with producing very good results for 1.5

uneven dove
#

i see

#

christ, why does sdxl take any word like chubby or thicc and make them 400lbs

#

helluva bias

molten gull
#

the restore doesn't reaaaaaaaally work yet @golden quarry , it says it continues at state50, but it starts at step1 again, not sure if it just names it wrong, could be the case

uneven dove
#

it may just not progress the update bar to the right pct

urban fjord
#

Something in SDXL seems to pick up on things and exaggerates it, like my misaligned eyes issue.

uneven dove
#

mine does and it is a pain in the ass. i dont blame him for avoiding progress bar shenanagins

uneven dove
#

when you provide those features it suddenly knows how to express them in the u-net

uneven dove
#

quality of data has always mattered

#

idk why people assume it wont with sdxl

#

probably because of mcmonkey...

urban fjord
#

Definitely, I just found it strange that it took it a lot further than both my training images and the features already present in the base-model

clear moth
#

I feel like prompt weights and negative prompts are what will fix these short comings

uneven dove
uneven dove
clear moth
uneven dove
#

Empty negative prompt means, during sampling, on each sampling step, create two denoised pics - one using the prompt, one using the empty prompt, and combine them into one, adding first and subtracting the second.

When the Negative Guidance setting kicks in during sampling, you stop producing two denoised images (from frompt and from neg prompt), you stop combining them into one, and instead you produce just one denoised image, using the prompt.

That is the reason why results are different.
@visual glade what the heck, is that how negative guidance works in ComfyUI too?

uneven dove
ionic dragon
#

can we use prompt as a filename?

clear moth
elfin cobalt
elfin cobalt
#

On windows? Mostly no, the list of disallowed characters is huge.

uneven dove
ionic dragon
uneven dove
#

oh

#

for output. i see

ionic dragon
#

yep

#

we can use samplers element as the filename

uneven dove
ionic dragon
clear moth
uneven dove
#

"thicc"

clear moth
#

or whatever you use

uneven dove
# visual glade of course not

yeah it seems incredibly inefficient and i'd never ever heard of it being done that way. i had to look at the code to verify. he is correct. but i have, no idea why he does that.

clear moth
#

it just needs to be pushed back a bit

uneven dove
#

ok, i think you need to simply run a local copy of SDXL with prompt weighting and so you can be asuaged that it will not help

clear moth
#

I'm just using stable horde via artbot

uneven dove
#

when you put a weighted term into the positive and negative

sudden cliff
uneven dove
#

AI: "you're on your damn own"

uneven dove
sudden cliff
#

Or is it your own impl wip?

#

ah

#

(side note, did anyone ever add it to IF? I tried to do it in Compel but got too busy)

uneven dove
#

DeepFloyd tokenizes by letter, i have no idea how that impacts Compel.

visual glade
azure oxide
uneven dove
azure oxide
uneven dove
#

here are more for you @azure oxide

azure oxide
#

oh my god i am grooving now

uneven dove
#

prompted for an astronaut in a teacup

#

you can kind of see where it was... hoping to go with that

dusk mica
#

i always wondered how people could make great models with base 1.5. Like if base 1.5 can make such great modes, sdxl will be insane

sudden cliff
uneven dove
#

i think it's a logical fallacy to assume that just because an older model had good fine-tunes that SDXL ones will be better

#

people didn't pick up fine-tuning 2.x because it was very hard, and everyone blamed OpenCLIP. and guess what? SDXL has an even bigger OpenCLIP, plus the original CLIP-L from 1.5

#

so far, a lot of the successful trainers on 1.5 can't even load into SDXL training without first upgrading their equipment

uneven dove
steady chasm
uneven dove
#

what's funny is, if you say nothing at all to the prompts, it behaves nicer

sudden cliff
#

side thing, in comfy --gpu-only is a good bit faster than the high vram line

sudden cliff
#

ah

uneven dove
#

which CPU do you have?

sudden cliff
#

i9-9900k

#

Oh so it's moving over the calculations too

uneven dove
#

i wonder if it's the lack of AVX512

sudden cliff
#

I thought gpu-only was memory related only

rustic garnet
sudden cliff
#

as a line

dusk mica
#

does avx512 help in diffusion?

rustic garnet
#

but you call the unet twice, yes

visual glade
#

--gpu-only forces everything on the GPU and disables shifting of stuff between cpu and gpu

uneven dove
#

@rustic garnet i'm repeating AUTOMATIC1111's words about how his negative guidance implementation works, which is that

visual glade
#

normally if you have a decent CPU the text encoders are run on the CPU because that's actually faster than shifting it to the GPU, running it and shifting it back

vale eagle
#

Share this interesting transformation after looping multiple time of the 20 steps base+refiner process. (20,40,60,80,200)

sudden cliff
visual glade
#

no, gpu-only only uses the gpu

sudden cliff
#

were you saying 'that's actually faster DESPITE shifting it to the GPU and shifting it back'?

visual glade
#

the shifting back only happens in the other vram modes

steady chasm
rustic garnet
sudden cliff
uneven dove
#

where has he been? doesn't know anything about the code he copy-pasted from SGM @visual glade

visual glade
#

I just wish people would stop using that ui as a reference, there's issues in the sampling code and everywhere else

rustic garnet
#

it seems to be a trick to improve performancy by disabling cfg for certain timesteps

rustic garnet
#

negative prompts still work the same way as before

steady chasm
#

Is there a superior local UI to use?

boreal bough
grizzled warren
# uneven dove tbh you're just doing it wrong, the refiner pass should happen before 1.5 touche...

That's not what the person asked, though. They asked, "can I start an image with a 1.5 model and finish it with a refiner". I answered to that. Technically, we can. But I don't see a reason why we should. Other than maaaaybe using a LoRA, but chances are that would most likely get trashed by the refiner. Even you admit that we shouldn't start with 1.5 here.

Using the 1.5 derivatives for "refining" the SDXL's output in a certain way is an entirely different process, which can indeed might use cases before the SDXL ControlNet fine-tunes become available. But that would be complicated, because it's hard for 1.5 to operate at such a high resolution. And any useful scenario I can think of implies downscaling the image, using 1.5 for Ultimate upscale, or some use of ControlNet. That's either rather dumb, or very advanced, and there should be a very strong reason to necessitate that kind of workflow. It's very heavy...

visual glade
#

anything that uses diffusers is superior and also ComfyUI

uneven dove
steady chasm
#

I figured diffusers were model dependent ? I'll have to check out comfyui, I'm assuming there's a native Linux version

uneven dove
steady chasm
#

O I didn't notice their nickname, Lol

#

You ui

rustic garnet
uneven dove
rustic garnet
#

but how should it work technically?

uneven dove
#

it Just Does because the way SDXL was trained favoured OpenCLIP a bit

rustic garnet
#

if you make both prompts different then the tokens don't align

sudden cliff
#

BTW, I saw a discussion here one time about comfy UI gpu-only requiring a large amount of vram, but I can do batchsize 8 with --gpu-only and it stays under 20GB VRAM

uneven dove
#

@rustic garnet not everything has to be ideal and perfect. we're denoising imaginary data samples and making something from nothing. if you misalign stuff, that's often where new details can be introduced

rustic garnet
#

it just doesn't make sense

uneven dove
#

if you have two different prompts in the two CLIPs, you are inherently accessing a different part of the data distribution than you would if you used the same prompt in both

rustic garnet
#

I tested it once and it maked things worse as expected. Haven't looked into it since then

uneven dove
#

they're not "better" results, they are "different", which in some cases does overlap with "better"

grizzled warren
rustic garnet
#

the point is that your tokens are misaligned afterwards. If you use a prompt like
"a dog" for CLIP-G and "national geographic" for CLIP_L then you merge the tokens "a" with "national" and "dog" with "geographic"

sudden cliff
rustic garnet
#

or are the prompts not concatenated afterwards? I could swear they are

#

*the embeddings

uneven dove
steady chasm
rustic garnet
#

the pooled embed is only using CLIP-G anyways

uneven dove
#

yep that's how SDXL favours it, aiui

steady chasm
#

Joining this server and observing the discussion on some of these channels has me realizing I'm fairly lacking in knowledge of generative ai

visual glade
uneven dove
#

if people gravitate toward something, that's their own fault.

#

i think they end up appreciating the software and its capabilities more

#

i used to try and convince Windows users to switch to Linux 😁

#

learnt me lesson

rustic garnet
#

nope, I'm right, they are concatenated

steady chasm
#

You've fallen, soldier. Keep fighting

rustic garnet
#

so it weird that it works to use to different prompts

uneven dove
#

instead of convincing users, i just open bug reports and fix their issues now.

azure oxide
elfin cobalt
#

Speaking of your software.

uneven dove
#

it's so much easier than trying to convince them to use workarounds

grizzled warren
rustic garnet
#

you could fill up the first prompt with blank tokens and then use different prompts - would make more sense

uneven dove
rustic garnet
#

anyways, for me this "linguistic" and "style" prompt thing is still super experimental and a weird hack. I wouldn't claim its the way to do things. Maybe in 2 weeks there is a totally different way of doing it

sudden cliff
uneven dove
#

i don't think anyone has claimed it's THE way to do things? it's just Sytan's workflow

azure oxide
#

sytan's our jesus now, he can do no wrong

uneven dove
#

we saw jesus kick winston churchill's ass yesterday

#

"lets get you to bed, grandpa"