#✨｜sdxl | Stable Diffusion | Page 8

azure oxide Jul 14, 2023, 3:06 PM

#

i feel like comfy was saying the same thing as i just did about a1 with his comment above 😂

rustic garnet Jul 14, 2023, 3:07 PM

#

you have 77 tokens, each token consists of the encoding from CLIP-L and CLIP-G. So if your prompts are "a dog" and "national geographics" then you get two tokens, one "a"+"national" and one "dog+geographics". These tokens are then used in cross attention

sudden cliff Jul 14, 2023, 3:07 PM

#

rustic garnet you have 77 tokens, each token consists of the encoding from CLIP-L and CLIP-G. ...

tokens or embeddings?

elfin cobalt Jul 14, 2023, 3:07 PM

#

@visual glade I've been sketching out an easier-to-use API for comfyui. Something that's like the prompt parameter, but where nodes can be named instead of numbered, and where node inputs are named instead of indexed. I was going to just put that in my own code and use it myself (all Rust), but is this something you'd be interested in getting a PR for?

uneven dove Jul 14, 2023, 3:07 PM

#

i doubt national is one token 😛

visual glade Jul 14, 2023, 3:07 PM

#

elfin cobalt <@1059592307579564162> I've been sketching out an easier-to-use API for comfyui....

node inputs are named in the api though?

rustic garnet Jul 14, 2023, 3:08 PM

#

just to give an example: if you make one prompt as "a dog" and another is "a cat" then you have one token that is "cat+dog". So each pixel in the image has to be assigned to "cat+dog", it cannot be assigned ONLY to dog or ONLY to cat

elfin cobalt Jul 14, 2023, 3:08 PM

#

Hm? Hang on.

rustic garnet Jul 14, 2023, 3:08 PM

#

sudden cliff tokens or embeddings?

however you want to call it. I talk about the set of vectors that are used in cross attention

uneven dove Jul 14, 2023, 3:08 PM

#

chibiYell is overwhelmed by all the conversations at once

visual glade Jul 14, 2023, 3:08 PM

#

I added an option to export in api format in the latest version, to see it enable the dev options in the settings

steady chasm Jul 14, 2023, 3:08 PM

#

rustic garnet just to give an example: if you make one prompt as "a dog" and another is "a ca...

https://tenor.com/view/cat-dog-cartoon-gif-21973182

elfin cobalt Jul 14, 2023, 3:08 PM

#

"22": {
    "inputs": {
      "add_noise": "enable",
      "noise_seed": __SEED__,
      "steps": __STEPS_TOTAL__,
      "cfg": __BASE_CFG__,
      "sampler_name": "dpmpp_2m_sde_gpu",
      "scheduler": "karras",
      "start_at_step": 0,
      "end_at_step": __FIRST_PASS_END_AT_STEP__,
      "return_with_leftover_noise": "enable",
      "model": [
        "10",
        0
      ],
      "positive": [
        "75",
        0
      ],
      "negative": [
        "82",
        0
      ],

#

What is the "0" in positive?

sudden cliff Jul 14, 2023, 3:09 PM

#

rustic garnet just to give an example: if you make one prompt as "a dog" and another is "a ca...

they're not even doing pixels, it's latent space. I believe that what is happening is the multiple tokens are turned into encodings with cross attention before they are combined for the image guidance

rustic garnet Jul 14, 2023, 3:09 PM

#

sudden cliff they're not even doing pixels, it's latent space. I believe that what is happeni...

nope. And yes, not pixels but latent pixels, which is almost the same just bigger

civic sigil Jul 14, 2023, 3:09 PM

#

Does anyone have an idea why some ints would be incompatible with eachother in comfyui

steady chasm Jul 14, 2023, 3:09 PM

#

There's so many... Terms being stated.
Anyone happen to know a guide on the technology of diffusion, specifically written for dunces such as i

sudden cliff Jul 14, 2023, 3:10 PM

#

sudden cliff they're not even doing pixels, it's latent space. I believe that what is happeni...

So that is to say, your CLIP-L and CLIP-G prompts both have full encodings and at that point they can simply be added together to form a final vector

#

or vectors

grizzled warren Jul 14, 2023, 3:10 PM

#

There's no such thing as "pushing people" here. There are just different tools, some fit certain jobs better that others. Sometimes fundamentally, sometimes because it's more developed of mature. People are free to choose though.

The only kind of pressure you can possibly apply involves A1111 and Vlad. The better your software gets, the more important it is for them to stay on your level. But that's a very good kind of pressure, if you ask me! The community surely benefits from that.

rustic garnet Jul 14, 2023, 3:10 PM

#

sudden cliff So that is to say, your CLIP-L and CLIP-G prompts both have full encodings and a...

yes. I mean, concatenated, not added.

visual glade Jul 14, 2023, 3:10 PM

#

civic sigil Does anyone have an idea why some ints would be incompatible with eachother in c...

primitive nodes need improvement

uneven dove Jul 14, 2023, 3:10 PM

#

grizzled warren There's no such thing as "pushing people" here. There are just different tools, ...

no matter how many disclaimers i put on my code repos that it's highly tuned for my specific needs. people insist on trying to use it

rustic garnet Jul 14, 2023, 3:11 PM

#

so each token in your input prompt becomes a vector. The k-th token in the CLIP-L prompt is concatenated with the k-th token in the CLIP-G prompt

civic sigil Jul 14, 2023, 3:11 PM

#

visual glade primitive nodes need improvement

Do you know any tricks to solve it? Like editing the json file maybe

sudden cliff Jul 14, 2023, 3:11 PM

#

rustic garnet so each token in your input prompt becomes a vector. The k-th token in the CLIP-...

each token in a prompt does definitely NOT become a vector

rustic garnet Jul 14, 2023, 3:11 PM

#

(and with the pooled prompt from CLIP-G)

sudden cliff Jul 14, 2023, 3:11 PM

#

or these AIs wouldn't work at all

rustic garnet Jul 14, 2023, 3:11 PM

#

it does

#

its even worse 😉

sudden cliff Jul 14, 2023, 3:11 PM

#

no

rustic garnet Jul 14, 2023, 3:11 PM

#

because it always uses 77 tokens, so they are filled with blanks

visual glade Jul 14, 2023, 3:11 PM

#

elfin cobalt What is the "0" in positive?

that's the index of the output on the previous node, that's defined as a tuple in the node def so it should be fine

elfin cobalt Jul 14, 2023, 3:12 PM

#

visual glade that's the index of the output on the previous node, that's defined as a tuple i...

I expect it won't change (often), but that's what I meant I wanted to name.

sudden cliff Jul 14, 2023, 3:12 PM

#

maybe some of the vectors become 0 after cross attention?

elfin cobalt Jul 14, 2023, 3:12 PM

#

It's named on the UI, after all.

uneven dove Jul 14, 2023, 3:12 PM

#

rustic garnet because it always uses 77 tokens, so they are filled with blanks

you only have to pad if they're different lengths

rustic garnet Jul 14, 2023, 3:12 PM

#

sudden cliff maybe some of the vectors become 0 after cross attention?

nope

sudden cliff Jul 14, 2023, 3:12 PM

#

at least they can't be vectors with any strength

rustic garnet Jul 14, 2023, 3:12 PM

#

yes, their attention can be 0

visual glade Jul 14, 2023, 3:12 PM

#

elfin cobalt It's named on the UI, after all.

it's named but some nodes might have the same name for multiple outputs

rustic garnet Jul 14, 2023, 3:13 PM

#

but I'm sure the "blank" tokens are still used somehow

rustic garnet Jul 14, 2023, 3:13 PM

#

uneven dove you only have to pad if they're different lengths

you could. But its not done

azure oxide Jul 14, 2023, 3:13 PM

#

ive always wondered about token concatenation, how does it work in regards to not using concatenation? say if i prompt a scene with 75tokens. then i prompt with those same tokens but add 100 more to describe it more finely. Does the second one have less weight on the original 75 tokens or something? or does concatenation work flawlessly and can just simply overcome the 75 token limit and also add the extra tokens?

rustic garnet Jul 14, 2023, 3:13 PM

#

you will get different results when using no padding. However, it would be more performance efficient without padding for sure

sudden cliff Jul 14, 2023, 3:13 PM

#

LLMs don't work in such a way that they consider a catalog to be a cat and a log

elfin cobalt Jul 14, 2023, 3:14 PM

#

visual glade it's named but some nodes might have the same name for multiple outputs

Honestly I feel like that shouldn't be allowed...

rustic garnet Jul 14, 2023, 3:14 PM

#

sudden cliff LLMs don't work in such a way that they consider a catalog to be a cat and a log

hm? I never claimed that

strange mist Jul 14, 2023, 3:14 PM

#

visual glade it's named but some nodes might have the same name for multiple outputs

hi dude, I was just messing around with comfyui, it's really powerful ✨
do you have a discord / community somewhere?

visual glade Jul 14, 2023, 3:15 PM

#

strange mist hi dude, I was just messing around with comfyui, it's really powerful ✨ do you ...

yes it's on matrix: https://app.element.io/#/room/%23comfyui_space%3Amatrix.org

rustic garnet Jul 14, 2023, 3:15 PM

#

azure oxide ive always wondered about token concatenation, how does it work in regards to no...

actually I have no clue how the >75 token prompt extension in auto111 works. I don't like to look into their messy code

strange mist Jul 14, 2023, 3:15 PM

#

visual glade yes it's on matrix: https://app.element.io/#/room/%23comfyui_space%3Amatrix.org

thanks 👍

visual glade Jul 14, 2023, 3:15 PM

#

elfin cobalt Honestly I feel like that shouldn't be allowed...

there's some cases where a node can return 2 of the same types of objects in that case it makes sense for the name to be the same

elfin cobalt Jul 14, 2023, 3:16 PM

#

Only if they're completely equivalent, I think.

uneven dove Jul 14, 2023, 3:17 PM

#

rustic garnet actually I have no clue how the >75 token prompt extension in auto111 works. I d...

https://github.com/BlenderNeko/ComfyUI_ADV_CLIP_emb

sudden cliff Jul 14, 2023, 3:17 PM

#

rustic garnet hm? I never claimed that

Is the core idea you're trying to convey that you basically have a prompt 'a cat licking a dishwasher' and the CLIP-L and G are creating encodings like 'a a cat cat licking licking a a dishwasher dishwasher' and cross attention is applied afterwards?

rustic garnet Jul 14, 2023, 3:18 PM

#

sudden cliff Is the core idea you're trying to convey that you basically have a prompt 'a cat...

no

#

your prompts are

[a, cat, is, licking, a, dishwasher] CLIP-G
[a, cat, is, licking, a, dishwasher] CLIP-L

and then your vectors are [a++a], [cat++cat], [licking++licking], and so on

#

but this does not mean you concatenate words or something

#

you just concatenate their vector embeddings

#

basically for SDXL they couldn't decide which text embedding works best, CLIP-L or CLIP-G, so they just used both

naive inlet Jul 14, 2023, 3:20 PM

#

can anyone tell me what ToBasicPipe is? i imported a workflow and it showed me that node is missing and i can't seem to find where to install it

elfin cobalt Jul 14, 2023, 3:20 PM

#

For prompt extensions, would it make any sense to just... average them?

visual glade Jul 14, 2023, 3:20 PM

#

except the output vector embedding don't match that much with the words

rustic garnet Jul 14, 2023, 3:20 PM

#

the cross attention gets one token with "cat" is encoded oncy by CLIP-L and once by CLIP-G

rustic garnet Jul 14, 2023, 3:20 PM

#

visual glade except the output vector embedding don't match that much with the words

yes, probably

sudden cliff Jul 14, 2023, 3:20 PM

#

that's why it doesn't matter

azure oxide Jul 14, 2023, 3:21 PM

#

naive inlet can anyone tell me what ToBasicPipe is? i imported a workflow and it showed me t...

if noone here can then i'd ask whoever you got the workflow from

uneven dove Jul 14, 2023, 3:21 PM

#

encoder2: "a dog"  --> [4, 5, 6]
encoder1: "a cat"  --> [7, 8, 9]
encoder2: "a cat"  --> [10, 11, 12]

    [[1, 2, 3], [7, 8, 9]],  # embeddings from encoder 1
    [[4, 5, 6], [10, 11, 12]],  # embeddings from encoder 2
]

    [1, 2, 3, 4, 5, 6],  # embeddings for "a dog"
    [7, 8, 9, 10, 11, 12],  # embeddings for "a cat"
]

an example of how it looks.

rustic garnet Jul 14, 2023, 3:21 PM

#

sudden cliff that's why it doesn't matter

Then you would not need the tokens at all

eternal fog Jul 14, 2023, 3:21 PM

#

uneven dove no matter how many disclaimers i put on my code repos that it's highly tuned for...

Bug report: it doesn't work.

Bet you get those too.

uneven dove Jul 14, 2023, 3:21 PM

#

eternal fog Bug report: it doesn't work. Bet you get those too.

no because i'm a genius that disables the issue tracker

sudden cliff Jul 14, 2023, 3:21 PM

#

rustic garnet Then you would not need the tokens at all

you only need the tokens because of language

eternal fog Jul 14, 2023, 3:21 PM

#

uneven dove no because i'm a genius that disables the issue tracker

Lmao

sudden cliff Jul 14, 2023, 3:21 PM

#

human language

rustic garnet Jul 14, 2023, 3:21 PM

#

I doubt so

#

if you trace back which pixels are attended to which tokens you clearly see that the tokens still keep their meaning

#

in the sentence "a cat is licking a dishwasher" you will still see that the latent pixels in the image that belong to a cat are more strongly associated with the word "cat"

sudden cliff Jul 14, 2023, 3:23 PM

#

you could connect the latents to the tokens in a way, but the tokens themselves aren't relevant

#

a rose is still a rose

rustic garnet Jul 14, 2023, 3:23 PM

#

maybe we just talk about different things, cause I don't use the word token precisely

#

I talk about the vectors in the cross attention

#

each vector is connected to a token in the original sentence

uneven dove Jul 14, 2023, 3:24 PM

#

just because the vectors are concatenated doesn't mean they lose their meaning, i think is what kai is trying to say. but that's precisely why it works to have "a dog" in one prompt and "national geographic" in the other.

rustic garnet Jul 14, 2023, 3:24 PM

#

and of course, some words will consists of more than one tokens, then their vectors are probably very similar to each other

rustic garnet Jul 14, 2023, 3:24 PM

#

uneven dove just because the vectors are concatenated doesn't mean they lose their meaning, ...

yes, exactly

#

I think it works because you can mess up with SDXL in soo many ways and it still works

#

and yes, sometimes messing up makes it even better

#

I just say that it is strange because you technically align both prompts with each other token by token, although this alignment has no meaning

sudden cliff Jul 14, 2023, 3:26 PM

#

I guess I'm trying to understand what kai is saying means wrt the 2 clips

uneven dove Jul 14, 2023, 3:26 PM

#

yeah that's why i said it's not THE way, it's just Different, and that allows you to access a wider subset of the data distribution than just doing it the same way every time you prompt the models.

hushed bobcat Jul 14, 2023, 3:26 PM

#

my refiner doesnt run automatically in comfyui, is there a button I am missing?

rustic garnet Jul 14, 2023, 3:26 PM

#

for example I wonder if the following works similarly good:

sudden cliff Jul 14, 2023, 3:27 PM

#

uneven dove yeah that's why i said it's not THE way, it's just Different, and that allows yo...

That is a good explaination, I thought maybe they kept CLIP-L because of the knowledge loss witnessed with OpenCLIP in 2.x

boreal bough Jul 14, 2023, 3:27 PM

#

hushed bobcat my refiner doesnt run automatically in comfyui, is there a button I am missing?

https://github.com/SytanSD/Sytan-SDXL-ComfyUI
use this for reference, to diagnose any problems

uneven dove Jul 14, 2023, 3:27 PM

#

OpenCLIP didn't have much knowledge loss, if anything it knows more and is more precise than CLIP-L

rustic garnet Jul 14, 2023, 3:27 PM

#

CLIP-L: "a cat is licking a dishwasher BLANK BLANK"
CLIP-G: "BLANK BLANK BLANK BLANK BLANK BLANK national geographics"

#

anyways, I guess there is still a lot to experiment

sudden cliff Jul 14, 2023, 3:28 PM

#

rustic garnet CLIP-L: "a cat is licking a ...

Even if it's vectors and not words, do they really empty fill the vector space like that?

uneven dove Jul 14, 2023, 3:28 PM

#

rustic garnet Jul 14, 2023, 3:28 PM

#

I just wanted to say that its not so obvious that you can use two different prompts and, therefore, I don't find it shocking when auto1111 has not implemented that yet

rustic garnet Jul 14, 2023, 3:28 PM

#

sudden cliff Even if it's vectors and not words, do they really empty fill the vector space l...

yes

sudden cliff Jul 14, 2023, 3:29 PM

#

rustic garnet yes

OK that IS a bit funny then

#

I get what you're saying

rustic garnet Jul 14, 2023, 3:29 PM

#

its extremely inefficient, but I think you cannot simply drop that

uneven dove Jul 14, 2023, 3:29 PM

#

sudden cliff Even if it's vectors and not words, do they really empty fill the vector space l...

that's how unconditional guidance works

#

it ensures you still get unseen prompt features aiui

stray mantle Jul 14, 2023, 3:29 PM

#

ComfyUI SDXL 0.9

rustic garnet Jul 14, 2023, 3:29 PM

#

cause due to the transformer layers you not only change the text tokens but also the blank tokens

uneven dove Jul 14, 2023, 3:30 PM

#

stray mantle ComfyUI SDXL 0.9

SDXL loves forehead adornements

rustic garnet Jul 14, 2023, 3:30 PM

#

like the embeddings for BLANK might still contain knowledge about "a cat is licking a dishwasher"

uneven dove Jul 14, 2023, 3:30 PM

#

when you misalign the timesteps of the two models, it does that reliably

stray mantle Jul 14, 2023, 3:30 PM

#

uneven dove SDXL loves forehead adornements

I asked for it though lol

uneven dove Jul 14, 2023, 3:30 PM

#

oh, i was just stating something randomly

rustic garnet Jul 14, 2023, 3:31 PM

#

so I guess when you would remove all blank tokens, the model loses expressive power

stray mantle Jul 14, 2023, 3:31 PM

#

boreal bough Jul 14, 2023, 3:31 PM

#

uneven dove SDXL loves forehead adornements

™ terms and conditions apply

uneven dove Jul 14, 2023, 3:31 PM

#

forehead jewels

#

misaligned timesteps are badass

#

it kind of "cracks" the image apart

#

you can see these thick black sharp lines form on faces quite often

meager canopy Jul 14, 2023, 3:32 PM

#

uneven dove Jul 14, 2023, 3:33 PM

#

@visual glade one of the best aspects of AUTOMATIC1111 getting SDXL support is that it seems to be putting a rush on resolving issues that cropped up with SD 2.0 support there, and were never resolved

#

like models not loading the correct VAE or having hidden errors that just silently fail to load the model, fallback to the prev model etc

#

idk why --no-half-vae isn't the default now, the only model that works with that is 1.5

visual glade Jul 14, 2023, 3:35 PM

#

probably because that's their solution to the high vram usage of the vae

sudden cliff Jul 14, 2023, 3:36 PM

#

uneven dove it ensures you still get unseen prompt features aiui

So is that to say if you could diffuse on just the tokens inputted you'd basically get cat, licking, diswasher related output only, twisted and stretched to fill every latent?

uneven dove Jul 14, 2023, 3:36 PM

#

visual glade probably because that's their solution to the high vram usage of the vae

it has VAE tiling and slicing but they're not enabled by default either kek

boreal bough Jul 14, 2023, 3:36 PM

#

sudden cliff So is that to say if you could diffuse on just the tokens inputted you'd basical...

...tempting

visual glade Jul 14, 2023, 3:37 PM

#

I'm pretty sure it doesn't actually and it's an extension

uneven dove Jul 14, 2023, 3:37 PM

#

sudden cliff So is that to say if you could diffuse on just the tokens inputted you'd basical...

the fewer tokens you have, the less there is to "pay attention to" and Weird Stuff Happens

sudden cliff Jul 14, 2023, 3:37 PM

#

rustic garnet so I guess when you would remove all blank tokens, the model loses expressive po...

Could it be since the models are trained with tons of empty tokens, that 'bulk knowledge' goes into 'empty token space'

uneven dove Jul 14, 2023, 3:38 PM

#

sudden cliff Could it be since the models are trained with tons of empty tokens, that 'bulk k...

that is what caption dropout aims to do

#

the zero or one vectors are tweaked by the inputs that have no captions to them - the empty caption is replaced with all zeroes or all ones, depending on the encoder, since they use different tokenizers

#

this i think is how a lot of models end up improving their negative latent space so that you don't need negative prompts

#

you can run the text encoder on very high quality data with caption dropout around 5-10% and it definitely stops needing negative prompts. but without a very diverse set of captioned images, you will start losing knowledge that exists in this 'empty space'

boreal bough Jul 14, 2023, 3:40 PM

#

sudden cliff So is that to say if you could diffuse on just the tokens inputted you'd basical...

the answer is yes XD

uneven dove Jul 14, 2023, 3:40 PM

#

yep

#

that's the Weird Stuff i alluded to

boreal bough Jul 14, 2023, 3:40 PM

#

there's even a second dishwasher there!XD

sudden cliff Jul 14, 2023, 3:40 PM

#

boreal bough the answer is yes XD

Glad to see there's no non-cat non-dishwasher in the latent space

uneven dove Jul 14, 2023, 3:41 PM

#

it's just that they didn't ask for a coherent cat, so, that didn't happen

#

when you don't use classifier free guidance, your output is pretty well-pinned to the prompt

#

if you don't ask for something, it doesn't happen

#

unseen prompt features? fuck 'em. never needed em 😄

autumn forum Jul 14, 2023, 3:42 PM

#

boreal bough the answer is yes XD

Bruh that’s a washing machine not a dishwasher😂

coral orchid Jul 14, 2023, 3:42 PM

#

dish, washer

uneven dove Jul 14, 2023, 3:42 PM

#

autumn forum Bruh that’s a washing machine not a dishwasher😂

that's a hybrid clothes dishwasher

sudden cliff Jul 14, 2023, 3:42 PM

#

coral orchid dish, washer

Is that really what happened here?

uneven dove Jul 14, 2023, 3:43 PM

#

have you not seen the way my family eats? a hybrid dish/clothes-washer would be awesome

coral orchid Jul 14, 2023, 3:43 PM

#

sudden cliff Is that really what happened here?

I don't know the exact tokenization but it wouldn't surprise me if it were something like that; lots of washer associations out there that don't have to do with dishwashers...

sudden cliff Jul 14, 2023, 3:43 PM

#

and is that because of insufficient self-attention?

uneven dove Jul 14, 2023, 3:44 PM

#

coral orchid I don't know the exact tokenization but it wouldn't surprise me if it were somet...

trying to get a powerful dishwasher and the outputs now make a lot more sense to me. it always seems to be a powerwasher version of a dishwasher

grizzled warren Jul 14, 2023, 3:44 PM

#

Incoherent cat would suffice!
Cross Attention, probably 🤣

uneven dove Jul 14, 2023, 3:44 PM

#

sudden cliff and is that because of insufficient self-attention?

oh my god lmao please try and get the self-attention guidance pipeline working with SDXL

coral orchid Jul 14, 2023, 3:44 PM

#

what I like to do for fun with negative prompts... generate with no prompt, and then add a negative based on what the unconditional output looks like

uneven dove Jul 14, 2023, 3:44 PM

#

SAG scaling never worked for 1.5 or 2.0 or 2.1 but DAMN IT things can be different

sudden cliff Jul 14, 2023, 3:44 PM

#

uneven dove oh my god lmao please try and get the self-attention guidance pipeline working w...

oh so is clip not doing that at all when encoding?

uneven dove Jul 14, 2023, 3:45 PM

#

read the SAG paper

sudden cliff Jul 14, 2023, 3:45 PM

#

screen actors guild?

uneven dove Jul 14, 2023, 3:45 PM

#

https://arxiv.org/abs/2210.00939

boreal bough Jul 14, 2023, 3:45 PM

#

autumn forum Bruh that’s a washing machine not a dishwasher😂

doth thee question that ai gods understanding of the dishwasher?

green python Jul 14, 2023, 3:45 PM

#

yeeeesss, i managed to do this after 20 attempts lol, someone on reddit said it was impossible

3_elephants_stacked_standing_on_top_of_eachother_standing_ON_TOP_of_eachother_in_the_style_of_midjourney_extremly_detailed_digital_art_ci_steps-43_width-768_height-1344_seed-0ts-1689349421_idx-0.png

coral orchid Jul 14, 2023, 3:46 PM

#

uneven dove https://arxiv.org/abs/2210.00939

https://github.com/KU-CVLAB/Self-Attention-Guidance

sudden cliff Jul 14, 2023, 3:46 PM

#

uneven dove https://arxiv.org/abs/2210.00939

thanks ok

uneven dove Jul 14, 2023, 3:46 PM

#

green python yeeeesss, i managed to do this after 20 attempts lol, someone on reddit said it ...

impossible in a reasonable time limit on a 1660 maybe

#

each image takes 4 minutes, so that's about an hour and 20 minutes for 20 attempts

uneven dove Jul 14, 2023, 3:47 PM

#

coral orchid https://github.com/KU-CVLAB/Self-Attention-Guidance

haha. they are so non-responsive to issues. i have already opened some. that code, does not work.

#

https://huggingface.co/spaces/susunghong/Self-Attention-Guidance/discussions/2

lament rune Jul 14, 2023, 3:47 PM

#

uneven dove haha. they are so non-responsive to issues. i have already opened some. that cod...

SAG? More like SAD!

sudden cliff Jul 14, 2023, 3:47 PM

#

I had no idea that there wasn't a self-attention component to all these diffusion impl. I realize that self-attention wouldn't be perfect, but thought it was doing some.

lament rune Jul 14, 2023, 3:48 PM

#

It was really cool when it worked

uneven dove Jul 14, 2023, 3:48 PM

#

SAG only works on v1.4 with non-square resolutions, on v1.5 with square resolutions, and nowhere else

green python Jul 14, 2023, 3:48 PM

#

uneven dove impossible in a reasonable time limit on a 1660 maybe

this image was created with the bot, but i have a 2070 super 8gb vram and with 5 steps base and 15 on refiner it's 15 sec

uneven dove Jul 14, 2023, 3:48 PM

#

i got tired of keeping track of which models do/don't work with it.

green python Jul 14, 2023, 3:48 PM

#

the guy on reddit said that sdxl wasn't capable of creating that image

uneven dove Jul 14, 2023, 3:49 PM

#

green python this image was created with the bot, but i have a 2070 super 8gb vram and with 5...

the 1660 has no Fp16 support so it runs in single bit precision mode (fp32) and uses more VRAM which hits the 6GB VRAM limit sooner which puts more burden on memory bandwidth/latency due to extreme number of page transfers

green python Jul 14, 2023, 3:49 PM

#

uneven dove the 1660 has no Fp16 support so it runs in single bit precision mode (fp32) and ...

that sucks

uneven dove Jul 14, 2023, 3:49 PM

#

so a single image takes 4 minutes on a 1660

#

yeah it sucks but it's better than what LLaMA would do on that hardware, which is, nothing.

sudden cliff Jul 14, 2023, 3:49 PM

#

Does anyone know, does even encoding do self-attention?

#

encoding ONLY

#

in these impl

boreal bough Jul 14, 2023, 3:50 PM

#

"sir mitten, a 1 year old kitten, taking on the adventure of prevailing over its archnemesis the dishwasher "sir brrrrrrs a lot""

uneven dove Jul 14, 2023, 3:50 PM

#

Self-Attention Guidance (SAG) is an advanced method that uses a model's own attention maps to improve the generated images. You can think of an attention map as a heat map that shows the parts of the image that the model is currently focusing on. By blurring only these parts of the image, the model is better able to focus on the most important features of the image, which leads to better results.

green python Jul 14, 2023, 3:50 PM

#

fined tuned sdxl will be the same as midjourney 5.2 and sometimes maybe a bit better

urban fjord Jul 14, 2023, 3:52 PM

#

SDXL fine-tuned or not isn't dependent on the developers to develop new features like with MidJourney. People seem to celebrate new features in MidJourney that has been in SD for a year.

boreal bough Jul 14, 2023, 3:52 PM

#

green python fined tuned sdxl will be the same as midjourney 5.2 and sometimes maybe a bit be...

finetuned sdxl will be expensive as hell, as you prob dont want to run it 24gb vram - essentially locking you out of prosumer options, now that the A6000 is also more expensive...

sudden cliff Jul 14, 2023, 3:53 PM

#

OK so what I'm asking is this, if I typed 'a cat licking a dishwasher' I could maybe type 'ing cat a washer dish lick a' and the only difference would be the vectors having a different order, there's no attention applied prior to sampling?

uneven dove Jul 14, 2023, 3:53 PM

#

urban fjord SDXL fine-tuned or not isn't dependent on the developers to develop new features...

SD is more esoteric

sudden cliff Jul 14, 2023, 3:53 PM

#

sudden cliff OK so what I'm asking is this, if I typed 'a cat licking a dishwasher' I could m...

(I understand it's not equivalent)

green python Jul 14, 2023, 3:53 PM

#

will i be able to run finetuned sdxl in 8gb vram?

uneven dove Jul 14, 2023, 3:54 PM

#

green python will i be able to run finetuned sdxl in 8gb vram?

for inference? with the refiner and the base? maybe. with just the base? absolutely

#

some huge LoRAs might end up increasing VRAM requirements depending on how they're handled during runtime

green python Jul 14, 2023, 3:54 PM

#

uneven dove for inference? with the refiner and the base? maybe. with just the base? absolut...

🙏

boreal bough Jul 14, 2023, 3:55 PM

#

sudden cliff OK so what I'm asking is this, if I typed 'a cat licking a dishwasher' I could m...

instuctions unclear. wet cat stuck in dish

sudden cliff Jul 14, 2023, 3:55 PM

#

boreal bough instuctions unclear. wet cat stuck in dish

Yeah but that's because the model was trained with the tokens in a non-fucked order 🙂

#

So that's where the non-equivalence comes in

green python Jul 14, 2023, 3:55 PM

#

stable doodle is really good

sudden cliff Jul 14, 2023, 3:55 PM

#

I always assumed the encodings would be wholly different

#

because of LLM self-attention

#

but now I see it's not used that way

#

I guess

#

it's raw encoding only...

boreal bough Jul 14, 2023, 3:56 PM

#

sudden cliff Yeah but that's because the model was trained with the tokens in a non-fucked or...

at least I can confirm, that teachin lora with a lack of order in words works - under the conditions that you add commas (and turn on shuffle!)

uneven dove Jul 14, 2023, 3:56 PM

#

transformer models are just different

#

almost nothing transfers over

sudden cliff Jul 14, 2023, 3:56 PM

#

OK whelp

inner ruin Jul 14, 2023, 3:56 PM

#

ahhh finally got a LoRA to kind of work, maybe I overbaked it? 1e-5 LR and 90 epochs

uneven dove Jul 14, 2023, 3:57 PM

#

oh hell yeah you overbaked the living fuck out of that

inner ruin Jul 14, 2023, 3:57 PM

#

but it wouldn't capture the face otherwise

#

I tried left and right

uneven dove Jul 14, 2023, 3:57 PM

#

that's ... surprising lol

sudden cliff Jul 14, 2023, 3:57 PM

#

OK well now I'm sad we don't have self-attention at any part

inner ruin Jul 14, 2023, 3:57 PM

#

uneven dove that's ... surprising lol

what do you suggest?

boreal bough Jul 14, 2023, 3:57 PM

#

inner ruin ahhh finally got a LoRA to kind of work, maybe I overbaked it? 1e-5 LR and 90 ep...

pic?

uneven dove Jul 14, 2023, 3:57 PM

#

@sudden cliff same lol it was pretty good for 1.4

glossy tusk Jul 14, 2023, 3:57 PM

#

hi

uneven dove Jul 14, 2023, 3:57 PM

#

@inner ruin i was going to suggest you ping Caith 😛

sudden cliff Jul 14, 2023, 3:58 PM

#

OK now I'm really wondering how DeepFloyd IF did as well as it did. Did that have SAG?

inner ruin Jul 14, 2023, 3:58 PM

#

sudden cliff Jul 14, 2023, 3:59 PM

#

sudden cliff OK now I'm really wondering how DeepFloyd IF did as well as it did. Did that hav...

Because if DF IF / imagen work as well as they do without SAG, that is very interesting

inner ruin Jul 14, 2023, 3:59 PM

#

boreal bough pic?

it only works kind of though. The moment I change the prompt too much it just defaults back to random girl

boreal bough Jul 14, 2023, 3:59 PM

#

ah... yeah. face

#

easiest solution is overfitting, for now

inner ruin Jul 14, 2023, 4:00 PM

#

boreal bough ah... yeah. face

it's weird cause I used to get faces really well in 2.1 and 1.5

boreal bough Jul 14, 2023, 4:00 PM

#

while there are better options, overfitting is a lot easier for now

inner ruin Jul 14, 2023, 4:00 PM

#

boreal bough easiest solution is overfitting, for now

so literally do what I'm doing? 1e-5 and 90 epochs? lol

sudden cliff Jul 14, 2023, 4:00 PM

#

https://www.assemblyai.com/blog/how-imagen-actually-works/

#

wonder if accurate...

boreal bough Jul 14, 2023, 4:01 PM

#

inner ruin it's weird cause I used to get faces really well in 2.1 and 1.5

base -> undertrained on faces
refiner -> hella overtrained on faces
it averages out. but you want to plug your lora into the middle of that ideally, which doesnt exist

v1.0 prob fixes that, and then training faces should become easy

inner ruin Jul 14, 2023, 4:01 PM

#

boreal bough base -> undertrained on faces refiner -> hella overtrained on faces it averages ...

yeah I can't train on the refiner right?

sudden cliff Jul 14, 2023, 4:01 PM

#

inner ruin Jul 14, 2023, 4:01 PM

#

boreal bough base -> undertrained on faces refiner -> hella overtrained on faces it averages ...

is your suggestion to just... wait? lol

sudden cliff Jul 14, 2023, 4:01 PM

#

ok DF IF DOES use SAG

grizzled warren Jul 14, 2023, 4:02 PM

#

urban fjord SDXL fine-tuned or not isn't dependent on the developers to develop new features...

The problem of MJ don't even start with advanced tools. It's mediocre when it comes to following the prompt, either due to the models they run or due to all sneaky additions they do under the hood. SD doesn't do that.

sudden cliff Jul 14, 2023, 4:02 PM

#

https://tenor.com/view/pop-team-epic-i-dont-know-i-dont-understand-i-dont-get-it-nod-gif-25890418

inner ruin Jul 14, 2023, 4:02 PM

#

grizzled warren The problem of MJ don't even start with advanced tools. It's mediocre when it co...

Midjourney has an LLM prompt expander running in the background

boreal bough Jul 14, 2023, 4:02 PM

#

nop.
so easiest solution I found for now, is overtrain the face only - base model wont break for A LONG time, so you shouldn't have issues there, then around 600% it makes nice faces

grizzled warren Jul 14, 2023, 4:03 PM

#

inner ruin Midjourney has an LLM prompt expander running in the background

SD can have it too. The difference is, you can control that here, but can't control it there.

inner ruin Jul 14, 2023, 4:03 PM

#

boreal bough nop. so easiest solution I found for now, is overtrain the face only - base mode...

oh ok, which parameters did you find work best? and how many training images?

boreal bough Jul 14, 2023, 4:03 PM

#

inner ruin yeah I can't train on the refiner right?

nop to refiner
yes to wait - not worth producing a good workflow, for something that fixes itself in 4 days

inner ruin Jul 14, 2023, 4:03 PM

#

grizzled warren SD can have it too. The difference is, you can control that here, but can't cont...

yeah exactly. They were very smart though because 1.5 was so hard to prompt, so the LLM layer made their prduct more accessible

boreal bough Jul 14, 2023, 4:03 PM

#

inner ruin oh ok, which parameters did you find work best? and how many training images?

1e-3
8/1 rank/alpha

#

training images - 30 to eliminate any possible problems. more images = better results, up to around ~150 at which point it just takes longer
if you have less than 30, it can work, just try a bit around, and see if you can avoid training the background as well XD

#

under 10, make sure your captions are good, and always caption the background!

sudden cliff Jul 14, 2023, 4:05 PM

#

Final actual question for anyone that knows:
Is the lack of SAG why MJ, SD, SDXL, Dall-E2 cannot do 'a boy with red hair and a girl with blue hair' (extrapolate with various associations) reliably (and DF IF can)?

inner ruin Jul 14, 2023, 4:05 PM

#

that makes sense, like regular 1.5 LoRAs

inner ruin Jul 14, 2023, 4:06 PM

#

boreal bough 1e-3 8/1 rank/alpha

I find this suprising though, if the goal is to overbake. Do you just run it for like 200 epochs or something?

rustic garnet Jul 14, 2023, 4:07 PM

#

sudden cliff OK so what I'm asking is this, if I typed 'a cat licking a dishwasher' I could m...

there is self attention. In the text encoder and also in the unet

boreal bough Jul 14, 2023, 4:08 PM

#

yep.
my 2B lora, where I trained a face as well for it - should have been done around 50 epochs, but I let it run to 350 to 'fix' the face. remaining model didn't suffer any damage though, clothing even improved and didn't get overbaked

sudden cliff Jul 14, 2023, 4:08 PM

#

rustic garnet there is self attention. In the text encoder and also in the unet

OK well everyone just got done telling me there is not

uneven dove Jul 14, 2023, 4:08 PM

#

sudden cliff Final actual question for anyone that knows: Is the lack of SAG why MJ, SD, SDXL...

In a transformer text encoder (as in GPT models), self-attention is used to capture dependencies between all words in a given text regardless of their position. For each word, it computes an attention score for all other words to determine their relative importance. The word embeddings are then weighted according to these scores to produce the final output. This mechanism allows the model to focus on relevant parts of the input sequence when generating each word in the output sequence.

The U-Net architecture is typically used in tasks such as image segmentation, where the model needs to output a pixel-wise classification of an input image. The self-attention guidance in a U-Net-like model isn't used in the same way as in a transformer model. Instead, it is used to better incorporate global context and guide the generation process in diffusion models. This guidance helps to improve the image generation quality by allowing the model to attend to different parts of the image at different stages of the generation process.

#

they're different and not the same form of SAG.

sudden cliff Jul 14, 2023, 4:08 PM

#

uneven dove In a transformer text encoder (as in GPT models), self-attention is used to capt...

OK this is what I was asking

rustic garnet Jul 14, 2023, 4:08 PM

#

sudden cliff OK well everyone just got done telling me there is not

terminalx talked about SAG (Self Attention Guidance) which has nothing to do with self attention

sudden cliff Jul 14, 2023, 4:08 PM

#

That's why I kept saying self-attention not SAG (which i didn't even know)

uneven dove Jul 14, 2023, 4:08 PM

#

i'm stupid and use the wrong words sometimes

sudden cliff Jul 14, 2023, 4:09 PM

#

OK so the encoding does use self-attention, I am satisfied then

#

I was horrified about that mainly

#

(I was thinking that other than how it was trained, token order didn't matter AT ALL)

boreal bough Jul 14, 2023, 4:10 PM

#

inner ruin that makes sense, like regular 1.5 LoRAs

basically anything that isn't • face, body, anatomy, eyewear - is super easy to train. but those few things require workarounds to get working. at least faces will change to easy to train as well on v1.0

inner ruin Jul 14, 2023, 4:10 PM

#

boreal bough basically anything that isn't • face, body, anatomy, eyewear - is super easy to ...

will it change because they're merging the models / ditching the refiner?

uneven dove Jul 14, 2023, 4:10 PM

#

sudden cliff (I was thinking that other than how it was trained, token order didn't matter AT...

but the transformer layer does calculate without focusing on the order of the tokens. it's just that when words are in different order they can tokenize differently.

sudden cliff Jul 14, 2023, 4:10 PM

#

uneven dove but the transformer layer does calculate without focusing on the order of the to...

Yep, that's all that matters

#

well see that's why I'm surprised that dishwasher and dish, washer wouldn't be more different

rustic garnet Jul 14, 2023, 4:11 PM

#

sudden cliff Final actual question for anyone that knows: Is the lack of SAG why MJ, SD, SDXL...

I think the text encoders were never trained for such subtle things

sudden cliff Jul 14, 2023, 4:11 PM

#

maybe it's because the CLIP isn't that huge

uneven dove Jul 14, 2023, 4:11 PM

#

i still want self-attention guidance for SDXL KEK

rustic garnet Jul 14, 2023, 4:11 PM

#

like image captions are usually quite bad and general

uneven dove Jul 14, 2023, 4:11 PM

#

sudden cliff maybe it's because the CLIP isn't that huge

dude the CLIP is enormous KEKL what do you want from it LOL

grizzled warren Jul 14, 2023, 4:11 PM

#

inner ruin yeah exactly. They were very smart though because 1.5 was so hard to prompt, so ...

But if you had something specific in mind, you had to dilute the prompt with tons of synonyms, weak supporting tokens or outright bogus input so the LLM doesn't add too much on top of meaningful prompt. It was a good solution for inexperienced user, not so much for someone who can actually prompt 1.5 well enough. And it seems they moved in the same general direction SDXL is moving, because I heard current versions also benefit from natural prompting which is closer to an actual sentence instead of 1.5 notation.

boreal bough Jul 14, 2023, 4:11 PM

#

inner ruin will it change because they're merging the models / ditching the refiner?

yes (faces on base wont be undertrained anymore)
eyewear remains to be seen... might stay problematic, might be fixed as well
anatomy/body parts will stay hard

rustic garnet Jul 14, 2023, 4:11 PM

#

you rarely have captions like "a photo of a girl with blond hair and a boy with brown hair"

sudden cliff Jul 14, 2023, 4:11 PM

#

Well why is dish and washer not encoded more differently from dishwasher

inner ruin Jul 14, 2023, 4:11 PM

#

boreal bough yes (faces on base wont be undertrained anymore) eyewear remains to be seen... m...

interesting! How do you know this, btw?

uneven dove Jul 14, 2023, 4:11 PM

#

rustic garnet you rarely have captions like "a photo of a girl with blond hair and a boy with ...

if you caption with T5-Flan, you do 😄

#

not sure how accurate they are but it was Good Enough for Me

rustic garnet Jul 14, 2023, 4:12 PM

#

also, CLIP is trained to create a pooled embedding. You don't care about the single words in the caption, but you want to compare a complete image against a complete caption

urban fjord Jul 14, 2023, 4:12 PM

#

boreal bough yep. my 2B lora, where I trained a face as well for it - should have been done a...

For a moment I thought you said your LoRA model was 2GB 😄

boreal bough Jul 14, 2023, 4:12 PM

#

inner ruin interesting! How do you know this, btw?

bot is running v1.0 base only. faces are no longer under/overtrained on it. just ideal.
eyewear is still biased towards glasses in bot XD (meaning no blindfolds, or cosplay accessories)

jaunty adder Jul 14, 2023, 4:12 PM

#

I read some youtube comment about how people aren't appreciating what is possible with SDXL and something about you'll be able to make cars for shoes. And I thought really? Oh yes really.

rustic garnet Jul 14, 2023, 4:13 PM

#

so it is very likely that the transformed word embeddings in CLIP in the last layers carry a lot of information about the complete image. That's why clip skip worked so well in SD

sudden cliff Jul 14, 2023, 4:13 PM

#

rustic garnet also, CLIP is trained to create a pooled embedding. You don't care about the sin...

still doesn't explain it tho. Because if CLIP can differentiate between 'dishwasher' and 'washer' and 'dish' in encoding, then there shouldn't have been a cross-over in training nor diffusion

urban fjord Jul 14, 2023, 4:14 PM

#

boreal bough bot is running v1.0 base only. faces are no longer under/overtrained on it. just...

I hope for a "Make stuff in the face work correctly" finetune.

boreal bough Jul 14, 2023, 4:14 PM

#

urban fjord For a moment I thought you said your LoRA model was 2GB 😄

43mb 🦾

sudden cliff Jul 14, 2023, 4:14 PM

#

sudden cliff still doesn't explain it tho. Because if CLIP can differentiate between 'dishwas...

I can understand if the self-attention isn't absolute, like 90%

#

If that's the only reason, then I can remain sane

inner ruin Jul 14, 2023, 4:15 PM

#

boreal bough yes (faces on base wont be undertrained anymore) eyewear remains to be seen... m...

ok so it's just based on the empirical observation of bot 1... let's see! IDK how they're gonna release the model on Tuesday!

boreal bough Jul 14, 2023, 4:15 PM

#

urban fjord I hope for a "Make stuff in the face work correctly" finetune.

just gonna be painful to dataset - but easy to finetune
all you need is 200 images per concept they didn't add... which arent that many tbh. I just got unlucky with 2B

sudden cliff Jul 14, 2023, 4:15 PM

#

dish<-washer(.9) => encoding that is a little bit 'washer' and more the concept of 'dishwasher'

rustic garnet Jul 14, 2023, 4:15 PM

#

don't get that. The last layer of CLIP contains the pooled embedding. However, nothing stops the model from letting the layer before the last layer already containing pooled embeddings

sudden cliff Jul 14, 2023, 4:16 PM

#

sudden cliff dish<-washer(.9) => encoding that is a little bit 'washer' and more the concept ...

if confirmed true I can shut up

rustic garnet Jul 14, 2023, 4:16 PM

#

for the loss function clip is trained on it would be totally fine if in the last layers all words have exactly the same embedding

#

it would be just a waste of parameters

#

but it would mean that if you use the embedding from these layers you loose the individual meaning of your words

sudden cliff Jul 14, 2023, 4:18 PM

#

rustic garnet but it would mean that if you use the embedding from these layers you loose the ...

Are you more saying that 'washer' can be carried forward with full strength sometimes or is always carried forward with full strength despite 'dishwasher' also shaping its own strength?

rustic garnet Jul 14, 2023, 4:18 PM

#

of course this is not the case. As said, if you look at self attention maps in SD you see that it can differentiate between different words in the sentence. It's still that sometimes words get mixed up a bit and a "women with blond hair and a boy with brown hair" the vector for women contains both, blond and brown hair information

inner ruin Jul 14, 2023, 4:18 PM

#

urban fjord I hope for a "Make stuff in the face work correctly" finetune.

I guess you could do this right now in 0.9 -> fine-tune with 200 or so faces and then make LoRAs on that model

boreal bough Jul 14, 2023, 4:19 PM

#

sudden cliff dish<-washer(.9) => encoding that is a little bit 'washer' and more the concept ...

offtopic - but you can also throw a photo of a dishwasher into a Vit-H model, and get back prompts biased towards it. Often there are supporting words that eliminate all false bias.

sudden cliff Jul 14, 2023, 4:19 PM

#

rustic garnet of course this is not the case. As said, if you look at self attention maps in S...

so in this case, what is the 'words get mixed up a bit'?

boreal bough Jul 14, 2023, 4:20 PM

#

its a dumb solution, but it works painfully well :/

sudden cliff Jul 14, 2023, 4:20 PM

#

boreal bough offtopic - but you can also throw a photo of a dishwasher into a Vit-H model, an...

interesting

#

that makes sense tho

rustic garnet Jul 14, 2023, 4:20 PM

#

sudden cliff Are you more saying that 'washer' can be carried forward with full strength some...

I'm not sure what you mean. CLIP consists of many layers. In each layer you have attention where you mixin information from other words. In the last layer your "sentence start token" have to contain the information about the complete sentence. Depending on which layer in CLIP you look at your words might contain more or less context

urban fjord Jul 14, 2023, 4:21 PM

#

inner ruin I guess you could do this right now in 0.9 -> fine-tune with 200 or so faces and...

Making stuff look right that the base failed at seems to be a lot more complicated than just new concepts.

rustic garnet Jul 14, 2023, 4:21 PM

#

like the sentence "girl with blond hair and boy with brown hair". In the first layer each word is isolated from each other. As more layers you go forward as more context is transferred to the words, such that "girl" is associated with "blond" and with "hair". In the last layer, the complete sentence has to be associated. So this means that its very likely that in the last layers every word is associated with every word in some way

sudden cliff Jul 14, 2023, 4:21 PM

#

rustic garnet I'm not sure what you mean. CLIP consists of many layers. In each layer you have...

I think that's basically 'confirming' what I was saying about the attention. It's as you say that the 'dishwasher' concept isn't absolute in the coding, and that attention between say the word 'cat' and 'washer' also has some strength and exists as context

rustic garnet Jul 14, 2023, 4:22 PM

#

yes

#

because the output of CLIP is just a single word

sudden cliff Jul 14, 2023, 4:22 PM

#

OK yep I'm understanding then

rustic garnet Jul 14, 2023, 4:22 PM

#

which contains information about the complete sentence

#

in SD the last layer is removed and the layer before is used, where HOPEFULLY the words still have their individual meaning

sudden cliff Jul 14, 2023, 4:22 PM

#

OK tbh this WHOLE conversation though kai, I thought you were saying that there are NO associations in ANY layer

rustic garnet Jul 14, 2023, 4:22 PM

#

but it is still very likely that a little bit of attention is leaked in each word

#

sorry, I'm probably bad in explaining 😅

sudden cliff Jul 14, 2023, 4:23 PM

#

It's fine because I am just more familiar with the LLM stage

boreal bough Jul 14, 2023, 4:23 PM

#

sudden cliff interesting

forgot what it was, but essentially I wanted a very specific chinese flower dress, and obviously it couldn't make even remotely close. was gonna train a lora. then I threw it into vit-h, it gave me back an artist name XD put the artist name into the prompt as well. works 100% how i wanted it, and only produces the right dress. wtf right?
turns out there's a photographer who does nothing but photograph people in that type of dress. the weight on his name is stronger than the real name of the dress XD

sudden cliff Jul 14, 2023, 4:24 PM

#

boreal bough forgot what it was, but essentially I wanted a very specific chinese flower dres...

that's very cool and a very good idea. I have the same exact issue with not knowing the words for a specific style of fictional plane

grizzled warren Jul 14, 2023, 4:24 PM

#

sudden cliff still doesn't explain it tho. Because if CLIP can differentiate between 'dishwas...

Fun fact: the bot censors the prompt, so when you ask SDXL to generate "cucumbers on a dish", it generates this #1100170365604483202 message

rustic garnet Jul 14, 2023, 4:24 PM

#

there is CLIP Interrogator

sudden cliff Jul 14, 2023, 4:24 PM

#

grizzled warren Fun fact: the bot censors the prompt, so when you ask SDXL to generate "cucumber...

Yeah I've seen that sort of thing being discussed, but I'm mostly running locally

rustic garnet Jul 14, 2023, 4:24 PM

#

it works really nice for these cases

sudden cliff Jul 14, 2023, 4:24 PM

#

So for my job I actually created a multimodal captioning software

#

that is focused on accuracy

#

it way outperforms even KOSMOS-2 etc

grizzled warren Jul 14, 2023, 4:25 PM

#

sudden cliff Yeah I've seen that sort of thing being discussed, but I'm mostly running locall...

I do both

boreal bough Jul 14, 2023, 4:27 PM

#

rustic garnet there is CLIP Interrogator

Vit-L does not do justice for sdxl, just a heads up. Vit big g, or Vit-H if you can

sudden cliff Jul 14, 2023, 4:27 PM

#

CLIP interrogator has a lot of different models you can run

rustic garnet Jul 14, 2023, 4:27 PM

#

I mean, diffusers has them all ;D

boreal bough Jul 14, 2023, 4:27 PM

#

rustic garnet I mean, diffusers has them all ;D

jealousy intensifies

sudden cliff Jul 14, 2023, 4:28 PM

#

Thanks all for the discussion, confirmed the bits I suspected and did actually understand but also learned a lot of things that I didn't know about at all or didn't understand

#

And I'm grateful that my whole world isn't shattered

inner ruin Jul 14, 2023, 4:29 PM

#

boreal bough *jealousy intensifies*

fwiw I get better LoRAs with kohya than with diffusers lol

idle pasture Jul 14, 2023, 4:54 PM

#

where can i prompt the sdxl 1.0?

cursive saddle Jul 14, 2023, 4:56 PM

#

Today, with collaborators at @Google , we're excited to announce 🥳🥳HyperDreamBooth🥳 🥳! It's like DreamBooth, but smaller, faster and better. 25x faster. Think of 30 minutes vs. 14 hours for 100 models. And works on a single image!
(Thread 👇)
webpage: hyperdreambooth.github.io"

Seen on twitter

#

A new dreambooth

grizzled warren Jul 14, 2023, 4:56 PM

#

idle pasture where can i prompt the sdxl 1.0?

Locally, in a week 😜
Or in the #1100170312106127410 through #1101178553900478464 channels, any of those, SDXL Beta Bot section.

cursive saddle Jul 14, 2023, 4:58 PM

#

https://twitter.com/natanielruizg/status/1679893292618752000?t=jodO1QlFfupH6koLlAAObg&s=19

#

A new dreambooth

civic sigil Jul 14, 2023, 4:59 PM

#

So is it a lora or a hypernetwork

#

Results dont look that amazing, I wonder what model they used

#

I guess Im curious what the difference is between that and traditional hypernetworks

#

Not sure why they glossed over it

stray mantle Jul 14, 2023, 5:06 PM

#

grizzled warren Locally, in a week 😜 Or in the <#1100170312106127410> through <#11011785539004...

Technically 4 days 😁

rustic garnet Jul 14, 2023, 5:07 PM

#

civic sigil Results dont look that amazing, I wonder what model they used

its not about the model but about the technique

boreal bough Jul 14, 2023, 5:08 PM

#

but... 1.5/2 only, right?

#

since the old techniques no longer apply to sdxl

patent badger Jul 14, 2023, 5:09 PM

#

hey guys, I know it doesn't 100% belong here but I guess it could be related to SDXL as well,
if I'm training a lora for the openjourney v4 model, should i train the lora on the model itself or on the 1.5 base model?

rustic garnet Jul 14, 2023, 5:11 PM

#

I don't think its about SD at all

boreal bough Jul 14, 2023, 5:11 PM

#

patent badger hey guys, I know it doesn't 100% belong here but I guess it could be related to ...

the only right answer here is "depends", and there are no one line answers for either way

lusty raptor Jul 14, 2023, 5:11 PM

#

in most cases, you're better off training on the model you intend to use the lora with

#

no hard rules, of course

rustic garnet Jul 14, 2023, 5:11 PM

#

oh, I'm wrong, they applied it to SD

#

anyways. I don't think that it is so interesting either. It is very similar to an older paper by google which was doing the same just with "rank-1 lora" instead of what they call "lightweight dreambooth"

#

it might be interestint for applications and cloud services that want to create personalized images on the fly for their users

boreal bough Jul 14, 2023, 5:13 PM

#

porting it to sdxl is not the issue - rather the theory behind its speed up is probably no longer applicable to sdxl. due to the larger model, we no longer have to worry about so many of the issues of training on 1.5.
hell, I trained the same dataset on sdxl in 6 different way to see which work, some completely wrong for the hell of it. and they all worked

rustic garnet Jul 14, 2023, 5:13 PM

#

but anyone here could just wait a few minutes longer and train a, probably much better, model using Lora

rustic garnet Jul 14, 2023, 5:14 PM

#

boreal bough porting it to sdxl is not the issue - rather the theory behind its speed up is p...

yes, I also have the feeling SDXL is easier to train than the previous versions 😄

boreal bough Jul 14, 2023, 5:14 PM

#

if its speed you want, 2e-3 is the fastest you can go to achieve good results. While it can't be overfitted too much, that is rarely what you want to do to begin with - and then training is a speedrun

grizzled warren Jul 14, 2023, 5:15 PM

#

They said they used Stable Diffusion, but they didn't specify the version. Chances are it's either 1.5 or 2.1.

civic sigil Jul 14, 2023, 5:16 PM

#

rustic garnet yes, I also have the feeling SDXL is easier to train than the previous versions ...

Cause it already has so much knowledge to begin with probably

rustic garnet Jul 14, 2023, 5:17 PM

#

in the end their model is similar to controlnet in the sense that it uses a pre-trained network for faces. It's not exactly like controlnet, and I guess its because the results with a controlnet were not good enough. But the point is you have to train a model that is able to finetune a model for face images+

#

which means it works ONLY for faces which makes it kinda boring imo

civic sigil Jul 14, 2023, 5:17 PM

#

Ohh yeah maybe that will be useful for like phone apps to personalize AI filters and stuff

rustic garnet Jul 14, 2023, 5:17 PM

#

yes

#

I guess thats the point

civic sigil Jul 14, 2023, 5:18 PM

#

Probably exciting for some startup out there lol

grizzled warren Jul 14, 2023, 5:18 PM

#

rustic garnet yes, I also have the feeling SDXL is easier to train than the previous versions ...

It might be easier in terms of know how (idk tho, didn't even try), but it should be harder for the hardware since the model is much bigger, and possibly might need further tuning for the refiner.

civic sigil Jul 14, 2023, 5:18 PM

#

But not for me

rustic garnet Jul 14, 2023, 5:18 PM

#

maybe also game development where you get a personal avatar based on a photo and stuff like that

civic sigil Jul 14, 2023, 5:18 PM

#

Ohh yeah like you can put your pic in and it will generate a bunch of images personalized for you

rustic garnet Jul 14, 2023, 5:18 PM

#

grizzled warren It might be easier in terms of know how (idk tho, didn't even try), but it shoul...

yes and no. Its bigger, but that also means it learns faster.

civic sigil Jul 14, 2023, 5:20 PM

#

Yeah my Loras learned super quick but it takes a lot of vram

grizzled warren Jul 14, 2023, 5:20 PM

#

rustic garnet yes and no. Its bigger, but that also means it learns faster.

It might take less compute, but will require more VRAM. That's harder to achieve with consumer grade hardware.

boreal bough Jul 14, 2023, 5:22 PM

#

civic sigil Ohh yeah maybe that will be useful for like phone apps to personalize AI filters...

in that case I'd steal the microsoft solution, of applying it to a 3d face. while sounding barbaric, the results are pretty damn good, now that a bit of ai optimization was added

uneven dove Jul 14, 2023, 5:22 PM

#

jim carrey as shrek?

1689355309.1955061c700b6d3add1c78fa044c48e3bd0083.png

1689355311.4905875a3f0c05df9196a38e26ee42f78b09303.png

lusty raptor Jul 14, 2023, 5:22 PM

#

is it just me or is the comparisons image in that hyperdreambooth a bit misleading?

uneven dove Jul 14, 2023, 5:23 PM

#

cursive saddle Today, with collaborators at @Google , we're excited to announce 🥳🥳HyperDreamB...

Google's research hardly ever goes anywhere until someone else picks it up, and their original ideas for dreambooth we now understand are pretty destructive and shitty

urban fjord Jul 14, 2023, 5:24 PM

#

I guess it is all single-image datasets comparisons. Any comparison made behind closed doors will always be misleading.

uneven dove Jul 14, 2023, 5:24 PM

#

it's not hard to improve on their original research paper

boreal bough Jul 14, 2023, 5:24 PM

#

I take it back. if you're ok with this, then 4e-3 is your limit XD

urban fjord Jul 14, 2023, 5:24 PM

#

But yeah I don't feel like the outputs are that good.

#

But it might have some uses still.

rocky geode Jul 14, 2023, 5:28 PM

#

boreal bough I take it back. if you're ok with this, then 4e-3 is your limit XD

thata you

#

face reveal

boreal bough Jul 14, 2023, 5:29 PM

#

casually pretends lora doesn't exist
though it definitely work a lot better in niche applications. just not generalized

though I also question their prompts, since you can't just compare "A Pixar character of a [V] face" when that prompt was never intended to work on the default model... while there IS a prompt that does work.

uneven dove Jul 14, 2023, 5:32 PM

#

i don't understand that test grid at all

#

that thing belongs in the Facebook group of scientific charts that look like shitposts

urban fjord Jul 14, 2023, 5:33 PM

#

Yeah without prompts it is kind of worthless, and where is normal LoRA...

boreal bough Jul 14, 2023, 5:33 PM

#

just finished reading. I feel bamboozled. They just made a new variant LoRA and gave it a fancier name...

uneven dove Jul 14, 2023, 5:33 PM

#

i think they want a line down the middle separating the men output from women output? looks like a god damn continuum where they gradually shift the weights

urban fjord Jul 14, 2023, 5:34 PM

#

I didn't read too much of the paper as I don't understand the fine-details too well. But if you're improving on LoRA you should compare this to the other LoRA variants.

boreal bough Jul 14, 2023, 5:35 PM

#

basically it's a 1/0.5 lora XD

urban fjord Jul 14, 2023, 5:35 PM

#

It is like me developing a new screw and makes comparisons to bolts and nails but not other screws.

uneven dove Jul 14, 2023, 5:35 PM

#

welcome to the wild world of machine learning research where the comparison don't mean anything and the demonstrations don't matter

rustic garnet Jul 14, 2023, 5:35 PM

#

boreal bough just finished reading. I feel bamboozled. They just made a new variant LoRA and ...

that was the previous paper

uneven dove Jul 14, 2023, 5:36 PM

#

@visual glade how do these work? https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/ac2d47ff4c00b041cae3d882c2832662c2c64935

rustic garnet Jul 14, 2023, 5:36 PM

#

in this paper they say directly that they do lora, but they use a hypernetwork to predict initial weights

boreal bough Jul 14, 2023, 5:36 PM

#

rustic garnet that was the previous paper

which they doubled down on...

uneven dove Jul 14, 2023, 5:36 PM

#

also, how the fuck does midjourney's bot send its partially denoised outputs to the discord message

#

😭

rustic garnet Jul 14, 2023, 5:36 PM

#

also they use a random vector factorization before doing the lora to further shrink down the number of parameters

uneven dove Jul 14, 2023, 5:37 PM

#

i can't update an old message with a new embed

boreal bough Jul 14, 2023, 5:37 PM

#

but end result is still a 120 KB lora, right?

rustic garnet Jul 14, 2023, 5:37 PM

#

yes

#

but from this view you could say Lora is the same as dreambooth

#

when you add the lora to a model you get a normal model back

boreal bough Jul 14, 2023, 5:37 PM

#

LiDBx2 would be the proper name XD

eternal fog Jul 14, 2023, 5:38 PM

#

lmao, my training has gone back to not working again. I don't understand lmao

paper phoenix Jul 14, 2023, 5:38 PM

#

uneven dove Google's research hardly ever goes anywhere until someone else picks it up, and ...

if you read the "eating our lunch" memo the person goes a way to explaining why that's the case.

uneven dove Jul 14, 2023, 5:38 PM

#

paper phoenix if you read the "eating our lunch" memo the person goes a way to explaining why ...

oh buddy i love that memo lmao

#

yeah i'm well aware of how groups like SAI and Google all burn capital just because they have it. i have always done everything i do, with much less than these groups spend. but i'm not working at their level i'm sure 😁

boreal bough Jul 14, 2023, 5:39 PM

#

uneven dove yeah i'm well aware of how groups like SAI and Google all burn capital just beca...

btw, i found out where all the stability cluster is going

uneven dove Jul 14, 2023, 5:40 PM

#

LoRA is probably the best thing Microsoft ever did

boreal bough Jul 14, 2023, 5:40 PM

#

Vit models go brrrrrrr on Stability Cluster

uneven dove Jul 14, 2023, 5:40 PM

#

oh, looked at wandb logs?

paper phoenix Jul 14, 2023, 5:40 PM

#

but it must utterly suck to be wrapped up in your own red tape and having to form an orderly queue on an idea while people outside your window are all running at it from every angle like wacky races.

uneven dove Jul 14, 2023, 5:40 PM

#

ahahahaha

#

@paper phoenix the caucus race scene from Alice in Wonderland

#

https://www.youtube.com/watch?v=_OcaJOAQVUQ

paper phoenix Jul 14, 2023, 5:41 PM

#

its kind of nasa versus the redbull flugtag and inexplicably the flugtag is competing!

uneven dove Jul 14, 2023, 5:41 PM

#

this scene is an incredible metaphor for so much time wasting resource expenditure we have in life

shy kelp Jul 14, 2023, 5:41 PM

#

why doesnt google release an image gen I wonder

uneven dove Jul 14, 2023, 5:42 PM

#

boreal bough Vit models go brrrrrrr on Stability Cluster

https://ai.googleblog.com/2023/03/scaling-vision-transformers-to-22.html

probably copy-pasting some more Google research

paper phoenix Jul 14, 2023, 5:42 PM

#

shy kelp why doesnt google release an image gen I wonder

seem happy enough for adobe to work on that stuff honestly.

urban fjord Jul 14, 2023, 5:43 PM

#

Why do even google research these things if no one gets to do anything with it.

visual glade Jul 14, 2023, 5:43 PM

#

uneven dove <@1059592307579564162> how do these work? https://github.com/AUTOMATIC1111/stabl...

it's a fast and shitty way of converting latents to RGB

uneven dove Jul 14, 2023, 5:43 PM

#

nice

urban fjord Jul 14, 2023, 5:44 PM

#

Does ComfyUI has support for that?

visual glade Jul 14, 2023, 5:44 PM

#

https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/latent_formats.py

#

yes

uneven dove Jul 14, 2023, 5:44 PM

#

yeah i wanna do what MJ did for image gen and show the images as they generate

#

i have a progress bar but that's boring

urban fjord Jul 14, 2023, 5:45 PM

#

Look at how Automatic1111 is doing live-preview.

paper phoenix Jul 14, 2023, 5:45 PM

#

do you mean preview every x steps?

eternal fog Jul 14, 2023, 5:45 PM

#

visual glade it's a fast and shitty way of converting latents to RGB

Fast and shitty, that should be Auto1111's motto

#

Although it's not that fast either

visual glade Jul 14, 2023, 5:45 PM

#

if you want preview in comfy it's: --preview-method auto

uneven dove Jul 14, 2023, 5:45 PM

#

eternal fog Fast and shitty, that should be Auto1111's motto

you should leak some documents with that idea in it and he'd probably implement your motto then

visual glade Jul 14, 2023, 5:46 PM

#

except a1111 is not fast, both diffusers and comfyui beat it in speed

#

so it's actually the slowest

uneven dove Jul 14, 2023, 5:46 PM

#

it's fast AND shitty which means it's the best at being the worst. it's like how 1+1=3.. for large values of 1

eternal fog Jul 14, 2023, 5:46 PM

#

I'm going insane wtf

#

I can't even train 512,512 anymore

paper phoenix Jul 14, 2023, 5:46 PM

#

this a bit.... weird.

eternal fog Jul 14, 2023, 5:46 PM

#

What is happening

uneven dove Jul 14, 2023, 5:46 PM

#

eternal fog I can't even train 512,512 anymore

pepethicc

eternal fog Jul 14, 2023, 5:46 PM

#

It just does this

#

And OOM

#

But it worked 2 days ago on 1024x1024 at batch 2

uneven dove Jul 14, 2023, 5:47 PM

#

wow

urban fjord Jul 14, 2023, 5:47 PM

#

Ah, I thought it were a node for it as I want to replace the normal conversion with the faster one as I've a bit of an issue with the vae atm.

uneven dove Jul 14, 2023, 5:47 PM

#

it even dips into the iGPU at the end

#

lmfao

eternal fog Jul 14, 2023, 5:47 PM

#

uneven dove it even dips into the iGPU at the end

I don't have an iGPU so it doesn't, it just starts trying to use a little System Memory

uneven dove Jul 14, 2023, 5:47 PM

#

oh

#

what the hell is that technique called KEKL

eternal fog Jul 14, 2023, 5:47 PM

#

But I don't understand why it's suddenly started doing this, when I trained one the other day and it did work.

#

So why now is it fucked

uneven dove Jul 14, 2023, 5:48 PM

#

did you update?

eternal fog Jul 14, 2023, 5:48 PM

#

updated, didn't work, downgraded worked. Went to sleep, ran again, didn't work.

uneven dove Jul 14, 2023, 5:48 PM

#

if you haven't updated, my guess is there's some state file it's picking up on from the last run

eternal fog Jul 14, 2023, 5:48 PM

#

I don't save states

#

And the commands look the exact same

uneven dove Jul 14, 2023, 5:49 PM

#

well upgrading and downgrading is ... uh... well, did you look at the code changes before doing it to verify it'd be okay?

eternal fog Jul 14, 2023, 5:49 PM

#

When I say downgrade, I checked out an old commit

visual glade Jul 14, 2023, 5:49 PM

#

nvidia driver update?

uneven dove Jul 14, 2023, 5:49 PM

#

shrug

#

i've heard a lot of this "works one day, does not the next"

eternal fog Jul 14, 2023, 5:49 PM

#

visual glade nvidia driver update?

Nah on the old driver still. I tested updating the driver again and that just eats fuck loads of RAM and tries to crash my PC

uneven dove Jul 14, 2023, 5:49 PM

#

have you tried turning it off and on again?

eternal fog Jul 14, 2023, 5:50 PM

#

Many times

uneven dove Jul 14, 2023, 5:50 PM

#

hm

#

well, something changed

eternal fog Jul 14, 2023, 5:50 PM

#

Let me check to make sure it's not some corrupt cached latent or something, I'm going to clear the whole training folder.

uneven dove Jul 14, 2023, 5:50 PM

#

that's what i meant when i said some saved state

eternal fog Jul 14, 2023, 5:50 PM

#

uneven dove that's what i meant when i said some saved state

I thought you meant model state

uneven dove Jul 14, 2023, 5:50 PM

#

cached latents or aspect buckets can i guess do that too

eternal fog Jul 14, 2023, 5:50 PM

#

I'll see if this works

uneven dove Jul 14, 2023, 5:51 PM

#

well it sounds truly frustrating, i hope you figure it out, because maybe it's the same issue for all

rustic garnet Jul 14, 2023, 5:51 PM

#

eternal fog Although it's not that fast either

what should be faster than that?

eternal fog Jul 14, 2023, 5:51 PM

#

It's just bizarre

#

Goes from 1024x1024 batch 2 working

#

to 512x512 batch 1 not lmao

#

I wonder if it's doing something fucky with some cached latents and trying to load them multiple times or something

eternal fog Jul 14, 2023, 5:52 PM

#

rustic garnet what should be faster than that?

I don't understand what this is in reference to

trail bay Jul 14, 2023, 5:52 PM

#

https://ai.meta.com/blog/generative-ai-text-images-cm3leon/ just saw this pop up

rustic garnet Jul 14, 2023, 5:52 PM

#

the VAE approximation in auto1111 is super fast and a great idea

shy kelp Jul 14, 2023, 5:53 PM

#

trail bay https://ai.meta.com/blog/generative-ai-text-images-cm3leon/ just saw this pop up

noice

#

I want to see a model or method where you can get multiple subjects in on the first go with no merging

#

thats my challenge to all the eggheads

boreal bough Jul 14, 2023, 5:57 PM

#

eternal fog It just does this

Does not look healthy

eternal fog Jul 14, 2023, 5:57 PM

#

boreal bough Does not look healthy

Well when it was working it was getting to 9.8GB out of 10GB and then would stay there and be fine

lilac wren Jul 14, 2023, 5:58 PM

#

civic sigil Jul 14, 2023, 5:58 PM

#

eternal fog Well when it was working it was getting to 9.8GB out of 10GB and then would stay...

You were getting 1024x batch size 2 on 10GB?

eternal fog Jul 14, 2023, 5:59 PM

#

civic sigil You were getting 1024x batch size 2 on 10GB?

I was yes, dunno how but I was.

#

I suspect somewhere some settings on the Kohya scripts are getting fucked up

#

And it's not doing what it's telling me it's doing

civic sigil Jul 14, 2023, 5:59 PM

#

Dang I need to switchto linux I cant get anywhere near that

eternal fog Jul 14, 2023, 5:59 PM

#

I mean it doesn't work at all anymore

#

I would OOM with a resolution of 1x1 lmao

#

I think I'm going to delete this whole thing and start again

#

Because something is obviously broken

civic sigil Jul 14, 2023, 6:00 PM

#

Been there done that

#

Derriens repo worked for me out of the box tho

eternal fog Jul 14, 2023, 6:00 PM

#

How much VRAM?

civic sigil Jul 14, 2023, 6:00 PM

#

12GB

eternal fog Jul 14, 2023, 6:01 PM

#

Hmm, maybe I'll try that

#

https://github.com/derrian-distro/LoRA_Easy_Training_Scripts/tree/SDXL

It was this one yeah?

civic sigil Jul 14, 2023, 6:01 PM

#

Yeah it's based on Khoya just that it installed correctly for me instead of the actualy kohya

#

It also has some nice QoL features

eternal fog Jul 14, 2023, 6:03 PM

#

I hate windows sometimes

#

It won't let me delete an empty folder because it's "In Use"

#

how... by what lmao

civic sigil Jul 14, 2023, 6:04 PM

#

What you're on windows?

eternal fog Jul 14, 2023, 6:04 PM

#

yes

civic sigil Jul 14, 2023, 6:04 PM

#

How did you get such good results

eternal fog Jul 14, 2023, 6:04 PM

#

¯_(ツ)_/¯

#

Thats what I'm trying to work out

thin nova Jul 14, 2023, 6:05 PM

#

eternal fog It won't let me delete an empty folder because it's "In Use"

an open folder or background process could be using it. i usually just reboot 🙃

civic sigil Jul 14, 2023, 6:05 PM

#

I would love to train at full res

eternal fog Jul 14, 2023, 6:05 PM

#

How would an open folder be using an open folder XD

lusty raptor Jul 14, 2023, 6:06 PM

#

eternal fog Thats what I'm trying to work out

forget the exact thing i use since it's been a while, but googling "windows unlock in use" should get you some tools for it

eternal fog Jul 14, 2023, 6:06 PM

#

I have a tool, but it does it on files and not folders

#

Because an empty folder shouldn't ever be locked lol

thin nova Jul 14, 2023, 6:06 PM

#

if you have the empty folder open, viewing the empty contents

sage basin Jul 14, 2023, 6:06 PM

#

Do you have a command line open in that folder?

thin nova Jul 14, 2023, 6:06 PM

#

or if you are inside the folder in a terminal

eternal fog Jul 14, 2023, 6:06 PM

#

It's fine, I'm just getting irritated because this thing randomly stopped working

civic sigil Jul 14, 2023, 6:07 PM

#

Are you sure it was working in the first place?

eternal fog Jul 14, 2023, 6:07 PM

#

100% it make a working LoRA

civic sigil Jul 14, 2023, 6:07 PM

#

What if you were accidentally resizing to 512 or something

eternal fog Jul 14, 2023, 6:08 PM

#

Well it OOMs at 512 so...

#

But my tests worked, not brilliantly, but they absolutely worked.

civic sigil Jul 14, 2023, 6:09 PM

#

Hmm

eternal fog Jul 14, 2023, 6:09 PM

#

And that's bang on the character style

#

It's just overfit to all hell

civic sigil Jul 14, 2023, 6:09 PM

#

Im interested now I want to get it working on my machine too lol

eternal fog Jul 14, 2023, 6:09 PM

#

Well I'm going to try re-install it all

#

Lets see if that works again

civic sigil Jul 14, 2023, 6:10 PM

#

Good luck

delicate grotto Jul 14, 2023, 6:16 PM

#

eternal fog But my tests worked, not brilliantly, but they absolutely worked.

i think i saw this char somewhere

#

anime with bikes

lilac wren Jul 14, 2023, 6:19 PM

#

eternal fog Jul 14, 2023, 6:20 PM

#

delicate grotto i think i saw this char somewhere

I dunno what the character is, it was a style lora, as you can see from the watermarks it baked in lmao

gentle mirage Jul 14, 2023, 6:21 PM

#

eternal fog But my tests worked, not brilliantly, but they absolutely worked.

im also training a lora on wdxl rn but for some reason it wasnt learning anything

#

couldu yoink me your parameters?

eternal fog Jul 14, 2023, 6:21 PM

#

gentle mirage couldu yoink me your parameters?

I mean mines not working at the moment, so don't think you want them lol

gentle mirage Jul 14, 2023, 6:21 PM

#

oh? i thought it was from that pic

#

all g then

eternal fog Jul 14, 2023, 6:22 PM

#

It was

#

It was working

#

Now it doesn't

#

¯_(ツ)_/¯

gentle mirage Jul 14, 2023, 6:22 PM

#

😂

eternal fog Jul 14, 2023, 6:22 PM

#

Just re-installed all Kohyas scripts and still doesn't work

#

Exact same training settings

#

So dumb

lilac wren Jul 14, 2023, 6:22 PM

#

gentle mirage Jul 14, 2023, 6:23 PM

#

eternal fog It's just overfit to all hell

how many steps and images btw

eternal fog Jul 14, 2023, 6:24 PM

#

I don't remember tbh

#

It worked but wasn't good enough so I deleted it

green python Jul 14, 2023, 6:24 PM

#

parti vs sdxl (no cherrypicking)

eternal fog Jul 14, 2023, 6:24 PM

#

{\"resolution\": [768, 1152], \"count\": 2}, \"1\": {\"resolution\": [768, 1216], \"count\": 1}, \"2\": {\"resolution\": [832, 1088], \"count\": 2}, \"3\": {\"resolution\": [832, 1152], \"count\": 4}, \"4\": {\"resolution\": [832, 1216], \"count\": 4}, \"5\": {\"resolution\": [896, 1024], \"count\": 1}, \"6\": {\"resolution\": [1024, 896], \"count\": 1}}, \"mean_img_ar_error\": 0.010579165855970867}"

This is the info from another test one I trained, so it was using the correct 1024x1024 with bucketing

#

Thats from inside the safetensors file

#

So I'm so confused as to how that somehow took less than 10GB VRAM but now it takes more than 10GB VRAM to try with 512x512

inner ruin Jul 14, 2023, 6:26 PM

#

eternal fog https://github.com/derrian-distro/LoRA_Easy_Training_Scripts/tree/SDXL It was t...

is that what you're using? How did you install?

eternal fog Jul 14, 2023, 6:26 PM

#

inner ruin is that what you're using? How did you install?

No I'm using https://github.com/bmaltais/kohya_ss

But I might try that one next and see if it's any better.

inner ruin Jul 14, 2023, 6:27 PM

#

eternal fog No I'm using https://github.com/bmaltais/kohya_ss But I might try that one next...

oh nice, I'm using the kohya trainer colab https://github.com/Linaqruf/kohya-trainer/blob/main/kohya-LoRA-trainer-XL.ipynb

#

should be close enough to kohya ss

eternal fog Jul 14, 2023, 6:27 PM

#

If the Collab works I might just use that. Although it's annoying having to upload all the images

lusty raptor Jul 14, 2023, 6:28 PM

#

lolwut
https://github.com/ModelSurge/sd-webui-comfyui

inner ruin Jul 14, 2023, 6:29 PM

#

eternal fog If the Collab works I might just use that. Although it's annoying having to uplo...

I mean it runs, I get a good lora if I prompt the exact thing I wrote on the caption

#

but if I change anything it just forgets the face

#

it's dumb af

eternal fog Jul 14, 2023, 6:30 PM

#

civic sigil Derriens repo worked for me out of the box tho

Do you know why it wants admin to do the install process?

I see no reason for it to require it

uneven dove Jul 14, 2023, 6:31 PM

#

reimagining Dragon Fruit

1689355360.5075364a3d8467b561f02556b5fec220f1816c.png

lusty raptor Jul 14, 2023, 6:31 PM

#

i tried derrien's for the first time recently and it didn't need admin

eternal fog Jul 14, 2023, 6:31 PM

#

I just ran it, it tries to change the powershell restriction policy and then a UAC Prompt for Admin

uneven dove Jul 14, 2023, 6:32 PM

#

lusty raptor lolwut https://github.com/ModelSurge/sd-webui-comfyui

@visual glade oh good it's MIT licensed so you can take it over KEK

eternal fog Jul 14, 2023, 6:32 PM

#

Call PowerShell -NoProfile -ExecutionPolicy Bypass -Command "& {Start-Process PowerShell -ArgumentList 'Set-ExecutionPolicy Unrestricted -Force' -Verb RunAs}"

It doesn't need to do this

lusty raptor Jul 14, 2023, 6:33 PM

#

eternal fog I just ran it, it tries to change the powershell restriction policy and then a U...

i used the bat, hrmm

eternal fog Jul 14, 2023, 6:33 PM

#

Yeah so did I

#

the Bat loads up that

#

The bat triggers Installer.py

#

Which does this

#

    try:
        subprocess.check_call(f"{os.path.join('installables', 'change_execution_policy.bat')}")
    except subprocess.SubprocessError:
        try:
            subprocess.check_call(f"{os.path.join('installables', 'change_execution_policy_backup.bat')}")
        except subprocess.SubprocessError as e:
            print(f"Failed to change the execution policy with error:\n {e}")```

#

Which then runs another BAT that runs that code to change the policy

lusty raptor Jul 14, 2023, 6:34 PM

#

so why didn't i get a UAC prompt?

eternal fog Jul 14, 2023, 6:34 PM

#

Have you disabled it?

lusty raptor Jul 14, 2023, 6:34 PM

#

no

eternal fog Jul 14, 2023, 6:34 PM

#

Or were you already running the bat in an admin command window

lusty raptor Jul 14, 2023, 6:34 PM

#

also no

eternal fog Jul 14, 2023, 6:34 PM

#

Not sure, but it did it for me

#

AndI'm not installing stuff that asks for admin for no reason

molten gull Jul 14, 2023, 6:35 PM

#

eternal fog Jul 14, 2023, 6:35 PM

#

lusty raptor also no

Actually it might be because your powershell policy is already on bypass, so it didn't need to change it.

molten gull Jul 14, 2023, 6:35 PM

#

that's weird pencils 🙂 the one in the middle is blue and black 🙂

lusty raptor Jul 14, 2023, 6:36 PM

#

oh good call

eternal fog Jul 14, 2023, 6:36 PM

#

But I don't see any powershell scripts that would need that changing to run

#

Think I'll just wait for 1.0 and improved tools before I train anythiing

#

I need food

uneven dove Jul 14, 2023, 6:37 PM

#

molten gull that's weird pencils 🙂 the one in the middle is blue and black 🙂

not so unusual tbh.

molten gull Jul 14, 2023, 6:37 PM

#

🙂

rustic garnet Jul 14, 2023, 6:37 PM

#

inner ruin but if I change anything it just forgets the face

I I mean, thats a good thing, the model is not forgetting everything. I think you have to train on trigger words and ideally also train text encoder

uneven dove Jul 14, 2023, 6:37 PM

#

for dreambooth, yeah training the text encoder is pretty strong, have to do it very carefully

lusty raptor Jul 14, 2023, 6:38 PM

#

eternal fog I need food

i need brain. i'm using bmaltai's repo. was wondering why i couldn't find an installer.py, except one in the venv

inner ruin Jul 14, 2023, 6:38 PM

#

rustic garnet I I mean, thats a good thing, the model is not forgetting everything. I think yo...

I mean I did train on a trigger word and I leave the trigger but add "dancing" and then the whole thing doesn't work

rustic garnet Jul 14, 2023, 6:39 PM

#

hm, thats weird. I had no problems train on a subject.

#

I did first train text encoder for a few epochs and then, much longer, the unet

eternal fog Jul 14, 2023, 6:40 PM

#

lusty raptor i need brain. i'm using bmaltai's repo. was wondering why i couldn't find an ins...

Yeah that one's fine and doesn't do that.

rustic garnet Jul 14, 2023, 6:40 PM

#

I did textual inversion first, but I'm pretty sure you can skip that if you have a good trigger word

inner ruin Jul 14, 2023, 6:40 PM

#

rustic garnet I did first train text encoder for a few epochs and then, much longer, the unet

what was the script/LR/epochs/# of input images?

#

it

rustic garnet Jul 14, 2023, 6:40 PM

#

LR 5e-4

inner ruin Jul 14, 2023, 6:41 PM

#

it's funny because it still gets the demographic data (like white man with long hair) but it loses face details

lusty raptor Jul 14, 2023, 6:41 PM

#

eternal fog Yeah that one's fine and doesn't do that.

i do have derrian's installed and used it previously, but that was several months ago, and i vaguely remember admin issues. anyway, doesn't matter now i guess

rustic garnet Jul 14, 2023, 6:41 PM

#

Steps in total? I don't know anymore. Guess ~400 for text encoder and ~800 for unet. Actually, I just train until I see severe overfitting. Then I use the last model and a model a few epochs before

#

I would try with training text encoder. This is really powerful

inner ruin Jul 14, 2023, 6:42 PM

#

rustic garnet Steps in total? I don't know anymore. Guess ~400 for text encoder and ~800 for u...

gotcha! I'll try doing that in diffusers cause it lets me customize text encoder and unet

true hazel Jul 14, 2023, 6:42 PM

#

anatomy why...

uneven dove Jul 14, 2023, 6:42 PM

#

cursed seed, forbidden token, or both

rustic garnet Jul 14, 2023, 6:44 PM

#

inner ruin gotcha! I'll try doing that in diffusers cause it lets me customize text encoder...

in this case: try text encoder first and only train the OpenCLIP - it is totally sufficient. Then train unet afterwards. That was how I did it. However, I haven't done experiments with other settings yet

true hazel Jul 14, 2023, 6:45 PM

#

leg status?

rustic garnet Jul 14, 2023, 6:45 PM

#

I have to say, SDXL is doing anatomy often surprisingly well. I see very rarely wrong number of fingers

uneven dove Jul 14, 2023, 6:46 PM

#

diffusers > *

rustic garnet Jul 14, 2023, 6:46 PM

#

its still sometimes mess up composition. But it feels like they trained it well on legs and fingers

uneven dove Jul 14, 2023, 6:46 PM

#

brb reducing RAM use in my bot by 3GB 😄

normal frost Jul 14, 2023, 6:47 PM

#

Which do you guys think is better? I like V1 more.

civic sigil Jul 14, 2023, 6:48 PM

#

normal frost Which do you guys think is better? I like V1 more.

I like the second pic

lilac wren Jul 14, 2023, 6:55 PM

#

normal frost Which do you guys think is better? I like V1 more.

my prompt?

true hazel Jul 14, 2023, 6:55 PM

#

lol

#

amazing it got the shadows right

clever verge Jul 14, 2023, 7:07 PM

#

Fingers seems to be very hard to get even close to looking decent. Will 1.0 be better?

#

Sometimes there's correct number of fingers (five) but a there's a normal finger instead of a thumb. 😄

rustic garnet Jul 14, 2023, 7:10 PM

#

clever verge Sometimes there's correct number of fingers (five) but a there's a normal finger...

yes thats still hapens a lot. But I'm happy that the number is correct lol

quasi remnant Jul 14, 2023, 7:10 PM

#

a little but not super massive, 0.9 is pretty indicative of what you're getting

#

there is a decent quality bump in 1.0 though

#

there are specific models for inpainting hands afaik

#

also negative textual inversion embeddings you can use to reduce extra fingies & stuff, nothing's really perfect though

primal hatch Jul 14, 2023, 7:14 PM

#

clever verge Jul 14, 2023, 7:16 PM

#

Eyes from steep angle (profile shot) is hard too.

#

But about the same level as some good 1.5 models is I feel.

sudden cliff Jul 14, 2023, 7:17 PM

#

In Comfy, I guess if you batch it doesn't save the seed for each image in the batch generated?

clever verge Jul 14, 2023, 7:17 PM

#

I understand that it's pretty hard for a model to predict how hands can be shaped, they are very handy tools after all.

clever verge Jul 14, 2023, 7:18 PM

#

sudden cliff In Comfy, I guess if you batch it doesn't save the seed for each image in the ba...

Should be saved as noise_seed, atleast my batches are.

lilac wren Jul 14, 2023, 7:20 PM

#

glad fulcrum Jul 14, 2023, 7:20 PM

#

I was told comfyUI was faster, but its very slow.

#

#

When I press on queue... Its takes a lot of time before it starts to generate

#

I have 16RAM, and 12VRAM

sudden cliff Jul 14, 2023, 7:21 PM

#

clever verge Should be saved as noise_seed, atleast my batches are.

Definitely doesn't in my pipeline anyway, identical information in each image in the batch

clever verge Jul 14, 2023, 7:22 PM

#

sudden cliff Definitely doesn't in my pipeline anyway, identical information in each image in...

I have it in all modules that uses the seed in the flow.

#

If you clear the UI and drag the image back does it show the seed?

#

Make sure you saved the UI first.

shy kelp Jul 14, 2023, 7:24 PM

#

glad fulcrum When I press on queue... Its takes a lot of time before it starts to generate

did you try to make more than 1 image? 😂 it has to load the model before making the first image m8

sudden cliff Jul 14, 2023, 7:24 PM

#

I mean looking at the raw data in the output it's identical so there's no way

#

wonder how I could fix that

clever verge Jul 14, 2023, 7:27 PM

#

glad fulcrum I was told comfyUI was faster, but its very slow.

Compared to what? SD 1.5 on A1111?

vast narwhal Jul 14, 2023, 7:28 PM

#

What you think about my upscale result? Too sharp?

glad fulcrum Jul 14, 2023, 7:28 PM

#

shy kelp did you try to make more than 1 image? 😂 it has to load the model before makin...

yeah, this workflow is very slow.

#

Like you have to load the model for every image...

#

And it takes like forever

#

in fact, you have to load 2 models for every image.

clever verge Jul 14, 2023, 7:30 PM

#

Yep, load it twice for every image.

#

Other model, possibly higher resolution.

civic sigil Jul 14, 2023, 7:32 PM

#

vast narwhal What you think about my upscale result? Too sharp?

Looks great, how did you do it?

clever verge Jul 14, 2023, 7:32 PM

#

It's much faster if you run ddim as Sytan's flow has.

lilac wren Jul 14, 2023, 7:35 PM

#

Yeah @vast narwhal, looks terrible, could you share your workflow?

vast narwhal Jul 14, 2023, 7:40 PM

#

civic sigil Looks great, how did you do it?

this is three steps: highfix+ 2x ultimate upscale using juggernaut with control net tile but i need more tests

eternal fog Jul 14, 2023, 7:40 PM

#

Not all SDXL 😢

#

I wonder if they'll give us a controlnet tile for SDXL

#

It might fix the detail loss

vast narwhal Jul 14, 2023, 7:42 PM

#

lilac wren Yeah <@400413917333028864>, looks terrible, could you share your workflow?

I will do. First i need a good realistic image workflow to test with a different thing beside the dog

shy kelp Jul 14, 2023, 7:50 PM

#

glad fulcrum Like you have to load the model for every image...

no you dont. you load it before the first image and then its loaded and wont need to load again unless you change the model. there's something wrong if it keeps reloading the same model.

steady chasm Jul 14, 2023, 7:50 PM

#

vast narwhal What you think about my upscale result? Too sharp?

Wow. This is fantastic. What was your process?

shy kelp Jul 14, 2023, 7:51 PM

#

vast narwhal What you think about my upscale result? Too sharp?

id like to know how you did that aswell if thats sdxl

#

im assuming you used a 1.5 model for upscaling

#

or 2.1 or whatever

eternal fog Jul 14, 2023, 7:52 PM

#

He did, he said he used Juggernaught

shy kelp Jul 14, 2023, 7:52 PM

#

ahh okay. i need to try that out.

boreal bough Jul 14, 2023, 7:58 PM

#

glad fulcrum Like you have to load the model for every image...

Do you have the 12gb vram available? Or is stuff in background using vram, and comfy needs to unload every time?

autumn forum Jul 14, 2023, 7:58 PM

#

Juggernaut xl will be in the works soon and oml will that be good

boreal bough Jul 14, 2023, 8:01 PM

#

I've basically finished all my lora tests. if anyone has a dataset they want to me test, feel free to ping me

uneven dove Jul 14, 2023, 8:02 PM

#

a stunning portrait of StabilityAI deepfrying the VAE
"yep, looks done?"
"better give it another hour"

1689364590.4766481007a1bd750869b63f6d47353cc9ac30.png

boreal bough Jul 14, 2023, 8:08 PM

#

@uneven dove
my favorite response to the comfy inside A1111 extension XD

elder rose Jul 14, 2023, 8:14 PM

#

Is this just control net for SDXL? #📣｜announcements message

eternal fog Jul 14, 2023, 8:15 PM

#

If you read the blog it tells you it's a T2I-Adapter

steady chasm Jul 14, 2023, 8:19 PM

#

uneven dove a stunning portrait of StabilityAI deepfrying the VAE "yep, looks done?" "better...

This guy is just pouring lava from his hands, ouch

visual glade Jul 14, 2023, 8:20 PM

#

uneven dove diffusers > *

or you can just use bf16

uneven dove Jul 14, 2023, 8:21 PM

#

that's not as much vmem savings

visual glade Jul 14, 2023, 8:21 PM

#

both are 16 bit though so it should be the same?

uneven dove Jul 14, 2023, 8:21 PM

#

it's not, the dynamic range is higher in bf16

#

https://moocaholic.medium.com/fp64-fp32-fp16-bfloat16-tf32-and-other-members-of-the-zoo-a1ca7897d407

#

also slower

#

bf16 is great but only because it is convenient in terms of development costs, same as tf32 for fp32-sensitive applications

elder rose Jul 14, 2023, 8:24 PM

#

eternal fog If you read the blog it tells you it's a T2I-Adapter

I clearly did not 😉

#

Thx

vernal cloak Jul 14, 2023, 8:37 PM

#

How long do you lot think we will have to wait until we get a anime finetuned model of SDXL, upon release?
Really looking forward to exploring XL's better understanding of context in prompts tho.

eternal fog Jul 14, 2023, 8:39 PM

#

vernal cloak How long do you lot think we will have to wait until we get a anime finetuned mo...

https://huggingface.co/hakurei/waifu-diffusion-xl

boreal bough Jul 14, 2023, 8:39 PM

#

vernal cloak How long do you lot think we will have to wait until we get a anime finetuned mo...

use Vit-H on an anime image of your choice, take that prompt, feed into sdxl, get new anime images. profit.

#

but also waifuxl for easy use

vernal cloak Jul 14, 2023, 8:42 PM

#

boreal bough use Vit-H on an anime image of your choice, take that prompt, feed into sdxl, ge...

Vit-H? Not heard of that before. Is it short term for a tool or something?

boreal bough Jul 14, 2023, 8:43 PM

#

Interrogator -> Vit-L model

vernal cloak Jul 14, 2023, 8:43 PM

#

eternal fog https://huggingface.co/hakurei/waifu-diffusion-xl

And this is awesome! Fantastic news to hear that we will be getting a XL version

#

Oh right! Ok. Nice! Thanks. Haha

glad fulcrum Jul 14, 2023, 8:44 PM

#

shy kelp no you dont. you load it before the first image and then its loaded and wont nee...

But every image needs to load both models, so how can there be something wrong if it tries to load a model?

boreal bough Jul 14, 2023, 8:44 PM

#

glad fulcrum But every image needs to load both models, so how can there be something wrong i...

its supposed to stay in vram, not be removed again

#

removal should only happen if you lack enough vram

glad fulcrum Jul 14, 2023, 8:44 PM

#

oh so 12 must not be enough I guess

boreal bough Jul 14, 2023, 8:44 PM

#

it is -based on people here

#

however stuff in background may be taking some of it

#

photoshop or similar apps

#

or other uis

thin nova Jul 14, 2023, 8:45 PM

#

i just use --highvram

boreal bough Jul 14, 2023, 8:45 PM

#

thin nova i just use `--highvram`

does this force it? in that case nice! 👍

glad fulcrum Jul 14, 2023, 8:45 PM

#

glad fulcrum

as you see here, I am not using any vram demanding app

glad fulcrum Jul 14, 2023, 8:46 PM

#

thin nova i just use `--highvram`

for what?

thin nova Jul 14, 2023, 8:46 PM

#

keeps the models loaded on the gpu so it doesn't load them every time

visual glade Jul 14, 2023, 8:46 PM

#

uneven dove https://moocaholic.medium.com/fp64-fp32-fp16-bfloat16-tf32-and-other-members-of-...

yeah but it's still 16 bits because they only put 7 bits on the fractions part

glad fulcrum Jul 14, 2023, 8:46 PM

#

thin nova i just use `--highvram`

Where should I add that?

visual glade Jul 14, 2023, 8:46 PM

#

as long as the hardware supports it speed should be the same as fp16

glad fulcrum Jul 14, 2023, 8:47 PM

#

here?

thin nova Jul 14, 2023, 8:47 PM

#

yeah

glad fulcrum Jul 14, 2023, 8:47 PM

#

ok will try

eternal fog Jul 14, 2023, 8:49 PM

#

lmao, I tried out a tiled sampler node

#

It did not go well

uneven dove Jul 14, 2023, 8:49 PM

#

visual glade as long as the hardware supports it speed should be the same as fp16

it ain't

#

not on a 4090, an A100, or an A6000

#

maybe on a TPU it works better... hm

eternal fog Jul 14, 2023, 8:52 PM

#

ok this time it worked, as in it made something, but it does not keep it coherent at all

stray mantle Jul 14, 2023, 8:53 PM

#

glad fulcrum When I press on queue... Its takes a lot of time before it starts to generate

ComfyUI is loading the models at 1st generation, if you come from A1111 it loads on app loading, so ComfyUI is faster to start but 1st gen is slower.

eternal fog Jul 14, 2023, 8:54 PM

#

I think it's loading in the Text Encoders that takes time

#

As it takes that extra time every time you change the prompt

static prawn Jul 14, 2023, 8:55 PM

#

feel like prompting on sdxl is pretty hard, i only get medium good results always, sometimes blurry, grainy, missing details

sage basin Jul 14, 2023, 8:56 PM

#

I've noticed it takes an extra 30~40 seconds to start generating when using a lora as well, even on subsequent generations

visual glade Jul 14, 2023, 9:00 PM

#

yeah loras are not currently handled in a very memory efficient way so if you only have 16GB ram it's going to be slow but I'm fixing that

static prawn Jul 14, 2023, 9:02 PM

#

dont understand why my results always get so blurry

green python Jul 14, 2023, 9:02 PM

#

#

Congratulations to the stability.ai team, you have done a very good job with this model

#

it's better than google's models

#

urban fjord Jul 14, 2023, 9:07 PM

#

visual glade yeah loras are not currently handled in a very memory efficient way so if you on...

Nice 😄

tight fjord Jul 14, 2023, 9:08 PM

#

hey guys, did we ever get official information regarding 1 vs 2 positive prompts and clip_g clip_l?

#

cause i tested both ways and i'm still not sure what is best

#

and similarly, ascore seems to have 0 effect at all if I change the int value of both positive and negative

vast narwhal Jul 14, 2023, 9:09 PM

#

eternal fog He did, he said he used Juggernaught

This is the full SDXL result. I just don't think it's better

rustic garnet Jul 14, 2023, 9:12 PM

#

tight fjord and similarly, ascore seems to have 0 effect at all if I change the int value of...

ascore is just for refiner

eternal fog Jul 14, 2023, 9:12 PM

#

Which of these looks better?

upbeat summit Jul 14, 2023, 9:12 PM

#

tight fjord hey guys, did we ever get official information regarding 1 vs 2 positive prompts...

I've done some tests but not enough. Right now I'm using the same prompt for CLIP_G and CLIP_L - that gives the most coherent result what the prompt says. But I started to try different concepts. Main prompt part in CLIP_G and style words in CLIP_L. But I'm not really sure.

tight fjord Jul 14, 2023, 9:13 PM

#

something in between these 2 @eternal fog , either blurry or overly sharp to me

green python Jul 14, 2023, 9:13 PM

#

vast narwhal This is the full SDXL result. I just don't think it's better

but it's really detailed tho

sharp robin Jul 14, 2023, 9:13 PM

#

eternal fog Which of these looks better?

1

rustic garnet Jul 14, 2023, 9:13 PM

#

tight fjord cause i tested both ways and i'm still not sure what is best

me neither. It definitely changes the style of the image, but it using different prompts for clip l and g did made it worse in most of my cases, while in sytans workflow its supposed to improve quality on photorealism 🤷‍♂️

green python Jul 14, 2023, 9:13 PM

#

eternal fog Jul 14, 2023, 9:13 PM

#

sharp robin 1

Why do you think 1?

tight fjord Jul 14, 2023, 9:13 PM

#

ok thanks guys, so its still mostly speculation with no real consensus

green python Jul 14, 2023, 9:14 PM

#

eternal fog Which of these looks better?

2, it has more details in the clothes and face, 1 looks too soft

rustic garnet Jul 14, 2023, 9:14 PM

#

yes, and as we had a very long discussion a few hours ago: from a theoretical standpoint it's awkward using different prompts

tight fjord Jul 14, 2023, 9:14 PM

#

yeah, i feel UIs won't adapt to have 2 prompts just for sdxl

elfin cobalt Jul 14, 2023, 9:14 PM

#

uneven dove maybe on a TPU it works better... hm

TPUs have limits to memory bandwidth, too. I haven't tried it on any of ours, but I would surprised if it was any different.

rustic garnet Jul 14, 2023, 9:15 PM

#

tight fjord yeah, i feel UIs won't adapt to have 2 prompts just for sdxl

I think that's not an issue. I'm pretty sure they would if it really helps

uneven dove Jul 14, 2023, 9:15 PM

#

eternal fog Jul 14, 2023, 9:16 PM

#

ok what about these two?

sharp robin Jul 14, 2023, 9:16 PM

#

textures look more real, it doesnt feel "fake" or plasticky if you look around the area of breasts it starts looking artifact and burned

gilded plinth Jul 14, 2023, 9:16 PM

#

what is the impact of ascore?

azure oxide Jul 14, 2023, 9:16 PM

#

tight fjord yeah, i feel UIs won't adapt to have 2 prompts just for sdxl

why not? a1111 already added another prompt box for hi-res fix alone, not hard to do the same for sdxl

#

doesnt comfy already have it

eternal fog Jul 14, 2023, 9:16 PM

#

sharp robin textures look more real, it doesnt feel "fake" or plasticky if you look around t...

Yeah I think I'm gone a bit hard on the sharpness, but I'm trying to remove that soft look you get from doing an img2img upscale.

uneven dove Jul 14, 2023, 9:16 PM

#

eternal fog ok what about these two?

did you ASK for the facial lines?

#

it looks like misaligned timesteps

eternal fog Jul 14, 2023, 9:17 PM

#

uneven dove did you ASK for the facial lines?

She's a "Demon", so that's why it's done that

uneven dove Jul 14, 2023, 9:17 PM

#

hmm

#

fair enough but when there's more noise than it knows what to do with, it does that kind of facial lining

eternal fog Jul 14, 2023, 9:17 PM

#

I'll experiment a bit more I think I'm getting somewhere though

rustic garnet Jul 14, 2023, 9:18 PM

#

gilded plinth what is the impact of ascore?

it's only for the refiner and is supposed to make the image more aesthetically pleasing while less following the prompt

tight fjord Jul 14, 2023, 9:18 PM

#

try 'hairless demon' 😄

uneven dove Jul 14, 2023, 9:18 PM

#

a more severe example

eternal fog Jul 14, 2023, 9:18 PM

#

Instead of generating then upscaling then doing the img2img pass with refiner.

I'm generating then going straight to the img2img refiner pass, THEN Upscaling and then doing another img2img refiner pass. It seems to keep detail a lot better and only takes a few seconds longer.

sharp robin Jul 14, 2023, 9:18 PM

#

eternal fog ok what about these two?

2

eternal fog Jul 14, 2023, 9:19 PM

#

Let me try 2 more without facial lines this time.

uneven dove Jul 14, 2023, 9:19 PM

#

less misaligned makes it into some kind of excusable crayon lines. after all, he is a jester. but it looks odd

boreal bough Jul 14, 2023, 9:19 PM

#

eternal fog ok what about these two?

its the same question as ai auto processing of mobile phone photos.
more real, but worse / definitely fake, but fits aesthetics more

uneven dove Jul 14, 2023, 9:20 PM

#

here's the effect you get when you randomly add noise during denoising

eternal fog Jul 14, 2023, 9:20 PM

#

boreal bough its the same question as ai auto processing of mobile phone photos. more real, b...

Yeah I want a balance in between, the left is too soft in my opinion, but the right is too sharp.

gilded plinth Jul 14, 2023, 9:20 PM

#

rustic garnet it's only for the refiner and is supposed to make the image more aesthetically p...

ok ok, thanks

molten gull Jul 14, 2023, 9:20 PM

#

uneven dove Jul 14, 2023, 9:21 PM

#

in fact i think the random noise added during inference is possibly the best example of teh face cracking in an 'artistically acceptable way'

molten gull Jul 14, 2023, 9:21 PM

#

realistic ? i'm okay with it ... hands? no freaking way 🙂

uneven dove Jul 14, 2023, 9:21 PM

#

uneven dove here's the effect you get when you randomly add noise during denoising

i am talking about this

tight fjord Jul 14, 2023, 9:21 PM

#

@eternal fog just merge both 😄

gilded plinth Jul 14, 2023, 9:21 PM

#

what is the standard ascore for positive and negative?

uneven dove Jul 14, 2023, 9:22 PM

#

5/1

rustic garnet Jul 14, 2023, 9:22 PM

#

I think 6 and 2

uneven dove Jul 14, 2023, 9:22 PM

#

its definitely 5 and 1

sharp robin Jul 14, 2023, 9:22 PM

#

static prawn dont understand why my results always get so blurry

refiner steps too high

uneven dove Jul 14, 2023, 9:22 PM

#

they ain't great values

eternal fog Jul 14, 2023, 9:22 PM

#

These two are a bit closer, although it's buggered up the eyes on one of them

azure oxide Jul 14, 2023, 9:22 PM

#

uneven dove in fact i think the random noise added during inference is possibly the best exa...

tbh this effect kinda reminds me of the various customizable masks from payday2. google them if you havent played. the artist for that game would probably love it lol

sharp robin Jul 14, 2023, 9:23 PM

#

eternal fog These two are a bit closer, although it's buggered up the eyes on one of them

1 by a long shot

glad fulcrum Jul 14, 2023, 9:23 PM

#

do sdxl prompts work different?

#

and if so, in what way, do you have examples

boreal bough Jul 14, 2023, 9:23 PM

#

glad fulcrum do sdxl prompts work different?

yep. you can even write it a story and get back a good result

uneven dove Jul 14, 2023, 9:23 PM

#

glad fulcrum and if so, in what way, do you have examples

simplify them

eternal fog Jul 14, 2023, 9:23 PM

#

sharp robin 1 by a long shot

I think 2 pops out a lot more, it's not as smooth. But I do think it's too sharp. Time to play with more values.

molten gull Jul 14, 2023, 9:23 PM

#

those two look like skyrim 6 🙂

glad fulcrum Jul 14, 2023, 9:23 PM

#

boreal bough yep. you can even write it a story and get back a good result

is there a new token limit?

rustic garnet Jul 14, 2023, 9:23 PM

#

but you can also just copy an old prompt and it will usually work

shy kelp Jul 14, 2023, 9:24 PM

#

boreal bough removal should only happen if you lack enough vram

no the model is loaded into ram. its only taking up vram during generation

rustic garnet Jul 14, 2023, 9:24 PM

#

nope

uneven dove Jul 14, 2023, 9:24 PM

#

not the overly stupid prompts kai lmao

#

'realistic' makes it look plastic

sharp robin Jul 14, 2023, 9:24 PM

#

eternal fog I think 2 pops out a lot more, it's not as smooth. But I do think it's too sharp...

oversaturation problably

uneven dove Jul 14, 2023, 9:24 PM

#

masterpiece, trending on artstation, they make real people look like vector graphics

molten gull Jul 14, 2023, 9:24 PM

#

uneven dove Jul 14, 2023, 9:24 PM

#

you need to remove a lot of that crap

glad fulcrum Jul 14, 2023, 9:24 PM

#

for realistic, what do you use?

uneven dove Jul 14, 2023, 9:24 PM

#

nothing

glad fulcrum Jul 14, 2023, 9:24 PM

#

please share some prompts

uneven dove Jul 14, 2023, 9:25 PM

#

just say what you want

molten gull Jul 14, 2023, 9:25 PM

#

pure luck, i would say 🙂

uneven dove Jul 14, 2023, 9:25 PM

#

a stunning portrait of a 1985 adult in leggings

green python Jul 14, 2023, 9:25 PM

#

uneven dove Jul 14, 2023, 9:25 PM

#

it'll do fine

#

you don't NEED too much more

tight fjord Jul 14, 2023, 9:25 PM

#

be careful with negative prompts that look innocuous, i just figured out that 'blurry' was making my photos paintings, then i added 'painting' and now everyone looks wrinkled

molten gull Jul 14, 2023, 9:26 PM

#

then add "wrinkled", too 🙂

#

and see if that makes it blurry again 🙂

tight fjord Jul 14, 2023, 9:26 PM

#

did that, all im saying is without any of those, it looks better 🙂

#

negative prompting can have a lot of strong effects that are hard to predict

rustic garnet Jul 14, 2023, 9:26 PM

#

uneven dove its definitely 5 and 1

I checked and it's 2.5 and 6 😝

uneven dove Jul 14, 2023, 9:27 PM

#

ಠ_ಠ

boreal bough Jul 14, 2023, 9:27 PM

#

"a photo of jim the plumber working hard on pipes as he ponders the world and its meaning"
using bot v1.0

a_photo_of_jim_the_plumber_working_hard_on_pipes_as_he_ponders_the_world_and_its_meaning_seed-0ts-1689369955_idx-0.png

molten gull Jul 14, 2023, 9:27 PM

#

EVERYTHING is hard to predict 🙂

#

yeah, that's definitively jim 🙂

shy kelp Jul 14, 2023, 9:27 PM

#

vast narwhal This is the full SDXL result. I just don't think it's better

how did you upscale using sdxl?

tight fjord Jul 14, 2023, 9:27 PM

#

wow, 100% believably jim

rustic garnet Jul 14, 2023, 9:27 PM

#

yeah, you usually don't need negative prompts. Avoid them

#

use them only if you really need them

uneven dove Jul 14, 2023, 9:28 PM

#

usually for excluding overfitted subjects

rustic garnet Jul 14, 2023, 9:28 PM

#

not like 2.1 where we by default used complex negative prompts

elfin cobalt Jul 14, 2023, 9:28 PM

#

I think I might need negative prompts.

uneven dove Jul 14, 2023, 9:28 PM

#

2.1 just needs like one neg embed lol

soft bone Jul 14, 2023, 9:28 PM

#

i have massive success with tiny negatives on 2.1

boreal bough Jul 14, 2023, 9:28 PM

#

"a photo of jim the plumber working hard on pipes as he ponders the world and its meaning"
using mimizukari setup. no style/no negative

uneven dove Jul 14, 2023, 9:29 PM

#

soft bone i have massive success with tiny negatives on 2.1

same, just needs a bit more guidance through positive prompts

rustic garnet Jul 14, 2023, 9:29 PM

#

elfin cobalt I think I might need negative prompts.

😂
but if you ask for certain anime stuff you sometimes just get crap back that looks like fan arts

#

I tried some Dragonball Z prompt and get crappy images back. In this cases you have to improve your positive prompt, not the negative one

#

e.g. add artist names that describe the image style you want

sharp robin Jul 14, 2023, 9:31 PM

#

eternal fog I think 2 pops out a lot more, it's not as smooth. But I do think it's too sharp...

ran 1 w some color corrections... idk u tell me

sage basin Jul 14, 2023, 9:32 PM

#

https://www.midlibrary.io/ has a tonne of useful artist names that work with SDXL as well. Using photographers will usually give you decent quality photos

molten gull Jul 14, 2023, 9:33 PM

#

elfin cobalt Jul 14, 2023, 9:33 PM

#

I've had a great deal of luck using GPT-4 for first pass prompt engineering; 80% of the time it produces great pictures, although not always what I want.

#

Mind you, 10% of the time it outputs what I posted above.

molten gull Jul 14, 2023, 9:34 PM

#

question: in comfyUI there's a KSampler (Advanced) node, that has a start_at_step and end_at_step ... what are those for ?

rustic garnet Jul 14, 2023, 9:36 PM

#

if yiu want to change the model during sampling for example

#

or other situations where you want to stop the denoising process, do something else with the latents, and continue

molten gull Jul 14, 2023, 9:36 PM

#

that's freaking crazy 🙂

rustic garnet Jul 14, 2023, 9:36 PM

#

e.g. stop in between and continue with the refiner model

#

or change the prompt or model in between

eternal fog Jul 14, 2023, 9:37 PM

#

You guys seen this? Personally I've not had issues but apparently this saves a little VRAM
https://huggingface.co/madebyollin/sdxl-vae-fp16-fix

molten gull Jul 14, 2023, 9:37 PM

#

elfin cobalt Jul 14, 2023, 9:38 PM

#

...now if you give the right prompt to GPT-4, it produces this.

vast narwhal Jul 14, 2023, 9:38 PM

#

shy kelp how did you upscale using sdxl?

Depends on the node you're using. but everyting needs to be from XL, including prompt box and the cliptextencoder

molten gull Jul 14, 2023, 9:40 PM

#

autumn forum Jul 14, 2023, 9:41 PM

#

rustic garnet if yiu want to change the model during sampling for example

So if I had 20 steps, but the end at step is 12, does it stop at 12 when it was scheduled for 20? If so. What’s the difference of putting the steps at 12 vs 20?

visual glade Jul 14, 2023, 9:41 PM

#

eternal fog You guys seen this? Personally I've not had issues but apparently this saves a l...

if you want comfy to use the VAEs in fp16 mode use: --fp16-vae or --bf16-vae for bf16 mode

rustic garnet Jul 14, 2023, 9:42 PM

#

autumn forum So if I had 20 steps, but the end at step is 12, does it stop at 12 when it was...

I think so. If you say 20 steps and stop at 10 then it stops at 50% denoising

clever verge Jul 14, 2023, 9:43 PM

#

visual glade if you want comfy to use the VAEs in fp16 mode use: --fp16-vae or --bf16-vae for...

What other command line options can you pass to main.py?

visual glade Jul 14, 2023, 9:43 PM

#

--help will show all of them

autumn forum Jul 14, 2023, 9:43 PM

#

rustic garnet I think so. If you say 20 steps and stop at 10 then it stops at 50% denoising

Gotcha ya that makes sense

shy kelp Jul 14, 2023, 9:43 PM

#

vast narwhal Depends on the node you're using. but everyting needs to be from XL, including p...

nope, works with the regular text encoder aswell

clever verge Jul 14, 2023, 9:43 PM

#

visual glade --help will show all of them

Too simple!

eternal fog Jul 14, 2023, 9:43 PM

#

visual glade if you want comfy to use the VAEs in fp16 mode use: --fp16-vae or --bf16-vae for...

ok, I'll take a look. See if it's any faster, or less memory intensive

eternal fog Jul 14, 2023, 9:50 PM

#

visual glade if you want comfy to use the VAEs in fp16 mode use: --fp16-vae or --bf16-vae for...

Well it didn't seem to make too much difference with that vae. If anything I think it made it use more VRAM.

#

It does though infact fix black images when you run it at fp16

mystic cosmos Jul 14, 2023, 9:52 PM

#

junkyard_on_futuristic_city_outskirts_steps-56_style-Watercolor_width-1344_height-768_seed-0ts-1689111268_idx-0.png

uneven dove Jul 14, 2023, 10:02 PM

#

prompt: i installed Clippy today to show my grandkids how we used to talk to a paperclip and they said "grandpa's sunsetting again" and i was rushed to the doctors who adjusted my medication and insist i don't have grandchildren and that i need to stop going off my meds

1689372136.39827235aa419d99c6516a805a1d39ff264d189.png

timid sonnet Jul 14, 2023, 10:04 PM

#

lament rune Jul 14, 2023, 10:06 PM

#

uneven dove prompt: `i installed Clippy today to show my grandkids how we used to talk to a ...

the narrative here, it is exquisite

glad fulcrum Jul 14, 2023, 10:16 PM

#

I tried the --highvram but now...

#

So without it every image needs to load both models constantly

#

and with it it runs out of memory

#

on 12gbVram ?

thin nova Jul 14, 2023, 10:17 PM

#

each model is pretty big

amber fulcrum Jul 14, 2023, 10:18 PM

#

elfin cobalt ...now if you give the *right* prompt to GPT-4, it produces this.

whats the prompt if I may ask? Looks amazing

urban fjord Jul 14, 2023, 10:19 PM

#

With only 12 GB VRAM you shouldn't run refiner and base with highvram

visual glade Jul 14, 2023, 10:20 PM

#

yeah don't do highvram with SDXL if you only have 12GB, both unets on the gpu take that amount of memory

elfin cobalt Jul 14, 2023, 10:21 PM

#

amber fulcrum whats the prompt if I may ask? Looks amazing

Still experimenting. At the moment...

Given input such as "A picture of a boat", generate a creative description such as "Digital painting of a boat on the stormy ocean", deferring to user input when convenient. Also output a style, selecting relevant artists and stylistic choices that go well with the prompt. Series/character names don't work, so describe the scnee or character instead. Always include artists. While the prompt should be regular english, the style should be comma-separated keywords.

Respond using JSON, in the format {"prompt": "{prompt"}, "style": "{style}", aspect_ratio: "{e.g. 4:3}"}
Which, with this request:
Machikado Mazoku.
Produced this:
Anime-style digital artwork depicting a young girl with horns and a spiky tail, surrounded by a mysterious aura in suburban scenery --style Contemporary, Manga, Modern, Magic Realism, Hayao Miyazaki, Yoshitoshi Abe --ar 16:9

urban fjord Jul 14, 2023, 10:21 PM

#

Highvram is useful if you're only doing the base, but not with both.

glad fulcrum Jul 14, 2023, 10:21 PM

#

visual glade yeah don't do highvram with SDXL if you only have 12GB, both unets on the gpu ta...

With 12GBvram I need to wait for each model to load (twice an image) ?

urban fjord Jul 14, 2023, 10:22 PM

#

It shouldn't be that slow really to load both.

glad fulcrum Jul 14, 2023, 10:22 PM

#

last year they were talking about 24 fps and now 6 months later we need to wait 5 minutes for an image.

#

yeah it takes like 500 seconds per image

#

diffusion is fast but the model takes too long to load

urban fjord Jul 14, 2023, 10:22 PM

#

And if you create a batch of latent images then you've got less model switches to worry about

soft zealot Jul 14, 2023, 10:24 PM

#

I use a 1080ti11gb , only time I notice the SDXL models slow(ish) to load is when I initially start the server and load for the first time, afte rthat its almost instant even if I switch models and use a 1.5/2.1 workflow

shy kelp Jul 14, 2023, 10:24 PM

#

man its so funny seeing people claiming their upscaling setup works great then showing absolutely horrendous results

soft zealot Jul 14, 2023, 10:26 PM

#

shy kelp man its so funny seeing people claiming their upscaling setup works great then s...

TBJ & IMHO Upscaling is oiverrated unelss you're going for comercial contracts or HQ print outputs.

If like me you're simply generating for pleasure of simply to create wallpapers then even just a basic upscale is "just fine" IMHO

shy kelp Jul 14, 2023, 10:27 PM

#

soft zealot TBJ & IMHO Upscaling is oiverrated unelss you're going for comercial contracts o...

nah man some people just like generating full body and semi-full body images with clear faces

#

nothing more to it

elfin cobalt Jul 14, 2023, 10:28 PM

#

A well-matched 1.5 model works fine for upscaling. But SDXL is so flexible, there isn't any single model that'll work.

shy kelp Jul 14, 2023, 10:28 PM

#

also idk why you wouldnt just like to look at an image with more details than 1024... your take dosent really make sense to me at all actually

uneven dove Jul 14, 2023, 10:29 PM

#

a lot of the quality improvements are just people showing off that they CAN do it

#

doesn't need prompt comprehension worth a damn

quasi remnant Jul 14, 2023, 10:29 PM

#

glad fulcrum on 12gbVram ?

12GB isn't high vram; i think you'd out of memory trying to run even batch size 1 with both models loaded

shy kelp Jul 14, 2023, 10:29 PM

#

elfin cobalt A well-matched 1.5 model works fine for upscaling. But SDXL is so flexible, ther...

yeah sure, im just talking about the ones saying the refiner works great for upscaling 💩

elfin cobalt Jul 14, 2023, 10:29 PM

#

shy kelp yeah sure, im just talking about the ones saying the refiner works great for ups...

It does wat mate?

amber fulcrum Jul 14, 2023, 10:30 PM

#

glad fulcrum last year they were talking about 24 fps and now 6 months later we need to wait ...

my 3070 8gb takes 2,5 minute to generate 4 images with refiner - something in your setup is wrong in my opinion. Btw: what drivers do you have? I had huge issues with speed in SD with drivers I updated couple of weeks ago - needed to revert

shy kelp Jul 14, 2023, 10:31 PM

#

elfin cobalt It does wat mate?

wat?

glad fulcrum Jul 14, 2023, 10:31 PM

#

eternal fog You guys seen this? Personally I've not had issues but apparently this saves a l...

does this VAE fo into the refiner too?

elfin cobalt Jul 14, 2023, 10:31 PM

#

"What", but flatter.

shy kelp Jul 14, 2023, 10:31 PM

#

it dosent work

#

da ting dont do wat da ting should do

quasi remnant Jul 14, 2023, 10:32 PM

#

glad fulcrum last year they were talking about 24 fps and now 6 months later we need to wait ...

3090 gens an image in 17 or so seconds; it's dependent on your hardware

glad fulcrum Jul 14, 2023, 10:32 PM

#

amber fulcrum my 3070 8gb takes 2,5 minute to generate 4 images with refiner - something in yo...

glad fulcrum Jul 14, 2023, 10:32 PM

#

quasi remnant 12GB isn't high vram; i think you'd out of memory trying to run even batch size ...

I am running a batch size of 1

#

I removed the highVram already

boreal bough Jul 14, 2023, 10:32 PM

#

Ufff. No bf16 support, right?

shy kelp Jul 14, 2023, 10:33 PM

#

if your 7x upscaling workflow doesnt take at least 7 minutes to complete a single image, you're not taking your image generation seriously and should probably just give up

glad fulcrum Jul 14, 2023, 10:33 PM

#

I think it is working faster now

#

It seems it loaded both models now

#

maybe the fp16 vae helped

amber fulcrum Jul 14, 2023, 10:33 PM

#

glad fulcrum

try 536.40 - revert helped me as I was generating one image for 2-3 minutes with updated instead of 15 secs

glad fulcrum Jul 14, 2023, 10:34 PM

#

amber fulcrum try 536.40 - revert helped me as I was generating one image for 2-3 minutes with...

maybe. Its working well now

amber fulcrum Jul 14, 2023, 10:35 PM

#

Cool, Maybe they fixed the bug with latest update ^^

glad fulcrum Jul 14, 2023, 10:36 PM

#

#

I mean it's not 17 seconds to me

#

but it doesn't load the model every time now

urban fjord Jul 14, 2023, 10:40 PM

#

Is that with base + refiner?

soft zealot Jul 14, 2023, 10:40 PM

#

shy kelp nah man some people just like generating full body and semi-full body images wit...

an OG image and a crude 4x upscale from it

autumn forum Jul 14, 2023, 10:40 PM

#

soft zealot TBJ & IMHO Upscaling is oiverrated unelss you're going for comercial contracts o...

Some people like to use ai image generation for fun to see what it spits out. It’s fun. Ups along to get the best quality is fun too. I find this technology amazing.

soft zealot Jul 14, 2023, 10:41 PM

#

autumn forum Some people like to use ai image generation for fun to see what it spits out. It...

I get that, Ifor example I prefer to play withhis than do a corssowrd or kntting but.................... 🙂

autumn forum Jul 14, 2023, 10:42 PM

#

soft zealot I get that, Ifor example I prefer to play withhis than do a corssowrd or kntting...

Crossword and knitting sound boring lol.

eternal fog Jul 14, 2023, 10:42 PM

#

glad fulcrum does this VAE fo into the refiner too?

Pretty sure they use the same vaes, so yes you could use it with the refiner.

boreal bough Jul 14, 2023, 10:43 PM

#

soft zealot TBJ & IMHO Upscaling is oiverrated unelss you're going for comercial contracts o...

I use ai for all my dnd stuff XD when I do prints I need it to be crisp

#

to not break immersion

soft zealot Jul 14, 2023, 10:43 PM

#

autumn forum Crossword and knitting sound boring lol.

trying to do both at the same time can be fun though lol

urban fjord Jul 14, 2023, 10:43 PM

#

You can always just upscale in batch over night or something.

sharp robin Jul 14, 2023, 10:47 PM

#

autumn forum Jul 14, 2023, 10:48 PM

#

I wish I had a better use case for ai art, I know some people would pay for a high quality image of something specific but I wouldn’t even know how to sell that service😅

boreal bough Jul 14, 2023, 10:49 PM

#

autumn forum I wish I had a better use case for ai art, I know some people would pay for a hi...

you could do prints for yourself - for the wall. your top 5 gens get a proper A3 art print at a photo shop
might not make money - but at least rewards your hobby c:

shy kelp Jul 14, 2023, 10:49 PM

#

soft zealot an OG image and a crude 4x upscale from it

thats cool but thats a x4 upscale model and not sdxl right

static prawn Jul 14, 2023, 10:50 PM

#

kinda wish i would get some consistent results, i always fail with sdxl

shy kelp Jul 14, 2023, 10:50 PM

#

is sdxl free?

boreal bough Jul 14, 2023, 10:50 PM

#

shy kelp is sdxl free?

#1101178378452750447

#

or wait 3 more days for local full release

soft zealot Jul 14, 2023, 10:51 PM

#

shy kelp thats cool but thats a x4 upscale model and not sdxl right

It’s just feeding the sdxl image into a simple 4x pixel based upscale

#

Workflow is left in there

shy kelp Jul 14, 2023, 10:51 PM

#

thanks @boreal bough

glad fulcrum Jul 14, 2023, 11:00 PM

#

It wrote pretty good the name of my friend

#

and the image is nice as well with very simple prompt

static prawn Jul 14, 2023, 11:00 PM

#

u know where i can find prompts from sdxl?

glad fulcrum Jul 14, 2023, 11:00 PM

#

hmmm I'm starting to believe this is good model

static prawn Jul 14, 2023, 11:00 PM

#

especially with neg prompts?

elfin cobalt Jul 14, 2023, 11:01 PM

#

Negatives aren't needed, really.

#

For positives... it understands English much better than 1.5. Stick to simple language without prepositions, and it'll work fine.

#

Well, pronouns. Prepositions you can try to use.

static prawn Jul 14, 2023, 11:04 PM

#

mh ok , dunno i always tend to get blurry or extremly grainy results

eternal fog Jul 14, 2023, 11:05 PM

#

Negatives do some things, I've been putting a few in like deformed, blurry and this is the sort of difference you can get from with negatives and none

#

shy kelp Jul 14, 2023, 11:05 PM

#

Friday night hype leggo

static prawn Jul 14, 2023, 11:07 PM

#

i dunno my results are just always blurry or extremly grainy

#

i cant get result like yours

eternal fog Jul 14, 2023, 11:08 PM

#

What sort of steps, samplers and cfg are you using

static prawn Jul 14, 2023, 11:09 PM

#

eternal fog Jul 14, 2023, 11:09 PM

#

static prawn

Don't change samplers and noise schedules between the base and refiner

#

And DIMM should be using DIMM Uniform

static prawn Jul 14, 2023, 11:10 PM

#

oh ok someone suggested its way better using diff samplers on base and refiner

sharp robin Jul 14, 2023, 11:10 PM

#

eternal fog And DIMM should be using DIMM Uniform

Why?

boreal bough Jul 14, 2023, 11:10 PM

#

static prawn u know where i can find prompts from sdxl?

A. get a good setup
B. either prompt properly sdxl (trial and error) / or write a sentence in natural language that is around 10~15 words long, no commas / Use Interrogator with Vit-H to get prompts from existing images
C. no negatives, unless you know what you want them to do
D. generate an image to make sure you didn't include a word that messes everything up (rare, but can happen)

eternal fog Jul 14, 2023, 11:10 PM

#

sharp robin Why?

When I've tried to use it with others it's done strange things

autumn forum Jul 14, 2023, 11:10 PM

#

static prawn

I feel like that’s way more steps than needed lol

amber fulcrum Jul 14, 2023, 11:12 PM

#

I find prompting in SDXL better and worse than in 1.5 the same time XD
Better is consistency - worse is consistency.
I mean - I would like to have higher randomness to the output - often same prompt gives +90% of the same results.
Which is a plus as it seems working as intended but is also negative if you found good style but need to experiment with each prompt just to have different image to previous.
They are very similar.
Another issue is color - if I add "white background" usually it dramatically adds white as a whole to the scene/artwork.
What issues do you have and how do you overcome them?

sharp robin Jul 14, 2023, 11:13 PM

#

amber fulcrum I find prompting in SDXL better and worse than in 1.5 the same time XD Better i...

Overfitment/training known issue hope 1.x fixes

urban fjord Jul 14, 2023, 11:13 PM

#

If you want consistency then you can make a LoRA out of it.

boreal bough Jul 14, 2023, 11:13 PM

#

amber fulcrum I find prompting in SDXL better and worse than in 1.5 the same time XD Better i...

longer prompts often solve this. the shorter, the less variation on many heavily weighted words

static prawn Jul 14, 2023, 11:14 PM

#

eternal fog

im probably not happy with the overall result of sdxl, i think urs is super blurry too, 1024x1024 on 1.5 look so much cleaner, and crisp imo

urban fjord Jul 14, 2023, 11:15 PM

#

People will fine-tune SDXL so results will improve.

eternal fog Jul 14, 2023, 11:15 PM

#

static prawn im probably not happy with the overall result of sdxl, i think urs is super blur...

Thats wasnt an attemp at anything good, just an example of what negative prompts can do

urban fjord Jul 14, 2023, 11:15 PM

#

Remember how poor 1.5 was and how well fine-tunes work now..

static prawn Jul 14, 2023, 11:16 PM

#

i dunno everything i tried got kinda messed up with a lot of grain or blurry

boreal bough Jul 14, 2023, 11:17 PM

#

static prawn im probably not happy with the overall result of sdxl, i think urs is super blur...

"a man named void, 30 years old, is unhappy with the results of sdxl as he sits in front of his computer"
no style, no negative. first attempt.

static prawn Jul 14, 2023, 11:17 PM

#

😂

paper phoenix Jul 14, 2023, 11:17 PM

#

No negative, just his posts 😉

boreal bough Jul 14, 2023, 11:17 PM

#

second image generated, seed+1

sharp robin Jul 14, 2023, 11:17 PM

#

boreal bough "a man named void, 30 years old, is unhappy with the results of sdxl as he sits ...

Edward Snowden?

static prawn Jul 14, 2023, 11:17 PM

#

trust me i look that depressed every day haha

paper phoenix Jul 14, 2023, 11:18 PM

#

even if SDXL was absolutely irreproachably perfect in every way. void: "oh well, I bet ill go blind soon and not be able to see it."

boreal bough Jul 14, 2023, 11:19 PM

#

use this setup (from the lovely mimizukari), or the one from sytan
https://github.com/SytanSD/Sytan-SDXL-ComfyUI

📎 mimi_-_sdxl.json

nimble heart Jul 14, 2023, 11:19 PM

#

autumn forum I wish I had a better use case for ai art, I know some people would pay for a hi...

replace your phone wallpaper every 4 days

amber fulcrum Jul 14, 2023, 11:19 PM

#

SDXL improved a lot with fantasy but on the other hand - in some areas is overfitted as hell. Still - lot of improvement overall

urban fjord Jul 14, 2023, 11:20 PM

#

boreal bough "a man named void, 30 years old, is unhappy with the results of sdxl as he sits ...

Thanks for the advice, it really helped my prompt.

no style, no negative. first attempt. Cat sitting in a kitchen sink

paper phoenix Jul 14, 2023, 11:20 PM

#

nimble heart replace your phone wallpaper every 4 days

replace your phone wallpaper every few hours, but its the same wallpaper but the person has a subtly different facial expression.

boreal bough Jul 14, 2023, 11:20 PM

#

hahahaha

paper phoenix Jul 14, 2023, 11:21 PM

#

urban fjord Thanks for the advice, it really helped my prompt. > no style, no negative. firs...

you can tell that's an AI generation though. you hid the cat's hands so people couldnt tell.

#

oldest trick in the book. (the book being around 18 months old)

urban fjord Jul 14, 2023, 11:21 PM

#

Yeah I hid that it didn't have 5 fingers.