sonic narwhal Jul 31, 2023, 3:10 PM

#

39 repeats * 39 images, batch size 8, 20 epochs total steps 3803, estimated training time 116 hours on RTX 3090.
Why is my estimated training time for SDXL 1000x that of training 1.5 loras?

📎 SDXL_1.1.json

young crater Jul 31, 2023, 3:16 PM

#

sonic narwhal 39 repeats * 39 images, batch size 8, 20 epochs total steps 3803, estimated trai...

Your total steps still feel s way too high

(1 * 39 * 20)/8 = 97 total steps

#

This is what the progress should look like

#

on a 3090

sonic narwhal Jul 31, 2023, 3:21 PM

#

why so small amount of steps when training 5k+ steps on SD1.5 is no problem?

young crater Jul 31, 2023, 3:22 PM

#

Different model, different lora training styles

lucid ice Jul 31, 2023, 3:23 PM

#

young crater Your total steps still feel s way too high (1 * 39 * 20)/8 = 97 total steps

having the same problem, any idea why?

young crater Jul 31, 2023, 3:23 PM

#

lucid ice having the same problem, any idea why?

what is your training image folder name?

sonic narwhal Jul 31, 2023, 3:23 PM

#

lucid ice Jul 31, 2023, 3:23 PM

#

40_lora

sonic narwhal Jul 31, 2023, 3:23 PM

#

3 hours for 100 steps

young crater Jul 31, 2023, 3:24 PM

#

sonic narwhal

113s/step is waay too high for a 3090. There is something else up here..

young crater Jul 31, 2023, 3:24 PM

#

lucid ice 40_lora

should be 1_lora

lucid ice Jul 31, 2023, 3:24 PM

#

1? what does that number even do?

#

someone else told me between 20 and 40

sonic narwhal Jul 31, 2023, 3:25 PM

#

young crater 113s/step is waay too high for a 3090. There is something else up here..

Where can I change that?

young crater Jul 31, 2023, 3:25 PM

#

lucid ice 1? what does that number even do?

its your repeats

young crater Jul 31, 2023, 3:25 PM

#

sonic narwhal Where can I change that?

I have no clue, I am sorry..
are you changing any setting from caith?

young crater Jul 31, 2023, 3:26 PM

#

lucid ice someone else told me between 20 and 40

in Caith's message, they recommend 1 image repeat. As you are using Epoch instead of repeats to change your total steps

young crater Jul 31, 2023, 3:28 PM

#

lucid ice 40_lora

if your SD1.5 LORA settings were:

40 images
4 Batch
30 repeats
1 Epoch

it is now:

40 images
8 Batch
1 repeat
30 Epochs

(estimated)

lucid ice Jul 31, 2023, 3:28 PM

#

young crater in Caith's message, they recommend 1 image repeat. As you are using Epoch instea...

ah so this message referred to something else? in sd 1.5 it was recommended to name your folder 100_lora, so i was a little confused here

young crater Jul 31, 2023, 3:28 PM

#

lucid ice ah so this message referred to something else? in sd 1.5 it was recommended to n...

ah! Yes, when I asked Caith a while ago, they said you now use epochs as repeats

lucid ice Jul 31, 2023, 3:29 PM

#

young crater ah! Yes, when I asked Caith a while ago, they said you now use epochs as repeats

this is starting to make an awful lot of sense now

lucid ice Jul 31, 2023, 3:30 PM

#

young crater ah! Yes, when I asked Caith a while ago, they said you now use epochs as repeats

so like this then

young crater Jul 31, 2023, 3:30 PM

#

lucid ice so like this then

yes

lucid ice Jul 31, 2023, 3:32 PM

#

young crater yes

yeah buddy, thats more like it. Thanks for the help

young crater Jul 31, 2023, 3:32 PM

#

lucid ice yeah buddy, thats more like it. Thanks for the help

for sure!

orchid yoke Jul 31, 2023, 3:33 PM

#

young crater should be 1_lora

Dang it, ive been sitting here for hours trying to figure out why everything Caith said was gold but not working for me..Thank you

sonic narwhal Jul 31, 2023, 3:33 PM

#

young crater I have no clue, I am sorry.. are you changing any setting from caith?

Running with caiths exact settings the estimated time is 9 hours

#

225s/it

lucid ice Jul 31, 2023, 3:34 PM

#

sonic narwhal Running with caiths exact settings the estimated time is 9 hours

costum model sdxl set?

sonic narwhal Jul 31, 2023, 3:35 PM

#

yes

lucid ice Jul 31, 2023, 3:35 PM

#

sonic narwhal yes

this one?

sonic narwhal Jul 31, 2023, 3:36 PM

#

yup

lucid ice Jul 31, 2023, 3:36 PM

#

https://tenor.com/view/stickergiant-restart-turn-it-off-turn-it-on-it-gif-24960297

Tenor

sonic narwhal Jul 31, 2023, 3:36 PM

#

sonic narwhal Jul 31, 2023, 3:37 PM

#

lucid ice https://tenor.com/view/stickergiant-restart-turn-it-off-turn-it-on-it-gif-249602...

Might be it 😄

young crater Jul 31, 2023, 3:37 PM

#

sonic narwhal

Try running it with half the batch size, it could be a vram issue

sonic narwhal Jul 31, 2023, 3:41 PM

#

young crater Try running it with half the batch size, it could be a vram issue

4 hours estimate, 74s/it

#

batch size 4, epoch 20

young crater Jul 31, 2023, 3:41 PM

#

are your images scaled to 1024x1024?

sonic narwhal Jul 31, 2023, 3:41 PM

#

no

lucid ice Jul 31, 2023, 3:42 PM

#

that might be it

sonic narwhal Jul 31, 2023, 3:42 PM

#

ahh okay xD

#

thought the bucket would take care of it

#

very well then, saving that for tomorrow

#

thanks for the help

warm fog Jul 31, 2023, 3:43 PM

#

this looks exciting: https://github.com/KohakuBlueleaf/HyperKohaku

GitHub

GitHub - KohakuBlueleaf/HyperKohaku: A diffusers based implementati...

A diffusers based implementation of HyperDreamBooth - GitHub - KohakuBlueleaf/HyperKohaku: A diffusers based implementation of HyperDreamBooth

spring sun Jul 31, 2023, 6:17 PM

#

If I remeber correctly, in ML the loss should be decreasing contantly over itterations. Is this True for diffusion models? Is it ok if my loss is going a little bit up then down though iterarions or it should always decrease?

When you are finetuning, loss is always going down if lr is right?

open merlin Jul 31, 2023, 7:09 PM

#

Depends. The loss landscape is very large, sometimes to get to global optima you have to go through suboptimal solutions. Do not know how exactly this applies to sd xl though. This is what my current model in training looks like, using cosine with restart, hopefully it will work:

#

https://towardsdatascience.com/https-medium-com-reina-wang-tw-stochastic-gradient-descent-with-restarts-5f511975163

Medium

A Newbie's Guide to Stochastic Gradient Descent With Restarts

An additional method that makes gradient descent smoother and faster, and minimizes the loss of a neural network more accurately.

spring sun Jul 31, 2023, 7:18 PM

#

@open merlin thanks, it helps

distant halo Jul 31, 2023, 10:17 PM

#

Has anyone had issues with their LoRAs doing very well on some prompts (simpler ones, typically), but struggling to reproduce the training subject accurately on others (usually larger and more complex ones)?

spring sun Jul 31, 2023, 10:18 PM

#

Train with transparent vs White background, anyone have an idea about which is better? I trained with transparent and its giving me alot of colorful background. Dont know if it had influence.

hollow spruce Jul 31, 2023, 10:20 PM

#

distant halo Has anyone had issues with their LoRAs doing very well on some prompts (simpler ...

I can tell you that my properly captioned datasets, trained with my settings have no issues.
but if you:
• Trainet with clip on
• Used bad caption practices
• Too big dim settings
then yeah, the LoRA may have damaged the core sdxl model enough that it's no longer functioning right

For my settings -> #🔧｜finetune message

hollow spruce Jul 31, 2023, 10:21 PM

#

spring sun Train with transparent vs White background, anyone have an idea about which is b...

did kohya now support alpha channels? O:
or is it just turning it white in the background, during import.
Cause from when I last checked up on that, it was still the latter. would be really cool to have transparency support though

distant halo Jul 31, 2023, 10:21 PM

#

hollow spruce I can tell you that my properly captioned datasets, trained with my settings hav...

Thanks very much for your thoughtful response! Here's one example of the caption I created for one of the 40 images in the training set. Any critique on its style/format?

#

"amcm, a woman wearing a black dress, smiling at the camera with a white curtain behind her, head shot"

spring sun Jul 31, 2023, 10:22 PM

#

hollow spruce did kohya now support alpha channels? O: or is it just turning it white in the b...

I dont know, it does accept the images, but no idea how its interpreting it. Kinda what I was asking 😦

hollow spruce Jul 31, 2023, 10:24 PM

#

distant halo "amcm, a woman wearing a black dress, smiling at the camera with a white curtain...

changing your trigger word, to something that the model already knows exists and is close to what you're trying to make - will significantly improve your experience. but this shouldn't really be causing any issue other than longer training time. (in the past, this was addressed by clip training, but since we're doing unet only right now, using proper fitting words as the trigger word saves you time)
other than that all good. "with a white curtain behind her" is good! always tag background 🙂

distant halo Jul 31, 2023, 10:26 PM

#

hollow spruce changing your trigger word, to something that the model already knows exists and...

What kind of trigger word would work best for a woman? I assume "woman" isn't a good one, right?

spring sun Jul 31, 2023, 10:26 PM

#

Btw in kohya, should I add the trigger and class words in the start of all captions?

hollow spruce Jul 31, 2023, 10:37 PM

#

distant halo What kind of trigger word would work best for a woman? I assume "woman" isn't a ...

ummm... woman is a bit of a special situation, since 'woman' contains all the information for all women.
you could pick the name of any famous woman that looks even remotely close to your subject

If you need a 1 size fits all approach, then 'photoshoot model', or something on a similar level will work as well.

hollow spruce Jul 31, 2023, 10:38 PM

#

spring sun Btw in kohya, should I add the trigger and class words in the start of all capti...

depends on training style.
if you want to do it properly, then this would be the way to go:

<trigger word>, caption, caption, caption, caption, caption, caption, <background description>

spring sun Jul 31, 2023, 10:39 PM

#

Oh thank you! It was not clear if I should add the trigger or it is already doing that behind the scenes.

junior owl Jul 31, 2023, 11:07 PM

#

anyone had any luck with SDXL textual inversion?

spring sun Jul 31, 2023, 11:09 PM

#

junior owl anyone had any luck with SDXL textual inversion?

Dont know if it helps but, I used to work with TI in the past, now with the new inplementations of lora its doing pretty good on XL

junior owl Jul 31, 2023, 11:11 PM

#

spring sun Dont know if it helps but, I used to work with TI in the past, now with the new ...

i'd like to try a TI + fine tune unet/lora but the TI alone right now is not working right

hollow spruce Jul 31, 2023, 11:12 PM

#

junior owl anyone had any luck with SDXL textual inversion?

it's currently working, but the method to go about it is different, so dont expect old tutorial to work.
expect there to be significan't amount of experimentation + longer training time, as the clip first breaks, before it rebuilds itself

junior owl Jul 31, 2023, 11:12 PM

#

hollow spruce it's currently working, but the method to go about it is different, so dont expe...

ah ok good to know. i'll just keep playing with it

hollow spruce Jul 31, 2023, 11:14 PM

#

junior owl ah ok good to know. i'll just keep playing with it

to get a bit of understanding, I'd recommend to do TI for something that isn't faces/anatomy. Then it's a lot easier to understand how it works

analog sinew Aug 1, 2023, 1:07 AM

#

For some reason when setting --num_cpu_threads_per_process=2 with sd-scripts, accelerate deadlocks. Very odd. Anyone see this before?

#

I think i found the issue

#

max_data_loader_n_workers

#

Yup, don't use both. https://github.com/pytorch/pytorch/issues/75147
I noticed with --num_cpu_threads_per_process=1 and max_data_loader_n_workers=4 I had a single CPU core pinned, so hopefully swapping those numbers give better perf.

primal isle Aug 1, 2023, 3:44 AM

#

@hollow spruce do u still recommend vit-h for auto tagging datasets for sdxl lora training? i want to experiment with ppl/faces. i will check captions manually, but i want to auto tag the dataset first as a base

hollow spruce Aug 1, 2023, 3:46 AM

#

primal isle <@211089689652887552> do u still recommend vit-h for auto tagging datasets for s...

Vit-H is the second best option that exists, but by far the easiest. (vit-big-g is better, but good luck on getting it to run)

primal isle Aug 1, 2023, 3:49 AM

#

hollow spruce Vit-H is the second best option that exists, but by far the easiest. (vit-big-g ...

ok. because i tried vit-h yesterday the first time, and got some weird captions on nearly every image. like a lot of artist names and words i never heard of 😂. i removed them by hand, but was wondering if i chose the right model

hollow spruce Aug 1, 2023, 3:53 AM

#

primal isle ok. because i tried vit-h yesterday the first time, and got some weird captions ...

that's the flavor chain. you do 'caption' only, to get only the first sentence, which is a neat descriptor

pliant drift Aug 1, 2023, 4:47 AM

#

can anyone tell if this is something we can use in diffusion land? https://research.nvidia.com/labs/par/Perfusion/

Key-Locked Rank One Editing for Text-to-Image Personalization

sonic narwhal Aug 1, 2023, 6:41 AM

#

What is a good upscaler for a image like this that needs to go from 512 to 1024?

warm agate Aug 1, 2023, 6:42 AM

#

sonic narwhal What is a good upscaler for a image like this that needs to go from 512 to 1024?

try 4x-Ultrasharp

sonic narwhal Aug 1, 2023, 6:46 AM

#

warm agate try 4x-Ultrasharp

In which folder does the pth file go?

warm agate Aug 1, 2023, 6:47 AM

#

sonic narwhal In which folder does the pth file go?

are you using a1111 or comfy?

sonic narwhal Aug 1, 2023, 6:47 AM

#

a1111

warm agate Aug 1, 2023, 6:48 AM

#

sonic narwhal a1111

models\ESRGAN

sonic narwhal Aug 1, 2023, 6:48 AM

#

thank you

meager spade Aug 1, 2023, 7:35 AM

#

CUDA Unified Memory is saving the day for me with LoRA training on a 3070, it OoMs without it

sonic narwhal Aug 1, 2023, 7:42 AM

#

young crater are your images scaled to 1024x1024?

Do they have to be 1024x1024 or is it good enough that one of the axis is 1024x(xxxx)?

young crater Aug 1, 2023, 7:51 AM

#

sonic narwhal Do they have to be 1024x1024 or is it good enough that one of the axis is 1024x(...

I believe they need to be one megapixel. How you spread out the megapixel is up to you (1024x1024 or 2048x512 or anything else that equals 1mp)

sonic narwhal Aug 1, 2023, 7:51 AM

#

can they be more than one megapixel?

young crater Aug 1, 2023, 7:52 AM

#

sonic narwhal can they be more than one megapixel?

Yes, but the model is trained at one megapixel and your memory usage may take a hit

#

But, I have had 100% success rate at 1.5 Lora’s and a 0% success rate at SDXL, so I may not be the best repo of information

signal nimbus Aug 1, 2023, 8:15 AM

#

have a question, when training an sdxl lora, why some people put the bucket max size over 1024px?

young crater Aug 1, 2023, 8:21 AM

#

signal nimbus have a question, when training an sdxl lora, why some people put the bucket max ...

while I do not know, my guess would be because SDXL is trained at 1mp not at 1024x1024 specifically. So a 1:2 aspect ratio SDXL render should be 2048x512. Which would be past 1024 on one axis.

Or its cause people like to train above the base resolution

signal nimbus Aug 1, 2023, 8:31 AM

#

yeah i guess

hollow spruce Aug 1, 2023, 8:34 AM

#

sonic narwhal can they be more than one megapixel?

with buckets turned on, and resolution set to 1024,1024 - everything that is too large gets scaled down to the best size for sdxl.
all aspect ratios work (but you'll save vram by not having all too many of them)

also although your images can be bigger, dont go complete overkill - if you have multiple 4000x7000 pictures, you'll get weird issues while the script is starting, and may run out of ram, or just have it run super slow. Keep the size at a humane level of like under 4000px in the largest dimension

signal nimbus Aug 1, 2023, 9:18 AM

#

is this considered fast? 🤔

hollow spruce Aug 1, 2023, 9:20 AM

#

signal nimbus is this considered fast? 🤔

2000+ images, or did you set your repeat that high?

signal nimbus Aug 1, 2023, 9:20 AM

#

172 img, 12 repeats

hollow spruce Aug 1, 2023, 9:20 AM

#

at 2k images, I'd say you're about averaged speed

signal nimbus Aug 1, 2023, 9:21 AM

#

cooking the new Blame lora :D

hollow spruce Aug 1, 2023, 9:21 AM

#

'cooking' well defined XD hope you dont burn it

signal nimbus Aug 1, 2023, 9:21 AM

#

improved the datased, all hand picked and edited + processed

#

gonna look better than before for sure

hollow spruce Aug 1, 2023, 9:22 AM

#

you better be saving every epoch with that high repeat rate!

signal nimbus Aug 1, 2023, 9:22 AM

#

i do!

#

previous lora was just 3 epoch, i'll see what it does up to 5

open merlin Aug 1, 2023, 9:51 AM

#

Would it make sense to use an llm like llama2 to adjust the automatically generated prompts in the correct format? It might be able to distinguish background from main character.
Then you can just write a python script that goes over the folder with automatically labelled images and cleans it up. Then you can train the llama with the cleaned up prompts for a copilot. Is this project feasible?

open merlin Aug 1, 2023, 10:10 AM

#

@hollow spruce How do you make sure your training prompts are in the correct format? Do you really go through all images every time?

hollow spruce Aug 1, 2023, 10:17 AM

#

open merlin <@211089689652887552> How do you make sure your training prompts are in the corr...

normally? yeah.
but right now I'm doing fun and lazy loras, where I do everything with just the trigger word

open merlin Aug 1, 2023, 10:18 AM

#

hmm, thanks. Do you think using open source LLM's could improve captioning?

hollow spruce Aug 1, 2023, 10:18 AM

#

https://discord.com/channels/1002292111942635562/1089974139927920741
is full with examples of my sketch lora

hollow spruce Aug 1, 2023, 10:19 AM

#

open merlin hmm, thanks. Do you think using open source LLM's could improve captioning?

nop. peak efficiency is reached when you use Vit-big-G

covert pagoda Aug 1, 2023, 11:03 AM

#

hollow spruce nop. peak efficiency is reached when you use Vit-big-G

How would one use it?

sonic narwhal Aug 1, 2023, 11:29 AM

#

hollow spruce https://discord.com/channels/1002292111942635562/1089974139927920741 is full wit...

Any special tricks when captioning sketch lora? I am also making a sketch lora atm 🙂

#

my dataset is 50 images so far, no background in any of them only solid white color

#

#

All 50 images are similar to this

signal nimbus Aug 1, 2023, 11:42 AM

#

results of 2h and a half of taining my Blame! lora. Not bad!

#

"ryan gosling by nihei tsutomu"

ashen field Aug 1, 2023, 11:58 AM

#

@hollow spruce your finetune config works with bitsandbytes 0.35, once upgraded to 0.41 (when using dev2 branch of kohya) the loss would diverge after a few epochs, it's upstream bug but I think you should be aware

hollow spruce Aug 1, 2023, 12:03 PM

#

ashen field <@211089689652887552> your finetune config works with bitsandbytes 0.35, once up...

well that explains a lot. noticed that on one install it wouldn't work correctly - but on the other it did. couldn't figure out why though.
thanks for letting me know ❤️ helps a lot

ashen field Aug 1, 2023, 12:05 PM

#

hollow spruce well that explains a lot. noticed that on one install it wouldn't work correctly...

Thanks for sharing the config. Any thoughts on using prodigy instead of adamw8bit? Assuming the optimal learning rate changed due to the bug, would adaptive method like prodigy help? Also, if I use a different rank (say 16 or 64), should I modify the learning rate?

hollow spruce Aug 1, 2023, 12:08 PM

#

ashen field Thanks for sharing the config. Any thoughts on using prodigy instead of adamw8bi...

16~32 should be fine. 8/1 ratio for dim/alpha should be kept though (so 16/2) - significantly increases the learning time. From my tests I can say it works great - but at the same it's not like I've run into any issues on 8/1 that weren't dataset or captioning related
but 64~256 you should take care to not accidentally overthrow the common knowledge of the sdxl model. Basically everything starts getting a bit worse if you use those sizes and don't take a lot of preventative measures

ashen field Aug 1, 2023, 12:10 PM

#

hollow spruce 16~32 should be fine. 8/1 ratio for dim/alpha should be kept though (so 16/2) - ...

I was using rank 8 with 500 images, but after about 40 epochs it starts to forget some likeness of already learnt stuff, feels like it's trying to jam too much into the 40mb file. So considering upsizing to rank 16 and resume training

#

And how about conv dim? Keep at 4?

hollow spruce Aug 1, 2023, 12:12 PM

#

apart from adamw8bit, I've only tested adafactor - where I did 4 loras, once with adam, once with adaf, same dataset. adamw8bit turned out better/faster every time. But I'm assuming it will arrive at the same detail, just slower.
will be testing prodigy soon - as it sounds promising. just haven't found the time for it yet

hollow spruce Aug 1, 2023, 12:21 PM

#

ashen field And how about conv dim? Keep at 4?

conv dim?
I could swear that was lycoris specific

sour eagle Aug 1, 2023, 12:28 PM

#

I tried adamw8bit yesterday and it said it was gunna take 3 hours lmao so I switched to adamw and it only took 1 hour. Is there a reason for that?

hollow spruce Aug 1, 2023, 12:29 PM

#

ashen field I was using rank 8 with 500 images, but after about 40 epochs it starts to forge...

long captions?
the most I've tried to teach it was 100 concepts with a 5k dataset - which worked fine. But if your captions are good enough, and basically every tag gets treated like a concept, then I can see that happening much sooner.

hollow spruce Aug 1, 2023, 12:30 PM

#

sour eagle I tried adamw8bit yesterday and it said it was gunna take 3 hours lmao so I swit...

with your 3090? nop not really :/

#

how many steps was it in total?

sour eagle Aug 1, 2023, 12:31 PM

#

hollow spruce with your 3090? nop not really :/

Yeah, and it was 1180 steps I believe. I used lower batch size because I can only do 3 max, 4 causes to much vram usage for me for some reason

#

Well I could probably push 4. I think 3 was like 15.8G of vram

ashen field Aug 1, 2023, 12:49 PM

#

hollow spruce long captions? the most I've tried to teach it was 100 concepts with a 5k datase...

Yeah it’s pretty long caption with tags, this could be why rank 8 gets filled quickly

hollow spruce Aug 1, 2023, 12:55 PM

#

ashen field Yeah it’s pretty long caption with tags, this could be why rank 8 gets filled qu...

also, at that point you've prob reached the limit of what constant scheduler will provide. best to move over to cosine with restarts

ashen field Aug 1, 2023, 12:58 PM

#

hollow spruce also, at that point you've prob reached the limit of what constant scheduler wil...

I found cosine hard to work with because when I use resume using last training state it just keeps decreasing lr instead of starting over

#

So the learning rate is stuck at 0

hollow spruce Aug 1, 2023, 12:59 PM

#

ashen field I found cosine hard to work with because when I use resume using last training s...

exactly. cosine with restarts is great when you know exactly where you want your training to end

ashen field Aug 1, 2023, 1:00 PM

#

hollow spruce exactly. cosine with restarts is great when you know exactly where you want your...

Yeah but I train 10 epochs at a time and resumes if I feel need more. How would I resume the training state but reset cosine if it already ended it’s cycle in the previous 10 epochs?

hollow spruce Aug 1, 2023, 1:02 PM

#

ashen field Yeah but I train 10 epochs at a time and resumes if I feel need more. How would ...

save every 10 epochs. restart every 10 epochs. max epochs of 500. then go on vacation for the weekend XD

ashen field Aug 1, 2023, 1:05 PM

#

hollow spruce save every 10 epochs. restart every 10 epochs. max epochs of 500. then go on vac...

Sorry for the dumb question but how to set the restart to be 10 epochs? I see there’s “lr number of cycles” setting, so in this case I would set max epoch to 500 and “lr number of cycles” to 50? Also what’s the difference between “epoch” and “max train epoch”?

hazy elbow Aug 1, 2023, 2:53 PM

#

signal nimbus results of 2h and a half of taining my Blame! lora. Not bad!

Looks Superb!. Please if possible upload it on CivitAI.

sacred grail Aug 1, 2023, 6:08 PM

#

ur not putting --w 1024 and --h 1024 in the sample prompts right? that might be the problem

ivory yew Aug 1, 2023, 6:15 PM

#

has anyone tried finetuning the XL unet instead of finetuning a LoRA?

analog sinew Aug 1, 2023, 6:25 PM

#

waifu diffusion xl

young crater Aug 1, 2023, 6:31 PM

#

ivory yew has anyone tried finetuning the XL unet instead of finetuning a LoRA?

isnt a lora a finetuning on the XL unet?

analog sinew Aug 1, 2023, 6:43 PM

#

now this is overfitting 😎

sour eagle Aug 1, 2023, 7:32 PM

#

Is there a way to get a fancy graph of my loss? I’m using kohya gui.

analog sinew Aug 1, 2023, 7:34 PM

#

i think you can configure wandb

young crater Aug 1, 2023, 8:25 PM

#

sour eagle Is there a way to get a fancy graph of my loss? I’m using kohya gui.

Start the tensor board thing

sour eagle Aug 1, 2023, 8:34 PM

#

young crater Start the tensor board thing

Idek how to do that😂

pliant drift Aug 1, 2023, 8:43 PM

#

in the gui

analog sinew Aug 1, 2023, 8:57 PM

#

got masked lora training working on kohya-ss/sd-scripts sdxl branch

#

(mask covers the text/drawing in the training data)

#

anyone have a masked hands dataset? 😄

young crater Aug 1, 2023, 9:45 PM

#

young crater Aug 1, 2023, 9:45 PM

#

sour eagle Idek how to do that😂

I sent an image of it

sour eagle Aug 1, 2023, 9:48 PM

#

Ohhhh okay thanks

#

Sorry didn’t see that as I’m not home😂

sinful rune Aug 2, 2023, 3:46 AM

#

Hi, does anyone know how to train on multi gpu devices with everydream2?

hollow spruce Aug 2, 2023, 5:56 AM

#

#✨｜sdxl or #🤝｜tech-support - but this channel is for training the sdxl model only

sonic narwhal Aug 2, 2023, 6:20 AM

#

#

Sample images of sdxl sketch style training looking good 😆

hollow spruce Aug 2, 2023, 6:30 AM

#

sonic narwhal

sonic narwhal Aug 2, 2023, 9:27 AM

#

#

what does training comment do?

#

Have you fixed this?

hollow spruce Aug 2, 2023, 10:09 AM

#

sonic narwhal

nothing functionally. some UIs let you see the comment if one was attached

sonic narwhal Aug 2, 2023, 10:52 AM

#

#

cursed sample images

dull bramble Aug 2, 2023, 12:20 PM

#

sonic narwhal Have you fixed this?

No I didn't tried yet

restive bridge Aug 2, 2023, 8:50 PM

#

why does training pull 450w from my gpu 🙃

young crater Aug 2, 2023, 10:10 PM

#

restive bridge why does training pull 450w from my gpu 🙃

What gpu do you have?

restive bridge Aug 2, 2023, 10:10 PM

#

young crater What gpu do you have?

3090

#

ftw3

young crater Aug 2, 2023, 10:11 PM

#

restive bridge ftw3

Your gpu has a 450w tdp

#

Some gpus have a way to change that with a switch on the top of the gpu, but I’m not sure about the ftw3

restive bridge Aug 2, 2023, 10:14 PM

#

young crater Your gpu has a 450w tdp

its not oc but pulls 420+. the problem is it's overdrawing and shuts my pc off every time i train.

#

my sensor logs say it's pulling 110% TDP

young crater Aug 2, 2023, 10:14 PM

#

restive bridge its not oc but pulls 420+. the problem is it's overdrawing and shuts my pc off ...

What PSU and CPU do you have?

restive bridge Aug 2, 2023, 10:15 PM

#

young crater What PSU and CPU do you have?

r7 5800x, corsair 850w 80+ gold

young crater Aug 2, 2023, 10:18 PM

#

550w between the two, it’s possible that you are getting a spike in your power draw causing it to shut down.

Assuming you are using caith’s settings, if you lower batch size from 8 to 4, you will lower your memory usage quite a bit and, in turn, wattage.

restive bridge Aug 2, 2023, 10:20 PM

#

young crater 550w between the two, it’s possible that you are getting a spike in your power d...

yeah i can always lower batch size but how are they doing batch 8 on 2070s with the exact same config?

young crater Aug 2, 2023, 10:20 PM

#

By running way slower and offloading to system memory

#

The extra performance (on a factor of like 5-10x in this case) is what is drawing so much power

#

And as Caith notes, running batch 4 means you can check your training in ComfyUi while your training rather than relying on the not-great sample output

#

Which means you’d technically be running faster as you can check your work sooner

restive bridge Aug 2, 2023, 10:24 PM

#

something else is off. i'm using caith's config file with no changes and getting crashes even on batch 6, while thousands of others are running it fine on worse gpu's and higher batches, and if anything they only get oom.

#

if i got oom i wouldnt mind

young crater Aug 2, 2023, 10:25 PM

#

🤔

In that case, it sounds like there’s something up with the gpu specifically.. have you tried running stress tests on it recently?

restive bridge Aug 2, 2023, 10:27 PM

#

yes a few today. I also use vr heavily and have never had problems. temps are good

ocean dune Aug 2, 2023, 10:43 PM

#

Hoi, for training, do the images have to be 1:1 ratio? Like 1024x1024? And if so, how do would i say train one for a game character? AS most of them for the ingame character is in 21:9 ratio or around there. Just photoshop the images to have no backgrounds and just make image as wide as tall and make them a transparent png with no background? :P

young crater Aug 2, 2023, 11:15 PM

#

ocean dune Hoi, for training, do the images have to be 1:1 ratio? Like 1024x1024? And if so...

You can train at different aspect ratios, just make sure of a few things:

1. The image size should be One Megapixel in total (1024x1024, 2048x512, etc). Here is a calulator for this purpose: https://www.scantips.com/mpixels.html (there are others aswell, just a hard thing to find). You can use Presize.io to crop your whole data set at once
2. Set max image size in the settings to your source image resolution
3. When testing in ComfyUI, make sure to set your CLIP resolution to the same aspect ratio as your generated image or else SAI staff may post your lora in the sdxl chat and call you out for claiming to fix double characters

Calculator, Maximum camera image dimension and print size from Mega...

Megapixel Calculator, what image size image from X megapixels?

dawn pawn Aug 2, 2023, 11:34 PM

#

restive bridge why does training pull 450w from my gpu 🙃

You could power limit the GPU. The command is: nvidia-smi -pl 350 if you wanted to limit it to 350W.

I run my 3090 at 300W and lose about 3% performance in training compared to the normal power and it's much quieter as well (i.e. it's very worth it).

young crater Aug 2, 2023, 11:46 PM

#

dawn pawn You could power limit the GPU. The command is: nvidia-smi -pl 350 if you wanted ...

is there any drawbacks to this in terms of stability?

dawn pawn Aug 2, 2023, 11:48 PM

#

young crater is there any drawbacks to this in terms of stability?

No. It's perfectly stable and I have done this for months.

ocean dune Aug 3, 2023, 12:10 AM

#

young crater You can train at different aspect ratios, just make sure of a few things: **1....

Ah nice, so as long as pixel count is the same as model trained, then it works?

#

There's no clip res for loras as far as i can see. Used this guide https://www.youtube.com/watch?v=AY6DMBCIZ3A

YouTube

SECourses

First Ever SDXL Training With Kohya LoRA - Stable Diffusion XL Trai...

Updated for SDXL 1.0. How to install #Kohya SS GUI trainer and do #LoRA training with Stable Diffusion XL (#SDXL) this is the video you are looking for. I have shown how to install Kohya from scratch. The best parameters to do LoRA training with SDXL. How to use Kohya SDXL LoRAs with ComfyUI. How to do checkpoint comparison with SDXL LoRAs and m...

▶ Play video

#

Also, is it normal for when training a lora for it to use all 24GB and then some? Thonk Training for SDXL

sour eagle Aug 3, 2023, 12:17 AM

#

i dont trust that guide lmao

young crater Aug 3, 2023, 12:21 AM

#

ocean dune There's no clip res for loras as far as i can see. Used this guide https://www....

as Via noted, please use Caiths guide if you want to follow one, Caith has a guide based on what SDXL needs. This guide is a SD1.5 preset slightly modified to work in XL.

TLDR: The issue is LORAs are relative and training 10% of SD1.5 is a very different thing than 10% of SDXL.

I'll grab a link to Caith's guide

#

#🔧｜finetune message

ocean dune Aug 3, 2023, 12:44 AM

#

Thanks :)

#

So how much video memory should a lora training consume? Like all of it?

young crater Aug 3, 2023, 1:11 AM

#

Batch 4: 10gb
Batch 8: 16gb
Batch 10: 24gb

or something along those lines

ocean dune Aug 3, 2023, 1:15 AM

#

young crater https://discord.com/channels/1002292111942635562/1026382406279770152/11354808097...

Thanks! Doing 8 batches now and that uses 18GB video memory :P

#

Will take 19 sec per iteration, so 15 hours for this quick test lol.

young crater Aug 3, 2023, 1:22 AM

#

whats your repeats at o.o ?

#

A SDXL lora of any reasonable size on a 3090 should be 15-50 minutes

#

should be around 80 Epochs, 1 repeats and 8 batch

ocean dune Aug 3, 2023, 1:23 AM

#

I used the config in the post you linked to. And for some reason, training just ceased for some reason

young crater Aug 3, 2023, 1:24 AM

#

press space on the command prompt, but you have way too many repeats

ocean dune Aug 3, 2023, 1:25 AM

#

young crater A SDXL lora of any reasonable size on a 3090 should be 15-50 minutes

The heck? Then why did it say 48k steps Thonk And 2 sec per iteration

young crater Aug 3, 2023, 1:26 AM

#

repeats at one, 35 images, epoch 100, batch 8 takes about 2 hours

ocean dune Aug 3, 2023, 1:26 AM

#

Ah, seems like i forgot this one. Gonna do a through folder structure tomorrow and do the deprecated folder part as well :P

ocean dune Aug 3, 2023, 1:26 AM

#

young crater repeats at one, 35 images, epoch 100, batch 8 takes about 2 hours

Do you do 1024 or 512 training?

young crater Aug 3, 2023, 1:27 AM

#

ocean dune Do you do 1024 or 512 training?

1mp @ 21:9

ocean dune Aug 3, 2023, 1:27 AM

#

Huh, odd. And yeah, something is amiss lol

young crater Aug 3, 2023, 1:27 AM

#

whats your img folder name?

#

(You can censor the prompt name if you want, just want to know the first number)

ocean dune Aug 3, 2023, 1:28 AM

#

100_link. Testing making a game character lora.

young crater Aug 3, 2023, 1:28 AM

#

ocean dune 100_link. Testing making a game character lora.

rename to 1_link

#

you are using Epochs as steps per image with Caith's workflow

ocean dune Aug 3, 2023, 1:28 AM

#

Ah, thought number indicated steps per image

ocean dune Aug 3, 2023, 1:29 AM

#

young crater you are using Epochs as steps per image with Caith's workflow

Gotchu. Is there a tutorial video that explains what each of that stuff means? Like "epochs" :P

#

Not too good with text sadly, though the guide you linked to was fairly easy, just some terms i wanna dig deeper into :P

young crater Aug 3, 2023, 1:31 AM

#

Steps = (Repeats x Img Count x Epochs) / Batch Size

Aim for 150-200 steps with Caiths workflow

I honeslty dont know what they mean either sadly. Hopefully one day there will be a solid video tutorial, but most today are either bloated or confusing..

hollow spruce Aug 3, 2023, 2:15 AM

#

young crater Batch 4: 10gb Batch 8: 16gb Batch 10: 24gb or something along those lines

changes depending on bf16 support and multiple other factors.
most 12gb vram gpus can run batch 2. Some are just over the limit.

Enabling full bf16 training should fix this, and allow for much higher batches - but I still haven't tried that setting as there's not much point on my rtx4090 - since lower end cards may behave differently

#

but I've seen the issue people run into with vram, so I'll be adding presets for 12gb vram, 16gb vram, 24gb vram in the next few days - should solve the first issue people usually encounter

sour eagle Aug 3, 2023, 2:28 AM

#

added a lora for chappie:P works surprisingly well. probably would pair good with other mech loras in the future https://civitai.com/models/121549

hollow spruce Aug 3, 2023, 2:32 AM

#

sour eagle added a lora for chappie:P works surprisingly well. probably would pair good wit...

oh damn those images look legit O:

sour eagle Aug 3, 2023, 2:33 AM

#

hollow spruce oh damn those images look legit O:

thanks!

#

i just realized all of them are in the same pose. lmao

#

except one

restive bridge Aug 3, 2023, 4:06 AM

#

why does training with regularization double the training steps even though reg repeats is 1 and img repeats is 20?

hollow spruce Aug 3, 2023, 4:26 AM

#

1reg image per dataset image is the general rule of thumb

#

I think kohya automates that

restive bridge Aug 3, 2023, 4:33 AM

#

hollow spruce I think kohya automates that

so reg repeats is ignored? strange cuz in the kohya dataset preparation tool it asks for reg repeats.

young crater Aug 3, 2023, 4:34 AM

#

restive bridge so reg repeats is ignored? strange cuz in the kohya dataset preparation tool it ...

I havent tested it,but I suspect that reg repeats is repeats per data set image. So if you set it to, say, 2 it will do 2 reg images per data set image

restive bridge Aug 3, 2023, 4:34 AM

#

ohh

signal nimbus Aug 3, 2023, 7:20 AM

#

guys what's the max batch size you usually put on a 24gb GPU when training SDXL LoRa?

ruby pond Aug 3, 2023, 9:12 AM

#

signal nimbus guys what's the max batch size you usually put on a 24gb GPU when training SDXL ...

8

signal nimbus Aug 3, 2023, 9:55 AM

#

damn stupid me using batch 2 lol

hollow spruce Aug 3, 2023, 10:15 AM

#

signal nimbus guys what's the max batch size you usually put on a 24gb GPU when training SDXL ...

8 if I'm afk. 3 if I'm using the pc - so that I can actively continue using comfy & test my checkpoints while its training

orchid yoke Aug 3, 2023, 11:00 AM

#

hollow spruce 8 if I'm afk. 3 if I'm using the pc - so that I can actively continue using comf...

Thanks so much for everything youve said on the topic, ive been stalking. (sorry) - Do you have any tips or a comfy setup you recommend, i ended up switching back to Auto but purely because my Loras are being mutilated on Comfy but not there, and i havent narrowed down why yet.

hollow spruce Aug 3, 2023, 11:05 AM

#

orchid yoke Thanks so much for everything youve said on the topic, ive been stalking. (sorry...

I made sure to include some notes.
This is the easiest way to test your lora in comfy. runs quick and easy using base only.

📎 simple_lora.json

#

for a more "full" workflow, I can recommend sytans highres fix (since that uses the base again at the end - therefore more lora details
but for testing the capabilities of your lora - this is the most efficient way

proud robin Aug 3, 2023, 11:07 AM

#

whats the easiest/ best way to generated images of myself in any style?

orchid yoke Aug 3, 2023, 11:15 AM

#

hollow spruce I made sure to include some notes. This is the easiest way to test your lora in ...

thanks so much for this and all your comments, helpful notes, thanks for you.. basically.

sour eagle Aug 3, 2023, 4:19 PM

#

I wonder when any actual good fine tunes will come about for sdxl. Dreamshaper is meh and all the anime ones are meh as of right now.

tall condor Aug 3, 2023, 6:39 PM

#

does anybody have a propper description on how exactly finetuning in kohyass works?

#

also is it possible to merge 2 lora models to 1 model?

restive bridge Aug 3, 2023, 8:13 PM

#

hollow spruce for a more "full" workflow, I can recommend sytans highres fix (since that uses ...

I was very excited with sytans 3rd pass to bring back lora detail but it made details a bit too perfect. for my use case if someone is ugly irl they need to stay ugly in the output lol.

So i discovered that if a lora is sufficiently trained enough, the details of a face make it into the first base pass and arent removed for refiner pass. refiner does its job on details but doesnt remove any likeness. It's quite nice

young crater Aug 3, 2023, 8:15 PM

#

restive bridge I was very excited with sytans 3rd pass to bring back lora detail but it made de...

What’s your refiner pass set to?

restive bridge Aug 3, 2023, 8:21 PM

#

young crater What’s your refiner pass set to?

5 steps on 2m karras

young crater Aug 3, 2023, 8:23 PM

#

restive bridge 5 steps on 2m karras

DDIM for the base/upscale?

restive bridge Aug 3, 2023, 8:24 PM

#

young crater DDIM for the base/upscale?

also 2m karras, no upscale. this is just 2 stage workflow not sytans

young crater Aug 3, 2023, 8:25 PM

#

Ahh, I misread the post, tyty

tall condor Aug 3, 2023, 11:24 PM

#

does it make sense to train a 1.5 model with max size 768 by 768? i read that it can pick up more details this way? is it true?

queen matrix Aug 3, 2023, 11:44 PM

#

I have usually trained 1.5 at 768x768 and everything always worked out fine. But I never compared to training at 512x512 with the same settings.

young crater Aug 3, 2023, 11:47 PM

#

I mainly trained at 512x512 for faces, with a good starting checkpoint, the faces looked great.

I think 768x768 could help with more complex trainings?

hollow spruce Aug 4, 2023, 3:46 AM

#

Barely docu on this if you google, so I thought I'd post it here. Advanced settings in kohya

Dropout caption every n epochs
Usually, images and captions are learned as a pair, but it's possible to train just on "images without captions" every certain number of epochs.

This option allows you to specify "drop out captions every ○ epochs."

For instance, if you set this to 2, you will conduct image training without captions every 2 epochs (2nd epoch, 4th epoch, 6th epoch...).

By training on images without captions, it is expected that your LoRA will learn a more comprehensive feature set from the images. It can also help prevent the image features from being tied too closely to specific words. However, if you use captions too sparingly, your LoRA could become ineffective at prompts, so be cautious.

The default is 0, and in the case of 0, caption dropout is not performed.

Rate of caption dropout
This is similar to the "Dropout caption every n epochs" mentioned above, but during the entire learning process, you can train on "images without captions" for a certain proportion of the time.

Here, you can set the proportion of images without captions. 0 means "always use captions during training," and 1 means "never use captions during training."

Which images will be trained as "images without captions" is determined randomly.

For example, if you train LoRA with 20 images, reading each image 50 times for just 1 epoch, the total number of image learnings is 20 images x 50 times x 1 epoch = 1000 times. If you set the rate of caption dropout to 0.1, 1000 times x 0.1 = 100 times, you will train on "images without captions."

The default is 0, and all images are learned with captions

opal jacinth Aug 4, 2023, 8:07 AM

#

hollow spruce Barely docu on this if you google, so I thought I'd post it here. Advanced setti...

thx for the guide above and your commitment. First time I ever played around with SD in general and still got no clue about all the terms, but was already able to do my own training and create some nice images afterwards with your basic workflow

compact trellis Aug 4, 2023, 10:40 AM

#

has anyone managed to get the refiner to train?

hollow spruce Aug 4, 2023, 11:37 AM

#

compact trellis has anyone managed to get the refiner to train?

you can stalk the dev branch if you want, but so far, casual refiner training is not a thing yet
https://github.com/kohya-ss/sd-scripts/commits/dev

#

once it's supported, you'll see it via commit there first

stone garden Aug 4, 2023, 12:03 PM

#

Question, is there a nice automated way to generate captions, and a UI to edit them? I want something to work with initially, and then edit the captions myself.

#

Also to curate the dataset, delete some images, etc.

narrow kraken Aug 4, 2023, 12:05 PM

#

Question, is it possible to merge xl1.0 base with the refiner ?

#

@stone garden lol, sup

hollow spruce Aug 4, 2023, 12:06 PM

#

stone garden Question, is there a nice automated way to generate captions, and a UI to edit t...

Interrogator on webui to auto-tag using the Vit-H or Bit-big-G model. (vit-big-g is better, but requires more resources to run)
curation, I recommend adobe bridge for the initial "delete, rate, move" part.|
resize all images above 4000px to be less (can be automated in many apps)
then move to hydrus network, where you can import all the tags, and efficiently edit them

I'll have a comprehensive guide up eventually, on how to do it. but this is it in a nutshell

stone garden Aug 4, 2023, 12:07 PM

#

How much do all these cost lol?

hollow spruce Aug 4, 2023, 12:07 PM

#

those are all free ^^

#

hydrus network is somewhat complicated to learn though. so if your dataset is like around 50~100 images, you could use this app instead (not as good, but much much easier to use)
https://github.com/lukemoore66/FastCaption

hollow spruce Aug 4, 2023, 12:11 PM

#

stone garden Question, is there a nice automated way to generate captions, and a UI to edit t...

for the actual training, these settings are still valid if you're just starting out with sdxl
#🔧｜finetune message

stone garden Aug 4, 2023, 12:14 PM

#

hollow spruce hydrus network is somewhat complicated to learn though. so if your dataset is li...

It’s 1000s, but of course many will be pruned for lighting, etc.

analog sinew Aug 4, 2023, 3:20 PM

#

hollow spruce hydrus network is somewhat complicated to learn though. so if your dataset is li...

i've also just open sourced a tool I developed for myself for combined manual captioning, tagging, and masking. It also has a few handy scripts for moving/filtering datasets and scraping booru tags.
https://github.com/briansemrau/image-data-tool

sour eagle Aug 4, 2023, 3:22 PM

#

Does anyone know if it’s possible to train to a negative amount instead of 1? Like, if I train a Lora I want to train the negative variant so the positive does the opposite? If that makes sense.

tepid sundial Aug 4, 2023, 3:23 PM

#

Haven't experimented much with this yet, but super interested in its impact (also the impact of training with a different scheduler). If you give it a try I'd love to hear what you find.

hollow spruce Aug 4, 2023, 3:23 PM

#

sour eagle Does anyone know if it’s possible to train to a negative amount instead of 1? Li...

yes/no?
basically you can train the lora normally, but then use it with negative values

#

tried it with my sketch lora and got pretty weird (but not bad) results

sour eagle Aug 4, 2023, 3:25 PM

#

hollow spruce yes/no? basically you can train the lora normally, but then use it with negative...

Yeah I guess that would be the only way, but I think that would be an interesting concept to somehow add if it’s even possible.

hollow spruce Aug 4, 2023, 3:26 PM

#

sour eagle Yeah I guess that would be the only way, but I think that would be an interestin...

you should be able to flip the weights after you're done. not sure about how to do that myself, but basically you invert the weights on the tokens, and then -1 = 1. so it should do what you want

tepid sundial Aug 4, 2023, 3:27 PM

#

tall condor does anybody have a propper description on how exactly finetuning in kohyass wor...

The code in sd-scripts is fairly straight forward and makes sense in terms of just adopting the minimal required changes from 1.5. There aren't that significant differences compared to for example the training setup in the diffusers training for LoRA, and there's an open fine-tune PR for diffusers right now that also is very similar in approach (some things differ like inclusion of snr for example). Diffusers PR here https://github.com/huggingface/diffusers/pull/4401

What's still quite unclear to me is what the impact of all these slight differences are, there's not much available data. For example how good results people are getting with float16, bfloat16 vs float32 precision, snr or not, batch sizes, learning rates, etc. There's simple quite a lack of benchmarks and data still. Model trainers are noutoriously BAD at sharing details of their findings as well.

GitHub

SDXL text to image trainer by CaptnSeraph · Pull Request #4401 · hu...

What does this PR do?
This is a modified pix2pix that should enable a basic text to image trainer for SDXL
fixes issue #4366 as requested by @sayakpaul
theres some small todo in the code, and it do...

sour eagle Aug 4, 2023, 3:28 PM

#

hollow spruce you should be able to flip the weights after you're done. not sure about how to ...

Hmm yeah I’ll look into that

hollow spruce Aug 4, 2023, 3:30 PM

#

sour eagle Does anyone know if it’s possible to train to a negative amount instead of 1? Li...

oh shit. i REALLY need to load up my accidental youtube artifact compression LoRA

sour eagle Aug 4, 2023, 3:31 PM

#

hollow spruce oh shit. i REALLY need to load up my accidental youtube artifact compression LoR...

And do what with it? Invert the weights to see if it adds detail? Lmao

hollow spruce Aug 4, 2023, 3:31 PM

#

yes! XD

sour eagle Aug 4, 2023, 3:37 PM

#

hollow spruce yes! XD

Do it! Lmk how you did it if you figure it out

analog sinew Aug 4, 2023, 3:40 PM

#

I made an inverse aesthetic lora, but the results weren't obvious

sour eagle Aug 4, 2023, 3:41 PM

#

analog sinew I made an inverse aesthetic lora, but the results weren't obvious

Did you invert the weights? If so how did you?

hollow spruce Aug 4, 2023, 3:43 PM

#

you can just do weight = -1 to quick test it. results are the same as if you inverted it

#

and hot damn its working

#

my dumb accidental youtube artifact lora is now high detail lora

#

#

right is with lora applied with -1

sour eagle Aug 4, 2023, 4:14 PM

#

Is that the same res? If so I’d say that did add some detail. Interesting

hollow spruce Aug 4, 2023, 4:30 PM

#

big image for the brave!
Artifact lora is using weight -1
BASE ONLY | Artifact Lora | Face Lora | Artifact + Face Lora

#

also some cherry picked results:

#

first of all, this makes using 2 loras at the same time a damn lot easier.
and the negative lora helps a lot with following the prompt XD which... I can't really explain

either way, this opens up a whole new box of stuff to research

young crater Aug 4, 2023, 4:50 PM

#

hollow spruce Interrogator on webui to auto-tag using the Vit-H or Bit-big-G model. (vit-big-g...

For the vit-h captioning, should we use the caption preset on A111?

I feel like it always gives very simple captions to the images, but idk what an ideal caption length would be

analog sinew Aug 4, 2023, 5:02 PM

#

has there been any proper research on negative prompting?

stone garden Aug 4, 2023, 5:20 PM

#

hollow spruce Interrogator on webui to auto-tag using the Vit-H or Bit-big-G model. (vit-big-g...

I know you said "More resources" but this is getting ridiculous. Am I doing something wrong?

Bildschirmfoto_2023-08-04_um_19.19.18.png

fringe ravine Aug 4, 2023, 5:27 PM

#

restive bridge I was very excited with sytans 3rd pass to bring back lora detail but it made de...

neck is sus

young crater Aug 4, 2023, 5:29 PM

#

stone garden I know you said "More resources" but this is getting ridiculous. Am I doing some...

in other messages Caith did (more or less) say "good luck running it lmao" so it may be very difficult to run

restive bridge Aug 4, 2023, 10:31 PM

#

anyone got foolproof background removal i can run locally?

hollow spruce Aug 5, 2023, 12:33 AM

#

stone garden I know you said "More resources" but this is getting ridiculous. Am I doing some...

if you're running it yourself, you need the 8 bit version.
if you're doing it via interrogator app, then once the initialization is done, it should work even on a 4090 if you pick fast/caption only

#

but yeah, for properly running the flavor chain each time you'll be needing 48/80gb vram depending on 8bit or 16bit

#

it's not really a local solution, as the 3090/4090 can barely run it.
most that I'm aware of, run it via an A100 runpod or other hosted solution

stone garden Aug 5, 2023, 3:09 AM

#

hollow spruce it's not really a local solution, as the 3090/4090 can barely run it. most that ...

Thanks for answering, it ran, just didn’t output anything to the console.

sour eagle Aug 5, 2023, 3:13 AM

#

anyone messed with specified down and up weights when training lora? i get an error saying no perameters specified

hollow spruce Aug 5, 2023, 3:32 AM

#

sour eagle anyone messed with specified down and up weights when training lora? i get an er...

prob worth checking if the number of blocks changed in sdxl - cause if yes, then you'd need to supply more parameters to statisfy all blocks

sour eagle Aug 5, 2023, 3:33 AM

#

i looked it up, still 12, at least it should be. may have some hidden parameters that are not in the gui that are missing for me idk

rain scarab Aug 5, 2023, 5:16 AM

#

restive bridge anyone got foolproof background removal i can run locally?

Inpaint?

restive bridge Aug 5, 2023, 5:19 AM

#

rain scarab Inpaint?

like automated tho

rain scarab Aug 5, 2023, 5:27 AM

#

restive bridge like automated tho

https://www.lightxeditor.com I use this. The AI background removal. pretty easy after a bit of fumbling. i use it a good bit now. If you find a better one let me know.

LightX

LightX Graphic Designing and Video Making

Create beautiful designs with thousands of editable and customizable design templates. Make social media posts, posters, ads, flyers, ebook covers and much more with LightX.

stone garden Aug 5, 2023, 1:43 PM

#

How are regularization images supposed to be created ideally???

#

If I just generate regularization images using "photo of a woman", I never get full body shots, face closeups, sitting or lying poses, or pictures from behind, while the training images include those shots.

#

On https://rentry.org/59xed3#hard-route I read:
"regularization images are reduced to latents and then trained on how to produce them back, using DDIM as sampler"
"You will want to generate an AI reg image for every training image you have. The names will have to match. So every training image will have a matching regularization image."
"Same prompt as the caption for the training image."
"DDIM sampler, resolution equal to your training resolution (not the same as the training image!), seed equal to your training seed (420 if you didn't touch it in the scripts below)."

THE OTHER LoRA TRAINING RENTRY

Stable Diffusion LoRA training science and notes
By yours truly, The Other LoRA Rentry Guy.
This is not a how to install guide, it is a guide about how to improve your results, describe what options do, and hints on how to train characters using bad or few images.
Due to the higher prevalence of...

#

I would imagine the AI can learn the difference between training and regularization images best if the prompt is the same for both images, just with the trigger word in the training image!?

warm fog Aug 5, 2023, 2:06 PM

#

I am renting a 1 x A100 SXM 80GB on runpod. What are some good initial settings for dreambooth sdxl training that optimize for performance? I am using the huggingface/diffusers library with latest pytorch, latest diffusers lib, and the training pipeline from examples/dreambooth. I guess fp16 precision is a good choice for performance on a100?. I get something like 3.44s/it during training.

accelerate launch train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=25 \
  --seed="0"

and ~/.cache/huggingface/accelerate/default_config.yaml looks like this:

compute_environment: LOCAL_MACHINE
distributed_type: 'NO'
downcast_bf16: 'no'
gpu_ids: all
machine_rank: 0
main_training_function: main
mixed_precision: fp16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

warm fog Aug 5, 2023, 2:16 PM

#

warm fog I am renting a 1 x A100 SXM 80GB on runpod. What are some good initial settings ...

I guess I should try torch.compile

tidal silo Aug 5, 2023, 7:40 PM

#

anyone have any good recomendations for a good tool to manage / tag / edit datasets?

warm agate Aug 5, 2023, 9:09 PM

#

tidal silo anyone have any good recomendations for a good tool to manage / tag / edit datas...

use blip captions in a1111

orchid dew Aug 5, 2023, 9:10 PM

#

seems like a good place to ask, but how can someone make a sdxl checkpoint? I can't seem to figure out... anything

tall condor Aug 6, 2023, 12:58 AM

#

quick question, what is "CV"?

#

"8bit Adam is slower but saves memory and results in a higher CV" - what does CV stand for?

jade hornet Aug 6, 2023, 1:02 AM

#

orchid dew seems like a good place to ask, but how can someone make a sdxl checkpoint? I ca...

I think it's limited to training loras until someone figures it out

orchid dew Aug 6, 2023, 1:24 AM

#

jade hornet I think it's limited to training loras until someone figures it out

how are people on civitai making checkpoints though?

ashen field Aug 6, 2023, 1:27 AM

#

tall condor "8bit Adam is slower but saves memory and results in a higher CV" - what does CV...

Cross validation (score)

tall condor Aug 6, 2023, 1:27 AM

#

and what does that mean?

primal isle Aug 6, 2023, 1:29 AM

#

orchid dew how are people on civitai making checkpoints though?

thats what i have been wondering for over a week now 😂 cant find any infos about how to finetune sdxl checkpoint (not lora), but obviously there has to be way, as there are already some fine tuned checkpoints on civit. have to try harder gathering i guess

ashen field Aug 6, 2023, 1:30 AM

#

tall condor and what does that mean?

It's a way to validate model performance for kaggle competition, nothing we should be concerned about for stable diffusion

tall condor Aug 6, 2023, 1:30 AM

#

ok, than i gues smy question is shall i use adamw oder admaw8 bit? i cant find too much about the difference

ashen field Aug 6, 2023, 1:32 AM

#

tall condor ok, than i gues smy question is shall i use adamw oder admaw8 bit? i cant find t...

The point of adamw8bit is to save vram, if you are using a consumer GPU you should prefer the 8bit

tall condor Aug 6, 2023, 1:33 AM

#

i am wondering on what is the difference in the model at the end of the day

#

so far i allways used adamw

#

i am wondering weather i can expect the model to get worse with adamw8bit

ashen field Aug 6, 2023, 1:33 AM

#

tall condor i am wondering on what is the difference in the model at the end of the day

No hard rule to tell, they will have different optimal parameters which you can only find by experimentation

tall condor Aug 6, 2023, 1:34 AM

#

took me 2 month to finde mine with adamw so better not touch that lol

ashen field Aug 6, 2023, 1:34 AM

#

Adam8bit will allow higher batch size which can train faster

tall condor Aug 6, 2023, 1:58 AM

#

doe sit make sense to train an 1.5 model with 768x768 max size? i read online that it can pick up more detailes this waay

bronze oyster Aug 6, 2023, 3:23 AM

#

Hi,
How deep should captioning go when training a person/character lora?
I have seen conflicting information on this and was wondering what results people here have had.
At the moment I am not captioning very deep.
Eg:

Uniquetrigger, man wearing tank top, close-up, from side
uniquetrigger, man wearing jacket and pants

Should I be captioning even less?
Eg:

Uniquetrigger, man, from side, close-up

Or more

Uniquetrigger, man with a beard standing in room wearing blue tank top, from side, close-up
Uniquetrigger, man with a beard standing on a balcony wearing blue tank top, from side, close-up

or excessive

Uniquetrigger, man with a beard standing in room wearing blue tank top, from side, close-up, white wall, cabinet, brown eyes, photo
Uniquetrigger, man with a beard standing on a balcony wearing green jacket with black sleeves, from side, close-up, trees, ocean, islands, houses, photo

latent charm Aug 6, 2023, 5:12 AM

#

tag or prompt is to descript your image. In training, it means your lora try to learn to use your caption to descript the training pair image. If less, you would able to produce the similar image in less caption. Something you are not mention in caption would be learned into your mentioned tag.

#

Correct me if I am wrong

jade hornet Aug 6, 2023, 6:05 AM

#

The guideline I follow is to describe everything but the character. You can describe their clothing if you don't want the AI to associate the clothing in the training images with the character. I'd avoid things like hair, eye color, etc unless you want to be able to change those things in your inference

pliant drift Aug 6, 2023, 3:44 PM

#

tall condor doe sit make sense to train an 1.5 model with 768x768 max size? i read online th...

I've done lots of 1.5 training with higher resolution loras. it does help a lot. i think i even took it to 896 a couple of times.

tall condor Aug 6, 2023, 3:44 PM

#

which resolution did you useß

#

and what batch size?

pliant drift Aug 6, 2023, 3:48 PM

#

bronze oyster Hi, How deep should captioning go when training a person/character lora? I have ...

my usual captioning strategy. i collect a bunch of tags i want to consistently use. like "muscles, shades, looking cool" whatever. Those might not apply to all your images but you want them to be consistently tagged where they are used. so you form a consistent tag set. thats step 1.

step 2. captioning each image with a template pattern. "token class, tags, clothes, background" so it could be "jack man, muscles, shades, looking cool, lumberjack flannel jacket, heavy duty jeans, in the woods"

step 3. there is no step 3. ez pz.

#

end of the day, just go for it. captioning is just voodoo imo

#

whenever i try using clip and blip to caption, it's always some junk like "man holding a beer and sitting holding a beer with a beer in his hand holding a beer sitting at a beer table with beer"

#

Thing i don' tlike about those is that there's no commas so they don't play into caption shuffles or dropout mechanisms that kohya uses

tall condor Aug 6, 2023, 3:53 PM

#

for me what i do as captioning is i create as @pliant drift mentioned my base catpions of the concepts i want to train, like, "standing, sitting, whatever" and tag all my images

#

then i use an automated tagger and add all the usefull tokes i get from it

#

and then i prefix my own tokens

#

for me DeepDanbooru captioner worked quite well, however you need to get rid of the nsfw tags afterwards

#

@pliant drift when you say you trained with higher resolution do you mean to change the max resolution settings higher or just use high res input images?

tall condor Aug 6, 2023, 4:23 PM

#

flowwolf you should try DeepDanbooru captioner, you get a comma seperated list of tags and if you enhance that with your own the results got much more flexible in my tests

pliant drift Aug 6, 2023, 4:36 PM

#

tall condor flowwolf you should try DeepDanbooru captioner, you get a comma seperated list o...

i've never considered that app because i don't do anime or anything like what danbooru hosts. LOL only reason i know about danbooru is because i was wondering what all this talk about booru tags in anything v3 were

would an app like that matter to me? like, i'm training generalized styles for non anime models. I feel like the anime guys just got way better tooling. Booru tagging culture is actually a huge boon to this new field

tall condor Aug 6, 2023, 4:39 PM

#

i dont do animie either

#

but i have tried like 10 different taggers and overall this one worked be for me for lora on ppl

#

i even did testing for landscape and it did a good job

#

you can test it here

#

http://dev.kanotype.net:8003/deepdanbooru/

#

what i found most importaint is that you still create your concepts and tag the primary tags you want by yourself

#

but once you have that sorted you can just use the tagger for the details

#

@flowwolf regarding the sizes may i ask for more details?

hollow spruce Aug 6, 2023, 5:25 PM

#

pliant drift I've done lots of 1.5 training with higher resolution loras. it does help a lot...

can confirm. 896 REALLY paid off. the last lycoris model I did for 1.5 basically upped the resolution you can generate to 896, and gave consistent results more than 80% of all generated images -which is usually my target line

#

the amount of images I manually tagged in the last two days.
only 25hours of manual reviewing/editing/tagging
roughly 20k images were reduced to that
100% watermark free

#

warm agate Aug 6, 2023, 5:35 PM

#

@hollow spruce i am train a lora and i forgot to add these args to one of the sample prompts --w 1024 --h 1024 --d 1 --l 7.5 --s 35 --n blurry,text,watermark can I edit them in the prompts.txt file from the samples folder

hollow spruce Aug 6, 2023, 5:36 PM

#

warm agate <@211089689652887552> i am train a lora and i forgot to add these args to one of...

yep

warm agate Aug 6, 2023, 5:36 PM

#

hollow spruce yep

ok thank you

#

if I didnt have those, it wouldhave run into an error

hollow spruce Aug 6, 2023, 5:37 PM

#

once the next image generates, it always checks the .txt file for what is has to generate

warm agate Aug 6, 2023, 5:37 PM

#

and would terminate the training

#

experienced it yesterday

hollow spruce Aug 6, 2023, 5:37 PM

#

so you can change it like every epoch if you want XD even change prompt entirely

warm agate Aug 6, 2023, 5:37 PM

#

hollow spruce so you can change it like every epoch if you want XD even change prompt entirely

oh ok

#

but my main prob was that its just 600 steps away from generating samples

hollow spruce Aug 6, 2023, 5:37 PM

#

I've done that before when I had an unlucky seed for what I was training

warm agate Aug 6, 2023, 5:37 PM

#

so i was kinda worried

#

as i have to sleep

#

and gonna leave my pc up for training

warm agate Aug 6, 2023, 5:38 PM

#

hollow spruce I've done that before when I had an unlucky seed for what I was training

i dont really trust sample images

hollow spruce Aug 6, 2023, 5:39 PM

#

warm agate i dont really trust sample images

same. which is why I've stopped them completely in sdxl. on 4090 I have enough vram left to run comfy on the side (normally at least)

#

but now I'll just wait it out

warm agate Aug 6, 2023, 5:39 PM

#

hollow spruce same. which is why I've stopped them completely in sdxl. on 4090 I have enough v...

oh cool

hollow spruce Aug 6, 2023, 5:40 PM

#

after waking up from sleep I'll just test everything. if you're not in a rush, you can just load up comfy anyway, let it overflow into ram, and sure it takes like 10 minutes to generate 1 image. but if you're not using the pc that's little time to wait

warm agate Aug 6, 2023, 5:40 PM

#

i cant run comfy simultaneously

#

as i only have 16gb vram

grizzled agate Aug 6, 2023, 5:41 PM

#

waow

warm agate Aug 6, 2023, 5:41 PM

#

@hollow spruce when using Kohya, does the training continue after waking the pc after sleep?

#

or will it terminate if we put the pc on sleep?

hollow spruce Aug 6, 2023, 5:41 PM

#

ah. may wanna try out the cpu only method. see if that works in the future. (not now though, when you just try if it works)

hollow spruce Aug 6, 2023, 5:41 PM

#

warm agate or will it terminate if we put the pc on sleep?

that's terrifying XD no idea

warm agate Aug 6, 2023, 5:41 PM

#

oh ok

hollow spruce Aug 6, 2023, 5:41 PM

#

I'd assume it crashes since it uses so many systems

#

not many apps are optimized for sleep

pliant drift Aug 6, 2023, 6:07 PM

#

@tall condor sorry i've been wrestling with configurations . when i say using 896 images, i mean i'm bucketing and setting the resolution in kohya to 896,896. the training sets i have i try to do at the highest quality, and i let kohya downscale it to appropriate buckets. Common crops help here a lot

#

where appropriate i'll even double up the training data a bit, for people especially, i'll do high quality square crops of faces. If the closeup crop isn't decent quality at the training resolution, then i toss it. quality training data is paramount

#

high resolution alone isn't a magic bullet. there should be high quality imagery too and thats up to you to eyeball

tall condor Aug 6, 2023, 6:10 PM

#

and how many batches can you do with that? like 2?

pliant drift Aug 6, 2023, 6:13 PM

#

i have 16gb. i guess maybe 3 -4. gradient checkpoints seeem to be a magical way of getting higher batches at the cost of speed. I use them to huge success.

tall condor Aug 6, 2023, 6:13 PM

#

thanks for the input i will try that

pliant drift Aug 6, 2023, 6:16 PM

#

i tried deepdanbooru on a few typical photos of the sort i'd like to deal with. i guess the demo doesn't show the tag editing capabilities. it kind of sucks for photographic purposes and the only thing it does see accurately is the text. i hope the editing abilities are better

#

i use a janky script i milked out of chatgpt right now

tall condor Aug 6, 2023, 6:32 PM

#

i have something for that sec

#

its quite fancy

#

https://github.com/pipinstallyp/minigpt4-batch

GitHub

GitHub - pipinstallyp/minigpt4-batch: Use miniGPT-4 batch to genera...

Use miniGPT-4 batch to generate captions for a lot of images! You should be able to create the best captions you always wanted! - GitHub - pipinstallyp/minigpt4-batch: Use miniGPT-4 batch to genera...

#

it works quite well but does not produce tags, its more like the clip

hollow spruce Aug 7, 2023, 4:10 AM

#

will post more info on this eventually,
but if you're training a single face, then use DIM 24/alpha 1
that is big enough that no detail should be lost at all
emphasis on face only.
for essentially all other loras, dim 8/alpha 1 is still the way to go.

young crater Aug 7, 2023, 4:59 AM

#

hollow spruce will post more info on this eventually, but if you're **training a single face, ...

For face training, is it still ideal to crop as close to the face as possible? (Or is the 24 dim also for like upper body + face)

sour eagle Aug 7, 2023, 5:57 AM

#

getting this error when trying to merge loras merge_lora.py", line 6, in <module>
import library.model_util as model_util
ModuleNotFoundError: No module named 'library

tall condor Aug 7, 2023, 8:18 AM

#

@pliant drift when i increase the max size to 768 my model turns into maximum crap

tall condor Aug 7, 2023, 8:54 AM

#

any suggestion on the settings?

orchid yoke Aug 7, 2023, 10:12 AM

#

hollow spruce will post more info on this eventually, but if you're **training a single face, ...

Woah, a quick test (because your workflow allows for such quick tests) - and.. it seems to have made a lot of difference. 🙂... Im particularly honing on quick loras as i use them for reference so usually have about 10 face shots (which like, 100 odd repeats on each image for 1.5 and random crap tagging was fine. Looking forward to your detailed guide.. if you haave a buymeorcoffee or somesuch, let me know. Keenly following as you've been a massive help.

latent charm Aug 7, 2023, 11:50 AM

#

@hollow spruce Sorry to bother you, if we are not training the text encoder on sdxl, why would the trigger words work with the lora? As my understanding, the trigger was learned into text encoder in previous lora. When we loaded the lora with models, we use the trigger words to trigger related trained features. In my sdxl lora testing, I am still doing the same thing without training the text encoder. I dont understand how the model+lora could know my trigger word and trigger related features.

hollow spruce Aug 7, 2023, 11:56 AM

#

latent charm <@211089689652887552> Sorry to bother you, if we are not training the text encod...

the clip is like a translator. it takes your prompt, and translates it into ai language.
if you are training a specific car, and give it the caption "car", then clip will take the image, and do its best to translate it from a car image, into car data.
if you give it an image of a monkey, and call it "car", then the clip will still translate it into car data, and the unet layer will just stare at the clip in a very confused way and mumble: "well if you say so...". causing all future cars to very very slowly converge into monkeys.
but if you used the caption "monkey", then it would make a lot more sense, and the unet will learn it much faster, since its data that already makes sense when the unet looks at the converted word

#

if you actually train the clip

then in the first instance, it will look at your car, and make the ai word for car also sound a small bit more like your english word for "car". It makes the whole translation process smoother and easier.
In the second example of telling it that monkeys are cars... its gonna REALLY mess everything up about both cars and monkeys XD but after enough epochs, and after having seen enough monkeys that are all called "car", then essentially the ai will learn the new meaning for the word, and translation will be smoother.

so why don't we do clip training for everything?
because just telling it that its translation is wrong, often does more harm than good. You only give it a word, with very little context. meanwhile the original meaning the ai had before you, was incredibly complex and linked to many things.

#

for cars, this isn't much of an issue, as it will quickly learn it. but if you dare to teach it eyes, neck, hands, feet or any such words, which are extremely complex when viewed in context of the bigger picture, then unless you're willing to provide it with around 5k images, you'll only teach it wrong things.

#

clip training is great. half assed clip training is bad.

#

here'

#

here's an example. I'm currently retraining "girl" since the clip understanding of that word is pretty wrong.

#

vertical is clip training, horizontal is unet training

#

basically, once my training is done, I'll be able to specify age groups, without bias from the words "girl" or "woman", since I'm completely retraining the meaning of those words in both clip & unet

#

but it comes at the cost of 4200 manually tagged images 🤣

#

still not nearly enough, as I'll need about 3000 images per ethnicity

latent charm Aug 7, 2023, 12:11 PM

#

I think I get the answer after seeing your explaination. For example, my trigger words is multiple characters, xeanlwan, the text encoder is freezed. The trigger word would be splitted into something like this xe, an, nl, w, and the unet got trained on those words leanred from the training image. When I loaded the lora, the text encoder also split my trigger word into same combination and because of that, te unet could load the trained features.

#

thanks for sharing

hollow spruce Aug 7, 2023, 12:22 PM

#

its usually best to change your trigger word to one that matches your content.
if you can't change it (training many many concepts, and you don't want to write a guide for how to use your lora)
then you have no option but to train the text encoder, to make all your trigger words work

latent charm Aug 7, 2023, 12:27 PM

#

Does it means if I change my trigger word to the closest word in the vocab would makes the training much faster?

#

For example, when I train an anime character call jianguo, rather than use his name as the trigger word. I should use male anime character to trigger the lora.

hollow spruce Aug 7, 2023, 1:06 PM

#

latent charm Does it means if I change my trigger word to the closest word in the vocab would...

yes. absolutely

#

it's literally like a multiplier

#

if you're training a face for example, you should find out which celebrity looks closest to you, then use their name to train your own face XD

latent charm Aug 7, 2023, 1:07 PM

#

but does it means I need to input his name to trigger the lora?

#

lin the celebrity example

#

Thank you for sharing this useful information

supple lynx Aug 7, 2023, 1:23 PM

#

did anyone tried textual inversion with xl model? im getting mixed results

hollow spruce Aug 7, 2023, 1:24 PM

#

latent charm but does it means I need to input his name to trigger the lora?

if you're doing captions, then you want the first word in your caption file to be that name:
lin sheng, a man wearing swimming trunks holding up a gold medal, olympic stand, pool
and then have your settings like shown in the screenshot

if you have no captions, then you'd just have your folder name like: 1_lin sheng

latent charm Aug 7, 2023, 1:27 PM

#

hollow spruce if you're doing captions, then you want the **first word** in your caption file ...

I usually do that but if the text encoder is not trained. How would lin sheng represents to the person who I want to train. This is my question

hollow spruce Aug 7, 2023, 1:28 PM

#

latent charm I usually do that but if the text encoder is not trained. How would lin sheng re...

ah, this was referring to if you know what celebrity looks like you

#

and I was hinting at gold medalist lin sheng

latent charm Aug 7, 2023, 1:29 PM

#

Oh, I didnt realize the name is a celebrity.Lol

hollow spruce Aug 7, 2023, 1:29 PM

#

you dont actually use your own name. you use the name of a famous person that looks like you! XD

latent charm Aug 7, 2023, 1:29 PM

#

Now I get that

pliant drift Aug 7, 2023, 3:52 PM

#

tall condor any suggestion on the settings?

adaptive optimziers. dadapt. adafactor. prodigy. use the learn rate of 1 and let it take the wheel.

tall condor Aug 7, 2023, 4:12 PM

#

which one worked best for you?

slim thistle Aug 7, 2023, 4:53 PM

#

anyone know if you can get the entire codeformer algorithm into stable diffusion? THere is an extension but it seems it doesnt come with uber face upscale and enhancement

covert pagoda Aug 7, 2023, 7:06 PM

#

pliant drift adaptive optimziers. dadapt. adafactor. prodigy. use the learn rate of 1 and l...

So LR 1, but what about Unet and TE LR's? Make LR=1 too for these? What if we want to turn down the TE? Does it affect the TE LR to lower it at all, or is adaptive completely automated on the unet/TE as well?

sacred grail Aug 7, 2023, 7:08 PM

#

supple lynx did anyone tried textual inversion with xl model? im getting mixed results

How are you training it?

pliant drift Aug 7, 2023, 7:09 PM

#

Lora's have an essential requirement according to the kohya documentation. --network_train_unet_only (i think thats' corect) so keep that in mind.

now those other learn rates are optional. if they're both 0, then they both use the learn rate. But since you're only training the unet, setting the TE here doesn't matter anyways. Understanding what the settings are helps a lot. They're optional settings if you don't want the LR for both the TE and Unet to be the same.

As usual i'm always coming back to this. RTFM. I could've also mentioned --network_train_unet_only at the beginning too, but i thought it was implied and known already. I tend to trust people have at least read the manpages when they're looking for help.

latent charm Aug 7, 2023, 7:19 PM

#

@hollow spruce How would you find the closest celebrity for lora training? I have tried one but I think the celebrity I had chosen wasn't the closest one and it does affected the likeness to the lora.

supple lynx Aug 7, 2023, 7:22 PM

#

sacred grail How are you training it?

almost like standard script
i changed to tune both text encoders embeddings and smaller size images (will try later on 512x512)

sacred grail Aug 7, 2023, 7:23 PM

#

supple lynx almost like standard script i changed to tune both text encoders embeddings and ...

Which script

supple lynx Aug 7, 2023, 7:25 PM

#

originally from diffusers, with lots of edits from myself

supple lynx Aug 7, 2023, 7:42 PM

#

🤷‍♀️ i can share if you interested

covert pagoda Aug 7, 2023, 7:48 PM

#

pliant drift Lora's have an essential requirement according to the kohya documentation. `--n...

Thanks dude. Yea, basic questions

hollow spruce Aug 7, 2023, 8:07 PM

#

latent charm <@211089689652887552> How would you find the closest celebrity for lora training...

• quick google
• use Interrogator + Vit-H on an image of yourself. see what words it says look like you. google a few of them, then be amazed by the fact that there is a famous doppelganger of yourself walking somewhere in the world! XD
• use a word like "model photoshoot" <- much less ideal, but avoids the research on yourself XD

latent charm Aug 7, 2023, 8:08 PM

#

hollow spruce • quick google • use Interrogator + Vit-H on an image of yourself. see what word...

Thanks. Where can I use Vit-H standalone?

hollow spruce Aug 7, 2023, 8:10 PM

#

latent charm Thanks. Where can I use Vit-H standalone?

A1111 webui -> addon Interrogator
you can also load it standalone via the github page, but that comes without the ability to batch interrogate an entire folder all at once

latent charm Aug 7, 2023, 8:11 PM

#

hollow spruce A1111 webui -> addon Interrogator you can also load it standalone via the github...

Thanks. I just uninstalled the webui. I will try the github.

sacred grail Aug 7, 2023, 11:18 PM

#

I have a question, is Micro-Conditioning from the paper also related to training? or is it only related to generating?
If I select don't upscale buckets in training a Lora (train on the original resolution of the images) and then do Micro-Conditioning in generating in comfy does it know that it should use the quality of the images above 1024 px but still use the knowledge from the images under 1024px?

stone garden Aug 8, 2023, 6:28 AM

#

pliant drift Lora's have an essential requirement according to the kohya documentation. `--n...

Do you know if it was ever working or if it broke? Do you know the commit that broke it?

pliant drift Aug 8, 2023, 7:32 AM

#

stone garden Do you know if it was ever working or if it broke? Do you know the commit that b...

it wasn't ever working as intended. i think it works it's just a little bit haywire until the devs sort things out and optimize things. sdxl has two text encoders so training them isn't as straight forward

hollow spruce Aug 8, 2023, 8:57 AM

#

stone garden Do you know if it was ever working or if it broke? Do you know the commit that b...

to be exact, text encoder support was added just before 1.0 release.
but it's not about the training being supported - it's about it being nothing like training 1.5. which is the first thing people do - just pretend that sdxl is 1.6, which it is not

I talked a bit about training clip over here: #🔧｜finetune message

latent charm Aug 8, 2023, 9:15 AM

#

hollow spruce to be exact, text encoder support was added just before 1.0 release. but it's no...

I have tried to train with celebrity but I didn't get the likeness to replicate the original image using the training prompt. Does it mean it is undertrain? I usually able to replicate the image around 80~90% using the training prompt with unique token training.

hollow spruce Aug 8, 2023, 9:17 AM

#

latent charm I have tried to train with celebrity but I didn't get the likeness to replicate ...

probably. more closeup photos usually help speed this up by a lot]

#

the more uncropped they are, the longer it takes, and the more images you need

latent charm Aug 8, 2023, 9:19 AM

#

My dataset are 15 images, most are upper body.

#

I am trying 400+ images now with celebrity training.

rough crypt Aug 8, 2023, 9:56 AM

#

I see now many fine-tune anime xl mode in civital , but all of them had a bad performance , even worse than 1.5. I think training sdxl not only just train base , refiner model is also even more important. That's why until now there is not good xl model of anime.
SDXL is good, 50% is because strong clip, 50% is because refiner model. No refiner model, the result of sdxl is not surprising .
It seems that most people still don't realize how important the refiner model is.

#

If there are a good anime sdxl model, it must have a related anime refiner model .

sacred grail Aug 8, 2023, 12:26 PM

#

rough crypt I see now many fine-tune anime xl mode in civital , but all of them had a bad pe...

maybe try extracting a lora from the finetuned base model and plug it into the refiner? not sure if this will work but it might?

#

or the other way around maybe

latent charm Aug 8, 2023, 12:56 PM

#

rough crypt I see now many fine-tune anime xl mode in civital , but all of them had a bad pe...

refiner is not that much.

#

I do agree a good fine tune should has its paired refiner model, especially the anime model which is a total different things of current model.

ashen field Aug 8, 2023, 1:10 PM

#

SAI should provide insights on how to train properly or at least how they did it, it's kind of weird they did beta testing with finetuners yet no best practice being shared.

#

For example to this date we don't know if 1.0 uses offset noise or zero terminal snr.

latent charm Aug 8, 2023, 1:12 PM

#

ashen field SAI should provide insights on how to train properly or at least how they did it...

They said they trained with huge dataset which is not compatible with consumer level equipment.

#

And the training method is also different

ashen field Aug 8, 2023, 1:13 PM

#

latent charm They said they trained with huge dataset which is not compatible with consumer l...

That's not the point, things like if./how much offset noise or if foundamental things like zero terminal snr should be used are invariant to the batch size

sacred grail Aug 8, 2023, 1:13 PM

#

ashen field For example to this date we don't know if 1.0 uses offset noise or zero terminal...

offset noise for SDXL is 0.0357

ashen field Aug 8, 2023, 1:13 PM

#

sacred grail offset noise for SDXL is 0.0357

Maybe true for 0.9, probably not for 1.0, the two models look a lot different

latent charm Aug 8, 2023, 1:14 PM

#

#✨｜sdxl message

ashen field Aug 8, 2023, 1:14 PM

#

Two my best knowledge 1.0 doesn't use offset noise, but we shouldn't be the ones to make guesses when such things can be easily explained by offical SAI

ashen field Aug 8, 2023, 1:16 PM

#

latent charm https://discord.com/channels/1002292111942635562/1089974139927920741/11381728618...

You prove my point, the community shouldn't be digging for pieces here and there when the team can easily put together a coherent piece explaining things

latent charm Aug 8, 2023, 1:18 PM

#

I do agree SDXL is kind of lacking well documentation. Many things need to be experiment. If dev team could provide more information which would be nicer.

ashen field Aug 8, 2023, 1:19 PM

#

Release shouldn't be "let's throw it out there and let people solve puzzles", documentation is important

#

Especially given the fact that there was a beta test

latent charm Aug 8, 2023, 1:22 PM

#

I could imagine 0.9 is rougher than 1.0 and they couldn't postone 1.0 again

tall condor Aug 8, 2023, 2:39 PM

#

prodygy and adafactor for me uses a way to high elarning rate. is there any way to control that

latent charm Aug 8, 2023, 2:52 PM

#

both of them should be auto adjusted in the training

ashen field Aug 8, 2023, 3:06 PM

#

Lora does tend to use high LR rate, nothing inherently bad with that

stone garden Aug 8, 2023, 3:11 PM

#

Could someone edit one caption for me so it’s in the proper format (I could work on editing the rest myself):

The image shows a woman with long red hair wearing a black top and looking up at the sky with a pensive expression on her face. The background is a cityscape with skyscrapers and other buildings visible in the distance. The image is well lit, with the sun shining down on the woman's face and casting shadows on her body. The overall mood of the image is contemplative and introspective.

pliant drift Aug 8, 2023, 3:42 PM

#

ashen field Release shouldn't be "let's throw it out there and let people solve puzzles", do...

what do you want documented?

#

it's a machine learned neural network. releasing it with documentation would mean 3 years of research into figuring out how to document it

#

these tend to be black box systems. even by the people who make them. it's sort of a key aspect of the entire field of Machine Learning.

ashen field Aug 8, 2023, 3:44 PM

#

pliant drift what do you want documented?

Setting and processes that they used to trained it. Best practices during their training and beta testing. None require 3 years of research

pliant drift Aug 8, 2023, 3:44 PM

#

also, due to the nature of open sourced fields, most of the components are all made by different people. the memory optimizers for instance. wnat to learn about dadaptation? the documentation is in the project. https://github.com/facebookresearch/dadaptation

pliant drift Aug 8, 2023, 3:45 PM

#

ashen field Setting and processes that they used to trained it. Best practices during their ...

how are they to determine best practices?

ashen field Aug 8, 2023, 3:46 PM

#

pliant drift how are they to determine best practices?

You think they didn’t do any research during training and just hit run and released one model they got? There’s a lot of trials and errors during fitting, a lot of learning during beta testing, not much of those shared yet

pliant drift Aug 8, 2023, 3:46 PM

#

these complaints got strong armchair expert energy to them. you're coming in here demanding help and blaming the world why you haven't found it yet. lack of documentation on cutting edge software is kind of what's expected and always has been. There is so much information for you to search out and dig your teeth into, but you're instead coming here and wasting energy blaming stability for not releasing documentation about a black box system

#

have you read the paper they released for sdxl on arxis? they published

#

complaining about documentation lacking when you've not even studied the paper or put any notes on it

#

https://arxiv.org/abs/2307.01952

arXiv.org

SDXL: Improving Latent Diffusion Models for High-Resolution Image S...

We present SDXL, a latent diffusion model for text-to-image synthesis.
Compared to previous versions of Stable Diffusion, SDXL leverages a three times
larger UNet backbone: The increase of model parameters is mainly due to more
attention blocks and a larger cross-attention context as SDXL uses a second
text encoder. We design multiple novel cond...

ashen field Aug 8, 2023, 3:48 PM

#

pliant drift have you read the paper they released for sdxl on arxis? they published

Yes I did, but you can’t say docs are sufficient as of now

pliant drift Aug 8, 2023, 3:48 PM

#

ashen field Yes I did, but you can’t say docs are sufficient as of now

then why are you complaining about not knowing training settings?

ashen field Aug 8, 2023, 3:48 PM

#

pliant drift then why are you complaining about not knowing training settings?

Because a lot of those were valid for 0.9 not 1.0

#

Such as offset noise

pliant drift Aug 8, 2023, 3:49 PM

#

I can say that because i've managed to find my way around just fine. i've not relied on any youtube tutorials at all since after the 2.1 release and i figured out that none of them really knew what they were talking about

pliant drift Aug 8, 2023, 3:49 PM

#

ashen field Because a lot of those were valid for 0.9 not 1.0

the one i linked is for 1.0. hmmmmmmm

#

1.0 is a progressed version of 0.9 too so all the information in the former version of the paper would matter

ashen field Aug 8, 2023, 3:50 PM

#

pliant drift I can say that because i've managed to find my way around just fine. i've not r...

Then good for you, you are talented. Why get so worked up about a user wishing such info are more organized and centralized?

#

It’s not like they don’t have the info, but not everyone is as talented as you or as patient to dig stuff out

pliant drift Aug 8, 2023, 3:50 PM

#

ashen field Then good for you, you are talented. Why get so worked up about a user wishing s...

i'm not very talented. i just don't have an aversion to combing through open source information

#

have you even read kohya's training manual? all of that informaiton still applies too

ashen field Aug 8, 2023, 3:52 PM

#

pliant drift have you even read kohya's training manual? all of that informaiton still appli...

I use kohya and I train fine, what’s wrong with demanding such things are more accessible to everyone? I don’t get why you are so angry about this

pliant drift Aug 8, 2023, 3:52 PM

#

because you're making unreasonable demands when we break it down. like i have.

#

believe me when i say i'm not angry about this.

ashen field Aug 8, 2023, 3:53 PM

#

pliant drift believe me when i say i'm not angry about this.

I believe you.

pliant drift Aug 8, 2023, 3:55 PM

#

"the community shouldn't be digging for information" feels really lazy and more about "i don't want to do legwork" since nobdoy researching any field will find all the information in one place. Research and learning will always require leg work. I don't understand people's aversion to research. learning is awesome. Spoon fed learning is just doing what you're told. Going out and discovering exactly the knowledge you need is what being human is all about.

ashen field Aug 8, 2023, 3:58 PM

#

pliant drift "the community shouldn't be digging for information" feels really lazy and more ...

Yes I’m advocating spoonfeeding because for good documentation you should always assume users are dumb and lazy. I’m advocating for those majority of users out there, nothing wrong about that.

pliant drift Aug 8, 2023, 3:59 PM

#

users won't read the manual. always assume that. the lazy people will always want someone else to do it for them. that'll never end regardless of all of stabilities efforts. we are in the Eternal September

#

#oldmemes

#

example. LInux is arguably more documented than windows. windows costs money while linux is free. people still use windows because they don't want to read manuals, ever

#

i really don't buy the altruism schtick if i wasn't being clear

tall condor Aug 8, 2023, 4:23 PM

#

i understand that lora is using high LR but the models are so dominant

ashen field Aug 8, 2023, 4:23 PM

#

tall condor i understand that lora is using high LR but the models are so dominant

What do you mean by dominant?

tall condor Aug 8, 2023, 6:00 PM

#

i mean that as soon i add it it it becomes very dominant in the results, i need to turn it down to 0.4 or less to generate regular concepts - but if i turn it down so far it barely generates the face right

#

if this was a regular model i would say its very overfit

#

i have treid about 10 different settings with the lora but apparently its either too strong or too weak - it appears to be much more harder to get it right than creating actual models

#

also after around 50 epocs the loss move to <0.02 and so on

#

i think its just very fast overfitting

#

how many epocs do you guys run? usually

latent charm Aug 8, 2023, 6:13 PM

#

You could just save the lora more frequently and test which one is the best

tall condor Aug 8, 2023, 6:56 PM

#

yes obviousely i do that but i find it quite hard to find propper settings

latent charm Aug 8, 2023, 7:00 PM

#

If you have 50 epochs come to <0.02, you might try 10,20,30,40,50 and try to findout which one is overcook. If you find that, move to the previous one and testing out.

#

repeat the process you might find the perfect one

tall condor Aug 8, 2023, 7:01 PM

#

but even at 50 epochs some concepts tend to underperform so i cant just reduce epocs

#

what learing rate are you guys using and how many epochs and how many repeats in each epoch?

latent charm Aug 8, 2023, 7:04 PM

#

if your loss is less than 0.02, it should means you could reproduce the training image by the training caption at very high level similarity, let say 98%.

tall condor Aug 8, 2023, 7:05 PM

#

yes but it usually also means that it an not create anything else

#

*can not

latent charm Aug 8, 2023, 7:05 PM

#

yes, how come some concepts underperformed?

tall condor Aug 8, 2023, 7:06 PM

#

well some concepts are just harder to learn and take more epochs

#

i try to balance that out with higher repeat count but you can also only push that so far because otherwise it will break the model

latent charm Aug 8, 2023, 7:09 PM

#

Does the concepts perform correctlly in 50 epoches result? If yes, you might try the 40 epoches one. If 40 is great, you could try to use 40 epoches in next time or run the same dataset with 40 epoches and see would it be ok.

tall condor Aug 8, 2023, 7:12 PM

#

with adafactor the resulting model produces allmost only the model arready at 50 epochs

#

i need to do some research on the behavior i gues

#

prodigy is even worse

latent charm Aug 8, 2023, 7:14 PM

#

And you don't need to run the full training if you could see the result is enough

tall condor Aug 8, 2023, 8:01 PM

#

what dim/alpha ratio are you guys using?

small eagle Aug 8, 2023, 8:19 PM

#

possible for a mod to pin this?

sacred grail Aug 8, 2023, 8:42 PM

#

ashen field Maybe true for 0.9, probably not for 1.0, the two models look a lot different

When training and I dont put offset noise on that number I get a warning that it was trained on that number so I guess I was

tall condor Aug 9, 2023, 12:50 AM

#

so for me it is very unexcpected that a rank 8 lora has allmost the same effect than a rank 256 - why is that

restive bridge Aug 9, 2023, 1:10 AM

#

I need clarification on something. Online guides are very conflicted on this.

As I understand it: If my 20 training images are in the training img folder, "20_ronald", thats 400 after repeats. now if my reg folder, "1_man", also has 20 images in it, then there's one reg image for every training image.

BUT if I put way more images into the reg folder and keep "1_man" title, training still goes for the same amount of steps. So theoretically if I put 400 images into reg rather than 20, there would be one unique reg image per training image repeat, rather than 20 reg images repeated 20 times.

Correct? Why does this not lengthen training time? Why are people being told to use the same # of images in both? Is there even a benefit to putting more images in "1_reg" than in "20_img"?

jade hornet Aug 9, 2023, 2:40 AM

#

restive bridge I need clarification on something. Online guides are very conflicted on this. ...

My best guess is that it just finishes the epoch as if you had no reg images... Having 1 for every batch ensures a consistent benchmark for every image. If you put more, my guess is they get skipped

#

Unless you get the guy that wrote kohya_ss to comment, or someone that actually understands the code, it's just guess work

restive bridge Aug 9, 2023, 2:43 AM

#

@safe pecan 🤔

jade hornet Aug 9, 2023, 2:47 AM

#

The way I try to think of it, every comparison of your training image to the class results in a score, that gives you the loss. But the class itself changes as you fine tune the model. The reg images would serve as a stable unchanging representation

#

That's why people recommend 1 for 1

rough crypt Aug 9, 2023, 3:21 AM

#

anyone know how to train a sdxl refiner model using custom data?

hollow spruce Aug 9, 2023, 4:03 AM

#

restive bridge I need clarification on something. Online guides are very conflicted on this. ...

https://github.com/kohya-ss/sd-scripts/issues/640
explains the nuance of it a bit more

hollow spruce Aug 9, 2023, 4:04 AM

#

rough crypt anyone know how to train a sdxl refiner model using custom data?

full finetuning - which isn't really a 1 man job anymore, nor are there any examples of other people who did to base your work off.
other than that we currently have no way to train the refiner.

restive bridge Aug 9, 2023, 4:06 AM

#

hollow spruce https://github.com/kohya-ss/sd-scripts/issues/640 explains the nuance of it a bi...

yes i spoke with him (SEcourses) about it a lot today in sdxl but he still couldnt confirm. He's been using hundreds of reg and didn't know people have been using matching numbers. i just don't know if that's actually working for him or if 380 of the reg images get ignored

hollow spruce Aug 9, 2023, 4:10 AM

#

restive bridge I need clarification on something. Online guides are very conflicted on this. ...

in short - your assumption is completely true.
the reason is doesn't take longer, despite all the uniques, is that only latent caching takes a bit longer. but if there are only 20 images, then they get auto-repeat to match the total steps of your training folder

#

essentially, the moment you declare a regularization folder, you multiply your steps by a total of 2. Images from the regularization folder are taken until they match this number, or multiplied until they match it

restive bridge Aug 9, 2023, 4:15 AM

#

hollow spruce essentially, the moment you declare a regularization folder, you multiply your s...

Thank you for the clarity, been looking for that answer all day. So there's no reason that people should be matching their image counts if theyre using repeats, because they'll get better preservation with less repeated reg images (more images in reg folder). In that case I'll just use a permanent 1000 img reg folder to ensure all uniques.

hollow spruce Aug 9, 2023, 4:19 AM

#

restive bridge Thank you for the clarity, been looking for that answer all day. So there's no r...

yep. it's why in the past people often had folders with 10k images. Because while having more often doesn't make a difference, if you're in a situation where you don't have enough, then thats actually a bad thing.

#

I've currently sidestepped this by using no regularization folder at all, and instead including it as an additional training folder. -while I can confirm that this works for big and more complex loras, trained on datasets of over 1k images (for the both the training images AND reg images), I haven't tested it with smaller datasets yet

restive bridge Aug 9, 2023, 4:22 AM

#

hollow spruce I've currently sidestepped this by using no regularization folder at all, and in...

what token should class images use if training a face? many in sdxl channel said they use the same instance token on both training and reg.. i cant see how that would be a good idea

hollow spruce Aug 9, 2023, 4:23 AM

#

restive bridge what token should class images use if training a face? many in sdxl channel said...

same token defeats the purpose XD

#

I don't train faces enough, to have tested this properly. But I can tell you the 2 ideal theoretical ways.

Training token: <celebrity name>
Regularization: man (or woman)

option 2:
<celebrity name> man
man

option 2 follows the original dreambooth intention a bit more. can't confirm if it is better or not though

latent charm Aug 9, 2023, 4:44 AM

#

@hollow spruce I tried the celebrity method and I think it is less likeness then unique token method. Some celebrities features remain on the result compared same training with unique token.

#

I also tried to use your reg files. It kinda prevent my lora to learn the features from training image compare with no reg.

hollow spruce Aug 9, 2023, 4:48 AM

#

oh no D:
thanks for letting me know. I'll mess around with it tonight, and see what changes work best

restive bridge Aug 9, 2023, 4:51 AM

#

latent charm <@211089689652887552> I tried the celebrity method and I think it is less likene...

In my recent experience, both of these things you just mentioned required me to extend my epochs but boosted quality and likeness in the end beyond what 'no name' or 'no reg' were capable of. Felt like lora vs. dreambooth.

#

but tbf my LR is .0003 so less relevant to caiths params

hollow spruce Aug 9, 2023, 4:54 AM

#

restive bridge In my recent experience, both of these things you just mentioned required me to ...

which makes sense
learning rate might be adjustable, so I'll try it from 5e-3 through 1e-4, and record how many epochs (steps) it took to learn all the features, as well as the quality thereof.

latent charm Aug 9, 2023, 4:54 AM

#

I was training on 400 epoch and the result still the same as 20-40 epoch with celebrity+reg

hollow spruce Aug 9, 2023, 4:55 AM

#

latent charm I was training on 400 epoch and the result still the same as 20-40 epoch with ce...

my regularization data is still biased towards standard caucasian, so maybe the reg data is having an adverse effect to that

latent charm Aug 9, 2023, 4:56 AM

#

Yeah, my training data is Asian which seems conflicted with your reg set.

restive bridge Aug 9, 2023, 4:56 AM

#

hollow spruce which makes sense learning rate might be adjustable, so I'll try it from 5e-3 th...

I'd love to see the results of that. you should post findings here afterwards🙏

latent charm Aug 9, 2023, 4:57 AM

#

also some unusual features

hollow spruce Aug 9, 2023, 4:57 AM

#

latent charm Yeah, my training data is Asian which seems conflicted with your reg set.

Yeah. I realized that 'asian' isn't even good enough. I'll have to split that tag into chinese/vietnamese/philipino and so on

#

I finally got all the images though, so now its just a matter of filtering and tagging ^^

covert pagoda Aug 9, 2023, 10:48 AM

#

restive bridge yes i spoke with him (SEcourses) about it a lot today in sdxl but he still could...

you do realise reg images get shuffled every epoch. In that Kohya issue about repeats, he asks Bmaltais if each epoch remembers what class images have been used in the previous epochs, so as to run the whole diversity of images. The problem is there is no way to spread all class images across epochs. It’s an open issue unfortunately

restive bridge Aug 9, 2023, 11:01 AM

#

covert pagoda you do realise reg images get shuffled every epoch. In that Kohya issue about re...

yes you can't spread across epochs but you can spread across repeats. if a 10 img training folder is getting repeated 20 times per epoch then there's 200 steps, which could be paired with 200 unique reg imgs rather than 10 of them repeated 20 times.

covert pagoda Aug 9, 2023, 12:03 PM

#

restive bridge yes you can't spread across epochs but you can spread across repeats. if a 10 im...

Thanks for simplifying and clarifying. So, if we can only randomize within an epochs repeats, it stands to reason to only use as many class images as there are in an epoch, in this case above, 200. Furkan from SECourses suggested to Bernard to implement a way to link class image randomization across epochs, but he said he didn't have the time to do it unfortunately. Think dreambooth extension solves this by using global repeats for randomization of class

versed crescent Aug 9, 2023, 12:49 PM

#

Can I double check something with folks here? I've read a reddit post on SDXL lora training, and in the example they have 14 training images repeated 7 times, and 200 regularisation images, with a training step count of 3000. In their post they claim it takes 30 minutes on a 4090.

Now I'm trying a broadly similar setup using the same params (as far as I can tell) and it's estimated to be about 20 hours on my 4080. Is there something I've likely got wildly wrong? I can't imagine estimated training time is non-linear. There aren't so many posts on using sdxl_train_network.py so I'm having trouble cross-checking this with other sources.

rancid tartan Aug 9, 2023, 1:13 PM

#

How do I train a character LoRA? I tried making a character and style LoRA with 10 images and I got the style but failed on the character
By some miracle, I have access to my desktop again, so I should be able to train on a reasonable time

versed crescent Aug 9, 2023, 2:35 PM

#

Do LoRA dimensions always need to be powers of 2?

hollow spruce Aug 9, 2023, 2:37 PM

#

versed crescent Do LoRA dimensions always need to be powers of 2?

no.
since 24 is the current ideal for faces.
can't talk about the theory behind it though

versed crescent Aug 9, 2023, 2:38 PM

#

hollow spruce no. since 24 is the current ideal for faces. can't talk about the theory behind ...

Thanks. I'm just playing around with values, as going from 256 down to 128 gives me an absolutely incredible increase to speed.

orchid yoke Aug 9, 2023, 3:02 PM

#

Caith - Kinda curious on why your posts arent pinned , .. its like, anything you say should be revered (IMO). But anyone poking about on the things youve said already would.. learn alot. Felt impolite to @ you welcome to ignore it or not even see it 😄

hollow spruce Aug 9, 2023, 3:12 PM

#

orchid yoke Caith - Kinda curious on why your posts arent pinned , .. its like, anything you...

❤️
joe said he'd have the team review it. but I'm pretty sure I was forgotten 🤣

#

it's fine though. some day once my guide is up, I'll get them to link that!

stone garden Aug 9, 2023, 3:17 PM

#

Question, what are people's views on reg images? Also, in my LoRA, I can't get people's teeth to come out right. Any tips?

orchid yoke Aug 9, 2023, 3:18 PM

#

hollow spruce ❤️ joe said he'd have the team review it. but I'm pretty sure I was forgotten 🤣

Ive yet to see anyone else produce something that could be proven to be gold (and at the least a high level base with you know, tweaking) in like, 14 minutes or less.. but.. really looking forward to your take on a guide - thanks again

hollow spruce Aug 9, 2023, 3:19 PM

#

stone garden Question, what are people's views on reg images? Also, in my LoRA, I can't get p...

yep
teeth need about 5 times as long to train as faces
still haven't found a quick way to solve it other than either bigger datasets or accept more overfitting :/

stone garden Aug 9, 2023, 3:20 PM

#

I mean, I'm not picky whose teeth they are, they should look like teeth and not one big blob.

hollow spruce Aug 9, 2023, 3:20 PM

#

I didn't solve it for my mega lora until I hit 4k images in my dataset 🥲 (essentially a finetune, but in a lora)

sacred grail Aug 9, 2023, 3:20 PM

#

stone garden Question, what are people's views on reg images? Also, in my LoRA, I can't get p...

I had this theory for why reg images are not working properly, I personally think you'd have to generate images on the same seed you put your training to and the same prompts your captions of your training images are, not sure how accurate this is , didn't have time to properly test it yet..
reg images in dreambooth needed the activation word of your model (to target the right part of the model) so that might be the same if you use captioned images in lora

hollow spruce Aug 9, 2023, 3:22 PM

#

stone garden Question, what are people's views on reg images? Also, in my LoRA, I can't get p...

regularization is great for faces - but other than that, for literally any lora, I've not yet needed them

stone garden Aug 9, 2023, 3:22 PM

#

Ty so much. 😄

warm agate Aug 9, 2023, 3:33 PM

#

@hollow spruce i trained the lora

#

but i dont know if its really good

#

do you have any workflow and sample prompts to test?

hollow spruce Aug 9, 2023, 3:34 PM

#

warm agate do you have any workflow and sample prompts to test?

depends on what you want to achieve?
if you wanna be genuinely partial, try random prompts from other people, which roughly include your lora subject

#

I usually mess with random civitai prompts, and generate an image without and with my lora

#

#✨｜sdxl message
#✨｜sdxl message

warm agate Aug 9, 2023, 3:36 PM

#

hollow spruce I usually mess with random civitai prompts, and generate an image without and wi...

ok thats better

#

@hollow spruce can you share a workflow for testing lora, as i dont have any

versed crescent Aug 9, 2023, 3:36 PM

#

Can you get over-training with a lora ? I remember over-training in SD1.5 Dreambooth and the subject's likeness was present in every single face that was generated

hollow spruce Aug 9, 2023, 3:37 PM

#

versed crescent Can you get over-training with a lora ? I remember over-training in SD1.5 Dreamb...

yep. that is called overfitting. and it is something you both do and and dont want at the same time

#

up to a certain point it's usually good, and after a certain time it gets bad

#

especially bad if you accidentally also trained things like low quality noise/backgrounds/a watermark present in all training images

versed crescent Aug 9, 2023, 3:38 PM

#

Ok, and I guess that's why you have checkpoints generated every n epochs, and you manually test them to find the right balance

hollow spruce Aug 9, 2023, 3:38 PM

#

versed crescent Ok, and I guess that's why you have checkpoints generated every n epochs, and yo...

exactly! ^^

versed crescent Aug 9, 2023, 3:39 PM

#

I haven't quite figured out the maths for a good epoch balance, so my current run has generated 18, which is probably a bit excessive 😄

leaden patio Aug 9, 2023, 4:30 PM

#

does dreambooth work if you have jpgs, pngs, and webps?

#

i mean in the same dataset

hollow spruce Aug 9, 2023, 4:43 PM

#

leaden patio does dreambooth work if you have jpgs, pngs, and webps?

yes and no

#

it will work, but if you have sidecar caption files, with the same name
then obviously "image.jpg" and "image.png" can't both have their own "image.txt"

#

that can lead to serious issues - but other than that, you're good to go ^^

versed crescent Aug 9, 2023, 5:22 PM

#

Are the sidecar caption files suffixed with .npz ?

#

Or is that an intermediate cache of the latent values? (maybe this depends on your workflow. I'm using kohya-ss/sd-scripts)

restive bridge Aug 9, 2023, 5:57 PM

#

versed crescent Or is that an intermediate cache of the latent values? (maybe this depends on yo...

yes those are latents

shadow stream Aug 9, 2023, 6:02 PM

#

When I train my Lora after a certain number of epochs I just get black output. Is that overtraining or what is that? Any idea of what I am doing wrong?

stone garden Aug 9, 2023, 7:35 PM

#

shadow stream When I train my Lora after a certain number of epochs I just get black output. I...

I think you need to reduce the epochs or perhaps increase the dampening.

#

Is the loss going up a bad thing, and what can I do to control it? It goes up, but only slightly. I'm already using prodigy with ["decouple=True", "weight_decay=0.01", "d_coef=2.0", "use_bias_correction=True", "safeguard_warmup=False", "betas=0.9,0.99"], I've noticed it happens when the d*lr/d jumps.

shadow stream Aug 9, 2023, 8:39 PM

#

stone garden I think you need to reduce the epochs or perhaps increase the dampening.

What does increase dampening mean? I am using AdamW8Bit constant_with_warmup (5%) and a learning rate of 0.001

#

I'm getting it with other samplers too

stone garden Aug 9, 2023, 8:50 PM

#

You probably need a lower learning rate and a different sampler.

shadow stream Aug 9, 2023, 9:05 PM

#

stone garden You probably need a lower learning rate and a different sampler.

What's the recomended learning rate and sampler for SDXL Loras? (For a character)

stone garden Aug 9, 2023, 9:28 PM

#

I don't know, I'm still doing SD1.5 first.

restive bridge Aug 9, 2023, 11:00 PM

#

shadow stream What's the recomended learning rate and sampler for SDXL Loras? (For a character...

the ones you're using are the only ones I've seen "recommended". There just havent been enough experiments yet. I'm using Adafactor and 3e-4, someone else used 2e-4, and SEcourses uses 4e-4. training times for those rates can go up to 2-3 hours. yet caith gets good stuff with 1e-3 in 15 minutes, but we're all doing very different things.

shadow stream Aug 9, 2023, 11:02 PM

#

restive bridge the ones you're using are the only ones I've seen "recommended". There just have...

that's very interesting. Whats the theory behind how @hollow spruce can get good stuff so quickly?

restive bridge Aug 9, 2023, 11:05 PM

#

shadow stream that's very interesting. Whats the theory behind how <@211089689652887552> can g...

i don't think he's focused on photorealistic faces so it can be more quick and dirty. those of us trying to make HD photos of real people all seem to be using 256 dim still, and 0.0005-0.0001. it does a deeper, more dreambooth-like style of training

#

styles, concepts, objects, animals, mostly anything can be trained much faster than faces if you're going for flexibility + likeness + photoreal

shadow stream Aug 9, 2023, 11:08 PM

#

Thank you for the explanation. What's the best resources for best practice for captioning, regularization and parameters when it comes to faces? Any up to date guides that works for SDXL

restive bridge Aug 9, 2023, 11:13 PM

#

shadow stream Thank you for the explanation. What's the best resources for best practice for c...

I would try SEcourses youtube channel for a visual breakdown or Caith's message here. I haven't seen any other well-informed "guides" #🔧｜finetune message

stone garden Aug 10, 2023, 1:55 AM

#

restive bridge I would try SEcourses youtube channel for a visual breakdown or Caith's message ...

What would you recommend for captions for such a Lora? Prose or prompt?

Also what about the dim/alpha for 1.5?

restive bridge Aug 10, 2023, 2:05 AM

#

stone garden What would you recommend for captions for such a Lora? Prose or prompt? Also wh...

i'm not even using captions anymore for training faces and it's perfectly fine without.

i've never made 1.5 loras, i was on 2.1 when they came out so idk

covert pagoda Aug 10, 2023, 2:48 AM

#

stone garden Is the loss going up a bad thing, and what can I do to control it? It goes up, b...

About loss going up? Do you check that in a tensorflow graph or in the training terminal data?

narrow kraken Aug 10, 2023, 2:57 AM

#

hello everyone

#

i just cant get this to run on google colab

#

no matter what i do, it throws the following error

#

stone garden Aug 10, 2023, 4:30 AM

#

covert pagoda About loss going up? Do you check that in a tensorflow graph or in the training ...

Both

glossy valve Aug 10, 2023, 7:31 AM

#

Has anyone tried to finetune XL 1.0 refiner for anime style?

hollow spruce Aug 10, 2023, 8:10 AM

#

glossy valve Has anyone tried to finetune XL 1.0 refiner for anime style?

If you really mean "finetune", then no, since even the base has only been finetuned 2 times so far:
https://huggingface.co/hakurei/waifu-diffusion-xl
https://huggingface.co/Linaqruf/animagine-xl

If you were referring to LoRAs, then no. Since we currently can't lora train the refiner.

sonic narwhal Aug 10, 2023, 8:54 AM

#

How did those two guys finetune the base?

stone garden Aug 10, 2023, 9:31 AM

#

How is ANYONE training SDXL LoRAs? Which commit of kohya-ss do you guys use, the dev and sdxl branches are broken.

hollow spruce Aug 10, 2023, 9:32 AM

#

stone garden How is ANYONE training SDXL LoRAs? Which commit of `kohya-ss` do you guys use, t...

kohya-gui main branch - it gets updates from the dev branches, so you dont need to do anything complicated

ruby pond Aug 10, 2023, 10:02 AM

#

anyone know how to find and remove unicode characters from the caption files?

hollow spruce Aug 10, 2023, 10:05 AM

#

Speed training for faces! it's time

Lora Training Settings - speed training faces edition

(24gb vram version for 3090/4090 or datacenter cards) - no regularization images - trains relatively fast

Use exactly 40 or 45 or 50 or 55 or 60 images (multiple of 5, and as close to 50 as possible)
Do we need captions for images? Yes! Because this is training the clip - hence the instruction are a bit more important to make this exact setup work
What captions?

A trigger word (caith, sdxl_token, george, sara, ohwx, shirogane-sama <- can be anything, as we're clip training. no need for celebrity names. Just please dont use "coffee shop" or "toyota". Any normal names, or completely made up names will work though)
Your class token (boy/man/girl/woman)
any features that aren't present in all images (glasses/sweater/suit/outdoor/indoor/shower/red lipstick/black lipstick)

A few example captions for images from my dataset:

girl, glasses, indoor, shirogane-sama
cindy aurum cosplay, girl, shirogane-sama
asuka cosplay, girl, indoor, shirogane-sama

(order doesn't matter - since we use shuffle captions!)

Training Images setup:

they don't need to all be 1024x1024 - its fine to have some lower quality ones, and it's fine to have like 2048x2048 images
they do need to be perfect squares 1:1, as we won't be using buckets, to reduce the amount of things that can go wrong. (buckets work just fine, we don't use them to keep this as simple to follow as possible)
images should be zoomed in on faces, similar to portrait shots. I'll include 4 sample images of my dataset, so you can see what level of zoom is recommended
folder name should be your class token that represents your images. Choose 1 from these 4: boy/man/girl/woman
(my folder name was 1_girl in this case)
No need for regularization photos
Repeat must be set to 1
Caption files need to have the same name as the images:
1bec16d.jpg
1bec16d.txt
jpg/png/webp all work just fine - but obviously make sure they all have unique names

Settings:

Make sure you're actually on the LoRA tab
Change Source model path to your own
Change folders to your own
Under Parameters -> There's a VAE option. Link it to the 0.9 VAE (then samples are kind of working)
Sample Prompt needs to update to match your own:

<trigger word>, <class token>, flavor text --w 1024 --h 1024 --d 2 --l 7 --s 30
shirogane-sama, girl, indoor, glasses --w 1024 --h 1024 --d 2 --l 7 --s 30

#

Expectations:

This should work straight on the first attempt, as long as you follow the guidelines.
Epoch 60 should be perfectly cooked. (but do try a few below and above that - just to be sure)
Training time: 8min per 10 epochs with no samples. (so about 80 minutes for 100 epochs)

Explanations of Parameters:

Consider this training to be a bit more... aggressive... to put it mildly.
Essentially we're using Dropout caption every n epochs to literally nuke the model with our training info.
We are using Text Encoder training - hence what captions are used is important, as some captions will break the model quickly. Start out with my recommendations, and then slowly expand from there.
All learning rates are set to 0.0005. This slows down training a bit - but that's needed due to what we're doing to the poor sdxl model with Dropout caption every n epochs
Network Rank (Dimension) is set to 24. This is the highest you should need to go - while higher may give better results - don't mistake this for your lora getting better vs the whole sdxl model getting worse. Essentially our dropout setting should emulate the effect that that higher dim setting was giving. It's not necessarily the best option that exists - but it's certainly not more destructive than using dim 128 or 256.

F.A.Q.:

Q: Can I enable Buckets?
A: Yes! Just make sure all the buckets have images that are a multiple of 5.

Q: Can I use more images? (Like 100!)
A: Yes! It will most likely increase quality, but the epochs that this needs to run will change. Basically just try out your various checkpoints afterwards, and let it run for longer.

Q: Will this run on 16gb vram?
A: Yes! Batch size & caption dropout will need to adjusted. To what? That requires testing - feel free to try out various combinations and report back.

Q: Will this run on 12gb vram?
A: Most likely not. Text Encoder requires some vram as well - and that will probably push you above 12gb vram 😦
(But you can still train with other settings that don't use clip training)

Q: Why is it taking longer to train?
A: Cause we're generating samples. Feel free to turn them off for an almost 100% speed boost.

Q: For captions, can I write "a photo of a man standing inside a room"
A: No. Captions need to be simple words separated by commas. Simple but effective.

Q: Are more captions better?
A: Usually not. There are a lot of words we really don't want to train, so we're keeping it super simple on purpose.

Q: What if I want to train without captions?
A: Then this is the wrong setup - there are many other ways, just that this one relies on a few captions per image

Q: Should I save the training state?
A: Yes! It will let you pick up right where you left off. Meaning you can set training to 60 epochs, finish it in 48mins, and if you're unhappy with your checkpoint, you can just resume training again.

Q: Should I change the Save every N epochs setting?
A: You can change to it to like 10 if you want. But keep in mind that every 5th epoch is a 'big one', since that's the one that runs with dropout.

Q: Why is this using offset noise of 0, instead of 0.0357?
A: This... is a lot more complicated to answer. But in a nutshell, it will make our images less grey in the end.

#

4 training images for context (I used a total of 45, in random cosplays, random positions, random outfits & hair colors, random backgrounds)
I trained her face, and lightly the aesthetic of her images

4307d09f0e6888ee9e266475526c4f8ba3357bc2536a91fe3d26d0857b8353fc.png

fd3ee01b67c8991a5720db1aab0ed46e42b707bf5b46e30d1d18713e350c66ea.png

1bec16d1f1d4477e21b17a473c234b848c93e0e06a1199928175635d51a93173.png

df43c1fcddf470d526947d0b8ff637e5c2eb357ef2777f43f796d4725fabb34f.png

#

without/with lora (base only - 1 sampler node only)

shirogane-sama20a20portrait20of20a20girl20in-sd_xl_base_10safetensors-2.png

shirogane-sama20a20portrait20of20a20girl20in-sd_xl_base_10safetensors-2-1.png

#

abstract20expressionism20artskinnygirl-sd_xl_base_10safetensors-356918025-1.png

abstract20expressionism20artskinnygirl-sd_xl_base_10safetensors-356918025.png

#

shirogane-sama20a2070s20grainy20photograph20-sd_xl_base_10safetensors-2.png

shirogane-sama20a2070s20grainy20photograph20-sd_xl_base_10safetensors-2-1.png

#

(images were made using random civitai prompt - so I can be impartial in how well the lora works)

hollow spruce Aug 10, 2023, 10:33 AM

#

And the json!

📎 kohya_facePreset_AdamW-fullbf16.json

ruby pond Aug 10, 2023, 10:44 AM

#

Amazing! Thanks @hollow spruce w00t

latent charm Aug 10, 2023, 11:11 AM

#

Can't wait to try it

ancient pier Aug 10, 2023, 11:16 AM

#

@hollow spruce wrote "mages were made using random civitai prompt "

Or using random prompts supplied/inspired by the audience lol ( I recognise the 70's feeling)

hollow spruce Aug 10, 2023, 11:44 AM

#

also, in case somebody is wondering, this is what dropout even means

Dropout caption every n epochs
Usually, images and captions are learned as a pair, but it's possible to train just on "images without captions" every certain number of epochs.

This option allows you to specify "drop out captions every ○ epochs."

For instance, if you set this to 2, you will conduct image training without captions every 2 epochs (2nd epoch, 4th epoch, 6th epoch...).

By training on images without captions, it is expected that your LoRA will learn a more comprehensive feature set from the images. It can also help prevent the image features from being tied too closely to specific words. However, if you use captions too sparingly, your LoRA could become ineffective at prompts, so be cautious.

The default is 0, and in the case of 0, caption dropout is not performed.

Rate of caption dropout
This is similar to the "Dropout caption every n epochs" mentioned above, but during the entire learning process, you can train on "images without captions" for a certain proportion of the time.

Here, you can set the proportion of images without captions. 0 means "always use captions during training," and 1 means "never use captions during training."

Which images will be trained as "images without captions" is determined randomly.

For example, if you train LoRA with 20 images, reading each image 50 times for just 1 epoch, the total number of image learnings is 20 images x 50 times x 1 epoch = 1000 times. If you set the rate of caption dropout to 0.1, 1000 times x 0.1 = 100 times, you will train on "images without captions."

The default is 0, and all images are learned with captions

Consider this option going nuclear. Might be great for style loras, but anything else is technically seen a wrong application. It's working here since we're only training for a few epochs - and are fine with the little damage that is does do. It's still a lot less damage than using network rank 256

covert pagoda Aug 10, 2023, 1:38 PM

#

@hollow spruce I saw BMaltais write a review of Ai3 Lycoris where he had two dataset image folders one regular 10_busterkeaton man and 10_buster Keaton hat. Strangely no class name it seems. But I'd be interested to know if dividing central concepts in dataset preparation benefits training with decisive separation of the concepts https://www.reddit.com/r/StableDiffusion/comments/14low8y/lora_lycoris_ia3_is_amazing_info_in_1st_comment/

r/StableDiffusion - LoRA (LyCORIS) iA3 is amazing (info in 1st comm...

27 votes and 13 comments so far on Reddit

stone garden Aug 10, 2023, 1:40 PM

#

covert pagoda <@211089689652887552> I saw BMaltais write a review of Ai3 Lycoris where he had ...

I've looked at the code. If the repeats are the same and every file has captions, it should in theory make zero difference.

covert pagoda Aug 10, 2023, 1:43 PM

#

stone garden I've looked at the code. If the repeats are the same and every file has captions...

Ok so, it would be like having them in the same folder all together. With the added benefit giving you a more simple organization of image type, say for instance to easily separate crop types (closeup, full and half body)... if you're dealing with 100 images dataset it can be a benefit

#

Thanks for your response

hollow spruce Aug 10, 2023, 1:47 PM

#

covert pagoda <@211089689652887552> I saw BMaltais write a review of Ai3 Lycoris where he had ...

I'm loving what I read about Ai3 Lycoris. will definitely try it out ~~once it leaves the dev branch~~
will be trying it out, since it seems to be working for sdxl

hollow spruce Aug 10, 2023, 1:49 PM

#

covert pagoda <@211089689652887552> I saw BMaltais write a review of Ai3 Lycoris where he had ...

as for the separation of concepts - yes it works, but it was already hard to use correctly in 1.5
and even harder now in sdxl

#

I've done it a few times so far

#

especially for my big datasets, where I did this

#

it's also how I include regularization images, when I misuse them as training data rather than regularization data

orchid yoke Aug 10, 2023, 1:58 PM

#

hollow spruce especially for my big datasets, where I did this

Curious what your take on a small dataset is, like, on the lower end of 10 - 20 images. Its pretty easy to get, just photo style of the subject. You can pretty much do that with 1 image, or one image with a coupel of different crops. So i was about to try, your last method, with the real images i can collect, plus, generated photo images of that subject which i guess isnt ideal, but.. was trying to think of, the best you can do with the least.

hollow spruce Aug 10, 2023, 2:00 PM

#

orchid yoke Curious what your take on a small dataset is, like, on the lower end of 10 - 20 ...

1-10 photos is still doable - but it's not what I'd call beginner friendly - nor will the final lora achieve an 80% hitrate of 'good' images.
10-20 images is just about good enough, settings will be harder to get right

#

also depends on quality of those images

orchid yoke Aug 10, 2023, 2:00 PM

#

down to that snappy tagging? (+ good quality images)

hollow spruce Aug 10, 2023, 2:01 PM

#

keep in mind that with small datasets, you might accidentally end up training something like a jpg-compression artifact lora 🤣 happened to me once when I trained on not-so-high quality images, and since there weren't enough, I accidentally made a lora that added jpg compression 🥲

#

but yeah. 2 high quality images is the lowest that really "works"

orchid yoke Aug 10, 2023, 2:03 PM

#

Yeah i appreciate everything, Loras were always a side thing, i could just throw in before on 1.5, so having to learn a bit just to do what was quick mockups, but im in love with SDXL and i dont want to go back 😄

hollow spruce Aug 10, 2023, 2:03 PM

#

at 1 image, you're just building a weird controlnet lora to hopefully reproduce the right thing.
can only recommend to get it good enough to produce a few good images - then reuse those to train a good lora

hollow spruce Aug 10, 2023, 2:04 PM

#

orchid yoke Curious what your take on a small dataset is, like, on the lower end of 10 - 20 ...

fyi, generated photos work just fine.
sure you'll probably reduce skin details - but that's rarely the make or break point for if a lora is good

#

and can be fixed with a bit of prompting

#

not my thing, since my long term goal is to make a finetune that is trained on less than 50% professional photography
but I can vouch for a lot of loras trained on nothing but synthetic images, and the lora is 😘 chefs kiss!

#

(especially style loras suffer essentially 0 quality loss for synthetic training)

covert pagoda Aug 10, 2023, 2:20 PM

#

Im about to move into style training and my main use case, as I am a photographer by trade, is to train for specific photographic styles which include studio flash or very natural soft light, or composition and film look. Usually this is done by using in style of so and so. But as I want to refine the styles into something much more focused and specific, I wondered if you had some basic art style tenets. I suppose cohesion image to image is the most important thing. So the end result in the Lora is snappy. I wonder if art/fasjoon photographic styles as concepts are difficult. I shall know soon. Will test with standard Lora and prodigy Lokr @hollow spruce

#

Have had some excellent results with loha/profigy for character...

onyx thicket Aug 10, 2023, 2:21 PM

#

Has anyone tested the difference between training a lora on the base model vs a different model which images you like better? Like dreamshaperXL. Which is a better base for a lora

covert pagoda Aug 10, 2023, 2:21 PM

#

Here's an example of character LoHA. Dataset of 20

#

#

00456-seeds-_lora_s0r4_runp_prod_v3_40rep-000006_1__photo_of_s0r4_woman_miniskirt_strapless__bandeau_tube_top_ribs_mugler_designer_1.png

#

And gen

#

#

40 repeats, batch 8, 6 epochs

#

And yes, fully images are hit and miss as far as face accuracy though I suspect adetailer and impairing should be standard for full body

hollow spruce Aug 10, 2023, 2:38 PM

#

covert pagoda

uff that looks cool and good

#

what network + Convolution ranks + alphas were you running on?

covert pagoda Aug 10, 2023, 2:52 PM

#

Pretty much all default on preset loha/prodigy. Except precision at bf/bf. Dim alpha 32/16

#

Nothing really changed but I did curate my dataset a ton

#

For the photo styles I guess I could organise image folders under film look, light, edgy (for stuff that is more complex to categorize). The first two are pretty standard and not much captioning to do. But the third would be more of a flavour thing. I wish it was possible to use yaml files to increase repeats on specific important subfolders recursively like in Everydream

hollow spruce Aug 10, 2023, 2:58 PM

#

covert pagoda Pretty much all default on preset loha/prodigy. Except precision at bf/bf. Dim a...

at least in theory, 32/1 should give better results.
but in practice, I've not nearly spent enough time with LoHA

covert pagoda Aug 10, 2023, 3:00 PM

#

The Lycoris discord guys guys discuss it a little, though they are not especially always forthcoming with config settings. They talk math more often than not lol

covert pagoda Aug 10, 2023, 3:07 PM

#

hollow spruce at least in theory, 32/1 should give better results. but in practice, I've not n...

Im just surprised how much more effective prodigy is than adafactor. At batch 6 I cooked the model much faster and nicely than with ada in batch 1 over a much longer period of time

hollow spruce Aug 10, 2023, 3:16 PM

#

covert pagoda Im just surprised how much more effective prodigy is than adafactor. At batch 6 ...

yeah. prodigy for automated or adamW if you're confident about your training rate

warm agate Aug 10, 2023, 4:46 PM

#

how to get prodigy for gui?

covert pagoda Aug 10, 2023, 4:58 PM

#

warm agate how to get prodigy for gui?

Preset on Kohya

warm agate Aug 10, 2023, 5:00 PM

#

oh ok tq

warm agate Aug 10, 2023, 5:01 PM

#

covert pagoda Preset on Kohya

i dont have 24gb vram, it uses 24gb vram, i have 16gb vram, what batch size do you suggest?

covert pagoda Aug 10, 2023, 5:02 PM

#

hollow spruce yeah. prodigy for automated or adamW if you're confident about your training rat...

For outfits training any recommendations? How many per outfit?

#

But if I want to generalize from a variety of similar outfits from a same collection, should I fine tune a model rather than do a loha with several outfits and mix them up with a multiplier?

warm agate Aug 10, 2023, 5:15 PM

#

@hollow spruce what does this mean?

covert pagoda Aug 10, 2023, 5:19 PM

#

warm agate i dont have 24gb vram, it uses 24gb vram, i have 16gb vram, what batch size do y...

Then try batch 1. It's very ram consuming. I use an a100 with 80gb ram and it's like 3 hours

warm agate Aug 10, 2023, 5:20 PM

#

oh ok

covert pagoda Aug 10, 2023, 5:21 PM

#

warm agate <@211089689652887552> what does this mean?

Use bf instead of fp precision. And start with batch 1 and very small dataset. It might not generalize but at least you can check how much time it takes to learn the face

warm agate Aug 10, 2023, 5:21 PM

#

covert pagoda Use bf instead of fp precision. And start with batch 1 and very small dataset. I...

ok

hollow spruce Aug 10, 2023, 5:37 PM

#

warm agate <@211089689652887552> what does this mean?

did you enable fp16, anywhere other than for saving the lora?

warm agate Aug 10, 2023, 5:47 PM

#

hollow spruce did you enable fp16, anywhere other than for saving the lora?

yeah

#

i disable it rn

#

@hollow spruce

📎 sg_v4.json

#

idk why even with basic settings the vram consumption is high

hollow spruce Aug 10, 2023, 5:49 PM

#

warm agate <@211089689652887552>

loha + prodigy + TE?

warm agate Aug 10, 2023, 5:50 PM

#

hollow spruce loha + prodigy + TE?

whats te?

#

which is better adamw or prodigy?

simple ivy Aug 10, 2023, 5:51 PM

#

Can anyone offer assistance on the relationship between training images and epochs and steps? In Kohya LoRA training I have 33 images, set to train with batch size 1, no bucketing, learning rate 1 ... I'd expect epoch amounts to 33 steps, right? One step for each image? But some other setting is making that 1 epoch amount to 5445 steps. How do I make sense of this?

warm agate Aug 10, 2023, 5:54 PM

#

simple ivy Can anyone offer assistance on the relationship between training images and epoc...

steps = (images x repeats) / batch_size x epochs x 2(only if reg images are used)

simple ivy Aug 10, 2023, 5:55 PM

#

warm agate steps = (images x repeats) / batch_size x epochs x 2(only if reg images are use...

repeats? I don't see such a setting. And I am not using reg images right now so that should mean no doubling. I have all settings I can find to just x1 so I expect 1 epoch should equal my training set. 33. 5,445 is so weirdly huge I can't figure out what is going awry

-edit: I think I found it - just needed to type some things out and then do some arithmetic. Conflict arising between trying to piece together incomplete direction from multiple sources.

hollow spruce Aug 10, 2023, 5:59 PM

#

warm agate which is better adamw or prodigy?

prodigy will always get you 80% of the way if configured right.
amaw can get you to 100%, but that doesn't mean that it will.

warm agate Aug 10, 2023, 6:02 PM

#

simple ivy repeats? I don't see such a setting. And I am not using reg images right now so ...

its in tools>deprecated

warm agate Aug 10, 2023, 6:02 PM

#

hollow spruce prodigy will always get you 80% of the way if configured right. amaw can get you...

can you share adamw

#

i'll train it once this gets completed

hollow spruce Aug 10, 2023, 6:12 PM

#

warm agate can you share adamw

for faces:
#🔧｜finetune message

for others:
#🔧｜finetune message

warm agate Aug 10, 2023, 6:13 PM

#

tq

warm agate Aug 10, 2023, 6:44 PM

#

@hollow spruce do you suggest loha or lora for characters?

warm agate Aug 10, 2023, 6:45 PM

#

hollow spruce And the json!

the epochs is 100, i have 160 images
what should I set the repeats and epochs to?

hollow spruce Aug 10, 2023, 6:45 PM

#

warm agate <@211089689652887552> do you suggest loha or lora for characters?

I can only say that my faces preset works. Haven't tested loha with sdxl yet

warm agate Aug 10, 2023, 6:46 PM

#

hollow spruce I can only say that my faces preset works. Haven't tested loha with sdxl yet

ok, prodigy uses loha

hollow spruce Aug 10, 2023, 6:46 PM

#

warm agate the epochs is 100, i have 160 images what should I set the repeats and epochs to...

repeat = 1
if you're doing the faces one - then epochs 60 will be finished.
I usually let it run to 100 so I can test various checkpoints - to see what heavy overfitting did

warm agate Aug 10, 2023, 6:47 PM

#

hollow spruce repeat = 1 if you're doing the faces one - then epochs 60 will be finished. I us...

oh ok
but 100x160

warm agate Aug 10, 2023, 6:47 PM

#

hollow spruce repeat = 1 if you're doing the faces one - then epochs 60 will be finished. I us...

i use batch_size of 1, so should I also use 100
the steps change, so asking

hollow spruce Aug 10, 2023, 6:47 PM

#

warm agate oh ok but 100x160

🤷‍♂️ might take less epochs, might take more. My preset was designed for 50 images. You'll have to try and find out

warm agate Aug 10, 2023, 6:48 PM

#

oh ok

#

will use 100 and try

cyan harbor Aug 10, 2023, 7:11 PM

#

Hi guys. I was working with stable diffusion 1111 without a problem today but since I have copied the ckpt file version 7gig it stopped to work with SDXL refiner and even after I have deleted it it doesn't work when I push the generate button it gives me a black image 😔

tidal silo Aug 10, 2023, 7:56 PM

#

Does anyone have any good information or resources on instance & class tokens for training SDXL? I have seen things such as "ohwx" recomended because its unique & 1 token? but also seen people say with SDXL just use the name of the person or something else? really trying to understand better.

hollow spruce Aug 10, 2023, 7:57 PM

#

tidal silo Does anyone have any good information or resources on instance & class tokens fo...

#🔧｜finetune message

tidal silo Aug 10, 2023, 9:18 PM

#

hollow spruce https://discord.com/channels/1002292111942635562/1026382406279770152/11391375341...

just read though so much of this stuff above, thanks so much @hollow spruce for all the effort you are taking to help people so useful 🙌 . Still have much to learn. two questions if you dont mind - Are you guides hosted anywhere else except berried in the discord? 🥲 & any resources or further reading for going beyond your guide? eg training styles / multiple subjects / larger checkpoints?

stone garden Aug 11, 2023, 3:15 AM

#

covert pagoda <@211089689652887552> I saw BMaltais write a review of Ai3 Lycoris where he had ...

One point it’d help is if you had no captions. Then you could use a different class token and template for each subset.

#

It helps if you want to re-use the images later.

stone garden Aug 11, 2023, 3:19 AM

#

warm agate which is better adamw or prodigy?

I’ve found that with Prodigy min_snr_gamma has a significant effect. Set it low for simple Loras (like characters) and high for complicated Loras (style, for example). But outside of that, it’s really been the only parameter I needed to adjust for good results.

#

Don’t know about 100% of the way though. Never went that far.

#

And my dataset is pretty shit to say the least. The captions are good, the images are not edited in any way. Just selected/pruned.

latent charm Aug 11, 2023, 4:11 AM

#

@hollow spruce

#

I also has this effect using your preset. Does it related to dataset? How do you resolve it?

hollow spruce Aug 11, 2023, 4:14 AM

#

latent charm <@211089689652887552>

the samples are always half-working. same for me.
it's good enough to roughly know whats going on - but actually testing the checkpoints will give proper results

latent charm Aug 11, 2023, 4:16 AM

#

It has the effect in real run.🤣

hollow spruce Aug 11, 2023, 4:50 AM

#

latent charm It has the effect in real run.🤣

wait what? XD that's not normal!

signal warren Aug 11, 2023, 5:22 AM

#

Maybe you have that setting which ends the steps early?

placid stag Aug 11, 2023, 5:23 AM

#

没有中国人吗

#

catwhaaa

south jungle Aug 11, 2023, 6:38 AM

#

有美国人

signal warren Aug 11, 2023, 6:42 AM

#

If someone wanted to make a SDXL finetune with many concepts and ideas all in one, but can't because they only have 12gb vram, would making a huge lora or multiple loras then merging it with the SDXL model be any good?

stone garden Aug 11, 2023, 6:55 AM

#

signal warren If someone wanted to make a SDXL finetune with many concepts and ideas all in o...

I don’t know but you can test it out. A couple of tips: Use scaled weights to make sure the LoRAs are compatible. Test different orders as order matters. And finally, merge in the same order you test.

restive bridge Aug 11, 2023, 7:00 AM

#

just realized SEcourses is training the text encoder with only an instance token and no captions in his video. and results look good🤔 trying it for myself, except his params for 16 imgs is 3 fkn hours

hollow spruce Aug 11, 2023, 7:05 AM

#

restive bridge just realized SEcourses is training the text encoder with only an instance token...

and results look good
do note what happens to the backgrounds in the images he generates

#

text encoder for single words work just fine - but if you pick the wrong one, uff does it go bad quickly XD

#

so use with words you've tested are ok

restive bridge Aug 11, 2023, 7:06 AM

#

random string?

#

i.e. "2jF7DT98L"

hollow spruce Aug 11, 2023, 7:07 AM

#

restive bridge i.e. "2jF7DT98L"

'2jF7' <- 4 letters only. might be worth trying

#

in your particular case, I'd rather figure out the perfect parameters for training on the 4 main class words for people though

#

boy/man/girl/woman

#

cause once you got those down, you can train them much easier

restive bridge Aug 11, 2023, 7:08 AM

#

hollow spruce '2jF7' <- 4 letters only. might be worth trying

as in use "man" as instance token? I'm still using "man" reg

hollow spruce Aug 11, 2023, 7:09 AM

#

restive bridge as in use "man" as instance token? I'm still using "man" reg

yeah. I'd try with a small amount of dropout + clip training on only the word 'man'. no regularization images

#

did a lot of tests the last two days, and it should work

#

I'm starting to get good results with less and less images

#

(much trial and error though)

restive bridge Aug 11, 2023, 7:10 AM

#

hollow spruce I'm starting to get good results with less and less images

network or caption dropout? and what value are you using

hollow spruce Aug 11, 2023, 7:10 AM

#

restive bridge network or caption dropout? and what value are you using

dropout every 5 epochs

#

that will work if you can get your training to finish in under 100 epochs (1 repeat)

#

so basically play with training rate, until 50 epochs looks 'perfect'

#

that way you also train much faster obviously

#

I've even pulled of a somewhat successful training on a single image - using that method

restive bridge Aug 11, 2023, 7:12 AM

#

hollow spruce so basically play with training rate, until 50 epochs looks 'perfect'

do u mean learning rate?

hollow spruce Aug 11, 2023, 7:12 AM

#

restive bridge do u mean learning rate?

yes

restive bridge Aug 11, 2023, 7:13 AM

#

ok I'll give it a shot thank you

signal warren Aug 11, 2023, 7:17 AM

#

stone garden I don’t know but you can test it out. A couple of tips: Use scaled weights to ma...

cheers love

fervent sandal Aug 11, 2023, 7:19 AM

#

Hey all. I've been interested to train a lora. But my set-up of using Linux means I get a crash with the koyha_ss gui. Can anyone say if they have had success in just using the sd-script along (on Linux)?

hollow spruce Aug 11, 2023, 7:23 AM

#

fervent sandal Hey all. I've been interested to train a lora. But my set-up of using Linux mean...

weird. kohya should run better on linux O_O

latent charm Aug 11, 2023, 7:23 AM

#

@hollow spruce Your preset is good but I think the likeness is not enough. My caption is using nature language like this: {rare_token}, a {subject}..... Should I reduce token in caption? I think the result of this attempt is like 70-80% of original using default epoches with 60 images. If I extend the epoch, might be increase the likeness?

hollow spruce Aug 11, 2023, 7:23 AM

#

fervent sandal Hey all. I've been interested to train a lora. But my set-up of using Linux mean...

https://github.com/bmaltais/kohya_ss
^ you were trying to install this, right?

fervent sandal Aug 11, 2023, 7:24 AM

#

hollow spruce weird. kohya should run *better* on linux O_O

I get a crash with python - the Tk extension is not present. The other factor is that this is a headless EC2 with attached GPU. I have no idea why something with a web interface requires Tk!

hollow spruce Aug 11, 2023, 7:24 AM

#

latent charm <@211089689652887552> Your preset is good but I think the likeness is not enough...

yeah. more epochs should increase likeness.
'a' is wrong though. just {rare_token}, {subject}, {clothing}, {background}

#

and you're using 50 images? or less?

#

I should probably stop using epochs as a reference. essentially 3000 steps (600 steps at batch 5) is where my loras turn out ideal.
but the full range is 1500 steps to 4500 steps (so that would be epoch 30~90)

latent charm Aug 11, 2023, 7:29 AM

#

using 60 images

hollow spruce Aug 11, 2023, 7:29 AM

#

it's why I initially suggested 100 epochs - just to be sure

hollow spruce Aug 11, 2023, 7:30 AM

#

latent charm using 60 images

if you saved the training state, then let it run for up to 100 epochs

latent charm Aug 11, 2023, 7:30 AM

#

I would try the 4 token caption way

hollow spruce Aug 11, 2023, 7:30 AM

#

latent charm I would try the 4 token caption way

clothing/background only need to be mentioned if they aren't the same in all images

#

basically you write a word for everything that actually changes between the images in your dataset

latent charm Aug 11, 2023, 7:31 AM

#

how detail should this 4 parts be?

hollow spruce Aug 11, 2023, 7:31 AM

#

"glasses" aren't needed if the subject wears a glasses in all images.
but if they only wear it in half the images, then its important to tag it

latent charm Aug 11, 2023, 7:32 AM

#

like clothing, white t-shirt and blue jeans or t-shirt and jeans?

hollow spruce Aug 11, 2023, 7:34 AM

#

latent charm how detail should this 4 parts be?

lin sheng, woman, fencing suit
lin sheng, woman, blue track outfit
lin sheng, woman, studio photograph

chiba-port-arena-chiba-japan-14th-june-2019-lin-sheng-chn-june-14-2019-fencing-2019-asian-fencing-championships-womens-individual-epee-award-ceremony-at-chiba-port-arena-chiba-japan-credit-yohei-osadaaflo-sportalamy-live-news-TDEEH6.png

#

to give an example

hollow spruce Aug 11, 2023, 7:35 AM

#

latent charm like clothing, white t-shirt and blue jeans or t-shirt and jeans?

jeans shouldn't be visible
cropping is important!
#🔧｜finetune message

check there for crop examples

latent charm Aug 11, 2023, 7:39 AM

#

I cropped all as 1024 for testing

ancient pier Aug 11, 2023, 7:43 AM

#

@hollow spruce L to R
No LORA
SDXL Offset Example at 0.65
Your LORA at 0.65

Nice 🙂

sonic narwhal Aug 11, 2023, 7:45 AM

#

simple ivy repeats? I don't see such a setting. And I am not using reg images right now so ...

Repeats is the number of your img folder example 20_Subject, 20 would then be the repeats

ancient pier Aug 11, 2023, 7:46 AM

#

ancient pier <@211089689652887552> L to R No LORA SDXL Offset Example at 0.65 Your LORA at 0...

Same 3 Images after Upscaling/HRF & Face Detailing

restive bridge Aug 11, 2023, 7:49 AM

#

@hollow spruce would i not see any benefit from adding regularization to your method? considering I don't have captions, and many images have matching clothing, and training set is usually 12-25 imgs, and backgrounds are all white

#

if i can afford the time

hollow spruce Aug 11, 2023, 7:54 AM

#

restive bridge <@211089689652887552> would i not see any benefit from adding regularization to ...

I've not tried it yet.
If you do add reg images, then definitely double dropout chance, to once every 10 epochs
Not sure if that will mean it runs the same, less, or more epochs though. will be interesting to see

restive bridge Aug 11, 2023, 8:00 AM

#

hollow spruce I've not tried it yet. If you do add reg images, then definitely double dropout ...

have you experimented with batch 1? I keep hearing quality is best one image at a time

hollow spruce Aug 11, 2023, 8:09 AM

#

restive bridge have you experimented with batch 1? I keep hearing quality is best one image at ...

batch size impacts learning rate & in case of clip training, how the actualy clip training works
so it's more of a case of people using presets designed for batch 1 - then having worse experiences after using a higher batch size - which makes total sense.

higher batch sizes can deliver better results, but that doesn't mean that training gets automatically better with high batch sizes.
Basically, one setting will never fit all vram options.
(but you can use batch 1 + Gradient accumulation 5, to get extremely close to batch 5) <- so that helps when you already have a good workflow that is designer for a higher end card

restive bridge Aug 11, 2023, 8:34 AM

#

@hollow spruce "man" token worked surprisingly well but not even 100 epochs was enough so I may extend it. artifacts on clothing is WAY less than before with celeb and reg, but detail is lower for sure. Could I raise dim just a bit?

hollow spruce Aug 11, 2023, 8:44 AM

#

restive bridge <@211089689652887552> "man" token worked surprisingly well but not even 100 epoc...

are you too zoomed out on your images?
but yeah. either increase learning rate, or try using dropout every 5 epochs after all

restive bridge Aug 11, 2023, 8:46 AM

#

hollow spruce are you too zoomed out on your images? but yeah. either increase learning rate, ...

that was with dropout every 5. theres a couple waist-up but mostly closeup

hollow spruce Aug 11, 2023, 8:51 AM

#

restive bridge that was with dropout every 5. theres a couple waist-up but mostly closeup

interesting. then it's probably that you have 25 images, instead of my usually tested 50.
yeah, just increase learning rate until epoch 50~100 is well cooked

covert pagoda Aug 11, 2023, 9:06 AM

#

hollow spruce interesting. then it's probably that you have 25 images, instead of my usually t...

What about gradient accumulation on batch 10 on a 80gig A100 card. Is there any benefit to update weights less frequently if using a high batch size with a larger gpu? I'm running right now at batch 4 with excellent results but at 3 hours training. I'd love to bring time down. Would higher batch size plus grad accumulation combo = faster iterations (higher batch) + slower updates (less frequent updates)? What is the correct calculation to predict gains?

hollow spruce Aug 11, 2023, 9:31 AM

#

covert pagoda What about gradient accumulation on batch 10 on a 80gig A100 card. Is there any ...

with an A100 you can do cool things.
I'd probably try a dataset of 50~100 images, and fitting the whole thing into 1 single checkpoint
(so like batch 50 with GA of 2, for exactly 100 images) <- will take a while to get all the settings right, but then you can pull of really cool things

also, you can train at 1536x1536 <- captures more detail, but obviously your images need to be high quality enough, to not accidentally train image noise found in low to mid quality jpgs

versed crescent Aug 11, 2023, 9:35 AM

#

How does a model like SDXL cope with mixed resolutions? Could a LoRA encode higher resolution information than the base model?

covert pagoda Aug 11, 2023, 9:57 AM

#

hollow spruce with an A100 you can do cool things. I'd probably try a dataset of 50~100 images...

That sounds like a very cool experiment. Is that a Dreambooth Lora type training or we talking a full fine tune checkpoint model training?

hollow spruce Aug 11, 2023, 10:09 AM

#

still a lora.
also, you can do finetune style loras with that gpu (training on 30k images)
which brings similar results to full finetune on SD1.5
^ that's what I'm currently doing, where my rtx4090 can barely keep up, by running for 20~30 hours

covert pagoda Aug 11, 2023, 10:15 AM

#

hollow spruce still a lora. also, you can do finetune style loras with that gpu (training on 3...

awesome. will do a test now

restive bridge Aug 11, 2023, 10:53 AM

#

why would extending epochs make likeness drift further and further 🤔 i would expect it to start overfitting but it's more like it starts un-learning entirely.

signal warren Aug 11, 2023, 11:04 AM

#

hollow spruce still a lora. also, you can do finetune style loras with that gpu (training on 3...

Does this mean more images take more Vram? Can 30k images be done on 12gb vram?

stone garden Aug 11, 2023, 11:23 AM

#

fervent sandal I get a crash with python - the Tk extension is not present. The other factor is...

You need to install it using the system package manager but it isn’t needed to train. What OS and version?

stone garden Aug 11, 2023, 11:25 AM

#

hollow spruce with an A100 you can do cool things. I'd probably try a dataset of 50~100 images...

You can use gradient_accumulation_steps for the same effect on lower end GPUs, it has exactly the same effect, just slower.

versed crescent Aug 11, 2023, 11:32 AM

#

Ok so to train within 16GB, one can follow Caith's writeup, drop Batch Size down to 1, and raise Gradient accumulate steps up to 5 under Advanced Config? Would there be any mileage in changing the Optimiser too? It's currently set to AdamW and am I right in thinking that AdamW does nothing fancy in terms of memory or lower bit quantisation ?

stone garden Aug 11, 2023, 11:53 AM

#

versed crescent Ok so to train within 16GB, one can follow Caith's writeup, drop *Batch Size* do...

A higher gradient_accumulation_steps can usually get away with a higher learning rate, but not always.

sonic narwhal Aug 11, 2023, 12:40 PM

#

Anyone has a dataset of regularisation images for faces that they can share?

sonic narwhal Aug 11, 2023, 1:17 PM

#

Why does Caith not use bf 16 for both mixed and saved precision?

fervent sandal Aug 11, 2023, 1:38 PM

#

stone garden You need to install it using the system package manager but it isn’t needed to t...

I am using an Amazon "Deep Learning" AMI. I can see the system installed Python, but they appear to have shipped a custom distribution in /opt/tensorflow thus my installation of python3-tk did not help. Given Tk is a GUI framework I'm not entirely sure how this works across a web interface either, but I cannot get that far. FWIW A11111 and ComfyUI both work fine but lack lora training options.

stone garden Aug 11, 2023, 1:39 PM

#

fervent sandal I am using an Amazon "Deep Learning" AMI. I can see the system installed Python,...

Which python version are you using? Try installing apt install python3.10-tk (adjust the version)

#

(For me, system Python was 3.10 but python3-tk installed a version for 3.8 for some reason.)

fervent sandal Aug 11, 2023, 1:45 PM

#

stone garden Which python version are you using? Try installing `apt install python3.10-tk` (...

Looks like others are having the same issue (https://github.com/bmaltais/kohya_ss/issues/873 and others). Using a python that was not installed by the system package manager is going to cause all sorts of issues to be raised.

GitHub

No module named '_tkinter' · Issue #873 · bmaltais/kohya_ss

Apologies, I don't have a particularly masterful understanding of python environmental settings. I'm on Ubuntu 22.04. I've cloned the kohya_ss repo, entered the newly created directory ...

hollow spruce Aug 11, 2023, 1:48 PM

#

sonic narwhal Why does Caith not use bf 16 for both mixed and saved precision?

to be honest, for that setting I'm just following the tips of those who actively study machine learning. Never questioned it.

#

(also, when I tried it - I saw no difference that I could actually notice, so I just left it on what was recommended to me)

#

might have been a fringe case though - so if you notice a difference, please do share ❤️

stone garden Aug 11, 2023, 1:54 PM

#

fervent sandal Looks like others are having the same issue (https://github.com/bmaltais/kohya_s...

I'm saying something different: That we should install tk that matches system Python, because apt didn't installing the version matching my installation for some reason.

hollow spruce Aug 11, 2023, 1:59 PM

#

stone garden You can use gradient_accumulation_steps for the same effect on lower end GPUs, i...

yep. but oh god it takes long XD
if my rtx4090 runs would run for 1 weeks straight - then how long would a 12 or 16gb vram card run for? 🥲

stone garden Aug 11, 2023, 2:01 PM

#

No disagreement there. 🙂 Just saying for those who really want to.

fervent sandal Aug 11, 2023, 2:07 PM

#

stone garden I'm saying something different: That we should install `tk` that matches system ...

So the question I have is, how? Within koyha_ss I see:
lrwxrwxrwx 1 ec2-user ec2-user 30 Aug 7 21:28 /mnt/sd1/kohya_ss/venv/bin/python3.10 -> /opt/tensorflow/bin/python3.10
Which is clearly a different installation to the one yum is installing packages for.

stone garden Aug 11, 2023, 2:09 PM

#

fervent sandal So the question I have is, how? Within `koyha_ss` I see: `lrwxrwxrwx 1 ec2-user ...

I used apt install python3.10-tk, yum might (or might not) have something similar.

fervent sandal Aug 11, 2023, 2:11 PM

#

stone garden I used `apt install python3.10-tk`, yum might (or might not) have something simi...

apt will only provide for system-supplied installs. Hence mine installs but has zero effect.

stone garden Aug 11, 2023, 2:12 PM

#

Wait, you're mixing apt and yum on the same distro? One comes from Fedora/RHEL and the other from Ubuntu. How's that even possible?

fervent sandal Aug 11, 2023, 2:12 PM

#

stone garden Wait, you're mixing `apt` and `yum` on the same distro? One comes from Fedora/RH...

No.
You have apt. I have yum. I have no apt.

stone garden Aug 11, 2023, 2:14 PM

#

fervent sandal No. You have `apt`. I have `yum`. I have no `apt`.

Did you try yum install python3.10-tk (instead of python3-tk) though? Worst case, you could use a conda environment and build a venv on top, I've done it before.

#

Oh -- forgot to say. You need both system and venv installations for tk to work.

fervent sandal Aug 11, 2023, 2:16 PM

#

stone garden Did you try `yum install python3.10-tk` (instead of `python3-tk`) though? Worst ...

There isn't one. There is a yum installed python-2, and python-3.7.
Inside /opt/tensorflow there is also python-3.10 which is clearly being used everything. When I installed python3-tk from yum it was version 3.7.xxxx.
Trying to figure out how to extend the tensorflow installation is the tricky bit.

stone garden Aug 11, 2023, 2:18 PM

#

fervent sandal There isn't one. There is a `yum` installed python-2, and python-3.7. Inside /op...

Try pip install tk-tools inside the venv.

stone garden Aug 11, 2023, 2:20 PM

#

fervent sandal There isn't one. There is a `yum` installed python-2, and python-3.7. Inside /op...

Do you have the option to use a different AMI? That one seems effed up.

fervent sandal Aug 11, 2023, 2:21 PM

#

stone garden Do you have the option to use a different AMI? That one seems effed up.

Well this is the one Amazon have optimised for GPUs with python and the frameworks. But I'm guessing as it is a headless environment they won't have bothered with a GUI library like Tk. Docs: https://docs.aws.amazon.com/dlami/latest/devguide/what-is-dlami.html

What Is the AWS Deep Learning AMI? - Deep Learning AMI

Launch a AWS Deep Learning AMI (DLAMI) for deep learning with NVIDIA CUDA and NVIDIA cuDNN as well as the latest releases of the most popular deep learning frameworks.

stone garden Aug 11, 2023, 2:23 PM

#

fervent sandal Well this is the one Amazon have optimised for GPUs with python and the framewor...

I've usually had success just launching vanilla Ubuntu and installing CUDA toolkit inside. Optimization doesn't mean 💩 when you can't get it to work.

covert pagoda Aug 11, 2023, 2:41 PM

#

does anybody know if possible to resume training on kohya lora even if i didnt tick the save training state on the last run?

#

also, what is this path for, LoRA network weights

digital dew Aug 11, 2023, 2:43 PM

#

Hello, I've been wondering if I could replicate something like this for other architectural design styles? How should I go about making a dataset that would help me train a lora such as this one? https://civitai.com/models/25384/xsarchitectural-8japanwabisabi

XSarchitectural-8japanwabisabi - xsarchitectural-8japanwabisabi.saf...

Japanese wabi Sabi style Test phase It is necessary to cooperate with ckpt and VAE If you have any questions, please contact me at xsarchitecturals...

hollow spruce Aug 11, 2023, 2:55 PM

#

digital dew Hello, I've been wondering if I could replicate something like this for other ar...

the proper way?
get enough dataset images with a high enough quality (100~500)

the easy way?
Use that lora (or any other lora) + highres fix to generate around 100 images. Then mess with sdxl until you also get about 100 good enough images.
then train on those 200 images. (this is called synthetic training)

the long way?
find whatever high quality images you can on the internet
then use those to train a v1 of your lora.
Then generate 1000 images using that lora, pick your 100~200 favorites.
Combine those with the original dataset -> then train the mix of original + synthetic images to make a really good v2 of your lora.

digital dew Aug 11, 2023, 3:01 PM

#

When you say high quality, you mean high resolution?

#

I'm thinking proper captioning is also critical? What should I be thinking about when I caption images?

hollow spruce Aug 11, 2023, 3:09 PM

#

digital dew When you say high quality, you mean high resolution?

high quality = 1216x832px or higher (with no jpg noise or typical phone quality like compression)

hollow spruce Aug 11, 2023, 3:10 PM

#

digital dew I'm thinking proper captioning is also critical? What should I be thinking about...

Once you have the dataset - you could try running it without captions, using only the class token of "room photography"

#

if that doesn't work - then you should worry about captioning

digital dew Aug 11, 2023, 3:19 PM

#

hollow spruce Once you have the dataset - you could try running it without captions, using onl...

So the .txt files should only contain "room photography"?

hollow spruce Aug 11, 2023, 3:21 PM

#

digital dew So the .txt files should only contain "room photography"?

only folder needs this name: 1_room photography

#

that gets loaded into kohya automatically

digital dew Aug 11, 2023, 3:23 PM

#

Ah, so I can train with the images alone? I was thinking captions are also required.

#

Sorry, I'm quite the noob at this. 😅

hollow spruce Aug 11, 2023, 3:46 PM

#

digital dew Sorry, I'm quite the noob at this. 😅

no problem ❤️ we were all there once XD

hollow spruce Aug 11, 2023, 3:48 PM

#

digital dew Ah, so I can train with the images alone? I was thinking captions are also requi...

nop. while captions help with specific goals in mind, they aren't always needed. especially for your kind of lora
also actually training it will be relatively quick and easy - so just trying it, before doing the work of adding captions is worth it

warm agate Aug 11, 2023, 4:05 PM

#

@hollow spruce trained my lora

#

testing it

wind compass Aug 11, 2023, 7:17 PM

#

Hey anyone, I want to trained an inpainting model on traditional dress. Currently working on scraping the dataset. Hoping to use 30k-50k images. Which model to used? and most importantly how to get caption for each images(clip, blip, or manually..) that will help get good training results. Last time have tried training a simple model on sd2.1 using manual caption on 30k images with plain background. The model trained was not good. Can anyone help?

versed crescent Aug 11, 2023, 8:56 PM

#

I wonder if flipping face images horizontally would be good for increasing the size of a small dataset

restive bridge Aug 11, 2023, 10:21 PM

#

versed crescent I wonder if flipping face images horizontally would be good for increasing the s...

no good cuz if the persons face isnt symmetrical it will ruin likeness

latent charm Aug 12, 2023, 5:10 AM

#

@hollow spruce Using your preset, same scheduler, same learning rate, same optimizer. 100 epoch, 200 epoch, 250 epoch, 275 epoch, 300 epoch.

#

latent charm Aug 12, 2023, 5:39 AM

#

I think around 275 is the sweet point for this type of training. 300 is a little bit overfit.

slim plaza Aug 12, 2023, 7:15 AM

#

Hello, how is the stable-diffusion-inpainting model trained? and why do we need seperate checkpoint if SDEdit-based method (does not require training) used?

hollow spruce Aug 12, 2023, 7:29 AM

#

latent charm

oh wow O: you really let that run for a while

#

good results! ❤️

latent charm Aug 12, 2023, 7:32 AM

#

hollow spruce oh wow O: you really let that run for a while

Around 4 hours on 3090. It could be the baseline. I would test prodigy and see if it would speedup the process.

hollow spruce Aug 12, 2023, 7:34 AM

#

latent charm Around 4 hours on 3090. It could be the baseline. I would test prodigy and see i...

keep in mind, prodigy changes the learning rate - so before that, it might be worth trying by just raising the learning yourself

latent charm Aug 12, 2023, 7:35 AM

#

loss around 0.15-0.13 in a few epoch. After that mostly around 0.13-0.1. In very last, it became 0.11-0.08.

latent charm Aug 12, 2023, 7:43 AM

#

hollow spruce keep in mind, prodigy changes the learning rate - so before that, it might be wo...

I heard prodigy has an issue which always started to learn in very late.

hollow spruce Aug 12, 2023, 7:46 AM

#

latent charm I heard prodigy has an issue which always started to learn in very late.

? it shouldn't.
basically put all learning rates to "1", and make sure warmup is around 10% (or lower for a more aggressive start)

latent charm Aug 12, 2023, 7:47 AM

#

Thanks for advice

latent charm Aug 12, 2023, 8:03 AM

#

hollow spruce ? it shouldn't. basically put all learning rates to "1", and make sure warmup is...

https://github.com/konstmish/prodigy/issues/3 refer to this issue. Seems prodigy has issue with cosine restart

GitHub

Question · Issue #3 · konstmish/prodigy

I am noticing a trend that has happened to my friend, and I, as we use CosinewithAnnealingLR and Prodigy, but it doesn't begin to really learn until it is almost done. I am wondering what is go...

hollow spruce Aug 12, 2023, 8:09 AM

#

latent charm https://github.com/konstmish/prodigy/issues/3 refer to this issue. Seems prodigy...

I see it. but it seems they already have an official fix for it

latent charm Aug 12, 2023, 8:10 AM

#

Ok. 👌

hollow spruce Aug 12, 2023, 8:10 AM

#

but that is specifically for annealing

#

@latent charm just did a quick check - but the preset for prodigy seems to have everything set correctly - including the special optimizer settings, so probably use that if you really want to use prodigy

#

(it was updated a few days ago - so make sure your install is up to date)

latent charm Aug 12, 2023, 8:14 AM

#

Just updated in yesterday. It should be fine.

versed crescent Aug 12, 2023, 9:30 AM

#

latent charm loss around 0.15-0.13 in a few epoch. After that mostly around 0.13-0.1. In very...

Are those loss changes broadly what anybody should see while training?

latent charm Aug 12, 2023, 9:31 AM

#

I just share what I saw in my training using @hollow spruce facr preset in 300 epoch with 40 images

versed crescent Aug 12, 2023, 9:31 AM

#

I'm in epoch 194 and loss is around 0.12, so it looks similar (but I'm using Caith's original settings with Adafactor)

versed crescent Aug 12, 2023, 9:32 AM

#

latent charm I just share what I saw in my training using <@211089689652887552> facr preset i...

Ok thanks, I have not done much training so I don't know what to expect 😄

latent charm Aug 12, 2023, 9:33 AM

#

Even the loss didnt come down, The model still learned.

#

You could see the different in above conversation

versed crescent Aug 12, 2023, 10:21 AM

#

@hollow spruce Regarding regularisation images, are there basically two choices in training, captions or regularisation ? I saw your comment somewhere above that the current LoRA tutorials on youtube which use regularisation images have a weaker ability to create detailed backgrounds. Is that the weakness behind the regularisation approach?

hollow spruce Aug 12, 2023, 10:22 AM

#

versed crescent <@211089689652887552> Regarding regularisation images, are there basically two c...

ah no, not at all. reg images are amazing.
the youtube tutorial uses dim setting of 128 or 256
also clip training can be destructive if done incorrectly

#

but reg images are always good for face training - just changes your learning rate & the length of your training to some extend

#

but quality should be better - at the expense of training longer

versed crescent Aug 12, 2023, 10:33 AM

#

Aaah ok I misunderstood

#

I'm trying your above approach, but I had to reduce the batch size to 1 to fit my 16GB card. I've also changed to Adafactor for the optimiser. I think it's working, but my training images aren't as varied as yours, so many of my captions are the same, so I'm unsure of how successful it will be. All part of learning 😄

hollow spruce Aug 12, 2023, 10:59 AM

#

versed crescent I'm trying your above approach, but I had to reduce the batch size to 1 to fit m...

as long as they're not all copies of one another, you should be good ^^

versed crescent Aug 12, 2023, 11:06 AM

#

Yeah but I don't cosplay enough, clearly. Need some more variety in my life 😄

versed crescent Aug 12, 2023, 11:28 AM

#

Interesting. The sample prompt I'm using for every 5 epochs has created an image that's much more like the training set, rather than a slowly morphing face. I wonder if this is the start of overfitting

lethal oracle Aug 12, 2023, 12:04 PM

#

i want to make ai images from face of my friend which method should i use textual-inversion, lora, dreambooth, kohya_ss?

stone garden Aug 12, 2023, 12:06 PM

#

hollow spruce as long as they're not all copies of one another, you should be good ^^

What kind of regularisation images would you recommend for a face/anatomy LoRA? Just other random faces? What about captions for reg images?

hollow spruce Aug 12, 2023, 12:11 PM

#

@urban halo

stiff dust Aug 12, 2023, 12:22 PM

#

lethal oracle i want to make ai images from face of my friend which method should i use textua...

kohya_ss lora with unet only works best in my opinion

hardy storm Aug 12, 2023, 3:09 PM

#

I have a question about the concept of using a celebrity's name for training instead of a the generic "ukj", etc...
I recently tried doing this with a friend of mine. I used "Tom Hanks" b/c he kind of sort of a litte bit resembles him. I trained both a Checkpoint and a Lora. It worked pretty well for the Checkpoint ... however, with the Lora (and I assumed this would happen), when I use the Lora with a different checkpoint (say, Photon), it creates a person who looks like a radiated blend of my friend and the actual Tom Hanks.
Am I correct in assuming that this concept does not work for Lora's?

covert pagoda Aug 12, 2023, 3:20 PM

#

small hickup with training, i am having trouble resuming from last saved State. Error message is Kohya script could not locate the State folder.... any ideas what could be causing this minor error? I am using the / in the path.

covert pagoda Aug 12, 2023, 3:57 PM

#

sour eagle Aug 12, 2023, 4:39 PM

#

is it possible to merge checkpoints in kohya or is a1111 the only way?

latent charm Aug 12, 2023, 6:22 PM

#

@hollow spruce After 5 hours with 300 epoch, didn't learn enough.

#

Always bouncing between 1.0~1.4, sometime get 1.6~1.7

hollow spruce Aug 12, 2023, 6:28 PM

#

latent charm <@211089689652887552> After 5 hours with 300 epoch, didn't learn enough.

cosine forever decreases - which means you can leave it running forever - but at the same time you won't be seeing significant improvement after a while.
Either cosing with restarts or cosine with annealing restarts - to make that work

#

but really really odd :/

#

cause it should learn everything, even with standard cosine

#

(even if not as aggressive)

latent charm Aug 12, 2023, 6:31 PM

#

300 epoch. Left is cosine prodigy. Right is constant with warmup adam.

#

275 epoch.

#

100 epoch

#

prodigy with cosine didn't learn much at all.

#

No lora

#

Might be it is due to I didn't set the Optimizer extra arguments "decouple=True weight_decay=0.5 betas=0.9,0.99 use_bias_correction=False"

livid rapids Aug 12, 2023, 7:32 PM

#

I'm just getting started training loras on 1.5, using kohya_ss. I looked around for guides but they all seem to be "just copy my example for this 16 image sample of a character! quick and easy!". Is there a more in depth guide that explains all the options/parameters and what they actually do? Like, what should I be thinking about when setting a learning rate? What are the 20 different optimizers and why should I use one over the other? etc.

Also training for a concept rather than character with around 200 images, so the endless tutorials with 16-20 images probably use settings that I should change. I just don't have the info on what and why to change, though.

crimson tundra Aug 12, 2023, 8:27 PM

#

Is there a way to change a body part shape and size using inpaint?

hollow spruce Aug 12, 2023, 9:31 PM

#

livid rapids I'm just getting started training loras on 1.5, using kohya_ss. I looked around ...

for a bit of a more complex explanation:
https://rentry.org/59xed3
^ goes into a fair amount of detail

https://hoshikat.hatenablog.com/entry/2023/05/26/223229#Rate-of-caption-dropout
^ use bing translate on that site, it explains all the settings you can find in kohya, without too much complexity

hollow spruce Aug 12, 2023, 9:33 PM

#

latent charm Might be it is due to I didn't set the Optimizer extra arguments "decouple=True ...

prodigy needs the special parameters to work correctly
https://rentry.org/59xed3#prodigy
^ this entry goes into a bit more detail on prodigy - and what the special parameters do and why they're important, if you're up for a bit of reading

livid rapids Aug 12, 2023, 10:10 PM

#

hollow spruce for a bit of a more complex explanation: https://rentry.org/59xed3 ^ goes into a...

Thank you

pliant drift Aug 12, 2023, 10:56 PM

#

latent charm https://github.com/konstmish/prodigy/issues/3 refer to this issue. Seems prodigy...

yeh. while i've been getting good results with prodigy, i've noticed it could be better when i look at lr graphcs. I found this https://github.com/kohya-ss/sd-scripts/pull/271

GitHub

Add '--lr_scheduler_type' and '--lr_scheduler_args' argument by Iso...

Add argument --lr_scheduler_type and --lr_scheduler_args to use lr_scheduler from another library

For example, to use torch.optim.lr_scheduler.CosineAnnealingLR with T_max=100 as lr_scheduler, we ...

pliant drift Aug 12, 2023, 10:56 PM

#

hollow spruce prodigy needs the special parameters to work correctly https://rentry.org/59xed3...

of course caith already on it too

#

these options just showed up in the gui recently. we can add cosine annealing now

#

hmm nevermind maybe not. the annealing is seperate and can't be used. i'll have to craft a command

#

seems like a pita. i'm going to go back to adamw

livid rapids Aug 12, 2023, 11:46 PM

#

  num train images * repeats / 学習画像の数×繰り返し回数: 2770
  num reg images / 正則化画像の数: 0
  num batches per epoch / 1epochのバッチ数: 1385
  num epochs / epoch数: 4
  batch size per device / バッチサイズ: 2
  gradient accumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 5540

It's estimating 4 hours at 2.8 it/s for this lora. 270 images at 10 repeats each, 4 epochs, batch size of 2. I'm running on a 12gb VRAM RTX 3060. Are these normal speeds?

restive bridge Aug 13, 2023, 12:19 AM

#

dim 24 vs. dim 256, from aitrepreneurs new video

stone garden Aug 13, 2023, 4:57 AM

#

restive bridge dim 24 vs. dim 256, from aitrepreneurs new video

You need a huge dataset and probably even reg images and captions for dim = 256.

tribal frigate Aug 13, 2023, 7:55 AM

#

Guys, do you have tips for tutorials on how to train your own embeddings for SDXL using google colab?

covert pagoda Aug 13, 2023, 8:12 AM

#

covert pagoda small hickup with training, i am having trouble resuming from last saved State. ...

Well looks like resume was broken for Lycoris training but the latest Lycoris dev module is now working for resuming fyi "pip install lycoris_lora==1.9.0.dev7"

#

If anyone interested

tribal frigate Aug 13, 2023, 8:15 AM

#

Also... is LORA tied to a specific model or is it like an embedding that you can use on top of a model?

covert pagoda Aug 13, 2023, 9:11 AM

#

pliant drift hmm nevermind maybe not. the annealing is seperate and can't be used. i'll have ...

Just tried cosine annealing with T_max = to max steps. Not good results. Vanilla prodigy was better

stone garden Aug 13, 2023, 9:43 AM

#

tribal frigate Also... is LORA tied to a specific model or is it like an embedding that you can...

LoRAs can be biased towards a model, but if they are created correctly they will work well with most models.

open merlin Aug 13, 2023, 10:16 AM

#

How can I make the learning rate decay over the repeats when using cosine with restarts? Every repeat returns to the initial learning rate though it would be better if every repeat is a little less than the one before.

hollow spruce Aug 13, 2023, 10:24 AM

#

covert pagoda Just tried cosine annealing with T_max = to max steps. Not good results. Vanilla...

unless I misunderstood it - constant should get you close to perfect results using prodigy?
since it already auto adapts learning rate. no need to double adapt it, right?

latent charm Aug 13, 2023, 10:47 AM

#

@hollow spruce Tried prodigy with https://rentry.org/59xed3 this suggested extra parameter. It is better than the default parameter in the gui preset.

THE OTHER LoRA TRAINING RENTRY

Stable Diffusion LoRA training science and notes
By yours truly, The Other LoRA Rentry Guy.
This is not a how to install guide, it is a guide about how to improve your results, describe what options do, and hints on how to train characters using bad or few images.
Due to the higher prevalence of...

#

Prodigy 300 epoch cosine prodigy
decouple=True weight_decay=0.5 betas=0.9,0.99 use_bias_correction=False
Epoch: 100, 200, 250, 275, 300

#

Prodigy 300 epoch constant prodigy
decouple=True weight_decay=0.01 d_coef=2 use_bias_correction=True safeguard_warmup=False betas=0.9,0.99
Epoch: 100, 200, 250, 275, 300

#

But overall, I still think constant adam produce the best result over 300 epoches.

covert pagoda Aug 13, 2023, 10:51 AM

#

hollow spruce unless I misunderstood it - constant should get you close to perfect results usi...

Yea this seems to be the case. Playing with the opt arts only makes the LR go wild so far

covert pagoda Aug 13, 2023, 10:51 AM

#

latent charm But overall, I still think constant adam produce the best result over 300 epoche...

What was the difference in training time? Any gains?

latent charm Aug 13, 2023, 10:53 AM

#

covert pagoda What was the difference in training time? Any gains?

constant prodigy seems to be 3.x hours and constant adam seems to take longer, 4.x hours

hollow spruce Aug 13, 2023, 11:03 AM

#

latent charm But overall, I still think constant adam produce the best result over 300 epoche...

cause we're running adam at the lowest rate. adam with all learning rates at 0.001 should give similar results, at much much faster times

covert pagoda Aug 13, 2023, 11:04 AM

#

hollow spruce cause we're running adam at the lowest rate. adam with all learning rates at 0.0...

Oh really. I'll give that try

#

Worth turning off TE in arg?

covert pagoda Aug 13, 2023, 11:05 AM

#

latent charm constant prodigy seems to be 3.x hours and constant adam seems to take longer, 4...

How did your loss graphs compare for your two prodigy years? Can you paste here?

hollow spruce Aug 13, 2023, 11:06 AM

#

1e-3 is kinda the best 'value' of learning rates that I've found. best performance to training time ratio
5e-4 is the highest quality I achieved, but yeah, for some subjects it needs to run til like epoch 250~350. :/

my next test will be to do a small full finetune, and see if training speeds change on that finetuned sdxl model - where only the faces are easier to train

#

not sure if that will work out, but we'll see

covert pagoda Aug 13, 2023, 11:07 AM

#

hollow spruce 1e-3 is kinda the best 'value' of learning rates that I've found. best performan...

Sorry, how many repeats on this range 250-350?

hollow spruce Aug 13, 2023, 11:08 AM

#

covert pagoda Sorry, how many repeats on this range 250-350?

everything with repeat 1 - for simplicities sake

covert pagoda Aug 13, 2023, 11:08 AM

#

Batch 6?

latent charm Aug 13, 2023, 11:08 AM

#

I used batch 5

hollow spruce Aug 13, 2023, 11:09 AM

#

repeats are only relevant if you do unevenly weighted datasets - or are training a smaller amount of images on a specific face, and need a lot of regularization images to run with it

#

but for all other situations, repeat 1 is always the go to method, since it gives you more control over everything

latent charm Aug 13, 2023, 11:11 AM

#

constant adam

covert pagoda Aug 13, 2023, 11:11 AM

#

here are mine. prodigy cosntant then with annealing -- --network_train_unet_only --lr_scheduler_type "CosineAnnealingLR" --lr_scheduler_args "T-max=25" and weight_decay=0.5 d_coef=2 use_bias_correction=True

#

a little higher on the weight decay

#

constant was good, green. the adapt scheduler went a bit bonkers and didnt learn. Probably because my weight decay was too high

covert pagoda Aug 13, 2023, 11:13 AM

#

hollow spruce but for all other situations, repeat 1 is always the go to method, since it give...

ok, i didnt know that.

#

but in my case, i am working on a character. 23 images at the moment (might go up to 50)

#

10 faces, 10 midshots, 3 full bodies.

latent charm Aug 13, 2023, 11:14 AM

#

covert pagoda How did your loss graphs compare for your two prodigy years? Can you paste here?

constant prodigy (should be this one)

covert pagoda Aug 13, 2023, 11:15 AM

#

and i use reg with photographs, to inbibe photo style... prodigy constant

#

dataset:

covert pagoda Aug 13, 2023, 11:16 AM

#

latent charm constant prodigy (should be this one)

@hollow spruce shouldnt we use the max_norm graph mostly to see the loss ?

hollow spruce Aug 13, 2023, 11:17 AM

#

covert pagoda ok, i didnt know that.

basically there's no downside to just increasing epochs. but you have the advantage of using settings like 'dropout every n epochs' or the cosing with restarts with a lot more control

latent charm Aug 13, 2023, 11:18 AM

#

covert pagoda <@211089689652887552> shouldnt we use the max_norm graph mostly to see the loss ...

Not familiar with that. How to turn it on?

hollow spruce Aug 13, 2023, 11:18 AM

#

covert pagoda <@211089689652887552> shouldnt we use the max_norm graph mostly to see the loss ...

I'm the wrong person to ask for that - I work entirely without graphs since I almost always train multiple concepts at the same time - and graphs won't help much, if 3/5 concepts are trained perfectly, 1 does nothing, and 1 just crashes sdxl XD

covert pagoda Aug 13, 2023, 11:19 AM

#

latent charm Not familiar with that. How to turn it on?

it's one of the graphs on wandb. see bottom left

latent charm Aug 13, 2023, 11:19 AM

#

I didn't use wandb on the testing. Although I have the account, a little bit lazy to set it up.

covert pagoda Aug 13, 2023, 11:19 AM

#

hollow spruce I'm the wrong person to ask for that - I work entirely without graphs since I al...

oh. ok. i will try 1 repeat epochs then

#

Can you give an example set of concepts in one of your trainings?

#

I wonder if for instance for the above asian model, concepts could be:

#

a certain hair style

#🔧｜finetune

I can tell you that my properly captioned datasets, trained with my settings have no issues. but if you: • Trainet with clip on • Used bad caption practices • Too big dim settings then yeah, the LoRA may have damaged the core sdxl model enough that it's no longer functioning right

Lora Training Settings - speed training faces edition

I can tell you that my properly captioned datasets, trained with my settings have no issues.
but if you:
• Trainet with clip on
• Used bad caption practices
• Too big dim settings
then yeah, the LoRA may have damaged the core sdxl model enough that it's no longer functioning right