#LoRA_Easy_Training_Scripts

2192 messages · Page 3 of 3 (latest)

normal charm
#

wait nvm, false alarm, it works now. dunno what happened tho

vivid python
normal charm
#

Dyfuckingwhat

vivid python
#

Dylora

vivid python
#

From what I know, it's a way to make low dim lora work better? I haven't thoroughly tested it

quiet notch
#

They state it can train x7 faster than lora...?!

bleak minnow
#

👀

quiet notch
#

without compromising performance

bleak minnow
#

new toy to play with

#

nice

quiet notch
worn locust
worn locust
#

I tried reading the paper but it sounds like dylora is gonna be useless

#

If it was 7x faster that would be epic but it wasn't when derrian tested it

shut siren
#

are the results any different for dylora

#

or is it another dejj like ia3 and lokr

normal charm
#

I due prefer a good speed

vivid python
#

though, I did find that it trains about as fast in terms of iteration speed, just ran out of vram that first time

#

so it didn't count

normal charm
#

So the jury is still out

worn locust
shut siren
#

what kind of settings were tested for dylora?

#

based on the paper, it seems like the purpose of dylora is that you can do inference at different ranks

#

my dylora seems super undertrained for the same settings as locon

vivid python
#

Which version of dylora did you use? Kohaku's is different from kohya's

#

And because of that dylora is not going to be able to be used depending on the mode

#

If kohya's, then you have to use additional networks

quiet notch
quiet notch
#

oh man... so many new implementations of lora training while i was busy

#

well, mainly ia3 and lokr and dylora

#

and then this block weight training thing

#

i haven't even looked into what optimizers to, besides that adam8 is the "best"

worn locust
worn locust
quiet notch
#

i'm doing that right now cirnoSugoiWow

#

setting up a json, but i won't be able to train until a bit later

worn locust
shut siren
#

i tried kohaku's dylora

shut siren
#

4e-4 unet with 5e-5 text encoder learned like basically nothing

#

at ~900 steps

#

what dims were ppl testing on dylora

quiet notch
#

batch size?

shut siren
#

supposedly the idea is that you can do inference at a diff rank than its trained at based on the paper?

#

im always batch 1

#

basically stochastic lol

quiet notch
#

👌

#

i missed the "inference" part on the paper

#

wait, what do you mean by inference?

shut siren
#

like, generating images

quiet notch
#

oh, that's interesting... i'm reading now that it's adaptive at inference time

#

i was under the impression that training is adaptive in determining rank (dim)

#

hence, was confused why you would want to pick a dim, since dylora would optimize the dim anyways

shut siren
#

from my understanding of the paper, the supposed benefit of dylora is to avoid having to do multiple training runs at different rank

#

to find the optimal rank

#

since you can just select the rank used at inference

#

now, im not sure what the dim settings on dylora do

#

maybe its the maximum rank?

quiet notch
#

it might not be used at all?

#

i'll have to "read" the paper again

shut siren
#

rip neither kohya's or kohaku's repos having english documentation for how to use kek

#

ok im just gonna run kohya's documentation through deepL lol

#

"Features of DyLoRA in this Repository
After training, DyLoRA model files are compatible with LoRA. LoRAs of multiple dims below a specified dim(rank) can be extracted from the model file."

#

so i think the rank specified for dylora is like the max rank

#

it will simultaneously train for all ranks below that

#

"According to the paper, higher ranks of LoRA are not necessarily better, but it is necessary to find the appropriate rank depending on the model, dataset, task, etc. Using DyLoRA, LoRA is trained simultaneously at various ranks below a specified dim(rank). This saves time in learning and searching for the optimal rank for each."

#

"Also, specify a unit for --network_args, for example --network_args "unit=4", where unit is a unit to divide ranks. For example, --network_dim=16 --network_args "unit=4" where unit is a divisible value of network_dim (network_dim is a multiple of unit)."

#

so you can specify how to divide them

#

based on this i think you can do like dim16 with training also at dim12/8/4 if you set unit=4

#

in kohaku's its called block size iirc

#

"For example, training with dim=16 and unit=4 (see below) will train and extract LoRA for 4, 8, 12, and 16 ranks. By generating images with each of the extracted models and comparing them, the LoRA with the best rank can be selected."

#

basically dylora is to avoid having to retrain multiple times to find the ideal dim size

quiet notch
#

interesting... cirnoThinkHmm

#

i wonder how increasing dim effects training time

#

cause lately i've been trying to train style at 8 dim, and with default unit 4, then dylora would only train 4, 8, which doesn't seem like it would be an improvement

#

i haven't experimented with dim/alpha at all tbh, so i don't know too much about how they effect results

#

but i guess with dylora and extraction, it would be easier to extract lower rank dims and compare them

#

i know there's already comparison grids of dim/alpha, but it's a different kind of learning if you do it yourself with your own dataset that you're familiar with

shut siren
#

not sure why with kohaku's it seems to need either a higher LR or more steps

#

than locon

quiet notch
#

currently training dylora, and the samplers per epoch look terrible

shut siren
#

i didnt have any turn out well

quiet notch
#

i'm thinking about trying this dylora with dadaptation

quiet notch
vivid python
#

oof

bleak minnow
#

good

errant wraith
#

any guide on how to use these scripts?

#

or even link a message in a convo of someone explaining it

#

feeling really stupid rn

vivid python
#

you just need to follow the popups

#

once they are installed using the installer

#

you can run them by running the run_popup.bat

#

once loaded it will ask you a bunch of questions sequentially

#

if you know what settings you want, it's pretty quick

#

if not, then it might be a bit confusing

#

I'm working on an overhaul of the UI right now, as in, I'm making a whole UI right now

#

what are you having an issue with in particular?

quiet notch
#

Honestly, using the json file with notepad is my UI and honestly that's all I need.

#

The arglist.py (i think) is a good reference as well, albeit a bit hidden.

marsh basin
#

Hey, could someone help me out with the following or give me tips how I can succesfully make a lora out of these images:

#

I know these are quite limited, but I cant figure out how to do this properly. I am getting mixed results with Kohya, would your easy training script help? Like normalizing.

#

Going to check it out rn though

vivid python
#

the easy training scripts also uses kohya on the back end

#

so it's likely you won't get better results if you were getting bad results before

#

that being said, that is far outside of what I normally train, so I can't really help you

quiet notch
#

just some quality of life features i'd like to see implemented.

  • if the output folder doesn't exist, just create it
  • allow the provided json name to be used as the name of the output folder, log prefix, and output name (togglable functionality)
  • maybe have the same functionality with im/reg folder path, but enforce suffixes to keep naming ordering consistent (togglable functionality)
  • let custom schedulers take the "num_warmup_steps" and "num_training_steps" as arguments for kwargs (my custom schedulers are a similar implementation of built-in schedulers)
#

my workflow is currently as follows

  1. generate a template json config script through json
  2. edit the json config. i find myself redundantly editing the output folder, output name, and log prefix to the same name
  3. copy json file to create variants, usually adjusting one hyperparameter, but also changing the output folder, output name, and log prefix
  4. create output folders
  5. run multiple json training
  6. go away for a long time and hope training didn't stop because i made a typo or forget to create a folder or something
#

it's very fiddly but very powerful, i like it

#

it's just... after doing 50+ trainings... it kinda gets to you

#

this is probably super extra and probably not needed on main branch, but if in the json file name, i put something like e12, it would know that this json is meant to be ran for 12 epochs, and will run for 12 epochs regardless of what's in the file itself

#

that's probably something i would have to do for my own personal workflow, but just something cool to bring up, i guess

#

other examples could be Ux# to multiply lr_unet by #, or Tx# to multiply lr_textencoder. all separated by spaces

errant wraith
vivid python
vivid python
# quiet notch just some quality of life features i'd like to see implemented. - if the output ...

Creating an output folder Is doable. Id have to rewrite some of my json code but allowing the use of the json name is possible, not entirely sure what you mean by enforcing suffix. Pretty sure custom schedulers already have that ability in the scripts but I don't use custom schedulers nor anybody else I've talked to, so I didn't feel the need to care about its implementation, either way, that one would probably be really annoying to account for. To be entirely honest, seems like you just kinda go... way too far per bake?

#

That being said, development time is currently being spent on making a UI

#

Oh, and about the name being used for arguments, that kinda defeats the purpose of the json files in the first place. And it would be a lot of work making a parser for a system very few would use

flat coral
#

pressing enter helps sometimes. it could have just paused on its own. happened a few times for me

vivid python
#

Oh very true, that's unfortunately a quirk of command line

flat coral
#

well somehow my output with easy training seems a bit different from just doing it through powershell. probably just me

#

i do like the easy features and .json

#

a custom ui/gui would be ideal and great

vivid python
vivid python
errant wraith
#

ty

vivid python
#

Yeah, quirk of tkinter was hoping to not be using it anymore at this point but that's not how it panned out. Good thay it's working for you now though

vivid python
#

It basically is the rate in which a model forgets something, so a higher weight decay can help in reducing bad training early on

#

Or, If too high, it could completely not learn anything

#

Granted, 0.1 is give or take a good spot to be in

#

If you have a really high lr, the weight decay can actually fix some of the issues that comes with that

#

Usually

flat coral
#

0.1 does sound a bit too much in some cases

worn locust
#

I'm trying to add dadaptation as an option to my colab but I got this error

Setting different lr values in different parameter groups is only supported for values of 0
I have these settings ```toml
[additional_network_arguments]
unet_lr = 1.0
text_encoder_lr = 0.5
network_dim = 16
network_alpha = 16
network_module = "networks.lora"

[optimizer_arguments]
learning_rate = 1.0
lr_scheduler = "constant_with_warmup"
lr_warmup_steps = 35
optimizer_type = "DAdaptation"
optimizer_args = [ "decouple=True", "weight_decay=0.02",]```

#

I don't know what a parameter group is in this context

#

I think the parameters groups in question are unet and text encoder
But other people can set them differently just fine

#

I had to pip install dadaptation manually, maybe that's why? But what else can I do

vivid python
#

D-adapt can't have seperate unet and te

#

It's based on adam not adamw, so it's all on one lr

vivid python
worn locust
vivid python
#

that's it

worn locust
#

True

normal charm
#

Should i run the update again

#

Also is there anywhere i can read about the schedulers?

errant wraith
#

so i want to train only the out/up layers, but when i put a weight of 0 for the middle layer, it just asks me to cancel, .1 says not an integer, it only accepted 1

errant wraith
#

oh i should try updating..

merry hearth
#

I have a question that I train lora on colab and the result is also very good (probably) but the size of that lora file is only about 10-20 mb (or I have trained too few images) and is there a way can anyone help it improve because when exporting lora file about 10 files, 1 file seems to be fine

vivid python
#

Ok yep, that was my mistake, I forgot to set it's mode to float I'll fix it soon

merry hearth
# vivid python What exactly do you mean?

i don't know if my lora is really ok, and i also tried many versions and the result is very different and sometimes error and my model seems the file size is very small compared to some other lora (the lora which is 80 -150 mb in size)

vivid python
#

All of my lora are either 16-ish mb or 30-40mb depending on if I'm using a lora or locon

#

So it's not an issue, just means you are using a smaller dim size

#

It might just be your training parameters causing problems here.

#

Because a small dim size won't break them

merry hearth
#

you are used colab to train not

vivid python
#

I don't use colab to train, no. I'm the one who makes the easy training scripts

#

@errant wraith I updated the scripts, it should be fine now

normal charm
#

I havent run the update bat in so long

#

And im still scared too

fiery tendon
#

Wouldn't #1092821901430227085 be a better channel for this?

normal charm
#

It was maybe made before that

#

If u mean this entire thread

fiery tendon
#

Like why not move there?

vivid python
# fiery tendon Like why not move there?

this is among the oldest posts probably, the "guides and resources" section didn't exist when this was created. I saw no reason to move over as usually there isn't much talking that happens here, that being said once the UI is done I'll probably create a new thread over there

magic magnet
#

Discord doesn't have a 'move thread to another forum' functionality sadly

#

so yeah right now moving this thread would entail shenanigans and cause more confusion probably

vivid python
#

alright, I'm gonna be making a new thread over in guides and resources. because I just finished a complete rewrite

#

I added a UI

#

#1110094921316171816