#🔧｜finetune | Stable Diffusion | Page 13

stiff dust Apr 22, 2023, 12:19 PM

#

if you show it images of your car you narrow this view and the model will no longer be able to create cars that don't look like yours.

tall condor Apr 22, 2023, 12:20 PM

#

so what i need to do is to specify my car better and more clear right?

stiff dust Apr 22, 2023, 12:20 PM

#

using something like "sks car" you basically tell the model that this is just one specific car

#

and the model will still be able to draw, let's say, comic images of your car

tall condor Apr 22, 2023, 12:20 PM

#

but in theory if i leave the istance promt empty i can retrain car like that right?

#

what is confusing me tho is the link between instance and model, lets say my instance promt is xyz and what i want to train is "red car" so i build the advanced captions like "xyz red car driing on the street" and "image of xyz red car on the motorway"

#

and if i dont use xyz in the advanced captions will the model still be able to match my car?

stiff dust Apr 22, 2023, 12:23 PM

#

no, but that's a good thing

#

you don't want the model to forget what a usual car is

tall condor Apr 22, 2023, 12:24 PM

#

where does my own isstance stop? if i have xyz red car and xyz headlights of car on the motorway how does the system know my model is xyz red car and not xyz headlights or just xyz red

stiff dust Apr 22, 2023, 12:25 PM

#

it's magic 🤷‍♂️
very likely it will associate the words xyz and car together in a sentence

tall condor Apr 22, 2023, 12:26 PM

#

yea i fugured there is a lot of magic here 🙂

stiff dust Apr 22, 2023, 12:26 PM

#

but that depends a bit on your training captions. If you always writr "xyz car" then it will more likely get stuck to that

#

that's why I prefer to randomize training captions as much as possible

tall condor Apr 22, 2023, 12:27 PM

#

so in reality if i want to create my xyz car model i need to have a base model that really is the type of car i want as xyz car and then specialize it with advanced captions but allways ponting back to xyz car?

#

there is a checkbox iy kohya_ss "randomize captions" maybe i try that

#

when you say randomize you mean like relaly randomize or just redefining so that it still make sense

stiff dust Apr 22, 2023, 12:28 PM

#

it will probably split the sentence by "," and shuffle

tall condor Apr 22, 2023, 12:28 PM

#

what is the "," doing when it is tokenized

stiff dust Apr 22, 2023, 12:28 PM

#

I mean with randomize to avoid fixed patterns

tall condor Apr 22, 2023, 12:28 PM

#

is it like a tag seperator?

stiff dust Apr 22, 2023, 12:29 PM

#

like not caption everything with "xyz car, from front, on street". Also sometimes use "front view of car xyz on street" and so on

stiff dust Apr 22, 2023, 12:30 PM

#

tall condor so in reality if i want to create my xyz car model i need to have a base model t...

I'm not sure if I understand you correctly. Think of xyz as a name. Like you would want SD to draw "batmans car"

#

and don't overthink it 🤷‍♂️ just try and learn from errors xD in the end every training dataset behaves a bit different anyways

tall condor Apr 22, 2023, 12:31 PM

#

yea but each cycle takes like 24 hours of trianing so trail and error is really not that funny xD

#

so the more i understand the better i can prepare xD

stiff dust Apr 22, 2023, 12:32 PM

#

then first try to train for a few hours only and check how far you get

tall condor Apr 22, 2023, 12:32 PM

#

i allready did that and this is why those questions come up 🙂

#

training each image 10 steps in 10 epochs is the same result as training each image 100 steps in 1 epoch?

stiff dust Apr 22, 2023, 12:32 PM

#

it didn't worked out?

tall condor Apr 22, 2023, 12:33 PM

#

its going there but i still have issues with overfitting or undefitting

stiff dust Apr 22, 2023, 12:33 PM

#

oh, and one very important thing: use a good prompt for testing

tall condor Apr 22, 2023, 12:34 PM

#

i used a learning rathe of 5e-7 and for some cases its still overfitting and some are underfitting and that is kind of frustrating

stiff dust Apr 22, 2023, 12:34 PM

#

like I found that using "xyz car" as prompt I get shitty images all the time, and as soon as I get good images it already overfits

tall condor Apr 22, 2023, 12:34 PM

#

yes i kind of have the same issue, how did you solve that?

stiff dust Apr 22, 2023, 12:35 PM

#

so first experiment what is a good prompt that gives you a nice picture of any car and then use the same prompt for your cae

tall condor Apr 22, 2023, 12:35 PM

#

ah ok i see

stiff dust Apr 22, 2023, 12:35 PM

#

like these crazy "photography of a red car, masterpiece, perfect angle, blablablabla"

tall condor Apr 22, 2023, 12:35 PM

#

i was hoping i can make a very roboust model that is not that specific and thus i used 5e-7

#

so that it is sloely learning the concept

stiff dust Apr 22, 2023, 12:36 PM

#

and then replace "red car" by "xyz car"

tall condor Apr 22, 2023, 12:36 PM

#

is the target to have as many different input images with an as detailed caption as possible or can i expect that at some point the system just learns my car and can map it on other cars?

#

and how can i tackle the issue that minly the model is oferfitting but yet some sepcifics that i have added pictures with captions from are underfitting in the same model

#

🙂 so many questions - i really like that stuff

stiff dust Apr 22, 2023, 12:38 PM

#

it should... I mean, sometimes it can do that with 5 images already 🤷‍♂️ So I think 1000 images are more than enough

tall condor Apr 22, 2023, 12:39 PM

#

for the overfitting and undefitting - shall i just increase the number of steps for the concepts that ar eunderfitted or shall i just keep training more with lower learning rate?

stiff dust Apr 22, 2023, 12:39 PM

#

I think your learning rate is as low as possible 😅

tall condor Apr 22, 2023, 12:40 PM

#

so if it is now overfitting i use too many steps of a particular concept?

stiff dust Apr 22, 2023, 12:40 PM

#

but yes, you might try to add more examples of the images which still fail to the training data

tall condor Apr 22, 2023, 12:41 PM

#

i was thinking of reducing the per image step sizes by /10 and then train 10 epochs while saving each epoch to see what is going on but im not sure if training 100 steps in the same epoch is the same as 10x10 epochs

stiff dust Apr 22, 2023, 12:42 PM

#

and experiment with captions and prompts. Like it's super hard to get the model to the point where it dies everything right just with "xyz car" as prompt. If something doesn't work, try to describe it in the prompt, maybe that helps already

stiff dust Apr 22, 2023, 12:43 PM

#

tall condor i was thinking of reducing the per image step sizes by /10 and then train 10 epo...

I don't know - depends on the implementation. Maybe just specify the epoch value and use a low one like 2-3

tall condor Apr 22, 2023, 12:43 PM

#

so basically what you are saing is that the input promts with propper image selection is the key right?

stiff dust Apr 22, 2023, 12:44 PM

#

not only for training, also for inference

tall condor Apr 22, 2023, 12:44 PM

#

so what kohya is doing is that it allows for each folder to define how many times each image in that folder is trained

stiff dust Apr 22, 2023, 12:44 PM

#

if you try if your model works and it gives you bad results, try to improve the prompt

tall condor Apr 22, 2023, 12:44 PM

#

and currently i train each image in each folder between 100 and 300 times or so

#

i was wondering if i can reduce that to 10 to 30 and just train 10 epochs

stiff dust Apr 22, 2023, 12:44 PM

#

tall condor so what kohya is doing is that it allows for each folder to define how many time...

that's smart

tall condor Apr 22, 2023, 12:45 PM

#

so i would have 30*1000 steps per epoch

stiff dust Apr 22, 2023, 12:45 PM

#

just use one epoch and train each directory, e.g. 5 times

tall condor Apr 22, 2023, 12:45 PM

#

if i train only 5 times each image the result is absolute crap xD

#

i found it to start working from 40 times +

stiff dust Apr 22, 2023, 12:46 PM

#

40 x 1000 images? Really?

#

okay, you use an extremely low learning rate, maybe that's the reason

#

I think you should use larger learning rate and experiment a bit with less steps

tall condor Apr 22, 2023, 12:46 PM

#

i found if i go to 1e-6 it is way to overfitting even at 40 times per image

stiff dust Apr 22, 2023, 12:47 PM

#

yes, so use less steps;)

tall condor Apr 22, 2023, 12:48 PM

#

but less steps the images get really blurry and stuff

#

well anways you gave me a whole lot of input thank you so much!

stiff dust Apr 22, 2023, 12:49 PM

#

hm, if your caption is good it should produce good images from the start that more and more look like your training images

#

yeah, I would say try first with higher learning rates and less steps to experiment a bit snd find a good training setup

tall condor Apr 22, 2023, 12:49 PM

#

i have a very wild mix of captions, also with alot of details, like headlights, spoilers and so on

#

and i want the model to be able to also understand those parts

#

thank you very much for your time sir

#

i will do some tests and report back with the results! 🙂

stiff dust Apr 22, 2023, 12:50 PM

#

sure, good luck

tall condor Apr 22, 2023, 12:50 PM

#

have a nice day

pure plume Apr 22, 2023, 1:04 PM

#

I wish to train for a style for the first time, where should i start?
I mean i know i should build a dataset, but is it like 20 pics, 50, 100, 200?
how do I tag such a thing?
in the style of <whatever>?
how to decide if it's TI, Lora, a model?

stiff dust Apr 22, 2023, 1:28 PM

#

try TI first, then do lora, use detailed captions. "in the style of <whatever>" is fine, but not important, you can also go for "by <whatever>".

tall condor Apr 22, 2023, 11:20 PM

#

anybody has a link describing how the text tokenizer works for training

#

also is there a way to extend a model rather than retraining it for something specific?

#

and if i finetune what is the best model to base on? currently im working with 1.5 and some i found on the internet that have 50% ema mixed in

spark comet Apr 22, 2023, 11:28 PM

#

hi

rigid starBOT Apr 22, 2023, 11:28 PM

#

FAQ: What is Stability AI?

Our vibrant communities consist of experts, leaders and partners across the globe. They are developing cutting-edge open AI models for Image, Language, Audio, Video, 3D and Biology.. AI by the people, for the people. Learn more here stability

tall condor Apr 22, 2023, 11:33 PM

#

hi 🙂

unique cloak Apr 23, 2023, 8:56 AM

#

you won't be able to train SDXL yet, as it is not released for download.
As for "can you", to be honest, almost no chance SDXL could be trained on this. You can't train any model I think on 4GB VRAM currently, the minimum I see is 8GB for LoRAs training

#

even just running SDXL has a big chance to require a super fancy GPU

#

you can train some models in google colab though, using notebooks

unique cloak Apr 23, 2023, 9:17 AM

#

it's not just taking long. it's another process and tool to train, and won't run at all on CPU or on 4GB VRAM 😢
with that card, your best bet is to use google colab for it from time to time, where a small training like 15 pics will be done in 20 minutes

tall condor Apr 23, 2023, 11:46 AM

#

loui i will remind you when when you make a model with 1000+ images to train on sdxl xDD

#

ant it will take half a year lol

stiff dust Apr 23, 2023, 3:04 PM

#

In my opinion all models have big issues with vikings. It seems to be that there is just no good training data for vikings. Like they all tend do give them weird horns and stuff

#

you might try different words or cultures that "feel similar", like "celtic" instead of "viking"

tall condor Apr 23, 2023, 7:50 PM

#

if i train a model what is the best model to base on?

#

also is it safe to use a pruned model?

#

also shall i use ema or nonema for training

#

v2-1_768-ema-pruned.safetensors or v2-1_768-nonema-pruned.safetensors

#

or v1-5-pruned-emaonly.ckpt / v1-5-pruned.ckpt

stiff dust Apr 23, 2023, 10:04 PM

#

always use ema

#

and yes, you can use pruned models. They just have some parts removed you don't need anyways

cold wyvern Apr 24, 2023, 9:40 AM

#

Is loss=NaN an issue when training a SD2.1 model??

#

unique cloak Apr 24, 2023, 9:42 AM

#

usually, loss=NaN means it's going badly yes. but it can also just bug out. this is very early in your training. unless you used a very high learning rate, this seems like a bug in display.
I would stop it and test though personally

cold wyvern Apr 24, 2023, 9:46 AM

#

Any suggested rates for 2.1??

unique cloak Apr 24, 2023, 9:48 AM

#

Dreambooth ? I use 6e-7 personally, with a polynomial scheduler

#

a lot lower than yours there

cold wyvern Apr 24, 2023, 9:48 AM

#

kohya_ss gui

unique cloak Apr 24, 2023, 9:48 AM

#

not sure then, especially if LoRA

cold wyvern Apr 24, 2023, 9:48 AM

#

based on dreambooth yeah

#

unique cloak Apr 24, 2023, 9:49 AM

#

yeah, I'm not sure how they implement it. I imagine those were the default values ?

cold wyvern Apr 24, 2023, 9:49 AM

#

0.0001 was default

unique cloak Apr 24, 2023, 9:49 AM

#

so 1e-4

cold wyvern Apr 24, 2023, 9:49 AM

#

I'll try yours and up the epoch count

unique cloak Apr 24, 2023, 9:50 AM

#

worth a try on 1e-4 already

#

but I do recommend using polynomial LR yeah

#

it makes LR reduce slowly over time

#

this is subjective, but from the tests I did on it, I find it gives better quality in smaller details

stiff dust Apr 24, 2023, 10:04 AM

#

NaN is usually a problem of your precision

#

if you train with fp16 you need special techniques like mixed precision and gradient scaling

cold wyvern Apr 24, 2023, 10:08 AM

#

had mixed precision = "fp16" in there already

#

and gradient checkpointing

#

am trying again with bf16 precision now

#

Thank you @stiff dust !

#

That's a batch size of 4 there at 768x768 on an A4000

unique cloak Apr 24, 2023, 10:10 AM

#

just under a picture per second. not that bad !

cold wyvern Apr 24, 2023, 10:11 AM

#

indeed, and it'll get a bit quicker as it goes on, not that it bothers me, I'm heading to bed and will collect the results tomorrow 🙂

tall condor Apr 24, 2023, 9:17 PM

#

hi guys, im finetuning with dreambooth on RTX4090 but for some reason i can only do batch of 1

#

any idea what im droing wrong

#

as soon as i use batch of 2 my vram runs out

#

ResidentChiefNZ: i had the same issue yesterday

#

may i ask how you solve it? for me it worked switching to batch of 1 and no memory efficioen attention

#

but i would really like to use bacthes of 2

#

does switching from fp16 to bf16 help at all on the memory?

unique cloak Apr 24, 2023, 10:21 PM

#

you can't train XL for now, even on DS. 2.1 can be taught specific things if you need to.
For a professional pipeline, this seems like a completly valid way to go still, and for quite some time
Same for SD 1.5, it has such a big user base, there are very specialized models doing wonders on those subjects.

I love SD XL, and I really like that you spread that love too, but let"s not antagonize over users either on what they use if they want to use it.
1.5 is far from an abandonware, 2.1 is not for old fart with the results it can give, let's water it down and enjoy what we each like, if it's cool with you

#

Using XL doesn't train it, no. it does help for sure on giving feedback on different things, like the tokens you use the most, and stats of use or even just all the great art you've been sharing since you came around, but other ways like pickApick, where human feedback is given back to the machine, are also great ways to help on it

The comparison you are proposing is not directly possible though. A prompt is tuned for the model it's targeted on, using the tokens that "resonate" the most with it. The good question there would be, for a given wanted result, what quality can you achieve by tinkering the prompt. But this becomes an unfair question, putting XL at a disavantage since it stays generalistic, and not specialized on a given result

Here is an example where we compared this with TwoDukes, one of those currently finetuning SDXL, if you really want :
SD 1.X results : #🍥｜anime message
SDXL results : #🍥｜anime message
we'll get more "on point" results from the models trained on the specific prompt, even if SDXL does really really great on it too, but importing more from styles that weren't prompted (like 3D in this example)

It's gotta be kept in mind that SDXL isn't out yet, it's just the beta, and that beta ROCKS, I'm all with you, and I have great hopes for it

unique cloak Apr 24, 2023, 11:02 PM

#

but to give another comparison I just did

#

prompt is "a realistic picture, professional portrait of a cat octopus creature wearing a suit, unmythical creatures"

#

and this second example is on a 1.5 trained model

median flax Apr 25, 2023, 12:09 AM

#

The checkpoints on Civitai based off 1.5 beat pretty much everything I've ever seen

cold wyvern Apr 25, 2023, 12:12 AM

#

As in the community trained/merged checkpoints found on CivitAI.com

median flax Apr 25, 2023, 12:12 AM

#

You need to get the right one for the right job though

tall condor Apr 25, 2023, 1:53 AM

#

FYI if i switch to bf16 instead of fp16 i can use batches of 2

#

can i expect changes in the result based on this change?

stiff dust Apr 25, 2023, 6:01 AM

#

sorry dude, but you seem to have no understanding how ai works 😅

cold wyvern Apr 25, 2023, 6:16 AM

#

stiff dust sorry dude, but you seem to have no understanding how ai works 😅

the difference between bf16 and fp16 is trivial, the fp16 has more bits dedicated to the fraction side, so is likely to be more accurate than bf16, but the difference is going to be so negligible that it won't be noticed in the end result

stiff dust Apr 25, 2023, 6:24 AM

#

oh sorry, I was referring to louis strange monologue above

stiff dust Apr 25, 2023, 7:47 AM

#

dude, your comparison mean nothing. You can make photos with 1.5 with much better quality than vanilla SDXL, just use the right model and the right prompt

#

just look at example images from:
https://civitai.com/models/3666/protogen-x34-photorealism-official-release
https://civitai.com/models/1116/rpg
https://civitai.com/models/4823/deliberate
It's all 1.5

Protogen x3.4 (Photorealism) Official Release - 1 | Stable Diffusio...

Research Model - How to Build Protogen ProtoGen_X3.4 - Enbrace the ugly, if you dare... By Downloading you agree to the Seek Art Mega License , and...

RPG - RPG V4 | Stable Diffusion Checkpoint | Civitai

NEW: Download the new User Guide here: RPG User Guide v4.3 Available on: Originally posted to HuggingFace by Anashel Mage: https://www.mage.space/u...

Deliberate - v2 | Stable Diffusion Checkpoint | Civitai

Deliberate All in One / Any Case Version This model provides you the ability to create anything you want. The more power of prompt knowledges you h...

cold wyvern Apr 25, 2023, 7:56 AM

#

stiff dust dude, your comparison mean nothing. You can make photos with 1.5 with much bette...

Don't bite.. he'll keep posting for next 3 days....

stiff dust Apr 25, 2023, 7:57 AM

#

haha, right. Have fun with him ;D

cold wyvern Apr 25, 2023, 7:58 AM

#

stiff dust haha, right. Have fun with him ;D

nah, I gave up 😛

#

I'm just warning you before you make the same mistake I made!

unique cloak Apr 25, 2023, 8:24 AM

#

a lot of people went on that argumentative road already.
I think XL is promising, but not the joker that beats everything either.
But let's just agree to disagree and all enjoy our tools, is what I landed on there :p

tall condor Apr 25, 2023, 8:33 AM

#

why ppl compare sdxl to 1.5 and not 2.1?

#

is 2.1 really that worse than 1.5?

stiff dust Apr 25, 2023, 8:34 AM

#

depends. Many people like 1.5 more

unborn rain Apr 25, 2023, 8:34 AM

#

well

stiff dust Apr 25, 2023, 8:35 AM

#

I would say 2.1 often makes more and better details. However, if you use a good custom model that doesn't matter so much

#

1.5 on the other hand is better in drawing humans, however, 2.1 is not as bad in that as many people say. It just cannot draw nude people

tall condor Apr 25, 2023, 8:36 AM

#

ah i see

stiff dust Apr 25, 2023, 8:36 AM

#

some people say 2.1 is a bit more overfitted and less versatile. Hard to say if that is true, though

tall condor Apr 25, 2023, 8:36 AM

#

from my tests 2.1 draws faces much better than 1.5

#

and with much more detail

stiff dust Apr 25, 2023, 8:36 AM

#

but I would say that, indeed, 2.1 gives me more of the same look (e.g. its hard drawing people full body, it always wants them close-up, while 1.5 seems a little bit more versatile)

#

yeah, as said, in the end nobody really uses the base models, but custom models like dreamshaper, deliberate and so on. They are also much better in details and faces

tall condor Apr 25, 2023, 8:37 AM

#

what model do you recommend to finetune on?

#

dreamshaper looks interresting

#

when i finetune do i use the vae version?

#

i still dont really understand that part

cold wyvern Apr 25, 2023, 8:46 AM

#

I finetune on the base models, as I know they still have their ema data, whereas most of the models have been pruned and may not have the training data

hot crow Apr 25, 2023, 9:09 AM

#

#1072220168534642768

cold wyvern Apr 25, 2023, 9:41 AM

#

Fools errand - a) quality is subjective and subject to bias; b) some of the fun of AI art is the "pull the handle" pokie machine roll it and see if you get a good one or not; and c) SD2.1 has well past moved on being just one model - there are dozens of finetunes out there (thousands for 1.5 models)

That said - I pressed generate once - got this

#

We aren't saying that SD XL is bad; nor that one should not support Stability as they are awesome - we are just saying that there is far more to AI art, and we need the resources of all involved to make this the best it can be...

#

SD2.1 was trained on 768x768...

#

And besides.. that wasn't what you asked for...

#

well each of the individual images in that plot is 640x960

#

dude.. I'm going to politely tell you where you can take your rule...

#

you've been beating this drum for 4 days! it's time to let it go

#

https://tenor.com/view/facepalm-really-stressed-mad-angry-gif-16109475

Tenor

unique cloak Apr 25, 2023, 10:16 AM

#

I think they last said they feel it's time to let it go, not sure they are looking for more comparison, nor that you are stupid there.
From what I read, it feels like you are both on a different opinion and failed to convince the other or ear them out.

#

I do feel, like I explained yesterday, that your rules don't make sense.
Trying to see what's the best model either for that mater, since the possibilities of training are different, and they can't be integrated in the same pipelines because of it.

#

yes seriously. You start from a random noise, so to be fair on number of picture, 1 pic only doens't make sense. it's the medium aesthetic score of hundreds of gens that you would need to compare. but it's far from the only point that doens't make sense in comparing them : parameter counts, different text encoder, finetune to prompts, loras, hypernetworks, ...
There are numerous difference that make comparing them a statistical nightmare if you want to be fair
If you want to be practical, each has a different target and use on the market currently, the use cases they are intended for, or at least used for, diverge completly
Because of it, each will be the best in a different category

To keep your analogy, you are comparing 2 different sports golden medals at the Olympics

#

try to use reasoning maybe to answer me if you want to debate. this isn't far fetched, it's the practical world people coming for help on this server describe. I'm not attacking you there, you don't need to "screw" everything.
Anyway, since XL can't be finetuned at all currently, let's at least diverge this to #🏞｜general-with-images and not clog this channel.

#

finetuning is not modifying the prompt.
#🔧｜finetune is the channel for dreambooth, lora, textual inversion, controlnet, and other kinds of finetuning techniques
Prompting work is essential on any model, but you are messing the terminology. we mostly work on the good prompting techniques, and ways to build coherent prompts, in #📝｜prompting-help

unique cloak Apr 25, 2023, 11:27 AM

#

still not finetuning, please stop this contest as you've been asked multiple times now. nobody is asking for fair rules in this, you can keep having fun with it if you want, but move this either to #✨｜sdxl or to #🏞｜general-with-images .
This channel still isn't the place for what you are doing.

#

I think it's still not cool to keep using the wrong channel intentionaly there no

#

I didn't subscribe to the rules of your comparison
you did accept the rules of this server though. I'm explaining to you that you are in the wrong channel and need to change channel to respect other users, as I'm here right now to help those rules get applied to everyone.
So keep ignoring me and posting in finetune to prove your point and get timed out, I'm not sure what you are looking for there

unique cloak Apr 25, 2023, 11:55 AM

#

I have 10k credits currently, and used around 2k credits on SDXL currently.
I know what I'm talking about also on finetuning, having been on this server for months now, and being a moderator on it for around 6 now. So, the theorical place to talk about a given subject, and find people willing to talk about that subject, I know quite well. And I'm telling you, this is not fine tuning you are doing.
It's prompting, and yes it's essential. As well as working on your sampler, scheduler, steps and all other available settings there.

#

I'm saying this one last time. stop sharing non fine tuning on this channel. I'll time you out, I can only warn you. I am giving you every chances to please not come to this end and move to #✨｜sdxl or #🏞｜general-with-images

#

it's tuning the output, note finetuning the model, as described by just the terminology itself in the diffusers library that is the base of everything SD related
https://huggingface.co/docs/transformers/training

pearl cypress Apr 25, 2023, 2:36 PM

#

unique cloak it's tuning the output, note finetuning the model, as described by just the term...

I admire your patience 🕊

unique cloak Apr 25, 2023, 2:37 PM

#

but thanks

paper knot Apr 25, 2023, 5:59 PM

#

I was going to ask something related to LoRA but I got hooked by this surreal conversation, I even forgot what I was going to say

unique cloak Apr 25, 2023, 7:17 PM

#

nope, don't start this again lol

#

still not the good channel, and still not a drama server

cold wyvern Apr 25, 2023, 7:33 PM

#

Did anyone have a working script for merging sd2.1 lora INTO a model?

prime briar Apr 26, 2023, 2:50 AM

#

left is the training data, right is what i get after training a lora from it. are there any lora parameters i should adjust to get more accurate results? i already tried the basic steps, cfg, and tried making the aspect ratios match
idk if i would have to use one of those img2img tools or inpainting tools, or switch to something like textual inversion?

#

also could it be that AI just struggles on guns like how it generates hands? idk anymore

stone garden Apr 26, 2023, 2:54 AM

#

Oi

olive vapor Apr 26, 2023, 9:48 AM

#

I'm training dreambooth on generating a character, though the results come out really, uhh, weird

#

I've got a dataset of about 70 images

#

I'm not sure if my config is correct but my settings go like this
instance token: a photo of (name) person
class token: a photo of a beautiful woman

instance prompt: a photo of (name), and then a whole bunch of variables like high quality and all (which give me photorealistic results when I use it in tex2img)
class prompt: a photo of a beautiful woman, then the same parameters
classification image negative prompt is regular stuff like bad hands, bad quality, whatnot (again this stuff gives me photorealistic results in tex2img)

sample prompts is a txt file with prompts corresponding to my dataset images
sample image prompt is blank (not sure what to fill in there)
sample negative prompt is the same negative prompts like before

#

I did set up my class images in this way: each image is named like 001 002 003 etc, and in a separate folder I have a bunch of txt files also named that, so for example txt file 001 has the prompt corresponding to image 001
But I'm not sure how to make dreambooth read in the correct txt file, so what I did was put all those prompts into one txt file which I now read in alone

unique cloak Apr 26, 2023, 9:56 AM

#

hello there JoJoCa 🙂 Happy to see you post your problem.
You are doing a lot of things right here, but some are double work for nothing, and some small errors that hammer your quality

olive vapor Apr 26, 2023, 9:57 AM

#

Thanks for the reply, yeah I probably did weird stuff, pretty new to SD

unique cloak Apr 26, 2023, 9:58 AM

#

first of all, if you are training on a single character, usually, 10 to 20 pictures is a good target. Anything above is a danger getting bigger and bigger, because things start to repeat in the pictures and you don't want that : you want variety
So first step I would take, is select the 15 best pictures of that dataset, with varied clothes, lighting, background, pose, and framing (close up, full body, ...)

#

then, if you are using instance prompt and class prompt like you are, then there is no need for a caption file next to the pictures, this is double the work for nothing.
It looks inside the files when you check the option to do so. if not, it takes the class/instance prompt to train

#

next, the "instance token" is not what you are using, it's just name

#

as for class token, in this case, I would use woman

#

it's a "single token"' you want in those

#

the prompts in class and instance prompt will help build on it, but those are the main concepts you are targeting : your new token nameand the class token woman

#

sample prompt seems good, but it's just a control measure anyway, it's something that shows you during training, what would the result be currently if you were to run that "sample prompt" on the model

#

so it lets you test how good the model performs

#

usually, only 2 or 3 prompts are enough

#

like

portrait picture of name
drawing of name, very detailed, half body shot
full body shot of name

#

(ask if I said something that is not understandable)

olive vapor Apr 26, 2023, 10:05 AM

#

That makes sense thank you, I dont really understand the part where you said it looks inside the files

unique cloak Apr 26, 2023, 10:15 AM

#

the captions files you created, the .txt files. Those aren't used at all if you don't check the corresponding checkbox in the UI (not sure of its name)
If you do then, instance prompt becomes unused, it instead takes what's in the txt file linked to each picture
There is a last method available, it's using [filewords] as instance prompt. this will make it so the name of your picture will be used. I personally use that : I write the caption of each picture directly in the filename, like "painting of a cat.png"

The whole goal of those 3 things (caption files, instance prompt, and [filewords]=>filenames) is the same : provide a prompt to train each picture on
The instance prompt is intended to be used when you want the same prompt for each picture. In your case, it's what I would recommend, and I would use a very simple instance prompt : name, nothing else. (with your name of course 😉 )
The caption file and the [filewords] method have the same goal : letting you have a different caption per picture. This can be very potent, especially on bigger trainings, but I found it to be overhyped and more complicated to use correctly, so not the best for single subject training

olive vapor Apr 26, 2023, 10:17 AM

#

I see, thanks, but if I make my images a simple caption, how do I make it still use the other part (with all the variables like high quality, 4k, ultra res, ....)

unique cloak Apr 26, 2023, 10:20 AM

#

well, the good thing is : the base model you are training already knows those parts, you don't need to train those

by providing 15 very varied photos of you, and just saying to Dreambooth "learn this : it's 'JoJoCa'", dreambooth will learn each pic as you, and try to find the common part, the part that is in each and that could fit into that "JoJoCa" token.
It will discard what is changing automaticaly, not learn your wall behind you if it's a different background each time for example, same for your clothes if you have different clothes.
And it will put everything it find common in those pictures, inside that single token, JoJoCa.

So when you later use that model, you can prompt "JoJoCa", and get already a "mostly valid but bad" picture of yourself
Then you add to your prompt all the other tokens, the 4k, the realistic, ... and you get the results you wanted

olive vapor Apr 26, 2023, 10:25 AM

#

I see, thanks a lot, that makes it clear

#

Last question (for now 🗿 ), whats the difference between instance prompt and class prompt (I'm now using the image caption for instance)

cold wyvern Apr 26, 2023, 10:29 AM

#

instance prompt is the token it's going to use, and the class is what it token that would replace is how I understood it

#

i.e instance of Emma Watson and class of woman -> the prompt "masterpiece, a woman" would be trained to be equal to "masterpiece, Emma Watson"

unique cloak Apr 26, 2023, 10:30 AM

#

class is supposed to represent something larger than your instance. just "woman" in your case, or even "person"
this is called "prior preservation" or "regularization data", the class itself, and is completly optional by the way
It's what helps the model remember what a random woman is, and not replace every woman with your face too fast in the model

cold wyvern Apr 26, 2023, 10:31 AM

#

wonder where I got my info from then :S

unique cloak Apr 26, 2023, 10:32 AM

#

you are also right

#

the regularisation data is trained too

#

it's a "second concept"

#

trained at the same time as your main concept

cold wyvern Apr 26, 2023, 10:32 AM

#

oh sweet lol, last thing I want to do is spread bad info!

unique cloak Apr 26, 2023, 10:33 AM

#

it's complicated, but yeah, it was all right from what I understand of it too

olive vapor Apr 26, 2023, 10:35 AM

#

Thanks a lot, lets hope it works well now 🙂

olive vapor Apr 26, 2023, 11:03 AM

#

Hmm weird thing, I have class images per instance image set to 5 but its only generating 45 images (I have 25 instance images)

unique cloak Apr 26, 2023, 11:06 AM

#

you already had some of it generated in a previous try maybe ? not sure, not the tool I use

olive vapor Apr 26, 2023, 11:24 AM

#

oh yeah I did, my bad, well any way to make it go faster since its doing 8700 images and the eta is 17 hours 🗿

#

16gb vram btw

finite creek Apr 26, 2023, 11:25 AM

#

Hello, interesting conversation. Can I ask you guys a quick question. Do the classification images need to be square?

unique cloak Apr 26, 2023, 11:28 AM

#

olive vapor oh yeah I did, my bad, well any way to make it go faster since its doing 8700 im...

8700 pics Oo for regularization on 25 base pics ? That's insanely high ! I would do maybe 1000, tops

unique cloak Apr 26, 2023, 11:28 AM

#

finite creek Hello, interesting conversation. Can I ask you guys a quick question. Do the cla...

it depends on the tool you use for training. some do accept multi ratio, some still require square pictures

#

it's in the wiki usually to be sure

olive vapor Apr 26, 2023, 11:29 AM

#

unique cloak 8700 pics Oo for regularization on 25 base pics ? That's insanely high ! I would...

I honestly have no clue how to set that so its default

unique cloak Apr 26, 2023, 11:29 AM

#

olive vapor I honestly have no clue how to set that so its default

it's the dreambooth extension for Automatic you are using, right ?

olive vapor Apr 26, 2023, 11:30 AM

#

yeah

#

yesterday when I still tried with 60 images it put it on like 20k so now its resuming from that I suppose

finite creek Apr 26, 2023, 11:30 AM

#

unique cloak it depends on the tool you use for training. some do accept multi ratio, some st...

Thanks you Guizmus! It’s with dreambooth with A1111

unique cloak Apr 26, 2023, 11:31 AM

#

olive vapor yesterday when I still tried with 60 images it put it on like 20k so now its res...

given your context, go into the Settings and set "Total Number of Class/Reg Images" to 0

olive vapor Apr 26, 2023, 11:32 AM

#

aight thanks, do I need to restart for that or does it update automatically

unique cloak Apr 26, 2023, 11:33 AM

#

olive vapor aight thanks, do I need to restart for that or does it update automatically

I think you'll need to stop it and just restart it, no need to delete/recreate the model though, it hasn't started training

olive vapor Apr 26, 2023, 11:33 AM

#

oh its training, class images are finished

#

unique cloak Apr 26, 2023, 11:34 AM

#

finite creek Thanks you Guizmus! It’s with dreambooth with A1111

I'm looking at the doc of it right now (https://github.com/d8ahazard/sd_dreambooth_extension) but I don't see any mention of ratio, it seems it requires square pics ? Not 100% sure but it seems so

unique cloak Apr 26, 2023, 11:34 AM

#

olive vapor

ok I misunderstood

#

what's that 8700 picutres you were talking about ?

#

total steps ?

#

yeah that is a lot too many in my opinion

#

how much Batch Size are you using ?

#

2500 steps on batch size 1 should be enough for the settings I can see in your screenshot

olive vapor Apr 26, 2023, 11:36 AM

#

batch size 29
I used the performance wizard and it put it on that automatically

unique cloak Apr 26, 2023, 11:36 AM

#

ho ok lol so then 8700 is just insanely high

#

batch size 29 means your total dataset is trained each step

#

my recommendation here would be to run 100 step

#

max

#

to me, the 500 you already did is too many

#

such dataset is trained in 20 minutes or so

olive vapor Apr 26, 2023, 11:38 AM

#

Yeah I thought it was a bit much having 18 hours for a dataset, but I cant find where to put the max steps

#

oh is it the training steps per image?

unique cloak Apr 26, 2023, 11:38 AM

#

18 hours is 50% more time than I took to train my mega model on 750 pictures

olive vapor Apr 26, 2023, 11:38 AM

#

yeah definitely bad settings on my end then 🗿

unique cloak Apr 26, 2023, 11:39 AM

#

the base recommendation, but it depends on so many things, like the batch size, the gradient accumulation step, the learning rate, ... is to train around 100 times each picture

#

I have a guide that comments on lots of things like this

#

https://github.com/Guizmus/sd-training-intro

#

it's not specific to Automatic1111 dreambooth though, all training methods are included, it focus on the theory behind it more, and helps fix your errors by understanding how it works

olive vapor Apr 26, 2023, 11:41 AM

#

Damn that looks really interesting, thanks for making that

#

lol the entire character looks perfect aside from the face so far

unique cloak Apr 26, 2023, 11:44 AM

#

you'll have "fix faces" to help a little when prompting for real

olive vapor Apr 26, 2023, 11:44 AM

#

true, I'll let it do its thing

finite creek Apr 26, 2023, 12:15 PM

#

unique cloak I'm looking at the doc of it right now (<https://github.com/d8ahazard/sd_dreambo...

Thank you Guizmus !! Will check

olive vapor Apr 26, 2023, 2:05 PM

#

alright 68% and it says training finished, I'll test how it looks with a prompt now

pearl swan Apr 26, 2023, 2:44 PM

#

Hey not sure if I can post about this on here but I’m keen to pay $500 for someone who will spend say 3 hours with me walking me through fine tuning with Everydream2 — kinda like a learning session

Am doing a lot of things trial and error so could do with some accelerated learning session

olive vapor Apr 26, 2023, 5:25 PM

#

unique cloak you'll have "fix faces" to help a little when prompting for real

doesnt seam to fix it, but I realised a big part of the problem is that I think im doing something wrong on my end, I copied a prompt from the civitai website and what gives them this result gives me the second

cold wyvern Apr 26, 2023, 7:01 PM

#

You need to pick up a vae

olive vapor Apr 26, 2023, 7:29 PM

#

Yeah I realised as well, used a VAE and now it looks perfect

unique cloak Apr 27, 2023, 8:18 AM

#

Just found the multiply.txt feature in everydream, simulating bigger or smaller datasets :
Adding a multiply.txt applies the factor inside it to the current folder
That means that, if I put 0.25 inside a 200 pictures folder, 50 pictures would be selected at random in there each epoch.
the thing works also with numbers higher than one if needed be, but this is just what I needed to continue working on my mega model without retraining the whole 750 first pictures each time : I can just use this old dataset as regularisation, adding the good multiply.txt to balance it in size with the new dataset

finite creek Apr 27, 2023, 10:36 AM

#

Anybody had this problem? When training in dreambooth A1111, it generates classification images even though I have a folder with classification images in it.

tall condor Apr 27, 2023, 11:17 AM

#

for some reason my trained model is producing tints and sometimes super high contrast

#

is there a way i can avoid that?

tall condor Apr 28, 2023, 3:07 AM

#

anyone can help with the tint issue?

#

am i overfitting?

hot breach Apr 28, 2023, 3:47 AM

#

unique cloak Just found the multiply.txt feature in everydream, simulating bigger or smaller ...

yea this is very useful for regularization images, you can toss 10k images in and use like, multiply.txt with 0.03 or somethng to pick just a few each epoch

#

tweak to avoid catastrophic forgetting, etc

hot breach Apr 28, 2023, 4:01 AM

#

tall condor for some reason my trained model is producing tints and sometimes super high con...

if it uses offset noise it probably is using too much, might see if it can be adjusted

#

i think a lot of the offset noise has been implemented as 10% because the original blog post that created the idea used 10% but its too much sometimes

tall condor Apr 28, 2023, 12:36 PM

#

what would be the setting for the noise?

royal island Apr 28, 2023, 5:48 PM

#

Can anyone point me in a direction for gaining a deeper understanding of how to train for concepts? I've only had success doing faces via textual inversion but I want to go way deeper than that. I want to be able to prompt images of characters at a party actually interacting with each other, or working on a car/skydiving/skateboarding/anything. I don't quite grasp why I can generate a million images of a trippy giant mushroom and have them all come out different from each other and look amazing, but it's impossible to make even one good image a crowd of people on a dance floor. What would go into making such a thing possible? Is it something viable for one person with a 4090 to accomplish or is this the sort of thing that would require hundreds of thousands of images and a mining rig running nonstop for an unholy amount of time?

tall condor Apr 28, 2023, 11:42 PM

#

so i spend the last 5 days capturing 6k images

#

if the result sucks its all your fault xDDD

tall condor Apr 29, 2023, 12:06 AM

#

*captioning

tall condor Apr 29, 2023, 12:25 AM

#

does SD understand perspective? like angles on an object? when training is there a propper way to specify angels of view? like looking from top down, looking from front?

tall condor Apr 29, 2023, 12:54 AM

#

also shall i use fp16 or bf16?

cold wyvern Apr 29, 2023, 1:57 AM

#

tall condor also shall i use fp16 or bf16?

Fp unless it doesnt work iirc

serene flicker Apr 29, 2023, 2:50 AM

#

Hey, someone mentioned it's possible to train on top of a model with images in the dataset from the first training, plus and minus a bunch of images, without overtraining on the consistent images. Is there such a thing?

minor jay Apr 29, 2023, 7:27 AM

#

Anyone trying to finetune deepfloyd if

cold wyvern Apr 29, 2023, 8:43 AM

#

minor jay Anyone trying to finetune deepfloyd if

Not sure theres much point - its a 64x64 model with 2 upscales.. more excited for sdxl release

minor jay Apr 29, 2023, 8:44 AM

#

cold wyvern Not sure theres much point - its a 64x64 model with 2 upscales.. more excited f...

So what, its still state of the art for realistic, so if you train it to be anime style it should be able to do that too

serene flicker Apr 29, 2023, 9:25 PM

#

Ok, I need some help. I am training a new version of a model on top of the original. I can tell that it is training based off the sample images. But when I try to test these models, the outputs are basically the exact same thing as the base model. Does anyone know what is happening and how I could salvage this? It's the third time I have tried training and I really don't want to go again since it takes like 4-5 hours each.

#

#

There is an example

#

Btw this is a 2.1 768 model

#

I'm going to try one more thing.

oak yew Apr 30, 2023, 12:05 AM

#

before (base SD 1.5, a painting of flowers in a vase on a table):

#

after:

#

My lora REALLY likes tulips

#

I finally got kohya to work after much reinstalling everything and this is the first lora file I've made that doesn't instantly make things MUCH worse, lol

dull snow Apr 30, 2023, 5:28 PM

#

any tips how to make the black lines smoother?

visual needle Apr 30, 2023, 5:43 PM

#

Message🍴

dull snow Apr 30, 2023, 8:10 PM

#

#

tall condor Apr 30, 2023, 8:31 PM

#

hi guys, im having massiv eissues with yellow tints in my finetuned model, anyone know how to tackle it?

tall condor Apr 30, 2023, 10:44 PM

#

also i have another question: i have a model that has like 100 different concepts in them, some concpts have 5 images and some have 500 images. now what is happening in my model is that the ones that have like 500 images perfoem well but its allmost impossible to generate the conecpts that have only 5 images, i ran it with 50 epochs with multiple of 10s (so every image is learned at least 500 times) - what is the best way to peroppely weight those concepts?

fickle haven May 1, 2023, 12:57 PM

#

so i need someone to tell me how many u_net steps i need to train a character with 95 images in dreambooth and also text encoder steps.. . by deafult for 10 images it was 1500 unet and 350 text encoder so i multiplied x 9 since i have 90 images...

dull snow May 1, 2023, 4:33 PM

#

everyone is asking

#

nobody is answering

fickle haven May 1, 2023, 4:35 PM

#

Can we tag support?

#

They are ignoring us

cold wyvern May 1, 2023, 7:05 PM

#

Its a community support forum - there is no “@ Support”, you are relying on others passing their knowledge on

#

As for the step count - as with a lot of things with SD its trial and error - you may find with one dataset 1500 steps is plenty even for 90 images, and another might need 25000 steps depending on the look you want to go for…

tall condor May 1, 2023, 8:07 PM

#

its not like you can expect ppl to wait here and answer you pal. just be happy if you get a reply once in a while!

fickle haven May 2, 2023, 8:52 AM

#

tall condor its not like you can expect ppl to wait here and answer you pal. just be happy i...

aww thats so sad bc the ones in reddit just dont answer :C

fickle haven May 2, 2023, 8:52 AM

#

cold wyvern As for the step count - as with a lot of things with SD its trial and error - yo...

well i tried with 1500 and i got poop. 3000 , poop. 9000 looking better

#

i will try with 15000

tall condor May 2, 2023, 1:06 PM

#

i have a model that has like 100 different concepts in them, some concpts have 5 images and some have 500 images. now what is happening in my model is that the ones that have like 500 images perfoem well but its allmost impossible to generate the conecpts that have only 5 images, i ran it with 50 epochs with multiple of 10s (so every image is learned at least 500 times) - what is the best way to peroppely weight those concepts?

fiery rampart May 2, 2023, 4:47 PM

#

👀

tall condor May 2, 2023, 9:33 PM

#

endet up writing a programm that helps with the weighting of concepts 🙂

unborn wind May 3, 2023, 1:53 AM

#

cold wyvern Did anyone have a working script for merging sd2.1 lora INTO a model?

I dunno if anyone's answered this question for you, but I managed to successfully merge a 2.1 lora into a 2.1 model. In Kohya you can go to the "Merge Lycoris" tab and plug in the lora as if it were a lycoris.

#

keywords work without calling the lora

xyz_grid-0000-640127425-rz88mkultr4_Portrait_shot_of_a_steampunk_robot_ladybug_unreal_engine_realistic_render_8k_micro_detail_elegant_highly_detai.png

oak yew May 5, 2023, 1:27 AM

#

tall condor also i have another question: i have a model that has like 100 different concept...

The newest version of my Lora is doing great because this time instead of repeating all training images the same amount of times, I have a "tier system" where the best images have duplicates. I only trained for 4 epochs and I'm getting good results because I'm training on the good images 100 times and a bunch of "meh" ones 5-20 times, which is weighting the concept higher
This is also working for concepts I don't have a lot of images of but I want to be strong, so I put them in the high repeat folder
I don't really know what I'm doing yet, but I'm using Kohya as a front end to train LORA and in the images subfolders you name things number_foldername like so and number is how many times your script repeats the folder

#

I learned this on this sketchy 4chan lora training guide under "How to set up the directory" or something like that
https://rentry.org/lora_train

LoRA Training Guide

/hdg/ Logo Imgur (3 Sizes)
Written by DistroAnon/EZScriptsAnon with some help from a few others and the Thread!
Links to other Collaboration Edition Guides/Resources
Home
PromptAssist | LoRA Repo
LoRA Training Guide
Useful links if you want more understanding of training
What is a LoRA?
Using LoR...

#

There's no inappropriate images in that document but 4channy things make me nervous so watch out

oak yew May 5, 2023, 2:42 AM

#

tried making my flower lora better today
turning it up to see which colors really come through, I think I'm gonna try looking for more cool tones to add "high" in the data set

#

#

I'm collecting before and after every time I add to this LORA so that I know what's working and what isn't

tall condor May 5, 2023, 2:42 PM

#

i have like 250 concepts that i want to train in a model. can i do that using lora or should i just use dreambooth? all my concepts are captioned in folders atm

cold wyvern May 5, 2023, 9:24 PM

#

Lora for each and merge them together?

tall condor May 6, 2023, 3:27 AM

#

im going for dreambooth not for testing, 112hours 1.1mil steps

#

if it doesnt get me where i want to get ill go to lora next or try to fix the weights

#

i still dont understand the difference between lora and dreambooth tbh, with kohya ss as far is i understand the textxtual model is also trained so whats different does lora do other than the output being a kinda diff

cold wyvern May 6, 2023, 3:50 AM

#

tall condor i still dont understand the difference between lora and dreambooth tbh, with koh...

The basic summary is A lora is a set of instructions on how to edit a base model - dreambooth is a whole model

oak yew May 6, 2023, 4:22 AM

#

im on 112 of about 700 hand editing, annotating, and sorting my dataset

stiff dust May 6, 2023, 8:09 AM

#

there is basically no difference, just a different way of storing the model. The only limitation of lora is that you set a rank beforehand that limits how fine your model can differ from the base model

cold wyvern May 6, 2023, 8:54 AM

#

stiff dust there is basically no difference, just a different way of storing the model. The...

except the lora is sub 200mb, and the full model is 10x that minimum

tall condor May 6, 2023, 11:39 AM

#

what im concerned with the lora is that my input dataset is cress dependent

#

*cross

#

meaning that the content is linked via tags, if i seperate the concepts and make each of them a different lora im not sure if they can cross match at the end

tall condor May 6, 2023, 11:41 AM

#

oak yew im on 112 of about 700 hand editing, annotating, and sorting my dataset

how are you sorting and annotating? in folders?

warm ginkgo May 6, 2023, 3:30 PM

#

any suggestions? this is for dreambooth on base v2.1

#

tall condor May 7, 2023, 3:36 AM

#

how big is the difference if i use something like wd14 on my images and have much more detailed tags rather than concepts?

cold wyvern May 7, 2023, 4:10 AM

#

I know there is a big difference between captioned and uncaptioned.. have only ever hand captioned stuff myself though

stiff dust May 7, 2023, 5:17 AM

#

tall condor meaning that the content is linked via tags, if i seperate the concepts and make...

you don't necessarily have to separate them in different loras

#

just train them all together

sonic narwhal May 7, 2023, 11:53 AM

#

anay difference between everydream and stabletuner?

#

Trying to pick one for full fine tuning

tall condor May 7, 2023, 12:06 PM

#

cold wyvern I know there is a big difference between captioned and uncaptioned.. have only e...

if you hand caption how detailed did you make your captions?

sonic narwhal May 7, 2023, 12:39 PM

#

Also wondering should you add miniconda to PATH in order to get stabletuner?

hot breach May 7, 2023, 4:12 PM

#

sonic narwhal anay difference between everydream and stabletuner?

ed2 is pretty much the only one that is still actively developed and enhanced

nocturne vale May 7, 2023, 4:26 PM

#

is there a way to get the Triggerwords of a lora (safetensor) file? Cause over time I accumulated quite a lot of loras and either the site have been deleted or i can't find it anymore...

hot breach May 7, 2023, 4:57 PM

#

this is the issue with using weird keywords for everything, you have hundreds of files and can't use them without a magic dictionary

nocturne vale May 7, 2023, 4:58 PM

#

https://tenor.com/view/spiderman-dancing-gif-24233159

Tenor

#

aight, imma start making myself a dictionary for my collection then

oak yew May 7, 2023, 10:00 PM

#

tall condor how are you sorting and annotating? in folders?

I'm using BLIP captioning to generate text files for each image, but then editing all these txt files in vscode because I can pull up the image easy and also crop it and stuff with an image editing plugin

then I'm just throwing them all into folders that are how many repeats each image gets during training

tall condor May 8, 2023, 12:36 AM

#

may i ask some examples i just want to understand how extensive your captions are

oak yew May 8, 2023, 1:25 AM

#

tall condor may i ask some examples i just want to understand how extensive your captions a...

Keep in mind that I don't know what I'm doing 😅 , but I try to keep them as simple as possible. When I was detailed and had several commas and tried to describe the whole image well, my LORA made the outputs worse.
a man with a flower crown on his head
It doesn't matter if the man has a suit on or a t shirt or he's shirtless, or what he looks like, I just try to keep it simple and pertaining to the subject of my LORA, which is decorating things with flowers. The man is interchangeable.
a painting of a black cat surrounded by flowers
a woman wearing a dress covered in flowers in a garden, depth of field"
many flowers in a garden
floral illustration of roses
Less seems to be more

tired plank May 8, 2023, 5:08 AM

#

Have been had success training lora for an action as opposed to an object? Like a jump or a punch

cold wyvern May 8, 2023, 6:56 AM

#

Im guessing by some of the specific loras on civitai thats absolutely possible

warm agate May 8, 2023, 12:46 PM

#

Hello guys,
I want to train a model on landscape photography with a dataset over 100k images

wintry carbon May 8, 2023, 2:57 PM

#

can anyone please help me with training embeddings ive tried 3 times now and every time it looks nothing like the original reference images not even close

im trying to have an embedding of an original anime character and its not getting anything right not even the color of the hair, the thing about prompts is i am really confused about that part I did process the images to make the prompts for me but in the tutorial i was watching said only keep the prompts that arent integral to the character so like if it always has long hair remove the long hair part and if its wearing helmet remove the helmet part from prompt which i did, setting are
Embedding learning rate: 0.05:10, 0.02:20, 0.01:60, 0.005:200, 0.002:500, 0.001:3000, 0.0005
prompt template: custom_subject_filewords.txt
max steps: 3000
save image to log directory N steps: 50
save a copy of embedding to log directory every N steps: 50
save images with embedding in PNG chunks: checked
shuffle tags by',' when creating: checked
drop out tags when creating prompts: 0.1
sampling method: deterministic

model im using isnt the 1.5 but ive tried using that one and it got even worse results
i can dm the results and reference images

unreal linden May 8, 2023, 3:01 PM

#

hello, I know how to drop a training image into the SD bot, but I do not know what prompt to use to attach it so I can then do image2image

quiet eagle May 8, 2023, 3:11 PM

#

how much RAM do you typically need for Lora training. I see mentions of needing just 6gb on a 8gb gpus from some and using ~30gb on Colab e.g. in this recent thread https://www.reddit.com/r/Oobabooga/comments/13ait7z/lora_training_runs_out_of_memory_on_saving/

r/Oobabooga - LORA training runs out of memory on saving

3 votes and 12 comments so far on Reddit

#

I do have an 8gb gpu, and i do have colab pro (tho it seems pretty nerfed now and even the A100 shows like 16gb of vram) just curious if it's worth it even trying to set it up on my 8gb gpu or if it will be subpar

cold wyvern May 8, 2023, 8:02 PM

#

quiet eagle how much RAM do you typically need for Lora training. I see mentions of needing ...

6gb for image training works ok, but oobabooga is a text2text platform and the requirements might be very different as those models are much larger

#

The bmaltais/kohya_ss fork works fine on a 1060, if a tad slow

cold wyvern May 8, 2023, 8:06 PM

#

wintry carbon can anyone please help me with training embeddings ive tried 3 times now and eve...

Have u tried a lora instead?

wintry carbon May 8, 2023, 8:12 PM

#

cold wyvern Have u tried a lora instead?

well idk if making a lora is good for specific characters

tall condor May 8, 2023, 9:04 PM

#

warm agate Hello guys, I want to train a model on landscape photography with a dataset over...

that does sound like something super cool, however labeling the image i imagine to be a nightmare

tall condor May 8, 2023, 9:05 PM

#

wintry carbon can anyone please help me with training embeddings ive tried 3 times now and eve...

look at kohya_ss, there are even youtube videos on lora and dreambooth with it. i found it quite easy to get started with

wintry carbon May 8, 2023, 9:06 PM

#

i dont wanna use dreambooth because thats only for 1 model

tall condor May 8, 2023, 9:06 PM

#

kohya suppots lora also, also it has a nice weg ui and helps creating the folder structure and so on

#

its probably your most easy starting point

#

also it takes care of the dependencys and so on by itself

tall condor May 8, 2023, 9:08 PM

#

oak yew Keep in mind that I don't know what I'm doing 😅 , but I try to keep them as sim...

are you using caption shuffeling?

wintry carbon May 8, 2023, 9:23 PM

#

tall condor kohya suppots lora also, also it has a nice weg ui and helps creating the folder...

i wanted to train embeddings

gentle osprey May 8, 2023, 10:06 PM

#

wintry carbon i dont wanna use dreambooth because thats only for 1 model

you can make a dreambooth then extract the LoRA

#

i've had a lot more success making a dreambooth first since it doesn't require captioning

wintry carbon May 8, 2023, 10:10 PM

#

yeah but i cant use infinite loras when creating prompts

#

im pretty sure

gentle osprey May 8, 2023, 10:10 PM

#

what do you mean by infinite loras?

wintry carbon May 8, 2023, 10:39 PM

#

gentle osprey what do you mean by infinite loras?

well how many loras can you use for a single image ?

#

they all together have to = 1 in value no?

gentle osprey May 8, 2023, 10:41 PM

#

don't think there's a hard limit, just that they can interact in weird ways

#

and no, don't need to add up to 1

wintry carbon May 8, 2023, 10:45 PM

#

well what if i want multiple people i cant put 2 loras with different people

gentle osprey May 8, 2023, 10:45 PM

#

i'd just use inpainting for that

wintry carbon May 8, 2023, 10:46 PM

#

seems way too much work to do for every single image until i get something that looks good also inpainting never looks good at least not when ive tried

#

all i want is some help with training TI embeddings .-.

gentle osprey May 8, 2023, 10:48 PM

#

looked at your original statement, how are you doing your captions

wintry carbon May 8, 2023, 10:51 PM

#

if you mean the txt file captions for the images then i generate them with the processing tool and then i remove everything that is integral to the characters design

#

from the captions thats what ive been told to do

gentle osprey May 8, 2023, 10:52 PM

#

yeah, that's basically the gist of it, thought you might be over describing the object you're trying to embed

wintry carbon May 8, 2023, 10:53 PM

#

all i describe is the background sometimes pose and clothes

#

and when i train it even after 3000 steps its not even close not even the color of the hair is right

gentle osprey May 8, 2023, 11:00 PM

#

mind posting a training image and your caption?

wintry carbon May 8, 2023, 11:00 PM

#

ye sure should i send you it in the dms ?

gentle osprey May 8, 2023, 11:00 PM

#

sure

tall condor May 8, 2023, 11:22 PM

#

is there any way to interpret loss rate in correlation with learning rate

#

im using a cosine with warmup and my loss rate is stabelizing to a constant after 20 epochs even tho the learning rate keeps decreasing. is there way to interpret this?

#

also for some reason my speed went from 2.6 It/s to 3.1 it/s why is that?

#

gpu work faster when turtured hard over some time? xDDD

warm agate May 9, 2023, 3:33 AM

#

tall condor that does sound like something super cool, however labeling the image i imagine ...

Yeah it's gonna take a bit of time, but it can be a perfectly trained model.
Can I DM?

stiff dust May 9, 2023, 4:14 AM

#

tall condor is there any way to interpret loss rate in correlation with learning rate

there is no correlation. Sure, the final gradient is proportional to loss and learning rate, but both numbers don't have anything to do with each other.
Learning rate is determined by a learning rate scheduler. It decreases if you want it to decrease

#

loss in SD does not tell you much

#

it mostly depends on the sampled noise and time step, so you have to average loss over many steps or large batch sizes to see anything and even then the change in image quality is not necessarily going with a decrease of loss

cold wyvern May 9, 2023, 5:23 AM

#

Think of the model as a giant music studio with 895 million channels, training the model is just moving the sliders to get the desired result, learning rate tells you how far you can adjust each slider at each step

warm agate May 9, 2023, 6:01 AM

#

Deliberate is around 2 gb, so how with how many images did they train the model?

cold wyvern May 9, 2023, 6:20 AM

#

warm agate Deliberate is around 2 gb, so how with how many images did they train the model?

The size of the model does not correlate to the amount of images it was trained on

warm agate May 9, 2023, 6:21 AM

#

cold wyvern The size of the model does not correlate to the amount of images it was trained ...

I want to train a model with 100k, how much size would it take?

cold wyvern May 9, 2023, 6:23 AM

#

Theres still 895 million parameters, the only difference will be if there is additional information which there shouldnt be - the training weights will be there, but if its trained on fp16 or bp16 it will come out to either 3.95gb or 1.99gb for a sd1.5 base trained model, or 2.43gb for sd2.1

warm agate May 9, 2023, 6:24 AM

#

cold wyvern Theres still 895 million parameters, the only difference will be if there is add...

Oh, so even the captions play an important role?

#

@cold wyvern Can you also collaborate with us on the project?

#

I want to train a model on landscape photography with a dataset over 100k images

cold wyvern May 9, 2023, 6:29 AM

#

warm agate Oh, so even the captions play an important role?

Going back to the analogy above, the captions help the machine determine which sliders need to be moved and how far

warm agate May 9, 2023, 6:33 AM

#

cold wyvern Going back to the analogy above, the captions help the machine determine which s...

Oh ok

cold wyvern May 9, 2023, 7:01 AM

#

The cliffs notes??

#

0 is usually fine - I think 0.05 is what the original scientific paper recommended when they introduced it, most people go for 0.1 or 0 I think

wintry carbon May 9, 2023, 12:54 PM

#

so i am trying to make a lora and its stuck at 0% and wont move
anyone know whats up?

tired plank May 9, 2023, 1:42 PM

#

Question, say I generate an image of a person and they are wearing a shirt I really like.
However the shirt is patterned(flora, Hawaiian)and is at a not centered angle. It’s also drawn in a particular style and not photorealistic.

How would to you generate a bunch of test images from this shirt from different angles given this one image from different angles and close ups.

#

So I have a good training set for a Lora, I was wondering if I had to crop it out and put it in image to image with low denoising but I feel like that pattern will change even with low denoise

stiff dust May 9, 2023, 1:55 PM

#

actually what I really like is to use alpha layer to mask out the part of the image that is irrelevant (e.g. just keep the shirt). Then you might train with a single image. Textual Inversion in auto111 Supports this for example

hot breach May 9, 2023, 2:55 PM

#

the longer you train the less you need

gentle osprey May 9, 2023, 3:01 PM

#

for LoRA extraction with Kohya SS what should the network dimension (rank) and conv dimension (rank) be set to?

slate ledge May 9, 2023, 4:00 PM

#

There's probably a wiki somewhere that says something along the lines of

Don't ask how many (prompt, image) pairs are required for fine-tuning a stable diffusion model, because every fine-tuning task is a bit different.

On that note, has anybody fine-tuned a stable diffusion model to generate photorealistic images within a specific domain using a custom dataset? What was the size of your dataset? I need to generate a training set of (image, text) pairs for fine-tuning a CLIP model, where each image is a photorealistic scene and each text snippet is a description of what's in the scene (types of objects, and their spatial relation to one another).

#

I know "as many training samples as possible" is a totally valid answer, but I've only got enough motivation to manually label one reasonably small training set. That being said, I suspect the CLIP model will require significantly more samples than the SD model, and so I'm tempted to use a fine-tuned SD model to bootstrap a training set for fine-tuning my CLIP model. Has anyone done this before?

warm agate May 9, 2023, 6:56 PM

#

I am planning to train a model.
What do you guys suggest?

#

Lora or Dreambooth?

#

What's the difference?

cold wyvern May 9, 2023, 6:58 PM

#

@warm agate - I go for Lora as I wanted to be able to take that info and merge it into different models and see what came out. If you are trying to get a single model out, then dreambooth I think

#

I dont know if theres a technical reason to choose one over the other

warm agate May 9, 2023, 6:59 PM

#

cold wyvern I dont know if theres a technical reason to choose one over the other

Why does dreambooth take more VRAM?

#

B'cuz it goes to through the images more times?

cold wyvern May 9, 2023, 7:01 PM

#

Same reason a lora is smaller than a whole model i think - one is generating a set of instructions on how to edit the model, the other is editing the full model on the fly

warm agate May 9, 2023, 7:02 PM

#

cold wyvern Same reason a lora is smaller than a whole model i think - one is generating a s...

Oh so Lora is used for partial training like, Face, drawing style etc
Whereas, dreambooth is used to train image style, costumes?

#

Am I right?

cold wyvern May 9, 2023, 7:03 PM

#

warm agate Oh so Lora is used for partial training like, Face, drawing style etc Whereas, ...

No,you can do style loras

#

Possibly, they did change the version of torch

warm agate May 9, 2023, 7:04 PM

#

cold wyvern No,you can do style loras

Oh ok.
So in my case should I choose Dreambooth?

cold wyvern May 9, 2023, 7:06 PM

#

warm agate Oh ok. So in my case should I choose Dreambooth?

Try both tbh

warm agate May 9, 2023, 7:08 PM

#

cold wyvern Try both tbh

I think it would take a lot of time

#

@cold wyvern Can you help me with image scraping from r/EarthPorn?

cold wyvern May 9, 2023, 7:16 PM

#

warm agate <@379130271259295744> Can you help me with image scraping from r/EarthPorn?

Cant sorry, but I do like that idea and wish you luck!

warm agate May 9, 2023, 7:17 PM

#

cold wyvern Cant sorry, but I do like that idea and wish you luck!

👍

#

@cold wyvern Do we even need to add description for each image if we want to train.
For example, if a human is wearing a red dress so we add description of Man wearing a red dress on a sunny day?

cold wyvern May 9, 2023, 7:27 PM

#

Again, with most things stable diffusion related is “try it and see”. I have had some success with uncaptioned, and some where the captions where absolutely necessary

warm agate May 9, 2023, 7:29 PM

#

cold wyvern Again, with most things stable diffusion related is “try it and see”. I have ha...

Oh ok, will check

tall condor May 10, 2023, 1:44 AM

#

hi guys, in my model some concets are overfitting while some concepts are allmost impossible to be created. is there any propper way of handeling that other than trail and error?

stiff dust May 10, 2023, 1:50 AM

#

you can try alpha channel masking to weight down regions in the image it should not put too much effort on. I don't know, though, which scripts support that.

tall condor May 10, 2023, 1:50 AM

#

what is alpha channel masking?

#

also note that i have a set of around 7000 images

#

also another question i have is: for some reason my model is mixing up concepts that can notn be mixed. its like a car and a train, when i create something it creates like a cartrain. how can i stop that? also is it possible to have a car and a train generated in one image somehow?

cold wyvern May 10, 2023, 2:42 AM

#

tall condor also another question i have is: for some reason my model is mixing up concepts ...

I would have thought captioning would have fixed that?

stiff dust May 10, 2023, 3:38 AM

#

tall condor what is alpha channel masking?

you can open your images in a graphic editor and make parts of the images transparent.
SD ignores transparency, but some training methods use the transparency as weighting to emphasize which part of the image the model should train on.
Of course, for 2000 images you don't want to do that manually

stiff dust May 10, 2023, 3:39 AM

#

tall condor also another question i have is: for some reason my model is mixing up concepts ...

that's, unfortunately, a known limitation of SD. It has problems generating multiple concepts in one image. There are multiple ways to go around that

#

you can create two images separately, copy paste them into one and run img2img

#

you can use composable diffusion - it sometimes get these things right

#

I think there is even a plugin for composable diffusion in auto111 that allows you to make regions in your image with separate prompts

gentle osprey May 10, 2023, 3:41 AM

#

Regularization images should help prevent classes from merging

stray kindle May 10, 2023, 6:07 AM

#

Any Lora colabs that still work? Please @ me if you know.

tall condor May 10, 2023, 11:15 AM

#

composable diffusion - any more info on that?

tall condor May 11, 2023, 12:09 AM

#

gentle osprey Regularization images should help prevent classes from merging

i have i model im training and as far as i understand regularization images only work if you train 1 concept for 1 model, i have like 600 xD

gentle osprey May 11, 2023, 1:25 AM

#

Are you fine-tuning existing classes? If so regularization should work. Like teaching it 50 different cars and 50 different planes and 50 different ducks could all be done with their own regularization images.

stiff dust May 11, 2023, 2:17 AM

#

tall condor i have i model im training and as far as i understand regularization images only...

no, you can use regularization images also for multiple concepts

serene flicker May 11, 2023, 3:33 AM

#

Honestly what even are reg images, or class images? I never understood what to use those for

#

Though I have never trained subjects so I guess they don't matter for me?

stiff dust May 11, 2023, 4:24 AM

#

just for preventing overfitting. If you show images of rabbits drawn in a particular style you also for example show photographies of rabbits such that the model does not forget how rabbits without that style look like

gentle osprey May 11, 2023, 4:48 PM

#

if i'm training a face: any reason to use a learning rate other than 0.000001 and a learning rate schedular of constant/linear?

sonic narwhal May 11, 2023, 8:02 PM

#

is there a way to scrape instagram accounts for images?

gentle osprey May 11, 2023, 8:52 PM

#

jdownloader probably does that

brazen oriole May 11, 2023, 9:15 PM

#

simple question: will an embedding trained using textual inversion on sd 1.5 work with any other sd 1.5 based models, for example the base model of Deliberate https://civitai.com/models/4823/deliberate is 1.5, so should/would my embedding work with Deliberate?

cold wyvern May 11, 2023, 10:22 PM

#

a textual inversion will "work" with any other model made with the same base; but results may vary 🙂

#

@brazen oriole

brazen oriole May 11, 2023, 10:23 PM

#

gotcha, thanks. just kinda wondering if i should be training with the standard model or the variant mostly, since id like to be able to use the embedding with different base models (of the same versioN)

stiff dust May 12, 2023, 1:40 AM

#

use the model you will use the embedding in for best results

chrome breach May 12, 2023, 7:47 AM

#

I have tried fine-tuning dreambooth using 100 images with 1500 regularization images for 10K steps... Interestingly the output i get from it on 5 inference prompts is exactly the same as what the SD1.5 gives on those prompts...

#

It is as if fine-tuning never happened... Anyone faced same issue??

stiff dust May 12, 2023, 8:00 AM

#

guess there is something going wrong. The output is never the same, even after very few steps of training....
You should track progress with some validation images

chrome breach May 12, 2023, 8:06 AM

#

Yeah cool I'll have some validation images and see

#

Also, I am not sure if the data has to be specifically of one subject only or we can keep it like a mixture?

carmine pilot May 12, 2023, 10:58 AM

#

Hello fellow humans,
I was trying to use Dreambooth to create lora from photos of specific girl with glasses.
And I am getting mixed results and idk if it is just bad training or if the glasses are the problem.

Is it generally better to train on photos without those are SD can handle that just fine?

Thanks in advance,
feel free to DM (or just @ me, but i dont want to spam there) if u are experienced in person training. Ive got few more simple questions. And any help is well appreciated.

waxen grove May 12, 2023, 12:11 PM

#

Hi, I'm running a startup that heavily uses SD. we are looking to expand the team with a freelancer or 2. Searching for somebody that has tons of experience in finetuning SD 1.5 & 2.1. Shoot me a DM if you thank that could be you. happy to explain more there

gentle osprey May 12, 2023, 1:55 PM

#

carmine pilot Hello fellow humans, I was trying to use Dreambooth to create lora from photos o...

It can handle it just fine. I'd train a Dreambooth first and extract the LoRA later.

tall condor May 13, 2023, 1:17 AM

#

nonfp16?

#

you mena bf16?

#

of you mean fp32 models?

tall condor May 13, 2023, 1:19 AM

#

carmine pilot Hello fellow humans, I was trying to use Dreambooth to create lora from photos o...

sd does well with glasses from what i saw, its probably an issue with over / underfitting

#

may i ask your workflow and maybe a sample image of what is not working

tall condor May 13, 2023, 11:50 AM

#

can someone explain how to make regularisation images if i finetune with mutliple concepts?

#

also for regularisation images what shall be in them? the current model is not able to create what i finetune for. so how would i create those images?

gentle osprey May 13, 2023, 1:13 PM

#

tall condor can someone explain how to make regularisation images if i finetune with mutlipl...

The way I did it: have your base model spit out like 1500 images of the class your fine-tuning

#

So like if you're training a woman's face and your instance prompt is "a photo of kdbwh85 woman face" then you have your base model spit out 1500 images with the prompt "a photo of a woman face"

tall condor May 13, 2023, 1:33 PM

#

but i dont fine tune a single class

#

i have like 300-600 classes

#

thats why i am asking how i would take care of the regularisation image. i understand how it works for 1 single class

gentle osprey May 13, 2023, 1:41 PM

#

You would need regularization images for each class

#

Same process I'd imagine

tall condor May 13, 2023, 1:42 PM

#

but as far as i undestand dreambooth can only have 1 regularization folder

#

also my issue is that the class is new so i would know how to create regularization images

#

its just something the current model cant create

gentle osprey May 13, 2023, 1:43 PM

#

Not sure what your set up looks like, but for each concept I'm trying to train I can specify a different regularization folder

tall condor May 13, 2023, 1:46 PM

#

are you working with kohyass?

gentle osprey May 13, 2023, 1:47 PM

#

Nah, Dreambooth extension for A1111

#

But also not trying to train 300-600 concepts at a time lol

tall condor May 13, 2023, 1:48 PM

#

concepts is like 900 xD

gentle osprey May 13, 2023, 1:49 PM

#

The concepts that are bleeding together, are those the new concepts?

tall condor May 13, 2023, 1:49 PM

#

but i can probably cur the regularization down to 300 to 600 different classes

gentle osprey May 13, 2023, 1:49 PM

#

And is it all of them? Or just certain ones.

tall condor May 13, 2023, 1:50 PM

#

what do you mean with certain one or all together

gentle osprey May 13, 2023, 1:50 PM

#

You said that classes were kind of merging together

tall condor May 13, 2023, 1:50 PM

#

yes

#

its like a cross dependent model

gentle osprey May 13, 2023, 1:50 PM

#

Is it all the classes you're training or just some of them

tall condor May 13, 2023, 1:51 PM

#

all of them

#

i mean the results im getting without regularisation ist too bad, its just that some concepts are overfitting and some are hard to create at all

#

which is because for some cocnepts i have 100 images and for some i have 3

#

what im doing now i weight them down so that the concept with 100 gets as many runs as the concept with 3

#

but it seems that this is what is causing the overfitting

gentle osprey May 13, 2023, 1:52 PM

#

Maybe separate them into different batches

tall condor May 13, 2023, 1:52 PM

#

because those concepts with 3 images are run 30 times in 1 epoch which the others are run once

gentle osprey May 13, 2023, 1:53 PM

#

Like train all the high image concepts together

tall condor May 13, 2023, 1:53 PM

#

what do you mean by batches?

gentle osprey May 13, 2023, 1:53 PM

#

Then do a separate training for the low image concepts

#

Have no idea if that would work, but might be worth testing

tall condor May 13, 2023, 1:53 PM

#

you mean like training each indivitual and then merge them?

gentle osprey May 13, 2023, 1:54 PM

#

Yeah

tall condor May 13, 2023, 1:54 PM

#

tbh the level of work required with this is really not an options

gentle osprey May 13, 2023, 1:54 PM

#

Or train the model in steps

tall condor May 13, 2023, 1:54 PM

#

its more likely to define a regularization for each "base concept"

#

what do you mean by train the model in steps?

gentle osprey May 13, 2023, 1:54 PM

#

Like train it with all the high image concepts

#

Then train the fine tuned output with the low image concepts

tall condor May 13, 2023, 1:55 PM

#

i see

#

i have like 1300 class folders

#

and like 7000 images

#

so my average is like 5 images

#

however maybe what i can do is reduce the classes down to like 30 or 40

gentle osprey May 13, 2023, 1:56 PM

#

Hahaha damn

tall condor May 13, 2023, 1:56 PM

#

with the same images

#

and train a base model with them

#

and then specialize it with the full set

#

so basically create 30 classes with 7000 images

#

train the model as base

gentle osprey May 13, 2023, 1:57 PM

#

Yeah, I'd definitely experiment with how much you're feeding it at any given time

tall condor May 13, 2023, 1:57 PM

#

and then run the actual training

#

i was just hoping that weighting alone will help

#

also anyone know if shuffeling captions make a big difference?

#

any also what LR Scheduler are you guys using?

gentle osprey May 13, 2023, 2:03 PM

#

I use 0.000001, constant but I had the same question earlier

tall condor May 13, 2023, 2:03 PM

#

1e-6 right?

#

for me im using cosine with warmup

#

it appears the warmup is quite importaint

#

i have tested with constant before without warmup and it really didnt turn out too good

#

it appears everybody is using a scheuler that is reducing the LR at the end but i dont really understand the benefit yet

visual horizon May 13, 2023, 5:45 PM

#

I'm new to AI and I'd like to do a LoRa character training. I found this tutorial that seems super nice and easy https://imgur.com/a/mrTteIt#TjsDxqp but it's using google collab and I don't, so I have struggles following the actual training part

Could anyone explain it to me or have a good link to what I'm trying to learn please? This is the only tutorial that seemed easy and I struggle with whatever else I found

#

boge_shocked

gentle osprey May 13, 2023, 8:15 PM

#

visual horizon I'm new to AI and I'd like to do a LoRa character training. I found this tutoria...

https://rentry.org/2chAI_LoRA_Dreambooth_guide_english

LoRA guide

by anons
last update: 02.02.2023 UTC+3
Русская версия
What is LoRA?
Usage
Method 1 – using in WebUI's prompt
Method 2 – using sd-webui-additional-networks extension
Method 3 – merge with checkpoint
Preparing a dataset
General advice
For kohya's script
For sd_dreambooth_extension
Training
kohya's...

visual horizon May 13, 2023, 9:35 PM

#

gentle osprey https://rentry.org/2chAI_LoRA_Dreambooth_guide_english

Thank youu I'll check it out habby

#

I hope its not too hard to understand

turbid shuttle May 14, 2023, 1:07 AM

#

Not sure where to ask, does the WebUI have a function similar to Remix (MidJourney)?

cold wyvern May 14, 2023, 3:02 AM

#

Webui doesnt but Nerdy Rodent had a video a while back on a SD based image merge

tall condor May 14, 2023, 10:18 AM

#

what are you guys using to latent caption images? DeepDanbooru works well for some but not so well for other images. anyone used WD14 before? is it any better

tired plank May 14, 2023, 3:44 PM

#

I figured you would use deepbooru for stuff trained on danbooru(anythingv3/novelai) , but I've only ever tried to make two loras so I don't have the testing data to say

#

Related to that...I just made a lora for a character and while they got the hair and face kinda right, they completely got the skin tone wrong despite me not including the skin tone in the captioning. Character is much lighter skin than I wanted.

You are supposed to not include attributes in the captioning text files that are attributed to your character right?

#

Also if you are training on a character, is better you leave the 1girl/1guy tags or remove them in your captioning files,i figured you would remove those since being 1girl/1guy is an attribute of any character

gloomy stag May 14, 2023, 7:29 PM

#

I have a collection of faces of different characters I'm trying to train on. Would it make sense to say add woman and man to the description. For example should I saw Ryu man and Chun li woman? Will that give better results in general?

tall condor May 14, 2023, 10:54 PM

#

from my experience the better you caption your pictures the better your result is

#

i suggest you run something like DeepDanbooru on your images and then kind of optimize the captions.

#

i have just tested a model without detailed tags (just the concept) and when i added the detailed tags the result is much batter than before

wise locust May 15, 2023, 4:38 AM

#

does anyone know if training a dreambooth on a model that's already dreamboothed would affect quality in any way? Say if I want to train a new subject on a model where I already added subjects

cold wyvern May 15, 2023, 4:50 AM

#

shouldn't affect quality, but as always, the proof is in the pudding 🙂

wise locust May 15, 2023, 5:16 AM

#

cuz I do know that the reg images affect the entire model

#

so maybe if you train it too much it will start overfitting into ur reggies? do many questions, so liittle time to test them all

warm agate May 15, 2023, 5:54 AM

#

@cold wyvern Is it important to caption images before training model?

cold wyvern May 15, 2023, 6:16 AM

#

warm agate <@379130271259295744> Is it important to caption images before training model?

Its not necessary but highly recommended

warm agate May 15, 2023, 6:17 AM

#

cold wyvern Its not necessary but highly recommended

do you know any tool which can auto-caption images, as I am trying to train a large dataset? It would be difficult to caption each image.

#

I dont find Deepdanbooru for landscape photography

cold wyvern May 15, 2023, 6:21 AM

#

warm agate do you know any tool which can auto-caption images, as I am trying to train a la...

Both the dreambooth extension and kohya have tools for captioning

warm agate May 15, 2023, 6:21 AM

#

cold wyvern Both the dreambooth extension and kohya have tools for captioning

Dreambooth has inbuilt captioning and will auto-caption when we input our images?

cold wyvern May 15, 2023, 6:23 AM

#

warm agate Dreambooth has inbuilt captioning and will auto-caption when we input our images...

Nope - theres a tools tab and you can select the image folder in there and it will caption the images in the specified folder

warm agate May 15, 2023, 6:33 AM

#

cold wyvern Nope - theres a tools tab and you can select the image folder in there and it wi...

Ok

warm agate May 15, 2023, 6:33 AM

#

cold wyvern Both the dreambooth extension and kohya have tools for captioning

What's Kohya?

#

does dreambooth have a gui?

cold wyvern May 15, 2023, 6:45 AM

#

warm agate does dreambooth have a gui?

dreambooth extension is for a1111 - kohya is https://github.com/bmaltais/kohya_ss

GitHub

GitHub - bmaltais/kohya_ss

Contribute to bmaltais/kohya_ss development by creating an account on GitHub.

sonic narwhal May 15, 2023, 7:05 AM

#

The blip captioning in kohya is pretty shit though

#

Is there any other that works better?

short python May 15, 2023, 8:01 AM

#

i’ve been getting some good results finetuning SD2.1 by freezing all layers of the text encoder except for the last handful (2-6 out of 24). it seems to do a great job of preventing overfitting and catastrophic forgetting even at relatively high learning rates. doing this makes SD2.1 training feel about as “easy” as SD1.5 training (it’s still tricky, it’s just no longer a nightmare). there’s a branch on EveryDream2trainer if anyone is interested in trying it out.

vast dome May 15, 2023, 9:14 AM

#

guys what does "batch count" do? I don't think it increases the speed of training because its/s remains the same regardless of the batch count during Dreambooth

#

it is number of images that it processes before it updates the main model right?

#

wouldn't 1 batch count be better in terms of quality because it updates itself with every new sample? Though it might increase the training time because GPU-CPU transfer

cold wyvern May 15, 2023, 9:15 AM

#

its the number of images that it processes simultaneously, decreases the number of steps and has a slight performance increase

vast dome May 15, 2023, 9:16 AM

#

performance increase is speed it takes to complete the whole training right?

#

what I want to know is its impact on the final quality of the model

#

I get it that it reduces time it takes to train because it reduces the number of bus transfer between cpu-gpu

#

however i don't understand whether it can improve/deform quality of the final model

stiff dust May 15, 2023, 11:42 AM

#

during training the point is rather that multiple images are trained together in one update step. So your gradient update is an average across the images in one batch

#

this makes the gradient less noisy and unstable (in SD the gradient is often extremely noisy due to the stochastic nature of the noise sampling)

#

as far as I know, when training diffusion models people tend to use extremely large batch sizes to make training more stable. However, on consumer hardware you cannot do this, so people tend to use small batch size and extremely low learning rate instead

#

so yeah, probably quality of the model is better when doing larger batch size one same amount of steps. But of course, training time is also increasing a lot.

tall condor May 15, 2023, 12:08 PM

#

warm agate do you know any tool which can auto-caption images, as I am trying to train a la...

its very simple, just use DeepDanbooru

#

it will create tags for the images as txt file

warm agate May 15, 2023, 12:09 PM

#

But the results aren't that reliable for landscape photography

tall condor May 15, 2023, 12:09 PM

#

i did some tests the other day and it wasnt too bad

#

its better than nothing

tall condor May 15, 2023, 12:26 PM

#

stiff dust during training the point is rather that multiple images are trained together in...

also the info i read so far is bascially the opposite, ppl recommend batch sized of 2 or less.

stiff dust May 15, 2023, 12:27 PM

#

every paper I read used batch size of 20 or more

#

as said: on consumer hardware you cannot use large batch size, so people have to use super small ones

tall condor May 15, 2023, 12:28 PM

#

i think it also matters much if you have alot of images of the same type or very few images

stiff dust May 15, 2023, 12:28 PM

#

tall condor its very simple, just use DeepDanbooru

this is only for anime, right? For everything else you would have to use a CLIP interrogator or a BLIP model

tall condor May 15, 2023, 12:28 PM

#

if you have only a few images per concept than i think mixing them up will really give bad results. but if you have 100 of one it will not matter

tall condor May 15, 2023, 12:28 PM

#

stiff dust this is only for anime, right? For everything else you would have to use a CLIP ...

i use it for everyhting. the results are not bad

stiff dust May 15, 2023, 12:28 PM

#

tall condor i think it also matters much if you have alot of images of the same type or very...

not really. Even if you have only one image using batching makes sense, because it stabilizes the noise sampling

tall condor May 15, 2023, 12:29 PM

#

like this example of querty

#

#

i find the result pretty fair

#

never tested it on animi really. i just didnt find anything better yet

stiff dust May 15, 2023, 12:30 PM

#

I would say CLIP interrogator is WAY better

tall condor May 15, 2023, 12:30 PM

#

do you have a link?

stiff dust May 15, 2023, 12:30 PM

#

I just used the webui by vladic. It has it builtin

#

but guess you can also download it as separate extension for auto111

tall condor May 15, 2023, 12:31 PM

#

thanks i will test that

#

i saw huge improovements in the model when adding more detailed tagging so i guess the better the tags are the bettery

tall condor May 15, 2023, 12:34 PM

#

stiff dust not really. Even if you have only one image using batching makes sense, because ...

so basially you say to has a high batch count as possible? but if you say the change in the model is the average of the batch and i only have 1 image of a type doesnt that kind of kill the idea unless you run like 1000 epocs with very low learning rate?

stiff dust May 15, 2023, 12:35 PM

#

no, because you always sample a timestep. Let's say you sample a timestep at 1% then the image is completely noise and the model only learns rough shapes. If you sample at 99% then the image is almost perfect and the model just learns the fine details and textures

#

thus, what the model learns is completely different

tall condor May 15, 2023, 12:36 PM

#

what do you mean with sample at timestep?

stiff dust May 15, 2023, 12:37 PM

#

when you generate an image you create a random noise image and then step by step denoise it. You can watch this process in the webui

#

when you do training, you do not start from pure noise. Instead you draw a random time step. Lets say you draw the time step at 50%, then you add as much noise to the image as it would look like after doing 50% of the steps. The model then only removes as much noise as a single step

tall condor May 15, 2023, 12:39 PM

#

i just dont understand what this has to do with the batch size

#

if you mix up the training result of lets say 6 images that are all completely different rather than doing one at a time. i wuld expect the result to be compoletely different

#

isnt it?

stiff dust May 15, 2023, 12:42 PM

#

no, it makes it better

#

if you train 6 different images at the same time, each of them will have a very different gradient

#

averaging them is good, cause it makes the gradient more stable

#

in the end you want the model to train on ALL concepts, thus, you do not want it to overfit on a single image

tall condor May 15, 2023, 12:44 PM

#

it is only good if you train for one single concept

#

but if you would want to train a frog and a cat in one concept

#

i dont see how that would be beneficial

#

unless you want a catfrog as result

#

no?

stiff dust May 15, 2023, 12:45 PM

#

the opposite is the case

#

if you train ONLY on a cat, then this will override the models ability to do something else. Like it will forget how to draw anything except a cat

#

if you train a model hundred thousands of steps only on cat images, it will then only be able to generate cat images

#

the same happens with frog images

#

so training on 100,000 cat images basically destroys the model.

#

but lets say you train on 100,000 cat and 100,000 frog images. Now, order is important. If you train first on the cat images the model is already destroyeds

#

but if you train on one cat then a frog then a cat and so on, then the model will never forget what a frog and what a cat is, because you "remember it every step"

#

that's also the reason why we use regularizer images when training

#

this is example has not directly to do with batch size. I just want to demonstrate why it is a good thing to have variety during training

#

batch size increases variety

tall condor May 15, 2023, 12:51 PM

#

what you are saing makes a lot of sense. but i still think you need to compensate the bigger batch size with lower learning rate and higher number of runs if you are mising concepts

stiff dust May 15, 2023, 12:51 PM

#

yes, higher batch size means you need more epochs

#

learning rate, however, can rather be increased

tall condor May 15, 2023, 12:52 PM

#

ah yes because you are learning a mix

#

which is lexx critial than learning too much of 1 image

#

makes sense

stiff dust May 15, 2023, 12:53 PM

#

I mean, in general you could think of batch size as a purely performance thing. Increase batch size by 10 times means you can increase learning rate by 10 times. Training is faster because it can be easier parallelized

#

but in reality, batch size also stabilized training. Too high batch size makes gradient too stable, too low batch size makes gradient to instable. Somewhere in the middle is the sweet spot

tall condor May 15, 2023, 12:56 PM

#

what batch size you recommend for "normal ppl"? you think 6 is too high

stiff dust May 15, 2023, 12:56 PM

#

but in deep learning on images, memory requirements are so insane high that we never even reach this sweet spot. We can only use very low batch sizes. So I would say, take the batch size that fits in your memory

tall condor May 15, 2023, 12:58 PM

#

as for the regularization image. maybe you can help me with that. if i want to finetune a concept for lets say a swimming cat. can i still use a regular cat a regularization?

#

or do i need to have regulariszation images of swimming cats

#

and then if i want to train for a swimming cat in black fur. do i need cats with black fur as regularization or can i still use the "regular cat"

stiff dust May 15, 2023, 1:02 PM

#

I mean, you can do all these things. If all your images show swimming cats then also showing regular cats during training is a good idea to prevent overfitting

#

but concepts like "swimming cat" can usually be trained purely on textual inversion

#

or just by finetuning the text encoder

#

then you don't need regularization images at all I would say

tall condor May 15, 2023, 1:03 PM

#

i am getting quite good results for most concepts, just i do see quite some overfitting for other concepts. espacially the ones that have very little images

gentle osprey May 15, 2023, 3:17 PM

#

stiff dust or just by finetuning the text encoder

Any good guides on this

weary locust May 15, 2023, 3:23 PM

#

If I train a textual inversion embedding on a new concept, is it possible to train a better LORA using that specific embedding? Can custom embeddings in an interface like automatic1111 affect the training process or are they ignored?

gentle osprey May 15, 2023, 4:14 PM

#

weary locust If I train a textual inversion embedding on a new concept, is it possible to tra...

an embedding isn't part of a model, it's a matrix that you feed a model, so no they wouldn't have an affect on training a model unless you used images containing that embedding

#

when generating regularization images, do i want a wide distribution of sampling steps?

#

same question for CFG scale

weary locust May 15, 2023, 4:52 PM

#

gentle osprey an embedding isn't part of a model, it's a matrix that you feed a model, so no t...

Thanks for the clarification. One more question? So hypothetically if the images contained the custom embedding, and the embedding is loaded into the embeddings folder (within Automatic1111 or comparable webui), would it append the custom vector representation to the corpus of tokens when training a LoRA for an additional boost to fine tuning? Or would there need to be missing functionality added on to achieve this in the form of an extension?

gentle osprey May 15, 2023, 5:00 PM

#

weary locust Thanks for the clarification. One more question? So hypothetically if the images...

It would treat it like any other image you used to train LoRA.

weary locust May 15, 2023, 5:29 PM

#

gentle osprey It would treat it like any other image you used to train LoRA.

Meaning the token would get preprocessed as one one of the input vectors or dropped while retaining only the tokens that are part of the base model?

gentle osprey May 15, 2023, 5:58 PM

#

There'd be no token. If you're feeding it images created with an embedding it will treat it like any other image and have no knowledge of the token. At that point it's just pixels. Like the token isn't something the model knows so there's no association between what you feed it and the token.

#

You could specify that token if you wanted to model the associate those images with that token but it would be an approximation of the original.

sonic narwhal May 15, 2023, 8:47 PM

#

I downloaded instaloader using these 3
-m pip install instaloader
pip3 install instaloader
pip install instaloader

But when I run a script that has
import instaloader

it returns with "ModuleNotFoundError: No module named 'instaloader'" Why is this and how to fix?

chrome breach May 16, 2023, 6:43 AM

#

sonic narwhal I downloaded instaloader using these 3 -m pip install instaloader pip3 install i...

Did you do a runtime restart after the pip install?

sonic narwhal May 16, 2023, 7:21 AM

#

"python -m pip install instaloader" I did this today this morning and it fixed it

stiff dust May 16, 2023, 7:34 AM

#

@weary locust In principal you can use embeddings for LORA and it makes totally sense to do so. If the input embedding is used within the LORA training depends on the implementation you use - would have to check the source code.

#

note that every good implementation of LORA or dreambooth is doing textual inversion as first step anyways. So before even start training a LORA, a textual inversion is usually trained first

warm agate May 16, 2023, 9:34 AM

#

stiff dust but guess you can also download it as separate extension for auto111

Can you please share the link

sonic narwhal May 16, 2023, 9:52 PM

#

sonic narwhal "python -m pip install instaloader" I did this today this morning and it fixed i...

actually this only fixed it on the laptop but problem remains on pc

sonic narwhal May 16, 2023, 9:53 PM

#

chrome breach Did you do a runtime restart after the pip install?

How would I do this?

import os
os.kill(os.getpid(), 9)

Input something like that into the script I want to run?

tall condor May 16, 2023, 11:14 PM

#

anyone can recommend an interriogator like CLIP that works standalone with scripts

#

rather than a python module

#

DeepDanbooru works ok but the tags are sometimes really far off

idle crown May 17, 2023, 1:22 AM

#

Hey! I have a total noob question. I have a bunch of custom shapes - I want to have a finetuned SD model that will understand the design style of the shapes and generate a custom shape for any scene I describe - for example : if i give a prompt - chilling at a beach - the model should maybe output a custom shape created out of a picture of sand or a custom shape created out of a picture of sea etc.

#

can anyone please guide me on how I can create a model like this ?

surreal lagoon May 17, 2023, 2:52 AM

#

is there a script that uses captions instead of just a single instance prompt?

#

i can load BLIP and interrogate, as i've got 80G

chrome breach May 17, 2023, 4:47 AM

#

sonic narwhal How would I do this? import os os.kill(os.getpid(), 9) Input something like th...

No, restarting a kernel is done when you use ipynb (either on local system or colab).
When you run a script that cant happen.

#

Is it that you are using the library in a virtual env?? but installed the library outside of that env??

indigo orbit May 17, 2023, 8:44 AM

#

Hello: Lost my upscalers recently could not find upscaler named R-ESRGAN 4x+, using None as a fallback

gentle osprey May 17, 2023, 2:11 PM

#

idle crown can anyone please guide me on how I can create a model like this ?

Look up how to create a style LoRA.

sonic narwhal May 17, 2023, 2:57 PM

#

chrome breach Is it that you are using the library in a virtual env?? but installed the librar...

no im not using virtual environment but ill try that when I get home.

#

Are you guys using any tools for text file batch editing and management?

Etc now after scraping instagram I want to batch edit all text files remove certain parts of the text, replace hashtags with commas etc..

echo shuttle May 17, 2023, 3:19 PM

#

hey, my LoRA's are all a bit too strong and look baked on 1 strength, but usually work perfectly on like 0.6 to 0.8. Are there parameters I can adjust to make them a bit weaker to work on 1.0? (I crop them, pre process them with booru tags, remove tags i want my model to be associated with, add a triggerword to every file and then train it with usually 100 iterations each)

sonic narwhal May 17, 2023, 3:21 PM

#

U can make it output multiple versions on different points of training. Then do a test gen with each model and select the one that is not overtrained

echo shuttle May 17, 2023, 3:22 PM

#

hmm lemme find that setting

#

is it save every N steps?

sonic narwhal May 17, 2023, 3:23 PM

#

Save every N epochs

#

if ur using kohyass

echo shuttle May 17, 2023, 3:24 PM

#

yes i am

#

does 1 mean its only one model at the end cuz thats default?

#

these are my usual settings

sonic narwhal May 17, 2023, 3:38 PM

#

ah yea then it will be only one model. Maybe its possible to set the save every N to a number below 1?

echo shuttle May 17, 2023, 3:39 PM

#

well it lets me input 0.1, lemme run it and see

#

while i wait, will my models improve if i use lycoris instead of lora on the same settings?

#

i shouldn't have picked one with 2.5k steps

echo shuttle May 17, 2023, 4:09 PM

#

ok 0.1 also only gave me 1 model

sonic narwhal May 17, 2023, 4:11 PM

#

hm ok

echo shuttle May 17, 2023, 4:12 PM

#

trying 10 now

echo shuttle May 17, 2023, 4:39 PM

#

10 gave me 1 model

#

what is your purpose slider

#

aaaaaa

honest nexus May 17, 2023, 4:39 PM

#

Somebody having problems with kohya dreambooth google colab notebook?

#

seems not working anymore

echo shuttle May 17, 2023, 4:39 PM

#

did you use it a lot recently?

#

i dont use colab but ive read that it starts to lock you out for a certain time if you use it too much, and that time gets longer the more you use it

honest nexus May 17, 2023, 4:48 PM

#

I use multiple google accounts

#

looks like a huggingface problem

echo shuttle May 17, 2023, 4:49 PM

#

did you try a different colab?

honest nexus May 17, 2023, 4:50 PM

#

i have used this https://colab.research.google.com/github/Linaqruf/kohya-trainer/blob/main/kohya-LoRA-dreambooth.ipynb#scrollTo=-Z4w3lfFKLjr

Google Colaboratory

swift turret May 17, 2023, 10:16 PM

#

Hey dont know if this the right channel to ask but do you usually prefer to use batch count or use seed variation strength ?

hexed cypress May 18, 2023, 8:41 AM

#

Does anyone have any advices in terms of settings? I am honestly looking to fine tune it a bit as I feel like things are a bit too deformed sometimes

dark gull May 18, 2023, 12:45 PM

#

swift turret Hey dont know if this the right channel to ask but do you usually prefer to use ...

I can give no good assistance on this but I have a question regarding these elements. If you assign a seed number to a batch of images, why do they still generate differently? Aren’t they sharing the same seed data?

stiff dust May 18, 2023, 1:12 PM

#

seed is just for initializing the random generator

#

noise in each image is still different, same way as not every pixel in the noise image is the same

gentle osprey May 18, 2023, 2:47 PM

#

for dreambooth training, what batch size can i get use with a 4090

turbid shuttle May 18, 2023, 3:07 PM

#

New to Training, in Dreambooth do I need a text file corresponding to each image?

#

Or is there a better training tool?

gentle osprey May 18, 2023, 3:27 PM

#

turbid shuttle New to Training, in Dreambooth do I need a text file corresponding to each image...

only if you're using LoRA

gentle osprey May 18, 2023, 3:27 PM

#

turbid shuttle Or is there a better training tool?

every training tool has it's quirks

#

so better is hard to quantify

turbid shuttle May 18, 2023, 3:28 PM

#

gentle osprey every training tool has it's quirks

Can I use Dreambooth to create a model from scratch? If so do I need the txt files for that?

#

For context my first attempt used a source model, its hard to tell the difference in the end results

gentle osprey May 18, 2023, 3:29 PM

#

like with no base model? i'd imagine that's something you could only do if you had a server farm at your disposal

#

the base models are trained on absolutely massive data sets

turbid shuttle May 18, 2023, 3:30 PM

#

gentle osprey the base models are trained on absolutely massive data sets

So if I understand correctly, when I use a base Model I'm essential creating a LoRA?

gentle osprey May 18, 2023, 3:31 PM

#

#

that's how each training approach works

turbid shuttle May 18, 2023, 3:34 PM

#

gentle osprey

Wow, I've got a lot of learning to do 😲

gentle osprey May 18, 2023, 3:34 PM

#

hahaha, oh yeah

#

there's a masssssssssssssiiiiiiiiivveee learning curve

#

you can get started relatively quickly

#

like took me about two weeks of tinkering to get decent results

#

but the ceiling of what you can learn is really high and always going up

turbid shuttle May 18, 2023, 3:35 PM

#

Unfortunatly I stared on MidJourney, now I am trying to get the same quality of results

gentle osprey May 18, 2023, 3:37 PM

#

midjourney is definitely a more user friendly product, but less versatile/customizable

turbid shuttle May 18, 2023, 3:37 PM

#

gentle osprey midjourney is definitely a more user friendly product, but less versatile/custom...

and $$$

gentle osprey May 18, 2023, 3:37 PM

#

hahah that too

turbid shuttle May 18, 2023, 3:37 PM

#

Plus the NSFW filters, I type in something harmless and end up having to appeal it 🙄

#

Okay, Ill keep reading, thank you!

tall condor May 18, 2023, 4:15 PM

#

is there any way to find out what is causing my model to overfit?

#

what is happening is that my model is learning strange pattern and applying them

#

is there any mechanism to aviod those patterns?

gentle osprey May 18, 2023, 4:16 PM

#

tall condor is there any mechanism to aviod those patterns?

i wonder if you could use those weird patterns to create a negative embedding

#

or a LoRA and then add a negative weight to it

tall condor May 18, 2023, 4:17 PM

#

as for regularisation images: whap happens if they are from a completely different bade model like dreemshaper instead of 1.5? does it matter? also how close do regularisation images have to fit the actual concept i want to train? how far can they be off?

#

how would i filter that pattern and create a negative embedding? also i would prefere to avoid them in first place

gentle osprey May 18, 2023, 4:17 PM

#

tall condor as for regularisation images: whap happens if they are from a completely differe...

pretty sure you want to use regularization images generated by the model you're training on

tall condor May 18, 2023, 4:18 PM

#

what happens if i dont? 1.5 is not even getting close to the concepts i want to train on so its gonna be hard to create regulariuation iamges

gentle osprey May 18, 2023, 4:18 PM

#

tall condor how would i filter that pattern and create a negative embedding? also i would pr...

you'd take a bunch of images with that pattern and create an embedding then add that embedding as a negative prompt

#

it's definitely a kludgy solution though

tall condor May 18, 2023, 4:19 PM

#

im using kohya ss which is runnin gimages multiple times based on weights i apply

#

and i guess that i just run certain images too often, but the patetrn is so unclear that its hard for me to identiy which images causing them

gentle osprey May 18, 2023, 4:21 PM

#

tall condor what happens if i dont? 1.5 is not even getting close to the concepts i want to ...

yeah i'm honestly not sure how you'd approach that

tall condor May 18, 2023, 4:21 PM

#

what exactly does dreamstudio do with the regularisation images?

#

its it like a filter or is it like a correction?

gentle osprey May 18, 2023, 4:22 PM

#

dreamstudio or dreambooth?

tall condor May 18, 2023, 4:23 PM

#

dreambooth sorry

gentle osprey May 18, 2023, 4:24 PM

#

regularization images are there to keep your training data from having too much of an effect on the base classes. like if you're trying to teach your model a specific instance of a car, regularization will ensure that all cars don't start looking like your car

gentle osprey May 18, 2023, 6:35 PM

#

do you guys use classification image negative prompts for dreambooth training?

tall condor May 18, 2023, 6:52 PM

#

anybody used regularization images from different models before?

tall condor May 18, 2023, 7:51 PM

#

anyone know how to avoid strange pattern when finetuning? its like mixing multiple concepts so ther its neither nor

#

also is there any interpretation of loss?

echo shuttle May 18, 2023, 10:19 PM

#

is there a setting for koyha ss lora training to give me a model after every n% of being done so that i can pick the one that is not overtrained?

tall condor May 18, 2023, 11:07 PM

#

i think you can set that in advanced options

#

i know there is for dream and as far as i saw all the settings are identical so there must be

#

hi guys. what is the point in reducing the Learning Rate in the later stage of the training? LR Scheduler cosine for example? i see that its pretty standard but i dont understand what is the benefit

#

does it like allow to run longer and pick up more details or so?

#

or can i just use constant with warmups and train unteill the model performs best

sonic narwhal May 19, 2023, 6:27 AM

#

gentle osprey pretty sure you want to use regularization images generated by the model you're ...

I've heard you want to use best possible images so doesnt matter from where they are

chrome breach May 19, 2023, 9:09 AM

#

tall condor as for regularisation images: whap happens if they are from a completely differe...

Had similar doubt... Anyone has some idea on how will real class images work(not generated from SD but rather actual real pics manually used)??

stiff dust May 19, 2023, 10:01 AM

#

actually, I would use real class images if available

#

in particular for people, as they don't have strange artefacts and deformed hands

#

I would say its less important that the regularization images come from SD itself. Its more important that they have high diversity and you use every reg image only once or few times

chrome breach May 19, 2023, 11:23 AM

#

I see...

#

U got any ideas on how to NOT let the model overfit on the training data? (with regularization images made from SD itself by the usual way)

tall condor May 19, 2023, 12:16 PM

#

@stiff dust do you know how well the regularization image have to fit the concept i want to train? or can they be very general?

stiff dust May 19, 2023, 12:17 PM

#

I think the idea is to give them almost same caption as your training images. However, the best is probably to use random LAOIN subsets - so they can be also very general

tall condor May 19, 2023, 12:18 PM

#

anyone know if kohya ss is bucketing regularisation images?

#

kaibioinfo do you have any comment on my question reguarding learning rate?

tall condor May 19, 2023, 8:26 PM

#

so it appears that lower learing rate (5e-7) in combination with constant scheduler (with warmup) does much better than 1e-6 with cosine

#

may i ask what settings you gyus are using?

gloomy pike May 19, 2023, 9:45 PM

#

in automatic1111's web ui does anyone know how to make batch with masks do "only mask" instead of "whole image"? is there an argument I could use to make only mask the default?

#

I guess I found a way because text2mask uses the inpaint settings instead of the batch settings when masking and generating from batch. I would like to use the masks generated with the batch though so I could use the xyz script instead of the txt2mask.

#

Idk I guess the batch is using inpaint settings too. I just tested with generated mask. Idk the last time I tried it would all ways use whole image regardless of the setting in the inpaint panel.

#

maybe it's because this time I sellected it in inpaint upload instead of regular inpaint? Is that correct???

tall condor May 19, 2023, 10:10 PM

#

how many regularisation images should i provide per concept?

#

i read online it shall be 200 per training image is that correct?

devout dome May 19, 2023, 10:27 PM

#

gloomy pike maybe it's because this time I sellected it in inpaint upload instead of regular...

When you use "only mask", what do you have the padding set at?

gloomy pike May 20, 2023, 1:04 AM

#

devout dome When you use "only mask", what do you have the padding set at?

default, I've checked, inpaint upload settings are used for batch stuff and txt2mask.

I've run into a new issue now though. I can't use only mask with batch because when I do it never changes the mask to the next one along with the corresponding images, I think the masks work normal in batch when it is generated based on whole image but when I use only mask it keeps the first mask and goes through all the different images with the same first mask. They are all named the same as their corresponding images.

finite creek May 20, 2023, 7:04 AM

#

Hello everyone, after training on 1.5 for a while I’m trying 2.1. Does anybody know if there are specific parameters to take into account that would be different from 1.5?

stiff dust May 20, 2023, 7:29 AM

#

tall condor i read online it shall be 200 per training image is that correct?

ideally one per epoch

#

one per epoch per training image

#

but if you have less its not that bad

tall condor May 20, 2023, 10:09 AM

#

ok so if i run 100 epochs i shall have 100number of image in conceptnumber of concepts right

#

so 70k images xDD

#

700k sorry xD

turbid shuttle May 20, 2023, 12:25 PM

#

Inpaint/Sketch: I've noticed the masked area doesn't reset when I add a new image. The paint color is removed but the mask is still there. Anyone else have this issue or a different inpaint/sketch tool?

#

Also after a few uses during the session it begins to alter areas never masked

serene flicker May 20, 2023, 3:34 PM

#

finite creek Hello everyone, after training on 1.5 for a while I’m trying 2.1. Does anybody k...

2.1 trains faster than 1.5, so probably a lower learning rate, train the text encoder completely just to improve that, and save the model more often for testing when its done to minimize the risk of overtraining and having to restart.

surreal lagoon May 20, 2023, 9:15 PM

#

has anyone tried altering the sort of the images pulled in by the dreambooth script, to sort by atime in reverse order eg. the oldest images / least-touched images go first?

hot breach May 21, 2023, 4:05 AM

#

I use random shuffle the entire data set every epoch in ED2

#

main gain may just be putting different images together in batches every epoch more than the order matters

#

the original dreambooth repos based on xavier's repo only have batch size of 1 or 2 maybe on 24gb so it may be a bit moot

stiff dust May 21, 2023, 8:38 AM

#

tall condor 700k sorry xD

if you have so many images and concepts you probably don't need regularization images.

surreal lagoon May 21, 2023, 9:22 AM

#

^ this

#

i'm using about 122,000 images right now and it's just... amazing. they're well-tagged and varied

warm agate May 21, 2023, 10:27 AM

#

stiff dust if you have so many images and concepts you probably don't need regularization i...

What are regularization images?

surreal lagoon May 21, 2023, 6:24 PM

#

they are also called 'class images'

hot breach May 21, 2023, 7:42 PM

#

warm agate What are regularization images?

two techniques, dreambooth uses SD generated images mixed in with training, or if fine tuning you can mix in your own, you can use multiply.txt in ED2 to load those images less frequently compared to training images

#

the purpose either way is to avoid overfitting to your training data, i.e. "remind the model what it already knows" so it doesn't forget, in very hand-wavy terms

warm agate May 21, 2023, 7:47 PM

#

hot breach two techniques, dreambooth uses SD generated images mixed in with training, or i...

So multiply.txt in ED2 would use our selected images in between some random images to train it better so that the model is not just limited to the trained images?

warm agate May 21, 2023, 7:47 PM

#

warm agate So multiply.txt in ED2 would use our selected images in between some random imag...

Its the same in DB?

hot breach May 21, 2023, 7:48 PM

#

dreambooth has a particular way it pairs the training images and regularization images, it pairs them up every step

#

ED2 does not, its random shuffle of all the data, no distinction is made inside the software on what "regularization" even means

warm agate May 21, 2023, 7:49 PM

#

hot breach dreambooth has a particular way it pairs the training images and regularization ...

Oh if the batch size is 10, then it uses 5 desired images and 5 regular images?

hot breach May 21, 2023, 7:49 PM

#

in dreambooth, batch size 1 would mean 1 training image and 1 regularization images are in 1 batch/step

warm agate May 21, 2023, 7:49 PM

#

hot breach in dreambooth, batch size 1 would mean 1 training image and 1 regularization ima...

Oh ok makes sense

warm agate May 21, 2023, 7:50 PM

#

hot breach ED2 does not, its random shuffle of all the data, no distinction is made inside ...

Thats the only diff b/w ED2 and DB?

hot breach May 21, 2023, 7:50 PM

#

in ED2 its random selection, everything is just shuffled together, you can just sort of simulate the dreambooth thing though

surreal lagoon May 21, 2023, 7:50 PM

#

the typical setup for DB has a lot of 'repeats' on your training data and fewer repeats on the class data

warm agate May 21, 2023, 7:50 PM

#

hot breach in ED2 its random selection, everything is just shuffled together, you can just ...

Where are regularization images extracted from?

hot breach May 21, 2023, 7:50 PM

#

ED2 was built for general fine tuning, dreambooth is a specific technique, so ED2 is more general training and fine tuning

warm agate May 21, 2023, 7:51 PM

#

surreal lagoon the typical setup for DB has a lot of 'repeats' on your training data and fewer ...

What is class data?

surreal lagoon May 21, 2023, 7:51 PM

#

they're generated by the checkpoint you're training from.

hot breach May 21, 2023, 7:51 PM

#

in dreambooth the typical technique is regularization images are generated from SD itself, they are inference outputs

warm agate May 21, 2023, 7:51 PM

#

Oh ok

surreal lagoon May 21, 2023, 7:51 PM

#

they usually look like total garbage but it depends on the checkpoint

warm agate May 21, 2023, 7:51 PM

#

So they use the base SD's images as their input for regularization images

hot breach May 21, 2023, 7:51 PM

#

I'm not a fan myself, when you can source Laion or ffhq or coco and use those instead if you even need regularization at all

surreal lagoon May 21, 2023, 7:52 PM

#

that's highly likely to burn the model if you don't use reggies from the checkpoint, btw

#

at least this is the case for 2.1

#

it possibly isn't for 1.5. i haven't checked

hot breach May 21, 2023, 7:52 PM

#

not much happens if you just train for one character

warm agate May 21, 2023, 7:52 PM

#

hot breach ED2 was built for general fine tuning, dreambooth is a specific technique, so ED...

In my case, I want to train on Landscape photography, so which one do you suggest?

surreal lagoon May 21, 2023, 7:53 PM

#

ah see i've been trying to do generalised fine-tunes and so class data tends to harm my results

hot breach May 21, 2023, 7:53 PM

#

partial freezing of text encoder and using a separate lower LR seem to help with training sd2.1 a lot

warm agate May 21, 2023, 7:53 PM

#

hot breach in dreambooth the typical technique is regularization images are generated from ...

whats the diff b/w training and fine-tuning?

surreal lagoon May 21, 2023, 7:53 PM

#

i just use polynomial learning rate and a high warmup run

#

training and fine-tuning are the same but generally when a distinction IS made, training is understood to be from scratch and fine-tuning is providing specific concepts to a pretrained model to bring those weights up and make it more likely that type of output is produced.

hot breach May 21, 2023, 7:54 PM

#

warm agate whats the diff b/w training and fine-tuning?

in this context nothing, fine tuning usually means training after an initial training session though, which since we're starting with the supplied checkpoints from SAI, etc then its all fine tuning

warm agate May 21, 2023, 7:54 PM

#

hot breach in this context nothing, fine tuning usually means training after an initial tr...

What's SAI?

surreal lagoon May 21, 2023, 7:54 PM

#

StabilityAI

hot breach May 21, 2023, 7:54 PM

#

stability ai

surreal lagoon May 21, 2023, 7:54 PM

#

jinx

warm agate May 21, 2023, 7:55 PM

#

hot breach in this context nothing, fine tuning usually means training after an initial tr...

oh ok, so as we are training after SD's initial data, everything is considered as fine-tuning?

#

Ok

surreal lagoon May 21, 2023, 7:56 PM

#

there's a group that re-trained SD 1.5 on more than 2.9 million images with thoroughly tagged captions and i personally have trouble declaring that 'fine-tuning' considering the extent of catastrophic loss from the original SD 1.5 model but that IS fine-tuning.

hot breach May 21, 2023, 7:56 PM

#

yeah fine tuning is a pretty generic term, I would say dreambooth is a specific technique inside fine tuning for example

surreal lagoon May 21, 2023, 7:56 PM

#

yeah. dreambooth is a subset

hot breach May 21, 2023, 7:56 PM

#

surreal lagoon there's a group that re-trained SD 1.5 on more than 2.9 million images with thor...

"fine tuning" doesn't mean you didn't do a bad job though lol

surreal lagoon May 21, 2023, 7:56 PM

#

for sure, yes

#

and LoRAs generally are a world of their own, with a lot of similarity to Dreamboothing but different training data setup, different learning rate, different impact on each "delta from zero" for each hyperparameter you change

hot breach May 21, 2023, 7:57 PM

#

you can do some pretty amazing things with just a few thousand images though, train entire fictional worlds worth of characters and scenery and stuff, people underestimate how much "room" is in the model to learn

surreal lagoon May 21, 2023, 7:58 PM

#

well the model has a lot of garbage connections

warm agate May 21, 2023, 7:58 PM

#

hot breach yeah fine tuning is a pretty generic term, I would say dreambooth is a specific ...

Can you explain with an example?

hot breach May 21, 2023, 7:58 PM

#

example of dreambooth or of fine tuning?

warm agate May 21, 2023, 7:58 PM

#

hot breach example of dreambooth or of fine tuning?

Well, I would prefer both

surreal lagoon May 21, 2023, 7:59 PM

#

@warm agate a general fine tune will have thousands and thousands of ideally, well-captioned data. this results in a "generalization" of your improvements across all of the tags you had in your training data. this can be MONUMENTAL.

dreambooth is trying to insert a single subject into a model so it can be referenced by a single keyword. in other words, add yourself into your favourite model so you can become a subject in its fantasies.

hot breach May 21, 2023, 8:00 PM

#

"fine tuning" is training an already trained/started model with labeled data (i.e. captioned images), that's the most generic version

#

dreambooth does the same thing, but generally the labels are a fixed word like "xyzbob" or "xyzbob person" and regularization images are also mixed in with just some generic label like "person"

#

the point of either is to make the model learn something it doesn't know, can be anything that relates text to a 2D image really, like styles, camera angles, characters, etc

warm agate May 21, 2023, 8:02 PM

#

surreal lagoon <@828208105631383572> a general fine tune will have thousands and thousands of i...

so isn't dreambooth like LORA?
Does fine-tune also extract all elements of images and return an image with an element from each image image into a single image?

hot breach May 21, 2023, 8:03 PM

#

LORA is its own thing, its a trick to try to make training more efficient by training and patching a much smaller submodule, but it isn't actually updating the core model weights at all

warm agate May 21, 2023, 8:03 PM

#

hot breach "fine tuning" is training an already trained/started model with labeled data (i....

So finetune allows SD to better compile the available resources better?

surreal lagoon May 21, 2023, 8:03 PM

#

yeah, expanding on that last point, you can use Dreambooth to "fix" the model's understanding of a "concept". example: SD 2.1 cannot make aliens.

solution: provide Dreambooth,

the instance prompt "aliens" and class prompt "person"
about 500-3000 training images of different aliens
about 15,000 class images
use a VERY low learning rate, and a LARGE number of steps

and that will overload the 'aliens' keyword with your concept from the training data, usually replacing the astronaut it places under 'alien' by default.

warm agate May 21, 2023, 8:04 PM

#

What are class images?

hot breach May 21, 2023, 8:04 PM

#

warm agate So finetune allows SD to better compile the available resources better?

I think fine tuning scales better, and using all real images produces better results as well

surreal lagoon May 21, 2023, 8:04 PM

#

we're going in circles

#

you already asked that

surreal lagoon May 21, 2023, 8:05 PM

#

warm agate So they use the base SD's images as their input for regularization images

here

#

class images = regularization data

#

you have a 'subject' and a 'class' in Dreambooth training. if your subject is Lara Croft, your class is woman

hot breach May 21, 2023, 8:06 PM

#

you pick your class, you could use "person" too but yeah,t he idea is the class is some sort of super-class of your trained thing

surreal lagoon May 21, 2023, 8:06 PM

#

if you're trying to improve the anatomy of humans, providing 'hands' as your subject, your class would be 'human anatomy'

#

and good luck with that

hot breach May 21, 2023, 8:06 PM

#

if you are training your pet dog Chewy, your class would probably be "dog" etc

warm agate May 21, 2023, 8:07 PM

#

For example, I input 100 images of 'Forests', 100 images of 'beaches', 100 images of 'sunset' and 100 images of 'camels'.
So with an prompt like An aerial view of a beach during sunset with a dense forest located near the beach, camels approaching the beach through the forest

warm agate May 21, 2023, 8:08 PM

#

warm agate For example, I input 100 images of 'Forests', 100 images of 'beaches', 100 image...

would Finetuning be better at these kind-off tasks?

#

@hot breach @surreal lagoon

surreal lagoon May 21, 2023, 8:11 PM

#

yeah you'd want to provide all of that in a single dataset and put what you would prompt for each image as its caption

#

there's many different training tools and so i can't really provide guidance on how you'd use those captions for training, but i name my files by their prompts, with _ in place of spaces and then in my training code, i replace those with spaces and do a bit more cleanup on it

hot breach May 21, 2023, 8:13 PM

#

dreambooth works fine for training in one person with like 10-30 images, but if you want to train 8 characters and a bunch of screenshots from your favorite TV series all at once, or reform the entire model to be some special style, I don't think dreambooth technique is very helpful

surreal lagoon May 21, 2023, 8:14 PM

#

nah if you do it right you can train a movie into a single keyword and avoid style bleed that you'd see with general fine-tune. it all depends what you're after.

#

i did this with The Hobbit, and "lotr style" would make everything look very, uh, Peter Jacksony

#

but that's when i realised The LOTR movies actually have a terrible style to them

#

🤣

#

i thought the training just didn't work but i went back and looked at the movie and was like holy cow, it really does look like that

#

maybe i can pre-process them with img2img to make them more brighter and vibrant but the movie is dull and grainy and even just straight-up blurry and it makes all of the images you apply the LOTR Style keyword to, appear "decayed"

#

city skyline = vibrant, colourful, alive
city skyline lotr style = Aleppo, Syria

#

i want to try A Scanner Darkly next

warm agate May 22, 2023, 7:29 AM

#

If we train a model with a dataset of 1 million or maybe more as it's easy to get the dataset images of humans, will the faces become way better?

stiff dust May 22, 2023, 8:30 AM

#

yes and you don't need millions. I think most models out there are trained on very few images. You see this, because they tend to generate the same faces over and over

#

so more is better, but a thousand is probably more than enough

#

also there is a limitation. The problem SD has with generating faces of people more far away in the picture (e.g. full body shots) is a limitation of the model rendering in low resolution. You probably won't be able to fix that (except if you train on very high resolution, which takes insanely amount of time and memory)

warm agate May 22, 2023, 8:52 AM

#

stiff dust also there is a limitation. The problem SD has with generating faces of people m...

I am looking for closeups mostly, but I don't think it's hard to find full body shots. I think the number of selfies are far more than full body images.

warm agate May 22, 2023, 8:52 AM

#

stiff dust also there is a limitation. The problem SD has with generating faces of people m...

Why only high res for full body?

warm agate May 22, 2023, 8:52 AM

#

stiff dust also there is a limitation. The problem SD has with generating faces of people m...

Which method do you suggest to train numerous faces?

stiff dust May 22, 2023, 8:54 AM

#

just check models like icantbelieveitsnottrue

#

they achieve really good faces

stiff dust May 22, 2023, 8:56 AM

#

warm agate Why only high res for full body?

because SD computes in 8 times lower resolution. So if you compute an 512x512 picture the internal latent resolution is 64x64. Now if the face in the original image was 64x64 pixel in size, it's internal size is 8x8. This is too few pixels to get the details right

warm agate May 22, 2023, 8:59 AM

#

Oh, so we have to try like 4096x4096 images?

#

But I don't think we can get such high res images

stiff dust May 22, 2023, 10:09 AM

#

no, you can't. Use upscaling and inpaint, or tiled diffusion. There are many techniques to get higher details and fix artefacts in images

warm agate May 22, 2023, 11:14 AM

#

stiff dust no, you can't. Use upscaling and inpaint, or tiled diffusion. There are many tec...

So upscale all the dataset images to 4096x4096 and then train?

stiff dust May 22, 2023, 11:15 AM

#

no, train normally on the native resolution or close to native

#

when you generate images you can use upscaling to make images larger, then img2img to add details.

warm agate May 22, 2023, 11:17 AM

#

stiff dust when you generate images you can use upscaling to make images larger, then img2i...

Oh ok, so inpainting will use the miniature faces using the model?

orchid jay May 22, 2023, 6:28 PM

#

weird discovery, I dreamboothed my favorite model and the whole model ended up looking even better....

#

even when prompting stuff that was not dreamboothed. I think it's cuz of my reg images?

stiff dust May 22, 2023, 8:50 PM

#

I would say that's normal if the images you use are aesthetically better than the random stuff the model generates otherwise

tall condor May 22, 2023, 9:53 PM

#

hi guys, how many times shall i run regularisation images? my training images run like 30 times per epoch per image

#

is 1 run gonna be too little?

orchid jay May 22, 2023, 10:54 PM

#

I use 1500 total reggies and split them up by my instance img ct

#

so if I have 100 instance images, thats 15 reggies a pop

stiff dust May 22, 2023, 11:20 PM

#

usually 1 time is enough 😳
ideally, you do not train more than one or few times on the same regularization image

gloomy stag May 23, 2023, 12:02 AM

#

Has anyone had great success with blip2 or clip captioning? I am trying to find a project preferably with a guide to run either of these for human pictures.

surreal lagoon May 23, 2023, 5:01 AM

#

stiff dust I would say that's normal if the images you use are aesthetically better than th...

i did The Hobbit movie and everything looked broken, old, rotting, decayed 🤣

#

didn't expect that. i was careful with the input data

surreal lagoon May 23, 2023, 5:05 AM

#

stiff dust usually 1 time is enough 😳 ideally, you do not train more than one or few times...

my current training is generating about 224,000 regularization images 😩

#

about 140,000 remain R_Flex

#

god bless the A6000

warm agate May 23, 2023, 5:09 AM

#

What does sub 60 images mean?

stiff dust May 23, 2023, 7:46 AM

#

gloomy stag Has anyone had great success with blip2 or clip captioning? I am trying to find ...

I use the 6.7b version of blip2 and the results are as mediocre as the blip1 version.

gloomy stag May 23, 2023, 7:46 AM

#

stiff dust I use the 6.7b version of blip2 and the results are as mediocre as the blip1 ver...

What do you use instead?

stiff dust May 23, 2023, 7:47 AM

#

gloomy stag What do you use instead?

I use blip2. Don't know any better that is as practical to use

gloomy stag May 23, 2023, 7:48 AM

#

stiff dust I use blip2. Don't know any better that is as practical to use

Oh, so its the best but no every good.

#

I having trouble finding a blip2 project that works on runpad

stiff dust May 23, 2023, 8:01 AM

#

surreal lagoon i did The Hobbit movie and everything looked broken, old, rotting, decayed 🤣

sounds like the text encoder is overfitting on the data

warm agate May 23, 2023, 9:24 AM

#

gloomy stag Has anyone had great success with blip2 or clip captioning? I am trying to find ...

Use captionr

gloomy stag May 23, 2023, 9:29 AM

#

warm agate Use captionr

Can you give me more information?

warm agate May 23, 2023, 9:39 AM

#

gloomy stag Can you give me more information?

https://github.com/theovercomer8/captionr

GitHub

GitHub - theovercomer8/captionr: GIT/BLIP/CLIP Caption tool

GIT/BLIP/CLIP Caption tool. Contribute to theovercomer8/captionr development by creating an account on GitHub.

sonic narwhal May 23, 2023, 12:16 PM

#

stiff dust I use blip2. Don't know any better that is as practical to use

Is it possible to run blip2 on something like vast.ai or google colab? I dont have the pc for blip 2 locally

ancient mural May 23, 2023, 12:51 PM

#

I made a LoRA with Dreambooth but every image it generates is miscolored or has a blue tint to everything. Any ideas?

stiff dust May 23, 2023, 1:32 PM

#

I observed such artefacts when you train the unet with too low rank

#

default for LORA is 4 which is fine for the text encoder but too low for the unet

#

use larger rank (e.g. 16 or higher)

gentle osprey May 23, 2023, 3:34 PM

#

surreal lagoon god bless the A6000

Damn baller

gloomy stag May 23, 2023, 3:41 PM

#

sonic narwhal Is it possible to run blip2 on something like vast.ai or google colab? I dont ha...

Did you try captionr?

surreal lagoon May 23, 2023, 3:43 PM

#

gentle osprey Damn baller

it has generated 14,000 reg images so far

#

141,212 to go

sonic narwhal May 23, 2023, 4:27 PM

#

gloomy stag Did you try captionr?

No, if im not mistaken captionr = blip2

gloomy stag May 23, 2023, 4:28 PM

#

sonic narwhal No, if im not mistaken captionr = blip2

You asked about blip2

surreal lagoon May 23, 2023, 4:41 PM

#

https://laion.ai/blog/large-openclip/

@serene flicker LAION says that bfloat16 helps to train the text encoder better

Large scale openCLIP: L/14, H/14 and g/14 trained on LAION-2B | LAION

We trained three large CLIP models with OpenCLIP: ViT-L/14, ViT-H/14 and ViT-g/14 (ViT-g/14 was t...

serene flicker May 23, 2023, 4:41 PM

#

Oh

surreal lagoon May 23, 2023, 4:42 PM

#

mixed-precision training

stiff dust May 23, 2023, 4:48 PM

#

as long as the dataset is not too huge, I prefer training on float 32 to avoid these issues at all

surreal lagoon May 23, 2023, 4:48 PM

#

100% i agree

#

what qualifies as too large in your eyes? i am using a6000, 4090, and a100 80G cards for training

#

i assume each has a different threshold

sonic narwhal May 23, 2023, 5:00 PM

#

gloomy stag You asked about blip2

Is it not the same thing in essence?

stiff dust May 23, 2023, 8:24 PM

#

surreal lagoon what qualifies as too large in your eyes? i am using a6000, 4090, and a100 80G c...

I train on my own consumer gpu which is a 3090ti with 24gb. If training finishs over night or during a work day, it doesn't matter so much if it takes 5 hours or 10 hours. So I prefer training with 32bit instead of training over night with 16bit and then see that something didn't worked

#

for textual inversion I use 16bit, though, as it is really slow otherwise

#

lora training, in contrast, is often surprisingly fast even with 32bit

surreal lagoon May 23, 2023, 8:30 PM

#

yeah i saw that when helping Sytan figure his overly baked output out

regal trail May 23, 2023, 9:21 PM

#

Hey guys what are you using to train dreambooth? I used stable turner because i could use the shuffle after epoch on windows but the install is broken. Does anyone have a better alternative? I have a 3090 and a 4090 and have quite a decent experience trianing models from 20-50k images. I'm really looking for dreambooth not fine tuning as EDT is good enough for that

hot breach May 23, 2023, 10:56 PM

#

you can simulate dreambooth in everydream2, the bonus is it is actually maintained

regal trail May 23, 2023, 11:22 PM

#

Thank you

surreal lagoon May 23, 2023, 11:30 PM

#

has anyone tried to keep a translation list of common terms and swap the words out randomly when training english datasets so that the encoder is introduced to new languages? eg. say you have a various dataset of landscapes, subjects, objects, and you want cats to also be gatto, you could change out cat for gatto randomly when you encounter it

stiff dust May 24, 2023, 7:30 AM

#

wouldn't it ne easier to make a new token with same embedding as e.g. cats

warm agate May 24, 2023, 12:13 PM

#

@stiff dust is ED2 better than Dreambooth for landscape photography?

#

What does batch size mean in training?

stiff dust May 24, 2023, 12:14 PM

#

it's all just different scripts implementing fine-tuning

#

but I would say ED2 is most sophisticated

#

batch size should be set as highest as possible without getting out of memory errors

warm agate May 24, 2023, 12:14 PM

#

Which one do you suggest for Landscape photography training?