#🔧｜finetune | Stable Diffusion | Page 14

warm agate May 24, 2023, 12:15 PM

#

ok

stiff dust May 24, 2023, 12:15 PM

#

usually you can't use higher values than, say, 4

warm agate May 24, 2023, 12:15 PM

#

How to check the highest possible, just through trial and error?

stiff dust May 24, 2023, 12:15 PM

#

yes

warm agate May 24, 2023, 12:15 PM

#

ok

#

Which one do you suggest for Landscape photography training?

stiff dust May 24, 2023, 12:16 PM

#

just use ED2

#

don't overthink it. In the end, every training data behaves a bit different anyways

warm agate May 24, 2023, 12:17 PM

#

stiff dust just use ED2

ok

#

@stiff dust How to check VRAM usage?

#

Can we upload images of different resolutions?

surreal lagoon May 24, 2023, 1:32 PM

#

well on my multi a100 system i can really only tolerate bs 7 and 64 gradient accumulations

#

it gets too slow and training takes 10x longer

stiff dust May 24, 2023, 2:36 PM

#

I don't believe in gradient accumulations. Why should it help?

surreal lagoon May 24, 2023, 5:11 PM

#

i was just trying different things to get a more gradual and cohesive training session

#

i have been observing pretty great results now that i've frozen the text encoder by decaying its learning rate to zero as the unet learning rate rises

#

4k steps to train the text encoder and then another 6.5k are nearly done now, with just the unet

#

do you think i'll have good luck if i do one epoch of final training with the TE unfrozen after the unet training completes?

mild mortar May 24, 2023, 6:19 PM

#

Hi everyone, I was just wondering what is the most recommended way to up-scale images while retaining realistic texture on the skin, face and remains crisp and not too smooth after upscaling 4x. Thank you 😇

stiff dust May 24, 2023, 6:26 PM

#

surreal lagoon i have been observing pretty great results now that i've frozen the text encoder...

I had the opposite: freezing unet (or train it with very low learning rate) improved image quality

surreal lagoon May 24, 2023, 6:33 PM

#

odd

#

i do use a super low learning rate. it starts at 1e-10 and rises to 9e-8

#

my loss averages 0.244 with my current training method, likely due to ADAM 8bit optimizer spikes, but this is so much lower than the .546 avg i could get with the 'official guidance' for fine-tuning 2.1

#

i was wondering how i was breaking everything for a few days before i realised that almost not training it at all on each iteration has the best results

#

freezing the text encoder fixed the 'memory loss' problem i was having where for instance it would forget how to make a gecko since geckos weren't in my training data and i couldn't figure out a way to generate reg images that would represent them and also everything else i wanted to preserve

#

example: the term leopard gecko started out lizard-like, pretty close to a real one. but then it starts to add whiskers and fur and then an actual leopard's head on a gecko body before finally it's just a leopard. the next checkpoint, the leopard was gone, and it was a "leopard tank", eg. a vehicle used in war, with little soldiers standing around, and smoke in the background

#

my model i published was from about 2,000 steps before this loss became so noticeable in my test matrix of prompts. but for stuff outside my test matrix, it's apparent there was still some loss. it's just an acceptable amount, and things i generally didn't care to preserve eg. celebrities

hot breach May 24, 2023, 7:34 PM

#

from another conversation, qwerty only has 74 images and was trying to set batch size to 35, this will likely cause issues, wouldn't suggest setting batch size more than like 10% of your total image count

stiff dust May 24, 2023, 7:35 PM

#

high batch size is even good if you have only a single image

#

as you sample the image at different noise time steps

hot breach May 24, 2023, 7:35 PM

#

the issue becomes with runt batches and aspect bucketing

#

if you have 30 images and a batch size of 25, you end up with two steps, one of 25 images and another 5, these remainders become an issue

stiff dust May 24, 2023, 7:44 PM

#

but this is just an implementation issue

#

depends how you implemented the bucketing

tall condor May 24, 2023, 9:25 PM

#

so my firs try with regularisation images was a complete fail

#

all my mopdel is producing is regularisation images

#

and ther eis bascially nothing left from any of my concepts, what did i do wrong?

#

i have 240 regularisation images per concept that has like 10 images and i run those 10 images like 20-30 times per epoch, so i theory i have around 1 regularisation image per image i run per epoch

#

any recommendation?

stiff dust May 24, 2023, 10:08 PM

#

surreal lagoon example: the term `leopard gecko` started out lizard-like, pretty close to a rea...

do you try different seeds?
I found that even with super small learning rate the image can swap between different outcomes very easily, but this has nothing to do with the model itself. If you repeat the same prompt with different seeds you might see different outcomes

#

like one seed produces a gecko, another more an leopard, and a third one a tank. When training your model, it might switch between these three interpretations of the image, but it would also switch if you just use a different seed

#

but in general the text encoder is surprisingly the workhorse. Training it mostly determines the outcome of the image, while the unet takes way more time to train and easily overfits on texture instead of the structure or shape

tall condor May 24, 2023, 10:37 PM

#

well without regularisation images all works kinda ok except the overfitting

surreal lagoon May 25, 2023, 12:13 AM

#

stiff dust do you try different seeds? I found that even with super small learning rate the...

same seed

#

the gecko never came back at any seed and eventually that model stopped working very well

hot breach May 25, 2023, 2:57 AM

#

tall condor well without regularisation images all works kinda ok except the overfitting

running text encoder training at a lower learning rate helps the overfitting or subject substantially, you can get away with quite a bit without doing dreambooth style regularization

#

damian did a bunch of testing and put a PR in so you can even just choose to train the final X layers of the text encoder, which seems to be really good for SD2.x with the newer 24 layer encoder, makes it train more like SD1.x

stiff dust May 25, 2023, 7:10 AM

#

yes, I also always freeze the first 16 layers of text encoder

#

also I train Lora on low rank for the textencoder

surreal lagoon May 25, 2023, 2:43 PM

#

hot breach damian did a bunch of testing and put a PR in so you can even just choose to tra...

PR for what?

#

i'm just using the train_dreambooth.py script with slight modifications. can those improvements be backported to that script?

finite creek May 25, 2023, 3:07 PM

#

Thank you! Appreciate the info 👍🏻👍🏻

surreal lagoon May 25, 2023, 3:17 PM

#

@finite creek see above about freezing the layers of 2.1's text encoder to better train it without catastrophic loss in lower layers

#

my understanding is that when OpenCLIP was trained, LAION went stage by stage gradually freezing subsequent layers of the text encoder in order to preserve the foundational features it learnt

#

so we kind of have to do the same thing to avoid disrupting the connections and structure of those layers too

normal pike May 25, 2023, 3:29 PM

#

Hey there. So, I've got an issue. The issue is, that doesn't matter how I train the lora, it does everything pretty decently, BUT the eyes of the character... Doesn't matter how much I try. What could I do?

#

(my dataset has 34 images both close-up of the character's face, body and eyes too, unet is 1e-4 and text LR is 5e-5)

finite creek May 25, 2023, 5:31 PM

#

surreal lagoon my understanding is that when OpenCLIP was trained, LAION went stage by stage gr...

Thanks a lot! Very interesting. How do you freeze the layers? I see through your previous posts that you train the text encoder and Unet separately. Could you tell me how?

surreal lagoon May 25, 2023, 5:50 PM

#

certainly. i'm using the normal train_dreambooth.py script, modified to use the filename of an image as its prompt (with some added cleanup etc)

that script has the option to train (or not) the text encoder. the process goes something like:

analyze training data, retrieving all keywords and count their frequency
prepare a subset of training data that contains the least commonly used keywords, as they're least likely to be known by the encoder
at this time you can also remove any outlier data or look at the most commonly used keywords and either modify or remove them to ensure you're populating the segments you actually want to

once your data is ready,

an initial training run at a supremely low learning rate using polynomial learning rate scheduler on the text encoder and unet, simultaneously, for a certain number of steps. you likely want to make a ckpt every 50-100 steps when you're in the "toy model" phase just to see how your test prompt output is changing.

i'm not sure whether prior preservation is useful here. if you're doing Dreambooth for a single subject, probably it is mandatory. if you're doing a general fine-tune, it seems to be incredibly harmful.

select/cherrypick from your checkpoints for the one that has the most pleasing results. this can be quite subjective. it helps to have a wide array of prompts generated from each checkpoint in a way that you can compare them easily. you want to select a checkpoint that didn't change the output much, but be sure to check the results of prompts containing your pre-training keywords, so that you can more easily see the early changes that training is applying.

Honestly if it's changing too much between each ckpt, your learning rate might be too high.

Once you've got the text encoder trained,

use save_pretrained on the pipeline for that checkpoint in an inference script, to save it as a complete model
begin another training run, this time on your full subset of data, and your full step count, and no --train-text-encoder option
this can have a much higher learning rate, but since i had a large number of images to process, i kept it low
you can use save_pretrained on the checkpoint that is most appealing to you, from these results. I save checkpoint every 1000 steps when training the unet alone, but if your LR is higher than mine, you might need every 500.
once you have that complete model saved again, you can go back to the text encoder training step, this time, on your full subset of data.

disclaimer: this is my process i'm doing currently and not what i think a lot of other people are doing. if you can, at all, use the new dreambooth code instead, that will use separate learning rates for TE vs unet, since they benefit from that. additionally, the new code has the ability to actually freeze the more important layers of the text encoder so that it is harder to damage.

tall condor May 25, 2023, 11:23 PM

#

hi guys, im still having issues with tht number of regularisation images, how many regularisation images per concept shall i have?

chrome breach May 26, 2023, 6:24 AM

#

tall condor hi guys, im still having issues with tht number of regularisation images, how ma...

Ig around 10-15 should work... Have only read this online, not verified myself which number should work best

#

Ig trying out with 5, 10, 15 and evaluating then on each of resulting models will be hepful

tall condor May 26, 2023, 9:37 AM

#

5-15 per image?

chrome breach May 26, 2023, 10:30 AM

#

Yes

tall condor May 26, 2023, 12:02 PM

#

im using kohya ss, so my images are run between 10 and 40 times per epoch

#

does that mean that i also need to get 5-15*10-40 reg images?

#

or shall i still stick with 5-15 per image?

surreal lagoon May 26, 2023, 5:20 PM

#

                if args.with_prior_preservation:
                    # Chunk the noise and model_pred into two parts and compute the loss on each part separately.
                    model_pred, model_pred_prior = torch.chunk(model_pred, 2, dim=0)
                    target, target_prior = torch.chunk(target, 2, dim=0)

                    # Compute instance loss
                    loss = F.mse_loss(
                        model_pred.float(), target.float(), reduction="mean"
                    )

                    # Compute prior loss
                    prior_loss = F.mse_loss(
                        model_pred_prior.float(), target_prior.float(), reduction="mean"
                    )

                    # Add the prior loss to the instance loss.
                    loss = loss + args.prior_loss_weight * prior_loss
                else:
                    loss = F.mse_loss(
                        model_pred.float(), target.float(), reduction="mean"
                    )

#

so this is code for prior preservation and i kind of see what it's doing, but, why is it doing that?

it makes the loss value appear much higher than it is without prior preservation, and i see now how the weight is applied to the prior loss and explains why the loss is lower with it being less taken into consideration.

but how does this actually direct the process or change its result?

stiff dust May 26, 2023, 5:54 PM

#

it's just training on the regularization images and the training images

#

you don't need any special loss for that. You could also just put the regularization images to your training data.

#

however, the idea of regularization images is that they are only seen once in training (ideally). So you cannot overfit on reg images as they are trained only for one epoch

tall condor May 26, 2023, 6:25 PM

#

so they are not applied every epoch?

surreal lagoon May 26, 2023, 7:55 PM

#

oh

#

so the backwards pass uses the loss value to determine how much error to resolve

#

so why does SD 2.1 just start out with insane loss values on the regularization data when i feed only those through?

tall condor May 26, 2023, 9:19 PM

#

hi guys, i see a major different between running images more than once rather than running multiple epochs, anyone know why?

tall condor May 26, 2023, 10:42 PM

#

also anyone tried the difference between random crop and center crop?

surreal lagoon May 26, 2023, 11:42 PM

#

depends on your source material, how much source material, how it's tagged, etc.

#

i like the partly-frozen TE

tall condor May 26, 2023, 11:44 PM

#

my source material is all captioned, however i have a lot of concepts that are mixed

#

the souurce material is very low sometimes per concept- sompetimes maybe 5 pics only

#

sometimes 100

surreal lagoon May 26, 2023, 11:45 PM

#

not sure, my best results were with about 3000 images so far

tall condor May 26, 2023, 11:45 PM

#

does it makes sense to train half the time with random crop and half the time with center crop or so?

surreal lagoon May 26, 2023, 11:46 PM

#

are you just doing style transfer

tall condor May 26, 2023, 11:46 PM

#

no its not only style, its objects with details

surreal lagoon May 26, 2023, 11:46 PM

#

then you want to manually crop your images

#

that's not very many and it will be easy

tall condor May 26, 2023, 11:47 PM

#

but i have like 7k images xDD

surreal lagoon May 26, 2023, 11:47 PM

#

idk, i don't see the point of just trying to train like 5 images of something and hundreds of others. it likely won't learn the lesser-frequent concepts

tall condor May 26, 2023, 11:48 PM

#

well with kohya_ss what you can do is define per concept how often the images are repeated, however that also causes issues for me like creating strange pattern

#

and i still havent found a way to tackle this

hot breach May 27, 2023, 1:42 AM

#

surreal lagoon idk, i don't see the point of just trying to train like 5 images of something an...

I've experimented with duplication and it can help but obviously something with 100 images is going to come out better than the one with 5 just run 20 times more each

#

duplicating the rare examples can help a bit, just don't try to fully equalize, the one with all the duplicates will overfit

uncut vapor May 27, 2023, 3:07 AM

#

Hello, I'm wondering if anyone can point me in the right direction. I want to remove speech bubbles from images of comic panels without giving prompts. Dataset in the thousands. I think I can train something like meta's new SAM to segment, YOLO to ID, then SD to inpaint? Is that the SOTA? Can SD tools help in the ID part at all?

#

surreal lagoon May 27, 2023, 3:12 AM

#

finite creek Thanks a lot! Very interesting. How do you freeze the layers? I see through your...

i have figured out how to freeze certain layers of the text encoder and the results are superior to the approach i described before

#

    # Load scheduler and models
    noise_scheduler = DDPMScheduler.from_pretrained(
        args.pretrained_model_name_or_path, subfolder="scheduler"
    )
    text_encoder = text_encoder_cls.from_pretrained(
        args.pretrained_model_name_or_path,
        subfolder="text_encoder",
        revision=args.revision,
    )
    first_frozen_layer = 0
    last_frozen_layer = 0
    total_count = 0
    for name, param in text_encoder.named_parameters():
        total_count += 1
        pieces = name.split(".")
        if pieces[1] != "encoder" and pieces[2] != "layers":
            print(f"Ignoring non-encoder layer: {name}")
            continue
        print(f'Pieces: {pieces}')
        current_layer = int(pieces[3])
        if (
            current_layer >= first_frozen_layer and current_layer < 21
        ):  # choose whatever you like to freeze, here
            last_frozen_layer = current_layer
            if hasattr(param, 'requires_grad'):
                param.requires_grad = False
                print(f'Froze layer: {name}')
            else:
                print(f'Ignoring layer that does not mark as gradient capable: {name}')

#

this has me training just the last 2 layers

hot breach May 27, 2023, 4:01 AM

#

ED2 allows you to freeze first/last n layers in text encoder, it seems pretty good for SD2.x models with the newer openclip, though for SD1.5 I just set the learning rate of the text encoder lower than for unet, seems to work well, like 1/5th to 1/2 or so, or use cosine schedule on text encoder only

#

here only training last 6 layers, using different LR for text encoder

serene widget May 27, 2023, 6:38 AM

#

Hi folks (again) ! I really need help here. I installed the plugin to my photoshop - but it says that "Failed to load stable art"
I did all that was in the instruction
I have few versions of photoshop. And the plugin doesn't work in any of them.

tall condor May 27, 2023, 11:08 AM

#

higher prior loss value or lowere prior loss value will cause more overfitting?

#

as far as i read a lower value will cause more overfitting but from the source code it appears a higher will

#

can someone confirm that a higher prior loss value will cause the model to overfit less

dull snow May 27, 2023, 12:11 PM

#

can anyone introduce me to the settings of training a embedding

dull snow May 27, 2023, 12:30 PM

#

forget it

stiff dust May 27, 2023, 1:39 PM

#

a higher prior loss will reduce the impact of the training data

#

just set it to 1:1

untold moss May 27, 2023, 3:47 PM

#

hi people how many max steps do you use for faces? it's 3000 too much or too low??

#

it's taking me 4 or 5 hours for each embedding, it's actually maddening.

surreal lagoon May 27, 2023, 4:30 PM

#

3k steps for a face seems like a lot

#

i finally have enough sunlight for today (for now) to do some training on my 4090 here 😄

finite creek May 27, 2023, 7:07 PM

#

surreal lagoon i have figured out how to freeze certain layers of the text encoder and the resu...

Thank you, do you use ED2? Or some other Soft?

finite creek May 27, 2023, 7:13 PM

#

surreal lagoon certainly. i'm using the normal `train_dreambooth.py` script, modified to use th...

Thanks a lot for the detailed explanation. A lot to unpack and learn there!

finite creek May 27, 2023, 7:20 PM

#

hot breach ED2 allows you to freeze first/last n layers in text encoder, it seems pretty go...

Thanks for the info Freon 👍🏻

surreal lagoon May 27, 2023, 7:43 PM

#

surreal lagoon May 27, 2023, 7:43 PM

#

finite creek Thank you, do you use ED2? Or some other Soft?

i have taken the Diffusers script and modified it a lot

finite creek May 27, 2023, 7:44 PM

#

surreal lagoon

Nice

finite creek May 27, 2023, 7:47 PM

#

surreal lagoon i have taken the Diffusers script and modified it a lot

Could you share the link to the Diffusers Script?

surreal lagoon May 27, 2023, 7:49 PM

#

mine, or theirs?

#

it'd be like github.com/huggingface/diffusers/tree/main/scripts

surreal lagoon May 27, 2023, 8:40 PM

#

hot breach ED2 allows you to freeze first/last n layers in text encoder, it seems pretty go...

i'm seeing that it might be best to freeze the outer layers eg. first n and last n, and train the middle for styles etc?

finite creek May 27, 2023, 8:48 PM

#

surreal lagoon it'd be like github.com/huggingface/diffusers/tree/main/scripts

Cheers 👍🏻

hot breach May 27, 2023, 9:02 PM

#

surreal lagoon i'm seeing that it might be best to freeze the outer layers eg. first n and last...

I think that follows for unet, people see that with unet merges with weighted layers, but not experimented with TE

surreal lagoon May 27, 2023, 9:05 PM

#

i should freeze some of the unet too?

hot breach May 27, 2023, 9:06 PM

#

I've not tried, im sure if you selectively froze stuff in general you may see interesting behaviors

#

kohya did experiments with merging two unets with different weights per layer between A and B models, it produced different results, but all his examples were anime and harder for me to judge, certainly interesting differences by using different layer weights

surreal lagoon May 27, 2023, 9:10 PM

#

i tried to merge models but those seem to be for 1.5 and the weights aren't named the same now

surreal lagoon May 27, 2023, 9:59 PM

#

https://openreview.net/forum?id=i9UlAr1T_xl

OpenReview

SmartFRZ: An Efficient Training Framework using Attention-Based...

There has been a proliferation of artificial intelligence applications, where model training is key to promising high-quality services for these applications. However, the model training process is...

ionic gulch May 27, 2023, 11:01 PM

#

hey
I'm trying to create a data set to train my lora which is pretty much just to create this animation

#

#

i have 2 options. one input all data sets i have from the internet, which is pretty much all cartoonish art.
two try to create it from WebUI and then use it as a data set
would doing the first option cause results to turn into cartoon art? as I want to create semirealistic art with it

wispy elbow May 28, 2023, 2:45 AM

#

Anyone know how I can get txt2img to generate red hair with blonde tips? I've tried all manner of prompts and weights but it always chooses one or the other. I've got multicolored hair before, but this is my first time wanting a specific coloration.

surreal lagoon May 28, 2023, 3:30 AM

#

so with Terminal SNR and training on 2.1 i can get loss down to 0.11

#

@hot breach have you experimented with the alternate noise schedules? or @stiff dust

hot breach May 28, 2023, 4:46 AM

#

I just added the zero terminal snr thing but need to run experiments, I've messed with offset noise quite a bit

surreal lagoon May 28, 2023, 4:51 AM

#

it isn't working very well for me

hot breach May 28, 2023, 4:52 AM

#

offset noise requires smaller amounts the longer you train, and it is not very stable since it has to be modified based on length of training, or turned on only for some portion of training

#

20k steps you can try 0.01 or 0.02, the original blog post suggested 0.10 but that only works well for training hundreds or a few thousand steps

surreal lagoon May 28, 2023, 4:56 AM

#

i mean terminal snr

#

it's just destroying the model immediately

hot breach May 28, 2023, 4:57 AM

#

ah

surreal lagoon May 28, 2023, 5:30 AM

#

#

hmm

#

it might come back

untold moss May 28, 2023, 8:22 AM

#

what is the min dataset size you recommend for 3000 steps? 20, 30 ?

tall condor May 28, 2023, 10:43 AM

#

pseudoterminalx: i am having similar issues, your learing rate is too high and you are using to many steps

#

try a learning rate of 5e-7 or 8e-7

dull snow May 28, 2023, 11:15 AM

#

every time i try using kohya ss i get exit code 0

#

😭

#

even if i do everything right

dull snow May 28, 2023, 11:35 AM

#

returned non-zero exit status 1.

#

i followed whole tutorial

untold moss May 28, 2023, 11:45 AM

#

why makes the error "AttributeError: 'FreeTypeFont' object has no attribute 'read' stable diffusion" in training each time it has to save a png ?

surreal lagoon May 28, 2023, 12:31 PM

#

tall condor pseudoterminalx: i am having similar issues, your learing rate is too high and y...

i am at 1e-7

#

and 100 steps

#

how is that too much

#

4600 steps in...

#

#

should i just keep going?

#

i can kind of see the output improving. but also feels super broken

dull snow May 28, 2023, 12:40 PM

#

dying rn

surreal lagoon May 28, 2023, 12:40 PM

#

like im having to totally re teach it how predictions work with a new. algorithm

surreal lagoon May 28, 2023, 1:04 PM

#

#

seems to be figuring something out

tall condor May 28, 2023, 1:27 PM

#

pseudoterminalx can i see some of your input data

#

and how many images are you training on?

#

are you running images multiple time sin an epoch?

surreal lagoon May 28, 2023, 1:40 PM

#

i have 22,000 images

#

and no

tall condor May 28, 2023, 1:41 PM

#

you should not get that result with 22k images

#

are they captioned?

surreal lagoon May 28, 2023, 1:41 PM

#

i'm switching the model to Terminal SNR, so

tall condor May 28, 2023, 1:41 PM

#

did you tag the images propperly

surreal lagoon May 28, 2023, 1:41 PM

#

of course

tall condor May 28, 2023, 1:41 PM

#

can i see some input samples?

surreal lagoon May 28, 2023, 1:42 PM

#

i have good results without Terminal SNR, though they're not exactly what i want the model to do, usually..

#

the_night_king_from_game_of_thrones_watching_over_a_modern_call_center_full_of_sales_reps_photorealism_4k.png

pizza_au_fromage_et__la_confiture_confiture_dabricot_fromage_comt_la_pizza_est_saupoudre_de_persil_cinmatique_hyper_dtaille_dtai.png

inflatable_balloon_sculpture_in_the_shapes_of_the_corals_very_colorful_from_red_sea_on_the_beach_in_dubai_lots_of_arab_people_wa.png

dungeons_and_dragons_owlbear_attacking_a_kobold__upscaled_by_mrcus_fast.png

cute_sunflowers_hand_drawn_isolated_white_background__upscaled_by_zenofexsvk_fast.png

tall condor May 28, 2023, 1:43 PM

#

ok thats a quite big variation of input types, can i see some captions with the correspondig image?

surreal lagoon May 28, 2023, 1:43 PM

#

they're the filenames for each

tall condor May 28, 2023, 1:44 PM

#

pizza_au_fromage_et__la_confiture_confiture_dabricot_fromage_comt_la_pizza_est_saupoudre_de_persil_cinmatique_hyper_dtaille_dtai.png

#

are you captioning in non-english?

surreal lagoon May 28, 2023, 1:44 PM

#

it's french

#

it's a mix of languages

#

i think there's some Hindi, Russian, Japanese

tall condor May 28, 2023, 1:44 PM

#

i dont believe that it works if its not english. also just to make sure, you are having the caption in a .caption file and it is beeing used right?

surreal lagoon May 28, 2023, 1:45 PM

#

i have a custom dreambooth script

tall condor May 28, 2023, 1:45 PM

#

which script are you training with?

#

ah ok

surreal lagoon May 28, 2023, 1:45 PM

#

the filename itself is the tag/caption

#

i like this approach better than what anyone else does

tall condor May 28, 2023, 1:45 PM

#

first of all i believe you need to normalize the data to be all the same language, english

surreal lagoon May 28, 2023, 1:45 PM

#

i don't believe that to be the case at all

tall condor May 28, 2023, 1:45 PM

#

i dont think you can train in non english

surreal lagoon May 28, 2023, 1:45 PM

#

like i said i have good results without Terminal SNR.

#

the OpenCLIP model already understands other languages, sir 😛

#

my efforts are to improve the model across the board incl more language comprehension

tall condor May 28, 2023, 1:46 PM

#

the problem is that the text model training will be screwed up if you dont and its as far as i understand the most importaint part

#

if the base model you train on is in english you are bascially intriducing new "words" to the model and it will probably only lik those words to your images

#

so if your cheese is now "fromage" it will only know your one "fromage" image as "fromage" but not chees

#

at least thats how i understand that

#

which means that it will not be able to recognize your "fromage" as chees and also not recognize chese as "fromage"

#

i believe what you want to do is to train in english and than convert your input promt from any lange you enter to inglish before generating an image

surreal lagoon May 28, 2023, 1:50 PM

#

man it's really improving still. i am going to leave it running

#

@tall condor i don't think you fully understand what's happening here

#

this is stabilityai/stable-diffusion-2-1 output with my current settings and that same prompt

#

i'm fine-tuning the model to use terminal SNR

tall condor May 28, 2023, 1:52 PM

#

sorry i dont know anything about that. i though you are doing regular training

surreal lagoon May 28, 2023, 1:53 PM

#

it's okay 👍🏽 usually i am but last night i started down the rabbit hole of implementing a research paper

#

i wish there were more people that have done this specific transition

#

i have no idea if what i'm seeing is correct

tall condor May 28, 2023, 1:54 PM

#

what exactly id the difference

surreal lagoon May 28, 2023, 1:55 PM

#

it improves the contrast balance of the image

#

the typical noise schedule of SD means that the overall average colour grade of the image is gray

#

offset noise was a workaround to help with this issue but apparently it's a hackish fix and terminal SNR is "the right way"

#

4500 steps

tall condor May 28, 2023, 1:56 PM

#

so its basically a replacement for lowering the applied noise?

surreal lagoon May 28, 2023, 1:56 PM

#

5700 steps

#

i don't know if i'm doing this right but i'm okay with what's happening now at least.

tall condor May 28, 2023, 1:57 PM

#

would be interresting to see where its going

surreal lagoon May 28, 2023, 1:57 PM

#

tall condor so its basically a replacement for lowering the applied noise?

as i understand it, this is a way of allowing the model to determine where it wants to end up before it gets there, so that it can remove noise more effectively. and my understanding is likely incorrect

tall condor May 28, 2023, 1:58 PM

#

the images are still having ver yhigh contrast tho

#

i wonder if that will be lowered to the end

surreal lagoon May 28, 2023, 1:58 PM

#

they were washed out before

#

the contrast is a fix in progress as far as i can tell

#

you know what's crazy, the researcher leading this team only graduated from university about a year ago

#

that said, he obtained a masters' degree in CS so, a bit more than what i did KEK

tall condor May 28, 2023, 2:01 PM

#

cool

#

what i really hate on training is that there is so little description on impact of certain settings

#

for me its mostly trail and error

surreal lagoon May 28, 2023, 2:02 PM

#

yeah people hold these cards too closely to their chest

surreal lagoon May 28, 2023, 2:23 PM

#

loss=0.0743

#

think

gentle osprey May 28, 2023, 2:34 PM

#

surreal lagoon yeah people hold these cards too closely to their chest

Big facts.

surreal lagoon May 28, 2023, 2:41 PM

#

on the off chance anyone else has done this in here, did you freeze the text encoder partly, fully, or not at all?

tall condor May 28, 2023, 2:46 PM

#

anyone played with color augmentation yet?

surreal lagoon May 28, 2023, 2:53 PM

#

well i'm going to restart this with a fix pulled from ED2

#

i was definitely at the very least, doing inference wrong

#

#

#

way better

#

tall condor May 28, 2023, 3:11 PM

#

that from the same model before or did you restart?

surreal lagoon May 28, 2023, 3:12 PM

#

this is the restarted training

tall condor May 28, 2023, 3:12 PM

#

what did you change this time?

surreal lagoon May 28, 2023, 3:12 PM

#

scheduler and config for it

#

this time i kept the SD2.1 scheduler config and overloaded values into the betas

#

previously, i just used the default scheduler config with overloads

#

my understanding now is that SD2.1's config is pretty different from the way the schedulers are used out of the box

tall condor May 28, 2023, 3:13 PM

#

looks much better now but is it doing what you expect it to do?

surreal lagoon May 28, 2023, 3:13 PM

#

i didn't expect it to work this well, so i'm not sure how to answer that

#

it still has photoreal issues but i can fix those

tall condor May 28, 2023, 3:14 PM

#

maybe its not doing anything at all? xD

surreal lagoon May 28, 2023, 3:14 PM

#

nope, the image quality is +++ compared to baseline

tall condor May 28, 2023, 3:14 PM

#

cool

surreal lagoon May 28, 2023, 3:16 PM

#

the contrast is changing a lot

surreal lagoon May 28, 2023, 3:49 PM

#

#

ok so bikes look super good

#

i'm going to nuke my learning rate because it's too high

untold moss May 28, 2023, 5:25 PM

#

any guide for fine tuning or faq or pro tips?

summer rose May 28, 2023, 5:58 PM

#

surreal lagoon i have taken the Diffusers script and modified it a lot

Could you please share the script if it is fine with you?

surreal lagoon May 28, 2023, 5:59 PM

#

it's in quite disarray

#

are you a developer?

summer rose May 28, 2023, 6:00 PM

#

surreal lagoon are you a developer?

Yes I am

surreal lagoon May 28, 2023, 6:02 PM

#

https://github.com/bghira/SimpleTuner/blob/main/train_dreambooth.py

GitHub

SimpleTuner/train_dreambooth.py at main · bghira/SimpleTuner

A basic Dreambooth implementation geared toward Stable Diffusion 2.1. - SimpleTuner/train_dreambooth.py at main · bghira/SimpleTuner

stiff dust May 28, 2023, 6:05 PM

#

interesting. Should try the scheduler fix, too

surreal lagoon May 28, 2023, 6:05 PM

#

scheduler fix?

#

i'm so overwhelmed, lol

stiff dust May 28, 2023, 6:05 PM

#

surreal lagoon my understanding now is that SD2.1's config is pretty different from the way the...

SD 2.1 is using v prediction, which is very different from the 1.5 noise scheduler

surreal lagoon May 28, 2023, 6:06 PM

#

yeah i know that now because i've fixed the scheduler config and now terminal SNR works out of the box

#

this is baseline without any fine-tuning

#

nothing special about the prompt and there's no negatives

#

btw, ignore the nsfw prompts in my inference script, they're only there so i can stop training if i begin to introduce any by accident

#

they do really weird stuff LUL like paint the swiss alps, a cabin in the woods. because there's zero concept of those in 2.1

surreal lagoon May 28, 2023, 9:14 PM

#

i'm using photos as class data but i've got it feeding the class data's filename in as its prompt instead of a single token, and i've used BLIP to label them

tall condor May 28, 2023, 9:45 PM

#

surreal lagoon scheduler fix?

your results are really interresting. it would be really cool if you could contribute them to the community

tall condor May 28, 2023, 9:46 PM

#

untold moss any guide for fine tuning or faq or pro tips?

i recommend to start witk kohya_ss it has a web UI that is really a good starting point

#

@untold moss https://www.google.com/search?client=firefox-b-d&q=kohyass

surreal lagoon May 28, 2023, 9:48 PM

#

tall condor your results are really interresting. it would be really cool if you could contr...

mate, i publish everything on github

#

i'm not going to go rush to implement stuff in applications i never use, that's up to you

tall condor May 28, 2023, 9:51 PM

#

i saw. thanks!

tall condor May 29, 2023, 12:27 PM

#

so i have 2 same images, one time with orange one time with blue tint, i would expect sd to equalize them out to be a neutral tint but for some reason the blue one is very stronly dominating, why is that?

surreal lagoon May 29, 2023, 2:12 PM

#

likely because of how the denoising process works

#

but i don't know if anyone knows exactly why that would happen, are you using any special techniques like offset noise or SNR fixes?

restive plank May 29, 2023, 3:02 PM

#

tall condor so i have 2 same images, one time with orange one time with blue tint, i would e...

Could be how the colors are coded in the first place: in RGB, blue might be around 0,0,255 while orange might be around 125,125,0, so “put together” blue is stronger than the red and yellow

main blade May 29, 2023, 5:29 PM

#

Hello. I'm trying to train sd with a lora with this art style, but so far I'm not getting good results. I think it's partly because my English is not very good and I'm not describing the image properly in the text file. Can you help me with the caption text of this image for example? It's a bullfighter's jacket, I guess I should describe as well as possible the ornaments, the embroidery, the brooches...

surreal lagoon May 29, 2023, 7:19 PM

#

watercolor

surreal lagoon May 29, 2023, 7:50 PM

#

well, training has completed Blurry_eyes

#

first test prompt outside of my sanity check prompt list

lilac topaz May 30, 2023, 3:03 AM

#

I am trying to train a model to use this specific mask. If you pause at any given frame most of them look like the mask I'm trying to fine tune but each frame scrambles the colors/features. I tried retraining with images of a single mask but it still scrambles the color... any thoughts on getting consistency? Not 100% sure if it's a training issue or a settings issue

surreal lagoon May 30, 2023, 3:21 AM

#

can you share more details about your training setup?

lilac topaz May 30, 2023, 3:28 AM

#

this dataset you see here was trained on about 25 pictures of a variety of these masks (no 2 are exactly the same). I can't remember the exact settings but it was a standard how to vid on youtube, I believe training steps were 1600 at 2e-6.

I then trained it on a single mask but used 50 photos same settings, this one was muddled and looked too much like a generic mask.

I then used a dataset of a single mask at 25 photos which seems slightly more consistent however it requires a lot more fine tuning in the settings.

My approach here was to treat it like a face and give it a modest dataset at 1600steps, 2e-6 which is what most people say faces should be trained at

surreal lagoon May 30, 2023, 4:08 AM

#

you're doing Lora?

#

or Dreamboothing a keyword?

lilac topaz May 30, 2023, 4:18 AM

#

I'm using dreambooth on google collab. My GPU is too slow to use it locally

surreal lagoon May 30, 2023, 4:32 AM

#

i would suggest training a Lora

#

dreambooth is kind of iffy and is a heavier process, and a Lora will sit on top of any compatible model

lilac topaz May 30, 2023, 4:39 AM

#

thank you! I will try Lora

warm agate May 30, 2023, 9:24 AM

#

@surreal lagoon to caption a face which all traits/features should we mention?

stiff dust May 30, 2023, 10:16 AM

#

I would always first do textual inversion to learn the facial features / assign them a token.Then you can use that token for captioning

#

caption can be "photo of <mytoken>". Better is to randomize it a bit, "<mytoken>, photography", "an photo of <mytoken> made by smartphone camera" and so on

warm agate May 30, 2023, 4:28 PM

#

stiff dust I would always first do textual inversion to learn the facial features / assign ...

Textual inversion of a same character or just random faces?

stiff dust May 30, 2023, 4:28 PM

#

same character. I thought that was the question

warm agate May 30, 2023, 4:29 PM

#

stiff dust same character. I thought that was the question

No I just wanted to train a model with multiple faces so we can improve the face textures and make them a bit realistic

surreal lagoon May 30, 2023, 5:43 PM

#

warm agate <@636706883859906562> to caption a face which all traits/features should we ment...

i allowed BLIP to do all of mine

#

to me that led to improvement across general keyword use, eg. a woman or a man will come back by default as more complete-looking

#

i like using BLIP for training captions because it essentially has the text encoder tell me where it wants those images to be

#

like "oh, i recognise those features! we have these keywords for them." and then i provide high quality images for those keywords and it learns to do them better

hot ether May 30, 2023, 5:47 PM

#

Heya, i have 1,3k images and I want to train Lora/Model (dont know yet) which one is more suitable, looking for some advice and guides on where to start:)

stiff dust May 30, 2023, 8:31 PM

#

I use Blip, too, but still find its captions way too short.

surreal lagoon May 30, 2023, 8:57 PM

#

you can do it with a higher temperature and return multiple possibilities and merge them

#

🙂

surreal lagoon May 31, 2023, 12:05 AM

#

is the vae fine-tuned in the script from diffusers?

tall condor May 31, 2023, 12:16 AM

#

hot ether Heya, i have 1,3k images and I want to train Lora/Model (dont know yet) which on...

all of the images are for the same concept or is it a mix of concepts

warm agate May 31, 2023, 3:22 AM

#

surreal lagoon like "oh, i recognise those features! we have these keywords for them." and then...

Oh kk

warm agate May 31, 2023, 3:23 AM

#

stiff dust I use Blip, too, but still find its captions way too short.

Yeah they are too short, it's not the best to describe a scene

gloomy stag May 31, 2023, 5:43 AM

#

Anyone tried this out? https://github.com/tjennings/Coreco_LLaVA

GitHub

GitHub - tjennings/Coreco_LLaVA: Large Language-and-Vision Assistan...

Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities. - GitHub - tjennings/Coreco_LLaVA: Large Language-and-Vision Assistant built towards multimodal GPT-4 level ca...

hot ether May 31, 2023, 7:57 AM

#

tall condor all of the images are for the same concept or is it a mix of concepts

it’s a collection of random items, which share the same cartoon draw style

hot ether May 31, 2023, 5:36 PM

#

is there a way to run dreambooth fully locally?

stiff dust May 31, 2023, 5:55 PM

#

yeah, of course, you should have 8gb vram at least, better more

hot ether May 31, 2023, 5:56 PM

#

i have 12

#

tried to used colab but it throws disconnects from time to time ruining everything

finite echo May 31, 2023, 6:10 PM

#

Im looking at making a "pixel perfect" 32x32 pixelart model, do I need to finetune everything into it (knights, dragons, zombies, swords, skeletons, trees, bushes, etc.) or is there a way to do this in 1 go or does it need to be done 1 at a time?

surreal lagoon May 31, 2023, 6:45 PM

#

hot ether tried to used colab but it throws disconnects from time to time ruining everythi...

this is easier to handle if you have checkpoints being written every N steps and keep only X number of checkpoints

hot ether May 31, 2023, 7:03 PM

#

surreal lagoon this is easier to handle if you have checkpoints being written every N steps and...

tbh i am not sure how to use that system, i just put settings and pres "play" want to build it on local for personal grind:D, but not sure how

surreal lagoon May 31, 2023, 7:18 PM

#

if you clone the diffusers repo into your google drive on colab (GPT can help you with that) you can find the examples directory and in there is some dreambooth, fine-tuning, etc scripts and you can pick whichever you want, and follow the directions on the huggingface hub tutorials for it - that's how i've done it anyway, i'm sure there's different approaches

trim portal May 31, 2023, 8:24 PM

#

surreal lagoon if you clone the `diffusers` repo into your google drive on colab (GPT can help ...

How good are your models with this approach

surreal lagoon May 31, 2023, 8:25 PM

#

like this

#

i fine-tune on 2.1 only because i love its compatibility with higher resolutions and SNR fixes, as well as the penultimate clip handling

#

it's not as "easy" though, it takes a fair bit more understanding of what you're trying to accomplish

hot ether May 31, 2023, 8:31 PM

#

i just want to give it a word like " mouse" and get a mouse in that image style:D

surreal lagoon May 31, 2023, 8:32 PM

#

that sounds like you want a LoRA

hot ether May 31, 2023, 8:32 PM

#

maybe

#

i found it out just today, didnt manage to get into it tho

surreal lagoon May 31, 2023, 8:32 PM

#

or textual inversion

hot ether May 31, 2023, 8:32 PM

#

this idk what is:D

surreal lagoon May 31, 2023, 8:33 PM

#

like a positive/negative prompt, on steroids

hot ether May 31, 2023, 8:33 PM

#

can i feed it with my own images to make it "understand" what i want

trim portal May 31, 2023, 8:40 PM

#

What do you think i shud use for this use case: i have 1000+ images of one specific meme in diff styles. I want to create a model that lets u input any text and get that version of the meme (a batman version, spongebob version etc.)

surreal lagoon May 31, 2023, 9:27 PM

#

a meme in different styles?

#

you mean like pepe with his identity crisis?

trim portal May 31, 2023, 9:44 PM

#

Exactly yea

#

Like this would be trump pepe and the other drooling pepe i alr have captions but have been struggling with getting good results finetuning

#

Obv my dataset is not pepe but a diff meme but similar idea

surreal lagoon May 31, 2023, 10:25 PM

#

i think you want a LoRA

tall condor Jun 1, 2023, 12:32 AM

#

does anyone know if there is a way in auto 1111 to download all dependencies at once?

tall condor Jun 1, 2023, 12:50 AM

#

@hot ether you may want to take a look at kohya_ss

trim portal Jun 1, 2023, 1:27 AM

#

surreal lagoon i think you want a LoRA

Ok thats what i was thinking thx

warm agate Jun 1, 2023, 5:04 AM

#

@surreal lagoon can I DM?

#

I want to discuss it in DMs rather than here

surreal lagoon Jun 1, 2023, 5:07 AM

#

i'm about to go to slepe but ok

tall condor Jun 1, 2023, 12:04 PM

#

what is the minimum resolution you gyus recommend for training?

#

is 1000x1000 ok?

surreal lagoon Jun 1, 2023, 12:31 PM

#

it only works at 512 or 768 square

#

the aspect bucket stuff is really poorly implemented, manually crop and centre everything

#

downsampling high res images to the right size will artifact the image too

#

better to crop instead

warm agate Jun 1, 2023, 1:42 PM

#

@surreal lagoon I have DM'ed

#

Kindly please check your inbox

warm agate Jun 1, 2023, 1:42 PM

#

surreal lagoon better to crop instead

Isn't it impossible to manually crop like 2k images

surreal lagoon Jun 1, 2023, 1:43 PM

#

nope not impossible

#

i will answer here. i think using a llm to caption images is a waste of resources and money

#

you already have a Clip encoder that can caption, and it can be fine tuned on new captions

#

unless you are training an encoder from scratch, i see no point

warm agate Jun 1, 2023, 1:46 PM

#

surreal lagoon you already have a Clip encoder that can caption, and it can be fine tuned on ne...

What does finetuning captions mean?

surreal lagoon Jun 1, 2023, 1:47 PM

#

it means you are making BLIP work better

#

openclip was trained on like 2 or 3 billion image caption pairs

#

i doubt theres a lot that can be improved on that with a llm

warm agate Jun 1, 2023, 1:48 PM

#

surreal lagoon it means you are making BLIP work better

Ok, so how can we do that.
B'cuz currently blip just generates small cations which I don't think can add a lot of details

surreal lagoon Jun 1, 2023, 1:49 PM

#

first, demonstrate to me that small captions dont work well

#

dont simply assume

warm agate Jun 1, 2023, 1:49 PM

#

surreal lagoon first, demonstrate to me that small captions dont work well

I want to train a model based on landscape photography, so it generates mini captions.
Many elements are skipped

#

Lemme show me the output

surreal lagoon Jun 1, 2023, 1:50 PM

#

why do you need the details captioned at all

#

you are tuning its current vocabulary

#

it will be fine and you can use shorter prompts to get good results then

#

if you want new keywords added then you will have to add them to each caption yourself

#

its tedious but not the end of the world. you arent training on 10k or 100k images, so

warm agate Jun 1, 2023, 1:53 PM

#

surreal lagoon its tedious but not the end of the world. you arent training on 10k or 100k imag...

I have a dataset of 100k images

#

I am away from my pc

#

I'll share the images and their captions generated by ED2 captioning(blip2)

surreal lagoon Jun 1, 2023, 1:58 PM

#

ok you seem to be taking on a project you arent prepared for at all

warm agate Jun 1, 2023, 2:00 PM

#

surreal lagoon ok you seem to be taking on a project you arent prepared for at all

I have already generated the captions

warm agate Jun 1, 2023, 2:00 PM

#

surreal lagoon ok you seem to be taking on a project you arent prepared for at all

I haven't trained any model before

#

What are the things that I have check for?

surreal lagoon Jun 1, 2023, 2:01 PM

#

start small

#

observe what different parameters change for an end result

#

your 100k images will take more than a week on a single gpu to train effectively

#

imagine finishing that and realising it wont work

#

i started with 300 images which werent enough and 3k images were okay but starts to take long enough that the text encoder could be damaged. the most recent try was 30k images and it took a lot of work to figure out

#

100k images to me would require a much smarter approach to training the text encoder

#

smarter doesnt mean faster, wastes some compute to optimize training but layers can converge faster

#

so in a roundabout way, it is faster but fewer iterations per second

warm agate Jun 1, 2023, 2:24 PM

#

surreal lagoon i started with 300 images which werent enough and 3k images were okay but starts...

Do you mean text encoder as model?

surreal lagoon Jun 1, 2023, 2:25 PM

#

i mean the text encoder

warm agate Jun 1, 2023, 2:28 PM

#

surreal lagoon i mean the text encoder

What's a text encoder?

surreal lagoon Jun 1, 2023, 2:32 PM

#

time to google

warm agate Jun 1, 2023, 2:36 PM

#

surreal lagoon time to google

Oh ok,
Got it it converts text into vector

surreal lagoon Jun 1, 2023, 2:37 PM

#

it does most of the work

warm agate Jun 1, 2023, 2:39 PM

#

surreal lagoon i started with 300 images which werent enough and 3k images were okay but starts...

How does it get damaged?
It fails to convert unrecognised text?

surreal lagoon Jun 1, 2023, 2:43 PM

#

and more

#

do it and find out lol

warm agate Jun 1, 2023, 2:46 PM

#

surreal lagoon do it and find out lol

Ok

#

@surreal lagoon which training method is best suitable for landscape photography?

#

And which one for faces?

surreal lagoon Jun 1, 2023, 2:48 PM

#

these all feel like questions that google can.easily answer and im not trying to be rude but it feels slightly rude to keep asking someone stuff that is so easily discovered. like, i am not a search engine, ya know?

#

i will say good luck training on the word city or downtown because new york times square is overfitted something fierce and likely cant be fixed

warm agate Jun 1, 2023, 2:55 PM

#

Np

#

Will check through Google

rose bridge Jun 1, 2023, 6:50 PM

#

Is it possible to train lora's with an m1 apple? i tried googling but it doesn't seem supported yet by most GUI i seen.

tall condor Jun 1, 2023, 11:24 PM

#

@surreal lagoon may model is producing strange patterns after a while of training while other things keep training well. is there a way to find out what causes those patterns? for example it looks like skin is being ripped off and stuff. it feels like a single image is becoming too dominat or so. any idea how to tackle that?

surreal lagoon Jun 1, 2023, 11:25 PM

#

are you training the text encoder? if so, are you freezing any of it?

#

shapes, textures and patterns are pretty strong features in the lower-to-mid layers of the text encoder

tall condor Jun 2, 2023, 12:03 AM

#

what do you mean when you say freezing?

#

do you think it makes sense to stop the text encoder training at the point where i start seeing patterns?

tall vault Jun 2, 2023, 1:31 AM

#

How can I create images with a specific person's face in it? I know I can try to get the description of the image. But can I train the model to know a person's face?

#

ok so I guess dreambooth can do this but I only have 6GB of vram

#

so, not possible?

valid coral Jun 2, 2023, 4:28 AM

#

Heyyyy

So I did this tute:
https://youtu.be/3uzCNrQao3o

Took me almost 6 hours to get through it

In the end, results were AWFUL

YouTube

SECourses

How To Install And Use Kohya LoRA GUI / Web UI on RunPod IO With St...

How to install famous Kohya SS LoRA GUI on RunPod IO pods and do training on cloud seamlessly as in your PC. Then use Automatic1111 Web UI to generate images with your trained LoRA files. Everything is explained step by step and amazing resource GitHub file is provided with necessary commands. If you want to use Kohya's Stable Diffusion trainers...

▶ Play video

#

Is this still the best or is there a better one? I've heard aitrepreneur makes good ones?

#

Also wondering if I had the wrong python version when I did training ......

surreal lagoon Jun 2, 2023, 4:31 AM

#

it really might depend on your images that you're using

valid coral Jun 2, 2023, 4:31 AM

#

And then wondering if that wrong python version is resulting in the loras not getting loaded properly in a1111

surreal lagoon Jun 2, 2023, 4:31 AM

#

try with like 15 images and no class data and about 4000 steps at a learning rate of 1e-4

valid coral Jun 2, 2023, 4:32 AM

#

I have previously used the exact same images (13 of them) with Shivam's Dreambooth colab thingy, got great results

surreal lagoon Jun 2, 2023, 4:32 AM

#

oh

#

well dreambooth captions and lora stuff are different

valid coral Jun 2, 2023, 4:32 AM

#

Yeahh ... .. I dunno anything now 😄 my info is all from November, forever in the AI world

surreal lagoon Jun 2, 2023, 4:32 AM

#

i have never actually messed with lora

#

sounds messy

valid coral Jun 2, 2023, 4:34 AM

#

ahh? yeah see the terms are all new and I really don't know what is what

#

dbooth was destructive etc and lora is supposedly not?

#

But how do people make their civitai stuff? 😛

#

So basically I wanna add some faces to a civitai model ......

#

What's your personal favorite way of doing that?

surreal lagoon Jun 2, 2023, 4:36 AM

#

dreambooth doesn't have to be destructive

#

freeze about half the text encoder

#

do as few steps as it requires to actually get the results you seek

#

you want to have a few validation prompts like different celebs or 'a random european man' kind of thing to ensure you are not breaking it

#

i usually check these:

        "woman": "a woman, hanging out on the beach",
        "man": "a man playing guitar in a park",
        "child": "a child flying a kite on a sunny day",
        "alien": "an alien exploring the Mars surface",
        "robot": "a robot serving coffee in a cafe",
        "knight": "a knight protecting a castle",
        "menn": "a group of men",
        "bicycle": "a bicycle, on a mountainside, on a sunny day",
        "cosmic": "cosmic entity, sitting in an impossible position, quantum reality, colours",
        "wizard": "a mage wizard, bearded and gray hair, blue  star hat with wand and mystical haze",
        "wizarddd": "digital art, fantasy, portrait of an old wizard, detailed",
        "macro": "a dramatic city-scape at sunset or sunrise",
        "micro": "RNA and other molecular machinery of life",
        "gecko": "a leopard gecko stalking a cricket"

valid coral Jun 2, 2023, 4:38 AM

#

And you enter these into what or where 😄 an a1111 extension? A separate thing?

surreal lagoon Jun 2, 2023, 4:39 AM

#

the inference.py script in my simpletuner repo

#

thomas

#

in fact my dreambooth script there is the least destructive i know of but it's more like a fine-tuning script now...

valid coral Jun 2, 2023, 4:41 AM

#

OK ........ that's not as straightforward/obvious as I was hoping 😄

#

But perhaps an avenue I shall explore

#

For now, I am neglecting my child and must get back to daddy duty 😛

valid coral Jun 2, 2023, 4:46 AM

#

surreal lagoon in fact my dreambooth script there is the least destructive i know of but it's m...

Hi, I'm neglecting again ... "my"? On your githubby thing? 😉 Can you share the link?

surreal lagoon Jun 2, 2023, 4:52 AM

#

https://github.com/bghira/SimpleTuner/tree/main

GitHub

GitHub - bghira/SimpleTuner: A basic Dreambooth implementation gear...

A basic Dreambooth implementation geared toward Stable Diffusion 2.1. - GitHub - bghira/SimpleTuner: A basic Dreambooth implementation geared toward Stable Diffusion 2.1.

untold moss Jun 2, 2023, 9:26 AM

#

what is the recommended resolution in ppp for training? is 72 enough?

stiff dust Jun 2, 2023, 10:59 AM

#

cool thing on LORA is: you can train it and then afterwards disable certain layers to check what happened

#

for example, I found that when training on my face it was the unet that made trouble (my photos are rather low quality android photos and the unet very fast learned the grainyness of the photos). So scaling down the unet and relying more on the text encoder fixed that for me

#

similarly, you can disable the first k layers of the text encoder and check how that affects your results

#

you can do all that afterwards and find out what went wrong in your training. Next, you retrain the lora or dreambooth but this time with removing the layers that caused harm

#

@valid coral if you don't want to write your own code, there is a tool called EveryDreamer2 that can do DreamBooth and that is parameter tuned

#

like it already freezes the first 16 layers of text encoders in Sd 2.1 and so on

valid coral Jun 2, 2023, 11:05 AM

#

stiff dust <@543489650241568780> if you don't want to write your own code, there is a tool ...

I have that ... .. installed ... ... I've been slowly going through the extensive how-to documentation .... then I get distracted by other sparkly objects 😆
Now I'm trying to fix a broken venv in my a1111
and ...
yeah

#

venv fixed. Seems the dbooth extension for a1111 has been broken for a while. Saw a couple fixes on various forums that didn't seem to solve my issue. SO FORGET THAT IDEA

Now back to seeing if I can get dbooth to train in kohya_ss.

If not ... EveryDream Trainer 2.0 💪

valid coral Jun 2, 2023, 11:40 AM

#

OK! I finally got all my ducks in a row and hit the Train button for Dreambooth in kohya_ss.

Shortly after, I got a memory error 😄
(I have 12 GB)
So that's an official "no" to running Dbooth on my machine? There's no config to edit or slower way to do it?

"try setting max_split_size_mb to avoid fragmentation" -- not an option?

🙏

warm agate Jun 2, 2023, 11:41 AM

#

I got this error, where should I change the padding_side parameter?
https://github.com/Vision-CAIR/MiniGPT-4

valid coral Jun 2, 2023, 11:42 AM

#

This is a heck of a place to ask that 😄

warm agate Jun 2, 2023, 11:55 AM

#

valid coral This is a heck of a place to ask that 😄

ik, but there isn't a server for them so asked here

valid coral Jun 2, 2023, 11:55 AM

#

Not the worst place place. There are people here that might know. But are they awake right now ....

warm agate Jun 2, 2023, 12:27 PM

#

valid coral Not the worst place place. There are people here that might know. But are they a...

Not sure

valid coral Jun 2, 2023, 12:28 PM

#

warm agate Not sure

Are you familiar with Text Generation WebUI? There's a discord, I've gotten some great support with related stuff there...

#

I just tried sharing the link but it was blocked 😛

#

Easy enough to google it

warm agate Jun 2, 2023, 12:33 PM

#

valid coral Are you familiar with Text Generation WebUI? There's a discord, I've gotten some...

This one
https://github.com/oobabooga/text-generation-webui

valid coral Jun 2, 2023, 12:45 PM

#

Indeed!

valid coral Jun 2, 2023, 12:54 PM

#

valid coral venv fixed. Seems the dbooth extension for a1111 has been broken for a while. Sa...

HUR HURRRR, EveryDream2 can't use safetensors models ... .. . .. . .. . .. . ..

so I guess that's the end of that one

#

...unless I use THIS perhaps??
https://github.com/diStyApps/Safe-and-Stable-Ckpt2Safetensors-Conversion-Tool-GUI

warm agate Jun 2, 2023, 1:04 PM

#

@valid coral

#

What might be the error?

valid coral Jun 2, 2023, 1:05 PM

#

Just looks like a timeout error .... tried installing too many github things? Happened to me once. Go to the main github website in a browser and see if it has you on timeout...

#

also not sure what the "jllllll" thing is about

warm agate Jun 2, 2023, 1:06 PM

#

valid coral Just looks like a timeout error .... tried installing too many github things? Ha...

maybe, because I got timeout even on gallery-dl

warm agate Jun 2, 2023, 1:31 PM

#

valid coral Just looks like a timeout error .... tried installing too many github things? Ha...

How can we fix it?

#

valid coral Jun 2, 2023, 1:32 PM

#

When you go here, no errors?
https://github.com/oobabooga/text-generation-webui

#

And you're using the one-click installer?

rare niche Jun 2, 2023, 2:37 PM

#

Is it better to train on a celebrity with a smaller set of images (40) or larger set of images (160)? How do you decide how much is too much? I heard that 100 steps per image is correct. Is that true?

warm agate Jun 2, 2023, 3:09 PM

#

valid coral And you're using the one-click installer?

Yep

tall vault Jun 2, 2023, 3:11 PM

#

I'm trying to figure out how to train for specific faces. My graphics card only has 6GB vram so I can't use Dreambooth so I am trying to use Kohya to create a lora but everything I try with Kohya causes errors and nothing works.

What should I do?

warm agate Jun 2, 2023, 3:34 PM

#

@valid coral Same error when I ran pip install -r requirements.txt through the normal installation meth

jaunty grove Jun 2, 2023, 3:43 PM

#

Hi All,

Noob to AI Art here, having starting generating with early access to the Leonardo AI cloud platform 2 weeks ago where I generated a fair few images, using different models, and even trained a couple of models as well (albeit through a very simple user friendly GUI)

I swiftly moved on to installing Automatic1111, and SD, and have been doing some great local generations, and upscales using different models and LOra's from Civitai etc.

Today I installed Kohya_SS GUI, after much frustration as the install doesn't "just work" when running Setup.bat, or at least didn't for me.

I set about doing my first training in the Kohya GUI, and provided a set of 130 images, and used WD1.4 to tag then. Tags look good enough to me, so I went ahead with the training, and got my first character/preson Lora out of it. I kinda of worked, and it's had an affect but it's not strong enough.

I just don't know what I'm doing with the Repeat, Batch Size, and Epoch settings 😦 I ran that first Lora on defaults:

Repeats: 40
Batch Size: 1
Epoch: 1
Optimizer: AdamW8bit
Text Encoder Learning Rate: 5e-5
UNet Leading Rate: 5e-5

I have no idea how this correlates to the number of steps, how often I should output a sample image etc

I understand an epoch is a complete pass over the data set, where each image is trained "repeat" times at part of the epoch. What's a good number of steps to aim for to train a model? How does the number of repeat / epochs affect things

Any advice appreciated, or pointers to some good resources.

Thx

valid coral Jun 2, 2023, 3:44 PM

#

warm agate <@543489650241568780> Same error when I ran `pip install -r requirements.txt` th...

You'll be much better off asking at their Discord, friend:
https://github.com/oobabooga/text-generation-webui/discussions/600

warm agate Jun 2, 2023, 3:46 PM

#

valid coral You'll be much better off asking at their Discord, friend: https://github.com/oo...

Sure definitely

jaunty grove Jun 2, 2023, 3:53 PM

#

I think just answerd some of my own questions lol:

I wanted to know how image repeat, epochs etc related to steps, and then saw this in the Kohya output:

#

That tells me what I want to know, but what I don't know is, is 26K steps good? Is there any guidance around this, and also the batch size.

I know what batch size is when doing generations. Is it the same here? So if I set batch size to 4, I'd end up with 26K x 4 steps, and the idea being that each pass over an image, per epoch, will generate 4 images to train on

#

I'm running a training now, and have set it to output an image every 100 steps. That's gonna be a crap ton of sample images as it trains

tall vault Jun 2, 2023, 5:43 PM

#

@valid coral did you have any luck with training? I'm also trying to figure out how to do it

jaunty grove Jun 2, 2023, 6:14 PM

#

tall vault <@543489650241568780> did you have any luck with training? I'm also trying to fi...

Well, I took out a few images that I thought would pollute the data set because in one the person was wearing a hat, and a couple of others they had weird colour contacts in.

I tried with batch size 4, epoch 10, repeats 30.

Took an hour, and I could see it was getting closer to representing the person, but wasn't quite there. I was watching the sample pics being generated.

I'm doing another run, this time batch 4, epoch 20, repeats 40 which is likely to take 3 hours.

My learning rate for both is 5-e4.

I'm new to it all, both AI art with SD, and training. Fingers crossed I'll get something good this time.

I'm doing lots of reading to try and understand everything

valid coral Jun 2, 2023, 6:29 PM

#

tall vault <@543489650241568780> did you have any luck with training? I'm also trying to fi...

Still making my way through the EveryDream2 tutorial...
https://github.com/victorchall/EveryDream2trainer
It won't allow me to train a .safetensors model from civitai, but at this point I'm desperate enough to get ANY kind of result 😉

#

The Dreambooth extension is broken with a recent update of A1111 ... and the lora I trained with kohya_ss was also totally unusable (generated errors).

tall vault Jun 2, 2023, 6:37 PM

#

mama mia, has anyone been able to get this to work?

valid coral Jun 2, 2023, 6:48 PM

#

Of course! Just nobody that's here right now 😄

#

And I just hit the part in the tutorial video where the guy shows it running -- it's using 22 GB of VRAM ... so that's the end of EveryDream2trainer for meeee!

valid coral Jun 2, 2023, 6:49 PM

#

tall vault mama mia, has anyone been able to get this to work?

For the record, I did TONS of training, but it was last November ... and the whole world has changed since then. Now I'm trying with 2.1 models

tall vault Jun 2, 2023, 7:02 PM

#

Ya I think I might need to do the training on something that's cloud hosted. Then once I have the lora I'll be good?

#

if I'm doing cloud hosting maybe I should use dreambooth? does dreambooth produce lora's too?

jaunty grove Jun 2, 2023, 7:08 PM

#

valid coral The Dreambooth extension is broken with a recent update of A1111 ... and the lor...

My first lora generated from koyah today worked fine in the Automatic1111 gui, just want close enough to the look I wanted.

I didn't do anything special first time around, except to select the rpgv4 checkpoint as my base model for training.

The rest of the training settings were default mostly

jaunty grove Jun 2, 2023, 7:10 PM

#

valid coral And I just hit the part in the tutorial video where the guy shows it running -- ...

I'm on koyah, running on a 4080 16GB vram.

No idea what my training run in koyah used, but I had a batch size of 4, and a training res of 512 * 768.

Took an hour to do around 9300 steps, and produced a working lora

valid coral Jun 2, 2023, 7:34 PM

#

Yeah I'm gonna rewatch the koyah_ss tutorial video and do exactly what the guy did, instead of branching off with my own model...

valid coral Jun 2, 2023, 7:35 PM

#

tall vault Ya I think I might need to do the training on something that's cloud hosted. The...

If you're only doing lora, that's much more VRAM friendly. You can use kohya_ss
I think for me it never ran over 6 GB...

tall vault Jun 2, 2023, 7:43 PM

#

valid coral If you're only doing lora, that's much more VRAM friendly. You can use kohya_ss ...

it won't work for me, whenever I try to do anything I get errors. When I try to caption images it says I'm missing "cudart64_110.dll"

#

then it says " Ignore above cudart dlerror if you do not have a GPU set up on your machine."

#

but I do have a GPU

#

I have a laptop with integrated graphics and a GTX 2060. So maybe its accidentally using the integrated one?

tall condor Jun 2, 2023, 9:44 PM

#

anyone know if it a problem that i have buckets with only 1 image?

#

also if i increase the bucket size does that mean that my images are also handled in bigger chunks or is the only difference the grouping?

jaunty grove Jun 2, 2023, 9:50 PM

#

tall vault then it says " Ignore above cudart dlerror if you do not have a GPU set up on yo...

I have the solution for that, as I was installing koyah for the first time today and had that error.

What I did was copy that file from the Automatic1111 directory, to the root folder of koyah.

I then got errors about my xformer version, and pytorch version.

To fix this from a python prompt, and in the koyah venv root dir, I ran the following to install latest xformer, and pytorch+cu118 (koyah wasn't running at this time)

pip install -U xformers

pip install -U torch torchvision torch audio - - index-url https://download.pytorch.org/whl/cu118

After that, it amazingly works.

Why the setup.bat of koyah gui just doesn't work is a mystery.

Took me a while to figure out, as whilst I am a dev, I know nothing of python.

If you need the cuart dll let me know

tall condor Jun 2, 2023, 10:03 PM

#

found this, IMO its quite usefull

#

https://hoshikat-hatenablog-com.translate.goog/?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=de&_x_tr_pto=wapp#Dont-upscale-bucket-resolution

人工知能と親しくなるブログ

人工知能に関するトピックを取り上げるブログです

#

it describs the kohya_ss parameters quite well

tall condor Jun 2, 2023, 10:30 PM

#

if i dont "upscale bucket resulution" (dont downscale my images to fit the max size) in combination with random crop does that mean that for each time the image is picked for learning a random 512x512 section of the image is selected for learning but without scaling the image down?

tall vault Jun 2, 2023, 10:30 PM

#

jaunty grove I have the solution for that, as I was installing koyah for the first time today...

THAT WORKED

surreal lagoon Jun 2, 2023, 10:31 PM

#

400 steps into trying to burn a model lol

tall condor Jun 2, 2023, 10:31 PM

#

if its only 400 steps xD

surreal lagoon Jun 2, 2023, 10:32 PM

#

2.1 can burn like ice

tall condor Jun 2, 2023, 10:32 PM

#

all my 2.1 trails failed hard so i stick with 1.5 for now xD

surreal lagoon Jun 2, 2023, 10:32 PM

#

Sad

jaunty grove Jun 2, 2023, 10:32 PM

#

tall vault THAT WORKED

Glad to help out. I really struggled this morning to get Koyah Gui to work, had to many issue, but glad I got it sorted eventually, and also that I could spread word of how to get it working for others 🙂

My own lora works nicely (though I need to re-train to get a closer look to the person I trained on), but what I'm finding is that when combining with other lora's, the prompt weights seem off. it's like my lora's weight is too heavy, and I have to really up the prompt weight of the other lora

tall condor Jun 2, 2023, 10:33 PM

#

maybe your model is overfitting too much

surreal lagoon Jun 2, 2023, 10:33 PM

#

the bicycle with lord of the rings i am trying to get transferred into the model

#

KEK

tall condor Jun 2, 2023, 10:33 PM

#

yea i see no ring xD

surreal lagoon Jun 2, 2023, 10:34 PM

#

the seat is beige

#

very hobbitsy, you see

tall condor Jun 2, 2023, 10:34 PM

#

xD

surreal lagoon Jun 2, 2023, 10:34 PM

#

look at the blurry, soft focus

tall condor Jun 2, 2023, 10:34 PM

#

come back when it can ride in mordor - then we talk!

surreal lagoon Jun 2, 2023, 10:34 PM

#

look at its gandalf

#

hes so smol

tall condor Jun 2, 2023, 10:35 PM

#

please have gandalf ride that thingy

surreal lagoon Jun 2, 2023, 10:35 PM

#

the ballrog?

tall condor Jun 2, 2023, 10:35 PM

#

no the bicycle

surreal lagoon Jun 2, 2023, 10:36 PM

#

tall condor Jun 2, 2023, 10:36 PM

#

@surreal lagoon can you answer this: if i dont "upscale bucket resulution" (dont downscale my images to fit the max size) in combination with random crop does that mean that for each time the image is picked for learning a random 512x512 section of the image is selected for learning but without scaling the image down?

surreal lagoon Jun 2, 2023, 10:36 PM

#

omg it doesn't do it

#

oh i typed basketball rog

tall condor Jun 2, 2023, 10:37 PM

#

that gandog?

surreal lagoon Jun 2, 2023, 10:37 PM

#

gandog the gray

surreal lagoon Jun 2, 2023, 10:37 PM

#

tall condor <@636706883859906562> can you answer this: if i dont "upscale bucket resulution"...

i'd have to go look at the source code

#

you know my feelings on this bucketing nonsense

tall condor Jun 2, 2023, 10:38 PM

#

yea i know but if the behavior is as i describe i an actually even see a benefit

#

because i would leant way more details on a model this way

surreal lagoon Jun 2, 2023, 10:38 PM

#

it's possible

#

it's also possible it'll break it

tall condor Jun 2, 2023, 10:39 PM

#

btw my patterns are gone after latent upscaling all the images

#

i guess it can only break if if the images are too big and the sections are too small

#

but i see your point

#

probably creating those details beforehand and then caption them propperly would be the best option

surreal lagoon Jun 2, 2023, 10:40 PM

#

baseline's understanding of a leopard gecko man

tall condor Jun 2, 2023, 10:40 PM

#

but i just have too many images for that

#

lizmouse?

surreal lagoon Jun 2, 2023, 10:40 PM

#

leopard gecko

tall condor Jun 2, 2023, 10:40 PM

#

make a cacco? xD cat gecko?

surreal lagoon Jun 2, 2023, 10:40 PM

#

chibiYell

tall condor Jun 2, 2023, 10:41 PM

#

also its way more gecko than man

surreal lagoon Jun 2, 2023, 10:41 PM

#

man isn't in the prompt!!!

#

you

#

you monster.

tall condor Jun 2, 2023, 10:42 PM

#

but still nice result

surreal lagoon Jun 2, 2023, 10:42 PM

#

        "woman": "a woman, hanging out on the beach",
        "man": "a man playing guitar in a park",
        "child": "a child flying a kite on a sunny day",
        "alien": "an alien exploring the Mars surface",
        "robot": "a robot serving coffee in a cafe",
        "knight": "a knight protecting a castle",
        "menn": "a group of men",
        "bicycle": "a bicycle, on a mountainside, on a sunny day",
        "cosmic": "cosmic entity, sitting in an impossible position, quantum reality, colours",
        "wizard": "a mage wizard, bearded and gray hair, blue  star hat with wand and mystical haze",
        "wizarddd": "digital art, fantasy, portrait of an old wizard, detailed",
        "macro": "a dramatic city-scape at sunset or sunrise",
        "micro": "RNA and other molecular machinery of life",
        "gecko": "a leopard gecko stalking a cricket"

tall condor Jun 2, 2023, 10:42 PM

#

very much and clear details

surreal lagoon Jun 2, 2023, 10:42 PM

#

it's the gecko prompt

#

yeah when you train 2.1 on terminal SNR it goes amazingly clear and crisp

#

SAI needs to update it with this baked in

tall condor Jun 2, 2023, 10:43 PM

#

my main issue with moy models atm is that the training picks up the colors and patters sometimes more than the objects. is there a way to tackle that?

surreal lagoon Jun 2, 2023, 10:43 PM

#

yeah freeze the text encoder more

tall condor Jun 2, 2023, 10:43 PM

#

also can i see your "cosmic"

#

my next try will be to stop the text encoder at 50% and see where it takes me

surreal lagoon Jun 2, 2023, 10:44 PM

#

100->400

tall condor Jun 2, 2023, 10:44 PM

#

looks cool

surreal lagoon Jun 2, 2023, 10:44 PM

#

pretty cool eh

#

way better than 2.1 without SNR fixes

tall condor Jun 2, 2023, 10:45 PM

#

and lol for "menn"

surreal lagoon Jun 2, 2023, 10:45 PM

#

surreal lagoon Jun 2, 2023, 10:45 PM

#

tall condor my next try will be to stop the text encoder at 50% and see where it takes me

i stop mine at 25pct

tall condor Jun 2, 2023, 10:46 PM

#

ill copy that xD

#

it was my idea

#

and if it sucks it was yours

#

some of my concepts have very few pics (maybe 2-3) - do you think the batch size of 6 will kill those concepts?

#

im quite happy that i could get rid of those patterns as they were a real PITA

surreal lagoon Jun 2, 2023, 11:10 PM

#

i broke it. my cosmic prompt

tall condor Jun 2, 2023, 11:11 PM

#

how?

surreal lagoon Jun 2, 2023, 11:11 PM

#

successfully regressed Stable Diffusion's capabilities by like 3 years KEK

surreal lagoon Jun 2, 2023, 11:11 PM

#

tall condor how?

i'm testing training earlier layers for style transfer

tall condor Jun 2, 2023, 11:15 PM

#

some of my concepts have very few pics (maybe 2-3) - do you think the batch size of 6 will kill those concepts?

surreal lagoon Jun 2, 2023, 11:24 PM

#

no idea

tall vault Jun 2, 2023, 11:38 PM

#

I was trying to see if I could successfully create a lora with kohya so I just made one with 2 images. It's still taking a while, downloading a bunch of stuff it seems, and I gotta take my laptop with me to the airport in like 10 min. If I cancel the operation will it screw up my kohya installation?

tall condor Jun 2, 2023, 11:40 PM

#

likely yes

#

pseudoterminalx: does it makes sense to train a 1.5 model with images greater 512px?

jaunty grove Jun 2, 2023, 11:52 PM

#

tall vault I was trying to see if I could successfully create a lora with kohya so I just m...

Did you choose the right stuff in koyah?

When training a lora I didn't, and don't, have it don't downloads of an that stuff.

Once setup, it just worked. Are you using the dreambooth lora tab?

surreal lagoon Jun 2, 2023, 11:55 PM

#

tall condor pseudoterminalx: does it makes sense to train a 1.5 model with images greater 51...

probably not

#

broke it again, this is supposed to be a wizard

tall condor Jun 2, 2023, 11:56 PM

#

i mean, kinda looks like a wizzard xD

tall vault Jun 2, 2023, 11:56 PM

#

jaunty grove Did you choose the right stuff in koyah? When training a lora I didn't, and don...

I didn’t change any of the defaults. Maybe it downloaded stuff because it was the first time. And ya the dream booth Lora tab

surreal lagoon Jun 2, 2023, 11:56 PM

#

100 steps earlier tho

tall vault Jun 2, 2023, 11:57 PM

#

What are you trying to make?

surreal lagoon Jun 2, 2023, 11:58 PM

#

i've extracted frames from The Hobbit and i'm transferring its style

#

experimenting with different text encoder layers training

#

so far it seems like the 16th layer is too early still

#

too many fundamentals you can break

tall vault Jun 2, 2023, 11:59 PM

#

Very interesting. Where did you learn about layers?

surreal lagoon Jun 3, 2023, 12:01 AM

#

by breaking them

jaunty grove Jun 3, 2023, 12:02 AM

#

tall vault I didn’t change any of the defaults. Maybe it downloaded stuff because it was th...

I've a 900Mbps connection, so maybe I didn't notice the downloads lol, as I was doing other stuff at the time as well.

Hope you get a lora creates. Any problems let me know, and I'll see if I can help.

I'm new to koyah as well, as of this morning. I'm on my second training now.

It's going to take 3 hrs total on my 4080. Around 24K steps to train in total.

Same one I did earlier, that came out pretty good. But it wasn't quite close enough, so I'm re-running with more epochs and repeats

#

@surreal lagoon Are you using koyah?

How do you go about selecting the layer to train?

surreal lagoon Jun 3, 2023, 12:05 AM

#

i'm using a script i'm developing as i go

#

https://github.com/bghira/SimpleTuner

#

i love this one

#

looks so damn bootleg

tall vault Jun 3, 2023, 12:23 AM

#

Thanks @jaunty grove I’ll definitely give you a ping if I run into more trouble. I’m trying to figure out if I can create a Lora with someone’s face and use it in combination with the studio ghibli Lora to make ghibli themed profile pics

tall vault Jun 3, 2023, 1:06 AM

#

@surreal lagoon do you have a background in data science?

surreal lagoon Jun 3, 2023, 1:16 AM

#

lol i just keep breaking it god damn that's supposed to be a bicycle

#

KEKL

#

same ckpt

#

sigh

#

good

#

valid coral Jun 3, 2023, 6:43 AM

#

surreal lagoon

Looks solid to me!

valid coral Jun 3, 2023, 6:45 AM

#

surreal lagoon broke it again, this is supposed to be a wizard

I actually love this, I would make it an album cover

surreal lagoon Jun 3, 2023, 6:50 AM

#

#

this training session is going much better.

#

i've obviously bumped the weights of "people" up immensely. a scene that just describes a robot serving coffee, now serves it to a person

#

this one, in other training sessions with different dataset, would result in an animated wizard

warm agate Jun 3, 2023, 6:51 AM

#

are vast.ai gpu prices cheaper than runpod?

#

#

As its just 1 cent/h and on run pod its $1.4/h

surreal lagoon Jun 3, 2023, 6:52 AM

#

that's so cheap. why

warm agate Jun 3, 2023, 6:52 AM

#

surreal lagoon that's so cheap. why

you should ask vast.ai

surreal lagoon Jun 3, 2023, 6:52 AM

#

i would honestly do performance tests. it might be overshared

warm agate Jun 3, 2023, 6:54 AM

#

surreal lagoon i would honestly do performance tests. it might be overshared

might be

#

Its 4x3090

#

It shows VRAM of 24GB

#

so does have 24gb vram per gpu or collectively 244gb vram?

surreal lagoon Jun 3, 2023, 6:55 AM

#

hmm

#

24 per

#

that's priced linearly above the 48gb system too

#

damn so i can run like a cluster over there for what i pay now KEK

warm agate Jun 3, 2023, 6:56 AM

#

so we get 96gb vram

surreal lagoon Jun 3, 2023, 6:56 AM

#

seems like it

warm agate Jun 3, 2023, 6:56 AM

#

surreal lagoon Jun 3, 2023, 6:56 AM

#

yeah

warm agate Jun 3, 2023, 6:56 AM

#

Runpod is exactly twice the price of vast.ai

surreal lagoon Jun 3, 2023, 6:57 AM

#

that's a more common price

#

still really good price

warm agate Jun 3, 2023, 6:57 AM

#

Is vast.ai displaying low prices for new users

surreal lagoon Jun 3, 2023, 6:57 AM

#

good question

warm agate Jun 3, 2023, 6:57 AM

#

SO they can acquire the customers?

surreal lagoon Jun 3, 2023, 6:57 AM

#

well let me know if you find out lmao

warm agate Jun 3, 2023, 6:57 AM

#

Ok

#

I am contacting the customer support

#

see this

surreal lagoon Jun 3, 2023, 6:59 AM

#

the woman hanging out at the beach has always been a difficult prompt for 2.1 for some reason but now she even has all the right number of fingers. they're just the wrong ones

warm agate Jun 3, 2023, 6:59 AM

#

warm agate see this

a40 for just 0.165

surreal lagoon Jun 3, 2023, 6:59 AM

#

hm that shoulder tho

warm agate Jun 3, 2023, 7:00 AM

#

surreal lagoon the woman hanging out at the beach has always been a difficult prompt for 2.1 fo...

the wrist is pulled towards her, can we fix that with inpaint?

surreal lagoon Jun 3, 2023, 7:00 AM

#

i'm just continuing training it lmao

#

you guys inpaint whatever ya want

#

hmm

surreal lagoon Jun 3, 2023, 7:10 AM

#

warm agate the wrist is pulled towards her, can we fix that with inpaint?

i think it's not getting better

#

not sure

surreal lagoon Jun 3, 2023, 7:38 AM

#

@warm agate the price increases when you allocate more than 10gb of storage

warm agate Jun 3, 2023, 7:39 AM

#

surreal lagoon <@828208105631383572> the price increases when you allocate more than 10gb of st...

oh, then it would definitely increase as every AI model is something like 20gb+

surreal lagoon Jun 3, 2023, 7:39 AM

#

and also it's just random people's computers i think

#

warm agate Jun 3, 2023, 7:40 AM

#

surreal lagoon and also it's just random people's computers i think

yep it is, as i got an option to host

surreal lagoon Jun 3, 2023, 7:40 AM

#

so the datacenter prices are still good

#

makes longterm prices much higher tho, about 2x the price for 60gb of space

warm agate Jun 3, 2023, 7:41 AM

#

surreal lagoon

How to filter this?

#

We have a reliability score, so we can easily estimate

#

@surreal lagoon do you know how we can add minigpt4 into text-generation-webui?

surreal lagoon Jun 3, 2023, 7:42 AM

#

nope.

warm agate Jun 3, 2023, 7:44 AM

#

@valid coral how to add minigpt4 into text-generation-webui, I have asked in their discord, but they seem unresponsive.
i have added minigpt into the pipeline, but how to work with the model?

surreal lagoon Jun 3, 2023, 7:46 AM

#

not really the right channel for any of that

warm agate Jun 3, 2023, 7:51 AM

#

surreal lagoon not really the right channel for any of that

hmm ok

valid coral Jun 3, 2023, 9:49 AM

#

warm agate <@543489650241568780> how to add minigpt4 into text-generation-webui, I have ask...

Indeed, this is an image generation place.

Though Stability did produce an LLM of their own... why isn't there a channel for it?

🤷‍♂️

I've only played with the text generation webui for maybe 20 minutes myself, I'm not a good person to ask for support unfortunately. I was only suggesting their Discord as a potential place to get support for playing with LLMs.

warm agate Jun 3, 2023, 9:50 AM

#

valid coral Indeed, this is an image generation place. Though Stability did produce an LLM ...

Hmm ok

#

@valid coral Can you debug this

valid coral Jun 3, 2023, 9:53 AM

#

Debug? It looks fine to me...

But no, not really, I have probably 3% knowledge when it comes to Python.

#

I see you were getting a bunch of replies on the other Discord, but you were in the "dev" channel and not a support channel.

#

Asking all over the Interwebs is only gonna get you banz0red 🙂

So .... yes it's frustrating and confusing, but that's where we're at, this stuff is very much in its infancy.

warm agate Jun 3, 2023, 10:22 AM

#

valid coral Debug? It looks fine to me... But no, not really, I have probably 3% knowledge ...

Cuda extension is not installed. oh ok\

warm agate Jun 3, 2023, 10:22 AM

#

valid coral Asking all over the Interwebs is only gonna get you banz0red 🙂 So .... yes it'...

ok

tall condor Jun 3, 2023, 10:43 AM

#

@surreal lagoon: what can i expect from the UNet Training after freezing the TI? what shall i focus on to see if training is still improving

surreal lagoon Jun 3, 2023, 3:54 PM

#

i would expect it to start taking on the textures of your images more than their contents

#

at least for sd2.1 it can kind of improve the model to keep training the unet but if your source material isn't truly high quality it just ruins it

#

the text encoder is most worthwhile to train

#

it's also the hardest to 😦

#

currently i'm testing offset noise for the first time and i'm just not expecting to see stuff like this in my results. is that what it does at first before swinging back and making more sense?

surreal lagoon Jun 3, 2023, 4:47 PM

#

well

#

tuning with multiple GPUs apparently you can't freeze the text encoder during training, at least, not the way i've done it

tall condor Jun 3, 2023, 5:15 PM

#

for me when i reduce the noise ofset to 0.02 the results are getting way better

#

also it got rid of alot or wired stuff for me

#

what value did you pick?

surreal lagoon Jun 3, 2023, 5:18 PM

#

GPT4 is telling me i can't have offset noise trained in and also freeze the text encoder so early 😄

#

it says it is not going to work

surreal lagoon Jun 3, 2023, 5:20 PM

#

tall condor what value did you pick?

0.1, is that too high 😄

#

oh whoa, GPT4 was right LMAO

#

god damn it

#

i hate it when the robot is right

tall condor Jun 3, 2023, 5:37 PM

#

0.02

#

try that

#

works quite well even for dark pics

#

also you need to increase your epochs with low noise

surreal lagoon Jun 3, 2023, 5:42 PM

#

thank you

#

🦎

surreal lagoon Jun 3, 2023, 6:03 PM

#

love how thorough the notes from LAION are on OpenCLIP

#

H/14 with big batch size works well, but unstable, and very hard to recover

Planned for 256 * 135M epochs from 2B-en
Spike at epoch 122. Tried a lot of stuff to recover
Only one thing worked: decreasing lr fast for 8 epoch, got 74% that way
Batch size 79k, starting lr 5e-4
1 week to train + many days to try to figure it out
800gpus
Doing 8 epoch with batch size 158k gave 75.4%

Finished up to 256 at batch size 79k in bfloat16 and reached 78.0%

#

so trying to fine-tune 2.1 on a huge cluster of GPUs is harder than training on a small group

#

i hadn't seen this page before now but it's fun how my results mesh with theirs

#

kinda wish they'd started from scratch once they figured it out again

surreal lagoon Jun 3, 2023, 6:07 PM

#

tall condor try that

seems to be doing a lot better with that low of a value

tall condor Jun 3, 2023, 6:07 PM

#

i had the same experience

#

especially for very dark and very bright. and for me after adding that low noise value the contrast got much better in general

surreal lagoon Jun 3, 2023, 6:09 PM

#

https://github.com/huggingface/diffusers/pull/3664

GitHub

[WIP] Fix schedulers zero SNR and rescale classifier free guidance ...

This PR implements two of the three suggestions from the Common Diffusion Noise Schedules and Sample Steps are Flawed paper, which is discussed in #3475:

Rescale Betas to have zero SNR, which enab...

#

omg omg omg

#

it's happeeennnningggg

#

ALL schedulers in diffusers will do zero terminal SNR now

tall condor Jun 3, 2023, 6:13 PM

#

great work man

#

cant wait to see this feature in kohya

surreal lagoon Jun 3, 2023, 6:18 PM

#

oh Max is the one who did all the work, i just trained a model that allowed them to test it

tall condor Jun 3, 2023, 6:19 PM

#

@surreal lagoon can you freind/pm me i have 1 question regarding hardware

surreal lagoon Jun 3, 2023, 6:19 PM

#

i hate doin that tbh i get a huge friends list full of people i don't know lmao

tall condor Jun 3, 2023, 6:20 PM

#

np. can you recommend any AI workstation with 2-4 4090?

#

what hardware setup are you on?

surreal lagoon Jun 3, 2023, 6:21 PM

#

2 to 4 of them? 😮

#

threadripper, to start with

#

dual power supplies.. it's a lot

#

like, my 5800X3D and the ASUS X570p (AM4) are capable of having two GPUs but the 4090 uses three slots

tall condor Jun 3, 2023, 6:33 PM

#

thiungs i have seen so far have capabilities for 2x 4090

surreal lagoon Jun 3, 2023, 6:34 PM

#

so get one 😛

tall condor Jun 3, 2023, 6:35 PM

#

i allready hve 2 workstations with 1 each but it soooo darn slow

surreal lagoon Jun 3, 2023, 6:36 PM

#

it kind of doesn't go faster with two

#

you just get higher batch size

tall condor Jun 3, 2023, 6:36 PM

#

i am a bit concerned that high batch sizes mess up my concepts

surreal lagoon Jun 3, 2023, 6:37 PM

#

you want your entire dataset absorbed in a single shot if possible

#

that is the best, but, no one can do that

#

that's the only reason we batch stuff

#

so, the higher the better

tall condor Jun 3, 2023, 6:39 PM

#

i see

#

it just still doesnt make sense to me if i mix up the learning result of 2 different things in one update that it can still learn both the things you know

#

and mixing 6 of them IMO cant make it any better

#

just makes no sense in my head

surreal lagoon Jun 3, 2023, 6:41 PM

#

#

i just do what the AI tells me

surreal lagoon Jun 3, 2023, 10:04 PM

#

i now have a pretty good National Geographic dataset

tall condor Jun 3, 2023, 10:37 PM

#

cool

#

i found a way to add another rtx4090 to one of the workstations xD

#

@surreal lagoon do you have any other usefull tipps for training larger datasets? im still have some issues that the details in the pictures are not picked up very well

surreal lagoon Jun 3, 2023, 10:45 PM

#

nay

tall condor Jun 3, 2023, 10:45 PM

#

also any tips on how i can prolong the training and improoving the dataset on the long run?

#

so far 100-200 epochs are gettting me somewhere but some concepts are still very badly generating

surreal lagoon Jun 3, 2023, 11:19 PM

#

tall condor also any tips on how i can prolong the training and improoving the dataset on th...

slower LR

#

well

#

you might need to up the learning rate and freeze more layers for 1 epoch

#

and then, go back to old settings

#

or unfreeze more layers

#

it's a game about tricking the model into a new space that is in the direction you want and then slowing down and refining it

tall condor Jun 3, 2023, 11:20 PM

#

if i reduce the LR the concepts wont create at all mostlky

surreal lagoon Jun 3, 2023, 11:20 PM

#

that's why the polynomial learning rate has a really high learning rate for a number of warm up steps

#

so it can move the model into a new zone that it needs to clean up the output of

#

then the learning rate decays and slows down

tall condor Jun 3, 2023, 11:20 PM

#

im currently using constant scheduler

#

but i tried cosine and tbh i didnt see much difference

dark gale Jun 4, 2023, 12:33 AM

#

Hello! anyone using Runpod? I want to know if it worth the use

surreal lagoon Jun 4, 2023, 1:53 AM

#

@tall condor ever seen this pixelation?

surreal lagoon Jun 4, 2023, 3:35 AM

#

hm it's nto always there

stone garden Jun 4, 2023, 10:32 AM

#

/imagine, soggiorno, parquet, grande porta finestra, tramonto, arredamento country, divano, camino

tall condor Jun 4, 2023, 10:38 AM

#

look s like it failed to create the depth of field

#

which is ok IMO

surreal lagoon Jun 4, 2023, 2:23 PM

#

surreal lagoon Jun 4, 2023, 5:05 PM

#

😄

#

progress

surreal lagoon Jun 4, 2023, 5:23 PM

#

Meta key: Title values: ["RHODES"]
Meta key: Creator values: ["Cushman, Charles W., 1896-1972"]
Meta key: Date modified values: ["02\/03\/2022"]
Meta key: Subject values: ["Towers","Spires","Seas","Forts & fortifications","Coastlines","Buildings","Clouds","Waterfronts","Islands","Rhodes (Greece : Island)"]
Meta key: Roll Number values: ["4-65"]
Meta key: Date Created values: ["1965-04-04"]
Meta key: Source values: ["P13980"]
Meta key: Holding Location values: ["Bloomington - University Archives<br \/>Wells Library E460<br \/>1320 E 10th St.<br \/>Bloomington, IN 47405<br \/>Contact at <a href=\"mailto:archives@indiana.edu\">archives@indiana.edu<\/a>, <a href=\"tel:812-855-1127\">812-855-1127<\/a>"]
Meta key: Alternate ID values: ["465.37"]
Meta key: Campus values: ["IU Bloomington"]
Meta key: City values: ["Rhodes"]
Meta key: State/Province values: ["Aegean Islands"]
Meta key: Country values: ["Greece"]
Meta key: Genre values: ["Seascapes","Cityscape photographs"]
Meta key: Call Number values: ["P13980"]
Meta key: Frame Number values: ["37"]
Meta key: County values: ["Sporades"]
Meta key: Persistent URL values: ["http:\/\/purl.dlib.indiana.edu\/iudl\/archives\/cushman\/P13980"]
Meta key: Cushman Identifier values: ["P13980"]

generates the caption:

Generated title for image: rhodes towers spires seas forts fortifications coastlines buildings clouds waterfronts islands (greece island) aegean seascapes cityscape photographs sporades county

surreal lagoon Jun 4, 2023, 6:43 PM

#

NPCDryadSmug

#

this is amazing

sturdy dagger Jun 4, 2023, 9:00 PM

#

I would like to run some dreambooth training with ShivamShrirao repo, two questions:

Is it mandatory to provide a class_data_dir folder?
Is there somewhere I could find good quality regularization photos (for men and women)?

tall condor Jun 4, 2023, 11:56 PM

#

gabinino: i recommend you start without regularisation

#

and see where it takes you. as far as i undersatnd you can not just take any regularisation images, the need to be made with the model you train on

jade hinge Jun 5, 2023, 12:36 AM

#

https://www.youtube.com/watch?v=QTYX0tgA5ho

YouTube

SECourses

Human Cropping Script & 4K+ Resolution Class / Reg Images For Stabl...

I have discovered a workflow that has never been explored before, which allows for studio-quality realism beyond expectations using Stable Diffusion DreamBooth / LoRA training. To achieve this workflow, it required an exceptionally high-quality dataset of classification / regularization images. Additionally, I developed a script capable of autom...

▶ Play video

#

can be used for fine tuning

sturdy dagger Jun 5, 2023, 8:57 AM

#

jade hinge https://www.youtube.com/watch?v=QTYX0tgA5ho

Thanks!

jaunty grove Jun 5, 2023, 3:19 PM

#

tall vault Thanks <@345506719788564480> I’ll definitely give you a ping if I run into more ...

Hey Snubber,

I now know why you were getting all those downloads in Kohya when you kicked off training that first time. It's because you had the default model path in "Source model", that causes Kohya to go off and download all the checkpoints from the runwayML git.

Probably a bit late now, but what you need to do, and what i did, is press teh paper icon button and open an existing checkpoint/safetensor model e.g. SD1.5, or RPGv4, or any other model that you likely have got installed into Automatic1111

I just set the path to a model that is in my Automatic1111 directory, and then Kohya just uses that and doesn't download anything

#

tall vault Jun 5, 2023, 3:21 PM

#

Okay cool! Good to know thanks for the info @jaunty grove !!

jaunty grove Jun 5, 2023, 3:25 PM

#

tall vault Okay cool! Good to know thanks for the info <@345506719788564480> !!

Did you get any Lora's trained in the end? I've done two Lora's for two different people. I had to try a few times with different learning rates, epochs, repeat etc, as I've read that for a person around 1500-3000 steps are enough.

I found with too low a learning rate for the unet, and main learning rate, it just didn't take. Too high, and it ended up just looking really bad. I'm still trying to figure it all out. Through trial and error I got my two people Lora's to work, but the weighting seems really "heavy". By that I mean my Lora's default weight compared to other keyword is too high. I've no idea how to change the weightings that get backed into the Lora as part of the learning 😦

tall vault Jun 5, 2023, 3:27 PM

#

jaunty grove Did you get any Lora's trained in the end? I've done two Lora's for two differen...

I haven't tried, been busy, but hopefully will get around to it tonight. That's good to know. Also I'm pretty sure you can put weights on your loras. At least I know that you can do it in comparison to the weights of other loras

#

civitai isn't loading for me but if you check out the example in here https://civitai.com/models/6526/studio-ghibli-style-lora
One example they use a zelda lora and the studio ghibli lora to make an image of zelda in ghibli style. And you can see they have different weights for the different loras

#

https://civitai.com/images/71819?modelVersionId=7657&prioritizedUserIds=53515&period=AllTime&sort=Most+Reactions&limit=20

#

here it finally loaded

jaunty grove Jun 5, 2023, 3:29 PM

#

Yeah I can put weights on when I use it, but the default weighting feel too heavy, so in use I end up putting a low weight on it to bring it down a bit e.g. lora:mylora0.6

tall vault Jun 5, 2023, 3:29 PM

#

ohh ok, interesting

jaunty grove Jun 5, 2023, 3:29 PM

#

I'm sure it's me missing something. Let me know if you find the same when you get a chance to try 🙂

tall vault Jun 5, 2023, 3:29 PM

#

will do

hot ether Jun 5, 2023, 4:46 PM

#

which option to choose guys?

#

3060ti

stiff dust Jun 5, 2023, 5:14 PM

#

honestly, you don't need accelerator if you have only a single gpu

stiff dust Jun 5, 2023, 5:18 PM

#

jaunty grove Yeah I can put weights on when I use it, but the default weighting feel too heav...

I observed that, too. I guess it's the same reason why people merge models and get better results. Setting your lora to 0.6 weight is essentially merging your lora with the base model in a 6:4 ratio

#

can you share your learning rate, number of steps, and number of images, as well as the rank?

tall condor Jun 5, 2023, 5:21 PM

#

so it appears running with 2GPUs is not halfing the time for training, anyone know why?

#

also it shows like it is running for 400 epochs eventho i specifiy 2

#

anyone know why

surreal lagoon Jun 5, 2023, 10:13 PM

#

stiff dust honestly, you don't need accelerator if you have only a single gpu

it's great though, it handles compiling the unet.

surreal lagoon Jun 5, 2023, 10:13 PM

#

tall condor so it appears running with 2GPUs is not halfing the time for training, anyone kn...

that's just not how it works

#

you just get to run a larger batch size but everything is limited by the main system doing the training

tall condor Jun 5, 2023, 10:15 PM

#

actually i think that the time is halving but for some reason if i spicify 200 epocs it runs 200 epochs per gpu

#

so it will run 400 epocs

#

same for batch. i specify 6 batch but it does 12

#

however 1 epoch does run much faster

#

but what im not sur eof is if i can just say 100 epochs if i want 200

surreal lagoon Jun 5, 2023, 10:20 PM

#

yeah it'll scale learning rate too

#

so it'll destroy model faster

tall condor Jun 5, 2023, 10:34 PM

#

can you eloborate?

#

not sure if i understand

surreal lagoon Jun 5, 2023, 10:38 PM

#

?

#

the learning rate is impacted by batch size

#

rescaled CFG is scary

tall condor Jun 5, 2023, 10:44 PM

#

but i though heigher batch size means more stable learning?

#

thus i could even increase the learning rate

#

or did i missunderstand that part

stiff dust Jun 5, 2023, 10:53 PM

#

no, thats correct

tall condor Jun 5, 2023, 10:54 PM

#

i wonder what paralell mechanism is used by dreambooth

stiff dust Jun 5, 2023, 10:54 PM

#

surreal lagoon it's great though, it handles compiling the unet.

but thats a one liner xD

tall condor Jun 5, 2023, 10:59 PM

#

if i merge 2 models somehow the result is different if i merge A+B and B+A - site for first is 5.8GB size of 2nd is 7.8 GB

#

anyone know why?

#

is the merge not combining both models into one and apply a weight if they both have the same key?

surreal lagoon Jun 5, 2023, 11:00 PM

#

stiff dust but thats a one liner xD

not during training

stiff dust Jun 5, 2023, 11:01 PM

#

dunno, I thought its also just unet.compile(). But to be honest: I didn't noticed any performance improvement by compiling (beside waiting minutes until the compiler's done)

surreal lagoon Jun 5, 2023, 11:12 PM

#

when you compile it and try and do certain operations it has to be recompiled and it breaks if you try and recompile it when it's already done

sturdy falcon Jun 6, 2023, 12:50 AM

#

hey so I'm not sure if this belongs here or in #📝｜prompting-help let me know if this isn't the right channel for this please :)

I'm trying to take a seamless tiling image and upscale it with Controlnet 1.11 tiling resampler, following this guide: https://stable-diffusion-art.com/controlnet-upscale/

Environment Info:

A1111 webui
dreamshaper model
Controlnet v1.1 sd15_tile model
Ultimate SD Upscale script

I'm getting it to generate nice upscale details, but its not seamless at the edges of the image. Its generating abrubt lines where it repeats even though I am selecting tiling setting under the img2img settings at the top.

**Pictures of settings attached: **

img2img settings: https://cdn.discordapp.com/attachments/273241020077047810/1115441000010240070/firefox_2023-06-05_18-26-34.png
Controlnet settings: https://cdn.discordapp.com/attachments/273241020077047810/1115441000463204433/firefox_2023-06-05_18-27-03.png
Ultimate SD Upscaler script settings: https://cdn.discordapp.com/attachments/273241020077047810/1115441000861679656/firefox_2023-06-05_18-28-36.png

I'm using the website: https://www.pycheung.com/checker/ to check the images tile seamlessly, heres a comparison of the input and output images:

Images in the tiling checker

Example 1: input image tiling: https://cdn.discordapp.com/attachments/273241020077047810/1115441002388410448/image.png
Example 1: output image doesn't tile: https://cdn.discordapp.com/attachments/273241020077047810/1115441001721511986/image.png
Example 2: input image tiling: https://cdn.discordapp.com/attachments/273241020077047810/1115441001113341962/firefox_2023-06-05_18-37-56.jpg
Example 2: output image doesn't tile: https://cdn.discordapp.com/attachments/273241020077047810/1115441001369174046/firefox_2023-06-05_18-37-29.jpg

Does anyone know how to fix this?

#

I'm gonna post example 1 again so it embeds these two but not everything else, I don't want to clog up the channel lol

visual lichen Jun 6, 2023, 1:22 AM

#

Sorry if this is the wrong place if so I'll delete:
Looking for guidance on where to go to learn to train "styles" as embeddings or loras or whatever where it won't affect the content much but will affect colors and lighting to make it match a style independent of content.

Also want to learn to train specific concepts better. I have been trying to make monuments into enormous fishtanks with mixed results. Is there a best practices here? I've trained hypernetworks and textual inversions with mixed success

#

Currently I am training by masking out the alpha on everything except my subject. That works ok but I don't have enough control. And I can't make the monuments (such as the leaning tower of pisa) into fishtanks well.

surreal lagoon Jun 6, 2023, 2:06 AM

#

surreal lagoon Jun 6, 2023, 2:06 AM

#

visual lichen Sorry if this is the wrong place if so I'll delete: Looking for guidance on wher...

hello! checkout nitrosocke's github

surreal lagoon Jun 6, 2023, 2:15 AM

#

sturdy falcon hey so I'm not sure if this belongs here or in <#1011743094309396631> let me kn...

im way too high but i want to help, can you distil it for me

#

you're saying the seams are always detected?

sturdy falcon Jun 6, 2023, 2:27 AM

#

surreal lagoon im way too high but i want to help, can you distil it for me

lol thanks, same tbh

So I made a seamlessly tiling image in txt2img. I'm trying to upscale that and keep the tiling.

Its always generating a bit at the edges and doesn't tile seamlessly after the upscale

#

you can see what I mean by using these two images in https://www.pycheung.com/checker/

surreal lagoon Jun 6, 2023, 2:28 AM

#

you can try inpainting the seams maybe

#

unfortunately i don't think they will ever be truly invisible. the way it works is by inpainting them already

sturdy falcon Jun 6, 2023, 2:30 AM

#

hmm I can try inpainting but I doubt it'll work, when I was inpainting images before without upscaling it was messing up the seams

#

I've gotten Topaz Photo AI to upscale them and keep the seamless tiling effect, but it doesn't keep generating the image with fine details like SD Controlnet does, it just kind of makes the low res image sharper and smoother looking

#

yeah, unfortunately inpainting the seams made it worse

#

its like the controlnet is ignoring the tiling setting at the top

surreal lagoon Jun 6, 2023, 2:46 AM

#

i've managed to reintroduce the idea of smoking 😄

#

they're super smoky smokers now

#

i've upped the resolution of all my validations to see if i've managed to train out the model's tendency toward duplicate subjects, and voila, 1152x768

wispy elbow Jun 6, 2023, 3:13 AM

#

When running hires fix under latent, it always ruins hands that were previously perfect for me, I've been trying different settings but can't quite seem to get it. Anyone know the issue? Also, anyone able to explain the difference between the different Latent upscalers like Latent Nearest and Latent Nearest Exact? I've tried googling all these things to no avail, so I resort to bugging people here.

surreal lagoon Jun 6, 2023, 3:17 AM

#

sturdy falcon Jun 6, 2023, 3:35 AM

#

So it looks like its the stable diffusion upscaler doesn't actually support seamless tiling images, but I might have found a workaround

generate your texture with the tileable setting turned on (result: 512x512 image)

tile the resulting image 2 by 2; meaning 2 tiles in X and Y direction = 4 tiles in total (result: 1024x1024 image)

upscale the 2-by-2 tiled image as much as you like (result: for example 4096x4096 for 4x upscaling)

crop the center part of the upscaled image (result in this case an 2048x2048 image)

check that the center crop is seamlessly tileable - which it usually is...

https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/3590#issuecomment-1336280476

GitHub

[Feature Request]: Seamless texture SD upscale · Issue #3590 · AUTO...

Is there an existing issue for this? I have searched the existing issues and checked the recent builds/commits What would your feature do ? I wish to be able to use SD upscaling on seamless texture...

surreal lagoon Jun 6, 2023, 3:45 AM

#

have you tried mixture diffusers

wispy elbow Jun 6, 2023, 3:57 AM

#

Not even sure what that is, so no.

surreal lagoon Jun 6, 2023, 4:12 AM

#

seamless tiling

#

it's in the multidiffusion extension

visual lichen Jun 6, 2023, 4:41 AM

#

surreal lagoon im way too high but i want to help, can you distil it for me

Yeah I have it trained on the arc de triumph and on fishtanks with textual inversions but how do I make it do the arc de triumph made out of fishtank?

surreal lagoon Jun 6, 2023, 4:43 AM

#

prompt alternating?

#

maybe

tall vault Jun 6, 2023, 8:01 AM

#

@jaunty grove first attempt at creating a lora based on me. It's pretty cursed. I did 1500 steps. I can increase the steps but are there any other settings I should try changing? (some people mentioned epochs, what does that do?)

stray pecan Jun 6, 2023, 10:59 AM

#

When using the Tiling option (in auto11), does it change the Lora layers too, or only the base models?

#

I want to train a Lora that is optimized for Tiling and I don't know if I should enable Tiling (changing Conv2d layers' padding to circular) just for the base model or also for the lora network during training

jaunty grove Jun 6, 2023, 11:46 AM

#

tall vault <@345506719788564480> first attempt at creating a lora based on me. It's pretty ...

Good first attempt. Can't say mine was much better lol.

So from all the reading I've been doing around 1500 - 3000 steps total is good for training on a person. My two that I've done that have worked out ok, and present the person in the AI art accurately were around 1900 steps total I think.

So there are a few elements that you need to consider when working out the total steps:

Total Steps = (Number of Images * Image Repeats * Epochs) / Batch Size

Number of images = Self explanatory
Image Repeats = Number of times the source image is shown to the model
Batch Size = Number of source images taken as a batch, which are presented to the model for training
Epoch = A single training pass (Number of Images, Images Repeats, Batch Images)

Think of each epoch as a training run, the more epochs you do, you're kind of re-enforcing the model by going over it all again and again. I've seen reccomendations that around 10 epochs is good for training.

For learning rates, there are 3 of them. LR Rate, Text Encoder Learning Rate, and UNet Learning Rate. I've literally no idea of the specifics of these, but I do know that the larger the value the faster the learning rate (which isn't good), the lower the value the learning rate slows, but make it too slow and it doesn't learn a lot. It's a huge balancing act, and I'm doing it through trial and error.

#

@tall vault I set all 3 learning rates to the same value. If you put in something like 0.005 that will be too big a number, the learning rate is too fast, and it won't learn. I used 0.000005 at one point, and that was crap, it didn't learn a lot. Eventually I used 0.00005, and that worked out great, but I do feel it will depend on your images.

So for my first character that I trained I had 122 images. I had more, but dropped them because the person had face paint on, and weird contacts which was messing it up). So 122 images total

To get around 1500-3000 steps total, I used the following:

(122 images * 15 repeats * 10 epochs) / batch size 6 = 3050 steps total

3050 was near enough for me. I was happy with that so then I set about training with the learning rates mentioned above. Once I tried 0.00005 it was good for me, for that particular training data set.

Another tip for person training, is to make sure you have images of the person from different angles, wearing different clothes, different expressions etc, or the model will get over fitted (basically baked in), and your rendering using your lora, will mean your AI person will always have the same kind of pose etc Under-fitting is where it hardly looks like your character you trained.

Apparently you can get away with 10 - 15 images to train on a person. I used 122 for that first person. Second person I used 19 images,and it worked ok (though not as good as the first one)

tall vault Jun 6, 2023, 12:18 PM

#

OKay awesome! thank you so much for all the info! I am going to give it another shot tonight

jaunty grove Jun 6, 2023, 12:19 PM

#

Also I've been reading up on, and watching lots of videos on AI upscaling techniques. The best one yet, uses a combo of the ControlNet, and Ultimate SD Upscaler extensions to Automatic1111

Here's some examples where I took an original 512 x 768 render, fixed up the eyes via inpainting, and took the image in 2x scaling increments to to a whopping 8192 x 12288. So four lots of 2x upscalings to get there :-).

The small pic, and the first pixelated face are the original image 512 x 768 image (eyes fixed up via inpainting), followed by ones with increasing detail.

tall condor Jun 6, 2023, 12:30 PM

#

i recommend to add a noise offset of 0.02 to your lora and use a LR Scheduler with 10% warmup (constant with warmup or cosine)

stiff dust Jun 6, 2023, 12:35 PM

#

sturdy falcon hmm I can try inpainting but I doubt it'll work, when I was inpainting images be...

hi, do you use the inpainting control net? It makes a HUGE difference, in particular, if you inpaint with high noise strength

stiff dust Jun 6, 2023, 12:36 PM

#

tall vault <@345506719788564480> first attempt at creating a lora based on me. It's pretty ...

I had trouble for a long time, too. I found that the most important thing is to use a very low CFG when you want photorealism

#

like it is super easy to train on your face and generate nice anime images of you. But making it photorealistic is difficult. use a CFG of 3 or 4

tall vault Jun 6, 2023, 12:37 PM

#

stiff dust I had trouble for a long time, too. I found that the most important thing is to ...

I'm trying to use a studio ghibli style LoRa to create stylized images of me or anyone

#

do you think this is even possible?

stiff dust Jun 6, 2023, 12:39 PM

#

yes, I found that works straight away

#

even textual inversion is often good enough for that

#

the funny thing is that I also got extremely photorealism anime portraits of myself without problems (like they fit my face in super high details) but as soon as I want photorealism things get hard

#

reducing CFG to an extremely low value helps a lot, though. Like I got almost good photorealism with that

tall vault Jun 6, 2023, 12:41 PM

#

how did you get it get your face details? training a lora?

#

or just textual inversion?

stiff dust Jun 6, 2023, 12:42 PM

#

for best details you need a lora

#

but for a anime character (where you don't need all wrinkles and other details ;)) a textual inversion is enough

#

I have to say I found it easier train on SD 2.1 than on SD 1.5 (in contrast to what most people say)

tall vault Jun 6, 2023, 12:43 PM

#

ok cool, any tips on settings/steps for anime?

surreal lagoon Jun 6, 2023, 2:46 PM

#

hmm so at batch size 18 i'm seeing 2.23 seconds per iter and at batch size 6 i'm seeing 1.2 seconds per iter. how much faster is BS=18?

#

a train leaves los angeles at <x> miles per hour, ...

#

using a batch size of 18 processes samples approximately 1.614 times faster than using a batch size of 6 (8.07 divided by 5 equals 1.614)

#

For batch size 18: 2.41 hours * $3.18/hour = approximately $7.67

For batch size 6: 3.89 hours * $3.18/hour = approximately $12.37

surreal lagoon Jun 6, 2023, 4:26 PM

#

still waiting for the model to be able to make a white background

hot breach Jun 6, 2023, 4:35 PM

#

some random samples from a training I ran last night using zero terminal SNR, it gets very close to white/black backgrounds

#

black and white backgrounds

#

settings, this was just sort of a big random dump of training data I have lying around, 13k images

surreal lagoon Jun 6, 2023, 5:03 PM

#

doesn't help me much because i'm using diffusers implementation 😛

hot breach Jun 6, 2023, 5:29 PM

#

this is diffusers

#

I think there are some things that still need to happen in auto1111 or whatever since it mostly uses ldm code and patches, some issues with getting the extensions that supposedly do the CFG rescaling to look right

surreal lagoon Jun 6, 2023, 6:28 PM

#

well you're only training the last 2 layers of the TE

#

damn, you ran 20 epochs of training? how many samples?

#

you're using EveryDreamer2, not diffusers

#

https://github.com/victorchall/EveryDream2trainer/commit/81b7b00df736894be0cd8a053656e062690a7cde

odd change they made 3 days ago with no comment why

surreal lagoon Jun 6, 2023, 7:17 PM

#

i wish it would be better at faces already, how many do i have to show it

#

definitely understands the overall concept

surreal lagoon Jun 6, 2023, 7:40 PM

#

hot breach some random samples from a training I ran last night using zero terminal SNR, it...

i guess it's actually doing much better then in the image i showed. i'm using offset noise as well as terminal SNR

#

this is about 5000 steps earlier in training. it did not want to do a white background at all

#

i'm assuming that it's going to take a while to fully train all of this new noise schedule i'm applying

warm agate Jun 6, 2023, 7:43 PM

#

@surreal lagoon can you please explain what generated images mean from this?
does it mean they were artificially generated using the training images fed into the algo?
they are completely different from the ones that were used to train it?
https://github.com/NVlabs/stylegan

surreal lagoon Jun 6, 2023, 7:46 PM

#

i don' tknow

warm agate Jun 6, 2023, 7:52 PM

#

surreal lagoon i don' tknow

ok

hot breach Jun 6, 2023, 8:14 PM

#

surreal lagoon you're using EveryDreamer2, not diffusers

it's still diffusers, its a bunch of augmentation on top to allow multiple optimizers be used, layer freezing, multiaspect, etc

#

given what I'm seeing here I think offset is not required with zero terminal snr, I think that was the one of the points of the paper as well, offset noise is not very stable over time

#

that training was on 13k images, fairly random assortment of things

surreal lagoon Jun 6, 2023, 8:22 PM

#

offset noise helps it converge more quickly

#

might have to stop applying it at some point but it does help still, even with trailing and rescaled zero SNR betas

hot breach Jun 6, 2023, 8:24 PM

#

at least to me, it looks like a stable version of offset noise, with offset noise you need different mounts of it based on how long you train, like offsetnoise*0.1 the 0.1 is too much if if you train more than a few thousand steps and the model will turn into splochy figures on black etc

#

if you hand tune offset noise down its more stable for longer periods, but not entirely stable

surreal lagoon Jun 6, 2023, 8:26 PM

#

well i trained at 30k steps without offset noise and the terminal SNR stuff didn't help anywhere near as much as both together did

hot breach Jun 6, 2023, 8:26 PM

#

surreal lagoon https://github.com/victorchall/EveryDream2trainer/commit/81b7b00df736894be0cd8a0...

the trained_betas is easier to pass into the from_pretrained and takes care of the alphas and betas

#

diffusers handles it more elegantly than changing both betas and alphas manually, just easier, but you need a schedule/timestep curve to run through the code snippet from the paper to "correct" it, so step 1 is load normally, then load again withthe corrected schedule and discard the temporary scheduler instance

surreal lagoon Jun 6, 2023, 8:29 PM

#

        pipeline.scheduler = DDIMScheduler.from_pretrained(
            model_id,
            subfolder="scheduler",
            rescale_betas_zero_snr=True,
            guidance_rescale=0.3,
            timestep_scaling="trailing"
        )

hot breach Jun 6, 2023, 8:29 PM

#

I'm a few subversions behind, not sure that was in

#

there are a few of us hacking on it regardless, its sort of a backdoor feature until we sort it all out and document

surreal lagoon Jun 6, 2023, 8:30 PM

#

it's still a WIP pull request i've merged on my fork and have been testing

hot breach Jun 6, 2023, 8:30 PM

#

ah cool

#

this guy is also working on it: https://github.com/huggingface/diffusers/compare/main...AMorporkian:diffusers:main

surreal lagoon Jun 6, 2023, 8:33 PM

#

my knight is becoming more handsome either way

#

how it starts, roughly

hot breach Jun 6, 2023, 8:36 PM

#

at least so far I think zero term works, several of us getting great samples but I think there may be issues in the auto1111 whatevers that handle inference side, but works like a charm on diffusers since the trained_betas actually get saved right in the scheduler_config.json, so works on invoke, sdgrate, samples from actual trainer, etc

surreal lagoon Jun 6, 2023, 8:36 PM

#

perfection

hot breach Jun 6, 2023, 8:36 PM

#

d-adaptation adam also seems to be working well for some people but unforunately not very efficient rightnow

#

should have AdamA in soon

surreal lagoon Jun 6, 2023, 8:39 PM

#

well the offset noise has done what i've wanted so now i've removed it, at 12k steps

#

let's hope he comes down to earth and improves on the next ckpt

tall condor Jun 6, 2023, 9:11 PM

#

as for faces it needs to see the same face at least 500-1000 times for it reproduce it

#

my concepts that run very little times suck very hard with the faces

sturdy falcon Jun 6, 2023, 9:25 PM

#

stiff dust hi, do you use the inpainting control net? It makes a HUGE difference, in partic...

I did try that as well, it still showed the seams very clearly. Thanks for the suggestion though

tall condor Jun 6, 2023, 9:57 PM

#

is there any tool that can convert regular text captions into tokens/tags?

hot breach Jun 6, 2023, 10:06 PM

#

tokens and tags are different things, what is it you're actually trying to accomplish?

tall condor Jun 6, 2023, 10:09 PM

#

well if i use wd14 or any other clip captioner i do not get tokes

#

like car, red, open window

#

its more like "a red car with open windows"

#

and i am wondering if there is a tool that can convert that into tokens

surreal lagoon Jun 6, 2023, 11:53 PM

#

those are tokens

#

people get caught up on this stuff and think that something little is going to solve their problem when it's not even close to being the issue

#

like, what problem are you trying to solve with that

ancient mural Jun 7, 2023, 12:43 AM

#

How much quality/accuracy is lost if you merge multiple models together?

hot breach Jun 7, 2023, 1:03 AM

#

tall condor its more like "a red car with open windows"

textcap models like blip write sentences and phrases, but some caption utilities do something they call "clip flavors" which is just trying to figure out if your image is visually close to a bunch of words in a dictionary, i.e. tags,

#

some of caption utilities do both, use blip to create "a man standing in a park" then clip flavors would add something like "claude monet, daytime, oil on canvas, outdoor"

surreal lagoon Jun 7, 2023, 1:48 AM

#

surreal lagoon Jun 7, 2023, 3:03 PM

#

@hot breach ok so offset noise breaks things a lot more now that terminal SNR is in there

#

without it, training goes better

hot breach Jun 7, 2023, 3:14 PM

#

offset noise is unstable, ztnr should be stable

#

getting things to play nice in auto1111 may be a challenge, diffusers handles sharing the data about the updated beta schedule better since it can be explicitly shared in the schedule_config.json

#

that's ztnr only, no offset noise

gs05999-0-dimly_lit_office_with_a_huge_window_overlooking_a_city_at_night_from_mass_effect.jpg

#

gs09199-1-the_silhouette_of_a_man_in_an_orange_haze_film_still_from_bladerunner_2049.jpg

surreal lagoon Jun 7, 2023, 3:24 PM

#

i don't use automatic

#

once i hit 10k steps of training though, the thing starts screwing up

#

10k -> 12k -> 14k

#

hot breach Jun 7, 2023, 3:26 PM

#

above is 30k or so steps at batch 15

surreal lagoon Jun 7, 2023, 3:27 PM

#

i'm at batch 12

hot breach Jun 7, 2023, 3:27 PM

#

also grad accum 6 so effective batch size close to 100

surreal lagoon Jun 7, 2023, 3:27 PM

#

oh wow

#

that's a lot higher than mine, i have zero gradient accumulations in use

#

i'll restart from 10k steps with a higher batch and gradient size since i'm not interested in speeding through this training

hot breach Jun 7, 2023, 3:30 PM

#

unet LR 3.5e-6 constant, TE only unfreezing last 2 layers with lr 2e-6 cosine schedule

#

some of those settings are somewhat haphazard as I experiment but they're not far off

surreal lagoon Jun 7, 2023, 3:33 PM

#

i accidentally unfroze my whole text encoder for a few hundred steps once

#

it was about halfway into 30k steps

hot breach Jun 7, 2023, 3:35 PM

#

works with brighter stuff too

gs09599-1-a_blue_volvo_wagon_made_of_legos_minature_lego_model_of_a_volvo_wagon.png

surreal lagoon Jun 7, 2023, 3:36 PM

#

i've unfrozen a couple more layers of the TE to see whether this helps bring the weights up or whether it makes it worse shrug

#

we'll see, i guess

#

my assumption is that it makes it worse

jaunty wadi Jun 7, 2023, 3:58 PM

#

Okay maybe not the greatest example, but it should suffice. This is what I was referring to with these. (excuse the cat example, just wanted a really obvious choice its not trained on cats) These are tested with ((masterpiece)), outline, cel_shading cat, <lyco:CelShading-000001:0.75>,1 and ((masterpiece)), cat, <lyco:CelShading-000001:0.75> (with xyz plot on epochs/weight), for style loras, I've always heard it should be avoided if possible to make sure that you don't have to enter anything into the prompt style-wise. How could I avoid this? Would I prune all variants of "outline" or "cel_shading" from my training data?

surreal lagoon Jun 8, 2023, 3:20 AM

#

so damn close

woeful goblet Jun 8, 2023, 11:12 AM

#

How do i inpaint a fist in this pose, with the palm facing towards the viewer?
https://i.imgur.com/60pjsKM.png

i cannot seem to do it, i keep getting fists pointed the opposite way, with the back of the hand facing the viewer, even when inpainting over an image like this.
ive tried using varions on "palm facing viewer" and having knuckles in the negative prompt. But all i'm getting is either high quality inverted fists, or garbled flesh spaghetti

i have even tried inverting the colors of the fist in photoshop, and it made no difference

Imgur

surreal lagoon Jun 8, 2023, 1:31 PM

#

wrong channel @woeful goblet

woeful goblet Jun 8, 2023, 1:32 PM

#

surreal lagoon wrong channel <@151097176544641027>

how the heck did you get the discord number 0001

chrome breach Jun 8, 2023, 1:33 PM

#

surreal lagoon same ckpt

Wow dude... looks amazing ✌️✨

surreal lagoon Jun 8, 2023, 1:38 PM

#

chrome breach Wow dude... looks amazing ✌️✨

thanks, 2.1 is a workhorse

chrome breach Jun 8, 2023, 1:43 PM

#

Mind sharing more details about how u did this fine-tuning??

#

I am currently trying to get my fine tuned model to give at least some level of realistic faces... but well, uk 1.5😂

#

Ig i'll try those configuration settings on 1.5... see how things with that

#

Try using a lower cfg

#

Damn

worthy orchid Jun 8, 2023, 2:25 PM

#

does LORA add new info, or does it just tune your prompt to get the best result, like embeddings do

stiff dust Jun 8, 2023, 2:29 PM

#

LORA is more or less same as dreambooth. However, it depends a bit on the implementation you use

worthy orchid Jun 8, 2023, 3:38 PM

#

is dreambooth the same as embeddings 😛

stiff dust Jun 8, 2023, 3:52 PM

#

okay 😜
Dreambooth finetunes the complete model.
Lora finetunes large parts of the model, depending on the used implementation.

#

so it's not just the embedding but the tect encoder and the unet

hot breach Jun 8, 2023, 7:58 PM

#

hella contrast with zero terminal snr only, no offset noise

gs04799-3-a_spiral_staircase_in_hogwards_film_still_from_harry_potter_movie.png

#

surreal lagoon Jun 8, 2023, 8:34 PM

#

fine-tuning progress going super well this time

tall condor Jun 8, 2023, 8:40 PM

#

still quite a lot of contrast tho

surreal lagoon Jun 8, 2023, 8:45 PM

#

that's my prompt asking for it

#

#

for the end of the training run i've added more faces to the dataset. that collection worked well before but if i used it for too long it started picking up watermarks

#

hoping that resolves this munchkin face issue

#

it did on a separate training run 🤞🏽

stiff dust Jun 8, 2023, 9:07 PM

#

can we except one of these models, you invest so many gpu hours in, online and downloadable soon? ;D