restive ridge Oct 21, 2022, 8:33 PM

#

Same thing I'm seeing with my samdoesarts model which is supposed to have digital painting results. Going to have to rethink my prompts, not getting the same style. Same prompt is looking more realistic / CGI. Might to be more careful now about using "realism" "ray tracing" etc

modern lintel Oct 21, 2022, 8:40 PM

#

hey, I'm following Arki's Dreambooth guide (https://stablediffusionguides.carrd.co/#training-p) and managed to train the model on runpod.io, but now I have to download it and the download speeds are abismal, like 200kbps. Does anyone know perhaps of a solution to that?

Arki's Stable Diffusion Guides

Stable Diffusion Guides by Arki

eternal radish Oct 21, 2022, 8:57 PM

#

is here any option to make small specialized SD for img2img who will be like 100 times faster and use less vram, and like hardcoded only one prompt ?

hot breach Oct 21, 2022, 9:00 PM

#

not sure thats immediately possible but you can reduce steps and/or denoising strength

restive ridge Oct 21, 2022, 9:01 PM

#

Using a v1.2 model or earlier might be lighter

hot breach Oct 21, 2022, 9:01 PM

#

they're all the same model, just more or less training

eternal radish Oct 21, 2022, 9:01 PM

#

hot breach not sure thats immediately possible but you can reduce steps and/or denoising st...

minimal steps what give at last something is still too slow for me

#

i'm more about another model i can create and train

#

just like "cat into dog" dataset, or single style

#

not sure if smaller Unet or something still be able to give good images

viral jay Oct 21, 2022, 9:17 PM

#

Guys I'm trying to train it with a logitech G935 headphone, but even after 20k steps it still not able to really catch it, any tips?

hot breach Oct 21, 2022, 9:18 PM

#

those may be too close up

#

and cropped

viral jay Oct 21, 2022, 9:18 PM

#

it has around 40 photos for learning and BLIP captions, so I'm training the embedded with subject_filewords too

hot breach Oct 21, 2022, 9:19 PM

#

screenshot of training images?

viral jay Oct 21, 2022, 9:19 PM

#

hot breach Oct 21, 2022, 9:20 PM

#

some of the training images look a bit too close up and may confuse it

viral jay Oct 21, 2022, 9:21 PM

#

did you think selecting images that are more "clean" would help? or those with texts around it, etc aren't such a problem?

hot breach Oct 21, 2022, 9:21 PM

#

ideally you want your training images to be very well cropped, I would remove the images where the entire headset is not visible

#

like, a small portion close up might be ok, but.. I see quite a few there that are cropped weidly

viral jay Oct 21, 2022, 9:22 PM

#

🤔 hmm I will try to use just good ones then and see if I get better results, I thought closeup ones would help to improve details

hot breach Oct 21, 2022, 9:22 PM

#

it can, but they probably need to be a much smaller % of the total training set

#

you can see from your outputs you get heavily cropped images, too

viral jay Oct 21, 2022, 9:23 PM

#

that's true I haven't paid attention to it

hot breach Oct 21, 2022, 9:23 PM

#

I would say keep the extreme close ups like 5% max of the total set, just as a rough idea

#

so maybe 1-2 images out of 40, or maybe try once with none of them cropped, only full photos

viral jay Oct 21, 2022, 9:24 PM

#

I will take pics of my real one so I can get it more controlled, the main problem is that I proably will be holding it with my hands so not quite sure if that's a good idea lol

hot breach Oct 21, 2022, 9:25 PM

#

maybe get a string or coat hanger or something to hang them from

#

take some photos outside, too

#

or on your own head

#

then caption "a man wearing logitech headset" or whatever

viral jay Oct 21, 2022, 9:25 PM

#

good idea, Thanks for the advice, lets see if I can get this thing good

#

do you think the filewords are relevant at all?

#

for embedding?

hot breach Oct 21, 2022, 9:26 PM

#

what sort of training are you doing? TI? dreambooth?

viral jay Oct 21, 2022, 9:26 PM

#

I'm doing with TI, I've also tried with HN

hot breach Oct 21, 2022, 9:27 PM

#

I'm not sure how it works with TI

#

I don't know if captions count for that

viral jay Oct 21, 2022, 9:28 PM

#

well yeah for faces TI didn't do very well like HN, maybe my images aren't helping it so I will try to take more consistent photos and see which one can catch it better

hot breach Oct 21, 2022, 9:28 PM

#

no real idea on HN but I got the impression it is something more useful for styles than objects/subjects?

viral jay Oct 21, 2022, 9:29 PM

#

for faces it does a great job

#

not sure if I'm right but I think faces are mainly a subject type?

#

I'm going to try with something else, took photos of a thermometer once I manage to get it being replicated then I get back to the headset, my headset is too dusty for pictures 😅

green flax Oct 22, 2022, 12:08 AM

#

if i train an embedding with a hypernetwork selected in settings will it use the hypernetwork while training

tribal rapids Oct 22, 2022, 12:42 AM

#

viral jay Guys I'm trying to train it with a logitech G935 headphone, but even after 20k s...

@viral jay talking about product objects did you see this by the way? https://mobile.twitter.com/StrangeNative/status/1579848925993136128

Russ Maschmeyer (@StrangeNative)

Products with simple forms and organic lines reproduce exceptionally well. We took this trainer on adventures across unfamiliar terrain. 👟🏔 [2/10]

Likes

105

viral jay Oct 22, 2022, 1:06 AM

#

tribal rapids <@212681387562631180> talking about product objects did you see this by the way?...

that looks damn cool

gilded crater Oct 22, 2022, 3:15 AM

#

I got it from huggingface. Google search reddit and huggingface

maiden grail Oct 22, 2022, 3:27 AM

#

What kind of time averages are people getting for training an embedding? It takes like a half a second per step, and not sure how many steps are recommended.

Should I just reduce the max step field? Or is that going to significantly hurt the model?

#

And also, trying to track an embeddings "progress". It generates an image every 500 steps? But what even is that image represeting? just an amalgome of all the images so far?

#

Or it is just a random test file created?

hot breach Oct 22, 2022, 4:10 AM

#

it tries to generate an image of the last class or caption trained I believe

radiant rose Oct 22, 2022, 4:59 AM

#

gilded crater I got it from huggingface. Google search reddit and huggingface

ok cool thanks, I did a google search first but nothing came up. but ill go and dig at huggingface 👍

gilded crater Oct 22, 2022, 5:03 AM

#

I can find in an hour or so. Gotta feed the kids first.

hot breach Oct 22, 2022, 6:19 AM

#

ripped out regularization from dreambooth as I reform code to just do full fine tuning and seeing quite a drop in vram use, good sign

#

also seems existing codebases (xavier, etc) are using lightning 1.5.9 and may be improvements moving to 1.60+

frozen forum Oct 22, 2022, 9:33 AM

#

Did an interesting comparison between mixing model files
https://www.reddit.com/r/StableDiffusion/comments/yaiyw6/comparing_results_from_model_mixing_checkpoint/

r/StableDiffusion - Comparing Results from Model Mixing (Checkpoint...

0 votes and 1 comment so far on Reddit

gilded crater Oct 22, 2022, 10:18 AM

#

radiant rose ok cool thanks, I did a google search first but nothing came up. but ill go and ...

https://huggingface.co/jinofcoolnes/VImodel/tree/main

jinofcoolnes/VImodel at main

ivory veldt Oct 22, 2022, 12:48 PM

#

Do people know good resources for custom models? I need to generate 2d dashboards, web ui, app UI and logos. The current 1.5 model is focusing on art and photography. There might be models out there that has been trained specially for 2d designers

ivory veldt Oct 22, 2022, 4:25 PM

#

Btw, the legendary works of Neville Brody, the Designers Republic and Alex Trochut are completely seem to be missing from the datasets. All from the 2000's, so internet era.

fast current Oct 22, 2022, 4:30 PM

#

ivory veldt Do people know good resources for custom models? I need to generate 2d dashboard...

Wondering this myself

sand pine Oct 22, 2022, 4:38 PM

#

tribal rapids <@212681387562631180> talking about product objects did you see this by the way?...

How do they deal with not owning the image?

hot breach Oct 22, 2022, 5:57 PM

#

up to batch size 6 now on local fine tuning

spare magnet Oct 22, 2022, 7:57 PM

#

I don't have anywhere near the resources required, but I kind of want to leave this link here in case anyone wants to pick it up: https://github.com/hukenovs/hagrid

GitHub

GitHub - hukenovs/hagrid: HAnd Gesture Recognition Image Dataset

HAnd Gesture Recognition Image Dataset. Contribute to hukenovs/hagrid development by creating an account on GitHub.

#

It's a dataset of 500k images of people with their hands making different gestures, like so: https://github.com/hukenovs/hagrid/blob/master/images/example.jpeg?raw=true

frail thunder Oct 22, 2022, 9:27 PM

#

Hello colleagues, could somebody point out the information on how to fine-tune SD to introduce new category to the model? Not a dreambooth way

frail thunder Oct 22, 2022, 9:27 PM

#

frail thunder Hello colleagues, could somebody point out the information on how to fine-tune S...

I mean vanilla SD, that does text 2 image

icy olive Oct 22, 2022, 9:49 PM

#

fast current Wondering this myself

/r/StableDiffusion had an app-icon-generating model posted. Generally you can use Dreambooth to finetune, so gather up and describe samples of icons, UIs, logos, etc. and feed 'em to the machine.

next nimbus Oct 22, 2022, 10:23 PM

#

hey guys, so I'm following Nerdy Rodent tut on dreambooth, but when I run the last command (./my_training.sh), I get this error, how should I fix it, any idea?

train_dreambooth.py: error: the following arguments are required: --pretrained_model_name_or_path, --instance_data_dir Traceback (most recent call last):

eternal radish Oct 22, 2022, 10:34 PM

#

next nimbus hey guys, so I'm following Nerdy Rodent tut on dreambooth, but when I run the la...

maybe it means you do not add this commands to command line while starting this script

#

just python script do not get it

next nimbus Oct 22, 2022, 10:55 PM

#

eternal radish just python script do not get it

thanks for the help, looks like it was due to extra space during copy/past. but I'm stuck again with this new error : /

accelerator.py", line 286, in _init_ raise ValueError(err.format(mode="fp16", requirement="a GPU"))

I own gtx 1070 so i tried with and without deepspeed, but the same error even when I chose CPU.

#

"Unable to proceed, no GPU resources available", not sure why.

#

i just run the command torch.cuda.is_available() and the result was false : (

edit: nvm, probbaly need to update my windows to allow cuda pass through.

rugged wolf Oct 23, 2022, 12:00 AM

#

dumb question, what's an epoch? some models say they were trained by X number of steps and others talk about epochs

tribal rapids Oct 23, 2022, 12:06 AM

#

"An epoch means training the neural network with all the training data for one cycle."

hot breach Oct 23, 2022, 12:14 AM

#

generally means one look at every training sample, but a lot of the code people are running has repeats so its actually a lot more than 1 look per sample

mellow dock Oct 23, 2022, 2:11 AM

#

on what resolution was novel ai trained on??

raven pecan Oct 23, 2022, 4:01 AM

#

god im so lost

#

all i want is to make my midjourney images cleaner and nicer and enlarge

#

I dont know what Im doing. I managed to figure out how to run the notebooks though lol

#

I tried to research but now i know 100 new things that have nothing to do with what im trying to actually do 🤣

gloomy belfry Oct 23, 2022, 6:17 AM

#

raven pecan all i want is to make my midjourney images cleaner and nicer and enlarge

do yourself a favor, just buy topaz gigapixel or something

gilded crater Oct 23, 2022, 8:23 AM

#

ivory veldt Do people know good resources for custom models? I need to generate 2d dashboard...

the current model is completely capable of generating all those things. i was able to gen game icons and logos with 1.4

#

#

#

#

ivory veldt Oct 23, 2022, 9:42 AM

#

icy olive /r/StableDiffusion had an app-icon-generating model posted. Generally you can us...

https://www.reddit.com/r/StableDiffusion/comments/y5rfff/app_icon_generator_v1_people_liked_the_pixel_art/

r/StableDiffusion - "App Icon Generator V1". People liked the Pixel...

896 votes and 100 comments so far on Reddit

fallow valley Oct 23, 2022, 10:05 AM

#

spare magnet I don't have anywhere near the resources required, but I kind of want to leave t...

should be able to use a dataset like that to train the recent stabilityai released improved kl-f8 autoencoder, and then bolt it over any model you have to reap the benefits

#

unfortunately there is no guide out there as far as I know on how to further finetune the autoencoder.. if anyone know of one let me know, I am willing to use some of my cloud gpu credits to try and finetune

#

I only know how to use it with the popular gui, it was recently updated to import the weights from the vae file into the model you are running.

woeful goblet Oct 23, 2022, 10:19 AM

#

How am i supposed to make a mask for the mask upload feature in some guis? (using automatic)
If i paint in the area i want to redo in black, its treated as a negative mask when generating, i have to select "Inpaint not masked" for it to work
So then it surely makes sense that white would be positive. But a white area does nothing at all, it just generates the whole image

foggy hinge Oct 23, 2022, 10:59 AM

#

Hey what workflow or colab with SD 1.5 are you using to fine-tune with only one to a few base images?

glossy rune Oct 23, 2022, 11:12 AM

#

Same as 1.4

ivory veldt Oct 23, 2022, 1:00 PM

#

finally managed to create a custom model.ckpt with dreambooth colab. The difference is astonishing. See the default SD 1.5 model image vs the newly trained model. I trained the model on about 10 images of Hide the Pain Harold. https://huggingface.co/bencser/harold

00453-1519822628-pencil_drawing_of_hide_the_pain_harold.png

bencser/harold · Hugging Face

next nimbus Oct 23, 2022, 1:28 PM

#

Do you guys know if its possible to train/dreambooth ckpt file instead of diffuser? localy? if so which guide I should follow?

rugged wolf Oct 23, 2022, 1:33 PM

#

hot breach generally means one look at every training sample, but a lot of the code people ...

thanks that explained it

rugged wolf Oct 23, 2022, 1:35 PM

#

next nimbus Do you guys know if its possible to train/dreambooth ckpt file instead of diffus...

the webui has a very simple interface which I assume "just works" as long as you have enough VRAM

next nimbus Oct 23, 2022, 1:36 PM

#

rugged wolf the webui has a very simple interface which I assume "just works" as long as you...

can you share for me a link to the one you are referring? does it train ckpt file directly?

rugged wolf Oct 23, 2022, 1:37 PM

#

next nimbus can you share for me a link to the one you are referring? does it train ckpt fil...

https://github.com/AUTOMATIC1111/stable-diffusion-webui/

#

it looks fairly featured

next nimbus Oct 23, 2022, 1:39 PM

#

rugged wolf it looks fairly featured

Thanks, but that's different, dreambooth is way more different and better result.

rugged wolf Oct 23, 2022, 1:40 PM

#

oh I didn't know, so dreambooth is not textual inversion?

ivory veldt Oct 23, 2022, 1:40 PM

#

next nimbus Do you guys know if its possible to train/dreambooth ckpt file instead of diffus...

this colab has a "convert weights to ckpt" option https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth_Stable_Diffusion.ipynb

Google Colaboratory

next nimbus Oct 23, 2022, 1:41 PM

#

rugged wolf oh I didn't know, so dreambooth is not textual inversion?

nope.

next nimbus Oct 23, 2022, 1:41 PM

#

ivory veldt this colab has a "convert weights to ckpt" option https://colab.research.google....

thanks, but I wonder if there is one that train ckpt file directly.

"This code repository is based on that of Textual Inversion. Note that Textual Inversion only optimizes word ebedding, while dreambooth fine-tunes the whole diffusion model. "

#

I think I found it.
https://github.com/XavierXiao/Dreambooth-Stable-Diffusion

GitHub

GitHub - XavierXiao/Dreambooth-Stable-Diffusion: Implementation of ...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion - GitHub - XavierXiao/Dreambooth-Stable-Diffusion: Implementation of Dreambooth (https://arxiv.org/abs/2208.122...

ivory veldt Oct 23, 2022, 1:44 PM

#

nice

next nimbus Oct 23, 2022, 2:08 PM

#

tut https://www.youtube.com/watch?v=xSkyLuRnt4g&list=PLaOL9kx0jIPDZTr8AG5HzMleK-JOwWdpp&index=12&t=1s

YouTube

TingTingin

DREAMBOOTH! Stable Diffusion Add Yourself or Anything To Stable Dif...

Commands

18:02 git clone https://github.com/XavierXiao/Dreambooth-Stable-Diffusion.git
18:18 cd Dreambooth-Stable-Diffusion
18:34 conda env create -f environment.yaml
19:08 wget --http-user=USERNAME --http-password=PASSWORD https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4-full-ema.ckpt
20:43 import zipfile as...

▶ Play video

foggy hinge Oct 23, 2022, 2:23 PM

#

What to do if we only have one picture?

stone garden Oct 23, 2022, 3:26 PM

#

Hello I wanna do some dreambooth trainig

#

what is the best repo for doing that as of today? I have a GPU with 23GB VRAM

#

tried to do it with a repo about 1 or 2 weeks ago but I was getting errors because of lack of VRAM

glossy rune Oct 23, 2022, 3:34 PM

#

foggy hinge What to do if we only have one picture?

You can try augmentation, but with little/bad raw data you can only ever go so far… see here for example http://d2l.ai/chapter_computer-vision/image-augmentation.html

#

But consider that for Image Generation you are much less flexible in augmentations e.g. of colors vs for the usual classification that most augmentation guides will probably aim at

foggy hinge Oct 23, 2022, 3:39 PM

#

Hmm right.

#

Have anyone tried to use Stable Diffusion fine-tuning with WebAssembly (with Rust bindings or just Python) so that the client run the training part?

unborn fulcrum Oct 23, 2022, 3:46 PM

#

greetings, does anyone know about this error when training HN?

AssertionError: no gradient found for the trained weight after backward() for 10 steps in a row; this is a bug; training cannot continue

hot breach Oct 23, 2022, 3:49 PM

#

next nimbus I think I found it. https://github.com/XavierXiao/Dreambooth-Stable-Diffusion

yes this is the original dreambooth repo, several forks out there now but this unfreezes the unet and vae and requires 24gb

#

original for SD at least, there are also diffusers versions, some* diffusers don't unfreeze everything though, there's a trade off on VRAM use and how much of the model is unfrozen, and results are different if they don't unfreeze everything

next nimbus Oct 23, 2022, 3:53 PM

#

hot breach original for SD at least, there are also diffusers versions, some* diffusers don...

so diffuser gives a better result? or it's not related?

hot breach Oct 23, 2022, 3:53 PM

#

some of the early diffusers ones were only unfreezing unet I think, and it doesn't work as well, but I haven't messed with diffusers myself so take that with a grain of salt

#

if diffusers unfreezes the same parts of the model I imagine they're close either way, but all the 8gb/10gb/16gb stuff I believe is simply not unfreezing the entire model and that probably leads to worse results, but there's also the concept of "enough" to get your project done

#

there are differences in optimizers in diffusers and such, too

#

people get either to "work" but most people are also doing fairly small scale projects, like 10-50 images of their dog, or their own face, etc

#

my take is if you have higher ambitions, unfreezing the entire (latent diffusion*)model, caption training, etc is far more capable and moves you away from TI/dreambooth towards more general fine tuning, depends on your goals, your hardware capabilities or what you want to rent, etc

next nimbus Oct 23, 2022, 3:59 PM

#

I see, thanks for the info.
we got alot of stuff out there in a very short time, I can't keep up with all.

I want to do a test similar to the pokemon model outs there but instead of generating pokemon, I would like to use naruto anime, so I figured I train the model with a bunch of naruto pictures, you think this can do that?

#

https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb

#

its the last ben repo of dream booth.

hot breach Oct 23, 2022, 4:01 PM

#

don't know, I think most people have to work on their projects a bit to get them to work and gain experience, try different things

#

anime might be harder than more photo real type characters from what I've seen of other people's experiences, but I don't do anime stuff myself, nor do I use diffusers

#

I'm trying to update the lightning trainer (local/xaiver/compvis based) to see what can be done about getting it more up to speed, its still running on some old libraries

cobalt sorrel Oct 23, 2022, 6:13 PM

#

Hypernetworks dont work with 1.5 yet?
I try to train a Hypernetwork with 1.5 model but i get a error messagfe.

tribal rapids Oct 23, 2022, 7:31 PM

#

could summarize what are the clear symptoms of overfitting please? also how CFG would be affected (eg it'd have to be lower and lower etc)

hot breach Oct 23, 2022, 7:36 PM

#

at least sometimes, faces look sunburnt or high contrast at standard cfg

tribal rapids Oct 23, 2022, 7:36 PM

#

for instance i trained 36 images 3600 steps with 1500 regs, then another 900 steps to make it 4500. i'm pretty sure at 4500 the results are starting to look both less like the subject (a bad thing), but also just look less like the training images (eg more variety in the outputs so a good thing). i'd expected if i overfitted then it would all look like my training images with not much variety? i need to compare exact sampler parameters between the 2 models though

#

what are we calling standard CFG? 7?

hot breach Oct 23, 2022, 7:36 PM

#

yeah close enough

#

7.5 was default on compvis launch, auto defaults to 7 i think, either way close enough

tribal rapids Oct 23, 2022, 7:38 PM

#

what would you say is a good sampler/steps for just a basic unstyled output eg (photo of jmp909 man) as a baseline test?

sonic bobcat Oct 23, 2022, 7:39 PM

#

normal steps should be fine

tribal rapids Oct 23, 2022, 7:40 PM

#

you mean 20 Eular A?

sonic bobcat Oct 23, 2022, 7:40 PM

#

30~ i don't use euler a

#

the big difference is dpm2 a and euler a from the rest

tribal rapids Oct 23, 2022, 7:41 PM

#

euler a is the default selected in a1111, that's why i was wondering what sampler is a good baseline default for unstyled output

sonic bobcat Oct 23, 2022, 7:42 PM

#

you can make it render more than 1 sampler

tribal rapids Oct 23, 2022, 7:43 PM

#

sure i know you can do that in extras, i was just wondering opinion on sampler/steps for outputting a baseline unstyled result to compare 2 sets of training on a photo likeness

#

i know there are a so many variables to it all it makes it tricky to generalize

#

like at 3600 steps i could get a good likeness with DPM adaptive, 20 steps, cfg 8..... now at 4500 steps if i don't lower the CFG to about 3.5 there, i get somebody else's face that has similar features but does not look like the subject

#

with photo of (jmp909 man) wearing yellow hat .. obviously the emphasis has some effect there on both tho

#

actually i think with 4500 i can take the emphasis out and get a better result. what i cant work out is how far to take the training to actually improve things

#

being able to take the emphasis out for the same CFG suggests the model is doing better?

sonic bobcat Oct 23, 2022, 7:59 PM

#

not necessarily ? try other steps

tribal rapids Oct 23, 2022, 8:02 PM

#

yeah doing an X/Y on CFG vs steps now thanks

last delta Oct 23, 2022, 8:52 PM

#

Hi guys, I want to train DreamBooth for a style. I did it a few time for faces no problem.
Is it the same process for styles?

Setting it up with prior preseveration, what is the class of a style.
For example if I want to train for watercolor images, the class would be what kind of images do I have to supply? just random other styles for comparison?

icy olive Oct 23, 2022, 8:56 PM

#

The regularization images would be whatever images the model would generate previously with the class keyword:

Prompt: artstyle -> your regularization images

stone garden Oct 23, 2022, 8:57 PM

#

oops sorry, sent wrong place

last delta Oct 23, 2022, 9:03 PM

#

icy olive The regularization images would be whatever images the model would generate prev...

ok thanks I think I was to general, I though my class is "style" but I guess that is stupid

restive ridge Oct 23, 2022, 9:10 PM

#

last delta ok thanks I think I was to general, I though my class is "style" but I guess tha...

I tried "style" for a specific artist, results did not come out well. But, maybe I configured it wrong, it was on v1.4 too, v1.5 might like "style" better. When I switched to "painting", I got better (but not amazing) results. Seems like DreamBooth is not the best for style. Might want to also try image embedding for style as well, those are supposed to work well for style.

hot breach Oct 23, 2022, 9:17 PM

#

I imagine it should work if you use captioned training and describe the art such as "a wolf standing in a forest by Biff Artistman" or whatever for each image, haven't messed too much with outright styles though

last delta Oct 23, 2022, 9:24 PM

#

hot breach I imagine it should work if you use captioned training and describe the art such...

oh interesting, most of the images I used are captioned, how do I do captioned training? Haven't seen the option in the dreambooth colab I am using

hot breach Oct 23, 2022, 9:24 PM

#

mrwho added it to joepenna repo

#

there's some setup to name the files and organize them into subfolders

#

works in kanewallman's repo just based on filename but the notebook is probably not maintained

past parrot Oct 23, 2022, 9:30 PM

#

hey now

icy olive Oct 23, 2022, 10:30 PM

#

restive ridge I tried "style" for a specific artist, results did not come out well. But, maybe...

Works fine for other people: https://old.reddit.com/r/StableDiffusion/comments/yaquby/2d_illustration_styles_are_scarce_on_stable/

r/StableDiffusion - 2D Illustration Styles are scarce on Stable Dif...

561 votes and 190 comments so far on Reddit

stray kindle Oct 23, 2022, 11:06 PM

#

Yo, looking for advice. If I wanted to train this shit on different species of cavemen, what would be the best option for that? I assume I'd have to do one at a time, yes?

#

Well, more importantly, would it have to be a particular individual, or would different ones of the same category work well together?

icy olive Oct 24, 2022, 12:15 AM

#

Ok, it's time to step up.

I know dreambooth can be used to train one particular style or subject in, but how would I train the model in general -- as in pick up training of SD 1.5 with my own dataset of various different things?

hot breach Oct 24, 2022, 12:22 AM

#

you can caption all your images and train as much stuff as you want

#

you'd have to check the diffusers stuff though on 16GB, the lightning/xavier repos take 24gb because they unfreeze the entire model, or rent runpod/vast or colab pro I guess

icy olive Oct 24, 2022, 12:25 AM

#

I'm already renting on vast

icy olive Oct 24, 2022, 12:28 AM

#

hot breach you'd have to check the diffusers stuff though on 16GB, the lightning/xavier rep...

How do I train multiple things with JoePenna's repo, or should I check another one?

hot breach Oct 24, 2022, 12:37 AM

#

there's some stuff with how you put things in subfolders, unfortunately they hadn't put any documentation on it last I checked

#

there are other options local or if you dont might doing everything from the CLI but I gues everyone doing remote runtimes are likely using notebook

#

technically all you need is a terminal on the runtime even on a runpod or vast but you need to be kinda familiar with linux command prompt instead of just clicking the play button on a notebook...

#

im working on a general fine tuning trainer, ill see if I can make a notebook for it at some point...

icy olive Oct 24, 2022, 12:44 AM

#

hot breach there are other options local or if you dont might doing everything from the CLI...

I don't mind the CLI. I use Arch btw.

hot breach Oct 24, 2022, 12:45 AM

#

do you know how to push your own files into the rte and move them around in folders? i.e. your training files

icy olive Oct 24, 2022, 12:46 AM

#

yes

#

already did that for the previous model I trained

hot breach Oct 24, 2022, 12:48 AM

#

https://github.com/victorchall/everydream-trainer you can try mine, haven't run it on a remote rte but should work just fine, or you can use kane's https://github.com/kanewallmann/Dreambooth-Stable-Diffusion they're fairly similar at this point

#

readme on both should explain how to organize files, mine doesn't use regularization per-se, but you can put reguliarization images in the training folder if you want anyway

#

how big is your training set?

#

tbh these work better with larger sets, I used kanes with 600, 900, and 1400 and then forked mine last one I did with 1600

icy olive Oct 24, 2022, 12:51 AM

#

I have around 400 images (half of which I'm still tagging)

hot breach Oct 24, 2022, 12:51 AM

#

ah thats probably good enough then

#

im working on some stuff to autoclip tag stuff

icy olive Oct 24, 2022, 12:52 AM

#

I'm gonna make a tool to help me tag stuff

hot breach Oct 24, 2022, 12:52 AM

#

both mine and kane wallmann's use the same naming convention of "your caption goes here_n.ext"

icy olive Oct 24, 2022, 12:52 AM

#

"Show image, input tags in text box, click next"

hot breach Oct 24, 2022, 12:53 AM

#

yeah you can do it in automatic but its not batch, you can probably script calling interrogate.py but I want to build something in myself

icy olive Oct 24, 2022, 12:53 AM

#

tbh, for datasets less than 1000, I think fully manual tagging is king

#

getting it exactly right seems really important when you have fewer images to train on

hot breach Oct 24, 2022, 12:54 AM

#

you can batch name them with clip then just replace "a man" with "john cena" or whatever

#

yeah my results have improved greatly, first 4 character one I did was just all "name of character_n.ext" then I ran them through clip to get the surrounding context and it helped quite a bit

#

slowing building training set with a mix of other ground truth data...

restive ridge Oct 24, 2022, 1:26 AM

#

icy olive Works fine for other people: https://old.reddit.com/r/StableDiffusion/comments/y...

thanks, probably something wrong with my setup. I do see they used "artstyle" instead of "style", I'll have to try that one

Class word was artsyle. Token was holliemengert. Prompt for instance would be holliemengert artstyle

#

Any opinions on which DreamBooth implementation works the best?

JoePenna - https://github.com/JoePenna/Dreambooth-Stable-Diffusion
XavierXiao - https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
ShivamShrirao - https://github.com/ShivamShrirao/diffusers

icy olive Oct 24, 2022, 1:34 AM

#

I just trained a model with JoePenna's implementation

#

It seems to work pretty well

gloomy belfry Oct 24, 2022, 1:35 AM

#

Shivam is good as well

next nimbus Oct 24, 2022, 2:01 AM

#

What makes the ckpt 2GB? I mean if we remove the full float and unnecessary data?
How can we make sure to add to the ckpt instead of overwriting it?
Or the answer to these questions are still not out there?

fast current Oct 24, 2022, 6:21 AM

#

Only just reading up on Aesthetics Gradients. Good explanation: https://metaphysic.ai/custom-styles-in-stable-diffusion-without-retraining-or-high-computing-resources/

Metaphysic.ai

Martin Anderson

Custom Styles in Stable Diffusion, Without Retraining or High Compu...

A researcher from Spain has developed a new method for users to generate their own styles in Stable Diffusion (or any other latent diffusion model that is publicly accessible) without fine-tuning the trained model or needing to gain access to exorbitant computing resources, as is currently the case with Google's DreamBooth and with Textual Inver...

#

Definitely gonna be the new way

lilac helm Oct 24, 2022, 7:02 AM

#

fast current Only just reading up on Aesthetics Gradients. Good explanation: https://metaphys...

That's a really good read!

fast current Oct 24, 2022, 7:04 AM

#

Mmm, makes me wanna make an embedding for Attack on Titan scenery

stone garden Oct 24, 2022, 8:10 AM

#

where can I find thousands of regularization images of woman

ivory veldt Oct 24, 2022, 9:56 AM

#

using custom models = a new era for humanity,

viral jay Oct 24, 2022, 10:00 AM

#

man dreambooth is so much better compared to TI and HN, shame I'm not able to run it locally

#

dreambooth with 1200 steps, it's a fiat uno, the damn thing has learn it quite well

#

this is TI after 5k steps

viral jay Oct 24, 2022, 10:17 AM

#

what I'm finding cool about it is that I still able to edit the results, for example the first image of car on snow, there's no pictures that have been trained on snow, on bottom its also on snow but from 10 images I've generated only one had something leaning to snow and wasn't correct at all

#

any tips on how to achieve the same with TI or HN?

tough gazelle Oct 24, 2022, 10:27 AM

#

fast current Only just reading up on Aesthetics Gradients. Good explanation: https://metaphys...

Would be interested to know if anyone's figured out how to get this to consistently work. I had a play with it last night and I can't figure out the relationship between the training images and the results.
You also get wildly different results from just changing the gradient steps by 1, or even changing the learning rate by 0.0001

fast current Oct 24, 2022, 10:28 AM

#

Yeah it sounds tricky for sure. A lot of it probably comes down to consistency in the training images at a guess

tough gazelle Oct 24, 2022, 11:03 AM

#

fast current Yeah it sounds tricky for sure. A lot of it probably comes down to consistency i...

I can sometimes see hints of the images I've put in, but other times it just does completely unrelated things.

fast current Oct 24, 2022, 11:05 AM

#

I think thats sort of the idea though? From what im reading it's more of a method of nudging art style in the right direction, rather than adding specific elements

tough gazelle Oct 24, 2022, 11:05 AM

#

Or you get the opposite. I trained one on vapourwave style images and all it took from them was the colours and it turned everything into a purple/yellow blob

tough gazelle Oct 24, 2022, 11:07 AM

#

fast current I think thats sort of the idea though? From what im reading it's more of a metho...

Yeah it sometimes does stuff that looks kinda related to the images. But it's no different to what you'd get by just using prompts.

I think it needs to work with prompt engineering. So you get something close to what you want and then that can maybe just push it over the edge to exactly what you want.

fast current Oct 24, 2022, 11:08 AM

#

Yeah that sounds about right. Time shall tell more i suppose

steel ocean Oct 24, 2022, 12:49 PM

#

hey guys , I am using this repo of optimised dreambooth https://github.com/gammagec/Dreambooth-SD-optimized
the problem is my training always stop after around 19 minutes , like there is a timer or something .. how do I edit the code to make it work till it finish certain amount of steps

GitHub

GitHub - gammagec/Dreambooth-SD-optimized: Implementation of Dreamb...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion - GitHub - gammagec/Dreambooth-SD-optimized: Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) wi...

trail flower Oct 24, 2022, 12:52 PM

#

Has anyone merged with the inpainting model? I'm wondering if it works and retains its inpainting abilities with the trained elements of a secondary model

tough gazelle Oct 24, 2022, 1:10 PM

#

steel ocean hey guys , I am using this repo of optimised dreambooth https://github.com/gamma...

Are you changing the maximum steps setting?

tough gazelle Oct 24, 2022, 1:10 PM

#

trail flower Has anyone merged with the inpainting model? I'm wondering if it works and retai...

Don't believe you can. It has extra layers compared with normal models.

steel ocean Oct 24, 2022, 1:11 PM

#

tough gazelle Are you changing the maximum steps setting?

where can I find that variable

tough gazelle Oct 24, 2022, 1:13 PM

#

Not at my computer so can't 100% remember but I believe it's in the bottom of this file

configs/stable-diffusion/v1-finetune_unfrozen.yaml

steel ocean Oct 24, 2022, 1:18 PM

#

tough gazelle Not at my computer so can't 100% remember but I believe it's in the bottom of th...

this ?

#

does the algorithme choose how many steps it needs ?

#

like did the algorithme did choose 2264 steps , or there is a variable

tough gazelle Oct 24, 2022, 1:28 PM

#

Ignore that. It's based on the amount of training data.

#

It will stop at whatever step count you set.

tough gazelle Oct 24, 2022, 1:28 PM

#

steel ocean this ?

And yes, the max steps

steel ocean Oct 24, 2022, 1:28 PM

#

I made 8000

tough gazelle Oct 24, 2022, 1:28 PM

#

I found the best for speed / decent accuracy was between 4-6k but your mileage may vary.

#

8k should be fine too

#

You don't usually start overtraining until a decent bit over

#

Keep an eye on the renders it makes to track how well it's doing.

steel ocean Oct 24, 2022, 1:30 PM

#

so it will go beyound 2k since I set the max 8k

steel ocean Oct 24, 2022, 1:31 PM

#

tough gazelle Keep an eye on the renders it makes to track how well it's doing.

can I traing model then put generate 2k pictures and then choose 200 pictures from the generated pictures as regularisation

tough gazelle Oct 24, 2022, 1:33 PM

#

Yeah it will move to the next epoch

tough gazelle Oct 24, 2022, 1:33 PM

#

steel ocean can I traing model then put generate 2k pictures and then choose 200 pictures fr...

When you resume training like that it's almost like merging models. For best results you need to do it all in 1 go.

steel ocean Oct 24, 2022, 1:35 PM

#

tough gazelle When you resume training like that it's almost like merging models. For best res...

can you please explain , what do you mean by doing it all in 1 go

tough gazelle Oct 24, 2022, 1:36 PM

#

You made it sound like you were planning on stopping changing the data and resuming.

#

You can't do that and have good results in my experience. You need all your training and regularisation data set from the beginning. And then set your steps and let it complete the whole process.

steel ocean Oct 24, 2022, 1:38 PM

#

ahh , cant I do something to pick good pictures for ai

tough gazelle Oct 24, 2022, 1:43 PM

#

Well you need to pick your own training images for what to train. For regularisation images what I did was used stable diffusion and created a few hundred images using my chosen class name as the prompt.

steel ocean Oct 24, 2022, 1:46 PM

#

that exacty what i did

ivory veldt Oct 24, 2022, 3:07 PM

#

where do people download image datasets that are great for training? For example I need open source images of everyday women and men, same folks, with different settings, emotions, clothes etc. There are a lot to be found by celebrities, but not from open sourced people.

icy olive Oct 24, 2022, 3:21 PM

#

LAION, or have the model generate them

#

The latter is mostly acceptable, since the point of the images is to make sure the model can still generate the same stuff it could before finetuning

storm linden Oct 24, 2022, 3:43 PM

#

If I want to train text encoders with Dreambooth, do I need to change any .json files before running the training script? I'm using gallery-dl to get my images and I have the metadata, I just don't know if or where I should be putting the file

storm linden Oct 24, 2022, 3:49 PM

#

ivory veldt where do people download image datasets that are great for training? For example...

Training images are best generated by your model. If you’re training on an anime-based model but have regularization images with real people, you’re training the AI on both your reg images and the instance prompt. Just use your model “a picture of xxx” and make a bunch of images

icy olive Oct 24, 2022, 3:59 PM

#

Update: now using ShivamShirao's repo locally; even faster than JoePenna's repo on the cloud GPU

hot breach Oct 24, 2022, 4:15 PM

#

steel ocean this ?

keep in mind there may be several flags strewn about that limit training, might be settings in the yaml and another setting on the CLI args, make sure to double check, tbh hard to keep track of all the repos args but some have multiple limits

#

might be useful to just delete or set the default on cli args to 99999 and just use the yaml to keep it all in one place if you find other limits so you're not chasing your tail constantly

upper prism Oct 24, 2022, 5:00 PM

#

icy olive Update: now using ShivamShirao's repo locally; even faster than JoePenna's repo ...

where does the speed increase come from? I updated but didnt see a difference yet

hot breach Oct 24, 2022, 5:04 PM

#

from my understanding some of the diffusers stuff isn't unfreezing the entire latent diffusion model like the xavierxiao based repos (joe/kane/gamma) do, so be wary of comparisons because it may not be apples to apples

hot breach Oct 24, 2022, 5:07 PM

#

icy olive The latter is mostly acceptable, since the point of the images is to make sure t...

fwiw my last model only used laion data and turned out well, better rthan previous model that used SD-generated regularization images IMO, but I'm also training much larger/longer than others

frail thunder Oct 24, 2022, 5:42 PM

#

next nimbus I see, thanks for the info. we got alot of stuff out there in a very short time...

I would propose not to use dreambooth in your case but use approach of finetuning done with pokemon directly. caption your naruto images with BLIP then train on this text-image pairs

next nimbus Oct 24, 2022, 5:55 PM

#

frail thunder I would propose not to use dreambooth in your case but use approach of finetunin...

correct me if I'm wrong, so finetune uses ckpt but dreambooth will ask for diffuser right?

#

And when it would be a better choice to finetune and when it would be the best to train using dreambooth?

hybrid pilot Oct 24, 2022, 5:59 PM

#

next nimbus correct me if I'm wrong, so finetune uses ckpt but dreambooth will ask for diffu...

fine tuning and dreambooth are similar. dreambooth keeps the text to image portion frozen. where as fine tuning will train all of them together.

next nimbus Oct 24, 2022, 6:01 PM

#

hybrid pilot fine tuning and dreambooth are similar. dreambooth keeps the text to image port...

I see, thanks!

#

@hybrid pilot If I want to train the model with a new art style, what would be the best choice?

hybrid pilot Oct 24, 2022, 6:09 PM

#

depends really on how you want to use it. I'm still sort of figuring things out, but I wills say that dreambooth does great at adding a single specific thing or a very apparent art style, fine tuning is the "best" way, but has the draw back that you really need to know what your doing to not mess up weights.

I would say textual inversion actually. the downside being it will take up some context space, but for a quick style, it does the job really well

#

or the new aesthetic diffusion thing I've seen mentioned. that looks like it has room to really set a style

#

hypernetworks are too heavy for just a single style

#

disclaimer: I have no idea what I'm doing and this is purely just things I've read/observed, I could be 100% wrong

icy olive Oct 24, 2022, 6:45 PM

#

I've used all 3 -- dreambooth gives the best results
for styles, TI is pretty good most of the time
Hypernetworks are hit and miss

icy olive Oct 24, 2022, 6:58 PM

#

hybrid pilot fine tuning and dreambooth are similar. dreambooth keeps the text to image port...

that's not quite true anymore; for example, ShivamShirao's version allows you to train the text encoder at the same time

next nimbus Oct 24, 2022, 7:08 PM

#

icy olive I've used all 3 -- dreambooth gives the best results for styles, TI is pretty go...

we can do TI using auto out of the box? how much vram it would need?

icy olive Oct 24, 2022, 7:08 PM

#

next nimbus we can do TI using auto out of the box? how much vram it would need?

automatic1111's web UI includes it in the "Train" tab. I think it works with 8 GB VRAM there.

next nimbus Oct 24, 2022, 7:09 PM

#

oh, thats cool.

#

man I wish I had these tools when I was a kid.

hybrid pilot Oct 24, 2022, 7:29 PM

#

If this was around when I was a kid, I'd be a data scientist instead of a sysadmin lol

next nimbus Oct 24, 2022, 7:56 PM

#

I just finished training and testing my first Dreambooth using TheLastBen, with only 800 steps and 24 pictures, I feel like the source images I feed the model leaked all over the ckpt, or am I hallucinating?

upper prism Oct 24, 2022, 8:16 PM

#

next nimbus I just finished training and testing my first Dreambooth using TheLastBen, with ...

that can happen when not using regularization images and the prior preservation loss flag

next nimbus Oct 24, 2022, 8:20 PM

#

upper prism that can happen when not using regularization images and the prior preservation ...

"With_Prior_Preservation" is checked and the Dreambooth says it will auto generate 200 picture for the class.

upper prism Oct 24, 2022, 8:21 PM

#

next nimbus "With_Prior_Preservation" is checked and the Dreambooth says it will auto genera...

that's strange then, since that should prevent just that.

#

can you see the 200 created images in the folder structure of the colab?

next nimbus Oct 24, 2022, 8:22 PM

#

upper prism can you see the 200 created images in the folder structure of the colab?

yes.

tawny inlet Oct 24, 2022, 8:23 PM

#

Hey guys, can someone please tell me what's the best way to Fine Tune Stable diffusion for Set of Characters from Anime?

upper prism Oct 24, 2022, 8:25 PM

#

next nimbus yes.

I'm lost then, sorry
It's not happening with the other repos like that and I dont know if its missing a configuration or if it doesn't do it correctly

covert gazelle Oct 24, 2022, 8:26 PM

#

Has anyone converted this into a ckpt yet?
https://huggingface.co/sd-dreambooth-library/homelander

icy olive Oct 24, 2022, 8:38 PM

#

I'm not sure, but it's simple to do it yourself if you have python and pytorch installed

hot breach Oct 24, 2022, 8:41 PM

#

tawny inlet Hey guys, can someone please tell me what's the best way to Fine Tune Stable dif...

if you want a set of characters all in one model you'll need a caption-enabled repo so you can name or otherwise somehow mark your individual training images for each character

covert gazelle Oct 24, 2022, 8:45 PM

#

icy olive I'm not sure, but it's simple to do it yourself if you have python and pytorch i...

uh.... How much Vram? 😅

icy olive Oct 24, 2022, 8:46 PM

#

Converting models doesn't require any VRAM. It occurs on the CPU.

hot breach Oct 24, 2022, 8:46 PM

#

I don't think ckpt convertors touch the gpu at all?

#

ckpt files are just tar files (like a zip file) you can even open them in 7zip and look around

covert gazelle Oct 24, 2022, 9:14 PM

#

Surprisingly fast!

#

Or maybe I just didn't do research about the conversion process

#

great! 👍

#

actually, maybe not that great, but I did only make a 2 GB model

#

It's not doing so well when it comes to placing the said person in different locations, probably because it was trained in colab???

raven pecan Oct 24, 2022, 9:35 PM

#

gloomy belfry do yourself a favor, just buy topaz gigapixel or something

Hmmmm 👀 I’ll look

covert gazelle Oct 24, 2022, 9:51 PM

#

but what about SD upscale catlurk

stone garden Oct 24, 2022, 9:55 PM

#

covert gazelle great! 👍

Oh my god

wild totem Oct 24, 2022, 10:19 PM

#

covert gazelle Has anyone converted this into a ckpt yet? https://huggingface.co/sd-dreambooth-...

lol I saw that homelander model, didn't like it, made my own

tribal rapids Oct 24, 2022, 10:39 PM

#

Nice can I ask how you’re training please? Mind you.. he’d be in the model already

#

Also what % of generations are actually decent?

stray kindle Oct 24, 2022, 10:42 PM

#

If I wanted to train a bunch of caveman species, what should I do?

icy olive Oct 25, 2022, 12:53 AM

#

Dataset protip: Install gallery-dl, and scrape dozens of images at a time

#

e.g. gallery-dl -D mydataset --range 1-25 <danbooru search results link>

wild totem Oct 25, 2022, 12:57 AM

#

tribal rapids Nice can I ask how you’re training please? Mind you.. he’d be in the model alrea...

This was the first person I trained, did over 20 models using colab and they were all bad, like 10% success rate on prompting

sterile pivot Oct 25, 2022, 12:58 AM

#

cooolll

wild totem Oct 25, 2022, 12:58 AM

#

but the colab notebooks have improved since then

stray kindle Oct 25, 2022, 1:03 AM

#

No advice?

tribal rapids Oct 25, 2022, 1:03 AM

#

thanks

#

has anybody got a technical explanation why if i train photo of jmp90 man then photo of jmp91 man man will also give close results.... is it just converting jmp90 to a number internally that jmp91 is also close to and will crossover essentially ... not sure how far the difference is eg jmp9, jmp etc

next nimbus Oct 25, 2022, 1:22 AM

#

Some fine-tune mention "This version uses the new train-text-encoder", anyone can explain this to us? and how can we use it to train/ finetune our model?

icy olive Oct 25, 2022, 1:24 AM

#

tribal rapids has anybody got a technical explanation why if i train `photo of jmp90 man` the...

probably has to do with how the prompt is tokenized

#

if it's split as ('jmp', '9', '0', 'man') then ('jmp', '9', '1', 'man') will be similar

tribal rapids Oct 25, 2022, 1:33 AM

#

you think i should just use jmp?

#

sorry jmp man

vale egret Oct 25, 2022, 3:48 AM

#

use 🏃‍♂️

tawny inlet Oct 25, 2022, 10:38 AM

#

hot breach if you want a set of characters all in one model you'll need a caption-enabled r...

what do you mean Caption-enabled repo?

#

and which type of training???

ivory veldt Oct 25, 2022, 10:58 AM

#

FYI: allegedly runpod.io machines running dreambooth create better ckpt, nicer images: https://www.youtube.com/watch?v=mVOfSuUTbSg

YouTube

Aitrepreneur

DREAMBOOTH Free CKPT File With Google Colab BUT Is it Worth it? Com...

Dreambooth, Google's new AI, allows you to train a stable diffusion model with your own images, producing better results than textual inversion. Dreambooth was built on the Imagen text-to-image model, which allows you to insert any character (yourself, friends, family), object, or animal you want into a stable diffusion model with just a few ima...

▶ Play video

dire heath Oct 25, 2022, 1:34 PM

#

https://twitter.com/rainisto/status/1584879548529020929?s=46&t=oU03hLY_JikoyUsmeVuZyg posted this

Roope Rainisto (@rainisto)

Dreambooth training for Stable Diffusion is extremely powerful. You can train a new token to the "person" class to create very convincing looking images of them. I've posted some examples in the past days.

But it's not the coolest thing you can do...

ivory veldt Oct 25, 2022, 1:58 PM

#

I was unlucky with google colab today. I even paid $9.00 to get an undisturbed training. 1 out of 6 training processes went through, others just stalled, stopped, "something went wrong" after an hour wait. I think it's a bad quality product from google, the ux is also terrible. Now trying runpod

hot breach Oct 25, 2022, 2:08 PM

#

ivory veldt FYI: allegedly runpod.io machines running dreambooth create better ckpt, nicer i...

I think the issue was the diffusers repos were not unfreezing the entire model, it made it work on less VRAM but doesn't produce as good results

hot breach Oct 25, 2022, 2:09 PM

#

tawny inlet what do you mean Caption-enabled repo?

fine tuners or dreambooth repos that let you have a caption on each training image instead of just a "token" or "classword"

icy olive Oct 25, 2022, 2:24 PM

#

ivory veldt I was unlucky with google colab today. I even paid $9.00 to get an undisturbed t...

You could've gotten an instance on vast.ai for $0.40/hr. It'd be more reliable than colab

ivory veldt Oct 25, 2022, 3:17 PM

#

yes. runpod is unstable - it's a joke. I want a reliable training server, not a shared machine

ivory veldt Oct 25, 2022, 3:33 PM

#

who has a web service where I can upload images and I get a ckpt file back. Without the painful interruptions, errors and 100s of log files?

limber peak Oct 25, 2022, 3:41 PM

#

runpod works pretty well

ivory veldt Oct 25, 2022, 3:45 PM

#

Not for me. Training stop every time

tribal rapids Oct 25, 2022, 3:45 PM

#

hmm..
(jmp person:1.0) and jmp person are not the same it seems... if you have Use old emphasis implementation ticked in a1111
(jmp person:1.1) and (jmp person) are also not the same in that scenario

I had assumed each bracket was 1.1, with ie 2 brackets being 1.21

solar shale Oct 25, 2022, 5:30 PM

#

Hi, does anyone know how can I implement stable diffusion on my own macbook?

woeful sphinx Oct 25, 2022, 6:06 PM

#

When it comes to improving the quality of the faces around a specific model for 1.4, what comes to mind?

#

We are not happy with the results

#

Should we try training the model longer?

tame aurora Oct 25, 2022, 6:23 PM

#

hot breach my take is if you have higher ambitions, unfreezing the entire (latent diffusion...

What do you mean by”entire”?
Not just the Unet but also the ClipEmbedder and the Autoencoder?
Aren’t they supposed to be generic and good enough in their pre-trained variant or you think there would be benefit if my data is kind-of more specialized than laion?

hot breach Oct 25, 2022, 6:24 PM

#

clip remains frozen

#

unet and autoencoder for the local repos, and I think you can unfreeze them both in diffusers with the right setup if you choose but I'm a bit behind on that

#

at least the how part

#

but I believe it is an option

silk marsh Oct 25, 2022, 7:18 PM

#

Hey, I've noticed after training an embedding of a character with Auto1111's textual inversion interface, that after a certain point, using the embedding tends to make duplicates of the character. Can it be because of too many steps of training? Is there a way to prevent that? Have any of you had the same thing happen?

tame aurora Oct 25, 2022, 8:21 PM

#

hot breach unet and autoencoder for the local repos, and I think you can unfreeze them both...

I’m using the SD repo if that’s what you mean. Thanks, you are giving me options to explore further

hot breach Oct 25, 2022, 8:21 PM

#

if you have the vram I think unfreezing the whole model is worth it from what I've seen

#

maybe for some projects its not terribly important

tame aurora Oct 25, 2022, 8:30 PM

#

I’ll add that to the list of experiments, thanks

hot breach Oct 25, 2022, 8:31 PM

#

yeah the backlog problem is real 😆

woeful goblet Oct 25, 2022, 10:44 PM

#

how do i remove colors with inpainting? I'm making an image of a wasteland, and it has lots of glowy red spots on the ground that i don't want, i'm trying to just replace them with boring, cracked black dirt. But it keeps maintaining the colors and just painting in red cracks. I've even got black in the prompt, and red in the negative prompt

#

https://i.imgur.com/GN1OBGB.png

#

Example: I selected all the red bits that aren't on the pillar, used prompt cracked black ashen ground and negative prompt red pink color
Still red everywhere 😐
using automatic ui

#

i only want red on the pillar not the ground, any thoughts?

tribal rapids Oct 25, 2022, 10:55 PM

#

What about something like (red_pillar:1.2) in prompt and (red_ground:2) in negative ?

#

Keep the underscores see if it works

#

Play with CFG level

#

Dunno

woeful goblet Oct 25, 2022, 11:13 PM

#

tribal rapids What about something like (red_pillar:1.2) in prompt and (red_ground:2) in nega...

i dont need to do anything with the pillar, i'm happy with this overall composition, just want to finetune details, hence using this channel 😛

tribal rapids Oct 25, 2022, 11:14 PM

#

Inpaint after?

azure crane Oct 26, 2022, 5:17 AM

#

I read maybe a few weeks ago that someone had made a SD version that was trained STRICTLY for HANDS (and maybe feet too), but I can't find it anymore... Anyone knows which one it is and it's any good?

woeful goblet Oct 26, 2022, 9:47 AM

#

azure crane I read maybe a few weeks ago that someone had made a SD version that was trained...

Please ping me if you find out, i desire this too

upper prism Oct 26, 2022, 10:01 AM

#

azure crane I read maybe a few weeks ago that someone had made a SD version that was trained...

I think I saw @north ledge mention it but I don't know if he made it. Might know where it is though

tribal rapids Oct 26, 2022, 12:46 PM

#

If there’s multiple people in a photo and you just want to train one of them what’s the best way to exclude the rest? Just mask them out with noise?

upper prism Oct 26, 2022, 1:30 PM

#

tribal rapids If there’s multiple people in a photo and you just want to train one of them wha...

You could crop the image or mask it
But don't use noise. I think white or black works best

tawny inlet Oct 26, 2022, 2:49 PM

#

Hey, can somebody help me with Dreambooth?

#

I'm trying to train a Person Face, should I leave these two values as they are?

#

upper prism Oct 26, 2022, 3:53 PM

#

tawny inlet I'm trying to train a Person Face, should I leave these two values as they are?

For making it easy, yes
You could also change them if you know what you are doing.
What reg images are you using?

tribal rapids Oct 26, 2022, 4:04 PM

#

upper prism You could crop the image or mask it But don't use noise. I think white or black ...

thanks

tawny inlet Oct 26, 2022, 4:34 PM

#

upper prism For making it easy, yes You could also change them if you know what you are doin...

reg images???

upper prism Oct 26, 2022, 4:35 PM

#

tawny inlet reg images???

Or is this the new Lastben repo without them? Sometimes called class images?

tawny inlet Oct 26, 2022, 4:35 PM

#

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

GitHub

diffusers/examples/dreambooth at main · ShivamShrirao/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - diffusers/examples/dreambooth at main · ShivamShrirao/diffusers

#

Shivam's Repo.

upper prism Oct 26, 2022, 4:36 PM

#

Ah yeah, that's the one I'm using.

tawny inlet Oct 26, 2022, 4:37 PM

#

I don't understand too much about AI and all these stuff, so yeah, please don't ask too complex questions. I would prefer normal english.😅

#

I still didn't understood what you meant by class images.

#

I just added face images on training folder...

#

Tutorial I was following said that Class images will be generated automatically. I guess...

upper prism Oct 26, 2022, 4:39 PM

#

To make it as easy as possible I'd leave the settings as is and try it without any modifications.
If that test runs well and gives you good results you could come back to that ☺️

tawny inlet Oct 26, 2022, 4:43 PM

#

upper prism To make it as easy as possible I'd leave the settings as is and try it without a...

btw, what's your rig specs?

upper prism Oct 26, 2022, 4:56 PM

#

tawny inlet btw, what's your rig specs?

3090, Ryzen 9, 32GB RAM and a normal HDD

midnight owl Oct 26, 2022, 5:49 PM

#

Anyone have any suggestions for working with different ckpt / weights quickly? I'm using A111111 SD Webui. For me it takes up to a minute or so.

hot breach Oct 26, 2022, 5:52 PM

#

faster hard drive will help, its fast to load on an NVMe drive

#

loading a 4gb file from spinning rust will be slow, but it only takes 2-3 seconds on a fast NVMe drive

midnight owl Oct 26, 2022, 5:53 PM

#

Ok so those YT vids I've seen where folks are switching 'instantly' are time snipped I guess

hot breach Oct 26, 2022, 5:53 PM

#

1-2GB/sec read on NVMe drive means 4gb is just a couple seconds

midnight owl Oct 26, 2022, 5:54 PM

#

Oh really OK, 2-3 seconds, wow then I'm doing it wrong, thanks

hot breach Oct 26, 2022, 5:54 PM

#

I initially started using SD off my NAS, it's reasonably fast NAS but it would still take 20 seconds or so to load, on NVMe drive its just a few seconds

#

SATA SSD is like 6-10 seconds, etc

midnight owl Oct 26, 2022, 5:55 PM

#

I simlinked my models from my nvme drive - some of them anyway. I seem to remember it didn't help initial checkpoint weights loading, so never pursued it

#

I very superficially looked at how to point the SD webui at an external location for the models

restive ridge Oct 26, 2022, 6:35 PM

#

Yeah on my M.2 SSD it's a few seconds. It's also worth trying out embeddings if you haven't. Much lighter than checkpoints and ideal for styles.

grand jay Oct 26, 2022, 7:19 PM

#

Anyone got a good tutorial link on how to fine tune the model? with 10k+ images. lambdalab's pokemon demo looks horrendous...

azure crane Oct 26, 2022, 8:02 PM

#

upper prism I think I saw <@469323823876014090> mention it but I don't know if he made it. M...

Have you asked him?

tribal rapids Oct 26, 2022, 8:48 PM

#

on shivam's, for a person what do you currently recommend steps wise for 12 images @ 1e6 with 300 regs? i'm goin up in steps of 800, up to 4800. .. i'm not sure 100 per image does it.. or maybe 800 was to low and 1600 too much.. but the face isnt settling

2e-6 converged (well a bit) a lot quicker

#

it was defaulted at 5e-6 at one point wasnt? i've seen someone here training 20 images for 4040 steps so dunno

#

there's some unnatural data in these photos tho this time, eg i've moved a person from the edge of a photo to the middle, cropped anybody else off and filled with black.. it's probably problematic

delicate stream Oct 27, 2022, 12:13 AM

#

Out of all the 28 or 29 activation functions of the hypernetwork on Automatic1111, and the Layer weights initialization. What is the best/recommended options i should choose? I AM SO CONFUSED as to which i should go with.

#

i used to do linear, but

#

Linear is not there anymore, and they added layer weights initialization

delicate stream Oct 27, 2022, 12:29 AM

#

Also this Select Layer weights initialization. relu-like - Kaiming, sigmoid-like - Xavier is recommended

#

is very vague

#

what does that mean? i should mix Relu with Kaiming Normal and always use Sigmoid with Xavier?

#

That is so vague and confusing

alpine blade Oct 27, 2022, 12:37 AM

#

I made a bunch of interesting photos to say the lease

hot breach Oct 27, 2022, 12:38 AM

#

it's so new maybe people don't know yet

delicate stream Oct 27, 2022, 12:38 AM

#

Guess i'll have to find out myself.

hot breach Oct 27, 2022, 12:40 AM

#

try training with the same stuff and just change the init setting and compare after I guess, it's pretty experimental stuff

alpine blade Oct 27, 2022, 12:40 AM

#

this is a bunch of images I generated

📎 animegirloutput.zip

delicate stream Oct 27, 2022, 12:41 AM

#

i tried earlier and im still a bit confused, either way i'l keep experimenting with it.

alpine blade Oct 27, 2022, 12:41 AM

#

but what a common theme between all of them is that there is something inherently wrong with their limbs and faces

#

does anyone know how to fix that?

delicate stream Oct 27, 2022, 12:42 AM

#

alpine blade does anyone know how to fix that?

Potentially Dangerous file That's not good....

alpine blade Oct 27, 2022, 12:42 AM

#

uh what

#

its all images

delicate stream Oct 27, 2022, 12:43 AM

#

Yhea but Disocrd do be like that

alpine blade Oct 27, 2022, 12:43 AM

#

47 images in there

#

maybe just because its zipped

delicate stream Oct 27, 2022, 12:43 AM

#

Those faces look....weird, i see

#

Have you tried using mutated, disfigured in the negative prompt?

#

or any other limb correcting phrase?

alpine blade Oct 27, 2022, 12:48 AM

#

no

#

I'm new to this

delicate stream Oct 27, 2022, 12:48 AM

#

Well that certainly can help

alpine blade Oct 27, 2022, 12:49 AM

#

this was just an initial test of mine where I put in a series of photos with the prompt "anime girl"

delicate stream Oct 27, 2022, 12:49 AM

#

Negative prompt basically is what you don't want in the image

alpine blade Oct 27, 2022, 12:49 AM

#

alright

#

any other tips?

delicate stream Oct 27, 2022, 12:50 AM

#

Well what UI are you using?

#

Automatic1111, CMDR2 or Grisk, etc

alpine blade Oct 27, 2022, 12:55 AM

#

the webui?

delicate stream Oct 27, 2022, 12:55 AM

#

like where did you download it from?

alpine blade Oct 27, 2022, 12:56 AM

#

https://github.com/AUTOMATIC1111/stable-diffusion-webui#automatic-installation-on-windows

GitHub

GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI

Stable Diffusion web UI. Contribute to AUTOMATIC1111/stable-diffusion-webui development by creating an account on GitHub.

delicate stream Oct 27, 2022, 12:56 AM

#

Then Automatic1111

alpine blade Oct 27, 2022, 12:56 AM

#

ahh

#

I see

delicate stream Oct 27, 2022, 12:57 AM

#

well there's LOTS of things to do on it so i cant name all of them, i suggest taking a look at this.

#

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features

GitHub

Features · AUTOMATIC1111/stable-diffusion-webui Wiki

Stable Diffusion web UI. Contribute to AUTOMATIC1111/stable-diffusion-webui development by creating an account on GitHub.

#

the features mainly

#

but tips specifically

#

well:

#

You can use Loopback on img2img to make the image sort of better over time

#

in order to upscale an image you can use a 512x512 image and pass it to img2img with denoise strength at 0.50-60 or use ESRGAN

#

20-50 steps is more than enough, unless you are doing img2img or oupainting or inpainting.

#

The recommended sampling methods are usually, Euler_a, Euler or DDIM, you can use the others but i mainly and other people use those.

#

using [] or () with a specific word inside like:

#

[dark alley] with a (red light) on the ceiling.

#

[] = Doesn't pay much attention to what's inside but its still there

#

() = Pays more attention

#

the more (()) the more attention as well as [[]] for less attention

#

The more complex your prompt, sometimes is better but sometimes cutting back helps, so don't go crazy with the prompt.

#

Things like "an anime girl is running down the road" is much better like this "1girl, anime, running, street"

#

But most of these you can find as well on the Automatic1111 GitHub, just read around and you'll see it.

alpine blade Oct 27, 2022, 1:08 AM

#

tysm

delicate stream Oct 27, 2022, 1:09 AM

#

No problem

tribal hearth Oct 27, 2022, 1:20 AM

#

any tips for drastic inpainting? it works fine when I mask an object and try to replace it with something similar, but if (for example) I mask a wall and prompt "hole in the wall" often very little to nothing will change, irrespective of cfg/steps/sampler/etc

pearl merlin Oct 27, 2022, 1:44 AM

#

My dune SD fine-tune https://huggingface.co/nagolinc/sd-dune

nagolinc/sd-dune · Hugging Face

icy olive Oct 27, 2022, 2:20 AM

#

The slow speed in loading weights seems to have to do with Python's relatively slow pickle deserialization (I should try to profile it). I'm stuck waiting for 10-20 seconds even with an NVMe SSD and 32 GB of RAM.

delicate stream Oct 27, 2022, 3:51 AM

#

Found a nice guide for the Hypernetwork thing (activation function of hypernetwork)

#

https://indiantechwarrior.com/7-types-of-neural-network-activation-functions-how-to-choose/

IndianTechWarrior

indiantechwarrior author

7 Types of Neural Network Activation Functions: How to Choose?

Neural network activation functions are a crucial component of deep learning. Activation functions determine the output of a deep learning

#

basically explains what they look like

crimson wasp Oct 27, 2022, 3:57 AM

#

icy olive The slow speed in loading weights seems to have to do with Python's relatively s...

Hrm I don't think it takes more than a few seconds to switch model for me, running off a Samsung 980 pro and loading onto an rtx 3060, with the i5 12400 maybe doing some decompressing. It could be something else

delicate stream Oct 27, 2022, 3:59 AM

#

icy olive The slow speed in loading weights seems to have to do with Python's relatively s...

You can always load them into ram using Checkpoints to cache in RAM

#

under stable diffusion in settings

#

https://paperswithcode.com/methods/category/activation-functions

Papers with Code - An Overview of Activation Functions

Activation functions are functions that we apply in neural networks after (typically) applying an affine transformation combining weights and input features. They are typically non-linear functions. The rectified linear unit, or ReLU, has been the most popular in the past decade, although the choice is architecture dependent and many alternative...

#

as far as i can see (currently training) elu is good

#

The best seem to be, relu, rrelu, swish, sigmoid (hard to say), leakyrelu, tanh.

delicate stream Oct 27, 2022, 10:10 AM

#

you can add selu to the list, it's also good.

dawn trellis Oct 27, 2022, 2:31 PM

#

Hrm.... I'm playing with textural inversion, and I'm getting results that appear to be over fitting? ie. If you put * in as a prompt, it tries to literally regenerate one of the sample photos. Such things as 'a * themed lunchbox' just don't work at all.

#

I've been reading all the issues on the original textural inversion github... but, there's just heaps of people trying to random stuff as far as I can tell.

#

eg. 'set num_vectors_per_token to 60'

#

Even though the original work didn't do anything remotely like that.

#

Does anyone know of a good guide for the right settings to use?

crimson wasp Oct 27, 2022, 3:34 PM

#

dawn trellis Does anyone know of a good guide for the right settings to use?

number of vectors is roughly the equivalent of number of words if you were trying to type out a prompt to do it, and SD only takes up to 75 vectors, so a high number will overwhelm anything else

dawn trellis Oct 27, 2022, 3:35 PM

#

Yes, so I fail to see how that’s useful despite the advice in various places to do so, and simply set the inference number to some arbitrary lower value like 8

#

…but practically the number of vectors of 6 (used in the original repo) seems to give results that don’t do anything remotely like what they describe.

#

“Banksy art of *” gives gives me a photo of *

#

Is there a trick to making it do something meaningful, instead of just spitting out the training images?

#

(I’m reading, for example, https://github.com/rinongal/textual_inversion/issues/35 but people just seem to be posting pictures and not actually bothering to say how they got them)

GitHub

Got weird results, not sure if I missed a step? · Issue #35 · rino...

Hey @rinongal thank you so much for this amazing repo. I trained with over 10K steps I believe, and around 7 images. (Trained on my face) Using this colab I then used those pt files in running the ...

#

Eh. https://github.com/rinongal/textual_inversion/issues/92 I have the same results as this guy trying to repeat the original results in the paper.

GitHub

Reproducing the results in the paper · Issue #92 · rinongal/textual...

Hi, I'm trying to reproduce the paper results, especially the 'cat_statue' concept below in Figure 1, but it seems like extra information is needed to do that. Inversion and...

#

TLDR; he can’t.

#

Meh, I think this may actually just be broken with LD.

oak ether Oct 27, 2022, 5:00 PM

#

hello, how to add a new style in SD ? i precise that i have not it running on my pc, i use it online

tribal rapids Oct 27, 2022, 6:47 PM

#

to increase negative emphasis in te negative prompt (ie I want it even less like the word) do I use (word) or [word]?

#

like instead of red in the negative prompt, i want to say I really really dont want red.. is that ((red)) or [[red]] ?

#

not sure because it's negative

crimson wasp Oct 27, 2022, 7:09 PM

#

tribal rapids to increase *negative* emphasis in te negative prompt (ie I want it even less l...

it depends on the repo, it's a hack added on top of base SD and each one does it differently. I'm guessing you should use (word)

crimson wasp Oct 27, 2022, 7:09 PM

#

dawn trellis Meh, I think this may actually just be broken with LD.

I've had some success with TI but am too busy to go through it all in detail right now. I'm pretty sure Automatic's implementation is broken though, if that's what you're using

icy olive Oct 27, 2022, 7:16 PM

#

crimson wasp I've had some success with TI but am too busy to go through it all in detail rig...

Wasn't always broken, at least

crimson wasp Oct 27, 2022, 8:05 PM

#

icy olive Wasn't always broken, at least

Yeah I've had some success with it, but can't get anything good out of it now. I have some very minor modifications, but they're the same as I have in other repos (reading prompts directly from filenames)

delicate stream Oct 27, 2022, 8:36 PM

#

They are bringing Linear back to Hypernetworks... nice

#

Pull request
https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/3717

GitHub

Add missing support for linear activation in hypernetwork by benkyo...

In hypernetwork, the linear activation_func that the old implementation used is missing from ui.
Add the linear activation in hypernetwork back to ui.

native bison Oct 27, 2022, 9:20 PM

#

just uploaded my cyberpunk dreambooth model to huggingface (its overtrained and I LOVE it) https://huggingface.co/Phantasion/phaninc

Phantasion/phaninc · Hugging Face

young quail Oct 27, 2022, 9:38 PM

#

Has anyone had any luck determining a more optimal prompt for multiple subjects without distortions or merging taking place?

hot breach Oct 27, 2022, 10:01 PM

#

is that a fine tuning question or a prompting question?

#

because putting group photos in your training set helps

#

that's a render of a trained model that includes group photos

tifa_lockhart_cloud_strife_aerith_gainsborough_and_barret_wallace.png

#

getting outfits on specific characters in group photos is still elusive

young quail Oct 27, 2022, 10:08 PM

#

hot breach is that a fine tuning question or a prompting question?

oh my bad yeah prompting question i missed the channel for that xD

somber roost Oct 27, 2022, 11:49 PM

#

What is the best way to train an art style in dreambooth? I picked 150 samples and trained with 2000 steps, the results are impressive but seems the model can't understand some concepts that appeared in the dataset

#

Maybe bcs there are few references to those concepts

indigo siren Oct 28, 2022, 12:59 AM

#

I’m a newb hoping to make a Transformers model, and I have two questions:

If I want to teach the AI to recognize details like a ‘red helm’ (specifically the blocky face-framing element some Transformers have) or a pair of ‘blue pedes’ (the nonhuman feet of a Transformer, trying to avoid toes), is it better to say “red helm/blue pedes”, “red-helm/blue-pedes”, or “redhelm/bluepedes” when teaching? Or something else entirely?
Is it possible to teach the AI multiple tags at the same time? I found SD through NovelAI, which made a model that recognizes most danbooru tags, and I would really like to be able to do something similar. Eg) “whirl-idw1, blue-long-empuratee-helm, yellow-long-empuratee-optic, neck-up, from-side, suspicious” to describe a picture in the dataset, and “swindle-g1-cartoon, black-helm, gray-face, purple-optics, black-neck, yellow-pauldrons, yellow-chest, glass-windshield, purple-torso, waist-up, from-front, fake-happy” to describe another. Is there a way to do this efficiently?

I apologize if these are ridiculous questions with obvious answers. I’m new to this.

delicate stream Oct 28, 2022, 1:12 AM

#

I have spent...hours trying to determine which Hypernetwork activation function to use. I have determined......SCREW IT!

#

It's Math, it's all different ways to plot a graph and based on that graph your training will go different. My advice...... There's no better method, they just all produce different things. There saved you the trouble, just stay on Linear, relu, selu, elu and leakyrelu. Those seem to be stable......for the love of god....don't change the Layer weights initialization from Normal. Jus leave it there, DON'T touch it. Also....DON'T use SIGMOID, it's just a mess......if anyone wants to keep trying, go for it. But for me? im good with Linear.

#

as for Dropout....well be careful with that. i could do some perfect stuff without it before, but it will drastically change your outputs as well.

#

Maybe for the good or bad.

delicate stream Oct 28, 2022, 2:50 AM

#

Here is some examples of what the graphs look like.

#

#

Do with that what you will.

#

alpine rose Oct 28, 2022, 8:54 AM

#

@hot breach what ratio have you been using between training images and reg images?

#

1:100 ?

#

i have 400 training images for a model im trying to make, kinda lame to generate 40000 reg images

ivory veldt Oct 28, 2022, 9:41 AM

#

I installed dreambooth locally for local training, but getting a cuda memory error. RTX3060 6GB VRAM Anyone successfully trained locally?

alpine rose Oct 28, 2022, 11:32 AM

#

at some point I read you needed 24GB VRAM to train, maybe that changed

half folio Oct 28, 2022, 11:51 AM

#

ivory veldt I installed dreambooth locally for local training, but getting a cuda memory err...

6GBs? No, it can't be done.

ivory veldt Oct 28, 2022, 11:51 AM

#

half folio 6GBs? No, it can't be done.

maybe in a few months...

half folio Oct 28, 2022, 12:02 PM

#

the model is simply too big to be loaded in 6GBs of VRAM

hot breach Oct 28, 2022, 3:43 PM

#

alpine rose <@187004267641438208> what ratio have you been using between training images and...

last model didn't have reg images, it was trained on 1:1 training images and laion image scrape, no difference in repeats

viral jay Oct 28, 2022, 5:43 PM

#

guys I've installed dreambooth on WSL, but when I convert to ckpt and load it on automatic webui I'm getting this error, if I copy the whole folder and execute same command but on windows directly it works, any idea?

hot breach Oct 28, 2022, 6:01 PM

#

auto has a safety checker on the ckpt that rejects if it there are unexpected things in it so people don't end up getting malware

#

it may be the converter being used on the repo you have from diffusers->ckpt is doing something unexpected in the ckpt pickle file

#

his safety checker is probably far from perfect, but better than nothing

viral jay Oct 28, 2022, 6:10 PM

#

yeah disabling the checker make it work, the weird thing is that same script on windows creates a working ckpt but on debian produces that message

hot breach Oct 28, 2022, 6:23 PM

#

that's certainly an interesting data point

fair perch Oct 28, 2022, 6:26 PM

#

hello everyone. I was thinking a way to produce datasets of generated characters for fine tuning and today I founded a simple way to do it.
I did just a few tests using img2img, the idea is cropping every image in each angle and use it in dreambooth

icy olive Oct 28, 2022, 10:51 PM

#

viral jay yeah disabling the checker make it work, the weird thing is that same script on ...

Probably has to do with python/pytorch version

hot breach Oct 29, 2022, 3:34 AM

#

added micro model training to my trainer since people seem interested in that, will damage the model if you do too much but worked ok for training Ted from the Seth McFarland of the same name https://github.com/victorchall/EveryDream-trainer/blob/main/MICROMODELS.MD sorry no notebooks yet, just local on 24GB gpus

#

not bad for 13 minutes of training

delicate stream Oct 29, 2022, 3:39 AM

#

😂

#

Never thought i'd see Ted as superman.

dawn trellis Oct 29, 2022, 10:13 AM

#

I've got to say I'm struggling to see what really changed other than the config in https://github.com/kanewallmann/Dreambooth-Stable-Diffusion/compare/main...victorchall:EveryDream-trainer:main

GitHub

Comparing kanewallmann:main...victorchall:main · kanewallmann/Dream...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion (tweaks focused on training faces) - Comparing kanewallmann:main...victorchall:main · kanewallmann/Dreambooth-S...

#

Drops the reg as part of the loss in the unfrozen finetuning?

dawn trellis Oct 29, 2022, 10:15 AM

#

hot breach added micro model training to my trainer since people seem interested in that, w...

Prompt is simply "ted bennett" <-- why no complex prompt?

mighty igloo Oct 29, 2022, 10:16 AM

#

I have a dataset that I created by parsing one site, image tags are inserted in the file names, and I want to finetune SD so that it understands the tags of this site. What is the best way to do this?

delicate stream Oct 29, 2022, 11:23 AM

#

Yo quick question, if im training a hypernetwork on a specific anime character. Is it better if i do this for the textual inversion template?

#

Ichigo, [filewords] ? or [filewords] since im thinking adding ichigo will make it more pronounce to know that the anime character im training it on is called Ichigo and every time i say 1boy, orange hair it wont just generate a random dude with orange hair instead. So that's why im thinking if i do Ichigo, 1boy, orange hair it will consistently make it like the guy i trained it on. What do yo guys think?

hot breach Oct 29, 2022, 12:40 PM

#

dawn trellis `Prompt is simply "ted bennett"` <-- why no complex prompt?

there's not much need for it, the text encoding is quite smart from what I can tell and it's too painful at inference to remember magic prompts and magic tokens, I don't think the magic tokens really has legs, no one is going to want to have to read a giant prompt guide especially as mega models start flowing in place of having a drive full of a hundred 2GB dreambooth trainings

hot breach Oct 29, 2022, 12:42 PM

#

dawn trellis I've got to say I'm struggling to see what really changed other than the config ...

regularization pairing is removed, kane is off doing his phd so I forked it to keep moving forward, I'm up to batch size size 6, which is like running the other xavier forks with batch size 3 with the train/reg pairing and wasn't possible for one thing, they max out at 2 (or equivalent of 4 on my fork)

#

the data management to change the ratio of preservation to training data is obtuse if you use kane's, it was sort of fixed at 1:1 without a complex explanation of moving data between "reg" and train folders

#

kane's was essentially just passing pairs of train/reg and training them equally, which actually worked very well for preservation using laion data in place of "regularization" in dreambooth paper terms, captioning already removed token/class so at this point it back to a more general case fine tuner and there's nothing left of the dreambooth paper in there really, and I wanted to be able to more easily manage the ratios of new and preservation data

#

I'm actually fairly pessimistic about "fast" "dreambooth" (no regularization) long term because but people seemed hyped about it so I did a POC for it above and it only took me like 30 minutes of actual work to run a test and write up the readme for ted bennett to show it works

naive wharf Oct 29, 2022, 1:47 PM

#

getting super off results when training hypernetwork with automatic1111, running 2000 steps and the outputs are nothing that matches the prompt... any suggestions on how to direct the model closer to the prompt?

delicate stream Oct 29, 2022, 1:49 PM

#

What are your settings? and are you trying to train a person, object or a style?

#

by settings i mean, activation method, layer structure, learning rate, etc.

naive wharf Oct 29, 2022, 1:58 PM

#

delicate stream by settings i mean, activation method, layer structure, learning rate, etc.

linear, (1, 2, 1) and 0.00005

#

i'm training purple boots / shoes

#

so moreso object

delicate stream Oct 29, 2022, 2:00 PM

#

By getting super off results do you mean the images degrade (deep-fry) over time or do they just not look as what you are training?

naive wharf Oct 29, 2022, 2:01 PM

#

delicate stream By `getting super off results` do you mean the images degrade (deep-fry) over ti...

both, they start to fry around 1700 steps, but all images prior to that are just super off , for example a random car showed up and then what seemed like a japanese zen garden ... so totally all over the place

delicate stream Oct 29, 2022, 2:03 PM

#

i recommend first using this method: but also make sure your images are of your subject and nothing else just to be safe, "Quality over quantity" as the guide says.

naive wharf Oct 29, 2022, 2:05 PM

#

delicate stream i recommend first using this method: but also make sure your images are of your ...

yeah, i only have 10 images, all just of the target, very clean/focused on the target too

delicate stream Oct 29, 2022, 2:08 PM

#

Enter hypernetwork layer structure: 1, 1 (linear)
Select Layer weights initialization Normal
Use Dropout: Enabled

make sure they are 512x512 or resize them.

Hypernetwork Learning rate: 5e-5:100, 1e-5:1000, 5e-6

also make a custom prompt template with something like
Shoes.txt inside> a picture of purple shoes or Purple shoes

if that doesn't work try using:
Select activation function of hypernetwork: relu, rrelu, elu, swish, leakyrelu

#

usually depending on what i am training i get good results at 2k

#

also make sure you have in settings Move VAE and CLIP to RAM when training hypernetwork. Saves VRAM. Enabled

naive wharf Oct 29, 2022, 2:10 PM

#

ok awesome will try this now

delicate stream Oct 29, 2022, 2:10 PM

#

and

#

Stop At last layers of CLIP model: 1

#

your results will also depend on what model you are using, SD 1.4~5 is mostly real stuff and Waifu diffusion for anime training

#

and also

#

make sure while training a hypernetwork you don't accidentally have another hypernetwork enabled

#

That's all i can say, it should be pretty straightforward. if even after everything, it still fails. Well....idk

naive wharf Oct 29, 2022, 2:15 PM

#

delicate stream That's all i can say, it should be pretty straightforward. if even after everyth...

its worked well with me for textual inversion training with diffusers a few weeks back so i hope to get this one working to but sometimes it's just 🤦‍♀️ 🥲

delicate stream Oct 29, 2022, 2:15 PM

#

Yhea they changed a few stuff and i was mad because i had to re-learn hypernetworks again.

#

to help you a bit...

#

#

These are the activation functions

#

basically they are the weight of the learning

#

something like sigmoid has a high starting point therefore you might get weird noise and stuff. Linear is straight forward and it gets better overtime, however it has a negative slope too, relu is basically linear without that negative slope.

#

Hope that helps in visualizing how your hypernetwork might change overtime

naive wharf Oct 29, 2022, 2:19 PM

#

yeah, that's great, thank you

delicate stream Oct 29, 2022, 2:20 PM

#

No prob, dont be afraid to change the learning rate btw just dont go over 0.00005 (5e-5) or your network will die quickly at 1000 or more

#

so in short
1e-4 = no
5e-5, 1e-5, 5e-6, 1e-6, 5e-7, etc. = good

naive wharf Oct 29, 2022, 2:22 PM

#

and, for the sizing, keep it at 512 x 512 ? would for example 768 x 512 inject extra time/error ?

delicate stream Oct 29, 2022, 2:23 PM

#

well, from what i understand it's better to play it safe at 512x512 but you can do 768x768 but not uneven numbers, they have to have a 1:1 ratio.

#

or else you might get weird stuff

naive wharf Oct 29, 2022, 2:26 PM

#

ok great, implementing now the updates 🤞

delicate stream Oct 29, 2022, 2:26 PM

#

Good luck! Make sure to experiment.

viral jay Oct 29, 2022, 3:37 PM

#

any luck with anything except linear? if I use sigmoid or anything else I just get noise

tribal rapids Oct 29, 2022, 5:09 PM

#

Anyone trained 30 images or so of a person on shiv’s? And if so what was your LR and steps? Thanks

delicate stream Oct 29, 2022, 5:15 PM

#

viral jay any luck with anything except linear? if I use sigmoid or anything else I just g...

Yhea sigmoid is for me a hard skip, i just use relu, swish, elu, selu, linear or rrelu. Swish is slow to learn but its safer from what ive heard. Sometimes selu gives good stuff. I've done A LOT of experimenting and im still not on a solid activation method, but i do use Linear more often.

delicate stream Oct 29, 2022, 5:20 PM

#

viral jay any luck with anything except linear? if I use sigmoid or anything else I just g...

Im training Elastigirl right now using selu

#

step 50

#

step 900

#

im not using Linear right now so i can't give an example but as you can see selu gives good stuff, obviously this is still training so it's bad rn

#

step 1150

icy olive Oct 30, 2022, 12:16 AM

#

where is selu?

fathom crane Oct 30, 2022, 1:04 AM

#

Hi all, is there any relations between the layer structure and learning rate? Should I use learning rate smaller than 5e-7 when layer is more than [1, 2, 1] ?

hot breach Oct 30, 2022, 1:36 AM

#

https://github.com/victorchall/EveryDream updated Laion scraper for better capture rate, fixed quite a few issues with junk in TEXT/caption fields

alpine blade Oct 30, 2022, 1:40 AM

#

anyone running into an issue where the ai understands who some characters are, but not others?

hot breach Oct 30, 2022, 1:43 AM

#

might need to be more specific on what you're training, posting examples would help

alpine blade Oct 30, 2022, 1:45 AM

#

hot breach might need to be more specific on what you're training, posting examples would h...

im not training it

tough gazelle Oct 30, 2022, 2:59 AM

#

alpine blade im not training it

Then the model likely wasn't fed many examples of that particular character in it's training data

sage creek Oct 30, 2022, 3:28 AM

#

Can anyone link a decent tutorial on how to navigate the embedding/textual inversion procedure? Thanks and I am sorry if not the correct place to ask.

hot breach Oct 30, 2022, 3:36 AM

#

someone just linked some guides: #💬｜general-chat message

#

there are some basics on automatic1111's wiki on his github too

#

this is the right place

sage creek Oct 30, 2022, 3:39 AM

#

Ty. I appreciate the help

edgy raptor Oct 30, 2022, 9:44 AM

#

How many steps should a dreambooth model be trained on? E.g how many steps should you use relative to the amount of sample images

#

Currently, I'm using a 1:100 ratio. I used 40 images previously, so that's 4000 steps. However, it seemed to have overfitted the model. Thoughts?

alpine rose Oct 30, 2022, 11:03 AM

#

to avoid overfitting, you could use this repo, you'll need to caption your images https://github.com/kanewallmann/Dreambooth-Stable-Diffusion/

#

depending on the investment you want to put in, you could then use this one : https://github.com/victorchall/EveryDream-trainer

stone garden Oct 30, 2022, 11:05 AM

#

https://huggingface.co/hlky/xynthii-diffusion
dreambooth model of Xynthii (cyclops monster girls)
1000 steps, will be testing different amounts, results from num_train_epochs=24 (1920 steps) are good
the same images (and prompt) were used for both instance and class

alpine rose Oct 30, 2022, 11:10 AM

#

by using the technique used in the repos above, the model is able to understand the concepts much better
for example you can train painting styles, and get results like this :

#

here the subject was already known by the model and not in the training images, but it "understood" what it meant to represent subject in the style i was training

viral jay Oct 30, 2022, 11:15 AM

#

playing with dreambooth, any ideas on how to avoid text on the images? the images used for training have no text, but I've used microsphere worlds as prompt so I think its the root of text on the image, maybe using a random word would help?

alpine rose Oct 30, 2022, 11:34 AM

#

https://miro.com/app/board/uXjVPI19lP8=/

storm linden Oct 30, 2022, 2:14 PM

#

Did Shivam’s dreambooth update break training for anyone? I’m having issues after updating it recently. My training setup is:
3080TI
Windows using Ubuntu
Shivam’s DB
CUDA v 11.6
Python 3.9
Everything else basically set up with nerdy rodent’s 10GB dreambooth video

I’m using 300 reg images with 20 training images. I’ve tried 800 steps at 5e-6 and 2000 at 1e-6 but the ckpt ends up giving either all black images or the colored images. What’s frustrating is that the loss rate is really inconsistent: sometimes it says at 0.18 and other times it goes to nan by 50%

storm linden Oct 30, 2022, 3:50 PM

#

I’m trying to do the training locally, not on collab

prime rivet Oct 30, 2022, 4:50 PM

#

How knowledgeable are you lot with Textual Inversion embeddings. I can't seem to get anything but guesses about the actual parameters. What I'm struggling with is that the embedding is too dominant, as it overtakes everything. Even with token range of two. However this seems to be the case regardless of the learning rate. I have understood that adjusting the learning rate can be used to influence the scale dominance. Should I try drastically lower rates? Since noise loss doesn't seem to be tied to learning rate.

#

Also does the initialisation term(s) act as if they were prompts. Should I give it a term or broader range of terms. As in "Underwear" or "Underwear, briefs, pants" with or without the comma. There sodesn't seem to be much of useful information and which there is seem to conflict. I even read the original paper on the topic.

#

However I think all current implementations are different and advanced compared to the original paper.

#

What is the primary stuggle really is to ensure editability of the embedding in use. Currently they seem to work fine even if dominant to make generic basic SD outputs, but if you try to force style it refuses to.

stone garden Oct 30, 2022, 5:57 PM

#

stone garden https://huggingface.co/hlky/xynthii-diffusion dreambooth model of Xynthii (cyclo...

Update from epoch48
a classical preraphaelite painting of a beautiful cyberpunk vaporwave xynthii by john william waterhouse and William-Adolphe Bouguereau
k_euler_a 69 steps cfg scale 7.5

vocal pawn Oct 30, 2022, 8:54 PM

#

Hallo all, I want to train a model on a subject but I've only got 20 good images, if I train a baby model, output 100s of pics until I get a new one that looks very very decent - can I do that until I've got 10 new ones and retrain with the new decent fake 10 to get a better 30 model? Thonk Sounds feasible to me but very new to this so not sure if there's some hidden pitfall

restive bridge Oct 30, 2022, 8:56 PM

#

vocal pawn Hallo all, I want to train a model on a subject but I've only got 20 good images...

sounds like a good idea to me. as long as those outputs are the style youre looking for. no harm in trying it!

vocal pawn Oct 30, 2022, 8:57 PM

#

Nice, I will try it and see then peepoBlush
Will have to be extremely selective and mind hands of the warp

#

I've also done some others - I trained 2 60 image models on 6000 steps earlier, and the 6000 ckpt seemed a bit wonkier/worse than the 4500ish one Thonk Is that something anyone else has found, that there are sweetspots with image numbers/steps?

tribal rapids Oct 30, 2022, 10:06 PM

#

I’ve seen n*80 suggested as a figure, but it’s anecdotal obviously

#

Since it depends on your data, I think you just have to hone in on it

#

@vocal pawn which trainer?

vocal pawn Oct 30, 2022, 10:20 PM

#

Using the colab fast db atm

restive bridge Oct 30, 2022, 10:36 PM

#

vocal pawn I've also done some others - I trained 2 60 image models on 6000 steps earlier, ...

i think the reason that 30 images /3000 steps is suggested is cuz any higher than that can have diminishing returns on the training. but thats case by case

light beacon Oct 30, 2022, 10:46 PM

#

Which sampler should I use if I want very photorealistic results, like this sample?

#

#

this is dall-e but I am trying to get as close as possible with SD

#

and any other settings recommendations to get this?

#

the prompt is A photo portrait of a female supermodel, soft neutral expression, long blonde hair, symmetrical face, front facing, looking at camera, studio lighting, 8k. Dramatic, professional photography. UHD.

vocal pawn Oct 30, 2022, 10:54 PM

#

restive bridge i think the reason that 30 images /3000 steps is suggested is cuz any higher tha...

👍 I see, lots to test

#

Ty for that :>

restive bridge Oct 30, 2022, 10:57 PM

#

light beacon Which sampler should I use if I want very photorealistic results, like this samp...

here are some photo-realistics i've done with SD. i always use euler_a, but the results are more up to your prompts than sampler. try adding things like "award winning photo, 50mm, highly detailed". i dont like using things like "uhd, 8k, hyper realistic" because those terms are rarely associated with real photos which is what we want SD to draw from.

01329-3835147329-beautiful_young_woman_with_short_dirty_blond_hair_wearing_overalls_thin_wide_shot_award_winning_photo_sharp_focus_detailed-gigapixel-low_res-scale-2_00x.jpg

01469-1875560516-stunningly_beautiful_thin_hippie_girl_with_blond_hair_wearing_denim_shorts_full_body_portrait_sharp_focus_detailed_photogra-gigapixel-low_res-scale-2_00x.jpg

01433-4293303383-stunningly_beautiful_fit_woman_with_shorter_hair_wearing_knit_sweater_and_denim_pants_skinny_full_body_portrait_award_winnin-gigapixel-low_res-scale-2_00x.jpg

01309-4192447683-stunningly_beautiful_young_woman_with_shorter_wavy_blond_hair_thin_wide_shot_award_winning_photo_sharp_focus_detailed_phot-gigapixel-low_res-scale-2_00x.jpg

light beacon Oct 30, 2022, 10:58 PM

#

restive bridge here are some photo-realistics i've done with SD. i always use euler_a, but the ...

OMG those look amazing! Mind sharing the prompts for the last two?

restive bridge Oct 30, 2022, 11:00 PM

#

light beacon OMG those look amazing! Mind sharing the prompts for the last two?

sure. i'll have to dig for them, gimme a minute

tribal rapids Oct 30, 2022, 11:04 PM

#

restive bridge here are some photo-realistics i've done with SD. i always use euler_a, but the ...

are these trained subjects? If so could you tell us what you trained please? Steps, subject count rate etc thanks

restive bridge Oct 30, 2022, 11:05 PM

#

tribal rapids are these trained subjects? If so could you tell us what you trained please? Ste...

nope these are just random people that SD created

tribal rapids Oct 30, 2022, 11:06 PM

#

Ah right yeah I can get great results from the core model for sure. Thanks for prompts tho

light beacon Oct 30, 2022, 11:07 PM

#

restive bridge sure. i'll have to dig for them, gimme a minute

getting closer by putting those KW's in!

1759021674_A_photograph_of_a_female_supermodel_dua_lipa__award_winning_photo__50mm__highly_detailed__soft_neutr.png

547859527_A_photograph_of_a_female_supermodel_dua_lipa__award_winning_photo__50mm__highly_detailed__soft_neutr.png

restive bridge Oct 30, 2022, 11:08 PM

#

light beacon OMG those look amazing! Mind sharing the prompts for the last two?

for the sweater one: stunningly beautiful fit woman with shorter hair, wearing knit sweater and denim pants, skinny, full body portrait, award winning photo, sharp focus, detailed, photography, 50mm
Steps: 35, Sampler: Euler a, CFG scale: 8.5, Seed: 4293303383, Face restoration: GFPGAN, Size: 960x1344, Denoising strength: 0.32 and for the blonde one: stunningly beautiful young woman with shorter wavy blond hair, thin, wide shot, award winning photo, sharp focus, detailed, photography, 50mm
Steps: 35, Sampler: Euler a, CFG scale: 9.5, Seed: 4192447683, Face restoration: GFPGAN, Size: 896x1408, Denoising strength: 0.32

light beacon Oct 30, 2022, 11:08 PM

#

TY

lime anvil Oct 31, 2022, 1:40 AM

#

light beacon

yes, because its real

fast current Oct 31, 2022, 2:17 AM

#

restive bridge for the sweater one: stunningly beautiful fit woman with shorter hair, wearing k...

I get particularly confused about this style of prompting. Does SD really know what "stunningly beautiful" as opposed to "stunning, beautiful"

#

Mostly i just wonder what it does with the "ly" i guess

hot breach Oct 31, 2022, 2:28 AM

#

https://github.com/victorchall/EveryDream/blob/main/doc/AUTO_CAPTION.md auto captioning added to my tools repo those using caption training, should work with kanewallmann or my fork, or using MrWho's schema of "00001@my caption.jpg" for use with joepenna fork

GitHub

EveryDream/AUTO_CAPTION.md at main · victorchall/EveryDream

Advanced fine tuning tools for vision models. Contribute to victorchall/EveryDream development by creating an account on GitHub.

hot breach Oct 31, 2022, 4:47 AM

#

Added a colab notebook for above, any scuffed nvidia GPU should work, just need maybe 4GB?

edgy raptor Oct 31, 2022, 4:54 AM

#

Made a script for making 512x512 dreambooth images, crops to a 1:1 image first and then resizes to 512x512to preserve as much of the image as possible; for use with TheLastBen's fast-dreambooth

📎 renameFolder.py

hot breach Oct 31, 2022, 5:19 AM

#

You can use birme and it lets you move the cropping window so you dont cut people's heads off: https://www.birme.net/?target_width=512&target_height=512&auto_focal=false&image_format=webp&quality_jpeg=95&quality_webp=99

#

edgy raptor Oct 31, 2022, 5:25 AM

#

Nice tool!

hot breach Oct 31, 2022, 5:28 AM

#

yeah its just js in the browser

#

sometimes chokes if you put too many huge images in but works 98%

edgy raptor Oct 31, 2022, 6:02 AM

#

Could've made something like that myself but I don't need to reinvent the wheel 😄

stone garden Oct 31, 2022, 6:59 AM

#

Dreambooth
prior-preservation loss
train text encoder
105 images
same images used for both instance and class
prompt of "taylor swift" for both instance and class
69 epochs/7245 steps lr 2e-6

prompt: taylor swift k_euler_a 69 steps

works quite well, another test of my idea to use the same images/prompt for instance and class

alpine rose Oct 31, 2022, 9:09 AM

#

hot breach You can use birme and it lets you move the cropping window so you dont cut peopl...

Birme completely destroys image quality, you should use other tools

#

Has anyone content to recommend on regularization images ?
Some sort of theory guide ?

#

I know they're supposed to match what you are going for during inference, but there's probably more to it

trail rock Oct 31, 2022, 10:18 AM

#

I have few dozens pictures of art style I am trying to replicate, but what reg images should I use to train a style in Dreambooth, please?

#

can I use the person dataset?

#

I tried generate 200 images with prompt "graphic style" and I will see how it goes 🙂

shrewd wedge Oct 31, 2022, 12:13 PM

#

how do I save the values used for training textual inversion?

novel trout Oct 31, 2022, 12:16 PM

#

Hello channel, anything reference code to recommend please if I'd like to finetune SD-1-4 on a customer image-text dataset?

#

Am i supposed to resume from SD-1-4 or SD-1-4-ema?

#

Thanks!

next nimbus Oct 31, 2022, 12:56 PM

#

what samplers you guys use after training your model with dream booth?

#

I usually use lms but after training my model with dream booth, the only good sampler that gives a good image is euler a, not sure why.

#

my lms gives a really bad result.

stone garden Oct 31, 2022, 12:58 PM

#

novel trout Am i supposed to resume from SD-1-4 or SD-1-4-ema?

in my experience EMA is better at keeping the model's "world knowledge" and mixing it with your style, e.g. the fine-tuning dataset

tame aurora Oct 31, 2022, 1:49 PM

#

novel trout Hello channel, anything reference code to recommend please if I'd like to finetu...

Do you mean the original Stable Diffusion project? If so I found most helpful an example by LambdaLabs / Justin Pinkney:

https://github.com/LambdaLabsML/examples/tree/main/stable-diffusion-finetuning

GitHub

examples/stable-diffusion-finetuning at main · LambdaLabsML/examples

Deep Learning Examples. Contribute to LambdaLabsML/examples development by creating an account on GitHub.

novel trout Oct 31, 2022, 1:56 PM

#

tame aurora Do you mean the original Stable Diffusion project? If so I found most helpful an...

I see! thanks a lot for the pointer.

novel trout Oct 31, 2022, 2:10 PM

#

stone garden in my experience EMA is better at keeping the model's "world knowledge" and mixi...

Got it! I see the CompVis ema ckpt available here (https://huggingface.co/CompVis/stable-diffusion-v-1-4-original), but I don't see it in the HF diffuser format. Need to convert manually if want to use hf diffuser?

stone garden Oct 31, 2022, 2:13 PM

#

I think you should use the non-original ckpt with Diffusers:
https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/main

CompVis/stable-diffusion-v1-4 at main

#

(original is for the CompVis repository and their scripts that don't use huggingface.diffusers)

novel trout Oct 31, 2022, 2:17 PM

#

I checked this before and it seems no ema weights are included. I might miss sth tho.

stone garden Oct 31, 2022, 2:24 PM

#

ah, sorry then, I haven't used the Diffusers library yet so I'm not familiar with what's available for it

#

btw is finetuning/training even possible with diffusers (the library)?

novel trout Oct 31, 2022, 2:29 PM

#

No worries! I got this (https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py) from their repo. But seems experimental atm.

GitHub

diffusers/train_text_to_image.py at main · huggingface/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - diffusers/train_text_to_image.py at main · huggingface/diffusers

#

Don't think they have a full training pipeline supported otherwise.

stone garden Oct 31, 2022, 3:53 PM

#

Yeah, last time I checked I couldn't find anything.
Thanks for sharing! It does seem fresh 🙂

hot breach Oct 31, 2022, 4:15 PM

#

alpine rose Birme completely destroys image quality, you should use other tools

if you set quality to 99 it does not, thus the link

alpine rose Oct 31, 2022, 4:48 PM

#

Ok guess I'm a retard

hot breach Oct 31, 2022, 4:52 PM

#

I guess png support would be nice but I think at 99 you're losing very little

viral jay Oct 31, 2022, 4:54 PM

#

dreambooth is pretty crazy, did a model to do emojis

chrome oxide Oct 31, 2022, 5:17 PM

#

is there a dummy guide to dreambooth for Collab?

#

I am asking, as I am having a hard time getting to understand some of the notebooks, and the performance is not as good as I expected.

fluid folio Oct 31, 2022, 7:05 PM

#

hi, how to train the style of sam yang https://www.instagram.com/samdoesarts/ in the dataset? 🙂

frozen ivy Oct 31, 2022, 7:12 PM

#

Hello everyone! newbie here.
I followed Arki's guide to install InvokeAI. I am now looking for how to train it with custom faces. Is there a guide somewhere by any chance?
Or do I have to use another version than InvokeAi?

light jetty Oct 31, 2022, 7:43 PM

#

chrome oxide is there a dummy guide to dreambooth for Collab?

nerdy rodent does good guides https://youtu.be/VgKDZqAii1I

YouTube

Nerdy Rodent

Create AI Art Using Your Face - Dreambooth Tutorial - Google Colab ...

Want to add things to your AI art but don't have a powerful Nvidia GPU at home? No worries - got you covered with this diffusers version of Dreambooth which can be run for FREE on Google Colab! Works GREAT on a T4 with just 15GB VRAM. No need to install anything - just run straight from your web browser. Even runs on a potato computer ;)

As a b...

▶ Play video

#

might be slightly different, as it was a month ago

#

which in AI time, is about a year

night hound Oct 31, 2022, 9:12 PM

#

viral jay dreambooth is pretty crazy, did a model to do emojis

Hoy Manu imgs was your training set? What class did You use?

viral jay Oct 31, 2022, 9:25 PM

#

night hound Hoy Manu imgs was your training set? What class did You use?

24 images, 1000 steps, I didn't used a class at all, I've generated my own images for class

#

a bunch of different images with modelling clay look

#

it produces good results

night hound Oct 31, 2022, 11:20 PM

#

viral jay 24 images, 1000 steps, I didn't used a class at all, I've generated my own image...

What class name did you used?, Local GPU?, I'm training a toy car, it looks aceptable but Will need to train wheels separately

manic estuary Nov 1, 2022, 12:06 AM

#

Anyone tried doing (something like) Dreambooth on JUST the text encoder, while freezing the main part of the model?
I'm experimenting with this now but I'm not sure what to expect

novel trout Nov 1, 2022, 8:36 AM

#

viral jay a bunch of different images with modelling clay look

would you mind sharing a pointer to these images you use? Thanks a lot.

manic estuary Nov 1, 2022, 10:37 AM

#

manic estuary Anyone tried doing (something like) Dreambooth on JUST the text encoder, while f...

Following up on this: it worked reasonably well on pictures of my own face, although it didn't replicate my likeness quite as much as normal Dreambooth. I haven't tried textual inversion, but I'd expect that the results are roughly comparable given the similarity between the methods. Possible that the results would be better if I used a lower learning rate than 5e-6, always used 1e-6 or lower for normal Dreambooth.

viral jay Nov 1, 2022, 10:44 AM

#

novel trout would you mind sharing a pointer to these images you use? Thanks a lot.

I'm using a paid set of emojis, but I got the ones with visual style I was looking for, then I created 50 class images on webui using different variations, like 5 images of emoticons with modelling clay looking, then another 5 images of cars with modelling clay looking, so I could add different types of classes to apply the style

#

I've now created a new model, but instead of 1000 steps 1e-6 I'm with 6000 steps 1e-7, it's a bit more free to create different stuff now, still looking for the right spot

#

with second model I can create logos and other stuff applying the same look which is very nice

dawn trellis Nov 1, 2022, 11:27 AM

#

frozen ivy Hello everyone! newbie here. I followed Arki's guide to install InvokeAI. I am n...

You can try the invokeai discord, but the tldr is, it only does textual inversion, not real fine tuning. You’ll have to use a different version.

#

(Difference; it’s quicker to do but the results are significantly worse)

frozen ivy Nov 1, 2022, 12:04 PM

#

dawn trellis You can try the invokeai discord, but the tldr is, it only does textual inversio...

Thank you 😍 I will start with textual inversion and work from there

novel trout Nov 1, 2022, 1:21 PM

#

viral jay I'm using a paid set of emojis, but I got the ones with visual style I was looki...

thanks a lot for the information. absolutely amazing to see these.

viral jay Nov 1, 2022, 1:25 PM

#

I just found that using TI + my emoji model I can then create emojis based on real people, my wife and me for example

abstract widget Nov 1, 2022, 2:13 PM

#

viral jay I just found that using TI + my emoji model I can then create emojis based on re...

Amazing! What do you mean by TI?

viral jay Nov 1, 2022, 2:28 PM

#

It'a the Textual Inversion, but I said it wrong I'm actually using Hypernetwork + model, I think TI can also help to direct the image but HN for me on photos has give some better results, we can use a custom model like disney or this emoji one with HN or TI that has been trained on another model even if it's not suggested to as results may vary

alpine rose Nov 1, 2022, 3:51 PM

#

if anyone is interested, i made this script for automatic webui to generate regularization images for a set of training images, to then use with kanewallman's repo
it's pretty ugly but seems to work
for each training image, it first creates a caption using BLIP, then generates X reg images out of it

#

#

#

can be used with txt2img or img2img, haven't really tested training yet so I can't tell what yields better results

#

#

i should probably expand it to automatize the captioning and renaming of training images as well, now that i think about it

summer oriole Nov 1, 2022, 4:44 PM

#

I am going to attempt to do some hypernetwork training. Do I need to use the bigger 1.5 checkpoint? And once I've done the training, can I use the hypernetwork .pt I create with the smaller 1.5 (emaonly, whatever that means) checkpoint file?

viral jay Nov 1, 2022, 5:09 PM

#

does anyone have examples or a better explanation of what the prior preservation does on dreambooth?

hot breach Nov 1, 2022, 6:24 PM

#

its there to keep knowledge in the model, beyond that you probably want to read the dreambooth paper and it gets math heavy fast

#

if you train without any effort to keep the model in tack you'll cause "damage" to the model, things will start to look messed up. You will with dreambooth too but the regularization/prior presevation is there to try to slow that down

neat oxide Nov 1, 2022, 6:27 PM

#

anyone interested in markiplier finetune

#

i made it

#

#

https://drive.google.com/file/d/12YTG_FIwhlEfr0XfqB6y7NDcpMFgkYPl/view?usp=share_link

stone garden Nov 1, 2022, 8:09 PM

#

stone garden Dreambooth prior-preservation loss train text encoder 105 images same images us...

Further development on this. Retrained using a larger more refined dataset (235 images total), still experimenting with the idea of using the same images/prompt for instance and class however this time some images were excluded from the instance set (205 remaining out of 235 total).
These results are 19 steps k_euler_a, 512x704, 7.5 cfg scale, gfpgan1.4 + RealESRGAN_x4plus, they are not cherry picked either, 8/10 results at 19 steps are good
prompt: a photograph of taylor swift, outdoors, shot on iphone 14 instagram 2022
Personally I haven't seen anyone else's results with dreambooth produce such an accurate likeness to the person trained.

Just to note: this is purely for research purposes, I have no intention of releasing these models. I do want to write up my findings along with my thoughts on the implications of models like this which can accurately reproduce a person.
Also if you're wondering about the choice of subject, I chose Taylor Swift for two reasons, mainly because she is a celebrity so everyone knows what she looks like, if I tested this method on myself I couldn't really ask people "does this look like me", then the deciding factor was just that I like the new album and I've been listening to it a lot.

crimson meteor Nov 1, 2022, 11:25 PM

#

hey guys, Python noob here and i'm trying to fine-tune my first custom ckpt model, Kinda like that robo diffusion model, would love it if you can provide me with any links to tutorials or resources to help me get started?

jovial ore Nov 1, 2022, 11:35 PM

#

Having a hard time making a D&D-style Kenku. Any suggestions for artists or any other prompts to add?

hardy storm Nov 1, 2022, 11:51 PM

#

viral jay I just found that using TI + my emoji model I can then create emojis based on re...

This is fantastic. Bravo! I've been trying to do this exact thing. And for your emoji model, you say you used a variety of class images made from "modeling clay"? That's interesting. How did you decide on that? It seems to worked amazingly well

night hound Nov 2, 2022, 12:12 AM

#

alpine rose if anyone is interested, i made this script for automatic webui to generate regu...

How do you use captions in this case?
does it add captions to reg images as well as training images?
Can KaneWallmans repo be used on colab / runpod?
Wanted to know about captioning but dont understand yet how it works or why is it useful.

night hound Nov 2, 2022, 12:46 AM

#

chrome oxide is there a dummy guide to dreambooth for Collab?

LastBen is the easiest to use colab, maybe you can start there.

night hound Nov 2, 2022, 2:46 AM

#

fair perch hello everyone. I was thinking a way to produce datasets of generated characters...

So you just use the first imageSet as template, add a prompt according to what you need, can you provide any template to try img2img?

chrome oxide Nov 2, 2022, 4:28 AM

#

night hound LastBen is the easiest to use colab, maybe you can start there.

thanks buddy, this repo worked like magic, I want to look at ways to improve the quality of generation of these images, any suggestions? more varied data I guess right?

#

good lightning conditions with many facial expressions etc.

glossy rune Nov 2, 2022, 6:39 AM

#

stone garden Further development on this. Retrained using a larger more refined dataset (235 ...

how much do you think this (same tokens/images for instance and class) comes from the model already knowing about taylor swift as opposed to training it on e.g. yourself or someone the model doesn't yet know (some "sks person/face").

stone garden Nov 2, 2022, 6:53 AM

#

glossy rune how much do you think this (same tokens/images for instance and class) comes fro...

I'll need to do more testing however the same method worked very well for https://huggingface.co/hlky/xynthii-diffusion that's where I first got the idea because the design of the character is so unique that no class prompt would generate images anywhere close to the subject

abstract widget Nov 2, 2022, 8:59 AM

#

viral jay It'a the Textual Inversion, but I said it wrong I'm actually using Hypernetwork ...

Thanks for clarifying! ^^

half spoke Nov 2, 2022, 10:37 AM

#

What is a good tool for finetuning with Dreambooth locally? I have a 3090, I've been using n00mkrad's text2image-gui. I am comfortable with the cli.

north stream Nov 2, 2022, 10:58 AM

#

Maybe this https://github.com/smy20011/dreambooth-gui

GitHub

GitHub - smy20011/dreambooth-gui

Contribute to smy20011/dreambooth-gui development by creating an account on GitHub.

#

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

GitHub

diffusers/examples/dreambooth at main · ShivamShrirao/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - diffusers/examples/dreambooth at main · ShivamShrirao/diffusers

alpine rose Nov 2, 2022, 11:08 AM

#

how do you guys measure model "corruption" when fine tuning, to tell if you are overfitting or not ?

alpine rose Nov 2, 2022, 11:15 AM

#

night hound How do you use captions in this case? does it add captions to reg images as wel...

for now it doens't deal with training images as i already had them captioned, but yeah i think my script target should be : give me a folder of training images and i will make a dataset to be used with kane's repo

#

https://miro.com/app/board/uXjVPI19lP8=/
made v3 with the dataset created using my script

fierce oar Nov 2, 2022, 3:13 PM

#

Hi guys, I wonder if any of you guys know the procedure for finetuning stable diffusion for inpainting task only (the one that they described in v1.5). There seems to be a config file for inpainting inference, I am not sure about training and how to run the script for inpainting. Hope that someone can help me out! Thank you in advanced!

viral jay Nov 2, 2022, 3:57 PM

#

hardy storm This is fantastic. Bravo! I've been trying to do this exact thing. And for your ...

Yup, I've generated around 50 class images, but with several different types around 5 images per generation the idea was to try to not bias the style to something exactly, but I'm also playing more with it and I found that using no class images also brings good results, I'm kinda lost right now because I did so many tests lol but I will try to get something more "scientific" with proper results later so I can study and share better info about it.

#

guys what the dreambooth train_batch_size does exactly? its a bit confusing because I thought it would train 2 images per interaction, but it seems to take almost same time as using batch_size 1? or speed keeps the same but it can do more? like 1 batch_size = 500 it and 2 batch_size = 1000 it?

fair perch Nov 2, 2022, 4:06 PM

#

night hound So you just use the first imageSet as template, add a prompt according to what y...

yes, just that simple. I get the templetes googling "3d heads" or "face in multiple angles". This for example https://us.v-cdn.net/5021068/uploads/editor/49/c4hsv0sqapit.png

hardy storm Nov 2, 2022, 4:13 PM

#

viral jay Yup, I've generated around 50 class images, but with several different types aro...

Haha. Okay, glad to hear you're as lost as I am. I keep doing the same thing, running my own detailed experiments, thinking that I'm on to something, and then finding it all to be wrong and having to start all over. Either way, great work on your emojis. I'm currently trying to replicate your method

viral jay Nov 2, 2022, 4:14 PM

#

hardy storm Haha. Okay, glad to hear you're as lost as I am. I keep doing the same thing, ru...

one thing that I found important to get proper images with white background, use img2img 😄

#

img2img with a full white image produces a image with white background, use 1.0 for denoise strength

hardy storm Nov 2, 2022, 4:16 PM

#

viral jay img2img with a full white image produces a image with white background, use 1.0 ...

Ooo, good tip. Thank you. Do you mean for your training images or reg images? Or both?

viral jay Nov 2, 2022, 4:16 PM

#

hardy storm Ooo, good tip. Thank you. Do you mean for your training images or reg images? Or...

I mean for actually generating the emoji

hardy storm Nov 2, 2022, 4:17 PM

#

viral jay I mean for actually generating the emoji

Oh, gotcha. Gotcha. Roger that

viral jay Nov 2, 2022, 4:17 PM

#

or use an image like this

hardy storm Nov 2, 2022, 4:18 PM

#

viral jay or use an image like this

Interesting. That makes sense. Great tips! Thank you

viral jay Nov 2, 2022, 4:22 PM

#

it can get some good results, using 0.95 denoise strength, and we can easily change the background color that way, I've used a purple background with noise circle, so it generates the emoji on top of that

viral jay Nov 2, 2022, 4:55 PM

#

So here's probably the best results I achieved for emoji, 50 class images, 22 (I have selected less images from what I had, reduce count of faces itself) emoji images with style I wanted to replicate, 1000 steps LR=1e-6

#

dreambooth instace prompt was "dreamfoil emoji" and this is the result of "dreamfoil emoji, head girl with colorful hair"

#

here's more examples of results

hardy storm Nov 2, 2022, 6:21 PM

#

viral jay dreambooth instace prompt was "dreamfoil emoji" and this is the result of "dream...

shut ... the front ... door. these are awesome! holy crap

manic estuary Nov 2, 2022, 8:05 PM

#

This is kind of vague because I don't have time right now to write more about my experiences with this, but I tried running dreambooth but only optimizing the weights of the attention modules, i.e. 'CrossAttention', 'SpatialTransformer', 'SpatialSelfAttention', and 'LinearAttention', and my first impressions are that it seems to work BETTER than optimizing the entire model (with or without optimizing the cond stage). Better generalizability during inference and harder to overfit.

dapper prism Nov 2, 2022, 8:24 PM

#

If anyone is interested, I've uploaded a large selection of regularization image datasets for DreamBooth training. Currently available datasets include "artwork style", "illustration style", "fighter jet", "person", "woman", "train", "supermodel", & "erotic photography": https://huggingface.co/datasets/ProGamerGov/StableDiffusion-v1-5-Regularization-Images

ProGamerGov/StableDiffusion-v1-5-Regularization-Images · Datasets a...

leaden patio Nov 2, 2022, 10:09 PM

#

Has anybody dreambooth'd it with the best midjourney images yet? You'd think so.

leaden patio Nov 2, 2022, 10:12 PM

#

viral jay with second model I can create logos and other stuff applying the same look whic...

Fakebook and copper? Oh I use those!

viral jay Nov 2, 2022, 10:48 PM

#

sphere worlds, 12 images, 1000 steps and no class images, loving dreambooth

hardy storm Nov 3, 2022, 1:39 AM

#

viral jay sphere worlds, 12 images, 1000 steps and no class images, loving dreambooth

dude. what the heck. you're killing it! that's amazing

night hound Nov 3, 2022, 2:02 AM

#

viral jay sphere worlds, 12 images, 1000 steps and no class images, loving dreambooth

Did you use lastBen when training?
Those were your training imgs?

fallen nova Nov 3, 2022, 3:19 AM

#

finetuned on 24 pieces by Yves Tanguy, 2400 steps

#

prob my fav result so far

01550-2950510147-Wooden_structures_painting_by_ytvn_surrealist.png

#

kay sage is next on the list

drowsy adder Nov 3, 2022, 3:56 AM

#

viral jay So here's probably the best results I achieved for emoji, 50 class images, 22 (I...

Woah so good! Will you make the model available somewhere?

vast aurora Nov 3, 2022, 4:55 AM

#

Do people recommend using female / woman as a training class vs person? what are the best practices?

viral jay Nov 3, 2022, 7:42 AM

#

night hound Did you use lastBen when training? Those were your training imgs?

lastBen? not sure what you mean? the training images used are from a Brazillian artist

north stream Nov 3, 2022, 7:44 AM

#

https://github.com/TheLastBen/fast-stable-diffusion

GitHub

GitHub - TheLastBen/fast-stable-diffusion: fast-stable-diffusion, +...

fast-stable-diffusion, +25-50% speed increase + memory efficient + DreamBooth - GitHub - TheLastBen/fast-stable-diffusion: fast-stable-diffusion, +25-50% speed increase + memory efficient + DreamBooth

#

This is what he meant

viral jay Nov 3, 2022, 7:44 AM

#

Ah, then no, I'm using shivam repo

#

running locally on 3080ti / WSL2

novel trout Nov 3, 2022, 8:27 AM

#

dapper prism If anyone is interested, I've uploaded a large selection of regularization image...

Amazing. Can I trouble you elaborating a bit more details how these images are genearated?

woeful goblet Nov 3, 2022, 11:36 AM

#

I'm trying to inpaint and something very odd is happening
https://i.imgur.com/3vnh2em.png

Imgur

#

Some of the black inpainting mask is appearing in the output

#

that dark part is there in all of them, its what i painted and not a generated result

#

why/how can this be happening?

runic hatch Nov 3, 2022, 11:53 AM

#

does anyone have a guide on how to train or further train a model hosting on paperspace

dapper prism Nov 3, 2022, 1:40 PM

#

novel trout Amazing. Can I trouble you elaborating a bit more details how these images are g...

For each class / prompt, I used Automatic's "Generate Forever" feature to create 2-4 thousand images. Every image in a particular dataset uses the exact same settings, with only the seed value being different

dapper prism Nov 3, 2022, 2:36 PM

#

You can use my regularization / class image datatsets with https://github.com/ShivamShrirao/diffusers, https://github.com/JoePenna/Dreambooth-Stable-Diffusion, https://github.com/TheLastBen/fast-stable-diffusion, and any other DreamBooth repo with support for prior preservation loss. Normally generating the reg images for prior preservation loss is one of the most time consuming part of DreamBooth model training, so using my datatsets help speed things up significantly!

#

With an Nvidia A100 40GB graphic card, I was able to produce 50 images every 1 minute and 22 seconds. The speed could potentially be improved if I could get xformers setup properly

oak ether Nov 3, 2022, 3:09 PM

#

a little research on the effect of small variations of prompt : https://openart.ai/@owl-deafening-97/study-small-change-of-prompt-a-duck-da-vinci-greg-rutkowsky-beeple

OpenArt

Discover and Share AI Art | OpenArt

Search 10M+ of AI art and prompts generated by DALL·E 2, Midjourney, Stable Diffusion

glossy rune Nov 3, 2022, 3:19 PM

#

has anyone tested the effect of including the updated stabilityai vaes into dreambooth training vs adding them after training?

wintry girder Nov 3, 2022, 5:44 PM

#

Can you use an embedding in the initialisation text for a new embedding?

night hound Nov 3, 2022, 7:05 PM

#

glossy rune has anyone tested the effect of including the updated stabilityai vaes into drea...

Can You add vae after training?

split acorn Nov 3, 2022, 7:07 PM

#

yeah

glossy rune Nov 3, 2022, 7:07 PM

#

night hound Can You add vae after training?

Just replace the model with your dreambooth result

wintry girder Nov 3, 2022, 11:19 PM

#

Wellllll?

glossy jasper Nov 3, 2022, 11:29 PM

#

Anyone with a good config to train Dreambooth on a 3090 on Runpod?

#

I get CUDA out of memory with lots of configs and it seems a bit weird

wintry girder Nov 3, 2022, 11:30 PM

#

Ok, on another topic, I hear that ---medvram is crap for textual inversion, but if I disable it I get out of memory errors. I heard from the interwebs that I could edit "v1-finetune_lowmemory.yaml" to make num_workers = 4 (instead of 8), but I don't see that file in a1111. Please help?

stone garden Nov 4, 2022, 3:53 AM

#

glossy jasper Anyone with a good config to train Dreambooth on a 3090 on Runpod?

Which type of machine are you using? ([0] No distributed training, [1] multi-CPU, [2] multi-GPU, [3] TPU [4] MPS): 0
Do you want to run your training on CPU only (even if a GPU is available)? [yes/NO]:
Do you want to use DeepSpeed? [yes/NO]:
Do you wish to use FP16 or BF16 (mixed precision)? [NO/fp16/bf16]: fp16

or

In which compute environment are you running? ([0] This machine, [1] AWS (Amazon SageMaker)): 0
Which type of machine are you using? ([0] No distributed training, [1] multi-CPU, [2] multi-GPU, [3] TPU [4] MPS): 2
How many different machines will you use (use more than 1 for multi-node training)? [1]:
Do you want to use DeepSpeed? [yes/NO]:
Do you want to use FullyShardedDataParallel? [yes/NO]:
How many GPU(s) should be used for distributed training? [1]:2
Do you wish to use FP16 or BF16 (mixed precision)? [NO/fp16/bf16]: fp16

hot breach Nov 4, 2022, 5:45 AM

#

got variable aspect ratio training working, need a touch more work on the code before release but works surprisingly well, posted some tests in #1010579244188958901

#

the real gain is never having to crop or resize images again

glossy rune Nov 4, 2022, 6:10 AM

#

That’s a nice thought 😁 looking forward to see how you made it work

wintry girder Nov 4, 2022, 8:38 AM

#

wintry girder Ok, on another topic, I hear that ---medvram is crap for textual inversion, but ...

Anyone know how the hell to run TI properly on a 3060 using a1111?

north stream Nov 4, 2022, 10:18 AM

#

I don't think you can (it needs 8gb of vram)

glossy rune Nov 4, 2022, 10:19 AM

#

there's a 12gb 3060 version, but i have not tried TI yet

north stream Nov 4, 2022, 10:24 AM

#

With 12gb it should work just fine

wintry girder Nov 4, 2022, 1:07 PM

#

I guess i just need to not use the a1111 version. Shame.

gloomy belfry Nov 4, 2022, 1:41 PM

#

hot breach got variable aspect ratio training working, need a touch more work on the code b...

awesome, using Joe's or Diffusers?

#

let me know if you need a tester

hot breach Nov 4, 2022, 2:24 PM

#

gloomy belfry awesome, using Joe's or Diffusers?

its my own fork

gloomy belfry Nov 4, 2022, 2:25 PM

#

hot breach its my own fork

based on who?

hot breach Nov 4, 2022, 2:29 PM

#

I forked from kanewallmann, it's one of the xavier forks using lightning trainer, at this point the code is mostly xavier's original DB implementation and my changes

#

kane was the first to put in the ability to fully caption images so I started from his

old igloo Nov 4, 2022, 2:34 PM

#

I did my first couple of experiments training images with dreambooth. I copied the ckpt files into the models/stable-doffusion directory, and have been able to utilize them, with decent results so far. Probably need to do some retraining with better photos. But my question is this: Do I need to merge the ckpt file into the 1.5 ckpt file for best results?

hot breach Nov 4, 2022, 2:34 PM

#

there's been a LOT of hacking on the code, I think djbielejeski put in some meaningful changes as well

glossy rune Nov 4, 2022, 2:36 PM

#

old igloo I did my first couple of experiments training images with dreambooth. I copied t...

in my experience 1.4 is a bit easier for dreambooth than 1.5. for 1.5 you probably want to work with lower learning rates and fewer training steps (like 1e-6 and 2000 vs 2e-6 and 7000), but that also depends on the number of your training examples.

i have not merged dreambooth ckpts into original ckpts yet but i assume you get best results by proper training and just using the dreambooth ckpt

old igloo Nov 4, 2022, 2:38 PM

#

Thank you for that. I only did 25 images, and the only setting I adjusted was steps from 800 to 1000. Sounds like I could/should try much higher number of steps?

glossy rune Nov 4, 2022, 2:40 PM

#

my first tests were 4/5 images, sd 1.4 and 800 steps and those were pretty decent (photo realistic character). now scaling up from there with much more nuanced expectations...

tame sierra Nov 4, 2022, 3:09 PM

#

If I'm testing different checkpoints of my embeddings (.pt) in the embeddings folder of automatic's gui, do I have to restart the server to pick up the changes?

gloomy belfry Nov 4, 2022, 3:10 PM

#

hot breach I forked from kanewallmann, it's one of the xavier forks using lightning trainer...

cool cool, can I have a link to your repo?

vast crystal Nov 4, 2022, 3:12 PM

#

idk if this is the right server to ask my question but my 1050ti went from generating 1 iteration every 1.5seconds to 1 iteration every 6.5seconds and i changed nothing.

hot breach Nov 4, 2022, 3:12 PM

#

gloomy belfry cool cool, can I have a link to your repo?

https://github.com/victorchall/EveryDream-trainer I need a bit more work on the multi-aspect stuff before pushing but should be ready soon(tm) soontm

gloomy belfry Nov 4, 2022, 3:12 PM

#

nice thx

hot breach Nov 4, 2022, 3:13 PM

#

automatic bulk image captioning script and laion data scraper in my tools repo here: https://github.com/victorchall/EveryDream

granite portal Nov 4, 2022, 4:01 PM

#

I'm trying to train a massive dataset (About 100k images or so) using this notebook https://colab.research.google.com/drive/1vrh_MUSaAMaC5tsLWDxkFILKJ790Z4Bl?usp=sharing&authuser=4#scrollTo=Um6kJUmIlDaC (It's the only one I know that allows image names as prompts) but whenever it tried to generate a sample, I get the error "KeyError: 'sample'". I tried changing stuff on my own based on errors I saw on github but it never worked. Is there a better colab for finetuning a network on longer prompt images?

Google Colaboratory

tacit bronze Nov 4, 2022, 5:57 PM

#

dreambooth'd every king's quest 6 bg, 5000 steps. chef's kiss

crimson wasp Nov 4, 2022, 9:12 PM

#

tacit bronze dreambooth'd every king's quest 6 bg, 5000 steps. chef's kiss

Damn that could be an incredible tool for retro adventure game creation with far bigger maps than was previously possible. You could also put say all the King's Quest or Quest for Glory games in the same style, e.g. QFG2 which never got a higher quality remake like the first game did

old igloo Nov 4, 2022, 9:32 PM

#

glossy rune my first tests were 4/5 images, sd 1.4 and 800 steps and those were pretty decen...

I did a set today with 5000 steps and it seems like it produced lesser quality results than when I used 800 steps. It seems like with the set I did 5000 steps on, the images it produces are too similar to the original images, and nothing else in the prompt has the effect I'd normally expect it to.

#

But I'm very new to dreambooth, so I haven't figured out much yet.

#

When convertng the weights to ckpt, what's the purpose of converting to fp16? Does that 50% reduction in size also reduce quality?

tacit bronze Nov 4, 2022, 9:49 PM

#

@crimson wasp I think you gotta be careful, spread across too many games and you start to lose the style for sure

#

best option in my opinion would be either picking a single game's style for each gen just based on the intended scene (ie kq6 only), training together very similar artists (ie kq5, kq6, sq5 or so), or training only within a particular genre (ie "fantasy" with kq5, kq6, conquest of the longbow)

light jetty Nov 4, 2022, 10:48 PM

#

old igloo I did a set today with 5000 steps and it seems like it produced lesser quality r...

depending on what youre using to train, there is usually a way to produce sample images at an interval you set. so you can do a long training, and see it changing every whatever steps eg 250

old igloo Nov 5, 2022, 12:12 AM

#

light jetty depending on what youre using to train, there is usually a way to produce sample...

I'm using Dreambooth on Colab

tacit bronze Nov 5, 2022, 3:19 AM

#

merged can comprehend certain things better that are in the other games that arent in just kq6, though it tends to smooth out finer details and has more perspective (when the flatter scenes are likely more preferable for a 90s style graphic adventure game)

noble cairn Nov 5, 2022, 4:31 AM

#

Hello,
Anyone know how can I proceed to generate images like this?

#

tacit bronze Nov 5, 2022, 4:56 AM

#

hmm, after trying kq6+laurabow2 training, I gotta say that similar detail doesn't work quite as well, worse than similar genres (the fantasy merged one above).

I think composition and nostalgiawise, the best approach is a per-game 5000 step model

orchid imp Nov 5, 2022, 7:46 AM

#

Could someone point me in the right direction to understand how to train dreambooth with a 'style' rather than a 'model'?

regal harbor Nov 5, 2022, 8:04 AM

#

anything trained on faces? Especially diverse faces (not all good looking. maybe some ugly. different ages)

viral jay Nov 5, 2022, 10:06 AM

#

orchid imp Could someone point me in the right direction to understand how to train dreambo...

from my understanding to train a style you need different objects sharing a similar feature?

viral jay Nov 5, 2022, 12:44 PM

#

Guys, has anyone tried to train normal map images to see if we can get it as a style?

little parcel Nov 5, 2022, 2:09 PM

#

Hello

#

I have 20 images!

#

What are the correct settings in your opinion?

slow badger Nov 5, 2022, 2:41 PM

#

Dumb question but do I have to use the full-ema model to generate images if I have trained an embedding on it or can use the same model's pruned version and get the same results?

hot breach Nov 5, 2022, 3:11 PM

#

if you're just training an embedding I don't know if it will matter much

#

if you're unfreezing the model the "right" thing to do is train on the full file with both ema and nonema weights in it, and only prune to a 2GB ema only file when you're "done"

#

the lightning trainer will use nonema weights to train when present and fall back to ema weights if it can'tfind the nonema weights

#

again if you're just doing an embedding/TI I don't think it will matter a whole lot

slow badger Nov 5, 2022, 4:02 PM

#

Alright, I'm just creating embeddings, I don't have enough VRAM for the rest, thanks!

#

Another question: what's the difference between those VAE?
stabilityai/sd-vae-ft-mse
stabilityai/sd-vae-ft-ema
stabilityai/sd-vae-ft-ema-original
I can see the diff on the pictures here https://huggingface.co/stabilityai/sd-vae-ft-ema-original#visual but I don't understand how I should choose one above the others.

stabilityai/sd-vae-ft-ema-original · Hugging Face

silent spear Nov 5, 2022, 8:42 PM

#

Does anyone have a publicly-accessible trained model I can use as a test? Nothing sinister, I promise 🙂

glossy rune Nov 5, 2022, 9:41 PM

#

slow badger Another question: what's the difference between those VAE? stabilityai/sd-vae-ft...

I‘d say the difference is close to negligible. I‘ve played a lot with the -mse which supposedly gives smoother output. But depending on what you try to achieve, this can be very low on your priorities. I mostly go with the default original at the moment and don’t bother. I might look at the -mse when it’s just about a last bit of polish.

limber peak Nov 5, 2022, 9:57 PM

#

What is best dreambooth repo for my rtx 3080 10gb?

#

I see many of the top repo ask me to give them like 24GB or more

subtle moth Nov 6, 2022, 12:52 AM

#

So I'm trying to do fine-tuning on SD 1.5 with a large dataset (900k image). I have the training running on a 5-GPU A100 box with 90 cpus and 470 GB Ram.

For some reason anytime I run training with multiple GPUs it runs slower than just a single GPU. I've been trying to figure this out for hours now but can't explain that yet.

Does anyone have a guess as to what I'm missing? Or an example of training on mult-GPU computers?

slow badger Nov 6, 2022, 1:04 AM

#

glossy rune I‘d say the difference is close to negligible. I‘ve played a lot with the -mse w...

Thank you, I'll use original for now, I didn't want to spend time comparing them right now.

wooden shuttle Nov 6, 2022, 2:18 AM

#

orchid imp Could someone point me in the right direction to understand how to train dreambo...

This video has the info you are looking for. Also @dapper prism put together some regulaization images, including a "style" that you can use for your training.

https://youtu.be/7bVZDeGPv6I

YouTube

Nerdy Rodent

Super-Fast, Multi-Concept, Local Dreambooth for Stable Diffusion

Want to run Dreambooth for Stable Diffusion locally so you can train multiple concepts at once really quickly? Not a problem! Runs on Google Colab as well, so you don't actually need a modern computer to train.

Works on Microsoft Windows (partly), but for the lowest VRAM usage you'll need to use Linux (as with most AI stuff). Also remember to c...

▶ Play video

#

https://huggingface.co/datasets/ProGamerGov/StableDiffusion-v1-5-Regularization-Images

ProGamerGov/StableDiffusion-v1-5-Regularization-Images · Datasets a...

orchid imp Nov 6, 2022, 2:43 AM

#

wooden shuttle This video has the info you are looking for. Also <@629490061569949717> put toge...

Thanks!

dapper prism Nov 6, 2022, 3:05 AM

#

glossy rune I‘d say the difference is close to negligible. I‘ve played a lot with the -mse w...

I'd go with the MSE one as its been trained the longest, and seems to perform the best according the general consensus

#

Then if that doesn't work, I'd start looking at the other options

tacit bronze Nov 6, 2022, 5:48 AM

#

best jojo results so far, 900 step training on part 4 character portrait renders, cropped at the top square

chrome oxide Nov 6, 2022, 11:30 AM

#

are there any other image up scaling models? I tried to use ESRGAN to upscale the images generated by SD, the results are good, was easily able to get 4x resolution bump, but I am seeing more artifacts in the images compared to the original outputs from SD. Any Suggestions?

north stream Nov 6, 2022, 11:35 AM

#

SwinIR

opaque scroll Nov 6, 2022, 3:12 PM

#

tame sierra If I'm testing different checkpoints of my embeddings (.pt) in the embeddings fo...

Sorry to ping but I have been trying to use embeddings for a week now with no success.
I have downloaded them from huggingface but I don't feel they do anything and if I up the importance above 1.6 the generated image is chaos.
the files are about 40kb. Super small.

hot breach Nov 7, 2022, 4:12 AM

#

training on any aspect ratio/image size now available on 24GB here: https://github.com/victorchall/everydream-trainer

GitHub

GitHub - victorchall/EveryDream-trainer: General fine tuning for St...

General fine tuning for Stable Diffusion. Contribute to victorchall/EveryDream-trainer development by creating an account on GitHub.

#

no cropping or resizing required at all

#

going to work on updating some stuff so I can bring vram requirements down, then probably work on notebooks

#

#

example training set directly fed into trainer

swift terrace Nov 7, 2022, 4:44 AM

#

oh wow.

#

thank you, i'll give this a shot this week

gray phoenix Nov 7, 2022, 8:19 AM

#

I have a few questions about Dreambooth

So, I wanna give it a try training a model, but I have a question: do you train the model to the style that you want your outputs to follow (let’s say for example anime style), or do you train it using a subject that you want your outputs to be like (let’s say for example an actor)? Or you can do both? And if so, how do you train your model for each purpose?

half spoke Nov 7, 2022, 8:32 AM

#

gray phoenix I have a few questions about Dreambooth So, I wanna give it a try training a mo...

Both. You can do styles with Dreambooth, but its strength is definitely in subjects. I've heard hypernetworks are best for styles but maybe someone with more experience can add their input on that. If you're training an anime-ish subject then use a NAI/WaifuDiffusion model.

gilded crater Nov 7, 2022, 9:20 AM

#

UPDATED MY HUGGINGFACE PAGE WITH NEW INFO, DESCRIPTIONS, INVITE LINKS, FASTER INTERNET, BETTER FUEL ECONOMY, HIGH FPS, AND BIGGER TITS/COCKS/APPENDAGES:
https://huggingface.co/ShinCore/MMDv1-18

ShinCore/MMDv1-18 · Hugging Face

alpine rose Nov 7, 2022, 11:16 AM

#

gilded crater UPDATED MY HUGGINGFACE PAGE WITH NEW INFO, DESCRIPTIONS, INVITE LINKS, FASTER IN...

you should take a break you seem exhausted

tall badger Nov 7, 2022, 12:51 PM

#

edgy raptor Made a script for making 512x512 dreambooth images, crops to a 1:1 image first a...

Great script ! do you mind if I integrate it in the colab ?

tame aurora Nov 7, 2022, 3:30 PM

#

subtle moth So I'm trying to do fine-tuning on SD 1.5 with a large dataset (900k image). I ...

Are you using the original CompVis repo?
FYI I did some experiments with it with 2-4x A100 and noticed a proper speed-up.
I also drew some inspiration from https://github.com/LambdaLabsML/examples/tree/main/stable-diffusion-finetuning but I noticed they modified the source code... which I didn't for my experiments. I just compared my .yml config with their pokemon.yaml

tame aurora Nov 7, 2022, 3:32 PM

#

hot breach the lightning trainer will use nonema weights to train when present and fall bac...

do you have an example at hand that gives more details about non-ema vs ema in general? Is EMA "simply" the optimizer's parameters? If so, what's the reason to choose the variant without by default?

hot breach Nov 7, 2022, 3:33 PM

#

ema and nonema are standard ML terminology, lots of info on the web, but short version is EMA is intended for inference and is a way to keep the model from being biased to the most recent training samples it was trained on

#

for fine tuning, it is generally preferred to use non-ema weights because the ema is going to bias the starting point of your fine tuning

#

I guess in practice with SD using EMA weights for fine tuning doesn't seem like some huge critical failure, but it's the general suggested practice

tame aurora Nov 7, 2022, 3:36 PM

#

but isn't ema supposed to keep more of the model's original "knowledge"? In that case it's actually good to "keep the model from being biased to the most recent training samples"
(I'm trying to understand it and make up my mind what's best in my case)

#

I'm not sure what you mean by "bias the starting point of your fine tuning". Maybe I should read more indeed 🙂

hot breach Nov 7, 2022, 3:38 PM

#

biases the start of fine tuning because when you fine tune it is creating nonema weights anyway, ema weights are a byproduct

#

the trainers will (or should) use nonema weights if present in the ckpt

#

if nonema weights are not present, it will copy paste the ema weights to nonema weights, then start training and be training on nonema weights, only producing ema weights as a byproduct

#

it will probably make more sense if you do a bit of reading on ema

tame aurora Nov 7, 2022, 3:40 PM

#

Definitely! I'd do that and thank you for the effort

hot breach Nov 7, 2022, 3:41 PM

#

again, in practice, I'm not seeing that using normal 4gb/2gb files as a starting point screws stuff up a bunch, maybe its not a big deal for stable diffusion, but its just a best practice type thing

tame aurora Nov 7, 2022, 3:41 PM

#

btw my assumptions were based on the Catastrophic forgetting? section in this article. (Since I used it for ideas for most of my experiments)
https://www.justinpinkney.com/pokemon-generator/

Text to Pokemon Generator

Fine tuning Stable Diffusion to generate Pokemon

hot breach Nov 7, 2022, 3:43 PM

#

there are a lot of weird things people are doing that fall out of best practice, so just be careful of what the masses parrot around, I imagine SD is most people's first foray into training machine learning stuff and they usually are sort of "going through the motions" after watching a couple youtube videos, and the popular creators are often just as ignorant

tame aurora Nov 7, 2022, 3:44 PM

#

I agree, I'm trying to filter those out as well

hot breach Nov 7, 2022, 3:44 PM

#

I've had a lot of "wtf" moments watching what people do or tell others to do, or when they state opinions on what is best, etc

#

and back to ema/nonema I'd like to A/B test that and compare just an issue of time to do it, but since I did most of my early stuff just off the 4gb ema-only 1.4 and it wasn't a disaster its just been low priority

#

SD may be more resilient to ema vs nonema due to its architecture vs other ML models? just guessing, or the LR people are using for fine tuning is low enough the recency bias is not that large

tame aurora Nov 7, 2022, 3:54 PM

#

btw I had the chance to try non-ema before switching to with-ema and it seemed like the model was quicker to "forget" it's general knowledge and start producing stuff similar to my training data (a very small dataset)
with EMA it takes noticeably more epochs before it shifts that way (without changing other hyperparameters)
so it's like striking a balance between what you need - more original model or more "your" model. And I believe there's multiple ways strike this balance, considering other hyperparameters like learning rate

subtle moth Nov 7, 2022, 3:54 PM

#

tame aurora Are you using the original CompVis repo? FYI I did some experiments with it with...

Yeah I forked Justin Pinkney's repo. I suspect my issue is actually that I thought the progress bar was an aggregate of all GPUs but it might just be for a single GPU. How did you measure the speedup?

tame aurora Nov 7, 2022, 3:55 PM

#

there's a progress bar (tqdm?) for each epoch and it reports the time for that epoch.. I just noticed those times are shorter when on 4 gpus. Today I also noticed there's an "Average epoch time" in the logs as well so you could pay attention to it as well, I guess

subtle moth Nov 7, 2022, 3:56 PM

#

tame aurora there's a progress bar (tqdm?) for each epoch and it reports the time for that e...

makes sense. I haven't got to one epoch yet since that would take a long time with 900k images and I want to solve the performance issue before I do

tame aurora Nov 7, 2022, 3:57 PM

#

why not make a subset of 900 images and play with it to streamline the process first? 🙂

subtle moth Nov 7, 2022, 3:58 PM

#

ha yeah, good obvious idea that I hadn't thought of. to be fair, I didn't expect the metrics to be like this

#

I'm also planning to add some better multi-gpu metrics

tame aurora Nov 7, 2022, 3:59 PM

#

what kind of metrics do you mean?

subtle moth Nov 7, 2022, 4:03 PM

#

tame aurora what kind of metrics do you mean?

samples per second, # iterations per second, # batches per second.

its my suspicion that the progress bar stats, other than epoch, are only coming from the first GPU

tame aurora Nov 7, 2022, 4:12 PM

#

actually, you may be right - I remember something like the # of steps in an epoch being less when I switched from 1 gpu to multiple

old igloo Nov 7, 2022, 5:45 PM

#

I'm new to Dreambooth but so far I've trained 5 or 6 models with photos of myself and family members. I've noticed that with some of the models, SD struggles to produce images that are significantly different from the input images. I am sure I've made some missteps in the training process, but I'm not sure where to start in terms of correcting those mistakes. For each set, I used 30 images, cropped to 512x512, with training steps of 3000, and 3e-6 learning rate (also did some with 1e-6). I used generic "man" and "woman" class names. I am able to get outputs that look like the people I trained in the models, but can't seem to get it to change their appearance much, such as with a prompt like "XYZ man as superman". Any advice on what I can adjust to correct this?

old igloo Nov 7, 2022, 5:54 PM

#

chrome oxide are there any other image up scaling models? I tried to use ESRGAN to upscale th...

SwinIR and SD upscale is my preferred way to upscale while also adding in additional details. It takes a lot of experimentation to find the right combination of settings to produce the best results, but once you hone in on htose settings, it can produce awesome results.

old igloo Nov 7, 2022, 6:12 PM

#

So, it seems like with a model I trained with Dreambooth and converted into a ckpt file for use with Automatic111, I need to use much lower CFG scale than I normally use in order for it to honor the prompt and not just give me recycled versions of the images I trained on. Is that related to the number of images I trained on, the number of epochs, and/or the learning rate?

hot breach Nov 7, 2022, 6:42 PM

#

struggling to generate something other than training images and having to lower cfg scale is a sign of overtraining

#

adding more images to train and/or decreasing steps might help

#

ideally, you are getting multiple checkpoints when you train at different step intervals and you can test several out and pick the best one, if you only get one ckpt then you're kinda stuck starting over to try fewer steps

old igloo Nov 7, 2022, 7:55 PM

#

Is 30 images not enough? Is 3000 steps too many for 30 images? I'm using the Dreambooth colab notebook, I don't know if I'm getting multiple checkpoints. My save interval was 4000 when I ran 3000 steps. Does that have anything to do with it?

stone garden Nov 7, 2022, 8:02 PM

#

your save interval should be lower than your steps. when you have 3000 and a save interval of 500 it creates a checkpoint every 500 steps. You would end up with 6 checkpoints (500,1000,1500,2000.2500 an 3000)

old igloo Nov 7, 2022, 8:59 PM

#

Ok, that makes sense. And is that a recommended save interval for 3000 steps?

sharp solstice Nov 7, 2022, 9:17 PM

#

is there a comparison doing embedding vs hypernetwork in automatic1111? i understand the differences, but i don't think i've seen any comparisons online

#

i'm trying to train an embedding right now and so far I'm getting much better results than hypernetwork

stone garden Nov 7, 2022, 10:21 PM

#

old igloo Ok, that makes sense. And is that a recommended save interval for 3000 steps?

that's up to you how many check points you want. if you want 10 check points you go for 300 interval steps, if the end result is overtrained you can always fall back on an earlier check point.

old igloo Nov 7, 2022, 10:34 PM

#

I see. So the purpose of the checkpoints is to give me the option of choosing which checkpoint produces the best results?

tired wind Nov 7, 2022, 11:15 PM

#

yes, since overtraining is a common problem

hot breach Nov 7, 2022, 11:19 PM

#

yes there's a chronic problem because of how dreambooth has been popularized

#

people just try to guess how many steps they need and only generate one ckpt file, if you overtrained you're screwed and have to start over

#

if you use an online service you'll need more volume storage to store the files as you train, but its well worth the small extra cost for the volume storage so you don't have to keep renting the instance again to start over

half spoke Nov 8, 2022, 12:18 AM

#

sharp solstice i'm trying to train an embedding right now and so far I'm getting much better re...

My understanding is that hypernetworks are better for style, but I haven't experimented enough to know for sure

frozen bobcat Nov 8, 2022, 12:18 AM

#

I've had very good results training a custom character.
But is there a best practices to inculding certain poses and angles among the images used for training?

half spoke Nov 8, 2022, 12:19 AM

#

frozen bobcat I've had very good results training a custom character. But is there a best prac...

for Dreambooth?

half spoke Nov 8, 2022, 12:27 AM

#

frozen bobcat I've had very good results training a custom character. But is there a best prac...

in any case, with Dreambooth at least, I've noticed that when I added 4 images that have the character sitting to the dataset of 16, so 4/20, that I started getting more results of the character in that position. If you want a certain pose, have more of it in the data set. If you want more diversity include more poses/angles.

#

With embeddings/hypernetworks, I imagine you could just note the pose in the template [textfiles]

frozen bobcat Nov 8, 2022, 12:50 AM

#

half spoke for Dreambooth?

yes!

sharp solstice Nov 8, 2022, 12:58 AM

#

half spoke My understanding is that hypernetworks are better for style, but I haven't exper...

Yeah I'm getting the hang of it i think. The comparison you posted is also very useful.

So embedding is just like training a word to become like a very specific prompt. So for example if you're using a model trained on real people only, you wouldn't be able to train an embed to fit an anime character

Hypernetwork is seems more like a continuation of the checkpoint where the image data is stored onto the network given the prompts you use which is closer (or is the same?) as how real training is done

sharp solstice Nov 8, 2022, 1:00 AM

#

half spoke My understanding is that hypernetworks are better for style, but I haven't exper...

how did you do the prompts for the hypernetwork in this case? did you write something like "an anime girl with grey hair and cat ears" or did you just write "an anime girl" ? or maybe the name of the character or something

fallen nova Nov 8, 2022, 2:15 AM

#

fallen nova kay sage is next on the list

upscaled results

#

#

#

normal 512 dimensioned stuff

01650-1091581809-_The_Following_Morning__by_kysg_Kay_Sage_oil_on_canvas_1945.png

#

01667-1246929828-A_red_cloth_draped_over_an_egg-shapedsphere_rolling_up_a_curved_staircase_on_the_edge_of_a_cliff_overlooking_a_barren_flatland_l.png

thorny sapphire Nov 8, 2022, 2:56 AM

#

Does anyone have any good tutorials for getting textual inversion/hypernetworks working? I tried myself with some that I downloaded off the Hugging face repository and I cannot seem to get them working. They always throw an error about things not being in the right memory space or something whenever I hit the train button.

#

I am wanting to train for my wife, step daughter, and pets so I can produce some art of them, but I can't seem to figure out what is going on.

#

Seems like a pytorch issue.

half spoke Nov 8, 2022, 7:15 AM

#

sharp solstice Yeah I'm getting the hang of it i think. The comparison you posted is also very ...

if you're doing anything anime related, you're going to want to use an anime model. No matter the training/tuning you do, the other data in the model will still effect the results. As for what you would do for prompts for a hypernetwork, I'm no expert. I'd just use the hypernetwork.txt file under the textual_inversion_templates folder from web ui. Yes embeddings effect the results of a given prompt, while a hypernetwork can be loaded and have its effects increased/decreased. They're both detached from the actual model file, unlike dreambooth which will create a new checkpoint file

iron tundra Nov 8, 2022, 11:25 AM

#

half spoke My understanding is that hypernetworks are better for style, but I haven't exper...

I wish we could've seen a comparison using an original character (not in NAI)

tired wind Nov 8, 2022, 1:45 PM

#

sharp solstice Yeah I'm getting the hang of it i think. The comparison you posted is also very ...

You should be able to do textual inversion on an anime character. Then you could combine it with a hypernetwork of a style. I don't know about dreambooth. One important thing is you can generate an image with multiple textual inversion embeddings, so you can say [person1] and [person2] in X, where as the hypernetwork is 1 thing right now.

I've been running tests of [person1] and [person2] in [style] with and without using a hypernetwork in addition to that. I don't have any conclusions yet other than its probably worth training the same thing as textual inversion (do this first), then hypernetwork. Also I think hypernetworks should have more training samples where as textual inversion embeddings you may get better results on a low number.

tired wind Nov 8, 2022, 1:50 PM

#

thorny sapphire Does anyone have any good tutorials for getting textual inversion/hypernetworks ...

The AUTOMATIC1111 webui should work out of the box. I was expecting it to be really complicated and it ran with zero problems on a Windows machine with a 3090ti. Running on google collab I had lots of problems. If you are getting memory errors your video card may not have enough vRAM. Textual inversion follow this https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Textual-Inversion (simplest, only 4-5 images should be enough) For hypernetworks follow this for settings - https://rentry.org/hypernetwork4dumdums

GitHub

Textual Inversion · AUTOMATIC1111/stable-diffusion-webui Wiki

Stable Diffusion web UI. Contribute to AUTOMATIC1111/stable-diffusion-webui development by creating an account on GitHub.

Hypernetwork training for dummies

Based on my knowledge on 10/31/2022, by ixynetworkanon
I'll comment on anything with parenthesis in italics for things I don't know and need confirmation on.
This guide assumes you have decent knowledge of webui and stablediffusion, RTFM.
PREWORD
What is a hypernetwork????
Hell if I know, ask an...

half spoke Nov 8, 2022, 5:11 PM

#

iron tundra I wish we could've seen a comparison using an original character (not in NAI)

not sure I understand, the character is right there on the bottom left. Or do you mean literally an Original Character that doesn't exist in the default model? I've had success training a character prompt that would only result in really bad results that barely resemble the character and characters that don't seem to be in the model at all.

iron tundra Nov 8, 2022, 5:21 PM

#

half spoke not sure I understand, the character is right there on the bottom left. Or do yo...

yes that's what I meant, a brand new character. like a one-eyed horned man with red hair or something

half spoke Nov 8, 2022, 5:46 PM

#

iron tundra yes that's what I meant, a brand new character. like a one-eyed horned man with ...

ah, I don't see any charts for comparison but it should be doable. Especially with Dreambooth, I've seen some complicated and unorthodox characters finetuned/trained on it. For example https://huggingface.co/hlky/xynthii-diffusion

hlky/xynthii-diffusion · Hugging Face

reef junco Nov 8, 2022, 8:27 PM

#

Trying to run hypernetwork training locally, getting a ValueError: the following 'model_kwargs' are not used by the model: ['encoder_hidden_states' , 'encoder_attention_mask']. Any idea what is causing this error? Can't find too much info about it online right now

#

this is during the preprocess step, seems to be an issue with BLIP as some extra info, PP runs fine if i turn off blip captioning

#

Can i just use an external clip interrogator and copy the prompts over for each image instead? Clip interrogator seems broken with the same error as well

glossy rune Nov 8, 2022, 8:57 PM

#

blip/clip-interrogator seem to depend on transformers==4.15.0 while auto1111 will bump transformers to something more recent (4.24.0 is current)

reef junco Nov 8, 2022, 8:59 PM

#

So in ubuntu, would i just remove the current transformers package and reinstall the older one?

glossy rune Nov 8, 2022, 9:00 PM

#

pip install transformers==4.15.0
then run clip-interrogator
when done run pip install -U transformers

reef junco Nov 8, 2022, 9:00 PM

#

I will try this out in a few minutes and report back if i have any more issues, thank you very much 🙂

#

I ran the hypernetwork training with no captions just to see what would happen while i waited so imma give it the last 10 minutes it needs to finish, this will atleast be a good test aswell of seeing what no captions does.

#

also just as a side questions, does it need to have the same name on the photos after the captions as the hypernetwork name? like if i named my hypernetwork CBReal should all the photos have CBReal1..2...3 etc before i preprocess?

glossy rune Nov 8, 2022, 9:03 PM

#

never worked with hypernetworks, sorry

reef junco Nov 8, 2022, 9:03 PM

#

No worries! i appreciate your help at any rate, im starting with hypernetworks cuz the local dreambooth training is a bit above my head at the current moment

stone garden Nov 8, 2022, 9:13 PM

#

glossy rune blip/clip-interrogator seem to depend on transformers==4.15.0 while auto1111 wil...

blip works up to 4.21.3 afaik

glossy rune Nov 8, 2022, 9:14 PM

#

ah nice, tanks

reef junco Nov 8, 2022, 9:15 PM

#

👌 Noted,

#

well fuck me, its still erroring out, clip interrogate is showing TypeError: Unsupported operand types for += 'nonetype' and 'str'

#

Blip is still the same error as above

reef junco Nov 8, 2022, 9:51 PM

#

Yeah no luck with downgrading transformers, anyone got any other ideas, clip interrogate was working like a week ago idk what would make it stop

#

found the requirements file from blip seems a bunch of stuff was different version, gunna correct the versions it needs and see what happems

#

also fuck me i cant spell today

reef junco Nov 8, 2022, 10:25 PM

#

sadge, still not working

half spoke Nov 8, 2022, 10:49 PM

#

okay, noob/dumb question time. Using d8ahazard's extension would I use my folder with the regularization images for the Classification dataset directory?

#

and for "instance prompt" if I'm training a character on NAI would "masterpiece, best quality, artwork of 'X'" be ideal? In any case how does it know 'X' is the prompt I want? or would I be stuck with using "artwork of 'X'" for the prompt?

frozen bobcat Nov 9, 2022, 2:06 AM

#

Is it possible to feed images of a 3D created city block and have SD train on that location? Anyone tried it yet?

rustic lava Nov 9, 2022, 3:59 PM

#

Has anyone updated https://rentry.org/hypernetwork4dumdums or something similar to show the workflow with the new plugin for AUTOMATIC111's repo?

Hypernetwork training for dummies

Based on my knowledge on 11/8/2022, by ixynetworkanon
I'll comment on anything with parenthesis in italics for things I don't know and need confirmation on.
This guide assumes you have decent knowledge of webui and stablediffusion, RTFM.
PREWORD
What is a hypernetwork????
Hell if I know, ask an a...

rustic lava Nov 9, 2022, 5:09 PM

#

nm I just found this: https://www.reddit.com/r/StableDiffusion/comments/yqm9xv/how_to_use_dreambooth_in_automatic1111_in_10/

r/StableDiffusion - [HOW TO] Use DreamBooth in Automatic1111 in 10 ...

0 votes and 8 comments so far on Reddit

old igloo Nov 9, 2022, 6:14 PM

#

stone garden that's up to you how many check points you want. if you want 10 check points you...

Ok, so I took your advice, and I retrained my model with 30 images up to 3000 steps with checkpoints every 500 steps. I'm finding that even on the 500 step and 1000 step checkpoints, my model is overtrained. Testing it with a 7.5 GS produces images that look too much like the original images and don't really honor the prompt, which as I understand it, means it's overtrained. Do any of these other settings look off to you?

stone garden Nov 9, 2022, 7:00 PM

#

Sp, what if you you use like 200 images of a subject? is there a rule for training steps? or is it a overkill?

#

dreamboothing of coarse

old igloo Nov 9, 2022, 7:54 PM

#

I read in a Dreambooth guide that you should use 200x the class images as you have training images, and another that said they generally just use 1000 class images generated by the script. So I tried changing the num_class_images parameter to 1000, but as I am watching it run, it is only attempting to generate 238 class images. Any ideas?

stone garden Nov 9, 2022, 8:32 PM

#

okey, pretty close to the art style of sleepy gimp. now I have to train trench coats I see 🤣 (trained on 13 images with out class data and 2400 training steps)

stone garden Nov 9, 2022, 8:58 PM

#

okey, needs some fine tuning. lets lower the learning rate

old igloo Nov 9, 2022, 9:28 PM

#

stone garden okey, needs some fine tuning. lets lower the learning rate

What's a good low learning rate? 1e-6?

stone garden Nov 9, 2022, 9:34 PM

#

old igloo What's a good low learning rate? 1e-6?

Dunno, guess it's a good starting point. the above samples are 1e-6 i believe. I'm doing 5e-7 run now.

stone garden Nov 9, 2022, 10:06 PM

#

well there is a little bit of michille Michelle Pfeiffer in her... okey lets do 1e-7 and reduce the steps

00672-3608164305-art_by_--sleepygimp_Michelle_Pfeiffer_head_visible.png

shrewd jewel Nov 9, 2022, 11:50 PM

#

I've been trying to train embeddings or hypernetwork (only 8gb of VRAM so no dreambooth 😭) and on a certain set of training images I ONLY get results like these: I've tried trimming down the image set (from like 50+ down to 20ish) with different lighting, clothing, a couple full body shots, a couple torso shots. I cropped and resized all the images myself. Went through and edited the pre-processed prompts blip spit out. I've tried both hypernetwork and embeddings (using auto1111). I've tried adjusting the learning rate up and down. I've tried various checkpoints from 1000 steps up to 5000 steps. I've tried changing up the keyword juuust in case its colliding with something else in the dataset. What gives? These should be photos of a person btw and I've successfully done this with other people.

shrewd jewel Nov 10, 2022, 1:00 AM

#

rubber ducky your the one. I think I figured it out, the vectors per token was set to 1 with this embedding and not the others. Raising it seems to have fixed this issue.

surreal mango Nov 10, 2022, 3:30 AM

#

so im using a dreambooth model of myselve
last night it worked okay
but then Sd updated and the layout is weird like this but not only that I get the same photo on every generation even with diffrent seeds

slow badger Nov 10, 2022, 3:30 AM

#

surreal mango so im using a dreambooth model of myselve last night it worked okay but then Sd ...

Did you reload the page, try ctrl + f5

surreal mango Nov 10, 2022, 3:30 AM

#

yes

#

any suggestions?

#

also

#

#

what the hell keeps happaning with the noise?

slow badger Nov 10, 2022, 3:38 AM

#

The last update to AUTOMATIC1111 was on the 8th at 7AM UTC

surreal mango Nov 10, 2022, 3:39 AM

#

I dont know then

#

it showed git pull commands working saying there was changes

glossy rune Nov 10, 2022, 12:40 PM

#

surreal mango what the hell keeps happaning with the noise?

too many steps or too high learning rate

clear flume Nov 10, 2022, 1:09 PM

#

yo

#

I tried to install dreambooth locally

#

but it keeps giving me errors

#

can't import some stuff

#

maybe there is a way to unistal it and reinstall it?

surreal mango Nov 10, 2022, 4:01 PM

#

glossy rune too many steps or too high learning rate

how many learning steps should I do for the dreambooth model with 14 images?

glossy rune Nov 10, 2022, 4:21 PM

#

Depends on other variables like training text-encoder or not, prior preservation and if yes number of class images. Generally speaking 1000 steps are a good start and for sd-1.4 as base model I’d recommend lr 2e-6. with 1.5 I prefer 1e-6

maiden grail Nov 10, 2022, 5:40 PM

#

Is there a way to label images with certain words, for the model fine tuning?

For example, let's say I am making magic staffs.

I would train this on images of staffs, and let's say 1 of the staffs is called a "staff of power".

I would want my model to be able to generate a "staff of power" but I don't want to make models for EACH of these descriptors.

I want to have 1 model, that is "staff" model, but I also want to instill into it the concept of a staff "of power"

high venture Nov 10, 2022, 5:55 PM

#

Should i use larger batch size when training the model with dreambooth? I am able to set batch size of 4-6 on my rtx3060, and it runs significantly faster, if multiply iteration time times the number of batches.

glossy rune Nov 10, 2022, 6:11 PM

#

I‘d probably first remove low vram limitations like fp16, training without text encoder etc, before increasing batch size. But for efficiency you want to use your vram as much as possible. I usually train with full features and bs 2 (on 24 gb vram)

hot breach Nov 10, 2022, 6:11 PM

#

larger batch size is probably better, computes gradient across the whole batch

surreal mango Nov 10, 2022, 6:11 PM

#

glossy rune Depends on other variables like training text-encoder or not, prior preservation...

how do I set the Ir steps? I dont see that option (I am using this notebook by the way: https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb#scrollTo=1-9QbkfAVYYU)

Google Colaboratory

#

also I took 3 more photos

glossy rune Nov 10, 2022, 6:15 PM

#

surreal mango how do I set the Ir steps? I dont see that option (I am using this notebook by t...

The notebook seems to have a field for training steps (3000) but the learning rate is fixed in the code to 2e-6. you would need to modify that manually. I don’t train with notebooks, so can’t help much

old igloo Nov 10, 2022, 6:29 PM

#

For anyone running Dreambooth on colab, if you want to have all of your checkpoint weights to be converted to .ckpt files, you can modify the conversion cell to the following:

import subprocess
import os

half_arg = ""
#@markdown  Whether to convert to fp16, takes half the space (2GB).
fp16 = True #@param {type: "boolean"}
if fp16:
    half_arg = "--half"

print("Converting all weights located within " + WEIGHTS_DIR)

for dirname in os.scandir(WEIGHTS_DIR):
    if os.path.isdir(dirname):
        try:
          print(dirname.path)
          modelpath = dirname.path + "/model.ckpt"
          print(modelpath)
          val = subprocess.run(["python", "convert_diffusers_to_original_stable_diffusion.py", "--model_path", dirname.path, "--checkpoint_path", modelpath, half_arg])
        except RuntimeError:
          print(RuntimeError)
          continue

old igloo Nov 10, 2022, 6:47 PM

#

maiden grail Is there a way to label images with certain words, for the model fine tuning? ...

I think this is a case for creating multiple models, then merging them together.

#🔧｜finetune

samples per second, # iterations per second, # batches per second.