#🔧|finetune

1 messages · Page 5 of 1

median sun
#

that would also cut in half things that are on the image

delicate stream
median sun
#

I'll just do both and compare with promts I'm just testing everything anyway atm

delicate stream
#

okay, but if its just a style, i recommend 512x512 and just you making manual cropping, that way you choose what to cut

#

sort of like this

#

if i wanted to have the bleach manga style

median sun
#

yea and it's a lot of work to crop even with birme

#

I'll do that later when I'm more familiar with TE

delicate stream
#

do you use the auto cropping in automatic?

median sun
#

didn't even know that existed

delicate stream
#

yup

#

in the train tab

median sun
#

thx gonna try that next then

delicate stream
#

no prob

median sun
#

I'll also have to redo the dreambooth training

#

what I trained was a complete mess but I used too many source images

#

I'll try separating backgrounds from characters and train 2 models later I think

delicate stream
#

remember do Quality over Quantity

#

dont do what i did

median sun
#

yea I already deleted most

delicate stream
#

Yhea dont do that

median sun
#

lol yep I did that too

delicate stream
#

do something like this

#

a bit more varied

#

Now that i think about it

median sun
#

what is the output like?

delicate stream
#

i should train another 90s model

delicate stream
#

let me see

median sun
#

add some armitage and ghost in the shell if you haven't already

delicate stream
#

these are old results, 1 sec

#

This is from Bleach Manga style

median sun
#

not bad

#

do you use negative promts?

delicate stream
delicate stream
#

i think i deleted my older 90s results prob because im running out of space

median sun
#

lol yea sd sucks up space in no time

delicate stream
median sun
#

wow you got a small hdd

#

that is what I had like 10years ago

delicate stream
#

its an NVME but for some reason it came divided into two sections each 500gb when i bought it

#

so its 1TB

median sun
#

that is pretty low too

delicate stream
#

Yhea but....

#

10TB

median sun
#

ahh ok that looks more like it

#

XD

delicate stream
#

Mainly Anime

median sun
#

I assumed as much with your outputs

#

I do backgrounds and textures mostly with SD

#

I hope that I can train a model that will help me with gamedev

delicate stream
median sun
#

nice our tastes in anime are very similar

delicate stream
#

i think, Bleach, DB, One Piece and Pokemon take the most

#

yup

median sun
#

I only watched bleach season one after that it became really bad

#

pokemon I don't like that much

#

db same only season 1

delicate stream
#

got OST Manga, extra stuff too

median sun
#

So I assume you are rewatching bleach to create a model rn?

delicate stream
#

Not yet

#

im more interested in the manga style

#

but....

#

i currently have my hands full on a project

median sun
#

if you want to save time use blender in video editing mode and render in 1fps that gives about 1k pics for 20min

median sun
#

then you can just cherry pick pics

#

what are you working on?

delicate stream
#

My current project

median sun
#

ahh sd project

delicate stream
#

yessir

median sun
#

lol you remind me of a friend

delicate stream
#

i have been working on her model for WEEKS!

#

yesterday i did a breakthrough

#

and im closer to my goal

#

to release the model publicly

#

and yes

#

it can do NSFW

median sun
#

lol

delicate stream
#

in what way?

median sun
#

he likes to use yessir too like that

delicate stream
#

😂

#

Well....its some quality shit right here

median sun
#

yep those pics are really good for sd

delicate stream
#

also you might notice she don't got a tail

#

its an old model

#

the new one has the tail

median sun
#

now that you say it yea but it's been ages since I've seen the movie so I didn't even notice before XD

#

do you also render out 3d models for poses for img2img?

delicate stream
median sun
#

my test a couple month back was not very good but with TE that should become a lot better

#

I think I still got the output

delicate stream
#

i also upscale with img2img

median sun
#

this is the original render from blender

#

I'll def test it again with a trained model

delicate stream
#

its an interesting concept, maybe i should do the same

median sun
#

with your skills it's prob gonna look a lot better

delicate stream
#

i doubt it, but it will probably depend on the prompt

median sun
#

if you don't know blender well just genereate a model with makehuman

#

or download a free rigged one

#

then you just need to pose it add a camera in the right angle and render out an image

#

it's not too hard to learn posing a rig

delicate stream
#

i know blender, i just haven't gone to to use it a while

#

i used to do a bunch of texture editing, renders and NSFW stuff

#

but now im distracted by SD

median sun
#

ic

#

y I'm distracted by SD too

#

gotta do more gamedev soon

#

but if I can get SD to generate textures then I'll save a lot of work in the future

delicate stream
#

This is the new model, the previous one was trained on WD this one is on Furry model (ye18)

delicate stream
#

im still working out some stuff

#

still in WIP

#

i also noticed sometimes the neg prompt can disturb what you want so its best to take out the negative and generate, then use the negative and compare results

median sun
#

thx that is good to know

#

damn now with all the different models are appearing my HDD is gonna explode with model files

delicate stream
#

Yup

#

Im prob gonna spend another week on this model

median sun
#

wow that is a lot of time

delicate stream
#

bro

#

i have many

#

its a mess

median sun
#

they are prob. gonna be outdated in 3 month and you'll be able to delete them all XD

delicate stream
#

Maybe....

median sun
#

I luckily got a couple tb free space

#

so I won't run out on space soon

delicate stream
#

cries in GBs

#

i wish i had more TB's

#

its kinda sad people kicked Clip guided diffusion into the waste

median sun
#

you can get a classic 18TB HDD for like 300 now

#

what is Clip guided diffusion?

delicate stream
median sun
#

did that use videos for training?

#

those are very abstract

delicate stream
#

or in this case CLIP + VQGAN

delicate stream
median sun
#

I bet that will come back at some point when everyone is fed up with the perfect looking images

delicate stream
#

i think this one was "Beach in a dream"

median sun
#

that is some pretty good stuff

delicate stream
#

i guess this is more of what dreams look like instead of the realism that SD provides

median sun
#

so is clip gone forever or just not used much?

delicate stream
#

prob not used much

#

i still have the app

#

so i can use it

#

but i never do

median sun
#

maybe a SD model trained with abstract images would output similar images

delicate stream
#

the model is here too

median sun
#

here?

#

damn in ok another dl

delicate stream
#

i tried putting it in automatic1111 but it wouldn't run it

median sun
#

even tho it's a ckpt file?

delicate stream
#

yhea

#

let me show you what it says

#

ran the wrong bat lol 1 sec

#

that

stone garden
#

I'm working on a kind of expanded "prompt x/y" to compare multiple sets of diffusers on different configurations and a set of prompts. I'm still working on the UI, but the inner comparing function is mostly done, and I wanted to share 🙂

median sun
#

@delicate stream have you tried it in SD 1.5?

#

oha 2h for 10k steps

delicate stream
median sun
#

isn't sd 2.0 stable diffusion v 2?

#

why tf is my cpu at 64 degrees

#

cpu isn't even used 10%

#

guess the gpu is radiating that heat everywhere

#

guess I'll have to change the cooling if I do this a lot

weary knot
median sun
#

holy shit gpu waterblocks cost 200$

weary knot
#

excuse-me, what's waterblock?

median sun
#

it's cooler used to connect watercooling to a gpu

weary knot
#

thank you

median sun
#

my old pc was water cooled and I thought about watercooling my current one but if did that It would cost like 500$

#

can't afford that atm just for cooling

#

I hope the model training works then I'll hopefully finish a game sometime in 2023

weary knot
#

oh, you're making a game as well?

median sun
#

multiple

weary knot
#

also, I don't get why you need special cooling for generating

median sun
#

my pc runs too hot

weary knot
#

yeah haha but I mean, do you generate all the time? Like many batches per minute ?

median sun
#

cpu is at 60+ deg even tho the gpu is working

#

I'm training models a lot lately

weary knot
#

oh I see

#

do you mind sharing a bit of your game projects?

median sun
#

but if I can get SD to generate nice pixelart textures I might pick it up in 2023/2024 again

#

I've switched to a really small project where I'm currently animating a worm:

#

that is what I should actually be doing but instead I'm making SD models ...

weary knot
#

huh

#

I was about to ask how SD fits into the worm

#

but it~s unrelated, then

median sun
#

SD doesn't in that project

#

what are you working on?

#

I wonder if it would be possible to train SD to create diffuse textures for uv maps

weary knot
#

I don't know much about uv maps, but I've seen GANs doing that, so it should be possible with SD

#

unless you have some kind of hard requirement, it should be a matter of finetuning

#

do you see a reason it wouldn't work?

#

as for my project, hmmm

#

I'm working on my master's thesis, on a new idea that might become a paper

#

so I really shouldn't tell much of the details until I can publish or something

#

Already had to restart my research from scratch once, don't wanna do it again haha

#

First time didn't have to do with SD, it was a new segmentation technique that turned to be crap in the end

#

ok, not crap, I'm exaggerating. But not good enough to publish

median sun
#

every uv map is differently shaped so I'm not sure if SD is gonna be able to fill them right

#

and generating an object with SD and cutting it out would only texture on side of an object

#

there is also stretching effect in the diffuse of round objects on a 2d image that looks normal on a 3d object

#

there is a lot of knowledge involved in creating a diffuse for a 3d object so I assume SD won't be able to do that

#

or maybe only for simple objects that have similar uv maps

weary knot
#

yeah, I've seen networks trained specifically for that and they can do pretty well

#

in the context of clothing for fashion

#

but maybe SD doesn't have the specific knowledge

#

yet, maybe it does. Who knows 😛

median sun
#

guess I'll have to try it sometime

#

what I guess could work is img2img for alternative diffuses

weary knot
#

yup. I've yet to see someone fine tune SD with image prompts instead of text captions

#

would be interesting

#

so you could finetune with img2img

median sun
#

first I need to train a good pixelart model

#

that is prob. gonna take a couple of weeks

#

if my pc didn't run so hot I could train SD and use blender at the same time

weary knot
#

sorry for delay

weary knot
#

so your focus is the pixelart model? YOu seem like the type that does a lot at once, honestly

#

which is always dangerous 😛

delicate stream
#

i accidentally img2img an image with no prompt for upscale....and ngl it looks better than i would have done with a prompt

#

original

#

no prompt^

#

for some reason no prompt just made it work

weary knot
#

hmmm

#

did you use the same seed?

delicate stream
#

no

weary knot
#

sometimes I have problems in img2img when I use the same seed as for the lowscale version

#

ok

#

huh interesting

delicate stream
#

i did an upscale with the same prompt as original and it looks worse

#

No prompt is better?

#

idk i just see the no prompt has no artifacts

weary knot
delicate stream
#

im not using 2.0

weary knot
#

ohhh

delicate stream
#

just a custom model and img2img

weary knot
#

yeah, I thought so because the 2.0 comes with an upscaler. Got it now.

delicate stream
#

yhea but the upscaler still doesn't work i think

median sun
#

@weary knot how is that dangerous? Anyone that doesn't learn how to use AI now will have to learn it later. My assumption is that in 5 to 10 years digital artists that can't use AI will be out of jobs.

weary knot
#

oh no, I mean doing a lot of things at once is dangerous

#

because you end up not finishing many of them

#

and getting distracted

#

not you specifically, anyone who does that

median sun
#

ahh yea that is true but the only distraction I have atm is SD so I'm not too worried

delicate stream
#

anyone else getting this when doing "send to img2img"?

dapper prism
dapper prism
#

As for why? The question you should really be asking is "why not"?

dapper prism
#

lol, sadly not yet. It doesn't work so well with taping really large objects to walls

delicate stream
#

im liking the new DPM++ SDE Karras sampler

weary knot
#

how did you do that, though?

#

your workflow

delicate stream
#

🍔

dapper prism
# weary knot how did you do that, though?

I acquired photos and paintings of objects like pizza, wine, bananas, cans, phones, cameras, bags, dildos, etc... taped to walls (22 images in total), and then trained the model on it

delicate stream
#

i think the hamburger is off

dapper prism
#

To increase the training dataset size, I'd need photos from people of them duct taping objects to walls as I've exhausted the search online for such content

weary knot
#

so it's like a style, in the end

delicate stream
#

tape them

dapper prism
delicate stream
#

yup

#

Nice burger

dapper prism
delicate stream
#

i have cement walls so i don't need to worry

#

I..... LOVE the freckles here

#

she looking like she has shark skin

weary knot
#

was it also a simple dreambooth training, or did you do anythign special on this model?

delicate stream
#

tho im using the new sampler

#

DPM++ SDE Karras

delicate stream
weary knot
#

indeed

median sun
#

yea I think I should do 512x512 next

#

I did not expect to get abstract art when using 768x432 instead of 512x512

#

or was 10k steps for an embed too low?

delicate stream
#

besides SD is bad at lower res than 512

#

thats why people always do 512x512

median sun
#

how do I generate regularization images? I only downloaded 1k for persons but I have none for backgrounds.

weary knot
#

maybe if you search discord you'll find a set you can use

lusty dagger
#

I just forked a dreambooth notebook with Stable Diffusion V2 enabled and wrote a guide to go with it!! I hope this helps folks out 🙂

https://twitter.com/kaliyuga_ai/status/1596955332181655552?s=46&t=NFRtT8ATnPZRMb4dNQ8_NA

Ok! By popular demand I forked the DreamBooth notebook I use to have SD V2 enabled, and I wrote a guide to go along with it! I hope folks find this useful and fun!
Notebook: https://t.co/ymeiSpYzZk
Guide: https://t.co/HiIppaOiCj
:)
#stablediffusion2 #dreambooth #stablediffusion

median sun
#

what is a DreamBooth notebook?

median sun
weary knot
#

not sure, actuall

frosty wave
#

Hum I've an new issue with DB extension of a1111 since last update, I didn't have it before: Error completing request
Arguments: ('MyModel', '', 'D:\SD\SD-learn\input512', 'D:\SD\SD-learn\Regul', 'photo of a RobertLF person', 'photo of a person', '', '', 1, 7.5, 40, 2371, 512, False, True, 1, 1, 1, 3000, 1, True, 1e-06, False, 'constant', 0, True, 0.9, 0.999, 0.01, 1e-08, 1, 5000, 5000, 'fp16', True, '', False, True, '75', True, False) {}
Traceback (most recent call last):
File "P:\SD\stable-diffusion-webui\modules\ui.py", line 169, in f
res = list(func(*args, **kwargs))
File "P:\SD\stable-diffusion-webui\webui.py", line 58, in f
res = func(*args, **kwargs)
File "P:\SD\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\dreambooth.py", line 331, in start_training
trained_steps = main(config)
File "P:\SD\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 649, in main
unet = UNet2DConditionModel.from_pretrained(
File "P:\SD\stable-diffusion-webui\venv\lib\site-packages\diffusers\modeling_utils.py", line 471, in from_pretrained
for param_name, param in state_dict.items():
AttributeError: 'NoneType' object has no attribute 'items'

#

Is there a missing argument?

frosty wave
#

Ho well there was another error above, seems the model structure directory and files were not created correctly... These two subdir were not here in the model directory:

#

so I copied them from a preceding model...

#

and this file was corrupted (considered as "non zip file" in an error)

#

So I copied it from a preceding model

#

and now it seems to work...

#

At least the logs says it is training

royal kayak
#

Any v2 models out yet?

spice trench
honest snow
hot creek
#

what's the best learning rate for dreambooth?

#

4e-6?

median sun
#

@hot creek I had good results setting it to 0.000001 not sure how much 4e-6 is.

north stream
#

It's 0.000004

stone garden
delicate stream
#

train with 512x512 or any 1:1 ratio

final matrix
#

HUGE update coming to my Legend of Korra model soonTM! Featuring training on 768x768 pixel size (on 1.5 SD!), ALL characters, experimental caption method, manually pruned dataset from 30000 images....

My currently released v1.0 Legend of Korra model (https://huggingface.co/ai-characters/4elements-diffusion) has some big issues with caption overload, inflexibility, overtraining, only Korra trained, etc... but I have recently come across the EverDream repo by Freon that allows for training of up to 768x768 pixel size, up from the standard 512x512, and which I want to use for the rework of my model.

So I took this opportunity to redownload the entire original 30000 image dataset and manually prune it this time around. It will have taken 5 days now to fully prune it down to 5000-10000 images.

I will also include more fanarts and cosplay photos.

I am also going to try out a new caption method where I only caption a small set of the images but very detailed, and then the rest of the images will serve more as padding. E.g. those images could be captioned as say "screencap from token in the token aesthetic". My thinking here is that the model AI will be smart enough to link the captioned character images to uncaptioned character images in the wider dataset as it will now who is who from the captioned images. If this works it will massively reduce my workload while massively increasing likeness and flexibility by providing a ton more data of the characters and style for the AI without overloading the captions.

I will also now feature all major characters and some minor characters (including Naga and Pabu!) in all outfits, and no longer just Korra!

I will still train this new version on 1.5 SD, but I may train a 2.0 SD version too sooner or later.

The fact that there will be a ton of screencaps featuring hands likely means that hand generation with this model will be very good too!

However as this manual processing all takes a lot of my time and attention I probably wont be able to start training until in a few days. I also have final exams starting thursday which will further slow down things.

#

I will release the pruned but uncaptioned dataset later today
or tomorrow depending on how long the fanarts will take

#

will be in original aspect ratios and resolutions though as i intend to use this dataset for 768x768 training

but i put 5 days of pruning work into this. around 8 hours per day or so.

frosty wave
#

Hi! Trying to run Dreambooth on 2.0 using the "accelerate launch train_dreambooth.py" from Shivam Shrirao repository but I get this error. What did I miss?

tough gazelle
#

The URL it's trying to get to doesn't exist

frosty wave
#

Yeah.. but why does it try to get that url? My only parameter is stabilityai/stable-diffusion-2 as name of source model, and he completes by himself the other parts of the url... So maybe I've a wrong version of some module somewhere?

tough gazelle
#

There were some issues on huggingface a bit earlier. Perhaps it's just busted.

frosty wave
half folio
#

does Shivam's fork support 2.0?

upper prism
half folio
#

ah

summer mural
#

im trying to make an embedding for a subject, does anyone know a tool to download all images from google images?

summer mural
#

is there a nice tool so that making a dataset is less of a pain?

vale egret
#

curl

summer mural
#

good lord

#

it takes 2 days to train an embedding on a 4090...

trail owl
#

"Error completing request
Arguments: ('harrisondreambooth3', False) {}
Traceback (most recent call last):
File "C:\AI\stable-diffusion-webui\modules\call_queue.py", line 45, in f
res = list(func(*args, **kwargs))
File "C:\AI\stable-diffusion-webui\modules\call_queue.py", line 28, in f
res = func(*args, **kwargs)
TypeError: start_training() takes 1 positional argument but 2 were given

Traceback (most recent call last):
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict
output = await app.blocks.process_api(
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 983, in process_api
data = self.postprocess_data(fn_index, result["prediction"], state)
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 913, in postprocess_data
if predictions[i] is components._Keywords.FINISHED_ITERATING:
IndexError: tuple index out of range"

Does anyone have any ideas how to solve this error while training dreambooth?

vale egret
summer mural
#

how many steps do you use

vale egret
#

6K

summer mural
#

okay

#

this might be why lol

#

im doing 100k

vale egret
#

That sounds like overkill

summer mural
#

i just left the default values

torpid sinew
summer mural
#

burgonet

fast crater
#

has anyone got dreambooth extension to automatic1111 to work with sd 2.0 768 ?

torpid sinew
#

2 Days still sounds like a lot

summer mural
#

do I need to care about the hypernetwork thing?

vale egret
#

100k steps sounds like it should take several hours

fast crater
#

(just to be clear I am just asking if its possible yet, not asking to be told how to do it.... i mean that would be nice...but if i know its not possible yet I will stop looking for how lol)

summer mural
#

yeah im trying at 5k

#

hope the results arent too shite

fast crater
summer mural
#

yeha

#

eh its getting there

#

kinda

fast crater
#

if you have the hard drive space go for 6500k but have it output an checkpoint every 300 steps

summer mural
#

think ill run it overnight at 10k steps or smth

fast crater
#

when its done, test each one.... sometimes i get good results at 1200 steps, sometimes at 6k
sometimes i get good results when the sample has been bad, other times the sample has looked good but the results turned out bad

summer mural
#

test each what?

fast crater
#

each checkpoint

summer mural
#

ah i see

#

it doesnt save the best loss or something?

fast crater
#

no

summer mural
#

its getting there 😄

#

mmh

#

is it normal that my embeddings are only 25kb?

fast crater
#

and with the dreambooth extension on automatic1111 you can train more steps as needed

fast crater
summer mural
#

i heard you cant merge concepts with dreambooht

fast crater
#

i use a concept.json to train it on 3 people at once.... works rather well

#

the latest update to the one i use means i got to re-do the concept file though

summer mural
#

is there a tool to make annotations less of a pain?

fast crater
#

not that i know off, but its not that difficult

summer mural
#

its tedious

fast crater
#

shrug sometimes you got to put in the work to get the results

summer mural
#

yeah ik

#

but im a programmer, i dont wanna work 😂

fast crater
#

once you got a template for a concept...cut and paste is a go to.... just make sure to get notepad++ so you can check the json file is valid

compact trellis
#

you make copilot work

vale egret
summer mural
#

maybe i wait a year or two, then all the smart and hard working people would have figured everything out

fast crater
fast crater
summer mural
#

if i know something is going to be tedious, i would rather waste 10 times as much work it would take to do it manually so that i can have a bot do it for me 😂

fast crater
#

I have got great results from dreambooth after all...but never tried embeddings

vale egret
#

Dreambooth is more accurate at replicating something specific but it’s less flexible usually.

summer mural
#

its getting better and better, kinda want to try one right now

fast crater
#

the results i get tend to be either very very exact (spooky) or regonisable as me/mates..... the ai does like to pack a few pounds onto me...yes the ai.... totally the ai ......

vale egret
#

Embeddings don’t give that kind of result unless you crank the vector count up to absurd numbers, but then it starts requiring a lot more training data than dreambooth uses

fast crater
#

and for reference this is training image

fast crater
vale egret
#

I haven’t tried training a high precision embedding yet, so I couldn’t say

fast crater
#

so far I have trained me and mates on a lot of models, the openmidjourney being one of my favs... thats the main downside I seem to have... an embedding may take longer but if the result is the same then thats a one shot deal reagardless of models

summer mural
#

its not half bad

vale egret
#

Embeddings only work on the model they were trained with

fast crater
#

I mean I am here to find out if there is a way to get dreambooth to work for 2.0, but that will mean a few hours training again lol

summer mural
#

considering it has never see anyone wearing a burgonet

vale egret
#

Sometimes there’s a little cross-compatibility, but an embedding only usually works on the 1 model

fast crater
#

ah I have been using a couple on multiple models... but dont tend to use them often (and with mixed results)

#

then the only plus to embeddings would appear to be the hard drive space used?

#

(which tbh is starting to become an issue lol)

vale egret
fast crater
#

I may have to actually give embeddings a try.... compare the two methods (once its all working for 2.0/when i find how to)

summer mural
#

can you combine multiple dreambooth finuted model?

fast crater
#

but you are essentially influencing the first model to a percentage of the second model

#

personally i find it better to just train the other model the same as you did the first

summer mural
#

yeah alright, I can't get to have the model do what I want

#

think ill just go learn how to draw or smth 😂

fast crater
#

its not quite that bad lol

#

I just use the model i like as the source model and train the same as i did the base model

vale egret
#

With weighted average, you dilute each input model. With add difference, you keep what is unique about both models, but you lose some of the original

fast crater
#

.... There may (probably is) be better ways

#

Emad? think thats his name, has said better faster tools are coming to finetune 2.0

#

but coming is an annoying timeframe.... star citizen full release is coming.....
the eventual heat death of the earth is coming
(not sure which of those examples will be first, but you get my point lol)

summer mural
#

fucking finally

#

something good

fast crater
#

nice

#

if you try the same prompt with say the midjourney model, does it work?

#

(or any other model you may have)

#

curious how much the model an embed is trained on has an effect

summer mural
#

I had this with MJV4

#

to be honest

#

the amount of work

#

to have anything REMOTELY as good as midjourney

fast crater
#

no i meant the midjourney model for stable diffusion

summer mural
#

is probably worth more than the amount of $$ one would give to MJ

#

but unfortunately, it doenst undestand what a burgonet

#

is

fast crater
#

if memory serves its the model i trained my above image on, I know i have a few models based on the midjourney one

summer mural
#

ah havent tried with the midjourney finetuned model yet

fast crater
#

this i generated earlier on my trained version of midjourney v4

#

(for somereason it added a beard lol)

#

actually now not sure if that was midjourney one.... i have too many models lol

summer mural
#

that dinosaur thing does NOT have good intentions

#

anyway, still too much work, ill learn to draw instead xd

fast crater
#

lol, my artistic talents extends to "deformed stickman" so I am stuck with ai

summer mural
#

i drew when i was a kiddo

fast crater
#

right i need go do some work, nice chatting with you guys

summer mural
#

same

spring sun
#

Can I dreambooth the 768 model using the training res as 512? or the results will be bad ( running out or memory with 768 res )

celest egret
fast crater
spring sun
fast crater
#

was hoping the 2.0 768 model would work ok locally on dreambooth

spring sun
weary knot
spring sun
#

16gb too

#

used ram goes to 11gb when using webui for hypernetworks

weary knot
#

ty!

shy wadi
#

Question for y'all - what would you say is a good minimal dataset size+step count for a quick-and-dirty dreambooth run? I've had good results with around 30 images/6k steps, but I'm wondering how much (if at all) I can safely trim that down for a quicker meme-ier take.

split acorn
#

12 images, 2k steps GoatUppies very minimal quick and dirty, worked well for 2 models I did (both were 12 images, 2020 steps)

grave carbon
#

I tried dreambooth the 768 model but I get these:

karmic warren
#

can go as low as 500 steps with DB if you only focus on a close up and also stack the classifiers with pictures more specific than the class used

#

speaking of which the DB colab seems to be working for 768 just need a "premium GPU" (13 units per hours, is like 1.3$) and it takes 20 minutes to do 500 steps

#

so 10k steps which is usually where it starts getting really good if you have say 150 instance pictures willl be like 400 minutes which is almost 7 hours

#

wich would cost 91 credits on colab ... a bit steep for now 😄

karmic warren
#

the first 2500 steps were a bit hopeless but it kinda got there after all

grave carbon
#

I did 4400 steps for 22 images

#

in the last ben

karmic warren
#

would say maybe try pushing it a bit further but seeing how much longer 2.0 seems to be needing compared to 1.5, and how expensive that can get ... try if you're really motivated or are running locally

fast crater
#

I am currently training 3 concepts to 768 model in dreambooth extension to auto1111. They concepts are my mates and I ...however the images are 512x512 so i do not expect it to work.... time will tell

#

currently it using exactly 12gb vram

karmic warren
#

🤞

#

glad to hear iti's working on local installs already

#

just gotta wait a bit more for colab and runpod

fast crater
#

first samples look terrible but that would be expected at 300 steps tbh

fast crater
#

none of these look like who they are meant to.....though i sort of see josh's beard in his...sorta

#

top one is meant to be my bald mate craig lol.... as i say though 300 steps...not gonna judge it on 300steps

karmic warren
#

in the grid posted above it didn't even start looking like anything befoe 2.5k steps

#

good luck 🤞

fast crater
#

yeah i may not have even got the json file right ( a lot changed)

#

trying the max steps thing to cut one concept off at 1200 steps (have less images for that one) and the rest to be the max amount of steps

shy wadi
fast crater
#

900 steps and they aint getting better lol

fast crater
#

not sure if its patience or 512 in 768 model

#

or a bad json

#

to compare that with old training samples:

#

hell it could be i have the sample prompt not right

#

and my class files were generated with 1.5 ...... i think I may have too many variables for this to be scientific in any way shape or form lol

karmic warren
#

😄

#

did upscale all my 512 pictures this morning

#

took a good 3 hours

fast crater
#

its using 12gb of ram..... thats scientifically proven 🙂

karmic warren
#

so if it doesn't pan out after say 5k steps

#

768 will be the ticket

fast crater
karmic warren
#

🤣

#

sing it a song like plants

fast crater
#

I'll sing it Daisy...and threaten to pull out the ram chips

karmic warren
#

😄 i'm afraid i can't do that dave

fast crater
#

lol (was worried that reference may show my age too much and not be regonised lol)

karmic warren
#

hehehe old geezers gotta stand together

fast crater
#

indeed!

karmic warren
#

🧓

fast crater
#

wonder if the Discovery 1 is in the dataset..... this may need to be explored when training is done

#

(and yeah i admit i had to look up the name, knew it was Discovery something but couldnt remeber what)

karmic warren
#

yup still lots to be experimented with instance pictures their number and diversity and same goes with the class pictures

fast crater
#

these samples are disturbing

karmic warren
#

there's some funky funky results to be had by setting a couple variables in ways they probably weren't intended 😛

#

mhmm welcome to the uncanny valley, it gets better ... after a while

fast crater
#

o.O

#

must...not....close...eyes

karmic warren
#

would recommend grabbing a drink, maybe reading a paper

fast crater
#

tempted to load the console for a bit got an hour to go.
so far my mate josh is the only one thats consistant...it looks nothing like him but its the same imposter in each image lol

#

/me afk for an hour

karmic warren
#

that'll happen more often with dpmadaptive but DB doesn't train with that one, wish you luck 🤞

fast crater
karmic warren
#

lots of knobs to explore 😄

fast crater
#

I think its my json file tbh...i should have done the first test with the webui and one concept...then expanded

#

doesnt help that dreambooth has had a complete ui makeover

karmic warren
#

there's a description of every variable on the github

fast crater
#

read the manual....first? thats crazy talk

#

manuals are for figuring out what went horribly wrong....not for preventing it lol

karmic warren
#

😄

fast crater
#

I think its calling me fat

#

My AI: one person...hmmm no, one person and a little person...hmmm no, one person he must be one person...no...OH he is a house! that explains the waist

fast crater
#

yeah it didnt go well

#

I have to say that has to be the most hillerously wrong a training has gone (that was 100% the first gen test of the model)

novel pike
#

hi friends, any advice on how to achieve style consistency when fine tuning models? i would like my output to look a lot more like my training images (side profile, no background, same angle etc) but cannot seem to figure out how to get there!

proven wave
#

Hey guys! I've just released a new photorealistic SD 1.5 model. It's available on dreamlike.art, in diffusers, and as .ckpt.
Trained on a large dataset of high quality images. Based on SD 1.5 with the new VAE.

Model Card: https://huggingface.co/dreamlike-art/dreamlike-photoreal-1.0
Link to .ckpt: https://huggingface.co/dreamlike-art/dreamlike-photoreal-1.0/resolve/main/dreamlike-photoreal-1.0.ckpt
Diffusers model id: dreamlike-art/dreamlike-photoreal-1.0

  • Use standard SD prompts, or add danbooru-like tags for characters (1girl, brown hair, etc.)
  • Supports any aspect ratios! No more double heads or repeating subjects. Non-square aspect ratios work better for some prompts. If you want a portrait photo, try using a 3:4 or a 9:16 aspect ratio. If you want a landscape photo, try using a 16:9 aspect ratio.
  • Trained on 640x640 images, so increase the resolution a bit
spring sun
#

anyone getting good results with 2.0 embeddings ?

lost mauve
#

im having problems training a personal model

#

any help? i keep getting images like this even after long hours of training

grave carbon
#

training the 768 model gives me these:

#

in the last ben.

#

does anyone know why?

remote vapor
#

What are you training it with ..... i got the same

weary knot
sterile depot
shy cosmos
#

Hi #🔧|finetune! Our first forum is going live with the addition of #1047197565365538826!

This forum will be a supplement to #🔧|finetune where users can share their custom models with the community. Creators can add tags to their post indicating the category the model falls under, for example, an anime-centric model focused on a specific character would use the anime & character tag. Please read the pinned post for guidelines before submitting your model! ❤️

heavy lynx
#

How does 16gb of vram compares to 24gb regarding finetuning the model? Can I run out of memory with training?

lost mauve
flat herald
proven wave
flat herald
#

Oh alright thanks you, I though of it by seeing that

proven wave
#

You get 100 free credits after signing up and 1 free credits per hour after that

flat herald
#

Thanks you

#

I got it

west slate
# grave carbon

go to your v2-inference-v.yaml, ctrl-c, ctrl-v, name it the same as your model with .yaml

grave carbon
west slate
#

he put a post on reddit saying that it needs way more training steps than before

grave carbon
#

hmm thanks

#

I lowered my train encoder to 60%

#

it says to keep it 100%

grave carbon
#

So I put no in the contains_faces section...

#

I am doing 1500 steps, 2000, 2500, 3000 and 3500 checkpoints

#

I put captions to all my images

#

like

#

token_selfie_smiling_with_a_white_shirt.jgp

#

100% train text encoder

#

fp16

#

768 model and 768 images

vivid hinge
#

Guys I'm sorry if this is super basic but is there a way to set the "weight" of a negative prompt? Like.. I see what the model is doing, and it is removing too much of the features that I would like to see less of. I'm using the stable diffusion UI

#

I have tried is like "something:0.1" but the weight doesn't seem to matter at all. If I set it to "something:0.0000001" or "something:0.99" it seemingly results in roughly the same effect

vivid hinge
heavy lynx
#

How does 16gb of vram compares to 24gb regarding finetuning the model? Can I run out of memory with training?

weary knot
#

16gb requires a lot of tricks to run well

#

it can dreambooth and textual inversion, for sure, but I'm not sure personally if it can finetune the whole model

abstract latch
#

is it normal for the loss to go up and down from .2 to .1, currently at step 500/15000?

#

and which one is better? higher or lower loss?

split acorn
#

some training actually uses just above 16GB of VRAM, like dream artist will sometimes zap up just over 16GB of VRAM

#

and then various dreambooth options can take the full 24GB VRAM

abstract latch
#

well im using Automatic7777's webui to do SD stuff

tawny inlet
#

Hey guys, Does anyone have any idea on how to train Dreambooth on 8GB VRAM??

tough gazelle
#

Don't believe you can

#

I think the absolute minimum is 10GB on Linux

tawny inlet
#

🥲

#

then what's point of SD2.0?? I guess, for people like me, need to depend on someone unknow in this world to train a subject or style and upload it on Internet.

stuck arrow
#

so in the next few weeks you'll see people that do have enough VRAM finetuning v2.0 instead of v1.5 and that ideally will produce better quality weights.

tawny inlet
#

okky

fast crater
#

With dreambooth on a local system (via automatic1111) is there any harm in just adding more and more concepts to a model?

so for example, I train 1.5 model a person (me david eastwick)... i get it to perfectly make me.
I then take that model (lets call it 1.5dave) and train it on my mate josh..
I then take that model (lets call it 1.5daveJosh) and I train it on my mate craig..
I then take that model (lets call it 1.5JoshDaveCraig) and train it on a specific art style.

Now would the result of that model (lets call it 1.5JoshDaveCraigSpeghetti-art) still know how to make images of david eastwick? would it still be able to make the an image of the statue of liberty?

at what point does training/finetuning degrade a model?

#

can i just go on adding things to a model as i like ? i mean they are trained on billions ...but guessing not through dreambooth

fast crater
#

anyone know? (if anyone does, then feel free to ping me)

eternal hollow
#

Does dreambooth diffuse all images you give at once or does it diffuse them one after the other?

hot breach
#

it trains your batch_size number of images at one time, though if you use dreambooth it usually batches one training image and one regularization per batch_size as well

fast crater
#

but is there degredation the more you train it?

weary knot
fast crater
eternal hollow
#

Interested in stylized diffusion

fast crater
#

or using a style... for that huggingface will be your go to

#

gnollingcase is a good style to try

eternal hollow
weary knot
#

thanks

fast crater
cyan wedge
#

Hey guys, could I have someone who is really experienced with the software shoot me a message please, need some advice

fast crater
# weary knot Please post if it worked, I'm curious now.

my test just failed
I trained with just me until I was perfected
I then took that model and trained my mate craig until he was perfected

I then Tried to generate craig with the end result... perfect
I tried to generate me with that result I got craig... even though i was only using my token

zealous dragon
#

Do you think I could finetune Stable Diffusion to do something like beard/no beard? Or are GANS still the bees knees for image-to-image translation? Like I think I want to to Img2Img with a prompt like "Man without a beard" to remove beards.

Or is finetuning better just for styles or something? Any good guides you could point me towards?

delicate stream
#

but yes you could fine tune a model techincally

#

as for guides....im not sure, it would depends on what you want to do and think how you think is the most easiest way to do it.

ancient gale
#

(Hope this is the right channel for this) This is an image2image with the prompt "girl, cat ears, Yoji Shinkawa" Everything about this is almost perfect but the hair on the left and right are supposed to be pig-tails. How do I go about fixing this with stablediff?

#

I thought all I had to do was just use the same seed and change/add a word but that changes the entire image still :C Like could I photoshop in some pigtails over and then run it again with the same seed or something? Sadly now whenever I try to do it with the same seed it just spits out a completely dif image idk what I did

ancient gale
#

I did but it more just kinda smears hair around, I didn't exactly know what I was doing ig

vivid hinge
#

It's basically the butterfly effect in practice

vivid hinge
#

Like, generate a couple pics and mash 'em together

ancient gale
#

either way sadly I lost the seed, for some reason it generates a completley dif image everytime now so rip

hazy schooner
#

Do textual inversions work the same on v2 as they did on v1.x? Like, can I just use automatic1111 and train on a small library of 768x768 images with the training tab? Or do I need another repo/use another method to 'shape' SD 2

vivid mural
zealous dragon
# delicate stream In that case why not do `(((beard)))` in the negative prompt?

What I mean to ask is specifically do you think it could perform well (look like the original picture with only a removal of a beard)? I'm trying to understand if this task requires fine tuning or not. For example, I tried img2img, queried CLIP for a prompt, then removed "with a beard" from the original prompt and added ((beard)) in the negative prompt. It definitely gave me a man with no beard in the same spot with similar colors (sometimes his skin was black, sometimes not), but it was a DIFFERENT man. I can see you're getting amazing results, so that's why I'm asking you specifically because you understand what finetuning can and can't accomplish.

old igloo
tough gazelle
#

When you merge models it does exactly that, it merges them together. So the two distinct people wouldn't feature anymore you'd get a sort of hybrid of the two.

If you want to train multiple subjects you need to do them all together really. Unless you can find something that lets you actually resume training and not just stick the new stuff on top.

old igloo
tough gazelle
#

Some people like merging the styles to see what happens

old igloo
#

Also, the model obviously already has the full set below the added subject, so what happens to all of that data when you merge two models that share that base?

tough gazelle
#

You have some control over it as you can weight it either side. It it's not suitable if you wanted to stick two single subject models together

old igloo
#

How so? Data from 2.3 billiuon images in both models. How does it merge all of that data together?

tough gazelle
#

It's taking weight from both models and balancing them out between the values you pick. So maybe 50% of the weights from each model.

old igloo
#

That sounds like a theory

tough gazelle
#

It's what it's doing. The code is open source you can go look at a dreambooth trainer.

#

The ckpt files contain a bunch of weights that tells it how to control the noise generation for certain items. You train a model your telling it how to produce the right noise for your images.

When you have 2 different models and you merge them it gives a weighted average for the new model.

Why do you think the file size stays the same and doesn't double.

old igloo
#

So then what is emad suggesting the community does to fix the borked 2.0 model? The implication is that we can retrain it on all of the missing subjects. What you are suggesting is that is impossible.

#

When I finetune a model on a single subject with token xyz, it adds just a tiny bit of data into the model, yes? And that tiny bit of data doesn't wreck the data from billions of images that was already in the model, correct?

#

So then why would taking that model that was trained on xyz lose any data about xyz if I did a subsequent finetuning against subject abc?

#

Is the data for xyz stored in a special compartment that gets shared with abc when I do a subsequent training?

#

And if that is indeed the case, then is if safe to assume that dreambooth training is not an exact reproduction of the process that the model underwent in its initial training?

tough gazelle
#

Dreambooth isn't the only way to do training. I Suspect they'll release tools that help people train models from scratch.

old igloo
#

Fair enough.

ancient gale
#

what's the 7 part do

tough gazelle
#

I'm talking from experience with dreambooth models too and merging them caused the blending. But training them all in 1 session worked.

old igloo
#

Understood. Separate from that conversation, do you know the answer to my last question regarding dreambooth compared to the original training?

vivid mural
ancient gale
#

oh sick! I'll keep that in mind

tough gazelle
# old igloo Understood. Separate from that conversation, do you know the answer to my last q...

So when you train something brand new, and you give it a token name so SD understands what it is, you also include regularisation images, so it tells SD that this stuff should still be in the model and it won't overwrite it.

You'd still get different results from a base model, but it wouldn't completely wreck that data.

In theory if you had 1 model you trained and then you wanted to add something into it, if you trained a new model, but with regularisation data from the old model, you might be able to keep both that way, but I'm not sure how well that would work.

old igloo
#

So Dreambooth is doing finetuning. Is that quite different from what is done when training the initial model on 2.3 billion images?

tough gazelle
#

Yeah

#

Dreambooth your supplementing what's already there

old igloo
fast crater
tough gazelle
#

I don't know the exact way they do training from scratch, I presume it would be fairly similar.

#

You've got your text encoder and your telling the text encoder with images and captions what certain things are.

You do that both with dreambooth and from scratch

fast crater
tough gazelle
#

The Waifu Diffusion model for example, is created by resuming from models, it's not dreambooth.

Although I don't know exactly how it's done.

old igloo
#

There's a reason I'm asking. There's a guy on FB attempting to prove that the model does in fact contain complete copies of images. He used dreambooth to finetune on a single image with 1000 steps, so it's overfitted AF, and then he shows that now he can create a "perfect" duplicate of that image by prompting SD with the tag. I think this is an invalid experiment, manipulated to further the agenda of the AI-haters, and is obviously not reproducible with a dataset of 2.3 billion images.

fast crater
tough gazelle
old igloo
#

For what it's worth, I know that. I just wanted to better understand the difference between finetuning and initial training in that sense.

tough gazelle
#

I think the only real difference is with finetuning your modifying the weights already there, and sort of moving them around so they create what you want them to.

#

From scratch there's no weights to begin with so you have to teach it from scratch

#

Like you can fine tune a particular type of clothing and because the main model knows people, it knows clothes and it knows people wear clothes, it can apply that type of clothing to people.

#

But if you trained a model from scratch with clothes, it would only do clothes, just on their own. Because it has no perspective of anything else.

#

So if this person is training an image over and over, it's not even just his image it's learning from, it's using all the other artwork that are in the model already.

rough marten
#

Hello, beginner here - what would be the most obvious use case for fine tuning vs other ways to make a model?

cunning night
old igloo
#

Such as my little brother as the Hulk...

stoic hedge
#

We have a tutorial to use dreamboth ?

#

Or Dreamboth on Automatic1111

#

?

split acorn
#

Yep!

#

One sec, I'll grab one

#

Dreambooth local training has finally been implemented into Automatic 1111's Stable Diffusion repository, meaning that you can now use this amazing Google’s AI technology to train a stable diffusion model with your own images. You can train a character, an object, a style, or anything you want! There is also a new option that allows you to use D...

▶ Play video
#

I've been using that one, although, the extension changed it's UI as of late

#

so it's a little outdated, but the concepts remain the same

stoic hedge
#

Wow, awesomeee, thanks!!!

split acorn
#

that's for local using Auotmatic1111

#

np!

true shale
#

Did anyone manage to convert diffusers weights based on SD2.0 to a ckpt usable in auto's web-ui?
I used this script https://github.com/huggingface/diffusers/blob/main/scripts/convert_diffusers_to_original_stable_diffusion.py
but it gave me completely different results using the exact same parameters in the web-ui.

GitHub

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - diffusers/convert_diffusers_to_original_stable_diffusion.py at main · huggingface/diffusers

true shale
#

And is it normal that I never manage to finetune the SD 2.0 base 512 model under 0.25 or 0.24 loss value, no matter what I change (learning rate, steps, training images, class/regularization images, etc...). What am I doing wrong? I am training locally with the diffusers repo. Does the loss value actually matter for finetuning SD 2.0??

zealous dragon
#

Oh, when you guys say "finetune" you mean Dreambooth?

jaunty robin
#

Hey guys, if I feed an even mix of 3 different faces into a training model, can I expect the output to be a combination of all 3 faces?

#

Or, is there a better way to achieve that?

#

I'm trying to combine my face with my brothers, into one model

#

Would it be better to train 3 models individually, using the same token name, and then combining them with an even weight?

median sun
#

has someone exp. with everydream? Is it worth using to train models?

jaunty robin
tawny inlet
#

Hey guys, It is possible to fine tune Stable Diffusion using Dreambooth for Whole Category like for example Anime. Not just single Object or Style.

weary knot
#

not with dreambooth, I think

#

but there are ways to finetune the whole thing

#

also there are models for anime

tawny inlet
#

EveryDream won't run for me, Locally seems impossible, and not even for colab...I guess I need to stick with Dreambooth for now...

weary knot
#

=/ yeah those things can take a lot of compute

#

if you have the money, you can try an instance at e.g. lambdalabs

stoic hedge
uncut cypress
#

Hi guys! What do you recommender to fine tune interior design projects? Make something similar as interiorai.com

split acorn
limpid briar
#

I've been working on a training for a bit, I'm running into an issue where my hypernetwork is doing a great job of getting the correct face shape/type fitted onto a main subject, but any (medium-to small faces) don't get the same treatment. (I'm curious if it is possible to include transparent images in a meaningful way to a hypernetwork so that it could have smaller faces without whiping out backgrounds because it's just a tiny floating face in empty space.

median sun
#

damn everydream needs 24gb....

limpid briar
median sun
#

do 32 and 48gb gpus work too?

#

like the AMD Radeon Pro W6800 has 32gb and doesn't cost that much more compared to a nvidia rtx 3090

weary garnet
#

Is there any api that I can train my face and it returns images? I want something official stable for non-financial illegal actions

jaunty robin
#

Hey guys, have you found in your testing for training that women seem to produce more "stable" models than men do? I'm not sure what training parameters are wrong but the women I've trained a model on come out looking very consistent

#

But then all the photos that I try to generate from men come out with weird, uncanny valley faces

#

does anyone have any advice on how to solve this, and get more consistent faces with male trained models?

patent aspen
tawny inlet
#

hey guys, does FP16 half precision training lowers the quality?

tired nest
maiden grail
#

Does anyone know if there are any problems with merging together, like 100 models?

My plan is to create a BUNCH of finetuned, dreambooth models, and them combine them together with merge model.

But, I am wondering there there are any issues with merge model, at scale. Like, as long as I use a DIFFERENT keyword, PER thing that I am fine tuning the model on, that should be fine, right?

I'm doing this because a DB model, is like 5GB, but it is still the same size if I combine them together.

weary knot
#

merging dreambooth in general doesn't work, as far as I know

#

I has some conversation in this chat a while back about this, hang on

maiden grail
#

Is it because the REST of the dreambooth model, gets copied/ doubled?

I didn't try too many merges, but I started to see some worse results... and wondering if other people saw this

weary knot
#

yeah, it's expected to get worse results

#

because dreambooth trains the weights of the model itself

#

so it changes/forgets something every time

maiden grail
#

Hmmm, there has got to be a solution for this.....

Like, lets say I trained ONE dreambooth model, on ALL of my data, but I assosiated a DIFFERENT word per image type.

Basically, I want to say "MyModelSword" and "MyModelShield", and train both of these keywords at once.

I thought you had to merge models for this. Is there another way to train multiple words at once?

#

Right now, in the UI, you can only put in 1 word/phrase, right? And that word is constant?

heavy lynx
#

Any methods on how to not run out of vram for larger model training/finetuning?

weary knot
#

that I know of, there are no methods... you could try combining textual inversion embeddings for different things

#

would probably get mixed results, but it's the best bet

#

I'm sure there are ppl working on that problem, from the amount of times I hear this request xD but so far no tools I know of

#

emphasis on "that I know of"

maiden grail
#

Or, it has multiple concepts now? And thats different?

weary knot
maiden grail
maiden grail
#

I think I am just going to have to test this out.... this is really a needed feature.... like why not train something on multiple instance prompts?

weary knot
#

because it doesn't work 🙃
It's not just writing the program, it's a research effort

maiden grail
#

Why wouldn't it work? The entire point of these models are that they have multiple concepts already, right? Stable diffusion knows what a "man" and a "woman" is, and those are both different instance prompts.

Or is it something to do, with fine tuning? Fine tuning a model, on multiple instance prompts doesn't sound that different?

weary knot
#

even though the model has seen a lot of things, it has not seen everything. Like, it has not seen my dog. If I want a picture of my dog, I need to change the model so it adapts to that new concept. That's kinda the idea behind finetuning (dreambooth, for example, is a way to achieve that). The problem is that can only bend so far before it breaks

#

if you try too many things, the model will try to do them all at once and won't be precise enough in any of them

#

there may be ways of circumventing that, but then one needs to have a grasp on the inner mechanism to find exactly what needs to be changed

#

which is a big effort by itself

jaunty robin
#

Is it possible to tag images within this method? Or is there a better method that I can be using to train models?

#

I'm still figuring this tech out, I know I can tell it the instance and classification tokens, but I don't see anywhere to actually detail what the separate images are of

patent aspen
#

ahh yeah i'm not sure for dreambooth actually

#

it's an option in textual inversion / embeddings in automatic1111

jaunty robin
#

Is that something you'd recommend I research? I'm not really attached to one training method right now haha, I just want to make high quality models of my friends

#

I'm having a loooooot of fun taking them around the world on photoshoots haha

#

This picture is almost impossible to distinguish from reality, and it pisses me off because I didn't write down my training settings (I am now!) and I don't know how or why the model is so damn accurate

#

And I can't even get my own face to look that good

#

I even tried training 3 separate models on a ton of images of my face and combining them into one model but it still gives me caveman chin and forehead sometimes

weary knot
#

oh duh you're here already

jaunty robin
#

Sure am!

#

I'm about a week in to learning all of this stuff right now, and I've only been using the link I posted above for training as my own GPU (a 1660ti) can't handle training locally

weary knot
#

well, if you're up to read a lot, there's a big discussion threads with tips on textual inversion, mentions things like taking care with shadows or you'll get weird beard, etc.

jaunty robin
#

Haha yes please, I'm not here for handouts 😛 I am all about learning as much as I can

weary knot
#

also, if you want pictures of people, dreambooth is the best method, I think. Is that link enough to do dreambooth?

weary knot
jaunty robin
#

Ah, thank you 🙂

#

Yeah the link above does dreambooth, but I'm on the free version of Colab so I'm limited to 15gb of Vram

weary knot
#

by the way, are you trying to get more than one person in the same model? That's tricky, as consecutive dreambooth trainings tend to fail

jaunty robin
#

No just the one for now, I'll try two when I get comfortable with one, lol

weary knot
#

yeah sure xD

jaunty robin
#

I'm very confused about all the settings with dreambooth though. Batch size, gradient checkpointing, 8bit adam, they're all terms I'd like to understand so I know when to use them

weary knot
#

oh so

#

you're already using it, I think, as you mention 8 bit adam

jaunty robin
#

The Colab I linked above is using this implementation, yes

#

I tried to run it locally and CUDA runs out of memory every time, even in CPU mode with a 3800x, the ol' 1660 just can't handle it

weary knot
#

yeah. You're in a good path already

jaunty robin
#

Okay, that's good to know 🙂

#

So I guess I just need some better understanding on how to actually use this tool, and when to use which settings

weary knot
#

so, the problem is that your face still looks bad, huh...

#

yeah. You just need to tweak the parameters

jaunty robin
#

Women seem to come out consistently

#

Men seem to have slightly different looking faces every time

#

Like it generates a bunch of lookalikes XD

weary knot
#

are all of those generated? They don't look that bad to be, is the problem that they're in black-and-white?

jaunty robin
#

Oh yeah they're great! I love the quality, and I've been working on some scripting for random cameras and locations and stuff, that's not the issue, the issue is the consistency between faces

jaunty robin
jaunty robin
#

Okay, I'm understanding through typing this out, my issue isn't generating faces, it's getting the training to provide consistent faces, which means my issue is probably more with fine tuning than the general software

weary knot
#

yeah

#

Ctrl+F for a comment from Oct 8

#

"My most effective way to train textual-inversion to reproduce people faces is..."

#

even though textual inversion is another method, it is similar to dreambooth in some aspects

#

so it might help

jaunty robin
#

Thank you very much, that link has a ton of good data!

#

I thought, originally, that textual inversion was something you added on top of your already existing CKPT model file

#

And it kind of "duct tapes" the cracks in training through very strict descriptions of what the subject actually is

weary knot
#

well, not quite. I could go into the detal if you'd like. I like talking haha. Or you can read their project page/paper

jaunty robin
#

If you don't mind, please!

#

Let me steal as much of your knowledge as possible hahahaha

weary knot
#

hahaa

jaunty robin
#

God what a superpower that would be

weary knot
#

yeah xD

#

language is kinda that, right? The ability to transfer knowledge from one mind to the other. ANyway,

#

textual inversion

#

when you type in the words, they first get processed by something called CLIP (or OpenCLIP, for stable diffusion 2.0)

#

each word gets transformed into a code of sorts

#

which is a bunch of numbers between 0 and 1

#

and THOSE numbers are what the visual module uses to know what to generate

jaunty robin
#

And that is what is being displayed here, correct?

#

Tokens that CLIP will use for generation?

weary knot
#

yeah, so. The words get split into tokens (that's a simple substitution, almost 1-to-1) and then the tokens go into CLIP to get replaced by embeds

#

I was talking about the embeds, which are codes generated by the CLIP neural network

#

tokens are integers, like 1958, 287, etc.

#

embeds are a sequence of numbers between 0 and 1

jaunty robin
#

Okay, with you so far

weary knot
#

it's simple to get a token, not that simple to get an embed. Calculating the embeds is CLIP's whole job, basically

#

and then the embeds turn into images. cool. SO,

#

the genius idea behind textual inversion is

#

there are more embeds than tokens

#

like, possible embed combinations

#

so textual inversion picks something like your face

#

and finds the correct embedding to generate your face, even though there is no token -- no "word" -- for your face that the model knows about

jaunty robin
#

Okay, so I type words into Stable Diffusion. Those are split into tokens and then each token is fed into CLIP to get an embed, and the image is the weighted difference of those embeds.

weary knot
#

hmmm

#

almost

#

you got it right until you said that the image is the weighted difference

jaunty robin
#

Yeah I knew that was an incorrect guess

#

I need to understand the different programming language terms better

weary knot
#

the actual process from embeds to image is very complex, and it's the whole work of making something like stable diffusion

jaunty robin
#

Haha right okay

weary knot
#

so just think of a black box for now haha

#

it's a neural network, in sum

jaunty robin
#

Okay, embed goes into magic box, magic box makes picture

weary knot
#

yeah

#

so what textual inversion does is

#

it starts with a general embed, like "face"

#

and makes a random-ish picture

#

and then it compares to your examples

#

it can calculate how far it is from matching your examples

#

so it uses that knowledge to adjust itself and try again

#

after many cycles, it finds the embedt that generates your face (or any object)

jaunty robin
#

Okay, so kind of like anchor points?

weary knot
#

hmmm somewhat, yeah

jaunty robin
#

As in, generate a picture, look at the mouth, mouth doesn't match input from user, re-align to match

weary knot
#

oh

#

yes, but more automatic

#

since it works for a cup of tea, for instance

#

or any object

jaunty robin
#

Okay, this makes a lot of sense

weary knot
#

so that's why it's called textual inversion

jaunty robin
#

Or at least it's starting to haha

weary knot
#

because it starts from the output and makes it way back to deduce what is the (textual) embedding that woudl generate that output

#

dreambooth does something similar, but it also changes (trains) the model itself to get as close as possible

#

textual inversion is not well fit for reproducing a specific identity, for instance

#

dreambooth CAN do that, but it is not guaranteed I guess. From your results xD

#

though you may find ways to tweak the settings and make it better

jaunty robin
#

Okay, so going back to my duck tape metaphor earlier

weary knot
#

ok

jaunty robin
#

If TI is working backwards to deduce what the model should look like, and dreambooth can generate models that are somewhat close to accurate but not quite

#

Can they be used in conjunction so that TI has a more stable starting position?

weary knot
#

perhaps if you use TI first, and then dreambooth from the starting position of TI, it might be improved

#

but how to improve Dreambooth or TI are generally complicated matters, which is what I was trying to say to someone else earlier. It's certainly possible and certainly hard haha

weary knot
jaunty robin
#

I once spent 16 hours trying to get my 3D prints to look, at best, 5% better

#

Certainly possible but certainly hard is what I'm all about XD

weary knot
#

ha I like you

jaunty robin
#

Thank you 🙂

#

So, I've tried a very basic TI test before

weary knot
#

so, yeah, if you want to try that

#

ok go on

jaunty robin
#

And I've noticed while loading the webui, in the CLI

#

it says "loading embeds"

#

But those files are NOT ckpt files, which is my current workflow, correct?

#

They don't have to be selected as a style

weary knot
#

yup

#

they are a few KB files

jaunty robin
#

Just "my" version of 1111 now "knows" what that TI data is, and can implement it on any CKPT

weary knot
#

yes

jaunty robin
#

So if I load a disney style

#

And train a TI on Redw04

#

And then ask for Redw04 in Disney style

#

That all goes in the magic box and it spits out the expected output?

weary knot
#

in theory, yes

#

but so what, what's the idea?

weary knot
#

dreambooth does that by default

jaunty robin
#

Okay, well this is a TON of useful information, and I think there's really not much left to do but to try and load up a TI model of myself and see what happens!

#

I really appreciate you taking the time to help me out with this

weary knot
#

by the way, an example

#

Of that Disney thing

weary knot
jaunty robin
weary knot
#

Kinda xD

tawny inlet
#

Hey guys, do you guys know how to prevent Overfitting with dreambooth training?

maiden grail
tough gazelle
#

If you train everything in one go into 1 model and you give them different instance names then it could work

#

SD still has the general tendency to try merge multiple objects in a scene together

#

Especially with people.

maiden grail
tough gazelle
#

Yeah you can do that, I'm guessing your meaning like if you put "A Man and a Woman" into a prompt to try get both.

#

It sometimes works, but SD can be a little odd with it

hidden hatch
#

I haven't done much programing or development but would like to train SD 2 on an artist (from the 19th century) there is a lot of his work in public domain, any suggestions for web crawlers or ways to download a training data?

tawny inlet
hidden hatch
#

thank you!

jaunty robin
#

How much of an effect does learning rate have to model training?

#

I just noticed the Colab I am using has a learning rate of 1-e6 whereas Auto1111's training uses a default of 5e-3

#

Is a higher or lower number what I want for more accurate models?

fast crater
#

@fiery frigate this channel is the training one chap

fiery frigate
fast crater
#

these guys will know way more than i will ever know (generally speaking)

fiery frigate
#

i asked in tech support

#

will it be possible to train using the dreambooth extension on a 3070ti 8gb?

fiery frigate
#

no i mean

#

without the extension @weary knot

#

no sorry

#

with

weary knot
#

Hum okay a sec

fiery frigate
#

from the ui itself

weary knot
#

No wait I don't get it. The ui uses an extension to do dreambooth

fiery frigate
#

but

#

i always get an out of memory error

#

which fricks me out

weary knot
#

Do you want to do it without the ui, with raw code?

#

Hmm

fiery frigate
#

with the ui

#

but the problem is the out of memory

#

sorry for the confusion lol

jaunty robin
#

Make sure you're using 8bitadam mode, in the advanced settings of the extension

#

It should run on 8gb

fiery frigate
#

ye i check that up

#

but i still get the annoying error

jaunty robin
#

Did you try CPU mode?

fiery frigate
#

it's super slow

#

didn't even bother to check

jaunty robin
#

Yeah but it'll run

fiery frigate
#

i wanna run it on my gpu

jaunty robin
#

I believe cpu mode just offsets some of the processing, your card is still doing work

#

Typically it's designed to run on like 16gb of VRAM, so you have to make some tradeoffs with less vram

fiery frigate
#

but it'll take hours to run

weary knot
jaunty robin
#

Speaking of which @weary knot I can't run TI locally for the same reason

#

8gb minimum

fiery frigate
fiery frigate
#

ye i used that too

#

is there a way to change the scheduler?

jaunty robin
#

But I'm using 10000 steps and 100 images so it still takes about 2 hours to train

#

Not sure I haven't tried

#

I also want to experiment with loading in a 2.0 CKPT

jaunty robin
weary knot
#

2.0 uses a different CLIP. So its TI is not compatible with v1

#

That's also the reason prompts change from 1 to 2

jaunty robin
#

Oh interesting

#

Is it more flexible in 2.0?

weary knot
jaunty robin
#

It sounded smarter pre-edit 😛

fiery frigate
#

which one trains better: dreambooth from the ui or dreambooth from colab?

weary knot