#🔧｜finetune | Stable Diffusion | Page 1

wild totem Oct 3, 2022, 4:16 PM

#

🔧

golden ruin Oct 3, 2022, 4:21 PM

#

https://tenor.com/view/sabadosad-life-adulting-vida-fine-tune-gif-13456879

Tenor

uneven flame Oct 3, 2022, 7:41 PM

#

has anyone gotten fine tuning working on a 12gb card?

#

i can get it running, but i have to reduce the size to 256 😭

next bluff Oct 3, 2022, 8:35 PM

#

https://tenor.com/view/jo-crystal-crystal-sync-gif-24834952

Tenor

covert dragon Oct 3, 2022, 8:45 PM

#

Is there any guidance on how to optimize the rate that stable diffusion generates images? I'd like to know what configuration options make the most difference.

#

After that I want to know what hardware setup is best for running it. I haven't found any articles on that, figured this might be a good place to ask that.

#

I'm pretty much a complete noob with it other than setting up the server on my computer and using it via a plugin in Krita

half folio Oct 3, 2022, 8:49 PM

#

so I came across this:
https://colab.research.google.com/drive/1jYZIbYSmEGWClWuVYFyUR8nUKwW-c91x

Google Colaboratory

lilac helm Oct 4, 2022, 1:45 AM

#

Innnnnnnnteresting, haven't seen or heard of that notebook before

random ocean Oct 4, 2022, 2:14 AM

#

I'm trying to train a model to emulate my own illustration style. Textual inversion isn't really cutting it and dreambooth seems to be geared towards people and objects. I came across this pokemon model and I'm wondering if this method of fine-tuning is better to emulate styles. Thoughts? I'm a dummy when it comes to coding btw.

https://github.com/LambdaLabsML/examples/tree/main/stable-diffusion-finetuning

GitHub

examples/stable-diffusion-finetuning at main · LambdaLabsML/examples

Deep Learning Examples. Contribute to LambdaLabsML/examples development by creating an account on GitHub.

lilac helm Oct 4, 2022, 2:18 AM

#

random ocean I'm trying to train a model to emulate my own illustration style. Textual invers...

Hardware requirements on that are prohibitive unless you're willing to pay for 6+ hours of finetuning on 2xA6000s, based on the example in that repo

random ocean Oct 4, 2022, 2:21 AM

#

how much is that ballpark?

lilac helm Oct 4, 2022, 2:22 AM

#

You're probably gonna have to do the pricing research yourself since there are a ton of different hosted GPU services out there. But if you read the link you shared you might have an idea, since it does mention how much that 6 hours did cost the author.

sinful venture Oct 4, 2022, 2:52 AM

#

lilac helm Hardware requirements on that are prohibitive unless you're willing to pay for 6...

$10 to train a new model isn't a lot...although their method completely trashed the original model, I think. There's no reason more people couldn't do the same, if not for the technical knowledge barrier.

lilac helm Oct 4, 2022, 2:55 AM

#

Whether $10 is a lot honestly differs from person to person and we shouldn't assume, but I'd say if someone's not familiar with coding or willing to get in the guts to troubleshoot, that $10 is almost certainly going to be burned away, six hours of frustrating wait and problems to fix

tame aurora Oct 4, 2022, 12:25 PM

#

Hey, is anyone able to continue training a model with the original CompVis main.py script with the --resume parameter? For me it seems to go well at first - the proper model is loaded from the checkpoint and the configs are merged as expected but once it gets to trainer.fit() ... it just returns, doesn't train for a single step and the script ends

hollow salmon Oct 4, 2022, 1:41 PM

#

random ocean I'm trying to train a model to emulate my own illustration style. Textual invers...

is it different from dreambooth?

keen pasture Oct 4, 2022, 1:44 PM

#

random ocean I'm trying to train a model to emulate my own illustration style. Textual invers...

I think the so-called "Dreambooth" repos are still worth a try for style training. There are configuration files for that purpose. It doesn't look like many people have tried that though so you might have to do some experiments. But still easier to get going than the other finetuning repos.

random ocean Oct 4, 2022, 5:13 PM

#

keen pasture I think the so-called "Dreambooth" repos are still worth a try for style trainin...

Interesting. Gonna look into those configuration files.

hot breach Oct 4, 2022, 7:21 PM

#

there's a pesonalized_style.py that should be there and some slightly different cli params I think just everyone is doing objects instead of styles so its not as well tested

bright obsidian Oct 4, 2022, 7:40 PM

#

is there a website where i can just submit some images and get it finetuned w/ dreambooth?

hollow salmon Oct 4, 2022, 8:08 PM

#

dreambooth vs finetuning?

keen pasture Oct 4, 2022, 8:21 PM

#

bright obsidian is there a website where i can just submit some images and get it finetuned w/ d...

Not yet... But I think that would be a smart business idea for someone... It'll probably exist very soon! It could be done for like $10 or so I think.

somber estuary Oct 4, 2022, 8:35 PM

#

bright obsidian is there a website where i can just submit some images and get it finetuned w/ d...

You can do it for free on colabs already

#

And it's almost as easy as drag, drop, and fire if you rent a GPU from runpod or vast.ai to do it semi-locally.

somber estuary Oct 4, 2022, 8:38 PM

#

random ocean I'm trying to train a model to emulate my own illustration style. Textual invers...

It's true that I've mainly seen DB used for people and objects, but I wouldn't discount its ability to learn a style instead. If TI is iffy on people and objects and great at styles, and DB is great at people and objects, I would be surprised if it wasn't great at styles.

somber estuary Oct 4, 2022, 8:40 PM

#

keen pasture Not yet... But I think that would be a smart business idea for someone... It'll ...

$10 would be price gouging, IMO. It takes like an hour on a Colab using Tesla T4, which rent for $0.29/hr from Google.

#

A cloud service that offers "true" training, like how we got the anime, furry, and pony finetunes, would be very interesting, and could charge in the hundreds to thousands of dollars depending on dataset size...

bright obsidian Oct 4, 2022, 10:05 PM

#

keen pasture Not yet... But I think that would be a smart business idea for someone... It'll ...

i sent a friend req, would love to chat more

jaunty spade Oct 4, 2022, 10:34 PM

#

anyone around to provide guidance on how to train a batch of pictures using the google colab portal? im getting an error message that says ## Convert weights to ckpt to use in web UIs like AUTOMATIC1111.

bright obsidian Oct 4, 2022, 10:39 PM

#

jaunty spade anyone around to provide guidance on how to train a batch of pictures using the ...

which colab are you using?

jaunty spade Oct 4, 2022, 10:39 PM

#

bright obsidian which colab are you using?

DreamBooth_Stable_Diffusion.ipynb

jaunty spade Oct 4, 2022, 10:41 PM

#

bright obsidian which colab are you using?

#

on the last step.. ahhh so close

bright obsidian Oct 4, 2022, 10:41 PM

#

can you link the colab?

jaunty spade Oct 4, 2022, 10:42 PM

#

bright obsidian can you link the colab?

https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth_Stable_Diffusion.ipynb#scrollTo=baL22PHzOLeP

Google Colaboratory

bright obsidian Oct 4, 2022, 10:42 PM

#

did you run this cell?

#

#

this is where OUTPUT_DIR is defined

jaunty spade Oct 4, 2022, 10:44 PM

#

just noticed there wasnt a green tic beside that play button. just ran it

bright obsidian Oct 4, 2022, 10:44 PM

#

awesome

#

worst case you can just define OUTPUT_DIR= <put something here>

jaunty spade Oct 4, 2022, 10:45 PM

#

does the green tic have to be green on all the play icons?

bright obsidian Oct 4, 2022, 10:45 PM

#

and then run the cell

jaunty spade Oct 4, 2022, 10:45 PM

#

#

should that be run too?

bright obsidian Oct 4, 2022, 10:46 PM

#

should be

jaunty spade Oct 4, 2022, 10:46 PM

#

ok... looks like i skipped a few should be steps

bright obsidian Oct 4, 2022, 10:46 PM

#

sent a friend req

jovial hemlock Oct 4, 2022, 10:50 PM

#

does anybody know if "a character" is a valid training subject?

bright obsidian Oct 4, 2022, 11:04 PM

#

if anybody is interested in using dreambooth to finetune people, i've found this notebook to work well

#

https://github.com/kanewallmann/Dreambooth-Stable-Diffusion

GitHub

GitHub - kanewallmann/Dreambooth-Stable-Diffusion: Implementation o...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion (tweaks focused on training faces) - GitHub - kanewallmann/Dreambooth-Stable-Diffusion: Implementation of Dream...

analog sinew Oct 4, 2022, 11:23 PM

#

Is anyone aware of a finetune on just 64x64 images? I'm looking to generate low resolution images, and in my experience going far below 512x512 yields awful results.

bright obsidian Oct 5, 2022, 12:44 AM

#

why do you want 64x64?

#

you can generate 512x then downsample

analog sinew Oct 5, 2022, 12:55 AM

#

gonna attempt to implement dreamfusion3d

#

i imagine finetuning to lower resolutions shouldn't be too hard compared to other finetuning projects

vivid mural Oct 5, 2022, 2:24 AM

#

https://youtu.be/7Lxdk89W2K0

YouTube

Nerdy Rodent

Textual Inversion - Add Anything To Stable Diffusion!

There's a new tab in the Automatic1111 WebUI for Textual Inversion! According to the documentation it needs 12+ GB VRAM, but it seemed to work in about 8 GB for me. Probably worth giving it a go and seeing what you can add to stable diffusion! Do let me know if it works on your 8GB card ;)

WebUI: https://github.com/AUTOMATIC1111/stable-diffusio...

▶ Play video

#

This experiment is very insightful

bright obsidian Oct 5, 2022, 2:32 AM

#

analog sinew i imagine finetuning to lower resolutions shouldn't be too hard compared to othe...

would love to chat more, have some relevant exp

#

send me a friend req?

tawdry forge Oct 5, 2022, 2:35 AM

#

before I dive in to setting up an environment I'm trying to understand if I should be trying to set up dream booth or textual inversion. my understanding is that textual inversion can use multiple trained concepts at once but dreambooth can't?

#

also I'm running local on a 3080 so not sure if I can run dreambooth at all

ashen perch Oct 5, 2022, 7:09 AM

#

I hope this is the right channel

#

I'm messing around with Textual Inversion, used a bunch of town art from Heroes of Might and Magic 3, first I used styles.txt for prompt template, then I've used a custom (styles_filewords.txt, but [name] and [filewords] swapped) and with less steps
I'm not sure what I'm doing wrong, my first try (with steps 3000, 8500 and the homm3-style-old with 11000 steps) looks better, but doesn't resemble the prompt. My second try looks worse, but closer to the prompt with the final embedding (3000 steps homm3)

the prompt was simple: london, 8k, highly detailed, homm3-500 with Prompt S/R replacing homm3-500 with homm3-500, homm3-1000, homm3, homm3-style-3000, homm3-style-8500, homm3-style-old

vivid mural Oct 5, 2022, 8:16 AM

#

ashen perch I'm messing around with Textual Inversion, used a bunch of town art from Heroes ...

I really wish to see more of this. Textual inversion is more accessible and I wish people find more ways to make it success.

stone garden Oct 5, 2022, 9:42 AM

#

vivid mural I really wish to see more of this. Textual inversion is more accessible and I wi...

You are right, we need more of these, I'll try to do some too. I wanted to do a where is waldo one, and I'm looking for any other ideas too if you'd like !

vivid mural Oct 5, 2022, 9:52 AM

#

stone garden You are right, we need more of these, I'll try to do some too. I wanted to do a ...

Please 👍👍

Something like the effect of initialization text is something to be explored.

I've tried using one token and long initialization text. When I used it without training, the result of the token is very different than when I put the prompt alone.

stone garden Oct 5, 2022, 10:05 AM

#

vivid mural Please 👍👍 Something like the effect of initialization text is something to be...

Did you experiment a lot on the number of tokens ? I'm not sure where the ratio quality per token starts to be bad in it...
Initialization text, I haven't experimented with, no. It's still quite new to explore to me, I've just skimmed over it for now

vivid mural Oct 5, 2022, 10:10 AM

#

No not much. I've just experimented with cartoon characters, and the results were... As horrible as possible 🥲

mossy oriole Oct 5, 2022, 11:42 AM

#

stone garden Did you experiment a lot on the number of tokens ? I'm not sure where the ratio ...

I judge it like with java memory for minecraft. Half the amount of tokens.

Depending on the (aesthetic) score, the more you pile on, the less will be recognized.

Example 1:
Subject, effect, whatever else, Peter Max - PM is 1661933.16.00

Example 2:
Subject, effect, whatever else, Jordan grimmer - JG is 749135.43.00

It's a big pain in the peach, to balance out artists - I honestly have no idea if I should consider evening out the scores to match high-scored artists.
The difference is 91.279.773 in score.
That would be between
Bartholomeus Strobel 911728.56.00
and
Emil Fuchs 913456.35.00

-> Eggplant, shine, by jordan grimmer, by Emil fuchs, by peter max
-> Eggplant, shine, by jordan grimmer, by Bartholomeus Strobel, by Emil fuchs, by peter max
Additionally, by including Bartholomeus, I start to experience getting a lot more frames with my outputs - Perhaps there's an unintended high-limit with scores?

But, that science project, requires me to have more resources available - so I don't have to think about cost-performance stuff

stone garden Oct 5, 2022, 1:25 PM

#

ok let's do some tests on that waldo 🙂

bleak swallow Oct 5, 2022, 1:41 PM

#

tried using the huggingface textual inversion colab to get the style of first pic, ended up getting second pic instead, feels bad

#

problem is it would take at least another 3 hour training session to figure out if changing some settings can fix the results or not

#

now I remember why I wasn't able to get into machine learning...

stone garden Oct 5, 2022, 1:57 PM

#

that was on how many learning steps ?

bleak swallow Oct 5, 2022, 2:02 PM

#

3000, I left all the parameters at the notebook default

#

honestly it wouldn't surprise me if the style is too weird for TI to even be able to locate it

#

I mean the style probably fails the filtering of the SD training data

stone garden Oct 5, 2022, 2:08 PM

#

I'm still in the dark on TI... it requires testing to really get right, but that takes so much time

#

progress on the waldo training

#

getting closer

bleak swallow Oct 5, 2022, 2:11 PM

#

definitely looks like a waldo

#

although I have a bad feeling that even as the overall picture looks waldo-like, the individual figures in it might stay indistinct blobs forever

stone garden Oct 5, 2022, 2:14 PM

#

it could, especialy since this is only TI, but the goal is not to make more "where is waldo", but to extract an embedding for "waldo style"

#

I'd like that intricate style with lots of characters in lots of other styles and subjects

#

I'm training it as a style there, not a subject

bleak swallow Oct 5, 2022, 2:16 PM

#

I see, interesting idea

ashen perch Oct 5, 2022, 4:33 PM

#

ashen perch I'm messing around with Textual Inversion, used a bunch of town art from Heroes ...

Gave this an another try, now with img2img, used League of Legends' Summoners Rift as a base, the prompt was "a forest in mountains with 3 dirt roads, 8k, highly detailed, homm3-500" and these settings

#

#

and the results are confusing for me

#

1000 steps are so good, while the final 3000 steps look overcooked, my previous attempt also looks horrible with 3000, 8500 and 11k steps

ashen perch Oct 5, 2022, 6:13 PM

#

Any suggestions how should I restart training it?

bright obsidian Oct 5, 2022, 7:22 PM

#

is the a browser version of webui?

#

for my frens that don't want to have to download anything

serene condor Oct 5, 2022, 7:36 PM

#

i want yo train 35 photos, what settings should i chose and how long should i expect to finish?

#

should i run anything over 15,000 steps?

rough hamlet Oct 5, 2022, 8:14 PM

#

I've found 12,200 steps and 3 vector tokens works pretty good for faces. haven't tried training any styles yet

#

Here are some samples of my firend, I don't have the original training images on this comp but the first set of faces is pretty much the same as the training set. Trained with 22 closeups of face, 12200 steps, init word=man, vectors per token=3 leanring rate of 0.005.

gilded crater Oct 6, 2022, 3:15 AM

#

I would advise caution against posting anything child related on any public area. Even if it's your own child.

ESPECIALLY IF ITS YOUR OWN CHILD.

viral jay Oct 6, 2022, 3:21 AM

#

gilded crater I would advise caution against posting anything child related on any public area...

Thanks, make sense

eternal wing Oct 6, 2022, 3:22 AM

#

Where can I look for details on training in general rather than asking a bunch of questions? Checked pins, but don't see anything

vivid mural Oct 6, 2022, 4:29 AM

#

https://youtu.be/lSUPMSKdfZo

YouTube

Nerdy Rodent

Dreambooth Vs Dreambooth Vs Textual Inversion

Stable Diffusion. It has a lot of things going for it, but how do some of these things compare? Dreambooth SD? Textual Inversion? Dreambooth Diffusers? Which is better? What does better even mean anyway? All these questions and more are answered right here, right now...

▶ Play video

lament moat Oct 6, 2022, 5:04 AM

#

After a long night of futile research, I could use some help. I use the SD Autmatic1111 UI on my PC. I have a single .PT file inversion file. How do i run it (get it to work) on my version of Automatic1111?

hot breach Oct 6, 2022, 5:15 AM

#

lament moat After a long night of futile research, I could use some help. I use the SD Autma...

did you try looking at the wiki for his repo?

ashen perch Oct 6, 2022, 5:21 AM

#

lament moat After a long night of futile research, I could use some help. I use the SD Autma...

Put the pt file in embeddings folder, then you can use it with the file name eg. file is some-random-name, then “a photo of some-random-name” will use the embedding

lament moat Oct 6, 2022, 5:22 AM

#

hot breach did you try looking at the wiki for his repo?

I have looked through it more than once, I see a lot about CREAZTAINGz a text inversion, but nothing about simply adding an already creatted pt file

hot breach Oct 6, 2022, 5:22 AM

#

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Textual-Inversion#using-pre-trained-embeddings

lament moat Oct 6, 2022, 5:24 AM

#

ashen perch Put the pt file in embeddings folder, then you can use it with the file name eg....

Thanks Assassin, I am using the latest version, but there is no "embeddings" folder that i can find. Have thought of just creating one? then, if the Pt is dogsinspace.pt, i would prompts "a photo of dogsinspace looking out the window"?

ashen perch Oct 6, 2022, 5:25 AM

#

lament moat Thanks Assassin, I am using the latest version, but there is no "embeddings" fol...

Then create it manually

#

And yes, it works like that

ashen perch Oct 6, 2022, 5:26 AM

#

lament moat Thanks Assassin, I am using the latest version, but there is no "embeddings" fol...

But I think I had one when I’ve downloaded from automatic1111’s webui, because there is an empty txt with a name something like “place embedding here”

#

I’m messing around with Papa Franku and I don’t know what’s wrong, but with 2 different training both looks overcooked for me

#

Filthy-frank-6 used only 1500 steps and 6 vector tokens, the others used only 3, and I think I stopped somewhere between 5000-8000

stone garden Oct 6, 2022, 7:52 AM

#

I did a waldo style, on 30 tokens and trained it a little too much. the results are fun though

📎 waldo-23500.pt

ashen perch Oct 6, 2022, 8:00 AM

#

stone garden I did a waldo style, on 30 tokens and trained it a little too much. the results ...

Why did you choose 30 tokens? Is it better for styles if there are more tokens?

stone garden Oct 6, 2022, 8:02 AM

#

ashen perch Why did you choose 30 tokens? Is it better for styles if there are more tokens?

because I'm navigating blind and didn't try that many tokens yet, I don't know what I'm doing here, just testing to see if it would land better results, using about half of the available tokens

ashen perch Oct 6, 2022, 8:07 AM

#

It’s good to hear that I’m not the only one 😂

lilac helm Oct 6, 2022, 9:41 AM

#

stone garden I did a waldo style, on 30 tokens and trained it a little too much. the results ...

This looks fun, gonna download this!

turbid patrol Oct 6, 2022, 9:42 AM

#

stone garden I did a waldo style, on 30 tokens and trained it a little too much. the results ...

this is neat, i can imagine these hung up on a wall somewhere

#

fun to look at

stone garden Oct 6, 2022, 9:42 AM

#

TI has quite some fun to show me yet I think, I'm just trying it. Especialy TI on a style, not a subject

turbid patrol Oct 6, 2022, 9:44 AM

#

i haven't been able to replicate styles too well yet. let us know how your experiments go!

weak sonnet Oct 6, 2022, 10:06 AM

#

i was trying to teach it how to draw specific swimsuit
it learns how to draw them well, could be better, could be worse
but in the process it includes the most horrible faces i've ever seen, like enormous cheekbones and lips, also shaved head, something like neandethals
and i cannot remove this face even with specific prompting. needless to say there is nothing like that in the initial imageset
i tried everything: different vector sizes, initialization texts, different template files and captions, include only images with/without faces
only what's help is lowering learning rate, but that is not enough
most disturbing is that the most pleasant results i get on 200-500 steps of training, it's not perfect, it's still thinking about initial prompt and not the pictures i included, but it's nice to look at. and then it come only worse and worse and worse

stone garden Oct 6, 2022, 11:20 AM

#

rick and morty, 10 tokens, 13500 steps
@mental hatch

📎 rickmorty-13500.pt

#

it's still struglling

#

I'll send a zip with all checkpoints here when I reach 15k

stone garden Oct 6, 2022, 11:48 AM

#

📎 embeddings.zip

#

mental hatch Oct 6, 2022, 12:40 PM

#

It just has issues with R&M

sharp grove Oct 6, 2022, 1:07 PM

#

stone garden it's still struglling

Man SD is not getting on board with burp Rick and Morty at all

half folio Oct 6, 2022, 1:15 PM

#

Is that textual inversion?

high nest Oct 6, 2022, 1:16 PM

#

/dream apple

half folio Oct 6, 2022, 1:16 PM

#

Nevermind, just saw those files

stone garden Oct 6, 2022, 1:20 PM

#

yep it is

rigid starBOT Oct 6, 2022, 1:20 PM

#

@high nest

FAQ: I'm new here, how do I generate images ? Where is the bot ?

Welcome ! There is no bot currently to generate your images on discord. You may want to start by taking a look at the #1014939219904450590 channel. You can access Stable diffusion in different ways : 1️⃣ the official website, https://beta.dreamstudio.ai/. The easiest and fastest way to access Stable diffusion with 200 free credits. For any question on it, you can find help in the #1025467151206854736 channel. 2️⃣ Installing Stable diffusion on your computer. There are numerous projects that let you do that, and you will find help in the #🤝｜tech-support channel. 3️⃣ Running Stable diffusion in the cloud, through rented GPU services, using notebooks. You can find lots of them shared and discussed over in the #1011228442399883294 channel.

half folio Oct 6, 2022, 1:23 PM

#

https://github.com/huggingface/diffusers/blob/finetune-txt2img/examples/text_to_image/train_text_to_image.py

GitHub

diffusers/train_text_to_image.py at finetune-txt2img · huggingface/...

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - diffusers/train_text_to_image.py at finetune-txt2img · huggingface/diffusers

#

hmm, this certainly looks interesting

viral jay Oct 6, 2022, 4:04 PM

#

well after lot of trying seems like I found a good spot for face learning with textual inversion, 6 tokens and up to 20000 steps was the answer, with 1 token above 15000 steps it was bringing random results, with 6 its giving me realistic faces and still kinda flexible to work on styles and stuff

whole ice Oct 6, 2022, 5:57 PM

#

Hi friends! I just published my introduction to Training guide on my site, it's meant to be a quick way to get started with training a person into a Stable Diffusion compatible model with a cloud GPU. Hope you enjoy! https://stablediffusionguides.carrd.co/#training-p

Arki's Stable Diffusion Guides

Stable Diffusion Guides by Arki

whole ice Oct 6, 2022, 5:58 PM

#

whole ice Hi friends! I just published my introduction to Training guide on my site, it's ...

bright obsidian Oct 6, 2022, 8:04 PM

#

finally finetuned a model with dreambooth, kinda painful

#

would anybody be interested in an website that lets you do dreambooth finetuning?

amber cape Oct 6, 2022, 9:23 PM

#

bright obsidian would anybody be interested in an website that lets you do dreambooth finetuning...

Yes

viral jay Oct 6, 2022, 10:01 PM

#

guys what's the effects of increasing the learning rate?

#

lower rate means more accurate? or just slower?

half folio Oct 6, 2022, 10:06 PM

#

lower rate means the training will be slower, yes

#

it controls the rate at which the model learns

#

I'd suggest not to increase it much if you're doing dreambooth finetuning as it will most likely make your model worse

#

you could increase it slightly if you're training on more images than recommended

viral jay Oct 6, 2022, 10:09 PM

#

its actually for textual inversion, I'm using around 20-25 images, is that good amount?

#

I'm trying to get the result better, its for face learning, there's one training that did great results but its kinda rigid on styles so I've cropped images at face bounds but then things got worse, trying to extract best of it isn't easy

half folio Oct 6, 2022, 10:13 PM

#

I'm not sure how good textual inversion is for that, I've only trained it on styles and not faces or objects.

#

I think you could try increasing the learning rate a bit

viral jay Oct 6, 2022, 10:13 PM

#

it did learn my face pretty well

half folio Oct 6, 2022, 10:15 PM

#

that's pretty cool

viral jay Oct 6, 2022, 10:16 PM

#

that's pretty accurate

half folio Oct 6, 2022, 10:17 PM

#

yes, the quality is very decent

viral jay Oct 6, 2022, 10:18 PM

#

the images above got generated without inpaint so I think ti was very good, but it seems just efficient to learn my uggly face 😂

#

well discord stopped uploading the images 🤔

#

there's some way to finetune a embedded model?

#

for example, I got a training with very nice results but its with incorrect shape on chin

hot breach Oct 6, 2022, 11:23 PM

#

https://drive.google.com/file/d/1FqEKUaUV5hlvhQQZrgp_z5KWHz-FWFje/view?usp=sharing 4 character FF7R model with Cloud Strife, Tifa Lockhart, Barret Wallace, and Aerith Gainsborough

Google Docs

ff7rV2-steps7995.ckpt

#

new version from what I posed the other day on reddit

viral jay Oct 7, 2022, 12:58 AM

#

I just found the [face:.... face:0.5] parameter, there's more like that?

tough gazelle Oct 7, 2022, 1:11 AM

#

hot breach https://drive.google.com/file/d/1FqEKUaUV5hlvhQQZrgp_z5KWHz-FWFje/view?usp=shari...

How did you get a 4 character model without them all merging together?

hot breach Oct 7, 2022, 1:11 AM

#

use per image prompts on the training images instead of a global class word

tough gazelle Oct 7, 2022, 1:12 AM

#

How do you do that?

hot breach Oct 7, 2022, 1:12 AM

#

https://github.com/kanewallmann/Dreambooth-Stable-Diffusion this repo enables it, uses the filename of the image

tough gazelle Oct 7, 2022, 1:13 AM

#

Ah ok and that works well then? It doesn't merge features at all?

hot breach Oct 7, 2022, 1:13 AM

#

I'm running it locally, not sure if the notebook was updated

#

nope doesn't merge

tough gazelle Oct 7, 2022, 1:13 AM

#

Would you say if your doing 2 characters you should do double the steps?

#

Also I wish I'd seen that earlier lol before I created this monstrosity.

hot breach Oct 7, 2022, 1:14 AM

#

that model was ~7500 steps and >500 training images

tough gazelle Oct 7, 2022, 1:15 AM

#

Ok I've been doing 4000 for 30 images for 1 character

hot breach Oct 7, 2022, 1:15 AM

#

can get more than one character in frame, still sometimes mixes their features but that's a global issue with SD

00003-1198296186-a_photo_of_cloud_strife_and_tifa_lockhart_hugging_kissing_film_still_50mm_f5.6.png

#

I believe by adding more multi-character training images it can learn to separate them better though

tough gazelle Oct 7, 2022, 1:15 AM

#

Ok, but I'm guessing if you use one of their subject words it just looks like them?

#

And you don't get say 1 persons outfits colours on another's

tough gazelle Oct 7, 2022, 1:16 AM

#

hot breach I believe by adding more multi-character training images it can learn to separat...

When you do this do you name the image something like "subject a and subject b"?

hot breach Oct 7, 2022, 1:17 AM

#

I'm using blip to interrogate the images, and then fine tuning the prompts from there

#

adding my character names in place of "man" or "people" and such, tweaking things like duplicates or fixing incorrect prompt words

tough gazelle Oct 7, 2022, 1:18 AM

#

Ok cool, thanks. I'll give it a go.

hot breach Oct 7, 2022, 1:19 AM

#

00010-3340262792-a_photo_of_cloud_strife_and_barret_wallace_film_still_85mm_f4.5.png

#

there's a bit of style transfer here, barret sort looks like cloud's outfit more than his own but it sorta knows they are separate characters

#

my theory is adding multi-character training images helps attention work better, this is much better than a previous attempt and I only went from like 35 group photos to about 55

tough gazelle Oct 7, 2022, 1:21 AM

#

You think you need a lot of training images for it to work well then

#

I guess you could also use this to get multiple outfits for 1 character under different subject names as well

hot breach Oct 7, 2022, 1:21 AM

#

yes, I've spent many hours assembling the training set but working on automating, blip and txt2mask can probably help a lot in automating this

#

yes I tried adding many outfits into the training data as well, it actually helps with style transfer, it's easier now to get characters to wear other outfits from base SD by including many outfits in the training set, it can attend to the face vs. body I think when it sees a character name associated with different outfits

#

and also vice versa, putting base SD faces on characters

00011-2757917824-a_photo_of_emma_watson_as_aerith_gainsborough.png

slow mantle Oct 7, 2022, 1:40 AM

#

https://drive.google.com/file/d/1DtL1KHC6ZFx9ATfVnSVnvBlM1_9DwG0B/view?usp=sharing

Google Docs

2022-10-07T00-56-13_davidbowie-jareth_30_training_images_3030_max_t...

#

Link for Jareth from Labyrinth

hot breach Oct 7, 2022, 1:58 AM

#

00021-444897199-jareth_man_riding_a_horse_in_a_dirt_equestrian_arena.png

00023-3295976420-jareth_man_riding_a_horse_in_a_dirt_equestrian_arena.png

#

my sis will love this, she's obsessed with labyrinth

tough gazelle Oct 7, 2022, 1:59 AM

#

hot breach and also vice versa, putting base SD faces on characters

ok cheers I'm running through it now. Looks like it's not had the optimisations put into it like the Dreambooth-SD-Optimised has unfortunately, as it's running at 1.40s/it instead of the 1.1s/it you get on that.

hot breach Oct 7, 2022, 1:59 AM

#

yeah its a lot of hours of training

tough gazelle Oct 7, 2022, 1:59 AM

#

Also got a bit confused as it seems to expect you to have the images in a directory called images, so I was pointing it to where the files were and was getting a "Not A directory" error. But I think I've got it going now.

#

Well I must have, it wouldn't start if it couldn't find any training or reg images

hot breach Oct 7, 2022, 2:01 AM

#

this is how mine is organized

#

reg folder is the same

tough gazelle Oct 7, 2022, 2:02 AM

#

I just have them all in 1 folder, but the files are named accordingly like it says on the Github page

#

So
SubjectA_001
SubjectB_001
SubjectC_001

And so on

hot breach Oct 7, 2022, 2:03 AM

#

yes

tough gazelle Oct 7, 2022, 2:03 AM

#

Does this generate 3 test images at the intervals?

#

Or just the one

hot breach Oct 7, 2022, 2:03 AM

#

it only seems to generate one pair

tough gazelle Oct 7, 2022, 2:03 AM

#

Ok, would be nice to have all 3 so I can see that it's working properly.

hot breach Oct 7, 2022, 2:03 AM

#

it also does some batch size hunting at the beginning I think because my sets are mismatched

tough gazelle Oct 7, 2022, 2:04 AM

#

Probably not that hard to implement, but I'd have to find where it is in the code

hot breach Oct 7, 2022, 2:04 AM

#

I'm not positive how it choses what to generate, I guess whatever prompt/image it is on when the training image logger triggers

#

yeah

tough gazelle Oct 7, 2022, 2:04 AM

#

Yeah I would assume that too. It must just look at the last training subject and class word and use that

hot breach Oct 7, 2022, 2:04 AM

#

probably could use some better collation

tough gazelle Oct 7, 2022, 2:05 AM

#

I suppose if it's doing that you could just feed the script the different names you've put in and tell it to do all of them at the interval

hot breach Oct 7, 2022, 2:05 AM

#

my training images almost all have entirely unique names

#

ex "cloud strife turned away from the camera with a buster sword on his back"

#

cloud strife standing on a sidewalk, night, streel lights_ (15).webp

tough gazelle Oct 7, 2022, 2:06 AM

#

Oh I've not gone that far with this test. Just using simple names for each character

hot breach Oct 7, 2022, 2:06 AM

#

tifa lockhart in a purple dress holding her right hand up standing in front of a shelf with liquor bottles on it's shelves.webp

#

a lot of stuff I get out of blip

#

yeah just a suggestion, I think it will help with attention in training, and also maintaining the existing model knowledge

tough gazelle Oct 7, 2022, 2:09 AM

#

Ok, well I'm just running this for a couple hours before I sleep to see what it's like.

#

That SD-Optimized one also provides more images, like it shows you what phrase it was using to get the image. This one unfortunately doesn't.

#

So I'm just going to have to assume it's working

hot breach Oct 7, 2022, 2:13 AM

#

you can tweak the v1-finetuning.yaml and change the imagelogger interval

#

    image_logger:
      target: main.ImageLogger
      params:
        batch_frequency: 400``` <---

tough gazelle Oct 7, 2022, 2:23 AM

#

oh

#

it got to 1000 steps and died

#

Traceback (most recent call last):
  File "main.py", line 832, in <module>
    trainer.test(model, data)
  File "/root/miniconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 911, in test
    return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule)
  File "/root/miniconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/root/miniconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 954, in _test_impl
    results = self._run(model, ckpt_path=self.tested_ckpt_path)
  File "/root/miniconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1128, in _run
    verify_loop_configurations(self)
  File "/root/miniconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 42, in verify_loop_configurations
    __verify_eval_loop_configuration(trainer, model, "test")
  File "/root/miniconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 186, in __verify_eval_loop_configuration
    raise MisconfigurationException(f"No `{loader_name}()` method defined to run `Trainer.{trainer_method}`.")
pytorch_lightning.utilities.exceptions.MisconfigurationException: No `test_dataloader()` method defined to run `Trainer.test`.```

hot breach Oct 7, 2022, 2:24 AM

#

you have to pass the number of max steps or set it >1000

#

        "--max_training_steps",
        type=int,
        required=False,
        default=12000,
        help="Number of iterations to run")```

tough gazelle Oct 7, 2022, 2:24 AM

#

I changed it in the v1-finetune_unfrozen file

#

Is it different on this one?

hot breach Oct 7, 2022, 2:24 AM

#

yeah it is overriden by the cmd line arg

tough gazelle Oct 7, 2022, 2:24 AM

#

Ah ok

#

So I leave the file alone and pass that command line

hot breach Oct 7, 2022, 2:25 AM

#

I just changed it in code there on the arg definition

#

but you can just add it to your arg list too, either works

#

I believe max_steps in the finetune.yaml is also respected

#

so which ever comes first

tough gazelle Oct 7, 2022, 2:26 AM

#

I changed it in v1-finetune_unfrozen.yaml

#

And it didn't listen to that

#

I'll use the command line

hot breach Oct 7, 2022, 2:28 AM

#

the command line arg is inserting an interrupt thus the error message, the one in the yaml I believe doesn't throw an error but they behave the same, and both are active limits

#

it should still dump the ckpt either way

tough gazelle Oct 7, 2022, 2:28 AM

#

It did dump it, but at 1000 steps

#

Then stopped

hot breach Oct 7, 2022, 2:28 AM

#

correct, as designed, you should set both limits higher

tough gazelle Oct 7, 2022, 2:29 AM

#

Ok, you didn't need to set it in the command line on the other repo I've been using

#

It just looked at the file

hot breach Oct 7, 2022, 2:29 AM

#

correct, kane added that for whatever reason I guess

#

you could just go into the argparse code and set the default value to 99999 or whatever if you want

#

above I set mine to 12000

slow mantle Oct 7, 2022, 3:36 AM

#

hot breach

Yay, I'm glad the model will be seen by further people

hot breach Oct 7, 2022, 3:38 AM

#

she rides dressage and uses the labyrinth music for her routine, I sent her a photo of jareth on a horse 😆

tough gazelle Oct 7, 2022, 3:58 AM

#

Ok so I only had time to run through it 3000 steps tonight to test it, but it's worked reasonably well

#

1 of the characters doesn't turn into a blob like doing back to back training steps

#

There's still a lot of style transfer, but I think more training images and more steps will most likely help that

hot breach Oct 7, 2022, 4:00 AM

#

yeah include a few with both characters in it, with a proper prompt, I used ~55 images out of 550 as group photos, even that seems to have helped a lot, I think a bit more may be better

tough gazelle Oct 7, 2022, 4:00 AM

#

It seems like the class has bled into it a little too much as well

#

On a normal model, putting in the class doesn't really make a massive difference. On this, it starts making them look like the regularisation images

#

Maybe I used too many reg images

#

I'll have to give it another go tomorrow and leave it a lot more steps

#

But it's at least partially working at low steps

#

#

One of the models of the 3 is worse than the other

#

But it did have slightly less training images

#

It was Darkness that had less images and she doesn't come up much at all
I guess it could just be, low steps didn't have time to train as well with the lower images.

coral mist Oct 7, 2022, 4:19 AM

#

When training SD, what do I need to do to have it start from a previous checkpoint? I'm seeing references to --finetune_from and --resume and --resume_from_checkpoint

#

Which is it lol. I tried --finetune_from since that was referenced on the pokemon diffusion but it took me two epochs before I realized it was trying to start from scratch

fallen nova Oct 7, 2022, 5:32 AM

#

textual inversion shenanigans with Yves Tanguy paintings as the input for training

00603-3583385407-A_log_cabin_interior_in_the_style_of_Yves_Tanguy_Painting.png

00607-1281000597-Telegraph_poles_in_the_desert_in_the_style_of_Yves_Tanguy_Painting.png

00608-2469139706-Telegraph_poles_in_the_desert_in_the_style_of_Yves_Tanguy_Painting.png

#

if anyone wants to try

📎 Yves_Tanguy_Painting-15000.pt

#

if it works like that.. heard you can just drop it in an embeddings folder on the automatic1111 fork but i'm not sure if just works plug and play like that

#

oh damn guess it does

coral mist Oct 7, 2022, 6:13 AM

#

Ran a few tests at different gpu configurations to see what made sense for training speed vs cost, maybe it will be useful to others

1xa6000 ($0.63 /hr)
80 in 2 minutes

4xa6000 ($2.40 /hr)
200 in 2 minutes
costs ~300% more and only 150% faster

4xa100 ($3.20 /hr)
252 in 2 minutes
costs 33% more and only 26% faster

stone garden Oct 7, 2022, 7:43 AM

#

fallen nova if anyone wants to try

thanks ! yes, it's plug and play like that. I could even continue training the thing

tame aurora Oct 7, 2022, 9:47 AM

#

hot breach I'm using blip to interrogate the images, and then fine tuning the prompts from ...

Do you care about BLIP's performance/accuracy or it's fine whatever the result is?
I tried it once for one type of images but I was afraid using it to caption my training data since the descriptions were very vague and not very accurate. Would that make tuning worse in your opinion?

#

For example, I skimmed some of LambdaLab's Pokemon images and BLIP did decent job IMHO so I concluded an accurate caption would benefit the fine tuning

tame aurora Oct 7, 2022, 9:58 AM

#

coral mist When training SD, what do I need to do to have it start from a previous checkpoi...

Do you continue from your own checkpoint or from a downloaded one?

for an external ckpt I found it most reliable to add a ckpt_path parameter in the config file (with absolute path to the ckpt file). I mean in the params: section of the model
for my own tuning and resuming I had to use --resume to pass the exact logs directory (something like logs/2022-10-07...

tidal orbit Oct 7, 2022, 11:36 AM

#

hot breach yes, I've spent many hours assembling the training set but working on automating...

@hot breach That must have been quite the challenge. What do you think was your average (image,caption) output per minute ?

#

How do you assemble a training set? Are there places with good libs for that? I would be interested to finetune on some fantasy/sci-fi image gallery

hot breach Oct 7, 2022, 1:17 PM

#

a lot of it has been brute force data prep, but I'm working on workflow improvements

hot breach Oct 7, 2022, 1:19 PM

#

tame aurora Do you care about BLIP's performance/accuracy or it's fine whatever the result i...

I'm usually just grabbing segments, working on some automation, it does weird things like "such and such in a sci fi fi fi fi fi fi fi" or "with a gun in their hand and a gun in their other hand" when there is no gun (probably due to black gloves) so I remove that stuff

#

I didn't fully blip prompt the entire set either, maybe like 30%, too time consuming for now but a lot of this can be automated

tulip comet Oct 7, 2022, 1:28 PM

#

what do the parenthesis stand for? like what's the dif between, highly detailed, (highly detailed) and ((highly detailed))

tame aurora Oct 7, 2022, 1:37 PM

#

hot breach I didn't fully blip prompt the entire set either, maybe like 30%, too time consu...

Thank you for the feedback!
It sounds encouraging for smaller datasets (like < 1k samples)

hot breach Oct 7, 2022, 1:37 PM

#

tulip comet what do the parenthesis stand for? like what's the dif between, highly detailed,...

prompt weighting

charred jackal Oct 7, 2022, 1:56 PM

#

Is there a list of model weights and what they have been trained on? (I´m using Artroom stable diffusion, if that makes a difference)

rough hamlet Oct 7, 2022, 2:14 PM

#

tulip comet what do the parenthesis stand for? like what's the dif between, highly detailed,...

Emphasis: use (text) to make model pay more attention to text and [text] to make it pay less attention

tulip comet Oct 7, 2022, 2:18 PM

#

nice thanks both! @hot breach @rough hamlet

fallen nova Oct 7, 2022, 2:41 PM

#

stone garden thanks ! yes, it's plug and play like that. I could even continue training the t...

you might want to change the classifier if you're going to do that lol, currently its "Yves Tanguy Painting" instead of something like "*" which if im not mistaken takes up a lot of tokens lol

stone garden Oct 7, 2022, 2:47 PM

#

fallen nova you might want to change the classifier if you're going to do that lol, currentl...

I try to keep my identifyers a single word, preferably not recognised already too much by the AI, or at least not recognised as something else (don't use a nickname for your face for example, I was called a monk a long time and I used monk as part of the identifier, learning went wild)
but with TI, if you use the identifier corresponding to an embedding, it's not the number of words in your identifier that matters, it's how many tokens per vector you choose when you created the embedding.

#

like, waldo I shared earlier is 30 tokens on its own I think

fallen nova Oct 7, 2022, 2:47 PM

#

damn wow

#

so yves tanguy painting would be a shit ton then

stone garden Oct 7, 2022, 2:48 PM

#

it just depends how you create your embedding

#

I would choose a word like "yvestanguystyle"

#

and put about 8 tokens in to see how it works already

fallen nova Oct 7, 2022, 2:48 PM

#

there u go then thanks

rose cosmos Oct 7, 2022, 3:22 PM

#

someone knows why my loss while training with dreambooth doesn't go down.. tested at 5e-5 and 5e-6 on 20instance pics 200 regularization 800 1600 and 3200 steps

tawdry forge Oct 7, 2022, 5:11 PM

#

made with a textual inversion watercolor training

delicate rock Oct 7, 2022, 5:50 PM

#

is it possible to finetune on sizes that aren't powers of 2? i want to use data from drawception which has a fixed size of 600x500

viral jay Oct 7, 2022, 5:53 PM

#

guys, what's best method to train a whole person with textual inversion? I'm running now a training but now I've separated some good quality images from head / face and body and I'm using subject_filewords so it use the filename to guide it, is this a good approach?

sweet sand Oct 7, 2022, 7:12 PM

#

Hey there!

#

Quick question: I would like to start training model in Automatic11111 and share the model with friends. Any advice? My questions are:
1- Should I train on my 3080 rtx or pop some AWS instance instead if I want to do mass trainning (cost not really an issue, I have a pool of credits on AWS)
2- Can a train model on Automatic1111 output a .pck that I put back in the stable diffusion folder?
3- Any training I could test to learn? (Ex: sample of folder with X images that I could follow)
4- Anyone have some time to walk me trough the process? Will gladly pay for the training 🙂

viral jay Oct 7, 2022, 7:20 PM

#

sweet sand Quick question: I would like to start training model in Automatic11111 and share...

I'm also learning, but the process is pretty straight forward on the web ui

#

I have a 3080ti too, with around 20-25 images It do aprox 10000 steps per hour (may take less time)

#

it does generate few .pt files

#

that goes inside of embeddings and textual inversion folders

#

sweet sand Oct 7, 2022, 7:25 PM

#

In the main folder, I have mode > Stable-Diffusion. This is where I put the model I find, for example: robo-diffusion-v1.ckpt

viral jay Oct 7, 2022, 7:25 PM

#

result from training I did

viral jay Oct 7, 2022, 7:26 PM

#

sweet sand In the main folder, I have mode > Stable-Diffusion. This is where I put the mode...

Hmm that's a different kind of learning from textual inversion, I haven't worked with ckpt yet but textual inversion is giving me some good results with some work

sweet sand Oct 7, 2022, 7:27 PM

#

Wow, thanks for all the info. Let me know If I understand:

Name: Whatever I want
Initialization text: Description of the prompt for that image set?
Number of tokens : 75?
Source: Path on my folder with all image
Destination: Where I want the process image
6: I click Flip and Caption?

viral jay Oct 7, 2022, 7:28 PM

#

with 6 tokens on textual inversion and only 8000 steps I could get some very nice results even mixing the style with real photos

sweet sand Oct 7, 2022, 7:28 PM

#

This will create a nice 512x512 folder I can feed to the embedding part?

silent spear Oct 7, 2022, 7:29 PM

#

viral jay guys, what's best method to train a whole person with textual inversion? I'm run...

I am still working my way through the TI process, but my overall gist is (re: auto1111 TI interface, I should say):

yes, use filewords, but maybe make a new subject_filewords.txt file for humans, because when you use the default, the phrasing tends to be a bit wonky... you don't have a good photo of "a jane_doe", y'know? 🙂
I use more initialization text than I think may be standard: say, "person female woman actress". For some reason, "actress" really made a difference. I used to use very few words, but then I'd have the effect of "clear face, blurry everything else", which I suspect was SD saying "I know that jane_doe is this face, but I dunno WTF the rest of her looks like"... the extra words allow it to borrow from existing concepts, if that makes sense.
I use 2 vectors per token max. Any more and I can't use the character the way I want (different costumes, styles etc).
learning rate at 0.001 for about 3000-5000 steps. I actually usually save out 2 versions, one for "I need this to look like the character exactly" and one for "let's have some fun and be flexible".

Thus far, it's working pretty well. I refine it constantly, but I've made a handful of characters this way, and those are the general parameters I use.

sweet sand Oct 7, 2022, 7:30 PM

#

A: Embedding: I guess this will have something once I did the first part?
B: Learning Rate: 0.015?
C: Dataset Directory: Any path I want?
D: Log - Ok
E: Prompt Word: uh?
F: Max Steps: 15 000?
G: Save image asd save copy: disable or 500?

tough gazelle Oct 7, 2022, 7:31 PM

#

If you guys are running that training repo that lets you do multiple characters I've noticed that it doesn't have the speed upgrades.
https://github.com/kanewallmann/Dreambooth-Stable-Diffusion

You can get the speed upgrades from
https://github.com/gammagec/Dreambooth-SD-optimized

Just replace the files in the kanewallmann repo with the ones from Dreambooth-SD-optimized. Increases training speed to 1.06s/it on a 3090, instead of around 1.5 you get currently.

Files to replace:
/ldm/modules/attention.py
/ldm/modules/diffusionmodules/model.py

On Linux I also had issues with an older version of Pytorch being used causing it to use more memory and therefore run out.
To fix this edit the Environment.yaml so it uses:

pytorch=1.11.0
torchvision=0.12.0

viral jay Oct 7, 2022, 7:31 PM

#

sweet sand Wow, thanks for all the info. Let me know If I understand: 1) Name: Whatever I w...

yes you place the name you want it so when you use on prompt it will use the info learnt
keep as is
tokens I use 4-12, more tokens means it can get more details, but your prompt is limited to 75 tokens which means you will get more and more rigid results as you increase to max of 75 tokens
source path is the folder with images (prefer to crop images to subject you want to learn)
this is where images will be output after you press the process button and is where you will input on the dataset directory
I don't use flip or caption, I tried with caption and my results got worse so I didn't really tried much with it, I don't use flip because faces are asymmetric

tidal orbit Oct 7, 2022, 7:33 PM

#

hot breach a lot of it has been brute force data prep, but I'm working on workflow improvem...

When I think about it I imagine that you need at least several second on average to correct or modify (image|caption) pairs. Lets say 5s which makes you produce 12 pairs a minute. 720 an hour of really concentrated work given that you have a workflow that is streamlined.

viral jay Oct 7, 2022, 7:34 PM

#

@silent spear gave some good tips, I'm still learning too, for the learning rate I use default of 0,005 I have increased but didn't noticed any changes, might try to decrease as he said, for steps I've noticed that going above 5000 does not make a huge difference, but I always let it process up to 10-15k and then I delete the higher steps numbers from the folder if I above certain steps I noticed a degradation

hot breach Oct 7, 2022, 7:34 PM

#

yeah I spent a lot more than an hour on it lol

tidal orbit Oct 7, 2022, 7:34 PM

#

hot breach yeah I spent a lot more than an hour on it lol

How many pairs did you produce. The magnitude I mean.

sour pond Oct 7, 2022, 7:34 PM

#

Does the optimized version of Dreambooth work on a 3080?

hot breach Oct 7, 2022, 7:34 PM

#

I was screenshotting the game, then I'd go back resize/crop, then fix prompts

silent spear Oct 7, 2022, 7:35 PM

#

Oh, one thing that I think is very important (for everyone, regardless): never feed TI an image that isn't "right". If you give it a photo of your subject that looks different or weird or non-standard, it will latch on to that sucker and reproduce all the wonky qualities you hate in every image you create from then on.

viral jay Oct 7, 2022, 7:37 PM

#

That's very true, and its very good to get bad details lol

#

a face I was training had a small shadow on middle of lower lip, it made the algorithm think that lower lip were separated...

#

got good once I removed that single image from the dataset

tough gazelle Oct 7, 2022, 7:39 PM

#

sour pond Does the optimized version of Dreambooth work on a 3080?

If you mean the Dreambooth-SD-Optimised one, no, it uses 23.3GB of VRAM.

If you mean the diffusers one, then sort of. It can work on just under 10GB VRAM. However windows running probably means you'd run out, so you'd probably have to run it in a Linux console for it to work.

sour pond Oct 7, 2022, 7:42 PM

#

tough gazelle If you mean the Dreambooth-SD-Optimised one, no, it uses 23.3GB of VRAM. If you...

Cool. Guess I'll wait until I have a 4090.

tough gazelle Oct 7, 2022, 7:44 PM

#

You can rent 3090's for like $0.40 an hour if you really want to give it a go

sour pond Oct 7, 2022, 7:46 PM

#

Its all good. 4090 releases next week and the price of a 3090 has fallen dramatically.

sweet sand Oct 7, 2022, 7:49 PM

#

tough gazelle You can rent 3090's for like $0.40 an hour if you really want to give it a go

Where?

tough gazelle Oct 7, 2022, 7:50 PM

#

sweet sand Where?

Vast.ai or runpod are the ones I know of

#

I'm using Vast.ai as it has a $5 minimum credit amount and Runpod has $10

sweet sand Oct 7, 2022, 7:50 PM

#

Nice thanks

#

i'll take a look

#

People who train and output a .ckpt (ex: the one who published robo-diffusion-v1.ckpt) - are they using another software?

tough gazelle Oct 7, 2022, 8:26 PM

#

The main tools output ckpt files. It's only the diffusers packs that don't

#

I've never used the diffusers one as I've heard it's worse quality

viral jay Oct 7, 2022, 8:30 PM

#

Can anyone explain to me what's a token exactly? is it the training trying to cook information on a single word? I'm trying your params @silent spear and with 2 tokens and 0.001 learning rate its doing reasonably well but 5k steps wasn't enough, I'm letting it do some more training and with 6k its giving some nice results but will let it go up to 10k

#

what I'm noticing is that its learning less data, the character suit is less detailed, let me grab some examples of it

#

This is Wraith from Apex Legends, with 2500 steps / 6 tokens / 0.005 learning rate

#

this is with 2500 steps / 2 tokens / 0.001 learning rate

#

2750 steps / 2 tokens / 0.001 learning rate, when comparing to first image its much less detailed

#

and this one is the real character from the game

Apex_Legends_Screenshot_2022.10.07_-_16.05.19.06.png

#

will now do a test with 12 and 24 tokens to see how this reflects the quality to have some parameters, but I believe I will get back to average of 4-6 tokens as they seem to give decent output

viral jay Oct 7, 2022, 9:01 PM

#

2500 steps / 12 tokens / 0.001 learning rate, yeah not very good tbh

#

3750 steps / 12 tokens / 0.001 learning rate, starting to get interesting, the fact is that learn more details on the cloth but face seems better on low tokens one, will run this up to 10k and see if there's a good improve

viral jay Oct 7, 2022, 9:27 PM

#

here's some comparison

#

2 tokens (top) vs 12 tokens (bottom)

#

same prompt and parameters on both images, its easy to see that higher tokens create a higher bias to source images

tough gazelle Oct 7, 2022, 10:06 PM

#

hot breach https://github.com/kanewallmann/Dreambooth-Stable-Diffusion this repo enables it...

Thanks for this, worked really well
https://cdn.discordapp.com/attachments/1010577750077210726/1028065177473650789/unknown.png

viral jay Oct 7, 2022, 10:07 PM

#

tough gazelle Thanks for this, worked really well https://cdn.discordapp.com/attachments/1010...

can this run on windows with a 12gb card?

tough gazelle Oct 7, 2022, 10:07 PM

#

No

#

24GB

tough gazelle Oct 7, 2022, 10:18 PM

#

tough gazelle Thanks for this, worked really well https://cdn.discordapp.com/attachments/1010...

Trained for 9k steps. I used some files from the Dreambooth-SD-Optimized repo to speed it up

crimson sandal Oct 7, 2022, 10:30 PM

#

This might be a naive question, but I successfully was able to fine tune SD using a set of images I provided. If I wanted to train that same model on another set of images, can I do that? Can someone link me to an explanation? For instance, if I trained it on <cat-toy> images and now I want to add <dog-toy> images, how can I make that happen?

tough gazelle Oct 7, 2022, 11:34 PM

#

crimson sandal This might be a naive question, but I successfully was able to fine tune SD usin...

You can, but it doesn't work very well, it merges the results together. Your better off using the repo I was talking about above and train them all at the same time.

shy widget Oct 8, 2022, 12:55 AM

#

You could also try training both as different embeds and then referencing them using the composable diffusion operators ( https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/ )

Compositional Visual Generation with Composable Diffusion Models

#

like <cat-toy> AND <dog-toy>

plucky swan Oct 8, 2022, 12:56 AM

#

Theres a repo that able to finetune at 8gb of vram for dreambooth, have you guys tried it?

stiff dust Oct 8, 2022, 4:31 AM

#

if you are talking about the diffusers dreambooth with deepspeed, I made that PR. It works fine for me on 8GB VRAM, some other people have tested it too successfully

#

it's also not written anywhere yet, but replacing the Adam optimizer with Deepspeed version of Adam gives very substantial speed up

gloomy belfry Oct 8, 2022, 6:17 AM

#

plucky swan Theres a repo that able to finetune at 8gb of vram for dreambooth, have you guys...

Didn't work for me

gloomy belfry Oct 8, 2022, 6:18 AM

#

stiff dust if you are talking about the diffusers dreambooth with deepspeed, I made that PR...

How did you get it to work? I'm on a 3080 10GB with 64GB ram and it's throwing oom

strange crest Oct 8, 2022, 6:19 AM

#

I'm new here... can anyone direct me to where to learn better prompts? I keep getting images with blurred faces and extra limbs

plucky swan Oct 8, 2022, 6:19 AM

#

Crazy that it had at least a degree of success anyway, did you find the quality significantly worse or its mostly the same?

gloomy belfry Oct 8, 2022, 6:19 AM

#

strange crest I'm new here... can anyone direct me to where to learn better prompts? I keep ge...

Lexica

stiff dust Oct 8, 2022, 6:20 AM

#

I don't think I did anything special, the instructions are included. Make sure to set batch size 1, gradient checkpointing and mixed_precision=fp16

gloomy belfry Oct 8, 2022, 6:20 AM

#

Yeah I did all that

#

Idk

plucky swan Oct 8, 2022, 6:20 AM

#

Are there some sacrifices on quality for the 8gb repo?

stiff dust Oct 8, 2022, 6:20 AM

#

no, but it requires a lot of CPU RAM

gloomy belfry Oct 8, 2022, 6:21 AM

#

And is sloweer

#

If you could share your Adam code @stiff dust id like to try it again

plucky swan Oct 8, 2022, 6:22 AM

#

Interesting this allows the tech to be a lot more accessible by just adding more system ram

stiff dust Oct 8, 2022, 6:22 AM

#

from deepspeed.ops.adam import DeepSpeedCPUAdam
...
optimizer_class = DeepSpeedCPUAdam

strange crest Oct 8, 2022, 6:22 AM

#

gloomy belfry Lexica

Thanks!

stiff dust Oct 8, 2022, 6:22 AM

#

it should work without that change too but using DeepSpeed Adam gives about 2x speedup

gloomy belfry Oct 8, 2022, 6:22 AM

#

Nice

#

Thanks

stiff dust Oct 8, 2022, 6:23 AM

#

also that requires cuda toolchain that is same version as pytorch

#

I had troubles with it and built pytorch from source to make it work

gloomy belfry Oct 8, 2022, 6:23 AM

#

Ah

#

Well that may be too excessive

sullen zephyr Oct 8, 2022, 10:39 AM

#

Have anyone tried finetuning the decoder?

delicate stream Oct 8, 2022, 2:21 PM

#

viral jay This is Wraith from Apex Legends, with 2500 steps / 6 tokens / 0.005 learning ra...

Yo.....what she holding?

half terrace Oct 8, 2022, 2:25 PM

#

viral jay guys what's the effects of increasing the learning rate?

Learning rate changes the step size for each cycle (how much it's allowed to adjust the network weights) higher values make it faster, and easier to pass local minima (values that the ANN thinks are good) but it decreases stability and increases the risk that you'll blast past actually optimal values.

ashen perch Oct 8, 2022, 3:40 PM

#

does anyone have a sample prompt template for training specific people?
subject.txt and subject_filewords.txt have a bit strange combinations

hot breach Oct 8, 2022, 4:03 PM

#

ashen perch does anyone have a sample prompt template for training specific people? subject....

what code are you running?

ashen perch Oct 8, 2022, 4:15 PM

#

I'm using automatic1111's webui

austere wigeon Oct 8, 2022, 4:21 PM

#

so I am trying to run the Dreambooth on a runpod https://github.com/JoePenna/Dreambooth-Stable-Diffusion I have a Runpod with 24gb VRAM which I thought should be enough based on info I found... but I am still faced with RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 23.68 GiB total capacity; 18.33 GiB already allocated; 39.56 MiB free; 18.41 GiB reserved in total by PyTorch) at step 1... any ideas?

GitHub

GitHub - JoePenna/Dreambooth-Stable-Diffusion: Implementation of Dr...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion. Tweaks focused on training faces, objects, and s...

half terrace Oct 8, 2022, 5:24 PM

#

When running TI, is there any issue with using a longer initialization text than the number of tokens for an embedding?

sweet sand Oct 8, 2022, 5:32 PM

#

I am running Arki's RunPod fro training, when it's completed, do I download the last.ckpt in the logs (checkpoint) to put it in automatic1111?

viral jay Oct 8, 2022, 5:35 PM

#

half terrace Learning rate changes the step size for each cycle (how much it's allowed to adj...

Interesting, yesterday I was playing with values, I did tests up to 1.0 of learning rate, I've noticed that it was overshooting and on X step it was looking close but on Y step it was completely different, after some testing I think between 0.005 and 0.001 are good values

half terrace Oct 8, 2022, 5:37 PM

#

viral jay Interesting, yesterday I was playing with values, I did tests up to 1.0 of learn...

appropriate values will depend on both the dataset you're training on and what's already in the model. It's really good at some types of images and in those cases you can probably use really high values.

viral jay Oct 8, 2022, 5:38 PM

#

hmmm, what I've noticed is that it seems better if I throw very different images of same subject instead of lot of close images but only different angles

ashen perch Oct 8, 2022, 6:03 PM

#

what prompts were u using?

tough gazelle Oct 8, 2022, 6:30 PM

#

sweet sand I am running Arki's RunPod fro training, when it's completed, do I download the ...

pretty much yes

#

I believe the collab one does some pruning first to make it smaller, so make sure you read the instructions

sweet sand Oct 8, 2022, 6:52 PM

#

tough gazelle pretty much yes

Is it normal that I get a 11 gb file out my sd 4gb model

#

And do yo advice 2k, 6k or 15k training?

tough gazelle Oct 8, 2022, 6:57 PM

#

sweet sand Is it normal that I get a 11 gb file out my sd 4gb model

Yes, you can prune it down to 2GB or so. Not sure how the collab is setup for that though. I have a script to do it manually.

tough gazelle Oct 8, 2022, 6:57 PM

#

sweet sand And do yo advice 2k, 6k or 15k training?

I've had good results training between 3k and 6k for single character models

#

Just make sure you training images are of good quality and you have a decent amount of regularization images

#

I use around 20-30 training images and 300 regularization images.

grave carbon Oct 8, 2022, 7:10 PM

#

Can using too many regularization images be bad?

tough gazelle Oct 8, 2022, 7:15 PM

#

grave carbon Can using too many regularization images be bad?

Yeah, seems to cause issues with the class

sweet sand Oct 8, 2022, 7:41 PM

#

tough gazelle Yes, you can prune it down to 2GB or so. Not sure how the collab is setup for th...

I use the pod ark has set, do you know the path to the prune script?

sweet sand Oct 8, 2022, 7:41 PM

#

tough gazelle I use around 20-30 training images and 300 regularization images.

Got 73 training image, about 300 regular from the regular pack

tough gazelle Oct 8, 2022, 7:42 PM

#

sweet sand I use the pod ark has set, do you know the path to the prune script?

No idea, I have my own that I don't remember where I got it from.

inner rock Oct 8, 2022, 7:58 PM

#

I'm trying out textual inversion in the automatic1111 webui on an AWS server. I used 5 training images, all head shots from different angles and backgrounds, cropped to 512x512. I edited the subject.txt file to match the photos, along the lines of "a close-up photo of a smiling [name] looking forward". I trained for 6,000 epochs, which was less than an hour. Is there something similar to what JoePenna describes on his dreambooth repo readme as "If you trained with joepenna under the class person, the model should only know your face as: joepenna person". I don't see a way to indicate the class in the webUI version. Is that needed in some way, or am I good to go with just using my token? TIA!

ashen perch Oct 8, 2022, 8:25 PM

#

ashen perch does anyone have a sample prompt template for training specific people? subject....

thanks for the help, i got a lot of answers 🙂

austere wigeon Oct 8, 2022, 8:53 PM

#

can anyone help me with runpod and automatic1111... the version of automatic111 that the template starts is old... I am wondering how do I get the new version...? I did a git pull, but not sure how to restart everything... starting the webui-user.sh doesn't seem to do anything

#

I have downloaded my custom dreambooth model and want to use it in the runpod environment now...

#

I have used it locally successfully, but I want to run it with the powerhouse GPU I rent with the runpod service 😄

dusky pecan Oct 8, 2022, 9:08 PM

#

austere wigeon can anyone help me with runpod and automatic1111... the version of automatic111 ...

Before today it was just git pull and pip install -r requirements_versions.txt

#

But there were some breaking changes for that today

austere wigeon Oct 8, 2022, 9:08 PM

#

I mean it is already installed

#

it is just running an old version

dusky pecan Oct 8, 2022, 9:09 PM

#

Yes, you have to connect to jupyter and update it

#

git pull
git checkout 4999eb2ef9b30e8c42ca7e4a94d4bbffe4d1f015

#

In a new terminal in the webui dir

#

Then pip install the requirements again and then restart the pod with the little pen icon

#

You have to checkout a commit from earlier today.

austere wigeon Oct 8, 2022, 9:16 PM

#

thanks!

#

that worked

austere wigeon Oct 8, 2022, 10:55 PM

#

hm suddenly getting Sizes of tensors must match except in dimension 0. Expected size 82 but got size 77 for tensor number 1 in the list. even after restarting and even with default settings

#

huh okay I shortened the prompt which seems to have fixed it

austere wigeon Oct 9, 2022, 1:36 AM

#

so wondering, can you use good outputs of dreambooth creations as training for a 2nd round of dreambooth training?

sweet sand Oct 9, 2022, 2:09 AM

#

base_learning_rate : any advice, i have 1.0e-06

hot breach Oct 9, 2022, 5:12 AM

#

I've seen people using lower value, like 5e-7 but the base 1e-6 has been serving me well on the mega model for ff7r so far

stray idol Oct 9, 2022, 5:18 AM

#

inner rock I'm trying out textual inversion in the automatic1111 webui on an AWS server. I...

Isn't the class the 'Initialization text'? I've seen reference to people training sigh...'wet t-shirt' on 't-shirt' so that'd make sense.

plucky swan Oct 9, 2022, 10:53 AM

#

In diffusers dreambooth especially, class token is be the most important thing to choose after the training images and amount of steps itself. A good matching class can nicely fill up the gaps that the training data didn't have

#

Armor class token for me ended up resulting mostly like this

#

And warrior class token is able to create something like this

main ocean Oct 9, 2022, 12:49 PM

#

Im having some ugly outlines that I can’t get rid of … with img2img . Any ideas. ? 🥹

obtuse shuttle Oct 9, 2022, 2:01 PM

#

Quick question about fine tuning: I’ve just finished creating a CKPT via dreambooth, which can run in SD. But it seems as though the training has influenced everything from that ckpt file. Every single person now looks like the person I fed into the prompt. Is it possible to isolate the CKPT file to only come when explicitly prompted, as with Textual Inversions?

zenith fable Oct 9, 2022, 2:39 PM

#

Anybody know if I can train an additional person onto an existing dreambooth ckpt, i.e, put the 2gb model into the directory and train onto it new new faces and new prompt? it's a pain to swap between ckpt files when making images with different faces. Hope this makes sense.

tough gazelle Oct 9, 2022, 4:41 PM

#

obtuse shuttle Quick question about fine tuning: I’ve just finished creating a CKPT via dreambo...

This sounds like you used person as the class word and didn't have enough regularisation images. Or you did too many steps and it's bled into the class. Or both.

tough gazelle Oct 9, 2022, 4:42 PM

#

zenith fable Anybody know if I can train an additional person onto an existing dreambooth ckp...

You can do it, but it won't be clean. It will most likely merge the two subjects together in some ways

bleak swallow Oct 9, 2022, 5:22 PM

#

if you have any checkpoints from partway thru training you could try those 🤔 or you could try merging the vanilla sd model with it

zenith fable Oct 9, 2022, 5:36 PM

#

tough gazelle This sounds like you used person as the class word and didn't have enough regula...

Hmm was afraid of that. Same thing happened when I tried to merge. Is this a feature that could ever be developed? Would be great to have one model trained to what I need

tough gazelle Oct 9, 2022, 5:36 PM

#

zenith fable Hmm was afraid of that. Same thing happened when I tried to merge. Is this a fea...

You can do it but you need to train them all at once

#

https://github.com/kanewallmann/Dreambooth-Stable-Diffusion

GitHub

GitHub - kanewallmann/Dreambooth-Stable-Diffusion: Implementation o...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion (tweaks focused on training faces) - GitHub - kanewallmann/Dreambooth-Stable-Diffusion: Implementation of Dream...

#

That repo has the functionality

#

Trained it with 3 characters and it worked well

#

Speaking with someone else you can apparently put training images in where they are together and it make it more likely for group shots to work. Rather than what it usually does and you just get multiple clones

zenith fable Oct 9, 2022, 6:32 PM

#

That's ace thanks! I'll give it a try

obtuse shuttle Oct 9, 2022, 8:56 PM

#

tough gazelle This sounds like you used person as the class word and didn't have enough regula...

I had around 400 regularisation images, and did 2k steps

long summit Oct 9, 2022, 9:01 PM

#

Question about Dreambooth style training and regularization image.
When I trean new style , what do I put in training and regionalization images?

TrainingImage: various class images with specific style
Regulalization images: photography of various class images. e.g. cat, car dog, person, etc…

Is it ok?

visual atlas Oct 9, 2022, 9:02 PM

#

Do you think that there is a way to finetune the model to give only good images to our liking (using a simple classification good versus bad of images output from normal model for retraining)?

gloomy hatch Oct 9, 2022, 10:02 PM

#

Is it possible to combine AItemplate and Xformers to get the benefits of both?

sturdy charm Oct 9, 2022, 10:43 PM

#

Hi guys,
Which enhancer works best for anime faces? Like the GFPGAN but for anime o:

hot breach Oct 9, 2022, 11:23 PM

#

anime specific SRgns or real-esrgan

#

might help a bit at least

#

they are super resolution gans but typicall you can downsize then upsize to have it fix up styling some

hot breach Oct 10, 2022, 3:00 AM

#

was able to get some concepts of the city and style for midgar into latest multi-model for FF7R, along with a 5th character

00138-3720754183-aerial_photo_of_iron_man_over_midgar_city.png

silent spear Oct 10, 2022, 12:50 PM

#

visual atlas Do you think that there is a way to finetune the model to give only good images ...

I've been experimenting with that in a way. I created a TI styled called "me-likey" based in 300 images I thought looked good, then just added it as a style influence the way I would any other style. Sadly, it probably takes more discipline and diversity than I have patience for... the definition of "good" clearly has some commonalities I didn't notice (unconscious bias) so the outputs were all kinda samey and eventually painfully repetitive. I'd probably have to start with a much bigger, much more random base to avoid that. But it's definitely possible, at least.

pulsar arrow Oct 10, 2022, 1:17 PM

#

plucky swan In diffusers dreambooth especially, class token is be the most important thing t...

Thanks for the tip, this helps me understand class better!

visual atlas Oct 10, 2022, 1:42 PM

#

silent spear I've been experimenting with that in a way. I created a TI styled called "me-lik...

Cool. I wish I had a GPU to test on 10000 input images

final matrix Oct 10, 2022, 6:55 PM

#

does anybody here have experience with fine tuning? like actual finetuning? like this https://github.com/LambdaLabsML/examples/tree/main/stable-diffusion-finetuning

GitHub

examples/stable-diffusion-finetuning at main · LambdaLabsML/examples

Deep Learning Examples. Contribute to LambdaLabsML/examples development by creating an account on GitHub.

shy widget Oct 10, 2022, 7:07 PM

#

Sys VRAM: 12288/12288 MiB (100.0%)
wheew, no OOM

#

must have been really close

dreamy zenith Oct 10, 2022, 8:02 PM

#

Are there any good rules of thumb for picking a TI step count? Is there a mathematical formula or something?

proper thorn Oct 10, 2022, 8:53 PM

#

Hey, im looking for some help or direction on how to get started training my own model. if anyone could point me in the right direction that would be great

pastel torrent Oct 10, 2022, 10:45 PM

#

main ocean Im having some ugly outlines that I can’t get rid of … with img2img . Any ideas....

photoshop em out

stuck arrow Oct 11, 2022, 12:07 AM

#

proper thorn Hey, im looking for some help or direction on how to get started training my own...

Can you provide any specifics? I recently (finally) got a model trained from scratch on a proprietary dataset at my company, I might be able to offer some advice if you like.

#

I'm thinking of writing up my experiences and troubleshooting into a blog post sometime soon too

pulsar arrow Oct 11, 2022, 1:21 AM

#

Anyone able to use the Shivam Shrirao Dreambooth collab notebook on the premium google GPUs? It only seems to work on standard GPUs for me.

proper thorn Oct 11, 2022, 2:41 AM

#

stuck arrow Can you provide any specifics? I recently (finally) got a model trained from scr...

I have seen alot of people use their own faces for a model and I was wondering how they did that and if im even capable of doing it on my own machine

#

i have an rtx 3070 and a r7 5800x

gloomy belfry Oct 11, 2022, 3:00 AM

#

proper thorn i have an rtx 3070 and a r7 5800x

it's called DreamBooth and I don't believe it would on either of those, you can rent a runpod

viral jay Oct 11, 2022, 3:31 AM

#

proper thorn I have seen alot of people use their own faces for a model and I was wondering h...

you can use textual inversion

proper thorn Oct 11, 2022, 3:41 AM

#

ahh ok

#

ill look into that

cursive obsidian Oct 11, 2022, 4:38 AM

#

https://www.reddit.com/r/StableDiffusion/comments/x1587s/sampler_step_count_comparison_with_timing_info/

r/StableDiffusion - Sampler / step count comparison with timing info

47 votes and 14 comments so far on Reddit

#

this was a badass reddit thread

#

after looking at all the results/timings, I don't really see a reason to use any of the dpm or heun samplers

stoic bough Oct 11, 2022, 4:52 AM

#

For textual inversion is it better to have around 3-5 images or is it better to have more images of the object/style?

#

And if I were to increase the number of images, would I need to increase the training steps?

viral jay Oct 11, 2022, 5:21 AM

#

from my newbie tests I found that its better to have the subject being trained and have different images of it with different lightning and background, closeup of faces for me worked worse than getting whole face from far but must be careful as it may learn other stuff around too like wallpaper or texts (like text on my chair), from experiments it seems like having less images but with more changing happening while subject you want to train remains stable is better than lot of pictures of same thing but with small changes

#

3-5 images was enough for some stuff, others I trained with 10-15 images...others with 20-25 images...it really depends if its getting the results you want on the output, but more images not always equal to more quality on my tests

#

steps is also something weird, sometimes I got well trained result with 2000-3000 steps for some face, for another I had to run like 6000-8000 steps, and in few situation I had it running up to 30k steps, but above 10-15k I didn't noticed a huge difference but that's for what I've been training, maybe something else that's not a person face could take advantage of more steps

stoic bough Oct 11, 2022, 5:41 AM

#

I see, thanks for the info

#

Will just try a bunch of random stuff ig

limber peak Oct 11, 2022, 7:13 AM

#

proper thorn i have an rtx 3070 and a r7 5800x

rtx 3070 is 10gb or smth, right? There was an 8gb version of DB I saw a while ago, though I haven’t tried it myself

#

It uses deepspeed

gloomy belfry Oct 11, 2022, 8:13 AM

#

limber peak It uses deepspeed

doesn't work for me and a bunch of other folks

#

seems like it OOM when pinning the memory for the optimizer even if I have plenty more free memory

proper thorn Oct 11, 2022, 10:08 AM

#

yes its 8gb, but ill look into that to see if there are any updates then

plucky swan Oct 11, 2022, 10:36 AM

#

Anyone trained subjects that have very limited training images (rare character or object)

limber peak Oct 11, 2022, 11:11 AM

#

gloomy belfry doesn't work for me and a bunch of other folks

Oof that sucks

worn river Oct 11, 2022, 1:21 PM

#

Is anybody making a horror/body horror kind of model, because that would be epic

fervent grail Oct 11, 2022, 3:07 PM

#

Hey guys!! I am running Dreambooth on a 3060 and can't tell if it's doing 'good enough' or not. It very clearly has to do with settings - but I am getting a high loss this time. Should I nuke it and try something else? My first try I got like .04 or something like that - but I was extremely unsatisfied with the results, so I deleted the model and tried again

fervent grail Oct 11, 2022, 3:48 PM

#

ahh! thats a great idea too! I saw a cool one with Studio Ghibli art style - was that you?

right now experimenting with my face, and it seems like it's working! doesn't feel 'exact' so I think I'll just have to fine tune it as I go along. My eyes were really bad the first time I tried it, and so were the teeth, but when I ran it again, it got better.

Thanks for the response! I was a little confused about that!

fringe shell Oct 11, 2022, 4:57 PM

#

how did you get that "loss" data to show up? i'm only seeing this;

Total progress: 100%|████████| 30/30 [00:35<00:00,  1.19s/it]
100%|████████| 60/60 [01:13<00:00,  1.23s/it]
Total progress: 100%|████████| 60/60 [01:12<00:00,  1.22s/it]```

fervent grail Oct 11, 2022, 5:58 PM

#

fringe shell how did you get that "loss" data to show up? i'm only seeing this; ```100%|████...

I saw that when my output folder had info in it already. What I had done - was nuke everything in the output folder, and then ran it again

#

3060 - 165 photos - only took 19 minutes. I was surprised.

random ocean Oct 11, 2022, 6:38 PM

#

stuck arrow I'm thinking of writing up my experiences and troubleshooting into a blog post s...

Please do!

hot breach Oct 11, 2022, 7:45 PM

#

some info on how I'm producing the FF7R multi-character + style model: https://gist.github.com/victorchall/67bc53472f86641aef1ebee1e154f5d1 best read if you already know about dreambooth and how to run it locally on a 24GB card (joe/xaiver/kane etc repos without using notebook)

hot breach Oct 11, 2022, 8:30 PM

#

https://github.com/huggingface/diffusers/commit/66a5279a9422962b1cff3ad0e5747e8903ae067b wow ok forget my link above... this may be massive, attention bounding box capability

wary stratus Oct 11, 2022, 9:17 PM

#

Hey for those who might be interested, I modified the Dream Booth notebook, mostly so that all available parameters of the training function can be used. I also reorganized a bit because of that, and added the possibility to just put a path to a gdrive folder where your images are from the start. Oh and an auto-disconnect at the end so you can let it run and it will disconnect on its own without having to wait to be kicked out.
https://colab.research.google.com/github/Zinston/colab_notebooks/blob/main/DreamBooth_Stable_Diffusion_(advanced_settings).ipynb

Google Colaboratory

hot breach Oct 12, 2022, 12:25 AM

#

new FF7R mega model, 1400+ training images, 13k+ steps, info/pictures here: https://old.reddit.com/r/sdforall/comments/y1ojm6/ff7r_mega_model_v4_style_characters_and_more/iryj3n6/

r/sdforall - Comment by u/Freonr2 on ”FF7R mega model V4, style, ch...

0 votes and 1 comment so far on Reddit

#

several styles and a bunch of characters all in one

muted relic Oct 12, 2022, 12:31 AM

#

what setting am i missing? i just want more sky

gloomy belfry Oct 12, 2022, 12:47 AM

#

https://github.com/huggingface/diffusers/issues/807

GitHub

Dreambooth doesn't train on 8GB · Issue #807 · huggingface/diffusers

Describe the bug Per the example featured in the repo, it goes OOM when DeepSpeed is loading the optimizer, tested on a 3080 10GB + 64GB RAM in WSL2 and native Linux. Reproduction Follow the pasteb...

#

created an issue on the diffusers repo for the 8gb version if anyone wants to chime in their experience

fervent grail Oct 12, 2022, 1:07 AM

#

Anyone know if Dreambooth with over 2000 samples makes a difference? Tried 2000 after doing 800 and it was night and day.

Also - probably obvious - but restore faces has the unfortunate side effect of removing some features.

muted sierra Oct 12, 2022, 5:29 AM

#

what's a good number to set the "image creation progress every N" to?

#

in the web UI

still shore Oct 12, 2022, 5:34 AM

#

https://colab.research.google.com/drive/1vrh_MUSaAMaC5tsLWDxkFILKJ790Z4Bl

Google Colaboratory

#

Found this in a post previously, anyone tried it? Allows fine tuning but isn’t dreambooth and looks like you can have captions with the images etc

#

I’ve run out of compute units for the month trying to get Textual Inversion working on Hugging faces colab…

still shore Oct 12, 2022, 6:41 AM

#

Or a hypernetworks finetune colab? I’d love to try an alternative to TI and compare.

limber peak Oct 12, 2022, 11:38 AM

#

Is joepenna’s DB version still up to date, or are there better models for finetuning now?

hybrid pilot Oct 12, 2022, 11:49 AM

#

muted relic what setting am i missing? i just want more sky

out painting has been meh, at best, for me as well. I think its not quite up to the task yet.

ashen perch Oct 12, 2022, 12:24 PM

#

i'm trying to achieve the style of Heroes of Might and Magic 3 with Textual Inversion, I've used these 11 images, maybe the descriptions are not the best

#

I've used 8 tokens and the initialization text was illustration style

#

I've used only this line in my custom prompt template:

[filewords], in style of [name]

#

After 5500 steps, loss is at 0.28 👀

#

what am I doing wrong?

#

should I rename my sample images?

#

the name of the embedding is homm3-v2-tk8-illustrationstyleinit-customtemp, is it a problem that I use homm3?
if I generate an image with the prompt homm3 it seems like SD has some knowledge of it, can it influence the result?

#

ok, i'll try it again

#

new descriptions

#

#

a new name for embedding: sksv3tk8styleinitcustomtemp

#

initialization text is style

ashen perch Oct 12, 2022, 12:50 PM

#

ashen perch I've used only this line in my custom prompt template: ``` [filewords], in style...

same prompt template

wintry girder Oct 12, 2022, 1:10 PM

#

Using auto1111, how do I know I'm generating images using the embed I created with textual inversion?

stray idol Oct 12, 2022, 1:11 PM

#

I'm pleased to announce that the auto1111 repo now supports generating embeddings as shareable .png images:

#

Which can be dropped into your embeddings folder and loaded in just the same way a .pt can.

#

Uses the custom prompt from "Preview prompt" so you can choose what image is used as a representation of your work.

ashen perch Oct 12, 2022, 2:01 PM

#

ashen perch ok, i'll try it again

ran it again till 5500 steps with the settings above, got a bit better results, but still doesn't look good

ashen perch Oct 12, 2022, 2:17 PM

#

a castle, in style of sksv3tk8styleinitcustomtemp

#

with 5500, 4500, 3500, 2500, 1500 and 500 steps

#

same seed with simply a castle

#

what am i doing wrong? :/

stray idol Oct 12, 2022, 2:30 PM

#

doesn't look like an unreasonable style from the images.

#

dilute the idea of 'castle' a bit in your prompt?

ashen perch Oct 12, 2022, 3:04 PM

#

I don't know, this is with London, sksv3tk8styleinitcustomtemp
None of them look like London and I don't know where these characters come form

stone garden Oct 12, 2022, 5:44 PM

#

#

pls zoom in the face

#

does it look overtrained to u?

ashen perch Oct 12, 2022, 5:47 PM

#

I don't know, you can't really tell from a single image

stone garden Oct 12, 2022, 5:47 PM

#

how can I tell?

#

I have many images

ashen perch Oct 12, 2022, 5:48 PM

#

I try different styles eg. comic

#

if it won't be anything like a comic, it's overtrained I think

#

but I might be wrong

stone garden Oct 12, 2022, 5:48 PM

#

ok ill try

hot breach Oct 12, 2022, 5:49 PM

#

stone garden does it look overtrained to u?

can't tell without seeing your training images really

stone garden Oct 12, 2022, 5:50 PM

#

hot breach can't tell without seeing your training images really

https://www.instagram.com/enakorin/

えなこ(Enako) (@enakorin) • Instagram photos and videos

2M Followers, 329 Following, 789 Posts - See Instagram photos and videos from えなこ(Enako) (@enakorin)

hot breach Oct 12, 2022, 5:50 PM

#

usually overtraining you get "cooked" faces, like sunburnt or too much contrast, you can sometimes get it to look good again lowering CFG scale, but its best to use an earlier copy of your CKPT from earlier steps if you start to see that

stone garden Oct 12, 2022, 5:51 PM

#

#

there is just something off about the face

hot breach Oct 12, 2022, 5:51 PM

#

I know some repos/colabs dont let you control it, but some let you keep copies of the CKPT every so many steps, or you can start training again on an existing trained CKPT just to add some steps

stone garden Oct 12, 2022, 5:51 PM

#

like the features are too defined

#

ignoring the common issues with eyes and teeth, thats just SD

#

yea I used my own GPU I have all the checkpoints

silent spear Oct 12, 2022, 5:53 PM

#

Feels like it's just a tiiiiiny bit overtrained. Though the "render as a comic" trick is a good way to tell.

stone garden Oct 12, 2022, 5:54 PM

#

I did 100k steps on 800 images

hot breach Oct 12, 2022, 5:54 PM

#

100k???

#

wow

stone garden Oct 12, 2022, 5:54 PM

#

I'll do the comic thing

#

yea 100k was the default on automatic's webui

hot breach Oct 12, 2022, 5:54 PM

#

oh

stone garden Oct 12, 2022, 5:54 PM

#

so I just left it at that, its textual inversion not dreambooth

#

it took 7 hours

#

on an ampere a5000

hot breach Oct 12, 2022, 5:55 PM

#

ok, yeah I'm using "dreambooth" and the most I've ever done is like 15k steps with 1400 images which is maybe 8-10 hours on a 3090

#

not quite comparable

stone garden Oct 12, 2022, 5:56 PM

#

can u share the guide u followed

#

I tried to use dreambooth first but couldnt find proper docs

#

and was getting VRAM issues

#

I have 24GB

hot breach Oct 12, 2022, 5:57 PM

#

https://gist.github.com/victorchall/67bc53472f86641aef1ebee1e154f5d1 this is my (short) guide for using kanewallmann's repo with captions for every image

Gist

Quick start on caption training with kanewallmann repo

Quick start on caption training with kanewallmann repo - gist:67bc53472f86641aef1ebee1e154f5d1

stone garden Oct 12, 2022, 5:57 PM

#

thanks

hot breach Oct 12, 2022, 5:57 PM

#

I'm doing multi-character + style training all in one go

#

you can name your training images like "[whatever name] in a white dress holding her left hand up to her face.png" and such

#

example filename in training set "a food truck in the slums distrct of midgar city with people standing around_2.png"

#

"a close up of barret wallace in a brown collared vest and a necklace around his neck with a concerned look on his face_ (31).webp"

silent spear Oct 12, 2022, 6:03 PM

#

My tolerance for accuracy is a bit wonky, but generally speaking ~3000-9000 is pretty good on TI, I find. But with 800 images, that changes the math a bit. Best way to think of it, I find, is to imagine someone locked you in a room with a pile of images and said "tell me what all these images have in common" ... the longer you worked at it, the most delirious you'd get, and you'd start seeing patterns that weren't really there, and eventually you'd become convinced that this "subject" was all about the mole on her right cheek. You want to reduce the number of steps accordingly, to avoid the trainer losing its mind.

(this is also why the comic style test is useful, because if you've overdone it, it won't be able to fathom that person existing in any form except a photograph, because it's convinced the "photograph" aspect is vital to the definition)

stone garden Oct 12, 2022, 6:04 PM

#

silent spear My tolerance for accuracy is a bit wonky, but generally speaking ~3000-9000 is p...

actually my prompt is dry acrilic painting but all the images are photographs

#

so I must have overtrained a lot then

silent spear Oct 12, 2022, 6:04 PM

#

hot breach I'm doing multi-character + style training all in one go

I am still fighting with the multi-element stuff in Dreambooth. Getting there, slowly.

hot breach Oct 12, 2022, 6:04 PM

#

try "a photo of [whatever name], 50mm f5.6" or something like that

#

without the brackets

silent spear Oct 12, 2022, 6:07 PM

#

stone garden so I must have overtrained a lot then

Sometimes the boundary between overtrained and not-overtrained is really slight. I have two versions of most of my models, so I can swap when the "strict" version doesn't work. For instance, one of my characters absolutely cannot wear shirts with collars using the "strict" model, but drop to the "flex" version and it has no problem at all. The difference between strict and flex is 500 steps. 3500 to 4000.

ashen perch Oct 12, 2022, 6:09 PM

#

Is it possible to train TI with transparent sprites?

#

like this

#

all 256x256

stone garden Oct 12, 2022, 6:11 PM

#

I guess yea you'd just have to upscale them to 512x512 for training I think

silent spear Oct 12, 2022, 6:11 PM

#

I've trained it on ~300 transparent PNGs and it didn't have any issues relating to the transparency, at least. I just couldn't make it generate anything useful from the training. Might've been my source, though. It wasn't the most cohesive set.

ashen perch Oct 12, 2022, 6:21 PM

#

Is it a nice practice to mirror the images to get more images for training?

#

I mean is it a good idea?

silent spear Oct 12, 2022, 6:23 PM

#

I haven't had a problem with it myself, but apparently mirroring can cause problems if your subject NEEDS asymmetry, like one side is uniquely different than the other, and needs to stay that way. Again, I've never had the issue myself, but I've read that it can be an issue.

prisma kiln Oct 12, 2022, 6:44 PM

#

What are the top ERSGAN models everyone is using for (drawing, illustration and photos)?

stray idol Oct 12, 2022, 7:12 PM

#

stray idol I'm pleased to announce that the auto1111 repo now supports generating embedding...

Okay, a serious attempt that wasn't a Rick Astley embedding, ulrikbadass's strong outline style work:

silent spear Oct 12, 2022, 7:35 PM

#

stray idol Okay, a serious attempt that wasn't a Rick Astley embedding, ulrikbadass's stron...

This embedding-image tech is one of my favorite things ever. Here is a quick image using my current WIP.

stray idol Oct 12, 2022, 7:36 PM

#

Nice, spookily similar style chosen.

silent spear Oct 12, 2022, 7:37 PM

#

Oh, I used your embedding style for that. Just wanted to see how nicely it would play with my character embedding.

stray idol Oct 12, 2022, 7:42 PM

#

Ahh, that makes much more sense! 🧠

ashen perch Oct 12, 2022, 9:31 PM

#

silent spear I've trained it on ~300 transparent PNGs and it didn't have any issues relating ...

I think transparency is a problem, after 7000 steps, everything has a black background

stray idol Oct 12, 2022, 9:41 PM

#

ashen perch I think transparency is a problem, after 7000 steps, everything has a black back...

Doesn't seem like a problem with transparency, it'll be squashing it to RGB, and most PNG encoders set the RGB to constant in 100% transparent regions.

#

Just data as presented

#

If they're doing that you might get weird colour edges as the transition to 100% transparency and then turn off the colours though.

fervent grail Oct 13, 2022, 12:20 AM

#

ashen perch I think transparency is a problem, after 7000 steps, everything has a black back...

whats the recommended amount of steps? it doesnt seem to change GPU peg at all.

gloomy belfry Oct 13, 2022, 5:38 AM

#

https://github.com/devilismyfriend/diffusers-multi-token-logging-dreambooth

GitHub

GitHub - devilismyfriend/diffusers-multi-token-logging-dreambooth: ...

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - GitHub - devilismyfriend/diffusers-multi-token-logging-dreambooth: 🤗 Diffusers: State-of-the-art diffusion...

#

changed the diffusers dreambooth to accept multiple subjects in one training session for anyone interested

alpine rose Oct 13, 2022, 8:00 AM

#

https://youtu.be/7m__xadX0z0
Hey guys, I followed this tutorial to train the model on custom people, however I'm struggling to use them correctly.
It's as if I was very limited in creativity once I include my person in the prompt.
Is that something to expect based on the quality and diversity of the training images ?
My training pictures are mainly portraits and not very diverse in terms of composition (chest and above)
Also I'm wondering if you can overfit the model ? Is 4000 steps too much ? I guess the amount of steps to use depends on the amount of training pictures ?

YouTube

Aitrepreneur

DREAMBOOTH: Train Stable Diffusion With Your Images Using Google's AI!

Dreambooth is Google’s new AI and it allows you to train a stable diffusion model with your own pictures with better results than textual inversion. Dreambooth is originally based on Imagen text-to-image model and this technology makes it possible for you to insert any character (yourself, your friends, your family), object or animal you want in...

▶ Play video

gloomy belfry Oct 13, 2022, 8:32 AM

#

gloomy belfry https://github.com/devilismyfriend/diffusers-multi-token-logging-dreambooth

added image and checkpoint logging like in the non-diffusers repos

ashen perch Oct 13, 2022, 11:29 AM

#

fervent grail whats the recommended amount of steps? it doesnt seem to change GPU peg at all.

I have no idea, I’m just trying everything I can 😂

fervent grail Oct 13, 2022, 1:33 PM

#

ashen perch I have no idea, I’m just trying everything I can 😂

Sensible.

So far I settled with 5k. 800 to 2k made soooo much difference. Then I realized I could just keep raising it.

spice solar Oct 13, 2022, 1:50 PM

#

alpine rose https://youtu.be/7m__xadX0z0 Hey guys, I followed this tutorial to train the mod...

If you cannot style it, its probably overfitted...try les steps!

fervent grail Oct 13, 2022, 3:49 PM

#

So - I ran 2 people in Dreambooth. When I ran the second one, it seems that it overwrote the first. Is there a way to... you know... not?

ashen perch Oct 13, 2022, 4:04 PM

#

stray idol Okay, a serious attempt that wasn't a Rick Astley embedding, ulrikbadass's stron...

how do you generate an embedding as a png?

alpine rose Oct 13, 2022, 4:05 PM

#

spice solar If you cannot style it, its probably overfitted...try les steps!

thanks ;) will try

stray idol Oct 13, 2022, 4:07 PM

#

ashen perch how do you generate an embedding as a png?

There's an option in auto1111, it's on by default, it creates the embeddings in a folder called image_embeddeds next to the regular embeddings.

ashen perch Oct 13, 2022, 4:07 PM

#

oh i see it now

hoary slate Oct 13, 2022, 6:12 PM

#

What is best for training the model to recognize a certain pose? Textual inversion, dreambooth or training a hypernetwork?

#

I have 50-100 images of the same pose, but it's not the same person in any of them. Would that be something that could be trained?

sullen apex Oct 13, 2022, 6:39 PM

#

Probably. On that note, does anyone have a reference for a list of all of the classes? Like person, dog, etc.

hybrid pilot Oct 13, 2022, 6:50 PM

#

hoary slate I have 50-100 images of the same pose, but it's not the same person in any of th...

at least with imagen there was another channel you could use and one guy on LAION used a pose estimator to create that extra channel's worth of data, so his models would always do really well with posing, I'm not sure if anything like that would be possible with SD but I'm not a good enough programmer to know for sure.

ashen perch Oct 13, 2022, 6:58 PM

#

Preprocessing with Use BLIP for caption used to name the images with the captions, now they are placed in a txt, does anyone know why?

velvet glen Oct 13, 2022, 7:19 PM

#

silent spear I've trained it on ~300 transparent PNGs and it didn't have any issues relating ...

I used transparent PNGs for training and It was a bad idea: the transparent areas become opaque and the invisible part is trained as part of the image getting weird results when using the resulting model for generating new images... 😦

silk crystal Oct 13, 2022, 7:38 PM

#

Hello, I am trying to finetune SD with Textual inversion and I am getting poor results
To be short, the training and val loss is oscillating during training and images generated in logs/images doesn't show any improvement even after 3h of training

I am using the InvokeAI colab notebook (https://github.com/invoke-ai/InvokeAI) which uses the hyperparameters of the original paper and I have 5 images in my dataset like recommended in it

#

Here are images in the training set :

#

And here is a sample of the results :

silent spear Oct 13, 2022, 7:42 PM

#

velvet glen I used transparent PNGs for training and It was a bad idea: the transparent area...

Argh, sorry about that. I'm wondering if the transparency actually played more of an issue in my tests too. Maybe I was misunderstanding where the problems were coming from :/

velvet glen Oct 13, 2022, 7:43 PM

#

silent spear Argh, sorry about that. I'm wondering if the transparency actually played more o...

I guess we will have to wait for another model that takes in account the alpha channel as long as RGB

silent spear Oct 13, 2022, 7:46 PM

#

silk crystal Hello, I am trying to finetune SD with Textual inversion and I am getting poor r...

I haven't used the InvokeAI training myself, but I find that with styles (especially more fantastical styles, or with fantastical subjects) you need more source images than normal training. At 5 images, it will be picking up the gist of the style (which is seems to be doing well) but has no idea how to apply it more broadly. Depending on your tolerance for pain, I'd try adding images in batches of 5 to see what happens. There'll be a sweet spot in there somewhere.

silk crystal Oct 13, 2022, 7:46 PM

#

Interesting, thanks

#

Also if you know an alternative way to finetune SD with textual inversion I would be glad to check it

silent spear Oct 13, 2022, 7:48 PM

#

I've switched over to using Automatic1111's version lately. The settings are a bit easier to manage, and it gives fancy PNG-based embeddings, which are 900 kinds of awesome.

silk crystal Oct 13, 2022, 7:48 PM

#

Thanks a lot, I am gonna take a look 😄

#

It looks very awesome 😄

silent spear Oct 13, 2022, 8:02 PM

#

I used to run a more customized version based on some colabs (a few weeks ago) but Auto just moves so fast that anything else felt like I was missing out on the future. Though as a warning: you may wake up some days and discover everything has turned upside-down and nothing works anymore. Just wait a few minutes and there'll probably be a new version you can git pull.

nimble harness Oct 13, 2022, 8:42 PM

#

https://twitter.com/Rahmeljackson/status/1580653799223402497?s=20&t=YS31XkqHF-0Rj9rgJPbR2w

jinofcoolnes (@Rahmeljackson)

The Arcane Vi model is out, Let me know what you think and hope you like it!

Download link here:
https://t.co/Fad09PI8Z8

#digitalart #arcanefanart #LeagueOfLegendsFanArt

silk crystal Oct 13, 2022, 8:45 PM

#

silent spear I used to run a more customized version based on some colabs (a few weeks ago) b...

What do you suggest for the number of vectors per token for a style transfer training ?
I would have tried myself before asking if I wasn't renting a GPU for that 😅

silent spear Oct 13, 2022, 8:49 PM

#

I think for style training you want to be around the 10ish range (someone can correct me if I'm wrong). Then again, your specific style is both visual and conceptual, so maybe a bit higher would work too. All I can say for sure is that I once trained a style on 50 images at 30 vectors/token and all it would produce was very strange garbage that looked similar to my source, but in a truly demented way. Heh. Not sure that helps 🙂

crimson wasp Oct 13, 2022, 9:47 PM

#

If anybody wants to try experimenting with textual inversion which limits embeddings to the range of weights seen in the original embeddings, I wrote some changes to Automatic's code with his help. You can replace modules\textual_inversion\textual_inversion.py with this, and you can play with the power of the effect on line 265. The original author of the textual inversion paper said that it should in theory help to retain editability of an embedding, and make it play nicer with other prompts

📎 textual_inversion.py

#

The changes are just the function determine_embedding_distribution, and where it's called to get the floor/ceiling, and then where they are used

hidden hatch Oct 14, 2022, 12:41 AM

#

What went wrong with this imbedding? too many steps?

silent spear Oct 14, 2022, 12:49 AM

#

Feels like it, yeah. How many steps vs source images?

hidden hatch Oct 14, 2022, 12:58 AM

#

13 source images of my dog and 6k steps

#

1k steps the animal didn't quite look like my dog, it looked very generic.

tribal rapids Oct 14, 2022, 12:59 AM

#

ashen perch Preprocessing with Use BLIP for caption used to name the images with the captio...

if you look in dataset.py you can see it's looking at these text files now

#

text_filename = os.path.splitext(path)[0] + ".txt"

silent spear Oct 14, 2022, 1:02 AM

#

hidden hatch 13 source images of my dog and 6k steps

Hmm, yeah, the sweet spot is probably in the 3-4k range, I suspect. I'm finding there's a spot pretty early on (1.5-2k) where things start to look decent, and by 3k they look solidly recognizable... and then it goes downhill fast once you pass 5k. But then no two trainings seem to be exactly alike, so it's hard to pin down an absolute truth to all this. At least not yet. I'm still working out a process 🙂

tribal rapids Oct 14, 2022, 1:02 AM

#

@ashen perch if it can find the text file, it uses the words in there, otherwise it splits them from the filename (the old way)

ashen perch Oct 14, 2022, 4:53 AM

#

I already renamed them manually 😄

#

I might do something wrong 😦

#

I took some screenshots from Heroes 3 HD, and cropped some parts into separate images, rescaled them into 512x512 and named them

#

there are 40 images total, +40 mirrored

#

#

I think the descriptions might be the problem

#

at 1600 steps, it produced results like this (I saved an image every 100 steps)

#

#

at 2500

#

#

at 5k

#

#

and it became worse, I ran for 23500 steps 😄

midnight knot Oct 14, 2022, 5:04 AM

#

blip is kinda stupid... like 90% of the pictures of me that I ran thru it says that im holding a pizza or a cellphone or a remote and wearing a backpack.

#

is there a better option than BLIP?

silk crystal Oct 14, 2022, 5:17 AM

#

silent spear I think for style training you want to be around the 10ish range (someone can co...

In your opinion is it better to have as much images as possible from the same artist even if the style (mostly color palettes) slightly diverge
or should I stick to the ones I really want SD to learn ?

bleak swallow Oct 14, 2022, 5:22 AM

#

try the new deepdanbooru

ashen perch Oct 14, 2022, 5:26 AM

#

ashen perch

Please help me what I’m doing wrong

#

I used 30 tokens and * as initialization text

silk crystal Oct 14, 2022, 5:48 AM

#

I am as ignorant as you unfortunately
We can just wait for people to live their lives before they answer 😛

ashen perch Oct 14, 2022, 5:58 AM

#

I’m trying to make it work for a week and I got about 0 help from here 😅 I don’t know if the file names are wrong or the images or the initialization text

silk crystal Oct 14, 2022, 5:59 AM

#

For my tiny experience initialization text can make a huge difference
your file names look fine

ashen perch Oct 14, 2022, 6:33 AM

#

And what should I enter if I wanna train styles?

silk crystal Oct 14, 2022, 6:37 AM

#

For me entering the actual style I wanted to learn gave better results

"Sci-fi" or "space-opera"

#

I may be very wrong though

#

(and may confuse initialization text, embedding name and initalizer words 😅)

final matrix Oct 14, 2022, 8:03 AM

#

so i am trying to finetune a model using this pokemon diffusion tutorial https://lambdalabs.com/blog/how-to-fine-tune-stable-diffusion-how-we-made-the-text-to-pokemon-model-at-lambda/ and this repo https://github.com/justinpinkney/stable-diffusion and I keep running into this error:

Found nested key 'state_dict' in checkpoint, loading this instead
/venv/lib/python3.8/site-packages/pytorch_lightning/loggers/test_tube.py:105: LightningDeprecationWarning: The TestTubeLogger is deprecated since v1.5 and will be removed in v1.7. We recommend switching to the pytorch_lightning.loggers.TensorBoardLogger as an alternative.
  rank_zero_deprecation(
Monitoring val/loss as checkpoint metric.
Merged modelckpt-cfg: 
{'target': 'pytorch_lightning.callbacks.ModelCheckpoint', 'params': {'dirpath': 'logs/2022-10-13T08-13-33_pokemon/checkpoints', 'filename': '{epoch:06}', 'verbose': True, 'save_last': True, 'monitor': None, 'save_top_k': -1, 'every_n_train_steps': 2000}}
ModelCheckpoint(save_last=True, save_top_k=-1, monitor=None) will duplicate the last checkpoint saved.
/venv/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/ddp.py:20: LightningDeprecationWarning: The pl.plugins.training_type.ddp.DDPPlugin is deprecated in v1.6 and will be removed in v1.8. Use pl.strategies.ddp.DDPStrategy instead.
  rank_zero_deprecation(
/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:297: LightningDeprecationWarning: Passing Trainer(accelerator='ddp') has been deprecated in v1.5 and will be removed in v1.7. Use Trainer(strategy='ddp') instead.
  rank_zero_deprecation(
/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:317: LightningDeprecationWarning: Passing <pytorch_lightning.plugins.training_type.ddp.DDPPlugin object at 0x7fd1e0c3f790> strategy to the plugins flag in Trainer has been deprecated in v1.5 and will be removed in v1.7. Use Trainer(strategy=<pytorch_lightning.plugins.training_type.ddp.DDPPlugin object at 0x7fd1e0c3f790>) instead.
  rank_zero_deprecation(
/venv/lib/python3.8/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: max_epochs was not set. Setting it to 1000 epochs. To train without an epoch limit, set max_epochs=-1.
  rank_zero_warn(
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

Traceback (most recent call last):
  File "main.py", line 846, in <module>
    data.prepare_data()
  File "/workspace/stable-diffusion/main.py", line 211, in prepare_data
    instantiate_from_config(data_cfg)
  File "/workspace/stable-diffusion/ldm/util.py", line 79, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "/workspace/stable-diffusion/ldm/util.py", line 87, in get_obj_from_str
    return getattr(importlib.import_module(module, package=None), cls)
  File "/venv/lib/python3.8/importlib/init.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'ldm.data.local'```

The Lambda Deep Learning Blog

How to fine tune stable diffusion: how we made the text-to-pokemon ...

Stable Diffusion is great at many things, but not great at everything, and getting results in a particular style or appearance often involves a lot of work "prompt engineering". If you have a particular type of image you'd like to generate, then an alternative to spending a long time crafting

GitHub

GitHub - justinpinkney/stable-diffusion

Contribute to justinpinkney/stable-diffusion development by creating an account on GitHub.

#

any ideas on how to fix that?

final matrix Oct 14, 2022, 8:05 AM

#

bleak swallow try the new deepdanbooru

@midnight knot my blip captions were also garbage and i can confirm that the deepdanbooru were much much better and more accurate.

midnight knot Oct 14, 2022, 8:07 AM

#

final matrix <@258179251486982144> my blip captions were also garbage and i can confirm that ...

They're somewhat better in my case. not great either tho.

final matrix Oct 14, 2022, 8:08 AM

#

you can also change the tags (like 1girl to girl or brown_hair to brown hair) by using notepad++ and ctrl + f + shift and selecting the folder with your captions

midnight knot Oct 14, 2022, 8:10 AM

#

yeah

#

i gotta figure out if im doing things correctly. tried training hypernetwork but it didnt work at all

final matrix Oct 14, 2022, 8:12 AM

#

final matrix so i am trying to finetune a model using this pokemon diffusion tutorial https:/...

i see i am the only one with this error here. i searched it in the search box and nothing came up.

so that means i must have missed something in the tutorial. but the tutorial is so lightweight idk.

midnight knot Oct 14, 2022, 8:24 AM

#

try newer python

final matrix Oct 14, 2022, 9:22 AM

#

midnight knot try newer python

like which one

#

but also i font think thats going to fix the error because clearly a file seems to be missing i think?

midnight knot Oct 14, 2022, 9:46 AM

#

i dont see any local.py file in my setup

#

for ldm

#

3.10.6 is the one i work with

silk crystal Oct 14, 2022, 9:47 AM

#

The issue might be the stable diffusion version

#

I didn't look in the details though

midnight knot Oct 14, 2022, 9:52 AM

#

you might be better off looking for a more popular repo.

final matrix Oct 14, 2022, 9:56 AM

#

midnight knot you might be better off looking for a more popular repo.

i am already following a waifudiffusion tutorial but i still wanna know whats qrong with this one lol
ill try 3.10.6 then

silent spear Oct 14, 2022, 12:29 PM

#

silk crystal In your opinion is it better to have as much images as possible from the same ar...

It kinda depends, but I can put it this way: I loaded up a ton of images from an artist who does painting and pencil art (nicely shaded, but noticeably pencil). The results were insanely messy, because it seemed to be trying to reconcile the two visual effects at the same time (weirdly, eyeballs all came out very pencil-drawn, while the rest of the face was painted). So if it's mostly the color palettes in play, you might be able to mix and match, but in general I will load an artist's different styles as unique embeddings, just to keep them distinct.

(and, as with everything SD, don't show it anything you don't want it to learn from, because it will invariably obsess over the ONE image you didn't really want 🙂

stray idol Oct 14, 2022, 12:44 PM

#

Got tired of gimp: https://github.com/dfaker/quick-ti-cropper

GitHub

GitHub - dfaker/quick-ti-cropper: quick cropping utility to grab 1:...

quick cropping utility to grab 1:1 ratio sections from a folder of images - GitHub - dfaker/quick-ti-cropper: quick cropping utility to grab 1:1 ratio sections from a folder of images

silent spear Oct 14, 2022, 12:49 PM

#

ashen perch Please help me what I’m doing wrong

This one is gonna be tricky 🙂

So, as a kinda foundational concept, assume the AI is incredibly stupid and needs a lot of hand-holding. Give it a picture of a mausoleum and it will say "WTF is that?" Tell it it's a mausoleum and it will learn that all mausoleums look like that one image. Ask it to draw a mausoleum and it will spit out exactly what you showed it. That's why we use multiple images for each subject, to help it learn enough about a mausoleum that it can connect certain dots and make up its own stuff.

For TI, I like to give it some help by feeding some initialization text like "building" so it can blend its general knowledge of buildings with the image of the mausoleum, which lets it fill in gaps more easily.

BUT: you are actually feeding it multiple things at once. You've got a style + subjects. So you're basically saying "here's a picture of a castle in the middle of a forest" and the AI is learning and saying "OK, I am ready for another castle in the middle of a—" and then you give it a picture of a medieval building with a flag on the roof. And the AI is saying "WTF I don't see what any of these have in common..." and starts grasping at straws to find commonalities.

Now again, if the initialization text says "building" then it might have a bit more of a foundation (ha!) to build on, but in my experience it's gonna struggle either way. The longer you train it, the more desperate it will be to find ANYTHING that connects the source images, and you'll start to get really freaky images. I have become overly fond of the concept of locking the AI in a room with a set of photos and telling it the only way it can get out is if it figures out what they all have in common... the longer it's in there, the more delirious it's gonna be 🙂

(tbc)

untold peak Oct 14, 2022, 12:53 PM

#

final matrix so i am trying to finetune a model using this pokemon diffusion tutorial https:/...

I actually suggest not following the tutorial fully. They used Blip for their training set labels and they are terrible. It can produce pokemon every time, but it really doesn't take advantage of a fine-tuning.

#

It might have as well used a single token.

silent spear Oct 14, 2022, 12:59 PM

#

When I train styles, I like to give it very plain subjects that I know it will understand. Standard-issue humans, trees, bridges etc. Things I know for sure it will understand without much hassle. That way, it will be looking at the style, not the subject. Anything "unique" or "different" will send it down the wrong path.

The tricky part with your set (and this is out of your control, I know) is that you've got cool architecture and angles and overall CONCEPTUAL style, as well as the visual style. So the AI is going to be struggling to understand what it is you want it to learn. Especially in things like the mountains (which may not match any mountains the AI recognizes off the top of its head) or the graveyard (which is busy and probably hard for it to pick out individual features from). Imagine it's locked in a room and looking at that graveyard pic and you're saying "what do you notice about this photo of a graveyard?!" and it's scanning its memory trying to reconcile what it knows about graveyards with what you've given it. It's not going to focus on the art style, it's going to obsess on the wrong things.

All of which is to say: I would trim back your set to only include images that are fairly clear and distinct, where there is clearly a building that it might recognize as a building, with very little excess noise around it (so avoid shots where the BG has colors that match the foreground, or it may not recognize the object) and see how that works.

Then, once you get your style locked, you can use that style to generate new sub-classes of, say, architecture. So you can say "draw me a mausoleum in the style of style-123" and it will hopefully give you variations on a mausoleum that match the style you've built.

final matrix Oct 14, 2022, 12:59 PM

#

any of you know ways to convert a ton of images from webp to jpg or png very fast?

final matrix Oct 14, 2022, 1:02 PM

#

untold peak I actually suggest not following the tutorial fully. They used Blip for their tr...

yes we already talked about how shitty blip captions are and how deepdanbooru ones are better. thats not the issue i am having.

bleak swallow Oct 14, 2022, 1:02 PM

#

Irfanview can do it but it needs a plugin to open webp

final matrix Oct 14, 2022, 1:03 PM

#

bleak swallow Irfanview can do it but it needs a plugin to open webp

thats fine

#

ill try it thx

untold peak Oct 14, 2022, 1:04 PM

#

oh sure, I just wanted to make sure you weren't wasting your time and resources doing something suboptimal.

#

also FFMPEG can convert webps

#

So you can write a batch script.

#

"ffmpeg -i image.webp image.png"

final matrix Oct 14, 2022, 1:08 PM

#

on another server just now:
"i use irfan view"
"go learn ffmpeg cmd line tools"

ashen perch Oct 14, 2022, 1:16 PM

#

silent spear When I train styles, I like to give it very plain subjects that I know it will u...

I’ll give it a try when I get home, should I take single buildings?
And should I set Style as initialization text?

silent spear Oct 14, 2022, 1:17 PM

#

Yeah, I would stick to single buildings where possible, and add maaaaaybe "painting style" as the initialization text. Just so it knows what ballpark it's playing in.

final matrix Oct 14, 2022, 1:21 PM

#

irfanview was easy, just had to install 2 exes

silent spear Oct 14, 2022, 1:34 PM

#

I know this is assigning a personality where none exists, but still: I was testing a 3k step TI embedding just now: "photo of a woman played by b153, short hair" and it generates an image where the woman is too far away to accurately gauge the face. I keep trying, keep getting distant shots, almost like it's so uncertain about the face that it's AVOIDING drawing it.

"photo of a woman played by b153, head and shoulders, close up, short hair" --- and it generates image after image of the woman turned away from the camera so I can't see the face.

😐

"photo of a woman played by b153, head and shoulders, close up, short hair, (facing camera:1.3)"

...

Shot of the woman with long hair swept across her face.

viral jay Oct 14, 2022, 3:02 PM

#

Hello guys, so after some experimentation, I think hypernetwork can't be used alone but its a good complement for textual inversion, here's a test I did, 3k steps for textual inversion, 0.002 learning rate then 3k steps for hypernetwork and 0.000002 learning rate and last image is the control image

silk crystal Oct 14, 2022, 3:14 PM

#

Not Bad gg

fair perch Oct 14, 2022, 3:23 PM

#

viral jay Hello guys, so after some experimentation, I think hypernetwork can't be used al...

i've read in a youtube comment that hypernetworks works for styles and poses, not for objects

viral jay Oct 14, 2022, 3:24 PM

#

fair perch i've read in a youtube comment that hypernetworks works for styles and poses, no...

yeah, it can actually improve the subjects as you can see on her face/clothes after hypernetwork, but alone I think its not able to do a good job like textual inversion

#

the downside is that as its more efficient on style, most of images produced with hypernetwork are giving her a doll like face

#

but its producing some very nice results this way

00344-4022985128-face_wattson_face_0.2_throne_of_glass_fantasy_portrait_of_a_magician_casting_a_spell_d__d_high_fantasy_detailed_digital_a.png

00348-1127386896-wattson__6_young_jada_pinkett_smith_as_aeon_flux_profile_picture_by_Greg_Rutkowski_the_matrix_white_hair_dynamic_pose_intr.png

00343-1951463436-portrait_of_beautiful_blonde_face_wattson_face_0.5.png

alpine rose Oct 14, 2022, 4:07 PM

#

stray idol Got tired of gimp: https://github.com/dfaker/quick-ti-cropper

couldn't test yet, but I assume you can crop multiple times the same image? (to get face, upper body, whole body)

stray idol Oct 14, 2022, 4:11 PM

#

yeah, click as many times as you like for repeated crops.

alpine rose Oct 14, 2022, 4:25 PM

#

so good 🤩

stray idol Oct 14, 2022, 4:26 PM

#

Simplest thing that could possibly work 🧠

alpine rose Oct 14, 2022, 4:27 PM

#

do you know of a tool to extract images from multiple videos ?

#

lets say one image every sec

stray idol Oct 14, 2022, 4:28 PM

#

I have an ffmpeg script to do just that

alpine rose Oct 14, 2022, 4:28 PM

#

HAHA no way

#

i'll look into it

silk crystal Oct 14, 2022, 4:30 PM

#

silent spear This one is gonna be tricky 🙂 So, as a kinda foundational concept, assume the...

So you're basically saying "here's a picture of a castle in the middle of a forest" and the AI is learning and saying "OK, I am ready for another castle in the middle of a—" and then you give it a picture of a medieval building with a flag on the roof.
Very interesting
So if I have multiple characters in the same style art doing different things (shooting, looking through a microscope...) I shouldn't give those details in the prompts (like BLIP does) ?
I should just put "a cartoon character in the style of X" ?

silent spear Oct 14, 2022, 4:34 PM

#

silk crystal > So you're basically saying "here's a picture of a castle in the middle of a fo...

That's where you run into the very difficult question of "what will the AI get hung up on?" because if it's reasonably clear what's happening, the extra detail in the prompts seems to help it focus on the art, but for instance "looking through the microscope" has a decent chance of causing you trouble, because it may not know what a microscope is (with any certainty, anyway) and skew further away from the goal while trying to figure it out.

So yeah, I typically stay with "a cartoon character in the style of X" and hope for the best. But then I also don't usually have a fantastically diverse set of source images to begin with, so that probably hurts/helps too 🙂

silk crystal Oct 14, 2022, 4:34 PM

#

Thanks a lot for those very useful insights 🙏

silent spear Oct 14, 2022, 4:35 PM

#

silk crystal Thanks a lot for those very useful insights 🙏

I run into walls so others don't have to 😄

stray idol Oct 14, 2022, 4:40 PM

#

alpine rose HAHA no way

getframes.py added to that same repo.

alpine rose Oct 14, 2022, 4:43 PM

#

https://tenor.com/view/giga-chad-gif-23143840

Tenor

#

@stray idol atm

#

i was getting somewhere 🥺

stray idol Oct 14, 2022, 4:45 PM

#

It was just laying around my old project junk.

silk crystal Oct 14, 2022, 5:05 PM

#

It would be nice if there was a way to identify the images where training loss is clearly raising 🤔

stray idol Oct 14, 2022, 6:58 PM

#

silk crystal It would be nice if there was a way to identify the images where training loss i...

Great idea, a top 10 source image vs loss table?

stray idol Oct 14, 2022, 7:13 PM

#

silk crystal It would be nice if there was a way to identify the images where training loss i...

Gave it a quick go, quite variable as it depends on image vs prompt

#

fervent grail Oct 14, 2022, 7:19 PM

#

Why would you try tagging everyone? You monster

silk crystal Oct 14, 2022, 8:10 PM

#

stray idol Gave it a quick go, quite variable as it depends on image vs prompt

Interesting, where do you have to edit the code to print that?

Maybe a mean loss per image would be more relevant

stray idol Oct 14, 2022, 8:22 PM

#

textual_inversion.py

#

setup a dict entryLoss = {} before for i, entry in pbar: push entries in the loop after loss is calculated entryLoss[entry.filename] = loss.item()

#

yeah eventually maybe

ashen perch Oct 14, 2022, 8:36 PM

#

silent spear This one is gonna be tricky 🙂 So, as a kinda foundational concept, assume the...

I'm giving it a try, I still generate an image every 100 steps, this is the result at 300 steps, why did it create people? prompt was a homestead with thatched roof, in style of homm3v10tk10buildstyle, I used only 10 tokens now and the initialization text was building style

#

these are my sample images

silk crystal Oct 14, 2022, 8:45 PM

#

your dataset looks very heterogenous from my understanding of entropie's messages
I would keep only the castles

ashen perch Oct 14, 2022, 8:46 PM

#

you mean these 6?

#

or the first 2 and the 4th?

silk crystal Oct 14, 2022, 8:48 PM

#

everything on this picture except the last one

obtuse sleet Oct 14, 2022, 8:52 PM

#

for anyone doing training of any kind, i believe this tool to be invaluable compared to other methods of image cropping: https://github.com/sneerz-1/squarize-images-update

GitHub

GitHub - sneerz-1/squarize-images-update

Contribute to sneerz-1/squarize-images-update development by creating an account on GitHub.

#

silk crystal Oct 14, 2022, 8:58 PM

#

On my side, here are my current results

#

The training set
All images were captioned by "a cartoon man/woman"

#

1000th training step :

#

8000th training step:

#

21000th:

#

Attempt to generate something with the fine tuned model with the prompt "a woman in the style of mush characters" :

#

(a training set sample to compare more closely)

ashen perch Oct 14, 2022, 9:02 PM

#

what was your initialization text? did you use textual inversion?

silk crystal Oct 14, 2022, 9:03 PM

#

initialization text was "sci-fi character" with 10 tokens and i am using textual inversion

#

I feel like I am close to get very good results but it lacks something
Using more precise prompts gives very poor results so as the textual inversion paper suggests I think there are too much images in training set
poke @silent spear 👉 👈

next nimbus Oct 14, 2022, 9:19 PM

#

hey guys, can we run dreambooth on windows without installing Linux?

fervent grail Oct 14, 2022, 9:20 PM

#

next nimbus hey guys, can we run dreambooth on windows without installing Linux?

you have to at least have WSL 2 - which is a linux subsystem, but you run it on windows

next nimbus Oct 14, 2022, 9:24 PM

#

fervent grail you have to at least have WSL 2 - which is a linux subsystem, but you run it on ...

So there is no way around linux yet?

#

And 2nd quesion, what if I run colab on my pc with jupyter?

fervent grail Oct 14, 2022, 9:27 PM

#

next nimbus So there is no way around linux yet?

WSL is not a large download, and not invasive at all.

#

I can't answer that one, I don't use colab

next nimbus Oct 14, 2022, 9:28 PM

#

fervent grail I can't answer that one, I don't use colab

I see, thanks m8, going to check colab and if it didn't work, I will try the WSL.

silk crystal Oct 14, 2022, 9:28 PM

#

next nimbus And 2nd quesion, what if I run colab on my pc with jupyter?

you could do that yep, if you find a notebook which works (or want to craft it yourself)

next nimbus Oct 14, 2022, 9:28 PM

#

silk crystal you could do that yep, if you find a notebook which works (or want to craft it y...

Ah, cool, thanks!

vale egret Oct 14, 2022, 9:28 PM

#

In automatic 1111 how do styles work?

fervent grail Oct 14, 2022, 9:28 PM

#

vale egret In automatic 1111 how do styles work?

elaborate.

vale egret Oct 14, 2022, 9:29 PM

#

There’s a create style button. What does it do, and how do you use it?

next nimbus Oct 14, 2022, 9:30 PM

#

Any of you guys know of a colab that uses deepspeed so it will work on 8GBVram?

#

The one I found ask for 14GB : /

silk crystal Oct 14, 2022, 9:31 PM

#

you can pay to use GPUs with more VRAM 🤷

next nimbus Oct 14, 2022, 9:33 PM

#

silk crystal you can pay to use GPUs with more VRAM 🤷

Sure, but I was thinking of using it locally on mt gtx 1070.

vale egret Oct 14, 2022, 9:33 PM

#

vale egret There’s a create style button. What does it do, and how do you use it?

So, anyone know what this does?

unborn basin Oct 14, 2022, 9:36 PM

#

When training embeddings, does the model you train on matter? For example, if I use waifu diffusion to train an embedding, will that same embedding be as accurate as one trained on the regular model?

next nimbus Oct 14, 2022, 9:36 PM

#

vale egret So, anyone know what this does?

either look at the wiki of auto or search for it here, you may find more info about it.

unborn basin Oct 14, 2022, 9:37 PM

#

vale egret So, anyone know what this does?

I'd also suggest looking at the code to see what the button does

gloomy belfry Oct 14, 2022, 9:43 PM

#

obtuse sleet for anyone doing training of any kind, i _believe_ this tool to be invaluable co...

verty interesting, thanks for sharing!

raven rain Oct 15, 2022, 1:04 AM

#

Does anyone know if it's possible to resume training a Dreambooth model from where you left off? Or do I have to start from the beginning on a fresh SD 1.4 model?

#

I can't find this answer anywhere, sorry if it's been asked

crimson wasp Oct 15, 2022, 1:16 AM

#

raven rain Does anyone know if it's possible to resume training a Dreambooth model from whe...

I've never trained the model but it should be possible, since you're already resuming from where the 1.4 model left off. I think if you prune the model to shrink the size it might remove what's needed to resume training though

hot breach Oct 15, 2022, 1:23 AM

#

I've resumed on 2gb models several times, no issues with the python cli from xavier and forks

hidden hatch Oct 15, 2022, 1:27 AM

#

Is there a way to delete an imbedding? Do I just have to revert to the basic SD model?

stray idol Oct 15, 2022, 1:40 AM

#

just delete/move the embedding file, or don't use its special term.

hidden hatch Oct 15, 2022, 1:44 AM

#

I found the embedding file but wasn't sure if deleting it would break anything. I guess if it does it's easy enough to start over.

stray idol Oct 15, 2022, 1:48 AM

#

no that's safe, don't worry.

hot breach Oct 15, 2022, 2:17 AM

#

put a write up along with the checkpoint for my ff7r mega model here for those interested: https://huggingface.co/panopstor/ff7r-stable-diffusion

panopstor/ff7r-stable-diffusion · Hugging Face

#

onward to adding back laion data now

dreamy zenith Oct 15, 2022, 4:26 AM

#

Should the total step count in TI be proportional to the amount or detail of input images?

ashen perch Oct 15, 2022, 4:37 AM

#

silk crystal I feel like I am close to get very good results but it lacks something Using mor...

wow

#

actually it worked

#

what you suggested

ashen perch Oct 15, 2022, 4:38 AM

#

silk crystal your dataset looks very heterogenous from my understanding of entropie's message...

i mean this

#

not perfect, after 10k steps it's still not quite catched what I wanted

#

but I chose render style as initialization text, the promt was a render of [filewords], [name] style and the preview prompt was Big ben, homm3v11tk30renderstyle style

#

and it gave me Big Ben!!!

#

first time it actually gave me what I entered in the prompt

#

😄

#

or not exactly 🤔

#

preview images are good

#

but if I try anything with homm3v11tk30renderstyle style, it gives me a random old guy

#

e.g Budapest, homm3v11tk30renderstyle style

#

silk crystal Oct 15, 2022, 7:10 AM

#

ashen perch e.g `Budapest, homm3v11tk30renderstyle style`

Well this prompt doesn't seem precise enough to me

#

Try castle, buliding, mausoleum etc.

wintry girder Oct 15, 2022, 11:45 AM

#

What's the difference between a hypernetwork and an embed, in a practical sense?

#

After searching a lot, I see that the question is asked a lot but never answered

silk crystal Oct 15, 2022, 12:15 PM

#

I want to finetune SD on 50x50 images
Do I need to upscale them to 512x512 ? If yes, what is the best way to do it ?

stone garden Oct 15, 2022, 12:55 PM

#

Can someone explain hypernetworks and/or textual inversion and/or link to a website or article that explains it

crimson sandal Oct 15, 2022, 1:26 PM

#

hot breach put a write up along with the checkpoint for my ff7r mega model here for those i...

@hot breach as a huge FF7 fan, this is awesome. Quick question, did you do anything different in the training? Or did you feel like better labels and image groups did the trick? I am working on fine tuning a single model with multiple people that are not in group photos and am trying to find the best approach

unborn basin Oct 15, 2022, 1:29 PM

#

stone garden Can someone explain hypernetworks and/or textual inversion and/or link to a webs...

I'm an amateur at this, but my understanding of Textual Inversion is this: the language model translates your prompt into a vector that is used to guide the unet from the random noise towards an image. TI is the act of fine tuning that vector for specific concepts that the language model might not have a word for. Basically, TI is telling the network "when I say 'asadayo', I'm talking about images that look like this"

#

I do not understand hyper networks and haven't experimented with them yet.

#

I would also love if someone who actually understands this stuff could chime in. I have a feeling I'm missing something about the mechanics of TI

humble bramble Oct 15, 2022, 2:31 PM

#

So, if I have understood clearly, TI/embeddings are instruction to tell the model "what" a word is and hypernetwork are instruction to tell the model how to recreate a particular style? what about VAE?

hot breach Oct 15, 2022, 3:55 PM

#

crimson sandal <@187004267641438208> as a huge FF7 fan, this is awesome. Quick question, did yo...

captioning images and having group photos are both big deals, I've trained the model with progressively more and more data now about 10 times

#

if 2 particular characters have a group photo, the likelihood you can get a good image of them together at inference time is a lot higher

#

I can do a pretty good job getting Cloud/Barret/Tifa in one image because there are a couple examples of all three of them in one image in the training set, plus a ton of training images for all of them individually (140+ each now) mixing aerith/jessie together is much harder, no images of those two together (not even sure they meet in the game?) and jessie has a slightl smaller training set herself (~90) which may affect that

#

worth noting jessie still looks spot on by herself with just the 90 solo images and a few with her and cloud/biggs/wedge

#

wedge/biggs have much smaller sets and don't look very good, I have more captures ready but trying to avoid just training again until I get laion data introduced back to replace regularization

feral lava Oct 15, 2022, 6:04 PM

#

is there any resource or chart for explaining the differences in the "Samplers"

hot breach Oct 15, 2022, 6:27 PM

#

feral lava is there any resource or chart for explaining the differences in the "Samplers"

#🤝｜tech-support message and next post as well

green flax Oct 15, 2022, 6:39 PM

#

do any of you know of a tool that will grab every frame from a video that contains a specified character

#

it should be possible to use ai to do this somewhat reliably and it would be an excellent source of images for textual inversion

hot breach Oct 15, 2022, 6:56 PM

#

only aware of speech models for "diarization", you'd need to have a model you could first train to recognize individual characters, certainly possible just don't know if something like that exists

sacred grail Oct 15, 2022, 8:41 PM

#

ashen perch but if I try anything with `homm3v11tk30renderstyle style`, it gives me a random...

homme is french for guy so that might be the problem

hot breach Oct 15, 2022, 8:42 PM

#

@stone garden @gusty thicket @hollow surge #🔧｜finetune message

stone garden Oct 15, 2022, 8:44 PM

#

hot breach <@456226577798135808> <@409190725931237384> <@145622913729429504> https://discor...

well, I'll read in more detail, download, test, and share some feedback to you.
But already, thanks a lot for this detailed method, you are the MVP in the finetune game

hot breach Oct 15, 2022, 8:46 PM

#

one interesting thing to test is the main characters vs. side characters, it will give you an idea of how the weight of training data impacts quality, i.e. "biggs ff7r" has much less data than "cloud strife" and looks worse for it, and "rufus shinra" has a tiny training sample

#

"red xiii" is also still pretty awful, barely 10 images out of 1400

hollow surge Oct 15, 2022, 8:48 PM

#

how did waifu diffusion train a wifu model? is that dreambooth? textual inversion? or something else?

hot breach Oct 15, 2022, 8:48 PM

#

I don't think they used dreambooth

#

they trained only on anime content afaik, and did not attempt to "protect" the existing model, so it can no longer produce "a photo of tom cruise" etc

#

I actually haven't tested, I don't have WD ckpt

hollow surge Oct 15, 2022, 8:49 PM

#

it actually can, it's just more anime styled

hot breach Oct 15, 2022, 8:49 PM

#

yeah

hollow surge Oct 15, 2022, 8:49 PM

#

but yes it totally messed with every generation

hot breach Oct 15, 2022, 8:49 PM

#

they stomped on the model so to speak, reformed it to be very specialized

#

I believe next step for fine tuning is introducing laion data back and dropping regularization

#

I'm working on that as we speak

#

once regularization is dropped, it's not really "dreambooth" anymore

hollow surge Oct 15, 2022, 8:51 PM

#

no clue what that means but sounds good

#

regularization is the weird fact you have to generate like 300 class images right

hot breach Oct 15, 2022, 8:51 PM

#

yes

#

it's used to keep the model from overtraining on the training images, to "protect" the model to not forget how to draw the stuff it knew before you try to train new things

#

so the FF7R model can still do "a photo of tom cruise" and he doesn't look like a Final Fantasy video game character for instance

hollow surge Oct 15, 2022, 8:52 PM

#

that makes sense

hot breach Oct 15, 2022, 8:53 PM

#

he will if you prompt "tom cruise standing on the rooftops of midgar city slums district" however, as the style transfer happens when you start prompting "midgar city slums district" and soforth

#

the gist or the huggingface links have links to imgur with examples of that

#

yeah its on the gist if you scroll down, links there to imgur

hollow surge Oct 15, 2022, 8:57 PM

#

❗❗ THE NUMBER ONE MISTAKE PEOPLE MAKE ❗❗
Prompting with just your token. ie "joepenna" instead of "joepenna person"

#

is this true?

#

i haven't been using the class name and i thought my results were excellent. i'll have to try adding the class

hot breach Oct 15, 2022, 8:57 PM

#

I've loaded some models from other people and I've found person may not be required

#

it may learn enough on just "joepenna" to work, but keep in mind there are a variety of techniques being used

#

several of us have not been using "sks" this or that, or "person" for class_word on the older repos at all

#

I trained "john carmack" just like that, without "person" anywhere at all and it worked perfectly fine

#

richservo has a giant list of models trained this way as well, no "person" at the end, just the name

#

there's also no particular reason to train without spaces in names and such that I can tell, though really long names do use up tokens later when you want to prompt I guess

hollow surge Oct 15, 2022, 9:03 PM

#

btw, i just noticed automatic1111 ui seems to have no tokens limit!? that's crazy, didn't expect that to happen so soon / ever

fervent grail Oct 15, 2022, 10:37 PM

#

hollow surge btw, i just noticed automatic1111 ui seems to have no tokens limit!? that's craz...

This was very recent. I think I noticed it like last night on accident?

atomic lagoon Oct 15, 2022, 11:32 PM

#

I am installing webui and my PC is lagging so much, I can't move anything, is it normal?

fervent grail Oct 15, 2022, 11:44 PM

#

atomic lagoon I am installing webui and my PC is lagging so much, I can't move anything, is it...

yeah dont touch anything until its installed.

atomic lagoon Oct 15, 2022, 11:49 PM

#

fervent grail yeah dont touch anything until its installed.

Ok, how much time did it take?

inner turtle Oct 16, 2022, 12:06 AM

#

atomic lagoon Ok, how much time did it take?

Depends on the disk/bandwidth speed on your first time since it’a downloading SD and creating a new environment. For me it took between 4-6 mins. But after the first time it takes 30-60 seconds

atomic lagoon Oct 16, 2022, 1:27 AM

#

how i reopen after the instalation?

north nest Oct 16, 2022, 1:30 AM

#

Does anyone know What’s the vram requirement for the textual inversion training in webui?

bitter matrix Oct 16, 2022, 1:32 AM

#

Ok so I turned on live preview for sample steps, How does this help me refine an issue I see during the render...For example and incomplete weapon missing a hilt .....I'm not sure how this benefits me....?

maiden coyote Oct 16, 2022, 3:44 AM

#

bitter matrix Ok so I turned on live preview for sample steps, How does this help me refine an...

For long upscales it allows you the ability to interrupt and restart after tweaking some parameters.. also if you are doing in painting and you see something you are not liking you can interrupt instead of having to wait the extra minute or so for the image to finish.

#

You can't force SD to fix something like missing body parts/incomplete weapons, etc.. but you can at least identify that it's not coming out how you want it and interrupt early to tweak setting and try again

unborn fulcrum Oct 16, 2022, 3:47 AM

#

How good is hypernetwork? Anywhere I can see results?

viral jay Oct 16, 2022, 5:07 AM

#

unborn fulcrum How good is hypernetwork? Anywhere I can see results?

It's mainly for styles, I didn't achieve good results with face learning, but its a helper for subject improvement

#

without hypernetwork

#

with hypernetwork

#

control

rose flame Oct 16, 2022, 6:00 AM

#

hollow surge btw, i just noticed automatic1111 ui seems to have no tokens limit!? that's craz...

Does that mean we can use multiple textural inversions at once?

rose flame Oct 16, 2022, 6:01 AM

#

unborn fulcrum How good is hypernetwork? Anywhere I can see results?

I used same image library go train a cpkt last night and a hypernetwork and discovered it's very helpful with a trained model or normal SD model

#

My most important discovery was I had to turn hypernetwork strength to 0.5 for it to work!

#

I was getting rubbish results nothing like my reference material until I did that

#

Are people combining cpkt, hypernetwork and textural inversion with good results or is that considered overkill or is one of them considered redundant at this point?

harsh viper Oct 16, 2022, 8:09 AM

#

Good news, for those who love to mess around with hyper networks or embedding folder, you don’t need to restart the whole program to have the dropdown listing your new pt files. With my change (just merged to master) just simply click the refresh button, you will see the new pt files in the drop-down

silk crystal Oct 16, 2022, 8:23 AM

#

Hey, is there a playbook for style transfer learning ? 🤔

stray idol Oct 16, 2022, 8:59 AM

#

harsh viper Good news, for those who love to mess around with hyper networks or embedding fo...

A great convenience!

sturdy willow Oct 16, 2022, 9:19 AM

#

hypernetwork is great for styles , it does take some time to get it right

#

I trained a hypernetwork using some Mob psycho anime screenshots. Here are some comparisons

00068-852133260-bishounen_blue_eyes_medium_hair_japanese_armor_1girl_masterpiece_best_quality_masterpiece_medium_breasts_full_armor_l.png

00067-1304611545-bishounen_blue_eyes_medium_hair_japanese_armor_1girl_masterpiece_best_quality_masterpiece_medium_breasts_full_armor_l.png

#

26k steps

#

started with e5 then e6 than back to e5 for the last 10k steps

#

same params , just changed the HN for each one

silk crystal Oct 16, 2022, 10:53 AM

#

Did you use different characters in your training set ?

half folio Oct 16, 2022, 1:45 PM

#

Has anyone tried finetuning the full SD model (not TI or DB) using the script in the diffusers repo?

#

It looks very straightforward to me but I don't know if it works on colab or if the results are any decent as it says the script still is experimental.

viral jay Oct 16, 2022, 2:08 PM

#

what I've noticed is that hypernetwork allow to use bigger image sizes than TI

#

I can train up to 2048x2048 on my 12gb card

#

So after a bit of test, I figured that actually hypernetwork is able to learn faces too, but it works different from TI

#

That's me (photo)

#

that's a test with hypernetwork, 2048x2048, 42 images, 0.000025 learning rate, 1500 steps, images have been labelled with BLIP

#

from my tests during training I can say that lower learning rate wasn't giving me desired output, I was training 15-20k steps and still got nothing, so I was increasing learn rate and at 0.00005 it went crazy (noise texture) above 1200 steps, so I've reduced the learning rate by half and it seems to work so far

sturdy willow Oct 16, 2022, 2:41 PM

#

silk crystal Did you use different characters in your training set ?

yes , like 4, I did notice is more consistent with guys , so next time ill add more girls to the data

#

@silk crystal just in case if you are interested , here is the images and meta data that I used for my HN, keep in mind that I used the [redacted] model

📎 processed.rar

#

another comparison using the HN trained and different strength

viral jay Oct 16, 2022, 2:57 PM

#

did some comparison too, that's what I said about it going crazy, with 0.000025 and above 2500 steps it start to get really bad, but the sweet spot is close to it at 2000steps, so kinda delicate balance I would say

#

will do more testing with different learning ratios, to see how it behaves at lower rate

stone garden Oct 16, 2022, 3:02 PM

#

I have 800 pics of a person, I want to do textual inversion, any idea what are the best settings to have it learn a face/person but without overtraining? last time I used all default settings in automatic1111's UI (100k steps) and it learned the photo style so I couldnt then use it to do any different style like oil painting or anime drawing.

dreamy zenith Oct 16, 2022, 3:02 PM

#

rose flame Are people combining cpkt, hypernetwork and textural inversion with good results...

I just use TI.

sturdy willow Oct 16, 2022, 3:02 PM

#

stone garden I have 800 pics of a person, I want to do textual inversion, any idea what are t...

I used this as a base https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2670

#

had to tweak some stuff and start over and over

#

but Its a good start to get a idea

stone garden Oct 16, 2022, 3:03 PM

#

Also us it better to have it learn on the full pics or should I extract the faces only and train it on that?

sturdy willow Oct 16, 2022, 3:05 PM

#

viral jay did some comparison too, that's what I said about it going crazy, with 0.000025 ...

mine where between 5e-5 and 5e-6 , switching them when It got too stale without making any relevant changes, I was checking every 200 steps

sturdy willow Oct 16, 2022, 3:06 PM

#

stone garden Also us it better to have it learn on the full pics or should I extract the face...

mm in my case I was trying to train the style , so I'm not sure about that

stone garden Oct 16, 2022, 3:07 PM

#

sturdy willow mm in my case I was trying to train the style , so I'm not sure about that

yea I saw that guide, its for styles only I think

viral jay Oct 16, 2022, 3:09 PM

#

stone garden Also us it better to have it learn on the full pics or should I extract the face...

stuff around may affect, I haven't tried with such big amount of photos, but I use between 10-40 photos and I usually crop it at upper part of body so it focus more on the face, but I've noticed that sometimes it learn stuff around me like wallpaper, chair had text so it was adding that text (or kinda) on the generated images too

#

I never tried 100k steps, think it may be overkill, on my case for TI I use between 10-20k steps, sometimes it get good even at lower steps like 5-7k

#

something that also play a good role is the tokens count, higher token count takes out the control, so I would keep it in between 1-10 tokens only, higher makes it difficult to apply styles

stone garden Oct 16, 2022, 3:14 PM

#

viral jay something that also play a good role is the tokens count, higher token count tak...

thanks for the info, yeah 100k seemed overkill but it was the default value on the webui for some reason, and I chose 16 tokens, will train again for about 20k and maybe choose 8 tokens

sturdy willow Oct 16, 2022, 3:31 PM

#

I thought 10k was the default

#

aah you are talking about TI

hot breach Oct 16, 2022, 3:32 PM

#

you can experiment running at a certain LR rate for X steps, then continue with more steps at a lower LR, assuming whatever code you're using allows you to continue training on your own bin/pt/ckpt

stone garden Oct 16, 2022, 3:43 PM

#

I think Im just gonna try dreambooth instead

#

since it seems like the superior option

#

just gotta figure out how to run it on my 24GB VRAM without it crashing lol

#

would be nice if automatic adds dreambooth to the webui

viral jay Oct 16, 2022, 3:46 PM

#

yeah unfortunately I have only 12gb 😦 so can't say about it

#

so a bit more of testing, hypernetwork with 0.00001 learning rate and from 0 to 6k steps

xy_grid-0021-32131321-a_beautiful_photo_of_man_with_mustache_and_glasses.png

stone garden Oct 16, 2022, 4:41 PM

#

nice pixel art model ! https://publicprompts.art/pixel-art-v1-dreambooth-model/

hot breach Oct 16, 2022, 4:41 PM

#

neat!

ashen perch Oct 16, 2022, 4:55 PM

#

sacred grail homme is french for guy so that might be the problem

does it interpret different parts of the name of the embedding? i'll try to rename it and train again

ashen perch Oct 16, 2022, 4:55 PM

#

silk crystal Well this prompt doesn't seem precise enough to me

do you mean my test prompt or the initialization text?

silk crystal Oct 16, 2022, 4:55 PM

#

test prompt

ashen perch Oct 16, 2022, 4:55 PM

#

ashen perch do you mean my test prompt or the initialization text?

a castle, homm3v11tk10render style gives me this result

silk crystal Oct 16, 2022, 4:56 PM

#

well i was wrong then

tribal rapids Oct 16, 2022, 4:56 PM

#

hi, if i'm using dreambooth training eg with token=jmp909 class=man so jmp909 man is there a way to give more weight to the token to steer it towards picking up my likeness? eg a photo of a [man:jmp909 man:0.3] on a beach holding an icecream or whatever? or a photo of a (jmp909 man:1.3) on a........

silk crystal Oct 16, 2022, 4:56 PM

#

But btw, I got way better results with this repo : https://github.com/JoePenna/Dreambooth-Stable-Diffusion @ashen perch

GitHub

GitHub - JoePenna/Dreambooth-Stable-Diffusion: Implementation of Dr...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focuse...

#

But you will need to gather a bunch of images of buildings for regularization for your specific problem

ashen perch Oct 16, 2022, 4:57 PM

#

my card only has 8gb of vram so dreambooth is off for me, maybe through colab

ashen perch Oct 16, 2022, 4:58 PM

#

silk crystal But you will need to gather a bunch of images of buildings for regularization fo...

I even thought about taking pics ingame like this and cropping out different parts

wintry girder Oct 16, 2022, 4:59 PM

#

What's the difference between a hypernetwork and an embed, in a practical sense?

#

I'm thinking that no one actually knows, is that accurate?

hot breach Oct 16, 2022, 5:05 PM

#

ashen perch I even thought about taking pics ingame like this and cropping out different par...

you could try cropping such a screenshot twice once left, once right, but getting SD to generate that afterwards in wide aspect may be difficult

#

if the game has a square aspect it might be better

#

if you can use caption training you could add "with a toolbar on the right" only to the right side crops, maybe it would learn the difference, just an idea

stone garden Oct 16, 2022, 5:14 PM

#

wintry girder I'm thinking that no one actually knows, is that accurate?

something something hyperparameters good for styles not for subject something something

wintry girder Oct 16, 2022, 5:15 PM

#

stone garden something something hyperparameters good for styles not for subject something so...

I'm assuming you're talking about hypernetworks in particular? If so, does that make your answer something like:

"Hypernetworks are better than embeds for styles, but embeds are better for subjects"?

stone garden Oct 16, 2022, 5:16 PM

#

wintry girder I'm assuming you're talking about hypernetworks in particular? If so, does that ...

yes exactly

wintry girder Oct 16, 2022, 5:16 PM

#

Great, thanks. Is this from experience, or do you have a source?

stone garden Oct 16, 2022, 5:17 PM

#

based on the context I infered by reading what other people that dont understand it either said

wintry girder Oct 16, 2022, 5:17 PM

#

Gotcha

viral jay Oct 16, 2022, 5:17 PM

#

embeds allow you to create a word for some subject, while hypernetwork seems to learn the overall information

wintry girder Oct 16, 2022, 5:18 PM

#

Does the hypernetwork not use a word also?

viral jay Oct 16, 2022, 5:18 PM

#

nope, if I write a prompt like "a man with mustache" with my hypernet any man will be like my face

wintry girder Oct 16, 2022, 5:19 PM

#

Hmmm... So every generation is thereafter contaminated by the hypernetwork?

viral jay Oct 16, 2022, 5:20 PM

#

if hypernetwork is enabled, yes

wintry girder Oct 16, 2022, 5:20 PM

#

Do we get any control over it?

#

Right

viral jay Oct 16, 2022, 5:20 PM

#

on automatic webui you can control strength and choose which one

wintry girder Oct 16, 2022, 5:21 PM

#

Oh... Where do I find that in auto1111?

viral jay Oct 16, 2022, 5:22 PM

#

I'm still testing it, for my face learning it seems to be doing a very good job now, I've taken some more photos of myself, was using 40 now I'm with 80 photos, changed my shirt and took some photos with and without glasses at different places of my house, want to see if it stop showing my jacket on all generations lol

#

ah sorry

#

misunderstood your question

wintry girder Oct 16, 2022, 5:22 PM

#

Maybe it appears after you create a hypernetwork? I can't actually seem to create one because it says "Error" 😂

viral jay Oct 16, 2022, 5:23 PM

#

I have this problem sometimes too, you need to close/open it

#

often happens after training a TI

wintry girder Oct 16, 2022, 5:23 PM

#

Gotcha

viral jay Oct 16, 2022, 5:23 PM

#

the strength you can control on settings

#

or do like myself

#

on settings > quicksettings list use "sd_model_checkpoint, sd_hypernetwork, sd_hypernetwork_strength"

#

#

this adds the options to top of page so its faster to change them

wintry girder Oct 16, 2022, 5:25 PM

#

I don't have quicksettings, maybe I have to update

viral jay Oct 16, 2022, 5:27 PM

#

hm yeah I think its some recent addition

wintry girder Oct 16, 2022, 5:27 PM

#

When I'm done with my current work I'll update and have a look, thanks for the info 🙂

viral jay Oct 16, 2022, 5:28 PM

#

I'm getting very happy with results on hypernet now with 80 photos its producing less biased content

wintry girder Oct 16, 2022, 5:28 PM

#

What application are you thinking about for it?

viral jay Oct 16, 2022, 5:30 PM

#

my idea is to make it learn the faces so I can make different styles of them and maybe create shirts, cups, etc using those

wintry girder Oct 16, 2022, 5:31 PM

#

I see, and is there a reason you're choosing hypernetworks over embeds for that?

viral jay Oct 16, 2022, 5:31 PM

#

I'm actually experimenting with, on my first try I wasn't getting anything good out of hypernetworks, but after some tries I'm starting to get good outputs, but still need with other faces

wintry girder Oct 16, 2022, 5:32 PM

#

Cool. And have you already experimented with embeds?

viral jay Oct 16, 2022, 5:32 PM

#

from comparison I think the HN is producing more accurate face than embeddings

#

yeah, with it I had to generate several times to get something good out of it

wintry girder Oct 16, 2022, 5:33 PM

#

Ok interesting

viral jay Oct 16, 2022, 5:52 PM

#

lol I'm getting really good results with hypernetwork

#

what I can say is that it's producing more natural images compared to TI

tribal rapids Oct 16, 2022, 5:56 PM

#

@viral jay what’s your training settings? (Steps, images etc). Thanks

viral jay Oct 16, 2022, 5:58 PM

#

6000 steps, 0.00001 learning rate, 73 photos (had 80 but I removed some out of focus photos), all them tagged with BLIP

tribal rapids Oct 16, 2022, 5:59 PM

#

Thanks