#✨|sdxl
1 messages · Page 79 of 1
dude
you asked, I answerd
yeah, bitches eating shit
wth XD
found a prompt my lora is very very unhappy with
girl turns into an amanita muscaria mushroom, raw photo, 8k
without/with lora
partially jk ||but any mention of such horrors should beckon a permaban|| but jk
so theres an issue about two girls sat around a table sharing one cup of tea?
Apoologies but as a good friend of mine uused to say "Its not the mind it comes out of it's the mind it goes into"
If you choose to interpret 4 innocent words completely differently to me then nothig I can do about that 🙂
yes my mind is quite damaged by the early years of the internet. i cna't deny this. i'm a product of the world.
but you knew what you intended too so 😛
prove it 🙂
i was seriously asking
i dont have to. you brought it up. now i'll forever consider you a poop fetishist
I have some bad news for pretty much every time you drank tea
Faces lmao
you know like that guy who cuts one inappropriate joke about hitler and then everyone considers him that guy forever? is it eveyrone else's fault for him cutting that joke? naw
/godwins law
if i can answer the question, it's not as good at anything v3 at creating creative mechas
on 1.5 models , especially the anime ones, i could create delorean mecha
Haven't caught one from xl yet
From my experience with sdxl it is not good at faces at a distance, nor is it very good at eyes at all. There’s always a bad one. Wheels on cars are still kind of an issue but it’s definitely not bad.
still?
oh, thats something which need a model, so i dont have much computing power yet i have the dataset sadly
bad faces are usually solveable with hires fixes, stronger prompting, and detailer passes
which v3?
what are mechas?
if you have old school type issues with SDXL this means you likely didn't infer it properly
anything v3. it's the quintessential anime model imo. anything v5 is good too. anything v4 was just a fake knockoff merge from a stooge who didn't like v3's popularity
mecha's are like, giant robot mechs from japanese anime
I get good results but it requires more than what the base and refiner has to offer. Sdxl is very very good. Not saying it’s bad but can be better. Heck i don’t even use the refiner for most of my gens
i didnt get what versions you are referring to
OK WTF, that's an apple logo
the name of the model is "anything"
whose on first
oh, i think i got few
so basically v3, is like finetunings?
yeah it can do them for sure. but not creatively like "delorean mecha" or "1977 honda civic model mecha"
@trim orbit do you have lora settings to train a face, got bad results from 3 trainings
i can train a lora, if you want
it's a sd1.4 based model that is extremely specialized to anime
@indigo carbon i ran your workflow but for some reason can't find the output image lol...
i dont need them. i just came across that case that sdxl couldn't do. I always generate deloreans on every model. delorean mechas too
you are likely missing required nodes. it's pretty self explanatory
it runs / completes and node manager says i have everything - just not seeing anything in the actual Output folder which is strange
oh, I know what's happening 😄 , you don't have the upscaler I use, it saves the image after upscaling
likely. it should work perfectly fine.
anyways, who said SDXL can't do mechs?
creative mechs. like delorean mecha 
the car?
if sdxl is lacking that, i can train, but tryna find params before i kickstart training diff loras
@boreal bough Inial impression is good (ignore the hands I was just using my usual test Seed lol)
Left to Right
No Lora, Defaulat SAI Offsest LORA, your LORA
Loras at 0.6 strength
it can do both mech AND it can do deloreans, very well
i'm not sure how you'd train a lora for better zero shot capabilities here
isn't this that car? idk what a delorean is, but this is what it makes from "delorean mecha"
so then what is it lacking exactly
i gotta get some exp
i donwloaded anything v5 because i didn't have them anymore. here's a result. it often does them like a silver haired man too. i think because its pulling info about mr delorean's portrait hahaha.
hard to say. zero shot means it can generate something it's never seen. while it's very great at zero shotting photos like Furkan has shown, riding a dinosaur and a lion, it's also limited when turning cars into robots
it might be because cars are very over fit
i think, it wasnt trained on a great dataset of transformers
transformers and mecha are slightly different aesthetics but yeah, maybe that'll be a better approach. prompting is king after all.
i'm firing this motif off on "psy animated" now and its actually getting some better results than i had with the base
the base would often draw a mecha AND a delorean, entirely separate
similar situation but the mech is much more aesthetically a delorean now
the more i use psy animated the more i like it
forgive me, for I have prompt sinned
P: abstract expressionism art,(ahegao:1.2),(a 29 yo woman, (Zooey Deschanel:0.2), (redhead:1.1), cute smile, detailed face, perfect eyes),neon,hdr10,they are watching into glowing wires everywhere,transformers,woman as a transformer,megatron prime
N: monochrome error,multiple views,blurry,uncloned face,disconnected limbs,fused limbs,long body,long legs,body,fused limbs,long legs,missing limb,missing arm,no arms,short limb,nonsense,huge face,bad eyebrows,ugly beard,mutated,
why zoey would ever be a 0.2 is beyond me. this is the greatest sin
oh til. it actually is zooey
for appearing in the crap remake of HHGTTG in 2005
Tried your flow as is TD and it didn't like me
As for lora we only use few images, so we can caption using minigpt4
CMD log?
i loved the movie and feel it's the best version it could've ever been of the first book
same error as in the UI
I used to have it on vinyl 🙂
bombed in boxoffice though so i dn't argue that with people who insist it's bad. i just keep on enjoying it
the radio show? oh god. now that's bad
that's probably not true. that type of error usually shows like "imput:0 error" or something like that. if it's still broken for you you can try manually placing the AIT modules.
close enough?
a (transformer:1.4) Autobots in the shape of the delorian with a woman sitting inside it, 29 yo old, Zooey Deschanel, Autobot
hah . that's the kind of results i get a lot of the time. a delorean next to a robot.
cool delorean!
oh wait, I see what's up. you're on SM75, aren't you?
I'll get there! XD I'm already getting closer
prompting is king! i believe in you.
front end looks more Fox Bodied Mustang than deloreac
that one is more of a firebird tho
i agree. it's blending models
and they believe as well!
closest I'll get with zooey hogging all the latent space XD
she really does have commanding screen presence
chonk delorean
so i was wrong
sdxl through psy animated has been getting good hits
without her its much more doable ^^'
did you try DSXL or CCXL?
These are bomb!
Shouldn't be reloading the checkpoint again, but it will have to run through your text encoders again.
which should be fast. show your workflow
Unsure regarding that
How much memory do you have?
if your GPU gen is under 3000 series, it's not supported yet for AIT. however, in a few days all the modules will be ready.
No, main memory
yes that should be fine
Ah gotcha, 2060 so yeah. I'll keep an eye on it
I think when you change the prompt it has to redo the the CLIP encoding or something
I know trying to use wildcards in my prompt was KILLING my performance, lol
I have 16 GB of RAM 😄
but will it support a 1080Ti ;o)
ComfyUI sits at 18GB memory use for me
It might simply be dropping unused data in your case, but then it has to reload it to compute new embeddings.
im about to run a prompt so I won't be able to type for a bit~~ lol
I think it’s likely. And RAM is cheap these days anyway.
Also, are you looking at its VRAM usage or RAM usage?
Both matter, but obviously they’re not the same.
Depends on your motherboard.
Do you have 2x8 right now?
4x8 should work, but you might lose some performance.
Should try to make it the same model though.
DDR3200.
Match the model if possible, match the speeds and clock rates if not.
@indigo carbon Using your workflow. Want to replicate an anime robot but the lion head in the wrong place. 
Well, yes. DDR 4 and 5 are completely different.
On the bright side, DDR4 is extra cheap. :p
At what speed?
Ok. Well. I can’t guarantee it’ll work, but worst case if you reduce the speed it should.
Running four modules at your current speed.
No, because 3200 is an oversimplification and really they need to match impedance and clock speeds in nanosecond regimes, on wires that are so long relative to the frequency that they act as antennas, not wires.
but unless you're stats watching it will probably be fine
4 more than doubles the chance of problems.
No.
All desktop motherboards are dual-channel, so 2 is optimal. They handle the 2 dimms mostly independently.
With 1, you actually lose 50% of your memory bandwidth.
If you have infinite money? Sure.
Otherwise do 2 or 4.
1 and 3 are both bad.
I run 4 x 16Gb Crucial Ballistix BL2K16G30C15U4B and have done for the lat 18 months without any issues
system is stable and plenty fast enough
As I’d expect. Matching all four usually works just fine.
yes, theoretically you can alter the node's code and compile the module yourself, but that's insanely complicated
Having four modules sometimes causes trouble, but usually not. Really the issue is going from two to four… because you’re likely to not have four identical ones.
Yeah.
But also.
Just about every combination works fine if you leave it in the failsafe configuration, at 1200 MHz.
it already can, but the older gen your cuda cores are, the less boost AIT will cause.
Which is the default.
Unfortunately a lot of people buy fast memory, then never flip it to high speed. :p
just go all in and grab 2x32Gb sticks, its cheap enough. Pop your existing stuff in the drawer just in case
nice! does the AIT node work for you? many people report issues with old GPUs with it
No. Has error while using the node
GPU?
There’s an AIT node? Where?
custom node, experiential and I seem to be the only one on windows it works for =\
Maybe? But for DDR5 you’d also need to swap your motherboard, and maybe CPU too.
my bad those are SODIMMS
Hit me. 
I’m on Linux anyway.
Either will work. 3000 is maybe 2% slower in practice.
unless you're chasing numbers just take the 2 old sticks out and put these two in
it will be fine
you will notice the benfit and be able to have even more chrome tabs open whilst generating images
Hahahahaha…. … yes.
What’s the minimum ram quantity requirement?
oh nice. you should be fine. just load the workflow from one of my gens and restore nodes it says you don't have. you shouldn't need to worry about compiling the modules yourself as long as your GPU is 3000 series or above. if it is a newer gen it should handle the modules on it's own
0
I mean, we’ve seen it work on 4GB…
because the first one was SODIMMS for latops
Generalised list of DDR4
https://pcpartpicker.com/products/memory/#Z=65536002&sort=price&ff=ddr4
Choose Memory
I have 256gb @3600
depends on how much of a snob and a number chaser you are
I’d get one of the crucial kits.
ddr5 is more efficient if you want speed
That one should be fine, sure.
yeah, who needs fancy rgb lights on their ram? nobody lol
hmm. I'll have to check that out
pip really does enjoy just eating up storage space
TBH, I’ve had data corruption bugs. Don’t use it unless you know very well what you’re doing.
You can just let me debug those first. It is remarkably good otherwise though—I’m genuinely getting SSD speeds with HDD storage capacities.
my issue is I'll sometimes realize a pip install is 100 gb or something, and my primary drive is a bit on the small side
just purged 25 gb
Ah. Yeah. I just have everything in a single giant pool.
so one day I'll have 160 gb free space then install a couple things and it's 20 somehow
Two NVMEs and two HDDs. Mirroring, of course.
well I store a lot of things on hdd and backup in cloud storage. but realized loading sdxl models from my hdd takes FOREVER
like 5+ minutes
Mm. Wir bcachefs, it writes files to the SSDs on access.
So you still need a decently sized SSD, but it’ll automatically delete the oldest cache elements when that fills up.
In practice, it just means that older stuff I don’t use ends up in the HDDs, and I don’t need to think about what is where.
…also writes start on the SSD, so that’s fast as well.
this chat turned into LTT lol
not sure what LTT is but it sounds like it must be pretty cool
Maybe just install One Button Prompt in ComfyUI? It's available as an extension in the extension manager of ComfyUI.
there's literally a one button node right? or am I crazy
well hardware is a big part of this - at least when you run it locally heh
true, true
nice background (and foreground) 🙂
did you hit tha twith the face detailer?
yeah well, sometimes you hit a different part of the latent space while exploring it 😉
yes, liminal latent space
isn't the backrooms a liminal space?
yes
I have a backrooms lora for 1.5
tonight - join picturesonpictures on a visit to the backrooms of the latent space
ooh, maybe I should combine USSR propaganda lora with the "truck" lora
soviet truck propaganda
that's a top tier lava lamp
@indigo carbon I think they are updating something recently on the AITemplate node repo. It broke something. Wait until they are ready for it.
yes, it works with base SDXL models on SM80 gpus, they said refiner support is coming very soon.
@visual glade said he is working AIT support without any custom nodes, but that was a few weeks ago
do yall know the newsest engine name for api usage
adding chapters to hopefully coming today my new massive tutorial
yes but I'm a bit busy in LA right now
yeah it's fine. the custom node (kinda) figured it out, well, for the base atleast
i need a female photo reg dataset. anyone got one?
i am very liking the psy animated xl today. more deloreans
Hey folks I’m seeing a few issues with image quality in ClipDrop. Is this the best place to talk to someone about it?
in 1h I should finish my last video! WIll post here (Made with SDXL and MJ)
Comfyui does queing up prompts use the prompts AT the time you quede it up or at the time the prompt is running?
uses the params at the time you queue it. feel free to change params after queuing
Yeah everything is AT the time. Including the timestamp of image creation, so if you use that parameter in your image filename, it will be the stamp when you hit generate, not when the file completes
stable-diffusion-v1 is this the latest stable diffusion engine, or is there a new one i can use for api
does "shuffle caption" shuffle words within the caption or shuffle captions between images? and if there are no caption files then this setting shouldnt do anything?
I have trouble even loading the model. I have a 15gb shared gpu memory and 8.5gb individual. But I can't even get it started somehow 90% of it is dedicated somewhere else.
Pretty sure shuffles words within the caption
Shuffling caption between images doesn’t make sense lol
why do you think 15gb of shared gpu memory should be used?
in this case, if there are no caption files and the "caption" is the folder name, will it shuffle the folder name words?
How much Vram do you need to even start the sdxl model?
Doubt it lol
more than 8gb by default im pretty sure
im sure theres optimizations
some ppl are running cpu only
Running in ComfyUI just fine, 2060 Super with 8gb
what do you use?
runnin sdxl on dual core apu
I think my second attempt at LoRA (the first [supposedly] diverse enough to be character LoRA) has failed. It seems to have too much effect and any attempt of photorealistic, 3d render or photo fails either by still generating 2d sprites or the character is near completely lost. Beyond that, I have to add the details of the character in the prompt, which I think I shouldn't have to (the 2d sprites are not that bad, actually)
Anybody else having an absolutely terrible experience with training LoRA's for SDXL as of late?
No matter what I do, all of my LoRA's are trash now
problem is making lora or using them?
using is fine. Making is horrible. I just can't obtain decent result
making is suddenly near impossible
I made a few just fine before
now, no matter what settings I use, no matter what captions or dataset, it learns nothing
for me it learns wrong in hours 😄
I have had @boreal bough and other extremely smart people when it comes to LORA's help me, and yet I am getting nothing, and I mean NOTHING
my LoRA's are basically identical to base SDXL
full hour of training on a 3090, loss is identical no matter the settings, LoRA is identical to SDXL
loss for mine tests is 0.15 that is too high
try training something absolutely off the wall, something which would be extremely easily able to tell if it's doing anything at all. That should tell you if it's doing anything
this is 2 different training tests with different learning rates, different DIM, and different alpha balues
what are you training
A portrait photograph LoRA
my first test had some results, but the effect was very undertrained
I got a bigger dataset and spent way more time captioning, and now my LoRA's are identical to just stock SDXL
its different from run to run in the end, but it never matters
SDXL is just refusing to learn anything now
everything is either stagnant/inconsistent/increasing loss no matter what I do
ah. see i have a suspicion that SAI was very concious about portrait photography and gave it excellent representation. Without a stupidly large finetune, I don't see XL getting better at portrait photography in general. You could weight it towards a certain photography style or camera setting/effect though.
it has a long way to go
even my super fast 20 image 5 epoch LoRA already suggested some huge improvements
Sytan in all of your tests did you use adamw8bit?
Use adamw or adafactor
for me it learn, but not in the way I want. Images are too far from the training ones (style train adafactor)
if prodigy didn't work, likely nothing will. Its made to adapt as best as possible in real time
At least try Sytan
agreed, but i dont think a lora is gonna get it there. The Dreamshaper finetune has thousands of portraits handpicked over 6+ months, yet it still only barely improves portraits
even with prodigy, it manages to have a consistently raising Loss
Thats cause the dreamshaper XL finetune is kinda trash
and by kinda I mean like... very
What scripts are you using for LoRA training?
Kohya
me too
Anyone tried diffusers?
nut it just doesn't work 😦
We decided in this channel that dreamshaper is def the winner for portrait photos out of all finetunes currently available
A friend has, I am just not willing to mess with diffusers ATM
If it’s doing nothing try to make it do something. Like adamw 1 epoch high learn rate
any proof of that?
I'd love to see some images
cause I am pretty confident my base SDXL images will beat them
nothing
I have now ran 16 different LoRA tests
all of them end the same one way or another
i said out of all finetunes available. base not included
i'll find it one sec
start strong, after a few steps, rapidly raise to 0.12 loss, and stay there, no matter how long you give
Either that, or they go up to like .14
I've been tinkering with inference with diffusers. building a lora environment as we speak.
If a finetune is worse than the base, then I am not sure why the finetune would be suggested, personally
unless I am misunderstanding
All I know is from what I have seen of dream shaper XL, it was a super rushed and poorly executed finetune
They should have given it more time to be good, rather than rushed to be "first"
https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_sdxl.py
Not sure if it's good, but I will find out.
there are good Lora that works, I would like to know how did they do.....
I know there are, I have made some
Its just... all of a sudden, LoRA's will not train for me
@high skiff #✨|sdxl message
5 is dreamshaper. unanimous vote. yes there are other factors to consider but for photorealism portraits on any seed dreamshaper was doing the most real lighting, and most dynamic environments and outfits
this is strange
I never been able with sdxl
ah yeah, looks exactly how I expected, very plastic
they all are
Its a cool style, sure, but its not realistic at all
xl is more plastic
base SDXL can do realistic just fine if you know what you are doing
There's another too: https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/README_sdxl.md
and so can dreamshaper lol..... base XL is in that comparison on the exact same seed, yet dreamshaper is LESS plastic
thats because whoever did the comparison doesn't know how to prompt base, IDK what else to say
its the same prompt on base and dreamshaper. if the prompt caused the plastic in the base then it would cause even more plastic in dreamshaper if dreamshaper was more plastic than base
scientific process
The documentation is sparse
that makes no sense, there is nothing "scientific" about that
They weren't finetuned on the saem data, for all you know the way to prompt realism for one is way different than the other
Maybe sometime I will download dreamshaper XL and run some proper realism prompts through it
maybe then I would have a better feel for how good it works
Before that guy's test, I also thought the same, i remembered dreamshaper and realismengine and all of those looking waxy on 1.5 and 2.1. Yet due to that comparison I tried it and boosted realism across ALL of my generations. very surprised and pleased
mostly the benefit is the boost to coherent details especially in clothing
I still have a hard time believing it will be any better. Base SDXL can look leagues better than all of those with some proper prompting, but who knows, maybe dreamshaper can look even better-er lol
all I know is that SDXL base with proper prompting blows MJ out of the water for portraits
is there any way to do textual inversion train?
I believe Kohya can do it now, but don't quote me on that 😅
base vs. dreamshaper.
This is a horrible comparison for many reasons, but shows the gist of added color, dynamics, less artifacts
those are a little lest fake looking, I'll give you that one
it compliments the lora, helps make up for what it damages about the model in training
this is the difference my first poorly traied very small dataset LoRA makes for portraits
no LoRA
LoRA on the same seed with the same prompt
and its very poorly made
my only confusion now is why it can't train anymore
I should know, these are not the portraits I am referring to when I say that SDXL is exceptionally good at portrait realism
This was just a very quick prompt that I threw together without any real focus on realistic image prompting
Just wanted to see quick results
It trained in like 15 minutes, so it's definitely very undertrained, and the data set that I used was very small and limited, and also had pretty bad quality images in it
My new data set is multiple times bigger, and monumentally higher quality
anyone have a suggestion for a good image to image workflow?
and you've scaled the epochs up with the dataset?
No, I replaced the data set as it introduced a shit ton of issues because of the low quality images I originally sourced
I ended up finding a really good repository for the images I needed, so I was able to ditch the low quality images I originally trained on
I went from a 24 image data set where almost all of the images were quite low quality, to a 64 image data set where basically all of the images are near or above 4K resolution
Although the LoRA is training at 1024x
I share my last work done with SDXL and MJ
Welcome to 'An AI Journey to Dante's Inferno'. Meticulously crafted using the innovative powers of Midjourney and SDXL, this digital exploration merges the forefront of artificial intelligence with the profound depths of Dante's timeless literary masterpiece. Although the visuals may be chilling and evocative, rest assured that the words recited...
It's honestly got me very curious what would happen if I use the same data set on a 1.5 model
And still training at 1024x
Even the 64 image data set is still a relatively small scale test
If at some point I can manage to unfuck it, I would want to do a full version of around at least 400 images
i'd just raise learning rate or steps until i see a change. at my learning rate i would need 4+ hours of training for 64 imgs, but thats for faces
It's not my learning rate that's causing issues, something else is fundamentally messed up
I've tried to range of learning rates from 0.0001 all the way up to 0.1
They all do the exact same thing, start off strong, then after a few steps go up to 0.12 loss, and then either stay there or continuously rise for the rest of the training
I guess I can try installing derrian again
you're one of the only people i know that looks at loss graph. whether it looks good at the end is subjective and idc if the math says i did it wrong if i like how it looks.
but it sounds like youre not seeing the change either
I need a Seinfeld XL Lora
that was right after the original dreambooth paper (maybe even along with it), super outdated now
The lost graph is not inherently indicative of training performance, however, it should generally at least go down in some capacity as it slowly learns the patterns of the images that you train
For example, my first successful LoRA I made for SDXL used 23 images, and it trained for 20 epox at 20 repeats per image, and it started at 0.15 loss, and ended at 0.06 loss
And I retrain that one a couple different times to test out some values, and no matter what values I changed, it's still at least consistently went down
Whereas a lot of the tests that I'm running currently start at like 0.105 loss, and then by the time I'm done it's at 0.135 or higher
So it's quite literally getting worse at recreating the images as it goes along, which I just don't understand
that paper is what taught us to freeze the text encoder for improved quality
Thanks. This one? https://arxiv.org/abs/2208.12242
Large text-to-image models achieved a remarkable leap in the evolution of AI,
enabling high-quality and diverse synthesis of images from a given text prompt.
However, these models lack the ability to mimic the appearance of subjects in a
given reference set and synthesize novel renditions of them in different
contexts. In this work, we present a...
i believe so
afaik no one is using dreambooth anymore. it was just a method of using regularization for training but everyone stopped using reg so it was essentially just regular training like in the original Stable Diffusion paper. and then loras got good.
dreamboothing (finetuning) SDXL is pretty unrealistic for consumers, we're mostly stuck with loras for now
I use reg images for LoRA's sometimes, they can be massively helpful
but they are hard to get right
i just started using them again, specifically ground truth (real) photos. there's no doubt that it helps with training faces
I would personally say don't use real images, as thats not what reg images are for, but hey, if it works for you, do you
see now that right there has been a massive debate since the paper posted above. in the end I'm going to believe the creators of Everydream2, who stick to real images
to each their own
How do you go about captioning your images?
if i come across a dataset of 600+ person images made by XL, i'll do a comparison
i stopped captioning. works just fine for people on XL
for anything other than people I'd recommend captions
i just used blip but that's far from the best
but the captioning tool in kohya is nice
As if the model is better at understanding what it looks at already? Is that how it works?
Is there a way to only import prompt and lora from a image instead of everything in comfyui?
i assume so. a person isnt a very complicated concept for it to get. it makes guesses if you dont caption, and if the only thing in the image is a person then it's easy for it to guess what's happening.
completely theoretical tho
SDXL seems to be well trained so if the assumption is true it would make sense.
I have yet to do a direct comparison of captions vs. no captions but my results without captions are very good so idc
same... should i make one?
Haha, if you want to, go for it! I'm honestly, thinking of making a George lora once I am done working on other projects.
Booru Dataset Tag Manager:https://github.com/starik222/BooruDatasetTagManager
speeds up captioning edits after Blip.
we need a lora of kramer doing all his weird poses haha.
Do you know if it's possible to change only model path to another drive in comfyui? I change checkpoints every half hour and it would be nice to have them on my SSD
extra_model_paths.yaml.example
rename this file to .yaml and edit it in a txt editor.
add your checkpoint folder on the appropriate line
any controlnet for 1111 sdxl out yet? I know there is for comfyui but I prefer 1111
not yet
Hi. What learning rate was used for pretraining SDXL?
It's really difficult finding a technical writeup for the model online
When I got SDXL for my Automatic1111 there was a file called lora:offset_0.2, I download it at the same time but then never tested it until today, what is it supposed to do? It keep my image similar at very low values like 0.002, but for ever step up the images look worse and more crazy.
how's that work?
magic =]
so fast!
looks so damn fast, and with 67 steps
how did you integrate AITemplate, what are the limitations and what do you need to show us how to do it? 😉
this is not simple. AITemplate isn't fully integrated yet (don't pressure me, I didn't make the module), by changing the code of the custom node I managed to get AIT to work on the base, the creators of the node claim to release a more stable and fully implemented version of AIT soon.
what is AIT and is it hardware-specific?
more complex than that. it's a self optimized optimization that currently gives an insane boost to newer cards
Thanks for the info! I'm kidding! Not pressuring you, man. Just amazed and of course super interested. Who wouldn't want to utilize their hardware better to explore this thing.
The first time I read about it a couple of months ago, I was hoping it would come out at some point, since it looks so very promising - like the jump we made with xformers and also 4000 cards would finally get a real speed boost.
So I'm just really excited seeing it in your setup! Awesome!
keep in mind that what I demonstrated isn't the fastest possibility. in the workflow I made (in the video) I only figured out how to implement it at base stage, so just wait until the refiner will also be implemented, the speed will get even faster. I bet with an RTX4090 you might get INSTANT image generation like that
What graphics card do you have now?
thanks for showing it off, Tdg8uU. if that doesn't get people excited... I'm only using base right now anyway 😄
4070ti
would aitemplate work on a 3090? lol
I'm happy my 2070 can do SDXL at all, so i'm not too saddened that this new witchcraft might pass me by for now
does anyone know why SDXL has the scheduler config "steps_offset": 1
yes, boost might be lower than for 4000 series, but because 3000 series is also SM80, you should be good
nice. where might i be able to look for an update on that for when it comes out for comfy?
it will be all over this channel (and the internet) once it's easier to implement - I think you can be sure of that 😄
there are currently 2 pending releases people are waiting for. one is to wait for the custom AIT node to support the refiner, the other is wait for @visual glade to officially implement it into ComfyUI
for now the custom node seems to only work on base
what about tools to convert the checkpoint?
okay ill wait for comfy to impletement. im sure ill hear about that. at that speed everyone will know about it lol.
that's a story for another day. both the custom AIT node and Comfy claimed to provide precompiled modules for different architectures. AIT is NOT checkpoint specific, it's architecture specific.
is it possible to make it work on pascal cards or only works on RTX?
wasn't tested yet, theoretically it is possible. I'm sure when Comfy will officially implement it, it should work on new cards (including AMD cards) however, older gen cards are unlikely to get any boost from this whatsoever.
the compiling tools themselves are incredibly difficult to use, therefor in the node and (assumingly) in the official Comfy version it will do that on it's own.
i have another question then, does it use more vram/ram or is it rougly the same or less?
but well, after you got all that set up, the boost is totally worth it IMO, I'm literally generating extremely high quality images in about 10 seconds.. (without it even being implemented in the refiner)
a little less
oo nice!
spent like total 6 days for recording + editing 😄 https://youtu.be/sBFGitIvD2A
In this tutorial, you will learn how to install Automatic1111 Web UI for SDXL. How to use LoRAs with Automatic1111 SD Web UI. How to install Kohya SS GUI scripts to do Stable Diffusion training. How to train LoRAs on SDXL model with least amount of VRAM using settings. All of the details, tips and tricks of Kohya trainings. How to do x/y/z plot ...
Has anyone got an aesthetic scorer to work in Comfy? I tried both **ComfyUI_Strimmlarns_aesthetic_score **and **ImageReward **and neither installs, it seems I have dependency errors
any quick way to save uuid file names in comfy?
I had image reward working but then if conflicted with something
Thanks, I've installed it again without errors... I think. I even ran the pip command on the requirements.txt file and that went through but wen I start ComfyUI I get:
from .AestheticScore import * File "C:\Applications\StableDiffusion\ComfyUI\python_embeded\lib\site-packages\ImageReward\models\AestheticScore.py", line 16, in <module> import clip ModuleNotFoundError: No module named 'clip'
add "clip" to your requirements.txt file and run "pip install clip"
if you want to use that
You say here that using regularization images further tunes the model to that style/quality, and that's why you use high quality photo datasets. is that still true for small trainings like on 16 images?
Will give it a try, thanks
you need to give this a try https://civitai.com/models/124110/furry-wolf-sdxl
still working on my blender addon for comfy, gotta add workflow to import via button
will be cool to see it work with post generated images and creating depth and normal maps from a workflow
that a funny looking sexy babe on the beach😆
lol dont judge me!
that's awesome!
anyone submitting to the civitai contest? Curious what you do for the "missing generation data" if you use ComfyUI to generate your images
coming along nicely, few stumbling blocks
i usually dont have that issue with missing generation data
what gpu did you run that on?
They are using AI Template on a 4070ti i beleive
AIT significantly improves speed. and hopefully soon to be supported in Comfy
reading the github on it, didnt realize it was made by facebook
what is made by facebook?
AI Template
what is ai template?
a thing that does a thing that helps speed up ai gens
this ☝️
Using ComfyUI? SDXL image?
speed of what?
cant tell but think it's more compiled with the cuda 11 package rather than being "part" of comfyui
scroll that engineering team off this page
lmao
I wonder what it actually does....sounds like it removes the need for certain middle apps in the process of making things like SD work...comfy or one of the better UI's like a1111 and SDNext
scroll up, I already answered that question
So it only works with marketypush UI, Comfy?
yep
So not very useful
very useful. i love comfy
Facebook developed something to improve comfy?
comfy is love comfy is life. lmao
odd how the whole community changed in a flash to talking about comfy
if you hate nodes just use comfybox xD
tbf i didnt know about comfy until sdxl 0.9 leaked
Comfy and A1111 are different tools for different jobs. I hope they both continue to be developed.
The facebook github doesn't even mention comfy
i hated comfy when i first started using it, all my scripts were using a1111's api, exclusively use comfy now just took a bit to understand it
Because the other UIs were late to fully supporting SDXL.
odd that it's made specically for comfy, but it doesn't even mention comfy
Nobody was late. SDXL doesn't support SDXL...we've lost tools like controlnet
i think it's made for AI generation in general, someone else is making something to make it work "with" comfy
right\
No tools were lost, they are just for a different product
Comfy fully supports the SDXL generation process (can't say workflow anymore because people think that means spaghetti nodes). A1111 doesn't support the full process yet.
The electric start on the packard wasn't lose. The 1960 car you now have to crank start is a different product
I hope for SDXL inpainting model someday. It really does work better for many processes.
SD NExt supports it
I've said it multiple times, SDXL has been out for 2 weeks, and people expect it to have literally everythign that 1.5 has which took months to get to
what aspect of the process...the workflow, which now you're not using because marketing...doesn't A1111 support?
i'm using inpainting in comfy now with sdxl, seems to work, controlnet "soon" we think
Inpainting works if you inpaint based on the whole picture. But if you want to inpaint only the masked area you need an inpainting model for best results. Same as 1.5/2.1. No different.
When the cars came out in 1960, they had everything the Model T's had
they didn't go back to 1900
Is there anywhere I can pay to train a lora for sdxl
Controlnet will come. Base model is base model.
ah
Lets continue to compare apples to oranges
they didn't trot out horse carriages, insist they were a step forward, and soon we'd have engines again
Runpod (not affiliated).
I’ll look into it
new gpu friday, then gotta learn how to train, then maybe lol
what gpu are you getting smutz?
☝️
just use ComfyBox if you despise nodes so much, if you want the speed boost I demonstrated earlier you need Comfy as a backend.
wtf is comfybox? lol, now i gotta go look
what part of the workflow...er, sorry, the marketing language is now to call it process, does sdnext not support?
frontend for ComfyUI, usable with the AIT workflow I made.
does it support animation extensions?
which animation extensions?
mov2mov,sd animation,deforum,animatediff
Well, I don't know. You mean with SDXL, right?
yea can u just paste those extensions into sd next or someone has to port them?
they worked for 1.5. I don't know if they work for SDXL now
I rebuilt my PC into a new case today, my 4080 is hitting temps that seem pretty steep and I was wondering if someone else here could monitor their temps during a generation to increase my sample size. Pretty please and thank you.
can you point me to them and how to get them into comfy?
cant get them in comfy yet
Yes those are high. Try to improve GPU airflow.
I'll have to wait til tomorrow then. My corsair fans don't fit the new thermaltake mounting bracket so I'm lacking 3 intake fans
Thanks
That could be why. Need airflow to move hot air away from GPU.
Put a yurblotz in there
I did make two pretty pictures during testing at least...
A Yurblotz will help with your temps
lol I thought you were just making words up at first
I'll look up yurboltz
how's nido? 🙂
yurblotz
A yurblotz is a kind of tiny frost dragon. I have a few of them at my house instead of AC. They eat bugs, and they breathe out cold air
Just the best boy ever, as always
Tell the dog "Burf" for me.
Will do, after his nap of course
I dont have a dog. However, there are some coyote packs around here that are friendly with me
its true what they say,sdxl does insane realistic cat pics
I have a sure fire winner for the generate contest
the 4090 will be mine
I am working on an image of a man gazing into a crystal ball wherein is shown a fight between a dragon and the men of a town
However, that is not the winning image
the contest won't be a best generation contest. It will be a who gets the most people to like their entry contest
aka...solicits people to go choose it
Welcome to internet.
they claim that won't be allowed. How can they stop it?
The bottom line is that to overcome that, someone would need an image that gets people to choose it despite going there to choose something else. I have such an image\
Image McImageface?
nope
my image is like that one that's been making the rounds on Reddit
a boy goes to a witch to curse a pendant....it will make whoever wears it die in 7 days
the boy places it around his mother's neck - the doctor says she won't live through the night
If someone could make that - it's a 4 panel comic - on one image generation, it would win
i won the contest gimme the 4090
That would make a good pet
An image that might win would be the image of my meeting with a vulture. One day 6 vultures were on my roof. They had caught something, and they were eating it. I turned on the sprinkler to scare them away because when I went out there to shoo them, they attacked me from all sides. Their claws and beaks were sharp. I could nae fight them all off.
Alol of them left except one, the runt. I guess it was desperate for food, and normally it didn't get much because the bigger ones got all the food
I had mercy on it - turned off the sprinkler so it could eat.
Today it returned, spit out at me the skull of a crow. I guess that was my reward for scaring off all the other vultures
hmm tasty vultures
my arms got all cut up from them
I think they are black vultures. They didn't have red heads
That video of those images generating - it sure was fast
How many of this style do you have now?
Get it so that inside isn't just colors, but an actual model of something like a town or mountain...something like that
hah put your text in for fun
dood
If you could get all four panels in one generation - the story is a tear jerker. Of course, only a fool would use black magic for any purpose, even a good one
No one will stop to read the image while voting. They will look for colors, details, waifus.
I think they'd look at that one
The winner I have, however, will instantly grab attention and appeal to people in this community
What is this Waifu? Is it like Kung Fu, like a waif fighiting, a street urchin who fights using martial arts?
WAIFU - What Actually Is F_d Up in this image 😉
what image?
Everybody was waifu fighting... 🎶
Kung Fu is not as much about fighting as it is about strengthening the body
just thought of a good way to remember sd15. its the one that made every girl look 15.
rip sd15. long live sdxl
how to remember SD2.1? faces generated with SD2.1 look like they are missing 2.1 million brain cells
even 15 is a little generous..many looked a lot younger
and mind. the roots are for any skill that requires an mastery and discipline over time if i understand it right
correct
also had 15 extra fingers and facial features
yeah, classic
also conjoined twins with wide images
hard to believe that was about a month ago
you'd get a lot of centaurs too
I wonder what we'll have in a year or two.
what i call the people riding themselves
more sdxl asian girl loras
holodeck confirmed
Moar waifu is a given. I wonder if there will be another new model by then or if SDXL is it for a while.
SDXXXL soon
text2movie
on consumer hardware? maybe in ten years at the rate nvidia is drizzling us with hardware refreshes
seems to require a tonnnn of processing ower for video, which makes sense. i think for most people, video won't be for a few years.
Apples AR headset is gonna have people walking around with GPU backpacks turning all the pedestrians around them into waifus
people already do that at home
With VR sure
i don't think it'll be that integratable. it'll do that with macbooks though
and macbooks got some of the best ML silicon on them
google glass all over again
lol I just think it's a hilarious thought
Now with ai!
ppl even used a term for them i think it was called glassholes
i'm thinking of getting one just to jailbreak it and put googly eyes on the visor
they refused to interact with you if u were wearin a google glass
I just wish I could record my perspective when playing with my dogs but pervs gotta ruin everything
This dog's day will come.
a loud minority was against google glass. the reason they really didn't move is because they were ugoh and the actual text dispaly sucked. it was pointless
building your own google glass wouldn't be that hard today
cameras are EVERYWHERE. no one cares
Yeah the tech wasn't all that useful yet.
could probably even make it look decent with some custom 3d printed parts
glass failed because it was dumb
there's tons of smart glasses on the market still. raybans that take photos for your phone. all sorts
the thing that is keeping these things from really being awesome is power... battery capacity is still garbage and cpu's still use too much of it
https://www.ray-ban.com/canada/en/rayban-stories these things are actually getting some market penetration
smart glasses would be great if they werent bulky+had the capability of high res video streaming
they could but we have that power issue
yeah but now super conductors
yep, thats why i specifically pointed that out lol
nope that LK99 was debunked
i know 😦
i read about an idea of a smart battery vest to power all your wearables
but super conductivity at ambient is going to be a thing.
never
if universal, products could be designed/developed going forward with that in mind as a power source
hmm. a vest that could ignite at a moment. i like that cool idea
keeping a bomb in my pocket is something i've gotten used to so lets just vest up i guess
it's just an engineering problem. the math is real.
math is just a story, stories can be made up
perhaps math is the greatest story ever told
or the worst, but there are usually better stories every couple decades or so
only because children are stupid and have no cultural knowledge so missed that story last time it was told
ugh kids are so dumb it gets on my nerves so much
everyone is dumber yesterday then they are tomorrow, its how our minds work in this linear spacetime we are trapped in unfortunately
well yeh, people have strokes, eat like shit, or just stop giving a fuck
or that entire generation of people who were exposed to wide lead contamination from leaded gasoline emissions. that can happen too
pretty sure covid dropped the global iq average a couple points as well
you are forgetting the leaded paint that was quite popular around that time and is still very much present today
that long vid has me dealing with concussion recovery again. i def got stupider
i love microplastics on my food
nope. i am not forgetting the leaded paint on all of the toys and bedrooms and cribs and lead in everything.
story goes lead paints taste sweet so there'd be kids who ate the paint chips
or irradiated seafood from nuclear power plants dumping their waste into the pacific...
looks like XL used MSPaint for one of the eyes
or fake sugar, fake colors and fake fruit flavor on my icecream
i'm less worried about the radioactive water since the pacific is huge
i haven't checked lately but after that disaster in japan the levels of radiation were quite high in the fish/crab/etc being pulled out of the pacific, even as far as NA coastlines
i often finds it gets eyes just a little off taht way. like on one the reflection is trying to be sharp and the other it's soft. detailing pass on the face is my favorite for portraits, but i've noticed the settings i got dialed in makes them sort of plasticy. maybe to much denoising or steps
effort to get back on topic. existential dangers are a rabit hole
eyes have been an issue all the way back to the SD 1.x days, usually from a VAE that needs more training or a jacked up text encoder... wouldnt be surprised SDXL has one or both of those things going on
higher than normal. a banana has more radiation exposure. remember, the pacific is literally huge. one of the hugest things on the planet
take all the water in the pacific and put it in a ball, it'd be huge
relative size of the pacific doesn't have much literal correlation with the risk posed from an industrial sized nuclear powerplant dumping all of its material into it
i think all powerplants are industrial sized. they're kind of industry that way
wtf thought it would be a good idea to build a nuclear powerplant on the beach in an area that frequently seas tidal waves
but yeah. it does. the pacific is huuuuuuge
not really on a cosmic scale
that radioactive water would have abad local effect def but the wider? its not homeopathy
it did tho
cosmic scales are unfathomable. huge still sort of it
it has measurable impact on the radioactivity of the fish being pulled up of the north america coast
measurable yeh. higher than typical yeh.
if its measurable its not homeopathy 😛
i live on the pacific coast directly across the currents from fukushima. the debris washed up on my shores on vancouver island. i'm informed on the matter. i live it
worry more about a solar flare than fukushima. there's been at least a dozen that dump more radiation into the atmosphere since then
luckily the atmosphere is huge
we also have layers of both physical, magnetic, etc composition that filters out most of that crap
not saying a huge ass solar flare couldn't royally screw us tho
not the ones that break through and cause auroras and measurable effects on the elctrical grid
well keep enjoying your uranium infused fish, ill go eat my lead paint chips
also the other problems are free radicals and other ionized particles. but they're in such diluted amounts that literally a banana exposes you to more
one to beam down, Scotty
you'll never beam me lucky charms
sphinx of black quartz, judge my vow
Cant make Spock do the Vulcan salute. dont think XL knows what it is
have you tired the phrase £ making a Jewish Priestly Blessing" of " Kohanim Priestly Blessing" ?
will try
haha yeah it loves doing him with a hand up, but fingers splayed
nope. He just has his fingers raised
amazingly im seeing 5 fingers consistently
higher cfg
Live well and propagate.
spock on the bridge
Spoculues
amazing
i mean if you want good palms, i guess prompt spock
these hands are outstanding
but they can't do the vulcan salute
sadly not
got a controlnet for sdxl handy?
hehe, handy. I get it
my latent to your latent
my prompt to your prompt
"My mind...to your mind. My thoughts...to your thoughts. It is in the stars, when at war always with the Sith there are two...master and apprentice. AT LAST we shall have our revenge on the Jedi! When the stars trek, if a hand has six fingers, then hours are like days. The repairs will be complete within four days....if hands have six fingers...or four."
did anyone see the SNG episode recently where Spocks ears were made human and he had to ear a beanie to meet his mother?
Terrible show
really ? I'm loving it
anyway...controlnet could easily do this
[:splayed:0.25] in negative is aiding a bit
wheres the fun in that lol
helps to investigate the latent space's mobility without controlnet still. i like this character study
prompting is king
Situations sometimes work
I have found that if I cant get a clothing style, I can describe a place and/or activity to get it
try holding your mouse to your mouth like a microphone, and then speak the prompt
yeah prompting i always put a location. context helps sdxl a lot more than 15
"Computer....give me an image of Spock making the vulcan salute"
anyone been able to use comfyui and load iA3 created lora files? DyLORA as well?
Provided to YouTube by Universal Music Group
Highly Illogical · Leonard Nimoy
Spaced Out - The Best of Leonard Nimoy & William Shatner
℗ 1967 Geffen Records
Released on: 1997-01-01
Producer: Charles Grean
Producer: Tom Mack
Composer Lyricist: Fred Hertz
Composer Lyricist: Charles R. Grean
Auto-generated by YouTube.
postcard from vulcan
This output is highly illogical
all the spock pics were probably all captioned with "holding hand up" so it got blended with knowledge of every other hand
future dataset building opportunities
The Borg
john spock. you don't want to see what he's like when you kill his tribble and steel his shuttle craft
Spock Motor Sales is #1 because our prices are always logical.
The most deadly type of Borg. The Bjorn Borg.
The Bjorg
This is all three segments from the Magnavox advertisement laserdisc: "Leonard Nimoy Demonstrates the Magnavision Videodisc Player"
This advertisement has become a bit of a legend amongst laserdisc enthusiasts and Nimoy fans alike.
I noticed that previous versions uploaded to YouTube missed out the second segment so thought I'd upload my own...
why has noone done 7of9 yet ......................
I tried. got someone completely different
me too, man 😁
that is the most enjoyable trek in recent years.
psy borg
Star Trek Woke Frontiers is not enjoyable
Here's another of Seven Eight Nine
Here is from the awful Picard series
Trying to remeber the name of the fan based series I warched aon YT a year or two back that basically redid the TOS episodes anew, It was most excellently done
at least she didn't plastic surgery and botox herself into oblivion the way Dcctor Crusher did
Star Trek New Voyages
it's better than STD or picard, imo 🤷♂️
and you can't watch anything without woke these days
I can
nope not that one, this one was better
Star Trek Continues, then
that was the one 🙂
New Voyages has some terrific episodes - the one with the doomsday machine, sulu, checkov
i loved the new episode of strange new worlds where they entered the pocket of space that caused them all to sing
and last season they did a freaky fridya episode. it was amazing
that was funny as
you close your eyes when woke scenes appear or don't watch them at all? 🫣
Starship Exeter has some good episodes
I tried to watch some suiperhero show on netflix
Heroes of Tomorrow or something
another psy borg.
although the SNG episode 2 before where they did the corsssover with Lower Decks
that was your mistake, watching anything on netflix 😀
I dont watch much of anything anymore
I watch anime when I work out or bible study videos and debates
bard on how to better overwrite a celebrity token for training a new face. does this make any sense to anyone?
they turned dororu into woke garbage
who is doruru
it's about this baby that is born without body parts, skin, internal organs...his father sacrificed him to demons to gain power. Some sorcerer keeps the kid alive, builds wooden parts for him, and then the kid goes around slaying the demons to get his body parts back
the main character is a boy, a thief named Dororo (I had it spelled wrong) who helps this guy
In the Manga, it turns out Dororo is a girl whose mother raised as a boy in order to protect her from being raped in war torn Japan
in the anime, they change it to that Dororo identifies as a male trapped in a girl's body
the Manga focuses on the demon slayer's relationship with his father and younger brother...the battles between them
the anime focuses on Dororo's gender identity - becomes unwatchable from about half way on
the main twist of the manga is played down in the anime in favor of Dororo's gender identity
can we not do subtle hate and prejudice against people? not even in any channel here/ that'd be nice
yes, please don't
you seem really angry about woke , wahtever you think that is.
it's probably just a facade for hate as usual.
don't
Im dropping it because I don't want to get kicked out or banned
I expressed no anger - that's the last thing I'll mention
well that confirms the context you meant it in then.
Feel free to PM to discuss it
if you think you'll get banned for it, then you know exactly wtf you're doing
naw. ew
This would be easy with controlnet depth or cammy...whatever it's called
I bet you could train a vulcan salute lora, too
or just use a hand pose with controlnet
openpose
since no controlnet for sdxl your best option is img2img
img2img
that's a good one
that looks nothing like spock
I am getting beyond frustrated with making LoRA's for SDXL
I can't tell if something is just flat out broken with Kohya right now or not
its been more than 2 weeks and still no Steven Seagal lora for sdxl 😔
how's that new GPU?
This is Spock 🙂
that's great 🙂 glad to hear it
Now, if only I could get any results from SDXL training
dumb question, how does the StableDreamer bot handle prompt compared to a local comfy ui workflow?
like i feel like bot follows prompt better
I present to you - training with the nuclear option known as "caption dropout"
unet diffusion_pytorch_model.safetensors 10.3G,long time downloading..for running inference API on huggingface
At least you have results 
probably worth setting up your install anew
also, kohya gui got preset updates last night, for sdxl
so those should be useable now - including prodigy
might be a good time to reinstall it
git LFS pull is much slower than LFS push
is it in real-time or something? 
yeah, using kohya samples XD
I added too many new settings to this training run - so I needed at least some way to track it live
I did
I have an old install, a recent one, a brand new, and brand new Derrian from today
yeah no. if none of them are working then there is an issue O_O
python on 3.10.9?
cuda cores are actually working? 🤣 and not starving for power? (/jk, can't image this being an issue)
how u train lora?with kohya GUI or just script running?
with the gui - since it adds all the working things from the dev branch of kohya-ss
what's prodigy params?cuz i just train lora thr python hardcoding,have to modify the training optimization config file
I'm running my first LoRA training now but the example dataset (pokemon) is large. Should get a smaller set to test on. 😄
adafactor's loss around 1.1,if i wanna to minimize loss value,it seems like i have to change another optimization type
mine are a copy of this post:
https://civitai.com/articles/1022/sdxl-trainingbdsqlsz-lora-training-advanced-tutorial2best-optimizerprodigy-is-all-you-need
This is the new updated version live on kohya gui: (I didn't test it - but I'm guessing its working if its recommended for everyone?)
https://github.com/bmaltais/kohya_ss/blob/master/presets/lora/SDXL - LoRA prodigy AI_Now v1.0.json
your GPU VRAM volumn?what's the minimum requirement?
Stable Diffusion XL (SDXL) enables you to generate expressive images with shorter prompts and insert words inside images.
I run it at just below 24gb vram
but I've seen in the comments, that people got it down to 8gb vram - can't help on that, as I never tried it
Just published my new Inflated lora https://civitai.com/models/126131
Thanks to @upbeat summit for helping test it out 🙂
Use the word inflated to add the inflatable look to your images with this lora.
lora train another character with bokeh prompt,if you generate thr refine model,just make sure chain it with upscale nodes, otherwise the render will be lost with more details,i will update how to do inference on huggingface later,it will be more interesting and time saving
what are the difference between no caption and caption dropout?
Dropout caption every n epochs
Usually, images and captions are learned as a pair, but it's possible to train just on "images without captions" every certain number of epochs.
This option allows you to specify "drop out captions every ○ epochs."
For instance, if you set this to 2, you will conduct image training without captions every 2 epochs (2nd epoch, 4th epoch, 6th epoch...).
By training on images without captions, it is expected that your LoRA will learn a more comprehensive feature set from the images. It can also help prevent the image features from being tied too closely to specific words. However, if you use captions too sparingly, your LoRA could become ineffective at prompts, so be cautious.
The default is 0, and in the case of 0, caption dropout is not performed.
Rate of caption dropout
This is similar to the "Dropout caption every n epochs" mentioned above, but during the entire learning process, you can train on "images without captions" for a certain proportion of the time.
Here, you can set the proportion of images without captions. 0 means "always use captions during training," and 1 means "never use captions during training."
Which images will be trained as "images without captions" is determined randomly.
For example, if you train LoRA with 20 images, reading each image 50 times for just 1 epoch, the total number of image learnings is 20 images x 50 times x 1 epoch = 1000 times. If you set the rate of caption dropout to 0.1, 1000 times x 0.1 = 100 times, you will train on "images without captions."
The default is 0, and all images are learned with captions
caption dropout info
hahah works great. nice work
nice 😄 glad you like it
Might be I need to do a compare between no caption, caption dropout, full caption.
base and a plain lora
left is base model lora training image,looks fine,right is upscaling refine model generation,breathtaking
it's been a lot of fun. very unique fine-tuning!
thanks again for helping to test it out 🙂
you're welcome! thanks for letting me explore it
wonder why it flagged this as 'sexual acts'
I tried to click the arrow 
maybe check the prompt
it didn't seem to detect any of the metadata either
sdxl refine model render details is so good,left base model,right is refine lora upscaling
this is totally related to number of repeating. if you do more number of repeating it will have stronger effect. because it will use that many images during training as regularazation images
that's great!
thanks
sigh wrote an article on my nodes on civitai and it just went to a 404 page when I went to publish it, doesn't even save a draft or anything
damn 😦 yes, their page is currently unable to handle the load
I since write everything in my editor or an email draft first heh
face lora guide + settings is up 😄
#🔧|finetune message
lora with sdxl refine is awesome
The more I increase steps, the less it follows my controlnet in comfyui, anyone know how I could increase steps, while having it keep the same amount of controlnet control as it would have with 20 steps? I hope it makes sense what I am saying.
okay,i admit the character generation is surpass midjourney with different styles and diversity,dark style is so good as well
When Controlnet XL ?
even this lora has same settings as the old one?
I see you're training clip. if it works then what were these "unexpected results" we kept getting warned about with that?
interesting poses
the head looks out of scale to the shoulders as well
training clip is an all or nothing situation normally - but that's why I'm so specific on how to caption it, exactly how long to run the training
it's a very controlled environment
For my big datasets I'm always training clip now - since I have more than 150 images per caption word I'm training - with a total of 4k images 🤣
in that case it works
but the smaller the dataset gets - the harder it gets to train clip and get better results than without training it
I see. for my purposes I can't manually edit captions so I probably shouldn't train clip, plus my training datasets are always 12-26 imgs
yeah :/ was gonna recommend something with user input... but that fails at the individual image level - where you can't exactly ask them to tag if they're wearing a sweater or not XD
yeah probably would create more problems.
I'm getting good results with unet only right now but there's still a hint of the celebrity used for the token if XL knows their face too well. increasing LR doesn't help and I really don't want to sacrifice more epochs, it's currently an hour for 16 imgs
"professional model" and "model photoshoot" didn't work as tokens
I'm pretty confident that the 4 class tokens listed above can be made to work.
also relies on caption dropout.
those would always yield exact reproducible results - but testing various dropout settings would be a pain until you finally get it right 🥲
i just uploaded the actress upscaling model on https://civitai.com/models/126239
this is a realistic sdxl 1.0 base model trained lora of maggie Q,first of all,i training 68 raw images of maggie Q with kohya python scripts on 16G...
clip skip 2? O:
I haven't tried clip skips at all yet in sdxl. is there value in training with it?
since sd2.X,it seems like clip skips is unavaible anymore,anyway,i just leave it for reference.
My understanding is that the refiner doesn't work with loras, and can't be trained.... Therefore most of the time the refiner actually regresses those learned traits
well there is a node in COmfy for it but cant say Ive tried using it
refine model is just render more details than upscaling
Fair enough, I guess if you keep the denoise low enough
currently,i'm focus on the hot inference API of huggingface,it takes some time to deploy
base model denoise 1.0, refine model 0.25,it's not a big deal,neural work params optimization and be effective and efficiant of work are more important
First image is Clip Set Last Layer = -1 , Second Image is Clip Set Last Layer = -2
All other settings identical
(dont count fingers and ignore the foot lol)
how could i get something close to this?
it was reelvant for the novelAI model back from SD1.5
since they finetuned at clip skip 2, so it gave better results for inference at clip skip 2
what does "it" in it/s stand for?
iterations
You shall not pass.
That's a cool approach, to use a percentage for the base.
In ComfyUI, is there a way to navigate the Canvas without having to grab outside a node and drag?
Currently I have to zoom in and out then grab and drag a lot. I'd prefer if I could stay zoomed in more often and just scroll around the canvas.
spacebar
cries in cpu generation
i am on GPU
but are you... really? 🤔
i am on MX130.
jk. you prob have something causing a vram overflow into standard ram. which is causing to run at cpu speeds
hm.
Chrome may be eating too much RAM
maybe i can try Edge
2GB VRAM is obviously not enough
oh god no XD
but i wouldnt think it contributed by the RAM
6gb is the lowest I've seen work
8gb vram 'works'
12gb vram is recommended for 'full' support
i still got it work actually
oh yeah it always works - but not via gpu
i am always on GPU
that's the cpu doing it in the background
even in SDXL lol
what dis?
@halcyon tusk ( right ping?)



