#🍥|anime
1 messages · Page 177 of 1
That's perfect anatomy ngl
heres another
is it bc of lora or model?
both
it's weird, just addng the birds, the model transformed the colored lights in them!
from this to that
@wooden coral @dull jackal bad link in chat 



i wonder what happens if i randomize the noise of the upscaler
how the?.....
i absolutely love comfy
it's a constant source of surprises
a part of my want to cover known characters in atipical misterys, but the dominant one just want to push the absolute chaos out of ai generation XD
the next thing i implement, are wild cards
no can has bunbun
But...what if there's...nothing to click!? 
lol
Santa's givin' that person coal
It's based on this girl I know. She has this bunny she loves.
Also, yeah, Comfy is so great!
I always wanted to have a bunny. I had a bff who had one when I was a kid, so I've always loved them. They're so soft and adorabibs, so how can you not?
They have their downsides, but they're cute.
i had 2, but they were big, and chewed all my pc cables, so, i gave them back
not a thing for me
i prefer my cats, that constantly drop vases and cups on the floor, but i resolved with plastic once lol
haha, yeah, small animals will be out for your cables, for sure! Cats, dogs, and rats.
Well, you know it's a rite of passage when your cat takes something and whaps it to the floor.
no more glass or porcellain for me XD
well, they are cats. 10 times better than lara croft XD
me too ❤️
haha, high five!

What's your fave!?
i'm a cat guys, always grown cats from childhood
they once i find on the streets
every time the time takes them away from me, it's a trauma, but even that it's love
what they game me, will always be part of me, now and ever
It's always tough when that happens, but I will always love all the animals I've owned.
Gorgeous! Love all the traditional art aspects/blending going on.
this workflow is insane. i don't know what's going on myself
Tell me about it.
i'll be fast showing you a screen XD
sweet!
A different kind of bunny lol
and this is still an incomplete project
some nodes may be connected mistakenly
and some settings could potentially even damage less performing pc
and if my pc is in a mid low range
but what does it do
lcm and coadapter to set a pose and depth to the image
than i pass it on 3 ksampler, 1 first starts the work in exponential, the second in karras, the third is connect to a new seed and the gen with a small upscale, to ad detail
the result get depth, over imposed images, like the birds in this case and absolutely not prompted details, that gives more complexity to the image
i even added concat, when i add single details that gives even more randomness to the end result
in this case, in concat i added birds and the gen spawned birds everywhere
maybe i should add regional on concat to add details in some precise zones
and use wild cards on the other 2 cocat to make things even more messier
in the end, i just have upscaler and face adetailer
on the top right, ipadapter
cute lol
adding reginal prompting on the concats, influenced by wild cards 
i can't wait to assemble it
i love comfy. i can't do such things on auto
I like it!
Sorry, had to do real life for a bit
ah no problem, i'm generating in multy tasking, when i'm reading and listening
time is never enough
the true madness, is in setting every node
as if connecting them is not enough
I can see the various different aspects in your image---it's certainly a lot of effort
but I like the overall output
and the depth to the background
And again, c'est la vie, but....FAREWELL
it's never enough
when i finish with existing nodes, i will study how to develope nodes myself XD
Am i using somehow a bad model or sth.. I only get bugged anatomy(literally)
What are yall using for fixing extra libs/legs/hands/fingers?
uh, wait. Currently on win 10. Gonna need some time as im unpacking something
Can you send an image with info that you generated
face adetailing this will be hell
i knew, adetailer is capturing every single face lol
Well... that's not a bad thing here. It looks great.
It's doing what it was made to do. Don't take that away from it, lol.
Well, zooming in, it looks like you don't really have a high denoise on face detailer. Put that up to 0.4-0.5 and have the steps at 12 or so. That should fix the problem for the faces that it's actually identifying correctly.
i lost the old lady on the right
old lady?
lowres, polar lowres, bad anatomy, bad face, bad hands, bad body, bad feet, bad proportions, {bad leg}, {more legs}, worst quality, low quality, normal quality, gross proportions, blurry, poorly drawn, text,error, missing fingers, missing arms, missing legs, short legs, extra digit, 2girls, long skirt, out of frame, only_upper body, only_lower body, [petite], low background, distorted perspective, dynamic angle, dynamic pose, nsfw,
now on ubuntu
Oh the horrors
I mean, it looks like you just need a simple full upscale first before you start inpainting faces.
for the old lady, well, it's a miracle
and an example of it:
or this
or here the face and the hand
that's literally insane, i don't see much of fuck ups
I wish I could understand comfy metadata
oh it's comfy
isn't 12 cfg kind of high?
not really, the first gen needs it ig
getting upscaled by model and then downscaled latent for more details
soo.. what you guys using and not getting ana faults?
This
The only thing I change depending of what i'm going for is the resolution/num of steps
How Do I Get Good Hands with a 1.x Anime Model?
Ah yes, a question as old as the time of AI itself.
If you're using 1.x: First, you need a killer model; one that's extremely cohesive like Aurora or DarkAlfa. Then you need to prompt for the model. If it's a booru-tagging model like Aurora, try to use as little natural-language prompting as possible (describing the subject or scene in sentences are a no.) If it's a natural-language model like DarkAlfa, use short descriptive sentences combined with booru tags to help it along.
Next, a decent negative embed will help it along in the direction that you're wanting it to go; something like negative_hand works well, if subtly. An additional general negative like AuroraNegative or EasyNegative also helps.
Additionally, a solid negative prompt that has as little as you can while still taking care of potential issues is the goal. Here's what I use as a starting neg prompt in Auto1111, but I often cut it down a bit further before genning.
BREAK
(disfigured, unclear, indistinct:1.3), ugly hands, extra arm, extra hand, split arm, missing finger, extra finger, three fingers, four fingers, six fingers, merged fingers, (bad anatomy:1.3), misplaced hand, misplaced foot, (text:1.3), (signature:1.3), (title:1.3)```
Final tips:
- Use the CFG you want, but know that the lower you get, the less cohesive, with more potential for bad hands and indistinct features.
- Use an amount of steps that works for the sampler that you're using. 20 is a good baseline, but if you want more details then 40 is good as well for samplers like 2M Karras. If you ever use any that end in Exponential, then increase the steps a bit, as it requires more.
- Check network extensions. Sometimes it's a bad LoRA or incohesive embed that can mess things up.
- Be aware of your settings as a whole. There are many moving parts that can affect things.
Also, keep in mind that pretty much any kind of weapon comes out looking like shit if you're not on sdxl
😔
Negative Hand embed: #🍥|anime message
Aurora and EasyNegative: #🍥|anime message
i just saw on civitai that some people on the model i use, use badhandv4, should i also do that?
or only one at time?
Additional guides of mine here. 
-
**How to Upscale with the Most Detail and Least Problems: **
#🍥|anime message -
Help I Can't Increase My Contrast when Generating!
#🍥|anime message -
What 1.x Anime Model Should I Use??
#🍥|anime message
Don't use badhandv4. Just use the negative_hand that I linked. And never use more than one bad hand embed.
alr thanks, i'll try out
hmm, i just generated some interesting image, it's just hiding the hands.. (No embedding currently, my internet is busy with another model)

welcome to the reality of entire humanity
can i add just the prompt in comfyui or how do i include the neghand?
I feel like it's easier to make mature looking characters in them
🤔
If you have the file the same name as what's in my neg prompt then it'll work. Just make sure you've restarted your ComfyUI after you've put it in the embeddings folder.
you do it!!! bravo!!!
Oh, and if you're using comfy, you'll need to concatenate the negative prompt clip conditioning instead of using BREAK.
um, what does that mean?
You'll need something like this. If you search "concat", you'll find the node. It essentially combines two CLIP Text Encode conditionings into one.
If you don't already know, you can double click empty space in the workspace to manually search for a node to place.
Right, you'd put what's before the BREAK in one and what's after the BREAK in the other.
Ig it worked, does comfy also give meta data out? if yes how'd i read them?
wait nvm, it clearly worked
not perfect, but much better as double/mixed up hands
it definitely improved 👍
{{{{{{{lens flare}}}}}}} what exactly do those {}? Do they make the weight go up or down?
DON'T use {} use () to increase attention.
I recommend not using silly amounts of brackets though.
Neither
Just use something like (black and white:1.3) or (skinny:0.5) to increase or decrease weighting.
(((((((((())))))))))) for curse image
Hahahaha
it makes my eye ache to read the prompt, does this mean it would be 1.7 because of 7 {?
**{} is NAI’s “implementation” of () **what is nai and what does that exactly mean?
NovelAI diffusion. Don't worry about it--it's not a local thing that we have access to with SD.
Novel AI, probably wayyyyy back
oh, alr
I'm sorry for interfering. But what about this syntax - [blue hair:medium:1.25], for example
what mean - medium:1.25
general purpose with weights would usually require a value between 1.0-3.0 max on the incremental sides and you would rarely go below 0.1
That's prompt editing. https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#prompt-editing
In short, it allows you to change that part of the prompt over time, from something to something else.
When prompting, I generally don't recommend going above 1.5 in weighting. If you are, then it's likely that the problem is elsewhere.
alright.. now my generation is fkd up
What did you do 
changed all the {} to weight
lol..
Show me your prompt
Model issue
Yep. Or a network extension.
general idea is that (test) is actually a weight of 1 and there are cases of (((test))) just a general outline of how the weight works, but yes going above 1.5 is not necessary
the one up there i left with {} as i was a little double confused
Holy cow. Ok, get rid of all the {}. Make them disappear.
Not the parentheses. The brackets.
Not (), but {}
Anything that's over 1.4, bring it down to 1.4.
Or even down to 1.3.
like this?
{shimmer hair:1.4}
Change that to (shimmer hair:1.3)
And the same for the rest that are 1.4 and over.
I forgot Network Extension even a thing...
You're gonna make me cry
still weird
Copy the text of the prompt in here. Put three of ` before and after the text.
(best quality:1.1), (masterpiece:1.2), (highres:1.1), extremely detailed girl, solo, sharp focus,(flower shop:1.3), (cinematiclighting:1.2), (character:1.3), (1 girl:1.3), solo, (beautiful plain pink t shirt:1.3), pants, standing, light angry, closed mouth, beautiful detailed eyes, blue eyes, (sharp focus:1.3), (masterpiece illustration:1.3), (medium shot:1.3), shiny hair, short hair, blonde hair, (messy spiky hair:2), (shimmer hair:1.3), glowing hair, (iridescent silver hair:1.1),(hair (houseki no kuni):1.3), disdain, science fiction, (lens flare abuse:1.3), glowing light, bloom, black wear, holding deathly alien weapon, war ready, cyber city
So many weights
That actually doesn't look too bad, if still an eyesore.
I know 
i'll try to load back my old prompt, wait a sec
Wait, why the heck is the image so big lol
wdym
wdym
fyi the general idea of using weight is not to use them just for writing prompts but only where attention is required that might be missing in your render
no hires, img too wide etc?
upscaling
Alright, well I've loaded your workflow up in my Comfy. Lemme see what I can do.
now it works again, with those {{{}}}
Well, does comfy even accept brackets as a attention modifier?
Probably due to this reason @rich tartan
If it doesn't then it's not even changing the weights.
Tell ChatGPT to remove all brackets and paste it in ComfyUI?
Maybe removing all weights and start adding on what's needed for the image
I loaded up Aurora and am checking the gen now.
Yep. Horrors, lol.
Ah
This is probably at least part of the issue "(messy spiky hair:2)"
yh, you found out?
hm, lemme try again
Hold on, lol. Still fixing. There's something up with the flow
kk
Who wrote this prompt... it's making my eyes hurt
Ok now we're getting somewhere. Let me change a couple more things.
best quality, masterpiece, hires, extremely detailed girl, solo, sharp focus, flower shop, cinematic lighting, character, 1girl, solo, beautiful plain pink t-shirt, pants, standing, light angry, closed mouth, beautiful detailed eyes, blue eyes, sharp focus, masterpiece illustration, medium shot, shiny hair, short hair, blonde hair, messy spiky hair, shimmer hair, glowing hair, iridescent silver hair, disdain, science fiction, lens flare abuse, glowing light, bloom, black wear, holding deathly alien weapon, war ready, cyber city
yh works also
thanks, but i wonder what messed it up..
weighting can mess it up so hard?
btw taken from your prompt you could save this as a template for generic anime prompts ...
best quality, masterpiece, highres, extremely detailed, sharp focus, cinematic lighting, illustration
howd you guys read the metadata actually?
TIL Illustration is generic prompt
just put it into auto1111?
yes
i copied your paste and edited them in my editor to remove brackets
but you could also read metadata with PNG Info tab on a1111
yh, i thought it wouldn't be compitable with comfy actually
Ok. You also didn't have a VAE in. Which was a big problem.
Or... maybe it's built into your model.
it changed the deign a little, it's more of a face photo now
built in
Every model have VAE inside
can't even gen image without vae
no, some models need an esternal vae
good catch, im removing illustration from that list
what a splendor
Was wondering since model I use "Illustration" turns into 2D illustrate, but I guess those are difficult to find.
don't you see that fantastic smily face? XD
Something I noticed awhile ago
yeah its logical not to put that word into generic template
and i switch a lot between realistic and anime
ig, the design also improved by removing those {{}}, it listens more to the prompts
Alright. I'm using a different model, but the prompt results look much better now. Try it with your model and see if it works. Just open the image in a web browser, download it, then drag it into your comfy to adopt the workflow. I tried to keep as much of the original as I could, and I'd really like to concatenate your positive prompt like I did the negative one but...
stunning colors
Oh, and you'll have to reattach your model VAE lines to the original spots.
under mask
dam, this now looks like russian to me xD
Uh... not sure what you mean by that. DDIM is a very... outdated sampler method these days.
If you want a softer look, then try euler A or DPM 2 Ancestral
Just be sure to up the steps to at least 28 if you use Euler A.
I haven't use that sampler like 2 years ago

Yeah 
it looks more detailed with ddim and simple
I disable it on my A1111
The eyes most times suffer
I'm not seeing the 'more detail' part, but it does look cleaner. Try Euler A. That'll likely be what you're looking for.
im actualy a PLMS user, but comfy doesn't have it, or i couldn't find it
yh, i'll try that
Euler A with 28 steps on mine.
ig that's much worse
Oh, and scheduler set to None
Well, in the end, you should gen with what you're happy with and what looks best to you. If you like how DDIM works with your setup, then I'm certainly not going to tell you that's wrong.
But why is it outdated tho?
ddim on the upper and euler_a the bottom one
same gen time
Because more efficient and better resulting samplers have come out since then.
DPM 2M Karras is still the go-to for most.
DPM 2M or 3M SDE Exponential is another that some have moved to.
Restart, UniPC, etc.
They all have their quirks and ideal step counts.
And for some, they just like the 'look' of the sampler in their particular setup.
Also most people who uploaded the model doesn't test DDIM etc. Results are vary too, no? @untold glacier
Results can vary by so much
Model, steps, CFG, etc. etc.
uni_pc is also good if i remember correctly from auto1111
I remember liking Uni_pc_bh2 in Comfy when I used it a few months back.
i'll try, the best dpm i found was the adaptive one, takes very long tho
Whatever you do, just have fun doing it. Don't be pressured by others to change your workflow if you like the way things are going. Just try to keep an open mind and don't stop learning! That's the best thing about generative AI: it's always changing and evolving as new techs and solutions come out.
do what he says and you will end with a monstrosity like mine
and the prompt fixing, it really did a massive change to easier prompting
The (((((((()))))))) is really funny lol, I swear I saw it in Civit everywhere
Esp the NSFW gen ((((big backpain))))
i also love heun for the environments, but the characters looks broken
like u prompt a glass full of ((((((((water)))))))))) and you will get the sea
LOL
hmmm wait
is it possible to generate with heun the environment and put the character that is generated with another sampler into it?
But would need an transparent background ig
Not easily. You'd have to automatically inpaint it with a Detailer.
can adetailer fix the face for heun?
i remember there was this detector that could distinguish people and objects
Essentially you'd gen the image first, character and all, and then it comes back in with a Detailer inpainting the character with different settings.
ComfyUI prob gonna convince me to do inpaint and mess with things like these in future..
using it and ip adapter, maybe
I swear I just go TXT2IMG and never look back
yeah, uni pc is actually really good
ig that's gonna break my brain at the end
Be breakable. Mold your mind to...
the p o s s i b i l i t i e s
Ooga uunga
But to do any of the detailer stuff you'll need the impact pack https://github.com/ltdrdata/ComfyUI-Impact-Pack
My dirty mind seeing "SEGS" nodes..
those nodes gonna extend out of my screen into my nightmare, bet
if i'm not wrong, segs was that detector i was talking about
charming
transparent/white background i would wish, but seems like nah
auto inpaint gonna be a mountain to crack with this
you could use a very rare color for bg and than delete that color with another app, to get a transparant bg
i may want to train and use some loras first, never really used some. Any recommendations and what loras take effect on?
i personally use digitalpainting, epicrealismhelper, hyperdetailer
all your gen are really gorgeous
your style is very peculiar and unique
still not perfect, but very cool
perfect to what
some plastic effect is still stuck on the image, but the realistic, vintage style is very cool
bitcoin spam too.....
yup...
I would reduce the strength a JPFilmGrain little.
no, me like grain
I definitely get FFVII:AC original dvd release end credits vibes from this.
With "Calling" by Kyosuke Himuro
Before they had to put Gerard Way in there and remove that whole section for the Bluray release. 
What could ever go wrong if you add ((((cursed)))) 
hmm, if i give more to the background lora, the character suffers, if i give more to the character, the background suffers.. What do you guys to in such situations? Any ideas?
wow
that's just a question of model
i knew neko-girl could fly in space
i guess yes, without the lora that supports the same style as the model, the character breaks or not even getting drawn.. Can't really balance them without something breaking or getting left out
some models are just very good with waifu, but inconsistent with bg, some other are good with bg, but fails in generating excellent waifus
to maintein the form of the character, maybe using control net may help
character is now more detailed vs background is now more detailed
hm, that sounds good, is controlNet available for comfy?
yep
alr thank you
Perfection
Finally managed to get some good emotions
@wispy canopy a quick nudge since you are working on a new merge and if you want to consider turbo ... https://civitai.com/models/215418
Like the work I do and want to say thanks? Buy me a coffee or Support me on Patreon for exclusive early access to my models and more! Join us on SC...
Buttons.. must press them 
damn, how tf did you notice that lol
had to check it 5 times to see it
Haha it's very common for AI to place buttons and like armor stuff over there
@nova remnant knows all the secrets of Ai
hey guys merry chirstmas fools
having to download multiple XL models is becoming an issue for me, especially when each XL models can cover all of the styles
and trying to restrict a XL model to do one specific style is a waste of checkpoint file itself
only kind of difference you'd see with XL models are those that are well trained and those that aren't
I don't think that's an XL thing, 1.5 has the same thing. Most models are fairly generalist and require loras to push a specific style or use. Any heavy leaning one way or the other is specifically from the training and merges in it and has nothing to do with the architecture. Realistically the only difference between SD 1.5, SD 2, and SDXL are the base resolutions used for training the original model. Which can also be changed by training
The only reason SDXL has "more detail" or "higher quality" is because it has 4x the pixel information in it's training, but that doesn't always mean it's better or actually good quality
Realistically, most people won't have any real benefit using SDXL over 1.5, especially if they aren't taking advantage of the larger resolution, because that's the only real advantage it has. On top of that, SDXL is really slow in comparison on lower end hardware. 1.5 is also vastly more mature as far as availability of model types, loras, embeddings, etc, and can be trained on normal low end consumer hardware. Training something for SDXL takes much more hardware and time, and it's more finicky because of the extra pixel data so it's hard to make training even work properly
Not to mention 1.5 models are also 2.5-4.1GB usually, my smallest SDXL model is 6.7GB
I mean, XL also has both the same CLIP model from 1.5 as well as OpenCLIP which is a larger text encoder. The UNET shape is also different think, but I don't know very much about that.
Well, there are multiple technical differences, but as far as practical use it doesn't matter as much. The text encoder changes how prompting works, but also people are training and using the models with the same tag mess from 1.5 so it doesn't really matter on heavily trained models
If you were comparing just base 1.5 with base XL, there's massively obvious differences in how they function with prompts and outputs
With sdxl I tend to prompt more "real english", but throw in some tags when i need to push a weight on something as it seems to work best that way. 1.5 was just word salad
That's because CLIP is pretty shit text encoder that we've just gotten decent at bending to our will when combined with the grossly overfitted models that we use

OpenCLIP is quite a bit better in that respect.
But, tbh the SDXL anime models that I've used are really not great yet. They have little existing character knowledge, and are largely just as overtrained if not more-so than the 1.5 models that we're used to.
I think that mostly has to do with the novelAI thing, we don't have a leaked model to merge into everything with XL
unless someone puts the effort and time/money into training a solid base anime model, it can only get so far
but anime specifically also has the benefit of not really needing XL to begin with, it doesn't have as much detail as a realistic image so it can just be produced at a smaller resolution and upscaled
upscaling a photo is a lot more difficult and benefits a lot more from denser pixel data from the start
my problem so far with most of the XL anime models is that the CLIP is destroyed with all the random merges people are doing with no actual training on anything, so they have horrible prompt coherence and disfigured bodies all the time
I'm still trying to figure out why models like bluepencil have so many merges, it's almost like they think merging actually adds all the data into the existing model so now it contains everything they merged
yes, SDXL based model, most of the prompt is in the filename - must be generated since the text doesn't make any sense 🙂
lol
yeah, this is with a fully NSFW capable model and it tends to substitute buttons for... other things 🙂
having a better prompt understanding (like SDXL) would benefit anime as well, although people have become used to booru tags there anyway
it would benefit us if it could create the anime characters we are typing
maybe, but most of the people can't even properly run sdxl <.<
and even fewer can train it
maybe at a buisiness lvl, would make sense
no chance, sdxl can't do it
this is a grid lock
it can do it, it might just not be trained in a current model
sdxl has certain advantage over realistic look but not w/o block like render
gib money ,no longer a gridlock
right
this is how great ideas fall
$$$$
also as you already know sd1.5 and sdxl are different in their versatility
by design sd1.5 can be trained for specific task that are focused on that task itself but sdxl by design have much wider scope of doing things across different styles, and i dont get why we have to download multiple xl models when one can do all those tasks given its properly trained
total technological mess up
who said you had to download multiple models?
sdxl can be trained for specific things too, it's not generalist because it has to be, it's being trained that way
u have to if u want to change the style
you can do all the styles with one xl model
ok, but you have to change 1.5 models to change the style too, or use loras
its how xl is designed in its core
yea i can just use a 1.5 style lora on a single 1.5 model
if xl 'can do everything' then why are you getting multiple models is the question
thats cause sd1.5 has certain limitations
devs will tell you if you want decent anime its 1.5
my roommate has been playing with the realcartoon 1.5 model and it's so far been more flexible than most of the XL models i've tested
he doesn't do anime though
I don't care what people tell you, I've found multiple anime models that will do what I want
but what I want and what you want is not the same thing
fox girl with white hair wearing a long dress dancing
exactly what i asked for, it does the job 🤷♂️
I see a lot of people complain about this sucks or that sucks, but no one with actual comparisons, just vague "I can't get what I want"
cant really compare something that doesnt exist
but, like I said earlier, if 1.5 does what you want and you can't get that from XL, then don't use XL, it's a really simple solution
especially since XL is horrible for performance, even on high end hardware
tbh, I really want to see an example of "XL can't do this" that was made in 1.5 just to see what magically impossible task is being asked of XL. I'd be willing to bet it's a limitation of the available trained models/loras, or at least the few tested, and not a limitation of the architecture
You'd be correct. The architecture is better in every way on paper. But having a better architecture is utterly meaningless unless the models actually leverage it. Take SD2.x for example. On paper the architecture is higher resolution with higher parameter count than SD1.x. In reality the text encoder was absolute garbage and the training left a lot to be desired so it consistently was outperformed by SD1.x models until it was basically forgotten.
emad is busy with new business strategy to profit, dont think we can expect much from sdxl
The fact is that SDXL is more difficult to train both due to hardware requirements and the hyperparameters being harder to dial in (and likely captioning being far more important) than SD1.5 is which will slow progress on finetunes significantly.
sdxl is now into 6 mo. since its release
And SD1.x models didn't really start to get good until around 6 months after the initial release even with the huge boon of the NAI leak helping it along.
Were there only a few models available at the time? Or why was nai model such "hit"?
nai was heavily trained by professionals on a massive high quality dataset and was not intended to be freely available
but once it was, it got merged into everything else vastly improving the models
It kickstarted the whole anime model lol
Without the leak I doubt AI would be as popular
at the very least it wouldn't have produced such high quality so quickly
It started with anything v3 iirc
yes without nai 1.5 anime models would suck like 2.1
Interesting, i also started to get know the ai with nai, as i thought first this would be Standard sd
and with recent nai sdxl 3.0 looks like nai are the ones who know how to train models properly
and they also have the money so 💸
Yeah, nai is the best anime model now
that's the ultimate problem, those guys do that as a job and have access to the hardware to make it happen, no one is going to dedicate the time and money to training that kind of model for the good of the free public
i think waifudiffusion tried to do that but the result is not the same
(they dont have money that nai has)
on the other hand, even if I wanted to spend the time and money to make some kind of amazing anime model, i'd have to know what's wrong with the current ones but no one can seem to explain it
With how many est. pics do they train the model?
Probably million lol
on nai 1.5 i think it was 5million pics
Novel AI is a company with huge amounts of money who were able to leverage a massive amount of compute along with a ton of GPU hours to make a model that was very good at what they wanted, then someone burned a zero day exploit to gain access to the private repository and leaked the model. Since then, virtually every decent SD1.x model that's been released has contained some amount of the leaked NAI model.
estimated to be at least the entirety of the danbooru database, likely with more added from other sources.
Dam, That's actualy a heavy numb, i thought about 100k or something
It's still scary how good nai v3 is
It's been only a year since the first ever nai 1.5 right
NAI leak was Oct of 2022 and it had been around for at least a couple months on their site before then.
From anime girl eating ramen the weird way to copying over 7k artist styles
look i got the hand right ... (less than five fingers, more than five fingers, missing fingers, extra fingers)
waving prompt hack
i actually had waving in prompt
no negatives
anime girl with long brown hair wearing a christmas sweater waving, standing in the snow at night, street lamps, from side,
is that xl model?
btw bluepencil v2.9.0 coloring and brightness are too harsh compared to v2.0
yes it is
and with bluepencil v2.0
xl models literally always have such clean colors ig, also looks much more realistic
and btw, did you guys try out to use two vaes at once?
I wonder what happens if i mix a vae into latent image and higres with that
Interesting, mixing heun into ddim gives some more accessories without messing the character up
I'm not sure what you mean, VAE is either converting latents to pixels or pixels to latents.
oh yeah right, i was in the mind mixing with some preset image or sum
instead of the pre-gen one
finally, heun also gives background with good character instead of void or messed up face/focus..
uh, what is the difference between approx and none?
sounds cool, but if you look at the previous version page it's just a merge of albedo, juggernaut and mergeheaven, and it can't even do 'anime girl' properly
very suggestive
shush 🤫
cute
chitanda rules
they reminds so much the original
what are you missing in SDXL? for my taste, that's anime-ish enough (just "a cute anime cyber ninja, cherry blossoms zen" with SDXL anime style)
and for old school, just add "((flat 2d cel shaded studio ghibli style)) a cute anime cyber ninja, cherry blossoms zen"
me? nothing, that's what I was wondering, people keep complaining about sdxl being bad at anime or "not as good" or whatever, but I think it's fine
right. it does everything anime I need...
the only thing the SDXL models i looked at fail with is kinky stuff (no novelAI leak for SDXL 🙂 )
yeah, I've been trying to get a good nsfw merge figured out since that's the only real failing
well, it does more adult stuff nicely (this is still SFW i think, just for illustration)
up to full and detailed nudity, anime and real, but i'm not posting that here 😉
"(watercolor) ((flat 2d cel shaded studio ghibli style)) a cute anime cyber ninja, sitting on a bench in a beautiful flower garden"
all just SDXL, nothing special, no LoRA
less flat
still more 3d
dam, why is kohya_ss such a pain in the a** somehow i can't get it working, as it tells me cuda dependencies not found or sth
dreambooth ooms me instandly, can't run it
What GPU do you use?
rtx 4070
on a desktop? ..any other gpu's (internal maybe that is getting used)
yh ubuntu, i don't have any other gpu in my pc(also no igpu)
ah ..linux, don't think kohya_ss works proper on linux, try going over all the steps again else ask in #tech-support
https://github.com/bmaltais/kohya_ss?tab=readme-ov-file#linux-and-macos
may or may not be legal in some states
may or may not be legal in some states for a xmas gift
thanks, i did, but im currently trying to compile a package myself, that's somehow not working
and it says 6gb vram users can run this thing
ooooooooooooooooooooooooh
my maximum bucket res was at 2048
meanwhile it should be 512
alr
i even crash with oom at 1024
weird
(just tested)
ask in #🤝|tech-support i would say
oh wait, i had batch size at 10
are those settings good for lora?
1 epoch, 100steps pic
for training I prefer this over the regular sd-scripts repo. gives a nice gui wrapper that you can use but also installs sd-scripts which you can run normally if you ever want/need to for any reason. https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
i dont know any anime that looks like this
thats way too slow for a 3070
people are splt in two views, anything goes sketch anime and actual perfection of anime.
doesnt look like a sketch either,looks like a cg render
the checkpoint that looks like a good anime style is the neko one
are u using linux?
i have broken out of convention with canvas size
sd1.5 with 640x800 and sdxl 816x1024
those are also 4:5 ratio
a 3060 gets 4it per second on XL at 1024x1024 so a 3070 should get like 5 or 6
the 1.5 looks normal
7 to 8 it/s
oh yea somethings wrong then
how much ram do u have?
then idk maybe try with another nvidia driver version
no,older versions
536.23
yea try that before downgrading drivers
also sometimes slow speed is due to wrong webui installation
yea still needs at least 4its,maybe u are not using your ram in dual channel?
u could try to downgrade driver to test if it works , but its an annoying process,i first had to uninstall current driver with DDU then restart then install new one,restart again then delete venv folder and let it regenerate again then good 2 go
u could try with driver version 536.99 or with 531.61
when i downgraded to both versions i had to do the same process of DDU and several restarts so it takes a while
@supple raptor whats your it/s on a 1024x1024 img on sd 1.5 with euler a sampler 20 steps?
Let me start up sd real quick and check. I know I gen at batch size 2 1920x1088 at ~2s/it
3.42 it/s
That's in A1111 with a 3080 12gb so Comfy should be a little faster. ^
yea idk why he gets 1 it on a 3070 with 1.5
Could be running out of vram if HW acceleration isn't turned off. that gen used 7.8GB on my system.
Using the wrong Commandline Args could also cause slowdowns.
yea or maybe hes using wrong python version, needs more info 😔
i am on a 3060 with python 3.11.7 and
onnxruntime_gpu-1.16.3-cp311-cp311
let me do a test in comfy real quick
got prompt
model_type EPS
adm 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['denoiser.sigmas'])
Requested to load SDXLClipModel
Loading 1 new model
Requested to load SDXL
Loading 1 new model
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:17<00:00, 1.12it/s]
Requested to load AutoencoderKL
Loading 1 new model
Prompt executed in 25.05 seconds
Since he said he just got a new card, my recommendations would be
- Run DDU in safemode to completely remove all traces of graphics drivers, then reboot and reinstall the latest version.
- Check python version. 3.10.6+ is recommended, though 3.11.x should also work now.
- Delete Venv and rebuild. Latest torch is 2.1.2+cu121 and the compatible xformers version is 0.0.23.post1
so 1 it with sdxl 😔
yea for sdxl 32gb ram its better
4 minutes to load a SDXL model into RAM 😄
think genning at 512 and then multipass upscale is faster than genning at 1024
yea its faster to 512 then highres fix to 1024
Harley Quinn, a villain, anti-hero, unique brand of chaos, unpredictable nature, fascinating character.
its where u have webui installed
It's a folder inside either your WebUI or Comfy folder.
ComfyUI Windows Portable does not generate a Venv, in case he is using that
at least that is the one from the official github: https://github.com/comfyanonymous/ComfyUI/releases
very very limited as i can tell
Good to know, I avoid all of the portable versions. I'd rather have control over what versions of everything it's using.
Just clone the repo to the the folder you want it to be in. If you already have git installed then the command is just git clone https://github.com/comfyanonymous/ComfyUI.git from the terminal. It will put the comfy folder in the current path that the terminal is in so make sure to cd to wherever you want it to be downloaded to first.
here's the manual install instructions https://github.com/comfyanonymous/ComfyUI?tab=readme-ov-file#manual-install-windows-linux
What commandline args do you use?
Yeah I'd just do a manual install so you can make sure the correct versions of everythign are installed.
u could also try to download a1111 version 1.6 and install it to see if u get the same speed there
just download the zip from here https://github.com/AUTOMATIC1111/stable-diffusion-webui/releases
after that just open webui-user.bat and it will install everything
now try with 512x512
yea that looks better
is that on auto111 or comfy?
what arguments are u using on auto111 ?
yea that looks good,so what did u do? driver downgrade or just reinstall comfy and auto?
yea u prob installed something wrong thats why it was slow 
sorry idk about comfy workflows in a1111
it doesnt have workflows so u dont need to do the same thing after u reopen it
oh yea just clikc the blue arrow under the orange generate button
when u open the webui again it uses the prompt u were using before u closed the webui
if u use it while u have it open it will just reset the prompt again,it only works when u just opened the webui and havent typed anything
Better yet, go through and set everything to the default values that you want then go to settings and in the defaults section you can set the current values to your defaults so that everything will be as you normally want it every time you load the WebUI.
you just have to clear out the positive prompt to make it work if webui is already running. You can also drag an image from the outputs folder into the prompt field and then click the blue arrow to apply the full workflow.
oh i didnt know about the drag image part
🙇♂️
In this moment, the girl became a living work of art, her nails an expression of her personality and a testament to the beauty that could be found in the details. The stable diffusion of admiration for her nails echoed in the room, creating an atmosphere of awe and appreciation for the artistry on display.
you can set some specific settings in ui-config.json under your webui folder so that you don't have to repeat setting them over and over every time, you can also set prompt-related workflows in your style.csv file under webui folder
model loading mostly depends on SSD speed (hoping you have no spinning disk drive anymore 😉 ) - e.g. a 2GB SD 1.5 model loads in 4-5 seconds with a good m.2 SSD
and 6GB SDXL in much less than 10s
and 30 steps dpm++2mk with 1024x1024 SDXL takes around 5 seconds on a 4090 (no turbo) - forget about the it/s, it's the actual generation time that counts
The ui-config.json is no longer required. Just set them in the UI then change the current parameters to defaults in settings.
oh thats sweet, thanks for the update.
keep in mind that the defaults function in the settings updates ui-config.json
yes thats where it stores the values w/o needing to manually open that file in editor
nubby said it in a way that meant ui-config.json didn't need to exist anymore
I meant editing the ui-config.json is not necessary.
yesh
Double sampling, heun for environment, uni_pc for the character
And it really fixes the anatomy of heun while keeping the background/environment
you loose a little effects patching heun, but it's still better as no/bad background
at 0.5 denoise btw
interesting
Maybe training another lora will fix the background's anatomy of the char that's still having some issues. Any methods you guys use to get the prefect anatomy?
I dunno, but i may try out
yes depends on model
that model was simply too cursed lol
hmm, yeah.. Did you guys manage somehow to fix/help those models?
you could try to fix them if u know how to play with supermerger and weights
especially because, with time, i noticed, the models are improving
supermerger? i'll look into that, thank you
you would also need the recipe of the model u are tryin to fix
its easier just to download a new model,unless the one u tryin to fix has certain style that hasnt been replicated yet
that's my issue
hurts inside? why?
I think no. unless it's Gosick
what anime is this?
I want a box like this with the same gift!
same
Grave of the Fireflies (Japanese: 火垂るの墓, Hepburn: Hotaru no Haka) is an animated war drama film written and directed by Isao Takahata, and produced by Studio Ghibli. It is based on the 1967 semi-autobiographical short story of the same name by Akiyuki Nosaka.
The film stars Tsutomu Tatsumi, Ayano Shiraishi, Yoshiko Shinohara and Akemi Yamaguchi....
ah this is!
sadly my ssd's are too smol so got all my Ai gen stuff on HDD :/
Thats.. slow..
Dang, this one is new for me
2x12 resolution at 104 steps 😛
yh xD
im currently researching of new ways and dumping many methods to find the best settings for better anatomy, it worked in my other workflow, but suddenly in this workflow it's messed up
But im really suprised as it keept a good anatomy
I wish I could send the stuff that I'm generating here 
ah.. scheduler exponential makes it pixelated
it my other workflow much less, but pixelated compared to others
but this one has the less mutations
tried with karras and so on, all made mutations
Gonna try to come up with a fix
the left one is the double sampled one
looks like a defective controlnet model
throw a 1-2 blur and one more ksampler 8 steps with a low denoise 0.32 works to fix edge artifacts or pixelations, before doing a ydetailer hand and face fix
yeah trying to do, but more mutations come up with a 3rd sampler.. Putting the denoise of the 2nd to 0.7 fixes the pixelation, but mutations come with it
hmm..can you throw me the prompt .. lets see what my wf does with it
i need it at 0.4, my results shown me that it fixes mutations very well compared to other schedulers
yh sure, but i need to make one without the loras
wait a sec
here, just move the pic into your workspace and it should load in a few secs
The thing is, the upscale method makes a big diff, nearest-exact is more pixelated and the result is also
But in the other hand, nearest exact gives the 2nd sampler more to work with
let me save my own WF 😄
have you tried 4x Ultra sharp?
i didn't but can it fix the sampler?
currently the one im using
never got clean results with RESRGAN
https://openmodeldb.info/models/4x-UltraSharp
I think it's this one. Just put it on esrgan folder.
thanks i'll try out
but this one also fixed a lot of pixelations
never tried using canny and depth controlnets together but from what I see depth can improve quite a lot details, before the trigger on the guns was impossible to render, and grenades ended up as part of the clothes. Anyway merry Christmas to everyone
works, but youre not upscaling and improving mid
exponential dks up when using an "broken" imagine
interesting, the way you input the latent, the way it also comes out
But this breaks the cause why i wanted to use it, it fixes mutations better as the other schedulers
Both refused to sit in the box!
That's the cause it did better.. i gave a not broken latent in
alr at least something i researched
i was dying to use this
Normal >karras > exponential > sgm_uniform > simple > ddim_uniform
yeah, i think lora will need to take hand here for better anatomy
One thing that i could improve from my dumping now is setting my denoise 0.1 down on the second sampler fixes more anatomy
(left one 0.4 right one 0.5)
hm.. 0.7 denoise gave even better
u controlnet/mask the face?
ydetailer messed up 😄
oh, never used any detailer
left one ddim_uniform, right one exponential both at 0.7 denoise, quite some good differences, and at 0.7 it's not much pixelated anymore
do you like the background @native halo
you like pvc style?
i mean, i love it so much, don't even bother using something diff currently
pvc style lmao
you can get good results tho
But this one is the most "PVCish" style i got to

lcm?
i love that massive hat
i rate 40/5
aye
deepbonk 
@nova remnant you must be hungry for a lil love
that thing you did was a unique bonk stuff
lmao
my fav girl from my fav anime
Close up of Suu from Monster Musume 😄
@supple raptor I think implementing buzz system is inspiring user engagement and activities on civitai
needing it is not a bad thing btw, maybe a lil discretion for the public eye :p
and if that need ever stops human race will go extinct no matter how you put the argument
unless otherwise age reversal is a thing and death becomes avoidable
are those with low sampling steps or lcm?
or are you using any LoRA style to add grain?
i see, what weight are you using?
i can picture that third image to look fantastic if it were bit crispier
thats bit too much btw, .. try with maybe 0.3
can you please share that metadata?
ahh ok
ok saved, booting up sd now
👀 1970s \(style\), 1girl, above clouds, aerial fireworks, american flag, astronaut, aurora, balcony, blue hair, blue legwear, campfire, cape, card \(medium\), christmas, christmas lights, christmas tree, city, city lights, cityscape, cloud, constellation, constellation print, crescent moon, desert, dress, dusk, dust, earth \(planet\), embers, fireflies, fireworks, full moon, galaxy, garter straps, globe, gradient sky, halftone, halftone background, hat, hirschgeweih antennas, horizon, lamppost, light, light particles, lighthouse, looking at viewer, milky way, moon, moonlight, night, night sky, outdoors, pine tree, planet, print headwear, purple sky, reindeer, rocket, shooting star, shore, short hair, sky, skyline, skyscraper, smile, snow, snowing, solo, space, space craft, space helmet, spacesuit, sparkler, star \(sky\), star \(symbol\), starry background, starry sky, starry sky print, summer festival, sunrise, tanabata, tanzaku, telescope, thighhighs, twilight, ufo, ultimate madoka, window, winter, witch hat
thats a mini dictionary btw
its hard to imagine what that would generate exactly
booru tag guesses words randomly
often gets carried away from the main focus lol
but if u use a 1.5 anime model u want to use the booru tags
sure thing, but adding that diverse words don't show an actual image concept
yep, thats cause it picks up every random thing
try with a 1600 words prompt 🤠
booru tag style is cool if you can give it a focus with words but the booru tag extension we have kinda sucks
steven i love using booru style lol, it makes things lot easier to convey ideas but i keep them focused on the idea
one of the prompt ideas i play with is ask google bard something along the line of ...
write a creative prompt based on the following words: girl, flowers, garden, stunning visual, sunset
that's just one example of the things you can play with your ideas
can't do that, cause the booru tag system on a1111 isn't too meaningful other than throwing every possible words of an image, but here is my logical interrogation of that image... 1girl, blue hair, witch hat, short dress, cape, stockings, scepter in hand, starry sky
garter belt
A lot of prompting is also needlessly repetitive and convoluted. Most models don't need similar concepts to be repeated, whether it's 1.5 or XL. For example: "christmas, christmas lights, christmas tree" in the above example, something like "christmas decorations" covers all of that and will do just as well. It also covers a number of other concepts, like winter, pine tree, etc. Sometimes it's needed to add an extra concept to specifically push the model in a certain direction, but generally it's best to keep them as simple as possible. By default the 1.5 model only accepts 75 tokens to begin with and the UIs are just using tricks to break prompts into chunks to process them different which just waters down the terms used making most of them ineffective
not a fault lol, its not perfected
u also need negatives like this
faints
yeah, 90% of that won't even do anything, you're better off getting a good negative embedding or 2
thats just adding in words from template w/o having any meaningful ideas
its funny tho 🙂
but guess what, studying those and questioning those can help learn prompting lol
I saw a post on reddit that was something like "the perfect negative prompt" and it was 12 pages of text
lowres, polar lowres, bad anatomy, bad face, bad hands, bad body, bad feet, bad proportions, bad leg, more legs, worst quality, low quality, normal quality, gross proportions, blurry, poorly drawn, text,error, missing fingers, embedding:bad-hands-5, embedding:ng_deepnegative_v1_75t, all i need 😄
oh yeah those are more focused on what you are trying to fix with anatomy and general image quality
just do 5-12 passes and all the ugliness gets fixed with it 😄
one thing i also learned is that by using a lot of textual inversion can affect the model you are using
It's better to start with a blank negative prompt and only add things you actually need it to remove from the generations it making, if you need things like 'blurry' or 'lowres' (especially when already using negative embeddings) it's a garbage model and you should just get a new one
i think @supple raptor mentioned that in discussion and i saw that in my example over time
yeah, textual inversions are fairly similar to loras as far as what they can affect, though it affects a different part of the process, but they can be just as destructive (or helpful)
you know that's one of the thing SDXL tried to reduce
@nova remnant if I visualize to generate that image and tried to use that long list of prompt to achieve that would be like attempting a lottery
what was the point tho, randomness?
cause if you were trying to generate random images i think that's fair to add lots of keywords
you would have a certain pattern to it tho 🙂
but honestly the more i think on it, the more i feel like creating a reality of a new kind w/o any of the current preconceived patterns lol
we have reached the end of 5th dimension
i mean 4th
i start generating random.. and then i lock the seed and tweak the prompt or add a 2nd or 3rd pass
its cool 🙂
randomness is just vastly something beyond our logical inference but also come from patterns of possibilities
some of the contradictory words in that prompt were....
above clouds, astronaut, aurora, balcony, campfire, city, city lights, cityscape, cloud, constellation, constellation print,desert .. etc
i feel like a god to be prompting the entire universe ya know?
trimmed 1girl, aerial fireworks, blue hair, blue legwear, campfire, cape, crescent moon, desert, garter straps, looking at viewer, shooting star, short hair, smile, snow, thighhighs, twilight, witch hat
@nova remnant btw i didnt mean to discourage you, was sharing some ideas with you, and that prompt of yours gave me some fun ideas too that im generating now
some technical bits, for latent upscaling the denoise value is best used 0.5 and up, for esrgan based upscaling 0.5 and below
when using hires fix .. its ideal to use 1/3rd the sampling steps, so if your sampling steps are say 30 then you'd use hires steps to around 10 steps
deepbooru is pretty cursed. Use WD-Tagger instead. If you're using A1111, here's the extension https://github.com/picobyte/stable-diffusion-webui-wd14-tagger.git

I think Swin V2 tends to be the most accurate
Yeah, it should try to tag everything in the image since all of it is important for training which was the intended usecase.
no cause that thing is a mess
getting a strange warning when launching webui
i wonder
this is the entire line of warning message im getting
could that be related to recent updates on tag autocomplete?
I get this one too. Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
Auto complete still works tho
ahh ok, works here too but why would that warning show up
no idea
as long as it works
have been getting it for a while now
probably more than a month alrd
there are some new files released i saw in git pull
because u need the model-keyword extension
whats that?
something i need to install or can skip?
u can skip it but if u want it here it is https://github.com/mix1009/model-keyword