#✨|sdxl
1 messages · Page 29 of 1
as have I, so I totally get what you mean
@spark bearsome moar
just messingw ith other races and such
the blue is a little strong, but its a pretty small dataset and I have to reinforce the skin color
how does it look with other artstyles , outside of photorealism?
How are you captioning your datasets @high skiff You doing it manually?
is there some refiner upscaling workflow for comfy
I have yet to try that, and likely won't for a bit
in order to make higher res image without duplication
yeah, i always manually caption
its still doing the high cheekbones/strong jawline thing for every face..I really hope 1.0 doesn't have that tendency, its really annoying
Just upscale it with a pixel upscaler and then run it through img2img. It washes out detail though.
or well, i run it through a tagger and then hand curate it and add tags I see as well
I see...
I was just thinking of making 1080p wallpapers with SDXL
I did make some decent ones
Yeah it's annoying. It will do stuff like fix eyes, mouths, hands (a bit), but it removes the texture off the skin and stuff in the background
You can try a latent upscale, but you have to use a lot of denoise
finetunes will likely allow for higher res upscaling, just like it did for 1.5
how do I quickly copy an image to clipboard in Comfy
trying to use high res fix with 1.5 base is pathetic lol
I can only copy to "clipspace"
click open image, then right click there
oh so it was copying the node then
you're welcome
just know that people can take that image and have all your settings and workflow
Prompt too
that doesn't matter for me
I'm just using a workflow off of reddit lol
and a bunch of generic prompts
eh for safety reasons I'll remove
if anyone wants the workflow I can just link them the reddit page
prompt, nodes, settings, model names, everything you can see on yoru screen, you share
well that's nothing private for me, but out of being paranoid I'll keep that removed
If your using a chromium based browser though and you copy and paste from the browser it strips all that data
Wait so if I copy paste from the web browser then it wont have the metadata
does it only have it when I post the image directly?
If it's Chromium based
Then I believe it still carries the data
I wonder what SD 3.0 will be about
since that will have the opted out artists and stuff
maybe it will be some clever "architectural" upgrade
Scuffed Rock
I preach my settings twice daily XD
I have no secrets
yeah joe biden looked a little odd too
What are these magic settings then
1 repeat lmao
1 repeat, a shit ton of epochs
it makes some very interesting results
I mean that's what I've been doing anyway
Mainly because I can't be bothered waiting
cause if you think about it, running a high BS with 1 repeat would have several different images being gradient acumulated together, rather than many versions of the same image
its an interesting concept for sure
but yeah, if SAI is doing 50 epochs of like 600 images at only 1 repeat, its not nearly as long as you would assume
compared to how I do it
My usual "style" dataset is 289 images typically do one repeat every epoch is better and I usually wish it went longer. (Only tried sdxl training once so far though)
first of all training speed
Unet Training: 1e-3 <- ideal setting for 95% of situations (From dataset size of 10~1000)
Dimension/Alpha = 8/1
Unet training only! (--network_train_unet_only)
Resolution = 1024,1024
Bucket size = 512,2048
repeats = 1
epochs = around 20 should be where the model is 'perfect'. Train for 40 to be sure. (scales very slowly with dataset size - but not nearly as much as in 1.5)
@boreal boughhey man, any reason why we uses such different alpha and dim compared to 1.5 with SDXL?
I was curious about that
That's basically what I've been doing, except DIM 16
offset noise = 0. (you want to use the original sdxl offset noise which is hardcoded into kohya if you leave it on 0)
so far I am very happy with the diversity I can get out of this LoRA with a dataset this small
How many training images Sytan?
my problem with large datasets is my lack of enthusiasm to caption them all lol
also curious about training a multi Layer LoRA, with different concepts
like some Avatar Creatures, as well as some of their nature/wildlife flora
yeaahhh, thats same here haha
Thats why I do single digit data sets Arron
tho, I am able to see what tags to remove from my auto tagger, as it supports removing some tags
Even though you're not training the text encoder captions still matter right?
i did a single image LoRA, it worked better than expected, but it bakes certain things in quite hard.
guys btw what is lora?
I forgot, I was supposed to try to train a new hybrid LoRA so I can get centaurs.
It's a way to train/finetune the model to add new stuff to it
I would assume so? I am not sure
I know it makes clip do nothing
but like, I can still trigger single concepts this way, so IDK
yeah
is it better or the same as Stable diffusion training on google collab?
I wish highres fix worked properly 😢
a person in our research server has text encoder training working
One of these is a pixel upscale, the other is trying to make it "better" via img2img
You can tell the img2img one because there's no texture on the ground
very very much.
is that gigapixel?
gigapixel is so antient by now
same as how prompting has changed for sdxl, the same should apply to captioning.
good captions are more important than they used to be - especially that the most important words that describe your thing are well chosen
because gigapixel tends to make fake detail to make theyre image more detailed
antient?
No it's 4x-Ultrasharp and it's not adding any detail, it's just the img2img takes it away
thats kinda how all pixel upscalers work lol
It's not too bad on images like this just using Ultrasharp
Just for faces, especially ones that have dodgy eyes it's awful
if you want to teach sdxl a block of text for example, using "text" will give you only garbage, while using "word" gives immediate results.
a bit of an extreme example, but important to know
Img2img fixes the face, but it smooths out everything else.
SD 1.5 didn't do that, so it's annoying.
ya but gigapixel has acually good AI to make fake detail seems real
gigapixel is pretty outclassed by a lot of new upscalers
and some ai just make it more sharper
like what?
also, dont try to redefine 'key' words unless you're brave enough. like 'boy', 'girl', 'portrait' <- it will work, eventually. but just using any other word that describes the same thing saves you soooo much trouble
dude i tried tons of ai upscaler and enchancer but nothing come close to Gigapixel and they always update theyre dateset/ai
I had a gigapixel license from a few years ago... What are some new ones because I may not update my license for it if there are better ones. What are some of the newer better ones???
Tried using "word" with the bot
Prompt: a man holding a sign with the word "smile" in bubble text
wasn't gigapixel corporate focused?
so it works in real life applications, but doesn't shine in anything that deviates from practical use cases?
basically
not sure if I'm confusing it with another company right now x_x
They are locally run gans, like 4x ultrasharp, which is what I use all the time
its very good for realism. It adds more detail than a lot of others, but without the crunch of most of the high detail ones
just checked. yeah that was them. topaz labs are great for work situations.
I like the 4x ultrasharp... thats what Ive been using.. I just thought you meant there were better ones than that hiding somewhere
that one already beats gigapixel pretty handily for realism
cant be that long for photoshop to add ai upscaling though. its a low hanging fruit - unless they plan to do it at a level where it doesn't compare, but beats locally run upscalers
they
they have it
they have had it for like
2 years lmao
its called detail enhancer I think
I used it a while ago, along with gigapixel
@high skiff ow shit youre right https://www.reddit.com/r/StableDiffusion/comments/12vppkk/gigapixel_ai_for_second_stage_upscaling_seems/
37 votes and 40 comments so far on Reddit
i see more detail with ultrasharp when zoomed in
is there any ultrasharp on google collab? i wanna try it
yeah, its a great little ESRGAN, and it runs super fast on local hardware
the one for camera raw? yeah. but that one only works on real camera photos.
I mean a general style upscaler that works on any image like the typical esrgan
oh, you can convert any image into a dng and use it that way 😅
uffff xD
thats how I used it on non raw images lol
WHY IS THERE A FACE
thats ultrasharp. i also notice ultrasharp has blocky artifact
probably has Face restoration and ultrasharp thought thats a face lol
yeah that's normal for face restoration or diffusing more than you should 
what do i do with this? i dont run any ai locally on my pc
no idea then
is this correct ai?
Gigapixel can learn what detail is missing and can add some in a very intuitive way, detail that was not there in the original, but does not look out of place when added!
On your drive in Google Colab, you can go into Drive/ComfyUI/Upscaler and then uploadit there. I did it this morning and it works. ComfyUI on Colab and SDXL
IDK, I know 0 about non local AI
ow shoot google colab can run sdxl normally?
I keep getting weird markings and scratches and other things with the Upscaler in ComfyUI any advice would be appreciated.
I made the boxes on the face to show where the weird markings are
https://github.com/comfyanonymous/ComfyUI#colab-notebook I'm colab pro and add RAM but idk if you need to. Maybe free wont be able to use the refiner?
thanks alot. can i also fine tuned using this notebook?
ah nvm. i just check guess not. but thanks alot for this link
Look at how it is setup. Once you do run the environment setup code on Colab (assuming you loaded the comfyUI notebook), you should see the drive on the left side. Chk out my setup.
do u have gigapixel?
https://github.com/Linaqruf/kohya-trainer it's not in the readme but if you look in the files there's a colab for sdxl training
Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning - GitHub - Linaqruf/kohya-trainer: Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
yes but its a really old version
Yes Google Colab ComfyUI can use SDXL pretty easily. Just drag and drop any image from this forum (sdxl form) into your comfyUI page and voila, the workflow is set!
what is comfyUI? never heard of it before. is it like automatic1111
?
im still new to SDXL stuff
It’s a node based workflow ui
But like Factorio version
ComfyUI is the UI to use to create imges using different stable diffusion models (SDXL being one of them). Easy to use on Colab, since you get free GPU as well (upto a limit)
In this ComfyUI Tutorial we'll install ComfyUI and show you how it works.
ComfyUI https://github.com/comfyanonymous/ComfyUI
Download a model https://civitai.com
ComfyUI Manager https://civitai.com/models/71980
ComfyUI Examples https://github.com/comfyanonymous/ComfyUI_examples
FREE ComfyUI workflow for 1.5 models here:
https://www.patreon.co...
comfyui looks like im editing or developing some games lol
new minecraft version hits different bro
lol this ui seems like im coding or something
Looks daunting at first but slowly you get the hang of it! 🙂
can i use like different sdxl model like dreamsharper xl?
You could , but I would recommend that you stick to SDXL Base and Refiner models for now.
Does anyone know if SDXL will be able to generate text consistently?
kinda. its a mixed bag
sometimes it can generate text flawlessly sometimes its not
So, I came to know of Colab like 3 days ago, setup my first note book 2 days ago and created my first SDXL images two days ago. I am not a coder by any means and I do strategy work for a healthcare company. So , some of the videos on youtube realy helped me get upto speed 🙂
really excited to train my own images. sd is already good at training artstyle i cant imagine when i train An artstyle in sdxl. thank you so much again
Cool that it can do this now 🙂 I wonder what the success rate is and where it might struggle.
in my experience it has 25% success rate. if u generate 4 images atleast one can have complete text
A nice to have feature in ComfyUI would be a stop rendering button.
optimizer? lr scheduler?
view queue > cancel
Thank you... That is exactly what I needed
I wish text further away was as good as it can be on signs now
adamw8bit - constant with warmup 5%
yea; nothing really compares to sdxl right out outside of like the private models like parti
parti?
results with controlnet and lora are yet to be seen
google's imagen, microsoft also has NuWa
neither are available to the public afaik
i thought they had something better than imagen
wasnt aware of either
imagen > muse?
do spill the tea
nah, google doesn't even get close
o i meant parti was google's big image generation thing
i forgot they also literally named one of them imagen
xD
nah, gotta give credit where it's due, parti 20b and nuwa-xl surpass anything available publicly rn
even SDXL?
microsoft is a bit more open though, nuwa might eventually see the light of day
parti isnt high res
1024x1024 is what we're at and that's what parti 20b is at
in terms of coherence it's better afaik
oh, do you have the source for that?
Imagen is finally available to some Google cloud customers need to request and be approved. Existing Vertex customers only I think
what about detail?
if you check the images on the page they're all 1024x1024
like these are all 1024s
I have my doubts that either exist as such. Model architecture, but since nobody can touch them, it's likely hype. Just like Bard before its release.
MS has invested so much in AI + the demos they have r pretty extensive for nuwa/nuwa-xl
i see no reason to doubt it
nuwa would probably get integrated in Bing AI or something down the road
Idk, I refuse to believe after all SAI's efforts SDXL is still not the best, when I use A1111- it gets to an insane level of detailing
sdxl is amazing
but googles & ms resources r like infinite
lol
ms put $10bil down for AI stuff
not too long ago
nah, it's not as good with details as SDXL
But I'd also argue that Midjourney V5 is probably mostly superior in the ways that SDXL is good. DeepFloyd IF is definitely the most accurate text to image model that we can demonstrably show.
Moreso than SDXL
that's not true. even 1.5 models are way better than MJ. let alone SDXL..
Meta has a new model too https://ai.meta.com/blog/generative-ai-text-images-cm3leon/
i wouldn't say 1.5 models are but for SDXL many people voted the SDXL pictures better than midjourney iirc
you have not seen Midjourney 5.x based models
so midjourney's next model might be SDXL based
for a 20B model, thats pretty damn rough
yea; the parti paper is older
nuwa-xl is showing better progress rn
understandable in that case
oh, I did, people here can vouch for that
As far as unreleased products, in the world of stuff like that, there is so much fakery about products that maybe 'could be'
I fully expect Imagen was never finished
MJ 5.x does look incredible, I will say
But while its images have a lot of detail, I find its "realism" images never look that real
like, they have a ton of detail, but it looks overwhelmingly artificial
i wouldn't doubt the capabilities of google/ms, even if the products never get released
SDXL is far better than MJ. Idk about Imagen
I saw one really silly thing with MJ, you ask for different nationalities in buildings and it always goes so over the top
Custom 1.5 models can definitely be better then MJ
they can be, yeah
The capabilities of a company must at least in some degree be tied to something that eventually gets released 😄
especially for more stylized things
I expect Google has amazing research
what model/prompt did you use for that, they look gorgeous
lmao
yes, custom finetuned models can surpass
the images have the metadata on them still, you can drag directly into comfyui if you have it
was testing a random prompt
yep, I remember myself actually making a 1.5 model that surpassed MJ
it also output these
if MJ is the mind of Holz, I would hate to see what the mind of Emad looks like lol
probably V2.1 honestly
not even close
I'm still kinda new with sd, is comfyui in the files or in the webui?
comfyui is a different UI to the web UI, i don't know if you'd be able to recreate that with auto1111 rn or if it even has sdxl support yet on the non dev branches?
if you have comfyui you'd just take the picture and put it into the ui itself
because that also lets you have finer control with the prompt
i didn't use clip vit-l
only vit-g
and also a refiner model
a1111 would send it both into vit-g and vit-l which would be a 100% diff result afaik
yes it would
oi, thanks for the help a few days ago in setting up comfy
Half of that was gibberish to me
Idk, A1111 does SDXL perfectly, except it isn't implemented with the refiner
Basically, if you install comfyui, and the 2 SDXL 0.9 models, you just put the picture into the web browser with the UI loaded up and it'll copy all the settings immediately
and you'll be able to see the full workflow that made that
like, does Emad really think MJ can outdo these?
does it have a separate text box for vit-g and vit-l?
if its any consolation we didnt all get qualifications in this, we started where you are and kind of fumbled around together.
because i rarely use vit-l
So... then its not perfectly 😅
It has LIETRALLY everything, but half of it
Then its not ALL OF IT lmao
mmm well, a1111 is great, but the vram usage is huge
i can barely do 768x768 on a1111
and easily 2048(non sdxl, sdxl falls at ~1900x1900 on 6gbvram) on any model
using vit-l decimates the quality unless it's used for a very specific reason for the prompts i use
like i'll sometimes add a few supporting terms in vit-l
but i don't want vit-g copied to vit-l
idk what to tell you, what ever it is it's better than any ComfyUI workflow I used.
and yea, refiner is pretty big part of the process for image clarity
I used the github link and am downloading it, I guess I'm just unpacking the .zip in my C:/AI/ folder?
I would assume thats user error in that case
nooooooooooooo
git clone the repository
so you're able to update it
any news on training the refiner yet?
like, dude, whatever A1111 did it damn works
after you do that, you can link your SD venv in here:
why is the chat active
It's funny I've found that without vit-l often key attributes are completely missing from my prompts. I'm thinking that by skipping this you're missing some of the knowledge in -l
I see no announcements
ok, how do I launch git, I have like 3 executables in my search bar
I mean, it looks fine
and only change this path, keep the rest
Bash, Gui, or CMD
you can do it through windows terminal/powershell
this is what the workflow looked like btw
Does using the basic condition node prompt both? Or just vit-g
but if you just used green text in a1111 it wouldn't do same result cuz it'd be feeding into vit-l too
They'll probably be putting both text encoders into the model, which it fine, it just gives you a bit less control
I bet there are specific prompts where either g or l have destructive knowledge, and times where g or l have specific knowledge that you'd want
the BF16 one does
yea
that's why
it speeds up generations massively at the end
you put specific terms in one or the other
But the trick is SUPER long term (I guess) learning which terms should go where
cat diffusion when?
like maybe l has a great idea of what an elf looks like but g doesn't
i more or less have an idea through testing what words i wanna put where
but i never wanna directly copy paste
or like
use same terms for both
*SDXL now
they didn't go far enough
i want a cat with wings, a devil cat, a cat with a fork, a cat playing poker
i even sparsely use negative unless i really need to
some kind of interface to be able to mark works to 'stay off' of one prompt or another might be interesting. Or literally... someone should do one of those charts, trying each prompt term in each combination of each clip model
none, l, g, both * each word
u using caith's params?
no, I am using very different ones
you can type in cmd in the address bar of any folder to open a command prompt there. git has a nice windows gui for keeping things up to date but unless you're actively coding its not v useful. think of git as like the update service for projects, you can largely just have it there.
Make a Sumo wrestling na'vi
you would need to install the nightly build of torch, which I do not remember how to do
i remember
ok, now that is a lot haha
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
ok i want this prompt now mate.
so it's some experimental feature for now 😔
oh that's it, thanks
care to share? these settings arent working for faces
okay, so how do I get started, Idk wtf to do
the generation times are good enough for me so I won't experiment with BF16 for now
my heart can't take it anymore
bubble twins
I am not gonna share my settings at the moment as they are EXTREMELY wasteful
Like, 70% of the LoRA was wasted on over training, and I need to tweak it a ton
we should maybe move to the tech channel, this moves pretty fast
The one thing that is pissing me off about this LoRA is it just like... randomly decides to not generate a Na'vi, and instead generates a black guy
it keeps happening, and I have no idea why
like, WTF
lmfao
gotcha. I'm trying to do the same, just need a coherent starting point for the process, and so far i'm getting major distortions. might be my captions
its so stupid, and I have no idea how to stop it from doing that
im sure training the text encoder will help once we figure that out
Sytan: I can't remember having issues like that with different skintone LoRA with 1.5. But things are a bit different with SDXL if you're not training TE.
does prompt weighting help?
good point
and that too
yeah we should wait for SDXL 1.0 anyway
I don't understand it well enough to explain, but he did get it working
I heard SDXL 1.0 is delayed because SA wanted to help with 0.9 Loras and stuff
is it hard because it's a guessing game or is it hard because it's demanding?
it requires you to train 2 LoRA's
I also saw some post that suggests that we can use T2I-Adapter with SDXL
That seems to be a misunderstanding and it isnt' really delayed due to 0.9 capabilities.
I need to find a way for SDXL to stop racially profiling Na'vi lmao
It would suck if we end up having to train 3-4 Loras for each subject (unet, TE, and refiner-and/or refiner TE)
If you train a new one for it yes.
That's what they did for that scribble thing
cause it just keeps generating black people
aah
I think it's something to do with how it captures general skin tone. Because earlier when I was trying to do a red skinned demon, it kept generating people with darker skin as well
the second I put african in the negative, all it generates is white guys
What happens if I put multiple models in the models folder? do I get like a character selection screen or an error that kills everything? 😂
oh wow, that one reminds me of Mao Mao haha
Just change the eye color and its basically Mao Mao haha
Cheshire Mao Mao lol
how does one use xy plot node in comfy?
It moved away from full cat person lol
anime style sometimes does that
we back
@eternal fog I am using the same prompt for the prompts, is there any way to connect these 2 nodes using another node, so that i can just input the prompt once?
Just use the one node and plug it into both
the cat on the chrismas tree
to which nodes is it connected, its too difficult to find the path when the noodles are straight
They plug into the Text_g and text_L in the CLIPTextEncodeSDXL Nodes
ok
this doesnt even look like a kitten.
wtf is pooled output
is it not just adding --bf16-vae when launching comfy?
might try thx
the smoke is realistic af
a fully tattooed young pretty and attractive Brazilian tall female fitness supermodel wearing biking and super muscular with six pack in the bedroom with face and chest towards camera full front ultra realistic 8k image hyperrealistic. Her each and every body part is fully covered with tattoos.
also in my case the patched fp16 vae on HF is actually faster than the standard cast to bf16 by quite a bit for decoding. Encoding seems broken though.
Is that Midjourney?
SDXL0.9
@delicate grotto what prompt/artstyle did u usw for this? I loved the Soft shading/anime style like league of legend
soo good man
yeah, the power of A1111
@indigo carbon do u have any idea for this artstyle?
mmm can't recall
i think
a catgirl in a purple yukata, sakura trees in the background, pink background
5 steps with the style: Enhance, Cinematic
15 steps with: Enhance, anime, digital art, Cinematic
Do u use ComfyUI and custom Model? @delicate grotto
sdxl 0.9 pruned - comfy
Also archived a very colourful splashy style with "dreamy, (style of Hannah Yata:1.1), colorful Psychedelic Art, neon colors" in my prompt. sdxl 0.9 pruned - comfy
how do you get a prompt from a photo using comfy? simply dragging it into the web browser ui doesn't do anything
well, i can't replicate the original but it does look alike i think
You probably will not get a workflow from the drag and dropped image since Discord is probably messing with the EXIF data from the uploaded images.
workflow should work, but you gotta copy it from file browser tho iirc, not from the comfyui
well, here are 4 pics from the directory itself, not sure how to extract info from comfy
just drag it into discord
Thanks, could replicate it.
Yeah, just drag and drop. Comfy is awesome.
i thought i was going to get a prompt and steps only
Wait until you have missing nodes 😄
at least its not a bsod
can't load a refiner
Let me check that out real quick
i need to do it in two workflows...
sadly
one to save the latent image, and one to pass through the refiner after deloading the refiner
on god I won't sniff
just want to remove
all any way no posssible
cant really do that easily as of yet
how to do it
like lets say
I will train this xl
with thousands of 18+ images
how I will stop
this shit
genrates
kids and arrest me
whats the best way?
its time to stop...
its real question I ma pay
like 2k usd
who can do it
joe penna himself said that if u can delete trained things from models that he'll hire you on the spot lmao
I don't want to get my ass arrested for this sht
yea like umm
maybe we can train like
hmm
we can collect all keywords ok these pedos search
and we can train for those keywords
a fbi logo
no it wont work
if it will generate a %0.1 percent change
a kid
I go arrested
tf is this how to
tell stable owner
don't use any kid shit
youre not going to get arrested for accidentally creating that lol
he's trolling...
i think hes genuine
Jesus christ, i was 2 seconds away
yeah it gets tricky if youre hosting a service
Here with refiner
like you are not a developer or like u are user?
hey people already asked on #🤝|tech-support but you seem more active, but im getting something weird, basically with A1111, i tried the default SDXL0.9, the Dreamshaper one, and the WaifuDiffusion one, and WaifuDiffusion runs flawlessy, like, it's perfect, but the base 0.9 and the dreamshaper one just, they take forever to load, and when they're loaded they seem to generate the image "fine" but then when it reach 99% it throws out a cuda memory error, which is weird because it does that with every resolution i try, even 512x512, while WDXL works till 1024x1536
so there must be a way
either you will create your own model
that has %0 kids in it
or like how we can
mmm add 3d to the positive
K, let me see
before um releasing the image
or pixar, the refiner kinda ruined it and flattened it
it will check for age
not sure how the refiner works, but try to use it on the last 2 3 steps?
i saw another host post how he handles it: he replaces prompts of what he thinks will create that into another word. so child becomes ape or something. obviously that means noone can ever make pics of children ever with his service but that's a small price to pay
yea me saw it to but
its not %100
its like %99.9 and
right
if a guy is a fully pedo
he will find a way
I am sure
and you get ur ass whoped
like trained model
must have %0 child image I guess
all adult
sdxl was trained specifically to work on 1024 right?
from what ive seen, most github repos will have a license stating that the dev isnt responsible for anything that a user does with their tool, im sure there's something similar for a service provider
yea yea but umm
you need to make sure before giving it away
they will take down ur server domain and shit
i can see that happening, yeah
there must be a way like to not generate any %100 no chance u know
I can add a age detector and reverse prompts
Don't know if that any better. I am more the realistic guy
its 11 yo 💀
yep, looks better, didn't get flattened
you using comfy or auto1111?
yo bro are you good with stable dif
Comfy
I was gonna buy a mercedes I invested in this nude stuff need help
At the moment yes. I always worked with Auto, but i completly changed my workflow.
i prefer the gradio UI too much to change it tbh
i tried your workflow
Yeah the learning curve is pretty steep. But once you get a hang of it, you will aprechiate Comfy. I am no fanboy or something, and i loved Auto, but now i stick to comfy
How did you do that?
IMPRESSIVE!
Is the sdxl 10 times slower than other models??
where did you get the pruned refiner?
Not realy, it generates just in higher resolution.
on high resolutions it's faster
So you think its better than any other is it
tbh i've found that WHEN it works it's faster
Some guy oploaded it one day after the leak on huggingface
Can you send 💀
What to the DM buttons under the bot image results do? Who gets DMed?
you
so you have them saved if you want them for later
Hell no.
haram
too bad, its not there anymore
oh well
also, is the refiner really needed, sometime i see it improves images, sometime it's worse
good question. A1111 isn't using the refiner and it runs it the best
I have read that you can prune them by yourselfe with automatic. Don't know if thats a thing
How to you use it on a1111
it needs api key
dev branch
yes you can
I don’t see anything in my DMs; maybe it’s because of restrictions in my account settings?
aye could be
Alrighty. No big deal.
if anyone knows where there is a pruned no-ema sdxl/refiner please send it here
Sorry, i have searched without luck. I am sorry, but i can't upload it. You might have to convert it by yourselve.
Here is a thread about it on reddit https://www.reddit.com/r/StableDiffusion/comments/14uaoca/how_convert_sd_xl_base_09safetensors_to_pruned/
1 vote and 4 comments so far on Reddit
you can sort of prune the main model from 13gb to like 6.5 with auto1111 but it didnt seem to work with the refiner, the size was the same
WHY DID IT MAKE TILE IMAGE
Base: 6.775.435 kb / refiner 5.933.544 kb. Can confirm that they are both fitting on my RTX 4070 12 GB in Comfy
Generation Time: Approx. 20 sec 1208x1024 with my shitty workflow inc. refiner
I think the better question is: How did you make a tile image?
weird firey revolver
i get tile images a lot on certain seeds with short prompts. was the prompt short? hopefully 1.0 fixes that. happens a lot to me too
What did you actually wanted to achive with that prompt?
Well, i closed some stuff
A 3060 mobile 6gb can fit it too
~30 sec render time
Uh, thats nice. Lets not look at the ghostly thing top left
some people may say that what you focus on speaks to your state of mind ;o)
30 sec is ok for that setup i think. Was it now with the refiner?
Yep
Nice. Oh btw, yesterday i found this here somewhere:
Im still not sure how the refiner works, but i didnt notice any difference
Style: Enhance
Positive: breathtaking {prompt} . award-winning, professional, highly detailed
Negative: ugly, deformed, noisy, blurry, distorted, grainy
Style: Anime
Positive: anime artwork {prompt} . anime style, key visual, vibrant, studio anime, highly detailed
Negative: photo, deformed, black and white, realism, disfigured, low contrast
Style: Photographic
Positive: cinematic photo {prompt} . 35mm photograph, film, bokeh, professional, 4k, highly detailed
Negative: drawing, painting, crayon, sketch, graphite, impressionist, noisy, blurry, soft, deformed, ugly
Style: Digital art
Positive: concept art {prompt} . digital artwork, illustrative, painterly, matte painting, highly detailed
Negative: photo, photorealistic, realism, ugly
Yeah, i got it on a note, extremely useful for the style
tyle: Comic book
Positive: comic {prompt} . graphic illustration, comic art, graphic novel art, vibrant, highly detailed
Negative: photograph, deformed, glitch, noisy, realistic, stock photo
Style: Fantasy art
Positive: ethereal fantasy concept art of {prompt} . magnificent, celestial, ethereal, painterly, epic, majestic, magical, fantasy art, cover art, dreamy
Negative: photographic, realistic, realism, 35mm film, dslr, cropped, frame, text, deformed, glitch, noise, noisy, off-center, deformed, cross-eyed, closed eyes, bad anatomy, ugly, disfigured, sloppy, duplicate, mutated, black and white
Style: Analog film
Positive: analog film photo {prompt} . faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage
Negative: painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured
Style: Neonpunk
Positive: neonpunk style {prompt} . cyberpunk, vaporwave, neon, vibes, vibrant, stunningly beautiful, crisp, detailed, sleek, ultramodern, magenta highlights, dark purple shadows, high contrast, cinematic, ultra detailed, intricate, professional
Negative: painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured
Style: Isometric
Positive: isometric style {prompt} . vibrant, beautiful, crisp, detailed, ultra detailed, intricate
Negative: deformed, mutated, ugly, disfigured, blur, blurry, noise, noisy, realistic, photographic
Style: Lowpoly
Positive: low-poly style {prompt} . low-poly game art, polygon mesh, jagged, blocky, wireframe edges, centered composition
Negative: noisy, sloppy, messy, grainy, highly detailed, ultra textured, photo
Style: Origami
Positive: origami style {prompt} . paper art, pleated paper, folded, origami art, pleats, cut and fold, centered composition
Negative: noisy, sloppy, messy, grainy, highly detailed, ultra textured, photo
Style: Line art
Positive: line art drawing {prompt} . professional, sleek, modern, minimalist, graphic, line art, vector graphics
Negative: anime, photorealistic, 35mm film, deformed, glitch, blurry, noisy, off-center, deformed, cross-eyed, closed eyes, bad anatomy, ugly, disfigured, mutated, realism, realistic, impressionism, expressionism, oil, acrylic
Style: Craft clay
Positive: play-doh style {prompt} . sculpture, clay art, centered composition, Claymation
Negative: sloppy, messy, grainy, highly detailed, ultra textured, photo
Style: Cinematic
Positive: cinematic film still {prompt} . shallow depth of field, vignette, highly detailed, high budget Hollywood movie, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy
Negative: anime, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured
Some of them btw mess up with anime
Style: 3d-model
Positive: professional 3d model {prompt} . octane render, highly detailed, volumetric, dramatic lighting
Negative: ugly, deformed, noisy, low poly, blurry, painting
Style: pixel art
Postive: pixel-art {prompt} . low-res, blocky, pixel art style, 8-bit graphics
Negative: sloppy, messy, blurry, noisy, highly detailed, ultra textured, photo, realistic
Style: Texture
Positive: texture {prompt} top down close-up
Negative: ugly, deformed, noisy, blurry
Sorry for the spam
@delicate grotto 😛
weird, it got better when I prompted "beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful beautiful "
wonder how reptition actually plays out in clip.
I got latent walks working
https://twitter.com/Birchlabs/status/1682531741100015617
#SDXL conditioned on interpolated CLIP hidden states
thanks @cafeai_labs for the compute, and @RiversHaveWings for answering my questions
pedantically it's interpolating hidden states rather than latent embeddings but that's as close as is possible
My biggest hope for improvement between 0.9 and 1.0:
The refiner model: All of the faces turn out more Caucasian and generally more plain looking than the intended expression. Like if a person is another ethnicity they come out looking at least half white. Also if the original face has a more stylized rendering, it almost always turns out more photoreal in the refiner version.
i really dont understand why they went with the refiner model to begin with, im no expert, i dont understand the process, but it seems to just to make the process more complicated
what is joe byron talking about\
does anyone know how to fix this
it implements eDiff-I's "ensemble of expert denoisers" idea
https://arxiv.org/abs/2211.01324
it reminds me of gpt 4
what if they had like 8 different refiners trained on different aspects or styles
and combined them
I guess it wouldnt be local at that point
it's happened like 7 times in a row bro what is going on 💀
if the conditioning is good enough, model has enough params, and the things you want are represented in the data: one refiner is enough
okay, thanks luluco
the refiner already is an expert of an "aspect" anyway
it's a high-frequency details expert
ooh thats right
and I guess it's biased towards being responsible for aesthetics
because it's conditioned on LAION's CLIP
do you know if diffusion models in the future could get the text right? Like if you ask it for a newspaper
DeepFloyd IF already does well on text
true
but I think the people who've done best, conditioned their UNets on text encoders which tokenized the prompt into characters rather than words
does that also remove the issue of two word objects being split up? for example dragon ball
hmm is that even an issue?
it sometimes puts a dragon
hmmm
its not a good example but
it's the most normal thing in the world for the tokenizer to tokenize your prompt badly
but attention still learns which tokens cause each other
the token embedding it learns for that token, may end up a bit tangled with multiple concepts
yeah it does sometimes
I think if you're not getting a dragonball, the more likely thing is it wasn't sufficiently represented in the training set of the text encoder, and/or of the Unet
does some of y'all got a good ComfyUI workflow?
even tokenizing it really well can't make up for the text encoder or Unet not having seen it during training
solution is to train a lora and put a unique token in the training data?
like dbz
in pinned
hmmm loads of solutions
textual inversion can search the text encoder's embedding space for coordinates that would already do the trick, even though they have no name in tokens
ooh
LoRA can't change the shape of model weights, so if you want to extend the text encoder's vocab: you'd need to resize the input embedding and output projection
(there are ways to do both of those without a full finetune)
then you'd finetune text encoder (you could use LoRA)
I think thats beyond me, its not really an issue though
then you'd finetune Unet (you could use LoRA)
but I think you can get away with the text encoder not understanding what you're talking about
lots of people just overfit when they finetune the Unet
just make it always draw dragon girls
dragon girl fumo
I have fumo irl so I can just stick it to my monitor
I made a fumo textual inversion embedding
https://birchlabs.co.uk/machine-learning#textual-inversion
I was amazed how close textual inversion got; it doesn't involve finetuning the text encoder or the Unet
theyre so cute
the thing that keeps stopping me from doing a proper finetune is I don't have a decent fumo dataset
theres a bunch of really good 3d fumo scans
there's a lot of data here, if a way can be found to de-watermark them
https://fumo.website/fumo_checklist.html
true
most of them dont have watermarks?
I'll take 10
oh the touhou logo
or whatever that icon says
it's all solvable, but more work than I really want to do
I just realized that the formatting like %date:yyyy-MM-dd-hh-mm-ss% works in comfy for filename outputs
aint gonna lie this doesnt seem optimal
very cute 🙂
heylo
something is wrong with this. what is your workflow
Got a couple new LoFi girl style images... What do you folks think?
really looks like somethings going wrong with that one mikey
They look pretty good. Reminds me a bit of cyberpunk for some reason.
i'm not sure what lo-fi means in this context. i usually see it as "low fidelity" nice images though. maybe a little underrefined. no refiner?
HiFi girl
I absolutely love this, would you mind sharing the workflow and prompt?
a terrifying creature lurks in the shadows of an abyss, another creature is spooked + fantasy art, imaginative themes, surreal elements, mythological creatures, dreamlike scenery
So
I have news
I wish not to get my own hopes up
but my 3090 is currently working without issues
and I am not sure if I have found the permanent fix, or if this is just a temporary postponement, but whatever it is, its the first time since I got it that I have not immediately bluescrened after basic high intensity compute
was it on the new computer? or working on the old one?
It's not the 3090 correct?
If this is the fix, it definitely was the 3090
Both of them
what's the fix?
interesting
The seller and I exhausted practically every option, and I considered the GPU completely dead, so we did some more reckless things
did it involve an oven?
I disassembled the entire GPU and started running it without the backplate, to diagnose any potential heating problems on the VRMs and the pads that he included, it was still crashing at that point
So then I wanted to test with the backplate reinstalled just out of curiosity, so I disassembled the whole graphics card again, reassembled it with the front and back on, and it hasn't had any problems since
I don't know if maybe the pads weren't making proper contact, or potentially there was too much mounting pressure on the GPU, which I know can cause problems with CPUs
ahh
Sounds to me like the seller installed some pads incorrectly
I'd likely assume that it is the mounting pressure situation, as the seller has nearly two months of squeaky clean windows event logs, and temperature logs and everything that show absolutely no out of the ordinary characteristics for the GPU
So our current guest is that maybe the box is staying some form of high impact that messed up the mounting pressure asymmetrically, so one side of the die was experiencing more pressure than the other, which can cause some pretty big issues with CPUs, and I would assume GPUs as well
The thermals look identical, the performance looks identical, but it's not crashing anymore
I genuinely have no idea, and I don't want to get too over hopeful that the issue has been completely solved, but this is the first thing out of dozens of different things I've tried that has yielded any form of improvement, let alone a complete eradication of the problem as of now
What are you using for thermals? Does your app you use show the the vram temp ect?
I just got my 3090 from eBay myself. The seller used it for light gaming only. It's working fine so far for me.
But just for a comparison, my GPU has sustained over 60 BSODs, every single time being identical without fail
The criteria for the BSOD was doing some form of thing that required the GPU to hit its maximum clock, waiting for it to fall back down to idle, and then having the GPU try and reach its maximum clock again
Every single time that led to a crash
Are you saying it might have gotten bounced around in shipping?
As of right now, I have run through that exact same system almost 70 times without a single hiccup or BSOD
So the single thing that failed every single time on the graphics card has now worked consecutively over 70 times
That's the guess, because the seller 100% knows his shit about GPUs.
I'm pretty educated on the topic, but this guy was running circles around me in terms of specifications, tests that we could run, potential hardware fixes, all sorts of shit
He's got a squeaky clean record of like 300 and some sales on eBay, and a phenomenal rating as well
The entire time he was optimistic, helpful, and very polite, and overall just seemed confident that there was some weird or stupid issue at play
He stopped me from giving up several times, and it seems like it may have been for good reason
He was immensely confident that he did nothing wrong on his part, as he was literally running live tests for me with logs and everything as I was contemplating buying the GPU
And I know for a fact that I did nothing wrong, is I handled the GPU extremely gently, and I've installed graphics cards dozens of times
It really does seem like some form of fluke mounting pressure problem
congrats, and that's damn weird
i might even take it back apart and look for a short
but you solved the problem @high skiff ?
hopefully it keeps working
As of right now, yes
good
I have no idea if it's a temporary, or a permanent, or whatever
I think i got lucky with my 3090. Had it for a week and a half and so far it runs very nicely. image iterations which took 30 mins on a 4GB 1050 Ti is now taking 30 seconds. lol
although, with more complicated images with multiple controlnets it does tend to sound very loud with that founders edition turbo fan. lol
LoFi girl is a sub genre of music... its a youtube channel and its all about easy to listen to music with no lyrics that helps some folks study or just relax. The LoFi Girl is the face of that channel. Lookup LoFi Girl on youtube. Even if you don't find that you like the music you will see what I mean... And yes its meant to be a combination of the Neon CyberPunk with The LoFi Girl in a photo format instead of an illustration. Sorry for the wall of text. Just figured it could use the explanation.
don't appologize! that's totally the sauce i wanted. see, i'm old. i don't get some lingo sometimes. wanted to know where that good old oldy, lofi, was coming from in a modern context. thanks for laying it all out for me.
Thank you 🙂
funnily enough, lo-fi tries to go for the older analog warm tones, what would have been called hi-fi back in the day
heh, yeah but hifi was analog stuff that was just purrre with zero noise. it was pretty impressive gear. i was never alive while it was the hot market, but there's a level of quality to old hifi audio systems that was just.. mmm..mmmmmmm!
I think in the context they use it in its less about the quality of the music and more to signify that the music is low energy and relaxing
I keep getting this error message here everytime I try to generate any image: NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 4096, 1, 512) (torch.float32) key : shape=(1, 4096, 1, 512) (torch.float32) value : shape=(1, 4096, 1, 512) (torch.float32) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: device=cpu (supported: {'cuda'}) flshattF is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) max(query.shape[-1] != value.shape[-1]) > 128 tritonflashattF is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) max(query.shape[-1] != value.shape[-1]) > 128 Operator wasn't built - see python -m xformers.info for more info triton is not available smallkF is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 512
could be, there's only so much to go around
Do you guys know how to fix this? Bc I can't load any image prompts in Stable Diffusion
I might be able to help
Thank you
Have you edited the .bat file you use to run ComfyUI?
lofi hiphop might've been what she started with
Uh... kinda? It's the WebUI-user.bat file
Are you using ComfyUI or VLAD A1111?
I open it up in Notepad and added in the prompts to COMMANDLINE_ARGS: --xformers --autolaunch --skip-torch-cuda-test --no-half
It's A1111
ok let me check and see what I have in that one.
What graphics card do you have?
It might be
sdxl only works in the release_candidate branch of auto111
Where do I find the graphics card?
Look: can we take this to DMs? Bc I don't want this chat getting crowded or drowned out
git switch release_candidate
git pull
Where do I imput that?
hmm. in the terminal. you might want to wait for the model to be released fully instead of just throwing it at a random install that you don't understand. I could explain how git works to you, but theres the next question, then the next.
I mean, they're git commands. where do you figure? in the root of the folder. someone should tell you this. beta software requires some knowledge about how the software works, generally.
learn your tool chains. read the manuals.
i believe in you
Look: perhaps we should take this to DMs, and you can explain the step-by-step process in installing Stable Diffusion and guiding me through it
do i seem like a person who feeds people step by step?
Bc I've clearly been thrown into the deep end
sink or swim
How am I supposed to know how to use the program?
Look: do you know somebody who can help me with the installation process?
Hold on there are videos that can walk you through the whole process
/points at the entirety of the internet and it's vast annals of knowledge, with a wide grand gesture
I would like to see images created with sdxl as an example of its power, where can I find them?
I've followed each and every one of those videos - step by step. Nothing seems to work
it hasn't been released yet right?
the #1100170312106127410 and other rooms let you make images
That is what brought me to my question... What graphics card to you have... If you don't have one it changes the entire process... If you do have one we need to know if it is an AMD or NVIDIA graphics card... These are questions I can't answer for you.
I'm trying to find my graphics card
I'm on my device manager. Where do I find my graphics card?
It would probably be something you would know about because they are really big and pretty expensive... It's not really something you would miss.
if you want the simple way of running SDXL get comfyui and the workflow pinned in this channel
No, I don't have a graphics card. It seems to be inside my computer screen, which also doubles as a monitor
It's basically a hybrid of sorts, but I don't think it contains a graphics card
Then you would need to follow the installation process to use the CPU and RAM instead of a graphics card... The next question is are you on a MAC or a Windows PC?
its an all in one. it's ancient
Windows PC
now we are getting somewhere
So, honestly unless this is very new and very powerful it is probably going to take a very long time to generate an image with your PC if it can even handle it... However you need to set command line arguments that force it to use the CPU and RAM and not to look for a graphics card.
Alright, so I would need to have Stable Diffusion installed into a CPU and RAM
not exactly
Yes, I got my PC a year or so ago. So it's brand-new
What command lines am I looking for?
The program gets installed on a hard drive. The program needs to know that it needs to use your CPU and RAM the CPU is the brain of the computer and the RAM is what stores temporary Memory and helps the CPU
I will need to look up the command line arguments for A1111
I need a few minutes unless someone else knows what they are
Depending on your computer you will be best off using ClipDrop or the bot here to use SDXL as running on CPU and not GPU is really slow.
Alternatively you can use Google Colab.
I'm sorry I'm not trying to be confusing... There is a big learning curve though if you aren't familiar with your systems components
It's alright
Just give me 5 minutes and I will try to get those command line arguments and then we can go through those steps
Alright I'm re-installing Python, Git, and the Stable Diffusion 1.5 and checkpoint
You probably don't need to do that
Try to put this for the command line argument
--use-cpu all --precision full --no-half --skip-torch-cuda-test
I'll try
It should look exactly like this spaces and all
The one on the left is the one you use the LoRA on?
left is off. discord reversed the order i put them in.
both suck though it's just my prompts
So you want them to look like the one on the right? The blue one with 4 tracks?
towards the asteroid goal
It looks very futuristic
are you having any luck?
i threw 200 in game graphics from the terran race from starcraft and a bunch that marketing art at a lora. but didn't tag any of it with starcraft words so it's more generalized.
I like it.
I'm waiting for Stable Diffusion and the checkpoint to finish downloading
I might try to learn how to train my own model or LoRA at some point. I would really like to see some High End Gundams.... If you know what those are you might just want to switch over to those from tanks... They are pretty badass
Alright, managed to clone everything on Stable Diffusion into a new folder
Awesome... Let me know when you edited your webui-user.bat file
I did
Copied everything into the commandline args
Now I just gotta wait until I can copy my CetusMix checkpoint into the models file
I did
i got mechs in the dataset. you've heard of gundam but not starcraft?
Let me know if you have any success with it.
Alright
I heard of it sure... I used to see it when I used to log into my World of Warcraft account. I never had any friends who played it so I never got into it.
If it still fails with the CetusMix checkpoint you should try it with the basic original Stable Diffusion 1.5 checkpoint... and if that doesn't work then it might just be that your computer cannot handle the image generation process.
Alright, I'll try. Even if it might not generate the results I'm looking for
right but at that point its really about finding out if your computer is even capable to do this or not and less about the specific results
Well, it's a pretty new computer that has plenty of space
Its more complicated than that
So it's just compatibility
Its about the parts in the computer... think about it like this.. Would you use a toyota prius to race a porsche? or would you use your everyday car to enter in a nascar race? The parts matter... It doesn't matter if its a brand new toyota prius if it doesn't have the parts to win the race... That is the best way I can think of it.
I understand
It’s starting to freeze. I think it’s working
It might take a really long time to generate one image but give it a chance.
I know
Let’s hope it doesn’t give me some kind of error message or whatever
And it looks like I might have to manually open Stable Diffusion once all of this has finished installing
Okay, it’s given me a couple of notices
That a new release of pip is available - 23.2
And if I need to update, I need to run: venv/scripts/python.exe -m pip install —upgrade pip
it feels messy. hmm.
oh that guy? you don't need to worry about him
Alright
https://www.youtube.com/watch?v=-u9zr6oF4uA thank you for allowing me an opprotunity for a solid gold reference
Warning: caught exception 'Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx', memory monitor disabled
Page Not Found
Uh... is this a problem? I got it in the cmd.exe file I'm running for the WebUI User program
well if you have no graphics card there would be no NVIDIA driver so at least the error makes sense
Added in "Dimentio from Super Paper Mario, masterpiece" into the prompt
Thanks for helping me out, @spring fulcrum
It takes a while to get a prompt that works perfect but at least now you are able to keep trying with the prompts
Your welcome... Happy to help.
Good luck with your images
Now I'm gonna figure out how to add in some styles to better improve my AI preformance
Alright, any idea where I can find all kinds of styles I can use for Stable Diffusion?
prompt them. you just describe what you want. here's the style prompts the bot uses though
okay thank you
you can see images with prompts here. kinda see how different people prompt https://lexica.art/
while yeah, good example of a good prompt database, it's all skewed to their highly finetuned version now a days. another problem is it's all towards the clip from sd1. One of SDXL's encoder is clip and one is openclip
I think they're using 1.5 for the first time
mb
https://civitai.com/ @livid phoenix here's a popular host of custom models. Different models or adding LoRa's can impact style allot
Civitai is a platform for Stable Diffusion AI Art models. Browse a collection of thousands of models from a growing number of creators. Join an engaged community in reviewing models and sharing images with prompts to get you started.
Wait... was I supposed to put the character checkpoint in the models folder? Or does that not really matter?
I'm not sure what you have. If you got a character LoRa they go in a specific folder, textual inversions go in a different folder
I have cetusMix
Apparently it's a style checkpoint
models/Lora #Lora
Models/stable-diffusion #full model
ok thx
It's a full model . Even LoRa's are checkpoints it's pretty much the same as saying model
Okay, where do I find these
the folders to place your models? mine looks like this sd\stable-diffusion-webui\models\Stable-diffusion where ever your sd folder is are you using A1111
Yeah, I'm using A1111, and yes: I did figure that out. I put in the cetusMix style in there
No, I'm talking about where I find the Lora models I need
models/Lora #Lora
Models/stable-diffusion #full model
Where can I find these so I can put it in the folder?
I'm on Civit.ai
adjust filters
I did
If you're looking for a specific character or style, just try to search it directly on civitai. The results will have little text on corner telling you whether it is a checkpoint or lora or textual inversion (TI). Then you can adjust the filter accordingly if you're getting too many results
Okay
For characters and styles, Lora, TI and LyCoris will work for you
Got it
Another thing: how exactly do I make fusions of two or more characters in Stable Diffusion?
TI will go in embeddings folder
You can use two loras in your prompts. Just adjust their weights acc to your needs
Alright
How do I make it so thatI get male models?
more specifically: how do I edit my prompts such that I get male characters?
you can just prompt things together with the base model too bobby hill as spider-man words describe what you want its text to image
