#✨|sdxl
1 messages · Page 158 of 1
this was my initial fave but 2m behaving ok too
not too slow
waifdoom
Best prompt to get products to position directly in the camera?
OMG
my training exploded on the last epoch lmfao
I have never seen anything like this
second to last epoch
last epoch lmfao
what in the everloving fuck happened in the last epoch lmao
I'm not sure if I can ask here questions about sdxl but I'll give it a try.
https://m.youtube.com/watch?v=zzA1iUgtiEs
Will this let me use sdxl more easily? I haven't seen the whole video yet due to university work and I won't be able to give it a look for a long while. Can anyone tell me if with this trick I'll be able to use my 2080S (8Gbs) with sdxl models? I know it's literally in the thumbnail but did someone try it?
Thanks :)
#stablediffusion #a1111 #nvidia #update #cuda #cudaerror #lowvram #kohyass #LoRA #dreambooth #tensorRT
(Update: while the update is able to solve CUDA memory errors, I have seen it to be very slow with SDXL... it is not very practical to use with low VRAM... works, but slow, hopefully in the next update, we get better performance than the curre...
Can you suggest me a good workflow to turn real life objects into anime style objects ? When using my usual workflow, with high denoising the details are wrong and with low denoising they does not look anime a lot.
Have you tried using a controlnet?
not so far, can you suggest me a guide/youtube video, im using automatic1111
I am sorry, but i gave up on Automatic1111 month ago and i am completely out of the loop for resources or videos. Maybe the others here have recommendations.
Maybe this video here, but once again, i have no idea. https://www.youtube.com/watch?v=gkkfdWukb5I
results look a lot better with controlnet in comparison using img2img, thank you @noble shoal
You are welcome
In img2img workflows I've been taking the input image and sharpening them + adding a little noise and the resulting quality difference is apparent. Easy trick
mannnnnpages. here's a real guide. https://github.com/Mikubill/sd-webui-controlnet#installation
Does anyone know if there is a SDXl Controlnet 'Tile' model arround?
Dynamic angle| looking at the viewer| POV
centered
I really like that sampling->custom_sampling nodes exists in ComfyUI but idk what I'm doing. could anyone direct me to papers on samplers, schedules or something that could help me understand the options available ?
Thor reluctantly agrees to wear a wedding dress to fool the Utgard giant king in an attempt to retrieve his stolen Mjolnir hammer
Took a bit to get this image lol
shameless model post ^^ https://civitai.com/models/148871 finally decided to release a version 1.2 of my model 😄 and it's pretty fucking good. I've not had many fail cases
Thanks,
concerned ape
can't wait for some haunted chocolatier business
some haunted chocolatier business (this is the actual whole prompt)
i love it!
i feel like graveyard keeper inspired CA to make a spooky game that was more wholesome
Looks like she hates her side job lol.
generally speaking, should I always be batching?
as in?
maxing my batches per prompt?
Depends how confident you are you'll get the image you desire with your prompt or not
It seems to add 15 seconds for 3 images compared to 1
oh in that regard. I dunno. I can only do 1 image at a time
More is still better.
If I could do batches, I would
@visual glade Any plans to add LCM Scheduler to ComfyUI by chance? I have some ideas I would love to mess around with
I have a upscale node that I want batches in, how would I be able to connect these OR is there a node that I should be using?
My newest test realism LoRA absolutely fucken SLAPS
base vs with the LoRA
base vs with LoRA
the upscale image node should work with batches, or do you mean you want to upscale a batch of latents? not sure I follow
Nah this is my bad for not articulating it well enough.
So I swapped my nodes to handle img2img and I don't have any node that I can find that adds batches:
So I'm wondering how I would add batches to my prompts.
uhh do you mean like "Repeat Latent Batch" but for images?
I don't think there's a node for that by default actually
like so?
Maybe, it's a little confusing as repeate to me means it spins up a new prompt every iteration?
Or is it batching the entire process into one prompt and outputting 5 images like your example?
a batch like above would do something with all 5 images at once
Batch count in the menu would just queue up x prompts one after another
I'll try and find that node, would you happen to know if I attach it right off my input image?
if you're trying to upscale the image x times then I guess so, but that node doesn't exist in base comfy afaik
there's probably something like that in one of those huge packs
or I can put it on gist or something
Add that to custom nodes?
yeah, not in any folder, just next to example_node.py.example
I just found your cusom node package and that's exactly what I need, thank you for your time and your nodes!
no problem (if you mean the archived repo then it's not really updated anymore but I guess if it still works that's good)
yep all golden, i did have to update comfy.
@bright valley I am now officially being sponsored by a company. They will be paying for all of my compute in the cloud, and we are discussing a monetary value for the work as a whole
As well as partnerships on many, much bigger things
So yes, you CAN be supported off of making LoRA's, granted their experience with my workflow gave me a big leg up, as well as my resources
Save and drag or load into comfyui to get the workflow - Instructions.md
My stance has not changed. Let me know when you've recieved a dollar bill.
but I hope whatever it is works out great for you
Comfy? or anyone. I really like that sampling->custom_sampling nodes exists but idk what I'm doing. could you direct me on what I should be searching for to learn more about samplers and schedules and what each variable means
Anyone run into comfyroll nodes broken with latest comfy update?
This has support for LCM scheduler now?
it's a sampler not a scheduler
I mean like sigma's and eta
Oh, interesting, alrighty
Thank you, I will mess with it in a bit when I am able to
I think most of my answers might be in diffusers documentation
is it weight or bias or something else that leans towards image / text prompt?
just regular ol' sdxl
@lusty moss Doom is an activation word in my model
It's very weak in the training, so I don't list it on the model page, but it still works kinda!
grockster's took heavy influence from it as well
@bright valley cool lora!
did someone have a workflow that would randomly choose an item from multiple text files to build strange prompts?
could really use one but struggling to figure out how to build it out
you just need a wildcard node, and then use {__file1__|__file2__|__file3__}
that will randomly select between the 3 files
Thanks man, really cool stuff!
worked great, ty!
it works really well paired with other loras
Yes I agree, although I've only tried a couple personally, I see people make all kinds of wild stuff with them on discord/civit
yeah it doesn't seem to overpower other networks so easily. off doing its own thing in the xl sized latent spaces
oh this one was pretty cool too
LCM-LoRa + IP Adapter
Nice
tried SDXL for the first time today
https://huggingface.co/blog/lcm_lora "Latent Consistency Models (LCM) are a way to decrease the number of steps required to generate an image with Stable Diffusion (or SDXL) by distilling the original model into another version that requires fewer steps (4 to 8 instead of the original 25 to 50)"
yes... it works in ComfyUI if your looking to try it https://gist.github.com/comfyanonymous/b948fd68aacc65c2ed37d743d0964f27
Save and drag or load into comfyui to get the workflow - Instructions.md
img2img fail, the face is obvious - look at the lion's front paws
fueled by coffee
LCM 22.90 seconds
A mini scare...
10.72 seconds
Harrologos2 LoRA is a great improvement!
Where to find the ipadapter+face SDXL?
Thanks, I googled but could not find it.
does making longer, more detailed descriptions also help in lora making like it does in model training
yes if it's a question
burning that midnight oil
That's bad ass
Has anyone had success or know of LoRa's for generating DnD battlemaps ? I have tried prompting for things like: top down view, birds eye view, digital battlemap, etc. with not great success.
https://x.com/krea_ai/status/1723067313392320607?s=20 this is the most astonishing UX I've ever seen
At the moment is there anything that supports transparency? If I use an image with transparency there's no workflows that will infer that and use it?
I don't click links, show pic.
It's a video clip of realtime image generation based off basic shapes
Is this with the new IP Full-face model they launched?
yeah, with kinda lower weights . here's one that better shows off IP Adapter using your face
Wow thank you so much man! Always love to see the creative stuff you make!
Aeroplanes are just as hard in XL as always.
An I said An Airplane
I am not even sure what those are anyway, lol
Original Juggernaut
young girl / young woman / woman / older woman
Juggernaut + My lora
young girl / young woman / woman / older woman
^^^ just did this Lora... https://civitai.com/models/196290
Engulfed_by is a surrealistic, artistic and fun way to pour some smoke into your generations. Lora was inspired by Explosive Dust lora for 1.5 but ...
Can any comfyui experts help me in #🤝|tech-support ? I'm getting desperate here...
instructions unclear. hair go brrrrrrr
jk. works great when not mixed with general-purpose loras
@boreal bough you early adaptor you 🙂
oh hey Caith! long time no see / read
heyo 👋
Engulfed By black, Deep in the forest a girl stands surrounded by ghosts, lovecraftian, horror, still from the movie "the thing"
with / without
got to admit i'm a fan of the white sheets...
DMT elf, SDXL
retouched
Reminds me of bioshock infinite
It should as it was part of this genre style.
I never played it only the first two
I played through most of the first one, though that was on the switch years ago
infinite is more of a pure action game than a bioshock. fun and cool though
infinite was also on the switch
well, that is probably why I never grabbed it
I loved Bioshock
fell right into this style which I have loved for eons
BeeJeesus
I would love to use my goth lora on that one
Have u guys used FreeU?
Yay or nay?
in my opinion, it can be useful as a tweaking tool.
As in, you find an image you really like the general composition of, then you can fiddle around with the FreeU settings to keep the general feel and get variations of that (keeping the same seed and such)
But, that's not generally how I do my images, so I dumped it
So it serves the purpose of noise variation.
I tried it but way to much work. And makes images so bright. I thought i was doing something wrong
FreeU gives a great deal more detail at low iterations ...10, 15, 20
What are your settings for SDXL or is this something u have to tweak model per model like we do w cgf and steps?
There are some nodes which open more options for it, cannot remember what pack had it anymore
freebasing is bad. Oh, freeu I never tried.
I'd have to do some work - my FreeU was included in NerdyRodent's IPAdapter Setup/Workflow ...
As soon as I know I'll get back to you
All the info im getting is it sacrifices details for coherence.
Wait. I remember u. I think. When i was trying to do 2.0 embeddings
yes
What a waste of everyone’s life was 2.0. Tho guess positive side is it showed us what bad was so we don’t repeat it.
Well, kinda sorta. 2.1 was its own kind of training nightmare.
2.x I should say.
Never did manage to pop out a training on it from Jan until I went to XL
All 2.x 🚮
well, elongated necks in 2.1 and in XL alien, snake, or blind eyes
not sure how that is a thing in base XL
What u mean?
wonky eyes is a thing in XL
I never move off of base as I train on base only.
I could live with elongated necks in 2.1 too
Training models loras or embedding?
I haven't done an embedding in almost 10 months now
Kohya isn't the same as a1111 way
I need my nsfw and 2.x didn’t budge on that.
So loras?
loras and full models though I don't release those publicly just the loras.
my loras are up on civit under this name
Ok let me stalk you real quick and peep the loras. 😃
Looks nice. But may need more lora strength.
Ok now we are getting there
This lora is a bit under trained as it was just a test one
This was nice pic regardless
Oh. No wonder it can’t beat captain America
Needs to train more
I need an image of him to train on
The image I need would come from the first movie
before he froze
That’s another SDXL issue that white outline on pictures of people/characters
is that avail somewhere? awesome work!
How to use stable diffusion
wtf is this white sickness :X
SD is sadly all too white! 😦
"I can't see their eyebrows!!!" 😄
still white halo lol
she has white halo like lighting coming from behind yet there is none
heres a few without the halo
Anybody have a good auto cropping tool that sees subject to crop tighter in, while also using a given aspect ratio? With computer vision?
needs more baubles? 😆
I hope you aren't wearing suede boots in that gross water
Today we explore how to use the latent consistency LoRA in your workflow. This fantastic method can shorten your preliminary model inference to as little as 0.7 seconds and in only 4 steps using ComfyUI and SDXL. This will also make it a lot easier to run these models on older hardware and is just mind-blowing fast! Now, it isn't perfect, but...
From Scott Dettweiler
so weird. I would not recommend that "hack". Everytime you pass the vae the image loses fine details
does anybody know of any optimized cli scripts for running sdxl with diffusers? the vae uses so much vram even with tiling/slicing/offloading enabled compared to comfy I'm losing my mind
idk if I'm missing something
batch size > 1 seems to murder it
with BS1 it's pretty close
you could also try changing the vae sample count, I believe that affects vram
also make sure you enable offloading for the whole pipe not just the vae
else the unet won't get offloaded
👍 guessing sample count is supposed to be vae.sample_size
I think sample size is it's target latent resolution
You have probably read this already but worth mentioning. https://huggingface.co/docs/diffusers/main/en/optimization/memory
I saw that but pretty sure some of the optimization stuff is outdated
I'm pretty sure I've seen another article about it recently. Can't find it though.
that's about it
yep thanks, will try in a second
oh that seems promising
just had a thought offloading the text encoders might be useful too, idk if that happens automatically
I think instead you need to change the count input into AutoencoderKL.encode().latent_dists.sample()
with model offload it does
when I use pixart it peaks at 12GB during text encoding and only 4.5 GB during sampling
I assume sequential offload also does too
I have the gen at 4.0gb now but the vae still hurts, going to try the sample thing
you have Nvidia right
yeah, 3060
hm
if you really wanted you could run each part of the pipe manually. like text_encoder.encode(), unet.sample() and shuffle the models around in vram as you like
instead of just calling the pipe which runs them all at once
if it's just for SDXL should be easy to do
could even call torch cache clears in between
that's an option I guess, yeah
so you can see exactly what's using vram
but the vae specifically you can run alone by setting the pipeline output to latent
since that's the main issue
ohhhh
one thing
in SDXL diffusers forced the VAE to fp32 by default
where comfy I think uses bf16
so call autoencoderkl.to(toech.bfloat16) before you encode
oh I think that might be the issue
yea. there's an issue on diffusers git to do that automatically but it hasn't been touched in a while
so I'd return latent from the pipeline then manually cast the latent and vae to bf16 and decode
so like 3 extra lines of code ig
I tried torch.inference_mode and it got the total vram usage to around 7.5gb, still a bit too much
that's with decoding the latent separately too 
bfloat16 and all?
yeah, afaics it's already all correct dtypes, I wanted to try sequential offload but getting some error about it not being implemented
maybe because I'm using ssd-1b
oh, I had custom_pipeline="lpw_stable_diffusion_xl" for longer prompts, sequential offload works without it
the only issue is that's twice as slow now, but the max vram usage is 2.3gb so that's certainly something
I think I tried compel but this one seemed easier, will take a look at it again
just copy paste their XL example and it seems to work
if your script runs on multiple models just check for the tokenizer and text encoder attribute on the pipe first
https://x.com/ShouldHaveCat/status/1723328495831425374?s=20 clever/smart use of most likely animatediff
Damn, just realizing the fp16 vae has the weird rainbow effect issue.
Hello guys, I'm going to do a full finetuning for SDXL to learn the aesthetic. I'm wondering is there a need to finetune the text coders of SDXL as well? Or I just need to finetune the UNet? Some people told me they always disable the training of text encoder of SDXL when doing a full finetune. Many thanks
I want to introduce a brand new node that was just added by Comfy to his stable diffusion system this morning, it's called FreeU. The concept here is you are able to change some of the underlying contribution mechanisms of the u-net, and this is the core of stable diffusion. The results tend to be much better, and it doesn't slow us down or co...
Adds extra detail in the high frequency range
how can i create a loop that will take a generated image and put it in a load image node?
?
it's for img2img style workflow..
so, i want the image to be loaded automaticly
oh, you want that node to go through images in a folder?
no such thing exists that I have seen
nope
i want something that sends the outcome of a workflow to the begining.
i remember there was something like that with outpaint, but perhaps reading file names is the best option
Try this
good ole was. thanks
Just released this
https://civitai.com/models/197757/aether-pixel-lora-for-sdxl
Aether Pixel - making stuff fall apart into pixels.
since launch
Bloody Nora!!!
Not intended but not bad...
What's the min VRAM for sdxl to work? My buddy has 8gb, but a1111 crashed while trying to load the model in.
i got trouble with comfyui + controlnet, anyone here who could maybe help me a bit?
it's fake. i can tell because of the pixels
this looks real though, right?
💯
exagerated clavicals give that one away INSTANTLY. can't fool me
too bad. now I have to stop making finetunes forever.
This is fake... but a good one.
you are joking, right?
I'm very serious, and don't call me shirly
Wow, i had to google that reference. Nerd. 😅
hi guys ,
any1 got any idea if i want to create a design for a room but for the windows i only want to show a blue sky or something , any ideas how to controll the windows ?
That’s not realistic sweat.
Blue skies visible through the windows
Bruh
Get ready for getting banned like never before.
@uncut steeple
Is so beautiful 🥹
Close-up vertical industrial fan, lumpy mud turds, extreme hyper-realism, painterly shading, exaggerated mud splashes, rich colors, dramatic contrast lighting, texture highlight, 1986 novelty card illustration style, conspicuous shading
it really hit the fan
any of you got experience with comfyui + controlnet ?
new qrcode controlnet is out. combing that with a little pretext rendering leads to great guidance with harlogos. used this site to make some ez text https://www.textstudio.com/logo/white-text-on-black-background-134
using other controlnet models with harrlogos or just controlnet on their own, doesn't get as good text effects
i think the new qrcode monster was trained with some text knowledges too so that might be where its coming in handy
was the source i used. put some 3d effect on it for cheese but it didn't do much
Do you have the link for the weight?
Wasn't it a lora? or it is a control lora?
ohhh you mean the monster qrcode thing
yeah
thank you.
Does anyone know where to find sdxl openpose and lineart controlnet models?
I found 4 but they didn't work at all
I tried 4 from those and they don't work.
t2i makes everything green and ignores the controlnet, and kohya doesn't listen and ignores it entirely.
..thanks.
im not using preprocessors, im just using the model. i have the outline/openpose thing of what i want but it's not listening
before you rush to declare the models not working, maybe consider there is a layer 8 issue
for the open pose, i used sai's openpose control lora, and for the lineart i used the t2i-adapter for lineart
in my experience though, most models work to some extent.
haha it's an old techy term. the roots are in the 7 layer OSI model which define a technical stack. good to know for troubleshooting. layer 8 is that layer where the end user exists. it was just made up by pricks like me to be like "maybe it's layer 8?"
..oh.
this basically?
ignore my weights i chahnge them all the time
yeah doing it without preprocessors works for me too
well looks like sai's openpose is too heavy, just crashed out.
thats actually the light weight one
wait, where idid i i get my open pose from? says sai in teh filename but they never released one
oh youre right
well i mean, i'm using it fine, so.....
wtf do i have then
can u see ?
ohhh the file name doesn't say sai, it says controllora, which my brain just added sai to
yeah it's only 750mb compared to other controlnets that's light
they've got the actual controlnet for openpose on that and its 5gb
then why do some of them say 42mb on the collection you sent me?
OH YEAH those are the kohyhya control lite ones. i haven't messed with them yet
they don't work very well, they ignore everything.
the extension support them?
It reads them so i think so?
it shows up in the dropdown when you click on the box to select a thing
it'll show any safetensor in that folder
https://huggingface.co/kohya-ss/controlnet-lllite manpages. says it's all good
always go to man pages
okay gonna try this
if it works for you i'm gonna get a little upset lmao
he's not shrugging but its' following it. i got a few of these
not bad for 45mb and its trained on anime too
works fine with shrek. vader just doesn't shrug it would be too powerful i guess
what weights are you using?
1
look at your logs when it doesn't work. might give you some indication that cnet failed
Scar Won Edition sickkkkkk
By combining the LCM Lora and TensorRT extension I made 320 images @ 512x512 in just over 1 min.
any suggestions on how what prompts would be good at making realistic, not deformed eyes?
use adetailer to help
thanks! 🙂
give that eye an adjective. "cool eyes"
thanks! trying this out too!
Each time I select sdxl_vae it switches back automatic
newbie here;)
this looks super cool, i'd love to hear more about it
Anyone know why some images have issues with upscaling tiles? And if there's a solution?
depends on the upscaler,the denoising level but its a common problem when using tiles
only way to get rid of them is by givin the finished img a pass on img2img with low denoise like 0.2
Thanks mate, appreciate it, will test it out.
Generated in 4 steps. (LCM lora + Euler A sampler in A1111.)
The lora works for any SDXL model (I've tried a few, and all worked perfectly). You can get much higher speeds and better quality in ComfyUI, since it supports the LCM sampler and handles image memory buffers better, but it still works in A1111 with Euler A.
Less than 2 seconds per image.
is there an article out there about how the prompt is converted into.... I don't even know what comes after the text input!? like how it works step by step
Which interface are you using? You mean like how it gets turned into the image or a pipeline/workflow for say ComfyUI?
code, turned into whatever the gpu can understand
like "a photo of", how does that help if everything is weighted on keywords?
I'm sure there are but I am unaware of all of that. Believe it gets tokenized then read by the GPU but unsure really.
I think the model is what dictates the "words" interpretation by tags applied during training.
I see people using a very long "story" as a prompt input, resulting in a mess
I believe there is no LLM so everything must be weighted keywords
I haven't had much success with the long stories.. this is the prompt for the above image.
(Robot Biomechanical Lizard:1.2), (Gold gears:1.05), (Red Wires:1.2), (Steam powered:1.4), Medium Close-Up, Majestic Lighting, (CGI, Pixar:1.2), (8K, Realistic, Hyperrealism, Depth of Field, 85mm lens, F/4.6:1.15), (ultra detailed, ultra accurate detailed:1.1), (Bokeh, Bokeh Lighting:1.1), (surrealism:1.05), (Victorian:1.1)
the input to SD isn't an LLM, but it is a language model (CLIP is 'Contrastive Language-Image') and it does actually prefer written sentences rather than keywords, especially SDXL
so @lusty moss 's prompt could be better with a sentence?
The text goes into the CLIP Tokenizer, which just converts words to numbers, then the tokens go into CLIP itself which creates an embedding approximately halfway in space between text and an image, and then the UNet takes the embedding and diffuses a latent image, which the VAE then converts the latent to an image
this is the worst prompt conversion ever, i need to slap my LLM, but a robotic biomechanical lizard, with gold gears and red wires visible on its body. The creature is steam-powered and has a majestic lighting effect around it, giving off an otherworldly vibe. It's rendered in CGI with Pixar-like quality and is shot using a 8K camera, capturing every detail of the creature. The image uses an ultra detailed approach with hyperrealism to create depth and realistic lighting effects using a 85mm lens at F/4.6, creating an immersive experience for the viewer. The image also features bokeh lighting and surrealism to add a dreamlike quality, while still maintaining its realistic nature. Lastly, the creature is set in a Victorian-style environment with intricate details and rich colors to complete this stunning image.
This giant story mess still produces very clean results, albeit it's a wall of nonsense text so not quite there
good stuff thanks! I'll try to dig into reading at the code
hand-rewriting it to a more sentence format: a medium closeup of a victorian steampunk robot biomechanical lizard with gold gears and red wires, Majestic Lighting, (CGI, Pixar:1.2), (8K, Realistic, Hyperrealism, Depth of Field, 85mm lens, F/4.6:1.15), (ultra detailed, ultra accurate detailed:1.1), (Bokeh, Bokeh Lighting:1.1), (surrealism:1.05) works great
Good info, you use a LLM to convert it from what i pasted?
Generally you want text that vaguely resembles image descriptions on the internet, ie a short sentence describing it and a few keyword meta-tags on the end. There's a lot of photography on the internet in the format a picture of [object] at [place] during [thing], [camera type], [location], [date] more or less
that was my first attempt there ye but the LLM did a poor job
oh so even the camera type is trained?
(StableBeluga2-13B with a simple preprompt of a few prompt cleanups)
ye
🤯
in the prompt that was used above 85mm lens, F/4.6 is camera details
the model absolutely understands those (to a degree)
honestly even if you give incorrect or dumb camera details it still "helps quality" to a degree cause the subset of images on the internet that have camera details are statistically more likely to be very high quality professional photography
(it's available on the bot in the r/StableDiffusion discord btw)
so let's say "a sun in the top left corner" is possible? like where to place object relative to the frame
I think the model understands general terms like 'slow shutterspeed' or 'high iso'
I have seen the model place things if you mention left side or right side, or on one side, etc, but it doesn't put them where you want, so kind of like a child that doesn't know their left from right 😄
@wicked frigate TY for the info for sure!
take story prompts like that and paste them after "print a terse version of this prompt, keeping as much of the detail as possible but properly formatted for stable diffusion " e.g. Robotic biomechanical lizard with visible gold gears, red wires, steam-powered, majestic lighting, otherworldly vibe. CGI, Pixar-like quality, 8K camera, ultra-detailed, hyperrealistic, 85mm lens at F/4.6, immersive, bokeh lighting, surreal yet realistic. Set in a Victorian-style environment, intricate, rich colors
but can still be useful if you say 'on left side fire and the right side ice', even if it gets the sides wrong it still helps
oh also! other prompting tip:
when grabbing prompts from other people
google image search the words they use
for example, go google image search Hyperrealism and ask yourself whether those images look like something you actually want here
(answer in this case: probably no)
"Hyperrealism" is a specific art style, and not just a vague "more realistic plis" request
my favorite magic word for quality in 1.5 was absurdres (for absurd resolution). supposedly this was how images from the early AI upscaler output (posted to anime forums) were indexed with. it really works, it ends up being a whole look
Anywhere that has good cliff notes on things like this?
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features this is less rules-of-thumb and more straight documentation but it's helpful to know all this stuff
how do you get to write this way
Do you mean the syntax?
Yess
If you type something like Dog, put your cursor in the text, say between D and o and press CTRL+UP/Down Arrow. It will "weight" the word.
hand-rewriting it to a more sentence format: a medium closeup of a victorian steampunk robot biomechanical three - headed dragon with gold gears and red wires, Majestic Lighting, (CGI, Pixar:1.2), (8K, Realistic, Hyperrealism, Depth of Field, 85mm lens, F/4.6:1.15), (ultra detailed, ultra accurate detailed:1.1), (Bokeh, Bokeh Lighting:1.1), (surrealism:1.05) works great
copy/pasting messages isn't how you generate - go to #1100170312106127410 to use the gen bot
What are your fave SDXL words/adjectives so far?
You can control this with regional prompting
I'm stuck, how do I implement this stuff for CLIP-ViT-B-32-laion2B-s34B-b79K in Clip Vision (G)?
that question doesn't particularly make sense? 0.o
that's an entirely different clip model
I'm finding it difficult to find a file of that name ...
what
I have d/loaded pytorch.model.bin and open_clip_pytorch_model.bin - but don't have a clue where to put them
put them for what?
these are models that have nothing to do with anything SD related
Its an IPAdapter setup by NerdyRodent
IPAdapter does not use Vit-B
Or even Scott Detweiler
they have a ViT-H and a ViT-bigG
in either case, standard ip-adapter software autodownloads the clip model for you, you don't do it manually
(or where they require you download models, you download the IP-Adapter model, not the source clip)
eg you'd install this nodepack https://github.com/cubiq/ComfyUI_IPAdapter_plus and these models https://huggingface.co/h94/IP-Adapter/tree/main/sdxl_models
I know, and I am finding it difficult to locate and d/load CLIP-ViT-B-32-laion2B-s34B-b79K
there is no vit-b to download that is not a relevant model
OK, but Scott Detweiler uses it ...
Scott Detweiler does not use any vit-b model
in his video on IP-Adapter https://www.youtube.com/watch?v=xzGdynQDzsM, he uses the ip-adapter official models
which notably are the standard (g) and the alternate (vit-h) models, absolutely no vit-b in sight
he also in that video for some reason separately loads a vit-h clipvision
I know, and that ViT file I cannot find
you don't need that and shouldn't use that idk why he does
either way it's H not B
all model links are @ https://github.com/cubiq/ComfyUI_IPAdapter_plus#installation
(you should be using the bigG models not the H ones)
(if you use StableSwarmUI the bigG clipvision literally downloads itself for you)
Yeah, he uses this vit h - so he says the laion version also needs to be vit-h
you should just use the standard (non-h) version
It worked pretty well. I had a brain-fart - I Symlinked my Models folder - forgetting that Symlinking at first empties your folder!!!
So I'm having to re download it all
I'll maybe try StableSwarmUI
Where's aka Twigs?
Something about Micrsoft dworking dotnet is stopping me installing StableSwarmUI
OK, the zip file worked whereas the git clone version would not
Has anyone tried training LoRA's for LCM?
you shouldn't be downloading the repo directly, it has install instructions in the readme
there's a oneclick installer
LCM itself is a lora now, meaning you can just add the lcm lora on top of any other model/lora you like
I got an LCM LoRA giving me SD top quality every 2.5 seconds
10 Steps, cfg 5
I want to change up to SDXL ...
Running an 8Gb VRAM RTX 2070
Just tried it. Amazing
LCM LoRA - each image at 512x512 took 2.5 seconds/10 steps/cfg 5
I gotta admit, I generated 1500 pictures yesterday ... ! I randomized the Style Selector - some good - some weird!!! 🙂
this is the best one
that would be great havent tested it with sdxl so share results when u do🙏
Each picture 2.5 seconds each!
I must admit, this LCM LoRA is Breast-Centred - almost 75% of the output is NSFW
🤩
SDXL with LCM lora, 10 steps
trying some lora stuff with this lcm animation
Yeah I know, But I was just wondering if trained LCM LoRAs give better Quality than loading LCM LoRA with older loras ? Since they have provided a training script in diffusers repo. But the script doesn't work.
Where to get this node, especialy the LCM sampler?
It's natively integrated into ComfyUI now, so you may need to update to the latest version. Check out this sample workflow in how to use it: https://gist.githubusercontent.com/comfyanonymous/b948fd68aacc65c2ed37d743d0964f27/raw/a7042ace206ff20eac38fb23cb79b5d841b4bc8c/lcm_lora_workflow_attached.png
let me know if you any questions
2432x1664 in 21 Seconds. With a 4 Step LCM Upscale / Highres fix. (RTX 4070). Total sampling time: 7 Seconds. Total Steps: 10 (Base + Upscale).
It says there is an item in the queue, but nothing's happening!
did you check that all the models are available in the nodes? if it's missing something the node should have a red border
It's OK - I had sd_xl_base_1.0 and it was looking for sd_xl_1.0
I got it to work - 5 steps, cfg 1.8 - good quality SDXL in 7 seconds
Down from about 24
This is waaaay cool!
Upscale times are also reduced I see?
if you are doing latent upscale, yes. but pixel upscaling is not accelerated.
LCM's performance is great, but the image quality loss is rough. I think it's great for real time applications, exploring animation and experiments. it's very impressive, just not something you would currently use to push image fidelity.
Yeah I was hoping quality was the same, but I can't use it for final images.
But can be used to check quickly if a prompt generally works
The eyes are the only casualties of the speed ... much worse than "slow-SDXL"
Some cool examples, Melting Candles and Shaun Tan style LoRAs with Dynavision model
Kewl
this is fr?
what is lcm?
wrong file 😄
LCM-LoRA - SDXL Acceleration Module! Tested with ComfyUI, although I hear it's working with Auto1111 now! Step 1) Download LoRA Step 2) Add LoRA al...
right now I'm using LCM to explore motion design animation (WARNING: flashing lights and lots of motion)
Even kewler
too bad MP4s do not loop in discord. here's a low fps gif
https://youtu.be/wTTlRNQ2x9M?t=3479 animated motion masks via blender
Guys, why inpainting is so slow with SDXL? I use Stable.Art and AutoPhotoshop-SD, same problem on both, inpainting is so slow, while with a 1.5 model, its really fast.
I can generate easyli txt2img with SDXL, but inpainting is slow as hell
because sdxl is bigger than 1.5
if you have a 3090 or 4090 its fast
(not because each image is fast, but you can batch generate 4 images in one go - takes like 10~25 seconds depending on your settings and step count)
i dont think youre on the right website bud
how did he get in there
stable diffusion v1.6 is set to auto-redirect to sdxl base on huggingface XD
apparently enough people had that question that an auto-redirect was set up
Of course, I know all about that.
~~But the areas I select are not large.
With Lasso/selection, I take a very small area (in pixels), a value that is much smaller than what I generate in txt2img.
It takes me 52 sec to generate a 1024x1024 txt2img image.
For inpainting, I don't get away with less than 15 minutes.~~
Don't listen to what I said. It's beyond comprehension. Before putting --no-half in arg, using inpainting with SDXL on Stable.Art caused an error and didn't work. With Auto-Photoshop-SD, it was extremely long and buggy.
I've just removed --no-half, and it works much faster than before...
the area you select might be not important at all
depends on the chosen tool, but most tools will do an img2img on the complete image
oh yeah :/ no half go nom nom on your vram
even if you inpaint a tiny region it will denoise the complete image
Trust me, no xD
My poor 2070S could not support my hell PSD at 8000x4000 pixels.
It just select the zone used
as I said, it depends on the tool
Yeah, i'm talking about Inpaint ones
some tools will do inpainting on a subimage
but even then the subimage will be "native" size
which is 512x512 in SD 1.5 but 1024x1024 in SDXL
(if the tool is good. Many tools are stupid and run SDXL with 512x512 which will heavily impact quality)
if he ran out of vram though, it may have defaulted to cpu ram x_x so... that's like 2 min to generate an image
yes, it's probably a vram issue. I just say that inpainting in SDXL will always be slower, just because it will run on larger subimages
Obviously, depend settings and how plugins works.
Personnally i'm really disappointed with Auto-Photoshop-SD. Stable.art do really BETTER job.
I don't know any of these tools, so I cannot help 🤷♂️
still a single creator behind Auto-Photoshop-SD, who's probably doing it as a side project
photoshop generative fill is kinda unbeatable for me atm XD
so not like I'm in the market for plugins
local or cloud?
You're missing the best part. Firefly can go back to bed with all the crazy models in SDXL. Having that in Inpainting, with Photoshop's blend masks... Makes everything so easy and amazing...
depends on purpose. for real life applications, such as photo retouching, fixing areas in paintings, removing unwanted elements in artwork, the generative fill based on firefly is near instant has not failed me once yet. Its only when i try to misuse it as a stablediffusion alternative that it falls flat on its head - but that's also cause that wasn't the intention. The intention is photographers or people at corporations who need to fix small areas from renders, mockups, blueprints, etc...
if you want a stable diffusion alternative to compare at the enterprise level, then that's Dall-e 3
offtopic xD I made an old comic style lora
damn thing took 900 manually tagged images to complete
poor eyes & hands took forever to get right
SD does this very well too, except that it lets you use models of all categories. I did this very quickly too. You just need the right prompt 🙂
don't get that. You mark the mouth but it changes the whole face
you selected the mouth but regenerated the whole face somehow 😅
guess it's not like content aware fill where it only needs the surrounding area
It's several layers and different generations, I simply deactivate all the layers to show the before/after.
Alt+click on a layer to show only it, in this case I do Alt+click on the base image, without modifications.
well, be that as it may it's nice to see a photoshop plugin that doesn't connect to some saas
which model is this? can't find it on civi
Not 100% certain, but probably the sdxl QRMonster release
https://huggingface.co/monster-labs/control_v1p_sdxl_qrcode_monster
I haven't looked at all into it, but I know qrmonster was working on their sdxl, so I just quickly checked that HF page and saw that it was updated 2 days ago, so I'm just assuming it's somewhere there
Bulma san
@noble shoal Do you mind if I DM you a question about one of your LoRA's?
Does anyone know where I can find the ModelSamplingDiscrete node for ComfyUI? Been searching and can't find it.
it's a new native node of comfyui. you probably need to update
I thought so.. trying to instruct someone else to use my workflow and they couldn't find it, even after update..
refreshed the browser after restarting comfyui?
Hey @lusty moss. do you have to be on dev channel of comfyui?
Advised the same lol. Thanks for confirming I'm not crazy!
yea i did
Default channel
hmmm
Ironman is sad, his lasagna is ruined,
his son is a little.. slow. its okay , he works every day to teach him to walk. His mother drank a lot.
normally it should work, but maybe the update fails because of it
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/node_db/dev/custom-node-list.json
Update ComfyUI
Cmd('git') failed due to: exit code(1)
cmdline: git stash
stderr: 'error: 'output/_output_images_will_be_put_here' is beyond a symbolic link
fatal: Unable to process path output/_output_images_will_be_put_here
sounds like it
cmd git failed
that's what it updates with
ok
i ll kill the symlink
you can specify your own output directory using the command line parameter --output-directory
that never worked for me
great news though
it updated
got the node @lusty moss
need to try --output-directory
@upbeat summit Thank you!!
np! rafraf fixed it 😉
yea thanks both of you
Iron man happy in the mushroom kingdom
that looks amazing
not sure how whitespace characters are handled but it should work
i seem to remember, that the output path needs to be on the same disk volume? is that right?
annoyingly, my dropbox is on e: and comfy on d:
that I don't know. I haven't used the option myself in a while
but with some image save nodes (WAS for example) you can specify a complete path in the node widget with a drive:\ and I know it worked - at least a couple of weeks ago.
ok might be worth trying that
you could also try creating a symlink inside your ComfyUI\output folder. ComfyUI\output\dropbox and save everything there. all save image nodes should support subdirectories
Any recommended checkpoints/LORAs or methods to get good line art? Tried with Corel Draw 2023's Auto-trace
high resolution found footage of a demon in a dark forest at midnight
high resolution dslr photograph of a white devil with glowing red eyes in a blizzard
I am just so sexy it hurts.
I dare anyone to run adetailer on this
medieval german harley quinn wearing intricately detailed gothic plate armor, 4k UHD photograph fujifilm XT3
Sure. Go ahead. I don't mind.
if this lil critter proposed would you say yes?
lol, inspiration from above 😆
rawr
when zero snr hits it sure hits
it's kicking in
i got another controlnet/comfyui question: how can i change a person, but keep the face the same? like a closeup photography and give the person a hat?
and a second question: is it possible by now to do textual-embedding with sdxl and a1111 ? (i know that people use LORA, but i do different stuff and that's not working with LORA)
make a mask selecting above their head, load the image, load the mask, use the mask as input for inpaint controlnet and image as input for sampler, play with the prompt and denoise... something like that
uhhh wait there's probably no inpainting controlnet for sdxl //nvm, there seems to be one looking at hf
can you give me a link of where to get what exactly maybe? @fierce hollow
I see an inpainting controlnet here but no idea if it works, haven't tried
https://huggingface.co/williamberman/sdxl_controlnet_inpainting
there are some TE's for XL on civitai. not sure how they were made though
how does that inpainting.safetensors would work in comfyui ? wouldnt you need some new nodes or something?
and i want to do my own TE's @heady vale , and with A1111, not comfy
does anybody know of any good freeU settings for ssd-1b/finetunes? both the sd and sdxl defaults seem to bake the images way too much
she should have washed her hair before she went to eat though.
Hey Guys,
I am working on a project for creating different cartoon characters for children stories. There are total 3 characters that are consistent in every story, therefore I have fine tuned a dreambooth sdxl model using autotrain on each character. The results for each character (solo) is great, but I have to make a sdxl model that will generate images for all three characters. Can I do this by combining 3 dreambooth models, if yes, then how? Or do I have to create one model and train it on all three characters collectively.
I would really appreciate any help regarding this, thanks.
you can do TI with kohya-ss. But honestly, loras are just faster and easier to train. I haven't found any advantage in TIs yet for SDXL
I would do both. You can combine them, but it will still mix up your characters from time to time. Using inpainting you can fix these problems, though. But you might get better results by also finetuning your model on images containing all three characters at the same time
how will i then make my model understand the different characters against their specific names when training on a combined dataset?
just write your own captions, don't use the auto-generated captions of your tool
in kohya_ss for example you can add for each training image file a text file ending on .txt with same name where you write your caption. You have to set --caption_extension=".txt" and it will use your captions instead of the autogenerated files
TheLastBen Papercut SDXL LoRA
I recommend FreeU to add detail to low resolution pictures; works made with a low number of steps. It increases the high frequency detail. I am trying it alongside the astonishingly rapid LCM LoRA ...
FreeU seems to work best on work made with a low number of steps. It adds detail in the high frequency portion of your pictures. If your step number is already producing good, intricate detail, applying FreeU will necessarily overcook!
These images feel like mortal kombat
can't wait to see a fatality 😉
that'll be what gets AI banned from all videogames. When Ed boon gets his team to do AI fatalities
Jack Thompson will show up again
it'll be a mess
hehe
nothing really to it. just i know that the qrmosnter control net has been used by people to hide text and logos, so i combined that with your lora and it was able to do a lot of those text effects on the text masked into controlnet qrcode
