#✨|sdxl
1 messages · Page 152 of 1
last I saw, it was actually going quite well. But controlnets are a pain in the ass to train lol
ah very cool 😄 and thanks, can't wait
Same here lol
I am nowhere smart enough to train something like that
his uses a floating point matrix instead of stick figures, so it should be easier to get more complex poses without tripping it up with overlapping "lines"
Astronaut doing a side karate kick
lol nice
thanks 😄 yea thats about what i got using Depth but it looked more like a picasso. 🙂
you usually should prompt a little closer to the image, as the controlnet won't have much context wiht just "astronaut". it's just another aid to the diffusion steps, and it's still a text to image system at it's core
a simpler pose did work 🙂
my astronaut in space reflected a desert and yours indoors reflected space. latent spaces are weird
same open pose, didn't work much but you can see it guiding
normally i would use a more discriptive prompt, but i prefer to test with a basic prompt just to see how well the models work 🙂
yeah you found the limit. weird poses.
lol XD
not bad,
are these using openpose? or just prompting?
open pose. astronaut doing a handstand on the surface of a comet.
lol i used the same image
Good evening everyone, can someone guide me with sdxl? I heard it needs powerful machine to run it.
You will get decent performance from the last few generations of nvidia cards with at least 6gb VRAM, preferably 8gb. It runs on my laptops 3060 6gb GPU too
hey all, does anyone know what the best SDXL inpainting model is? I saw one published in diffusers, but curious what y'all are using for inpainting/outpainting
For training, I should use Auto11, and not compfy?
yeah it does that with the word love on shirts sometimes
Look what I've been cookin' up... a lil Dyna update in the works.
I have RTX 3070 Ti, 16 GB RAM, i9-12900H
Are my specs powerful to run SDXL?
i am here
what did i miss
The only issue you might have is running out of ram, your GPU and CPU look great
yes
i have a 1660 and im fine, but, 32gb ram
overkill lets go!!!
can sdxl run multiple CN's?
Oh, how can I adjust the VRAM or thread count to cause no issues
In stable diffusion
I have maximum of 8 VRAM and would like to use 3
So 8?
also i dont think theres a way for u to adjust vram
theres no such thing in auto1111
Okay good to know
In the civitai web site, there are options for images I cant find in the web ui
You can do this in python.
Such as the vae, hires prompt etc
Yes how can I do it
This is done in the settings in auto11
Google,
Could tell me where to find them there?
The settings tab.
Yes but there are bunch of things
You will need to read them, there's a Vae field on the first tab.
hires prompt is done in the actual prompt flags, there will be a checkbox for this.
This isn't rude, but I would watch a intruduction to save yourself some time.
They go over all this.
I know have watched couple of videos and I am still learning
Is there a prompt to generate full body picture like from head to toe
full body
full body picture
Oh as simple as that!
mentiong both feet and eyes too
I have successfully generate my first sdxl image in 7 minutes!
shoes, legs, standing, lots of ways to describe
Using Realism Engine
I heard it doesn’t generate feet and hands properly. How can I fix that
these are natural language systems. it turns your prompt into a latent target and denoises it
Oh, is there a way to fix them?
generate another image and see if it's closer. investigate prompt styles that target better hands, find refined models or loras that would get it better, inpainting , controlnet, lots of options
Oh yeah styles, it’s a custom prompt, where do I find it
there's an extension called adetailer for a1 webui. it'll automatically mask the hands and do an inpaint pass against them
Refined models? Controlnet?
What does it do? Why it mask the hands?
yeah. anyone can take the base sdxl weights and refine them further on their own image sets
inpainting is when you only do diffusion on a specific masked area of an image. changing only that area
Oh
that second pass with a prompt specifically for hands, helps a lot
controlnet is a system where you can give a visual control to the composition of the image. depthmaps and poses and other means of compositional control
https://github.com/Mikubill/sd-webui-controlnet
https://github.com/Bing-su/adetailer
two extensions you'll want
There is also on model page on civitai on the right hand side a dropdown of files. Like full model fp16 and 32
What are those
datatypes that the model uses. the amoutn of precision they have. floating point numbers (decimals) with 16bits of precision or 32bits of precision. super technical stuff. you'll only ever need half precision. fp16.
more precision may be better for training but as far as i can tell, not needed for inferrence at all
default should download the half precision fp16 model
I personally think bing create is one of the better generators out there, anyone know if there's a opensource model that's comparable?
no
sdxl
closest thing to any of the monolith models from corps
Yeah I assumed, just wanted to check if I missed anything new.
https://pixart-alpha.github.io/ theres this which uses t5 but we don't have weights yet
Are they finding the NLP is the key to photoreal?
it's a whole lot of things that get better results. a larger pretraining model than openclip helps. that's t5. also, a larger unet is important too.
https://twitter.com/EMostaque/status/1714982965212508228 emad keeps showing off stuff like this that has great prompt comprehension so i think they got some research up their sleeves
@heavy zinc fyi, If you use ComfyUI it will cost less VRam and work without crashing
a node graph for affecting diffusion
Just to be sure, SDXL is same as 1.6 right?
1.6 is the version of automatic1111 you're using
i dont think u watched the videos u were talkin about
sdxl is the 3rd generation of model released from stability ai. it's an architecture / base model
they release those "weights", a bunch of numbers that multiply together according to the inputs, for free. another program like automatic1111 or comfyui uses those weights to generate images with
realism engine is a version of sdxl that someone else refined with their own images. they train it for many hours so that it learns more.
no but theres another neg embedding that does work for xl
yes
some are pt and some are safetensors,just put the name of the embedding in the negative prompt box
Is it in Textual Inversion tab?
yes its also there just click it and it will add itself
🤷♂️
Vram is allocated automatically.
one of the weirdest AI videos i have ever seen
What is the best way to enhance low res pic in SD?
Why In Lora, It is says showing nothing? I have put my files in LORA folder and restarted sd
are you trying to use a sd15 lora on sdxl?
things like loras, embeddings, hypernetworks, controlnets, they all need to be developed for the same architecture. sd1, sd2, sdxl
I am Legend ^
@heavy zinc In general sd1.5 is trained to make 512 x 512 resolution ,sd2.0 is 768 x 768 and sdxl is 1024 x 1024 ,
your using A1111?
Im not to sure I only use ComfyUI, I do have A1111 that I started with but got into ComfyUI when they released SDXL and I havent went back, what are you trying to make "Text to image"?
yes
if you use a SD 1.5 model 512 x 512 is the right size
What differences will it make when using ComfyUI?
I downloaded 1.6
but it says 512 512
Comfy give you a better idea what's going on with the process and where its stuck at
COmfyUI is also mroe complicated to use than A1111, still learning it myself
it "can" be more complicated to use
yes its not for me
By reusing the seed, yes
or you can use a premade build and just plugin the values much as any other ui
how do I do that
Under seed, right next to the box with the whole load of numbers, there's a green recycle button. CLick on the finished image that you'd like to add hires fix to, and then click on the green recycle button.
I saw this article from huggingface https://huggingface.co/blog/sdxl_jax, and was wondering if it would be possible to apply these same concepts to speed up SDXL model training, and if anyone has tried doing it already
Then, with the same prompt, click generate.
Mhm
It'll retrieve the seed, and reuse it.
To randomize it again, click on the dice icon.
but it wont randomize once you grab the seed.
I have the seed number
Do I save it ?
or what
I just clicked the generated image on the righ side then opened Hires options with the same prompts
What do I do next?
u should read the webui wiki
this one covers all the basics https://stable-diffusion-art.com/automatic1111/
always save every seed number. you'll need those later
When do I need seed number for later?
I am using @nimble heart's tinted latent node for SDXL, and man, it works better than offset noise when paired with my realism LoRA
Here are some extreme examples.
It seems to play much better with my first gen Realism LoRA than it does with finetuned realism models
get as bright or as dark as you would like
it can go brighter and darker, but thats a but much haha
and you can do COLORED tints as well
Prompt: A photograph of a warm forest at dusk
These are all the same seed/settings/prompt, just with different latent colors:
cyan latent, orange latent, dark grey latent, magenta Latent, yellowish green latent
its still really experimental, and I actually chose a pretty bad prompt for that haha
let me try some others in a more neutral prompt
how can I find this node?
just a sec
I am using the RGB latent colorizer
thanks ill check it out, always love your renders
thanks :>
please stay tuned, gonna be testing a new setting validation and new image tagging for the V1.5 of my realism LoRA (trying 120 images instead of the 90 I have right now)
one step closer to the 500 image 2.0
And one step closer to the 2500 image 3.0
goal is to be able to control image framing, color grading, and realism subject effortlessly
while also having way better lighting than any of the realism finetunes (something I have already achieved)
"A cinematic portrait photograph of a pretty Japanese fashion model sitting in a room"
Normal Latent, dark grey latent, pink patent, blue latent, orange latent, yellow latent, white latent, blood red latent, purple latent:
Some prompts and secens do better. I was trying to get one pretty neutral. It also works well if you prompt for specific colors as a theme and match with the latent
"A cinematic portrait photograph of a white woman dressed in red in a red room with red lips"
Normal Latent, red 20%, red 50%, red 75%, red 100%
You can see it gets a bit much at the end lol
(please excuse the weird noise and slight image artifacts, I am benching my LoRA with no negatives to see what to correct in my dataset)
"A cinematic portrait photograph of a tiger in a forest at night"
normal, 25% black, 50% black, 75% black, 100% black
infinitely controllable offset noise
Yes, the node is awesome.
Just be warned, it doesn't play with models as well as it plays with my LoRA, at least in dark values
it's such a general purpose hack it should apply everywhere with a little tweaking of the values. Even works with the refiner if you really want
but yea what 100% black does will vary from model to model based on the contrast they cooked in
I've found an anomaly in SDXL. why tf does it ALWAYS put the Apple logo on laptops
subliminal sponsorship
also sometimes cars are often just Mercedes if not specified for some reason
hey guys, how do i unbind the 8188 port or set it another way ?
PermissionError: [Errno 13] error while attempting to bind on address ('127.0.0.1', 8188): an attempt was made to access a socket in a way forbidden by its access permissions
lil cuties
there is a --port flag in ComfyUI
just add the flag to whatever your ComfyUI startup command is
and after you write the open/desired port you wish to use
tbh, though I lol'd, so many images to scrape for training are Apple laptops and Mercedes is big in cars too. A bit odd.
yeah, I find it humorous that SDXL thinks the only cars are Mercedes and the only laptops are Apple
maybe a SAI x those brands collab is inevitable
I am working my ass off to scrape images that are ideal for my dataset. it took me a week to get all 500
and its gonna take me forever to caption them as well, but I have faith it will lead to my LoRA kicking the asses of SDXL realism finetunes even more :p
I was mostly referring to my model responding better to black offset specifically. The other models seem to kinda fight it and get muddier results
As seen in these:
The models go from left to right as follows: Mine, RealisticVisionXLV2, RealismEngine, RealStockPhoto
realistic vision plays really bad with this trick for black latents I have noticed
SDXL and diffusion in general are meant to be general purpose, but LoRAs can be convenient sometimes
when it comes to realistic photography, my LoRA is the GOAT for most things
At least from what I have seen and tested (around 700 prompts, from 9 different SDXL models)
Same order as above
mine seems to take much better to the black latents, while also having much better realism/lighting/background separation
thats 50% black offset
idk man, realism doesn't seem to be an issue for my experience
those look alright, but definitely not like my model
my model 9/10 times stops the competition from my tests, non cherry picked. Thats not to say it doesn't lose, cause it does sometimes
they look real enough to me
i like the 1st one on the second row
Im a sucker for glowy bits
Green is real pretty too
they look passable, yeah. Just not like they were taken on a real camera. Just kinda the standard flat depiction of real life with improper lighting, inconsistent background blur, and impossible dynamic range
Its a stylized realism look
plus, the model I used there is actually a model I've calculatedly made by block merging. it's not meant to be realism or anything; just a good general model.
In all of these, the model order goes:
Mine in top left, RealisticVisionXLV2 top right, RealismEngine bottom left, RealStockPhoto bottom right
Well, let's look at how much money they have and the circle they live in as well.
mine consistently looks a lot more properly lit, with more realistic dynamic range, better foreground/background separation, and typically more coherent backgrounds with better fidelity
the whole point is it does really good realism with painfully basic prompts
same order here, Mine, RealVis, RealismEngine, RealPhoto
Entire prompt is "Dog"
"butterfly"
when unguided, realism engine is the closest to mine, and when guided, real stock photo gives me a run for my money occasionally on some prompts. Every once in a while real stock outputs very good images
these are the parrots I usually get
the main way to improve SD is the text encoder itself, CLiP is kinda a bottleneck
those aren't bad, though I assume you use negative prompting and such as well. I am not using negative prompts, just to see what they can do base.
All of these prompts are a single realatively short sentance
I do like the aesthetic of your images, but its still not "this was taken on a DSLR" like what I am going after specifically
which apparently takes a ton of work to do
evidently lol
just promped "parrot in a jungle", that's it
SDXL has a SOLID UNET, but the text encoder is a stick in its wheel
if thats really the case, then you have the best realism model by an absolute 10x landslide lol
ah, the base style is helping some
also interesting you are below 1m res, which is something I have messed with too
the base style is empty, it means nothing is being applied
I have found SDXL looks a little better slightly above/below 1024x1024 eq
I'd be interested to try this model you are talking about then, cause if this is really the output with no guidance, then your model is substantially better than all of the SDXL realism finetunes by a landslide
jesus christ, 67 steps? whats that about?
uses AITemplate, sampling takes 10-15 seconds for batch 4 anyways
so I can go ham with the steps
thats just wasteful regardless. Not sure why you would add all of those steps for no reason
that's just how I like to set my workflow, my personal preference
Fair enough I guess. I would prioritze speed/enegry savings over wasting energy for no gain 😅
is your model out? I would love to test it if so
yeah, it is. it's on CivitAI and I explained exactly how I made it there
didn't train anything, just calculated model merging
there has got to be something else going on here, cuase these results look impossibly good. I must test for myself
these results are so much better than any SDXL model I have ever used, I just don't see how its a mash up of all of them
could also be my workflow =\
I am willing to bet thats more likely
testing out in just a sec when its done
I am gonna drop realism engine, cause its consistently the least good on average
greetings!
for anybody interested, the absolute worst realism model I have used is juggernaut XL
it got absolutely trashed in every single image gen test I did
it was like 5 steps behind all the others I tested
and it breaks horrifically with any upscale diffusion
I am not sure what is up with that model
The parrot thing seems not to be that high of a challenge. Prompt: Parrot in jungle
those look pretty bad compared to @indigo carbon
I am pretty nitpicky, and I see a lot of issues with those two
Yeah, for instance the burned out whites
I see issues with TDG's as well, but there are a lot more positives than the ones you posted
thats probably one of the most accurate things in the whole image, for it being from a camera at least
thats one of the reason why the SDXL realism finetunes always look so flat and muddy. They have impossible dynamic range that just makes images look artifical, cause real cameras don't capture things in perfect dynamic range
for me it just makes these
I had to train my model to revert that, and thats one of the things that makes it look much more realistic in lighting
ok, these look significantly worse than the ones you posted above. Whats the difference?
He even missed the focus.
oh shoot, i downloaded youre model 2 times lol
I didn't change anything, Idk
only difference is seed
hmm... weird
those look much much much less good/accurate
those look way more like base SDXL
way more scrambled, incoherent, messy, inaccurate lighting, no focus planes, plastic rendering on the feathers, wood texture looks bad, hallucinating other parrots and colors in the background
I would have never thought those were from the saem model had you not said something
These get a 3/10 for me
these get a 8/10 for me
that is crazy that those came from the same model
they went from pretty damn solid to pretty damn bad IMO
the left ones have so much more fine details, better feather rendering, better light sources, proper rendering on the leaves with the textures, muchmore accurate wood, proper focus planes, better tonal contrast. I am having a hard time believing you didn't change something considerably in your workflow
and these look as good as the first set
what are you changing, cause I do not believe you when you say its just seed lol
that is an unreasonable difference from seed to see if thats the case
idk man, I'm just diffusin'
I guess I will have my answers soon, gonna try it for myself
that second set of images looks way more like what I am expecting
this conversation reminds me of Gus Fring talking with a scientist about the quality of their product
how so? lol
these look as good as 1/3
1,3, and 4 look the same
2 looks really bad by comparison
ok, yeah, just tested your model. Something is massively different with your workflow, and now you have my interest piqued, cause your model looks really bad for me lol
you're talking about the batch itself or the batch(es)?
I think all of the images in batch 1, 3, and 4 look 8/10 for me
While all of the images in batch 2 look really bad to me
and the results I just got from testing look even worse than 2 for me. So I really think your workflow is doing something really good
your batch 1/3/4 look solid is what I am saying
Wait, does a parrot has nose holes? 🤔
it's just a basic workflow that uses AITemplate
yes
interesting, if I add "photo" to the prompt, your model does much better
literally not even close to the IPAdapter workflows I did
dare I say, your model looks as good, if not slightly better than the realism models I am testing
wait, you are using unclip conditioning?
in the normal workflow? nope
@indigo carbonI am officially swapping out realism engine for your model in my testing going forward. I am very impressed with what I am seeing
"my workflow isnt that complicated"
cool, don't credit me, all I did was the math for the block merging
you may have ended up making the best realism merge out there that I have tested
ok nope, nevermind, it falls on its face like the others with portraits 😅
interesting
it did damn good with that parrot tho
give me the portrait prompts, I kinda want to also see how it'll do them
Portrait photograph of a pretty woman standing in a field of flowers wearing a suit and tie with braided hair at sunset
there is one
yeah, it looks like it struggles just as hard as the others for portraits without key word mashing
yeah, super plastic
exactly what I am seeing
they also look stretched in what you sent. Did you not update your latent sizes?
makes sense, like I mentioned earlier; that model isn't meant to be the best at anything, just a good general one
Mine, real vision, yours, real stock photo
Very interesting. I've got basically the same person with the base model.
LOAB?
this won't be the first time we see a woman reappearing in multiple generations
"A portrait photograph of a tiger walking in the snow up in the mountains"
Mine, real vis, yours, realstockphoto
yours is a LoRA, isn't it? the shapes are entirely different
my LoRA is reallyyy overfit on a specific tiger image lmao
yeah, I trained this LoRA really hard
and its super overfit on a specific tiger lmao
I only have one in the dataset
the new 500 image dataset has 7 different tigers
the tiger is the only thing that overfit in this LoRA lol
Oh, I also had the LoRA turned up too high. Thats why it was a little high on contrast
this image! lmao
it reallllyyyy grabbed onto that tiger lmao
still, you are comparing 3 models that their main purpose is to be realism to a model that's meant to be generally decent
I know, and yours is holding its won
*own
dare I say... I like it more than the other two
your generalized merge seems to be better at realism than the realism finetunes, albeit slightly. I am testing more
you can see again here how overfit my LoRA is on that tiger lmao
it's similar shapes on the ears for some reason, odd.
it just really overfit on that one image. A lot of subjects were represented with just one image, but the tiger held on SUPER HARD
so my new 5.2x bigger dataset has a few more tigers in it to fix that lol
and I am gonna remove that image
cause god damn does it love that tiger lmao
this reminds me of when I tried to blend Markiplier and Shrek
he's too strong, it's just markiplier dressed up like shrek
A portrait photograph of a creepy ceramic garden gnome wearing glasses in a swimming pool
Mine, real vis, yours, real stock photo
mine looks the most like hes on the SO list lmao
but I like all of the results in different ways
yours looks the most like a terracotta/clay gnome in the mexican style
mine looks the creepies and the most like a real photo
real visions is super cute
and I like the tiles on the pool in stock photo lol
all of them get a pass for me
that means it failed, you prompted using the word "creepy"
yeah, true haha
mine wins in creepy, but I like them all!
they all get a pass in my heart lol
looks like the images I added for tree bark are working well too
god, real vision consistently looks ass for non people lmao
also, a model (not SD) that uses T5 is about to be released, so we'll get a chance to see what other text encoder can do
man, these all look bad lmao
T5 is the same text encoder that OpenAI used on Dall-E 3 also
I'm unsure if SAI will also adopt that encoder, because it'll need to be quantized to work on normal machines
it listens really well (except for the times it flips you off and chooses to do literally nothing you ask for)
But its results usually look really bad IMO. Always flat, desaturated, no color grading, soft and lacking any real fine details and textures
I think, personally, that DE3 is the worst "looking" image generation model out right now
For visual aesthetics, I would take SD 1.5/SDXL/kandinsky/MJV4-5 over it ANY DAY
yeah, the quality is bad. but the coherence is in its own league
coherence = good
Aesthetics = ew
lmao
its like, every time I see an image from DE3 I think "Thats a cool concept, wonder what it would look like made by a visually pleasing model"
like "Wow, its so coherent! too bad it looks like ass and I don't wanna see it again"
for sure, I'm unsure if the quality is bad due to T5 or if they fucked up something
because they did confirm it's latent diffusion
I would rather have an aesthetic lump of crap than a flawlesly coherent boring blah that inspires nobody and nothing with the sheer impressive levels of drab-ness it exudes
I look at DE3 images, and I feel static in my brain
It feels the most lifeless of all of the models by a landslide, IMO
I would for sure take DE2 and its mountain over issues over DE3, just cause DE2 looks fun
I think T5 might be a double edged sword, it's a tradeoff for coherence and quality
or it's just OpenAI being dumb.. we'll find out eventually when that experimental T5 model releases
don't get me wrong, I hate OpenAI and all of the BS they have pulled, fucking over their investors, lying about what they are trying to do, all of that. And I am slightly biased against DE3, but I can recognize where its awesome, but god damn does it look like garbage aesthetically
run zoom-out and zoom in comfyUI with SD lol..
oh thats kinda fun haha
that damn default comfy image is burned into the back of my retinas lol
same, the "open" in OpenAI stands for "Open your wallet"
supposedly the quality improves if you dont use bing
but yea still not midjourney ✨ aesthetics 🌈
I sure hope so, cause dalle looks great in comp, and nothing else. Zoom in, its static shit, look at the overall levels, its flat, uninspired and drab, look at the textures, they are dull and fuzzy
wonder if its pixel based like dfif
if aesthetics is what you're talking about, SDXL is definately the winner there
also DALL-E 3 is latent diffusion, so it could be T5 giving it that grainy effect
im mostly just wondering if they're running it at like the bare minimum steps for all the bing users
They all fail in one area: Skin. All models some to a higher degree than others still have the fake airbrushed look that has been all the rage on social media
Yeah, i will agree thats probably the worst skin I have seen mine do
my model can do skin texture, it just failed in that image
i had the LoRA set too high, and I also had the sampler set wrong
it doesn't matter, you need to train specifically for realistic (natural looking skin), not an easy feat
please post some of the better looking skin 🙂
i think we're all missing the obvious solution here
reject human women and only diffuse attractive dragon women instead
hehe
something like the skin on the right or more detailed?
This is a crop
Please excuse the eyes. This LoRA is not done training at the moment. It didn't get a chance to fiully converge on high frequency details like eyes
i'd say eyes are mid freq
also, I have tags specifically for more textured skin that I don't often use
high freq would be skin
the issue with 99% models is the lack of skin subsurface scattering
skin looks either plastic, dull, or artificially shiny
uhhh
sub surface scattering is not a common thing in like 99% of photography, excluding ears lol
but in a real photo, you will see it

maybe a black latent would show the pores better
gonna have to x on that one
maybe, I could try sometime
I use 100mp photos for work and I can assure you, that skin looks different
mk
you ever see a dragon and just like think "would"?
it's the conundrum of the photoretoucher, how much skin texture is ok to show on a fashion shot
too liittle and it looks airbrushed
too much visible pores and it looks like you went too far with your high pass filtering
yeah, like Donke from Shrek
I know, lol
oh god
it's a fine line
these realism models make people with "acne" look like they have skin deseases, oh my god lol
yep
mine and real stock didn't, they look a lot better
or everyone looks like they came out from a Chanel makeup Ad
I feel like you haven't messed around with SDXL enough if you think that TBH
really? 🙂
and you're pretty passive aggressive as well, so I think I am gonna just move on. Have a good one lol
you don't have to be so sensitive, I was just trying to help with constructive criticism
old people always have detailed skin with SDXL
I am not sensitive, and I don't need your criticism, but thank you. I just don't wanna interact with you. So I won't
anyways
all good
still waitin for that realistic skin pic
I sent one above, though its not the greatest
ew wtf
makes old people really crusty lol
no i mean to warlock,he should send one so we can see what kind of skin hes talkin about,show us an example 🙏
ahhh, I get you
the skin sure is more detailed when it makes old people
Probably to exagerate age
the problem is my LoRA has unrefined residual noise cause I didn't train it fully, so it adds some unwated patterns on skin when too close
when I train it properly and get that to go away, it will look mucchhh better
I would train a model but i heard that it's time consuming and requires some nice resources.
well I'd probably make a lora first, since that seems to probably be the easiest.
my realism LoRA seems to work better than realism finetunes. LoRA's seem to be the way to go for dramatic changes
that... looks so crusty and artifical OMG
think it'd be a good idea to make a "latent gradient" node to go with the colored latents? for making simple patterns merging two latents
I was wondering about that!
I think the skin textures of older subjects is not necessarily more detailed. it has wrinkles and more stuff going on which gives it the impression of more detail but overall it's just as airbrushed missing pores and blemish.
btw, @nimble heart I have been using your workflow for some time and love it
kinda like this but without the math
I think this skin looks WAY more realistic than that random non topology conforming skin
which one? Principled?
but the residual noise makes it look a little too over-textured
grrrr
its an issue cause the LoRA didn't finish training
yes, principled
my non-principled workflows I haven't updated in like a few months at least
nice. try the colored latents in the newest version. pretty spicy
still trying to wrap my head around it though, trying to squeeze out the most out of it
I did 🙂
it's sorta like an auto1111 style pipeline
lmk if there's something in particular that's confusing so I can clarify.
there is pores texture, but no fine lines/wrinkles, which makes it look more fake IMO
looks like she needs some sleep
I like in particular the upscale and the latent color grading
love the skin pissing contest btw
latent color you kinda just have to try. Upscale was just reworked into it's own separate node so if you're talking about that, it's basically an auto1111 hiresfix. Just un-bypass the two purple nodes with ctrl-b and it'll do everything else for you more or less.
of you're still on the old all in one node then it's even closer to Auto.
ah maybe I misread I thought the upscale was confusing
I need sleep
toodles
I'm using this one
what in the world is going on here
I am so confused
is that a sampler issue?
I am so sure it would look much better if I was able to get rid of that last bit of latent red noise on her face
yea that's basically it. the pixel buster node does math ops in this case to fix the black/white levels. Two sliders control scale and neutral.
The full Pixelbuster manual is it's own node called "Pixelbuster Help". pretty cool NGL
definitely not saying that cause I'm the one who wrote the library
That was another topic, I need to read the Help file
I can tell there's a lot under the hood
Xiao, what is going on...
why the crunchies
are you using too few samples for the sampler you are on?
The principled node is mostly just existing ComfyUI nodes strung together automatically so it should behave as expected. Only change is the steps are dependant on denoising a la auto1111. I might revert that tbh, I haven't fully decided.
the other nodes aren't too bad either. I'd wager 95% of custom nodes are more complicated than mine.
Testing with my experimental fine tune. above image is like 35 steps, 5.5 cfg without refiner
interesting
a slightly dark latent almost always makes images look a little better haha
make the tiger pop
That what I like about yours, it's not over bloated like a swiss army knife with 50 different options 😉
I'm just messing around
at one point principled was longer with the scaling and some experimental stuff built in but I've sorta minimized it with as sane of defaults as I can. I used to even expose the XL clip sizes but now they're just auto calculated.
so far the only thing is principled won't work with conditioning nodes like IPA.
I could split the conditioning from the node but then it'd be super messy. Loras and whatnot already work so eh
@high skiff I'll try making that gradient node tomorrow probably till then night night
them apples
also, what is realism engine? never heard that name before
its a model for SDXL. Its not very good from my tests. It almost always comes in last
it makes everyone nude
umm... I did see that happen a couple times lol
sounds like typical CivitAI model, am I right?
where girls were clearly wearing no bras under their dresses so it gave them some really prnounced nipples, or really scantily clad dresses lol
yeah, sounds like something this community makes
I make pizza burgers
he hates it
how about now?
I like how it made him depressed now
they should join forces
yum
Now somebody needs to generate a newspaper article about the Pineapple pizza wars.
lol i like the pineapple head guy in this one, you can imagine they scrambled his voice in the interview
why is there a guy with a pineapple head on there 
a mushroom cloud made out of pineapple pizzas
I'd eat the first dish as a taste-test 
second one though, my taste buds can imagine how nasty that might taste 
🤣
mcgruber!
Is that also served on a jeans? 🤔 Based on the beverages and the overall set-up I would say it was cooked and served by a drunk guy.
Do "where are my farts" next
Hahaha gas pains, the worst
I used Dynavision with several Loras
Aether glitch and Aether bubbles from @icy brook
modern oil paint from @delicate kelp
and sktechbook from @sinful falcon
Not eating crab again...
Hello does anyone know if sdxl have an API
Very nice!
You mean ComfyUI run as an app. Yes, you can do this: Use MS Edge and install the Web IP as an API on your desktop.
What's on the menu?
Pizza
Without Pineapple!
of course
Everything else would break TOS Rule Nr. 4
Smiling about it...
he's smiling too
and running for the victory
and after, having a Finnish wooden sauna
Ah nice thank you
Hi all, anyone have experience with tiled vae
Experience as in?
I use a tiled vae node in ComfyUI for my final upscaled vae decode
Nice. I like the sketchy ones.
Yes to use less vram
Yeah in ComfyUI, it will automatically switch to tiled vae if you don't specify it, but it will take a few extra seconds for it to detect that.
Alternatively you can set a tiled vae node manually if you know you always hit that threshold anyways, and thus save a few extra seconds.
I heard ComfyUi is more complicated to set up, and I am new to SD
it can be, or you can just download a template someone already made and use it just like any other webui
Or use something like #🐝|swarm-ui which has a more user friendly ui
but still uses Comfy on the backend
John Bauer?
Can someone ELI5 why DALL-E seems to "get" what the prompter is thinking of far more easily than SDXL? It's really a shame they put so many restrictions on its use
It was trained differently to have more focus on that, but it suffers in quality and graininess in the final images
yes
Hi folks, does anyone know what prompt works well to get something similar to this style? I've already tried "vintage design," but nothing happens.
Gravure, lithography
SDXL Base (I use more art Lora, but also without it's similar)
thanks buddy
maybe also linocut and woodcut. basically anything that has to do with etching the plate for printing in ye olden times
I'll make a coffee and explore these valuable tips, thank you very much
Guys I am new to this, what command should I use to generate an image here?
in bot use/dream
apparently prompting for an angry Thor, that means he has a light saber
no, I am your father 😆
lol and zombie teeth with a shovel
🤣 gotta change up this prompt hahaha
Good night
popped eyes would hurt
Oh well now you're just horsing around
anyone know a video upscaler? preferably something I can run in a colab. I have a RIFE interpolator, that's working well

Hey, is there any command for ComfyUI and SDXL for Low VRAM? I'd like to know if I can improve the performance of my 8GB 2070 Super.
Comfy defaults to medvram.
I have a 2060S also with 8gb, I wouldn't consider going the full lowvram route
You probably should have --disable-smart-memory though put on
I've read things more carefully, it seems like everything is fine with the use of the 2070S. I confess I was hoping for some miracle to reduce my rendering time.
Always a hope to reduce rendering times haha
What does the refiner do, I forgot!
refines the image.
adds and refines minor minor details, is what it was designed to do.
Less needed as more finetunes have been released, but can still be used at your discretion
have any of you 24 gb giants in here tried using this thing? apparantly the code downloads the weights
So I can use the same model to refine?
This is the official refiner model
https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/sd_xl_refiner_1.0_0.9vae.safetensors
hey guys, i read that it is recommended to have at least 32 go of ram for SDXL, would it work fine if i setup some pagefile from my ssd ?
Saying that, I often (mostly always) now use whatever model I want to do my refiner process.
Sometimes I want the look of Dynavision, but want it to finish out less animated, so will use a different realism model. And/or vice verse
make the process your own, experiment and see what you ultimately prefer
Their hf haven't upload the weight yet.
ah. the code relies on the hugging face.
Guys, do you have any Upscaler X2 for SDXL? Currently I only have X4.
Where do I put this file and how do I use it?
and What is the best Upscaler?
download it to your normal models folder where all your other models are, and then just select it for the refiner process.
Guessing you're using A1111, I haven't used A1111 since SDXL came out so I'm unsure where exactly that's all at
so models/Stable Diffusion?
Yeah
SDXL is on A1111? isn't it?
Thanks, I'll try it, I'm working with images that don't need x4.
This is my favorite upscaler overall
https://openmodeldb.info/models/4x-NMKD-Siax-CX
sdxl is just a model, so yeah, if you use an sdxl model, then you're using sdxl
I have two models
One called Realism Engine SDXL and RealVisXLV20
I think RealVisXL is based on the Realistic Vision model
I tend to use ZavyChroma very often
SDXL is trained on these ratio's, 1MP sizes. Best to stick to these if you can
its very good but theres not a "best upscaler"
I have 4x-UltraSharp also
Yeah "best" is very subjective in all of this haha
It's why I said it's the one i prefer personally
@crisp owl somebody said that DATs are very good upscalers. Have you try them?
Nope, I've meant to go on an upscaler journey, but just always have something else taking my time away lol
🙂
Anyone knows what is the recommended sampling method?
subjective like everything lol
I use either dpmpp_2m/Karras or dpmpp_3m_sde_gpu/exponential (don't think A1111 has that one yet)
DPM++ SDE Karras
i think it has it
Yes iam using it
ah nice
missing a few, but a good read
ok I will read
Wow, my friend, what an insane speed this model is operating at! This was great! I dream of the day when the base model can do this with images.
so the refiner file goes to Stable Diffusion folder and not the VAE?
correct
Unless A1111 is structured to read differently, but it's a model, so I'd figure where the other models are
it is because other resize to 4x and then multiplied it 0,5 to get 2. This apparently just resize to 2x
You can as well try SwinIR 2x
Okay
Thanks
What affects the processing speed/generation speed, the gpu, cpu, ram?
mostly gpu
only gpu matters if your gpu doesnt have enough memory then it will use shared video memory which would slow things even more
some but not a whole lot. Some things like models are just called from there, but doesn't do any processing
I have RTX 3070 Ti
if you have 16 you should be o.k. with 8 GB of RAM you will probably need higher page file
what do you mean
16 GB RAM is 8 VRAM=8 Threads
i think it is apples and oranges 🙂
huh?
idk what are u asking
RAM and VRAM
Ram is way slower having not enough VRAM all process can be processed probably on RAM but will take ages. Therefore there are --medvram --lowvram switches.
BTW suggesting you using switch --medvram-sdxl @heavy zinc
it is good for 8GB gpu cards. Ask if you should use it in #🤝|tech-support
@heavy zinc depends on your sampler and number of samples
Is one able to run SDXL directly from the command line? I’m running An automated process from a GPU that I SSH into, so it’ll be ideal if I can run a generation with a single command. I’ve done this in previous versions but can’t find the txt2img.py file. Any help appreciated:)
Probably easiest to use the diffusers pipeline. You can make a really small python script that just forwards the args through argparse
works for 1.5, 2.1, and other diffusion models too
Isn't diffusers pipeline not as efficient as Comfy though?
it's close. faster than auto for sure.
think comfy is still ahead on my card by like 10% from the subquad impl
on Nvidia with SDP it might be closer even
use torch autocast before you build the pipeline and it go zoom
Also, Comfy will get more efficient very soon, AIT will be a built in thing in ComfyUI (including AMD)
but without autocast yea it crawls. like 5s/it vs 3it/s
I mean IDK how useful comfy is from a shell which was the question
if you really wanna minmax you can set up whatever speedhacks with diffusers. Pretty sure torch compile into tensorrt works for Nvidia
thought that was months ago and they gave up?
cause it wasn't seamless
they gave up because the thing FizzleDorf and me(partially, just the compilation of the modules) was already pretty much seamless when it's deployed on supported hardware
thanks a lot. i'm experienced with py but much less so with sd. is this article a good place to start?
but since new ComfyUI versions kill it, now it will be officially kept on life support
I've only used diffusers a few times but it's super easy. On the hugging face website they give you examples to get image gen working in like 3 lines of code.
basically just import pipeline, load model, prompt
yea and if you dive into it there's lots of functionality that most webuis have
perfect tysm man
yea I saw that new repo with a patch file
and the patch didn't apply cause I was a few commits ahead lol
you were actually behind, it's not compatible yet because Comfy needs to do stuff on his end to make it work right away
thought the patch file was to bring it up?
the AIT node will eventually become a built in node
that's literally what the description says
"apply this patch before it's integrated into ComfyUI"
it won't work, I mean- you can try it..
regardless it's still missing a lot of shit in the new repo so I'm not going to try and make it work in amd anytime soon
the only reason the repo will also stay as a custom node when built in is to also provide compilation scripts for AMD users and such
wonder what's stopping it from hot compiling
like Exllama does
does it take 10 minutes for a single kernel or something?
it doesn't compile kernals
or whatever it does
it compiles MSVC engines that are later used to load the models
and it can take even hours, the modules that have a range (the ones I uploaded to the repo that has the precompiled modules) took a few hours to compile
but if you are compiling for specific settings it should take a few minutes
so when you do a range it basically batch compiles for every possible combo of settings...
cursed 99
does tune affect it or only unet arch?
that's the only way to make an optimization that'll work on any hardware and will be as fast as stuff like TRT and OneFlow
OneFlow is not compatible with anything besides the most specific things it's meant for
damn. probably thinking something else
the engines compiled with OneFlow aren't even architecture specific like the ones AIT can make and are even more painful to produce
what'd be extra baller is if AMD added their own backend to torch compile. then it'd be vendor agnostic
pretty sure they already have a graph compiling thing for that similar to cuda
those engines include the model's weights and are checkpoint specific; not ideal. the engines made with AITemplate are architecture specific, so it's automatically a better optimization
also once the repo comes with prebuilt pip packages; the compilation scripts will just work ™️ on anything
heck, due to it being built in in future ComfyUI versions, it could just be included in ComfyUI itself
so then you would just use the node to make the engines, and you'll be able to use them right off the bat
maybe if you can multi thread the compilation it won't be too bad
you'll just need to do it once for every single model architecture, it's not too bad
however if AITemplate would be as flexible as stuff like TRT, you'd need to compile for EVERY MODEL- and that's much worse
just got my first image there! thanks again
(POV, close up portrait, cinematic lighting), (snobby anthropomorphic dolphin:1.3) wearing blue sweater and white sunglasses, pool lounge chair, (POV:1.3),1986, (extremely detailed depth, realistic color, saturated, deep blacks), style of Paul Kidby, highly detailed, style of comic book cover illustration, inspired by William Brodie,professional, clear, high contrast, high saturated, , vivid deep blacks, crystal clear
Would be cool to see how the AI see's reflections and draws them. Cause the reflections are often pretty spot on
it guesses
I don't remember exactly, but doens't it just look at what it has done before and mirrors that.
No clue, but I'd be interested to know haha Just for the curiosity sake
Here it failed a bit, columns reflections are perfect but... Probably not thinking about it as mirror?
Still a pretty decent job at capturing it. A bit wonky yeah, but the concept is there
Hey guys, not sure if this is the right place for my question, but i'll give it a try: I want to recreate images with stable-diffusion-webui/SDXL, which were created using the BotChannels here. But it just produces junk / totally different images. I guess i am not using the correct model/settings or something else. Is there any information about which settings the Bots here use or any other way to reproduce the outputs locally?
My white stage...
is the refiner needed, or can I generate ok images without?
Guys, the question might seem simplistic, but I'd like to know if I can delete the images in both the Output and Temp folders. I want to find out if deleting the contents can affect the generation of the next SDXL and ComfyUI content.
I've blended a duck and a snowball.. LOL
temp folder erases itself each time ComfyUI is started up
https://huggingface.co/segmind/SSD-1B looks neat. the name confuses me though. distilled models in the past have been something different
If anyone's looking for a good time, I nailed down a consistent comic book/graphic novel style I really love. Hope it's fun for someone else too! https://civitai.com/models/173569?modelVersionId=194889
Hey lads, trying to get this to work in comfy https://civitai.com/models/23900/anylora-checkpoint?modelVersionId=95489 but unable to, I've read the civitai description but I don't see any info, am I missing something?
man even SD makes female fantasy races just different flavored sexy humans
worst trope
agreed, 1 out of 10 got an actual female minotaur, rest just chicks
with those overfitted waifu models you can prompt animals and sprinkle in human anatomy words with a low weight and the overfitting will just bend the dragon or whatever into a human shape
might not work as much for pixel art tho
well i was gonna show an exampled but i guess i deleted it 🙃
how do i fix a lora doing this
These are really great ^ nightmare fuel for me, would make good ecological propaganda
propaganda
the smell must be terrific!
I enjoy the prompt generation on bing and would like to see that functionality added to SDXL.
I think I know what you're trying to convey, but can you expand on that? There's been quite a bit of discussion around that since its update.
I would imagine you mean how well it interprets the prompt.
bing = good
yes i do mean how well SDXL interprets the prompt. When using Bing it creates say a person sitting. I find that bing has a sentence structure that when I am using the same thing in SDXL i do not get the same results.
I feel very robotic adding alot of negative prompts using commas. Or for example with a seated subject having to put in the negative, kneeling, knelt, etc etc. i dont know how bing does it but somehow they pull it off.
I guess I like how on bing it feels like I am typing a story, and in SDXL it feels like I am just typing elements which feels more detached and makes me feel more like a programmer.
The reason that seems to make the most sense has to do with SDXL using CLIP and Bing's recent upgrade to DALL-E 3 using the GPT LLM; or at least that's what I've seen discussed, which makes sense to me. Now, implementing that locally? That's a whole other challenge. I don't think it'd be feasible locally. But there are some really knowledgeable people here who are deep in and know the underpinnings better than I do and I'm sure they could speak to it well.
I think your right probably but im no expert. I do see smarter people than me saying that bing seems to send it to chat GPT or something to create the image. There is local chat GPT things right ? Maybe they just be an addon like the checkpoints? I would like to use the prompt style from bing in SDXL locally of course.
I think over time we'll get there; to me that seems inevitable. But it'll just be continued development both in software and hardware to get us there.
I would also like to be able to get an image created in SDXL by another user load it into my pnginfo and have a list of their addons appear that it can download by checking a box.
If you move to ComfyUI and use Comfy Manager, you can do that with images created there.
often when adding images to pnginfo with the metadata as a new user I am unable to tell what is a word and what is an addon.
It won't help with A1111 images, but it's a step in the right direction.
A lot of us here use Comfy.
I am using A1111 so could this be better solved by building that into SDXL? that way we all can read it no matter who made it if it was made in sdxl?
I just followed a YT tutorial I had no reason to pick A1111 over anything else, i chose it arbitrarily not knowing the pros cons or even that other things exist.
It's not really about SDXL so much as it is the program wrapped around inserting the resulting image into whatever container it goes in.
so what really is SDXL so i can better understand what I mean when i talk about it. Is it uhhh hmmmm i dont know what to compare it too, is it like windows, and A111 is a program?
You should check out ComfyUI. Don't get discouraged at first, just take it slow. Grab an SDXL workflow for ComfyUI, load it up, and begin experimenting. The other thing I recommend is start to read up on all the terminology so that you know what each setting really does...that helps a TON with tuning your image creation.
theres a bunch of gpt models u can run locally but you need 💸 for it
well I might do that next generation right now I cant get SDCL to do what I want either way. I can get bing too do it though which is aggravating since I have SDXL sitting here working just fine.
I am willing to spend money once this whole hand thing gets fixed lol. then i fear maybe it will hit the normie sphere but maybe that isnt a bad thing.
u could also hook gpt to SD so it "fixes" your prompt and sends it to SD but its useless since their bing integration is better
Stable Diffusion in general is image inference from models. The common model versions now are 1.5 and SDXL. XL can infer generally reliably for images sized about 1 megapixel, while 1.5 models infer at one quarter of that size.
i am happy, very happy with the quality its amazing really! even with my junk 2070 to me it looks incredible.
It'll only get better as time goes on. Enjoy the ride. 🙂
wait what is this? link it to SD? that might help solve my problem maybe I have been having in SDXL
a small gpt model you could run it locally on 24gb vram so a 3090 or a 4090 would be good enough,if u want gpt 3 or 4 u would probably need a cluster of A100's thats like prob 50k$
But in the end, it's still just passing the verbiage to SDXL, which is interpreting it via CLIP.
so am i using it right that the hands look wonky in SDXL because I use real low settings on purpose. if I dialed them up it would fix the hands? if it can id be interested in checking out combinining it with gpt.
Hands are always a challenge...eventually, add-ons will get better for SDXL with that, solving the issue.
Ive seen some horrors on Bing as well, hands, extra legs, humans bent up like pretzels to fit in the frame, its bizarre. so it seems everyone is in the same boat for deformities.
so with SDXL will there be new versions, or does it just update the current version? for example would there be a SDXL 1.0 1.1 etc?
This DALLE on bing it uses its own code for the whole image? asking since when I have multiple subjects sometimes the front one is crystal clear, but the other ones looks like poorly rendered SD peoples faces.
It's possible. I don't know if anyone here actually knows if there are plans for that aside from those at Stable.
you can fix faces with adetailer extension
(re: new versions)
from what I have seen as a new user and outsider it seems to me that SDXL currently is superior to the other SD versions. That is my personal opinion only, speaking from seeing photorealistic images.
both versions can create beautiful photrealistic images but sdxl is more coherent
I better go thank you! just wanted to leave that feedback about the prompting method which I enjoy on bing and would like to see implemented. Computers will catch up if they need more computational power.
SD 1.5 has the advantage of being around longer and having lower requirements, therefore there are a lot of add-ons/extensions to give it better capabilities than it had out of the box. But give SDXL time and that will occur more and more. SDXL definitely has a better starting position.
👀
is that pytorch rocm or rocm repo?
yea
which one he asked?
Damn tattoos are getting good 👍
Hey guys is there any prompt generator which help you construct a prompt to generate proper images?
Hello everyone,
I'm excited to share Fooocus-Control.
Fooocus-Control is a ⭐free⭐ image generating software (based on Fooocus , ControlNet ,👉SDXL , IP-Adapter , etc.).
Fooocus-Control adds more control to the original Fooocus software.
what's the word for those fucking michael bay black movie bar things so I can add it to negative
letterbox maybe
cinematic bars?
Letterbox is correct. Michael bay is incorrect. They would be exploding if they were from Michael Bay.
beyond the explosions his movies are super tonemapped and widescreen
or at least transformers was
Could it be that your latent offset node forces them if you have cinematic in the prompt and have it turned down to a negative value? Just assumptions here.
yea im using a 100% black latent which paired with underwater monster seems to pull from movie stills and results in black bars
turning it down to like 90% fixes but it's not as contrasty I guess
I love the nodes btw
memaid 💗
when it's not making IMAX letterboxing, 100% black can do some dope things
Hey folks, are there any prompts besides 'brush strokes' that work effectively for emphasizing brushwork in painting?
Try "thick paint", not sure it will work but probably will.

