#✨|sdxl
1 messages · Page 148 of 1
you could edit the metadata to insert some vaguely similar prompts and such instead of yours, there should be some tool online for it if you're not into python much, no idea other than that
Yeah, someone is bound to take it on. Backend is fine it is the front end that needs the work.
Yeah. I mean there is Swarm which McMonkey wrote. I'm certain there will be more, or is more.
I've just gotten so used to it now
I can't get used to it as it screws me over when I start to go complex.
Too much noise on the screen and my mind melts
I have my fair share of moments where I'm just staring at the screen trying to figure out what needs to go where next lol
Slowly in the process of building out a new 1.5 workflow to share
There's a branch working on subprocesses
those help with screen mess
Abelton Live has had radio nodes for a very long time
Forgive me if it's been asked, but I'm tryin gto learn the ins and outs of the prompt system. What I've noticed is that my attempts to replicate an SDXL style on my own (i.e "comic book") don't look anywhere as good as when I tell the dream command to use the comic book style. Is there any available resources on what underlying style prompts exist?
radio nodes are simply two nodes a transmit and a receive. 100% no wires between on the screen between those nodes and so easy to trace issues down.
I never used dream so not sure.
@crisp owlDamn, I changed your workflow to be more inline with what I know from the website and it stopped being the same.
Got it
had the picture still open before you deleted it
works now
If I save either pic does the entire workflow come along for the riude?
*ride
Yup
Oh, sweet. I have so many workflows I was saving
well you're working on workflows...I'm working on styles. There're so many styles in base model that it's incredible
macro
by Slinkachu
there thousand of styles
I tried few hundred
Damn, I tried a few dozen then saw the onslaught of them and said that was enough
I also find an incredible Lora that powers all images and all styles
xl_more_Art
Kyoto Animation studio, as usual sdxl base
Looking to the Future...
Toei Animation studio
Michael Carson
Go pro camera style got taken literally... 🤭
Apologies, which model are you referring to, is it the base stability 1.0?
Bread Pit. Am i doing it right?
So.. the joke is about replicating a famous person and then naming them in a weird phony/botched name, but that doesn't mean that the AI generated image matches what you name it as, so for example, Bread Pit isn't literally a "Bread Pit", but is just an image of Brad Pitt. 
I am currently learning Yapanese, I hope you know the language
Beats me it is something SD decided to do. I moved on and didn't save it
ahh
Who 💩 in my porridge!
Pretty stable Photoreal (donald trump:1.3) photograph, (television broadcast, news footage:1.2), charging up, spiky saiyan hair,realism
😬
Japanese CEO trump
Is there some sort of API endpoint that I can run a prompt through programmatically to check it for filtered words? Or is there a CSV of filtered words?
Added more latent manipulation nodes if anyone wants to toy
https://github.com/Beinsezii/bsz-cui-extras/blob/master/bsz-nodes/bsz-latent-manipulation.py
different latent colors @ 70% strength
left column: red/green/blue/black
right column: cyan/magenta/yellow/white
So many things to tinker with 😅
black/white is the most useful by far
its basically just an input version of what my offset node does
I liked those two the best
white @ 70 is a bit strong
though the yellow is a nice effect also
yea the scaling isnt even it seems. Blue @ 50% is mostly gray while cyan @ 50% isnt even close
same how you can set black all the way to 100% and still get a colorful image while white @ 100 will turn it almost into line art
Yeah but that's just in general, sometimes changing a color like cyan vs yellow in photoshop also yields near nothing to completely changing the entire image
its extra pronounced in latent basically
the more luminant colors are spicier
why the right column is more extreme changes
rough approximation of SDXL latent channels based on just kinda looking at it
the rightmost is like cream at 10.0 and dark navy blue at -10.0 so idkwtf to call that
it's like off-luma
hot/cold luma
hey guys
beats me, that's beyond what I've looked at lol
yo @crisp owl , can I ask a followup question?
I'll answer if I know lol
I correctly set the Inspire nodes, but there is no image at the end
there's a latent debug node on my repo that spits out the average values of each channel if you wanna play
The sampler finishes but nothing shows up
Does anything show up in your output folder?
"unload clone 1"???
yeah idk honestly
I've noticed that before in my cmd, never could determine where it came from.
But I still got images
What's your full workflow look like?
no idea, but yes I got custom nodes
mines super different lol
I can't really post an ouput workflow since it doesnt do outputs...
without the Inspire nodes it works
So maybe it loads all the prompts at the same time? CLip encode took a long time (3-4) minutes to finish before I cut down the file numbers
I tried both approaches, all in one file, and different files
Am I doing something dumb by any chance? Sorry if its obvious
First try just putting normal text prompts through into your ksampler to determine if it's the conditioning where it's breaking or something else.
is that an XL model or 1.5?
yes, it's the conditioning breaking, works that way
Conditioning takes amazingly long , 3 minutes or so with a big text file, just what I want to accomplish with this setup...
yes 1.5
Well, I haven't played around with that aspect of the pack yet, so really uncertain how it is required to be setup. Seems it should be pretty simple though if it's just pulling from the txt file, and even if it pulled nothing, it should still result in an image as an empty prompt still generates an image.
So it could be it doesn't like the 1.5 models?
Or something is connected incorrectly
Yes... The wiki says it zips the prompts, then it unzips it. Doesn't really say if the prompts are loaded all at once, or what the purpose of the node is...
I don't think the setup (other than the inspire nodes) are wrong, because it works when I remove them and use manual clip input
Do you know if the creator messages here?
He does, but goes by a different name if I recall, perhaps he doesn't want to be bothered here
ah makes sense, couldnt find him by name
It seems there is another solution to this, will try it tomorrow
Someone also said there is no control over the sequence of prompts in the inspire pack
Always more than one way in Comfy lol
Hahah yeah seems so, thanks again for the help @crisp owl
thank you as well, gn!
used IPA to make the room darker
that'd probably be a good use case for offset latent
Ive only played a bit with the offset lora. thought I would try with this
if you start with a latent closer to black instead of middle gray it darkens the image
ex: #✨|sdxl message
yeah theres a lot to play with there
you could probably ref a black image in IPA to achieve something similar
modifying the input latent is free performance-wise though. idk if ipa slows it down like cnet and lora
tried a complete black image and result wasnt as good
I used an image that was generated with more shadows, less lighting
@vital ermine ayyy the blockers are gone in miopen. an internal testing build might be up on dockerhub by the time your buddy gets his 7900
oh, snap
Oh okay mcdonad's, keep half filling my fries and giving this diabetic regular cola, one day people are going to say they're done, and then what...
major libs all have a 6.0 release branch cut now too it seems
5.7 was put together pretty fast so hopefully 6.0 doesnt end up like the first rdna 3 release where it's just internal testing for like a solid month before any news of the actual release
they're hustlin lmao
thats not even half of the libs with major activity in the last hour
yep, that was reducing the ipa image levels in PS
never use levels for lowering values, always use a curve
so the clipping vals are retained
levels are for creating clipped values in the first palce
I reduced the greyscale
this was just a darker image, not altering anything
its intersting that it actually changed the image based on the input image lighting
yea, all 8 of the color examples I posted earlier are the same prompt/seed
just different latent colors
Yeah, tomorrow it arrives and he is a Nvidia guy so knows nothing of the AMD world. He does know this will be "fun" to get all to work.
Worst part is he trains so I hope 6 fixes that
she asked me how too make beef stroganoff... I sent her this
i think it turned out quite well
i dont get which vae exactly use with sdxl
i always get weird artifacts for some vae
even the new fixed ones
like rainbow colores scanlines, cant really describe it
https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors
this is the correct one. it's the SDXL 0.9 VAE and produces no decoding artifacts
boy typos do some things sometimes...
its weird i had some fp16 fix versions that work for me
all those produce scan lines
looks ok now
🙂
do u know if there is some full package of all the different upscalers? 😄
you can probably find some collection repos on huggingface but I got mine mostly from https://openmodeldb.info/ and github
many are using 4xUltrasharp, lollipop or upscalers by NMKD. I personally use 4x SwinIR s64w8 https://openmodeldb.info/models/4x-classicalSR-DIV2K-s64w8-SwinIR-M. it gives me the best output quality compared to many others I've tested including ultrasharp
+1 for that swinIR upscaler. I found it to be really good
i found 4xultrasharp the best for an in-between upscale if you're doing further denoising after
yeah it's a bit slower than 4xUltrasharp but absolutely worth it in my opinion. the details are more clearer, less over-sharpening and artifacts - at least when creating realism images
yeah as in-between it's great because it's really fast. but as a final or post-processing upscaler SwinIR s64w8 looks much better in my opinion and it doesn't add unnecessary details and stays very close to the source
ultrasharps' aggressive smoothing and accentuation also helps the 2nd denoise pass follow details better.
while the denoise fixes any ringing or other overtune artifacts
yeah - it's great for that
I did start to experiment by injecting noise in between passes but it's quite wild if you put upscalers in the mix heh. so I mostly upscale the latent now without a model
I even tried an iterative approach where I latent upscale and denoise between 768p -> 1080p -> 1440p -> 2160p
okay - I mean of course I get a lot more bad images but the ones that work have pretty good details
and it was still just a soupy mess
I can latent upscale up to like 1920x1080 without issue but every extra pixel after that gets exponentially harder
is highresfix in auto doing the same as ultimate upscale ? like when i have 1024x1024 and do a highresfix on 2
is it the same when i do it in ultimate upscale?
it is a simple img2img , isnt it?
ultimate tiles and does other stuff doesnt it
high res fix can do latent upscale, ultimate upscale cant
I never managed to get latent upscale to work and everyone said it is rubbish because they couldn't either.
but in the end its a img2img? like when i send my pic to img2img, set everything the same, and set the res higher, denoise -> i get the same result?
when it works it works but when it doesnt it just destroyes the image
needs a higher denoise, at least 0.5
u have to try different denoising strenghts to make it work and also some loras can break it or make the img deformed
yeah, for me 0.6+ and my image is so changed just not worth it
whats beef strokingoff
hmm... so right now I'm also using a lot out-of-spec resolutions to get those nice details. more bad images, but worth it when it hits. depends on the model. The Vision models by @delicate kelp have really great coherence and work quite well with higher resolutions.
so my current experimental workflow (ProtoVision 0.6.2.0):
- 1st pass: in-spec 1216x832 (optional FreeU to try to get more coherence)
- latent upscale scale_by 1.25 to 1520x1040
- (optional) injecting noise via Power Fractal Noise
- 2nd pass: latent upscale pass (hires fix)
- pixel upscale using SwinIR s64w8 scale_by 0.33
- final output resolution: 3040x2080
0.7 🙂
man i dont know what gpus yall have but my 1070ti is sucking
1520x1040 is so small you could almost generate that directly
but the sweetspot is 0.57 from my tests
everything below does too much latent blurriness
give him a mccrack pipe
yes, but you won't get those details without the latent pass
hes like a joker ronnie hybrid
problem is upscalers like Swin/ESR looks like shit on a native 4k display
so you need a latent 4k pass
oh i have no idea about those more passes, just using auto 1111 😄
cant just render like 20% higher res then SwinIR the rest
What gpu do you guys use?
3060ti
I'm not a fan of pixel upscalers at all. but I don't think they look super bad when I use them in my pipeline and I got good displays I suppose
3dfx voodoo 3
i will probably upgrade to a 3060 soon
im on a gtx 1070 but my upscalling can never go more then 2
for sd 1.5 just use tiled vae
my old wallpapers were all rendered in Blender @ native 4k, moving to SD wallpapers and 1080p upscaled to 4k looks like blurry garbage
gives u a lot of power
yes medvram il grab my prompts
but this time i upscaled with ultimate upscale
not sure if i can do 2x out of the box
maybe with tiled vae if its available for sdxl, im not sure
what upscale model you use with UUSD?
do you use sdxl models @static prawn
0.18 denoise 30, @ tile size = resolution size of image
this time it was 4x ultrasharp, i love the image itself but the quality isnt the best for the cat, anyway i like sd 1.5 way more then sdxl / im not so much interested in photorealistic stuff and i think thats where sdxl shines
this is my only prompt i have --medvram-sdxl
@nimble heart check these in full size.
- base 1536x1024
- upscale before latent upscale to 2304x1536
- final output: 3041x2028
i mean i like the cat but missing some details here and there, but i like the overall composition
i need more power lol ai art is addictive
but i use auto1111 so i dont have the power like u can have with comfy ui 😄
See how artistic SDXL is?
ultrasharp @ 0.3 denoise vs latent upscale @ 0.55
with sdxl
yeah if you compare it to a real native 4K render - sure. I totally agree
beauty
is that brittney spears
so I cant use the refiner either, even with ultrasharp
native 4k? no it can't
brittany
base XL does okay enough with a low denoise I can get some images to work
but yeah photorealism is exceptionoal
and that'
is no refine no upscale, not sure where the upscaled version went it is ridiculous though
4k @ 0.3 denoise coming from Ultrasharp. dont look too bad imo
so tthis is not so hot
0.3 is enough to clean up the upscaling fuzzyness
but yea refiner breaks even @ like 0.1
checked the details and sharpness on my images? they are not native 4k sharp, but I suppose the fidelity is alright 😉
there's a decent one
any generative fill plugins for SD to only change certain sections of an image yet?
it's very stylized. if your intention was to push it towards realism it does look blurry. it depends on the style you want to do here
that was an SDXL artistic oone
clean edges really. I mostly dont do photoreal
so the ultrasharp's into 4k denoise has worked fine.
idk I can get up to like 1440p probably with purely latent techniques so maybe SwinIR or something will perform better with more input data
yeah - it really depends what you are aiming for. as a stylized theme this looks very cool. but of course you are looking for specific details you want enhanced
do you see loosing a lot of details when upscaling and the first pass looks better to you?
but yea I have a few SD 1.5 wallpapers that i made using 1080p upscaled with whatever GAN looked best and the comparison to a 4k denoise isnt even close
wdym
I mean how does the first pass image look - does it have the same problems you find in the upscale?
as the GAN upscale or the latent?
first pass looks fine. usually the best version for thumbnail sizes
I do 1368x768
these are all the passes from one of mine
- starting with native 1536x1024
- latent upscale (denoising strength 0.57)
- final output (3041x2028)
- final output + post processing (some film grain...)
not 4k I know 😉
1024 -> 2028 just latent or is that also swin?
there are definitely limits how much you can push it. at one point you will just get garbage
1368x768 is SDXL's native 16:9 aspect
yeah and I work a lot in 1536x1024. it's out of spec and does give you deformations and mutations, but it still gives you nice images and the fidelity can be a lot higher - because more native pixels
first is ultrasharp4x into a 30% 4k denoise and the second is 1080p XL just upscaled with SwinIR. the high frequency details look jpeg-artifacty to me in the 1080 scale without the additional 4k denoise
seeds are a bit different ofc but you get the idea
yeah these are upscale artifacts
@jolly zinc is yours like this after enabling tiling it wont let me pick an SDXL model checkpoint
set COMMANDLINE_ARGS= --medvram-sdxl
1.5 couldnt render above 1080 so I never tried in-between resolutions
guess I could
actually mistake, that 2nd one was the wrong scaler
I started doing 1920x1080 experiments with SD2 and got about 2 good images in 50 maybe heh. so lots of wasted compute. SDXL is much better at it
I mean I can't say how swinir s64s8 does work with every kind of style or outlines, but it does work great for realism and also for renders
but there's a limit. I tried out how far I can push it before the look gets too artificial
left is 1st -> ultrasharp -> 4k 30%
right is 1st -> 1080 latent 60% -> swinIR
was to demo how GANS can't produce native clarity
im doing a 1st -> 1440 latent 60% -> swinir now
but the kernel compiles are killing me
I mean as said it really depends on the style, but I don't know how coherent the base image is. if you look on all the electronics components, they are not very coherent in the upscale - lots of AI noise.
the upscalers might not able to handle that in a satisfactory way to retain the style you are looking for.
bro 
alright these are the 3 results
yeah I'm not criticizing the image, style or artwork at all. it's just that upscalers might not be good at upscaling textures like that
yea they do good at small repeating textures like dirt and anime lines and thats it
hence why I'm so determined to get a native 4k pass
10240x5760 
so you are doing supersampling
this could give you sharpness, but also new artifacts
wdym "so you are doing supersampling"?
swin 4x's it, you're always going to have to downscale if its your last step
if the final step is just an SDXL denoise I dont supersample
i denoise at my native monitor resolution
i wonder...
yea idk i try lots of things but the 1MPX -> ultrasharp -> 4k 30% always looks closest to native to me. Its what Sytan developed and I just integrated it into my principled node.
4k runs @ like 7s/it for me so it kinda hurts but I mean it looks so much nicer when it works out it's worth it IMO
kinda wild 4k is 7s/it but 1080p is like 1.4 it/s
yeah. that's why I work a lot in out-of-spec resolutions. it might not be the most productive workflow since you are wasting a lot of compute, but when it works the fidelity can be pretty good. these use just 1 sampler pass in native 1920x1024.
why not just add the extra 56 pixels to do native 1080p?
because it will perform worse. you will get more in-coherent images when using full-hd. it makes a difference from my experiments
keeping the one edge native res?
tbh I tested that a bit when people insisted that mults of 64 performed better and I found it was more an area thing than an exact size thing
1920x768 (~21:9)
1920x1024 (~16:9)
are both working quite well
like in this 1920x1080 image you can basically see where it loses itself. the approximate 1024x1024 square on the right with the girl looks OK then when you leave that it gets funky
and that's how you end up with dupes
like it only has 1024² pixels of vision to look at at once
so the shape of thos epixels isnt really the killer its more the area
are you using all the SDXL text encoder values (text encoder width/height, target width/height)? it's not a 100% fix but it can greatly increase proportions in different specs
yes
well
not crop
just height/width + targets
they're set automatically by my Principled node
based on latent sizes
maybe you could fiddle with crop to center the 'square' more I guess
evenin'
I don't use it. does it set the target values correctly based on SAI's best practice? that is that the longest side is always 4096
Not at all
but it should 🙂
I did my own measurements specifically for upscaling
yeah
I found targeting the ratio as 1MPX did the best
personall
the difference is super tiny though
of course. there's no right or wrong. I just adapted what we discussed with Joe Penna in chat and build the Preset Ratio Selector node with humblemikey (Mikey Nodes). I build a preset file with all official SDXL resolutions from the official paper and calculated all the target values following this method
tbh I found the official resolution spread kinda lame too. Long as it's 1MPX you're good
so I do 1368x768 since its closer to 16:9
instead of 1344 or whatever they listed
well I wouldn't call it lame. it's the resolutions it was trained on
training ≠ mandatory
im a bit sick of people arguing with me that 1368x768 is litearlly satan
yes, but there are reasons for it
I work in film so I want industry standards as well 😄
i mean if there's a reason to use the res for you besides "its what's on the list" then go for it.
sdxl-raw-hdr please
but if someone's trying to make a 16:9 image specifically, use what's closest which is 1368x768
yeah it's the closest to 16:9 in-spec resolution
I've never had one person produce compelling evidence that the tiny difference in aspect between 1368 and 1344 or whatever does anything outside of mild seed variance
or 1536x640 for ~21:9-ish
Like I'm down to change my mind but whenever I ask for a clear demonstration it's just some random cherrypick like "look how cool this image is"
yeah you can use whatever res you like. you do get more coherent images if you stay within 1024x1024. but since I mostly work outside of that it's rng anyway
its more you get more coherent the squarer it is I found
21:9 can be rough even at 1 megapixel sometimes
since it's a 1024x1024 model that makes sense, but you can however, and a lot of work was invested in that, if the target values are used you can make very wide or high aspect ratios without much deformation
think I did 9:16 and cropped for my phone cause 9:21 was rough
yea my node straight up takes the latent sizes and feeds them into the clip sizes with the appropriate math so that shouldn't be an issue
this isn't really a stable rule. this black box neural network thing does sometimes stuff that makes no sense because we have no tools to analyze it
my code for sizes in Principled
yeah I think we talked about that before - you also looked at my json with your fancy shell viewer 😄
I used to have them set higher res but idk I found targeting XL's native res helped it not add random noisy BS
I've evolved since then. I switched from bash to nushell which can natively manipulate dataframes such as json
it did fix many of my proportions issues when using 4096 on the longest side (target values) when working in a) native resolutions b) very high or wide aspect ratios
my first pass is always native res so I guess I havent had that issue. it was mostly for doing a 2nd pass denoise
more the a1111 way of doing things I guess
in a1111 you can't even access all that stuff heh
I meant like hiresfix
my principled node is built to do an a1111 hiresfix
but better
i havent tested the 4096 thing as far as duplication goes but I couldnt imagine it'd help tbh
maybe I'm misunderstanding what you mean, but the target values are not meant as your final resolution output values. they are configuring the target bucket values when pulling data from the SDXL model and are suppose to help with coherence.
it's sadly not really explained anywhere afaik how it really works. I only have the info from talking to SAI and what I picked up in chat.
yea
I typically target 1MPX cause it seems to do the best all around
in the appropriate aspect
but I meant I havent tried say 16:9 with a 4096 edge as target for things like direct 1080p
I have not explored it further the last few weeks since AIT doesn't work correctly with many increments of SDXL. so only 1920x1024 works - not 1080
for 1368x768 using my current method it'd be
width: 1368
height: 768
target_width: 1368
target_height: 768
and for 1920x1080 it'd be
width: 1920
height: 1080
target_width: 1368
target_height: 768
I found by targeting low it seems to reduce the amount of random AI noise gibberish you start to get at higher resolutions
"1344x768 (AR ~16:9 / Near 7:4 / DEC 1.75:1)": {
"custom_latent_w": 1344,
"custom_latent_h": 768,
"cte_w": 1344,
"cte_h": 768,
"target_w": 4096,
"target_h": 2340,
"crop_w": 0,
"crop_h": 0
},
"[CUSTOM] 1920x1080 (AR 16:9 / DEC 1.78)": {
"custom_latent_w": 1920,
"custom_latent_h": 1080,
"cte_w": 1920,
"cte_h": 1080,
"target_w": 4096,
"target_h": 2304,
"crop_w": 0,
"crop_h": 0
}
i've not really played with width/height I just leave those at the latent sizes because I'm not sure what they do tbh
idk. i'll leave it as-is and I'll test that later once I can get flash attention or something working
get more than 1.3 it/s on 1080p
target values have been rounded by multiples of 16
yea
other people round at 64
idk
if its just affecting what "bucket" its giving extra attention to rounding shouldnt matter since those are pixel-space
I choose 16 because it was recommended when we talked about it in chat and also some experimental nodes by SAI used that
so I chose this approach
yea im too cynical to use things people in chat recommend without A|B|X testing it myself on 20 images first
man I was wondering why it took so long to render then when it finished I realized I had all the wrong denoising settings lol
sure - but it wasn't just some random chatter 😉 also I did so many tests... and found it worked quite well
this all so new experimentation is the only way to go anyway
there's no right or wrong
no
yea
I defaulted the refiner node to being muted in my workflow and honestly I've never felt the need to turn it back on
just way more consistent
okay much better
I only used 1 sampler for the last couple of weeks and have recently included a latent upscale / hiresfix pass
I did explore the refiner and it does have it's uses. I have it included in my workflow but it's almost always bypassed.
I fix eyes and other details with the latent upscale now - the refiner sometimes changes a lot of stuff or fixes a face but gets rid of a lot of texture details
it can be used for more stylized stuff, but I tried a lot to make it work for me
that feeling when you accidentally load a regular model instead of refiner and generate a hundred images and wonder why they all look off
sometthing I have not seen, i looked a little but, are there any refine refiner models anyone has trained? or is that not even a thing
refiner only images can be great too 😄
yeah I''ve done that too
or reversed the m
that's ...creepy , dont like the tall man
tall man scary
i wondered why my stroganoff looked a bit off
cos i was refining with dream shaper
just imagine 2 people wearing the coat
it clearly is, i see his arm, sneaky
hehe
try creating an image with refiner starting with a 768x resolution and then upscale
nothing wrong about it. many use other models for refining
you know exactly what's going to happen, right? 
you get a 2.1 image 😄
lol
2.1 is best model
for keeping your friends disinterested in AI.
I wish I could share this image, I mean it is not NSFW but mods be like... cos it's a skin tone body suit thing, but tthis fairy looks like she has down's syndrome :\
that's what hhappens when you use dreamshaperxl as refiner
looks fine to me 
for dalle-mini maybe
that would be pretty insane for dall-e mini hehe
well the face, the rest of the detail, yes
true
well maybe not, dall-e mini looks like it has down's syndrome and some crayons
or at least did, when my friend sent me some images a few years back
are they even still around on that project?
there itt is on hugging face lol
looks about the same I guess
ok no, DALL-e mini is sick powerful
Im' done with sdxl, this is tthe future
What a sweetie
here's a image started with refiner at 768 and then 2x upscale at 0.4 denoise
refiner does mid-frequency details at the cost of high-frequency details. so pebbels in gravel will look better but grains of snad will look worse
so for skin texture and fabrics base XL usually does better I found
while face contours refiner imporoves
she ate the wrong brownies
Keter class
SCP-1239123312932 Potato Hat
or maybe that was the random seed, not sure.. .either way
yeah, I agree. same for me in my experiments.
if I run the refiner it's usually not the full 20% as a result
its more like 10%
but yea recently I've just left it @ 0%
same - I used it mostly below 25%
the details are pretty good
I mean the medals went a bit crazy, but overall good fidelity 🙂
looks like mj about 6 months ago
that's bett'er than the BDSM KISS Army harpy last time I tried when i had my models mixed upp
finally made my ai waifu, took years... but now...
ok done with dallemini
so, do past runs bleed into repeat fixed runs?
no
I mean yes they change, but, I ttried to make thiis, with fixed seed in comfy, over and over, with nothing suggesting this background, repeated underground / gold and treasures, in both L/G and tried to negate castle but this one seed only seems to do this very similar style
just strange it won't listen at all,
sometimes the SD gods just are not on your side
yeah I switched seeds and now it is underground every time without the prompt, I dunno. that's just strange to me,
like it rememebered
Is automatic1111 stablediffusion 2?
Why are my pictures coming out weird
Should I add something
make sure sdxl models use the sdxl vae
and likewise for 1.5, make sure they use 1.5 vae.
Ok, let me try, thanks
And for SDXL keep to around the 1024x1024 ratio, or similar
So automatic1111 is 1.5?
yeah that's the standalone sdxl vae
You can manually select it in the A1111 settings
just select the button to see all options, then do a control+f search for vae and select it, check the box or whatever to make A1111 use that instead of baked in vae
it's somewhere int he settings, I don't run A1111 anymore
It’s not on the checkpoint section
in the settings menu, on the left there should be a button that shows all options, then just ctrl+f and search for "vae"
It'll automatically use that vae when you process any image now. Just make sure you have an SDXL model selected "checkpoint"
If you use the SDXL vae with a 1.5 checkpoint, or vice verse, you will get wonky images
Should I use SDXL with a 2.0 checkpoint?
like for like. SDXL checkpoint with SDXL vae
What type of images do you prefer?
Idk like ones of skylines
Cyberpunk
Cities
Bright colors
Neon
Idk stuff like that
The base model of course you can always go with, it's pretty good.
For more bright colors, vividness and such, I like to use ZavyChromaXL, it tends on the ends of realism but pushes more artistic colors
I don’t understand, the what is a checkpoint really?
Is vae like an extension of a checkpoint?
checkpoint=model
vae is ultimatley what's converting the latent space into pixel space
its like the decoder
so in more basic terms, think what the computer sees vs what we see
So it’s a prerequisite?
model is required, vae is required
optional
What does it do
can be used to help direct an image towards something. Some controlnet models can force a pose, while there are others that will extract the hard edges of an input image and push that to your generated image to follow
some models have the vae baked into the checkpoint, so separate file isnt always needed
Where’s the download button for hugging face
On the files and versions tab
I cloned it but it didn’t work so I just wanna download it from the website
little down arrow
screenshot
thats Nightvision
Night vision is the model?
Nice
I don’t have any memory for that
Minimum size before you start getting too wonky results would be probably 768x768 equivelents
anything below and you'll likely get off results
Yea
If you're using Auto1111, make sure you're using --xformers, --medvram and also install the Tiled VAE extension. With all that, my lapop with 8GB of VRAM can run all of the SDXL favored image sizes, and can even run batches of 4 x 1024. Not the fastest thing in the world, but it works.
Feels like I created a new style lol
shelobmobile
are those spider webs?
They are more like a web of neurons 😛
well, I wouldn't drive it

Return of Conan Biker
Hello. Does somebody know if there is a model for NormalMap with SDXL ?
hey guys, i am working on using stable diffusion sdxl model for inference purpose for generating images using some prompts, What is the hardware requirement for fine tuning sdxl? Thanks
If you can I would scoop up a nicely used 3090, you can probably do with less but you wont have no regrets having 24GB minimal if you wanna go cheap else get a 4090 or A6000.
Where is the room to generate AI in SDXL named Distillery?
I have only seen the depth models, which work very well. I have not seen any normalmaps
Cursed Conan
Threw together a quick few for Friday the 13th, Here's some "Jason" Voorhees art. Happy Friday The 13th!
@wet nacelle have any prompt tips for the kind of photography you've been doing?
First use this model ||https://civitai.com/models/133808/pyros-nsfw-sdxl||
Then use this negative prompt. (cartoon), 3d, render, low res, low resolution, ((text)), ((watermark)), ((logo)), tongue out, old, ugly, masculine, over exposed, vibrant, colorful, .com, ((tanlines)), (( ososedki.com))
You also need to have your sampler set as dpmpp_sde and your ksampler as normal.
⚠️⚠️⚠️ PLEASE TRY BOTH VERSIONS AND TELL ME WHICH ONE YOU PREFER ⚠️⚠️⚠️ This will help me decide how I train the next version! Thanks! Like what you see?...
what do you guys use to get directional facing, I can't seem too get that down at all they just look forward, may be an issue of the model but, I can't imagine it is just that.. .
a closup photograph of a {profile left|profile right|forward|slightly-left|slightly-right|upward|
slightly-upward|downward|slightly-downward|upward-left|upward-right|downward-left|downward-right} facing woman
definitely does not seem to have much effect
side angle
hrmm
photorgraph of a woman whose face is angled {side|forward|slightly left|slightly right|up|
slightly up|down |slightly down|up left|up right|down left|down right} and has a {happy|sad|neutral|bored|excited|upset|unhappy|coy|intense|neutral} expression on a black background
tthat works well
the direction> angle was not catching like this modification does
a photograph of a woman whose face is angled down and a unhappy expression on a black background
but it still is angled, and not directly forward so may need to adjust down and up
I don't even know why I use prompts
I could just do a head movement openpose model in blender and tween it for looking in all the directions and animate the hell out of that
thiis is magnificent,
-Blender version 3.5 or higher is required.- Download — blender.orgGuide : https://youtu.be/f1Oc5JaeZiwCharacter bones that look like Openpose for blender Ver94 Depth+Canny+Landmark+MediaPipeFace+finger + customizable body meshThis time, customizable body meshes for Depth/Normal/Seg/Lineart have been added.It is easier to create extreme body sha...

and this will be much easier for controlnet openpose animation in comfyui, does anyone know if there is an image loader that will increment, instead of hand loading like the load image node, or will i need to make such
i dunno much about automation and animation in comfy
I have not laughed so hard, I mean I do nott know why it is so... just.. other things have been funny, but something about this truly hit me somewhere..
has to be his pose... it's just.. campy af but... so right
lol thanks, inspiration came from this ad
https://youtu.be/BqpJvey-7-s?si=7-ZXW8rtYmlAqPEs
Dodge Challenger - George Washington "Freedom" American Revolutionary War Ad. Music by Jay Ungar, discovered by meus95.

Here's a couple open pose resources I've used before
https://openposes.com/
https://app.posemy.art/
I have officially cleared my pc
300gigs of storage
Now I can get like 273834 models
almost
image ratio or vae generally
GPU won't have any difference on the quality of the final image
no
upscaling from the initial image is how to get a larger scale image
Not larger scale, image quality
no, image quality is not a toggleable item.
Final image output depends on a couple factors. Your prompt, the seed, and the image ratio.
If you attempt to use a ratio outside the trained parameters, likely gonna get bad results.
The prompt is pretty forgiving, but can of course make a difference.
And the seed is a luck of the draw. Some seeds just work better than others, so hitting "generate" again is all you can do for that.
yes, like this is a canny representation of the input image.
The final output would resemble something akin to that outline
So that’s what control net does
You give it an outline
you give it an image
And it helps use that and the prompt
it then transposes it to whatever model you select.
canny is an outline basically
depth creates a depth map
openpose extracts the pose of the character
etc
Oh so it takes something from a photo and implements it
yes
2060 super
I have 1060 super
6gb should be enough to do full size images at least in ComfyUI, not certain about A1111.
COmfyUI is more resource friendly
but ComfyUI is not as "user friendly" persey
nope
How is it so photo realistic
What does inpainted mean
mask a section, generate an image to go into that masked section
in A1111 you see that in the img2img tab
Ohhh
Happy Friday the 13th...
What model did you use
I do the same here. I think I get within a few pixels of your precalculated values. Not figured out your math though so can't understand why we differ. 😄
@vapid roost can I dm?
was wondering why this image was taking a while
102 faces being corrected by the facefix node 😅
heh
idk if you wanted it to do that or not but there is a way to filter the max number of faces that facefix will inpaint
Nah, I'd rather wait the time than have some wonky af looking faces next to fixed ones lol
true, although with that method its giving everyone the same face
if you dont have wildcards
Buncha angry dwarfs, close enough lol
Thanks!
Think I saw them post it in this channel after they made it, but can't find in search (at least not while mobile)
anyone running a comfy custom node wiki yet?
a wiki?.. not that I am aware of
Where can I find this clip vision ip adapter?
you should able to find them on their git
Not there
found it though on hugging face, nope i didn't lol
wll search for it, thanks
ClipVision (For IPAdapter AND Revision):
Clip Vit_G:
install below in ComfyUI_windows_portable\ComfyUI\models\clip_vision
https://huggingface.co/stabilityai/control-lora/blob/main/revision/clip_vision_g.safetensors
install below in ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\models
https://huggingface.co/h94/IP-Adapter/blob/main/sdxl_models/ip-adapter_sdxl.bin
Clip Vit_H:
an available lighter-weight version of ONLY IPAdapter (will not work with Revision) below:*******
install below in ComfyUI_windows_portable\ComfyUI\models\clip_vision
https://huggingface.co/h94/IP-Adapter/blob/main/sdxl_models/image_encoder/model.safetensors
install below in ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\models
https://huggingface.co/h94/IP-Adapter/blob/main/models/ip-adapter_sdxl_vit-h.bin
Clip Vit_H PLUS:
an available heavy-weight version of ONLY IPAdapter (will not work with Revision) below:****
Use save clip_vision as above Vit_H
install below in ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\models
https://huggingface.co/h94/IP-Adapter/blob/main/sdxl_models/ip-adapter-plus_sdxl_vit-h.bin
T2I is a better way to go over controlnet for a doodler.
App results are a lot cleaner now
I wonder if Comfy is going to add glora to his comfyui?
yummy
nice! unusual toned down scifi stuff 🙂 love it
have you folks seen this?
https://github.com/YingqingHe/ScaleCrafter
4k x 4k gens with sdxl in 1 pass possibly, wonder how hard it would be to implement in a111/comfy
is it out?
Not yet. Need to have a break this weekend 😅
Chic Sofa in a form reminiscent of a pumpkin, stylish loft apartment works very well, making me rethink my prompting style
uuuh, this is so cool!
this mf is stuck generating things that look like 2022 stable diffusion
but talking like he's generating some god-tier stuff lol
I think what he was saying is that he inpainted for four plus hours
and the result is bad
to be fair: it's much easier generating good images "by chance" than generating an image where you already have specific constraints and requirements
hm, tried it. It works okay for photography. For anime and paintings it gave me totally unusable results, though. I would say that tiled upscaling still works better :/
but the idea is cool
I guess it could work if you fine-tune SDXL on super highres images
oh, guess it's not as good as claimed
at least not in my tests. It still has artefacts, in particular for 4k x 4k resolution, while texture is very low detail.
What is even the advantage of sdxl? I don't really get it?
It says you can make pics over 512x512, but i can already do that with normal models?
2 controlNets at the same time is not working on auto1111 on SDXL? or is it just me?
hey hi, where can i ask for a model sugestion?
???
SDXL is MUCH better than SD 1.5
In what way?
the naming might be confusing. SDXL is a new version of SD, basically SD 3.0
I have some 1.5 models that give me much better results than sdxl
Yeah it doesn't works for me either
yeah, you can always pick a model that is better in one particularly thing
such models exist for sdxl, too. Wanna have better photorealism then use custom models for photorealism
but these custom models are always overfitted as hell and don't work as good in general
SDXL is the better base model which can do more or less every style. From that you can improve with custom models
but in contrast to SD 1.5 which is shitty without custom models, you don't necessarily need custom models for sdxl
nice name 🙃
you'd probably get banned in several discords.
Why would i
you know why 😏
Not really
that's ok. me neither 
The depth of detail in sdxl is much higher, prompt with it for a while you'll see.
You can recomend a good xl model?
The 1.5 model i use right now can do all famous people, im using it on the discord bot on our server right now
The xl model i use not so much
Pablo Picasso ∞ SD XL 1.0: https://civitai.com/models/162136/pablo-picasso-infinity-sd-xl-10
Introducing ∞, an innovative LoRa (Text to Image) model that embarks on a journey through the captivating world of Pablo Picasso's artistry.
Pablo Picasso ∞ utilizes advanced deep learning techniques to immerse your images in the essence of Picasso's artistry.
What sets this model apart is its unique ability to interpret specific prompts.
By including the year of a Picasso artwork and the name of the piece in your prompt, you can guide the model to transform your images in the style of that particular period.
Each transformed image carries the year of the artwork and the name of the piece in its captions, offering a unique connection to Picasso's timeline.
Use trigger word: p1c4ss0
Trained on 28000 steps from a highly detailed by hand captioned large dataset.
- Special thanks the people who help me and who I care a lot about:
@MarkOREZ
@upbeat summit
@osiworx
@mix
@Thibaud
@Kamikaze(Elon Musk)
Thanks to the help of folks here and elsewhere I've published my first LoRA ❤️ Its for creating stylized storybook sketches. I hope it comes in handy for someone: https://civitai.com/models/162118/storyboard-sketch
What model?
I'm using the model I made by calculative merging with IPA for image input
You using a lora for trump?
nope
for the first one it's just pure txt2img optimized with AIT, the second uses the first as input, then it remixes it with a prompt using IPA
on paper it can seem extremely complicated, but it's compressed into a simple workflow(s)
allowing SDXL to blend and/or remix images was a difficult task due to CLIP being a bottleneck for SDXL, but it was possible
pure txt2img is turned into a science by SDXL, but the rest of the capabilities are limited due to CLIP imo
if SD3.0 would be a thing, I'd say the goal should be upgrading to a better text/image encoder like BLiP; while still going crazy with the UNET like done with SDXL
So great. Top right has amazing attitude
i enjoy this Lora so much!
blanka?
A bit unsettling, but here is an image of the tooth fairy. Not planning to release a children's book any time soon.
Nano-Virus
Loving Vincent (van gogh) finally making a comeback in Lora form for XL. coming soon
Why do previous generations bleed ino completely new generatioons new prompts lol which one of you said this is not true lol
and i think it does so, only when attached to a seed maybe?
cos changing the seed tends to clear that weirdness, but switching prompts with same seed, does some goofery
no way robocop is ''judy alvarez" when NOTHING in the prompt suggests this, and my t2i adapters all have robocop images for depth and color transfer lol
I had made this one, and then it shat this out after,
i refuse to believe that it is not bleeding previuos promptts.
or something in the workflow is caching oddly
yep changed the seed and suddenly more robocoppy
I noticed the same thing when using MidJourney
hmm weird, i wouldn't' think MJ would have single user cacheing of anysort, but yeah it was weird with same randomseed, this was automatic1111 not comfy so not sure how that translates
this is the not bled version that it made
I said it, and the only way you could get similarity of images is similarity of prompts. Some words are harder weighted than others
Perhaps there's a lora turned on.
Or a custom prompt node not properly updating at new image generation.
Hehehe
How common is it for the image to have the prompt text legible in the image?
depends on model,
your best bet is using inpainted external masks you make of the words you want
I was just curious because it happened and I'd never seen it do that
oh yeah that'll happen, natural is spelled wrong lol
I did a bunch of wanted pposters they were like WWANИИED
prompt was "naturepunk"
фтвand it was kind of weird mix of greek and cyrillic and latin alpha
which is interesting, it decided where to break the word
well there are probably images trained, that have Punk and Nature separate in the image
but not "naturepunk" so not having it separate, i even less likely
I've only been messing w SD for a couple of days but did a month of MJ and never saw anything like this happen so it was a shock
sdxl can be pretty good with text if called for, especially if you "quote" your text.
It'll still of course mess up, but it's come a long way
Thanks for this info. Just to clarify, I didn't use quotes in the prompt in this case
How can I get it to fill in more detail? Currently using sdxl refiner and 4x-Ultrasharp
Also, is 512 too small of an empty latent to use for sdxl?
yuup
Do I want 1024?
yes
Thank you
Welcome
these are the recommended
LOL, I was just about to screen cap that for him too
That is likely my issue with details
100