#✨|sdxl
1 messages · Page 16 of 1
on a1111 should be same place
Except a1111 can't run SDXL with refiner
I found it - checked is sd_model_interface
(EDIT: This was dumb but we're dumb before we're smart and my direction was realigned by pseudo. I leave this here for anybody else to benefit similarly
what if you reinitialzed the weights and trained a model on ONLY hands with overly verbose and descriptive captions and then used, like, an adetailer or some other form of segmentation->inpainting pipeline? Although, that may defeat the purpose of "foundation model" but there is data within the foundation that conflicts with how we want to be able to control hands. So, just, take it out, no? (Obviously easier said than done but it needs to be said before it can be done)
Or sd_model_checkpoint I mean
Add refiner then apply
man i need that scene from billy madison, "Not a single word of your answer, was correct"
SD_refiner or whatever it's called
I love being wrong, thank you!
i am going to watch it cause I also love that scene.
SD_Model_Refiner?
Exactly
Add?Or Change?
OK, done, I am restarting 😄
i just watched it lmao
we're talking about how to add things to quicksettings.
his explanation of the puppy who lost his way as "the industrial revolution" is also hilarious
been a good minute since I have. isn't the answer to the question, "the printing press"?
the puppy was a dog but the industry was a revolution.....ok....
honestly it's one of the best rants in cinematic history
in 10 words or less, why was i wrong?
SDXL 0.9 Prompt - nocturnal animals with eyes shining in the night, jungle, detailed bushes, anatomical earth structure, dine, soft lighting
So the full 1.0 version of SDXL comes out tomorrow?
The refiner works?
Are you in comfy or a1111?
not asking to be taught. asking what direction to independently search for my answer with something more precise than "lolreddituser23 says so" or other forms of influencer rhetoric. I am not referring to you as lolreddituser23. You are clearly well educated and well informed. I want to do the hard work to get there. It's just hard when the rhetoric is so hype-oriented
A1111
Damn, did you select it in the sd_model_refiner drop-down menu
I thought you said Vlad's
I selected in in the Stable Diffusion checkpoint menu
Yes Vlad ... but these new manus I've yet to experience
Where is the sd_refiner_drop_down_menu?
In the SD checkpoint menu, select base SDXL, in refiner menu select the refiner.
Did you apply the settings?
it's not necessarily incorrect, it's just not simple like that either
I have base selected - but no further option
there, 10 words
WHERE IS THE REFINER MENU? (Sorry for shouting!!!) 🙂
no one knows what toolkit you're using unless you tell us 
there's like 80 billion ways to run SDXL
i don't read either so if you already told us, well
Uh idk then. @uneven dove can you help him? I gotta go
i gotta go do work.
Thanks for all your help 🙂
Venice Carnival Grand Guignol
Thank you. As per my personal protocol, I'll keep rereading the papers about training methods and multimodal image/text understanding and evaluating/testing various resources available until my ability to rearticulate the complexities are second nature and then, with that validity on my back, try to acquire the specialized information behind the "it's not that simple" barrier through social communication with living beings. Since that is my labour to perform before I can perform the empathy necessary for me to ask for help while considering the other real human being across from me, I shall continue to look up all I can and read documentation before asking further questions on the topic.
what you're thinking of might be some kind of LoRA/LyCON or something
those are randomly initialised networks that are trained and merged into the base unet at runtime
i'm built differently
release still planned for tomorrow?
Are you guys using the Sytan setup, and also using an easy image to image option?
how do i get there?
The best words I have, when observing my thought, are that, in the [THIS IS A...] box, we have "specialized inpainting model" and, in [THIS IS NOT A...] box was "LoRA/LyCON"
base model for just hands then extracting a lora might be an overlap between what i'm thinking and, according the my best understanding, your communications
an inpainting model would need more knowledge than just hands in order to be able to compose hands into the scene though
I havent heard anything to agree or disagree with that 18th target. But how cool is it to get a great new base, and then another one a week later
ahh....i see what I missed now
which would mean this. i see i see
thank you
to my understanding: going back down a level loses the contextual understanding, making my idea revolutionary in, like, 2015 but it's 2023 now soooooooo we can do better
in how many hours SD XL releases? please
Target date has been given as tomorrow
July 18th UTC
SDXL 0.9 Vlad AUTO1111
Did you compare with comfyui?
what's the preference between A1111 vs Vladmandic? mostly concerned with the great API features and use them daily via python scripts, does Vladmandic still utilize the same API structure?
both codebases have a bunch of technical debt but vlad seems to be making some good decisions like switching to diffusers
if you want good api features there are better solutions
what's good about diffusers? why haven't they been used yet?
i havent made any workflows myself yet, just used others......is there a way to drag an image from my batch and easily use it as inspiration on next batch? using SDXL in Sytan's setup
FastAPI seems pretty solid, was easy to understand for an idiot like myself lol
i woke up today thinking it was the 18th. i am too excited for this.
Same. Lol
Have you heard any updates? Are they still planning for tomorrow??
so will a 3060 12gb vram 16gb dram be enough?
on comfyui yes
for 1.0?
a lot of people use diffusers it's a library not a UI
im using a quadro RTX 4000 8 gig and it works great in comfy
Yes, for the release of 1.0
RTX 4000 and 3060 are pretty close in performance i believe
has 1.0 been leaked?
oh
1.0 has the same performance requirements as 0.9
what are estimated generation times for 3060
and does all generations need to be equal to 1024x1024
yeah it's a core library for inferrence if i understand it right. i'm curious why theres buzz around them now with sdxl coming out. they've been around for a while but it has always seemed they're a hastle. like models need converting
nope, im changing resolutions to all kinds of things, above and below
will 1.0 work in a1111
i loaded up d8ahazard's new ui that uses diffusers and had to do some recompiling process to make any model i wanted into diffusers
even if it does, just switch to comfy for now
Depends on what samplers you use, but using ddim, I'm getting 85 seconds for a batch of four 1024x1024 images on my 3060
So far a1111 only has it as an extension. ComfyUI or Vlad (SDNext) can run it natively
i got it working but that whole process seemed like a whole un needed hastle for no benefit
I don't think a1111 implemented the ensemble base + refiner pipeline yet
yep, that sounds about right.... similar to me
will probably switch to comfy when an API comes out for it, would be cool to just call in json files from python
so for now it's comfy, vlad, maybe others
comfyui looks nicer but im so used to a1111 lmao
comfyUI works with amd too
there is an api
yeah i looked thru the links you sent last week, not wrapping my head around it as easily as i did with a1111
Vlad is a branch of a1111 and sdxl runs on it
i saw automatic1111 ui working with 0.9 this morning. my neighbor has 0.9 and we got it working in auto
im still blown away that i can make batches of 8 images, and large ones without any error
@visual glade There's a switch node for comfy ui? A reroute node that receive two sources and you choose which one by clicking like a light switch?
not in the base but there's probably one somewhere
SD-2.1 partial diffusion with SDXL refiner ❤️
i havent even tried using 2.1 models with refiner. nice
https://github.com/AUTOMATIC1111/stable-diffusion-webui/tree/dev this branch allows sdxl
it's working pretty well @broken hare
I don't think the SDXL pipeline is properly implemented, I think he just implemented the models
I see folk use links to images in their prompts, when I look at them they do not look as the result, is this a function in SDXL that folk just use in a faulty way or is it just a prompt trend that do nothing?
it's not some conspiracy to track and control ai images. the invisible watermark has been around for a while now. we've been labling 1.x generations with watermarks.
The idea is that future scraping of the web will encounter junk from old models and won't want to train with those images potentially.
those two jesters are from my ptx0/pseudo-real model, and these two are from ptx0/pseudo-flex-base, through SDXL refiner (50% refiner strength)
the watermark identifies them as a legacy generation
i watched a video comparing results to comfy with same prompt/steps and everything that he could make the same, and they just look blotchy and weird out of A1111
appreciate the explanation Flowwolf
it seems pretty sus up front haha
lol in this world we live in right now, yeah hah
if there was some cabal that wanted to track things they wouldn't use a library called invisible watermark.dll
do you think we will get some other software to accompany release of 1.0?
trying to make it mechatronic
All are so fixed with AI fakes now and want them to have hidden markings and so to spot faked images, but the best faked images are still made in Photoshop by skilled artists.
i want tickets to this show
i'm getting way better widescreen results using my flex-base model into the refiner, than 0.9 base. that thing performs poorly outside of its base square res for me
comfy UI and Auto1111 can't make exactly the same image because they generate noise in different ways
love that final tracking refresh mania fun
really? Im actually liking my 1280 width shots so far, but i think i will try it out with some 2.1 based models for fun. Artius for example
artius has really effective bucketing in the models it merged, it does pretty well too but not in portrait res
is that a new computer on the way?
people get really hung up on platform parity in this realm. but why?
I get it for like minecraft. you want a seed number to always reproduce the same world for player consistency. But why are we wanting to recreate the exact same image again? can't we just save the image?
just feels like platform parity doesn't matter very much outside a few fringe scenarios
my 3090 for my new PC
My new PC got here 3 days ago heh
its the final part
it is a hastle. But the design of diffusers is the better one. In auto111 people tend to do stuff like model pruning and so on to keep file size small. In contrast, diffusers always just download the part of the model you need. That's why the format is a bit different.
Nevertheless, I agree with comfy that diffusers is a popular and widely used library. Just not by people who wants an UI and, therefore, use auto111
if the end user case is a convoluted process requiring conversion to diffusers befor a model can even be loaded, i don't think it's designed better imo
i've never heard any reasons why diffusers are better. just told they are. i'd like to know. i've looked it up but all i find are lots of toxic comment threads of people yelling at each other
diffusers can load single files now
i only care about use cases. i'm not a ui developer.
they are better because their code is decent
and their performance is better than a1111
that's a strange argument. There are just two different formats out there. If you have your model in auto111 format you have to convert it to use it in diffusers
I took some of their performance tricks for comfyui which is one of the reasons people say it's faster
other way around: if your model is in diffuser format you would have to convert it to auto111 to use it there
https://huggingface.co/docs/diffusers/conceptual/philosophy i've found this but leaves me head scratching. saying they're not making it easy, they're not making it performant. i dn't care about it lacking abstraction. that's a developer preference. i see them hyped all the time, but i just don't get why end users should care.
safetensors wasn't invented by auto1111
that has nothing to do with safetensors
comfyui natively supports both formats by the way
you can load safetensors in diffusers, too. The question of safetensors vs. checkpoints is a completely different thing
diffusers philosophy page says they're going for usability and not performance
hype hype hype

its about the naming of the weight matrices IN the safetensor or checkpoint file
INDEED
they still are more performant than a1111
WOOOOOOOOO
they are more academic. They want to implement stuff as it is described in the paper
when i used them on d8ahazard's ui, it was much slower. probably just their code then
I KNOW HOW THAT FEELS LOL
and don't get me wrong. Not everything in diffusers is great
and many things in auto111 are better BECAUSE they did not follow the research papers but experimented what was better for artists
i just don't understand diffusers at all. they're often heralded as something so great and bringing accuracy and benefits, but as an end user, i see nothing. the only thing diffusers has ever presented me with is a giant hastle
it's just: auto111 is not a proper API. Its a gradio interface around the stable diffusion code which is growing over time and get less and less maintainable
it's about development mainly
when I started stable diffusion I looked into the auto111 code, run away, looked into diffusers code and finally could start programing with it
its just so much cleaner and easier to understand
so it's like, "we're using microsoft visual studio run time v1.284 instead of v1.187" kind of hype.
Curious if 1.0 will be better than 0.9!
something like a1111 for example has to implement pretty much everything from scratch
totally meaningless to end user
while people using diffusers can just update the library and have everything that's new
use the bots and see for yourself!
you should never mix a backend with a frontend. But unfortunately, that is what is happening more or less with auto111 in my opinion
Cannot see that diffusers/safetensors are a big deal - both produce stunning results!!!
there should be a good and maintainable API which can be extended by developers and which should be independent from the UI the enduser is using
auto won't last. it has an expiry date. feels like duct tape engineering to get'r done. i appreciate that kind of workmanship, but at some point you've gotta come back and get it done right
diffusers is not for endusers. Its for tools like ComfyUI that build upon it
My Vlad AUTO1111 startup line = webui --medvram --backend diffusers
Vladmantic enthusiasts have been preaching that diffusers are the jesus of diffusion tech it feels like. i've been trying to figure out wtf is the deal with them
Vlad AUTO1111 SDXL 0.9
what is the problem? That people hype something 🤷 Thats the internet
feels more like the "Blast Processing" of diffusion tech to me. just some marketing buzz word that end users don't see at all
but if that is your question: no, Jesus and diffusers are two different things
hey guys i have a macbook and really want to get into stable diffusion, would you reckon runing it through google colab?
for developers it is a great thing (and definitely not a marketing buzz word) because we have one stable code base we can all work upon
hmmm. no. that was not my question. if you read it more closely, you'll see i represented that particular point as a simile in a hyperbolic fashion. Analogy.
Blast processing was great for developers too, but when it showed up on sonic magazine ads it was just hilarious and meaningless
picking diffusers just shows that he's doing the right decisions from a software dev perspective
look inward
you've got me blocked. consider that before pinging me in the future
in the end there are, to the best of my knowledge, two code bases for stable diffusion
As a complete non-techy/mostly artistic user of SDXL - I feel that diffusers and safetensors BOTH are doing a great job!
actually 3 because comfyui only uses the models from ldm and throws out everything else
humble flex
can you please share the seed and the prompt including the -ve prompt
I lack the context. Are we comparing divine creation to diffusers library? 🤣
i told ya why, it's because you seem proud to not know stuff, or seek out wisdom. i think everyone in here understands your 'end-user perspective' and it is not anyone else's job but your own to educate yourself on how things work, or stop questioning others who work on it 
the original code base is not maintained. Its SAIs own thing and they use it for developing and improving their models. But they don't throw out any new features like controlnets, loras, sag and so on
thus, we are left with diffusers as the only really maintained and extended SD codebase. or we do what auto111 is doing and every tool reinwents the wheel and hacks all its features again and again in the same old codebase
Seed = -1 Model = Euler A Iterations = 20 CFG = 14 Prompt = lowbrow, 8k, matte painting, in the style of zentangle fauvism fibonacci Lilia Alvarado Salvador Dali Rembrandt Kandinsky Mark Ryden Anthony Ausgang Robert Williams Amy Sol Kenny Scharf Camille Rose Garcia, rauschenberg, diana ejaita, basquiat, remedios varo, rob gonsalves, keith haring, zentangle, blek le rat, ntombephi ntobela, richard burlet
this is getting really immature as people ignore context. i'm not attacking diffusers. just trying to understand their benefits. asking questions isn't a bad thing in my world view.
Using base SDXL 0.9 no negative prompt
i'll drop it now
seed -1 randomizes the seed, so after the generation, you can view the seed
OK, but you require the exact seed?
hopefully we get both of the SDXL's
there are a few versions, the large ones are "aesthetic" and the stock 1.0
haha, to be honest, I asked the same questions about ComfyUI and this was also in no way meant as offense against this tool
not sure which we will even get in the end, might be just sdxl 1.0
possible vram optimization?
I mean, even if we ignore all it can do for standard inference, finetuning or lora training would be hell without it
Seed 3696284803
do they have any theories on how diffusers can overcome the current pitfalls it has always faced?
dont ask me man. asking questions about them got a mod favorite regular all heated at me. didnt yuou just see me get ripped a new one because i admitted to not knowing ?
drop the topic please. or at least stop pinging me into it
honestly, I don't know who said anything offending to you. Maybe you overreact a little bit?
Anyways 🤷
naw. drop it please.
strange man
hey guys i am working on something, basically they're mockups for clothing brands where you can add your own design on the t-shirt through a photoshop, for the mockups i want to create generate realistic model pictures like these 2 below, would stable diffusion be something i should look in to? 🙂
stable diffusion isn't an instant solution to your drop ship website. not without a lot of hhand crafting to get it right. That'll be r&d you need to invest into to get a jump on the coming AI generated drop ship sites. no one has made tools public for that yet.
it's not for a dropship website, just want to provide realistic assets for startup brands with no capital to do photoshoots, i would edit it in photoshop so they could mock up their designs and use it to display on their website etc
i've been able to create these very basic ai generated models, but never really looked into stable diffusion so i figured this might be the place to ask
if its just about making artificial model shoots, I would say that's definitely possible
So I'm trying to achieve an image of a man standing on a cliff/rock outcrop over looking a cyberpunk city, with a shattered/crumbling looking planet in the sky. Anyone have any recommendations on how to prompt for that? No matter what I do, I can't seem to get it
while you could probably get the relevant parts working (heads & hands, where you just overlay clothes in photoshop)
you wouldn't actually avoid the fact that its commercial usage
Not saying you cant use it, just that you need to properly read the license agreements (from Stability AI), and stay on top of new laws in regards to AI gen image usage
Odds are higher you'll save money by getting a one month subscription to envato elements - download like 1000+ images of models wearing various clothes, to which you then permanently own commericial rights
(also legal aspects change, depending on which country you deal with - which doesn't make using AI gen images any easier)
yes exactly that, they need to be very basic models, like the ones i've sent, no designs, nothing special, because i would be adding the design onto it through photoshop
Just like this i will make it possible to mockups your own designs and change color etc, just wondered if stable diffusion would be a good option to generate the base models
depends.
1.5 -> get an anime checkpoint
sdxl -> use an interrogator running Vit-H on an existing similar anime picture, then use the prompt it gives you and work from there. It won't make much sense why it works - but it works XD
it's that good now?
you can make it very specific and it gets that close?
cause using an interrigator in older models just kind of sukced
I've been trying to get an interrogator to work with SDXL, similar to clip interrogator. Are there any easy methods to get that up and running?
Are you using colab?
#✨|sdxl message
yes. it can give you the images you seek.
but seriously, don't ignore legal issues. As an ex web dev, I've seen more than enough customers suffer real bad from images with shady licensing - and the costs for messing up there run between 1k~15k, so please take it serious
thank you! i will definitaly have to do some research about this because i wasn't really aware of it
they still do - but Vit-H works like magic on sdxl
thanks
How do you do that? Is that functionality built into ComfyUI?
i have a macbook so i can't run stable diffusion locally so i googled how to run it through web browser and they told me to use google Colab to run it
Just download a model off civitai and put it in the same folder you put the 1.5 model. Switch to the new model. Type what you want. Enjoy
this is the interrogator that caith mentioned, you feed it an image, and it gives you a prompt that should make something like that image. like anime ladies or whatever
what time will be the SDXL launch tomorrow? I want to get it as soon as it's released lol
also check out this site
https://elements.envato.com/
most new artists/web designer don't know about it - but its a lifesaver when you're starting fresh
What prompt did you use for that?
Wonder if it is just struggling for me because of the aspect ratio im trying to use
Cinematic Shot of a man standing on a cliff overlooking a city, cracked planet in the night sky
And then Cyberpunk in the 2nd text box for style
hello can someone say me why with the 3060 ti i get very low it/s
What aspect ratio?
9:19
thanks! i have an envato elements subscription also,, just wanted to see what i could do with AI, i have used the basic Ai's everyone can use but just recently started to learn about stable diffusion
I'll try that, works at 16:9 though
are you on mobile? and what res?
im 6gb 3060 mobile(non ti) 125W at ~ 1.25it/s and at times when it is hot at 1.3s/it - 1024x1024
Hydra jordan with sd 1.5 realisticVisionV40 and LyCORIS ,painful
aspect ratios are heavily biased 🫂
but try to like move it up and down a bit, often things get a lot better by making a small compromise
im on pc, res its 512x768
Got some interesting results this run through
if i want to run stable diffusion via my browser, do i have restrictions to when running it locally? i have a macbook so i'm afraid i can't run it
yeah, seems kinda low for a reason
is your cpu old get? also what is your xformers version?
how i can se xformers version
if you want to use AI commercially, you can use Adobe Firefly - on the legal side they've cleared everything for their users
cpu its ryzen 5 5600x
The cracked/broken planet is hard. Because do you mean the planet you are standing on, or the one in the sky. I don't think it can differentiate
it looks really really good on your end
lol @ people coming in here to ask if there are restrictions. this only means so many things
how i can se xformers version
cpu its ryzen 5 5600x
anyone have a good example python script and workflow.json for img2img using comfy
models/stable diffusion
I meant the planet in the sky. Yeah, it's super tricky, been trying in sd 1.5 as well, and I couldn't even get the cracked planet anywhere there lol
you are using comfy right? should show you when you start it
at the very top
RIP this channel
long gone are the fun times, now its all basic tech support ._.
You know there is a #🤝|tech-support channel right
This is specifically for SDXL discussion
@winter raptor
I'm gonna quote this next time we devolve again
@boreal bough don't make me call you out 
I'm liking the composition you are getting
doesn't show here, anyway go to tech-support and ask around, i can't remember the command to update xformers
oke
Exploding planet works quite well, but maybe not what you have in mind
Same place you put the 1.5 model
this channel feels like K-Mart now for some reason
Who are we kidding, the channel always devolves 
real K-Mart vibes.
It’s gunna be even worse when 1.0 comes out
What??? with restrictions i meant like can i run everything like i would if i ran it locally, i am only researching before i want to learn
can we bring it up to the level of Sears, if not Macy's, and ideally Radioshack but i know we'll never get there
I mean I'd prefer for it to not shut down like radioshack, so maybe we don't aim for that specifically lol
we'll just spam it back on topic with images & stuff XD
thank you!
go nuts, kiddo
I've noticed it can do the exploding planet well as well, struggles with the shattered planet much more. Been trying to find a way around it using the split prompting, mixed results. Maybe a lora is required.
Maybe get a planet and then inpaint cracked planet
christian slater approves this message
not sure what mac you got, but you almost for sure can run it on your CPU
PowerMac 5500
why not do it in steps? first make the planet and then add a few steps for the cracks
I mean that's sort of what I meant
I heard someone say that there will be multiple sdxl 1.0 models. Y’all think that could be true? Cuz I don’t believe it
eh, i was a bit annoyed with sytan's username/hardware saga
btw, found a way to not let it fully denoise a pic in the amount of steps the K-sampler outputs?
macbook pro 2020
2GHz quad-intel core i5
intel iris plus graphics 1536 MB
16 GB 3733 MHz
What do you mean?
yeah, seems like CPU mode is the only possibility for you
it takes much longer to output pics like that but it is possible
using googles clouds bases CPU servers would solve this issue right?
when you lets say give the k-sampler a task and ask it for 5 steps - it would usually fully filter the noise into a picture
if you give it 20 it would still fully filter the noise
Yeah, just set "Return with leftover noise" to enable
mmm right that is an option
it sounds like your talking about what we do with the refiner
nah, just prompt injection mid steps like in au1111
is there a specific command?
No, just 2 KSamplers with different prompts fed into them
mmm yeah, so what i've been doing so far
so it does transfer the noise... in the preview i couldn't see any noise so i though it fully filtered it after x steps
this is a complex example but this is how it can be done: https://comfyanonymous.github.io/ComfyUI_examples/noisy_latent_composition/
If you tell it to yes
the previews don't show noise because they show the denoised image not the Xt
I thought it was more useful than showing the image gradually denoising itself
any way to show the noise too?
why would you want to
just to know how much noise is left to be filtered, more noise means that more can still be changed without adding more noise
i usually use that to know when to inject the prompts in au1111
you did great tbh, it works better than au1111 and i gotta give ya props for that, got a donation page?
It sort of works, put the prompt "Cracked Planet" in half way through a 30 step render
at 15 steps
at 30 steps
what's the minimum amount of images you would recommend for training lora on SDXL?
while extremely cool, I also doubt it. But a man can dream? right? xD
It's pretty believable to me, I would like a photorealism model and an anime model, maybe a traditional art model too
1~30 <- hard mode. would not recommend
30~50 <- lowest that works without too much effort. make sure to tag backgrounds as well!
50~100 <- good amount that avoids 90% of issues
100~500 <- getting harder again (to keep up good tagging), but you can achieve really cool things!
4000~10000 <- while training for 2 eternities, you can achieve finetune levels of results on sdxl with just a small 43mb lora
in pickApic there are 2 SDXL models
SDXL and SDXL aesthetic
(if it's just your own, or anothers face you wanna train, divide my numbers by 3)
no the project doesn't have any expenses
That’s actually really good, would you mind sharing the workflow for that?
Just a 2nd ksampler with a 2nd text encoder.
Feed the first ksampler into the 2nd with if ending the steps before it's finshed and return with left over noise.
Then have the 2nd text encoder have something different in that's going into the 2nd ksampler.
mmm well, then im glad that your project went so well
Hm that’s interesting.
Also interesting is that if they release information on styles or if those were just injected prompts
I’ll give it a try, I haven’t played around with changing the steps, noise output, etc, in the samplers yet.
It's the same sort of thing you do with the refiner
Hope it is not spam - my fav dragons
good question, because the 2ndary prompt was a surprise and it might have been it
not looking forward to it
These are really cool
likely as it is already, I will probably be abandoning this channel
These are so dope
Imagine standing on 4 legs 🙂
same. but thanks to the bots, they'd be reverse engineerable with a few hours of effort. so we get them either way 😄
cause I'm really liking the sticker style
those are chefs kiss pretty!
weird to see the resolution attention problems at 524x524 pixels while also, this is an sdxl channel not 1.5 RV channel
workflow should be in the images
if anyone has slowness issues on the latest nvidia drivers on ComfyUI update it and try: --cuda-malloc
Does that fix the awful memory management?
"Churrarura" poor dragon got named after spare ribs :"D
trying it now ❤️
that's what I want to find out
Idk what this is, but it looks cool lol
The #SDXL Merry-go-Round
have you tried microsoft mimalloc with 2MiB hugepages?
thanks for prompt ❤️
translates well into other styles c:
@lilac wren Like your F40 so much that I had to do something similar.
Is it just me or changing the prompt halfway through the steps only seems to affect the background?
are you outputting noise as well?
For the last sampler? No
for the midchange
I do
odd 😦
For the first sampler
too many steps perhaps?
If your initial image is already established it will be hard for it to change major components
Try lowering the denoise value on the first one
cool - I play around with subject prompt and the part with location - it seems working well. Also keyword location in context seems to work well, like "tundra location, ancient ruins location, old ruined pharmacy interior location" and similar
The 2nd one should be at 1 i think?
It just seems like the background is more malleable than the foreground for some reason
in case you havent tried it yet, :: also works, but it changes how the weights work significantly
A detailed portrait of Amy Quesada :: City center dripping with black ink and black slime in the background, lights reflecting gasoline colors:: Bojan Jevtic + Ashley Wood :: maximalist intricate detailed :: ray tracing :: hyperdetailed, maximalist, psychedelic, post-apocalyptic, photorealistic, 64k resolution concept art, dynamic lighting, trending on Artstation
an extreme version using it, to show what its capable of
thx!
What does :: Do?
and how I used it for your prompt
[[fantasy monstrous compendium zoology illustration]], (epic colored painting of legendary animal fauna style "Churrarura" animal-hybrid, dynamic pose, Full body, (aquatic triceratops necromancer bone-dragon which body is made from bones, with visible skull and bones through decayed flesh, (skeleton dragon:1.6)) roaring:1.1), (by Norman Rockwell and Kazuki Takahashi and Frank Frazetta and Takeshi Obata:1.2), MTG illustration concept, [dnd character sheet], symmetrical, (detailed biology specie), centered subject, highly detailed, broad brushstrokes, fine art, (fantasy cursed dungeon with giant hall and stone obelisks location:1.4) :: lithograph, risograph
essentially gave like 0.5 to the first part, and 0.5 to lithograf & risograph - which is why they have such a strong impact int he image I showed
uh, :: isn't handled specially 😛
New option --cuda-malloc in ComfyUI improved generation time by a second
it's similar (but not the same), to using this
(prompt part 1, with this style going on:0.5) , (prompt part 2, with this other style going on:0.5)
@visual glade --cuda-malloc seems to mostly fix the fucky memory issues. But it still dipps into it a little bit and causes a small slowdown. It's not where near as bad as before where it would try fill up RAM and crash my PC
so it ignores prompt order and makes a 50/50 split
Where's the documentation for it if you don't mind?
kinda
not sure about official documentation
came across this
https://aidailies.com/midjourney/what-are-double-colons-weights-iw-and-multiprompts-how-do-i-use-them
Isn't that something that would have to be implemented in whatever system you are using?
Is it implemented in comfy
it's not implemented in most places other than midjourney
@visual glade am I spreading snake oil?
cause I could swear its working :/
thanks for testing if nobody reports issues I'll probably enable it by default on the standalone builds
I think it's just the fact that : is a token, I don't think it's doing what you think it's doing
maybe XD
I don't think it would really make much sense for what the UI is designed for
As you'd be changing the input of a node part way through.
even among comfyUI, InvokeAI, the diffusers LPW pipeline, and Automatic1111, they all have different attention emphasis syntax
@visual glade Ok did a secondary test, it's definately a fair bit faster on my upscaling workflow.
A lot actually wtf
comfyui just has (test:1.2)
stunning portrait of Caith consuming a horse size placebo pill, epic, cinematic
oops
wrong window 
@visual glade Yeah this is like 20 - 30 seconds faster over my entire upscale workflow
that's on the latest nvidia drivers?
how does the speed compare to vs 531?
I cba re-installing the previous one, but I was getting something like 100 seconds for my entire workflow when changing prompt, now it's 80
using sytan's workflow, adjusted to 30 steps, i generate in 18 seconds with new drivers and that the cuda malloc switch. feels faster. i'll relaunch without the switch to see
ok thanks for testing
time to enable it by default then
but only on the standalone
@visual glade Could you put the switches on the main page along with the keyboard commands maybe?
So people know they exist.
Or even just put run main.py with --help to get a list of switches
I was looking for a wiki page that listed them and couldn't find any
Wow prompt scheduling is so powerful
hmmm. my generation speeds are all over the place. without it i'm getting 11 and 10
😄
i'm on a 4080.
If it's for memory allocation it probably shouldn't really effect cards with larger memory. I forget how much the 4080 has.
Yeah, I have 10GB
So before on these drivers, it was being stupid and going OOM when switching to the refiner, but it didn't report that to the app for it to stop, so then it just starts leaching System Memory until it's full and your PC locks up
looks like across a longer set of gens, it's the same speed on both counts for me. what you say makes sense.
if anyone that has the standalone updates now you should get it as long as you didn't remove --windows-standalone-build from the .bat
probably eventually
I have to do some tests to see if it breaks AMD before enabling it for everyone
8~10 can be made to work - but does need to cut corners
should we add this to the target line if we build from git?
...I should really save one of the 8gb layouts so I can repost it
from git add: --cuda-malloc
alright, ty.
assuming it doesn't break anything I'll probably enable it by default for everyone soon
ooooo
using like 5GB less VRAM now on vae decode phase
and it seems faster
quite a bit faster lol. nvidia really messed up something
i can batch 3 with VRAM to spare now
instead of capping out at 2
xD
What exactly did NVIDIA do to break it? because this is a massssiiiive difference
went from 17 secs per image to 13 secs and able to batch 3 at a time, and still have vram to spare, not enough for a 4th batch though xD
Does automatic1111 work eith sdxl by default ?
/dev branch does
You have to "cut corners" with 10GB of Vram? I thought 8 was sufficient for all features?
should probably merge tonite or tommorrow
no way... lol
8 is definitely not usable for sdxl from what i know unless you also use system ram/cpu and this
I have seen SDXL run fine on 6GH VRAM
will make it take.... so long
hooow? Lol
tiled vae - though I think comfy made that default for low vram systems (so you dont see it happening)
I have seen plenty of people running it just fine on 6GB
not sure what else is being done in background
The base model only uses like 5.9GB when genning, and the refiner only like 5.1, it's just VAE decode that uses a lot of VRaM
sacrificing speed. switching models in and out. different optimization tricks. comfyui gives lots of power to that kind of cause
Wait.. So if you have 8-10GB of Vram. You need to use tailing, and not a native res?
In my experience, personally, 10GB VRAM doesn't have any compromises for SDCL
Usually only for vae decode
vae only - basically colors are a small bit off - but not in a way that's a real issue
load/unload - wasn't 12gb the magical amount where that stops happening?
Ah. Ok. Thanks btw.
oh, well maybe, but it takes what, like a single second?
I think the text encoders take more time than that
depends on a lot of things. 1~20 seconds, depending on your environment (Ram/ssd/hdd)
usually on the lower side though
also, 12GB was the magical number where people were egtting OOM cause the new NVIDIA bug was gettinbg big enough balls to keep the models cached
on my old PC it was like 2 seconds, now its like 1
at what time is SDXL releasing?
new GPU shuld be here any time now
its already 18th here
nobody knows
git your pitchforks AND LIGHT THEM ON FIRE
we needs 1.0 XD
My guess would be in like 26 hours ish, about 3PM, thats when I feel it would launh
Stability is a UK company. where the fuck is it ! 🔥 ⛏️
assuming its even dropped
26hrs?
thats would be 19th
true
oh really? Thats surprising
its 7pm on the 17th in london. they're fucking lucky. no wrath today
I hereby predict, in 19.5 hours!
I hope they took the time to launch it with controlnets ready
SAI's recokless abandon and hype mongering felt very American to me lmao
UK has precedence for machine learning in their law i think
lets see how right/wrong I get it XD
the rebel spirit is much much older than america
If that’s the case it’ll be tomorrow morning for me. Ugh works gunna go so slowww
I'm just patiently waiting for my 3090 ||impatiently||
Don't think we do.
We are just fucked after leaving the EU so wanting ways to make money. If the EU is going to start blocking stuff the UK is a much better location for an AI Business.
the suspence is agonizing lol
I hope you get your 3090 before 1.0 drops so you have time to install it before it drops
oh I will
its out for delivery right now
I just need it sooner rather than later, cause I need to sleeeeeep
maybe for you lol
s/waiting/dying
am so tired right now
Let's enjoy these final moments before this channel gets flooded with people asking the same question every few minutes
whats that question gonna be? anyone willing to bet money?
“How to use sdxl”
So far it's "when does it drop?" "does it do NSFW?" and "can I make money off of it?"
"why does my img2img look like shit"
Gonna be full of that
soon/soon/soon?
the best ones are the ones that come in and just try write prompts directly into random chats
I mean you didnt ask legally XD
absolutely based
Those are the 3 I keep seeing in here nonstop
I guarentee most of the people that ask if they can make money off it are children
you should expect those sorts of questions when involving such a large and diverse community 🙂
I wonder if they'll stick with the responsible AI license, or write a custom one
Thinking they can become millionaires from having a computer generate images
Free money, no effort
well its probably more likely than doing real art 😂
These are the sort of people that get scammed lol
Depends what art it is and who you have to buy it.
18th? 19th?
I'm soooo happy I left IT in time XD don't think I could bear working through all the AI scams 🥲
It's going to be funny tomorrow when they delay it
I wonder when the SDXL 1.0 weights are coming out, and if they will release the Style "photography/cinematic" weights as well
AI scam?
literally tomorrow apaprently
I mean that's only if you work in service desk. I did that for a few years, but it was internal so a slightly less amount of morons
We can hope
#🌶|off-topic if you wanna talk about it - but yeah, pretty bad
they promised 18th 🤷♂️
They didn't promise
They said target
One of the devs was in here last night pointing that out
Genuinely already wanna throw up
well they will delay it if they have to
well any sort of visual art really, otherwise you'd need years of training, and in most cases you just end up doing industrial streamlines artstyles with little creative freedom
Idiots who don't know what they are doing plastering bad info to the masses
Just wait for the Youtube videos
as someone who actually goes out and shoots photos with a lot of kit, i wholeheartedly understand why people seek realism and photography from AI, doing releases for every single person if you intend to make money or even just practice is just crazy depending where you live
Didnt mcmonkey just say yesterday that they were releasing in a couple days along with the comfyui presentation layer
Its definitely coming on the 18th
He said target was 18th
Bro he was clearly implying they were set for the 18th
theres gonna be tons of img2img tutorials that dont acknowledge it being broken at this point
is this the latest version then?
(asking for a friend)
they also said that they might do a delay and have like another beta model
tbf, their prompt guide is correct
no overkill tags, less commas, few on point negatives
I have a potential solution for the broken img2img, but I need somebody with coding experience to fix a node for me ._.
Second is the negative
I try not to use commas at all now, it's just an extra token to add noise into the prompt, it's rarely helpful.
Okay, they are going to release on the 18th. What type of last minute decision is gonna push them back?
wont be surprised if it delays, not beholden to release before its ready 🙂 will be disappointed but not surprised
would be nice if they could update us on that before it releases so we dont hype ourselves up like little children on christmas eve
yep - just corrected
What's broken about it?
I think that was the point in his message, to stop people being daft and getting over hyped and then getting mad tomorrow
It's a lot to explain, and a lot of guesses on my part, but I have had a pretty damn good track record with my guesses towards SDXL
What node
I wonder if @uneven dove could help me work with it
saying something on discord is a poor way to get your message across
It's a third party node
When I've used it, it seems to work well enough
Have you not seen SAIs messaging before lol. It's not brilliant.
My whole workflow is img2img so I'm curious
Oh no, this new thing I am talking about is leaps and bounds beyond anything I have done with SD up until this point
Is it actually bad for sdxl
how is saying something in a public forum for discussion a poor way to get your message across?
Also, just wanted to say that I had a staff member tell me my workflow is actually the right/intended way to use SDXL
I mean it is in their paper.
I think you just interpreted it properly.
Which makes me feel pretty good
no kidding, this is a pretty damn good guide for prompts
I am amused by the one interrogator example they included - but rest is good
Care to elaborate?
theres no point using it right now other than experimentally.
guessing they just lurked in the bot section?
No it's notttttt

That is a terrible way to write that prompt lmfao
oh boi........
so how would you tell a complete newbie to use it?
yes tell us
genuinely curious
Well Sytan fix it before 1.0 drops lol
The way I have it documented in my workflow, I included a hole multiple paragraph explanation as to how I recommend prompting everything
I don't think I can
but that's for photorealistic only
depends what forum. if you post something in discord chat and expect enough people to clip it and share it, it won't reach very far
i get tensor size doesn't match nan error, how do i fix?
The way I try to write prompts, is imagine what these images would be called online. Filenames, how they'd be described.
Now an image of a person for example, when would anyone put "Detailed Beautiful Eyes" into an image like that.
I need to fix a third party node to selectively upscale tiles of the image, then run a very low denoise pass on top of them, then pass that denoise pass on to a refiner sampler that only utilizes the third fourth of the sampling time scale
Did I not show you how to prompt in pseudos server?
Is that the ultimate SD Upscale node?
all channels will become tech support
Nevermind, I showed diodotos
it is inevitable /echos
My prompting guide works consistently across all styles, it's just a formula
nope XD
So by img2img you mean refiner only img2img?
Refiner is kind of a dumb model Id rather use base
Yes, image with base, denoise value, refiner on top as an img2img pass
god this refining thing just complicates everything
yes you can see it says: cudaMallocAsync
I hope this becomes just a few sliders in Webuis
It does, but the results speak for themselves when they work
I'm just curious if it was the only place it was stated. If so, sure. But if the message has multiple instances, i don't mind. it makes sense to me. it's on me to read the news and search for info, read the blog posts, threads, documentation, etc. Just my thoughts but I'm not sure how people live without search bars
refiner strenght, refiner prompt, refiner steps, etc
In my workflow for comfy, it's not even a slider, it's all setup for you, you just need 2 positives and a negative
@high skiff Are you just wanting to essentially add a proper way to do the mixed diffusion on the Ultimate SD Upscale Node?
Because I was looking at that as well.
some guy was talking about a quickfix solution on reddit but im not fully certain he understand what he's talking about or just pretends to
well i think they should make a statement before tomorrow regardless, if they decide to delay it.
I might try...?
Low key kinda wanna make a funny YouTube video when sdxl 1.0 comes out on how to set up and use it.. prolly won’t tho but I have some funny ideas running through my head😂
I think they should do whatever they want in retribution of some individual forcing them to make statements before they were ready by leaking 0.9
If this was true and he knows all about the code that's causing it. Why doesn't he fix it?
but that's probably just me
yeah he seems like a dunning kreuger kind of situation going on.
They are right here, right on the nose
you either have 100% noise or 0% noise
Not what I wanted but looks cool 😄
no no no do it but do it on a 4 GB VRAM card and then rage about it not working and then insert a short clip of coming to discord to ask why
They are saying that it doesn't do that in the normal KSampler though. So surely there should be an img2img difference in using them. But I get the same results out of both for the same equivalent denoising rates
a carnivorous rabbit 😮 you are twisted lol
so could you control the noise being applied better with a custom node?
the problem is, I am not sure how to describe what I need to do with the node. The only person i know who can code my workflow is pseudo, as he was my partner in implementing it into the official diffusers workflows
and then the end of the video will just be you zooming in on wherever the recommended hardware/software is posted
big problem is people start using the base or the refiner wrong as they're trying to figure it out, and then they don't take responsibility. they blame the model and refuse to believe they're doing it wrong. so they doubel down on their workflow and make it "correct" with weird convoluted work arounds. always happens with new technology platforms. technically inclined newbs get in there thinking they're experts in the new domain, and they haven't even read the manpages yet
Olivio let me down..... 😦
potentially
I have an idea for a workaround for it that could use just stock nodes, but it would likely be messy and lossy
i like potentially
ego and Bravada all over the place
and would still have very little control
Surely @visual glade Is the best person to ask about this anyway
@high skiff Are you saying we should only use refiner for img2img? Or your fix applies to the base model too?
idk, img2img works fine for me
well i asked on element if anyone was looking into it but no answers
my fix uses the base and the refiner on one image without using img2img, thats what the fix is
I'll make a megapost on reddit & civitai a few days later (prob sunday) - after making sure nothing changed about lora training
Showing how to do training - different results based on different training setups
• how to get around refiner issues when adding loras
Linking to working comfyui setups
Adding a bunch of prompts that work out of the box
and will try to keep it updated as things change
when you do it correctly even using img2img can have details preserved, it's a non-issue
I am for sure, I just don't have the raw python knowledge to get it working myself
it should be noted that I am more focused on high res fix than straight up img2img though
ok ok so i'm supposed to use the refiner first, right? THEN the base?
(i'm trying to come up with the most absurd questions responsible so i don't get irrationally angry on release)
I hope they release refiner for 1.0 but I also hope it goes out of style with the community lol
and oh god I hope mcmonkeys grid is release by then - cause I'm not excited to stich together 200~400 images x_x
same here haha
refiner 1.0 is "finished" according to emad twitter
Finished as in complete or finished as in gone?
I wanna play with mcmonkeys grid addon big time
yes
is this the good infinity grid or the bad OOM xyz grid?
who knows, maybe the refiner is gonna be outrageously good, we have seen nothing of the 1.0 refiner
i wouldn't call that reddit ranter right when he says the only potential fix is some magical solution that hasn't appeared yet
its amazing
the good shit - with probably more features based on his excitement
i'd say he fundamentally misunderstands things if he thinks only magic can solve it
you can have a single grid that can automate 10's of variables, and its all stored in its own GUI that allows you to pick and directly compare images
well high res fix is actually what im trying to achieve aswell, but thats basically also img2img at a different resolution than the native resolution
magic is tech you don't get so makes sense
bruh this is my jesus
i have an 8640 image grid that just finished on runpod
bullshit artists, especially technically inclined ones, tend to weave a little bit of truth into their tapestries. gives it that truthiness feel
i love me the infinity grid
if my understanding of how I got SDXL to where it is right now holds true, I am pretty sure I could get it working, I just need some proper brain power time
the only problem i was able to experience with img2img was when i tried refining the finished result
i can channel some to you with my crystal ball
i think asimov said that first, and it was more along the lines of technology that is sufficiently complex and unable to be understood is indistinguishable from magic.
I don't think solving noise problems is quite there yet. feels magical but it's just well made technology that is easy to grasp if you try
I will say, my workflow works very closely off of his, and also is limited for the exact reason he states, and my workflow was good enough to make it into the new img2img code for diffusers, so I would say my guess hes onto something is pretty accurate
Yeah I've been wondering if that's what people mean when they say img2img is broken
it likely is
i don't know why the refiner can't do img2img outputs, it just can't lol
The base model seems to behave pretty normally for img2img in my tests
If I have the refiner enabled in "From image" in SD.next I keep getting "'Tensor' object has no attribute 'astype'" error. Error goeas away if I disable the refiner. Also refiner works perfectly in "From Text" .. anyone had that issue ?
I was shilling for it in this server - until mcmonkey tagged me. then I posted HIM the link to my favorite addon - the infinity grid... and saw that url I was linking was his own damn domain XD
Refiner is probably fine by itself for img2img with low denoising strength too I guess
I would sacrifice controlnet for the infinity grid
same
yeah so you're using the base model, which you can get to preserve SOME detail with very low denoise. But you are basically not getting the full SDXL quality out of img2img no matter what
it's actually great, it can improve the quality but it doesn't change the img
he's parotting the same research that's been going around. He has some right to his words. but he says things are unsolveable currently.
Yet people are doing img2img gens very well. there's a ton of experimentation going on. Perhaps he's overstating the problem.
puh-leese. i can use up to like .8 strength img2img
and it looks great
"is quite there yet" = "is sufficiently complex to warrent 'magical' techniques"?
the post seems to illustrate things as unsolveable
well maybe you're not doing images that are supposed to have fine details like skin texture in the first place, in which case it could work. But you're still missing a lot of potential quality increases by not using the refiner (which dosent work for img2img)
yes i am
input -> output
@shy kelp instead of calling me a clown, quit proving to the world that you are one
woah you turned shit into more shit*
tired of your stuff, welcome to the block list, child
👏
Hmm, the Ksampler and Ksampler Advanced nodes do have different results for Img2img, even with the same supposed denoise values
It's kind of annoying you cant end at step 0 in comfy
end at the beginning?
beginning at the end
if you mute the tab, does it still pass it on?
Like just to pass through when you dont want to do double prompts
Ill try
end of time and space beginning of every end
ouroboros
have fun ruining all your good SDXL images, kid
Yeah no
the danny devito transformation above was after putting the input through img2img, eight times
it doesn't nuke details, some people just don't know what they're doing
this is a bit wrong the amount of noise returned if you enabled return leftover noise is the exact amount of noise the next step expects
@high skiff Confirm I'm not being stupid here.
If I have 20 steps 0.600 denoise is the same as starting at step 8 isn't it.
It's the same number of "denoise"
this is why if you chain samplers it actually works
denoise at .8 isn't the same as starting from a given step
bro of course it dosent remove detail at .8 😂 you're basically regening the same image over and over and probably didnt use that much refiner strength for your first pass to begin with 😂
denoise adds noise, starting from a given step simply continues with the residual noise
Is there any img2img workflow for comfyUI (I mean JSON or image i can "borrow")?
What does the denoise value do then? Does it just add a different amount of noise?
But the starting at step add's some set amount?
Yeah but if you have add noise enabled
If that's the case shouldn't the KSampler Advanced also have a denoise slider so you can control the amount of noise
well,guess i don't know what i'm talking about lol 
sytans workflow -> https://github.com/SytanSD/Sytan-SDXL-ComfyUI
or the one I linked is from mimizukari
it does, start at step is basically that
Thats what "add noise" does in comfy Im pretty sure
Adds noise based on the steps you have left or whatever
I will always remember the person that drove back to the store, bought new computer parts, went back home, did the hardware install, and it still wasn't working. Then I came, looked at it, and pressed the power button. When I'm working out problems, I usually narrate. They asked me what I just did. They heard me JUST say, "hmm I wonder if the button was pressed" and when I asked for context about their response, they said they didn't know they had to. At which point, I got mad. Now, I just don't help anymore. Not else you can prove that you have basic literacy and the ability to hold dialectic opinions

But how much noise does it add. What pseudo is saying makes sense.
There's a difference between run through all 20 steps, but just add less noise and only start adding noise at step 8 but a whole bunch of it.
you can use either of those two as a start for whatever project you wanna pursue in comfy
@thorn lion the best is "how do i do that?" and you have to direct them toward the glowing button
thank you - appreciate your help
OMG accelerate config
.....
how do you get confused on accelerate config?
I would assume it adds noise depending on the step / total steps ratio like you were saying earlier
For the first step
10 steps, .8 denoise = 8 steps of noise added
Yeah, but you don't have that silder in the advanced sampler in comfy, you only have the start on steps, so you have no control over the amount of noise it seems.
I'm going to try add the noise silder into the node and see what it does.
steps are just an abstraction layer on top of the sampler's internal concepts
the start on step will likely calculate the amount of noise by that
you have zero reason to add more noise than you intend on scheduling
tbf, when I started out I managed it wrong twice in a row 🤣
I thought it was hip to return slightly noisy latents into our decoders now lol
things get abstracted to get simplified but people want less abstraction with more granular control AND for it to be easy, simple, and pretty
@civic sigil a field of "active research"
I think, and its just a guess, that the refiner only works well on certain noise patterns leftover by an image generated from 100% noise. If you just img2img by adding x% noise instead of 100% noise, it drastically decreases in quality because that does not make the same noise patterns (leftover by the base model) that it was trained on.
only if you operate in the fractional space between dimensions.
the twilight zone
the timestep zone
but it sounds more fun!
That’s actually an interesting thought. I could see that being the case. But who knows🤷♂️
--cuda-malloc should now be enabled by default if you have torch2.0 and up installed
this also holds up with, the more noise you add for img2img, the better it gets. if you do 0.8 noise it's way better at doing its thing than 0.2
also very very dependent on the subjects in the image, and the supplied prompt
for photoreal its a lot more proactive - than for some artstyles where it does only the bare minimum of making lines neater and adding contrast
also same. I wasn't confused but it didn't work the way I wanted and it was cause I didn't understand fp16/bf16/fp8 and why they exist. but then I googled the same question people came to places to ask and found that I wasn't the only one that misunderstood. Then I learned from someone else asking the question and someone else performing the labour of answering it because the internet allows information and discussions to be preserved. then I tried again and got it right.
TL;DR try, think, search, read, interpret, try again
@uneven dove I added the denoise silder to the Advanced Node along with the start as steps and it absolutely makes a difference.
@visual glade Does the Ksampler Advanced, just add a set amount of noise at the starting step when you start from a later step?
but the ones i'm talking about are the ones who don't ask a direct question. they say "it's not working. what do i do?"
Does anybody know if there is a ComfyUI keyboard shortcut that cancels the current running prompt?
if you have add noise enabled yes
So you don't have control over the amount of noise it adds then?
that's controlled by the step
and, like, it tells you.....if you have a more developed question like "why pick bf16 over fp16?" that's a different story. it's a basic foundational knowledge type nugget but it's new enough that it could spawn questions.
under running, there should be an X for you to cancel it
you can enable that by clicking 'see queue'
sigma is tied to the step which is tied to the amount of noise added
@visual glade is there anything in the plans for making the refiner better for img2img/hires fix?
there's some parts of this core architecture where there's no "fixing" it, it's just something that you were never meant to do
No, that's not how that works
bro just block me already omg
The denoise is the amount of noise it's added on top of the image, so where 1.0 would be pure gaussian noise, 0.6 is 60% of that total gaussian noise on top of the base image
It still does all of the steps from the smallest sigma to the highest sigma, or the lowest frequency details to the highest frequency
i remember when they tried perlin noise in the thing for a day or two, i don't think i ever saw that brought up again lol
I think their might be benefit to allow users to change it seperately. This is an image of a Cat, starting at step 2 and ending at 20. If I run this normally I get this man.
But if I manually add in a denoise silder to it and reduce the amount of noise it supplies I can get a proper cat and I can't get that exact image otherwise.
Thanks, Caith!
I meant a keyboard+shortcut. Because right now in my 1920x1024 workflow, which obviously delivers lots of bad ones, I go through a lot of images. I keep canceling the generation as soon as I see I will not like how it is converging.
It would be way faster with a keyboard shortcut instead of clicking 😉
literally made a reddit post with examples showcasing exactly how it dosent work https://www.reddit.com/r/StableDiffusion/comments/14yggse/sdxl_09_currently_does_not_work_particularly_well/
and your own examples only confirms it
yknow
i find it interesting
that someone makes so much noise about something NOT working
on 0.9
but we don't have 1.0
I wonder. I WONDER
if they fixed it and didn't tell you
This isn't to do with the model though @thorn lion It's to do with how the inference is being done I think.
they've also asked that once an issue is acknowledged that it's kind of, moved on from 😄 and not to rehash it repeatedly, because it is not useful
I dont think it would be an issue with the model weights themself but the inference code
yup but there are certain people accountable to SAI, right? it's not out of distribution to consider them employing NDAs after the leak
Oh Arron already said that lol
or just courtesy from the developers like comfy
time will tell... but at least i could potentially have made them aware of the issue if they weren't. or made other people aware of it in case there's no fix tomorrow, so they don't go get themselves frustrated trying to get it working as well as well as it does for SD1.5
Wow, this is amazing
i bet comfy knows SO MUCH that we don't even consider
would never press them for it though
ah so a single pass rather than a double pass as per your 0.5 release
Is there any way to simply add noise to a latent image in comfyui?
well, I see what you're saying and you make a good point there. But I'm gonna bet that one opinion out of (anybody have stats on people using gen-ai?), or even from the perspective of one specific SDXL beta user amongst many, given that many many people try to make use of the pipelines and workflows and techniques that you are trying to employ, it is highly likely that they have much more information in regards to the issue that you are currently able to provide
no, otherwise I would do that and would have solved this issue by now haha
I'm going to experiment with a denoise silder on my KSampler (Advanced) you can get some different results with it
Seems like it would be simple to do
probably. comfy is a community project though, and the solution might come from some community member if comfy is preoccupied with other things.
I've been fine tuning this prompt all day - it was fun.
portrait of a battered defeated humanoid robot made out of silver metal standing on a hill overlooking the ruins of a destroyed urban city, from behind, golden hour, dystopian retro futuristic, natural light photo, Canon 85L f4.8, ISO320, 5000K colour balance, (pulp art by Robert Mcginnis:0.9) and (pixar:0.7)
and SDXL interpreted it really well! And many thanks to @tender timber . Got me inspired with the prompt he kindly shared.
"preoccupied with other things" could be fixing the thing and not being allowed to tell you about it
I fdont know if this helps but WAS Suite has a Latent Noise Injection Node
this si how my workflow works
Purple is the base, green is the refiner
My workflow has the base handle the low tm medium frequency details, and the refiner handle the upper medium to high frequency details
Wow that’s an awesome prompt. I’m still waiting with bated breath for you to release your node setup and workflow one day lol
the comfy project has winamp level energy to me. winamp was such a feature rich professional grade audio player when it dropped for free. breaking open the capabiliteis of audio playback for all PC users
PSA imagemagick can create nice grids of automatic or N size with one command. pretty sure they have windows downloads you can just add to PATH and use in powershell
if you do img2img, it spreads all of the steps across the lowerst to highest frequency detail
comfy really whips the llama's ass
it does help, though I can only use stock nodes for my workflow on the official comfy wiki
I had WAS on my old install
cool cool i hope you are right
sorry but what county did you learn that from? I've heard many a redneck saying in my life and that's a new one
painful, but a good alternate solution 👍
the county of winamp!
2.91 version of the iconic software splash screen and remembered track
You have been posting some really good stuff yourself! This is an amazing image #✨|sdxl message
auto1111 released their official SDXL support a few days ago tho, with no hi-res fix... i wonder why......
and, if i remember correctly, community projects have a very well established protocol for being worked on, which include pull requests, which i've understood you think is clown doo doo.......
so start your own with your own rules and guidelines and shut up
OR
be more amicable in your dealings with the community that allows passionately cares about the community project. whether you wanna make a linkedin profile pic or roop your face into hentai, we are all here for the same essential reason
because major members of this community are working on something we want.
Why would you....be rude.....in that situation? or disagree disrespectfully? Or be a conceited ass?
Am I calling you any of these things? idk. I forgot I was talking about you and started venting general frustrations
technically didnt someone release anss ADXL plugin to use with A1111 rather than A1111 doing it directly
someone released a comfy wrapper to load comfy in a1111
that gave me a chuckle
send the result to img2img, that's basically what their high res fix does anyways
oh? if that's the case, my bad
No reason not to use A1111 now
Do you have any recommendations for achieving a specific art style in SDXL? Like say there is a picture and I really like it’s style (whether it be a specific drawing style, photography style, etc). How would you go about trying to achieve it?
loving it ❤️
half a story. I remember seeing the comfy release and SAI staff being like, "we talked to A1111 but" (in my words) they couldn't understand why he was loafting.
auto bloating to the point of containing all of comfyui inside it is something else
like
if we're gonna do THE THING, at least be comprehensive. don't selectively include details
becoming the Emacs of diffusion UIs
Because auto doesn't even have my mixed diffusion, let alone proper noise injection for img2img or high res fix
i really like that you can load up comfy so fast and work on your prompt while the model is loading... i hate that i cant do that in auto1111 (not without changing the code at least)
True
that was when i knew i had to switch tbh
I still have it for inpainting and the openpose editors tbf
Comfyui is really forcing me to learn python better cause of having to make a million custom nodes lol
@high skiff What do you think about this. It's not perfect because I'm messing around. But I put a denoise silder on the Advanced KSampler so you can control the amount of noise it applies more.
Left is what it would normally do, right is more steps allowed, but I've restricted the amount of noise it can use
also Sytan did you even sleep
It's changed the look but I did that on purpose to make it pronounced
Nope
Might need to test this in dM's potentially
it was great that the extensions ecosystem was so active but it wasn't managed. now, i'm not a big fan of regulation or someone telling me what to do in general but it is clear that we need some modifications to systems that are centered around deregulation or self-correcting communities
I am sick, I woke up at 8PM
you have the circadian rhythm of a vampire
It's 12:42 pm
Can if you want. I need to get some food first though.
It's been established lol
I have diagnosed nocturnal sleep alignment disorder lol
well, I've been working on the raaaailroad
looks like its really messing up anatomy, but the detail on the armor is more fine
sorry i've been working on trainings recently with minimal generations and i still have to learn comfy
Sounds good, I'm waiting for my 3090, impatiently lol
I cranked it up really far to get an obvious change
so i don't know what comfy CAN'T do yet or what is more my style to do in A1111
still out for delivery?
Load images from a folder
I don't think the extensions are the issue I think the main repo adds features faster than it fixes broken ones
I’ve really gotta learn what these terms and techniques are, like noise injection. Any recommendations on where to start learning about that stuff?
some of the stuff in the main repo should just be its own official extension, like everything in the Extras tab
okay so just....don't update once you have a version that does what you want?
but also this. i did like that comfy gives you a big blank canvas versus 9287346578923645 settings in A1111 GUI
(i assume. never opened comfy but i am remembering a screenshot)
I like the far superior memory handling
the first two humps are 4 images in SD.15 in COmfy and note how the VRAM clears
Then 4 in A1111 and you can see the VRAM is kept in use even when finished
The 4 using COmfy & SDXL
First hump is higher due to the VRAM left in use by A1111 and then comfy released it
I’m loving the skin detail on the right
Here's a totally unnecessary 7680x4096 (4x1920x1024) test image using 4x-UniScaleV2_Moderate.pth for a quick and dirty upscale (the eyes, background depth of field not that great).
I also tried to give the prompt more realism in this variation.
really the only 'settings' comfy has are cli flags for optimizations and setting input/output directories. everything else is just done with nodes
OH SO I'M NOT CRAZY okay thanks you
My GitHub hosts my workflow that features mixed diffusion.
It's at workflow where instead of generating a full image with the base model and then doing an image to image pass for the refiner to fix the flaws, it instead only does part of the base image with the base model, and then passes the incomplete image to the refiner to continue from where it left off.
It takes less time to do, and produces better results on average
It ended up working well enough that diffusers has implemented it as their go to image to image workflow in their pipeline now
sorry. settings and parameters
so that includes things you can set on the nodes but comfy is just prettier
according to comfy, comfy's default is equivalent to auto's --med-vram
and I play Satisfactory so spaghetti doesn't scare me
yeah not letting go of ram is a serious a1111 issue. especially when you can count your ram on one hand 😉
It’s always the eyes with SDXL
but I am loving the detail in this image
which is fine as thats what I ran A1111 at however COmfy releases the VRAM after generating whereas A1111 doesnt
Awesome, and how about things like noise injection? Do you play around with the noise levels when prompting, steps, etc, or just focus on the prompt?
this ELI5 aligns with my intuition after reading the paper. thank you for your contributions
and just runs quicker generally 🙂
for a while early on i thought wonky eyes was a deliberate overt watermark on AI images!
Back in the 1.4 days I used to have to restart the server every few minutes as my 1070 would gradually run out of VRAM lol
think even now they're even adding a button in the UI to restart itself for similar reasons
i still occasionally have to restart the server after switching models and resolutions a few times
a flush handle, like a toilet. 😄
I do have an experimental workflow that I call fractional offset, but utilizes the fractional misalignments between 21 and 20 steps in order to inject a little bit more detail
lol
DOwnside howver is that COmfy does like to use system RAM to chche data in between runs , which again is fine as I have 64Gb of that and at least trelativly cheap & easy to upgrade but could be an issue for people with less than 32Gb (potentially)
yea when I first got my 24 gig card I made a big grid with all the 1.5 tunes I had and ran out of VRAM...
A1111? it already exists. In settings -> actions. it would purge the model but it didn't fix the issue and i dunno why
lol mcmonkey infinity grid ftw
no oom on grids
specifically 20 and 21 as opposed to.......actually, there is stuff i need to educamate myself on that you've already typed out. i'll go look for that
some people may say it because the author "borrowed" a signicant portion of the code however I have no conclusive evidence or proff of that
data caching is super nice though. On SDXL for example if you change something on just the refiner it won't re-run the base again.
And on my hi res fix if I want to save the low res I can just wire up a save image after its done and not have to re-gen the low res latents
if you're gonna give me a story, give me the whole story. all perspectives
So, does A1111 just suck now?
orthogonal , isometric, forced
does that actually fix the auto issues or is that a troll lol
Or am I completely missing the point of the convo?
people are overly mean tbh.
its my preferred UI, but it lacks in some areas compared to others
They didnt give them early access to the weights and are now criticizing them for not supporting XL, there is a very obvious agenda here
okay they do actually call it sometimes lol
reminds me of th ebad old days of creating a RAM cache to run windows in lol
NOTE WELL THAT I AM GUESSING
like, just unload model, clear cache, close torch, open torch, load model SHOULD fix the issue because that SHOULD clear the vram but if the issue is due to something other than VRAM not being garbage collected properly (which was what my understanding was based on what I read)
right now, for sdxl specifically - yes
in all other instances, no
also this statement will probably change sooner rather than later
Did they give ComfyUI access to the weights?
🙂
Yes very early on
but, like, if there is something deeper, well, i'm not sure. but there was something else I was doing and staring at my nvitop and the shape was the same as the memory bleed i got from A1111 so I did what I said above and it, at least, stopped being so serious
And money lol
that's why sdxl works faster for me in AUTO1111, in COFMFY UI, it has to load the model every generation
i'm not trolling. I honestly feel like some people are trolling when they don't know some things but no, i am being serious and I think I am right and I am prepared to be corrected
Not really old, I have my browser download directory set to a 48 gigabyte ramdisk
Do you know why they supported Comfy but not A1111?
Idk honestly
comfy himself is stability staff
they were in conversation with A1111 and Comfy around the same time. Comfy did the thing. so everyone used sdxl with them. A1111 only just did. idk why. that's all i've observed.
was he staff before he was staff
like, was he staff before ComfyUI? or was ComfyUI the thing that got them to hired them?
First Image2Image Workflow Attempt.
interestingly the empty cache is only in unload_model_weights() which a cursory glance seems like xyz_grid.py doesn't even call
So then will we see A1111 be as optimized with SDXL compared to Comfy?
i genuinely laughed. was just caught up on the train of thought
Because from what I've heard, A1111 is really vram heavy compared to Comfy with sdxl
ask A
The misalignment between 21 and 20 just happens to coincide with the amount of steps that I normally generate at, technically you could do all sorts of different misalignments, that's just the one that works for me
I'm pretty sure he was hired sometime after starting that project but I mean he's in here so we can just ask lol
i'm just regurgitating information that has been publically shared. my abilities end where my memory does
it was implemented in comfyui before because that's what we use to test the model
In this particular example, it looks like the noise you're adding is lower resolution than the diffusion, which would result in that big blotchy noise you're seeing.
But I'm kind of curious, are you saying the flow of base model complete denoise -> add noise -> refiner denoise always results in some defect similar to that, or only for upscaling?
I tried flows that did the ''''proper'''' step stop -> finish denoising in refiner and didn't like how it lost fine detail coherence (for example, resulting in kimonos made of tiny bathroom tiles etc)
