#✨|sdxl
1 messages · Page 53 of 1
loading the lora just affects everything like i would expect now. very nice
Can you show a zoomed in version highlighting the watermark? I don't know what to look for.
Rtx 2060 6gb
32gb ram
i5 8400
When I switch to refiner model it make my system crash
I did used xformers and medvram
What else can. Be fixed?
The overlaid red and green pixels
diffusers may have it built in
AMA - I developed an LLM, what questions should I ask to test it out?
even if you get it running, it will run ultra slow, as it will always overflow into ram. so we're talking like 5~15min per image. might as well use #1100170365604483202
Wow, that's far from invisible. And it covers the whole image? They need a better watermarking method.
yeah diffusers has that watermark enabled by default and I think it's not implemented correctly
it should not be that visible
yes I would also use the automated feature available in auto1111, kosmos2 offers solutions for those elements not typically found in the other datasets simply because the original human organisers didn't include them, yet basically with the built in feature we will easily find the poses, genders, clothing etc to make the images appear as we deem/
Is there any optimized version of xl 1.0 models for these 6gb variants b
i give up on SDXL im going back to 1.5
Or I run 0.9?
the most efficient way to run SDXL right now is ComfyUI
But I hate node system 😦
yeah i was thinking that it coudl really be less visible and actually invisible
if I have two cups of water, drink half a cup, pour a new glass, drink another half cup, how many cups of water remain
not on day 3 at least. give it like a few weeks, to see what optimizations pop up
but you can run it on any machine, using google colab (12gb vram)
Paid version of Collab gives 12gb vram or free
comfyui can do SDXL on free colab
hey does someone know how to chage the sampler in comfy ui?
on the ksampler node
thank u
I understand the need to watermark to prevent AI training on itself and maybe even the need to embed information about who generated the image in case it needs to be traced but it needs to be less visible.
i meant how do i add a samper that i downlode?
you can download samplers?
i really like how this lora is working like i've loaded a full model now
yes
how do you download a sampler?
Yeah, I'm digging through github issues now. There's a variety of reports and the consensus I'm getting is that it's not supposed to be that visible
Also waiting for inpainiting model
GREAT, putting this to test
Yes I did, the answer is YES, but refiner model needs loras specifically trained to work with It so if you want to use loras with SDXL and the refiner model you'll need a pair of them, one lora trained for the base model and the same lora trained for the refiner model, and load both (one in the base and other in the refiner) in order to get refined images that keep the lora. Not very practical yet.
there's currently no solution for it on diffusers. most people have decided to disable the watermark generation
that's not for comfyui and it's not actual improvement
is there code for training refiner lora? Kohya doesn't seem to support it
the kaaras sampler is not in comfy ui and it is better
we all wish there were ❤️
I'll add a checkbox to disable it for the moment until it is fixed
karras is called a "scheduler" in comfyui
I would just make a 1024 model from the two and a merge then use the new model as a "refined" set and make a lora with that
you can use it with all samplers
imagine telling the dev how it works
comfy has the patience of a monk XD
AMA - I developed an LLM, what questions should I ask to test it out?
If you start with two cups of water, each containing 100ml, you have a total of 200ml of water.
After drinking half a cup (50ml), you are left with 150ml of water.
When you pour a new glass, you add another 100ml of water, making the total amount of water 250ml.
After drinking another half cup (50ml), you are left with 200ml of water.
Therefore, you have 200ml of water remaining, which is equivalent to 2 cups.
"Write me a dnd statblock for a CR3 goblin that likes using a spear"
The ANSWER it gave back
how do i add a scheduler?
#1098025024541167646 or #🌶|off-topic maybe
good, it has logic awareness and context awareness, the adding a new cup is a difference!
if it gets the HP right, then it passes my LLM test XD
look right below the sampler on the node
so far only gpt-4 succeeds
LLM spam is like meme spam. go to another channel with it
thank u
anyone else wants to ask a question? me asking would be a bit too biased
is your resolution low?
Sdxl possessed, just run the exorcist.bat
its supposed to be 1024 output
for sure
SDXL 1.0 Super Stage VOD is up!
If you missed our live Discord stage, fear not! Here is all the scoop from our SDXL 1.0 stage with our @Stability_AI Applied ML team, Emad (GPU Emperor) and our host with the most Amli! A wild "cam on" appears 👀
Please enjoy.
🌟 Stage Summary by GPT🌟
🎛️ Trained on newer architecture for better control
🔍 Dual CLIP encoders for improved text...
downloading comfyui rn
if you are using comfyui try their discord and drag some 1.0 images over they are likely sharing nodemaps
i was using that automatic1111 i believe
you need the latest auto1111, right click on the area next to the files and open powershell then type git pull if you have git installed
there is also SDnext, they have the same extensions as Auto
but it runs 1.0 and everything
what's the best UI option?
comman line lol
How do I view the parameters people used to make the images they upload here? Does discord strip the exif data?
nice discord banner 😂
here's a good starting setup
auto has most extension support and broad userbase thusly more community networking but comfy is basically set to jet and it even includes all these special features like adding noise
thanks haha
and this is the official comfyui workflow
https://comfyanonymous.github.io/ComfyUI_examples/sdxl/
what am i doing wrong why do my iamges look like this ?
you sure your image resolution is correct?
what resolution shuld i pick?
1024 x 1024
1152 x 896
896 x 1152
1216 x 832
832 x 1216
1344 x 768
768 x 1344
1536 x 640
640 x 1536
thanks
Yes, you're correct. My hope is that ROCm (whatever the new version is supposed to be) closes the gap and I hope Intel ARC makes moves too. With AI being the shiny new thing, all parties are vested in making their cards better, or so one would hope. I'm not interested in other AI tasks nor am Interested in video games so I'll wait a few months, see how the software goes and then choose.
why did u remove yours
there were a few confusing dimensions included, that aren't beginner friendly and may cause more confusion that solutions. pure fires answer is simple and correct
Okay, I've checked.
We have indeed included the unet with SDXL.
Maybe we should @everyone
Can we inpaint in 1.0
Someone likely has an inpaint model on civitai, check the huggingface modelcard to see if they mention inpainting
they do in the other releases there is some alterations to the modelling method
and if you can't try matching a 2.1 model for small details like eyes they are still contextually aware of awesome imagery
just tried SDXL base with transformers on my pc. the vram usage when inferencing is 9GB without xformers. however when it is decoding, the vram increased to 14GB. i tried base+refiner and the result is still same (base 8GB, use refiner 10GB, but then when decoding suddenly 14GB). does anybody know whyy? thanks!
yes comfyui. no prests for it though, so you have to mess with the workflow yourself
I did some rocm 5.6 benches, with 5.5 benches linked:
#✨|sdxl message
are you sure it's the vram you are checking on? much smaller cards can run 1.0
stablestudio will soon get that support though, so might as well wait it out, to have it easy and well working
can you please post it again so i can try more aspect rasions
They said 16gb of system RAM was sufficient they might limit requirements in those parameters
What do you mean?
though how they do that with a 12 GB model is another question, they reduced size to 6...
the ETA in A1111 is so janky. ;_;
1024x1024
896x1344
1152x768
1024x680
1280x720
1920x1080
1184x664
1248x568
512x2048
my refiner seems to quadruple generation time
the others had nothing to do with generation, they were for target settings
but like they say it isn't required
Thanks, seems to be a painful exercise for not much speed gain, if at all. I was looking more for the 20 GB VRAM on some AMD cards as a net offset against the slower speed but I'll wait a bit. Doubt Nvidia will lower their prices and $2100 for a 4090 is beaucoup dollars
rocm 5.6 partially fixed the VAE times so there's that at least.
My 1.5k vae decodes are 850% faster lol
😦
you can try those cardsizes on runpod for 49c an hour
but yea was hoping to see 2k inference times magically like 30% faster
comfyui runs on 10gb vram in standard, and with vae decode tiled, in 8gb vram
A1111 caches irrelevant stuff, so that's not on SDXL1.0
| 1:1 | 2:3 | 3:2 | 128:85 | 16:9 | 148:83 | 156:71 |
+---------------+---------------+---------------+--------------+---------------+--------------+--------------+
| 512x512 | 896x1344 | 1152x768 | 1024x680 | 1280x720 | 1184x664 | 1248x568 |
| 768x768 | 1024x1536 | 1536x1024 | | 1920x1080 | | |
| **1024x1024** | 2048x3072 | 3072x2048 | | 3840x2160 | | |
| 1280x1280 | | | | | | |
| 2048x2048 | | | | | | |
if you want pretty graph XD
Other problems is certain really new libraries like bitsandbytes have 0 official rocm support, and compiling it yourself may take a lot of tinkering.
I play games a lot and have a 4k screen so the 7900 XTX is justified but if you're only AI I'd consider a 2nd hand 3090 or something instead
I hope Intel and AMD support AI better on consumer cards so Nvidia has competition. But if they really want to make money they will restrict it to datacenter cards.
im running with transformers?
I did see some people running one of those massive like 60B language models using a custom patched 4 bit mode and some other shit they hacked together on a 7900 XTX but I don't have the brain power for that
I'm kicking myself for not grabbing 3090 earlier this year when they were a little over $1100 CAD new. But I'd have to have upgraded the whole machine too and would rather spend that on my mortgage 🙂
yea the house is more important than a toy made of electric sand metal any day
Update for any it/s minmaxxers that are curious:
just hit a new record of 1.5it/s and 29 second generation times for 1024x1024 on my RTX 3060 (12gb vram) and 16gb system ram
I was getting 1.3 it/s before
Updating Nvidia drivers gave me a small boost
Using nighly pytorch version of comfyai gave me another small boost, my favorite thing about it is that the ram total lockup periods are smaller and it seems to hover around max ram use instead of hitting it
Do you just select the value in lora loader or have to put the lora in the prompt as well for it to work?
I'm doing base 20 steps euler, refiner 10 steps euler, no tiling vae
Hmm.. I'm getting 1.57s/it on driver 531.79 at 30 steps (20+10) and euler on a 16 GB RAM systems with the 12GB 3060 and latest Comfy as of this afternoon anyway. I'll have to update the drivers to see if that helps. Oh, that's with the default Comfy SDXL flow, so it has the VAE I believe
wonder what Titans go for on ebay now. Might be a really cheap 24 gigs of vram
so comfyui says it will work on rtx 2060 6gb version with xl 1.0 ? with this
pretty close to a 3090 used
damn
stupid college researchers and their 64 bit compute
buying up all the titans
can anyone also tell me what resolution should be kept tho to generate on this configs
Try it and find out
if you're on windows open task manager and watch your vram graph
HEY i finally understood what this meant. you were correct. thank you so much
what resolutions work with 1.5 models?
Just grabbed the latest NV drivers and gained about 4-5it/s.. nice
Newer drivers are usually better. The rumors about some drivers being "slow" were due to not understanding new memory management features.
err...0.05
but how do other 1.5 creators make different aspect rations?
1.5 models do work at non-512, it's just worse the farther you get from that size
There are various upscaling techniques they use. You can use any resolution but you should understand that resolutions other than 512x512 could cause strange behavior such as duplicated objects.
Nice. Good to.hear as i wanna buy a 3060 12gb😁
some of the community models are better at higher resolutions than the base 1.5 model
ok thank u
I'm getting far less 170 second generations where the computer gets locked up and getting around 30 seconds pretty consistantly now which makes this actually fun now
170 means vram overflow. use vae decode tiled, to further reduce vram if you dont have enough
vae decode tiled was giving me mismatch artifacts around the tiling, while also not helping at all
Switching to newer pytorch and updating nvidia drivers seemed to fix my problems
what is clip skip?
stopping the gen before the last steps
how can i do that in comfy ui and why would i do that?
I only started using comfy yesterday so its as new to me also
some people say the results are better
just started today. im digging it so far. got my own little setup going. Wish there were a little more node options to play with though.
yeah it has so much customisation and theres a lot to learn
it is possible to create custom nodes
is it possible to hide the noodles?
Not exactly. It is about being less specific. 'A woman blonde and with blue eyes' would be 'a woman blonde, so the more clip you skip the more generic the image
There is a Comfyui custom node manager that allows you to download a bunch of different custom nodes. Probably many you might be thinking of are already created
kinda like where every input has an address and then you can link inputs from different nodes into one and make a custom control box?
Launching Web UI with arguments: --xformers --lowvram --no-half --disable-nan-check
i launched with this should i be good to go with 1024 x 1024 ?
on rtx 2060 6gb
Remove --disable-nan-check. NaN is an error and you want to know about it. Add --no-half-vae because SDXL VAE doesn't work with fp16.
Expect it to be very slow.
at what resolution then i keep on
Actually --no-half may also apply to VAE already.
COOL what is it called
1024x1024 for SDXL. Smaller is not good. If you want smalller use 1.5 or 2.1.
idk if it can even handle it
If it works it will be very slow. But you can try it.
hope it wont crash black screen like earlier
If it gives you black screen image that was because of --disable-nan-check. So remove it and then read the error message.
Why not just try it? Takes 1 minute to test.
--no half vae or jusr no half
what u prefer
and medvram or lowvram
You need --no-half-vae for sure. I don't know if 2060 supports fp16. So try it.
i see
I don't know. Takes 1 minute to try it or 1 hour to ask every question.
ok launching ill see if it goes to 1024
If you get error show the exact error.
Interesting fail. The last thing you want to see walking home at night is a faun looking at you with his hand down his pants.
Managed to get rid of the watermark speckling in diffusers but I tink I've run into the interlacing effect that I think I saw mentioned earlier
use the 0.9 vae or download the newer model with it built in
I'm having the same problem but haven't bothered fixing yet
no errors but slow yes generated 1024 x 1024
correct, that model should fix it. I'm using comfy but the I ran tests of the two models and the only difference is those chromatic lines in the vae decoded result
2 minutes is not bad for that card.
ye
Need 16 GB VRAM to run full speed SDXL in A1111.
Just double checking that I am loading the new model with the VAE and not accidentally falling back to the model repo
which one would i needwell well i was looking top upgrade cpu tho
now i guess vram is the way
VRAM is king for AI. King above all.
And right now you need Nvidia. Nvidia is king for AI. No AMD or Intel.
but then i have to upgrade power supply too 
Someday AMD and Intel might be better.
you could also use the https://huggingface.co/stabilityai/sdxl-vae/blob/main/sdxl_vae.safetensors which should be the same as the updated model with 0.9 vae
while using refiner model is it compulsory for restore faces check ?
this is the difference between the two VAEs, just subtracting one result from the other on a direct image > encode > decode with each VAE, no sampling
Nothing is compulsory.
alright
Yeah, it doesn't seem to have fixed the issue. I'll try the second link but I have to build an override in - I don't have a way in my Unreal UI to specify different VAEs yet
the restore faces option is a post-process done by a GAN, not part of stable diffusion. You can use it or not, up to you. Sometimes it helps, so try it and see what happens 😉
anyone know what causes comfyui to tile the VAE? I've seen "trying again with tiled vae" in stdout once or twice but it usually just slows TF down when I exceed memory instead of tiling unless I go out of my way to use the tiled vae node
thank you sure
WTF?
i love it @midnight shuttle before after . amazed to see refiner model is fast af then base model takes 2 min
If you are using refiner in A1111 with low denoise (0.25) it will run fewer steps. Check your console output to see.
For posterity, I am a moron. I thought the latest sd_xl_base_1.0_0.9vae.safetensors file was the VAE and was trying to load it along side the baseline 1.0 checkpoint. Hours of my life I'm not getting back.
yes on 0.25 already
and 19 steps
So it is faster because fewer steps. Look in console. Probably only 5 steps.
refiner model speed
on 2060
I can only imagine trying to load a 6GB VAE lol
As I guessed. Only 5 steps. Look in your output.
pretty amazing yh
now hoping inpaint model soon in this week or month
and what if u generate images with refiner model ?
Me too.
No point. It is not made for that. You will get strange results.
alright alright
In my defense, the naming convention doesn't make it clear to a novice.
yeah but the file size
yeah the file name isn't great, not obvious
thats nice balance on this card
You would prefer to see it at 100%. It is probably stuck waiting for transfer to/from system RAM.
task manager doesn't do a good job of reporting ML workloads
Again, novice. I had no prior knowledge of what an expected filesize should be. It says XL in the name, could have meant file sizes!
i wish they would have combined refiner and base model cuz it need to switch everytime and takes a bit time to load
They tried but results were not as good.
Often results are good enough without using refiner.
Likeness isnt quite there but its still pretty damn good for a base model
I don't know what magic comfyui uses, but it somehow keeps both models cached even in 8GB of VRAM
A1111 has a long way to go for SDXL support still
It's some amazing magic since each model is over 6 GB. Maybe one-time swap to system RAM isn't as slow as constant unified memory swaps?
yeah it has to be that
but also A1111 is just terrible at memory management in general
whats the lowest vram 1.0 xl is supported ]
me personally? 16
0 GB. You can run with CPU inference but I think this will take at least 1 hour per image. Maybe more.
8GB definitely, I think 6GB is possible?
lmao
might need lowvram option for 6GB though
ofc i am on 6gb working fine on 1.3 mins
what image size?
1024 x 1024
nice
refiner model working nice too
If I have time this weekend I will look at A1111 code to try to understand memory usage.
I was testing it on a A4500 earlier in A1111, 20GB of vram and it was still erroring at 1536x1024
it would get 1 or 2 gens out, then after that each time it would run out of memory when trying to decode, and just quit
that was through wsl and docker though, so I think it couldn't use shared RAM
i want to upscale or refine real world images would that work ? without prompts
with refiner model tho
Upscale just use the Extras tab in A1111. To refine a real world image you must give a prompt so it knows what the image is.
you could use clip interrogate to get a prompt from the image
oh smart ye clip interrogate sure
you are better off trying Gigapixel or something similar. SD models wont know the faces and change them too much
Salad fingers in this one
gigapixel vs 1.5 is a questionable option. vs xl it's a no brainer. set up an xl upscale workflow and blow gigapixel away
for photos?
yup. gigapixel is over 5 years old. its like 3 generation of AI behind. old news really. and it's still paid software.
i am here
what did i miss
ControlNets came out!! ||naw jk||
... why spread disinformation
chaotic neutral
lamest excuse for acting like a shit
honestly, you filled my heart with joy for a brief second so worth it
💩
I cant imagine how messy comfy workflow will look like with multiple controlnets working together with upscale etc
its gonna get fun

technically there was a circles controlnet released for SDXL 0.9, but I think the page got deleted, it's a 404 now https://huggingface.co/sayakpaul/controlnet-sdxl-circles
wish i could code so i could help, but i'll be moral support instead, LG TEAM!!!
yo did you ever figure this out? its clearly using latent fill rather than original fill but how to change🤔 or it could just be using grey pixels
you could probably track it down if you really want to generate a bunch of circles though
https://huggingface.co/lllyasviel/ControlNet/tree/main/training the training data used for controlnet 1 here. any of us could make our own
its not very hard either, thibaud did 2.1 with little experience
when he released that i figured he was some phd guy
no i helped him learn DB when he made the double exposure model
SAI have mentioned they have some internally already, but they haven't released because they're trying to make a lighter weight solution
they are changing the Controlnet architecture for sdxl, theirs is not really going to be quite the same as training an old-style one again
well it was a community effort
old controlnet still works with sdxl. they're making a new controlnet architecture
we just need to train models
yes and I wish they'd release it externally (whoever made it quietly puts it on their personal HF) but if SAI resources were used then they probably cant
ugh I hate that there's still no way to hide the '1 blocked message' thing in discord
really want a lobe type interface for comfy, i'll pay the extra cpu cost
https://github.com/canisminor1990/sd-webui-lobe-theme
they had 0.9 models but they dont' work well anymore and are yhuge and hard to swap in and out of memory.
new research since has allowed an opportunity to a do a better way
ik im just impatient, like how we all trained 0.9 loras
hoping the wacky thing im trying ends up making it much easier for people to train cnets with less compute at some point (but like, no promises)(we have a few though to test)
The problem is ControlNet in it's current state makes a complete copy of the UNet which becomes cumbersome for SDXl - you'd have another billion parameter model on your hands and it may not actually be necessary
@sour obsidian and @visual glade have a much better idea
i'll take that as a promise!
This isn't a direct reason for it at all, but something I personally think about is the energy cost of inferencing generative models and the impact on the environment. If we can standardize a much smaller model with the same results it's a win win for everyone and everything.
I'm excited for the future potential. Controlnet made 1.5 SO much more powerful and useful for img2img workflows.
ohhh is that the same seed withotu a lora? geeze it lened into my prompt good
Hi, do anyone know where can I test run the SDXL 0.9 refiner model online? My computer has no GPU, so I'm looking for online trial
so far from early testing controlling xl is going to be so wild. So much more power behind it, really hoping these improvements turn out 
Do it again @sour obsidian do it again
lol you can do this too, you just want to showoff
❤️
oph wait i just realized. this was a controlnet demo! 😮
I'm in bed on my phone hahaa I just can't keep away 😉
you scallywag
any plans for a TemporalNet or other vid-aimed controlnet releases for sdxl?
definitely thinking on it
nice
hahahah
I just want pretty pictures, more pictures, more pretty
smooth pretty?
idk its been a long week
haha
Haha
If you've seen a paper or heard of it, we've read it and thought about it basically
until we find out the thing we are struggling on for 2 weeks was solved 3 years ago and was left in a pile no one looked at 

Hi, do anyone know where can I test run the SDXL 0.9 refiner model online? My computer has no GPU, so I'm looking for online trial
I have an idea. Joe strongly recommends using celebrity names for face-lora training. What would be the best tool to find the closest "celebrity lookalike" given an input image of someone unknown? tryna scale this
If you mean along with the base as intended then DreamStudio and ClipDrop both use it
There's a bunch online, not sure the best
... they're all kinda... Not great 😅
Lots of gimmicks
Something has gone terribly wrong ...
relevant username
Name checks out
but like python. I think deepface can do it but in a complicated way
winning!
LOL
DreamStudio has the SDXL 1.0, but I'm not sure whether it is base or refiner 😅
Somebody here did end up explaining it to me and I did solve it, however I do not remember the solution at this moment
I've been away from my computer all day today, I'll likely get back to SDXL and comfyUI stuff tomorrow
I shall await the ancient wisdom
maybe comfy but use cpu instead of gpu
XL spawned new life, but not as we know it
Ok, I believe this was the image sebody shared with me that helped
i love how easy it is to spank out new desktop wallpapers now
Same here, I can get reliable 1440p gens out of SDXL now
Even on ultrawides
same. i'm only operating at 1080p for speed
Ok I think thats how im doing it except with a ksampler in there and my masks are white cuz they're generated by CLIPseg that way
but i still have to go up to 0.8 denoise and the edges are solid, no mask blur even though there is blur set
is ther a high res fix in comfy ui so i can generate high res images (1.5)
multicolor glowing node to show where you are in the diffusions
also there's this
https://github.com/ssitu/ComfyUI_UltimateSDUpscale
Trying that setup with a refiner included is giving odd results
is the high res in automatic 111111 also just a upscaler?. i thought it gets generated in high res directly
are you sure you are starting with the base model, then sending it to refiner?
yea i think so 🤷♂️ 😂 \
i think you switched them up
my main problem in 1.5 is this behavior

was gonna say resolution but this is 386 x 579🤔
the blue arrows is how I have it set up. Tried it without the refiner, came out even scarrier.
On what card? Seems to be an X090?
It was a typo, I mean 0.05 it/s. So I went from 1.25 to 1.3it/s at 1024, 40 steps on a 3080
did you ever get to try 4k before your 3090 popped?
Cant say what is the issue here, Maybe try basic 1024sq and see if that helps
i have tried that but it seems like the image renders in high res but looks low res and bad quality
Inpainting works?
no
with comfy?
indeed
hard to compare unless you use the exact sampler and settings etc
is this ok?
I did not, but my 3080 is enough to generate 4096 X 4096 images
But those are base resolution generations, not upscales, so they are horribly inconsistent
I'm sure with enough tweaking I could get 4K working with my workflow, but it would just be a lot of additional nodes and variables to ship to the end user
damn do you use tile vae?
I do not, but I'm pretty sure it out of memories and picks it up from there
just downloaded and havent tested, trying it now, steps look way low o0
when it tiles your vae gets a progress bar I think
maybe that's only the dedicated tiled node though
So far, 1.5 seems light years ahead in terms of output. Sdxl has had a few amazing wins on some prompts but in general I think it probably needs a lot of lora and model support before it's really good.
@shy kelp I think you should remember and take into consideration just how dog water 1.5 was on release, and how SDXL can keep up with it considerably, or even surpass it in certain things such as realism
It is an incredibly powerful next generation tool that is light years ahead of its predecessor when not fine-tuned, so I have huge hope for the success of it over time
training an SDXL1.0 lora and only got 12s/it by my 3080. Is this normal? using LoRA_Easy_Training_Scripts
yea people 100% are over-remembering 1.5 after being so used to tunes and tooling.
compare 1.5 base without embeds/loras/cnet to xl and there's no competition
my first generated
I find it very good at general concepts but very hard to add certain details when the main sampling direction has been taken
changed teh settings a bit
interesting, i'm generating a batch, watching a stream, AND playing a game. And nothing is hitching. I wonder what it was that got better. it's always been if i tried playing anything while genning, either A1111 would slow to infinite ETA, the game would stutter, or both. This is really nice.
going all the way up to 4k wallpaper sizes is kinda throwing me.
base image
2 pass pixel upscale - fast and coherent-ish but lame
2 pass latent upscale - schizo
3 pass latent upscale - schizo with butterflies
Maybe the mythical 4 pass will save me.
....or 5?
yeah 1.5 couldn't do this ^
ComfyUI question, advice for sharing the same primitive node for the "steps" and "start_at_step" (i.e. get rid of the "Prenoise steps Duplicate" in the image below), it won't allow me to drag the same INT output to both, so I am guessing these are registered as different datatypes
Ohhh, I know why my generations still have lines in them even though I specified the new model that has the 0.9 vae baked in, I only downloaded the base one and not the refiner one
I feel stupid
maybe it would also help to scale the resolution instead of add flat count to it, so for 3 scaling stages I can use the cube root of 3840 / 1368 as the factor...
so what's the best way to upscale things ?
ultimate SD upscale
it has its own custom nodes you can install
0.9 vs 1.0 (same seed and workflow)
I think I got Hires Fix working. But it doesn't like anything outside of 1024x1024.
just find out the refiner got his own vae
Hmm, the 1.0 base, 0.9 vae refiner model has taken over 20 minutes to run 8 iterations. The normal 1.0 refiner finishes in about 10 seconds
How much Vram do you have?
damn 🔥
I know right?!
sd_xl_base_1.0_0.9vae.safetensors this one I guess?
yo mangler
how would you improve this work flow, its repeated 3 times for different prompts, but man, your detail kills
I'm going back to the guide
check the photo of my comfyui nodes above
yes, let me dwnld it and reverse engineer it, might learn something, tx
seems to work better with photorealism than 2D
it says these two
and this one
so what is sd_xl_base_1.0_0.9vae.safetensors exactly? a vae or a newer version of sdxl1.0?
Well after creating a full 5-pass pipeline with 5 base stages and one refiner stage with each pass latent scaling by a definitely-not-cursed factor of √∜((3840×2160)÷(1368×768)) to achieve perfectly uniform relative pixel area increments...
...it maybe looks slightly better 💀
I don't understand how you had good results with this. Every gen i tried with your Workflow is completely broken
Try this.
only SDXL trained loras work so search
Sharing my findings with the watermark + 0.9 vae here. First cat has the watermark enabled (bugged so the red and green pixels appear) and is using the 1.0 VAE. Second cat has watermark disabled and is using base+refiner 1.0 both with the 0.9 VAE
I installed Dreamshaper SDXL in a new Automatic 1111 directory Seems to work well. https://start.me/p/xb4Npa/ai
if you drag a RAW image created with comfyui to ui it automatically change the workflow, it blows my mind
cringe
Yeah, that's the diffusers watermark that's supposed to be invisible. If I stub out the watermark function in the pipeline then it goes away
Here's the image with no watermark and the 1.0 vae. You can see the colour banding at the bottom of the eye when you zoom in.
Do you recommend the 1.0 vae or the 0.9 vae?
Hello @boreal bough
Can you tell me your settgins please I would like to compare
Thank you so much for linking my video. @boreal bough said there are some flaws. I would like to learn his settgins and make a comparison video. But network rank is totally related to user choise. More keeps more info and thus more size. If we train only unet we will get subpar results. New tokens we have no info related to that. Captions. If you want Realism captions severely reduces. Also In my previous attempts when training with captions I never found good results.
Realism and styling had different workflows. Even though I did Realism oriented training the model were still amazing to produce styled output like these : https://www.reddit.com/r/StableDiffusion/comments/154assg/here_some_amazing_results_with_my_free_training/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1
513 votes and 100 comments so far on Reddit
How should I prompt in SDXL? The same like in 1.5?
@somber hill is there a way of compressing my LORA output ........
also thanks for putting up your video
Im getting output sizes over a Gigabyte
You need to reduce network rank. But it will have lesser info
hmmmm ok
Network rank is simply how many Neural network layers it is keeping. So if it is high it train more layers thus have more info. When we do dreambooth instead of dreambooth lora we train all layers. Thus it is better than lora but requires more hardware
Lora is an optimization technique. Originally found for LLMs
what am i lacking for sdxl?
looks pretty good to me
Ok thanks ..........ill keep going with it then because my outputs have been great
vae things i think
Is that the original resolution?
what does the empty latent image do in comfy?
I got a very small man in the rain 😄
Noticed the same color banding/jagged edges thing with images generated in SD.Next or Comfy. 1.0 models with 0.9 baked VAE fixed it
not that bad for base + refiner models without fine tuning
What was your prompt? I want to see how my refiner only setup does against base+refiner
In simple terms: It gives you the size of your image. You empty canvas so to say.
Do you have a refiner only setup for comfyui?
No. Automatic webui. I am using refiner at 680x680
I generated the prompet using a fine tuned LLama2-13B model I am making for making SD prompts
"Honda Civic parked on Highway, mountains in distance, dynamic composition, aerial perspective, black and white output, original vignette:1.27 during the blue hour, from a drone view [input], angular style, Plexure, toneset, hdr tone mapping, cinematic lighting, rim lighting, photolypze, pldirectrad, physically-based rendering, highly detailed , dvo stabilizer, rpyta aspoh otjecshron abjecshroomAIart"
thats why its not normal
This is what I got
SDXL 1.0 or 0.9?
1.0
I am not using it as intended tho. You are not supposed to use refiner on its own but I'm, as an experiment/
I am really surprised on how fast and easy the LLama2-13B model understood the SD prompts, it still needs filters and a little more fine tuning to make it work
how fast is it?
3 hours of training on a dataset me and a friend made to get the prompt that you just used with a 3 word input
"car on mountain"
lol no, I entered your entire prompt except the last phrase which wasn't in english
What env do you run LLama on?
it is 2.96it/s on a 4070
is it possible yet to train embeddings with sdxl and a1111 ?
512px or 1024px?
not loras, but textual-embeddings?
Can’t even use it properly. Crashes my computer on both auto11 and comfy
I have a 2070 super with 8Gb vram
I thought you where asking how fast the model was with the loras loaded in the text gen web ui
I am wondering this too. Can’t get it to work in auto rn anyway
yea that's what I was asking but resolution affects speed too. 2.96it/s would be a nice speed for 1024x1024px image but bad for 512x512px.
how much system RAm do you have? XL needs quite a lot
I have an issue about lora. I trained a lora which able to produce original image features but when I using base+Lora->refiner this workflow. The refiner makes the image less similar to original image. How to solve this?
16 gb
yep thats the main reason with 8gb vram
its 3.25 it/s on 1024x1024 on a 4070 for me
That I need more system ram than 16 gb?
ideally you want 32gb RAM on system and 8gb vram just passes
Same here
12 gbs works just fine
isnt it s/it?
Where does it say that?
Im just observing what it uses on my system
Nope 4070 its it/s
I didnt read it I thought it said vram 12 gbs of vram and 32 gbs of system ram
We're all coming slowly round to the idea that 8Gb VRAM might not be sufficient to the task any more!
How fun 🙂
in A1111 is was using 20-30GB RAM and all 10gb on the gpu. With comfy its about 6-8gb vram and 20GB RAM for system
I hope to upgrade to 24Gb VRAM
I could run my own 2.1 based 1024px model on 8Gb vram and 15 gb ram with no issues
T_T
how can i get restore faces for comfy?
XL is a different beast
that is really slow what card?
2070 on medvram
its sad but we all need to upgrade and spend a bunch of money if we all want to run the latest stuff
Hehe. Won’t happen
Many people in poor countries that will have to resort to online
yea cloud GPUs are a big thing, there not that expensive if you use runpod
3070 currently takes about 30 seconds for a 1024 image
is civitai down right now?
yep
no
Just install the ComfyUI - FaceRestore Node from Civitai
great thanks
any good finetuned models i can get?
what style?
one thats pretty flexible, can do creative jobs and realism equally well
What does SAI mean that they have made finetuning easy on sdxl?
Idk why people want high GPU when I am here running xl 1.0 on. 6gb vram 2060 on 1024 with just slight notepad command line changes
30 second wait time is good enough tho for me
why is there no option to load the sdxl vae in the comfy sdxl workflow examples?
is it built in?
I would personally go to this https://civitai.com/models/4384/dreamshaper
it has a little bit of everything imo
I should note that it is a SD 1.5 model
I have the same gpu. Do you use automatic1111 or comfyu I? And which optimization do you have enabled?
im using the dreamshaper xl model as of now, just checking for "better" models
the models right now are pretty mediocre, it probably takes a few months for the really good models to come out, stuff just takes time
if you look on civitai for models trained for things like cars or anime then you will get better results than a general purpose model and your going to need to wait for SDXL models to come out
Any possibility to train embeddings on SDXL?
I seem to be getting better results with dreamshaperXL10_alpha2Xl10.safetensors as a refiner step then sd_xl_refiner_1.0.safetensors, don't know if that is expected or if I am just using the refiner incorrectly? Showing zoomed image of the face, workflow attached:
is there a way to put the restore faces strength because right now it comes out way to strong?
I have not find a way yet, but maybe you can blend the original output with the Restore output with the ImageBlend node?
works fine as hell wit xl 1.0 1024 on this
I might give it a tyr too
The fact that you're considering this definitely needs to pop up everytime a topic shifts to SAI's carbon footprint, that's actually super considerate. Not sure if anyone could make a quantifiable/verifiable fact on its impact but in the company that I work for, we praise the cape-wearing heroes who do so anyways.
quality is maintained too and refiner model works fine
auto111
where did you get the SDXLPromptStyler node?
that might work thanks
Its not my node, but i am trying to code in a restore strength parameter right now
screenshot of installed extensions: ComfyUI-Manager
you don't seem to be using the right VAE
Can anyone please tell me how can I inpaint in comfyui with SDXL?
that would be amazing, the other method you mentioned works pretty well too.
Well, and that was just a guess 😅
at which step? I tried sdxl_vae.safetensors for the last VAE decoder and it made no difference
is there a highresfix for comfy?
these lines are a symptom of the wrong vae
and you know how to fix that?
downloading the latest official vae
Just connect the latent output of the first sampler into a upscale latent node, and that output into a second sampler. But experimentation is key.
and using that instead of the embedded
ah yes I see it, you're a legend thanks.
if you downloaded it at day-1, get the new vae, they updated it
Just finished my first style LoRA on SDXL 1.0
Children's book style illustrations, trained on ~500 images generated by a personal 1.5 merge model
(each image is from a different epoch)
I personaly get the image first out of the latend space, upscale it with a good upscaler, convert it back to latents and then run it through the sampler again.
whats the difference here ?
on this model
Ahh, yes, nevermind this is better: instead of using the VAE from the refiner
that's the one
thx 🙂
I think I'm doing something wrong, still getting used to this comfy ui
man i see many people using comfy ui and not auto111 , i find node system very overwhelming and too much i am simple guy whats so good on comfy ui
it's mostly the SAI shills pushing comfyui over automatic1111
me too it looks like rocket science to me but it works better for 8gb cards so I dont have much of a choice
my 2060 6gb works fine on some optimization with 30 seconds time on 1024 with auto111
Every time, or only with the skull image?
btw is there a better way to switch to refiner model fast ? or just a click
comfyui 😂 , but hopefully in the future update of a1111 soon
agh i am not going on that i dont wanna deal with nodes
it's not just shills but comfy does seem to work for sai now so you're not wrong about his ui being promoted a lot 
Comfy is heaps faster and once you learn to use it is technically better to be creative
Anyone have a good image2image workflow for sdxl1.0 in comfyui?
What's a leg and what's a tail? 😆
thanks for the workflow
just load one of the already done setups and use that. just change the prompts and resolution and don't care about the other things. There isn't any extension out anyways right now. So changing the prompt is nearly the only thing you should care about right now
Did anyone write a history browser for ComfyUI yet?
(It looks fairly straightforward... but I thought I should check.)
Is there a discord server specifically for questions / help around ComfyUI ?
I got plenty workflows of comfyui
Chek out github file of this tutorial
Updated for SDXL 1.0. #ComfyUI is a node based powerful and modular Stable Diffusion GUI and backend. This UI will let you design and execute advanced Stable Diffusion pipelines using a graph/nodes/flowchart based interface. In this video I will teach you how to install ComfyUI on PC, Google Colab (Free) and RunPod. I will also show you how to i...
Updated for sdxl 1.0 too
Will check that thanks
is this a sign that i'm using the wrong vae?
its a sign that you might want to see a dentist 🙂
how do i seperatly load the vae in comfy?
There's a 'Load VAE' node in the loader section.
It'll be offered as a suggestion if you drag a line from the VAE input to empty space.
I see it, thanks
I thought it is possible to drag and drop an image posed here into comfy to get the node setup used to generate it?
It is but you have to download the image first and drag and drop from the explorer
also it has to be an image that was uploaded straight from comfy
1.0 models with baked 0.9 VAE fixed this for me
pixel art withou any lora
What is prompt I would like to add my github file
pos (masterpiece, best quality), mysterious,16 bit pixel art, epic composition of a explorer in a boat reaching an island in the clouds, by studio ghibli, cinematic still, hd
neg (worst quality, low quality:1.3), (greyscale, monochrome:1.1), cropped, lowres, text, jpeg artifacts, signature, watermark, username, blurry, artist name, trademark, watermark, title, multiple view, extra hand, mask, (animal ear:1.4), blur
have you tried turning it off and on again?
❤️
looks so squeaky clean
obv its distorted in areas but this looks too good to be just AI
the sloppyness is intended (using terms: scribble, doodle, sloppy, unfinished + LoRA)
anyone have an image with the img2img workflow in it?
no, I trained it just now.
cool
yes thank you
I'm having trouble with the WAS node add-ons for ComfyUI. When I try to connect to the new nodes, the connecting wire just seems to pick up the WAS node I'm trying to connect to and it gets stuck to my mouse. Base ComfyUI is fine. Any ideas?
I'd post it as an issue on the wasnode github
Good idea, will do, but also wanted to see if anyone had see similar.
The output of the new nodes works fine, but not the inputs.
I'm connecting from STRING to TEXT. I can start the line from the new nodes or the old ones, but it messes up when I try to connect them.
Unfamiliar with it on comfy but I know in unreal engine, we'd sometimes need to convert the type first. It could be unrelated but Id look to see if you're able to convert string to text
any idea on when controlnet realeses for sdxl?
mixing two of my LoRAs (style + subject) JuN10R + gr3g0r
looks great!
i really like both 🙂 can you send me ? i would like to experiment some
Thanks. In this case the conversation is one that's common in the example workflow.
was that directed toward me? 🤷
yep 🙂
I'm still testing and tuning the LoRA, I'll release it when it's done.
do you know how can traing sdxl LoRA on google colab if git clone diffusers
I train locally
i can't figure it out after git clone diffuser, then cannot import name 'DiffusionPipeline' from 'diffusers',but when pip install diffusers,it works well
@zealous horizon maybe i could help to test ?
just a friendly offer though, i can wait for release, too 🙂
what's your GPU SMI or configuration?
how much vram to train sdxl ?
If you just clone diffusers you're probably not pulling the dependencies down at the same time. Pip will install all the dependencies for you
I have 3090
lol, the secret was not to ask for a "furry octopus" but for "a furry creature with the shape of a octopus"
yes, but the official readme doc doesn't specify the dependencies at all
is there an easy to follow guide for making lora's in sdxl?
hi , i test sdxl with comfy ui , i would like to know how to make a correct prompt with text inside? possible or not?
I run a 4090 on an i9 13900KS
sorry if this is an inane question, but im still getting the knack of SDXL and my computer is slow so cant just hammer at keys and learn that way. see where it says Style: X? how am i entering or injecting that into a prompt in comfy then?
is that just the bot parsing it as a variable so other people can mix it?
hey kiksu if you dont mind can i know what style or prompt for image "man at the mountain" , but if you dont want share it's okey i understand
thank you before
Anyone try training yet? Hearing 1.0 is harder to train than 0.9
Just save the image and read the data. It's all in the image 👍🏼
potentially dumb question. What is the significant of text_g vs text_l?
Are these just arbitrary var names?
How's the consistency? I tried training several childrens book styles for 1.5 and was unable to retain any decent consistency especially for different types of scenes. We ended a project /w some friends over the tech just not being there.
it isn't as bad as you think. You can resize and reposition them wherre you want to make using it easier. I agree that as a rule it is a pain panning left and right all the time to do stuff. But the sheer speed and ability to make even modest systems fly is a boon beyond measure
my 1.5 model and L0RA already did a nice job at consistent style.
This is from the 1.5 version + character LoRA
Looks like 2 diff styles to me. More consistent than what I was able to achieve but not perfect. How many training images and steps?
hands and feet looks worse than ever
thank you before
Here are some I did the other day
that wasn't a microdose
help me more sense of technology
@woeful patio I made it to simulate my DMT trips LOL

@thick goblet I also have the RTX 2060 but for me a generation in sdxl 1 takes like 2 minutes. What are you doing different?
In automatic1111
Use Comfy
@brazen patrol I did but I dont like the workflows. I like to do image2image too and I cant find a good way for that in comfyui
🐢
i wanna use it in my video cover help please
Nevertheless, that is the secret to speed and performance. Comfy says you can to img2img and inpainting.
it is a total mess
do you have an image with the workflow json imbedded?
Fantastic! 😄
This is from SD.Next and the generation data is in the image
try more
But it's kinda random because I run One Button Prompt -script
Yes, I would completely agree with this. I get that I'm making a comparison of a mostly virgin release compared to a release that has months of fine tuned user models.
Maybe my wording didn't come out right, but what you said is essentially what I'm hoping for. That with some time, and some user trained models, it will be far superior.
At the moment, I have a hard time getting superior results.
not so messy but still the hands are a bunch of meat
This could be a strange fetish
I've had some pretty good luck with hands lately
On the flip side there are things it does and understands that 1.x could never do no matter how much I tried. Chess imagery was a disaster through and through, as was trying to get it to make a logo. SDXL is in a whole different league.
are you using comfy or A1111?
this one looks better, maybe becuase the feet are mostly hidden
i am using invokeAI and it is horrible with hands and feet
otherwise images are beautiful
Does it natively allow you to use the refiner and Clip?
true, there will be a day where we won't have to worry about hands anymore
hands have been difficult in comfyui
Don't generate on 1024 , use 768 x 768. Instead
Then upscale
Or use refine model then upscale
In invokeAI? yes, refiner is loaded along base
is there a way to generate videos in ComfyUI yet?
does any1 have a tip for fast generation? What sampler at how many steps do you use? Right now i got 23 Steps with dpm++_2m_sde Karras for the base, and Refiner starts at Step 23 and ends at 30 with dpm++_2m Karras...
depends on what you want to define as fast lol
and speed isnt everything, think tortoise & hare
well... faster than my current setup xD
well as you havent provided a frame of reference...............
this was testing material with different weights and epchs of that LoRA. It's a process of iterating over the model and generating new training material.
But those images were actually from the first version, therefore I'm pretty happy.
It also helps if the character-sheet is using the style you're intending as a final product.
Mine looks like this but needs improvement. but i't super highres (11520x5760)
For speed use comfyui
the settings are the frame of reference i think... if you mean in terms of absolute time: i'm used to 10 seconds per image including highres fix with 1.5 Models
q
It's XL, it works differently. What's your it/s?
Is there any way to train Lora in reasonable time?
3,5 for base, 2,7 for refiner
so thats quicker than I get with my Venerable 1080Ti, basic 1.5 512x512 are around the 10 second mark, basic SDXL 1024x1024 are around the 50 Second mark
SDE has usually been slower sampler. I run 1280x720 Euler in both around 1.1it/s with 3060 12GB.. I'm just happy that it works.
IMHO total time in seconds from click go to image generation is a far better metric that Its etc as they can be subject to the reporting whims of the UI being used
Yes this is true
with euler normal the numbers dont really change xD
That is an incredible image. Three hands and two eyes from the side. Should not be possible. 😄
unpossible to possible 😆
"a pretty young woman wearing a diaphonous summer dress running trhough a meadow full of flowers on a sunny summers day"
Anyone got a set of binoculars for me please lol
Don't tell me it was the first shot.
nah. did a batch of 10. that was the only one
the bike didnt fair too well though
No, that's broken unfortunately.
Does anyone know of a node in comfy that can allow you to manually toggle something on/off without needing to break a connection like I am doing currently?
You should pass it through the refiner. The texture of the field is poor.
https://github.com/RockOfFire/ComfyUI_Comfyroll_CustomNodes these i've seen. hvane't trie dmuch with them
sigh that was a 2 stage pass
Ctrl + M could disable the part.
Toggle would be nice though, it's been requested before.
the texture of that field looks great to me
reminds me of a photo i once took over a field
nice. giving that a look now
I didnt know that, thanks
it doesn't reroute the inputs and outputs though. i don't know why it keeps getting recommended.
muting nodes just breaks the entire layout usually
technically it wa sa 3stage pass , I forgot
sdxl inpainting in the works or not?
I have one connection going to Upscale, but I dont always want it active, so a toggle would be nice
no they've halted all work on sdxl and have called it done || /s ||
so no inpaint model then
none yet. you can already inpaint and soon we'll have their new controlnet architecture which will enable greater inpainting
the devs were teasing me with their controlnets last night. rapscallions!
any release date for controlnet sdxl?
heard it was still in beta
no date. when they're happy with it i suppose
dukes remixed my photo on me last night. i couldn't even tell at first.
is there an extension that has preset prompts for setting parameters like lighting, angle, composition etc. for generating photographs?
If there isn't, I would like to create one 👀
WAS Node Suite has a Prompt Styles Node which uses A1111 style CSVs
(ie 3 columns, "name,prompt,negative_prompt" )
where can i find a good controlnet workflow
(presuming your using COmfyUI)
I was talking about A1111. So, do you think there's value in creating such extension?
either for automatic or comfyui?
Hi, is there a guide on training controlnet for SDXL? I would like to train the openpose one
Why? A1111 has a built in styles saved in a csv. Refer to the documentation
here you go, drag into ComfyUI for workflow 🙂
I've seen those flowers...my only complaint so far is very samey images given a single prompt where 1.5 (or admittedly user checkpoints) would give a wide selection of creatively different images
but only one style can be applied at a time right?
can anybdy explain how to use this (im clueless)
No
You can apply multiple styles
nvm, reddit guy found it out
I am trying my best to figure this stuff out.
I have come to understand there is OpenCLIP-ViT/G and CLIP-ViT/L```
diy time
Is there a guide on training controlnet for SDXL? I would like to train the openpose one
Okay, so you need two of them. You can convert clip g and clip l to inputs
basically 2 inputs?
just give me few minutes. Okay, you basically convert clip g and clip l to inputs and connect them to primitive node
Comfy now supports SDXL using the normal textencode btw
I did something similar yesterday
nicee
imma try this workflow from civit
interesting, link?
can someone help me add a lora to this workflow?
cool, link please 🙂
Until the release of version 1.3 of the workflows, a beta version of the improved Reborn workflow can be downloaded from this pastebin . Please mak...
uses 3 prompts instead of 1
thats 3x the fun!
Introduction SuperStability 😅
SDXL -> Refiner -> Juggernaut Final + UltimateSDUpscale
Yes that the method @high skiff developed previously 🙂
sick
Ai doesnt knows the gigachad. Would you guys add him into his mind?
A Lora
how does sdupscaler compare with realesrgan and 3x ultrasharp?
thank you very muc!
SDUltimateUpscale make use of upscale models too, it just use a tiling approach (requires the tile controlnet), this allow for much larger images since not all the image at once is infered/loaded in vram
should i get it?
nice. results like this are why i keep laughing whenever someone still thinks gigapixel is useful over half a decade later
I use it all the time personally
It was fascinating at the time 😅
does anybody have a good prompt for getting helmet mounted camera like pictures?
thats quite satisying to watchh
Not sure what would be the characteristic of those?
I thought during the launch that ControlNet models would be too over the top compute intensive?
1.5 upscaling already destroys Topaz's software for quality. Topaz is just a one button operation instead of a little bit of a skilled hand. It does still work, its just, the topaz signature artifacting is there always.
they're possible. 8gb people may not be able to run them. they'll be bigger in size.
a new controlent architecture is being developed by the team at stability
Apparently it will work but slightly differently, still finding it hard to find actual concrete info about that... I would love to train them if I knew how....
It's the thing I miss the most by far in SDXL (yes I do 1.5 -> SDXL img2img to cheat it but it's not ideal)
mainly first person views but higher up (or lower down if it's a body cam)
https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md here's the man pages. looks easy enough. builidng a dataset of 100k images though
questions like, do i use 512 pixel images or 1024? i dont know
How do you make a cat sound like a dog? || Douse it in gasoline and throw a match at it||
Yep but it doesn't go over either openpose or sdxl unfortunately, I did train my own circle controlnet to test
yeah i might try to run the circle dataset myself
after a while they'll smell the same too
openpose should be straight forward. you get 1000s of people photos and run the annotators on them
Euler-a vs Heun Sampler. Same seed. 32 steps. 1024px. Base-SDXL.
I was noticing a lot of bad textures and repeated patterns for concepts mostly related to nature or concepts that might have very few training images e.g. "indian village". It seems to be a sampler problem.
try different cfgs too. iv'e seen good results all the way to 12
Yes finding smaplers are working wildly differently with SDXL
I feel like the best CFG for the most part on SDXL is 4.2, I tried many different CFGs
but, also, 6.9
Lol
I was serious. it's good on A1111
magic hands
reminds me of the impossible umbrella i got yesterday wait
VR hands
what a fat fat man. possibly the fattest
powahhhhh
It doesn't get POV, head mounted, first person much
did you see when he coudn't race in the newest grand tour because he was too fat?
best part
TONIGHT
Actually I've been having a lot of trouble with viewpoints and perspective, even (low angle) wasn't working for me.
i find they work better when passed to the L clip.
yeah it really enjoys doing generic angle front pose pictures of one character
getting away from that seems very hard
describing with natural language is useful for the G clip, "looking up at subject from the ground" or "viewing from above"
by default it goes towards perfect composited portraits, but with strong prompts you can push it elsewhere into the latent spaces
Are any unofficial controlnets out?
yes! || jk no ||
Okay im very confused, all of a sudden all my gens are appearing like this in all comfyUI workflows.
Is there an img2img workflow in ComfyUI?
best samplers for refiner to add detail??
there are tools to make them in comfyui. you can find predefined layouts other people have made
Encode your image, pass this as the latent of your sampler lowering the denoise
I trained an SDXL lora using Blame! manga art (faces only though) and it's not bad at all 🤔
roughly 1.5k steps, 90 images in half an hour. Maybe it needs more steps though
Ah right, I was there before but missed the img2img example. thx
should i keep these refiner settings or not?
It depends on a lot of things like the checkpoints usually recommand noise algorithm to use but you should just try a few on the same seed.
Denoise is important only in img2img otherwise it should be 1
awesome, I think it's difficult to transfer that style
surprisingly that was first try!
what settings are you using for xl refiner?
theres some weird artifacts in 3rd pic.. dithering? Also, I thought it would be difficult to find 1024x faces, did you upscale them?
should note that the training data for this lora is just in game units and portraits from starcraft 2 terran race. some screenshots and concept art too. realizes the aesthetic so well
Is there a good repo for SDXL comfy UI workflows?
I do 2 refiner pass for now (sdxl refiner and then a 1.5 model)
On the base sampler I use return letfover noise on the SDXL refiner I start at half steps until end steps
There is now official samples
mostly transparent pngs in the training data too
This is for the refiner (so img2img), I prefer 0.2 denoise myself
maybe because very few training steps, no idea. Yes, I cropped the faces using opencv from the master edition manga, then upscaled them with chainner using 4x_eula_digimanga_bw_v2_nc1_307k.pth model
Where can I find that?
ok cool
imo it completely depend on the workflow, I don't even use denoise at all personally only start and end steps
that is denoise
just more thorough
A bit better
hey! I loaded this workflow in comfyUI and installed missing nodes but seed as text is still missing, any idea for that?
I just discovered your GTM "ForYou-Photo"! It works really really well for the 2nd refiner pass
Thanks you, it does 🙂
If you like my creations, then please consider buying me a coffee. Thank you! :) Ko-Fi Fantasy.ai is the official and exclusive hosted AI generatio...
replace it with a regular primative. seed as text just converts text to an int
In the credits and notes. Tells you where to get it
That's what I used in the image above.
Is there a reason why so many use comfy over a1111 with sdxl apart from the two step thing going straight from base latent into the refiner?
speed, memory efficiency, and layouts that allow prompts to each encoder seperately
Much better workflows imo
reusability, reproduction, tinkering
Ah shit guess I’ll be downloading comfy tonight then 😂
I still fire up A1111 from time to time, just to try out new extensions or model integrations
eat a pile of vegtables today so your thinking cap is fired up later
Although that will break the file naming and prompt txt file generation 🙂