#✨|sdxl
1 messages · Page 172 of 1
but i don't need to see telemetry to know that a lot more people use photoshop
and that's really the crux of it. you start talking about having very strong UX opinions, even opinions i agree with
but you will never have the tracing or telemetry to know if it really matters. or even necessarily poignant user feedback
whether or not i personally like photoshop - i don't - it's pretty serviceable
right now, my reaction is
Is there a way to use stable cascade locally with comfyui? I haven't been on discord in a few months and it seems like some exciting things are happening. Any ideas on when Stable Diffusion 3 will release?
okay, you are very opinionated about how krita or kliks or whatever is written. you have a single canvas.js file, so i can't possibly hope to contribute to it
Yes, SC works in Comfy. No SD3 news that I've heard recently. 😦
that's my feedback to you. being "anti typescript" is a big mistake. if you want something that makes sense for others to modify or improve. so no telemetry, no practical way to "pull request". it's your call
LOL yeah. I'm great at coding in my way, but practically incompetent at learning other people's codebases.
for cosxl, is there any reason to get the non edit version of the model, what is the difference?
I think it can do pitch black and pure white, which is something SD has struggled with in the past. Maybe just has better dynamic range overall.
Are any of the OGs still on... @soft zealot @high skiff @indigo carbon ... Just trying to see if any of my other buddies are on still.
I'm still here, but I'm not active in the server. I'm working on other things now.
Working with a research group I'm a part of to potentially try and train our own foundational model to release to the public as an alternative to stable diffusion :>
That seems pretty fantastic
Can't really share any more information about it, mainly for IP reasons, but also because not everything is decided yet haha
ah ok thx
Yeah, we're hoping that it ends up working out properly. We already have somebody on our team who has single-handedly fully retrained SDXL from randomized weights, so we definitely do have experience with the training aspect
We also have an architecture engineer who is working on implementing some papers that they themselves wrote into this new model. The goal is to be an even more open and more community-driven image generation company that shares research findings and actively takes suggestions from the community
It's still very very early on, but the hope is there
I also do contract work now, though it hasn't been going very well lol
@uncut gull sorry to ping you again, just quick question, i noticed on some pics i use, the output of cosxl seems kinda blurry, is this purely a problem with the parameters im using in the sampler or is this an actual limitation of cosxl?
Oh, I also recently started working on my own special way of training SDXL, and it is so incredibly promising I might write my own paper about it.
It focuses specifically fixing the coherency issues in SDXL, to the point where a single 30 minute training was able to turn base SDXL into a less deformed model than Juggernaut v9 out of the box. The prompt adherence is as good if not better, and the deformations are considerably better
It also expands SDXL to work at native 1536x, and all the way down to 128x128
An example of the results that I've had. The four on the left are base SDXL, the middle is my 1.0 attempt, and the right is my 2.0 attempt
That is 100% base SDXL 1.0 with just a coherence fix on it. Less than a dollar worth of compute
Well if you cant really talk about the science of how it works I understand and respect that, but if you need a random unbiased person to test it out and give feedback I would be happy to do so.
I meant in reference to the new stuff you are working on
Sorry, I've only read about it. Haven't experimented yet.
It also improves text, regional prompting, multi-res coherence, and various other things.
I'm also working on testing different versions of it, as I think I could find a way to use it to train models better and faster than LoRA's
This is base SDXL versus my 1.0 attempt
Over my head, but I hope it goes well!
Only required 50 images, and about 30 minutes worth of training
That is a huge savings in time and GPU resources.... Are you still rocking that 3090?
ah np, il experiment later, thx
Posted in general, but also messging here for visibility:
How do I best finetune SDXl? I would just use the default DreamBooth training script, but looking at the diffusers documentation for training on huggingFace, it seems like there is another option? Dreambooth: https://huggingface.co/docs/diffusers/en/training/dreambooth?gpu-select=16GB
vs
just SDXL directly? https://huggingface.co/docs/diffusers/en/training/sdxl
Yep, single 3090 for training
mostly curious what the second option is, since I'm familiar with DreamBooth
Here are some more examples
It's still in its infancy, and I've been messing around with some LLMS today to take a break from imageGen training, but I am looking to potentially write a paper about my findings, if they continue to be as successful as I project them to be
Those images are really awesome though
huh
oh these are nice
Without getting too far into the exact method of which I'm using, I'm doing a form of local adversarial training. I'm not only just training on an example of what I want it to look like, but a direct example of what I do not want it to look like.
It works as two loss functions, one pulling towards the concept training that you want, one pushing away from the concepts that you don't want. It helps the model not get stuck in local minimas and it's unet, And it also learns in a way that is very fast, yet none destructive when overtrained
You can throw an infinite amount of training time at this, but after a certain point it doesn't change anymore. The loss function will reach a maximum value, and never budge.
You can pull a result at 10 epox worth of training or at 1000 epochs of training, and they will be virtually indistinguishable from each other, as it prevents overbaking
You can see it here in the loss function, it learns very rapidly for the first epoch, then slowly simmers down to a stable result, then it does not change additionally from that point on.
The results from those last three epochs are effectively identical
I have a few more things I can share real quick, but then I have to get going
It greatly fixes things like yoga, as well as the coherence of background details. This is with exactly zero image training on people doing yoga, yet it is able to clear up The results monumentally
It's definitely not perfect, but it is also a microscopic training lol
It also helps fix duplications in very wide and very tall images
And finally, here is an example of how much it fixes things like small faces and crowds. Once again, less than 30 minutes of compute on only 50 images to get this result on base SDXL 1.0
The inclusion of ComfyUI in the log path interests me. Are you actually doing this training using ComfyUI and the model modification nodes available through addons, or are you just happening to use models that you're storing in ComfyUI's directory?
Oh no, that's just my output path. It automatically goes into my Laura folder, regardless of what I'm training. It's something that I forgot to fix for that
*LoRA
Normally I only use Kohya for LoRA training, So I usually just have it go automatically in there to where I don't have to do anything in order to validate
With that limited of a dataset, have you tested how well it works on things not included? A lot of fine-tuned models on civitai suffer when trying to create anyone who isn't a woman in her 20s who is both caucasian and asian at the same time, especially if you include keywords implying that she's a model or attractive.
Have you tested how well your technique performs if you tell it to generate something that wasn't included in the dataset, like a crowd of congolese women or a group of saudi arabian men with pickles in their mouths?
Also, does your technique seem to affect how well the model adheres to prompt semantics, like "man wearing pink shirt and woman wearing blue shirt"?
Yes, I've tested extensively on how it helps certain things. For example, in the images shown above, there are no images of crowds, no images of women doing yoga, no images of shirtless men, no images of the beach, no images of digital art, and no images of the Muppets. Pretty much every single example I have given has been stuff that it wasn't trained on, and I will also include a little bit of information to state that I am training completely without captions
without details on how you're setting up the adversarial process, it's hard to tell what the total range is of problems that your trick solves
It does, it actually makes it more likely to succeed with prompts like that. I have an example where it's able to listen to races considerably better than base SDXL
For example, this image is supposed to be a photograph of a black man on the left with a white woman on the right standing in front of building ruins
The left for are from base SDXL, the right for are with my coherence training
I will note, for the 1.0 attempt, I did mess up some of my training settings so if you notice the contrast seems to be messed up, that was later remedied
I also tested directly against Juggernaut v9, and the coherence training I did also seemed to succeed against v9 as well
given how much models tend to hyperfixate on attractive young light-skinned women, I'd be very interested in seeing how well your model performs generating people with darker skin of both sexes, in a range of ages and countries of origin
Against, sure. Have you tried doing the training on Juggernaut?
This is not really meant to be a sort of fine tune. This is more so meant to fix base SDXL to where it's a lot more open to being fine-tuned.
This is all very very early research for this, And I'm still trying different methods continually
alright
I have, it doesn't really seem to improve it any more than just doing this on base SDXL. In fact, training on a model that has the underlying issues more to where it can see a bigger delta between what you don't want and what you do want seems to yield better results for the output
Do you have a plan on when to talk about what you're actually doing? If you're doing so much experimentation with methods and hyperparams, I'd love to join in with trying things out, and I'm sure others would too
Also, I do have some experience with juggernaut. I worked with Run diffusion on V7, my realism trainings that I was working on at the time were to be merged into juggernaut, however it ended up resulting that my realism trainings did significantly improve realism, but at the cost of the models general performance in other areas. So I ended up keeping the realism training to myself, and I have thus been training my own in-house model that I have hopes to release with run diffusion, or other companies in the future
I'm sure you want to make sure you lay claim to the idea first to get proper credit, but I'm eager to get to the point where this technique has already been known for a few months and has been incorporated into the new status quo :p
I quite literally just started this yesterday, so this is only a single day's worth of results. I'm also trying to make sure that I have everything set up right locally before I start running various different tests to try different combinations. But I would be open to having people test the models at some point
not bad
Do you mean to say that SDXL-fixed performed better than Juggernaut-fixed?
I suppose that would be hard to grade exactly
I've been using this style of training for a little bit now, however I had never attempted to do it for something like this before. The results were completely unexpected from my original understanding of how the training works, and since then I've been able to train 1536 x better results using only a few images as examples. Again, I'm operating at a very small data scale here. I did end up doing a secondary training on $3,075 images, and those are the results that you see from the three image comparisons
but like, rough estimate on your part?
The main thing here is the fact that if the model already has most of the issues fixed in it, reapplying something on top of it that is supposed to fix it again will result in kind of baked outputs. That is to say that the yield of improvement is significantly better on base SCXL than any of the fine tunes that I tried it on.
In a very very small scale test that I cannot validate the reproducibility of, I was able to get a small glimmer of hope that the coherence fixed version of SDXL is significantly more open to being fine-tuned non-destructively than normal SDXL or other fine tunes
makes sense
From what I've seen, base SDXL with this fix can do hands better, and it can listen to the prompt a little bit better as well. It has more protection from duplicates, it can handle extreme aspect ratios better, and for whatever reason unbeknownst to me as my data set had literally no information about hands, it can do nearly perfect hands almost all of the time when prompting for hand poses
weird
I have some examples of that as well, let me grab them real quick
Or rather, I should say that it's not unbeknownst to me, as I have figured out why exactly it happened since then, but it is still shocking to me that this method of training in this way yield such incredibly improved results
If I toss you a comfyui json file later, would you be down to run it with your model?
I want to rig up a test bench of sorts. A big 'ol XY grid of different specific things SD tends to fail at.
Here is base SDXL versus SDXL with the coherence fix
I'd be willing to. I'm not sure if I will be able to today as today I'm working on large language models, but I could definitely take the workflow and save it for when I am focusing on image gen again
I've been going really hard at this stuff as of late, so I'm giving myself a little bit of a break to mess around with funny text gen AIs lol
better hands, sure. What was the style prompt? the first looks much more like old wizard-of-oz-style films if that's what you were going for.
The low contrast was the result of my bad training settings for the first version. That was the thing that I fixed later on. I've also tested this with linguistic prompts, tag prompts, pretty much everything and it seems to perform the same across the board, as it wasn't trained with any captions, so it didn't really pick up on anything like that.
I will say, it is photographic realism leaning, as my entire data set was only based off of photographic results. However the 3075 image version that I did of it does seem to be more dynamic and capable of fixing other concepts as well
The base model that the research group I'm a part of is looking to fully train in house is supposed to have a 1 million image multi concept coherence fix put on it at the end using this method of training I've been messing around with
You can see an examples like this one, where the contrast is not as degraded
Left is base SDXL, middle is the V1 of my coherence fix, and right is V1 merged with V2
In the end, this entire training premise and idea is not to train anything new in or out of the model, but to rather detangle all of the deformations and the effects of having tokens really close to each other
I do want to talk a little bit more about how it works as it's very cool, and extremely promising, however I don't want to give away any more information just yet haha
would you say it's left cartoon/abstract images worse, or unchanged?
I do also have an example of what it did to 1536 X 1536 gens. I was not expecting it to greatly improve the coherence of higher resolution outputs from SDXL, but it did
It's obviously not perfect, but it's also such a microscopic training
It did better, but the contrast issue that I was having at the time hit much harder as well
Left is base STXL, right is with the fix
Same with these two. Significantly more coherent, but the contrast issue hit way worse on these
yeah the hatsune one especially
I have not yet had the time to test if my new 3000 plus image fix also helps the contrast in these images as well. I barely ran any of the tests on the new 3000 one before I went to sleep, and then I woke up feeling super drained from all the work I did and decided to just focus on funny AI text bots today lol
This one also shows the difference between base SDXL, my V1 fix, and my V1 plus V2
Could you help me understand more about the variety of SDXL fine-tuning methods and what options make sense for what use-case? I know DreamBooth is an option, but it seems like there are many others. Such as, you are doing this adversarial method, which I understand you don't want to detail yet, but just wanted to ask since you are knowledgeable
what was it that caused the contrast issues?
If I'm being honest, I have an extreme amount of experience with LoRA's, But I only just within the last 2 weeks started fine tuning, as the updates to one trainer allow me to very easily fine-tune SDXL on 24 GB of VRAM.
Most of my findings were with the guidance of people in the research group that I'm a part of, they were giving me lots of tips and tricks, and then I ended up finding a different method of training, and then adapting it with the information that I learned from this and it ended up being very successful
It was a mixture of issues. Part of it had to do with some dropout that I had in there that I didn't mean to have, a lack of SNR gamma, and a lack of offset noise. As well as a processing issue on my original data set that led to the images themselves being lower contrast.
I spent special attention to detail trying to fix that for the 3000 image training, and from the little bit that I was able to try, it does seem to have helped
My response is may get a little slow here as I am currently unloading my dishwasher lol
I've also been using voice typing this entire time, so my apologies of any words came out weird. It's a lot of stuff to manually type on a small phone screen lol
the majority of the community uses LoRa and LyCoris training because there are established tools for those, and their effectiveness is proven. There's also embeddings, which don't add new "material" to the model, but can teach it to generate something that it could technically already do on command.
Aside from training whole new models off the base, pretty much everything else is considered "experimental", like what this guy is working on. It's hard to say what's "out there", because aside from the established tools, there's tones of people trying stuff. And lot of their experiments work, but in specific cases and specific conditions.
I figured this was the case when you name-dropped Laura :p
For example, I have trained over 1,000 LoRA's for base SDXL. And my results from those trainings got me a contract position at run diffusion, and are currently getting me a contract position at full journey.
My experience with full unet fine tuning is very minimal by comparison. However, this type of training that I'm using right now is excessively easy to run
Yeah lol, I tried to catch the words when I can, but I'm super ultra multitasking right now lol
At a minimum, I'm trying to figure out what the difference between this (https://huggingface.co/docs/diffusers/en/training/sdxl) and DreamBooth (https://huggingface.co/docs/diffusers/en/training/dreambooth) is. Would appreciate if you guys have any thoughts, but I also understand that you might have never used or looked at HF diffusers before
The main difference between them is that a full unet fine-tune does not have prior preservation loss, whereas a dream booth does
Also, depending on how much VRAM you have access to, you may want to choose different options
For example, 24 GB of VRAM is not exactly enough to do fine-tuning on SDXL in most trainers. The only one that I know of that is able to easily do that is one trainer, as they added some recent optimizations that make it use less than half the amount of VRAM at normally does
By comparison, batch size 1 Unet only 1024x full unet fine tuning in koya will use 23.2 GB of VRAM
The same in one trainer will use 9GB
I can read online that for dreambooth you can have a lot of variation, but doesn't need a lot, but unsure for full unet fine-tiune
It massively depends on what exactly you're trying to do. If you're trying to train in a single subject, it could be a few tens of images. If you're trying to reform the entire structure of the model to be significantly better at a specific concept, you'd want to have several thousand images. If you're trying to untrain all of the biases in the model and have something that works on its own, that does not produce fundamentally similar results to base SDXL, you'll want hundreds of thousands or millions of images
If you want a challenge you could start training an audio model if you want 🙂
For example, the leader of the research group I'm in did a full randomized weight retraining of SDXL 1.0 using over 8 million images over the course of several months. It is now so fundamentally different that it is completely incompatible with normal LoRA's or SDXL samplers
It's results don't even look like it comes from the same architecture family now
oh cool, link or pub if available?
It's considerably a better in some ways, but also considerably worse in some others. You get trade-offs, especially if you're going to dump literally all of the original training that went into SDXL like he did
I think I'm looking at using about 1000 images and want the model to get better at a specific type of object photoshoot style and composition
It's on his hugging face, I believe it's called terminus V2. I don't know how long ago he updated it, but it is continuing to be trained, and it has gotten really quite good as of late
All trained on images sourced, captioned, and trained by a single person with one a100
That sounds great in that case. I do full realism fine tuning on SDXL, and I use just over 3,000 hand curated and personally captioned images to make sure that the results are exactly how I want. The main thing that you need to know if you're going to be doing a full fine-tune training is that you need a significantly lower learning rate than you do for a LoRA
Using the same learning rate that you normally would will likely instantaneously destroy the entire unet
I learned that the hard way lol. And you also have to make sure that your images are bucketed to a 64 pics edge resolution. My data set was not, as I usually use Kohya, which will rebucket those images to have the 64 pics spacing, but having non-64 pics dimensions will result in horrifically deformed results lol
I am going to have to go here momentarily, but I will let you guys know if anything or if I end up writing a paper@ancient cairn @rigid laurel
I guess one follow-up: what might you recommend for my use-case? Maybe just start out with dreambooth and then try a full tine-tune, using something like the diffusers script I pulled out?
Sounds good on the paper--I would be interested in learning about the training method you described
That really depends on what hardware you have
What GP would you be training this on?
If it's 24 gigabytes or less, you're going to have to do a full fine tune, and it's going to have to be in one trainer, as Kohya does not yet have the changes to the training code that allow it to run properly within 24 or less gigabyte
If you wanted to do that, then a dream booth could potentially work. However I would probably recommend just going with a normal full network fine tune in one trainer. For example, a full network fine tune in one trainer on a 24 GB VRAM GPU can run at basically the same batch size as a dream booth on an a100 with 80 GB
I don't know when koya is going to be updating the code to have the new fixes that allow for fine tunes to be done so easily like in one trainer
Isn't the tab I've highlighted a full fine-tune?
nvm, don't think that's what you were talking about
That is a full fine-tune yes, however you will not be able to run that in Kohya on 24GB VRAM
The only option to do a full fine-tune properly on SDXL is to use a different trainer called one trainer. They have new settings in it that allow it to use significantly less memory than it normally would.
For example, in Kohya, on 24GB VRAM, you can train at a max of BS1 at 1024 with not TE's.
In one trainer. 24 GB VRAM can train at a batch size of 12 at 1024 without the TE's, or BS8 with the TE's
Yeah, I'm also somewhat familiar with onetrainer. I guess if memory isn't a constraint, I might just stick with Kohya since I could easily access a 48gb or 80gb if necessary. Why are you recommending full fine-tune over dreambooth for my use-case?
Also understand if you have to go, nw, just wanted to clarify that if you had time
Because of the cost savings. A 3090 for a full fine-tune in one trainer can run about the same batch size as a much more expensive a100 80GB for a dream booth in kohya
and I recommended fine-tune because one trainer only have fine-tune, not dreambooth
@high skiff Actually, if your method is what you want to guard for now, would you be willing to release the fixed models on civitai? Even if v2 isn't as good as a hypothetical v3 or v4, it would give people a chance to experiment with it, report back, and see if your theory about derivative fine-tunes being more stable is correct. Uploading the results shouldn't put the secret of their creation at risk.
How can I make really nice detailed backgrounds?
Do I need a LoRA or do I just need tweak and play around with my prompt?
I'm actually kinda trying to make the background the main star of my generations - I would like to generate my own desktop wallpaper
Preferably at like 1440p or 4K which I think I can do with Ultimate SD Upscale..? But since I use XL and not 1.5 I don't really have access to ControlNet Tile
If you really want to be extra, try generating the background by itself
then either inpaint the foreground subject, or use that transparent SDXL trick to generate the subject by itself then paste it on top
Does anyone have a solutions for the WASasquatch WAS Node Suite? I keep getting this?
Do SD 1.5 negative embeddings still work in XL?
no
yes, I'm not that active here currently, but I'll likely go back to making stuff once SD3 or a new model releases
I'd not necessarily consider myself an "og", but have been around for a while. After playing around using SD for "fun" I've found a pursuit to actually put my work to use. So I've spent much of my creation time working on that project instead of sharing on discord. I try and occasionally chime in here and there though, post the occasional image.
"We're all mad here"
Is it supposed to enhance the contrast when merged?
yes, for example here's some tests with random prompts on civitai, albedo vs the cosxl albedo merge in my example
a person holding a man
Some first CosXL tests. Still experimenting with the settings.
Does it differ much from using the offset lora?
so I haven't really done a/b testing and no direct comparison to SDXL base for example, but I think the extended color range does make a difference imo and can't be compared to the offset lora since that will not really enhance the dynamic range.
but without testing I can't really tell you how much is placebo.
Right now I'm experimenting with dpmpp_sde_gpu karras, cfg 4-6, steps 30-45, 1 pass and also 2 passes using latent upscale.
Your examples above do look good... enough to make me give it a try 🙂
Cool 🙂 I think they're close to getting overcooked, but you can still see details in the shadows which shows off the dynamic range and color balance a bit.
So I don't know if CosXL will just look very contrasty and saturated overall.
My first tests contained a lot of noise. The above sampler settings that I mentioned currently give me the least noise for my outputs, but it's a work in progress. I don't know what I'm doing (yet) 😉
Those images are just cosxl, without any merge?
@visual glade not sure the checkpointsave node is saving the merged clip correctly. Image made with merge in workflow VS image made with model saved from the merge
are you on the latest comfyui?
I updated it yesterday, but I'll update again now
Works for me. Images look the same during the merge as using the saved checkpoint.
Using updated windows portable version.
Makes a big change to the image output. These first 4 are from my model without merging...
After merge (same settings/seed)
still have the issue after running the update
I have a similar problem when trying to use the cosxl_edit model. Images are washed out and blurry
Actually might not be the clip, but the unet that is saved. Just tested the saved model, with the merged clip and got the same bad result. Then tried the saved clip with the merged model and got the better result
Same prompt, before and after merge with CosXL
good details and coherence, contrast is a bit strong
Some said sd3 has more channels in the vae. What does that mean?
finer details are better and (better dynamic range or better colour accuracy), that's my guess from what I remember so far
like SD3 2B is just 512px instead of SDXL's 1024px, yet it looks just as detailed if not better
(SD3 8B is 1024px btw, just in case it ends up being confused)
the smaller models which are 512px (such as 2B) are getting fine tuned on 1024px, but they are not as good atm
@meager canopy I have a similar problem when trying to use the cosxl_edit model. Images are washed out and blurry
How can I use stable diffusion ? I’m totally lost
Depends. If u have a good gpu u can run it on ur pc. Or u can run it in the cloud. Ther are good yt tutorials that can help u
Interesting
Does anyone know if there is an update coming for WASasquatch in ComfyUI? Mine seems to be broken.
Do you mean was-node-suite? Its latest version was early last month, with version ID 6c3fed7.
Thats the one... It is broken for me.
Import failed
works fine, are you sure you actually installed it?
did you do install custom nodes from the manager
and kill comfyui entirely by closing the console window
I installed it from the manager
yes
check the console log for errors related to it
Manager can sometimes cause a bad install. Try to uninstall the custom pack and reinstall it manually
WAS nodes - without exception - always come up RED for me in Manager!!! Can never get them to import...
I understand that Manager does not always update correctly; so I use a script "cd custom_nodes - cd instant id - git pull - cd.. - cd rgthree - git pull - cd .." etc etc etc
Try running this inside the custom_nodes folder
I edited the startup .bat to allow selectable updates during startup
Thank you!
||a cat in house||
I love me some Satable Dififel.
Hi everyone, I am new to SD. Wanted to get some help on how to create line/flat sketches like the below images from actual colored image. For example, how can I get the outline image from a image of the house. Would superappreciate any help with the workflow here. Thankyou
I have a node that is broken... Called ComfyUIStyler I have tried everything I can think of to fix it.
Any Help would be appreciated.
Update the node?
Tried that
And did the update occur?
I believe so.
What does the console say when it tries to load that node?
If you have the console output from when the update occurred, that would also be useful.
If not, try to update again and grab it.
The thing is I am having a lot of trouble finding which custom node this actually belongs to
Are you running ComfyUI Manager?
yes and it shows everything is up to date. nothing in red
When you open up the manager screen, on the left there's a drop-down for the badges. Turn on that and you'll see what packs each node came from.
Ok...so if you go to Install Nodes and change the top-left drop-down to the selection that isolates broken packs, what do you see?
Pardon my ambiguity; I just got to my machine and don't have Comfy opened up yet.
its all good. I appreciate the help
A quick google search and it looks like the original author removed the repository, which is why you can't see an issue. However, it looks like you can install this one (comfyui-styles-all) as a replacement:
https://github.com/aegis72/comfyui-styles-all
Its supposed to look like this potentially
This is the page I am trying to get the workflow from
https://comfyworkflows.com/workflows/a45ad88b-c9e1-4fd8-9484-c641635eefcc
The replacement node pack is ID# 391.
Hey that worked... I installed the node via the url you shared and it worked I didnt have to change anything
Thanks
🙂
I have been banging my head on my desk for 3 days over this...
Don't ever let things go that far. Bang your head on your desk for 1 day, then ask. But next time, do it in #🤝|tech-support 😉
Thanks... will do.
create a flying saucer
do it yourself in your garage
living his best life
"It all seems so REAL!"
Student living
dorm lyf
That room has seen some action! 😉
can almost see the crabs 😬
Are people doign anything special (e.g. update of different branch) to run cosxl in forge? When I use the model in Comfy it works fine, and when I use other xl models in Forge they work fine but with cosxl it just gives me noise
It's probably not supported yet 🤷🏻♂️
yeah, seems like it. A bit surprised since I didnt think anything in Comfy has been released specifically for it either
Yes, it has 🙂
must be, I just thought it hasnt since I dont use the default model loader but Efficent Loader from a Custom Node and I didnt think I've even updated the custom nodes recently (though maybe that one loads the default comfy ones under the hood)
sucks that A111 is seemingly not under development, and that forge was maybe more of a one-time thing
I guess Ill look around for a Comfy inpaint workflow that is comparable to a111 to test cosxl_edit
It's not really an inpaint model, as such
it sounds like image + prompt which is pretty similar when you add the standard stuff on top of it like a brush for choosing which part of the image to process
It changes the look of the whole image
Here are some examples of what you can do with a simple prompt #1071937700778741860 message
cool, but I guess you dont have a node for say selecting just the face putting angry or whatever and then it changing just that and recombining it?
It may be able to do that, I don't see why not, but haven't played with that side of it.
Red golden retriever
dunno man looks like a cat to me
Yes but he identifies as a...
Drag queen with ice and fire
..we are still waiting
i got used to sdxl 😦
even the best sd1.5 models dont cut it anymore after sdxl was finetuned...
it wa s afun 3 months tho while my primary focus was sd1.5
ella 1.5 is finally out. that means we have sd1.5 but with the promt adherence of dalle3 https://m.youtube.com/watch?v=_Pr7aFkkAvY
Diffusion models have made incredible strides in text-to-image generation, but they still struggle with dense, complex prompts that involve multiple objects, detailed attributes, and intricate relationships.
Enter ELLA - the Efficient Large Language Model Adapter that's poised to revolutionize how diffusion models handle sophisticated prompts. ...
What's cosxl ?
a new sdxl model from stabilety with a bigger color rage
Is possible to use it with diffusers library ?
Yes. If its not in difusers formart u an use skipts to vonvert it
creata an black elephant
is there any extension or tool that evens out colors after generation? i use adetailer a lot and usually the face comes out a different color tone from the body unfortunately
It's just a newer text encoder from what I understand. Nothing special yet.
Oh
How can a text encoder increse color space?
You're mixing 2 conversations.
CosXL = color space, Ella 1.5 = prompts
so what implementation of ella are you guys using? I grabbed the ella wrapper one.
is one better than the other?
i'm generally getting very good results... but it's hit or miss.
anyone using the style transfer with ipadapter_plus in comfyui, did something change? after update it no longer says "SDXL ONLY" and does not seem to work with sdxl either.
yes, the baseline is significantly improved
Anthropomorphic furry creatures joyfully dancing on a submarine's roof, submerged coral reef in the background with darting fish, tiny barnacles on the submarine's periscope, Edward Hopper, Soft diffused underwater lighting, A sense of whimsical adventure
the issue is not of composition or understanding, but of style loss
the vectors learned when training the control net can dodge the learned style of models
#🏞|general-with-images Lotus flower swaying in the wind, realistic, photorealistic, super detail, super sharp,
Here is the image you requested
elon glaze
the bulltproof luxury car
Senegalese black man, 40 years old, profile picture, real photo
now i get what you mean. i think its speachila it usues a llm and a adapter model as atext encoder i think its speacial
Dod Charer.
Here is the image you requested.
what is latest stable sdxl version and where can we dl it
is there any animal control net for sdxl?
Angry Maximilian
Had the pleassure of working alongside @maxcoopermax on his latest project, called “Seme”.
He asked me to experiment around the concept of renaissance era, giving life to a set of AI intervened custom-recordings [ft. @chinalabaig], experimental oscilloscopes [tbr, hopefully soon], and @touchdesigner systems, for his shows last week in collabor...
186
cloud,sky,science fiction,scenery,day,outdoors,building,science and technology sense city,(mechanical structure:1.2),(hard surface:1.2),car,BJ_Gundam,, masterpiece, best quality, complicated details,extremely detailed CG,perfect lighting,RAW,Masterpiece,Ultra High Resolution,(Dynamic Perspective),Sharp focus,(Masterpiece, Best Quality),8K,oc rendering,hd rendering
No bot
You could use a leash.
4K
Any of you guys want to participate in the weekend telephone game in #🔆|dailies ?
Yeah...someone needs to pick up from where I left off.
I was gonna pick it up if no one else did but I just didn't want it to just be you and me.
really good, gen data?
One message removed from a suspended account.
One message removed from a suspended account.
Birds; Mascot; Flat wind; Wearing a bachelor's hat on the head; Holding a pencil in hand
Vegan eatery, grimacing piranhas devouring a crispy lettuce salad with rabid appetite amidst horrified onlookers - surrealist oil painting. this is ai art creator from paincreator on civitai with a touch of andrea75c's cute 3d render lora
probably stick to sd1.x for that, xl's best anime is animagine 3.1 and kohaku delta/epsilon
Please create beautiful 17 century woman in period dresses high resolution
momentum-conserving unified unsampling-sampling sigma schedule
Nah do it urself
Here is the image you requested.
An astronaut is playing billiards in a dreamy color, high-definition, 8K
Here is the image you requested.
locomotive with sci fi power plant
Here is not the images you requested.
Good Booooy
Before vs After
epic
oh crap I forgot to write what I was using
😅
https://github.com/pamparamm/sd-perturbed-attention here's the repo that you can use in a1111 and comfy
slenderman matando una persona
great upscale, it really kept all the important details in
It's basically just Self Attention Guidance + Perturbed Attention Guidance + IPAdapter2 Composition transfer IMG2IMG.
为啥现在不能用了
怎么使用
hi
/dream/a dong
https://github.com/frankchieng/ComfyUI_Aniportrait hi,folks, i've released a comfyUI custom nodes project,enjoy
3D mesh to 2D face landmarker is full of algorithism and transform matrices
the dress is nice, great texture
can you do the same, but not cartoon face, but realistic photography ?
A what?
Yep, here ya go
A bit stronger
a blue sun and a yellow ocean
no
hallo
Hi ! i need help :3
I use stable diffusion locally and I would like to add the zaxychromaXL module in order to have access to SDXL Styles and ControlNet Integrated.
I put zaxychromaxl_v60.safetensors in models > stable-diffusion but when on stable diffusion I put this models, nothing happens..
Image
Probably best off going to the Tech support channel, but SDXL Styles and ControlNet are not part of the zaxychromaXL model, they are additional addon Extensions for Automatic1111, if that is what you are using?
Ah OK ! I thought this was part of zaxychromaXL. How do I install these extensions?
Go to the Extensions tab and hit "load from" and search for the ones you like, is the easiest way.
Oh thanks! its okay for sdxl but i dont find ControlNet Integrated... do you know if this extension has another name?
can anyone create for me future cartman with gray suit from south park and with really bad hairline ?
whats the equivelant of noise offset for sdxl? is there a darkness/light slider?
nope
Here is the image you requested.
Hmmmmm, what about the latent offset?
I think so. Works pretty well.
BSZ cui Extras has a node called BSZLatentOffsetXL. That's your slider. Let's just assume you are using comfyui.
ill check it out. the offset loras didnt give great results...
without the offset
And now to +1
that's beyond -1
-1 wasn't enough
so i intercepted the latent halfwayt hrough (step 30 out of 60) and dropped it down -0.5 again
Of course you did
same thing with the signs flipped
Me developing analog Film at home
lol
uhm so you use it before or after the ksampler?
After your empty latent image.
But feel free to experiment
workflow is in the image
haha putting it after made the image black 😂
I like the guy at the window
this is a gif right?
I hope it's not the 400 steps workflow 
discord strips all metadata on images
... i could def add a couple continents worth of disconnected nodes if you need that
Amazing 😍. A room where the lights are switched off.
oh nvm, i guess it doesn't anymore or at least not for pngs
yeah, fuck
the whole "throw a stupid lamp in at the last second" sdxl shit drives me nuts
shit, just lost my whole setup testing it out lol
Since I am a member of this channel discord has never stripped any metadata.
im not even trying to make something dark its just that this controlnet and ipadapter makes things much lighter than the source material for some reason...
How about darkening the source?
other approach that works well, might be best to do them together: brighten your source, shuffle blur it with the ipadapter noise node, and feed that into the negative input
best thing i've found is to color dodge it with itself
AFTER the shuffle blur
And then only 400 steps res_momentum aaaaaaand done.
hell yeah
last night i said something about having generated 100k images in the last 5 months or so and got a comment... "quality over quantity"
What do auto1111 users talk about these days? 🤔
<-- dr. 400 step res.............
don't think there's much action on that end anymore
forge dude has been abducted by aliens too which isn't helping
trying to see if it works for me or not
Latest commit was 1 month ago. That's....not that good.
yup, and left bugs that broke regional prompter on non-standard resolutions
memory leak
shit, it stopped working for me 100%
i had to roll back
So he went full auto
afaik, a1111 still dosen't even have differential diffusion, and that's been out for months
still no playground 2.5 support
a flaky extension for cascade
and a whole pile of other things that've come out recently
F that piece of software. When I installed it for the last time it auto downloaded SD 1.5 Base. That was giga shit
it's such a resource pig
Go to hell
here's what you wanna see in your basement at night
@uncut steeple @smoky patrol
this is the aesthetic i'd like in a house
nice
Until you go to the basement

You don't need to ping them. Just react to the message with ⚠️
alrighty, thanks
oh right, freeu is also good at making dark images if you take values below 1
thats interesting
probably also lowers quality lol
sometimes i like that
i have a tumblr dedicated to that
it removes the standard metadata, just not the workflow part that is apparently stored elsewhere
thank god for that
disco elysium portraits
Nobody cared who I was before I out on the mask
They mean business.
😄
Here is the image you requested.
it looks like he's taking his head off 😁
Don´t see much of a resemblance yet funky image anyway 😄
Like the two in the background 🙂
I dreamed of a bull-like zombie breathing next to a garbage container in the forest.
like this
😄
ok one sec
Can you make it so that the zombie looks like a human and breathes into the sky and its breath is visible?
This is not my native language, I'm using translation, sorry.
Just like our breath is visible when the weather is cold
in what way is a human zombie like a bull?
breath is like a bull
not the zombie
His breathing is like a bull, he is not himself
Can you show me how you did it?
😄
Good job, but it sounds more like fog is coming from behind, which is nice, the fact that it's leaning against the trash bin adds another angle.
yeah, when i said it wasn't a mist, it was coming out of his chest or head. it didn't want to make it come out of his mouth
so breathing out a mist is the closest thing.
What do you use when doing this?
The idea of a zombie breathing seems foreign to artificial intelligence.
do you have comfyui?
What do you mean? He draws it by hand! 🤣
that looks like forge
so you can download comfyui along with comfyui manager and then drop my picture onto it to see the workflow.
you can also generate images with the new pixart-sigma which I'm using, with this"
Does it run on the cloud or the graphics card?
4090
the great thing about pixart sigma is that it can run the language model in system ram (20 gigs) and only needs 3 gigs of vram.
Looks like it needs nails.
Using the hexagon seed image from today's #🔆|dailies post, I decided to run through a few other concepts. (@warm hazel)
This time it looks like it's not in the cat's hand but in the air almost in front of its hands 😄
bro why i can't ai generate
generate ai can't i why bro
i want generate ai and taking my ai art of my previous ai generate from #1103708504142925824
Access No from generate ai previous my of art ai my taking and ai generate want i
You not?

Just a wild thought, if I were to merge dozens of times a checkpoint with many loras I want, would I be sorta be doing a "new" checkpoint long term that way? like flushing the old checkpoint with enough data from the loras so the loras start becoming a new checkpoint..
Or that definitively won't work and be a waste of time?
@copper kraken Are you rob?
why?
great colors!
Thanks! I love these 😍
they are beautiful
Close-up, 3 glass bottles containing a different coloured Intricate gorgeous detailed Neon mythical creature in each bottle, triadic colours
great prompt 🙂 I made some modifications to make it react better with PixelWave
Close-up, glass bottles containing a different coloured Intricate gorgeous detailed Neon mythical creature in each bottle, triadic colours, vibrant glowing neon, dark dimly lit night
Nice! is that Pixelwave-Sigma?
I'm currently working on showcase images for the new PixelWave version (09) by @west breach which isn't out yet (08 available here: https://civitai.com/models/141592/pixelwave). I also use Perturbed-Attention Guidance with those images to get those next-gen details and coherence
I'm using Sigma, with an XL-Cos refiner
sounds like an interesting workflow
My initial Sigma image is rubbish, but the refiner makes a huge difference. This was the sigma output of the above image:
uhh... very interesting
Just that sigma has really good prompt comprehension, so I use it for composition of the image.
try putting Perturbed-Attention Guidance in the mix and see how it goes 🙂 it's now natively integrated into ComfyUI. You might need to tweak your settings so you don't burn the image - good scale values to start off are 1.0 - 1.75.
do you have an example how you're using sigma in your workflow?
I'm just adding auto CFG and PAG to it 😉
ha perfect 🙂 thank you
Needs a shit load of RAM, or VRAM for the T5 model
so the init image is pixart and than you use CosXL to refine it?
CosXL explains your awesome image contrast for sure
A custom merge of CosXL, yes
uhh - nice 🙂
Whoa!
fantastic!
Sorry, I was confusing Pixelwave with Pixart! 😄
The whole thing is a bit more complex 😄
There's also an Ultimate SD upscale in there, but doesn't need it.
yeah - seems like a great pipeline
cinematic still, vibrant glowing neon, dark dimly lit night, Close-up, coloured intricate gorgeous detailed neon mythical dragon, triadic colours, glow fog magical sparks
/slash dream
Hey there, do you have a pixart one posted on civit? I'm always looking to see what people are doing to improve pixart. I'm doing an sdxl denoise on it, but I haven't tried doing the new pag/sag on it.
I don't
looks fantastic
Thanks. Do you want the workflow?
yeah definitely
awesome, thanks
nice
I have many custom nodes, and some of those are grouped. Have you tried converting them back to nodes?
yeah, i was able to "fix node" on the u sd upscaler, but the middle one seems ungroupable/fixable. I'll recreate it. Thanks for it, it tells me what I was looking to see with the other ones.
PAG powered pixel / voxel art aesthetic
detailed, spellbinding portrait of a spirit, warlock's well monsoon, simple and clean, pixel art, 50mm, cherry and turmeric hue
great colors!
What happened to #stable-cascade?!?!?
I dunno - maybe being shifted since SD3 api is out now
All that history, just deleted. 
maybe it comes back later under the archive
I don't see any archive. The same with the old bot results...gone
you need to get the specific archive role - cascade is not there yet
Thanks
WTF?
that sucks
people were still using it too
Yep, I still do
@uncut steeple any way we could keep #stable-cascade around?
been enjoying your creations too
We should take over another channel for Cascade 😄
Not my decission, sorry
could you put in a request to at least make it available as an archive? thanks
Guys, think positive, at least we can keep the ultra helpful #1098025024541167646 channel. A place full of wonders and joy.
Oh shit. I completely forgot about that. Don't touch that! My name is somewhere buried in there.
o.O that cascade channel had a wealth of info and workflows
yup. it's really disappointing they'd just delete it...
it would also be nice if whoever is running this discord/making decisions would communicate at all with the people who are actually participating here. maybe i missed something but i haven't seen it
pantheon channel with bot gens went poof as well :/ but that one didn't have good info. It's rough for me as i used cascade chan as a reference guide, but totally rude to bigger contributors like you, everything just got lost .
yeah, we put a lot of time into it, and there's still people working on big finetunes with cascade
matteo (IPadapter plus guy) has expressed a lot of interest in cascade
This Discord isn't run for the community, just as advertising space where we have no say.
Ill bring it up
yeah it's a pretty weird server
kinda feels like the movie home alone 😛
is possible to use ella with sdxl ?
nope
Should be accessable now with the archive role
Thanks
How do i get the archive role, i thought i have community archive and dreambot archive roles, but not seeing it
(Can you still bring up that cascade is very much not dead, and it seems really soon to bury the discussions about it)
I'll agree on this, I still fire it up every day or two for certain types of generations
Civitai even has a section for it with recent uploads so I think that's enough reason to keep it around for now
Unfortunately also not my decision to make, community archive role should make it visible, archives are at the bottom of the channel list
understood, we don't blame ya or anything like that. 🙂 thanks for passing the word along
who is making these decisions anyway? just curious
but i didnt know they have actual control over this server, i guess i thought this was purely community created server.
oh well
yeah, that's what i thought at first
it appears that the community only gets the role of "guides" or modest moderation permissions
nice
so i was just looking at the archived section... might want to archive anything in here you care about because this section will probably get archived or deleted without warning just like the #stable-cascade section
looks like that's what happened with sd2.1 but yeah
Where is the best place to find all the latest xl controlnets?
gorgeous
civitai
depends on the style you're looking for but that's a good one
there's also cheyenne
can someone help me understand why auto is crashing when i enable ip-adapter
hello
Whats the best method to fix skin?
Load image node to vae encode to ksampler denoise doesn't remain true to the original composition and rather creates a new one. Meanwhile upscalers don't really do much of a fix to the skin if it was already lacking detail.
Not sure of any method that helps with it without messing with the composition (faces, hands, eyes, etc)
Controlnets: depth + blur|tile, low cfg, model fine-tuned for detailed skin
Thanks for the tips, will look into it
Channels dont get deleted as far as I know, just closed, but keeping all the channels around for everyone makes no sense, the server already has a lot of stuff going on and can be overwhelming, if you want access to the closed channels I can ask for that
We would like to know why an active model channel was even closed at all. Cascade is a good model that a few of us are actively using.
Calling it active is a far fetch tbh
Where'd the SD3 channel go?
It's Cascade that's gone
Ah! SD3 is back ... 🙂
Full fine-tune or LoRA? Do you also do DreamBooth?
where?
At the bottom of the channel list
I don't see it
No Access
Nope
There's only Community and Dreambot Archives
Ill bring it up
ok
unofficial implementation of Comfyui magic clothing - frankchieng/ComfyUI_MagicClothing
I tried this but only seems to generate female models
SDXL still top dog.
Well,I guess the question then is what level of activity is enough
Galaxy was posting stuff in there daily, I wasn't posting new stuff in there much recently but I use it a few times a week still as it remains the best at a few things (dark images in particular)
And there are channels with a lot less activity like the LLM one
Bring back Cascade!
Imo if there's ppl posting stuff there daily, even if it's not being heavily used around the world, it's enough... Otherwise it just will end up cluttering up the general images channel
If it doesn't come back it's not the end of the world but it just makes things less organized and harder to find down the road
Cascade was the least loved model. What a sad story.
I tried it and it was neat for comy. Out of the box it beat the base SDXL.
But no one cared....
I did 😄
Yeah it is a bummer I agree
It has one single issue, a pretty big one but something that seems very fixable
Stage B tends to oversample the image and generate what looks like leftover latent noise
The only issue it had, was that SD3 was announced so nobody wanted to waste time tuning Cascade.
It was a very weird release strategy.
Not sure if cemetry or public toilet.
yuuup
i bet the stage B issue would've been fixed in two or three weeks otherwise...
the most weird image
I love seeing very realistic looking architecture and prop placement that makes 0 sense in reality. Very surreal.
is realy santa claus ?
juggernaut + mistral ai with loras
can anyone help me setting up sdxl temporalnet in comfyui?
Happy White Squarepants feet, dancing with joy because SD3 🎉
Waiting for SD3, we'll see what it will bring
Sword hand
Kids these days
Ah, good old crappy digital picture aesthetics. Nice.
I've made upside down but it was by generating the person right side up then flipping
Vhs lora
I see, i see. Somebody is fishing in my lora territory 🕵️
Challenge accepted
What model you use ?
I will try good old Base. But i was under the impression i had a VHS Lora, but that was for Cascade (RIP Cascade Channel). I am training an SDXL Lora for that right now.
A lot to unpack here.
oooer
ooooer
Could be painful
damn it
boxes are created by folding timespace.
I will fold you
reminds me of the definition of jiujitsu
involuntary folding of your clothes while you’re still in them
What movie?
Really?!
yes really
Matrix
Can we give DoesStuff a timeout for that?
lol I’ve only ever watched reloaded
Dude, this isn't even a matrix movie. There is only The Matrix.
i know a lora in civiati
type vhs sdxl in search
I am more of a homemade guy
ha i see
hey there, which model are you using for this one? i have my own merge going, but this is pretty sharp while still having fastasy details and lots of background stuff going on that doesn't get pushed out.
Perfection
damm
try the prompt A view of a dark and sinister suburb with dilapidated houses and deserted streets. At the center of the scene is an overgrown garden filled with broken gravestones and eerie statues. Dead and twisted trees create threatening shadows on the ground, and a thick and opaque fog covers the entire landscape. Ghostly figures can be seen moving slowly between the graves. The image reflects a nightmarish and oppressive atmosphere, with dull colors and a dark and stormy sky.
My own custom model and workflow.
trained on your own images or a merge? any major models that you can point me to that I should be merging?
It's my merge on Civit that has then been merged with CosXL and the flow uses a local LLM to enhance the prompt. It goes through a refiner stage with another model and then upscaled with my merge again.
looks like a lost media lol
Shower pool
is possible with a tsunami in a bedroom

