#✨|sdxl
1 messages · Page 181 of 1
Not 1.2?
1.1 distilled, or 1.2 non-distilled.
hopefully city96 will get his stuff updated and then we can do 1.2 distilled too
grabbing the non-distilled now
yeah even non-distilled 1.2 isn't working with the city96 nodes. i guess he has to update stuff for both. so stick with 1.1 distilled for now
Hi, Can support for Hunyuan DIT 1.2 be added please? I attempted to load files from https://huggingface.co/Tencent-Hunyuan/HunyuanDiT-v1.2 and get the following error: Error occurred when executing...
i ve try so many time too. Do you run windows by any chance ?
Yes, but it's not compatible with the node yet.
yup. F hope they fix it soon.
Oh ok awesome. The city 96 guy has been really great with this stuff
anyone got a comfy workflow for this? https://civitai.com/articles/5834/silvi-v2-upscale-method-super-inteligence-large-vision-image
I am out here...
🌟 Visite for Latest AI Digital Models Workflows: https://aiconomist.gumroad.com/
Learn how to use the latest and greatest Stable Diffusion XL ControlNet models in ComfyUI! This step-by-step tutorial covers everything from downloading and installing the new SDXL ControlNet models to generating incredible AI images with Canny, Depth, and OpenPose...
is it worth to upgrade from the old controlnet models? anyone tried these?
Yes, they are fantastic. That new depth model is on par with marigold and is like 1000000x faster. Haven't tried the pose one because I don't make a lot of stuff with people in it
Mistoline and that new depth are the ones I use now days
It appears a 404 error U.U
Wait now it works, i´ll see it
Found it on reddit, I haven´t tested it yet
negative prompt: "simplified, low resolution, canvas frame, ((disfigured)), ((bad art)), ((deformed)),((extra limbs)), blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy, 3d render, digital art, bad art, (deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, 3d render, digital art, bad art, 3d render, digital art, bad art, 3d render, digital art, bad art"
🤣
That´s a big ahh negative prompt 😁
The funniest thing is that if you use that in the positive prompt it probably would look better 😁
hello
that pink and yellow room is great
I can't see any details inside the windows 🤷🏻♂️
is that still a thing or its just ads more tokens if u go overboard?
clip itself is limited to 75 tokens - no way around that.
you can only split the caption into parts of 77 tokens each and process each separately
how you then feed these tokens into SD depends on the implementation. You can just use all the tokens, but as sd is trained on max 77 tokens this will probably harm. You can also interpolate the tokens or down weight them
i have a negative promt style wiht 6000 words
been working on a workflow for sdxl as I put it down for a while when 3 came out to play with it
link?
I have a good one on my Patreon. These are before/after images using the Silvi example image.
Pink sky with blue clouds and green sun
almost, but the lady got in the way and her hat stole the green away from the sun
I guess the model knows clouds, but clouds are also part of sky. So when trying pink sky, it also projects pink to the cloud-part of the sky.
and sun, well that should always be yellow-ish or red-ish
tried to fool the model by replacing green sun by green sun-like star ... but it just wont get fooled
ok, enough. going to put (yellow, yellow sun:1.5) in the NEG prompt
We've hit a border. Something prevents us from having a green sun. Though this border protection worker seems familiar.
really cool sharp lines on this one
ella
Thanks but I'm banned from Patreon 😅
They didn't give an explaination
I think nsfw content is against their tos tho
@humble cape Apologies for the tag. I've been going through and love your images and the style. Would I be able to DM you some questions?
Hello, I will be happy to show you how I generate my images
Nudes are allowed.
Looks like there's a sea wall on the horizon
Maybe a very long pier 😄
"I feel...different..."
Drones battle it out over NYC
That small, overlayed image just makes it look like a badly denoised render.
That’s my watermark 😂
I keep forgetting to share the none watermark ones
It still doesn't look good. It's not clear enough to be worthwhile putting there, so you should either shrink it to stop obliterating the image, or remove it.
Just my thoughts, but it's up to you 🙂
I like em sorta big harder for ppl to strip and re sell because fb is filled with re sellers or artist theft
But I also make sorta small too
People are stupid enough to buy them?! 🤣
They don't belong to anyone anyway 🤷🏻♂️
😂 it belongs to the original artist well to be fair I do add sfx etc to the photos so it’s not just SD /SDXL procreate does play a part
Before I drew in
Depends how much of a "significant" change you make from what AI created.
After
That's so much better without the splatter! 😄
Thanx but it needed the horror to it 😂 other than that it’s too girly for me 😂 jk
Just change the prompt to add tattoos
True but I love drawing too
A man walks on Mars
Some variations of it...
The first and last one are awesome second can def be fixed in procreate
No point when you can just squeeze out some more images 😄
Yes that too 😂
See I thought the nurse outfit was perfect leaning more towards the Sony games
A cartoon girl in bed
A cartoon girl in bed
@iron rover A cartoon girl in bed
Here’s the image u requested
That’s def beautiful
@iron rover An astronaut looking up at the Earth
Here is the image you requested.
@meager canopy A cartoon girl in bed
Here is the image you requested.
Here is the image you requested.
Here’s the image u requested
Here is the image you requested.
@iron rover
Here is the image you requested.
Here’s the anonymousghost you requested
Here is the image you requested.
Here’s the image you requested
Here is the image you requested.
Here’s the image you requested
@iron rover
Here is the image you requested.
The very particular/distinct look of some DMT trips : D
yeah, cause they use paypal or stripe and those processors dont allow explicit nudity
Here’s the image u requested
Mangled Merge XL V3 has been released if anyone would like to play around with it. I'd love to see what images others can make with it. 🙂
did a few with it
A drone swarm composed of three drones inspects a building in a realistic style, close to the real scene
Nice! They look awesome! What are you using to upscale? Need to set something like that up myself. 😊
that one was just itself with a 1.5x upscale and 0.5 denoise. "upscale image by" node followed by another ksampler of it.
where can i generate images with sdxl please ?
Got the upscaling process working in Comfy now. 🙂
Does anyone have a good SDXL latent downscaler? I’d like to go from 128x128x4 to 64x64x4.
embrace the lord of frenzy !
have i got the woman for him. 🙂
We have grapes!
gaze in to frenzy ~ xD
hi! Wondering does anyone have ideas for how to get this style?
I can't find any loras for it
I need it for img2img
to make a image in this style
like style transfer
lite-brite style?
this is great, what sort of model and prompt are these action shots?
looks a bit like marvel movie but a bit more cartoony
Is it possible to outpaint with SDXL models? I've tried inpaint controlnet models and none worked so far
(with A1111/forge)
yeah any diffusion model can inpaint or outpaint
because there is a way to do it by masking the latents
in the img2img>inpaint tab?
I don't use A1111 so I don't know
I either use comfy, diffusers or direct pytorch workflow
This was a prompt I found on midjourney and rendered with hunyuan 1.2, refined with level 4 XL. prompt: trumpet player fights against a giant bloboid Cthulue-esque monstrosity by playing songs on a trumpet in a messy shady brick alley
I just want a model that makes pictures that are 100% indistinguishable from the real world. Without Photoshop pitty work.
One does exist her name is Photoshop
I mean I can fix anything in PS but then we're back to square one aren't we. XD
Right.....................
I think this nailed what you're looking for: #✨|sdxl message
Maybe ps has advanced sense then after all its its first just has evolved sense then
I think ai has gotten lazy here lately and I see everyone not here but Bing users look like it’s the same prompt same art same color… if that makes sense
Fire fly is actually really cool if u want to evolve ur SD images into something unique
I think AI kind apeaked. We have to accept that it will be just another (quite amazing) tool in our toolbox. but it's not a one click magic bullet.
I think sd3 2b gets really close to that real photo look with the better vae. obviously it's not trained well enough hence the bad trumpet etc, but this is miles closer to a real photo than sdxl
I agree Dodge Coin
Yes that's why I feel bad for SD3 it is a better start foundation than SDXL but if no one trains it... :/
well, supposedly better version soon
Someone will train it just ppl don’t get the liscense
IN 2 WEEKS! 😄
finally got a reasonsblae trumpet
Also Lykon said he will be making Dreamshaper / sd3 so there is that
Glad this room has zero drama in it and hope they don’t kill this channel like they did with Cascade
I hope not! =0
when new version come out they archive the old channels as they did us with SD2
If you like cronenbergs.
I noticed that what happened to sd XL?
That model came out at the same time as Cascade did
When I used to be here every single day a bunch of us were using the SD2.x channel. XL came out and they ripped it away from us. We asked for it to stay and they did but not for long as they archived it and no further posts allowed.
I remember us asking for the please stay but I’m sure they can un archive so stupid they did that too
A few of us did not wish to move to XL so preferred 2.x. They decided to allow the channel to come back but it was short lived
I eventually left this discord to find happier grounds after I migrated to XL not long after.
Everything short lives in this server it seems like
I found out how to make extremely good images with ssdxl-turbo
Just refine the image Generated with SDXL-turbo 3 times with the sdxl refiner.
The first image with SDXL-turbo and the second is the refined image.
did you try downsampling and then just color a other image based on downsampled source image. AI would just make artifacts
not too bad but some might prefer first one since it has more detail.
Yes, sadly this approach can add some artifacts but it removes more artifacts than it adds. One example is the eyes. But sdxl-trubo surpassed sd3 with the same prompt.
Лев
Можете ли вы, пожалуйста, использовать английский язык, чтобы минимизировать риск ошибки перевода?
Lion
thats a fox
thats a good idea ty
yeah you gotta specify what artifacts... best to provide a couple examples tbh
the bits in her hair
I am using a lora a one I trained through civatai plus another texture lora called ReaLora
its not very common but sometimes its very lightly on the body etc
like its really visible on the stomach here
the stuff in the hair looks like a noise problem
that can be the result of a bad checkpoint/lora, or many other things with sampling
best thing to do is turn off all loras, and revert to some basic standard sampler settings that don't add noise... something like dpmpp_2m or euler, 35 steps, karras
and see if it goes away
that can help you narrow down the problem
if it's still there and you're using a really generic WF... then it must be your checkpoint
or you got some stranger problem
Kolors, it's almost SDXL isn't it 😉
Cats eating noodles is cute
Those cats were such a nice surprise! (my ehm, creativity is basically "dear llm, i have these 2 prompts, create 10 more like them" and the cats were one of those 😉)
those were all cute
Different kind of prompt (An atmospheric HDR 8k digital octane render....) It's pretty fun model, not the best (doesn't do various styles all that well, doesn't react to artists, prompt following is hit and miss for recent models), but it's very promptable for a lack of better word, once a prompt works it works for various subjects, its failure mode is still reasonable images and results are varied.
@gloomy lark I managed to merge Kolors with my sdxl ft. 1: original 2: merged. Using this for inference: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper The original conversation is in SD3 channel but I think SDXL channel is more appropriate
wow those are fantastic
just as i was messing with lumina/cascade.. 🙂 (penguin image)
so I'm up to speed on using kijai's nodes, but how do we merge kolors with sdxl? I tried kind of but it just kept erroring out.
yeah... lumina on left, level4 sdxl upscale on upper right, kolors on bottom right
and they all get the too cute award
hah well I'm highly jealous of what @tribal lantern posted earlier
take a look at the model page and tech page on the huggingface space i gave you the link to
yeah i've been playing with it quite a bit since yesterday.
the biggest issue with upscaling stuff like lumina which is multi-subject, is upscaling with sdxl which isn't. but kolors is...
so now I just need xiaozhi's formula for sdxl merging
with it
i never use the AI i'm generating with to upscale. that's a recipe for disaster
Just now finding out that adding iterations to the Dare merge nodes is like merging on steroids. Super stoked about it. These are the results after smoothing the last run of merges.
Need to skip a layer which has different shape between kolors and sdxl
and comfyui node could not recognize kolors as sdxl unet due to that layer. I write a script to merge it
anyone used the new controlnet union model ?
ControlNet Union ++ is the new ControlNet Model that can do everything in just one model. Here is how to use it in Comfyui
Links from my Video
https://huggingface.co/xinsir/controlnet-union-sdxl-1.0
Get my Workflow here: https://www.patreon.com/posts/107705790
Join and Support me
Buy me a Coffee: https://www.buymeacoffee.co...
#artisan-1 типичный завод в Пскове
mexican skeleton frog robot is amazing
everyone should leverage this, all in one controlnet https://huggingface.co/xinsir/controlnet-union-sdxl-1.0
Are there any good models for SDXL controlnet anymore? I have a lot of models that perform poorly, once better, once worse, but certainly not the level as in SD 1.5. Are there any new sensible models for CN?
i dont know if a1111 supports that controlnet union but it works really well on comfyui as i've tested. and you don't need multiple separate CN models when using union which is about 2gb and cover all styles
a, I haven't heard of something like "controlnet union" I'm already getting around to reading about it, thanks for the info
I hope it works in A1111 or Krita (ai addon) cuz really dont want to use comfy.
its pretty new, dropped a day ago, they will probably have support for it on a1111 in days to come, but so far its working good on comfyui
i tryed, well, openpose in preview working good but in practice not working, at least in a1111 for now or I just set something wrong, but hmmm... it should be like this? (openpose_full)
dont think a1111 supported for it yet. Usually with controlnet you'd have to select a particular pre processor and load that respective model, but with sdxl union you select the model and select the preprocessor based on it
Are all these from the past few days from some newer version of your lora or a model merge? Been digging them a lot. Almost have a Berserk manga feel to them, but mixed with some high contrast color.
no new model... been fuckin with the RES sampler
most from this evening have been with the following strategy:
run a step with two seeds, then run two seeds for the next step (2x2), then two seeds from that next step for a total of 8 outputs over 3 steps
find the median output as a very rough estimate of "more probable" (thinking of this like a potential energy surface in chemsitry, except we can't take a single point energy here)
and then repeat
one of the more interesting things i'm noticing is it seems to be especilaly good at extracting styles etc from a lora
I've seen other workflows with contrast this good
but your workflow has by far the best colours I have ever seen
not sure exactly which part is doing that though
Ahh yeah that's a pretty interesting approach, it's like a seed average almost. You're kind of making a little bell curve for every pixel and then picking from the peak, kind of like a gaussian distribution function
Cool shit though man
After you have your little three step average, you then just ride it out to completion?
Kind of reminds me of the trick from a while back where you'd use 2-3 steps to prep a latent, but not return the noise, then feed that in as the base latent for the actual generation
so something like this? (im just adding them all and multiplying by 0.125 for an average, but i did the 2, lead into two, that lead into two each thing, they are just arranged for the screenshot)
i could probably write a simple node to pull this off to avoid summoning the flying spaghetti monster. Edit: looking at some of the previews, I can see it's not quite right, I think I need to change some of the add noise/return noise settings on them
bro im bored and trying to distract my brain with tedium
yeah there we go, just needed them all to add noise and to not return noise
anyone here has information about fine-tuning? I have no idea how much more/less data it needs. the example I had from dreambooth dataset was like 10 images per object/class.
A coloring book page, white background, illustration of an pocket room Coloring Book page , Calm, Cozy, and Peaceful Spaces for Relaxation and Stress, ink drawing, line art, clipart, simple line illustration, Relaxing Corner, black and white,8K
Light gray and green landscape, pines and cypresses, temples, water ripples, cranes, mountains, water, Chinese style, gold foil silhouette, plan view side view, full length shot, photo realistic, soft shadows, no contrast, clear sharp focus, film texture , natural lighting, professional color grading, Leica lens
This is allinone controlnet model. The middle is the original (random pic from reddit), Left is canny alone, right is openpose alone. The model is a pony merge
https://github.com/xinsir6/ControlNetPlus/blob/main/requirements.txt this i would watch out for
that requirements.txt might break stuff
258 lib reqs? Holy moly lol!
No ideia what this repo is. I just downloaded the model
gotcha
yeah just wanna make sure anyone that decides to give that shot is first aware that they should give that requirements.txt some serious consideration
no < or > just == across the board
odds are it will uninstall and reinstall every single thing you got lol
Nice detail! Which workflow are you using?
Thnaks o// It's two samplers with an Ultimate SD upscale with relatively high denoise settings.
guys, will a model understand me if i speak english or do i need to speak in danbooru tags
Depends on model, just keep it simple and do experiments
realCrystalPonyXL generating some good looking group portraits. Since this model is a 40%pony/60%nsfwsdxl mix, I had to blur some parts manually.
Why does this happen?
Every time I use a pony/XL model it creates this shit image and can't work out why!
looks very much like a vae error
What vae do you recommend?
What do you mean by that? Sorry haven't done it before
then its probably a vae error, but if you want share your prompt
test
Anyone know how to access / set the API in comfy UI?
Your prompt looks awesome
how does one use more than one CN? I used to use depth and canny (or lineart) together but in the new all in one I can't figure out how to do it.
Just put a second controller in line with the previous. Pipe your canny image into one and depth into the other and adjust their settings individually.
Oh and pipe the same cn model into both of them, just share the same node to prevent it from trying to double load the model
That being said, I am not 100% sure if the cn has an internal "automatically detect" mode for picking. I know in the guy's code, he pipes in an extra argument tensor like 0,0,1,0,0 or 0,1,0,0,0 and so on for picking
I saw no where to do that as before I would just daisy chain them.
are you talking about this node and just image out to image in of another of the same node?
like this if youre trying to use the same union cn for both depth and canny
that is the old way
no, it's the right way
oh you can use AIO if you want, it does the same shit under the hood. also, there isn't actual comfyui support for union yet, like i said, there are technically extra arguments that have to go into the sampling. it happens to work for the most part, for now, but it's not completely correct. depth seems to work really well though, but that might be the default within the model
Hopefully he brings in a connection point to allow more so depth and canny/lineart can be combined.
well i just showed them working together
but anyways, from his code:
added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids, \ "control_type":union_control_type.reshape(1, -1).to(device, dtype=prompt_embeds.dtype).repeat(batch_size * num_images_per_prompt * 2, 1)}
this isn't implemented into comfy yet through any plugins
for instance, canny would be torch.Tensor([0, 0, 0, 1, 0, 0])
text_embeds =, is that the one that throws a cli error as missing in my loras?
depth is torch.Tensor([0, 1, 0, 0, 0, 0])
`# 0 -- openpose
1 -- depth
2 -- hed/pidi/scribble/ted
3 -- canny/lineart/anime_lineart/mlsd
4 -- normal
5 -- segment`
there's the list of 0 0 0 0 0 options
and he technically supports multicontrol in his code as well
so you can do stuff like 0, 1, 0, 1, 0, 0
that would enable both depth and canny
iow, comfyui is the culprit?
no, the lack of implementing the special arguments is. it's not on comfyanon to implement every single proprietary thing for people, it's on those creators to make a plugin that meshes with comfyui
alright, thanks
here's the union creators git:
https://github.com/xinsir6/ControlNetPlus/
he has plans to support comfy, but its not done yet
looking through his code, it seems he takes that selection tensor value and "inject control type info to time embedding to distinguish different control conditions"
control_type = added_cond_kwargs.get('control_type') control_embeds = self.control_type_proj(control_type.flatten()) control_embeds = control_embeds.reshape((t_emb.shape[0], -1)) control_embeds = control_embeds.to(emb.dtype) control_emb = self.control_add_embedding(control_embeds) emb = emb + control_emb
so where ever stuff like that happens in comfy, is where he'd have to either find a way to inject it or to monkeypatch it in
requirements.txt could use some refining... lol
Looks like it is coming to ComfyUI though. https://github.com/comfyanonymous/ComfyUI/commit/faa57430b0ff882275b1afcf6610e8e9f8a5929b
This is only the model code itself, it currently defaults to an empty
embedding [0] * 6 which seems to work better than treating it like a
regular controlnet.
TODO: Add nodes to select the image type.
oooof, i don't think i've ever seen one that long
oh cool, figured it wouldn't take long for someone to throw together
but my point earlier is that it shouldn't be on comfyanon to implement all this himself
yeah, brutal, and everything is == with a specific version
dependency suicide
for real lol
Breaking stuff intentionally
is it possible to have a pic of a character as reference to generate art about that character with one single image? wtihout training a lora, hope someone can help
look into the ipadapter plus nodes
how can i use that in a1111?
you can't, you should switch to comfyui or stableswarmui anyway though
they're faster, more efficient with vram, have more features and are better supported
lmao the saviour we deserve
is that CFG 1?
2200 loras merged successfully.
Merging 2200 loras would be like mixing every color in a painter's kit. It would just average everything out and you'd be left with something slightly better or worse than the base model. Unless the unet actually increased in size, you'd just end up with every lora stepping all over the same concepts they all share
Averaging out the concepts they all know wouldn’t be 100% bad if it gives you a more variety in what turns out to be good in one shot prompts.
I’d be willing to play with it at least to see if it turned out to be a good sparring point. Then specific Lora’s to get you to the end goal.
Is it something you’ve uploaded?
But that's just it, it wouldn't. You'd have some anime cartoon lora, some photorealism lora, some boring lora, some fine art lora, etc etc, all competing for defining the concept of say a human woman, her proportions/poses, her skin texture, her hair, and so on
That's why I said it would likely be marginally better or worse than the base model. It might help a tiny bit with anatomy, since people focus on making waifus with most major loras, but then it will just overrestrict composition and potential variety
Only up to 1800 https://civitai.com/models/447902?modelVersionId=619849
I’ll give it a try when I get off work.
I’m interested to see how well it does when I throw random things at it
Cool! Let me know your honest opinion.
@tawdry current it's not meant to be a waifu model. It's more of an experiment in model merging and to see how many I can merge in before it breaks beyond repair. But, there are plans to make specialized versions eventually. You're right however. But that goes for all models, by training or merging, no one is truly making the underlying architecture for sdxl any better. They are just moving weights around. But still, people do it because they can or to see if they can. So what.
there is a spectrum between underfitting and overfitting
and a second spectrum between acquiring new knowledge and catastrophic forgetting of old knowledge
very few models are at the optimal point on each spectrum
and if they are further to the left, then you can safely add knowledge or change the fit
this will move rightwards on the spectrums
but if you have some headroom then that is fine
if you do not have headroom then it will not work as well
my personal view is that the vast, vast majority of deep learning models are overfit
and that we would be better off with slightly less capable models that have better generalising ability
Right, but my point is that loras, in general, are very heavily biased toward making portraits of pretty women. Doesn't matter the art style either, just take a look at civitai and you'll see what 95% of the thumbnails are. No joke, I just scrolled through ~150-200 rows and counted 30 thumbnails that weren't females. There are five per row. That means that female related loras account for 96-97% of what I saw...
So the bulk of the loras you used in that 2200 are likely waifu oriented loras, or at least have a large amount of their trainings oriented on them
Pony should be called SDXXXL
that being said, i did see an interesting lora that i want to try out now lol... https://civitai.com/models/521190/0682-hoarder-room
eww the base model is pony, nevermind
I'm not putting loras of waifus in there because they are boring. I do put portraits in every now and then to help with faces. But yeah, no waifus, no specific people, no graphic sex acts, just loras I think are cool.
i've lived in places like that at times in my life, they really do be like dat
I like how pretty much any lora or model you go to check it doesnt matter whatsoever what the thing specializes in based on example pics and description the community made pics below quickly devolve into nudes and orgies
for what its worth the model doesn't seem to be broken
I suspect you will take at least a subtle hit to image quality eventually and get some overfitted areas but that might be worth it
seems like a very efficient way to add more subject matter to the model
I think you may be on to a really good idea here
Nah this is is Kohyas Hires fix with sampling that kicks in too late : D
Thank you! I was thinking the same thing. It's kind of a form of RLHF training but through merging instead of fine tuning.
yeah it has the advantage of being drastically easier which might make it worth the potential issues
if you tend to use loras with looser weights that will help
still dialing in settings, there is 100% a bit of an over cook on cinematic bars on the top and bottom
still figuring out prompting for it might be some of that as well
I've noticed that as well. I usually put 'framed' in the negative. Sometimes it helps, other times not so much.
it also likes to forget about the background and make a subject on a black background
What a madman. LAMO.
does seem to be able to upscale ok though
one thing i did notice a bit is that it would make table top models every now and then, thinking that maybe there might be one too many of those loras if there are any in there
would explain some of the behavior
Noticed that too. You kinda have to add it into the prompt or it gives a blank background often. Its done that since V1
Yeah there are a few lora based on miniatures in there. Once I get a better grasp on smoothing I will be starting over. The first few versions were done the old way, but I'm finding the DARE/Ties method to be a lot better.
Exploring a more "experimental" approach on Stable Diffusion training, I tried to "merge" many (44) of the most iconic art styles into a single LORA (SDXL).
Music by Daft Punk, Amon Tobin, and Radiohead.
What ideas would you like me to try next?
You can freely access this LORA through the link in bio.
#touchdesigner #stablediffusion #animation
Outpainting : D
How can i create different versions of a image with a different colorpalette ? (i am using A1111)
Denoising works quite well to create "similar" images but the color palette is most likely identical.
I was playing around with controlnet but havent got a good result. May i could need some advice for settings / prompt.
@sweet abyss Something similar to this?
yeah
30s film grain at work. 40s version too
Well, the easiest way is to use solid color as an input of img2img. In this case it was a simple light blue rectangle:
And of course control net to replicate the form of the original image
Oh, important part. High denoising strength (0.95+) with Exponential scheduler
control net does not work well for me, i have top view perspective with small details which turn out like simple forms in control net.
What model and what settings do you use? Can you share your image?
stable diffusion sdxl, dpm++2, sampling steps 20, 1368x2048
controlnet is useless once u go beyond a thershold of weird angles
top of head, below weird side angles, yoga poses, forget it,
early days folks, emerging tech, we ar ebleedign at the edge - the days of rendering a perfect consistent 3d view of a thign form any angle are still some time in the future
but its coming.
how do I get pony to stop making every generation so fuzzy and 'painty'? I like it sometimes but I wish I could have something more defined
Using comfyui and euler ancestral, but it also shows up with dmpp
i noticed that with pony too
images are very monochrome and washed out
try adding "vivid colors, contrast and saturated in the promt
That seemed to help a lot actually, 4->7
i guess pony likes high cfg
i didn't mess with it much coz it only renders a lady in a empty room no matter what I prompt. 😄
I'm gradually learning more about iterative dare/ties merging. Merged clips of Halcyon 1.7 with Mangled Merge XL V2 at 50% ratio, no drop, at 2 iterations and then smoothed it by merging it with the SDXL base clip at 1 iteration 50% ratio. Getting some nice outputs with it.
really good stuff
So what exactly happens when i merge to checkpoints?
Thanks!
so if i merge 25 checkpoints and prune it does that mean i get a great checkpoint or a mess? merging is not the same as training i assume
Never tried it. I just merge everything already pruned. Worth looking into though. I'de be interested to see the outcome.
thats the thing i tried mrging like 12 sd15 checkpoints and ...
idk
was the result better i am not sure
it was different yes
You gotta mess with the weights. The merging method also matters. I'm still trying to wrap my head around attention gradients and masks. Then factor in iterations which I'm convinced is merging on steroids ... but can easily break the model.
Not even to mention block weights which is still fairly foreign to me.
Then there are new methods that haven't been implemented into comfy yet like the DELLA method. Which looks promising.
https://arxiv.org/html/2406.11617v1 It's all gibberish to me, but I'm working on understanding it. lol
Pony models require different prompting setup:
wasset?
Needs a ton more training, but it's definitely on the right track. They going to be very open with it and you'll likely see incremental updates to it pretty frequently. Think they stated that there's no reason for them to hide it behind closed doors when they want active community feedback on it
Wouldn't surprise me if you see monthly incremental updates to it
i just saw that... pure fire, that's why i'm excited
reminds me of the philosohpy and approach that led baulder's gate 3 to historic greatness
Well as long as people give them the right feedback and it doesn't turn into some waifu creator echo chamber, the model will continue to improve.
But it's off to a kickass start. I've seen some wild prompt comprehension examples people have posted and it nails them
Usually
absolutely
i was really surprised how well it understands complex prompts
on a lot of things it matched 8b
definitely better than 2b on many things
Seems limited on samplers that work with it
yeah, rectified flow
i'll admit it's gonna be hard to convince me to make a full switch away from SDE samplers
espec with what i've been doing more recently with my nodes
These are fast changing times 🤷🏻♂️
wow its a 6.8B parameter model
You've gotta go with the flow 😄
so its way bigger than the 2B
yeah, much bigger
so Kolors and AuraFlow are both apache license 2.0
I actually can't see how SD3 can compete when such strong models have true open license
I mean presumably Stability AI will now use their massive recent investments to train a much stronger SD4
but I can't see how SD3 can compete in the current generation
if you don't have open license then you have to have a much, much stronger model
such as midjourney
I tried adding the ollama node and use llama3 to enhance the prompt. It's flattened my 4090
😮
lol
ah yeah I tend to rent 4x RTX A6000 48GB when I do image gen stuff
its running big upscalers on big batches that is the biggest issue
this is one reason i like cascade
i can generate directly at 2880x1728 with 4gb vram lol
wow yeah that's really nice
have you used hidiffusion?
its the sota for native gen res
i remember lookin ginto it a while ago but never used it much iirc
Some strange eyebrow things going on here
oof
hidiffusion is best thing that came out in last 12 months in my opinion
if you combine hidiffusion with canny+depth control net
with cherry picking 1 out of every 200 gens
then you can push models really far with resolution and aspect ratios
an easier version of hidiffusion is koyha deep shrink combined with PAG
most of my generations are keepers in my mind
i've found PAG messes up a lot of images tbh
PAG is subjective yeah
it fixes obvious problems, but creates really subtle ones that are hard to fix
creates noise patterns across the entire image
it can add a sense of sharpness
kinda low/mid frequency-ish ones
sometimes it looks like a vague pattern of dots, especially with faces/skin
yeah there are side effects
my goal lately has been to just get the sampling right on the first try
no controlnets, no ipa, no SAG, PAG, anything
no upscales, or refiners
What is aura and why is it 16GB?
A model, because they made it that large.
lol
lol
upscaling is my main interest personally
I spend 99% time working on upscaling and 1% working on actual image gen LOL
what can aura do, compared to the other models ?
More from Aura
does it require different usage ? or can i just pick it as model?
ok so is it worth it? is just sdxl? or sd3?
boobs are covered i see
i just installed kolors, like a ne wmodel weekly now 😄
I just want an image of some guns with some roses! 😄
That's because llama3 is turning "Guns and Roses" into a full band prompt.
Bob would not approve
hmm, looks that i cant just load the checkpoint from auraflow, does it need special nodes in comfyUI ?
Yes, you need to update. They were added in the past few hours
what node do i need?
one of these ?
You don't need to install it, just update Comfy
It's native
how do i update it ?
run update_comfyui.bat ?
Yes 😄
(it's a serious question with all the fragile code these days, ANYTHING could happen 🙂 )
It's portable, easily replaced if anything goes catastrophically wrong 😉
I update every day...sometimes more
I have a script that prompts me if I want to upgrade Comfy, and then if I want to upgrade custom nodes.
i try to not touch all those fragile things if not necessary 🙂
Never had any real problem with it. Can fix it or rollback if required.
come on.... is it good, is it worth the install?
It means the model didn't unload from the GPU. Set the alive time to zero to unload the model after it generates the prompt
It will stay in regular ram as long as you don't force windows to garbage collect by loading a ton more stuff. I use ollama and llama3 all the time with comfy on an 8gb gpu
After the first run, it's been quick 🤷🏻♂️
Well like I said, just set the alive time to zero. It will prevent that from happening again
What "alive time"?
that usually happens most often when using ancestral or SDE samplers, think it's even noted in the plugin git page
on the node there is a setting called "keep_alive"
Not on mine
oh thats why, youre using some other less commonly used one for it
https://github.com/stavsap/comfyui-ollama is the one im using
Failed gens? 😉 🤣
art ! 🙂
bro those are sick, not fails
i only get gibberish out of that aura thing though
those images are from further steps within the workflow
does it have to be a certain sampler / scheduler ?
yes
same as sd3 afaik
It's not so good at generating a prompt by default though, you need to coax it out
which one would that be ?
the only difference might be whatever sysprompt they have in the code that isn't exposed, which you can manually change. also, in the pack i use, you can just use the advanced node and define your own sysprompt right there (could also be settings like temp and such)
which sampler / scheduler ? @tawdry current
you gotta remember, the addon isn't the one doing the lifting, it's ollama that it makes calls to. so as long as they have identical settings for things like temp, context, seed, etc, they'll produce identical results
sd3 uses sgm_uniform for noise and euler/dpm++2m for samplers, well any sampler as long as it isn't ancestral or SDE
gonna try kolor on comfy
I should stop abusing free HuggingFace Spaces
they are very generous but I feel guilty using them too much
it works really well, im playing with kolors right now
https://github.com/kijai/ComfyUI-KwaiKolorsWrapper is the addon you use for it
just follow his guide and use the quantized encoders
this kolors gen is the best image I have ever made
I wish SD3 could do this
its literally midjourney only
Uni_pc / Simple
i will try that, too
VAE
yeah probably the vae
Using the VAE in the model?
no, what do i have to do and where do i get what ? 🙂
Nothing, it's in the model
there is a way of baking VAE
it was not common at first
but now most models bake VAE
yes, but show the red wire and where it's coming from
The vae decode
it should like like galaxy's
No need to use the advanced ksampler
that's it, there's the problem, you mean ?
Not sure if Auto CFG will work with it
Try bypassing your pipe, in case that's not working
The rest looks correct. Not sure if that ksampler advanced is messing with it, but shouldn't do.
i try without the pipe
What's your empty latent image node?
ha, now i know !
that one is taking a different vae
stupid me 🙂
hmm, but it shouldnt matter?
since i make an image out of it ?
and later on encode it again
(with the correct VAE)
but i try remove the encode/decode part
Sounds like a waste of processing time
Perhaps 🤷🏻♂️
trying out now 🙂
i dont have to
fancy noise?
there is a one in a trillion trillion trillion chance
that the emtpty latent will randomly be a real picture
and so the VAE decode will work 👍
that would be so cool ! 🙂
🤣
You didn't bypass the pipe, and try a basic ksampler, without the random additions