#🆕|sd3

1 messages · Page 124 of 1

halcyon yarrow
#

so im guessing you input the image + this prompt to gnerate the bottom image?

gusty trail
halcyon yarrow
#

i don't get it

gusty trail
#

I made a node to automatically patch right or bottom depending on the input image

halcyon yarrow
#

that's different than what you were showing earlier riight? I was expecting to to see the same image as the couple as the top frame and then a new image as the intepretatin as the bottom frame, like hte knight and the princess

#

can you do the 3x frames like the knight and princess one where its the same original top image and new bottom image?

gusty trail
#

Oh I see

mortal mesa
#

visual identity design in context lora

gusty trail
#

It is ok with that

halcyon yarrow
#

are you using xiao's workflow to input an image and used the visual identity lora?

gusty trail
#

Just need a extra prompt for the new scene

mortal mesa
#

its embedded, i modified a workflow Klinter made

halcyon yarrow
# gusty trail Just need a extra prompt for the new scene

An energetic scene of a fiery chili pepper character and a frosty popsicle character interacting in harmony: the chili pepper radiates a warm, blazing aura while the popsicle exudes a frosty chill, their contrasting energies blending in the center to create a swirling mist of steam. The backdrop is a fantastical fusion of molten lava and glittering frost, with small flames and snowflakes dancing in the air around them. Their expressions are playful yet balanced, symbolizing the coexistence of opposites in a vibrant, dynamic moment.

halcyon yarrow
mortal mesa
#

exactly that

halcyon yarrow
#

cool you guuys are really taking it one step further

gusty trail
#

The edge is copied from above

halcyon yarrow
#

lol oh wow that's really cool well done it follows the prompt really well too

#

id give iti a high score for adhernece

gusty trail
#

Using another llm generated prompt

halcyon yarrow
#

ilike the first one better, anyways thats really cool so @gusty trail that's not the "try on" lora? what's the name of that workflow?

gusty trail
#

The try on lora is very specific for try on

#

You just need to load this lora and do the prompt

#

Or any other examples' lora

halcyon yarrow
#

but you need a specific workflow to do the top mask/ bottom mask right?

halcyon yarrow
#

produut-design lora + mask workflow right?

gusty trail
#

This is the workflow

#

yes

muted dove
halcyon yarrow
#

lol yikes

gusty trail
gusty trail
muted dove
gusty trail
halcyon yarrow
#

i think the try-on lora demo'ed replacing the face

gusty trail
#

This try on lora changed too much tbh

halcyon yarrow
#

yeah its not as good as inpainting methods

gusty trail
#

My friends used ic-lora + inpainting which seems pretty good

#

And I think you could actually do the second pass with inpainting to add or modify some generation

mortal mesa
#

XiaoZhi do you happen to know what this one is? TTPlanet/Migration_Lora_flux

gusty trail
mortal mesa
#

oh duh i see now

halcyon yarrow
#

i think of the in-context lora set my favorite is still the filmboard

#

set up a complex prompt for all 3 scenes and see how well the model can adhere to it, i wish they would make this lora for SD3

#

speaking of Sd3 I feel the community hasn't embraced it as much as I expected, I'm prettyy much the only person posting SD3 content on civit, it's like really dead in there, i never see SD3 content on my feed either, like ever

bitter hearth
#

TTPlanet has nice upscaling nodes also

#

caption tiles, and a decent splitting node also

halcyon yarrow
#

speaking of which @sullen moss I tried your suggestion for that 4xFFHQDAT upscaler and while the one is undeniably better in quality than the one i settled in I was unable to adopt it bc iit took 45 seconds to upscale an image with it

sullen moss
halcyon yarrow
#

i settled on BSRGANx2 delivers solid results in 10 seconds, I also liked RealESRGAN_x2plus but the other one edged it, had to do a bunch of side by sides to really settle on which one looked better

#

lol yeah i don't make art in an industrial scale either

#

the wayy i see it is Im dedicating my precious GPU time that it could be spending rendering images on upscaling those images that I'll be sharing onlinie for the enjoyment of others, so I don't think its crucial to spend another 30 seconds on increasing the quality by that much

#

like if it was +5 or even +10 more seconds id consider it but at 30 seconds, I couuld've made 2 images in that time, its not that important especially since civit is going to downscale the thumbnails anyway

sullen moss
#

The main thing is that there’s a choice, and everyone can find what suits them best. That’s the beauty of open source.

halcyon yarrow
#

yeah you're right, i ended up on this website: https://openmodeldb.info/?t=arch%3Aesrgan it was super handy finding what i need

#

i coul djuset set up the filters and the scale to suit mym needs, download all the results and just run them and compare

#

if i set it to Faces and 2x that's pretty much the only result too

bitter hearth
#

there are models ranging from really fast to really slow now

#

since you liked RealERSGAN maybe you would like RealPLKSR

halcyon yarrow
bitter hearth
#

there's more around the internet

halcyon yarrow
#

oh cool okay i thought that website was the end-alll be-all lol

bitter hearth
#

not rly although it is good

halcyon yarrow
#

i tested my current upscaler of 2xBSRGAN vs 4xPurePhoto-RealPLSKR and it looks darker and fuzzier/blurrier/softer with the 4x

mortal mesa
#

Shuttle 3 832x1216 2 step ----> 2396x3501 via 6 panels 4xNMKD-Siax and 2 step per panel, the TTP_Toolset workflow

halcyon yarrow
#

that looks way better

#

i like how it didn't just upscale but it enhanced

#

how fast is it tho?

mortal mesa
#

fast 2 steps!

#

14 steps total

halcyon yarrow
#

yeah but ive tried "4 step" flux models before that are still just as slow as the counterparts, like how fast is it in seconds?

#

so the left is the original and the right is the upscaled right?

mortal mesa
#

yes

#

i only have a 2080TI soooo....

halcyon yarrow
#

what does it say in the end like in regards to how much time it took

#

this line: Prompt executed in 63.24 seconds

mortal mesa
#

Prompt executed in 311.40 seconds

#

that was first model load also

halcyon yarrow
#

lol pfft

mortal mesa
#

might be faster now

halcyon yarrow
#

yeah but like 40% faster at best, that's still a good 150 seconds even if we half the time, and you got 11gb whereas i got 8 so for sure mines wouldn't be that fast

mortal mesa
#

toss me a hard to upscale prompt 🙂

halcyon yarrow
#

score_9, score_8_up, score_7_up, ((solo)), ((adult)), cinematic, best quality, 1girl,, pale skin, messy hair, short hair, auburn hair, freckles, freckles on chest, green eyes, dark makeup, tradwife, sundress, field, flowers, outdoors,

#

original left, remix right

mortal mesa
#

lol ill go swipe some somewhere else, comma salad

halcyon yarrow
#

lol

#

field of flowers in the prompt is usually a good upscale test bc there's usually lots of tiny detail in the distance that's very easy to judge

mortal mesa
#

mmm good idea

halcyon yarrow
#

have you tried InstantIR?

#

i personally felt betrayed by how long it took regardless of the quality simply bc it's name implied it was going to be fast, like instant fast lol

mortal mesa
#

i havent, there was something else like that i wanted to try

#

man, i forget

craggy crest
#

bilateral symmetrical dichotomy

halcyon yarrow
#

@mortal mesa so explan to me this setup:

Shuttle 3 832x1216 2 step ----> 2396x3501 via 6 panels 4xNMKD-Siax and 2 step per panel, the TTP_Toolset workflow

what's shuttle 3?
what's 6 panels?
so the upscaler you used is 4xNMKD-Siax but it doesn't work with the built in comfyui Load Upscale Model stuff so I need the TTP_toolset custom_nodes to make that upscaler work?

craggy crest
halcyon yarrow
#

so he's using shuttle difussion 3 to generate 832x1216 @ 2 steps and then upscaling it to 2396x3501 using 4xNMKD-Siax and what's the 2 step per panel and 6 panel thing about? is that a special node?

mortal mesa
#

the time i said isnt going to be accurate either cuz i ran this whole section im going to excise out

halcyon yarrow
#

iit looks so much better

mortal mesa
#

and each panel gets run through florence for a prompt

#

i didnt make it, just addapted it

halcyon yarrow
#

lol damn that's some level of care going into upscaling thats next level

bitter hearth
#

seemed to beat Supir

halcyon yarrow
#

yeah from that screenshot I posted (not sure if you saw it) it really did seem like the absolute best hands down

mortal mesa
#

supir was way to heavy for me but ya could do some nice stuff

bitter hearth
#

the advantage of Supir currently though is that its been broken apart in ComfyUI now by Kijai so you can swap certain bits in and out

mortal mesa
#

i should revisit it, i was talking about when it first came out with zero optimizations haha

halcyon yarrow
#

i'm replying to the comparison imamge of the upscalers if you wanna look at it

bitter hearth
halcyon yarrow
#

left SUPIR right InstantIR

mortal mesa
dusky thistle
mortal mesa
#

the upscaled chest and spider kinda looks bad full size :/

halcyon yarrow
#

ii itired PMRF but i have cuda 12.1 not 12.4 inistalled so im gonna put it off for now

bitter hearth
#

had some coloured artifacts

dusky thistle
mortal mesa
#

ya im messing with noise and denoise levels, i have to say i am pretty impressed with the tiling method, better than what ive tried

craggy crest
halcyon yarrow
#

@mortal mesa I modiied that workflow you're using a few ways

  • replaced flux shuttle with sd3.5 large fp8, I could probably optimize it further by trying turbo instead of large but i found large is faster than turbo when using the AIO model bc it comes built in with fast clip models
  • removed the image generatioin part converting iti to a purely upscale workflow
  • replaced florence with BLIP for speed

I got it down to 117 seconds and the results are outstanding

#

Prompt executed in 98.99 seconds in subsequent runs

#

original image let, second one is pre-texture detailer, third isi post-texture detailier

mortal mesa
#

sooo speaking about the texture detailer area, look where the negative prompt is coming from, it doesn't seem right, i dont understand it

#

but ya ill be reusing that middle section also

halcyon yarrow
#

yeah i noticed that, im replacing all that jazz with clownshark

dusky thistle
mortal mesa
gusty trail
civic trail
fast pendant
#

Hello everyone

#

how can I use SD3.5 large for sketch to image?

fast pendant
#

?

civic trail
civic trail
dusky thistle
cunning lintel
dusky thistle
halcyon yarrow
#

@dusky thistle check out this artifacting, using sd3 large q8

#

i think its really interesting how the sampler will never produce an image that is objectively flawed its only subjectively bad

dusky thistle
#

wow

#

what's it look like with euler in ksampler, or the new euler_ancestral in ksampler

halcyon yarrow
#

i could add that to the queue but I gotta wait for res_3s to try a pass at it first

#

that was using res_2m

#

and this is the source image not that it matters bc i didn't use img2img, and this is the prompt that generated that image:
hideous witch, by Sergei Parajanov, <lora:aidmaMJ6.1_v0.3:1> <lora:aidmaImageUpgrader:1>,

#

same seed, same exact parameters for both, res_2m left, res_3s right

dusky thistle
#

idk if you've tried rk_exp_5s yet, it's a little slower than res_3s but another step up in quality espec with SD3.5M

halcyon yarrow
#

res_3s understood the assignment when given that prompt for the "hideous witch" instead of making some crazy wizard it actually did a witch and i wouldn't say she's hideous (eveyrone is beautiiful in their own way) shes quite shoking to look at lol

dusky thistle
#

do the artifacts show up with regular SD3.5L (or do you not have the vram to test)

halcyon yarrow
#

you can still see some of the same patterns the wizard showed as far as texture and artifacting, il def try rk_exp_5s and compare it

#

i dont have the full 3.5L unpruned to test I just have the fp8 AIO, the q8 and q8 turbo

dusky thistle
#

can you screencap the clownshark params and paste the prompt? i could give it a shot over here, i have the full

halcyon yarrow
#

btw i don't choose that layout, its just how i programmed it, i feel like i should spend some time to program the layout better

#
if (!nextItemCloned.workflow.oldWorklowUsed) {
            nextItemCloned.rk_type = _.sample(['rk_exp_5s', 'res_3s']);
        }

here's what I'm going to do moving forward, in situations where I ask to use the new workflow (shark) instead of always picking res_3s it'll randomly pick one of those 2, any other high quality and slow sampler I could add to this list ?

dusky thistle
#

it's the length of your prompts

#

SD35 is really really weird about it

#

never seen a model do it but if you go past 72 tokens it starts going downhill, espec with the neg conditioning

halcyon yarrow
#

ah makes sense

dusky thistle
#

the truncate conditioning option in shark is there just to safeguard in case you go over and are seeing problems, it cuts it down to the size for a one chunk embed

halcyon yarrow
#

so it really does make sense to truncate SD3.5 negative prompts right?

dusky thistle
#

it goes downhill with both

#

but it's worse with the neg

halcyon yarrow
#

but the truncate option does so for both positive and negative right?

dusky thistle
#

yeah

halcyon yarrow
#

ill just add some code onn my end to truncate negative to 77 tokens for negative for SD3.5 and maybe flux too?

dusky thistle
halcyon yarrow
#

actually for flux i'm just blanking it out

dusky thistle
#

72 tokens for whatever reason is the limit

#

i stil lhaven't gotten around to looking at what's going on whatsoever, but once you hit 73 input tokens, truncate changes the output

#

72, it's the same

#

so it's probably moving onto the next block at 73 for whatever reason

#

yeah i'm just using a blank neg for sd35 myself

halcyon yarrow
#
if (isSd3Model){
        // truncate to 250 characters
        nextItem.negative_prompt = nextItem.negative_prompt.substring(0, 250);
    }

alright truncating in place

#

well i fgured sd3.5 isn't distilled and it's always handled negative prompts so i dont want to treat it like flux and just blank it out

dusky thistle
#

i haven't done comprehensive tests or anything, but the ones i did do... seemed to degrade with negative prompts of any kind

halcyon yarrow
#

ive seen that for sure on flux

dusky thistle
#

it def isn't like it was with cascade or sdxl (or sd15) where something like "bad quality" actually did generally lead to a better image

halcyon yarrow
#

where any negative prompting heavily reduces the quality of the image, even when using a dedistiilled model

#

i used a really long negative prompt for this one score_6, score_5, score_4, source_pony, source_anime, pink nipples, source_furry, source_cartoon, censored, deformed hands, deformed fingers, extra fingers, missing fingers, extra limbs, missing limbs, bad eyes, ugly face, blurry face, wrong anatomy, crossed eyes, missing leg, missing foot, unattached hand, deformed, deformed face, bad teeth, ugly teeth, low quality, bad quality, worst quality

dry wave
#

negative prompting is not necessary for cfg, in fact it's not even part of the original cfg implementation

halcyon yarrow
#

awww dude - Value not in list: sampler_name: 'rk_exp_5s' not in (list of length 28)lol c'mon so thats a new sampler then? gotta do the old git pull? is there any hidden bombs i should be aware of before i pull it?

dry wave
#

cfg itself is a hack, but it's necessary for diffusion models to reach good performance

#

negative prompting is a hack of a hack

halcyon yarrow
#

wow didnt know that

dry wave
#

if you can get rid of it, perfect. If you need it, use it, if the model works without that's great

halcyon yarrow
#

i much rather have cfg than have a distilled model where I get a range of 1 to 1.8 and i lose control over the aesthetics of the image

dry wave
#

I prefer cfg, too, but I talk about negative

#

negative prompts are not part of cfg, they are an optional feature

#

they might work, or they might make things worse. People often misinterpret how negative prompts work and might overuse them or use them wrongly

halcyon yarrow
#

yeah i agree with that statement, heck it took me a while to grasp the concept of negative prompts

#

not my creation, just a fun share, created using flux

zenith terrace
halcyon yarrow
#

sd3.5_large_fp8...aled | 🌱 4224251490 | 🦶 29 | 🦮 3.5 | cfg_scale_alt 3.5 | 🧠 sd35_VAE | 🎤 dpmpp_2m | 🕦 sgm_uniform | 🗓 11/16, 11:33 AM | ⏱️ 107s
(ignore the sampler/scheduler its just using res_2m) my only gripe with it is that vertical line on the left side

halcyon yarrow
#

@short thicket it makes sense to see the mangled model performing on par with the flux destill model, i wouldn't say it's any faster

civic trail
craggy crest
halcyon yarrow
#

last one chart I promise, i removed the destilled models, and the SDXL model at the bottom and included al results regardless of sample size or percentile

halcyon yarrow
#

weird how medium is taking the time for me as large right? you see how red and pink are like inline with each other?

craggy crest
craggy crest
halcyon yarrow
#

oh wow interesting so you're seeing the same thing too, i feel like that's kinda bullshit, that gives me 0 incentive to ever use medium

#

don't you find that's weird considering one is a much smaller model?

craggy crest
halcyon yarrow
#

oh i didnt know that

craggy crest
#

medium makes a nice refiner, too

halcyon yarrow
#

so it's not the same training data just distilled?

craggy crest
halcyon yarrow
#

that changes things good to know

#

cfg 7, cfg, 6, then 5 then 4. it's clear lower cfg improves image quality I just really liked the scene at cfg 7, im doinig another run at 3.5 to see what that looks like

craggy crest
halcyon yarrow
#

oh cool ive been playing with the idea of trying to make a performant one, I was messing with the florence 2 yesterday, i think Kagi or NeonNinja shared it

#

i could get it down to 100s with decent results, Ill try that one and modify it to start with Load Image rather than 3.5L and see what kind of times I can get with it

craggy crest
#

sounds good :)

mortal mesa
halcyon yarrow
#

lol wow talk about upscale

#

thats fun to look at, you an even see inside the caves

#

@mortal mesa how long did it take you to make that image?

mortal mesa
#

normal time, nothing special, what i was doing yesterday with slightly diffrent settings

halcyon yarrow
#

so that's the 6 panel workflow with florence and the NKMD upscaler riight?

mortal mesa
#

mmm i was swapping upscale models, i forget what was used on that one ide have to load it, ide bet 4x ultrasharp, but ya that WF. raised denoise and lowered noise injection

craggy crest
#

@mortal mesa you really need to animate that, that's got huge potential

mortal mesa
hallow lion
craggy crest
short thicket
halcyon yarrow
# short thicket which one is mangled and which is acorn?

Did you see the chart and skip my comment? It’s the one “on par with the destill” model in other words the red line is clearly destill being the slowest model of the group and the orange line that touches it is mangled. ChatGPT made the chart don’t blame me on the color selection lol

short thicket
halcyon yarrow
#

Lol it’s fine, I was upset about not being able to tell either and I had to reason my way to figure it out

short thicket
#

Have you tried it past 3?

halcyon yarrow
#

I haven’t tried it last 3 Lora’s

#

I wish I could convey the sample size per entry in a chart

#

I feel like having 100+ enries for given model and Lora count would be more accurate than something with 1 entry

civic trail
craggy crest
halcyon yarrow
#

@short thicket okay the time it took me from when i said that to when I was happy with a result is 40 minutes so you could almost say i spent 40 minutes making this chart (for fun of course)

#

the green means a confidient number bc there's enough tests done for that scenario that the indicator is a good measure of actual average times
the yellow means not so much bc there's between 10 and 100 tests done
and the red means take it with a grain of salt bc less than 10 were done so it might be an edge case outlier as far as actual expected times

bitter hearth
#

how much vram do you have and are you sure you didn't fill your vram during generation?

#

I ask because that changes the numbers

#

if vram filled

short thicket
bitter hearth
#

it does yeah especially on ada/hopper

#

but it does on any gpu

halcyon yarrow
#

in my experience theh fp8 version for SD3 does run faster in fact here's the chart for it

craggy crest
#

@halcyon yarrow you are putting way too much time into this to just post it here. you should consider making a video tutorial

halcyon yarrow
#

butu i don't think the fp8 is faster bc of the pruning method but rather bc it forces me to use the built in clip that uses the lower quality set than the triple clip setup i normally use

halcyon yarrow
craggy crest
#

that'd be good too

halcyon yarrow
#

im also geniuenly curious so im doing it for myself and sharing with others

craggy crest
#

it'll get buried here and lost.

bitter hearth
#

gguf is slower than fp8

#

particularly on GPUs that have native fp8 matmul

#

neither of these are pruning

#

pruning is something a bit different

halcyon yarrow
#

quantsizing somethinig isn't a form of pruning? i think so

bitter hearth
#

speaking of which, 3B pruned flux came out just now https://huggingface.co/TencentARC/flux-mini

halcyon yarrow
#

pruning is to selectively remove bits from the model by quantsizing you're selectively removing the precision bits

craggy crest
#

when you quant somethign, what do you do?

halcyon yarrow
#

you're rounding the precision on the weights right?

craggy crest
#

when you prune, you actually remove data

halcyon yarrow
#

if i'm wrong then civitai is wrong bc they call different options like bf16 and fp8 as different pruning types

bitter hearth
#

quantisation is converting floating point numbers to less precise formats, whereas pruning is actually removing weights from the calculation

halcyon yarrow
#

i agree when you prove you actually remove data, by rounding a number you're effectively removing data

craggy crest
halcyon yarrow
#

a full unpruned 22gb model has had it's data removed to become a q8 model at 11gb

craggy crest
#

(if it's flux, it doesn't need 4 gig of the padding anyway)

halcyon yarrow
#

i never said it did right?

craggy crest
#

but that's what pruning is

#

the weights are actually removed

halcyon yarrow
#

i think that's a form of pruning

#

by reducing the number of weights we can also call that distilling

craggy crest
#

pruning is when you cut branches off a tree. quant is when you put a ring around the base and don't let the roots grow out very far

bitter hearth
#

these terms mean specific things its not actually debatable

halcyon yarrow
#

look all i'm saying is I'm using the wrong terminolgy take it up with CivitAI.com bc they're a really big player in the industry and they're using that terminology to refer these different methods like fp8, bf16, q8 etc

#

i think pruning is a general term that can refer to distilling (to reduce weights) or quantsizing (to round weights), with both methods you're effectively removing data

craggy crest
craggy crest
halcyon yarrow
#

its not a big deal, let's just agree to disagree 🤝

mortal mesa
#

call it pruning the precision

halcyon yarrow
#

There we go that’s a good one, speaking of pruning I wanna try that Flux Mini 3b Neon showed off, looks bad ass

bitter hearth
#

I think I badly misunderstood this applet

craggy crest
#

you're not supposed to feed it dinner

bitter hearth
#

I do like the composable loras on this site

#

separate loras for background and characters etc

halcyon yarrow
#

12 seconds to generate uusing Flux Mini, this is the default prompt ComfyUI puts in

#

@bitter hearth have you tried it on comfy yet?

#

18 seconds, 40 steps, cfg 4.5 ddim beta

bitter hearth
#

gonna download it now

halcyon yarrow
#

67 seconds at 40 steps cfg 4.5 ddim beta

craggy crest
#

flux mini seems to be struggling

bitter hearth
#

this was SDXL on the same prompt, earlier in the year

halcyon yarrow
#

lol yeah i was just gonna post that

#

this is wiith clownshark sampler

#

ksampler just sucks thats what itt is

bitter hearth
#

clown stuff is just so much better yeah

shell bloom
halcyon yarrow
#

that was 170 seconds too

#

its a diffusers based model so you cant use load checkpoint

#

you gotta use load diffusion model and then load iit up with the clips and standard flux vae on the side

shell bloom
#

ahhh ok thanks

bitter hearth
#

this part of comfy is confusing

#

do I put the model in unet folder or diffusion_model folder

halcyon yarrow
#

putu it ini the diffusion_model folder

bitter hearth
#

ok thanks

halcyon yarrow
#

you can just load my workflow if you wanna try it

shell bloom
craggy crest
halcyon yarrow
#

sometimes it doesn't copy the metadata when i just copy the image through the clipboard so here's the fileupload

bitter hearth
#

LOL

halcyon yarrow
#

im trying res_2m see what times i get, i think i can safely bring it down to 20 steps too

bitter hearth
#

but there is no nice UI unless Matteo's project does well

halcyon yarrow
#

i don't like diffusers format it just makes things more confusin and mostly bc my whole codebase is writtne around the checkpoints folder i dont support models in that diffusion_models folder

bitter hearth
#

flux dev is ok at 20 steps yeah it will be worse quality than 40 steps but will still give an ok image

halcyon yarrow
#

im gonna have to convert it to SD format i have a script for it later

craggy crest
bitter hearth
#

same TBH

#

his IP adapter stuff is excellent

halcyon yarrow
#

126 seconds, res_2m, 40 steps

bitter hearth
craggy crest
bitter hearth
#

this is why there is a long delay for stuff to get ported from diffusers to comfy

halcyon yarrow
#

it lost coherency at 20 steps, 66 seconds but the bottle went missing

bitter hearth
#

its a bit tricky

#

if you are below 40 steps probably want eta = 0

#

or very low eta

craggy crest
halcyon yarrow
#

have you seen the ComfyUI wanna be UI that's specifically for diffusers? I saw it on a video recently it looks cute

halcyon yarrow
bitter hearth
halcyon yarrow
#

what about eta 0, res_3s and 15 steps? lets see...

halcyon yarrow
bitter hearth
#

going above order 2 requires a great many steps

halcyon yarrow
#

that stuff is above my level, ii dont really get what eta is doing i ust understand its a factor of noise

bitter hearth
#

eta = 0 means no extra noise is added each step

#

if eta is anything above zero then extra noise is being added

craggy crest
#

you add the noise for a number of reasons

halcyon yarrow
#

eta 0, 20 steps, 67 s, res_2m

bitter hearth
#

mostly keep s_noise at 1.0 its quite spicy
on some models s_noise 1.03-1.07 can be a nice detail boost

craggy crest
halcyon yarrow
#

clown renamed s_noise i dont see that field in the sampler anymore lol

bitter hearth
#

LOL

#

yeah clown renames everything a couple of times per day

#

its part of the mystery

craggy crest
#

squirrel!

bitter hearth
#

the d_noise thing is similar to that "lying sampler" node that went viral

#

or the "detail daemon" node that is similar

#

for the most part either it will boost detail a bit if you increase it, or it will break the model, depending on the model

halcyon yarrow
#

15 steps, res_3s, 135 seconds

bitter hearth
#

actually d_noise might want to go down rather than up, depends how it was implemented

#

the res_2m ones seem better

#

generally res_2m is the one for below 40-60 steps

#

and then above 40-60 steps res_2s with eta on is good

halcyon yarrow
#

ive been using res_2m for everything by default, i could add logic where if steps > 40 then ill auto switch it to res_2s or res_3s. that's some good feedback thx Neon

#

those sort of nibbles of knowledge are fun to consume bc they make my system better overall

craggy crest
#

"...nibbles of knowledge..." i'm stealing that

halcyon yarrow
#

20 steps using res_3s at 172 seconds. I think this is what most would consider the "gold standard" imamge for this prompt something like this image

#

its an ineresting model with times ranging from 12 to 130 seconds

bitter hearth
#

one issue I have with these models is they could end up losing the hyper/turbo lora compatability

halcyon yarrow
#

last one and then i gotta go, res_2s, 15 steps, 50 seconds, and I think what I changed that's making them better is i changed the base shift from 0.8 to 1.5 as per wizard's recommendation way back when

bitter hearth
#

1.5 shift is fine yeah

#

I use a bit of a different method but it requires multiple k-samplers

halcyon yarrow
#

Lol sounds expensive

bitter hearth
#

the model goes from sigma 1 (pure noise) to sigma 0 (sharp, finished image)
and the important thing is that it has a decent number of steps before sigma 0.8 or so, or even sigma 0.9 or so

#

shift is one way of doing that

craggy crest
#

(or you could just roll dice and see what happens)

bitter hearth
#

I prefer to use a node called split at sigma and then have a separate ksampler for sigmas 1-0.8 and sigmas 0.8-0

craggy crest
#

comfyUI needs a random dice roll node that'll set every value to something random

bitter hearth
#

yeah that might be good TBH

#

if you are lucky enough

craggy crest
#

i'm sure there would be horrors, but i'm equally sure realy cool stuff would happen

bitter hearth
#

a lot of my favourite things I found by accident

bitter hearth
#

I guess it didn't know what to do with a frame LOL

#

all the video models just explode if they try to make R2D2 move

#

they can rotate around him while he sits still though

craggy crest
#

i kind of like the expanding frame idea

bitter hearth
#

wow didn't know

#

which one is this

craggy crest
bitter hearth
#

I mostly used cog, maybe they are better now

craggy crest
#

zuckerberg's AI

halcyon yarrow
craggy crest
bitter hearth
#

they decided to show people a scheduler name

#

instead of a list of sigmas

#

but what comes out your scheduler node looks like this 1, 0.8, 0.6. 0.4, 0.2, 0

#

if you choose something like SGM Uniform 5 step

#

might not be exactly that but its a decreasing list of numbers from 1 to 0

#

one number per step

halcyon yarrow
#

The takeaway for me is that sigma is a factor where it’s a constant value of 1 and it’ll progress to 0 until it’s finished during generation

bitter hearth
#

yeah

#

you could see it as a progress bar in some ways

halcyon yarrow
#

And then how it progresses is based on the scheduler and rather than having a general curve you like to split the curve with two ksamplerd

bitter hearth
#

sigma 0.5 is always 50% done

pseudo owl
bitter hearth
#

yeah the pink curve is my overall sigmas

#

afraid I lost the workflow for this one

#

yellow curve is first sampler, then blue is second

#

pink is the overall combined curve

halcyon yarrow
halcyon yarrow
bitter hearth
#

yeah

#

and it goes up a bit because I set it to re-do some of the image

halcyon yarrow
#

Does clown have anything to say about this? Is there anything he can do to facilitate achieving something like that with a custom scheduler in the nodes?

bitter hearth
#

don't think clown particularly likes this method lol

halcyon yarrow
#

Lol oh I see

bitter hearth
#

from what I have seen he doesn't change shift or scheduler much

halcyon yarrow
#

Hey @pseudo owl if you want I can link you to the GitHub where I put it if you wanna try it

bitter hearth
#

there is already a "split at steps" node in comfy

#

or a "split denoise" node

#

so its not too different from that

halcyon yarrow
#

So it would be scheduler one res 3s, sigma breakpoint 0.8, schedule two res 2m

pseudo owl
bitter hearth
#

you do need a lot of steps for res 3s, sometimes like 60-100

#

res 2s and res2m need less

halcyon yarrow
#

I always change res 2m to res 3s without changing the steps (it’s like an option in my UI) to retry rendering an image and it always does fine

#

I think res 3s works at low steps just fine in my experience I don’t think I’ve ever reran it with the better sampler and didn’t get better results

bitter hearth
#

there's a way to measure sampler error to know for sure

#

its in the original DPM paper
haven't seen someone make a comfy node of it but that might be cool

halcyon yarrow
#

I do wish I could automate retrying, I know there’s solutions out there like a classifier that detects if the image is garbage or not, stuff like artifacts or just a solid color image or messed up patterns I’m just wary of going down that road bc it can also be subjective plus added overhead of classifying each image generated

#

If I could measure the error rate that could be a more light weight metric to trigger a retry

bitter hearth
#

there's image quality assessment tools yeah

pseudo owl
bitter hearth
#

lol yeah

halcyon yarrow
halcyon yarrow
#

These looks like a good 7-8s clip

pseudo owl
craggy crest
#

i do that with luma sometimes - kick it off and come back tomorrow

halcyon yarrow
#

I think my max is 85 frames which comes out to 3 or 5 secs depending on the fps. At 85 frames it takes my 8gb GPU about 25 mins to process

#

If I wanna render just one second or like 16 frames it takes 5 minutes sometimes 4 so it’s not bad lol

dusky thistle
dusky thistle
craggy crest
dusky thistle
#

SD35M still

craggy crest
halcyon yarrow
#

@bitter hearth I was upset that this flux mini requires it to exist in diffusion_models instead of the checkpoints folder, so after like an hour talking to 4o, then o1 preview about it I finally figured out the solution and wrote a script to convert mini to be compatible with ComfyUI's load checkpoint. yay. Posting the first image generated with this WF

halcyon yarrow
#

made using SD35L

craggy crest
#

SD3.5 large

dusky thistle
civic trail
halcyon yarrow
sullen moss
#

Yeah, since the release of 3.5, there hasn’t been much visible interest from the community. On Flux, custom models were already available just a week after launch

mortal kite
#

discord busted

#

this milk texture is NOT quite right

#

maybe is ok?

halcyon yarrow
#

Also yeah the number of new Lora’s for sd35 is really low/slow. I remember sd3 had a lot more Lora’s in its release

mortal kite
halcyon yarrow
# mortal kite

I’ll wait for the model that has the full realistic skin suit, I hear it’s coming out soon too lol 😆

bitter hearth
#

I agree some people prefer checkpoint so they should offer a range

halcyon yarrow
#

dude can you believe flux mini took down their project?

#

its coming back with a 404 now

bitter hearth
#

wow is your version the only version now?

#

lol

halcyon yarrow
#

it actually took a lot of research to fiigure out how to convert it from what the base was to soemthing that'll work via load checkpoint, then a lot of work to figure out there's a Save Model node I can repackage it with the CLIP AND VAE built in

#

and then it was easy breeze converting it to gguf after I got past the Load Checkpoint hurdle

#

i was trying to manually compile the safetensors file using python and then after a little research i felt dumb realizing i can do it in comfy

bitter hearth
#

save model node yeah

#

I've used the save clip node once as well its similar, or save diffusion model

halcyon yarrow
#

I also tried to bake in stuff like flan instead of t5xxl and that doesn't workk, I tried to bake in LongClip in various different ways with and without the dedicated node and that also didn't work, so the baked in has t5xxl fp8 and the vit14 finetune by zero point

bitter hearth
#

that's fine most people like that Clip L fine tune

#

personally I liked flan too but its controversial

halcyon yarrow
#

yeah its a 3b model so it needs all the help it can get

bitter hearth
#

yeah for sure

#

I am not sure about flan as one day it helps and one day it doesn't

halcyon yarrow
#

why is it controversial? id like to know

#

oh i see i think you just said why

bitter hearth
#

the model wasn't trained with flan

#

so it is not clear it is a good idea

#

this applies to the Clip L fine tunes too by the way

#

I am not sure personally, I often try both

halcyon yarrow
#

its interesting messing wit that stuff you sort of get to peek behind the scenes, i figured t5xxl and flan internally were the same structuure but its likely different layers and then comfyui is doing some special sauce to adjust to the different layers internally

#

same goes for longcliip, i cant build it in bc its expect it always to be the 'standard' clip L and it only works via the dualcip/tripleclip loader bc of internal adjustments they're makng after the fact

#

this whole thing started bc i want to use flux mini but i don't want to code support for the diffusion_models folder so it turns out all I had to do to convert it from the base model to something that'll work with Load Checkpoint was just prefx the layer keys with a certain string and that's it. super simple change

#

i kept calling the base model diffusers format but that was incorrect its actually in "flux transformers format" so i kept argunig with o1 preview like "okayy if its already in the target format why isn't it working?" i ended up dumping the structure of an existing flux model that works via load checkponit just so gpt can review and compare and figure out the solution

bitter hearth
#

I've been using single clip loaders and then concatting the embeddings, for what its worth
if you use SD 1.5 with ELLA T5 and Clip L then you have to do it this way

#

I need to check exactly what the dual/triple clip loaders and prompt text encode nodes actually do TBH

pseudo owl
bitter hearth
#

I was saying on comfy discord a while ago that I want to make a new set of loader and encoder nodes

#

which will be model-agnostic

#

so for example you will use the same loader and encoder node set to encode prompt as you do to encode images for IP adapter embeds

halcyon yarrow
bitter hearth
#

ah there is a special thing for that

#

its called ELLA

#

its really cool

#

https://github.com/TencentQQGYLab/ComfyUI-ELLA they had to train it

halcyon yarrow
bitter hearth
#

SD 1.5 with ELLA has better prompt adherence than SDXL

#

its crazy

#

the best things tend to have zero hype for some reason

halcyon yarrow
#

wow that's crazy Neon pretty cool stuff the level of care they apply to this stuff

bitter hearth
#

there's no downside to ELLA, that I know of

halcyon yarrow
#

I've just been just LongClp with SD1.5 but I'd be willing to try Ella and compare it

pseudo owl
halcyon yarrow
#

and ELLA iis only compatible with 1.5 its not compatible with SDXL?

bitter hearth
#

very sadly they made ELLA for SDXL but did not release it

#

every now and then people go ask them on github

halcyon yarrow
halcyon yarrow
bitter hearth
#

some of the best stuff is not released
there is a fine tune of Lumina that looks as good as Flux to me
but its not released (its in the I-max paper)

pseudo owl
bitter hearth
#

you can add Clip-L embeddings as well to get some back
but yeah actually that's a good point

#

it will not be as good as pure Clip-L for subject knowledge

halcyon yarrow
#

i don't usue sd1.5 that much, i do support it but sdxl would be where it's at

bitter hearth
#

cos even if you add Clip-L embeddings with "concat conditioning" node, the T5 embeddings are competing with them

halcyon yarrow
#

this is my understandinig of what the model architecture supports
SD15 - only L
SDXL - L & G
Flux - L & t5
SD3 - L & G & t5

bitter hearth
#

this is the sort of style that I like SD 1.5 for:

#

for some reason I can't get this look in other models

#

its very grainy and stylized but still a photo

#
SDXL - L & G
Flux - L & t5
SD3 - L & G & t5```yea that's right
halcyon yarrow
#

the left one clearly shows signs it was made with an inferior model but the right one could almost pass for flux at low rez

bitter hearth
#

yeah the detail and clarity is low

#

and hands bad

halcyon yarrow
#

i might release an update to flux mini aio model replacing the t5xxl fp8 with t5xxl v1.1 fp8, took me a bit of searching huggingface to find it bc you only see the full 22gb model and im not gonna embed that into flux mini lol, i had v1.1 but onlyy in gguf and you cant bake those in either

bitter hearth
#

the on the left is 1536x1536 which is why it looks worse
really hard to get SD 1.5 to do that res

pseudo owl
halcyon yarrow
#

no gguf baking, no longclip baking, no flan baking

bitter hearth
halcyon yarrow
bitter hearth
#

there is a 46GB version of Flan T5 XXL
I've been thinking about using it with SD 1.5 as a joke
cos I sometimes rent the 80GB servers (only $0.70 per hour luckily)

halcyon yarrow
#

so wiith lavi bridge in theory we could get SDXL to work with T5 right?

bitter hearth
#

not sure

halcyon yarrow
halcyon yarrow
pseudo owl
#

I think ELLA is actually slightly better but Lavi-bridge is easier to train.

bitter hearth
pseudo owl
#

Also I believe the t5xxl models are so large since they also include the decoder part which is not even used in text encoding, the actual encoder part of t5xxl which is used itself should be like only 9gb.

bitter hearth
#

yeah you can set the decoder layers to 0

halcyon yarrow
#

some guy commented on the flux mini model's page NO use for it all images dont follow prompt and bad anatomy, loras dont work ... ! and i checked allready 50 flux models lol like granted loras don't work, i agree, but if its not follwing prompt or anatomy that seems more like a setup issue than a model issuue

halcyon yarrow
bitter hearth
#

the main benefit is it would be smaller
so less download time, and faster loading in comfy
it may or may not be faster, but it would definitely not be slower

#

mostly models get faster when you set layers to zero but sometimes its not a big gain, it depends

halcyon yarrow
#

set it to 0 before I bake in t5 model so exclude the decoder part so it's more lightweight all the time, that could be handy, ill def look into that too

#

do you know the node name that lets mem zero out layers?

bitter hearth
#

this isn't doable in comfy it seems

#

comfy tends to not be so good for LLM stuff yet

halcyon yarrow
#

so i could just manually edit the model with safetensors library in python and then just use that modified model in comfyy to bake it in, i can just switch gears and do that instead, ill try it when i build the next flux mini

#

i even posted on flux mini's discussion board a bunch of text about how cool it is and how I posted it on civit and how i was offering these different versins, im sure Tencent team didn't tae it down permanently perhaps they're just preparing for another launch, their github was broken at the time so a better release would be apt

#

also @dusky thistle I linked to your github and suggested your sampler for using flux miini bc it does produce better results

#

hopefully i can convert some people to jump on the shark bandwagon

dusky thistle
#

good to know that it does! which rk_type/sampler type did you choose

halcyon yarrow
#

res_2m and res_3s are the onlyy ones i ever used, i wanted to ask you about that new one you suggested, its not in the list of samplers so im guessing i have to do a git pull but i wanted to confirm with you there isn't any bombs i should be aware of that I would have to adjust for?

#
  • did you change the order of any of the inputs or outputs of the nodes?
  • did you add or remove any fields to any of the nodes?
  • did you remove any of the options in any of the existing fields?
bitter hearth
#

res_2m and res_3s are the onlyy ones i ever usedsame but also res_2s

#

similar though

#

I also liked the soft scaling more than hard 🤔

halcyon yarrow
#

I'm going to be generating some images using flux mini to post them on the model's page and showcase it's ability

A breathtaking landscape of a rugged mountain range covered in dense evergreen forests, with rocky outcroppings in the foreground. The bright blue sky and scattered clouds add depth and serenity to the scene.

A detailed portrait of a young woman wearing a luxurious red dress with intricate lace details, accessorized with pearl jewelry. Her confident gaze and the soft lighting create a regal and timeless atmosphere, reminiscent of classical art.

bitter hearth
#

one on the left is incredible for 3B

#

one on the right needs refine

#

but that's ok

#

for example eye

halcyon yarrow
#

yeah the right eye ould use a little help but its still pretty good

#

oh you think the left eye could use help too?

bitter hearth
#

LOL we both said eye at the same time

#

yeah

halcyon yarrow
#

i think the left eye is fine if she's looking that way but the right eye looks deformed

bitter hearth
#

its not "bad" but even 2 steps of Realvis Schnell would help a lot

#

we have "eye detailer" now

#

like face detailer but for eye, nose etc

#

in impact pack can do that

halcyon yarrow
#

A beautiful still life painting of vibrant pink flowers in a ceramic vase placed on a wooden table by the window. The sunlight softly illuminates the petals, creating a warm and inviting atmosphere, inspired by classic oil painting techniques.

A stunning surreal cosmic landscape featuring a majestic lightning bolt striking through vibrant orange clouds, with planets and stars in the background. A lone figure stands in awe, surrounded by ethereal beauty, evoking a sense of wonder and exploration.

A serene and atmospheric scene of a train station nestled in a lush tropical forest, illuminated by warm lights. The station features vintage architecture, and people walk leisurely along the platform under the towering palm trees.

bitter hearth
#

but I don't like impact pack I do it in other ways

halcyon yarrow
#

these are all using res_2m btw

#

i dont know if the WF is embedding into these images im just copying the image via clipboard

bitter hearth
#

impact pack wants to do everything in terms of a new data structure called a "SEG"
but I don't want that

pseudo owl
halcyon yarrow
#

i just asked gpt4o to generate me a prompt for each of these images and that's how i got the prompts, civit took down this image bc of the kid in the third frame

bitter hearth
#

oh is that why

halcyon yarrow
#

this was what got me into flux mini like if it can generate stuff this good its gotta be worth trying

bitter hearth
#

your checkpoint is the only one now lol

halcyon yarrow
#

the original women in red was using the q8 model by the way, after i quantsized it to q8 the model is actually 3.4gb

#

imagine 3.4GB flux model

#

q8 model left, original base model right

#

q8 left, original right

bitter hearth
#

nice its the same, essentially

halcyon yarrow
#

again q8 left, original right. they're super similar to each other, for being 1.5GB smaller its pretty astouding how good it still is

bitter hearth
#

yeah Q8 is great

halcyon yarrow
#

A vivid and colorful depiction of a nebula in deep space, featuring intricate clouds of gas and dust in shades of red, blue, and gold. Bright stars shine through, creating a mesmerizing and otherworldly cosmic vista.

bitter hearth
#

sometimes Q6, or Q5 ones can be good

halcyon yarrow
#

its that term, diminishing returns, i like to q8 bc I just want less memory usage at the expense of a little bit of loss, iim not willing to accept more than just a liittle bit of loss lol

#

i think this is the train statin in the forest one, people don't look so great here

bitter hearth
#

I kinda agree Q8 is a good choice these days

#

personally I do everything FP8 but there are costs to that

halcyon yarrow
#

look at my times for generating these images on my 8gb gpu, like I see one going as low as 43 seconds

bitter hearth
#

what sort of hardware is this

#

oh you said 8GB sorry

#

yeah that's good for 8GB

halcyon yarrow
#

then i click on that liitte share button in the corner for each iimage, it'll run it through 2x VLMs for post title and tags, run it through the 2xBSRGAN upscaler and use exif_tool so CivitAI can read all the details on how it was made

bitter hearth
#

ah yeah I love automated chains like that

halcyon yarrow
#

and iti'll do all that in 10 secs per image

bitter hearth
#

its hard to recommend upscalers to people cos there are so many variables but you can definitely do better than 2xBSRGAN

halcyon yarrow
#

it has to be 2x bc I dont want 17-30mb image files laying arounud, i ithink 4-5mb is decent, and it has to be performant, ive used better models that def look better but im not willing to dedicate 30-45 seconds to upscale it

bitter hearth
#

https://github.com/Phhofm/models/releases/tag/all_modelsthis script downloads lots of good ones

#

actually this link might be more helpful it compares speed:

#

https://github.com/the-database/traiNNer-redux/wiki/PyTorch-Inference-Benchmarks-by-Architecture

#

cos there are really fast ones now too

#

would recommend span

#

ah this one seems perfect https://openmodeldb.info/models/2x-NomosUni-span-multijpg-ldl

#

its a 2x SPAN one for photographs

halcyon yarrow
#

awesome dude thanks for the links Ill def go over those benchmarks i love that kind of stuff

bitter hearth
#

no problem

#

with these upscale models its always worth trying a bunch

#

cos unlike diffusion models, the upscale models cannot work well outside of their exact training data

#

so it depends if your image matches what they expect

halcyon yarrow
#

@shell bloom lets talk here buddy

#

so you're saying you got 4gb of ram and you're fastest time yet with the 3B model at 1024px is 54 seconds using the aio model

shell bloom
#

No as a video card I have 12gb of vram, a 3060

halcyon yarrow
#

oh dude with 12gb of vram you should be getting way faster times

#

when testing make sure to try two similar prompts twice

#

the first time it has to lload the model, the second time is signifincantly faster

#

ive gotten times as low as 42 seconds with my 8gb so you should have no problem going even faster

shell bloom
#

what graphics card do you have?

halcyon yarrow
#

4070 on a laptop so it's technically more like a 4060 for pc

shell bloom
#

I have an rtx 3060 but it is slower than a 4070 laptop, but I honestly don't know

halcyon yarrow
#

the newer generations have more tensor cores maybe that's why? im not too sure. if you need help loading the unet model just take the image posted on my gallery and load that into your comfyui workflow

shell bloom
#

i7 8700k

craggy crest
craggy crest
#

a 3060 is going to be fairly slow as it is. so you can't have anything else running that'll want that GPU

shell bloom
#

I'm waiting for the new rtx 5000 to build a new pc

craggy crest
#

for now, make sure no games, or anythign else that wants the gpu while you're generating

halcyon yarrow
bitter hearth
#

3060 is about 4x slower than a 4090

#

its not too bad

frail shoal
halcyon yarrow
#

@shell bloom I think that @craggy crest is the right guy to ask in terms of sd35 training Ive read some of his messages and he's keeping up on that. im not into training or finetuning or any of that

halcyon yarrow
#

i have a script called shuffle-checkpoints that'll reassign items destined for other models to a specific model I set, so I just queued up 500 images into the flux-mini queue, expect to see some more flux-mini examples posted shortly

#

@bitter hearth testing your theory, wrote a quick script to remove the decoder side from t5, queue'ed up a generatin to see if it works or if itll produce an error

bitter hearth
#

okay thanks, I've been looking for flux-mini samples lol

halcyon yarrow
#

looking at the t5 xxl v1.1 fp8 and iti doesn't have the decoder layers so it's alreadyy optimiized

bitter hearth
#

what is that in the background on the right

#

is it a green chair

halcyon yarrow
#

i think they're supposed to be like orcs or monsters of some sort lol

bitter hearth
#

cannot tell if orc or chair 😂

mortal kite
craggy crest
rapid pivot
bitter hearth
#

maybe they'd still have the One Ring if they had orc-chairs on their side

mortal kite
halcyon yarrow
#

@mortal kite So you’re taking low Rez images of shirts and essentially up scaling them?

bitter hearth
#

I think it might be that control-lora thing

#

since that went viral this week

halcyon yarrow
#

The in context thing?

bitter hearth
#

yeah

#

that little one on the left is the k-sampler preview rather than the input

halcyon yarrow
#

I just hit 500 downloads on that Lora earlier actually

#

Yeah I think I got confused for that being the load image node lol

#

500 downloads for the in context Lora’s and not a single person has used it in the civit generator to post anything online, that’s kinda frustrating for me, I was excited to see how the community would use it but everyone is just offline generating and then not tagging if they even share it online

bitter hearth
#

I actually don't use previews personally

#

if I want to see generation partly finished I would just stop k-sampler early

halcyon yarrow
#

I’m the opposite, in fact I rewrote ComfyUi latent preview node just so I can see the preview of the batch

#

It’s super fun monitoring the preview of a batch of 4 in a grid and then on step 16/20 it decides to completely redo one of the set and it makes it way better or worse

#

I used to do batches of 5 now I do 4 just so it fits neatly as a grid lol

bitter hearth
#

you might like this node pack https://github.com/blepping/ComfyUI-blehit has improvements to the k-sampler previews
might be useful for ideas

#

oh yeah I don't use batching, its useful though

halcyon yarrow
#

I think at this point I’m a die hard shark sampler guy, the only beef I have with shark sampler is how it handles the preview but I don’t hold it against him

bitter hearth
#

I used TCD sampler for the vast majority of my images
not for Flux though

halcyon yarrow
#

He’s letting the system handle it based on global user preferences and then he’s listening for the preview events if enabled, whereas ksampler efficient advanced ignores those settings and uses the local node settings to decide whether to show

#

What’s TCD?

bitter hearth
#

its like hyper its a distilled version of SDXL or SD 1.5

#

this is the sampler for it https://github.com/JettHu/ComfyUI-TCD

halcyon yarrow
#

So the sampler is also a distilled version of a model? Thats interesting

bitter hearth
#

no the distilled versions are loras

#

the sampler just works well with them

#

cos it came from the same paper

#

the sampler is similar to euler_a

halcyon yarrow
#

How does TCD handle artifacts?

#

You know how sometimes ksampler will do green splotches? Like in the corner of the mouths or the eyes or nose n

bitter hearth
#

not sure I haven't seen those

#

it's generally worse than regular SD 1.5 or SDXL for accuracy

halcyon yarrow
#

That’s like the ultimate pet peeeve for me, spent all this time generating an image and it almost feels cruel bc it’s artifacting key areas lol

bitter hearth
#

restart sampler is the best, technically

#

as far as I know

#

does not work on flux though

halcyon yarrow
#

Restart sampler? Don’t let ClownSharkBatwing hear ya lol

bitter hearth
#

lol

#

noisy DPM/Res/Deis with a decent amount of Eta is also good but restart sampler is a bit better

halcyon yarrow
#

I think it’s gonna be hard to switch from his sampler tbh, I’ve noticed on really complex images where the chances of artifacts are high it’ll go into this mode where rather than green splotches it’ll do like these artistic overlays I wish I could show you lol

remote holly
bitter hearth
#

there is no restart sampler for flux, SD 3.5 or auraflow

#

is the main issue

halcyon yarrow
#

That’s kind of a deal breaker for me

#

I especially like how I can run all 5 base models with the exact settings and it handles it like a champ

bitter hearth
#

yeah I actually don't use restart personally anyway since I use TCD

halcyon yarrow
#

Sd15, sdxl, pony, flux and sd35 all with the same settings
ETA 0.5 Gaussian Gaussian res_2m beta57

bitter hearth
#

the reason I like TCD in particular is that it is the acceleration lora with the highest image complexity
(there is a model they use in papers now that judges image complexity)

#

those settings are good yeah

#

in my tests res_3s needed a lot more steps than res_2s
but it depends on settings/model/workflow etc

halcyon yarrow
bitter hearth
#

there's implicit steps as an option too

#

but you have 8GB so it might not be worth it

#

there is a limit to how slow would feel okay

gusty trail
remote holly
halcyon yarrow
#

@bitter hearth so I found this ComfyUI node to use lavi https://github.com/kijai/ComfyUI-LaVi-Bridge-Wrapper/issues/1 and then I looked up the issues before tackling an install to confiirm it works for sdxl land the team said:

Thank you for your interest in our LaVi-Bridge! We did not include SDXL in our current work, but we are conducting experiments on SDXL with LaVi-Bridge and will update our progress promptly in both the research paper and this repo.

#

that was 6 months ago they said that

#

comments are precious:

ELLA folks did release the adapter checkpoint for t5+SD1.5 (spoiler: it's not good) but announced that SD XL adapter will not be released.

With the ELLA team shooting the open-source community in the back by not releasing its SDXL tool, it's now come to your team to be our savior. Good luck! We're all rooting for you.

bitter hearth
#

LOL

#

I was under the impression ELLA was better though, rather than worse

halcyon yarrow
#

this comparison screenshot of before and after is pretty impressiive

bitter hearth
#

wow nice

halcyon yarrow
#

the embeddings really do play a major role in image composition so being able to inject t5 into these legacy models would give them a huge boost, i'm just not willing to add support for it if its only 1.5. once sdxl support comes out I'm first in line to try it

bitter hearth
#

the llama 7B results were even better apparently

craggy crest
#

@halcyon yarrow do you know how to get comfyUI to run in a docker container on linux?

civic trail
halcyon yarrow
#

I could share my deploy.sh but it's speific to RunPood

#

see the image I use is this one:

imageName: "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04"
it doesn't even come with comfyy just pytorch and cuda bc those are the 2 hardest/longest things to install

#

and then my script just does some basic stuff like:

if [ ! -d "$COMFYUI_DIR" ]; then
    git clone https://github.com/comfyanonymous/ComfyUI.git "$COMFYUI_DIR"
    # Navigate to the workspace directory and update the repository
    cd "$COMFYUI_DIR"
    git reset --hard origin/master
    git pull origin master

    # Step 4: Move the custom_nodes directory to $CONTAINER_DIR/custom_nodes
    mkdir -p "$COMFYUI_DIR/custom_nodes"
    mv -n "$BOOTSTRAP_DIR/custom_nodes/"* "$COMFYUI_DIR/custom_nodes/"
fi
craggy crest
halcyon yarrow
#

if you're using stuff like AWS i'm sure there's prebuilt AMIs that have ComfyUI built in or at least pytorch/cuda pre-installed but those instances are very expensive

pseudo owl
sudden parcel
#

i have a weird looking rendered image when doing SD3.5 in A1111

#

im using SD3.5 large

#

i assumed it had something to do with a VAE ... but i do something wrong, maybe its the wrong file or wrong location

hallow lion
sudden parcel
#

i will make a note

craggy crest
sudden parcel
#

does anybody know what i might do wrong?

craggy crest
sudden parcel
#

i put the sd3.5_large safetensor file into the models/stable-diffusion folder. I did download a vae file from civit.ai

#

do i put the vae file into models/vae or models/stable-diffusion folder?

#

i tried both, but it is not working

halcyon yarrow
#

@sudden parcel are you talking boutu ComyUI bc ComfyUI doesn't have a stable-diiffusion folder afaik it's checkpoints, diffusion_models, or unet

sudden parcel
#

A1111

craggy crest
craggy crest
#

and you can make a folder in checkpoints called sd3.5 if you want

sudden parcel
#

there is no checkpoint folder

#

i mentioned twice its A1111

craggy crest
# sudden parcel i mentioned twice its A1111

for a1111, you would put sd3.5 where the other models go. it does't need a special folder. but you still need to use a VAE that's written for it, not just any vae you found somewhere. and you still need to make sure you're not using a model version that has the VAE baked into it if you are going to use a seperate vae

#

and you still need to make sure you ahve cfg, steps, and other settings correct

unkempt compass
# mortal kite

It's interesting, but it's a pre-rendering. Not the final product. Because you'll have to provide a compatible file, with transparency to a any printing firm.
Do you have a plan for that?

sudden parcel
#

ok....

craggy crest
# sudden parcel ok....

no. the sd3.5_large.safetensors on the files page doesn't have the VAE. and the VAE in the folder on the files page is the one you want for it. you'll find your encoders on that page, in their folders, too

sudden parcel
#

or is there a better place to go and download the sd3.5 model?

craggy crest
#

and grab the sample workflow, too

halcyon yarrow
#

@pseudo owl looking at LLM2CLIP, I lke it, I wnt to t ry it, I'm confused as to how to use it, is it just a drop in replacement? do I just add it to my clip folder in ComfyUI? I don't get it lol

sudden parcel
#

the vae file is not available

craggy crest
craggy crest
#

yes it is. it's called diffusion_pytorch_model.safetensors and it's in that VAE folder

#

just name it somethign like sd35VAE and put it in the folder your vaes go in

halcyon yarrow
#

i mean im just gonna try it, there's a 1.2GB .bin file in there that I'm gonna rename to safetensors and give it a whirl

#

these results are outstanding

#

LongCLIP has been official dethroned, look at base clip L all the way inside of the blue circle, base clip L sucks a lot

dusky thistle
sudden parcel
#

ok, downloading

dusky thistle
sudden parcel
#

i placed it into the vae folder

#

rendering a test image needs 12 minutes

pseudo owl
halcyon yarrow
#

yeah renaming the bin file did not work lol

#

not sure what you mean by "work normally" the Load Clip node in comfyui doesn't support .bin files

bitter hearth
#

they updated

#

look their HF account

#

there is safetensors now

sullen moss
pseudo owl
halcyon yarrow
#

but it didn't seem to work

halcyon yarrow
#

yeah i see that 2.3GB safetensors file wowza

#

i thought the EVA02 version was better than that version tho that's why I went EVA02 first

mortal mesa
#

We expect to release all the parameters of the text model, adapter, and related components today. Previously, we experienced some delays due to precision issues during the Hugging Face conversion process

pseudo owl
#

it contains the image encoder part too but thats completely useless for image gen models, the clip load node should load on text encoder part.

EVA02 is different model I believe which is a better alternative to clip but the image gen models don't use it, they use the clip l. The best clip model is SigLip by google but again, no model uses it as a text encoder.

halcyon yarrow
#

i keep getting: 'NoneType' object has no attribute 'device'

#

this whole thiing feels like it's not ready for us to try yet, if you guys get it working I'd love to hear about it

sudden parcel
#

hmm... same results, no proper image, Sd3.5 model in the model folder and the vae file in the vae folder

craggy crest
sudden parcel
#

that was my fear 🙂

#

downloading, that will take some time

craggy crest
# sudden parcel that was my fear 🙂

if you're willing to change, i suggest you consider installing SwarmUI and just letting it handle all the technical stuff, and using comfyUI inside it

#

it'll make your life a lot eaiser

pseudo owl
#

@halcyon yarrow did you try any text with Flux-mini?

sudden parcel
#

i just downloaded the comfyUI windows portable file

#

will this be a problem?

halcyon yarrow
sudden parcel
#

i put the SD3.5 model into checkpoints

craggy crest
sudden parcel
#

i put the vae file into the vae folder

#

is there anything else i have to do?

halcyon yarrow
#

@pseudo owl i tried a bunch of text, now remember I'm using LongClip + t5 flan so whilie its usually excellent from flux-d models it seems mini seems to have completely lost the ability to do any legible text

pseudo owl
halcyon yarrow
#

ok ill report back w the results

sudden parcel
#

i need to leave, emergency

halcyon yarrow
#

lol thats terrible adherence

dusky thistle
pseudo owl
halcyon yarrow
#

one more

  • white cat: yes
  • blue dog: no
  • brown couch: yes
  • living room: yes
  • 4 cow pics: no
  • ufo hovering: yes
  • outerspace: no
#

it managed to nail 4 out of 8 elements now to be fair that's 27 steps, lets try 40 and give it a good shot see what we get

#

40 steps

pseudo owl
#

nice increase in quality, what are those 4 things flying though lol

halcyon yarrow
#

it managed to get 5.5 out of 8 now Ill give it half a point for the half cat half dog creature

#

they almost look like flying cows lol

#

overall i think flux mini is a fun thing to play with but it has no loras, unless civitai adds it as it's own base model, and the community rallies behind it and makes finetunes and loras for it, little mini is just destined to not be famous

mortal kite
#

I was planning on making these myself just at home with iron-on or something

#

was generated with Flux fp8

halcyon yarrow
#

that's funny for #8 they cited my page but I didn't create anything new I just reposted their stuff from the model zoo

mortal kite
halcyon yarrow
#

@pseudo owl this sd35 turbo

halcyon yarrow
# mortal kite

is the lora only able to handle tshirts or can it do weird garments?

mortal kite
halcyon yarrow
#

like for example can you try a tank top with a radioactive symbol or something like that

#

or a crop top instead of a tank top to make it even harder

mortal kite
halcyon yarrow
#

oh i see i ithouht you were still LORA'ing that's cool

mortal kite
#

actually, I'm using PixelWave not flux base

#

forgot

#

its the Pixelwave flux model

#

I can try crop top

#

this time it puts a person in there

halcyon yarrow
#

i love pixelwave its so good

#

btw have you tried that 4koma lora?

mortal kite
#

no I have tried a few LORAs but haven't heard about that one

halcyon yarrow
#

design a little wonky but yeah otherwise nailed it

mortal kite
#

seed lottery

#

interesting when you say crop top is alwas puts a model in

halcyon yarrow
#

yeah you almost gotta say just the garment or just the top

mortal kite
halcyon yarrow
craggy crest
#

count the number of cow pictures and where they are placed

halcyon yarrow
#

yeah the cat is a little wonky but i'd still give it a pass, its a cat on top of a dog lol

#

that's the only deduction i gave it that's why i said one of the 8 elements was the placement of the paintings

#

white cat: yes
blue dog: yes
brown couch: yes
living room: yes
4 cow pics: yes
placed in right corners: no
ufo hovering: yes
outerspace: yes
thus 7/8

craggy crest
# halcyon yarrow yeah the cat is a little wonky but i'd still give it a pass, its a cat on top of...

the prompt is, however, unclear on a number of elements. it says "A photograph of a white cat on top of a blue dog sitting on a brown couch in a living room. Behind them is a window and 4 cow pictures, one in each corner. Outside the window is a ufo hovering and outer space." and (picky magazine editor coming out) if we break that down, we have some fairly unclear concepts. it starts out fine "a photograph of a white cat" and then says "on top of a blue dog" - ontop... how? laying? sitting? sprawled? something else? and then it moves on to this "... sitting on a brown couch in a living room" - what is sitting? the cat? the dog? both? if an author had sent me that, i'd stop right there, and send it back and tell them to revise so it was clear to the reader what he was describing. to go on, however, we come to "behind them is a window and 4 cow pictures, one in each corner" - behind has to refer back to the cat and dog, that's good. and window is clear. we know that we expect to see a window as their background. windows have to be in walls so we expect to see a wall. but then we see this: "four cow pictures, one in each corner" - each corner of what? the window? are they actually IN the window, in the top right, top left, bottom right, bottom left, corners? are they on the wall next to the window's four corners? Or does the author actually mean that the pictures are on the wall that the window is in, but are in the 4 corners of the wall? not a clue what he really means, he didn't really say. moving past that we see "outside the window is a ufo hovering and outer space" - the UFO is clear, we expect to see a stereotypical ufo through the window, but what does outer space mean? does it mean we just see a starry sky the way we see outer space from earth? does it mean we see oute rspace from some other point of view such as one of the images did, with the earth and stars around it. does it mean something else? again, it's not very clear." - and if I, a professional publisher and editor, am going to rip this apart and tell the author to revise it and make it clear, because I can not really tell what's being described, the poor AI that hasn't got any real experience other than it's data training set is really going to have a hard time figuring out what is actually wanted.

halcyon yarrow
#

yeah you're right about all of that, the prompt is lazy and unclear butu i guess that's sort of the challenge seeing if the model can makke something that fits those loose words, i was also stumped when scoring/judging it what "ufo hovering and outer space" meant, like how that could work, and then i saw your variation and im like "wow thats a really interpretation of that text" bc despite how its unnatuural the view outside the window is of outer space

bottom line is your right prompt is unclear but it managed to put together all the elements in the prompt to deliver a really nice coherent image

glass meteor
craggy crest
halcyon yarrow
#

agreed

mortal mesa
#

maybe AI will get better one day

craggy crest
halcyon yarrow
#

[4koma] In this light and cheerful comic:
[SCENE-1] In a bright forest clearing, a cheerful boy with short brown hair stands facing an orange fox. The boy smiles and says, "Good morning, little friend!"

[SCENE-2] The fox holds up a shiny red apple, looking proud. The boy responds with a smile, "Wow, that’s so kind!"

[SCENE-3] Both sit beside a stream. The boy points at the water, laughing. He says, "This place is perfect!"

halcyon yarrow
#

using that in-context lora that's new called '4koma' designed to make cute scenes like this, flux dedistilled is the only model that can consistently nail a 'complex' set of text, literally every other flux model ive tried today can't even get one word bubble correct

mortal mesa
halcyon yarrow
# craggy crest have you tried recraft yet?

i got as far as making an account, finding there was no model i can download, seeing it wasn't free, i get there is 50 free credits per day but im not gonna get attached to something that's not free so i stayed away from it

craggy crest
#

yeah, it's not free, but it's very good at illustration and cartoon

halcyon yarrow
#

@mortal mesa that looks like a remix from the one you posted the other day, cool variations. I lke the first one the most bc it was such a lush set of greenier, this one is more 'dead' deespitie all the plants, cool concept thats for suure

mortal mesa
#

ya same prompt/seed, just with loras

halcyon yarrow
#

this is recraft, just copied and pasted the exact prompt into each box, super intutive interface

#
  • it nailed the first scene no complaints there
  • the boy's face looks weird in the second, wiish he was wearing the same clothes too, the text is wrong too
  • lost context and now it's a girl? what happened to the fox?

Im sure with a few adjusutments to the prompt I could solve all that. I could def see the use for this for creative professionals

#

there we go some slight tweaks to the prompt and I basicaally did the same concept in less than a minute with recraft while locally rendering that with flux took like 700 seconds lol

#

@craggy crest yeah recraft is pretty cool that was fun to make, took up 28 credits to do so i have room to make about 2 of these per day

obtuse hinge
#

Generate a poster

craggy crest
dusky thistle
dusky thistle
dusky thistle
dusky thistle
#

all SD35M

dusky thistle
native mist
#

cute boy,with black glasses

patent acorn
#

i still wish there is a major sd3.5 fintune to improve big stuff

#

recraft composition is CLEAAAN

halcyon yarrow
craggy crest
dry wave
#

flux is really good for multi-panel images, too. It usually preserves character identity really well. Complex text still might be a problem, though

bitter hearth
#

what appealed to me a lot in the paper was the sandstorm thing
would be cool to try to train ones for smoke effects or lighting effects

halcyon yarrow
# dry wave flux is really good for multi-panel images, too. It usually preserves character ...

i agree, I think between the in-context lora for multi-panel and doing really good text its an excellent model for handling that, but at the same time stuff like recraft is more practical for the non-tech savvy who want to do something like that and doesn't have the skills or means to run a set up like that locally, plus 700 seconds vs 60 seconds. if recraft was free free I'd be defending it more but a paid service is kinda lame

bitter hearth
#

out of the closed-source ones, I think FLUX Pro 1.1 Ultra is the most impressive cos its 2048*2048 in just 10 seconds

#

but if you include upscalers I think its possibly the latest Topaz Gigapixel
saw a video where it did a creative upscale that ended up over 19,000 pixels wide

halcyon yarrow
#

sometime I'll generate what I consider garbage (made using fluxubooru) bc it didn't adhere to the prompt or the source image it just sort of did it's own thing, I shared it on civit anyways, ive had 10 reactions and 30 buzz from it

frail shoal
#

SD3.5M quality seems great, but I'm only using it as a refiner to Pixart sigma. It does shitty compositions otherwise, very bland. Happy to have such a small model packing so much pixels.

bitter hearth
halcyon yarrow
#

lol yeah its nothing particularly outstanding, just funny one man's garbage is another man's treasure

bitter hearth
#

its very subjective yeah

halcyon yarrow
#

lol i do have some free credits with Kling

craggy crest
halcyon yarrow
#

lol that's a good story

#

perfectionist syndrome

craggy crest
#

:) yup. all artists suffer from it - too close to the trees and can't see the forest

#

too busy looking at the bark beetles to see anything else

halcyon yarrow
#

personally i don't see myself as an artist, or a perfectionist, i'm not detail oriented and I often time overlook glaring mistakes, like i didn't notice the cat on top of the dog looked weird until you pointed it out lol

bitter hearth
#

||56y||

hexed mulch
#

cat

craggy crest
craggy crest
pseudo owl
# craggy crest the prompt is, however, unclear on a number of elements. it says "A photograph o...

Yeah I know it’s not very properly formatted but surprisingly most of the times, it doesn’t really improve quality. This is flux schnell 8 steps, same seed. Left has 3 cow pictures but dog has no head, Right is 2 cow pictures and not no head dog.

For example, left is
A photograph of a white cat sitting on top of a blue dog. The blue dog is sitting on the brown couch. Behind the couch is a square window with a square cow picture in each corner of the window, the total amount of windows being 4. Outside the window is a ufo hovering in dark outer space.

Right is

A photograph of a white cat sitting on top of a blue dog sitting on a brown couch in a living room. Behind them is a square window and 4 square cow pictures, one in each corner of the window. Outside the window is a ufo hovering and dark outer space.

craggy crest
pseudo owl
#

Ok yeah, that’s justified.

bitter hearth
#

the most common prompt adherence benchmark, clip score on ms-coco, uses prompts like this: ```227590,The passenger train is painted brown and white.
467578,A box of donuts with a coffee in front of it.
379476,A long tunnel with a long table with lots of seats and candles next to wine glasses.
35206,a tennis player crouching down near the net
173208,A plate of food has some sesame seed bagels.
416059,Two people walk through the snow behind a dog.
350278,A zebra standing on top of a dirt field.
143224,An airplane on the tarmac and the glass passageway leading to its door
294853,"A man in a red cap, green shirt and white shorts holds a tennis racket under his arm."
323552,A young girl with glasses appears to be waiting with luggage at the baggage center.
185181,A giraffe on display in a glass enclosure.
43850,A man standing over his dog on a beach while holding a surfboard next to the ocean.
351369,A landing jet airplane kicking up spray on a wet runway.
558661,A woman that is standing in the grass with a frisbee.
119516,A beautiful woman standing on the side of a rad next to a street.
89790,A man in a parking lot talking to the driver of an army green pickup truck.

halcyon yarrow
halcyon yarrow
bitter hearth
#

they've started to move on to harder benches yeah

halcyon yarrow
#

here's two example of clownshark sampler doing some interesting effects rather than creating artifacts, prompt is:

score_9, score_8_up, score_7_up, source_anime, masterpiece, best quality, perfect anatomy, very aesthetic, absurdres, (3 girls), cute, standing in a fancy restaurant, carrying menu, french maid, intricate detail, 1girl

bitter hearth
#

the extra noise helps yeah

#

it randomly pushes the model out of areas with low score function gradient

#

sometimes the model thinks it has found a good solution but it only found a good solution for that particular area of the solution space, that didn't have a lot of gradient

craggy crest
craggy crest
mortal mesa
#

yes works also

halcyon yarrow
craggy crest
bitter hearth
#

ms-coco is kinda old now, came out in 2014
it was for object detection so for that prompt really it was designed to put a bounding box on the hat, shirt, shorts and racket

#

it gets kept around for historical reasons but its not optimised for image gen at all

#

the downsides of switching the widely used benchmarks are so high that they only change benchmark when they really really have to

#

FID is very flawed also, and it is now well-known how to game FID (fake a high score)

#

but it correlates decently enough with human preferences so they still keep it

craggy crest
#

sd3.5 L prompt: a pink poodle eating a large taco while sitting on a barrel

#

prompt: a pink poodle sitting on a barrel. it is holding a large taco in its front paws and gnawing on it

#

be specific in your prompt, you'll get closer to what you want

halcyon yarrow
#

@bitter hearth II loked into LLM2CLIP further more and I had a few takeaways

  • so the LLM model is out and the vision model is out
  • I tried the vision nodes in ComfyUI to try to make itt work somehow and I dont think this one is compatible with that architeture
  • it seems like the only way this is going to work is basically an upgraded ClipTextEncode node where you type in ithe prompt and rather than sending it to an LLM to generate a better prompt which then gets converted to embeddings, it sends it to an LLM to generate better text embeddings
  • ultimately tis is one of those the more compute you throw at a problem the better output you get, i just don't think I wanna have an 8B LLM model as part of processing pipelinie to iimprove my images
  • Once someone gets that stuff working in ComfyUI maybe Ii can try quantsizing the LLM into like a Q2 to make iti real quick tho
craggy crest
bitter hearth