#🆕｜sd3 | Stable Diffusion | Page 124

halcyon yarrow Nov 15, 2024, 6:38 PM

#

so im guessing you input the image + this prompt to gnerate the bottom image?

gusty trail Nov 15, 2024, 6:44 PM

#

#

halcyon yarrow Nov 15, 2024, 6:44 PM

#

i don't get it

gusty trail Nov 15, 2024, 6:46 PM

#

I made a node to automatically patch right or bottom depending on the input image

halcyon yarrow Nov 15, 2024, 6:47 PM

#

that's different than what you were showing earlier riight? I was expecting to to see the same image as the couple as the top frame and then a new image as the intepretatin as the bottom frame, like hte knight and the princess

#

can you do the 3x frames like the knight and princess one where its the same original top image and new bottom image?

gusty trail Nov 15, 2024, 6:48 PM

#

Oh I see

mortal mesa Nov 15, 2024, 6:48 PM

#

visual identity design in context lora

gusty trail Nov 15, 2024, 6:48 PM

#

It is ok with that

halcyon yarrow Nov 15, 2024, 6:49 PM

#

mortal mesa visual identity design in context lora

lol i didn't know it applied to tattoos so how did you get that?

#

are you using xiao's workflow to input an image and used the visual identity lora?

gusty trail Nov 15, 2024, 6:49 PM

#

Just need a extra prompt for the new scene

mortal mesa Nov 15, 2024, 6:49 PM

#

its embedded, i modified a workflow Klinter made

halcyon yarrow Nov 15, 2024, 6:50 PM

#

gusty trail Just need a extra prompt for the new scene

An energetic scene of a fiery chili pepper character and a frosty popsicle character interacting in harmony: the chili pepper radiates a warm, blazing aura while the popsicle exudes a frosty chill, their contrasting energies blending in the center to create a swirling mist of steam. The backdrop is a fantastical fusion of molten lava and glittering frost, with small flames and snowflakes dancing in the air around them. Their expressions are playful yet balanced, symbolizing the coexistence of opposites in a vibrant, dynamic moment.

halcyon yarrow Nov 15, 2024, 6:51 PM

#

mortal mesa its embedded, i modified a workflow Klinter made

so its similar to what xiao is doing where you can input the image on the left and have the lora generate the imiage on the right?

mortal mesa Nov 15, 2024, 6:51 PM

#

exactly that

halcyon yarrow Nov 15, 2024, 6:52 PM

#

cool you guuys are really taking it one step further

gusty trail Nov 15, 2024, 6:57 PM

#

The edge is copied from above

halcyon yarrow Nov 15, 2024, 6:58 PM

#

lol oh wow that's really cool well done it follows the prompt really well too

#

id give iti a high score for adhernece

gusty trail Nov 15, 2024, 6:59 PM

#

Using another llm generated prompt

#

halcyon yarrow Nov 15, 2024, 7:01 PM

#

ilike the first one better, anyways thats really cool so @gusty trail that's not the "try on" lora? what's the name of that workflow?

gusty trail Nov 15, 2024, 7:01 PM

#

No, I am just using this one https://civitai.com/models/933026/flux-product-design-in-context-lora

#

The try on lora is very specific for try on

#

You just need to load this lora and do the prompt

#

Or any other examples' lora

halcyon yarrow Nov 15, 2024, 7:03 PM

#

but you need a specific workflow to do the top mask/ bottom mask right?

gusty trail Nov 15, 2024, 7:03 PM

#

https://civitai.com/models/933018/workflows-flux-in-context-lora-for-product-design

halcyon yarrow Nov 15, 2024, 7:03 PM

#

produut-design lora + mask workflow right?

gusty trail Nov 15, 2024, 7:03 PM

#

This is the workflow

#

yes

muted dove Nov 15, 2024, 7:05 PM

#

gusty trail No, I am just using this one https://civitai.com/models/933026/flux-product-desi...

Would this work with people, faces or hands?

halcyon yarrow Nov 15, 2024, 7:06 PM

#

lol yikes

gusty trail Nov 15, 2024, 7:06 PM

#

muted dove Would this work with people, faces or hands?

You could see some example in the workflow

gusty trail Nov 15, 2024, 7:07 PM

#

halcyon yarrow lol yikes

The mask node is not avalible on manager yet. You need to download it from https://github.com/lrzjason/Comfyui-In-Context-Lora-Utils

GitHub

GitHub - lrzjason/Comfyui-In-Context-Lora-Utils

Contribute to lrzjason/Comfyui-In-Context-Lora-Utils development by creating an account on GitHub.

#

muted dove Nov 15, 2024, 7:08 PM

#

gusty trail You could see some example in the workflow

I mean realistic images. Like replacing a face, or bad hands.

gusty trail Nov 15, 2024, 7:08 PM

#

muted dove I mean realistic images. Like replacing a face, or bad hands.

You might need to use with inpainting

halcyon yarrow Nov 15, 2024, 7:08 PM

#

i think the try-on lora demo'ed replacing the face

gusty trail Nov 15, 2024, 7:09 PM

#

#

This try on lora changed too much tbh

halcyon yarrow Nov 15, 2024, 7:09 PM

#

yeah its not as good as inpainting methods

gusty trail Nov 15, 2024, 7:10 PM

#

My friends used ic-lora + inpainting which seems pretty good

#

And I think you could actually do the second pass with inpainting to add or modify some generation

mortal mesa Nov 15, 2024, 7:50 PM

#

XiaoZhi do you happen to know what this one is? TTPlanet/Migration_Lora_flux

gusty trail Nov 15, 2024, 7:53 PM

#

mortal mesa XiaoZhi do you happen to know what this one is? TTPlanet/Migration_Lora_flux

This is the trained ic lora. It is on hf. You could find the link in github

mortal mesa Nov 15, 2024, 7:57 PM

#

oh duh i see now

halcyon yarrow Nov 15, 2024, 8:31 PM

#

i think of the in-context lora set my favorite is still the filmboard

#

set up a complex prompt for all 3 scenes and see how well the model can adhere to it, i wish they would make this lora for SD3

#

speaking of Sd3 I feel the community hasn't embraced it as much as I expected, I'm prettyy much the only person posting SD3 content on civit, it's like really dead in there, i never see SD3 content on my feed either, like ever

bitter hearth Nov 15, 2024, 10:10 PM

#

TTPlanet has nice upscaling nodes also

#

caption tiles, and a decent splitting node also

halcyon yarrow Nov 15, 2024, 10:14 PM

#

speaking of which @sullen moss I tried your suggestion for that 4xFFHQDAT upscaler and while the one is undeniably better in quality than the one i settled in I was unable to adopt it bc iit took 45 seconds to upscale an image with it

sullen moss Nov 15, 2024, 10:15 PM

#

halcyon yarrow speaking of which <@1075363350197846036> I tried your suggestion for that 4xFFHQ...

Well I don't make art on an industrial scale, for me it's all about quality

halcyon yarrow Nov 15, 2024, 10:16 PM

#

i settled on BSRGANx2 delivers solid results in 10 seconds, I also liked RealESRGAN_x2plus but the other one edged it, had to do a bunch of side by sides to really settle on which one looked better

#

lol yeah i don't make art in an industrial scale either

#

the wayy i see it is Im dedicating my precious GPU time that it could be spending rendering images on upscaling those images that I'll be sharing onlinie for the enjoyment of others, so I don't think its crucial to spend another 30 seconds on increasing the quality by that much

#

like if it was +5 or even +10 more seconds id consider it but at 30 seconds, I couuld've made 2 images in that time, its not that important especially since civit is going to downscale the thumbnails anyway

sullen moss Nov 15, 2024, 10:19 PM

#

The main thing is that there’s a choice, and everyone can find what suits them best. That’s the beauty of open source.

halcyon yarrow Nov 15, 2024, 10:19 PM

#

yeah you're right, i ended up on this website: https://openmodeldb.info/?t=arch%3Aesrgan it was super handy finding what i need

OpenModelDB

OpenModelDB is a community driven database of AI Upscaling models. We aim to provide a better way to find and compare models than existing sources.

#

i coul djuset set up the filters and the scale to suit mym needs, download all the results and just run them and compare

#

if i set it to Faces and 2x that's pretty much the only result too

bitter hearth Nov 16, 2024, 12:48 AM

#

halcyon yarrow yeah you're right, i ended up on this website: https://openmodeldb.info/?t=arch%...

their discord is excellent if you want upscaling advice

#

there are models ranging from really fast to really slow now

#

since you liked RealERSGAN maybe you would like RealPLKSR

halcyon yarrow Nov 16, 2024, 12:54 AM

#

i see 2 valid options, theyy're both for anime, one sharp one soft, the other for text, the other for VHS tapes https://openmodeldb.info/?q=RealPLKSR&t=arch%3Arealplksr+scale%3A2

OpenModelDB

OpenModelDB is a community driven database of AI Upscaling models. We aim to provide a better way to find and compare models than existing sources.

bitter hearth Nov 16, 2024, 12:55 AM

#

there's more around the internet

halcyon yarrow Nov 16, 2024, 12:55 AM

#

oh cool okay i thought that website was the end-alll be-all lol

bitter hearth Nov 16, 2024, 12:56 AM

#

not rly although it is good

halcyon yarrow Nov 16, 2024, 1:58 AM

#

i tested my current upscaler of 2xBSRGAN vs 4xPurePhoto-RealPLSKR and it looks darker and fuzzier/blurrier/softer with the 4x

mortal mesa Nov 16, 2024, 2:15 AM

#

Shuttle 3 832x1216 2 step ----> 2396x3501 via 6 panels 4xNMKD-Siax and 2 step per panel, the TTP_Toolset workflow

halcyon yarrow Nov 16, 2024, 2:16 AM

#

that looks way better

#

i like how it didn't just upscale but it enhanced

#

how fast is it tho?

mortal mesa Nov 16, 2024, 2:17 AM

#

fast 2 steps!

#

14 steps total

halcyon yarrow Nov 16, 2024, 2:17 AM

#

yeah but ive tried "4 step" flux models before that are still just as slow as the counterparts, like how fast is it in seconds?

#

so the left is the original and the right is the upscaled right?

mortal mesa Nov 16, 2024, 2:18 AM

#

yes

#

i only have a 2080TI soooo....

halcyon yarrow Nov 16, 2024, 2:19 AM

#

what does it say in the end like in regards to how much time it took

#

this line: Prompt executed in 63.24 seconds

mortal mesa Nov 16, 2024, 2:19 AM

#

Prompt executed in 311.40 seconds

#

that was first model load also

halcyon yarrow Nov 16, 2024, 2:19 AM

#

lol pfft

mortal mesa Nov 16, 2024, 2:19 AM

#

might be faster now

halcyon yarrow Nov 16, 2024, 2:20 AM

#

yeah but like 40% faster at best, that's still a good 150 seconds even if we half the time, and you got 11gb whereas i got 8 so for sure mines wouldn't be that fast

mortal mesa Nov 16, 2024, 2:20 AM

#

toss me a hard to upscale prompt 🙂

halcyon yarrow Nov 16, 2024, 2:20 AM

#

score_9, score_8_up, score_7_up, ((solo)), ((adult)), cinematic, best quality, 1girl,, pale skin, messy hair, short hair, auburn hair, freckles, freckles on chest, green eyes, dark makeup, tradwife, sundress, field, flowers, outdoors,

#

original left, remix right

mortal mesa Nov 16, 2024, 2:21 AM

#

lol ill go swipe some somewhere else, comma salad

halcyon yarrow Nov 16, 2024, 2:21 AM

#

lol

#

field of flowers in the prompt is usually a good upscale test bc there's usually lots of tiny detail in the distance that's very easy to judge

mortal mesa Nov 16, 2024, 2:22 AM

#

mmm good idea

halcyon yarrow Nov 16, 2024, 2:23 AM

#

have you tried InstantIR?

#

i personally felt betrayed by how long it took regardless of the quality simply bc it's name implied it was going to be fast, like instant fast lol

mortal mesa Nov 16, 2024, 2:24 AM

#

i havent, there was something else like that i wanted to try

#

man, i forget

craggy crest Nov 16, 2024, 2:24 AM

#

mortal mesa lol ill go swipe some somewhere else, comma salad

an ornate box, open lid, a spider crawling out of it; Rob Gonsalves; hyper-realistic,hyper-detailed, fantasy, cosmic art; elegant, intricate, detailed, extremely textured, colossal, monstrous.

#

bilateral symmetrical dichotomy

halcyon yarrow Nov 16, 2024, 2:26 AM

#

@mortal mesa so explan to me this setup:

Shuttle 3 832x1216 2 step ----> 2396x3501 via 6 panels 4xNMKD-Siax and 2 step per panel, the TTP_Toolset workflow

what's shuttle 3?
what's 6 panels?
so the upscaler you used is 4xNMKD-Siax but it doesn't work with the built in comfyui Load Upscale Model stuff so I need the TTP_toolset custom_nodes to make that upscaler work?

craggy crest Nov 16, 2024, 2:26 AM

#

halcyon yarrow <@401839506493538304> so explan to me this setup: Shuttle 3 832x1216 2 step ---...

shuttle diffusion is flux - make sure your other models work with flux

halcyon yarrow Nov 16, 2024, 2:28 AM

#

so he's using shuttle difussion 3 to generate 832x1216 @ 2 steps and then upscaling it to 2396x3501 using 4xNMKD-Siax and what's the 2 step per panel and 6 panel thing about? is that a special node?

mortal mesa Nov 16, 2024, 2:29 AM

#

the time i said isnt going to be accurate either cuz i ran this whole section im going to excise out

#

halcyon yarrow Nov 16, 2024, 2:33 AM

#

iit looks so much better

mortal mesa Nov 16, 2024, 2:34 AM

#

halcyon yarrow <@401839506493538304> so explan to me this setup: Shuttle 3 832x1216 2 step ---...

Shuttle 3 is a modified flux schnell checkpoint that they advertise as 4 step, it also works well at 2 steps. The initial image is generated at 2 steps, its upscaled by the NMKD, cut into 6 panels and each is run through ksampler again 2 steps each than reassembled

#

and each panel gets run through florence for a prompt

#

i didnt make it, just addapted it

halcyon yarrow Nov 16, 2024, 2:35 AM

#

lol damn that's some level of care going into upscaling thats next level

bitter hearth Nov 16, 2024, 2:35 AM

#

halcyon yarrow have you tried InstantIR?

InstantIR is quite possibly the best upscaler

#

seemed to beat Supir

halcyon yarrow Nov 16, 2024, 2:36 AM

#

yeah from that screenshot I posted (not sure if you saw it) it really did seem like the absolute best hands down

mortal mesa Nov 16, 2024, 2:36 AM

#

supir was way to heavy for me but ya could do some nice stuff

bitter hearth Nov 16, 2024, 2:36 AM

#

the advantage of Supir currently though is that its been broken apart in ComfyUI now by Kijai so you can swap certain bits in and out

mortal mesa Nov 16, 2024, 2:37 AM

#

i should revisit it, i was talking about when it first came out with zero optimizations haha

halcyon yarrow Nov 16, 2024, 2:39 AM

#

i'm replying to the comparison imamge of the upscalers if you wanna look at it

bitter hearth Nov 16, 2024, 2:39 AM

#

mortal mesa i should revisit it, i was talking about when it first came out with zero optimi...

this happened with the Flux unsampling thing too
people tried a slightly janky version when it first came out so they thought it was bad

halcyon yarrow Nov 16, 2024, 2:41 AM

#

left SUPIR right InstantIR

mortal mesa Nov 16, 2024, 2:41 AM

#

Oh and this was one i was curious about also https://github.com/2kpr/ComfyUI-PMRF

dusky thistle Nov 16, 2024, 2:47 AM

#

bitter hearth Nov 16, 2024, 2:48 AM

#

mortal mesa Oh and this was one i was curious about also https://github.com/2kpr/ComfyUI-PM...

thanks will try it

mortal mesa Nov 16, 2024, 2:53 AM

#

the upscaled chest and spider kinda looks bad full size :/

halcyon yarrow Nov 16, 2024, 2:57 AM

#

ii itired PMRF but i have cuda 12.1 not 12.4 inistalled so im gonna put it off for now

bitter hearth Nov 16, 2024, 3:16 AM

#

mortal mesa the upscaled chest and spider kinda looks bad full size :/

might be the noise injection

#

had some coloured artifacts

dusky thistle Nov 16, 2024, 3:23 AM

#

mortal mesa Nov 16, 2024, 3:29 AM

#

ya im messing with noise and denoise levels, i have to say i am pretty impressed with the tiling method, better than what ive tried

craggy crest Nov 16, 2024, 3:52 AM

#

mortal mesa

those came out fantastic

halcyon yarrow Nov 16, 2024, 4:51 AM

#

@mortal mesa I modiied that workflow you're using a few ways

replaced flux shuttle with sd3.5 large fp8, I could probably optimize it further by trying turbo instead of large but i found large is faster than turbo when using the AIO model bc it comes built in with fast clip models
removed the image generatioin part converting iti to a purely upscale workflow
replaced florence with BLIP for speed

I got it down to 117 seconds and the results are outstanding

#

Prompt executed in 98.99 seconds in subsequent runs

#

original image let, second one is pre-texture detailer, third isi post-texture detailier

mortal mesa Nov 16, 2024, 5:00 AM

#

sooo speaking about the texture detailer area, look where the negative prompt is coming from, it doesn't seem right, i dont understand it

#

but ya ill be reusing that middle section also

halcyon yarrow Nov 16, 2024, 5:08 AM

#

yeah i noticed that, im replacing all that jazz with clownshark

dusky thistle Nov 16, 2024, 5:19 AM

#

#

#

#

mortal mesa Nov 16, 2024, 6:02 AM

#

gusty trail Nov 16, 2024, 7:51 AM

#

civic trail Nov 16, 2024, 9:35 AM

#

fast pendant Nov 16, 2024, 10:19 AM

#

Hello everyone

#

how can I use SD3.5 large for sketch to image?

fast pendant Nov 16, 2024, 10:53 AM

#

?

civic trail Nov 16, 2024, 10:53 AM

#

fast pendant ?

I gave you flux

civic trail Nov 16, 2024, 10:57 AM

#

fast pendant how can I use SD3.5 large for sketch to image?

But hey google is your friend: https://comfyui-wiki.com/tutorial/advanced/stable-diffusion-3-5-comfyui-workflow.en-US#google_vignette

Stable Diffusion 3.5 Workflow Tutorial in ComfyUI – ComfyUI-Wiki

Master Stable Diffusion with ComfyUI Wiki! Explore tutorials, nodes, and resources to enhance your ComfyUI experience.

dusky thistle Nov 16, 2024, 11:25 AM

#

#

#

#

#

#

#

#

#

#

#

cunning lintel Nov 16, 2024, 12:06 PM

#

dusky thistle Nov 16, 2024, 2:04 PM

#

halcyon yarrow Nov 16, 2024, 2:31 PM

#

@dusky thistle check out this artifacting, using sd3 large q8

#

i think its really interesting how the sampler will never produce an image that is objectively flawed its only subjectively bad

dusky thistle Nov 16, 2024, 2:33 PM

#

wow

#

what's it look like with euler in ksampler, or the new euler_ancestral in ksampler

halcyon yarrow Nov 16, 2024, 2:34 PM

#

i could add that to the queue but I gotta wait for res_3s to try a pass at it first

#

that was using res_2m

#

and this is the source image not that it matters bc i didn't use img2img, and this is the prompt that generated that image:
hideous witch, by Sergei Parajanov, <lora:aidmaMJ6.1_v0.3:1> <lora:aidmaImageUpgrader:1>,

#

same seed, same exact parameters for both, res_2m left, res_3s right

dusky thistle Nov 16, 2024, 2:41 PM

#

idk if you've tried rk_exp_5s yet, it's a little slower than res_3s but another step up in quality espec with SD3.5M

halcyon yarrow Nov 16, 2024, 2:42 PM

#

res_3s understood the assignment when given that prompt for the "hideous witch" instead of making some crazy wizard it actually did a witch and i wouldn't say she's hideous (eveyrone is beautiiful in their own way) shes quite shoking to look at lol

dusky thistle Nov 16, 2024, 2:42 PM

#

do the artifacts show up with regular SD3.5L (or do you not have the vram to test)

halcyon yarrow Nov 16, 2024, 2:43 PM

#

you can still see some of the same patterns the wizard showed as far as texture and artifacting, il def try rk_exp_5s and compare it

#

i dont have the full 3.5L unpruned to test I just have the fp8 AIO, the q8 and q8 turbo

dusky thistle Nov 16, 2024, 2:44 PM

#

can you screencap the clownshark params and paste the prompt? i could give it a shot over here, i have the full

halcyon yarrow Nov 16, 2024, 2:45 PM

#

is this easier for you?

📎 message.txt

#

btw i don't choose that layout, its just how i programmed it, i feel like i should spend some time to program the layout better

#

if (!nextItemCloned.workflow.oldWorklowUsed) {
            nextItemCloned.rk_type = _.sample(['rk_exp_5s', 'res_3s']);
        }

here's what I'm going to do moving forward, in situations where I ask to use the new workflow (shark) instead of always picking res_3s it'll randomly pick one of those 2, any other high quality and slow sampler I could add to this list ?

dusky thistle Nov 16, 2024, 2:47 PM

#

halcyon yarrow ```js if (!nextItemCloned.workflow.oldWorklowUsed) { nextItemCloned....

here's the origin of the artifacts

#

it's the length of your prompts

#

SD35 is really really weird about it

#

never seen a model do it but if you go past 72 tokens it starts going downhill, espec with the neg conditioning

halcyon yarrow Nov 16, 2024, 2:48 PM

#

ah makes sense

dusky thistle Nov 16, 2024, 2:48 PM

#

the truncate conditioning option in shark is there just to safeguard in case you go over and are seeing problems, it cuts it down to the size for a one chunk embed

halcyon yarrow Nov 16, 2024, 2:48 PM

#

so it really does make sense to truncate SD3.5 negative prompts right?

dusky thistle Nov 16, 2024, 2:48 PM

#

it goes downhill with both

#

but it's worse with the neg

halcyon yarrow Nov 16, 2024, 2:48 PM

#

but the truncate option does so for both positive and negative right?

dusky thistle Nov 16, 2024, 2:48 PM

#

yeah

halcyon yarrow Nov 16, 2024, 2:48 PM

#

ill just add some code onn my end to truncate negative to 77 tokens for negative for SD3.5 and maybe flux too?

dusky thistle Nov 16, 2024, 2:48 PM

#

https://novelai.net/tokenizer i just use this to double check how many i'm using

NovelAI

NovelAI - The AI Storyteller

NovelAI is a monthly subscription service for AI-assisted image generation, storytelling, or simply a LLM powered sandbox for your imagination.

halcyon yarrow Nov 16, 2024, 2:49 PM

#

actually for flux i'm just blanking it out

dusky thistle Nov 16, 2024, 2:49 PM

#

72 tokens for whatever reason is the limit

#

i stil lhaven't gotten around to looking at what's going on whatsoever, but once you hit 73 input tokens, truncate changes the output

#

72, it's the same

#

so it's probably moving onto the next block at 73 for whatever reason

#

yeah i'm just using a blank neg for sd35 myself

halcyon yarrow Nov 16, 2024, 2:50 PM

#

if (isSd3Model){
        // truncate to 250 characters
        nextItem.negative_prompt = nextItem.negative_prompt.substring(0, 250);
    }

alright truncating in place

#

well i fgured sd3.5 isn't distilled and it's always handled negative prompts so i dont want to treat it like flux and just blank it out

dusky thistle Nov 16, 2024, 2:51 PM

#

i haven't done comprehensive tests or anything, but the ones i did do... seemed to degrade with negative prompts of any kind

halcyon yarrow Nov 16, 2024, 2:52 PM

#

ive seen that for sure on flux

dusky thistle Nov 16, 2024, 2:52 PM

#

it def isn't like it was with cascade or sdxl (or sd15) where something like "bad quality" actually did generally lead to a better image

halcyon yarrow Nov 16, 2024, 2:52 PM

#

where any negative prompting heavily reduces the quality of the image, even when using a dedistiilled model

#

i used a really long negative prompt for this one score_6, score_5, score_4, source_pony, source_anime, pink nipples, source_furry, source_cartoon, censored, deformed hands, deformed fingers, extra fingers, missing fingers, extra limbs, missing limbs, bad eyes, ugly face, blurry face, wrong anatomy, crossed eyes, missing leg, missing foot, unattached hand, deformed, deformed face, bad teeth, ugly teeth, low quality, bad quality, worst quality

dry wave Nov 16, 2024, 2:55 PM

#

negative prompting is not necessary for cfg, in fact it's not even part of the original cfg implementation

halcyon yarrow Nov 16, 2024, 2:55 PM

#

awww dude - Value not in list: sampler_name: 'rk_exp_5s' not in (list of length 28)lol c'mon so thats a new sampler then? gotta do the old git pull? is there any hidden bombs i should be aware of before i pull it?

dry wave Nov 16, 2024, 2:55 PM

#

cfg itself is a hack, but it's necessary for diffusion models to reach good performance

#

negative prompting is a hack of a hack

halcyon yarrow Nov 16, 2024, 2:55 PM

#

wow didnt know that

dry wave Nov 16, 2024, 2:56 PM

#

if you can get rid of it, perfect. If you need it, use it, if the model works without that's great

halcyon yarrow Nov 16, 2024, 2:59 PM

#

i much rather have cfg than have a distilled model where I get a range of 1 to 1.8 and i lose control over the aesthetics of the image

dry wave Nov 16, 2024, 3:01 PM

#

I prefer cfg, too, but I talk about negative

#

negative prompts are not part of cfg, they are an optional feature

#

they might work, or they might make things worse. People often misinterpret how negative prompts work and might overuse them or use them wrongly

halcyon yarrow Nov 16, 2024, 3:30 PM

#

yeah i agree with that statement, heck it took me a while to grasp the concept of negative prompts

#

not my creation, just a fun share, created using flux

zenith terrace Nov 16, 2024, 3:49 PM

#

catwhaaa

halcyon yarrow Nov 16, 2024, 4:59 PM

#

sd3.5_large_fp8...aled | 🌱 4224251490 | 🦶 29 | 🦮 3.5 | cfg_scale_alt 3.5 | 🧠 sd35_VAE | 🎤 dpmpp_2m | 🕦 sgm_uniform | 🗓 11/16, 11:33 AM | ⏱️ 107s
(ignore the sampler/scheduler its just using res_2m) my only gripe with it is that vertical line on the left side

#

halcyon yarrow Nov 16, 2024, 5:31 PM

#

#

#

#

@short thicket it makes sense to see the mangled model performing on par with the flux destill model, i wouldn't say it's any faster

civic trail Nov 16, 2024, 6:17 PM

#

craggy crest Nov 16, 2024, 6:17 PM

#

halcyon yarrow

what on earth?

#

halcyon yarrow Nov 16, 2024, 6:20 PM

#

last one chart I promise, i removed the destilled models, and the SDXL model at the bottom and included al results regardless of sample size or percentile

halcyon yarrow Nov 16, 2024, 6:20 PM

#

craggy crest what on earth?

you dont understand it or you dont believe the stats?

#

weird how medium is taking the time for me as large right? you see how red and pink are like inline with each other?

craggy crest Nov 16, 2024, 6:21 PM

#

halcyon yarrow you dont understand it or you dont believe the stats?

it looked like a mountain and i visualized a ski run

craggy crest Nov 16, 2024, 6:22 PM

#

halcyon yarrow weird how medium is taking the time for me as large right? you see how red and p...

not really. in my experience, medium and large run about the same speed

halcyon yarrow Nov 16, 2024, 6:23 PM

#

oh wow interesting so you're seeing the same thing too, i feel like that's kinda bullshit, that gives me 0 incentive to ever use medium

#

don't you find that's weird considering one is a much smaller model?

craggy crest Nov 16, 2024, 6:24 PM

#

halcyon yarrow oh wow interesting so you're seeing the same thing too, i feel like that's kinda...

medium is designed to be more artsy than large. it has a use

craggy crest Nov 16, 2024, 6:24 PM

#

halcyon yarrow don't you find that's weird considering one is a much smaller model?

not really

halcyon yarrow Nov 16, 2024, 6:24 PM

#

oh i didnt know that

craggy crest Nov 16, 2024, 6:24 PM

#

medium makes a nice refiner, too

halcyon yarrow Nov 16, 2024, 6:24 PM

#

so it's not the same training data just distilled?

craggy crest Nov 16, 2024, 6:24 PM

#

halcyon yarrow so it's not the same training data just distilled?

no, it's not

halcyon yarrow Nov 16, 2024, 6:25 PM

#

that changes things good to know

#

cfg 7, cfg, 6, then 5 then 4. it's clear lower cfg improves image quality I just really liked the scene at cfg 7, im doinig another run at 3.5 to see what that looks like

craggy crest Nov 16, 2024, 6:26 PM

#

halcyon yarrow that changes things good to know

this workflow is for 3.5 l with upscaling and 3.5m as a refiner. it was released with the rest of the 3.5 releases by SAI. you might take a look at it

📎 SD3.5L_plus_SD3.5M_upscaling_example_workflow.json

halcyon yarrow Nov 16, 2024, 6:27 PM

#

oh cool ive been playing with the idea of trying to make a performant one, I was messing with the florence 2 yesterday, i think Kagi or NeonNinja shared it

#

i could get it down to 100s with decent results, Ill try that one and modify it to start with Load Image rather than 3.5L and see what kind of times I can get with it

craggy crest Nov 16, 2024, 6:30 PM

#

sounds good :)

mortal mesa Nov 16, 2024, 6:32 PM

#

halcyon yarrow Nov 16, 2024, 6:36 PM

#

lol wow talk about upscale

#

thats fun to look at, you an even see inside the caves

#

@mortal mesa how long did it take you to make that image?

mortal mesa Nov 16, 2024, 6:39 PM

#

normal time, nothing special, what i was doing yesterday with slightly diffrent settings

halcyon yarrow Nov 16, 2024, 6:42 PM

#

so that's the 6 panel workflow with florence and the NKMD upscaler riight?

mortal mesa Nov 16, 2024, 6:44 PM

#

mmm i was swapping upscale models, i forget what was used on that one ide have to load it, ide bet 4x ultrasharp, but ya that WF. raised denoise and lowered noise injection

craggy crest Nov 16, 2024, 6:48 PM

#

@mortal mesa you really need to animate that, that's got huge potential

#

mortal mesa Nov 16, 2024, 7:09 PM

#

craggy crest <@401839506493538304> you really need to animate that, that's got huge potential

yes i was pleasantly surprised, local video stuff is tough for me (time and OOM) but ya could be nice

hallow lion Nov 16, 2024, 7:16 PM

#

halcyon yarrow not my creation, just a fun share, created using flux

Russian DUNE.

craggy crest Nov 16, 2024, 8:21 PM

#

short thicket Nov 16, 2024, 8:48 PM

#

halcyon yarrow <@457597359099215893> it makes sense to see the mangled model performing on par ...

which one is mangled and which is acorn?

halcyon yarrow Nov 16, 2024, 8:50 PM

#

short thicket which one is mangled and which is acorn?

Did you see the chart and skip my comment? It’s the one “on par with the destill” model in other words the red line is clearly destill being the slowest model of the group and the orange line that touches it is mangled. ChatGPT made the chart don’t blame me on the color selection lol

short thicket Nov 16, 2024, 8:51 PM

#

halcyon yarrow Did you see the chart and skip my comment? It’s the one “on par with the destill...

Did you see the chart and skip my comment? Yup. LOL I read it afterwards. My bad.

halcyon yarrow Nov 16, 2024, 8:52 PM

#

Lol it’s fine, I was upset about not being able to tell either and I had to reason my way to figure it out

short thicket Nov 16, 2024, 8:52 PM

#

Have you tried it past 3?

halcyon yarrow Nov 16, 2024, 8:52 PM

#

I haven’t tried it last 3 Lora’s

#

I wish I could convey the sample size per entry in a chart

#

I feel like having 100+ enries for given model and Lora count would be more accurate than something with 1 entry

civic trail Nov 16, 2024, 9:00 PM

#

craggy crest Nov 16, 2024, 9:13 PM

#

halcyon yarrow Nov 16, 2024, 9:33 PM

#

@short thicket okay the time it took me from when i said that to when I was happy with a result is 40 minutes so you could almost say i spent 40 minutes making this chart (for fun of course)

#

the green means a confidient number bc there's enough tests done for that scenario that the indicator is a good measure of actual average times
the yellow means not so much bc there's between 10 and 100 tests done
and the red means take it with a grain of salt bc less than 10 were done so it might be an edge case outlier as far as actual expected times

bitter hearth Nov 16, 2024, 9:37 PM

#

how much vram do you have and are you sure you didn't fill your vram during generation?

#

I ask because that changes the numbers

#

if vram filled

short thicket Nov 16, 2024, 9:42 PM

#

halcyon yarrow the green means a confidient number bc there's enough tests done for that scenar...

Also, I noticed these are all gguf. Ive heard fp8 safetensors runs faster than gguf.

bitter hearth Nov 16, 2024, 9:43 PM

#

it does yeah especially on ada/hopper

#

but it does on any gpu

halcyon yarrow Nov 16, 2024, 9:45 PM

#

short thicket Also, I noticed these are all gguf. Ive heard fp8 safetensors runs faster than g...

i thought gguf was designed to be faster than fp8, by being better at memory conservation it was inherently faster than fp8?

#

in my experience theh fp8 version for SD3 does run faster in fact here's the chart for it

craggy crest Nov 16, 2024, 9:46 PM

#

@halcyon yarrow you are putting way too much time into this to just post it here. you should consider making a video tutorial

halcyon yarrow Nov 16, 2024, 9:46 PM

#

butu i don't think the fp8 is faster bc of the pruning method but rather bc it forces me to use the built in clip that uses the lower quality set than the triple clip setup i normally use

halcyon yarrow Nov 16, 2024, 9:46 PM

#

craggy crest <@156588917875933184> you are putting way too much time into this to just post i...

maybe i'll put it as a civitai article, ive done those before

craggy crest Nov 16, 2024, 9:46 PM

#

that'd be good too

halcyon yarrow Nov 16, 2024, 9:47 PM

#

im also geniuenly curious so im doing it for myself and sharing with others

craggy crest Nov 16, 2024, 9:47 PM

#

it'll get buried here and lost.

bitter hearth Nov 16, 2024, 9:47 PM

#

gguf is slower than fp8

#

particularly on GPUs that have native fp8 matmul

#

neither of these are pruning

#

pruning is something a bit different

halcyon yarrow Nov 16, 2024, 9:48 PM

#

quantsizing somethinig isn't a form of pruning? i think so

bitter hearth Nov 16, 2024, 9:48 PM

#

speaking of which, 3B pruned flux came out just now https://huggingface.co/TencentARC/flux-mini

craggy crest Nov 16, 2024, 9:48 PM

#

halcyon yarrow quantsizing somethinig isn't a form of pruning? i think so

not really

halcyon yarrow Nov 16, 2024, 9:48 PM

#

pruning is to selectively remove bits from the model by quantsizing you're selectively removing the precision bits

craggy crest Nov 16, 2024, 9:49 PM

#

when you quant somethign, what do you do?

halcyon yarrow Nov 16, 2024, 9:49 PM

#

you're rounding the precision on the weights right?

craggy crest Nov 16, 2024, 9:49 PM

#

halcyon yarrow pruning is to selectively remove bits from the model by quantsizing you're selec...

you're not removing data however, you're just stopping how many decimal places you do the math out to

#

when you prune, you actually remove data

halcyon yarrow Nov 16, 2024, 9:50 PM

#

if i'm wrong then civitai is wrong bc they call different options like bf16 and fp8 as different pruning types

bitter hearth Nov 16, 2024, 9:50 PM

#

quantisation is converting floating point numbers to less precise formats, whereas pruning is actually removing weights from the calculation

halcyon yarrow Nov 16, 2024, 9:50 PM

#

i agree when you prove you actually remove data, by rounding a number you're effectively removing data

craggy crest Nov 16, 2024, 9:50 PM

#

halcyon yarrow if i'm wrong then civitai is wrong bc they call different options like bf16 and ...

those are the number of decimal places you allow on the end of a number, which affects the precision of the math

halcyon yarrow Nov 16, 2024, 9:50 PM

#

a full unpruned 22gb model has had it's data removed to become a q8 model at 11gb

craggy crest Nov 16, 2024, 9:51 PM

#

halcyon yarrow a full unpruned 22gb model has had it's data removed to become a q8 model at 11g...

it hasn't had the weights removed.

#

(if it's flux, it doesn't need 4 gig of the padding anyway)

halcyon yarrow Nov 16, 2024, 9:51 PM

#

i never said it did right?

craggy crest Nov 16, 2024, 9:51 PM

#

but that's what pruning is

#

the weights are actually removed

halcyon yarrow Nov 16, 2024, 9:51 PM

#

i think that's a form of pruning

#

by reducing the number of weights we can also call that distilling

craggy crest Nov 16, 2024, 9:52 PM

#

pruning is when you cut branches off a tree. quant is when you put a ring around the base and don't let the roots grow out very far

bitter hearth Nov 16, 2024, 9:52 PM

#

these terms mean specific things its not actually debatable

halcyon yarrow Nov 16, 2024, 9:53 PM

#

look all i'm saying is I'm using the wrong terminolgy take it up with CivitAI.com bc they're a really big player in the industry and they're using that terminology to refer these different methods like fp8, bf16, q8 etc

#

i think pruning is a general term that can refer to distilling (to reduce weights) or quantsizing (to round weights), with both methods you're effectively removing data

craggy crest Nov 16, 2024, 9:55 PM

#

halcyon yarrow look all i'm saying is I'm using the wrong terminolgy take it up with CivitAI.co...

civit does a lot of things wrong - but @bitter hearth is a programmer, and math guy, and civit is not. i would listen to him.

craggy crest Nov 16, 2024, 9:55 PM

#

halcyon yarrow i think pruning is a general term that can refer to distilling (to reduce weight...

it isn't, however. it refers to a very specific acton that is taken on a model

halcyon yarrow Nov 16, 2024, 9:56 PM

#

its not a big deal, let's just agree to disagree 🤝

mortal mesa Nov 16, 2024, 10:08 PM

#

call it pruning the precision

halcyon yarrow Nov 16, 2024, 10:37 PM

#

There we go that’s a good one, speaking of pruning I wanna try that Flux Mini 3b Neon showed off, looks bad ass

craggy crest Nov 16, 2024, 10:41 PM

#

@dusky thistle @bitter hearth https://everlyheights.tv/everly-heights-xyz-grid-evaluator/

Everly Heights | Stories set in the fictional town of Everly Heights, Ohio

Everly Heights

Everly Heights XYZ Grid Evaluator | Everly Heights | Stories set in...

CLICK HERE TO LAUNCH FULL SCREEN

#

think that's worth anything?

bitter hearth Nov 16, 2024, 10:43 PM

#

I think I badly misunderstood this applet

craggy crest Nov 16, 2024, 10:44 PM

#

you're not supposed to feed it dinner

bitter hearth Nov 16, 2024, 10:45 PM

#

I do like the composable loras on this site

#

separate loras for background and characters etc

halcyon yarrow Nov 16, 2024, 11:12 PM

#

12 seconds to generate uusing Flux Mini, this is the default prompt ComfyUI puts in

#

@bitter hearth have you tried it on comfy yet?

#

18 seconds, 40 steps, cfg 4.5 ddim beta

bitter hearth Nov 16, 2024, 11:14 PM

#

gonna download it now

halcyon yarrow Nov 16, 2024, 11:15 PM

#

67 seconds at 40 steps cfg 4.5 ddim beta

craggy crest Nov 16, 2024, 11:18 PM

#

flux mini seems to be struggling

bitter hearth Nov 16, 2024, 11:20 PM

#

this was SDXL on the same prompt, earlier in the year

halcyon yarrow Nov 16, 2024, 11:20 PM

#

lol yeah i was just gonna post that

#

this is wiith clownshark sampler

#

ksampler just sucks thats what itt is

bitter hearth Nov 16, 2024, 11:21 PM

#

clown stuff is just so much better yeah

shell bloom Nov 16, 2024, 11:21 PM

#

halcyon yarrow 12 seconds to generate uusing Flux Mini, this is the default prompt ComfyUI puts...

Where do you use flux mini, on comfyui? Because it gives me an error there

halcyon yarrow Nov 16, 2024, 11:21 PM

#

that was 170 seconds too

#

its a diffusers based model so you cant use load checkpoint

#

you gotta use load diffusion model and then load iit up with the clips and standard flux vae on the side

shell bloom Nov 16, 2024, 11:22 PM

#

ahhh ok thanks

bitter hearth Nov 16, 2024, 11:22 PM

#

this part of comfy is confusing

#

do I put the model in unet folder or diffusion_model folder

halcyon yarrow Nov 16, 2024, 11:22 PM

#

putu it ini the diffusion_model folder

bitter hearth Nov 16, 2024, 11:22 PM

#

ok thanks

halcyon yarrow Nov 16, 2024, 11:22 PM

#

you can just load my workflow if you wanna try it

shell bloom Nov 16, 2024, 11:23 PM

#

halcyon yarrow putu it ini the diffusion_model folder

Thanks

craggy crest Nov 16, 2024, 11:23 PM

#

bitter hearth do I put the model in unet folder or diffusion_model folder

the better question is, do i bribe Comfy to create standards or do i just accept him doing stuff at random

halcyon yarrow Nov 16, 2024, 11:23 PM

#

sometimes it doesn't copy the metadata when i just copy the image through the clipboard so here's the fileupload

bitter hearth Nov 16, 2024, 11:23 PM

#

LOL

halcyon yarrow Nov 16, 2024, 11:23 PM

#

im trying res_2m see what times i get, i think i can safely bring it down to 20 steps too

bitter hearth Nov 16, 2024, 11:24 PM

#

craggy crest the better question is, do i bribe Comfy to create standards or do i just accept...

I've actually mostly switched to Diffusers at this point just because its more standardised TBH

#

but there is no nice UI unless Matteo's project does well

halcyon yarrow Nov 16, 2024, 11:25 PM

#

i don't like diffusers format it just makes things more confusin and mostly bc my whole codebase is writtne around the checkpoints folder i dont support models in that diffusion_models folder

bitter hearth Nov 16, 2024, 11:25 PM

#

flux dev is ok at 20 steps yeah it will be worse quality than 40 steps but will still give an ok image

halcyon yarrow Nov 16, 2024, 11:25 PM

#

im gonna have to convert it to SD format i have a script for it later

craggy crest Nov 16, 2024, 11:25 PM

#

bitter hearth but there is no nice UI unless Matteo's project does well

i have faith in matteo

bitter hearth Nov 16, 2024, 11:25 PM

#

same TBH

#

his IP adapter stuff is excellent

halcyon yarrow Nov 16, 2024, 11:25 PM

#

126 seconds, res_2m, 40 steps

bitter hearth Nov 16, 2024, 11:26 PM

#

halcyon yarrow i don't like diffusers format it just makes things more confusin and mostly bc m...

oh I agree having two setups with different structure is confusing

craggy crest Nov 16, 2024, 11:26 PM

#

bitter hearth Nov 16, 2024, 11:26 PM

#

this is why there is a long delay for stuff to get ported from diffusers to comfy

halcyon yarrow Nov 16, 2024, 11:26 PM

#

it lost coherency at 20 steps, 66 seconds but the bottle went missing

bitter hearth Nov 16, 2024, 11:26 PM

#

its a bit tricky

#

if you are below 40 steps probably want eta = 0

#

or very low eta

craggy crest Nov 16, 2024, 11:26 PM

#

halcyon yarrow it lost coherency at 20 steps, 66 seconds but the bottle went missing

that's no galactic arm, that's a rip in reality!

halcyon yarrow Nov 16, 2024, 11:26 PM

#

have you seen the ComfyUI wanna be UI that's specifically for diffusers? I saw it on a video recently it looks cute

halcyon yarrow Nov 16, 2024, 11:27 PM

#

bitter hearth its a bit tricky

got it i was at 0.5 eta

bitter hearth Nov 16, 2024, 11:27 PM

#

halcyon yarrow have you seen the ComfyUI wanna be UI that's specifically for diffusers? I saw i...

no I haven't seen it, would probably use one if it was good

halcyon yarrow Nov 16, 2024, 11:27 PM

#

what about eta 0, res_3s and 15 steps? lets see...

halcyon yarrow Nov 16, 2024, 11:27 PM

#

bitter hearth no I haven't seen it, would probably use one if it was good

it looked good it looked like a cleaner comfyui

bitter hearth Nov 16, 2024, 11:28 PM

#

going above order 2 requires a great many steps

halcyon yarrow Nov 16, 2024, 11:28 PM

#

that stuff is above my level, ii dont really get what eta is doing i ust understand its a factor of noise

bitter hearth Nov 16, 2024, 11:28 PM

#

eta = 0 means no extra noise is added each step

#

if eta is anything above zero then extra noise is being added

craggy crest Nov 16, 2024, 11:29 PM

#

you add the noise for a number of reasons

halcyon yarrow Nov 16, 2024, 11:29 PM

#

eta 0, 20 steps, 67 s, res_2m

bitter hearth Nov 16, 2024, 11:29 PM

#

mostly keep s_noise at 1.0 its quite spicy
on some models s_noise 1.03-1.07 can be a nice detail boost

craggy crest Nov 16, 2024, 11:29 PM

#

bitter hearth mostly keep s_noise at 1.0 its quite spicy on some models s_noise 1.03-1.07 can ...

i do not want to be invited to your house for a spicy dinner

halcyon yarrow Nov 16, 2024, 11:29 PM

#

clown renamed s_noise i dont see that field in the sampler anymore lol

bitter hearth Nov 16, 2024, 11:30 PM

#

LOL

#

yeah clown renames everything a couple of times per day

#

its part of the mystery

craggy crest Nov 16, 2024, 11:30 PM

#

squirrel!

bitter hearth Nov 16, 2024, 11:30 PM

#

the d_noise thing is similar to that "lying sampler" node that went viral

#

or the "detail daemon" node that is similar

#

for the most part either it will boost detail a bit if you increase it, or it will break the model, depending on the model

halcyon yarrow Nov 16, 2024, 11:31 PM

#

15 steps, res_3s, 135 seconds

bitter hearth Nov 16, 2024, 11:31 PM

#

actually d_noise might want to go down rather than up, depends how it was implemented

#

the res_2m ones seem better

#

generally res_2m is the one for below 40-60 steps

#

and then above 40-60 steps res_2s with eta on is good

halcyon yarrow Nov 16, 2024, 11:33 PM

#

ive been using res_2m for everything by default, i could add logic where if steps > 40 then ill auto switch it to res_2s or res_3s. that's some good feedback thx Neon

#

those sort of nibbles of knowledge are fun to consume bc they make my system better overall

craggy crest Nov 16, 2024, 11:34 PM

#

"...nibbles of knowledge..." i'm stealing that

halcyon yarrow Nov 16, 2024, 11:35 PM

#

20 steps using res_3s at 172 seconds. I think this is what most would consider the "gold standard" imamge for this prompt something like this image

#

its an ineresting model with times ranging from 12 to 130 seconds

bitter hearth Nov 16, 2024, 11:37 PM

#

one issue I have with these models is they could end up losing the hyper/turbo lora compatability

halcyon yarrow Nov 16, 2024, 11:37 PM

#

last one and then i gotta go, res_2s, 15 steps, 50 seconds, and I think what I changed that's making them better is i changed the base shift from 0.8 to 1.5 as per wizard's recommendation way back when

bitter hearth Nov 16, 2024, 11:39 PM

#

1.5 shift is fine yeah

#

I use a bit of a different method but it requires multiple k-samplers

halcyon yarrow Nov 16, 2024, 11:40 PM

#

Lol sounds expensive

bitter hearth Nov 16, 2024, 11:40 PM

#

the model goes from sigma 1 (pure noise) to sigma 0 (sharp, finished image)
and the important thing is that it has a decent number of steps before sigma 0.8 or so, or even sigma 0.9 or so

#

shift is one way of doing that

craggy crest Nov 16, 2024, 11:41 PM

#

(or you could just roll dice and see what happens)

bitter hearth Nov 16, 2024, 11:41 PM

#

I prefer to use a node called split at sigma and then have a separate ksampler for sigmas 1-0.8 and sigmas 0.8-0

craggy crest Nov 16, 2024, 11:41 PM

#

comfyUI needs a random dice roll node that'll set every value to something random

bitter hearth Nov 16, 2024, 11:42 PM

#

yeah that might be good TBH

#

if you are lucky enough

craggy crest Nov 16, 2024, 11:42 PM

#

i'm sure there would be horrors, but i'm equally sure realy cool stuff would happen

bitter hearth Nov 16, 2024, 11:42 PM

#

a lot of my favourite things I found by accident

craggy crest Nov 16, 2024, 11:43 PM

#

bitter hearth a lot of my favourite things I found by accident

bitter hearth Nov 16, 2024, 11:43 PM

#

I guess it didn't know what to do with a frame LOL

#

all the video models just explode if they try to make R2D2 move

#

they can rotate around him while he sits still though

craggy crest Nov 16, 2024, 11:45 PM

#

i kind of like the expanding frame idea

craggy crest Nov 16, 2024, 11:46 PM

#

bitter hearth all the video models just explode if they try to make R2D2 move

#

no they don't

bitter hearth Nov 16, 2024, 11:47 PM

#

wow didn't know

#

which one is this

craggy crest Nov 16, 2024, 11:47 PM

#

bitter hearth which one is this

meta

bitter hearth Nov 16, 2024, 11:47 PM

#

I mostly used cog, maybe they are better now

craggy crest Nov 16, 2024, 11:48 PM

#

zuckerberg's AI

halcyon yarrow Nov 16, 2024, 11:48 PM

#

bitter hearth I prefer to use a node called split at sigma and then have a separate ksampler f...

Wow talk about advanced techniques, I don’t really understand what sigmas are or the concept but I do retain some of what you’ve said how it has to setup the layout in the first 1 or 2 steps and that relates to the sigma somehow

craggy crest Nov 16, 2024, 11:48 PM

#

halcyon yarrow Wow talk about advanced techniques, I don’t really understand what sigmas are or...

sigma - isn't advanced, it's math

bitter hearth Nov 16, 2024, 11:48 PM

#

they decided to show people a scheduler name

#

instead of a list of sigmas

#

but what comes out your scheduler node looks like this 1, 0.8, 0.6. 0.4, 0.2, 0

#

if you choose something like SGM Uniform 5 step

#

might not be exactly that but its a decreasing list of numbers from 1 to 0

#

one number per step

halcyon yarrow Nov 16, 2024, 11:50 PM

#

The takeaway for me is that sigma is a factor where it’s a constant value of 1 and it’ll progress to 0 until it’s finished during generation

bitter hearth Nov 16, 2024, 11:50 PM

#

yeah

#

you could see it as a progress bar in some ways

halcyon yarrow Nov 16, 2024, 11:51 PM

#

And then how it progresses is based on the scheduler and rather than having a general curve you like to split the curve with two ksamplerd

bitter hearth Nov 16, 2024, 11:51 PM

#

sigma 0.5 is always 50% done

pseudo owl Nov 16, 2024, 11:51 PM

#

bitter hearth I mostly used cog, maybe they are better now

cog should be able to do it, thats right now the best open source model for img2vid. really waiting for mochi img2vid support, should be amazing then for open source.

bitter hearth Nov 16, 2024, 11:51 PM

#

yeah the pink curve is my overall sigmas

#

afraid I lost the workflow for this one

#

yellow curve is first sampler, then blue is second

#

pink is the overall combined curve

halcyon yarrow Nov 16, 2024, 11:52 PM

#

pseudo owl cog should be able to do it, thats right now the best open source model for img2...

Did you try my img2vid node? lol I made one for mochi it’s crude but it works

halcyon yarrow Nov 16, 2024, 11:52 PM

#

bitter hearth yellow curve is first sampler, then blue is second

So that break in the pink is where you split it right?

bitter hearth Nov 16, 2024, 11:53 PM

#

yeah

#

and it goes up a bit because I set it to re-do some of the image

halcyon yarrow Nov 16, 2024, 11:53 PM

#

Does clown have anything to say about this? Is there anything he can do to facilitate achieving something like that with a custom scheduler in the nodes?

bitter hearth Nov 16, 2024, 11:54 PM

#

don't think clown particularly likes this method lol

halcyon yarrow Nov 16, 2024, 11:54 PM

#

Lol oh I see

bitter hearth Nov 16, 2024, 11:54 PM

#

from what I have seen he doesn't change shift or scheduler much

halcyon yarrow Nov 16, 2024, 11:54 PM

#

Hey @pseudo owl if you want I can link you to the GitHub where I put it if you wanna try it

bitter hearth Nov 16, 2024, 11:55 PM

#

there is already a "split at steps" node in comfy

#

or a "split denoise" node

#

so its not too different from that

halcyon yarrow Nov 16, 2024, 11:55 PM

#

bitter hearth from what I have seen he doesn't change shift or scheduler much

He could add a field for like 2nd scheduler and then another field for the sigma breakpoint so you can pick res_3s for the first 20 percent and res 2m for the rest since the start is so crucial

#

So it would be scheduler one res 3s, sigma breakpoint 0.8, schedule two res 2m

pseudo owl Nov 16, 2024, 11:56 PM

#

halcyon yarrow Did you try my img2vid node? lol I made one for mochi it’s crude but it works

nope, I did see some examples tho, seemed surprisingly decent but very little motion. nice work

bitter hearth Nov 16, 2024, 11:57 PM

#

halcyon yarrow He could add a field for like 2nd scheduler and then another field for the sigma...

maybe yeah, a lot of things can work

#

you do need a lot of steps for res 3s, sometimes like 60-100

#

res 2s and res2m need less

halcyon yarrow Nov 16, 2024, 11:57 PM

#

pseudo owl nope, I did see some examples tho, seemed surprisingly decent but very little mo...

Lol yeah indeed, I was gonna try posting with a technique ChatGPT suggested called latent interpolation where I feed it the start and end input image and have it try to make an image to video that way but I feel like it would only work for the simplest of examples

#

I always change res 2m to res 3s without changing the steps (it’s like an option in my UI) to retry rendering an image and it always does fine

#

I think res 3s works at low steps just fine in my experience I don’t think I’ve ever reran it with the better sampler and didn’t get better results

bitter hearth Nov 16, 2024, 11:59 PM

#

there's a way to measure sampler error to know for sure

#

its in the original DPM paper
haven't seen someone make a comfy node of it but that might be cool

halcyon yarrow Nov 17, 2024, 12:01 AM

#

I do wish I could automate retrying, I know there’s solutions out there like a classifier that detects if the image is garbage or not, stuff like artifacts or just a solid color image or messed up patterns I’m just wary of going down that road bc it can also be subjective plus added overhead of classifying each image generated

#

If I could measure the error rate that could be a more light weight metric to trigger a retry

bitter hearth Nov 17, 2024, 12:02 AM

#

there's image quality assessment tools yeah

pseudo owl Nov 17, 2024, 12:02 AM

#

funny mochi gen

bitter hearth Nov 17, 2024, 12:05 AM

#

lol yeah

halcyon yarrow Nov 17, 2024, 12:07 AM

#

pseudo owl funny mochi gen

Thats really good too!

halcyon yarrow Nov 17, 2024, 12:08 AM

#

pseudo owl funny mochi gen

Have you tried pushing your vram to see what your max frames count is?

#

These looks like a good 7-8s clip

pseudo owl Nov 17, 2024, 12:12 AM

#

halcyon yarrow Have you tried pushing your vram to see what your max frames count is?

this one is from the offical website which uses unquantized mochi with 200steps + an upscaler, locally I only tried videos with very short frame counts since longer takes forever, and I cant wait 20+mins.

craggy crest Nov 17, 2024, 12:13 AM

#

pseudo owl this one is from the offical website which uses unquantized mochi with 200steps ...

you could - just kick it off before you crash for the night

#

i do that with luma sometimes - kick it off and come back tomorrow

halcyon yarrow Nov 17, 2024, 12:14 AM

#

I think my max is 85 frames which comes out to 3 or 5 secs depending on the fps. At 85 frames it takes my 8gb GPU about 25 mins to process

#

If I wanna render just one second or like 16 frames it takes 5 minutes sometimes 4 so it’s not bad lol

dusky thistle Nov 17, 2024, 1:36 AM

#

SD35M

dusky thistle Nov 17, 2024, 2:41 AM

#

craggy crest Nov 17, 2024, 2:42 AM

#

dusky thistle

no more jelly beans before generating

dusky thistle Nov 17, 2024, 3:28 AM

#

#

#

#

SD35M still

craggy crest Nov 17, 2024, 4:13 AM

#

dusky thistle SD35M still

don't look now, but there are bugs in your code

#

genai_m4_pnb_vll_v3upload_media_12089482_11_16_2024_20_21_40_600745_840594513631646892.png

halcyon yarrow Nov 17, 2024, 4:43 AM

#

@bitter hearth I was upset that this flux mini requires it to exist in diffusion_models instead of the checkpoints folder, so after like an hour talking to 4o, then o1 preview about it I finally figured out the solution and wrote a script to convert mini to be compatible with ComfyUI's load checkpoint. yay. Posting the first image generated with this WF

halcyon yarrow Nov 17, 2024, 5:06 AM

#

made using SD35L

craggy crest Nov 17, 2024, 6:28 AM

#

SD3.5 large

dusky thistle Nov 17, 2024, 7:52 AM

#

civic trail Nov 17, 2024, 10:56 AM

#

halcyon yarrow Nov 17, 2024, 2:47 PM

#

@bitter hearth I posted flux mini on civit and converted it to 3 other formats so 4 models posted

https://civitai.com/models/955242/flux-mini-3b

sullen moss Nov 17, 2024, 2:49 PM

#

Yeah, since the release of 3.5, there hasn’t been much visible interest from the community. On Flux, custom models were already available just a week after launch

mortal kite Nov 17, 2024, 2:51 PM

#

discord busted

#

this milk texture is NOT quite right

SPOILER_0650-a_poster_with_an_image_of_a_young_brunet-Fluxflux1-dev-fp8-231033754.jpg

#

maybe is ok?

halcyon yarrow Nov 17, 2024, 2:54 PM

#

sullen moss Yeah, since the release of 3.5, there hasn’t been much visible interest from the...

lol yeah I agree wholeheartedly, often times it feels like I’m the only one posting any images for sd35

#

Also yeah the number of new Lora’s for sd35 is really low/slow. I remember sd3 had a lot more Lora’s in its release

mortal kite Nov 17, 2024, 3:19 PM

#

0718-A_high_quality_magazine_advertisement_of-pixelwave_flux1Dev03-731723992.jpg

#

0721-A_high_quality_magazine_advertisement_of-pixelwave_flux1Dev03-1381305337.jpg

halcyon yarrow Nov 17, 2024, 3:33 PM

#

mortal kite

I’ll wait for the model that has the full realistic skin suit, I hear it’s coming out soon too lol 😆

bitter hearth Nov 17, 2024, 4:04 PM

#

halcyon yarrow <@456226577798135808> I posted flux mini on civit and converted it to 3 other fo...

thanks a lot, the Q8 will be helpful

#

I agree some people prefer checkpoint so they should offer a range

halcyon yarrow Nov 17, 2024, 4:04 PM

#

dude can you believe flux mini took down their project?

#

its coming back with a 404 now

bitter hearth Nov 17, 2024, 4:04 PM

#

wow is your version the only version now?

#

lol

halcyon yarrow Nov 17, 2024, 4:05 PM

#

it actually took a lot of research to fiigure out how to convert it from what the base was to soemthing that'll work via load checkpoint, then a lot of work to figure out there's a Save Model node I can repackage it with the CLIP AND VAE built in

#

and then it was easy breeze converting it to gguf after I got past the Load Checkpoint hurdle

#

i was trying to manually compile the safetensors file using python and then after a little research i felt dumb realizing i can do it in comfy

bitter hearth Nov 17, 2024, 4:06 PM

#

save model node yeah

#

I've used the save clip node once as well its similar, or save diffusion model

halcyon yarrow Nov 17, 2024, 4:06 PM

#

I also tried to bake in stuff like flan instead of t5xxl and that doesn't workk, I tried to bake in LongClip in various different ways with and without the dedicated node and that also didn't work, so the baked in has t5xxl fp8 and the vit14 finetune by zero point

bitter hearth Nov 17, 2024, 4:07 PM

#

that's fine most people like that Clip L fine tune

#

personally I liked flan too but its controversial

halcyon yarrow Nov 17, 2024, 4:07 PM

#

yeah its a 3b model so it needs all the help it can get

bitter hearth Nov 17, 2024, 4:07 PM

#

yeah for sure

#

I am not sure about flan as one day it helps and one day it doesn't

halcyon yarrow Nov 17, 2024, 4:08 PM

#

why is it controversial? id like to know

#

oh i see i think you just said why

bitter hearth Nov 17, 2024, 4:08 PM

#

the model wasn't trained with flan

#

so it is not clear it is a good idea

#

this applies to the Clip L fine tunes too by the way

#

I am not sure personally, I often try both

halcyon yarrow Nov 17, 2024, 4:08 PM

#

its interesting messing wit that stuff you sort of get to peek behind the scenes, i figured t5xxl and flan internally were the same structuure but its likely different layers and then comfyui is doing some special sauce to adjust to the different layers internally

#

same goes for longcliip, i cant build it in bc its expect it always to be the 'standard' clip L and it only works via the dualcip/tripleclip loader bc of internal adjustments they're makng after the fact

#

this whole thing started bc i want to use flux mini but i don't want to code support for the diffusion_models folder so it turns out all I had to do to convert it from the base model to something that'll work with Load Checkpoint was just prefx the layer keys with a certain string and that's it. super simple change

#

i kept calling the base model diffusers format but that was incorrect its actually in "flux transformers format" so i kept argunig with o1 preview like "okayy if its already in the target format why isn't it working?" i ended up dumping the structure of an existing flux model that works via load checkponit just so gpt can review and compare and figure out the solution

bitter hearth Nov 17, 2024, 4:13 PM

#

I've been using single clip loaders and then concatting the embeddings, for what its worth
if you use SD 1.5 with ELLA T5 and Clip L then you have to do it this way

#

I need to check exactly what the dual/triple clip loaders and prompt text encode nodes actually do TBH

pseudo owl Nov 17, 2024, 4:14 PM

#

halcyon yarrow its interesting messing wit that stuff you sort of get to peek behind the scenes...

it is the same layers no? its just a finetune right, I might be mistaken though.

bitter hearth Nov 17, 2024, 4:14 PM

#

I was saying on comfy discord a while ago that I want to make a new set of loader and encoder nodes

#

which will be model-agnostic

#

so for example you will use the same loader and encoder node set to encode prompt as you do to encode images for IP adapter embeds

halcyon yarrow Nov 17, 2024, 4:14 PM

#

bitter hearth I've been using single clip loaders and then concatting the embeddings, for what...

i don't understand that statement, ii thought internally SD1.5 was just designed for clip L so you're makng t5 work for sd1.5?!

bitter hearth Nov 17, 2024, 4:15 PM

#

ah there is a special thing for that

#

its called ELLA

#

its really cool

#

https://github.com/TencentQQGYLab/ComfyUI-ELLA they had to train it

halcyon yarrow Nov 17, 2024, 4:15 PM

#

pseudo owl it is the same layers no? its just a finetune right, I might be mistaken though.

there's no fine tune I made. regarding the layers the flux mini is a completely different architecture as the post said they were from so and so many single and double blocks to nearlly a fraction of it

bitter hearth Nov 17, 2024, 4:15 PM

#

SD 1.5 with ELLA has better prompt adherence than SDXL

#

its crazy

#

the best things tend to have zero hype for some reason

halcyon yarrow Nov 17, 2024, 4:16 PM

#

wow that's crazy Neon pretty cool stuff the level of care they apply to this stuff

bitter hearth Nov 17, 2024, 4:16 PM

#

there's no downside to ELLA, that I know of

halcyon yarrow Nov 17, 2024, 4:16 PM

#

I've just been just LongClp with SD1.5 but I'd be willing to try Ella and compare it

pseudo owl Nov 17, 2024, 4:16 PM

#

halcyon yarrow there's no fine tune I made. regarding the layers the flux mini is a completely ...

oh I thought you were talking about flant5xxl vs t5xxl, yeah flux-mini should have much less layers then it

halcyon yarrow Nov 17, 2024, 4:17 PM

#

and ELLA iis only compatible with 1.5 its not compatible with SDXL?

bitter hearth Nov 17, 2024, 4:17 PM

#

very sadly they made ELLA for SDXL but did not release it

#

every now and then people go ask them on github

halcyon yarrow Nov 17, 2024, 4:17 PM

#

pseudo owl oh I thought you were talking about flant5xxl vs t5xxl, yeah flux-mini should ha...

oh yeah in terms of those 2 there must be some different structure or layers internally, there has to be bc i tried bakng it and it would just generate a black image

halcyon yarrow Nov 17, 2024, 4:17 PM

#

bitter hearth very sadly they made ELLA for SDXL but did not release it

lol aw what a shame

bitter hearth Nov 17, 2024, 4:18 PM

#

some of the best stuff is not released
there is a fine tune of Lumina that looks as good as Flux to me
but its not released (its in the I-max paper)

pseudo owl Nov 17, 2024, 4:18 PM

#

bitter hearth there's no downside to ELLA, that I know of

The only downside was that some knowledge was lost but that was just because of the dataset but it has far better prompt following then sd1.5: https://github.com/TencentQQGYLab/ELLA/issues/35

bitter hearth Nov 17, 2024, 4:18 PM

#

you can add Clip-L embeddings as well to get some back
but yeah actually that's a good point

#

it will not be as good as pure Clip-L for subject knowledge

halcyon yarrow Nov 17, 2024, 4:19 PM

#

i don't usue sd1.5 that much, i do support it but sdxl would be where it's at

bitter hearth Nov 17, 2024, 4:19 PM

#

cos even if you add Clip-L embeddings with "concat conditioning" node, the T5 embeddings are competing with them

halcyon yarrow Nov 17, 2024, 4:20 PM

#

this is my understandinig of what the model architecture supports
SD15 - only L
SDXL - L & G
Flux - L & t5
SD3 - L & G & t5

bitter hearth Nov 17, 2024, 4:20 PM

#

this is the sort of style that I like SD 1.5 for:

#

for some reason I can't get this look in other models

#

its very grainy and stylized but still a photo

#

SDXL - L & G
Flux - L & t5
SD3 - L & G & t5```yea that's right

halcyon yarrow Nov 17, 2024, 4:21 PM

#

the left one clearly shows signs it was made with an inferior model but the right one could almost pass for flux at low rez

bitter hearth Nov 17, 2024, 4:22 PM

#

yeah the detail and clarity is low

#

and hands bad

halcyon yarrow Nov 17, 2024, 4:23 PM

#

i might release an update to flux mini aio model replacing the t5xxl fp8 with t5xxl v1.1 fp8, took me a bit of searching huggingface to find it bc you only see the full 22gb model and im not gonna embed that into flux mini lol, i had v1.1 but onlyy in gguf and you cant bake those in either

bitter hearth Nov 17, 2024, 4:23 PM

#

the on the left is 1536x1536 which is why it looks worse
really hard to get SD 1.5 to do that res

pseudo owl Nov 17, 2024, 4:23 PM

#

Theres also lavi-bridge which actually makes sd support llms like llama 2 7b as text encoder and pixart support t5 large instead of t5xxl
https://github.com/ShihaoZhaoZSH/LaVi-Bridge

GitHub

GitHub - ShihaoZhaoZSH/LaVi-Bridge: [ECCV 2024] Bridging Different ...

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation - ShihaoZhaoZSH/LaVi-Bridge

halcyon yarrow Nov 17, 2024, 4:23 PM

#

no gguf baking, no longclip baking, no flan baking

bitter hearth Nov 17, 2024, 4:23 PM

#

pseudo owl Theres also lavi-bridge which actually makes sd support llms like llama 2 7b as ...

oh yeah thanks I forgot about lavi-bridge
its the competitor to ELLA I need to try it

halcyon yarrow Nov 17, 2024, 4:24 PM

#

pseudo owl Theres also lavi-bridge which actually makes sd support llms like llama 2 7b as ...

that's pretty cool just read it, seems very similar to ELLA in my mind

bitter hearth Nov 17, 2024, 4:25 PM

#

there is a 46GB version of Flan T5 XXL
I've been thinking about using it with SD 1.5 as a joke
cos I sometimes rent the 80GB servers (only $0.70 per hour luckily)

halcyon yarrow Nov 17, 2024, 4:25 PM

#

so wiith lavi bridge in theory we could get SDXL to work with T5 right?

bitter hearth Nov 17, 2024, 4:25 PM

#

not sure

halcyon yarrow Nov 17, 2024, 4:25 PM

#

bitter hearth there is a 46GB version of Flan T5 XXL I've been thinking about using it with SD...

yeah thats the one it was 46gb not 22gb i mispoke

halcyon yarrow Nov 17, 2024, 4:26 PM

#

bitter hearth not sure

i think that would be an interesting bit of testing i could try, ill def pursue that over ella

pseudo owl Nov 17, 2024, 4:27 PM

#

I think ELLA is actually slightly better but Lavi-bridge is easier to train.

bitter hearth Nov 17, 2024, 4:27 PM

#

pseudo owl I think ELLA is actually slightly better but Lavi-bridge is easier to train.

my understanding from months ago was this yeah
but I should try lavi-bridge myself before dismissing it

pseudo owl Nov 17, 2024, 4:28 PM

#

Also I believe the t5xxl models are so large since they also include the decoder part which is not even used in text encoding, the actual encoder part of t5xxl which is used itself should be like only 9gb.

bitter hearth Nov 17, 2024, 4:29 PM

#

yeah you can set the decoder layers to 0

halcyon yarrow Nov 17, 2024, 4:29 PM

#

some guy commented on the flux mini model's page NO use for it all images dont follow prompt and bad anatomy, loras dont work ... ! and i checked allready 50 flux models lol like granted loras don't work, i agree, but if its not follwing prompt or anatomy that seems more like a setup issue than a model issuue

halcyon yarrow Nov 17, 2024, 4:30 PM

#

bitter hearth yeah you can set the decoder layers to 0

interesting and so would doing that somehow improve performance or whats the point of that?

bitter hearth Nov 17, 2024, 4:31 PM

#

the main benefit is it would be smaller
so less download time, and faster loading in comfy
it may or may not be faster, but it would definitely not be slower

#

mostly models get faster when you set layers to zero but sometimes its not a big gain, it depends

halcyon yarrow Nov 17, 2024, 4:33 PM

#

set it to 0 before I bake in t5 model so exclude the decoder part so it's more lightweight all the time, that could be handy, ill def look into that too

#

do you know the node name that lets mem zero out layers?

bitter hearth Nov 17, 2024, 4:35 PM

#

this isn't doable in comfy it seems

#

comfy tends to not be so good for LLM stuff yet

halcyon yarrow Nov 17, 2024, 4:36 PM

#

so i could just manually edit the model with safetensors library in python and then just use that modified model in comfyy to bake it in, i can just switch gears and do that instead, ill try it when i build the next flux mini

#

i even posted on flux mini's discussion board a bunch of text about how cool it is and how I posted it on civit and how i was offering these different versins, im sure Tencent team didn't tae it down permanently perhaps they're just preparing for another launch, their github was broken at the time so a better release would be apt

#

also @dusky thistle I linked to your github and suggested your sampler for using flux miini bc it does produce better results

#

hopefully i can convert some people to jump on the shark bandwagon

dusky thistle Nov 17, 2024, 4:40 PM

#

good to know that it does! which rk_type/sampler type did you choose

halcyon yarrow Nov 17, 2024, 4:42 PM

#

res_2m and res_3s are the onlyy ones i ever used, i wanted to ask you about that new one you suggested, its not in the list of samplers so im guessing i have to do a git pull but i wanted to confirm with you there isn't any bombs i should be aware of that I would have to adjust for?

#

did you change the order of any of the inputs or outputs of the nodes?
did you add or remove any fields to any of the nodes?
did you remove any of the options in any of the existing fields?

bitter hearth Nov 17, 2024, 4:43 PM

#

res_2m and res_3s are the onlyy ones i ever usedsame but also res_2s

#

similar though

#

I also liked the soft scaling more than hard 🤔

halcyon yarrow Nov 17, 2024, 4:44 PM

#

I'm going to be generating some images using flux mini to post them on the model's page and showcase it's ability

A breathtaking landscape of a rugged mountain range covered in dense evergreen forests, with rocky outcroppings in the foreground. The bright blue sky and scattered clouds add depth and serenity to the scene.

A detailed portrait of a young woman wearing a luxurious red dress with intricate lace details, accessorized with pearl jewelry. Her confident gaze and the soft lighting create a regal and timeless atmosphere, reminiscent of classical art.

bitter hearth Nov 17, 2024, 4:44 PM

#

one on the left is incredible for 3B

#

one on the right needs refine

#

but that's ok

#

for example eye

halcyon yarrow Nov 17, 2024, 4:45 PM

#

yeah the right eye ould use a little help but its still pretty good

#

oh you think the left eye could use help too?

bitter hearth Nov 17, 2024, 4:45 PM

#

LOL we both said eye at the same time

#

yeah

halcyon yarrow Nov 17, 2024, 4:46 PM

#

i think the left eye is fine if she's looking that way but the right eye looks deformed

bitter hearth Nov 17, 2024, 4:46 PM

#

its not "bad" but even 2 steps of Realvis Schnell would help a lot

#

we have "eye detailer" now

#

like face detailer but for eye, nose etc

#

in impact pack can do that

halcyon yarrow Nov 17, 2024, 4:47 PM

#

A beautiful still life painting of vibrant pink flowers in a ceramic vase placed on a wooden table by the window. The sunlight softly illuminates the petals, creating a warm and inviting atmosphere, inspired by classic oil painting techniques.

A stunning surreal cosmic landscape featuring a majestic lightning bolt striking through vibrant orange clouds, with planets and stars in the background. A lone figure stands in awe, surrounded by ethereal beauty, evoking a sense of wonder and exploration.

A serene and atmospheric scene of a train station nestled in a lush tropical forest, illuminated by warm lights. The station features vintage architecture, and people walk leisurely along the platform under the towering palm trees.

bitter hearth Nov 17, 2024, 4:47 PM

#

but I don't like impact pack I do it in other ways

halcyon yarrow Nov 17, 2024, 4:47 PM

#

these are all using res_2m btw

#

i dont know if the WF is embedding into these images im just copying the image via clipboard

bitter hearth Nov 17, 2024, 4:47 PM

#

impact pack wants to do everything in terms of a new data structure called a "SEG"
but I don't want that

pseudo owl Nov 17, 2024, 4:48 PM

#

halcyon yarrow > A beautiful still life painting of vibrant pink flowers in a ceramic vase plac...

Wow, that's actually pretty great! When I tested mini-flux, I got some pretty bad imgs but I think it was my settings probably.

halcyon yarrow Nov 17, 2024, 4:49 PM

#

i just asked gpt4o to generate me a prompt for each of these images and that's how i got the prompts, civit took down this image bc of the kid in the third frame

bitter hearth Nov 17, 2024, 4:49 PM

#

oh is that why

halcyon yarrow Nov 17, 2024, 4:49 PM

#

this was what got me into flux mini like if it can generate stuff this good its gotta be worth trying

bitter hearth Nov 17, 2024, 4:49 PM

#

your checkpoint is the only one now lol

halcyon yarrow Nov 17, 2024, 4:50 PM

#

the original women in red was using the q8 model by the way, after i quantsized it to q8 the model is actually 3.4gb

#

imagine 3.4GB flux model

#

q8 model left, original base model right

#

q8 left, original right

bitter hearth Nov 17, 2024, 4:52 PM

#

nice its the same, essentially

halcyon yarrow Nov 17, 2024, 4:52 PM

#

again q8 left, original right. they're super similar to each other, for being 1.5GB smaller its pretty astouding how good it still is

bitter hearth Nov 17, 2024, 4:54 PM

#

yeah Q8 is great

halcyon yarrow Nov 17, 2024, 4:54 PM

#

A vivid and colorful depiction of a nebula in deep space, featuring intricate clouds of gas and dust in shades of red, blue, and gold. Bright stars shine through, creating a mesmerizing and otherworldly cosmic vista.

bitter hearth Nov 17, 2024, 4:54 PM

#

sometimes Q6, or Q5 ones can be good

halcyon yarrow Nov 17, 2024, 4:56 PM

#

its that term, diminishing returns, i like to q8 bc I just want less memory usage at the expense of a little bit of loss, iim not willing to accept more than just a liittle bit of loss lol

#

i think this is the train statin in the forest one, people don't look so great here

bitter hearth Nov 17, 2024, 4:56 PM

#

I kinda agree Q8 is a good choice these days

#

personally I do everything FP8 but there are costs to that

halcyon yarrow Nov 17, 2024, 4:57 PM

#

look at my times for generating these images on my 8gb gpu, like I see one going as low as 43 seconds

bitter hearth Nov 17, 2024, 4:57 PM

#

what sort of hardware is this

#

oh you said 8GB sorry

#

yeah that's good for 8GB

halcyon yarrow Nov 17, 2024, 4:59 PM

#

then i click on that liitte share button in the corner for each iimage, it'll run it through 2x VLMs for post title and tags, run it through the 2xBSRGAN upscaler and use exif_tool so CivitAI can read all the details on how it was made

bitter hearth Nov 17, 2024, 5:01 PM

#

ah yeah I love automated chains like that

halcyon yarrow Nov 17, 2024, 5:01 PM

#

and iti'll do all that in 10 secs per image

bitter hearth Nov 17, 2024, 5:01 PM

#

its hard to recommend upscalers to people cos there are so many variables but you can definitely do better than 2xBSRGAN

halcyon yarrow Nov 17, 2024, 5:02 PM

#

it has to be 2x bc I dont want 17-30mb image files laying arounud, i ithink 4-5mb is decent, and it has to be performant, ive used better models that def look better but im not willing to dedicate 30-45 seconds to upscale it

bitter hearth Nov 17, 2024, 5:02 PM

#

https://github.com/Phhofm/models/releases/tag/all_modelsthis script downloads lots of good ones

#

actually this link might be more helpful it compares speed:

#

https://github.com/the-database/traiNNer-redux/wiki/PyTorch-Inference-Benchmarks-by-Architecture

#

cos there are really fast ones now too

#

would recommend span

#

ah this one seems perfect https://openmodeldb.info/models/2x-NomosUni-span-multijpg-ldl

#

its a 2x SPAN one for photographs

halcyon yarrow Nov 17, 2024, 5:46 PM

#

awesome dude thanks for the links Ill def go over those benchmarks i love that kind of stuff

bitter hearth Nov 17, 2024, 5:48 PM

#

no problem

#

with these upscale models its always worth trying a bunch

#

cos unlike diffusion models, the upscale models cannot work well outside of their exact training data

#

so it depends if your image matches what they expect

halcyon yarrow Nov 17, 2024, 5:58 PM

#

@shell bloom lets talk here buddy

#

so you're saying you got 4gb of ram and you're fastest time yet with the 3B model at 1024px is 54 seconds using the aio model

shell bloom Nov 17, 2024, 5:59 PM

#

No as a video card I have 12gb of vram, a 3060

halcyon yarrow Nov 17, 2024, 5:59 PM

#

oh dude with 12gb of vram you should be getting way faster times

#

when testing make sure to try two similar prompts twice

#

the first time it has to lload the model, the second time is signifincantly faster

#

ive gotten times as low as 42 seconds with my 8gb so you should have no problem going even faster

shell bloom Nov 17, 2024, 6:00 PM

#

what graphics card do you have?

halcyon yarrow Nov 17, 2024, 6:00 PM

#

4070 on a laptop so it's technically more like a 4060 for pc

shell bloom Nov 17, 2024, 6:01 PM

#

I have an rtx 3060 but it is slower than a 4070 laptop, but I honestly don't know

craggy crest Nov 17, 2024, 6:02 PM

#

shell bloom I have an rtx 3060 but it is slower than a 4070 laptop, but I honestly don't kno...

wht's your cpu?

halcyon yarrow Nov 17, 2024, 6:02 PM

#

the newer generations have more tensor cores maybe that's why? im not too sure. if you need help loading the unet model just take the image posted on my gallery and load that into your comfyui workflow

shell bloom Nov 17, 2024, 6:02 PM

#

i7 8700k

craggy crest Nov 17, 2024, 6:03 PM

#

shell bloom i7 8700k

are you running anything else in the background when you're generating?

shell bloom Nov 17, 2024, 6:03 PM

#

craggy crest are you running anything else in the background when you're generating?

I should check

shell bloom Nov 17, 2024, 6:03 PM

#

halcyon yarrow the newer generations have more tensor cores maybe that's why? im not too sure. ...

I will try mate, thanks

craggy crest Nov 17, 2024, 6:04 PM

#

a 3060 is going to be fairly slow as it is. so you can't have anything else running that'll want that GPU

shell bloom Nov 17, 2024, 6:04 PM

#

I'm waiting for the new rtx 5000 to build a new pc

craggy crest Nov 17, 2024, 6:05 PM

#

for now, make sure no games, or anythign else that wants the gpu while you're generating

halcyon yarrow Nov 17, 2024, 6:05 PM

#

https://civitai.com/images/40558181 click on the little blue Nodes button and itll copy the WF to your clipboard so you can just paste it into ComfyUI using CTRL+V

shell bloom Nov 17, 2024, 6:06 PM

#

craggy crest for now, make sure no games, or anythign else that wants the gpu while you're ge...

I will, thank you.

bitter hearth Nov 17, 2024, 6:06 PM

#

3060 is about 4x slower than a 4090

#

its not too bad

frail shoal Nov 17, 2024, 6:12 PM

#

halcyon yarrow Nov 17, 2024, 6:19 PM

#

@shell bloom I think that @craggy crest is the right guy to ask in terms of sd35 training Ive read some of his messages and he's keeping up on that. im not into training or finetuning or any of that

shell bloom Nov 17, 2024, 6:20 PM

#

halcyon yarrow <@414634119033782272> I think that <@407561236339752981> is the right guy to ask...

thanks buddy

halcyon yarrow Nov 17, 2024, 6:21 PM

#

i have a script called shuffle-checkpoints that'll reassign items destined for other models to a specific model I set, so I just queued up 500 images into the flux-mini queue, expect to see some more flux-mini examples posted shortly

#

@bitter hearth testing your theory, wrote a quick script to remove the decoder side from t5, queue'ed up a generatin to see if it works or if itll produce an error

bitter hearth Nov 17, 2024, 6:25 PM

#

okay thanks, I've been looking for flux-mini samples lol

halcyon yarrow Nov 17, 2024, 6:28 PM

#

#

looking at the t5 xxl v1.1 fp8 and iti doesn't have the decoder layers so it's alreadyy optimiized

bitter hearth Nov 17, 2024, 6:33 PM

#

what is that in the background on the right

#

is it a green chair

halcyon yarrow Nov 17, 2024, 6:37 PM

#

i think they're supposed to be like orcs or monsters of some sort lol

bitter hearth Nov 17, 2024, 6:40 PM

#

cannot tell if orc or chair 😂

mortal kite Nov 17, 2024, 6:42 PM

#

#

#

craggy crest Nov 17, 2024, 6:51 PM

#

bitter hearth cannot tell if orc or chair 😂

is both

rapid pivot Nov 17, 2024, 6:53 PM

#

bitter hearth Nov 17, 2024, 6:53 PM

#

maybe they'd still have the One Ring if they had orc-chairs on their side

mortal kite Nov 17, 2024, 6:58 PM

#

#

halcyon yarrow Nov 17, 2024, 7:18 PM

#

@mortal kite So you’re taking low Rez images of shirts and essentially up scaling them?

bitter hearth Nov 17, 2024, 7:19 PM

#

I think it might be that control-lora thing

#

since that went viral this week

halcyon yarrow Nov 17, 2024, 7:19 PM

#

The in context thing?

bitter hearth Nov 17, 2024, 7:19 PM

#

yeah

#

that little one on the left is the k-sampler preview rather than the input

halcyon yarrow Nov 17, 2024, 7:19 PM

#

I just hit 500 downloads on that Lora earlier actually

#

Yeah I think I got confused for that being the load image node lol

#

500 downloads for the in context Lora’s and not a single person has used it in the civit generator to post anything online, that’s kinda frustrating for me, I was excited to see how the community would use it but everyone is just offline generating and then not tagging if they even share it online

bitter hearth Nov 17, 2024, 7:23 PM

#

halcyon yarrow Yeah I think I got confused for that being the load image node lol

I was thinking that's what happened yeah

#

I actually don't use previews personally

#

if I want to see generation partly finished I would just stop k-sampler early

halcyon yarrow Nov 17, 2024, 7:25 PM

#

I’m the opposite, in fact I rewrote ComfyUi latent preview node just so I can see the preview of the batch

#

It’s super fun monitoring the preview of a batch of 4 in a grid and then on step 16/20 it decides to completely redo one of the set and it makes it way better or worse

#

I used to do batches of 5 now I do 4 just so it fits neatly as a grid lol

bitter hearth Nov 17, 2024, 7:26 PM

#

you might like this node pack https://github.com/blepping/ComfyUI-blehit has improvements to the k-sampler previews
might be useful for ideas

#

oh yeah I don't use batching, its useful though

halcyon yarrow Nov 17, 2024, 7:27 PM

#

I think at this point I’m a die hard shark sampler guy, the only beef I have with shark sampler is how it handles the preview but I don’t hold it against him

bitter hearth Nov 17, 2024, 7:28 PM

#

I used TCD sampler for the vast majority of my images
not for Flux though

halcyon yarrow Nov 17, 2024, 7:28 PM

#

He’s letting the system handle it based on global user preferences and then he’s listening for the preview events if enabled, whereas ksampler efficient advanced ignores those settings and uses the local node settings to decide whether to show

#

What’s TCD?

bitter hearth Nov 17, 2024, 7:28 PM

#

its like hyper its a distilled version of SDXL or SD 1.5

#

this is the sampler for it https://github.com/JettHu/ComfyUI-TCD

halcyon yarrow Nov 17, 2024, 7:29 PM

#

So the sampler is also a distilled version of a model? Thats interesting

bitter hearth Nov 17, 2024, 7:29 PM

#

no the distilled versions are loras

#

the sampler just works well with them

#

cos it came from the same paper

#

the sampler is similar to euler_a

halcyon yarrow Nov 17, 2024, 7:30 PM

#

How does TCD handle artifacts?

#

You know how sometimes ksampler will do green splotches? Like in the corner of the mouths or the eyes or nose n

bitter hearth Nov 17, 2024, 7:31 PM

#

not sure I haven't seen those

#

it's generally worse than regular SD 1.5 or SDXL for accuracy

halcyon yarrow Nov 17, 2024, 7:31 PM

#

That’s like the ultimate pet peeeve for me, spent all this time generating an image and it almost feels cruel bc it’s artifacting key areas lol

bitter hearth Nov 17, 2024, 7:32 PM

#

restart sampler is the best, technically

#

as far as I know

#

does not work on flux though

halcyon yarrow Nov 17, 2024, 7:32 PM

#

Restart sampler? Don’t let ClownSharkBatwing hear ya lol

bitter hearth Nov 17, 2024, 7:32 PM

#

lol

#

noisy DPM/Res/Deis with a decent amount of Eta is also good but restart sampler is a bit better

halcyon yarrow Nov 17, 2024, 7:33 PM

#

I think it’s gonna be hard to switch from his sampler tbh, I’ve noticed on really complex images where the chances of artifacts are high it’ll go into this mode where rather than green splotches it’ll do like these artistic overlays I wish I could show you lol

remote holly Nov 17, 2024, 7:33 PM

#

bitter hearth Nov 17, 2024, 7:34 PM

#

there is no restart sampler for flux, SD 3.5 or auraflow

#

is the main issue

halcyon yarrow Nov 17, 2024, 7:35 PM

#

That’s kind of a deal breaker for me

#

I especially like how I can run all 5 base models with the exact settings and it handles it like a champ

bitter hearth Nov 17, 2024, 7:36 PM

#

yeah I actually don't use restart personally anyway since I use TCD

halcyon yarrow Nov 17, 2024, 7:36 PM

#

Sd15, sdxl, pony, flux and sd35 all with the same settings
ETA 0.5 Gaussian Gaussian res_2m beta57

bitter hearth Nov 17, 2024, 7:36 PM

#

the reason I like TCD in particular is that it is the acceleration lora with the highest image complexity
(there is a model they use in papers now that judges image complexity)

#

those settings are good yeah

#

in my tests res_3s needed a lot more steps than res_2s
but it depends on settings/model/workflow etc

halcyon yarrow Nov 17, 2024, 7:38 PM

#

bitter hearth the reason I like TCD in particular is that it is the acceleration lora with the...

Interesting, I am willing to adopt even more hardened samplers that can tackle challenges better, currently if res2m fails me I use res3s but even that still fails tho from our last chat I could try switching to res3s and add double the steps to see if improves

bitter hearth Nov 17, 2024, 7:40 PM

#

there's implicit steps as an option too

#

but you have 8GB so it might not be worth it

#

there is a limit to how slow would feel okay

gusty trail Nov 17, 2024, 7:52 PM

#

remote holly Nov 17, 2024, 7:54 PM

#

halcyon yarrow Nov 17, 2024, 8:35 PM

#

@bitter hearth so I found this ComfyUI node to use lavi https://github.com/kijai/ComfyUI-LaVi-Bridge-Wrapper/issues/1 and then I looked up the issues before tackling an install to confiirm it works for sdxl land the team said:

Thank you for your interest in our LaVi-Bridge! We did not include SDXL in our current work, but we are conducting experiments on SDXL with LaVi-Bridge and will update our progress promptly in both the research paper and this repo.

GitHub

I heard this can be used with SDXL. · Issue #1 · kijai/ComfyUI-LaVi...

Is SDXL supported with Lavi?

#

that was 6 months ago they said that

#

comments are precious:

ELLA folks did release the adapter checkpoint for t5+SD1.5 (spoiler: it's not good) but announced that SD XL adapter will not be released.

With the ELLA team shooting the open-source community in the back by not releasing its SDXL tool, it's now come to your team to be our savior. Good luck! We're all rooting for you.

bitter hearth Nov 17, 2024, 9:01 PM

#

LOL

#

I was under the impression ELLA was better though, rather than worse

halcyon yarrow Nov 17, 2024, 9:03 PM

#

this comparison screenshot of before and after is pretty impressiive

bitter hearth Nov 17, 2024, 9:03 PM

#

wow nice

halcyon yarrow Nov 17, 2024, 9:04 PM

#

the embeddings really do play a major role in image composition so being able to inject t5 into these legacy models would give them a huge boost, i'm just not willing to add support for it if its only 1.5. once sdxl support comes out I'm first in line to try it

bitter hearth Nov 17, 2024, 9:05 PM

#

the llama 7B results were even better apparently

craggy crest Nov 17, 2024, 9:10 PM

#

@halcyon yarrow do you know how to get comfyUI to run in a docker container on linux?

civic trail Nov 17, 2024, 9:12 PM

#

halcyon yarrow Nov 17, 2024, 9:18 PM

#

craggy crest <@156588917875933184> do you know how to get comfyUI to run in a docker containe...

the only experience I have with that is using RunPod, technicallyy it's a docker container in linux, so I would just pick the image that has the prebuilt toolchain and then just have a provision script

#

I could share my deploy.sh but it's speific to RunPood

#

see the image I use is this one:

imageName: "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04"
it doesn't even come with comfyy just pytorch and cuda bc those are the 2 hardest/longest things to install

#

and then my script just does some basic stuff like:

if [ ! -d "$COMFYUI_DIR" ]; then
    git clone https://github.com/comfyanonymous/ComfyUI.git "$COMFYUI_DIR"
    # Navigate to the workspace directory and update the repository
    cd "$COMFYUI_DIR"
    git reset --hard origin/master
    git pull origin master

    # Step 4: Move the custom_nodes directory to $CONTAINER_DIR/custom_nodes
    mkdir -p "$COMFYUI_DIR/custom_nodes"
    mv -n "$BOOTSTRAP_DIR/custom_nodes/"* "$COMFYUI_DIR/custom_nodes/"
fi

craggy crest Nov 17, 2024, 9:20 PM

#

halcyon yarrow the only experience I have with that is using RunPod, technicallyy it's a docker...

drat. okay. i keep running into people that want to use linux, and i have one guy now that's trying to use docker as well.

halcyon yarrow Nov 17, 2024, 9:21 PM

#

if you're using stuff like AWS i'm sure there's prebuilt AMIs that have ComfyUI built in or at least pytorch/cuda pre-installed but those instances are very expensive

pseudo owl Nov 17, 2024, 9:42 PM

#

Also this is pretty nice, improves performance of clip in general, they are more focused on multimodels but should work in sd3/sdxl/flux and other models which use clip
https://microsoft.github.io/LLM2CLIP/

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

sudden parcel Nov 17, 2024, 9:48 PM

#

i have a weird looking rendered image when doing SD3.5 in A1111

#

im using SD3.5 large

#

https://gyazo.com/9129d12b1c529a1bf1e617a72dec552d

Gyazo

#

i assumed it had something to do with a VAE ... but i do something wrong, maybe its the wrong file or wrong location

hallow lion Nov 17, 2024, 9:54 PM

#

sudden parcel https://gyazo.com/9129d12b1c529a1bf1e617a72dec552d

Print it, frame it, sell for 50 million usd.

sudden parcel Nov 17, 2024, 9:55 PM

#

i will make a note

craggy crest Nov 17, 2024, 9:55 PM

#

halcyon yarrow the only experience I have with that is using RunPod, technicallyy it's a docker...

@atomic quest you might want to read this

sudden parcel Nov 17, 2024, 9:56 PM

#

does anybody know what i might do wrong?

craggy crest Nov 17, 2024, 9:57 PM

#

sudden parcel does anybody know what i might do wrong?

explain? you might be using the wrong vae, wrong lora, wrong encoders, given the prompt wrong, have all sorts of other settings wrong ... can you post your workflow?

sudden parcel Nov 17, 2024, 9:59 PM

#

i put the sd3.5_large safetensor file into the models/stable-diffusion folder. I did download a vae file from civit.ai

#

do i put the vae file into models/vae or models/stable-diffusion folder?

#

i tried both, but it is not working

halcyon yarrow Nov 17, 2024, 10:01 PM

#

@sudden parcel are you talking boutu ComyUI bc ComfyUI doesn't have a stable-diiffusion folder afaik it's checkpoints, diffusion_models, or unet

sudden parcel Nov 17, 2024, 10:01 PM

#

A1111

craggy crest Nov 17, 2024, 10:01 PM

#

sudden parcel i put the sd3.5_large safetensor file into the models/stable-diffusion folder. I...

you have to use a VAE that's for SD3.5 - and it sounds like maybe the one you have isn't for sd3.5. also, you want to make sure you're using a model that doesn't have the VAE baked into it, if you are going to configure a vae inside comfy

craggy crest Nov 17, 2024, 10:02 PM

#

halcyon yarrow <@256658090710138880> are you talking boutu ComyUI bc ComfyUI doesn't have a sta...

for sd3.5 it's /comfyUI/models/checkpoints

#

and you can make a folder in checkpoints called sd3.5 if you want

sudden parcel Nov 17, 2024, 10:03 PM

#

there is no checkpoint folder

#

i mentioned twice its A1111

craggy crest Nov 17, 2024, 10:07 PM

#

sudden parcel i mentioned twice its A1111

for a1111, you would put sd3.5 where the other models go. it does't need a special folder. but you still need to use a VAE that's written for it, not just any vae you found somewhere. and you still need to make sure you're not using a model version that has the VAE baked into it if you are going to use a seperate vae

#

and you still need to make sure you ahve cfg, steps, and other settings correct

unkempt compass Nov 17, 2024, 10:09 PM

#

mortal kite

It's interesting, but it's a pre-rendering. Not the final product. Because you'll have to provide a compatible file, with transparency to a any printing firm.
Do you have a plan for that?

sudden parcel Nov 17, 2024, 10:10 PM

#

ok....

#

does this has the vae baked into it? https://huggingface.co/stabilityai/stable-diffusion-3.5-large

stabilityai/stable-diffusion-3.5-large · Hugging Face

craggy crest Nov 17, 2024, 10:14 PM

#

sudden parcel ok....

no. the sd3.5_large.safetensors on the files page doesn't have the VAE. and the VAE in the folder on the files page is the one you want for it. you'll find your encoders on that page, in their folders, too

sudden parcel Nov 17, 2024, 10:15 PM

#

or is there a better place to go and download the sd3.5 model?

craggy crest Nov 17, 2024, 10:15 PM

#

and grab the sample workflow, too

halcyon yarrow Nov 17, 2024, 10:15 PM

#

@pseudo owl looking at LLM2CLIP, I lke it, I wnt to t ry it, I'm confused as to how to use it, is it just a drop in replacement? do I just add it to my clip folder in ComfyUI? I don't get it lol

sudden parcel Nov 17, 2024, 10:15 PM

#

the vae file is not available

craggy crest Nov 17, 2024, 10:15 PM

#

sudden parcel or is there a better place to go and download the sd3.5 model?

that's the actual SAI release page but they also released it here https://civitai.com/models/878387/stable-diffusion-35-large

craggy crest Nov 17, 2024, 10:16 PM

#

sudden parcel the vae file is not available

https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main/vae

stabilityai/stable-diffusion-3.5-large at main

#

yes it is. it's called diffusion_pytorch_model.safetensors and it's in that VAE folder

#

just name it somethign like sd35VAE and put it in the folder your vaes go in

mortal mesa Nov 17, 2024, 10:18 PM

#

halcyon yarrow <@842033136560242708> looking at LLM2CLIP, I lke it, I wnt to t ry it, I'm confu...

looks to not be complete yet

halcyon yarrow Nov 17, 2024, 10:18 PM

#

i mean im just gonna try it, there's a 1.2GB .bin file in there that I'm gonna rename to safetensors and give it a whirl

#

these results are outstanding

#

LongCLIP has been official dethroned, look at base clip L all the way inside of the blue circle, base clip L sucks a lot

dusky thistle Nov 17, 2024, 10:23 PM

#

sudden parcel Nov 17, 2024, 10:23 PM

#

ok, downloading

dusky thistle Nov 17, 2024, 10:24 PM

#

sudden parcel Nov 17, 2024, 10:26 PM

#

i placed it into the vae folder

#

rendering a test image needs 12 minutes

pseudo owl Nov 17, 2024, 10:34 PM

#

halcyon yarrow <@842033136560242708> looking at LLM2CLIP, I lke it, I wnt to t ry it, I'm confu...

I didn't try it yet, just found it. I think it should work normally, except renaming it to safetensors might not, idk. Just loading the bin file should work I think.

halcyon yarrow Nov 17, 2024, 10:37 PM

#

yeah renaming the bin file did not work lol

#

not sure what you mean by "work normally" the Load Clip node in comfyui doesn't support .bin files

bitter hearth Nov 17, 2024, 10:40 PM

#

they updated

#

look their HF account

#

there is safetensors now

sullen moss Nov 17, 2024, 10:41 PM

#

Flux

pseudo owl Nov 17, 2024, 10:42 PM

#

yeah new clip l/14, just updated, https://huggingface.co/microsoft/LLM2CLIP-Openai-L-14-336/tree/main
didn't know bin files dont work with comfyui.

halcyon yarrow Nov 17, 2024, 10:42 PM

#

bitter hearth there is safetensors now

can you link us? I'm on this page: https://huggingface.co/microsoft/LLM2CLIP-EVA02-L-14-336/tree/main
I don't see them there, anyways I just used this script and coverted it:
https://github.com/Silver267/pytorch-to-safetensor-converter?tab=readme-ov-file

#

but it didn't seem to work

pseudo owl Nov 17, 2024, 10:43 PM

#

halcyon yarrow can you link us? I'm on this page: https://huggingface.co/microsoft/LLM2CLIP-EVA...

this one: https://huggingface.co/microsoft/LLM2CLIP-Openai-L-14-336/tree/main

microsoft/LLM2CLIP-Openai-L-14-336 at main

halcyon yarrow Nov 17, 2024, 10:43 PM

#

yeah i see that 2.3GB safetensors file wowza

#

i thought the EVA02 version was better than that version tho that's why I went EVA02 first

mortal mesa Nov 17, 2024, 10:45 PM

#

We expect to release all the parameters of the text model, adapter, and related components today. Previously, we experienced some delays due to precision issues during the Hugging Face conversion process

pseudo owl Nov 17, 2024, 10:46 PM

#

it contains the image encoder part too but thats completely useless for image gen models, the clip load node should load on text encoder part.

EVA02 is different model I believe which is a better alternative to clip but the image gen models don't use it, they use the clip l. The best clip model is SigLip by google but again, no model uses it as a text encoder.

mortal mesa Nov 17, 2024, 10:48 PM

#

more pieces https://huggingface.co/microsoft/LLM2CLIP-Llama-3-8B-Instruct-CC-Finetuned/tree/main

halcyon yarrow Nov 17, 2024, 10:51 PM

#

i keep getting: 'NoneType' object has no attribute 'device'

#

this whole thiing feels like it's not ready for us to try yet, if you guys get it working I'd love to hear about it

sudden parcel Nov 17, 2024, 10:55 PM

#

hmm... same results, no proper image, Sd3.5 model in the model folder and the vae file in the vae folder

craggy crest Nov 17, 2024, 11:05 PM

#

sudden parcel hmm... same results, no proper image, Sd3.5 model in the model folder and the va...

maybe it's time to stop using a1111 and switch to comfy?

sudden parcel Nov 17, 2024, 11:18 PM

#

that was my fear 🙂

#

downloading, that will take some time

craggy crest Nov 17, 2024, 11:37 PM

#

sudden parcel that was my fear 🙂

if you're willing to change, i suggest you consider installing SwarmUI and just letting it handle all the technical stuff, and using comfyUI inside it

#

it'll make your life a lot eaiser

pseudo owl Nov 17, 2024, 11:46 PM

#

@halcyon yarrow did you try any text with Flux-mini?

sudden parcel Nov 17, 2024, 11:46 PM

#

i just downloaded the comfyUI windows portable file

#

will this be a problem?

halcyon yarrow Nov 17, 2024, 11:53 PM

#

pseudo owl <@156588917875933184> did you try any text with Flux-mini?

oh that's a good question, in fact I have the perfect prompt

sudden parcel Nov 17, 2024, 11:55 PM

#

i put the SD3.5 model into checkpoints

craggy crest Nov 17, 2024, 11:59 PM

#

@dusky thistle https://huggingface.co/THUDM/CogVideoX1.5-5B /
CogVideoX1.5-5B

THUDM/CogVideoX1.5-5B · Hugging Face

sudden parcel Nov 18, 2024, 12:01 AM

#

i put the vae file into the vae folder

#

is there anything else i have to do?

halcyon yarrow Nov 18, 2024, 12:13 AM

#

sudden parcel is there anything else i have to do?

the clip files into the clip folder

#

@sudden parcel iif you want a quick solution without having to gather a bunch of files you can try this one with the "Default Workflow" https://civitai.com/models/879259/comfyorg-stable-diffusion-35-large-fp8?modelVersionId=984291 just put that into the checkpoints folder and you're good to go

#

@pseudo owl i tried a bunch of text, now remember I'm using LongClip + t5 flan so whilie its usually excellent from flux-d models it seems mini seems to have completely lost the ability to do any legible text

pseudo owl Nov 18, 2024, 12:18 AM

#

halcyon yarrow <@842033136560242708> i tried a bunch of text, now remember I'm using LongClip +...

oh thats sad, how about prompt following?

Try this prompt, I don't expect too much since even sd3.5 large or flux 8b messes this up often but why not: "A photograph of a white cat on top of a blue dog sitting on a brown couch in a living room. Behind them is a window and 4 cow pictures, one in each corner. Outside the window is a ufo hovering and outer space."

halcyon yarrow Nov 18, 2024, 12:20 AM

#

ok ill report back w the results

sudden parcel Nov 18, 2024, 12:23 AM

#

i need to leave, emergency

halcyon yarrow Nov 18, 2024, 12:34 AM

#

lol thats terrible adherence

dusky thistle Nov 18, 2024, 12:34 AM

#

pseudo owl Nov 18, 2024, 12:39 AM

#

halcyon yarrow lol thats terrible adherence

I mean thats not tooo bad, pixart also got something similar and seems better then sdxl.

halcyon yarrow Nov 18, 2024, 12:43 AM

#

one more

white cat: yes
blue dog: no
brown couch: yes
living room: yes
4 cow pics: no
ufo hovering: yes
outerspace: no

#

it managed to nail 4 out of 8 elements now to be fair that's 27 steps, lets try 40 and give it a good shot see what we get

#

40 steps

pseudo owl Nov 18, 2024, 12:45 AM

#

nice increase in quality, what are those 4 things flying though lol

halcyon yarrow Nov 18, 2024, 12:45 AM

#

it managed to get 5.5 out of 8 now Ill give it half a point for the half cat half dog creature

#

they almost look like flying cows lol

#

overall i think flux mini is a fun thing to play with but it has no loras, unless civitai adds it as it's own base model, and the community rallies behind it and makes finetunes and loras for it, little mini is just destined to not be famous

mortal kite Nov 18, 2024, 1:18 AM

#

halcyon yarrow <@688962710956015690> So you’re taking low Rez images of shirts and essentially ...

No, this was generated entirely with that prompt

#

I was planning on making these myself just at home with iron-on or something

#

was generated with Flux fp8

halcyon yarrow Nov 18, 2024, 1:22 AM

#

oh i see there's a new set of models here:

https://github.com/ali-vilab/In-Context-LoRA?tab=readme-ov-file#community-creations-using-ic-lora

I'll have to update the model page to iniclude those too then:

https://civitai.com/models/929592/creative-effects-and-design-lora-pack-in-context-lora

GitHub

GitHub - ali-vilab/In-Context-LoRA: Official repository of In-Conte...

Official repository of In-Context LoRA for Diffusion Transformers - ali-vilab/In-Context-LoRA

#

that's funny for #8 they cited my page but I didn't create anything new I just reposted their stuff from the model zoo

mortal kite Nov 18, 2024, 1:24 AM

#

lol

#

#

#

halcyon yarrow Nov 18, 2024, 1:38 AM

#

@pseudo owl this sd35 turbo

halcyon yarrow Nov 18, 2024, 1:39 AM

#

mortal kite

is the lora only able to handle tshirts or can it do weird garments?

mortal kite Nov 18, 2024, 1:39 AM

#

halcyon yarrow Nov 18, 2024, 1:39 AM

#

like for example can you try a tank top with a radioactive symbol or something like that

#

or a crop top instead of a tank top to make it even harder

mortal kite Nov 18, 2024, 1:39 AM

#

halcyon yarrow is the lora only able to handle tshirts or can it do weird garments?

There is no LORA. This is Flux Dev FP8 directly rendering the prompt "a fashion advertisement of a olive green colored T-shirt with an image of a DNA Double Helix. The Helix is bordered by a 70's style multicolor line. Text below the helix reads "CODE IS LIFE""

halcyon yarrow Nov 18, 2024, 1:40 AM

#

oh i see i ithouht you were still LORA'ing that's cool

mortal kite Nov 18, 2024, 1:40 AM

#

actually, I'm using PixelWave not flux base

#

forgot

#

its the Pixelwave flux model

#

I can try crop top

#

this time it puts a person in there

halcyon yarrow Nov 18, 2024, 1:40 AM

#

i love pixelwave its so good

#

btw have you tried that 4koma lora?

mortal kite Nov 18, 2024, 1:41 AM

#

#

no I have tried a few LORAs but haven't heard about that one

halcyon yarrow Nov 18, 2024, 1:41 AM

#

design a little wonky but yeah otherwise nailed it

mortal kite Nov 18, 2024, 1:41 AM

#

seed lottery

#

interesting when you say crop top is alwas puts a model in

#

halcyon yarrow Nov 18, 2024, 1:43 AM

#

yeah you almost gotta say just the garment or just the top

mortal kite Nov 18, 2024, 2:03 AM

#

#

craggy crest Nov 18, 2024, 2:14 AM

#

pseudo owl oh thats sad, how about prompt following? Try this prompt, I don't expect too ...

SD3.5 l

#

and... flux

halcyon yarrow Nov 18, 2024, 2:31 AM

#

craggy crest and... flux

this is the best one, it just missed one out of the 8 elements

craggy crest Nov 18, 2024, 2:35 AM

#

halcyon yarrow this is the best one, it just missed one out of the 8 elements

look closer at that cat

#

count the number of cow pictures and where they are placed

halcyon yarrow Nov 18, 2024, 2:36 AM

#

yeah the cat is a little wonky but i'd still give it a pass, its a cat on top of a dog lol

#

that's the only deduction i gave it that's why i said one of the 8 elements was the placement of the paintings

#

white cat: yes
blue dog: yes
brown couch: yes
living room: yes
4 cow pics: yes
placed in right corners: no
ufo hovering: yes
outerspace: yes
thus 7/8

craggy crest Nov 18, 2024, 2:46 AM

#

halcyon yarrow yeah the cat is a little wonky but i'd still give it a pass, its a cat on top of...

the prompt is, however, unclear on a number of elements. it says "A photograph of a white cat on top of a blue dog sitting on a brown couch in a living room. Behind them is a window and 4 cow pictures, one in each corner. Outside the window is a ufo hovering and outer space." and (picky magazine editor coming out) if we break that down, we have some fairly unclear concepts. it starts out fine "a photograph of a white cat" and then says "on top of a blue dog" - ontop... how? laying? sitting? sprawled? something else? and then it moves on to this "... sitting on a brown couch in a living room" - what is sitting? the cat? the dog? both? if an author had sent me that, i'd stop right there, and send it back and tell them to revise so it was clear to the reader what he was describing. to go on, however, we come to "behind them is a window and 4 cow pictures, one in each corner" - behind has to refer back to the cat and dog, that's good. and window is clear. we know that we expect to see a window as their background. windows have to be in walls so we expect to see a wall. but then we see this: "four cow pictures, one in each corner" - each corner of what? the window? are they actually IN the window, in the top right, top left, bottom right, bottom left, corners? are they on the wall next to the window's four corners? Or does the author actually mean that the pictures are on the wall that the window is in, but are in the 4 corners of the wall? not a clue what he really means, he didn't really say. moving past that we see "outside the window is a ufo hovering and outer space" - the UFO is clear, we expect to see a stereotypical ufo through the window, but what does outer space mean? does it mean we just see a starry sky the way we see outer space from earth? does it mean we see oute rspace from some other point of view such as one of the images did, with the earth and stars around it. does it mean something else? again, it's not very clear." - and if I, a professional publisher and editor, am going to rip this apart and tell the author to revise it and make it clear, because I can not really tell what's being described, the poor AI that hasn't got any real experience other than it's data training set is really going to have a hard time figuring out what is actually wanted.

halcyon yarrow Nov 18, 2024, 2:57 AM

#

yeah you're right about all of that, the prompt is lazy and unclear butu i guess that's sort of the challenge seeing if the model can makke something that fits those loose words, i was also stumped when scoring/judging it what "ufo hovering and outer space" meant, like how that could work, and then i saw your variation and im like "wow thats a really interpretation of that text" bc despite how its unnatuural the view outside the window is of outer space

bottom line is your right prompt is unclear but it managed to put together all the elements in the prompt to deliver a really nice coherent image

glass meteor Nov 18, 2024, 3:01 AM

#

#🆕｜sd3 a car

craggy crest Nov 18, 2024, 3:11 AM

#

halcyon yarrow yeah you're right about all of that, the prompt is lazy and unclear butu i guess...

yes, but it's a really good example of why AI's make so many mistakes, and why it's very important to not only be clear but also think like the computer does - unless you just really want random results

halcyon yarrow Nov 18, 2024, 3:13 AM

#

agreed

mortal mesa Nov 18, 2024, 3:57 AM

#

maybe AI will get better one day

craggy crest Nov 18, 2024, 3:59 AM

#

mortal mesa maybe AI will get better one day

halcyon yarrow Nov 18, 2024, 4:22 AM

#

[4koma] In this light and cheerful comic:
[SCENE-1] In a bright forest clearing, a cheerful boy with short brown hair stands facing an orange fox. The boy smiles and says, "Good morning, little friend!"

[SCENE-2] The fox holds up a shiny red apple, looking proud. The boy responds with a smile, "Wow, that’s so kind!"

[SCENE-3] Both sit beside a stream. The boy points at the water, laughing. He says, "This place is perfect!"

craggy crest Nov 18, 2024, 4:23 AM

#

halcyon yarrow > [4koma] In this light and cheerful comic: > [SCENE-1] In a bright forest clear...

too cute award!

halcyon yarrow Nov 18, 2024, 4:24 AM

#

using that in-context lora that's new called '4koma' designed to make cute scenes like this, flux dedistilled is the only model that can consistently nail a 'complex' set of text, literally every other flux model ive tried today can't even get one word bubble correct

craggy crest Nov 18, 2024, 4:24 AM

#

halcyon yarrow using that in-context lora that's new called '4koma' designed to make cute scene...

have you tried recraft yet?

mortal mesa Nov 18, 2024, 4:24 AM

#

#

halcyon yarrow Nov 18, 2024, 4:27 AM

#

craggy crest have you tried recraft yet?

i got as far as making an account, finding there was no model i can download, seeing it wasn't free, i get there is 50 free credits per day but im not gonna get attached to something that's not free so i stayed away from it

craggy crest Nov 18, 2024, 4:27 AM

#

yeah, it's not free, but it's very good at illustration and cartoon

halcyon yarrow Nov 18, 2024, 4:27 AM

#

@mortal mesa that looks like a remix from the one you posted the other day, cool variations. I lke the first one the most bc it was such a lush set of greenier, this one is more 'dead' deespitie all the plants, cool concept thats for suure

mortal mesa Nov 18, 2024, 4:28 AM

#

ya same prompt/seed, just with loras

halcyon yarrow Nov 18, 2024, 4:29 AM

#

this is recraft, just copied and pasted the exact prompt into each box, super intutive interface

#

it nailed the first scene no complaints there
the boy's face looks weird in the second, wiish he was wearing the same clothes too, the text is wrong too
lost context and now it's a girl? what happened to the fox?

Im sure with a few adjusutments to the prompt I could solve all that. I could def see the use for this for creative professionals

#

there we go some slight tweaks to the prompt and I basicaally did the same concept in less than a minute with recraft while locally rendering that with flux took like 700 seconds lol

#

@craggy crest yeah recraft is pretty cool that was fun to make, took up 28 credits to do so i have room to make about 2 of these per day

obtuse hinge Nov 18, 2024, 5:36 AM

#

Generate a poster

craggy crest Nov 18, 2024, 5:39 AM

#

obtuse hinge Generate a poster

read the information in this channel -> #artisan-faq

dusky thistle Nov 18, 2024, 7:47 AM

#

dusky thistle Nov 18, 2024, 8:16 AM

#

#

dusky thistle Nov 18, 2024, 8:43 AM

#

dusky thistle Nov 18, 2024, 9:02 AM

#

#

#

#

#

all SD35M

#

#

#

dusky thistle Nov 18, 2024, 9:35 AM

#

obtuse hinge Generate a poster

Here is the image you requested.

native mist Nov 18, 2024, 11:14 AM

#

cute boy,with black glasses

patent acorn Nov 18, 2024, 3:09 PM

#

halcyon yarrow <@407561236339752981> yeah recraft is pretty cool that was fun to make, took up ...

ok this is real impressive

#

i still wish there is a major sd3.5 fintune to improve big stuff

#

recraft composition is CLEAAAN

halcyon yarrow Nov 18, 2024, 3:12 PM

#

patent acorn recraft composition is CLEAAAN

agreed

craggy crest Nov 18, 2024, 3:31 PM

#

patent acorn i still wish there is a major sd3.5 fintune to improve big stuff

you could create one

dry wave Nov 18, 2024, 3:39 PM

#

flux is really good for multi-panel images, too. It usually preserves character identity really well. Complex text still might be a problem, though

bitter hearth Nov 18, 2024, 4:07 PM

#

what appealed to me a lot in the paper was the sandstorm thing
would be cool to try to train ones for smoke effects or lighting effects

halcyon yarrow Nov 18, 2024, 4:17 PM

#

dry wave flux is really good for multi-panel images, too. It usually preserves character ...

i agree, I think between the in-context lora for multi-panel and doing really good text its an excellent model for handling that, but at the same time stuff like recraft is more practical for the non-tech savvy who want to do something like that and doesn't have the skills or means to run a set up like that locally, plus 700 seconds vs 60 seconds. if recraft was free free I'd be defending it more but a paid service is kinda lame

bitter hearth Nov 18, 2024, 4:34 PM

#

out of the closed-source ones, I think FLUX Pro 1.1 Ultra is the most impressive cos its 2048*2048 in just 10 seconds

#

but if you include upscalers I think its possibly the latest Topaz Gigapixel
saw a video where it did a creative upscale that ended up over 19,000 pixels wide

halcyon yarrow Nov 18, 2024, 4:45 PM

#

sometime I'll generate what I consider garbage (made using fluxubooru) bc it didn't adhere to the prompt or the source image it just sort of did it's own thing, I shared it on civit anyways, ive had 10 reactions and 30 buzz from it

frail shoal Nov 18, 2024, 4:45 PM

#

SD3.5M quality seems great, but I'm only using it as a refiner to Pixart sigma. It does shitty compositions otherwise, very bland. Happy to have such a small model packing so much pixels.

bitter hearth Nov 18, 2024, 4:47 PM

#

halcyon yarrow sometime I'll generate what I consider garbage (made using fluxubooru) bc it di...

this looks fine but its normal dev quality

halcyon yarrow Nov 18, 2024, 4:48 PM

#

lol yeah its nothing particularly outstanding, just funny one man's garbage is another man's treasure

bitter hearth Nov 18, 2024, 4:49 PM

#

its very subjective yeah

craggy crest Nov 18, 2024, 5:01 PM

#

halcyon yarrow sometime I'll generate what I consider garbage (made using fluxubooru) bc it di...

now go animate that.

halcyon yarrow Nov 18, 2024, 5:01 PM

#

lol i do have some free credits with Kling

craggy crest Nov 18, 2024, 5:01 PM

#

halcyon yarrow lol yeah its nothing particularly outstanding, just funny one man's garbage is a...

i wrote a short story that you need to read http://www.bewilderingstories.com/issue240/sculptor.html

The Sculptor

What is perfection in art? Who knows? But keep in mind that the 'rough spots' may be part of it.

halcyon yarrow Nov 18, 2024, 5:03 PM

#

lol that's a good story

#

perfectionist syndrome

craggy crest Nov 18, 2024, 5:04 PM

#

:) yup. all artists suffer from it - too close to the trees and can't see the forest

#

too busy looking at the bark beetles to see anything else

halcyon yarrow Nov 18, 2024, 5:13 PM

#

personally i don't see myself as an artist, or a perfectionist, i'm not detail oriented and I often time overlook glaring mistakes, like i didn't notice the cat on top of the dog looked weird until you pointed it out lol

bitter hearth Nov 18, 2024, 5:13 PM

#

||56y||

hexed mulch Nov 18, 2024, 5:22 PM

#

cat

craggy crest Nov 18, 2024, 5:24 PM

#

halcyon yarrow personally i don't see myself as an artist, or a perfectionist, i'm not detail o...

the dictionary defines art as 'human creativity expressed' - not what the final result is, or anything else. if you're being creative, you're making art. and if you're makign art, you're an artist

craggy crest Nov 18, 2024, 5:25 PM

#

hexed mulch cat

dog

halcyon yarrow Nov 18, 2024, 5:25 PM

#

craggy crest the dictionary defines art as 'human creativity expressed' - not what the final ...

Ok you got me lol 😆

pseudo owl Nov 18, 2024, 6:14 PM

#

craggy crest the prompt is, however, unclear on a number of elements. it says "A photograph o...

Yeah I know it’s not very properly formatted but surprisingly most of the times, it doesn’t really improve quality. This is flux schnell 8 steps, same seed. Left has 3 cow pictures but dog has no head, Right is 2 cow pictures and not no head dog.

For example, left is
A photograph of a white cat sitting on top of a blue dog. The blue dog is sitting on the brown couch. Behind the couch is a square window with a square cow picture in each corner of the window, the total amount of windows being 4. Outside the window is a ufo hovering in dark outer space.

Right is

A photograph of a white cat sitting on top of a blue dog sitting on a brown couch in a living room. Behind them is a square window and 4 square cow pictures, one in each corner of the window. Outside the window is a ufo hovering and dark outer space.

craggy crest Nov 18, 2024, 6:16 PM

#

pseudo owl Yeah I know it’s not very properly formatted but surprisingly most of the times,...

Left has 3 cow pictures but dog has no head, < the dog has a head, it's turned away from the camera and the cat is sitting on it, blocking it from view

pseudo owl Nov 18, 2024, 6:17 PM

#

Ok yeah, that’s justified.

bitter hearth Nov 18, 2024, 6:25 PM

#

the most common prompt adherence benchmark, clip score on ms-coco, uses prompts like this: ```227590,The passenger train is painted brown and white.
467578,A box of donuts with a coffee in front of it.
379476,A long tunnel with a long table with lots of seats and candles next to wine glasses.
35206,a tennis player crouching down near the net
173208,A plate of food has some sesame seed bagels.
416059,Two people walk through the snow behind a dog.
350278,A zebra standing on top of a dirt field.
143224,An airplane on the tarmac and the glass passageway leading to its door
294853,"A man in a red cap, green shirt and white shorts holds a tennis racket under his arm."
323552,A young girl with glasses appears to be waiting with luggage at the baggage center.
185181,A giraffe on display in a glass enclosure.
43850,A man standing over his dog on a beach while holding a surfboard next to the ocean.
351369,A landing jet airplane kicking up spray on a wet runway.
558661,A woman that is standing in the grass with a frisbee.
119516,A beautiful woman standing on the side of a rad next to a street.
89790,A man in a parking lot talking to the driver of an army green pickup truck.

halcyon yarrow Nov 18, 2024, 6:40 PM

#

using KLING 1.0

#

KLING 1.5

halcyon yarrow Nov 18, 2024, 6:42 PM

#

bitter hearth the most common prompt adherence benchmark, clip score on ms-coco, uses prompts ...

i almost feel like running the whole set against flux destill just to confirm it would score 100% on it

bitter hearth Nov 18, 2024, 6:43 PM

#

they've started to move on to harder benches yeah

halcyon yarrow Nov 18, 2024, 6:44 PM

#

here's two example of clownshark sampler doing some interesting effects rather than creating artifacts, prompt is:

score_9, score_8_up, score_7_up, source_anime, masterpiece, best quality, perfect anatomy, very aesthetic, absurdres, (3 girls), cute, standing in a fancy restaurant, carrying menu, french maid, intricate detail, 1girl

bitter hearth Nov 18, 2024, 6:47 PM

#

the extra noise helps yeah

#

it randomly pushes the model out of areas with low score function gradient

#

sometimes the model thinks it has found a good solution but it only found a good solution for that particular area of the solution space, that didn't have a lot of gradient

craggy crest Nov 18, 2024, 6:51 PM

#

bitter hearth the most common prompt adherence benchmark, clip score on ms-coco, uses prompts ...

The passenger train is painted brown and white< i wanna see that image

craggy crest Nov 18, 2024, 6:51 PM

#

halcyon yarrow i almost feel like running the whole set against flux destill just to confirm it...

just say no to flux

craggy crest Nov 18, 2024, 6:51 PM

#

halcyon yarrow here's two example of clownshark sampler doing some interesting effects rather t...

that's a pony specific prompt. it's only going to work correctly with pony

mortal mesa Nov 18, 2024, 6:52 PM

#

yes works also

#

halcyon yarrow Nov 18, 2024, 6:53 PM

#

craggy crest that's a pony specific prompt. it's only going to work correctly with pony

yyeah it was rendered with pony, i just finally found an example I can share of this cool effect the sampler is doing

craggy crest Nov 18, 2024, 6:55 PM

#

bitter hearth the most common prompt adherence benchmark, clip score on ms-coco, uses prompts ...

for this prompt "A man in a red cap, green shirt and white shorts holds a tennis racket under his arm." the word 'holds' means 'gripped in the hand'. the normal way to say this is 'tucked under one arm' - but i have to wonder how many images in the data training set show tennis players with a racket under an arm as opposed to how many show them holding the racket in a hand?

bitter hearth Nov 18, 2024, 7:02 PM

#

ms-coco is kinda old now, came out in 2014
it was for object detection so for that prompt really it was designed to put a bounding box on the hat, shirt, shorts and racket

#

it gets kept around for historical reasons but its not optimised for image gen at all

#

the downsides of switching the widely used benchmarks are so high that they only change benchmark when they really really have to

#

FID is very flawed also, and it is now well-known how to game FID (fake a high score)

#

but it correlates decently enough with human preferences so they still keep it

craggy crest Nov 18, 2024, 8:12 PM

#

bitter hearth it gets kept around for historical reasons but its not optimised for image gen a...

SD3.5 does fine, but he's 'holding' a racket in his hand, not under his arm - because you never talk about somene holding a racket if you mean it's tucked under an arm

#

sd3.5 L prompt: a pink poodle eating a large taco while sitting on a barrel

#

prompt: a pink poodle sitting on a barrel. it is holding a large taco in its front paws and gnawing on it

#

be specific in your prompt, you'll get closer to what you want

halcyon yarrow Nov 18, 2024, 8:20 PM

#

@bitter hearth II loked into LLM2CLIP further more and I had a few takeaways

so the LLM model is out and the vision model is out
I tried the vision nodes in ComfyUI to try to make itt work somehow and I dont think this one is compatible with that architeture
it seems like the only way this is going to work is basically an upgraded ClipTextEncode node where you type in ithe prompt and rather than sending it to an LLM to generate a better prompt which then gets converted to embeddings, it sends it to an LLM to generate better text embeddings
ultimately tis is one of those the more compute you throw at a problem the better output you get, i just don't think I wanna have an 8B LLM model as part of processing pipelinie to iimprove my images
Once someone gets that stuff working in ComfyUI maybe Ii can try quantsizing the LLM into like a Q2 to make iti real quick tho

craggy crest Nov 18, 2024, 8:20 PM

#

bitter hearth Nov 18, 2024, 10:31 PM

#

craggy crest prompt: a pink poodle sitting on a barrel. it is holding a large taco in its fro...

oh this poodle taco thing worked really well thanks