#🆕｜sd3 | Stable Diffusion | Page 125

bitter hearth Nov 18, 2024, 10:32 PM

#

he definitely looks like he is holding it in second one

craggy crest Nov 18, 2024, 10:33 PM

#

welcome :)

bitter hearth Nov 18, 2024, 10:33 PM

#

I'm going the other direction, trying to involve an LLM every time now

#

they can do other things than just prompts, for example bounding boxes

craggy crest Nov 18, 2024, 10:34 PM

#

bitter hearth I'm going the other direction, trying to involve an LLM every time now

that'll work well for sd3.X and flux as they were captioned by an LLM - but it won't work so well for, say, sd1.5 as it was pretty much captioned by people who put their images up with google SEO in mind - because those are the text labels in the liaon data base

bitter hearth Nov 18, 2024, 10:35 PM

#

yeah SD 1.5 is wild

#

I really love that model

craggy crest Nov 18, 2024, 10:35 PM

#

bitter hearth yeah SD 1.5 is wild

but fun

bitter hearth Nov 18, 2024, 10:35 PM

#

the LLM captioned ones (basically all the modern ones) never quite captured the chaos of SD 1.5

craggy crest Nov 18, 2024, 10:36 PM

#

bitter hearth the LLM captioned ones (basically all the modern ones) never quite captured the ...

here's a text string i found in the laion database that works really well for sd1.5: melting liquid falling into the bottom of the drop

#

see what you can do with it

pseudo owl Nov 18, 2024, 10:37 PM

#

bitter hearth they can do other things than just prompts, for example bounding boxes

Yeah this makes the llm generate regions for flux, the prompt following is truly amazing with this, basically regional prompting with flux but better and faster. https://github.com/NJU-PCALab/RAG-Diffusion

GitHub

GitHub - NJU-PCALab/RAG-Diffusion: Region-Aware Text-to-Image Gener...

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥 - NJU-PCALab/RAG-Diffusion

bitter hearth Nov 18, 2024, 10:37 PM

#

thanks I saved this one

#

on SDXL "colorful backgound:1.3" is the best hidden gem I have found I think

#

for some reason the model listens to it a lot

craggy crest Nov 18, 2024, 10:38 PM

#

bitter hearth on SDXL "colorful backgound:1.3" is the best hidden gem I have found I think

:1.3 is supposed to be weights and sdxl doesn't use that. that's midjourney

#

change the numbers and see what it does

bitter hearth Nov 18, 2024, 10:39 PM

#

oh yeah I'm just using that to communicate it
in comfy you have to make sure the strength increase is actually done how you intend cos so many nodes differ

craggy crest Nov 18, 2024, 10:40 PM

#

bitter hearth oh yeah I'm just using that to communicate it in comfy you have to make sure the...

sure, but see what changing the numbers does, anyway

pseudo owl Nov 18, 2024, 10:40 PM

#

bitter hearth on SDXL "colorful backgound:1.3" is the best hidden gem I have found I think

Color does actually look pretty nice, and this is base sdxl.

bitter hearth Nov 18, 2024, 10:42 PM

#

yeah that's the sort of thing

#

SD 1.5 one

#

livens them up a bit

dry wave Nov 18, 2024, 10:44 PM

#

pseudo owl Yeah this makes the llm generate regions for flux, the prompt following is truly...

wow that looks cool

halcyon yarrow Nov 18, 2024, 10:44 PM

#

@bitter hearth LLMs used for object detectioni, prompt enhancing and now embedding generation? it's like they're taking over the image gen scene!

bitter hearth Nov 18, 2024, 10:44 PM

#

halcyon yarrow <@456226577798135808> LLMs used for object detectioni, prompt enhancing and now ...

LLMs wrote my prompts from day 1 TBH

#

mostly OpenAI models or Florence 2

bitter hearth Nov 18, 2024, 10:51 PM

#

pseudo owl Yeah this makes the llm generate regions for flux, the prompt following is truly...

yeah this is SOTA, currently I think

#

there was a competition won by someone using LLM agents with Powerpaintv2 which may be better, not sure

#

or progressive outpainting by LLM agents, there's been a couple papers on that but they weren't compared here

#

feels like the "Soft Refinement" stage could also be applied to inpainting workflow 🤔

pseudo owl Nov 18, 2024, 10:58 PM

#

bitter hearth yeah this is SOTA, currently I think

The only thing I don't like is that inference speed grow considerably with more masks, not as much if you do 4-step/8step lora but still a pretty large amount. Any link to the llm agent with powerpaintv2 or progressive outpainting? that seems pretty interesting.

craggy crest Nov 18, 2024, 11:00 PM

#

bitter hearth LLMs wrote my prompts from day 1 TBH

prompt: fluffy cotton candy clouds, whipped cream, splatters, transmission fluid rain

bitter hearth Nov 18, 2024, 11:01 PM

#

looks tasty I guess

bitter hearth Nov 18, 2024, 11:02 PM

#

pseudo owl The only thing I don't like is that inference speed grow considerably with more ...

I'll post it tomorrow
I always forget the names of papers

craggy crest Nov 18, 2024, 11:03 PM

#

llm is never going to think about coming up with a prompt like that

bitter hearth Nov 18, 2024, 11:03 PM

#

pseudo owl The only thing I don't like is that inference speed grow considerably with more ...

and yeah the inference time is long my regional workflows have been taking over a dozen minutes on an L40s
its pretty rough
and this is with just euler and 8-10 step acceleration loras

#

not even clownsampler

craggy crest Nov 18, 2024, 11:03 PM

#

bitter hearth Nov 18, 2024, 11:04 PM

#

craggy crest llm is never going to think about coming up with a prompt like that

I do think GPT 4 is still pretty bad at prompting

craggy crest Nov 18, 2024, 11:04 PM

#

bitter hearth I do think GPT 4 is still pretty bad at prompting

just say no to anything closedAI does

bitter hearth Nov 18, 2024, 11:05 PM

#

I'm actually not an open source enthusiast personally
although I understand the motivations behind the movement

craggy crest Nov 18, 2024, 11:06 PM

#

bitter hearth I'm actually not an open source enthusiast personally although I understand the ...

i'm anti-OpenAI - i'm also anti-altman

#

he's a scam artist from what i've seen

bitter hearth Nov 18, 2024, 11:08 PM

#

have definitely seen some shenanigans in the news regarding that company

craggy crest Nov 18, 2024, 11:08 PM

#

bitter hearth have definitely seen some shenanigans in the news regarding that company

which is why i call them closedAI

bitter hearth Nov 18, 2024, 11:08 PM

#

yeah I find the name pretty funny, it does fit

craggy crest Nov 18, 2024, 11:09 PM

#

if you want a good llm, use claude from anthropic, or meta.ai - or one of the opensource llama versions

bitter hearth Nov 18, 2024, 11:09 PM

#

they said they still won't outsource the original GPT 3.5 because it is too dangerous

#

even though its weaker than some 7B now

craggy crest Nov 18, 2024, 11:09 PM

#

bitter hearth they said they still won't outsource the original GPT 3.5 because it is too dang...

microsoft owns them, the instant microsoft became their excluisve partner, they closed everything and made that excuse

bitter hearth Nov 18, 2024, 11:10 PM

#

I still need to try claude yeah

craggy crest Nov 18, 2024, 11:10 PM

#

and until that contract is over - and it wont' be over until they succeed in developing AGI, microsoft gets theri technology for free in exchange for giving them access to their data centers, also for free

rapid pivot Nov 18, 2024, 11:10 PM

#

Where images

craggy crest Nov 18, 2024, 11:10 PM

#

so until they aren't in microsoft's backpocket, they do what small and limp says to do

rapid pivot Nov 18, 2024, 11:11 PM

#

sadcat

craggy crest Nov 18, 2024, 11:11 PM

#

rapid pivot <:sadcat:1130568570712109176>

#

i'm turning them into videos

bitter hearth Nov 18, 2024, 11:11 PM

#

rapid pivot Where images

rapid pivot Nov 18, 2024, 11:11 PM

#

Videos are still images

#

waow 👍

craggy crest Nov 18, 2024, 11:11 PM

#

rapid pivot Videos are still images

yes, but that's where they are, headed into kling

rapid pivot Nov 18, 2024, 11:12 PM

#

bitter hearth

I see a feline face in it

bitter hearth Nov 18, 2024, 11:12 PM

#

on the right? yeah

#

on the right is real life LOL

rapid pivot Nov 18, 2024, 11:13 PM

#

What's happening there

craggy crest Nov 18, 2024, 11:13 PM

#

bitter hearth on the right is real life LOL

you have very strange clouds

bitter hearth Nov 18, 2024, 11:13 PM

#

rapid pivot What's happening there

if I remember rightly this image was testing a flux realism lora

craggy crest Nov 18, 2024, 11:14 PM

#

pseudo owl Nov 18, 2024, 11:14 PM

#

Pixtral large came out too today, really impressive. Its text capability is similar to gpt4o, llama 405b, gemini1.5 pro while its image understanding is actually better then gpt4o, sonnet3.5, and gemini1.5 pro. And its far cheaper then gpt4o, 405b, and sonnet 3.5.

https://huggingface.co/mistralai/Pixtral-Large-Instruct-2411

mistralai/Pixtral-Large-Instruct-2411 · Hugging Face

bitter hearth Nov 18, 2024, 11:15 PM

#

yeah this is huge

craggy crest Nov 18, 2024, 11:15 PM

#

craggy crest Nov 18, 2024, 11:16 PM

#

bitter hearth yeah this is huge

wait till next year at this time. you will not recognize the world

bitter hearth Nov 18, 2024, 11:17 PM

#

not sure
it doesn't feel like ML changed that much in 2024 compared to a year ago

#

has grown less than I expected

craggy crest Nov 18, 2024, 11:17 PM

#

bitter hearth not sure it doesn't feel like ML changed that much in 2024 compared to a year ag...

it hasn't. wasn't supposed to. next year - massive video and robot push

bitter hearth Nov 18, 2024, 11:18 PM

#

next year will be video year yeah

craggy crest Nov 18, 2024, 11:18 PM

#

things are lining up for that push now

pseudo owl Nov 18, 2024, 11:18 PM

#

bitter hearth yeah this is huge

and the vision encoder is just a measly 1b params. Llama 90b's vision encoder is 20b params, but mistral large still beats it by far.

bitter hearth Nov 18, 2024, 11:18 PM

#

I don't know about robots, don't follow that area

#

I guess this model is gonna need A100 80GB for Comfyui

craggy crest Nov 18, 2024, 11:19 PM

#

bitter hearth I don't know about robots, don't follow that area

you might want to come up to speed and fast in that case

#

don't prune your model, remove the trash from your data training set

#

cute adorable big-eyed happy chirping, fluffy baby bird by artist "jacek yerka", by artist "Jasmine Becket-Griffith"'

bitter hearth Nov 18, 2024, 11:21 PM

#

looks pixar

craggy crest Nov 18, 2024, 11:23 PM

#

bitter hearth looks pixar

it does. see what you can do with the prompt

#

try it in sd1.5 ;)

bitter hearth Nov 18, 2024, 11:23 PM

#

yeah will have a go next time I get server

rapid pivot Nov 18, 2024, 11:24 PM

#

Quick tell me @craggy crest what sampler and cfg for 3.5

#

waow

craggy crest Nov 18, 2024, 11:24 PM

#

rapid pivot Quick tell me <@407561236339752981> what sampler and cfg for 3.5

eular_ancestral : sampler linear_ quadratic: scheduler cfg: 3.5 to 4 steps 32

rapid pivot Nov 18, 2024, 11:25 PM

#

Linear who

#

sadcat

craggy crest Nov 18, 2024, 11:25 PM

#

unless you have an old version of comfy

rapid pivot Nov 18, 2024, 11:25 PM

#

Sounds like math

#

I don't see this option here

craggy crest Nov 18, 2024, 11:25 PM

#

linear quadratic

rapid pivot Nov 18, 2024, 11:25 PM

#

Surely karras will be fine

craggy crest Nov 18, 2024, 11:26 PM

#

with eular ancestral

craggy crest Nov 18, 2024, 11:26 PM

#

rapid pivot Surely karras will be fine

ehhhh talk to @bitter hearth about that

rapid pivot Nov 18, 2024, 11:26 PM

#

sadcat what is this

craggy crest Nov 18, 2024, 11:26 PM

#

rapid pivot <:sadcat:1130568570712109176> what is this

a confused female

bitter hearth Nov 18, 2024, 11:27 PM

#

rapid pivot <:sadcat:1130568570712109176> what is this

sounds like you tried Karras LMAO

craggy crest Nov 18, 2024, 11:27 PM

#

simple or beta

bitter hearth Nov 18, 2024, 11:27 PM

#

the Karras schedule is not good for SD 3.5, Auraflow or Flux
because it takes steps that are too big early on

#

these models want schedules that take small steps early on in the process

craggy crest Nov 18, 2024, 11:28 PM

#

here @rapid pivot https://docs.google.com/spreadsheets/d/17jzqpz3FyolUwvUREQ-oY1SaRLHlliGVkGoy7F5rfaI/edit?usp=sharing read through that

Google Docs

35-sampler-scheduler-compares

bitter hearth Nov 18, 2024, 11:28 PM

#

increasing shift helps with this
switching to beta also helps, and linear quadratic helps massively

#

you need to take care as small steps early on necessarily means large steps later

#

so there are limits to how much you can focus your steps in the early stages

rapid pivot Nov 18, 2024, 11:29 PM

#

This looks like it's working

#

Emo girl red room

#

I'm smart prompt

bitter hearth Nov 18, 2024, 11:29 PM

#

its good

craggy crest Nov 18, 2024, 11:29 PM

#

rapid pivot I'm smart prompt

red emo girl room

bitter hearth Nov 18, 2024, 11:30 PM

#

if Sana comes out it will be a bit more flexible

#

they removed the positional embeddings that cause Flux grid

craggy crest Nov 18, 2024, 11:31 PM

#

bitter hearth they removed the positional embeddings that cause Flux grid

if robin would just fix flux, flux would be fine

rapid pivot Nov 18, 2024, 11:31 PM

#

craggy crest red emo girl room

Here it is thomas

craggy crest Nov 18, 2024, 11:31 PM

#

rapid pivot Here it is <:thomas:1005605185013416016>

interesting that all it affected is her lipstick

rapid pivot Nov 18, 2024, 11:32 PM

#

And her hair

craggy crest Nov 18, 2024, 11:32 PM

#

didn't make her hair red

rapid pivot Nov 18, 2024, 11:32 PM

#

Looks like my hair if you told me to cut it myself

#

thomas

bitter hearth Nov 18, 2024, 11:32 PM

#

personally I am the biggest fan of Lumina
it uses rotatory embeddings
needs aesthetic fine tune though

craggy crest Nov 18, 2024, 11:32 PM

#

watches the entire world try to learn to think like a computer... and fail

bitter hearth Nov 18, 2024, 11:33 PM

#

there are some comfy things that can stop prompt bleeding

#

I don't use the fancier ones as I am not that bothered

craggy crest Nov 18, 2024, 11:33 PM

#

bitter hearth there are some comfy things that can stop prompt bleeding

or you can just learn how to prompt correctly ;)

bitter hearth Nov 18, 2024, 11:33 PM

#

yeah you can do it with prompt engineering

#

I use concat conditioning node when things get spicy, it boxes things off

craggy crest Nov 18, 2024, 11:34 PM

#

you've seen my workflows. they are very small and never any negative prompts. and i don't get bleeding unless i want it

#

https://youtu.be/9XOTRBY9MOY?si=xxXFjoeXGzqsuw0C

YouTube

MONTY

TOKTIK

▶ Play video

bitter hearth Nov 18, 2024, 11:36 PM

#

its been a few months since I last used a negative yeah

rapid pivot Nov 18, 2024, 11:36 PM

#

waow

#

I love red, looks squishable

bitter hearth Nov 18, 2024, 11:37 PM

#

I like how recent models don't have SDXL's trend towards yellow

rapid pivot Nov 18, 2024, 11:38 PM

#

#

Now it looks better

#

5cfg waow

craggy crest Nov 18, 2024, 11:42 PM

#

rapid pivot 5cfg <:waow:1017853838516035725>

5 is real close to the edge. go much farther and you'll start over cooking it - however you might find that useful in some situations

rapid pivot Nov 18, 2024, 11:43 PM

#

craggy crest 5 is real close to the edge. go much farther and you'll start over cooking it - ...

Let me try something more then

#

I think I upped it too much thomas @craggy crest

#

It's cooked now

craggy crest Nov 18, 2024, 11:46 PM

#

rapid pivot It's cooked now

you're coming over for dinner and bringing that with you, right?

bitter hearth Nov 18, 2024, 11:46 PM

#

AI food is always so good

rapid pivot Nov 18, 2024, 11:46 PM

#

bitter hearth AI food is always so good

thomas

#

Let me get you something

bitter hearth Nov 18, 2024, 11:47 PM

#

Playground v2.5 cakes were amazing

craggy crest Nov 18, 2024, 11:47 PM

#

bitter hearth AI food is always so good

not... always...

#

#

#

rapid pivot Nov 18, 2024, 11:48 PM

#

bitter hearth AI food is always so good

craggy crest Nov 18, 2024, 11:49 PM

#

rapid pivot

your chef needs to learn how to cut up pineapple

rapid pivot Nov 18, 2024, 11:49 PM

#

Those are special jamanderee pineapples

mortal mesa Nov 18, 2024, 11:56 PM

#

gritty gale Nov 18, 2024, 11:57 PM

#

Hello

dusky thistle Nov 19, 2024, 1:17 AM

#

winter crater Nov 19, 2024, 1:29 AM

#

does a1111 support sd3.5?

craggy crest Nov 19, 2024, 1:39 AM

#

winter crater does a1111 support sd3.5?

kinda?

still wadi Nov 19, 2024, 2:02 AM

#

Two posters for the Black Friday event

patent acorn Nov 19, 2024, 2:04 AM

#

if only you lend me 9 thousands H100

rapid pivot Nov 19, 2024, 2:08 AM

#

What will a bloon do with so many h100s

#

I'm scared

craggy crest Nov 19, 2024, 2:14 AM

#

patent acorn if only you lend me 9 thousands H100

sorry, they're already out on loan

patent acorn Nov 19, 2024, 2:17 AM

#

rapid pivot What will a bloon do with so many h100s

i will make a powerful architecture than recraft and open source it (i dont have a degree on computer science)

dusky thistle Nov 19, 2024, 2:18 AM

#

#

rapid pivot Nov 19, 2024, 2:30 AM

#

patent acorn i will make a powerful architecture than recraft and open source it (i dont have...

With the power of friendship we can do it all ! waow (has zero coding or math skills)

halcyon yarrow Nov 19, 2024, 2:40 AM

#

dusky thistle

I love this neon glow in the dark style I wish people made more art like this

dusky thistle Nov 19, 2024, 2:47 AM

#

#

brave tide Nov 19, 2024, 6:25 AM

#

I overhauled a Comfy Ksampler and built it with T2I.Does anyone want to use this Ksampler?
I'm thinking of making it public if there is a response.
I would like to contribute to the development.
The size is 1024 x 1024.
Created with StableDiffusion 3.5.

craggy crest Nov 19, 2024, 7:29 AM

#

dusky thistle Nov 19, 2024, 8:30 AM

#

mellow quartz Nov 19, 2024, 8:51 AM

#

aple

dusky thistle Nov 19, 2024, 8:52 AM

#

#

muted dove Nov 19, 2024, 11:27 AM

#

#

sudden parcel Nov 19, 2024, 5:19 PM

#

i have stable-diffusion-3.5-large, where do i find the clip files for it?

gusty trail Nov 19, 2024, 5:20 PM

#

Using IC Lora to create each Chinese character and concatenate all together

dusky thistle Nov 19, 2024, 5:32 PM

#

real terrace Nov 19, 2024, 5:43 PM

#

I haven't used SD3.5 for a while, from all this models, what would you choose, for speed or quality? Or there is some other finetune or model aroud these days?

pseudo owl Nov 19, 2024, 5:45 PM

#

real terrace I haven't used SD3.5 for a while, from all this models, what would you choose, f...

Depends on how much vram you have, turbo is the fastest and 2nd best, large is the slowest but the best.

Medium uses the least vram but is worse quality and worse speed then turbo.

real terrace Nov 19, 2024, 5:48 PM

#

pseudo owl Depends on how much vram you have, turbo is the fastest and 2nd best, large is t...

ty, I have 12 GB VRAM

Image I have a workflow with medium, I can switch to turbo or I have to change the workflow for it?

pseudo owl Nov 19, 2024, 5:53 PM

#

real terrace ty, I have 12 GB VRAM Image I have a workflow with medium, I can switch to turb...

Sd3.5 large turbo or not won’t fit normally, you will need quantized versions. These will lose a bit of quality(still much better then sd3.5 medium) but use far less vram. Q8 one is basically losless and should fit, but lower ones will take even less vram.
https://huggingface.co/calcuis/sd3.5-large-gguf

calcuis/sd3.5-large-gguf · Hugging Face

real terrace Nov 19, 2024, 5:57 PM

#

pseudo owl Sd3.5 large turbo or not won’t fit normally, you will need quantized versions. T...

I'm downloading that one, will test it ty

real terrace Nov 19, 2024, 6:24 PM

#

oh I found I download some other gguf

mortal mesa Nov 19, 2024, 7:34 PM

#

run the full boat one twice if the speed is acceptable stay with it - some guy on the internet

craggy crest Nov 19, 2024, 7:42 PM

#

so city96 seems to have gone off the deep end https://huggingface.co/city96/Flux.1-Heavy-17B

city96/Flux.1-Heavy-17B · Hugging Face

halcyon yarrow Nov 19, 2024, 7:43 PM

#

lol @craggy crest have you tried running it?

craggy crest Nov 19, 2024, 7:44 PM

#

halcyon yarrow lol <@407561236339752981> have you tried running it?

you need 80 gig vram. i don't have that

halcyon yarrow Nov 19, 2024, 7:44 PM

#

what about in lowvram mode?

craggy crest Nov 19, 2024, 7:44 PM

#

halcyon yarrow what about in lowvram mode?

i don't have 40 gig, either ;)

halcyon yarrow Nov 19, 2024, 7:46 PM

#

$1.20/hr to rent 80GB vram runpod for an hour. I have $4 in credits I could play with it there but ehh....

#

so he just merged it on itself? i don't get that concept, like I could understand if he merged a bunch of loras in like @short thicket did but even that doesn't increase parameter count. do you understand what he did @craggy crest

craggy crest Nov 19, 2024, 7:50 PM

#

halcyon yarrow so he just merged it on itself? i don't get that concept, like I could understan...

did you read his entire front page?

bitter hearth Nov 19, 2024, 7:55 PM

#

craggy crest did you read his entire front page?

LOL I'm downloading it already

mortal mesa Nov 19, 2024, 7:56 PM

#

Do you feel like you have too much VRAM lately? Want to OOM on a 40GB A100? This is the model for you!

#

lmao

craggy crest Nov 19, 2024, 7:57 PM

#

bitter hearth LOL I'm downloading it already

oh lord. cant' wait to see what you do with it

mortal mesa Nov 19, 2024, 7:59 PM

#

" Usage: Good luck."

halcyon yarrow Nov 19, 2024, 8:00 PM

#

I was really curious/interested in what he meant by this sentence: "Merging was done similarly to 70B->120B LLM merges, with the layers repeated and interwoven in groups." so I had a chat about it with chatgpt to better understand it: https://chatgpt.com/share/673cee36-c87c-800f-bc3a-c956e7ff1ac7

ChatGPT

A conversational AI system that listens, learns, and challenges

halcyon yarrow Nov 19, 2024, 8:00 PM

#

bitter hearth LOL I'm downloading it already

doing the lord's work my man lool report back and let us know if you can get anything out of it, even if it's a 128x128 image

#

is that even possible? what's the smallest image you can make with flux and SD3 and the like?

bitter hearth Nov 19, 2024, 8:01 PM

#

Flux does 100x100

#

this is the sort of self-merge it is based on https://old.reddit.com/r/LocalLLaMA/comments/1aj2jw0/miqu_120b_selfmerge_like_venusmegadolphin/

craggy crest Nov 19, 2024, 8:03 PM

#

halcyon yarrow is that even possible? what's the smallest image you can make with flux and SD3 ...

1 pixel by 1 pixel

bitter hearth Nov 19, 2024, 8:03 PM

#

not even wrong TBH

rapid pivot Nov 19, 2024, 8:03 PM

#

craggy crest

Bring that person closer to the camera

#

Then it reveals it is actually Shrek

#

thomas

bitter hearth Nov 19, 2024, 8:04 PM

#

did you know SD 1.5 and SDXL can also make rly small images like 250x250 or less
if you use Unet Temperature node

#

I only found that last week

craggy crest Nov 19, 2024, 8:04 PM

#

rapid pivot Bring that person closer to the camera

wait'll you see the entire video ;)

rapid pivot Nov 19, 2024, 8:04 PM

#

I can't wait for the Shrek reveal

#

waow

bitter hearth Nov 19, 2024, 8:05 PM

#

someone should secretly finetune shrek into the model

#

but only certain tokens trigger him

halcyon yarrow Nov 19, 2024, 8:05 PM

#

bitter hearth this is the sort of self-merge it is based on ```https://old.reddit.com/r/LocalL...

One one hand, it's amazing that you can improve a model by effectively copying around information that it already contains.

On the other hand, doesn't this suggest that the way inference currently works is suboptimal? If a program like mergekit can produce a 120b model from a 70b model that outperforms that 70b model without needing any additional information, shouldn't it be possible to build this into the inference code itself, and get the performance of the frankenmerge from the 70b model directly, without requiring additional memory?

this is exatly what i was thinking and why i asked chatgpt about it

#

even chatgpt was incredolous this technique would work or offer any beneit and yet it does

bitter hearth Nov 19, 2024, 8:05 PM

#

deep learning in general is the most suboptimal thing

halcyon yarrow Nov 19, 2024, 8:06 PM

#

what would happen if we GGUF q2 or q8 the flux 17b model?

bitter hearth Nov 19, 2024, 8:07 PM

#

would go fine

#

it will fit on 8GB GPU

halcyon yarrow Nov 19, 2024, 8:07 PM

#

can i run it on 8gb of gpu memory then?! lol

bitter hearth Nov 19, 2024, 8:07 PM

#

yeah

halcyon yarrow Nov 19, 2024, 8:07 PM

#

is that what you're going to do?

#

or are you gonna try running the full thing?

bitter hearth Nov 19, 2024, 8:08 PM

#

no I'm just gonna make R2D2 pictures

halcyon yarrow Nov 19, 2024, 8:08 PM

#

do you have the requisite vram tho?

bitter hearth Nov 19, 2024, 8:08 PM

#

ye I rented L40s

#

about $0.8/hr

halcyon yarrow Nov 19, 2024, 8:10 PM

#

sweet dude, make sure to try the best of the best for everything, t5xxl fp32, don't hold back lol

mortal mesa Nov 19, 2024, 8:12 PM

#

its bf16 already i was gonna convert to fp16 if it was fp32

bitter hearth Nov 19, 2024, 8:13 PM

#

the t5 can be Q8, its the same performance as fp32 for inference

rapid pivot Nov 19, 2024, 8:14 PM

#

bitter hearth someone should secretly finetune shrek into the model

A model that makes everything look 1% Shrek, even objects

#

thomas

halcyon yarrow Nov 19, 2024, 8:19 PM

#

whats better bf16 or fp32?

halcyon yarrow Nov 19, 2024, 8:39 PM

#

[INFO ] model.cpp:793 - load flux.1-heavy-17B.safetensors using safetensors format
[INFO ] model.cpp:1776 - model tensors mem size: 9436.40MB
[INFO ] model.cpp:1811 - load tensors done
[INFO ] model.cpp:1812 - trying to save tensors to flux.1-heavy-17B.q4_0.gguf
convert 'flux.1-heavy-17B.q4_0.gguf' success
Conversion completed in 0 hour(s) 15 minute(s) 19.4 second(s).
Press any key to continue . . .

#

works for me 🙂 @bitter hearth Prompt executed in 80.52 seconds

bitter hearth Nov 19, 2024, 9:00 PM

#

guys what do we do if Flux Heavy is better LOL

dull star Nov 19, 2024, 9:05 PM

#

what

#

well we will start crying then

bitter hearth Nov 19, 2024, 9:08 PM

#

well we have the model, so we don't need to be sad

halcyon yarrow Nov 19, 2024, 9:14 PM

#

bitter hearth guys what do we do if Flux Heavy is better LOL

wo so that's the same seed right?

#

do side-by-side comparisons using same seed bc I think heavy is better it could be considered subjective, it's not amazingly better right?

bitter hearth Nov 19, 2024, 9:17 PM

#

that was same seed yeah

#

how did you get a GGUF?

halcyon yarrow Nov 19, 2024, 9:20 PM

#

i made it

bitter hearth Nov 19, 2024, 9:20 PM

#

ah okay nice

halcyon yarrow Nov 19, 2024, 9:21 PM

#

here's the link if you just wanna download it: https://civitai.com/models/964045?modelVersionId=1079329

#

its already up on civit lol

#

here's some sammples

#

I'd have more but mochi is hogging the queue right now

rapid pivot Nov 19, 2024, 9:25 PM

#

What's better about it

halcyon yarrow Nov 19, 2024, 9:43 PM

#

rapid pivot What's better about it

the guy who created it just showed a picture of the base model, vs a picture after it self merged and the after was somewhat better than the base, not much to go on

bitter hearth Nov 19, 2024, 9:43 PM

#

looks like your GGUF was done correctly, thanks a lot
it does lose a fair bit in Q4 but it works

halcyon yarrow Nov 19, 2024, 9:43 PM

#

i'm still skeptical about the whole concept of self-merge but it's a thing and it's been demonstrated to actually improve the modmel so i'm waiting on @bitter hearth to post some side-by-sides

#

yeah the GGUF seems to hold out well compared to the full 17b model

#

can you try testing the full model on complex text? I'm seeing poor reslts on my end for that, im also using a cheap setup so im gonna try to push it on that end in a minute after im done with the images for the gallery

mortal mesa Nov 19, 2024, 9:50 PM

#

halcyon yarrow here's the link if you just wanna download it: https://civitai.com/models/964045...

is that you? i saw that name talking about using klown sampler with mochi somewhere and it was nice

halcyon yarrow Nov 19, 2024, 9:52 PM

#

mortal mesa is that you? i saw that name talking about using klown sampler with mochi somewh...

lol yeah man gotta spread the knowledge

mortal mesa Nov 19, 2024, 9:53 PM

#

ya was good stuff

halcyon yarrow Nov 19, 2024, 9:54 PM

#

i noticed that if i upload mochi videos to civitai as webp they get treated as images and they get filtered from the images feed and the videos feed so basically they don't get shown

#

ended up having to add another node to convert it to mp4 so i can share it properly

pseudo owl Nov 19, 2024, 9:56 PM

#

halcyon yarrow here's some sammples

seems really oversaturated for some reason, high cfg?

halcyon yarrow Nov 19, 2024, 9:58 PM

#

cfg 3 or 3.5, maybe I should set it to 1 since flux-d and therefore this version is distilled and im not using any flux guidance nodes?

pseudo owl Nov 19, 2024, 10:00 PM

#

halcyon yarrow cfg 3 or 3.5, maybe I should set it to 1 since flux-d and therefore this version...

yeah maybe, probably a good idea.

winged seal Nov 19, 2024, 10:02 PM

#

halcyon yarrow here's some sammples

Ok, so this is a compressed repeat layer style merge? Interesting. So the model itself isn't any bigger becuase its just cloning the same weights, but inference will be much slower?

halcyon yarrow Nov 19, 2024, 10:07 PM

#

winged seal Ok, so this is a compressed repeat layer style merge? Interesting. So the model ...

the model is bigger in parameter count and size and inference will be slower

winged seal Nov 19, 2024, 10:07 PM

#

Wait, so how is it only 9.8GB if its 17B params at FP8, yet Fp8 Flux Dev is ~12 GB?

halcyon yarrow Nov 19, 2024, 10:07 PM

#

it's a really interesting concept when I aske chatgpt about it, the LLM described this as how self-merge works:

How It Works
a. Layer Duplication and Interleaving

Duplication: Each layer of the original model is copied one or more times.

Interleaving: The duplicated layers are interwoven with the original layers in a specific sequence.

For example, consider a simplified model with layers [L1, L2, L3]. A self-merge might result in [L1, L1', L2, L2', L3, L3'], where L1' is a copy of L1.

winged seal Nov 19, 2024, 10:08 PM

#

Yeah, people do it all the time for LLM's, but it never really improves anything, just allows you to post a flashy number

halcyon yarrow Nov 19, 2024, 10:08 PM

#

so technically it's duplicating the layers right? and then quantsizing is rounding of the weights in the layers so it's almost like we're artifially doubling the size and then putting it in a zip file

winged seal Nov 19, 2024, 10:08 PM

#

I am not sure what the benefits would be, as flux lite already looks just as good, has full compatibity with, and runs way faster than flux dev

winged seal Nov 19, 2024, 10:09 PM

#

halcyon yarrow so technically it's duplicating the layers right? and then quantsizing is roundi...

I see. Still not sure why anybody would want to make flux bigger when its already obese/oversized 😅

halcyon yarrow Nov 19, 2024, 10:09 PM

#

winged seal Yeah, people do it all the time for LLM's, but it never really improves anything...

that's not true, someone posted a reddit thread earlier that showed a 70b that scored higher benchmarks at 120b by self merging

halcyon yarrow Nov 19, 2024, 10:09 PM

#

winged seal I see. Still not sure why anybody would want to make flux bigger when its alread...

the author of the model showed an image of improvement from base model to 17b using the same prompt so it does show signs of improvement there too

winged seal Nov 19, 2024, 10:09 PM

#

yeah, benchmarks. Self merging increases biases and patterns, which means over-expressed concepts like information trained in specifically to cheat benchmarks expresses even more

winged seal Nov 19, 2024, 10:09 PM

#

halcyon yarrow the author of the model showed an image of improvement from base model to 17b us...

I'd be curious what the "improvements" are

halcyon yarrow Nov 19, 2024, 10:10 PM

#

@bitter hearth is actively testing the 17B model in a rented A40 we'll see if he can come up wiith anything that can 'wow' us as far as improvements

winged seal Nov 19, 2024, 10:10 PM

#

base dev without training sucks ass for anything except over baked plastic images 😅

And I say that as somebody who might soon have a job dedicated to training flux lol

halcyon yarrow Nov 19, 2024, 10:10 PM

#

to be fair i'm using the q_4 model

#

here's some more 512px images using q4 flux heavy 17b

winged seal Nov 19, 2024, 10:11 PM

#

My research partner and I were able to demo incredible style/concept improvement in dev with proper training, so we are in the stages of securing funding

winged seal Nov 19, 2024, 10:11 PM

#

halcyon yarrow here's some more 512px images using q4 flux heavy 17b

jesus fucking christ she looks burnt

halcyon yarrow Nov 19, 2024, 10:11 PM

#

winged seal My research partner and I were able to demo incredible style/concept improvement...

have you tried flux dev destilled? i swear by that model

winged seal Nov 19, 2024, 10:12 PM

#

Our interest is in full flux dev for coporate

halcyon yarrow Nov 19, 2024, 10:12 PM

#

winged seal jesus fucking christ she looks burnt

agreed

pseudo owl Nov 19, 2024, 10:12 PM

#

winged seal base dev without training sucks ass for anything except over baked plastic image...

nah idk, theres no finetune that improves general capability of flux dev? except de-distilled.

winged seal Nov 19, 2024, 10:12 PM

#

pseudo owl nah idk, theres no finetune that improves general capability of flux dev? except...

that is not true lol

#

PixelWave Flux is a monumental improvement for flux across the board

#

a majority of the others are pretty ass though, I will agree. Most people are too aggressive and impatient with training

pseudo owl Nov 19, 2024, 10:13 PM

#

winged seal PixelWave Flux is a monumental improvement for flux across the board

flux dev's prompt following, human anatomy, and text are better or similar from what I tested, art styles are improved for sure though in pixelwave.

winged seal Nov 19, 2024, 10:14 PM

#

winged seal Nov 19, 2024, 10:14 PM

#

pseudo owl flux dev's prompt following, human anatomy, and text are better or similar from ...

you just said, its all better or equally as good... so its an improvement by literal definition lmao

halcyon yarrow Nov 19, 2024, 10:15 PM

#

winged seal PixelWave Flux is a monumental improvement for flux across the board

pixelwave IS really good I agree but it's no flux dev dedestilled

pseudo owl Nov 19, 2024, 10:15 PM

#

winged seal you just said, its all better or equally as good... so its an improvement by lit...

I said flux dev's prompt following, human anatomy, and text are better or similar to pixelwave, art styles are better but it has some cost of the above things.

halcyon yarrow Nov 19, 2024, 10:15 PM

#

pseudo owl flux dev's prompt following, human anatomy, and text are better or similar from ...

yea it's still within the same class, it's better but within the same class, whereas destill is in another class of it's own imo

winged seal Nov 19, 2024, 10:16 PM

#

pseudo owl I said flux dev's prompt following, human anatomy, and text are better or simila...

oh, yeah, you have to learn how to prompt pixel wave flux, that is true. But when you have a good prompt, it follows it better than dev by a huge amount. It absolutely trashes SD3 in every regard too, thats for sure 😅

pseudo owl Nov 19, 2024, 10:16 PM

#

halcyon yarrow yea it's still within the same class, it's better but within the same class, whe...

The main problem with distill is that it takes roughly 2x speed and flux dev is already really slow but yeah its a improvement over everything slightly.

halcyon yarrow Nov 19, 2024, 10:16 PM

#

i agree it's the slowest model by far, were you around when i posted that chart with my average model render time?

winged seal Nov 19, 2024, 10:16 PM

#

halcyon yarrow yea it's still within the same class, it's better but within the same class, whe...

I'm not seeing anything impressive about it. got any examples?

pseudo owl Nov 19, 2024, 10:17 PM

#

winged seal oh, yeah, you have to learn how to prompt pixel wave flux, that is true. But whe...

nah idk, try to format this example prompt for pixelwave, and I'll try it with flux dev
A photograph of a white cat sitting on top of a blue dog. The blue dog is sitting on the brown couch. Behind the couch is a square window with a square cow picture in each corner of the window, the total amount of windows being 4. Outside the window is a ufo hovering in dark outer space.

halcyon yarrow Nov 19, 2024, 10:17 PM

#

but its one of those "you get what you pay for" situations, if you have the time to do it right and you dont care how long it takes and you're willing to pay however much time it takes for flux to do a good job then flux destill is the way to go

winged seal Nov 19, 2024, 10:17 PM

#

pseudo owl nah idk, try to format this example prompt for pixelwave, and I'll try it with f...

ok

halcyon yarrow Nov 19, 2024, 10:17 PM

#

winged seal I'm not seeing anything impressive about it. got any examples?

https://civitai.com/models/843551

winged seal Nov 19, 2024, 10:17 PM

#

I know, I am not seeing anything impressive on its page

#

there's like 4 images lol

pseudo owl Nov 19, 2024, 10:18 PM

#

Probably not the best examples lol, just try it I guess. There is a huggingface space too.

halcyon yarrow Nov 19, 2024, 10:18 PM

#

its less impressive when seeing an image, try some of your rubric prompts, stuff thats hard to adhere to and images where you see it doesn't always hit all the elements. 9/10 times flux destill will nail a very complex prompt

halcyon yarrow Nov 19, 2024, 10:18 PM

#

winged seal there's like 4 images lol

lol you dont know about cross post? you gotta scroll down to the gallery bro

#

oh sorry you're right

#

look at the Q8 version

#

that's where all the party happens

winged seal Nov 19, 2024, 10:19 PM

#

Ok, I'm there

#

there's like 10 pics

winged seal Nov 19, 2024, 10:20 PM

#

pseudo owl nah idk, try to format this example prompt for pixelwave, and I'll try it with f...

wait a sec, I already ran this

halcyon yarrow Nov 19, 2024, 10:20 PM

#

winged seal Ok, I'm there

here's the direct link: https://civitai.com/models/843551?modelVersionId=943891 again scroll to the bottom were it says Gallery and you can see all the images generated with it

winged seal Nov 19, 2024, 10:20 PM

#

this one. It missed photograph style cause its not trained in as "photograph"

pseudo owl Nov 19, 2024, 10:20 PM

#

winged seal there's like 10 pics

the q8_0 gguf one? I mean all the examples aren't probably the best and all nsfw but its mostly just flux dev with a bit more detail from my testing.

halcyon yarrow Nov 19, 2024, 10:21 PM

#

but again w/o context its hard to judge an image and whether it's any good at adhernece

winged seal Nov 19, 2024, 10:21 PM

#

halcyon yarrow here's the direct link: https://civitai.com/models/843551?modelVersionId=943891 ...

I am seeing all of those, they all look really mediocre to me

halcyon yarrow Nov 19, 2024, 10:21 PM

#

winged seal this one. It missed photograph style cause its not trained in as "photograph"

what model was this? cuurious

winged seal Nov 19, 2024, 10:21 PM

#

like, I'm just not seeing anything lol

pseudo owl Nov 19, 2024, 10:21 PM

#

halcyon yarrow what model was this? cuurious

pixelwave I believe

winged seal Nov 19, 2024, 10:21 PM

#

pixelwave yeah

halcyon yarrow Nov 19, 2024, 10:21 PM

#

not bad for pixelwave

winged seal Nov 19, 2024, 10:22 PM

#

I mean, the prompt adherence is almost perfect, so I am happy haha

halcyon yarrow Nov 19, 2024, 10:22 PM

#

you see how it missed a lot of the crucial elements tho? try it on destill and you'll see it nail it like 100% not a single thing missed. i swear by destill bc it has adherence above and beyond anything else out there

winged seal Nov 19, 2024, 10:22 PM

#

like what?

#

white cat on blue dog on brown couch. 4 cow pictures in the window, outside is space, with a UFO. All it missed was the photographic style (cause its not tagged as photograph), and the 4 pictures being in the 4 corners

#

SD3.5 hasn't been able to get this image even partially right for me 😅

pseudo owl Nov 19, 2024, 10:24 PM

#

I mean I got this with Flux.1 alpha 8steps lora which got everything right, even the picture in corner part.

halcyon yarrow Nov 19, 2024, 10:24 PM

#

this is a repost rom another day

mortal mesa Nov 19, 2024, 10:24 PM

#

im still messing around with Shuttle, quite underrated and

halcyon yarrow Nov 19, 2024, 10:24 PM

#

pseudo owl I mean I got this with Flux.1 alpha 8steps lora which got everything right, even...

yeah id say that's 100% nailed it

pseudo owl Nov 19, 2024, 10:24 PM

#

but let me try with dev, will take forever but lets see.

winged seal Nov 19, 2024, 10:25 PM

#

pseudo owl I mean I got this with Flux.1 alpha 8steps lora which got everything right, even...

thats not correct. Thats not a photograph, thats not space (its a night sky), and the pictures are outside of the corners of the window, not inside of them like requested

halcyon yarrow Nov 19, 2024, 10:25 PM

#

and this is with destill models added to the list plus a rouge SDXL model at the bottom

winged seal Nov 19, 2024, 10:26 PM

#

mortal mesa im still messing around with Shuttle, quite underrated and

oh right, shuttle

pseudo owl Nov 19, 2024, 10:27 PM

#

winged seal thats not correct. Thats not a photograph, thats not space (its a night sky), an...

yeah accidently tested the old version of the prompt, let me try with the one I gave which has the dark outer space part.

winged seal Nov 19, 2024, 10:27 PM

#

when adding the proper photographic style tag, and changing the prompt to have the 4 pictures BY the corners, not IN them

#

anyways, I have to go for now 😅

halcyon yarrow Nov 19, 2024, 10:28 PM

#

just so we're all on the same page the prompt we're using is this one:

A photograph of a white cat on top of a blue dog sitting on a brown couch in a living room. Behind them is a window and 4 cow pictures, one in each corner. Outside the window is a ufo hovering and outer space

No adjusting the prompt or the wording or enhancing it right?

winged seal Nov 19, 2024, 10:29 PM

#

@halcyon yarrowI'll keep an eye on de-dstillined, but the pictures on civit aren't impressive, so hopefully there were be better images to interest me when I look back

halcyon yarrow Nov 19, 2024, 10:29 PM

#

one o the main problems @craggy crest had with that prompt is that it's very loose and incomplete and open to interpratation, just wanna make sure we're agreeing that's the prompt before i try it with flux heavy 17b

winged seal Nov 19, 2024, 10:29 PM

#

wait, its a month old? nevermind

winged seal Nov 19, 2024, 10:29 PM

#

halcyon yarrow just so we're all on the same page the prompt we're using is this one: > A photo...

Mine is modified, but almost that, yes

halcyon yarrow Nov 19, 2024, 10:29 PM

#

winged seal <@156588917875933184>I'll keep an eye on de-dstillined, but the pictures on civi...

i don't think any amount of images will change your mind, its just one of those things you have to really give a shot and try it yourself and do a side by side comparison on your own to really test it's power

#

well any modifications are 'unfafir' in the sense that again its a bad prompt full of holes so by changing the text you're giving the model a leg up on exactly what it should do and how

winged seal Nov 19, 2024, 10:30 PM

#

halcyon yarrow i don't think any amount of images will change your mind, its just one of those ...

I guess. I will do it some other time then. Too busy trying to secure proper funding for our corporate version

halcyon yarrow Nov 19, 2024, 10:30 PM

#

do a side by side with your corporate version, overall you can use cfg 3 to 7, and set the steps to a minimum of 60, ddim, beta is what I like to use on ksampler

mortal mesa Nov 19, 2024, 10:31 PM

#

i got a good prompt somewhere from here i gotta find, it was like orange blueberries and blue orange on a blue plate with orange wall on an blue napkin, thats not it but it was like that

winged seal Nov 19, 2024, 10:31 PM

#

A digital color photograph of a white cat sitting on top of a blue dog. The blue dog is sitting on the brown couch. Behind the couch is a square window with a square cow picture next to each corner of the window. Outside the window is a ufo hovering in dark outer space.

My version. I had to add the style tags for photographic style, since "photograph" is too broad for a model with multiple different styles of photography trained in

I also specified the cow pics should be NEXT to the corners of the window, not IN them

pseudo owl Nov 19, 2024, 10:31 PM

#

halcyon yarrow just so we're all on the same page the prompt we're using is this one: > A photo...

previously I used the old prompt, which was similar but different details
this is with same exact prompt, no enhancing with flux dev. It seems to get the prompt correct but detail is lacking, could be fixed with a better sampler.

winged seal Nov 19, 2024, 10:32 PM

#

the composition is good, but the style/look is horrifically bad lmao

#

btu thats kinda dev in a nutshell

#

anyways, gotta go

halcyon yarrow Nov 19, 2024, 10:32 PM

#

i wouldn't say those are square cow pictures next to the windows

#

later @winged seal nice tak

craggy crest Nov 19, 2024, 10:33 PM

#

winged seal the composition is good, but the style/look is horrifically bad lmao

you're alive!

winged seal Nov 19, 2024, 10:34 PM

#

craggy crest you're alive!

Yeah 😅

Haven't really been here since our project stopped using SD3.5

#

anyways, I really do need to get going, I'll talk later fellas

craggy crest Nov 19, 2024, 10:34 PM

#

winged seal Yeah 😅 Haven't really been here since our project stopped using SD3.5

was startign to get kinda worried about you. good to see you :)

mortal mesa Nov 19, 2024, 10:35 PM

#

Challenge prompt: A blue orange on a blue plate against an orange background with orange blueberries on a blue napkin

#

i got that from here long ago, its a great test

bitter hearth Nov 19, 2024, 10:40 PM

#

Flux Dev same prompt/seed:

#

Big Flux Thing, same prompt/seed:

pseudo owl Nov 19, 2024, 10:41 PM

#

bitter hearth Flux Dev same prompt/seed:

prompt? looks nice

bitter hearth Nov 19, 2024, 10:41 PM

#

can't send prompt, discord said its too long

#

it responds well to loras as well, you just need to put the strength high

errant dust Nov 19, 2024, 11:09 PM

#

What is Big Flux Thing?

pseudo owl Nov 19, 2024, 11:11 PM

#

errant dust What is Big Flux Thing?

self-merge of flux dev, 17b params. does seem pretty interesting but has some cons and pros

errant dust Nov 19, 2024, 11:12 PM

#

What are the most obvious cons?

#

Speed I am guessing is one

pseudo owl Nov 19, 2024, 11:14 PM

#

yeah thats one, I didn't try it yet so can't say much about quality. From examples tho, seems more creative and detailed then flux but worse at other things?

errant dust Nov 19, 2024, 11:15 PM

#

well, I figured you meant worse at some things. My question was what

#

if anything stands out

pseudo owl Nov 19, 2024, 11:16 PM

#

Text at least, an example(not mine, but author was showing)

bitter hearth Nov 19, 2024, 11:16 PM

#

speed seemed ok

#

the downside is its a bit overcooked, like CFG burn from high CFG

#

but that might be possible to deal with

errant dust Nov 19, 2024, 11:20 PM

#

Is it consistent? because Flux flubs text too. It is not prefect all the time

#

but ok, just curious. Have been busy last week or two so catching up to see if anything cool has developed for either Flux or SD3.5L

bitter hearth Nov 19, 2024, 11:22 PM

#

not sure about text, didn't test that

#

will do some more tests later

#

I had to shut down the server cos someone released a GGUF

#

so I was wasting money with 45GB server lol

halcyon yarrow Nov 19, 2024, 11:23 PM

#

bitter hearth Big Flux Thing, same prompt/seed:

i think the big flux thin is better in that one right?

bitter hearth Nov 19, 2024, 11:23 PM

#

probably better yeah

#

its a bit overcooked but not too bad

#

detail seems higher

pseudo owl Nov 19, 2024, 11:34 PM

#

errant dust but ok, just curious. Have been busy last week or two so catching up to see if a...

I mean pixelwave v0.3 finetune of flux is kinda impressive, knows many art styles with flux capability in prompt following at least.

bitter hearth Nov 19, 2024, 11:35 PM

#

I liked this flux version best https://civitai.com/models/941929/flux1-dedistilledmixtuned-v1?modelVersionId=1054490

#

description: Based on Flux-Fusion-V2, Merge of flux-dev-de-distill, finetuned by ComfyUI, Block_Patcher_ComfyUI, ComfyUI_essentials and other tools. Recommended 6-10 steps. Greatly improved quality compared to other Flux.1 model.

pseudo owl Nov 19, 2024, 11:37 PM

#

6-10? I need to try it then, I like speed.

bitter hearth Nov 19, 2024, 11:40 PM

#

yeah I haven't gone beyond 8 steps in ages

halcyon yarrow Nov 19, 2024, 11:53 PM

#

bitter hearth I liked this flux version best ```https://civitai.com/models/941929/flux1-dedist...

i downloaded it and tired it and wasn't super impressed by it

bitter hearth Nov 19, 2024, 11:54 PM

#

not sure

#

we could try stacking more on top lol

halcyon yarrow Nov 20, 2024, 12:28 AM

#

A photograph of a white cat on top of a blue dog sitting on a brown couch in a living room. Behind them is a window and 4 cow pictures, one in each corner. Outside the window is a ufo hovering and outer space

flux-dev-de-dis...Q8_0 | 🌱 2503417111 | 🦶 62 | 🦮 3.0 | cfg_scale_alt 3.5 | 🧠 flux_aeSft.sft | 🎤 res_2m | 🕦 beta | 🗓 11/19, 7:27 PM | ⏱️ 507s

#

#

technically the at is n top of a blue dog, its worded loosly so it doesn't mean the cat has to phyysically be on top, it missed the outerspace part and the 4 corners

bitter hearth Nov 20, 2024, 12:35 AM

#

can only discourage this test prompt as much as possible TBH
it feels weird how the most ambiguous test prompts end up being popular

pseudo owl Nov 20, 2024, 12:37 AM

#

bitter hearth can only discourage this test prompt as much as possible TBH it feels weird how ...

It’s usually a good prompt to test though for prompt following, sd1.5/sdxl models perform the worst, pixart sigma, sd3.5 medium are middle, and auraflow, flux, sd3.5 are the best at it.

halcyon yarrow Nov 20, 2024, 12:40 AM

#

2nd shot to see i it did any better

pseudo owl Nov 20, 2024, 12:40 AM

#

I mean I tested with 25 steps and 8steps. 62 steps is kinda unfair but yeah still flux de-distilled nailed it.

halcyon yarrow Nov 20, 2024, 12:41 AM

#

II wouldn't say nailed it, i think the outer space view from the window is pretty curciail element to the prompt

#

I'm willing to forgive the paintings not being in the corner but yeah like Neon said that prompt is pretty ambiguous so its not really 'fair'

#

if you let me rewrite it and really establiish all the elements, enhanced prompts flux destill would 100% get it on the first shot

#

here's my rewrite:

A realistic photograph capturing a white cat physically sitting on top of a blue dog on a brown couch in a cozy living room. The couch sits against a wall featuring a large window. The window is bordered with four distinct cow pictures, each precisely placed in one corner of the window frame, creating a symmetrical arrangement. Through the window, the scene reveals the vastness of outer space, with a dark star-filled sky, distant celestial bodies, and a UFO hovering midair. The juxtaposition of the living room's warm ambiance and the surreal outer space view creates a striking visual contrast.

dusky thistle Nov 20, 2024, 12:47 AM

#

halcyon yarrow Nov 20, 2024, 12:48 AM

#

that's so cool it looks like an art scene set up in an existing library

#

Come Your Visit The Pleasentville Local Library Before Thursday
Art expo featuring works by Sharky McSharkton and his famomus shark themed art pieces

#

first shot with enhanced prompt, it got all the elements except the pictures in the 4 corners

dusky thistle Nov 20, 2024, 1:00 AM

#

#

halcyon yarrow Nov 20, 2024, 1:18 AM

#

@dusky thistle so i looked into that idea of monitoring your posts and sharing them on civit, I woulud need to use this library called discord-js-selfbot-v13 where basically its a bot impersonating a real user and using the tokens from a real session to access the data in this room, its very taboo and it could get me banned for using it so I gave up on that idea lol

#

took 4 shots but I'd say this one nailed it 100%

#

wouldn't the frames prevent the window from sliding open? don't thinka bout that
shouludn't the cat be physically on top of the dog? not exactly
what prompt was used?

A realistic photograph capturing a white cat physically sitting on top of a blue dog on a brown couch in a cozy living room. The couch sits against a wall featuring a large window. The window frame is adorned with a cow picture at each of its four corners, ensuring all frames are immediately adjacent to the vertices of the rectangular window. Through the window, the scene reveals the vastness of outer space, with a dark star-filled sky, distant celestial bodies, and a UFO hovering midair. The juxtaposition of the living room's warm ambiance and the surreal outer space view creates a striking visual contrast.

limpid thunderBOT Nov 20, 2024, 2:04 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

dusty patio Nov 20, 2024, 2:05 AM

#

help

dusky thistle Nov 20, 2024, 3:32 AM

#

dusty patio help

Here is the image you requested.

craggy crest Nov 20, 2024, 5:03 AM

#

wrong kind of help

halcyon yarrow Nov 20, 2024, 5:12 AM

#

some guy posted some new files for clip G, https://civitai.com/models/929400?modelVersionId=1064550 haven't tested it on SD3 yet but this is SDXL.

left is the FP32 version
right is the standard clip_g everyone uses
fixed seed

craggy crest Nov 20, 2024, 5:14 AM

#

halcyon yarrow some guy posted some new files for clip G, https://civitai.com/models/929400?mod...

The refiner model uses the single clip loader not the dual like SDXL base< that is going to be a problem

halcyon yarrow Nov 20, 2024, 5:14 AM

#

wdyym?

craggy crest Nov 20, 2024, 5:15 AM

#

for SD3.5 - clip_g is your workhorse and clip_l and t5xxl share tokens and work along side it. it looks like he's tried to combine both of the encoders that sdxl uses, which are 2 of the three 3.5 uses

#

interested to see what your tests show

halcyon yarrow Nov 20, 2024, 5:19 AM

#

i mean so far it's shown a different, i idont know if i like the fp32 version better but maybe it's more apparent with sd35, im gonna try it with turbo so this is my default setup which relates to the file in the screenshot

#

that's a 1.3 gb file so then ill try the 2.7gb fiile called fp32SDXLFLUXRefinerCLIPG_clipGLargePrunedFP32.safetensors fixed seed, we'll see iin a bit

craggy crest Nov 20, 2024, 5:26 AM

#

dusky thistle Nov 20, 2024, 5:27 AM

#

#

craggy crest Nov 20, 2024, 5:31 AM

#

mortal mesa Nov 20, 2024, 5:31 AM

#

@halcyon yarrow one is sdxl refiner 1.0 with fp32 clip g, the other is clip g large pruned fp32

#

definatly a confusing listing

halcyon yarrow Nov 20, 2024, 5:32 AM

#

so why not use the refiner's clip g?

#

or the pruned one?

#

confusing indeed

mortal mesa Nov 20, 2024, 5:32 AM

#

he added in regular fp32, the other is "large" version

halcyon yarrow Nov 20, 2024, 5:33 AM

#

im guessing the 1.5gb is the pruned version and the 2.7gb is the full fp32 version I get that, but why not use the full fp32?

#

so it's large bc it's fp32 or is there like a medium size?

#

so the large version is the fp32 2.7gb file right?

#

i can't get his gguf version of the clip model to work

#

the one on the left is using the 2.7gb file and the one on the right is using the 1.4gb file

#

the one on the left is te 1.4gb file, the one on the right is the 2.7gb file

#

again fixed seed, same everything

mortal mesa Nov 20, 2024, 5:48 AM

#

poking around to try to figure out what the heck they were saying i find this, and why stop at large when you can go gigantic cs-giung/clip-vit-gigantic-patch14-laion2b

brave tide Nov 20, 2024, 5:48 AM

#

halcyon yarrow Nov 20, 2024, 5:48 AM

#

the way i see it, the size differene between 1.4 and 2.7 is so small that it's not really a lot of extra memory overhead, especially when the G model is so important for sd35

#

its hard to tell which one is better, it's so subjective, i dont wanna be biased towards the larger file but they really do look very similar even if not exactly the same. what do you think @mortal mesa would you say either of those 2 side-by-sides are objectively beter?

mortal mesa Nov 20, 2024, 5:54 AM

#

seems slightly finer but ya i might be imagining that

halcyon yarrow Nov 20, 2024, 5:56 AM

#

I'm just going to set it as my default for my workflows moving forward, it's one of those things where if i notice a steep drop off in quality or speed i could always revert, so moving forward my configuration will be

{
    "clip_name1": "Long-ViT-L-14-BEST-GmP-smooth-ft.safetensors",
    "clip_name2": "fp32SDXLFLUXRefinerCLIPG_clipGLargePrunedFP32.safetensors",
    "clip_name3": "flan-t5-xxl-Q8_0.gguf"
}

brave tide Nov 20, 2024, 6:21 AM

#

Prompt:An insanely sleek and futuristic hypercar races through the rain-slicked streets of New York City at night. The car's aerodynamic body gleams under the glow of neon signs and streetlights, with water droplets streaming off its surface as it cuts through the rain. Its LED headlights pierce through the misty air, reflecting off the wet pavement and creating vivid light trails. The urban backdrop is alive with towering skyscrapers, glowing billboards, and bustling traffic blurred by the car's incredible speed. The atmosphere is intense and cinematic, capturing the raw power and elegance of the hypercar against the vibrant energy of the rain-soaked city.

limpid thunderBOT Nov 20, 2024, 6:32 AM

#

Last 7 days <Nov 13 2024> → <Nov 19 2024>

Member counts
345992 ↗ 346021 ↗ 346029 ↗ 346047 ↘ 346035 ↗ 346070 ↗ 346093
Action members
0 → 0 → 0 → 0 → 0 → 0 ↗ 77
Message members
0 → 0 → 0 → 0 → 0 → 0 ↗ 57
Reaction members
0 → 0 → 0 → 0 → 0 → 0 ↗ 34
More details

Summary | comcom Analytics

comcom analytics は、Discord または Slack 上で運営されているコミュニティを分析・モニタリングできる完全無料のダッシュボードです。現在、パブリックにβ版を提供しています。

bitter hearth Nov 20, 2024, 9:17 AM

#

halcyon yarrow the one on the left is te 1.4gb file, the one on the right is the 2.7gb file

seems better to me

#

not quite sure at the moment whether stock or upgraded encoders are the best idea

#

you can replace:

#

Clip-L with Longclip-L or Improved Clip-L
Clip-G with this one
T5-xl with T5-xxl or Flan-T5-xxl

#

and use higher precisions, but I am not sure what is worth it

little fossil Nov 20, 2024, 10:14 AM

#

#artisan-1 running shine

Whole_body_Japanese_NARUTO_Hatake_KakashiLightning_in_the_handSharinganrunningmasterpiece_best_qualityfr_style-Fantasy_Art_width-768_height-1344_aspect-9-16_seed-0ts-1697194632_idx-0_1.png

muted dove Nov 20, 2024, 1:43 PM

#

#

#

#

muted dove Nov 20, 2024, 2:11 PM

#

#

#

#

rapid pivot Nov 20, 2024, 2:43 PM

#

how to light a sword on fire

#

thomas

halcyon yarrow Nov 20, 2024, 2:51 PM

#

@muted dove are those made using the incontext lora for flux?

muted dove Nov 20, 2024, 2:52 PM

#

halcyon yarrow <@606112693597896716> are those made using the incontext lora for flux?

No loras

halcyon yarrow Nov 20, 2024, 2:52 PM

#

so just pure flux? if so can you show us what one of those prompts looks like?

muted dove Nov 20, 2024, 2:59 PM

#

halcyon yarrow so just pure flux? if so can you show us what one of those prompts looks like?

Sure...

A hyperrealistic technical tutorial illustration depicting the step-by-step process of building a Roller-coaster. Each step flows logically, with consistent lighting and style throughout the image.

halcyon yarrow Nov 20, 2024, 3:00 PM

#

that's it? did those incontext guys ust trick us and its not needed at all? that's a super simple prompt too. is that base flux dev or a specific finetune?

muted dove Nov 20, 2024, 3:02 PM

#

I used AtomixFlux, but dev should do it too. I do feed that through an LLM as part of the workflow, but try it 😉

pseudo owl Nov 20, 2024, 3:05 PM

#

halcyon yarrow that's it? did those incontext guys ust trick us and its not needed at all? that...

In context Lora improved quality by a very large amount but flux can do it without it too. See the images below, these are just plain prompts nothing else at 1024x1024 res.

halcyon yarrow Nov 20, 2024, 3:05 PM

#

is there a node in comfyui that you know about that can take those images and gif-y them?

halcyon yarrow Nov 20, 2024, 3:06 PM

#

pseudo owl In context Lora improved quality by a very large amount but flux can do it witho...

is that a lora or is that just promptin too?

pseudo owl Nov 20, 2024, 3:06 PM

#

halcyon yarrow is that a lora or is that just promptin too?

Just prompting with base flux dev.

#

And you can ask chatgpt code for making them into gifs

halcyon yarrow Nov 20, 2024, 3:09 PM

#

what's it called when they come out like this? isn't that called a spritesheet?

#

https://github.com/stormcenter/ComfyUI-autosplitgridimage something like this would be cool in the workflow

GitHub

GitHub - stormcenter/ComfyUI-AutoSplitGridImage: ComfyUI-AutoSplitG...

ComfyUI-AutoSplitGridImage: A custom node for ComfyUI that intelligently splits images into grids, combining edge detection for columns and uniform division for rows. - stormcenter/ComfyUI-AutoSpli...

prime totemBOT Nov 20, 2024, 3:15 PM

#

halcyon yarrow Nov 20, 2024, 3:17 PM

#

@pseudo owl can ii get the prompt for any of the gif ones you made?

pseudo owl Nov 20, 2024, 3:36 PM

#

halcyon yarrow <@842033136560242708> can ii get the prompt for any of the gif ones you made?

I don’t have the full prompt now, but you can start with
“A seamless 4-image grid of consecutive frames from a gif. The gif is of …..,”

halcyon yarrow Nov 20, 2024, 4:50 PM

#

yay!

A seamless 4-image grid of consecutive frames from a gif. The gif is of pink teddy bear dancing

craggy crest Nov 20, 2024, 7:54 PM

#

@bitter hearth look what dropped https://huggingface.co/InstantX/SD3.5-Large-IP-Adapter

InstantX/SD3.5-Large-IP-Adapter · Hugging Face

bitter hearth Nov 20, 2024, 8:18 PM

#

craggy crest <@456226577798135808> look what dropped https://huggingface.co/InstantX/SD3.5-La...

wow nice

#

a good IP adapter would be great

#

I can put R2D2 pictures into it for style transfer

cinder lichen Nov 20, 2024, 8:21 PM

#

And "Large" means definitely for sure won't work with medium, correct? 😛

craggy crest Nov 20, 2024, 8:21 PM

#

cinder lichen And "Large" means definitely for sure won't work with medium, correct? 😛

nope. it's trained for large, the blocks are different from medium, but it might work. you could try it

bitter hearth Nov 20, 2024, 8:25 PM

#

sometimes stuff weirdly works when it shouldn't
one of the PAG nodes, made for SDXL, works with Flux as the python syntax happened to coincide with some other ComfyUI code regarding blocks

#

and my favourite SDXL lora is one that was trained on SD 1.5 but happens to have an effect on SDXL

#

or replacing T5 with Flan-T5, a Google fine tune not made for diffusion, improved my images

toxic bone Nov 20, 2024, 8:33 PM

#

bitter hearth and my favourite SDXL lora is one that was trained on SD 1.5 but happens to have...

there's likely embeddings in the lora that affect the clip layer. weird that it would , but the code is probably taking the embedding and applying it where it works

#

black magic imo

bitter hearth Nov 20, 2024, 8:35 PM

#

oh yeah that's a really good point it could have affected clip

#

cos I always use lora loaders that include clip

toxic bone Nov 20, 2024, 8:45 PM

#

I don't think it woud fly on automatic1111. You've got something special

bitter hearth Nov 20, 2024, 8:55 PM

#

A1111 is essentially just legacy code at this point
its only really for the people who started on it, and don't want to move off due to familiarity

toxic bone Nov 20, 2024, 9:01 PM

#

i wholly disagree but i wont argue against someone's clear biases. I'll just recognize those.

icy drift Nov 20, 2024, 9:01 PM

#

bitter hearth A1111 is essentially just legacy code at this point its only really for the peop...

Forge is A1111 with modern everything. I tried it but gave up because I couldn't separate T5 and Clip-L for Flux, which makes it unusable (compared to Comfy).

civic trail Nov 20, 2024, 9:05 PM

#

bitter hearth Nov 20, 2024, 9:13 PM

#

toxic bone i wholly disagree but i wont argue against someone's clear biases. I'll just re...

why are you accusing me of being biased?

craggy crest Nov 20, 2024, 9:20 PM

#

halcyon yarrow Nov 20, 2024, 9:20 PM

#

I don’t like forge or a111 bc ultimately I just want extreme level control of my setup, ComfyUi is the only tool I don’t have to depend on developers to add support for something for me to keep moving forward

craggy crest Nov 20, 2024, 9:21 PM

#

i wouldn't use forge if you paid me, they're slow to add in support for what I want to use, if they add it in at all. and auto1111 was good when it came out. it's no longer good.

#

but if someone else wants to use them, more power to them

toxic bone Nov 20, 2024, 9:28 PM

#

bitter hearth why are you accusing me of being biased?

we are all biased. don't take it personally. you think a111 is out of date or as you put it, "Legacy code", and that the only reason someone would want to use it is the one you define. That's why i "accused you" of it. I won't argue someone's biases. We are all free to have our own beliefs.

pseudo owl Nov 20, 2024, 9:34 PM

#

toxic bone we are all biased. don't take it personally. you think a111 is out of date or ...

Yeah I mean everyone is biased(including me) but a111 doesn't even support flux natively nor sd3.5 from last I heard, you need to use a seperate branch, I would kinda consider that outdated.

prime totemBOT Nov 20, 2024, 9:35 PM

#

pseudo owl Nov 20, 2024, 10:27 PM

#

The cogvideox controlnet and vid2vid with the new reward lora is kinda amazing

halcyon yarrow Nov 20, 2024, 10:33 PM

#

Wow yeah that does look kinda cool

#

I’ve been taking my sweet time to adopt cog I’m still playing with mochi but yeah those videos look good. So the left is the guidance and the right is the render so video plus prompt to video right?

bitter hearth Nov 20, 2024, 10:41 PM

#

toxic bone we are all biased. don't take it personally. you think a111 is out of date or ...

okay so you didn't have a real reason. I'm going to block you and I suggest you do the same

toxic bone Nov 20, 2024, 10:42 PM

#

i'm nnot trying to insult you. your biases and choices are valid for you, so i offer respect by not arguing against them.

inland patrol Nov 20, 2024, 10:52 PM

#

pseudo owl Nov 20, 2024, 10:53 PM

#

halcyon yarrow I’ve been taking my sweet time to adopt cog I’m still playing with mochi but yea...

Yeah left is using this model: https://huggingface.co/alibaba-pai/CogVideoX-Fun-V1.1-5b-Control It's similar to controlnet union, accepts depth, surface normals, pose, and more.

Right is using it too, but plain vid2vid.

Both have hps lora from here which improves quality: https://huggingface.co/alibaba-pai/CogVideoX-Fun-V1.1-Reward-LoRAs

queen fable Nov 20, 2024, 10:54 PM

#

Design professional logos for my Instagram platform, where we market products, using the name ando and the colors dark blue, gold, light pink, and black.

toxic bone Nov 20, 2024, 11:05 PM

#

pseudo owl Nov 20, 2024, 11:19 PM

#

halcyon yarrow I’ve been taking my sweet time to adopt cog I’m still playing with mochi but yea...

1280x768 res for the left one, the weird morphing of the face mostly goes away, and much better overall.

foggy cloak Nov 20, 2024, 11:45 PM

#

pseudo owl 1280x768 res for the left one, the weird morphing of the face mostly goes away, ...

Very impressive, how long do generations usually take?

halcyon yarrow Nov 20, 2024, 11:45 PM

#

inland patrol

wow cool set of prompts, what model did you use, any loras? how did you come up with these prompts? some of them look worthy o being wallpapers

halcyon yarrow Nov 20, 2024, 11:46 PM

#

foggy cloak Very impressive, how long do generations usually take?

that's a good question, i tried one 13 frame video at 1280x768 and it was taking in the order of like 1hr+ i was like screw iit

foggy cloak Nov 20, 2024, 11:46 PM

#

halcyon yarrow that's a good question, i tried one 13 frame video at 1280x768 and it was taking...

What gpu?

halcyon yarrow Nov 20, 2024, 11:46 PM

#

pseudo owl Yeah left is using this model: <https://huggingface.co/alibaba-pai/CogVideoX-Fun...

i think CogVideo is the next big thing, between all the gadgets and loras its setup to be the next gold standard for video tooling

halcyon yarrow Nov 20, 2024, 11:46 PM

#

foggy cloak What gpu?

8gb laptop 4070

foggy cloak Nov 20, 2024, 11:47 PM

#

Hmm, might be vram limitation

halcyon yarrow Nov 20, 2024, 11:47 PM

#

to put it into perspective a 13 frame 480x840 usually takes 110 to 170 seconds, I start getting into the 230-300+ range if i use res_3s or the res_5s one

pseudo owl Nov 20, 2024, 11:47 PM

#

foggy cloak Very impressive, how long do generations usually take?

The high res takes pretty long, 18mins.

halcyon yarrow Nov 20, 2024, 11:48 PM

#

foggy cloak Hmm, might be vram limitation

yeah for sure nothing sucks more than spending 90 minutes waiting for something to render and then getting nothing bc it ran out of memory, that one was for 1696x960

foggy cloak Nov 20, 2024, 11:48 PM

#

Yep agony

#

I’m praying the 5080 has 24gb but it’ll likely only have 16gb again

pseudo owl Nov 20, 2024, 11:49 PM

#

halcyon yarrow that's a good question, i tried one 13 frame video at 1280x768 and it was taking...

You usually don’t need that high res but might be a bit better. The videos are made by that hps Lora which improves quality overall.

pseudo owl Nov 20, 2024, 11:50 PM

#

halcyon yarrow i think CogVideo is the next big thing, between all the gadgets and loras its se...

Yeah it was already the best video gen model before mochi came out. Mochi beats it in text to video but cogvideo has so many extra tools now. Tora is really amazing too.

halcyon yarrow Nov 21, 2024, 12:07 AM

#

ii ihaven't even heard of Tora goinig to have to check it out, i also messed with svd yesterday results we're okay

#

I just took an image I made with a figurine model and just cranked the motion settings to max, and then tried a few variations, I'm thinking I want to integrate both SVD and Mochi into my system so I can click a button and turn that into a little cllip I can share or create a video from text input

craggy crest Nov 21, 2024, 12:19 AM

#

halcyon yarrow I just took an image I made with a figurine model and just cranked the motion se...

looks like it's just just showing the front and then showing the front again

inland patrol Nov 21, 2024, 12:19 AM

#

halcyon yarrow wow cool set of prompts, what model did you use, any loras? how did you come up ...

Thank you much Richard! I used Sd3 Large and Medium for these, no loras required. They have kinda come as the consequence of experimentation trying to capture the right vibe and feel. I wanted to make a sort of band merch-background image, but the desire morphed into making these when I felt I had got some of the right key words down. Honestly, it was incremental and word based improvement. I gained my knowledge from @craggy crest who is wonderfully talented and well versed. She taught me kinda how to get from a point A to a point B. 🙂

pseudo owl Nov 21, 2024, 12:33 AM

#

halcyon yarrow I just took an image I made with a figurine model and just cranked the motion se...

Something perfect for this is OrbitX loras with cogvideox. Really good at rotations around anything, wait let me show some examples.

#

dapper rune Nov 21, 2024, 12:44 AM

#

muted dove Sure... ``` A hyperrealistic technical tutorial illustration depicting the step-...

Would you mind giving some details on how you are using the LLM as part of your workflow? I have been playing around with similar prompts in Flux Dev since you posted those examples, but I can't get anything similar. The LLM is clearly doing some heavy lifting.

halcyon yarrow Nov 21, 2024, 12:59 AM

#

pseudo owl Something perfect for this is OrbitX loras with cogvideox. Really good at rotati...

So you’re saying rather than working with svd for orbit animations I’m better off adopting cogvidro with the orbit loras and I’ll get a much better result right? Bc that’s what I’m getting from it. That gives me motivation to set it up now lol

sudden parcel Nov 21, 2024, 1:03 AM

#

im trying to make a seamless texture, i placed a "seamless tile" node and a "Circular VAE decode (tile)" node... and the textures do not render as seamless

#

im at the end with my wits

craggy crest Nov 21, 2024, 1:23 AM

#

seriously? what makes you think anyone's going to fall for this scam?

halcyon yarrow Nov 21, 2024, 1:32 AM

#

@pseudo owl the Lora you refer to is this one right?

https://huggingface.co/wenqsun/DimensionX/tree/main

wenqsun/DimensionX at main

dusky thistle Nov 21, 2024, 2:34 AM

#

pseudo owl Nov 21, 2024, 2:45 AM

#

halcyon yarrow <@842033136560242708> the Lora you refer to is this one right? https://huggingf...

Yes these ones

dusky thistle Nov 21, 2024, 2:49 AM

#

dusky thistle Nov 21, 2024, 3:04 AM

#

dusky thistle Nov 21, 2024, 3:20 AM

#

#

#

craggy crest Nov 21, 2024, 4:49 AM

#

muted dove Nov 21, 2024, 6:13 AM

#

dapper rune Would you mind giving some details on how you are using the LLM as part of your ...

Search my name on Civitai and look at the latest workflow I uploaded. That has an LLM in it, and I'm using it with Ollama.

craggy crest Nov 21, 2024, 6:30 AM

#

dusky thistle

dusky thistle Nov 21, 2024, 6:32 AM

#

#

craggy crest Nov 21, 2024, 6:41 AM

#

dusky thistle

dusky thistle Nov 21, 2024, 6:42 AM

#

#

craggy crest Nov 21, 2024, 6:51 AM

#

timber root Nov 21, 2024, 10:17 AM

#

https://tenor.com/bzxqI.gif

Tenor

sullen moss Nov 21, 2024, 4:01 PM

#

https://blackforestlabs.ai/flux-1-tools/

Black Forest Labs

Introducing FLUX.1 Tools

Today, we are excited to release FLUX.1 Tools, a suite of models designed to add control and steerability to our base text-to-image model FLUX.1, enabling the modification and re-creation of real and…

dry wave Nov 21, 2024, 5:14 PM

#

omgomgomg

#

so hyped 😁

bitter hearth Nov 21, 2024, 5:41 PM

#

wow look at that outpainting range
this is gonna be so good

pseudo owl Nov 21, 2024, 5:43 PM

#

bitter hearth wow look at that outpainting range this is gonna be so good

Comfyui already supports it i believe

bitter hearth Nov 21, 2024, 5:43 PM

#

ah nice

#

I wish Comfy prioritised supporting the Int4/FP4 Flux

#

its the fastest thing for GPUs 24GB and under

#

for big GPUs Comfy still has max speed cos they can TensorRT flux

halcyon yarrow Nov 21, 2024, 5:47 PM

#

sullen moss https://blackforestlabs.ai/flux-1-tools/

i thouht flux already supported controlnet, ipadapter and inpainting. what is this new tools thing offering?

bitter hearth Nov 21, 2024, 5:47 PM

#

first-party

pseudo owl Nov 21, 2024, 5:47 PM

#

halcyon yarrow i thouht flux already supported controlnet, ipadapter and inpainting. what is th...

Mostly it’s just far better and as loras.

bitter hearth Nov 21, 2024, 5:47 PM

#

lora form is big for small GPUs yeah

#

quality is the main improvement though

halcyon yarrow Nov 21, 2024, 5:49 PM

#

so if i wanted to use something like flux redux its about the size of a lora

#

and then since there isn't a Comfy node for it yet I would just use this script: https://github.com/black-forest-labs/flux/blob/main/src/flux/cli_redux.py

GitHub

flux/src/flux/cli_redux.py at main · black-forest-labs/flux

Official inference repo for FLUX.1 models. Contribute to black-forest-labs/flux development by creating an account on GitHub.

pseudo owl Nov 21, 2024, 5:51 PM

#

halcyon yarrow and then since there isn't a Comfy node for it yet I would just use this script:...

https://blog.comfy.org/day-1-support-for-flux-tools-in-comfyui/

Comfy Org Blog

Day 1 Support for Flux Tools in ComfyUI

We’re thrilled to share that ComfyUI now supports 3 series of new models from Black Forest Labs designed for Flux.1: the Redux Adapter, Fill Model, ControlNet Models & LoRAs (Depth and Canny).

These additions provide users with easy and precise control of details and styles in image generation.

halcyon yarrow Nov 21, 2024, 5:51 PM

#

sweet thank you man

#

@pseudo owl have you set it up and tried it yet?

pseudo owl Nov 21, 2024, 5:53 PM

#

halcyon yarrow <@842033136560242708> have you set it up and tried it yet?

Not yet, but I’ll try soon.

halcyon yarrow Nov 21, 2024, 5:55 PM

#

Redux
The Redux model is a model that can be used to prompt flux dev or flux schnell with one or more images.

Download the sigclip_vision_patch14_384.safetensors model and put it in your ComfyUI/models/clip_vision folder and download the flux1-redux-dev.safetensors and put it in your ComfyUI/models/style_models folder.

You can then load or drag the following image in ComfyUI to get the workflow

https://huggingface.co/Comfy-Org/sigclip_vision_384/blob/main/sigclip_vision_patch14_384.safetensors
https://huggingface.co/black-forest-labs/FLUX.1-Redux-dev

pseudo owl Nov 21, 2024, 5:58 PM

#

Finally bfl open sourced something apart from the original flux models at least, where’s video gen model though.

bitter hearth Nov 21, 2024, 6:00 PM

#

they've open sourced more than half their entire company lol

#

we can't criticise BFL for being too closed

dry wave Nov 21, 2024, 6:01 PM

#

halcyon yarrow i thouht flux already supported controlnet, ipadapter and inpainting. what is th...

I haven't tried the BFL Controlnet yet, but I'm very sure it will be high quality. The xAI lab control nets were undertrained and almost unusable

bitter hearth Nov 21, 2024, 6:02 PM

#

ye I've been doing the control net stuff in SD 1.5 and then refining with flux so far
now can probably do it all in flux

dry wave Nov 21, 2024, 6:02 PM

#

bitter hearth we can't criticise BFL for being too closed

I agree. Also, in contrast to SAI, BFL never claimed to be a open-source company. Still they deliver more than SAI so far

bitter hearth Nov 21, 2024, 6:02 PM

#

yeah I'm very happy with BFL

#

they also gave Schnell with Apache 2 whereas SD 1.5 and SDXL are OpenRails

#

Apache 2 is a lot better

pseudo owl Nov 21, 2024, 6:06 PM

#

Flux outpainting(someone tested in banadaco discord)

bitter hearth Nov 21, 2024, 6:08 PM

#

someone needs to wake up Clownshark
inpainting/outpainting benefits a lot from better samplers

pseudo owl Nov 21, 2024, 6:10 PM

#

Yep for sure

Flux with redux(right is flux, left is original)

MV5BZTI4ZGMxN2UtODlkYS00MTBjLWE1YzctYzc3NDViMGI0ZmJmXkEyXkFqcGc.png

5HkNhIAX_M9_8ywIiYUHF_87a5bf4d8f454dcbbe0c4ed1a3fa8ac5.png

#

#

bitter hearth Nov 21, 2024, 6:12 PM

#

I wonder if redux would stack with depth and canny

halcyon yarrow Nov 21, 2024, 6:12 PM

#

halcyon yarrow Redux The Redux model is a model that can be used to prompt flux dev or flux sch...

juust got it working

#

i just updapted to the latest comfy, don't like how im on the new UI finally

pseudo owl Nov 21, 2024, 6:17 PM

#

bitter hearth I wonder if redux would stack with depth and canny

I believe it works

bitter hearth Nov 21, 2024, 6:17 PM

#

halcyon yarrow i just updapted to the latest comfy, don't like how im on the new UI finally

you can downgrade

#

unclear why you would want to though aside from familiarity

#

if you get bugs maybe

halcyon yarrow Nov 21, 2024, 6:35 PM

#

i have to stay on this latest version if i want the flux tools

#

but yeah its just familiraity i just hate having to relearn where they put in all the old stuff

bitter hearth Nov 21, 2024, 6:37 PM

#

I meant downgrade the GUI not the overall Comfy install

#

there is an option in settings

halcyon yarrow Nov 21, 2024, 6:42 PM

#

i looked for the option but was unable to find the legay UI mode anyways its fine ill be a big boy and adapt, i noticed t5xxl v1.1 doesn't work with redux, it creates a black image

#

i lost my sampler preview tho and i don't see the option to enable that in settings 😦

halcyon yarrow Nov 21, 2024, 7:08 PM

#

i don't have base flux dev installed so i used shuttle 3 diffusion, works great lol

#

flux mini however wasn't compatible, in case anyone is curious lol

#

just tested it with "UNET LOADER GGUF" and it does not work with the pixelwave model but it does work with @short thicket 's model

#

it also produced way nicer results than shuttle

#

i am 100% fast-tracking and integrating flux redux into my stuff its cool but im having trouble getting it to adhere to the prompt, i just give it a random image and say do he same thing as the examples "sketch, b&w" for example and it completely ignores it

bitter hearth Nov 21, 2024, 7:50 PM

#

these days with conditioning you want to set area, timesteps and strength for all conditioning types

#

so there are a lot of variables to tweak

mortal mesa Nov 21, 2024, 8:09 PM

#

what happened to shuttle 4, it disappeared

bitter hearth Nov 21, 2024, 8:19 PM

#

doesn't rly matter since the model wasn't on there yet

pseudo owl Nov 21, 2024, 9:11 PM

#

mortal mesa what happened to shuttle 4, it disappeared

there was a shuttle4?

pseudo owl Nov 21, 2024, 9:25 PM

#

bitter hearth doesn't rly matter since the model wasn't on there yet

Oh it just came out looks like: https://huggingface.co/shuttleai/shuttle-4-diffusion

shuttleai/shuttle-4-diffusion · Hugging Face

bitter hearth Nov 21, 2024, 9:30 PM

#

pseudo owl Oh it just came out looks like: https://huggingface.co/shuttleai/shuttle-4-diffu...

oh thanks, I didn't see when I looked

#

I should have scrolled

pseudo owl Nov 21, 2024, 9:50 PM

#

need to test it but waiting for quants

dry wave Nov 21, 2024, 10:13 PM

#

pseudo owl

how did you made that? I have the feeling, too, that Flux ignores the prompt as soon as you condition it on an image

bitter hearth Nov 21, 2024, 10:14 PM

#

I haven't booted a server to test yet but
timesteps and strength are what I would play with

dry wave Nov 21, 2024, 10:16 PM

#

comfyui has no strength for the style model yet

#

nah, I don't wanna mess with comfy code. I will wait for the next update

bitter hearth Nov 21, 2024, 10:17 PM

#

I wonder if the conditioning multiply node would work on it

#

or otherwise, you could multiply the strength of the conditioning coming out of your text encode node

#

this is a random idea but also maybe ClipAttentionMultiply or Clip Temperature Multiply
those nodes are really good on SD 1.5

dry wave Nov 21, 2024, 10:22 PM

#

hm, I think the way it works is that it adds additional tokens to your prompt

#

like expanding the prompt by a new prompt it generated from the image

bitter hearth Nov 21, 2024, 10:23 PM

#

ah okay that makes sense
if it works via prompt then it might be better on flux-dev-de-distill
cos that seems to follow your prompt better

dry wave Nov 21, 2024, 10:24 PM

#

nah, they showed examples that it works

#

I rather believe there is a bug in comfy ^^°

bitter hearth Nov 21, 2024, 10:26 PM

#

maybe yeah, could also be related to this

dry wave Nov 21, 2024, 10:27 PM

#

okay, increasing prompt length definitely helps

toxic bone Nov 21, 2024, 10:27 PM

#

flux has a dedicated text network that runs along side image generation. It's all self attention.

bitter hearth Nov 21, 2024, 10:29 PM

#

I am struggling to work out if they are actually better but there is Longclip or Zer0int's fine tunes for Clip L
and then Flan T5 for T5
as alternative text encoders

dry wave Nov 21, 2024, 10:29 PM

#

I think the issue is

#

SIGCLIP is basically transforming the image into tokens

#

and then the style model translates these tokens into T5 prompt space

#

and they are added to the prompt

#

the thing is now: the number of tokens in the image might be quite large

#

and if your prompt is very short, the newly added tokens just outweight the prompt

#

I got consistently anime images by just repeating over and over in the T5 prompt that I want an anime image

bitter hearth Nov 21, 2024, 10:30 PM

#

oh this makes a lot of sense yeah

#

could maybe downscale reference image

dry wave Nov 21, 2024, 10:31 PM

#

I think they are always fixed in CLIP models

#

actually, they are already super small

#

like 350x350 pixel or something like that

#

I mean, its not that bad:

An anime character in the style of anime and manga artists like studio Ghibli with vibrant colors, clear anime line arts, its a perfect anime image. An anime image of a young man.
This transforms any photo of a man into an anime image

bitter hearth Nov 21, 2024, 10:34 PM

#

ah yeah overstating things can help a lot

#

I've started just dumping 1000 tokens from GPT 4o in prompt boxes and that works well

halcyon yarrow Nov 21, 2024, 10:55 PM

#

You guys are talking about Redux?

pseudo owl Nov 21, 2024, 10:55 PM

#

I kinda didn't get the hype for shuttle diffusion3 but from some high-res testing, its much better then schnell and even dev sometimes. A quick gen I made with just 4steps and Euler.

halcyon yarrow Nov 21, 2024, 10:55 PM

#

Have you guys tried it yet? I could t get it to work per se

#

Side note t5xxl v1.1 produces a black image but v1 works fine . Haven’t tried flan. You guys can confirm it works?

#

I’m thinking flux redux only works as intended when using base flux dev only, my experience is that I’ll type in a prompt let’s say sketch black and white and I tried 6 or 7 flux models I have via the unet gguf loader and I would get different stylized versions of the original image but they would all be in color, it would just ignore my prompts completely basically, I even cranked up cfg to 8 to make sure it wasn’t that

bitter hearth Nov 21, 2024, 11:00 PM

#

haven't tried the new stuff today yet

halcyon yarrow Nov 21, 2024, 11:01 PM

#

dry wave and if your prompt is very short, the newly added tokens just outweight the prom...

I think this might be happening and explain what I’m experiencing using a super short prompt but I’m just following the example WF given and in their website they use super short prompts

bitter hearth Nov 21, 2024, 11:01 PM

#

pseudo owl I kinda didn't get the hype for shuttle diffusion3 but from some high-res testin...

shuttle diffusion seems great yeah

dry wave Nov 21, 2024, 11:01 PM

#

yeah, I'm trying to look through the comfyui code but its as messy as usual X_x

halcyon yarrow Nov 21, 2024, 11:02 PM

#

It’s easy to test just overly elaborate on how it should be a black and white pic maybe 1000 tokens worth and see if it affects the image

#

So let’s say vision model gets 1000 tokens out of an image then 1000 from the prompt should balance it

#

I guess I could load in a black and white pic as my source style image and replace empty latent image with the target image and the set a high denoise?

bitter hearth Nov 21, 2024, 11:06 PM

#

cos T5 has relative positional embeddings you could try dumping a huge prompt in (use LLM to write)
it was trained on 512 tokens or so but people have got it to recall things that were over 3,000 tokens in

#

depends how the node and back end are coded though they might split it automatically

dry wave Nov 21, 2024, 11:06 PM

#

its weird, yes, cause I think the additional tokens are appended on top of the 512 tokens

#

thats why Reflux is so slow

#

and I think its 576 additional tokens

#

but I don't think you have to write such a long prompt. I found it sufficient to just repeat what you want a few times

#

"black & white image, monochrome, black and white, an monochrome image in black and white"

bitter hearth Nov 21, 2024, 11:08 PM

#

yeah I haven't tested optimal prompt length yet

dry wave Nov 21, 2024, 11:08 PM

#

thats probably already enough?

bitter hearth Nov 21, 2024, 11:09 PM

#

maybe yeah, for photographic prompts I tended to only repeat 2-3 times

craggy crest Nov 21, 2024, 11:12 PM

#

bitter hearth oh thanks, I didn't see when I looked

bitter hearth Nov 21, 2024, 11:14 PM

#

I think I see the chicken's point tBH

halcyon yarrow Nov 21, 2024, 11:45 PM

#

@dry wave original image on the left, using shuttle diffusion on the right, my prompt is:

black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,black and white sketch, black and white sketch,

using the flux_redux_model_example image provided by the website

toxic bone Nov 21, 2024, 11:51 PM

#

pseudo owl Nov 21, 2024, 11:54 PM

#

@halcyon yarrow @dry wave

For redux, Text prompt isn’t supposed to matter by default. It’s supposed to be image variation.

You can hack it though by averaging the text prompt and redux prompt, then the text prompt matters, you can also have strength control by multiplying the redux prompt.

halcyon yarrow Nov 21, 2024, 11:56 PM

#

i trried it but that didn't work

#

i also just tried my latent img trik aka img2img setu

#

that didn't work

#

so instead of empty latent image i replaced it with load image > vae encode > sampler

pseudo owl Nov 21, 2024, 11:58 PM

#

halcyon yarrow i trried it but that didn't work

How did you try it?

halcyon yarrow Nov 21, 2024, 11:59 PM

#

so as you can see ini the top left the style image is this black and white circular image, so the theory is it's reading that image and extracting the style for that, then for the text prompt im overloading it with that repeated text 'black and white sketch' and the image attached is what im getting

#

and then for the denoise i tried 0.2 and 0.8 and i get similar results, but nothing in black and white

#

llike their own blog post sayys it, my new theory is that it only works on base flux dev model bc somehow the lora is aligned only with the token space from the base model

#

https://blog.comfy.org/day-1-support-for-flux-tools-in-comfyui/
https://comfyanonymous.github.io/ComfyUI_examples/flux/

Comfy Org Blog

Day 1 Support for Flux Tools in ComfyUI

We’re thrilled to share that ComfyUI now supports 3 series of new models from Black Forest Labs designed for Flux.1: the Redux Adapter, Fill Model, ControlNet Models & LoRAs (Depth and Canny).

These additions provide users with easy and precise control of details and styles in image generation.

ComfyUI_examples

Flux Examples

Examples of ComfyUI workflows

pseudo owl Nov 22, 2024, 12:08 AM

#

halcyon yarrow llike their own blog post sayys it, my new theory is that it only works on base ...

Hmm, I think something is missing then probably. I’m not sure 🤔

halcyon yarrow Nov 22, 2024, 12:09 AM

#

yeah it feels that way, they do show how to chain 2 images together which is pretty cool too

#

i wonder what'll happen if i chain the 2 images together like that

#

maybe in the example shown in the blog theyy're actually chaining 2 images? bc from what i see on the blog post its just 1 image and prompt

pseudo owl Nov 22, 2024, 12:15 AM

#

halcyon yarrow maybe in the example shown in the blog theyy're actually chaining 2 images? bc f...

Hmm, maybe it’s only pro. The examples are pro as well.

In addition to the [dev] adapter, the API endpoint allows users to modify an image given a textual description. The feature is supported in our latest model FLUX1.1 [pro] Ultra, allowing for combining input images and text prompts to create high-quality 4-megapixel outputs with flexible aspect ratios.

halcyon yarrow Nov 22, 2024, 12:17 AM

#

oh i must've missed that

pseudo owl Nov 22, 2024, 12:21 AM

#

It’s really sad if only flux pro supports it, but seems like we can hack our way to use a text prompt.

halcyon yarrow Nov 22, 2024, 2:07 AM

#

yeah that is sad, iif you can figure out a way to hack it @pseudo owl do tag me I'd love to get something like it for now it still has vallue tho, I can replace my img2img setup with this WF and get higher quality more coherent output

#

I'm actively running a script that's processing 540 loras I have for flux-d by running them through llama 3.2b uncensored to assiign them 3 of 25 possible categories

halcyon yarrow Nov 22, 2024, 3:23 AM

#

https://www.youtube.com/embed/PrazoJZtN3A

YouTube

Olivio Sarikas

FLUX TOOLS - Run Local - Inpaint, Redux, Depth, Canny

FLUX TOOLs - Run Local - Inpaint, Redux, Depth, Canny. Here is how to run all the new FLUX Tools on your computer.

Links from my Video

Get my Shirt with Code "Olivio" here: https://www.qwertee.com/
https://blackforestlabs.ai/flux-1-tools/?ref=blog.comfy.org
https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev-lora
https://huggin...

▶ Play video

young blade Nov 22, 2024, 4:22 AM

#

canny and depth loras are really bad quality output with the default workflows, anyone got a better one yet?

#

halcyon yarrow Nov 22, 2024, 6:02 AM

#

young blade

i think that's pretty cool actually, I just finiished integrating redux into my stuff

#

left is using Flux redux + empty latent image
right is using SDXL + load image

dusky thistle Nov 22, 2024, 6:17 AM

#

dusky thistle Nov 22, 2024, 6:59 AM

#

summer ginkgo Nov 22, 2024, 9:33 AM

#

remote holly Nov 22, 2024, 9:37 AM

#

does someone tryed redux ?

muted dove Nov 22, 2024, 10:01 AM

#

remote holly does someone tryed redux ?

Yes!

#

Before / after

#

Another

muted dove Nov 22, 2024, 10:23 AM

#

Sorted out the aspect ratio and added auto prompt

#

A comparison showing the difference between default (first) image and the same using "Lying Sigma" at -0.5 strength.

dry wave Nov 22, 2024, 11:26 AM

#

halcyon yarrow llike their own blog post sayys it, my new theory is that it only works on base ...

I don't think so

#

the Reflux lora is just a projection from CLIP-Vision to T5 latent space

#

it should not matter which Flux Checkpoint you use as every Flux Checkpoint operates on the same T5 latent space

#

I'm pretty sure we can get Reflux working better by playing around with the generated tokens. It's insane to generate ~600 additional tokens to describe a image. Maybe we can cluster and merge them to downweight their impact. I might play around with that on weekend

remote holly Nov 22, 2024, 12:08 PM

#

muted dove Before / after

Amzing

#

What vision clip can i use for redux style ?

muted dove Nov 22, 2024, 12:09 PM

#

The one recommended with the comfy workflow. The workflow is in the images above.

#

https://huggingface.co/Comfy-Org/sigclip_vision_384/blob/main/sigclip_vision_patch14_384.safetensors

sigclip_vision_patch14_384.safetensors · Comfy-Org/sigclip_vision_3...

cinder lichen Nov 22, 2024, 12:16 PM

#

Are there big differences between the different clips. They are still a huge mystery to me. I know only certain models support certain clip models but outside of that…

halcyon yarrow Nov 22, 2024, 1:28 PM

#

dry wave I'm pretty sure we can get Reflux working better by playing around with the gene...

I think Neon found the answer on why it’s not working, the specific redux feature to do prompt + image is limited only to pro model. It says so right on their website

dry wave Nov 22, 2024, 1:29 PM

#

I'm pretty sure you can make it work for flux dev, too

halcyon yarrow Nov 22, 2024, 1:31 PM

#

Yeah I agree with what you said, the model doesn’t matter, it works on base flux dev as well as any fine tune, but it only works limited to image to image kinda ignoring the prompt essentially . If you wanna do stuff like on the website where you just give it a short prompt saying black and white and an image and have that work you need to use flux pro, @bitter hearth quoted a snippet of the website where it says so that I missed

dry wave Nov 22, 2024, 1:32 PM

#

yes, that might be. But I'm pretty sure we will find a way to make it work for flux dev, too. As said, the issue seems to be that the added tokens outweight the original prompt tokens. So its a matter of weighting, interpolating, maybe subsampling the added tokens

halcyon yarrow Nov 22, 2024, 1:32 PM

#

pseudo owl Hmm, maybe it’s only pro. The examples are pro as well. > In addition to the [d...

Sorry it wasn’t neon it was @pseudo owl

Hmm, maybe it’s only pro. The examples are pro as well.

In addition to the [dev] adapter, the API endpoint allows users to modify an image given a textual description. The feature is supported in our latest model FLUX1.1 [pro] Ultra, allowing for combining input images and text prompts to create high-quality 4-megapixel outputs with flexible aspect ratios.

halcyon yarrow Nov 22, 2024, 1:34 PM

#

dry wave yes, that might be. But I'm pretty sure we will find a way to make it work for f...

If you can figure out a way to hack it like you said that would be cool

#

i integrated flux redux into my system and left it runnin overnight, seemed to have generated about 50 images it's interesting it seems like it's being cropped

original left, remix right

#

you would think that the model would sort of reformat the layout and move the text down but it keeps cutting off the Finding

#

original left, remix right, it cut off the shoes, they're both portrait images but the aspect ratio on the remix is not as tall , i wonder what's going on in the latent space that the model can't find a way to fit the whole subject(s) there

strange grotto Nov 22, 2024, 1:53 PM

#

muted dove A comparison showing the difference between default (first) image and the same u...

start end is 0.1,0.9?

muted dove Nov 22, 2024, 1:54 PM

#

strange grotto start end is 0.1,0.9?

WF is embedded, but I think it was, yes.

muted dove Nov 22, 2024, 1:55 PM

#

halcyon yarrow original left, remix right, it cut off the shoes, they're both portrait images b...

That's why they released the outpaint model at same time 😄 😉

halcyon yarrow Nov 22, 2024, 1:57 PM

#

but i don't want a different aspect ratio, i don't want a larger image, i was hoping it would sort of 'reformat' the layout so that everyything (includin the shoes) would fit within the iimage

#

its almost like it has a fixed height for the concept in latent space and then it doesn't fit in the canvas and it just crops it out, very interesting don't you think @muted dove ?

dry wave Nov 22, 2024, 1:58 PM

#

halcyon yarrow original left, remix right, it cut off the shoes, they're both portrait images b...

its CLIP

remote holly Nov 22, 2024, 1:58 PM

#

is flux inpainting great ?

dry wave Nov 22, 2024, 1:58 PM

#

CLIP works on 384x384 pixel images

#

so it can only process square images

#

usually it will crop your images to square and then downscale it to 384x384

remote holly Nov 22, 2024, 1:59 PM

#

ha is an issue

dry wave Nov 22, 2024, 1:59 PM

#

that's why it cuts of the borders

muted dove Nov 22, 2024, 1:59 PM

#

It didn't here #🆕｜sd3 message

halcyon yarrow Nov 22, 2024, 2:00 PM

#

ah i see so internally it's actually croppinig before the vision model even gets it?

muted dove Nov 22, 2024, 2:00 PM

#

halcyon yarrow its almost like it has a fixed height for the concept in latent space and then i...

Does it do it every time, or just this one? It is supposed to create a different image every time.

dry wave Nov 22, 2024, 2:02 PM

#

halcyon yarrow ah i see so internally it's actually croppinig before the vision model even gets...

yes

halcyon yarrow Nov 22, 2024, 2:03 PM

#

i did try re-running the same WF multiple times and it does create a different image every time, also it goes w/o saying but the quality of your output is determined by the model, reviewing the results it did run on flux mini and flux heavy and while it looked low quality it did manage to somewhat land the concept for both

#

original image input image for Redux

#

same seed, batch size of 2, using the STOIQ model

#

same seed as STOIQ, batch size of 2, using Fluximate v1

remote holly Nov 22, 2024, 2:07 PM

#

great !

halcyon yarrow Nov 22, 2024, 2:23 PM

#

hey @dusky thistle i see you removed that part of the instructions where you had to manually install a library now it's just calling requirements.txt. is that correct or did i miss that part in your README somewhere?
Also I put in a good word to a famous Youtuber called Olivio Sarikas, urged him to try your sampler and do side by side testing, maybe he'll even feature your stuff in one of his videos 👍 He did say he was goinig to try it so we'll see

remote holly Nov 22, 2024, 2:42 PM

#

i get this with redux

#

original img

dry wave Nov 22, 2024, 2:52 PM

#

halcyon yarrow If you can figure out a way to hack it like you said that would be cool

I made a very quick hack and it seems to work

halcyon yarrow Nov 22, 2024, 3:26 PM

#

dry wave I made a very quick hack and it seems to work

show us your results I'm eager to see how well it works, pls try the benchmark of converting something to a black and white image as that seems to be the least subjective prompt

dry wave Nov 22, 2024, 3:26 PM

#

black and white is the hardest xD

halcyon yarrow Nov 22, 2024, 3:28 PM

#

yeah bc its not like "oh i guess I can kind of see that effect in there" least subjective. you mind showing me the before and after with your hack?

dry wave Nov 22, 2024, 3:32 PM

#

no, its because the clipvision comes with color information and if you mix color with "black&white" you get just "unsaturated" images back

#

this is the original image (taken from pexels)

#

#

using the prompt "black & white photo. Monochrome photo. Black and white photography. gray, monochromic black and white. b&w. old photo in black and white."

#

and Reflux I get

#

#

using my hack I get

#

#

same with anime:

#

prompt is "anime, anime style, studio ghibli anime"

#

the normal reflux gives me this

#

#

my hack gives me this

#

cunning lintel Nov 22, 2024, 3:39 PM

#

Haven't tried reflux, but looking at the examples it's remarkable how well composition and even details like flower in hair of one person, other kind of flower right are all kept. Flux is sooo good at guiding/prompting tiny details

exotic sapphire Nov 22, 2024, 3:43 PM

#

What's the difference between the flux and sd3.5 large architecture? Is there a potential for sd3.5 large to hit the same or even higher standard compared to flux after fine-tuning?

dry wave Nov 22, 2024, 3:49 PM

#

I think SD3 made the error of training on CLIP as primary text encoder. Yeah, they also support T5, but the model always relies on CLIP as main information source. I think SD3 will never have a prompt understanding close to Flux

#

However, the main problems of SD3 are anatomy. That's something that might be fixed in future finetunes, who knows

dusky thistle Nov 22, 2024, 3:58 PM

#

halcyon yarrow hey <@1208924372299939890> i see you removed that part of the instructions where...

Yep should be a lot easier now 🙂

halcyon yarrow Nov 22, 2024, 4:00 PM

#

dry wave I think SD3 made the error of training on CLIP as primary text encoder. Yeah, th...

your hack def seems worth trying, so what did you do? is it j ust a wf change or did you made any code changes to the nodes?

dry wave Nov 22, 2024, 4:00 PM

#

its a small custom node that merges tokens that are close anyways. Its really a hack for now

exotic sapphire Nov 22, 2024, 4:00 PM

#

that's a great insight. initially, i was pretty amazed by 3.5 large's prompt understanding as well, possibly not comparable to flux but i think it's pretty close. it's just that the images come out much less attractive. not so sure how much fine-tuning would be required to take it a step further.

cause for sdxl fine-tuned weights, there were some improvements, but it wasnt a tremendous jump from the base model. so im really not sure how much we can improve on top of the current 3.5 weights.

dry wave Nov 22, 2024, 4:01 PM

#

that's the code

📎 message.txt

halcyon yarrow Nov 22, 2024, 4:04 PM

#

cool dude i'll try it for sure

dusky thistle Nov 22, 2024, 4:08 PM

#

dry wave that's the code

btw saw your github... so we got at least two chemists in here! 🙂

#

(synthetic organic here, used gaussian a lot)

#

i like how i'm often reading a paper on some sampler algorithm and they'll suddenly jump from image generation to, say, calculating frontier orbital energies

neon igloo Nov 22, 2024, 4:47 PM

#

#1237460438229450772 A realistic photo shows a crime scene of a elderly bodybuilding Japanese lifeguard found a missing lady laid on the bush over the beach.

pseudo owl Nov 22, 2024, 5:17 PM

#

@halcyon yarrow new video generation model, its crazy fast and uses low vram, pretty good quality and supports i2v and t2v. On fal it just uses 3 sec for a 5 sec video and on a 4090 it takes 15 sec for a 5sec video(without any the extra optimizations. Quality is surprisingly great too.

https://huggingface.co/Lightricks/LTX-Video

Lightricks/LTX-Video · Hugging Face

halcyon yarrow Nov 22, 2024, 5:37 PM

#

that's crazy wild 15 seconds to render a 5 second video? that's unheard of

young blade Nov 22, 2024, 5:48 PM

#

halcyon yarrow i think that's pretty cool actually, I just finiished integrating redux into my ...

it's the actual quality of the image vs the controlnet for me, it's not from the style of the input image as it does the same thing no matter what the input image is, just really terrible textures

halcyon yarrow Nov 22, 2024, 5:49 PM

#

young blade it's the actual quality of the image vs the controlnet for me, it's not from the...

i agree, ive generated over 100 images with Redux and it consistently makes the images appear grainy/blurry/out of focus and too soft, i guess i dont want to over generalize it and it could be my config for a lot of them but overall I think i need to adjust the way it works for me

pseudo owl Nov 22, 2024, 5:51 PM

#

halcyon yarrow that's crazy wild 15 seconds to render a 5 second video? that's unheard of

Even faster now for a 4090 lol, and takes less then a min for a 4060.

halcyon yarrow Nov 22, 2024, 5:51 PM

#

pseudo owl Even faster now for a 4090 lol, and takes less then a min for a 4060.

wow well i already downloaded the models and installed the plugin ust waiting on a render to restart comfy

#

i'm gonna try kai bro's custom nnode fiirst "Apply Style (Advanced)" and then ill try the new video thing

#

i will say this, given a source image + prompt I'll take Redux over trying to give the sampler a low noise latent representation of the original image and having it try to figure out how to redo it, I only wish I could use Redux with all my other models this clip vision tech is great

#

original left, redux right, using shuttle 3, b&w prompt, default style node

#

merge strength of 0.8 on the left
merge strength of 0.4 next one
0.55 the last one
@dry wave and my prompt was:

Rendered entirely in black and white, the image captures the interplay of stark contrasts, with deep shadows and bright highlights accentuating every detail. A sketch-like quality pervades the scene, blending fine lines and subtle cross-hatching into a harmonious texture. The monochrome tones, abbreviated as b&w, evoke a timeless simplicity, stripping the scene of distraction and leaving pure form and light in focus.

dry wave Nov 22, 2024, 6:06 PM

#

I added two additional sliders: downsampling and weighting

halcyon yarrow Nov 22, 2024, 6:06 PM

#

this upgrade is way better now I have actual control and artistic freedom as to how much of the image's style i want to apply to my new image

#

can you explain them pls? how does it affect the image?

#

is strength the same thing as weighting but a slider?

dry wave Nov 22, 2024, 6:07 PM

#

weighting is just multiplying the token latent with a value between 0-1, shrinking it towards zero

#

downsampling is similar to token merging, but it merges not similar tokens together but instead neighbouring toklens

#

a combination of these things gives me whatever I want, just... I still don't know which works better and which combination works best X_x

halcyon yarrow Nov 22, 2024, 6:09 PM

#

you should publish that code, I just added your class to the nodes.py file for now, i didn't use any of your imiports just the class and the helper function

#

give me a copy of that but you should publish it too

dry wave Nov 22, 2024, 6:09 PM

#

I have it on github

#

give me a second

halcyon yarrow Nov 22, 2024, 6:12 PM

#

I'm trying LTX viideo @pseudo owl , first attempt OOM error with 65 frames, cranked it down to 17 frames @ 512px and stiill OOM, going to try 9 frames @ 512px

dry wave Nov 22, 2024, 6:13 PM

#

https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControl/tree/main

GitHub

GitHub - kaibioinfo/ComfyUI_AdvancedRefluxControl

Contribute to kaibioinfo/ComfyUI_AdvancedRefluxControl development by creating an account on GitHub.

#

#

downsampling works really well!

#

it does not make the image blurry in contrast to merging

halcyon yarrow Nov 22, 2024, 6:14 PM

#

dang I can't even get past just loading the model with LTXV Model Loader node forget the frames, its just a tiny 9gb model file too, that suucks

dry wave Nov 22, 2024, 6:14 PM

#

yes... I think that's it. Downsampling works by far best of all I tried so far

halcyon yarrow Nov 22, 2024, 6:15 PM

#

dry wave

yeah that's stiill crisipy

dry wave Nov 22, 2024, 6:15 PM

#

"vintage comic"

halcyon yarrow Nov 22, 2024, 6:15 PM

#

this is a good example original image left, redux right. like it's a nice reimagination of the same image but the redux is blurry right?

dry wave Nov 22, 2024, 6:15 PM

#

marble statues

halcyon yarrow Nov 22, 2024, 6:16 PM

#

dry wave marble statues

wow man that's cool that's exactly what BFL promised and you delivered

#

@dry wave so if i were to take the original image no the left that purple princess pic, what settings would you recommend to get a reimaginatin while stiill keep things crispy?

dry wave Nov 22, 2024, 6:18 PM

#

Currently I have the feeling that downsample factor 1:3 is the best setting overall

halcyon yarrow Nov 22, 2024, 6:18 PM

#

dry wave Nov 22, 2024, 6:18 PM

#

uuuuh... that looks like an old version

halcyon yarrow Nov 22, 2024, 6:19 PM

#

i just iinstalled it from your stuff on github

#

last commiti says 5 minutes ago

#

oh i see you're good

dry wave Nov 22, 2024, 6:19 PM

#

thats weird. Can you restart and update your UI?

halcyon yarrow Nov 22, 2024, 6:20 PM

#

i tried running it and it sayys 0.55 not ini list

#

#

so ii ust manuaully fiixed iit

#

probably cached somewhere

dry wave Nov 22, 2024, 6:20 PM

#

ah, okay. Maybe you used one of my older images as workflow?

#

yes, that looks correct

halcyon yarrow Nov 22, 2024, 6:20 PM

#

i just used the workkflow for the previious node

dry wave Nov 22, 2024, 6:20 PM

#

downscale 1:3 and everything else on 1.0

#

if the effect is too weak, you can try to shrink one of the other two options additionally

halcyon yarrow Nov 22, 2024, 6:22 PM

#

so can you explain real briefly what i can expect to see between downscales like 1:1, 1:3 and 1:9? am I essentially merging more of the visual tokens and therefore making the text prompt stronger the higher the ratio goes?

dry wave Nov 22, 2024, 6:22 PM

#

yes. By default you have 27 x 27 visual tokens

#

so 729 tokens in total. Which is ~3 times as much as your text prompt

halcyon yarrow Nov 22, 2024, 6:23 PM

#

purple princess with that b&w prompt @ 1:3,1,1

dry wave Nov 22, 2024, 6:23 PM

#

when using downsample 1:3 you have 9x9 tokens, so 81 in total

#

and with downsample 1:9 you have 3x3 = 9 tokens in total

dry wave Nov 22, 2024, 6:24 PM

#

halcyon yarrow purple princess with that b&w prompt @ 1:3,1,1

yeah, the only downside of downsampling is that you cannot use a "weaker" effect strength. If its too strong you have to use the other two options

halcyon yarrow Nov 22, 2024, 6:24 PM

#

got it so so the ratio is how many visual tokens to reduce from the visual input based on the default spec of 27

dry wave Nov 22, 2024, 6:24 PM

#

yes

halcyon yarrow Nov 22, 2024, 6:24 PM

#

wow outstanding results now

#

the simplistic one on the right is the b&w prompt + 1:3 and the one on the left is the original prompt + 1:3

#

i am literally going to integrate this right now before I do anything else into my system so i can see how much better it does

craggy crest Nov 22, 2024, 6:27 PM

#

halcyon yarrow wow outstanding results now

check the annoucements on the comfyUI discord

halcyon yarrow Nov 22, 2024, 6:27 PM

#

@dry wave I have a system that dynamically builds a ComfyUI WF based on the requirements of the image generation, this is not a ComfyUII workflow this is my own structuured format so I can input a config object with the stuff it needs to make it and have it make the WF for me

📎 dslWorkFlowV5.json

halcyon yarrow Nov 22, 2024, 6:28 PM

#

craggy crest check the annoucements on the comfyUI discord

i'm not on ComyUI's discord, anything notable? this thing is way better than the base one

craggy crest Nov 22, 2024, 6:29 PM

#

halcyon yarrow i'm not on ComyUI's discord, anything notable? this thing is way better than the...

maybe notable - if you want to do video

halcyon yarrow Nov 22, 2024, 6:31 PM

#

im guessing the takeaway from the announcement is "update to the latest version of ComfyUI"

craggy crest Nov 22, 2024, 6:33 PM

#

halcyon yarrow im guessing the takeaway from the announcement is "update to the latest version ...

nope. sent you a DM

halcyon yarrow Nov 22, 2024, 6:34 PM

#

oh i read the announcement before you sent it as a screenshot and I missed the whole point of it also working natively

#

maybe ill try that too and see if i can get it to load

dry wave Nov 22, 2024, 6:35 PM

#

what's the announcement...?

halcyon yarrow Nov 22, 2024, 6:36 PM

#

the LTX video thing works using the built in nodes w/o needing to install custom nodes like mochi

#

theyy're giving LTX the VIP treatment like mochi got

dry wave Nov 22, 2024, 6:36 PM

#

ah, the ltx looks very interesting

halcyon yarrow Nov 22, 2024, 6:37 PM

#

i think they haven't done that for cogvideo bc cogvideo is so fragmented

mortal mesa Nov 22, 2024, 6:38 PM

#

you need the kijai chart

halcyon yarrow Nov 22, 2024, 6:39 PM

#

links to the example WF: https://github.com/Lightricks/ComfyUI-LTXVideo/?tab=readme-ov-file and i gotta tag @pseudo owl so he cna check it out too

GitHub

GitHub - Lightricks/ComfyUI-LTXVideo: LTX-Video Support for ComfyUI

LTX-Video Support for ComfyUI. Contribute to Lightricks/ComfyUI-LTXVideo development by creating an account on GitHub.

#

HOLY COW MY EYES ARE BLEEDING!!! 😮 took only 70 seconds to render for me, oddly appropriate aniimation too
Prompt executed in 69.21 seconds

#

no way that was on model load too, i just did a subsequent load and it took just 7 seconds Prompt executed in 7.74 seconds

#

Prompt executed in 33.28 seconds my minid is so blown right now

craggy crest Nov 22, 2024, 6:44 PM

#

sort of doesn't work with cartoons and stuff though. just mostly realistic, photographic images

#

at least that's the discussion on the L3 discord

halcyon yarrow Nov 22, 2024, 6:46 PM

#

i had a few cartoony benchmark prompts I used I could rerun those again, im looking for my max frame count befre i OOM

#

i'm at 177 frames at 76 seconds, this is nuts, already at double wat mochi can do and a fraction of the time, mochi can do 86 frames in like 15 minutes lol this does 177 in 76 seconds, i can't even

#

201 frames in 155 seconds, it takes LTX the same amount of time to give me 201 frames at the same resoluution and steps as Mochi did for 13 frames. that's a 15x speedup

#

the next question is does this work with the great ClownSharkSampler? @dusky thistle only one way to find out 🙂

bitter hearth Nov 22, 2024, 6:53 PM

#

I don't do video so I am not sure if video models work with clown stuff

#

would be cool if they did

craggy crest Nov 22, 2024, 6:59 PM

#

bitter hearth I don't do video so I am not sure if video models work with clown stuff

if they don't, i don't expect that's a challenge that @dusky thistle will avoid

dusky thistle Nov 22, 2024, 6:59 PM

#

yeah def will want this shit working with video

halcyon yarrow Nov 22, 2024, 7:01 PM

#

Mochi and ClownsharkSampler work togther I have a good feeling LTX is going to work too on the same principle

#

A lantern festival at dusk by a peaceful lake, glowing lanterns drifting into the sky, their warm light reflecting on the water, as bursts of fireworks illuminate the scene in vivid colors.
Mochi left, LTX right. LTX didn't even do any fireworks or laterns

craggy crest Nov 22, 2024, 7:08 PM

#

halcyon yarrow > A lantern festival at dusk by a peaceful lake, glowing lanterns drifting into ...

did really fast water, though

bitter hearth Nov 22, 2024, 7:09 PM

#

halcyon yarrow Mochi and ClownsharkSampler work togther I have a good feeling LTX is going to w...

depends on a few things
stiffness and stability of the ODE/SDE, and then the noise scaling

halcyon yarrow Nov 22, 2024, 7:09 PM

#

i guess I could do image to video and give it something of high quality to start off with so it can match mochi but then that feels like cheating, it makes longer videos and it's 15x faster and it does img 2 video i mean I'm sure it'll get better right?

halcyon yarrow Nov 22, 2024, 7:10 PM

#

bitter hearth depends on a few things stiffness and stability of the ODE/SDE, and then the noi...

try low ETA first and then scale it up i guess? Mochi was handling res_2m, res_3s and even the 5s one like a champ at 0.5 eta

#

awwww I think iit's not compatible 😦
The expanded size of the tensor (216) must match the existing size (864) at non-singleton dimension 4. Target sizes: [1, 3, 208, 120, 216]. Tensor sizes: [3, 201, 480, 864] @dusky thistle i guess some adjustments are in order maybe?

#

i could probably hack a solution using ksampler adv eff. again see if that solves it

dusky thistle Nov 22, 2024, 7:12 PM

#

halcyon yarrow awwww I think iit's not compatible 😦 `The expanded size of the tensor (216) mus...

is this that new one?

#

that comfi just added today

halcyon yarrow Nov 22, 2024, 7:12 PM

#

yeah that's the brand new one, here's the relevant link if you wanna try it: https://comfyanonymous.github.io/ComfyUI_examples/ltxv/

ComfyUI_examples

Lightricks LTX-Video Model

Examples of ComfyUI workflows

dusky thistle Nov 22, 2024, 7:13 PM

#

k cool

#

i will take a look at that later

bitter hearth Nov 22, 2024, 7:13 PM

#

halcyon yarrow try low ETA first and then scale it up i guess? Mochi was handling res_2m, res_3...

its likely either all will work or none of them

#

like if you use stock comfy SDE it just doesn't work at all cos noise scaling wrong

#

but then with the noise scaling fixed the same sampler types work

craggy crest Nov 22, 2024, 7:16 PM

#

bitter hearth its likely either all will work or none of them

that's why i posted all the links and everything else on the L3 discord - not posting that here

bitter hearth Nov 22, 2024, 7:17 PM

#

I agree to keep more experimental stuff on the more experimental discords yeah

craggy crest Nov 22, 2024, 7:18 PM

#

bitter hearth I agree to keep more experimental stuff on the more experimental discords yeah

they could post it in their comfy channel here but i sort of feel like that's their job

bitter hearth Nov 22, 2024, 7:19 PM

#

I kinda see that as a dead channel now

craggy crest Nov 22, 2024, 7:19 PM

#

kinda. same for swarm - the devs have their own discords and aren't part of sai any more

bitter hearth Nov 22, 2024, 7:20 PM

#

I just tried the Flux outpainting default workflow
switched it from euler to DPM++ 2S, and doubled steps
results immediately better LOL

craggy crest Nov 22, 2024, 7:20 PM

#

bitter hearth I just tried the Flux outpainting default workflow switched it from euler to DPM...

;) the euler curse, eh?

toxic bone Nov 22, 2024, 7:22 PM

#

https://tenor.com/view/ferris-bueller-gif-8034772

Tenor

bitter hearth Nov 22, 2024, 7:24 PM

#

euler has been causing shenanigans for centuries yeah

#

even switching to DPM++ 2M helped
didn't even need the ancestral

toxic bone Nov 22, 2024, 7:26 PM

#

https://tenor.com/view/eulers-disk-vsauce-gif-27098675

Tenor

#

disc pic

dry wave Nov 22, 2024, 7:38 PM

#

https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControl/tree/main

GitHub

GitHub - kaibioinfo/ComfyUI_AdvancedRefluxControl

Contribute to kaibioinfo/ComfyUI_AdvancedRefluxControl development by creating an account on GitHub.

#

I added a documentation now

bitter hearth Nov 22, 2024, 7:41 PM

#

thanks, this looks great
token merging is an interesting solution to the issue
I use token merging for speedups but it makes sense they would help here

halcyon yarrow Nov 22, 2024, 7:57 PM

#

dry wave https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControl/tree/main

you probably have a very good grasp of what's going on internally, i understand it as a basic level but I don't think I had the understanding to have built a node like that, doing sommething like that requires knowing what's even possible to achieve it, you can't do when you dont know what's possible, anyways big thx for that node I'm gonna be using the heck out of it. you want me to tag you with the comparisons?

dry wave Nov 22, 2024, 8:12 PM

#

halcyon yarrow you probably have a very good grasp of what's going on internally, i understand ...

oh, sure! I'm curious myself what's the best settings!

halcyon yarrow Nov 22, 2024, 8:14 PM

#

1:3 looked raelly good with the purple princess but i don't like the output with this other cyber girl i'm doing

#

original image left, redux right using 1:3

#

more 1:3 samples, not happy with the quality they dont feel sharp enough

dry wave Nov 22, 2024, 8:18 PM

#

hm, maybe the image contains too much details that is blurred away by merging

halcyon yarrow Nov 22, 2024, 8:18 PM

#

yeah the prompt is huge:

cybernetic female, holding pistol, great care is taken to depict the young woman to have anatomically correct arms and hands, intricate circuitry pupils,
tattoo, petite body,
modular cybernetics, an android young woman with medium blonde drill hair haircut in a malfunctioning teleporter merges people with objects, science fiction time travel, a scientist experimenting with time travel technology intricate details, 2d, detailed action background, The art style is sleek and polished, with clean, precise lines that contrast with the gritty world it portrays, it has a semi-realistic style, Each detail is sharp, from the smooth, reflective surfaces of cybernetic limbs to the crisp outlines. The overall look is refined, capturing a high-tech elegance amidst the dystopian backdrop, where every element—from intricate machinery to flowing organic forms—is meticulously rendered with a sense of precision and understated sophistication.

#

about 285 tokens and 1:3 reduces it from 729 to 81 tokens in in total right?

#

1:9 looks even worse imo

dry wave Nov 22, 2024, 8:22 PM

#

#

thats what I get with 1:3

halcyon yarrow Nov 22, 2024, 8:22 PM

#

that looks fantastic

#

maybe its my negative prompt?

dry wave Nov 22, 2024, 8:23 PM

#

oh, I haven't tried it with CFG yet

halcyon yarrow Nov 22, 2024, 8:23 PM

#

no it empty

dry wave Nov 22, 2024, 8:23 PM

#

#

can you try this as input image?

halcyon yarrow Nov 22, 2024, 8:23 PM

#

im using a distilled model so cfg is set to 1.2

dry wave Nov 22, 2024, 8:24 PM

#

as said, clip vision is cropping your input image automatically. Often its better to crop it yourself to ensure that the right part of the image is retained

dry wave Nov 22, 2024, 8:24 PM

#

halcyon yarrow im using a distilled model so cfg is set to 1.2

Let me try with cfg

halcyon yarrow Nov 22, 2024, 8:27 PM

#

this is 1:3 with merge strength of 0.8 and the uncropped image, i think you might be onto something with your theory that it's my cfg, there's 2 cfg fields, the one on the sampler and the one on the clip text encode node, that one should be set to 3.5+ and it was set to 1.2 too so i think that's probably where the source of my problems were coming from

dry wave Nov 22, 2024, 8:28 PM

#

yes, it looks like a cfg issue

#

I'm currently making a cfg workflow and try it myself

bitter hearth Nov 22, 2024, 8:29 PM

#

its very confusing but there are two common token merging methods
tome and todo
if you used tome for the node you might get better results with the todo method
I use a node I found here for it https://github.com/ethansmith2000/comfy-todo

dry wave Nov 22, 2024, 8:30 PM

#

#

this is with cfg

#

cfg=1.7

halcyon yarrow Nov 22, 2024, 8:30 PM

#

1:3, 1, 1 using cropped image and cfg of 1.2

halcyon yarrow Nov 22, 2024, 8:31 PM

#

dry wave

ill take that over anything im generating

dry wave Nov 22, 2024, 8:31 PM

#

maybe its your workflow?

halcyon yarrow Nov 22, 2024, 8:31 PM

#

100% it's my workflow, im pretty sure the text encoder cfg shouldn't be at 1.anything

#

1:3 + cropped image + cfg 3.5. thanks for helping me find this bug kai it's my cfg settings after all

dry wave Nov 22, 2024, 8:33 PM

#

so one thing you should always do when using cfg in a distilled model is to skip the first k and last k steps

#

hm, but even if I don't skip steps the image looks good

#

halcyon yarrow Nov 22, 2024, 8:34 PM

#

it was a bug in the code I was doing Math.min instead of Math.max

dry wave Nov 22, 2024, 8:34 PM

#

ah, okay

halcyon yarrow Nov 22, 2024, 8:34 PM

#

I rescale the cfg from whatever it is to a 1 to 1.8 range for the sampler and I leave the original cfg for the text encoder

dry wave Nov 22, 2024, 8:35 PM

#

hm, both values are totally different

#

I wouldn't mix them up

halcyon yarrow Nov 22, 2024, 8:35 PM

#

when i talk about the text encoder cfg I mean the 'guidance' field in CliPTextEncodeFlux

dry wave Nov 22, 2024, 8:35 PM

#

I now. Its good that they renamed it into "guidance"

#

its really confusing calling it cfg

#

its a distilled cfg, but it works fundamentally different from real cfg

halcyon yarrow Nov 22, 2024, 8:35 PM

#

yeah i hate the whole subject personally

#

i'm upset flux even had to go that route its made the whole thing confusing

dry wave Nov 22, 2024, 8:36 PM

#

to be honest, I would only use real cfg when you need negative prompts

#

also: real cfg is twice as slow. You don't want to use it every time

halcyon yarrow Nov 22, 2024, 8:36 PM

#

but iit's not like I can choose to not have iit

dry wave Nov 22, 2024, 8:36 PM

#

?

halcyon yarrow Nov 22, 2024, 8:36 PM

#

the dual cliip text encoder uses the guidance field

#

so i have to put sommething in there

dry wave Nov 22, 2024, 8:36 PM

#

yes, guidance.

#

you need guidance. Its not optional

halcyon yarrow Nov 22, 2024, 8:37 PM

#

lol yeah exactly, so what i set for guidance is what the original image parameters had set for cfg, and what i set for cfg_scale (for the sampler) is the rescaled version of the original cfg value. make's sense? so ifi the original cfg was let's say 10 then cfg_scale becomes 1.8 and guidance becomes 10

#

if the original cfg was 3.5 then cfg_scale becomes 1 and guidance becomes 3.5

dry wave Nov 22, 2024, 8:38 PM

#

I think you can just set both values independently from each other

#

use real cfg whenever the model does not follow your prompt correctly

#

like I use it when the model makes super pretty characters although my prompt says they should look ugly xD

dry wave Nov 22, 2024, 8:40 PM

#

dry wave

I don't want to praise myself, but the picture looks extremely good. Razor sharp !_!

halcyon yarrow Nov 22, 2024, 8:41 PM

#

yeah i agree, image does look nice and crispy

#

i do set both values independently but the source image gen params only have cfg so I have to translate that to something that'll work with my stuff so that's why i independently recalculate cfg for distilled flux models and for other ones like flux destill, mangled, fluxbooru i leave cfg as-is

dry wave Nov 22, 2024, 8:55 PM

#

bitter hearth its very confusing but there are two common token merging methods tome and todo ...

I mean, they talk about image patches in an unet. So its unclear if their findings are also valid for text token merging. But yeah, subsampling instead of merging would be also possible. I don't think that it will give you more details, though.

#

oh, but when I think about it...

#

I said you can only downsample by factor 3

#

but using torch.nn.functional.interpolate you could use arbitrary downsampling factors

#

this would allow for more fine-grained control

halcyon yarrow Nov 22, 2024, 9:13 PM

#

sounds to me like potentially a new version of your style apply node 👼

dry wave Nov 22, 2024, 9:15 PM

#

hm, I don't want to spam too many versions, but if you want to play around and experiment with it I can later upload a version with arbitrary downsampling factors and different interpolation options

halcyon yarrow Nov 22, 2024, 9:17 PM

#

yeah i agree, I think sometimems simplicity is key, i'm personally happy with your initial recommendation of 1:3, 1, 1. don't find myself needing more fine grained controls so maybe its overkill anyway

dry wave Nov 22, 2024, 9:32 PM

#

nearest neighbour:

#

toxic bone Nov 22, 2024, 9:32 PM

#

she is pretty close to her

dry wave Nov 22, 2024, 9:34 PM

#

yes, I think this could work

#

nearest neighbour is blurry, though

#

actually this is quite nice

#

you can set any downsampling factor

#

and you have several interpolation methods

#

"area" is what was the default before (just averaging)

#

I will make a push on a separate branch and update the main branch after further testing