#🆕｜sd3 | Stable Diffusion | Page 117

muted dove Oct 29, 2024, 1:13 PM

#

With a flex in their profile of "Crypto holder". 🤣

#

https://tenor.com/view/laughter-emoji-gif-26681851

Tenor

turbid grotto Oct 29, 2024, 2:13 PM

#

did sd35m out??

#

https://huggingface.co/stabilityai/stable-diffusion-3.5-medium

stabilityai/stable-diffusion-3.5-medium · Hugging Face

#

sponging gonnabegood

bitter hearth Oct 29, 2024, 2:17 PM

#

could someone put on civit pls my shell scripts are written for that

signal shuttle Oct 29, 2024, 2:18 PM

#

SD3.5M is OUT https://huggingface.co/stabilityai/stable-diffusion-3.5-medium

stabilityai/stable-diffusion-3.5-medium · Hugging Face

finite osprey Oct 29, 2024, 2:20 PM

#

waiting for the first woman lying on the floor

signal shuttle Oct 29, 2024, 2:21 PM

#

Made with SD3.5M

bitter hearth Oct 29, 2024, 2:23 PM

#

detail levels look good

signal shuttle Oct 29, 2024, 2:24 PM

#

Text ain't half bad

turbid grotto Oct 29, 2024, 2:35 PM

#

yea that is definitely improvement

sacred jewel Oct 29, 2024, 2:40 PM

#

cobalt moon Oct 29, 2024, 2:42 PM

#

This image shows a cozy, sunlit room with a relaxed, homey atmosphere, featuring Furina from Genshin Impact. She stands barefoot near a window, dressed casually in a tank top and shorts, holding a plate of food and a glass of water, looking slightly surprised or concerned. The warm sunlight streaming through the curtains softens the scene, creating a peaceful, domestic vibe. On the left side, a pair of playful, water-like creatures—one wearing a whimsical top hat—are causing mischief in the kitchen, splashing water all over the sink and bubbles floating in the air. The room is filled with small, charming details, like shelves full of books, potted plants, and scattered objects such as cans and a tipped-over cup. The balance between Furina’s calm stance and the chaos in the background gives the scene a playful, slice-of-life feel, capturing a moment of quiet absurdity in a seemingly ordinary day.

not a great prompt but hey.

turbid grotto Oct 29, 2024, 2:42 PM

#

textures are really good

dusky thistle Oct 29, 2024, 2:52 PM

#

noble coyote Oct 29, 2024, 2:52 PM

#

Flux/Florence2 + LoRAs img2img

patent acorn Oct 29, 2024, 3:08 PM

#

cobalt moon > This image shows a cozy, sunlit room with a relaxed, homey atmosphere, featuri...

smh base model doesnt recognize furina

cobalt moon Oct 29, 2024, 3:08 PM

#

patent acorn smh base model doesnt recognize furina

doesn't recognize Loona from Helluva Boss either

#

I mean the basic feature is there

patent acorn Oct 29, 2024, 3:08 PM

#

cuz its not trained on dat

#

wait for finetunes

turbid grotto Oct 29, 2024, 3:09 PM

#

am I crazy or it is not worse than large?

patent acorn Oct 29, 2024, 3:09 PM

#

"sd3 medium is all you need"

turbid grotto Oct 29, 2024, 3:10 PM

#

turbid grotto am I crazy or it is not worse than large?

no, 8b is better but that one good too

dull star Oct 29, 2024, 3:10 PM

#

waiting for gguf

cobalt moon Oct 29, 2024, 3:10 PM

#

dull star waiting for gguf

it's here already
https://huggingface.co/ND911/stable-diffusion-3.5-medium-GGUF/tree/main

ND911/stable-diffusion-3.5-medium-GGUF at main

dull star Oct 29, 2024, 3:10 PM

#

oh okay

turbid grotto Oct 29, 2024, 3:10 PM

#

dull star waiting for gguf

why, it is small

dull star Oct 29, 2024, 3:10 PM

#

I was looking at city's huggingface

#

oh wait yeah

#

I just got used to gguf lmao

turbid grotto Oct 29, 2024, 3:11 PM

#

cobalt moon it's here already https://huggingface.co/ND911/stable-diffusion-3.5-medium-GGUF/...

bro was camping

dull star Oct 29, 2024, 3:11 PM

#

didn't even remember WHY I was using gguf

cobalt moon Oct 29, 2024, 3:11 PM

#

turbid grotto why, it is small

someone like me with 2GB VRAM sadcat

dull star Oct 29, 2024, 3:11 PM

#

yeah its good for that

patent acorn Oct 29, 2024, 3:11 PM

#

cobalt moon it's here already https://huggingface.co/ND911/stable-diffusion-3.5-medium-GGUF/...

wtf

dull star Oct 29, 2024, 3:11 PM

#

thank god for these GGUFs

#

they have so much better quality than fp8

turbid grotto Oct 29, 2024, 3:11 PM

#

cobalt moon someone like me with 2GB VRAM <:sadcat:1130568570712109176>

oh curios how q4 gonna run

cobalt moon Oct 29, 2024, 3:12 PM

#

will try that tomorrow

#

like with my 2GB VRAM setup

#

I did have my 6GB VRAM laptop tho

dull star Oct 29, 2024, 3:12 PM

#

q4 is good on like 7B and up models (talking about LLMs)

#

but this may apply to DiTs since these are transformer based

#

but the point is, a large quantized model might outperform a small unquantized model

#

but it will be way slower for sure

#

hmm waiting for all the reddit posts that trash on SD3.5 Medium

#

thomas its to be expected

turbid grotto Oct 29, 2024, 3:21 PM

#

I hope it will be adopted, it looks so good

#

look at this aesthetics

cobalt moon Oct 29, 2024, 3:23 PM

#

dull star hmm waiting for all the reddit posts that trash on SD3.5 Medium

The comments saying "why use this when you have Flux" already appeared thomas

#

That guy probably have dual 4090 setup

dull star Oct 29, 2024, 3:24 PM

#

yeah he can fill flux and T5 at FP32 on his GPUs

turbid grotto Oct 29, 2024, 3:24 PM

#

imagine opposite happens

patent acorn Oct 29, 2024, 3:24 PM

#

cobalt moon The comments saying "why use this when you have Flux" already appeared <:thomas:...

LOL i saw that

turbid grotto Oct 29, 2024, 3:27 PM

#

q4 is smaller than 1.5 but speed drops from 1.16s/it down to 1.41s/it

#

quality does not drops heavily

#

q4 vs fp16

noble coyote Oct 29, 2024, 3:30 PM

#

"My Triple 5090 512Gb RAM, 3 x 10Tb SSD is gonna ... "

#

... need a pocket nuclear-reactor to power

patent acorn Oct 29, 2024, 3:31 PM

#

wanted to try style blending between photo and anime, unfortunately it didnt do well in medium maybe i suck at prompting 😂
"a photo of a cafe at night. there is an anime girl sitting on a chair"

turbid grotto Oct 29, 2024, 3:36 PM

#

patent acorn wanted to try style blending between photo and anime, unfortunately it didnt do ...

i couldn't too but I don't know correct prompting for that

patent acorn Oct 29, 2024, 3:37 PM

#

only seen lykon trying 3 subjects in separated style

noble coyote Oct 29, 2024, 3:37 PM

#

raw photo of a cafe at night. an anime girl is sitting at a table

patent acorn Oct 29, 2024, 3:37 PM

#

ok lemme try

#

nope

turbid grotto Oct 29, 2024, 3:48 PM

#

rapid salmon Oct 29, 2024, 3:50 PM

#

cobalt moon it's here already https://huggingface.co/ND911/stable-diffusion-3.5-medium-GGUF/...

Super nice. Curious though, how was a K quant made? When I look into the options for quantizing models (e.g. leejet's stable-diffusion-cpp), they only have non-K-quant methods.

(also I'm dying to see Forge get K-quant support, from my understanding it's still not in yet, though I could be relying on outdated info)

lavish osprey Oct 29, 2024, 3:50 PM

#

turbid grotto

this is kind of impressive, ngl

rapid salmon Oct 29, 2024, 3:52 PM

#

It seems like ComfyUI is the only source (that I know of) that supports K quants but...I really prefer other tools. :x

turbid grotto Oct 29, 2024, 3:52 PM

#

lavish osprey this is kind of impressive, ngl

yea that is very good at various styles

remote holly Oct 29, 2024, 3:57 PM

#

#

is realy good

lavish osprey Oct 29, 2024, 3:59 PM

#

about to upload my upscaling workflow

bitter hearth Oct 29, 2024, 4:04 PM

#

ok nice, tiled upscale is what think is the thing I might use it for

turbid grotto Oct 29, 2024, 4:05 PM

#

what this node doing in "SLG" workflow?

sacred jewel Oct 29, 2024, 4:05 PM

#

lavish osprey about to upload my upscaling workflow

Where will it be uploaded to?

Thank you..

lavish osprey Oct 29, 2024, 4:06 PM

#

sacred jewel Where will it be uploaded to? Thank you..

HF

#

waiting for review

lavish osprey Oct 29, 2024, 4:06 PM

#

turbid grotto what this node doing in "SLG" workflow?

this is SGL sampling

bitter hearth Oct 29, 2024, 4:06 PM

#

I don't use small GPUs but the smallest quant being just 1.79 GB is kinda cool

lavish osprey Oct 29, 2024, 4:07 PM

#

runs on a mobile phone lol

bitter hearth Oct 29, 2024, 4:07 PM

#

oh yeah I forgot phones
that's a legit use case for smol model

turbid grotto Oct 29, 2024, 4:10 PM

#

lavish osprey this is SGL sampling

hmmm that is nearly perfect

muted dove Oct 29, 2024, 4:11 PM

#

turbid grotto hmmm that is nearly perfect

Is that just by adding the SGL node?

sacred jewel Oct 29, 2024, 4:13 PM

#

lavish osprey HF

This one I assume?

bitter hearth Oct 29, 2024, 4:14 PM

#

if its anything like as good as PAG was for SDXL then this is a big deal

#

been missing PAG so badly

turbid grotto Oct 29, 2024, 4:14 PM

#

muted dove Is that just by adding the SGL node?

there is workflow on hf page with "SGL"

remote holly Oct 29, 2024, 4:14 PM

#

turbid grotto Oct 29, 2024, 4:21 PM

#

turbid grotto hmmm that is nearly perfect

#

seems like it does help

#

not always of course

graceful osprey Oct 29, 2024, 4:25 PM

#

SLG SLG SkipLayerGuidance

#

Yes it should help

#

The underlying mechanism is very similar to PAG

lavish osprey Oct 29, 2024, 4:27 PM

#

@lunar canopy how comes Dango has the yellow name and not me

graceful osprey Oct 29, 2024, 4:27 PM

#

Oh that's my old days dev title still there ? Lol

lavish osprey Oct 29, 2024, 4:27 PM

#

chat, I need dog plushie vs dangos fight images now

bitter hearth Oct 29, 2024, 4:28 PM

#

graceful osprey The underlying mechanism is very similar to PAG

oh awesome
in my opinion PAG is the biggest quality boost we got in last year or two

#

in terms of just a single node giving a boost

lavish osprey Oct 29, 2024, 4:29 PM

#

one thing you might have missed with Medium, is that it works at 512x512 too

#

unlike Large that ONLY works at 1mp

#

at the end of the day, they complement each other very well

#

I generated oner 2k images with Medium as refiner yesterday and I love it

graceful osprey Oct 29, 2024, 4:31 PM

#

SLG is not as direct as PAG in terms of layer selection given how DiT model works - but play with it - it is fun

lavish osprey Oct 29, 2024, 4:31 PM

#

noble coyote Oct 29, 2024, 4:32 PM

#

Flux/Florence2 + LoRAs img2img

bitter hearth Oct 29, 2024, 4:33 PM

#

graceful osprey SLG is not as direct as PAG in terms of layer selection given how DiT model work...

yeah that's okay, even a little bit of PAG effect would be good

graceful osprey Oct 29, 2024, 4:34 PM

#

Relevant layers:
2,4,7,8,9

bitter hearth Oct 29, 2024, 4:34 PM

#

Divisive Norm and Spectral Modulation from here gives a bit of PAG effect also and works on all DiTs https://github.com/Clybius/ComfyUI-Latent-Modifiers

#

should stack with SLG

turbid grotto Oct 29, 2024, 4:36 PM

#

lavish osprey one thing you might have missed with Medium, is that it works at 512x512 too

6s per image on rtx3060 with 512 and looks fine!

lavish osprey Oct 29, 2024, 4:36 PM

#

I'm also curious to test this stuff here ^^^

turbid grotto Oct 29, 2024, 4:36 PM

#

turbid grotto 6s per image on rtx3060 with 512 and looks fine!

it could go down to 3s with 8step lora and maybe to 2s with tensort

icy coral Oct 29, 2024, 4:36 PM

#

lavish osprey

Wouldn't it be wise to release your base models together with an example fine tune, one maybe done in partnership with a prominent fine tuner, so the people that are less tech savvy and more focused on quality rather than trainability and versatility can have a glimpse of what the model could offer down the line?

lavish osprey Oct 29, 2024, 4:36 PM

#

turbid grotto 6s per image on rtx3060 with 512 and looks fine!

hey that's naked

turbid grotto Oct 29, 2024, 4:37 PM

#

lavish osprey hey that's naked

oops

lavish osprey Oct 29, 2024, 4:37 PM

#

icy coral Wouldn't it be wise to release your base models together with an example fine tu...

I'm gonna release some finetunes soon

bitter hearth Oct 29, 2024, 4:37 PM

#

the model might be stronger structurally at 512, cos in my opinion flux is strongest structurally at 512 or sometimes even 384

remote holly Oct 29, 2024, 4:37 PM

#

lavish osprey Oct 29, 2024, 4:37 PM

#

bitter hearth the model might be stronger structurally at 512, cos in my opinion flux is stron...

less long distance attention issues with a smaller image

#

cause... you don't have long distances

#

you can also attempt a "cascade-like" workflow

#

where you generate low res, then upscale and refine

bitter hearth Oct 29, 2024, 4:38 PM

#

yeah "cascade-like" is how I do flux, its good for these

turbid grotto Oct 29, 2024, 4:38 PM

#

lavish osprey I'm gonna release some finetunes soon

gonnabegood

icy coral Oct 29, 2024, 4:39 PM

#

lavish osprey I'm gonna release some finetunes soon

Nice, because people love focusing on defects any base model has that are fixable relatively easily through training, and maybe releasing them together would prevent that somehow

dull star Oct 29, 2024, 4:39 PM

#

how do I use the skip layer guidance

bitter hearth Oct 29, 2024, 4:39 PM

#

the simpletuner dev showed some screenshots of sd3.5m fine tune testing and it already looks better

dull star Oct 29, 2024, 4:39 PM

#

or do I even need it

lavish osprey Oct 29, 2024, 4:39 PM

#

icy coral Nice, because people love focusing on defects any base model has that are fixabl...

well, my finetune currently focuses more on artistic view than on fixing issues. But we're also working on those.

dull star Oct 29, 2024, 4:40 PM

#

sd3.5 medium

#

https://tenor.com/view/edp-i-mean-its-all-right-gif-26702511

Tenor

turbid grotto Oct 29, 2024, 4:40 PM

#

bitter hearth the simpletuner dev showed some screenshots of sd3.5m fine tune testing and it a...

where can I see them?

dull star Oct 29, 2024, 4:43 PM

#

idk if euler ancestral is helping but I'm using it for sure

#

im gonna do 40 steps with it

#

since this is almost 2 it/s

#

#

https://tenor.com/view/nice-good-well-done-gif-14854677

Tenor

#

medium still has that superior VAE compared to sd3.5 large

#

no speckled "dust" artifact

remote holly Oct 29, 2024, 4:45 PM

#

bitter hearth Oct 29, 2024, 4:46 PM

#

turbid grotto where can I see them?

terminus research group discord

dull star Oct 29, 2024, 4:46 PM

#

lol dem fingers

remote holly Oct 29, 2024, 4:47 PM

#

the styles are amazing

turbid grotto Oct 29, 2024, 4:48 PM

#

3072х512

#

medium

dull star Oct 29, 2024, 4:49 PM

#

#

#

https://tenor.com/view/michael-rosen-like-project-noice-smile-gif-17002790

Tenor

#

remote holly Oct 29, 2024, 4:53 PM

#

dull star

this is amazing

dull star Oct 29, 2024, 4:53 PM

#

thanks

#

I promped it

#

thomas thomas thomas thomas thomas

#

euler ancestral, cfg 4, ddim_uniform, 40 steps

#

sadly it doesn't like uglyness, almost like flux

remote holly Oct 29, 2024, 4:54 PM

#

thanks

dull star Oct 29, 2024, 4:56 PM

#

Prompt: The man is shirtless and is barefeet and covered in mud. His long pants are old and torn. He looks thin and frail.

Reality: The man is genetically perfect, has abs and a nice jawline, he is NOT covered in mud cause that would be unappetizing... Rather, we just put a brushed him with a tiny bit of dirt. He looks like someone who is well fed and his pants cannot be torn as that doesn't look aesthetically pleasing...

analog dome Oct 29, 2024, 4:56 PM

#

I know a lot about stable diffusión but one thing I didn't search and I would like someone to explain it to me what does the TX5 (or whatever it's called) do to sd3.5 ? what improvement does because im using it without it and it's working fine but i want to know what I'm missing

dull star Oct 29, 2024, 4:58 PM

#

T5_XXL is a large language model, but unlike most of the ones that we use today, this one's an ENCODER too.
Some models can use it as an encoder (like CLIP I think??) to improve prompt adherence and text capabilities.
If you are mostly making images without text or rather simplistic scenes then yeah... you won't ever need it.

dull star Oct 29, 2024, 4:59 PM

#

dull star > Prompt: The man is shirtless and is barefeet and **covered in mud**. His long ...

ok putting "ugly" before "man" gives him a worse jawline and a receding hairline. Perfection

#

he still has abs though and his pants are NOT torn

analog dome Oct 29, 2024, 4:59 PM

#

thank you very much, I think this sd3.5 + the announced controlnet support will do wonders

dull star Oct 29, 2024, 4:59 PM

#

@cunning lintel

turbid grotto Oct 29, 2024, 5:00 PM

#

is there info about controlnets?

dull star Oct 29, 2024, 5:01 PM

#

BEFORE and AFTER | SkipLayerGuidanceSD3 (default settings)

turbid grotto Oct 29, 2024, 5:02 PM

#

dull star BEFORE and AFTER | **SkipLayerGuidanceSD3** (default settings)

is this sd35L?

dull star Oct 29, 2024, 5:02 PM

#

medium

turbid grotto Oct 29, 2024, 5:02 PM

#

dull star medium

sponging

#

looks great

lunar canopy Oct 29, 2024, 5:05 PM

#

lavish osprey <@729066661029871638> how comes Dango has the yellow name and not me

because

lavish osprey Oct 29, 2024, 5:08 PM

#

turbid grotto is there info about controlnets?

coming

turbid grotto Oct 29, 2024, 5:17 PM

#

lavish osprey coming

gonnabegood

#

256x256 lol

#

2s per image

#

native fullhd

sacred jewel Oct 29, 2024, 5:34 PM

#

SD3.5M ... random seeds, all other settings the same.
prompt: what?

vital surge Oct 29, 2024, 5:35 PM

#

turbid grotto native fullhd

Sorry, looks squished

sacred jewel Oct 29, 2024, 5:39 PM

#

Prompt: zombies running screaming with giant billboard in the background that reads "WHOA"

Not sure if my output is indicative or I am doing something wrong? Using the workflow example from the HF Repo for SD3.5M

Definitely not happy with these results

bitter hearth Oct 29, 2024, 5:42 PM

#

not necessarily doing something wrong, the model isn't as strong as the big boys

mortal mesa Oct 29, 2024, 5:42 PM

#

Uh where is skip layer guidance node from

bitter hearth Oct 29, 2024, 5:42 PM

#

its a core comfy node

mortal mesa Oct 29, 2024, 5:43 PM

#

mmm i thought i updated, ill do it again

keen palm Oct 29, 2024, 5:43 PM

#

This is 0.5 more B than I need though

bitter hearth Oct 29, 2024, 5:43 PM

#

might have to do it manually from github

mortal mesa Oct 29, 2024, 5:44 PM

#

ya i havent actually found it yet lol just started to look

#

OK it is indeed part of Comfy, had to update

sacred jewel Oct 29, 2024, 5:45 PM

#

a more complex prompt using the Triple CLIP Text Encoder

bitter hearth Oct 29, 2024, 5:45 PM

#

mortal mesa OK it is indeed part of Comfy, had to update

it will work on flux too but it needs adapting

#

obviously SAI won't make the flux node so we have to

keen palm Oct 29, 2024, 5:47 PM

#

Will there be a turbo version of Medium?

bitter hearth Oct 29, 2024, 5:47 PM

#

sacred jewel a more complex prompt using the Triple CLIP Text Encoder

not sure Clip L and Clip G would like the big paragraphs

#

I don't know much about prompting so I'm not sure, but it might be doing harm giving big paragraph to weaker text encoders

lunar rivet Oct 29, 2024, 5:49 PM

#

I see only downvotes under the 3.5 medium announcement, is it that bad?

keen palm Oct 29, 2024, 5:50 PM

#

lunar rivet I see only downvotes under the 3.5 medium announcement, is it that bad?

Think it's just nightmares from the 3.0 model

bitter hearth Oct 29, 2024, 5:50 PM

#

lunar rivet I see only downvotes under the 3.5 medium announcement, is it that bad?

go by Arxiv reactions rather than Reddit reactions

lunar rivet Oct 29, 2024, 5:50 PM

#

I mean thumbs down under announcement here but guessing reddit isn't doing much better

bitter hearth Oct 29, 2024, 5:51 PM

#

oh, reddit and discord are same anyway

#

it takes a while but in a few months there will be Arxiv papers covering these models, there's already a few papers that talk about Flux

mortal mesa Oct 29, 2024, 5:53 PM

#

Full HD

digital thorn Oct 29, 2024, 5:56 PM

#

Medium test

signal shuttle Oct 29, 2024, 5:56 PM

#

I wonder how fast SD3.5M going to be on my dual 3090 setup 🤔, shame I can't test it out right now

bitter hearth Oct 29, 2024, 6:02 PM

#

for the most part you can just take the parameter count to be the speed multiplier

#

not always true but its not far off

#

however Flux goes double speed per parameter cos it doesn't need a negative

sacred jewel Oct 29, 2024, 6:02 PM

#

bitter hearth I don't know much about prompting so I'm not sure, but it might be doing harm gi...

Didn't do much better with the smaller prompts either 🤭 which is why I have both short and long examples. The prompt was expanded by GPT-4o and tailored to the strengths of each encoder.

What do I know 🤷‍♂️

bitter hearth Oct 29, 2024, 6:04 PM

#

seems like the image quality you got in the end was similar to the others
its a limited model compared to flux and sd3.5l

sacred jewel Oct 29, 2024, 6:04 PM

#

OK, now Large vs Medium using a smaller prompt for both L/G

bitter hearth Oct 29, 2024, 6:05 PM

#

yeah large is just drastically better

#

might be the case that medium is nice for tiled upscale, not sure yet

#

I think Lumina is worth looking at again also, there was a fine tune of it to 2k resolution in the I-max paper, sadly they did not release it but it would be possible to replicate it

sacred jewel Oct 29, 2024, 6:09 PM

#

Large vs Medium different prompt:

#

Using @lavish osprey 's upscaling workflow which uses both Large and Medium models...

bitter hearth Oct 29, 2024, 6:13 PM

#

showing some potential

#

on the upscale

#

needs a bit more aesthetic finetuning or preference optimisation

sacred jewel Oct 29, 2024, 6:16 PM

#

That's a mighty hitchhiking thumb

SD3.5L

carmine vessel Oct 29, 2024, 6:20 PM

#

In general, how does this perform in comparison to 3.0?

sacred jewel Oct 29, 2024, 6:20 PM

#

SD3.5L original vs. Upscaled result through Medium (Lykon's upscale workflow)

signal shuttle Oct 29, 2024, 6:20 PM

#

Apparently SD3.5M loras can be trained in mere minutes https://x.com/peacej/status/1851288045712191572

Jerry Chi (ジェリー・チー) (@peacej) on X

trained a DreamBooth LoRA on my face in a few minutes using
Stable Diffusion 3.5 Medium. 数分で訓練できました。　Updated guide coming soon.
#SD35M #stablediffusion

carmine vessel Oct 29, 2024, 6:20 PM

#

Mostly good, but what is the versatility in unique applications such as stylized artwork?

sacred jewel Oct 29, 2024, 6:21 PM

#

carmine vessel In general, how does this perform in comparison to 3.0?

Not sure... I would have to put them side by side... let me see...

pseudo owl Oct 29, 2024, 6:23 PM

#

carmine vessel In general, how does this perform in comparison to 3.0?

It’s definitely better at human anatomy by a mile, a bit less then large, not sure about other things.

lavish osprey Oct 29, 2024, 6:23 PM

#

signal shuttle Apparently SD3.5M loras can be trained in mere minutes https://x.com/peacej/stat...

oh a random Jerry Chi

bitter hearth Oct 29, 2024, 6:27 PM

#

3.5m hands are a lot better than 3.0m hands

#

its a big improvement over 3.0m

#

its easy to forget what previous models were like, did sd 1.5 for last week and didn't see 1 correct hand in like 1000 generations lol

#

I really love sd 1.5's lighting and general "vibe" though sometimes, no other model like it

pseudo owl Oct 29, 2024, 6:32 PM

#

Yeah I think flux spoiled us, basically perfect anatomy each gen.

I still prefer flux but sd3.5’s is much more creative than it.

signal shuttle Oct 29, 2024, 6:32 PM

#

pseudo owl Yeah I think flux spoiled us, basically perfect anatomy each gen. I still prefe...

And more trainable too

#

In regards of full fine tunes

lunar rivet Oct 29, 2024, 6:33 PM

#

huh 3.5 medium isn't that bad after all, resolutions above 1MP don't break like in 3.5 large and the quality isn't terrible for the size and speed

bitter hearth Oct 29, 2024, 6:33 PM

#

flux spoiled us yeah exactly
cos what comes out now will always have flux as context

sacred jewel Oct 29, 2024, 6:33 PM

#

SD3.5M vs SD3.5L vs SD3M

same settings for all three generations.

slender fox Oct 29, 2024, 6:37 PM

#

sacred jewel SD3.5M vs SD3.5L vs SD3M same settings for all three generations.

Based on what I've tested so far, even though medium outputs feel messy, it creates quite nice looking skin tones.

sacred jewel Oct 29, 2024, 6:42 PM

#

SD3.5L vs SD3.5M vs SD3M ... same settings for all three

lunar rivet Oct 29, 2024, 6:44 PM

#

bitter hearth Oct 29, 2024, 6:44 PM

#

hands are the final boss of models

heady copper Oct 29, 2024, 6:45 PM

#

my last experience with diffusion models is SD2 but by god it's progressed a lot

serene spindle Oct 29, 2024, 6:45 PM

#

Because SD2 sucked

sacred jewel Oct 29, 2024, 6:45 PM

#

SD3M vs. SD3.5M vs SD3.5L

signal shuttle Oct 29, 2024, 6:46 PM

#

Wait I just realized SD3's "woman laying in grass" was just a copy of SDXL's "woman doing yoga" prompt

sacred jewel Oct 29, 2024, 6:46 PM

#

p.s. misspelling was on-purpose.

heady copper Oct 29, 2024, 6:47 PM

#

serene spindle Because SD2 sucked

you could do some interesting stuff on SD2 but compared to now especially it's pretty mid (and believe me, I loved that model for some reason)

serene spindle Oct 29, 2024, 6:47 PM

#

I liked 1.5 a lot more tbh

heady copper Oct 29, 2024, 6:48 PM

#

I was mostly doing embedding training for 2 which was pretty neat but unfortunately the architecture for embeds is no more

#

oh well, this isn't the channel for nostalgia lmao

bitter hearth Oct 29, 2024, 6:50 PM

#

you can still do embeddings 🤔

sacred jewel Oct 29, 2024, 6:55 PM

#

SD3.5L vs SD3.5M vs SD3M

bitter hearth Oct 29, 2024, 6:56 PM

#

SD3.5L did it 🙂

heady copper Oct 29, 2024, 6:57 PM

#

bitter hearth you can still do embeddings 🤔

Actually? That's cool- thought SDXL kinda killed it with the multiple text encoders, worth looking into I suppose

slender fox Oct 29, 2024, 6:58 PM

#

Well, hands are not going to happen, I tried to generate a woman showing her hands, did 100 images, only 3-5 ok.

bitter hearth Oct 29, 2024, 6:58 PM

#

I think multiple text encoders is gonna go away also

#

they are mostly a temporary anomaly cos its expensive to train a model to fully replace clip

sacred jewel Oct 29, 2024, 7:01 PM

#

Flux for comparison...

#

although the only one that got the BACK of the hand was SD3.5L 🤦‍♂️

#

With more "encouragement" it finally got the back of the hand LOL

signal shuttle Oct 29, 2024, 7:13 PM

#

sacred jewel With more "encouragement" it finally got the back of the hand LOL

Bro is holding a gun to the models head with that "(back of a hand:2.5)"

keen palm Oct 29, 2024, 7:19 PM

#

heady copper Actually? That's cool- thought SDXL kinda killed it with the multiple text encod...

Multiple text encoders with sdxl?

signal shuttle Oct 29, 2024, 7:23 PM

#

keen palm Multiple text encoders with sdxl?

SDXL uses clip_l and clip_g if am not mistaken

bitter hearth Oct 29, 2024, 7:27 PM

#

yeah

#

clip g and clip l for SDXL

keen palm Oct 29, 2024, 7:28 PM

#

signal shuttle SDXL uses clip_l and clip_g if am not mistaken

Okay. I haven't seen any people experimenting with dual clip loader workflows for sdxl, so I thought it was a single.

bitter hearth Oct 29, 2024, 7:29 PM

#

SD 1.5 can use T5 using ELLA

#

as a weird bonus

sacred jewel Oct 29, 2024, 7:32 PM

#

signal shuttle Bro is holding a gun to the models head with that "(back of a hand:2.5)"

I was like " IMEAN IT, DO NOT TEST ME! HAVE :5 READY!"

#

I take a Mythbusters attitude. If it doesn't burn on its own, I will MAKE IT burn.

red nacelle Oct 29, 2024, 7:34 PM

#

signal shuttle And more trainable too

is it tune'able tho? like what are the requirements to tune 3.5L? in terms of memory only lol

#

not mentioning datasets because somehow people find/make those which is lowkey crazy to think about

digital thorn Oct 29, 2024, 7:36 PM

#

If lora training works this time I'm sure that it will be a really good refiner model to use...at least for my graphic card. 0.25 denoise for the image to the right. Just a quick test

muted dove Oct 29, 2024, 7:40 PM

#

sacred jewel SD3.5L original vs. Upscaled result through Medium (Lykon's upscale workflow)

It smoothed the skin detail away.

cunning mesa Oct 29, 2024, 7:43 PM

#

Also lost a thumb.

sacred jewel Oct 29, 2024, 7:43 PM

#

muted dove It smoothed the skin detail away.

across the board, it looks smoother overall.

digital thorn Oct 29, 2024, 7:48 PM

#

cunning mesa Also lost a thumb.

Depends on how good we can train loras for anatomy of course. If it's even trainable...I'll believe that when I see it though

cunning mesa Oct 29, 2024, 7:50 PM

#

SD3.5M is probably a good test bed for training if nothing else.

frail shoal Oct 29, 2024, 7:51 PM

#

did you guys have any luck inpainting with sd3.5 medium ? Results get completely squashed. worked well as a refiner but inpainting not at all. I'm wondering if i'm doing something wrong

bitter hearth Oct 29, 2024, 7:51 PM

#

are you using a stochastic sampler or a deterministic one?

#

for inpainting stochasticity is the most important thing

frail shoal Oct 29, 2024, 7:52 PM

#

using the suggested sampler dpmpp_2m with sgm uniform scheduler

bitter hearth Oct 29, 2024, 7:54 PM

#

would suggest trying the node version of dpmpp_2s, the one that lets you adjust eta and s_noise
keep s_noise at 1 and set eta as high as you can without the image breaking
that can help a lot

frail shoal Oct 29, 2024, 7:55 PM

#

it was supposed to be a face

frail shoal Oct 29, 2024, 7:55 PM

#

bitter hearth would suggest trying the node version of dpmpp_2s, the one that lets you adjust ...

will test the behaviour with large to see if it's the same

bitter hearth Oct 29, 2024, 7:59 PM

#

it should give you higher image quality as well as helping inpainting

#

the main downside is stochastic samplers need more steps

slender fox Oct 29, 2024, 8:02 PM

#

SD 3.5M testing after work, pretty much random stuff I wanted to try. I guess it could have many uses despite of several things that don't seem to be working with it.

#

#

gusty hamlet Oct 29, 2024, 8:06 PM

#

SD 3.5 Medium, M2 Mac Pro

bitter hearth Oct 29, 2024, 8:07 PM

#

slender fox

by far the best set of images I have seen from SD 3.5M
really good job

#

I think with a bit more fine tune it has potential

frail shoal Oct 29, 2024, 8:08 PM

#

bitter hearth it should give you higher image quality as well as helping inpainting

sd3 medium hqd inpainting problems as well. But the 3.5m version seems way worse. SD3.5 large seems to be working a bit better, but not good at all. 30 steps here, 5 cfg, 0.4 denoise

frail shoal Oct 29, 2024, 8:08 PM

#

slender fox SD 3.5M testing after work, pretty much random stuff I wanted to try. I guess it...

can you share some settings ?

bitter hearth Oct 29, 2024, 8:09 PM

#

not sure if I could convince you to try more steps
inpainting is a hard task, a lot of steps can help a lot

frail shoal Oct 29, 2024, 8:09 PM

#

bitter hearth not sure if I could convince you to try more steps inpainting is a hard task, a ...

how much steps for inpainting ?

bitter hearth Oct 29, 2024, 8:10 PM

#

I would have done 100-150, but maybe 40-60 would be ok?

slender fox Oct 29, 2024, 8:11 PM

#

frail shoal can you share some settings ?

I don't think I had any special settings. Tried different samplers and schedulers, steps from 20 to 40. CFG from 3 to 6.

bitter hearth Oct 29, 2024, 8:11 PM

#

if you want to do like 30 steps then the stochastic sampler I recommended might be worse
for low steps, Deis and UniPC can be good

frail shoal Oct 29, 2024, 8:11 PM

#

bitter hearth not sure if I could convince you to try more steps inpainting is a hard task, a ...

well with flux, sdxl and sd 1.5 you don't need more steps. But i hope it's a settings problem. Generating images seems fine, step swap as well (refiner), but inpainting i'm getting horrible results and low quality. Wondering what might be the culprit. I will try more steps to see

frail shoal Oct 29, 2024, 8:12 PM

#

slender fox I don't think I had any special settings. Tried different samplers and scheduler...

ah you are not even using the new slg node

frail shoal Oct 29, 2024, 8:12 PM

#

slender fox I don't think I had any special settings. Tried different samplers and scheduler...

very nice results however

bitter hearth Oct 29, 2024, 8:13 PM

#

I agree flux was able to inpaint with less steps
even Schnell could inpaint lol

dim geyser Oct 29, 2024, 8:13 PM

#

my macbook pro did fine with SD3.5L but it really isn't liking SD3.5M. all the results are coming out messed up.

slender fox Oct 29, 2024, 8:13 PM

#

frail shoal very nice results however

I did generate 1000+ images at one go, and I may be picky about what is OK... so there is that too 😄

frail shoal Oct 29, 2024, 8:14 PM

#

slender fox I did generate 1000+ images at one go, and I may be picky about what is OK... so...

but in terms of image quality is very nice

slender fox Oct 29, 2024, 8:14 PM

#

frail shoal but in terms of image quality is very nice

Yes, I did tweak the prompts for several hours, generating images doesn't take long with decent GPU.

#

With SD 3.5 medium.

bitter hearth Oct 29, 2024, 8:15 PM

#

ah I didn't realise these were with euler
so with a stronger sampler you could get some more quality also

frail shoal Oct 29, 2024, 8:19 PM

#

bitter hearth ah I didn't realise these were with euler so with a stronger sampler you could g...

what stronger sampler ?

bitter hearth Oct 29, 2024, 8:22 PM

#

euler is the weakest out of any of the default comfy samplers
for deterministic, DPM++ 2M, Uni PC and Deis are particularly good
for stochastic, DPM++ 2M SDE, 3M SDE and 2SA are good, although only 2SA works with Flux and SD 3.5
you can get better samplers than those default ones but they require custom nodes

frail shoal Oct 29, 2024, 8:22 PM

#

bitter hearth euler is the weakest out of any of the default comfy samplers for deterministic,...

what are those better samplers, i have custom nodes for samplers

bitter hearth Oct 29, 2024, 8:24 PM

#

Clownshark's node pack gives lots of stochastic samplers that work with Flux and SD 3.5
https://github.com/ClownsharkBatwing/RES4LYF
Sampler RK is the latest node as far as I know, and its got a few to choose from

#

if you put Eta to 0 then they run in deterministic mode

quaint flare Oct 29, 2024, 8:28 PM

#

digital thorn If lora training works this time I'm sure that it will be a really good refiner ...

What has happened to her thumb?

digital thorn Oct 29, 2024, 8:51 PM

#

quaint flare What has happened to her thumb?

Yeah, bad hands in Medium. Like I said, if lora training works it might have a brighter future than 3.0 model.

vale oyster Oct 29, 2024, 8:58 PM

#

I gotta ask, as I just had to hit the wrong setting in my BIOS so my PC is in no post mode atm, as it power cycles.

So I need to rely on a cloud provider -- is there any good big names as of recently for 3.5 and flux?

#

I was looking @ modal.. But is runpod still a actively worked on cloud gpu provider?

violet escarp Oct 29, 2024, 9:14 PM

#

digital thorn Yeah, bad hands in Medium. Like I said, if lora training works it might have a b...

I have a feeling it's going to be painful to finetune sd3.5 medium and people are just going to go back to large. I think the architecture makes it so you need a high parameter count to train efficiently. Flux is a good example. It's as big as it is because that's the cheapest to train, even if it sounds counter-intuitive.

hallow lion Oct 29, 2024, 9:14 PM

#

so medium looking good huh

#

much fastor and higher than 1 megapixel

pseudo owl Oct 29, 2024, 9:15 PM

#

yeah anatomy isn't that great but nice for its size, great for people with weak gpus.

hallow lion Oct 29, 2024, 9:16 PM

#

hypemad

real terrace Oct 29, 2024, 9:16 PM

#

2:54 generation time here for 823x1152

#

In original workflow clip3 was fp16

#

#

Really like the details, they need upscalling but pretty interesting to work on

dusky thistle Oct 29, 2024, 10:33 PM

#

dim geyser Oct 29, 2024, 10:34 PM

#

vale oyster I gotta ask, as I just had to hit the wrong setting in my BIOS so my PC is in no...

SD's introducing post mentions the Stability AI API, Replicate, & DeepInfra

dusky thistle Oct 29, 2024, 10:44 PM

#

patent acorn Oct 29, 2024, 10:44 PM

#

the post-falling down the stairs

dense blade Oct 29, 2024, 10:52 PM

#

What's your performance difference between SD 3.5M and L? I use SD 3.5M (default) and SD 3.5L (Q8) with following speeds: 3.96s/it (M), 7,18s/it (L) - both use the same settings: shift 3, 40 steps, 4.5cfg, dpmpp_2m, sgm_uniform. I expected to get better performance from the M version.

#

My env: Linux, AMD Radeon RX 6700 XT (12GB), pytorch 2.5.0+rocm6.1

sullen moss Oct 29, 2024, 11:38 PM

#

Hmm

hallow lion Oct 30, 2024, 12:01 AM

#

Red Panda?

#

hypemad

#

So basically we have entered singularity when a new ai model comes out every day.

craggy crest Oct 30, 2024, 12:13 AM

#

hallow lion Red Panda?

vaporware

winged seal Oct 30, 2024, 12:32 AM

#

sullen moss Hmm

I have done a few hundred images of ranking. Red panda seems to have a little more variation in composition, mixed with really bad Dall E 3/MJV6 "Aesthetislop" which people look at and go "oo pretty" cause there is an unnecessary amount of contrast/noise/"detail"

#

Examples of aesthetislop, where there is just noise and "detail" everywhere to try and get the same reaction as jingling keys in front of a baby's eyes lol

#

I get why people like it, but its just all noise/nonsensical detail over anything logical or visually pleasing in a toned down way

hallow lion Oct 30, 2024, 12:39 AM

#

fast food for the eyes

winged seal Oct 30, 2024, 12:40 AM

#

pretty much haha

#

just overwhelm the senses so people don't take a look closer and realize its all meaningless noise and nonsensical details/tones

hallow lion Oct 30, 2024, 12:41 AM

#

i like the new word aesthetislop

winged seal Oct 30, 2024, 12:41 AM

#

A lot of people who do the head to head comparisons on that space go for "oo pretty" within 2.5 seconds of looking instead of taking in the composition, stylistic variation, or interpretation of the prompt

winged seal Oct 30, 2024, 12:49 AM

#

hallow lion i like the new word aesthetislop

my friends in my research group have been using it for nearly 2 years now haha. It really does best explain the look haha

sacred jewel Oct 30, 2024, 12:51 AM

#

Zombiecore LoRA - Match made in heaven 🤭

cedar vortex Oct 30, 2024, 1:08 AM

#

winged seal Oct 30, 2024, 1:21 AM

#

cedar vortex

goodness, more baked than snoopdog

craggy crest Oct 30, 2024, 2:02 AM

#

winged seal Examples of aesthetislop, where there is just noise and "detail" everywhere to t...

it's also vaporware right now

runic tusk Oct 30, 2024, 2:03 AM

#

winged seal Oct 30, 2024, 2:09 AM

#

craggy crest it's also vaporware right now

we shall see, we don't even know what model it is

#

Blueberry was around for a bit, and now you can use it

#

I am just curious who its made by

dusky thistle Oct 30, 2024, 2:10 AM

#

craggy crest Oct 30, 2024, 2:11 AM

#

winged seal we shall see, we don't even know what model it is

we also don't know if the images you're voting for were created by something new, or something old with a new name. we know nothing about it at all

winged seal Oct 30, 2024, 2:11 AM

#

craggy crest we also don't know if the images you're voting for were created by something new...

you mean for redpanda?

#

yeah, we know nothing about redpanda as of now

craggy crest Oct 30, 2024, 2:13 AM

#

winged seal you mean for redpanda?

yeah. it's vaporware. the images could have been created with anything. once we get a demo, and know who's behind it, then it's time to worry about it

winged seal Oct 30, 2024, 2:13 AM

#

craggy crest yeah. it's vaporware. the images could have been created with anything. once we ...

I mean, you are saying pretty obvious stuff lol

#

but yeah, curious to see who made it

#

last time it was a new Black Forest Labs model, so we will see

dusky thistle Oct 30, 2024, 2:15 AM

#

winged seal Oct 30, 2024, 2:16 AM

#

We have been due for a new Dalle4 for a long time, honestly. Dalle 3 was a prompt adherence champion, but my god did it look abysmal

errant dust Oct 30, 2024, 2:22 AM

#

We'll agree to disagree.

#

In any case, for those awaiting it. The GGUF build of SD3.5 Medium is out

#

https://huggingface.co/city96/stable-diffusion-3.5-medium-gguf

city96/stable-diffusion-3.5-medium-gguf · Hugging Face

sacred jewel Oct 30, 2024, 2:40 AM

#

sullen moss Hmm

I can probably make one of those charts in Photoshop in :30 seconds... I'll call mine Hawk Tuah and give it a score of 2001, a rate of 95% and a selections # of 76871 and profit 😎

errant dust Oct 30, 2024, 2:50 AM

#

What is Red Panda?

real terrace Oct 30, 2024, 2:51 AM

#

errant dust We'll agree to disagree.

Hi excuse me, how do you load those models in Comfy?

cobalt moon Oct 30, 2024, 2:53 AM

#

errant dust What is Red Panda?

A mysterious AI model that no one have a single clue where it came from

#

It just appeared in Artificial Analysis

errant dust Oct 30, 2024, 2:54 AM

#

real terrace Hi excuse me, how do you load those models in Comfy?

GGUF?

real terrace Oct 30, 2024, 2:54 AM

#

yes

errant dust Oct 30, 2024, 2:55 AM

#

cobalt moon Oct 30, 2024, 2:55 AM

#

I think it's probably a codename

errant dust Oct 30, 2024, 2:56 AM

#

Workflow is there. You need to install the GGUF extensions and make sure Comfy is updated

cobalt moon Oct 30, 2024, 2:56 AM

#

like Flux's codename "Blueberry"

errant dust Oct 30, 2024, 2:56 AM

#

https://github.com/city96/ComfyUI-GGUF

GitHub

GitHub - city96/ComfyUI-GGUF: GGUF Quantization support for native ...

GGUF Quantization support for native ComfyUI models - city96/ComfyUI-GGUF

real terrace Oct 30, 2024, 2:57 AM

#

errant dust

great thanks, I found the missing nodes there

real terrace Oct 30, 2024, 2:59 AM

#

errant dust

Also I see now that the text encoder can be quantized too

#

what is the SD3-5 vae? I didn't use none before I think

craggy crest Oct 30, 2024, 3:04 AM

#

@dusky thistle comfyUI has a new scheduler in it

craggy crest Oct 30, 2024, 3:05 AM

#

real terrace Also I see now that the text encoder can be quantized too

all models can be quantized. that doesn't mean they should be

real terrace Oct 30, 2024, 3:05 AM

#

craggy crest all models can be quantized. that doesn't mean they should be

great, yes I was kind of dazzled about so much quantization

#

at this point I load some model and it worked somehow

#

#

I don't even remember where did I get it

#

The only thing about these models is that they take so much I don't fiddle or play too much with the parameters, I just set prompts and img size

dusky thistle Oct 30, 2024, 3:19 AM

#

craggy crest <@1208924372299939890> comfyUI has a new scheduler in it

what scheduler

real terrace Oct 30, 2024, 3:20 AM

#

your gun is backwards sir

#

#

hallow lion Oct 30, 2024, 3:48 AM

#

real terrace your gun is backwards sir

lmao

real terrace Oct 30, 2024, 3:58 AM

#

craggy crest Oct 30, 2024, 4:00 AM

#

dusky thistle what scheduler

linear_quadradic

real terrace Oct 30, 2024, 4:02 AM

#

#

they need refinement

#

#

#

#

sh,don't shout the prompt bro

#

god dammit

sterile pendant Oct 30, 2024, 4:52 AM

#

craggy crest linear_quadradic

Afaik, it's pretty much only used with mochi or w/e that model is called. I haven't tested to see if it works at all with other models

#

Off the top of my head, it should likely look like something between an sgm_uniform and simple sigma curve

#

ahh nvm, looks like they use some wonky exponent for it. looks like it spends a lot of time shifting things around a bunch, then rapidly drops into fine details

#

it's definitely meant for flow models only. sdxl models usually have concave curves and it didn't respond to that when I tried to run this with an sdxl model.

#

either way, it's likely not all that useful for non-mochi models.

dusky thistle Oct 30, 2024, 5:23 AM

#

#

mortal mesa Oct 30, 2024, 5:30 AM

#

craggy crest Oct 30, 2024, 5:30 AM

#

sterile pendant either way, it's likely not all that useful for non-mochi models.

Sd3 is a flow matching model

dusky thistle Oct 30, 2024, 5:36 AM

#

#

#

#

#

sterile pendant Oct 30, 2024, 5:39 AM

#

craggy crest Sd3 is a flow matching model

It is, but that kind of curve is not what the model was trained on. So sure, it will work, but it's not optimal. It will spend far less time resolving medium and fine details using that linear quadratic curve

#

Think of it like model shifting to an extreme value away from the norm

dusky thistle Oct 30, 2024, 5:49 AM

#

i think beta's been best

#

it's got just enough time spent on the fine details

#

craggy crest Oct 30, 2024, 5:50 AM

#

sterile pendant It is, but that kind of curve is not what the model was trained on. So sure, it ...

Shrug

dusky thistle Oct 30, 2024, 5:56 AM

#

mortal mesa Oct 30, 2024, 6:00 AM

#

dusky thistle Oct 30, 2024, 6:06 AM

#

papa toilet

severe phoenix Oct 30, 2024, 6:24 AM

#

please can anyone explain to me what these two clips do. is one more powerful than the other?

sterile pendant Oct 30, 2024, 6:26 AM

#

dusky thistle i think beta's been best

Yeah beta has been my go to for a while now

dusky thistle Oct 30, 2024, 6:28 AM

#

severe phoenix please can anyone explain to me what these two clips do. is one more powerful th...

T5 and clip L, T5 will be for sentences, L for danbouru style tags, T5 is way way more influential

mortal mesa Oct 30, 2024, 6:38 AM

#

severe phoenix Oct 30, 2024, 6:45 AM

#

dusky thistle T5 and clip L, T5 will be for sentences, L for danbouru style tags, T5 is way wa...

ahh thanks, what kinda tags are these?

craggy crest Oct 30, 2024, 6:45 AM

#

severe phoenix please can anyone explain to me what these two clips do. is one more powerful th...

here's the architecture

#

SD3 uses all three encoders, flux only uses two

#

clip_g is the workhorse - it drives the entire thing

#

clip_l and t5-xxl work along with it

#

t5xxl gets your detail rich, narrative, natural language prompt

#

clip_l gets your artsy, ambient, background, fine details prompt

#

clip_g gets the no-nonsense, just-the-facts-mam information about what the image is

#

IF you do not use a node that allows you to put in a seperate prompt for each encoder, then the text you put into the positive prompt will be given to each encoder anyway, and they will fight with each other - they aren't in sync

real terrace Oct 30, 2024, 6:50 AM

#

craggy crest IF you do not use a node that allows you to put in a seperate prompt for each en...

nice explanation, I would like to experiment with that concept, it seems to be really important if they work like that

#

for example in this img the text that appeared was just some text to reinforce the idea, not some main title to write

craggy crest Oct 30, 2024, 6:51 AM

#

real terrace nice explanation, I would like to experiment with that concept, it seems to be r...

here's a workflow to start with. this has the SD3-2b-medium model loaded, but you can just change that to SD3.5 if you wish

📎 3-encoder-SD3-workflow.json

real terrace Oct 30, 2024, 6:51 AM

#

it would be in the clip_l I guess

craggy crest Oct 30, 2024, 6:52 AM

#

real terrace it would be in the clip_l I guess

the encoders alone do lousy text. you need at least two of them for it to be readable

real terrace Oct 30, 2024, 6:53 AM

#

I wonder how much different could you find when trying a single prompt and these 3 clip prompts, is there something done?

severe phoenix Oct 30, 2024, 6:53 AM

#

craggy crest IF you do not use a node that allows you to put in a seperate prompt for each en...

damn thanks for t5xxl level explanation 😃🙏

real terrace Oct 30, 2024, 6:53 AM

#

craggy crest the encoders alone do lousy text. you need at least two of them for it to be rea...

the idea is that I didn't want that text to be readable, it was just some text to reinforce the idea of the image

craggy crest Oct 30, 2024, 6:53 AM

#

this is an SD3.5 Large workflow i'm working with right now, it also has the node for all three encoders. to get the workflow: 1. click the image to open in viewer. 2. click the Open in browser words. 3. right click, save as. then drag and drop into comfyUI

craggy crest Oct 30, 2024, 6:54 AM

#

real terrace the idea is that I didn't want that text to be readable, it was just some text ...

so type random letters: here we see a sign with the word "swrenwtw rerwun ewwn"

#

the AI will try to guess what it might be

real terrace Oct 30, 2024, 6:55 AM

#

craggy crest so type random letters: here we see a sign with the word "swrenwtw rerwun ewwn...

there wasn't a sign mentioned, it was "advertisement" so it make a text and put something in the prompt.

craggy crest Oct 30, 2024, 6:55 AM

#

real terrace there wasn't a sign mentioned, it was "advertisement" so it make a text and pu...

just an example of what you'd do for random text

real terrace Oct 30, 2024, 6:59 AM

#

so clip g would be the classic SD prompt, clip t5xx would the "chat GTP" type promtp, clip_l would be booru tags, like "trending on artstation" ?

craggy crest Oct 30, 2024, 7:01 AM

#

real terrace so clip g would be the classic SD prompt, clip t5xx would the "chat GTP" type ...

i would advise you to never use trending on artstation unless you just want random dice rolls for noise. and most of the other tags like it. they are just noise.

#

here

#

clip_l: trees, warped, twisted, weathered, ruined, cracked, flames, fire, smoky; atmospheric lighting; peter mohrbacher, james jean, william morris, ernst haeckel, zaha hadid
clip_g: apocalyptic ruins; pink tree growing flames;
t5xxl: the scene is set in apocalyptic ruins. in the center, we see pink tree growing flames rather than leaves; All that is left of the buildings are parts of walls with crumbling bricks

#

=======

#

clip_l: thick impasto painting, heavy canvas texture, noticeable brush strokes
clip_g: Stratocumulus; Renaissance Island beach sunrise, maya beach by artist "Paul Dougherty", by artist "Nicolette Ceccoli"
t5xxl: the scene is a stunning sunrise at the beach. cumulostratus clouds cover the sky with red and gold, while transparent green water and white foam breaks on the light brown sand,

#

=====

#

there are 2 sets of prompts to play with

#

@dusky thistle comparisons. model: sd3.5 large sampler: euler_ancestral schedulers: on the left, linear_quadratic. on the right, beta workflow is in the images

dusky thistle Oct 30, 2024, 7:05 AM

#

that looks real good on the right

#

what's interesting i've noticed with this flow models is it's like it just needs to set the trajectory real carefully

#

almost like aiming a rifle

craggy crest Oct 30, 2024, 7:05 AM

#

it does. i'm working on that sampler/scheduler sheet - should be done tomorrow

dusky thistle Oct 30, 2024, 7:05 AM

#

and then pow, it can take massive steps

craggy crest Oct 30, 2024, 7:06 AM

#

dusky thistle and then pow, it can take massive steps

did you grab the sd3.5 medium workflow with SLG?

dusky thistle Oct 30, 2024, 7:06 AM

#

no idea, been coding all night

craggy crest Oct 30, 2024, 7:06 AM

#

skipping layers, like matteo did with flux back there. skipping the blocks

dusky thistle Oct 30, 2024, 7:06 AM

#

just got a brutally, just, oh-my-gawd-don't-want-to-admit-it bug fixed that's been hanging around for weeks

#

cool yea sounds like PAG with zeroing out the V

craggy crest Oct 30, 2024, 7:07 AM

#

dusky thistle just got a brutally, just, oh-my-gawd-don't-want-to-admit-it bug fixed that's be...

https://huggingface.co/stabilityai/stable-diffusion-3.5-medium

stabilityai/stable-diffusion-3.5-medium · Hugging Face

#

files are there when you're ready

craggy crest Oct 30, 2024, 7:08 AM

#

dusky thistle cool yea sounds like PAG with zeroing out the V

do you remember that really long node matteo did for flux, where he set up every block so you could adjust them?

#

and then he talked about how certain blocks didn't do anything and we played around with not using them?

mortal mesa Oct 30, 2024, 7:09 AM

#

severe phoenix Oct 30, 2024, 7:09 AM

#

craggy crest i would advise you to never use trending on artstation unless you just want rand...

ohhh so clipl recognizes artists and styles and t5 is just straight up natural language. woow i'll try this thanks!

dusky thistle Oct 30, 2024, 7:10 AM

#

it's great having comprehensive nodes like that

#

i make shit like this all the time

craggy crest Oct 30, 2024, 7:11 AM

#

dusky thistle it's great having comprehensive nodes like that

okay, that's what SLG is. only not nearly as long and annoying a node as those

#

there are two workflows on that repo, one with, one with out and you should probably DM lykon about that setup you have and talk to him

dusky thistle Oct 30, 2024, 7:11 AM

#

sounsd good

#

gonna have to go pass out soon here but will do

craggy crest Oct 30, 2024, 7:12 AM

#

and i want your workflow

#

i can't find the one matteo released

dusky thistle Oct 30, 2024, 7:12 AM

#

3rd order RES is looking pretty nice with ultracascade

#

which WF?

craggy crest Oct 30, 2024, 7:12 AM

#

he's got teeny wings coming out of his neck

dusky thistle Oct 30, 2024, 7:12 AM

#

hahah yeah

craggy crest Oct 30, 2024, 7:12 AM

#

dusky thistle which WF?

the one that screen shot was from

#

i'll bet those nodes aren't avaialble, are they?

craggy crest Oct 30, 2024, 7:14 AM

#

dusky thistle hahah yeah

add the term symmetrical to the prompt as the first term

dusky thistle Oct 30, 2024, 7:14 AM

#

they should be

#

https://github.com/ClownsharkBatwing/UltraCascade

GitHub

GitHub - ClownsharkBatwing/UltraCascade

Contribute to ClownsharkBatwing/UltraCascade development by creating an account on GitHub.

#

i think they're in there, if not i can get em to ya

craggy crest Oct 30, 2024, 7:16 AM

#

dusky thistle i think they're in there, if not i can get em to ya

i'll go poke around and let you knwo if i can find them or not

#

DM lykon, and go crash

sacred geode Oct 30, 2024, 7:44 AM

#

viral plaza Oct 30, 2024, 8:09 AM

#

@lavish osprey @graceful osprey SD 3.5 Medium modelspec resolution is 1440x1440, however the model struggles to run this resolution natively: (it's fine at 1024x1024, and I bet it can upscale to 1440 happily, but native from the ground at 1440 doesn't seem to work too well. Maybe with the skip layer guidance, longer prompt, luckier seed, and all you can push it into working? idk even then I feel like recommending 768 or 1024 with an upscale is smarter than recommend native start at 1440)

#

that's a photo of a cat and that staircase it's inside of, is the early steps completely corrupted, and the latter detail steps managed to refine it into something that almost looks like a real structure

bitter hearth Oct 30, 2024, 8:12 AM

#

Flux doesn't do well at the advertised top resolution of 2048 either

#

unless you are using I-max

turbid grotto Oct 30, 2024, 8:12 AM

#

viral plaza <@180327464155742208> <@689763763548979303> SD 3.5 Medium modelspec resolution i...

i was able to generate FullHD

#

but not first try

viral plaza Oct 30, 2024, 8:12 AM

#

bitter hearth Flux doesn't do well at the advertised top resolution of 2048 either

To be clear, I'm not discussing the advertised top resolution, I'm discussing the defined default resolution, which inside of the released SD3.5 Model is set at its top, but should not be

#

that's defined in the metadata header of the model if you're unfamiliar with modelspec

#

which https://github.com/Stability-AI/ModelSpec?tab=readme-ov-file#inferencing-tools-and-uis Apply relevant keys where logical (eg resolution in image models should be applied as the default resolution for images made with that model) that key is intended to be the default usage resolution

bitter hearth Oct 30, 2024, 8:14 AM

#

ah thanks
I knew there was metadata inside .safetensors files but I have never looked at them

#

okay yeah you have a good point then

noble coyote Oct 30, 2024, 8:21 AM

#

Clip_l, Clip_g, Clip_round_the_ear - what does this one do?! 🥳

muted dove Oct 30, 2024, 8:30 AM

#

noble coyote Clip_l, Clip_g, Clip_round_the_ear - what does this one do?! 🥳

With questions like that, you'll soon find out my lad! 😄

#

Even the strongest have their weak days.

bitter hearth Oct 30, 2024, 8:33 AM

#

on Civit there is someone adding Flan-t5-xxl to Flux and SD3.5

patent prism Oct 30, 2024, 8:33 AM

#

a cat

muted dove Oct 30, 2024, 8:33 AM

#

I doubt it's a cat that's adding it.

muted dove Oct 30, 2024, 8:58 AM

#

real terrace Oct 30, 2024, 9:15 AM

#

bitter hearth on Civit there is someone adding Flan-t5-xxl to Flux and SD3.5

so what are the best clips (!?)

#

I'm trying this

real terrace Oct 30, 2024, 9:20 AM

#

craggy crest clip_l: trees, warped, twisted, weathered, ruined, cracked, flames, fire, smoky;...

ty, I have to try that

meager patio Oct 30, 2024, 9:22 AM

#

Hello everyone, I really need help. I was struggling all night, installing a stable diffusion, literally for one job, and right now I'm launching it for the first time and it drops what's in the picture, talking about old drivers. I can't update it because this is the limit for the video card.

I want to ask someone to do literally one job. I'll be extremely grateful if someone responds.

real terrace Oct 30, 2024, 9:22 AM

#

This large model took as much as the GGUF sd.35 medium q5_0

#

bitter hearth Oct 30, 2024, 9:23 AM

#

the big speed up is only gonna come once you have a setup that fits within your VRAM

#

regarding clips, some people like this for Clip-L https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14
although I am not sure about it

sterile pendant Oct 30, 2024, 9:24 AM

#

Another big key difference about sd3.5 medium is that the t5 should contain no more than 256 tokens

bitter hearth Oct 30, 2024, 9:24 AM

#

Flan-t5-xxl for T5 I am more sure about

real terrace Oct 30, 2024, 9:24 AM

#

bitter hearth the big speed up is only gonna come once you have a setup that fits within your ...

what what setup that might be?

bitter hearth Oct 30, 2024, 9:25 AM

#

make sure the text encoders get put onto the CPU before the Ksampler starts
and try Q4_0

#

or just offload the text encoders entirely

#

after text encode

sterile pendant Oct 30, 2024, 9:26 AM

#

And SLG works well with 3.5 medium. Don't know about large though since medium has the slightly different architecture with extra self attention layers or w/e it was

bitter hearth Oct 30, 2024, 9:26 AM

#

SLG will work with Large and Flux I think
just needs porting and recalibrating

#

its a clever idea, it drops layers for the negative so the structure gets messed up

fossil pagoda Oct 30, 2024, 9:27 AM

#

👁️

241030101902_A_beautiful_mythical_eye_reminiscent_of_the_works_of_WETA_FX_and_Industrial_Ligh__00003_.png

bitter hearth Oct 30, 2024, 9:27 AM

#

since models try to do the opposite of the negative, you end up with better structure

#

for Flux this will require generating a negative, with the 100% speed penalty

#

would be good to try this method with Perp-Neg also

muted dove Oct 30, 2024, 9:44 AM

#

SD3.5L using Google FLAN and Flux refiner.

#

remote holly Oct 30, 2024, 9:57 AM

#

what are the resolutions suported by sd3.5 ?

muted dove Oct 30, 2024, 9:58 AM

#

muted dove Oct 30, 2024, 9:59 AM

#

remote holly what are the resolutions suported by sd3.5 ?

I just use the same as SDXL, which are in the "SD3 Select Latent Resolution" node

#

graceful osprey Oct 30, 2024, 9:59 AM

#

viral plaza <@180327464155742208> <@689763763548979303> SD 3.5 Medium modelspec resolution i...

Hmmm weird. 1920*1024 should just work out of box

#

ah but you might need a higher shift

#

like 4~5

muted dove Oct 30, 2024, 10:00 AM

#

alpine summit Oct 30, 2024, 10:01 AM

#

viral plaza Oct 30, 2024, 10:03 AM

#

graceful osprey Hmmm weird. 1920*1024 should just work out of box

1920x1024, Seed 1, a photo of a cat, CFG 7, Sigma Shift 3 on left and 5 on right, Sampler Euler scheduler Normal, Steps 20
in both cases while watching live previews it's clear the early steps are failing and the latter steps are trying to recover the corruption into something coherent (in this case a TV screen type of effect on the background, and a cat with a patchy outline lol)

graceful osprey Oct 30, 2024, 10:03 AM

#

Steps 20 - try dpmpp_2m steps50

#

should work - let my try as well

viral plaza Oct 30, 2024, 10:05 AM

#

here's that at the end. The two other images are intermediate previews just to show what it's doing in the early steps

#

for comparison at 1024 the early step previews have a relatively clean coherent image, just without details built yet

bitter hearth Oct 30, 2024, 10:06 AM

#

In Flux I got better results with stochastic samplers
the built in DPM++ 2SA in Comfy has been adapted to work with ret flow

graceful osprey Oct 30, 2024, 10:06 AM

#

Noisy beginning is kinda expected but should not bu this bad

viral plaza Oct 30, 2024, 10:06 AM

#

and also those early previews resemble the final image moreso than the high res

viral plaza Oct 30, 2024, 10:06 AM

#

graceful osprey Noisy beginning is kinda expected but should not bu this bad

yee

bitter hearth Oct 30, 2024, 10:09 AM

#

I've been trying to port restart sampler to ret flow, it might help with these models
its given me the best results in SD 1.5 and SDXL

#

it adds the noise in a special way, its a bit tricky

alpine summit Oct 30, 2024, 10:11 AM

#

remote holly Oct 30, 2024, 10:13 AM

#

what are the schelduders and samplers ?

muted dove Oct 30, 2024, 10:15 AM

#

#

#

#

alpine summit Oct 30, 2024, 10:59 AM

#

remote holly Oct 30, 2024, 11:02 AM

#

does someone knows a prompt system for prompt enhancing

alpine summit Oct 30, 2024, 11:08 AM

#

#

alpine summit Oct 30, 2024, 11:58 AM

#

noble coyote Oct 30, 2024, 11:59 AM

#

Flux/Florence2 + LoRAs img2img

#

severe phoenix Oct 30, 2024, 12:04 PM

#

muted dove

omg i never knew stuff like this existed! is there one for img2img

turbid grotto Oct 30, 2024, 12:39 PM

#

viral plaza <@180327464155742208> <@689763763548979303> SD 3.5 Medium modelspec resolution i...

hi, I found same problem when not using T5.
1,2 - clips only
3,4 - clips + T5

flat oracle Oct 30, 2024, 12:44 PM

#

why is my workflow i used for sd3.5 l not working for sd3.5 m

viral plaza Oct 30, 2024, 12:44 PM

#

oh shit, i'm so used to ignoring T5 with SD3 since it did almost nothing with the SD3.0 models, is it actually needed for SD3.5 Medium now?

flat oracle Oct 30, 2024, 12:44 PM

#

viral plaza oh shit, i'm so used to ignoring T5 with SD3 since it did almost nothing with th...

you can run with 1 text encoder, 2 or all 3.

viral plaza Oct 30, 2024, 12:45 PM

#

flat oracle you can run with 1 text encoder, 2 or all 3.

i know that lol refer to context of the conversation above, it's a much more specific thing

turbid grotto Oct 30, 2024, 12:45 PM

#

viral plaza oh shit, i'm so used to ignoring T5 with SD3 since it did almost nothing with th...

i have been using with no clips too and saw no errors until increased resolution above 1024px

flat oracle Oct 30, 2024, 12:46 PM

#

viral plaza i know that lol refer to context of the conversation above, it's a much more spe...

you asked if t5 is needed. The answer is no. lol

#

By any chance you know why my workflow is not working?

viral plaza Oct 30, 2024, 12:46 PM

#

flat oracle you asked if t5 is needed. The answer is no. lol

again, within a specific context of the conversation above

flat oracle Oct 30, 2024, 12:47 PM

#

viral plaza again, _within a specific context_ of the conversation above

this is an open chat sry dawg. Maybe reply to the message

viral plaza Oct 30, 2024, 12:48 PM

#

turbid grotto hi, I found same problem when not using T5. 1,2 - clips only 3,4 - clips + T5

seed=1, steps=20,cfg=7, CLIPs + T5, Steps=20. Seems still wonked out. left is 1440x1440, right is 1920x1024

#

oh i left dpm++ on instead of euler, slightly different details with euler but still wonked

turbid grotto Oct 30, 2024, 12:52 PM

#

"a photo of a cat"
seed 1
steps 20
1440x1440
dpmpp_2m
sgm_uniform
cfg 4
fp16 everything

turbid grotto Oct 30, 2024, 12:53 PM

#

viral plaza seed=1, steps=20,cfg=7, CLIPs + T5, Steps=20. Seems still wonked out. left is 14...

maybe too high cfg?

#

hmm there is stripes in background but seems to be gone if I specify

viral plaza Oct 30, 2024, 1:21 PM

#

turbid grotto "a photo of a cat" seed 1 steps 20 1440x1440 dpmpp_2m sgm_uniform cfg 4 fp16 ev...

that looks very similar to my results on euler at cfg 7. It's clearly corrupted on the background, and the cat itself doesn't look great (those eyes are barely existent)

viral plaza Oct 30, 2024, 1:22 PM

#

turbid grotto hmm there is stripes in background but seems to be gone if I specify

the specified background reduces the visibility of damage, but that image still looks pretty off

#

i think this is just a case of, the model wasn't trained enough at or above the 1440x1440 res, and the more input you give the more it can compensate and make it work, but if you give minimal input the difference shines aggressively

#

this is a general case with image diffusion models - they are largely self-correcting, so giving it more of anything helps cover issues. More prompt tokens = wider attention = more self-correction. More steps = more actual entire runs of the model = more self-correction

#

(relatedly in the other direction, generating with empty prompts is a great way to sus out implicit biases in a model, whatever it outputs without conditioning will be approximately representative of the types of content it was trained the most on and may have developed a general bias towards)

#

(in that test, SD3.5Medium has a very nice broad range, and seems to favor 768x768 for the most stable generations, even at 1024x1024 with empty input it displays some striping patterns)

turbid grotto Oct 30, 2024, 1:30 PM

#

viral plaza i think this is just a case of, the model wasn't trained enough at or above the ...

yea there is definitely undertraining in some cases but I do not judge it much as it is base model and will be improved if adopted by community

viral plaza Oct 30, 2024, 1:30 PM

#

yee

turbid grotto Oct 30, 2024, 1:33 PM

#

viral plaza (in that test, SD3.5Medium has a very nice broad range, and seems to favor 768x7...

does that means model is unbiased or that is bad thing?

viral plaza Oct 30, 2024, 1:34 PM

#

that means model lacks any visible bias, which is good

#

I've seen some models eg continually generate humanoid outputs on empty prompt, indicating that the model was violently overtrained on humans to win on "omg it makes pretty girl" aesthetic evals but otherwise useless as a foundation for anything

#

SD3.5Medium has a very broad range of random outputs from that, indicating it likely hasn't been overly tuned on anything in particular, making it very optimal for a base model for community tuning adoption

#

(noting however that's just a quick n dirty test, not a guarantee, so grain of salt and all that)

#

sd35m promptless outputs

#

maybe a touch of over tendency towards artsy stylings but eh

#

and for comparison, Flux Dev

#

burnt hard on high quality photoreal art

#

that woman included is The Flux Woman who everyone sees constantly, and the community have taken to identifying by way of her specific "butt chin" shape

turbid grotto Oct 30, 2024, 1:49 PM

#

viral plaza I've seen some models eg continually generate humanoid outputs on empty prompt, ...

majority of finetunes sadcat
but it is probably inevitable future of sd35 to become coherent. I compared sd35 with flux on "lying on the grass" and sd35 often tried to do the hardest thing, like upside down, while flux often choose the easiest route by generating non-upside-down and non-horizontal (idk how to name it)

turbid grotto Oct 30, 2024, 1:49 PM

#

viral plaza and for comparison, Flux Dev

yea I had similar result

#

so, you either have diverse model with lacking coherence or coherent with lacking divercity

viral plaza Oct 30, 2024, 1:50 PM

#

turbid grotto majority of finetunes <:sadcat:1130568570712109176> but it is probably inevitab...

yeah, i went on whole rants about that during the sd3 launch. no model is good at upside down woman, sd3 was just the one model that tried to do it without asking

viral plaza Oct 30, 2024, 1:52 PM

#

turbid grotto so, you either have diverse model with lacking coherence or coherent with lackin...

i think the ideal option is the diverse capabilities, but with something to avoid it defaulting to things it's bad at. eg an LLM prompt augmenter that takes short prompts and writes long prompts for it that don't suck and give good defaults for things the image model might make silly choices for

#

it's funny how often people have said my discord bot running flux schnell looks better than their flux dev results (or similar comparisons)... the model's worse, but my discord bot has an LLM that extends your prompt, and that does magic

lavish osprey Oct 30, 2024, 1:56 PM

#

viral plaza burnt hard on high quality photoreal art

not sure if it's just the distillation, but those all look like MJ images. Synthetic data, not just any "photoreal art". Especially the woman "sameface" with the butt chin, that's 100% MJ

lavish osprey Oct 30, 2024, 1:58 PM

#

viral plaza yeah, i went on whole rants about that during the sd3 launch. _no model_ is good...

yeah, another artifact on training on real world data. If you go on photo datasets they're usually using various different angles. Models trained on synth data tend to overfit on upright position or, in general, simple poses.

signal shuttle Oct 30, 2024, 1:59 PM

#

After playing a lot with flux and SD 3.5 large, i began to forget how fast 1s/it was, Man SD 3.5 medium is so fast

noble coyote Oct 30, 2024, 2:00 PM

#

Budget's just finished - penny off a pint of beer!

#

I'm off down the pub!!! 🥳

#

flat oracle Oct 30, 2024, 2:03 PM

#

flat oracle why is my workflow i used for sd3.5 l not working for sd3.5 m

if anyone wonders. got it fixed updating comfyui via manger.

noble coyote Oct 30, 2024, 2:04 PM

#

<---------------- needs OmniGen on ComfyUI

turbid grotto Oct 30, 2024, 2:04 PM

#

viral plaza it's funny how often people have said my discord bot running flux schnell looks ...

I probably should add some tiny llama to extend prompts

#

I often see people saying sd35 is not cinematic\realistic but that is sooo wrong, they just used to models defaulted to certain style. In that cause you have to prompt to get it and you will

turbid grotto Oct 30, 2024, 2:07 PM

#

noble coyote <---------------- needs OmniGen on ComfyUI

agree, even if I'll have to run it for a 5min, it would be interesting, and I saw it is actually great, but suspiciously reminds me flux's aesthetics, ||but I might hallucinate||

viral plaza Oct 30, 2024, 2:08 PM

#

#1243166025000943746 message relevant omnigen testing in the swarm discord earlier

noble coyote Oct 30, 2024, 2:08 PM

#

Its the power to meld disparate elements of different photos - seamlessly - which excites me

viral plaza Oct 30, 2024, 2:09 PM

#

seems like it has cool capabilities but its general image gen quality is very "meh" tier, making it hard to justify the disgustingly long gen time

muted dove Oct 30, 2024, 2:09 PM

#

noble coyote Budget's just finished - penny off a pint of beer!

The more you drink, the more you save! 🙂

noble coyote Oct 30, 2024, 2:09 PM

#

viral plaza https://discord.com/channels/1243166023859961988/1243166025000943746/12988209845...

My 8Gb VRAM 'excluded' me from using Omnigen on pinokio

#

There is an Omnigen Huggingface https://huggingface.co/spaces/Shitao/OmniGen

OmniGen - a Hugging Face Space by Shitao

signal shuttle Oct 30, 2024, 2:40 PM

#

(SD3.5L + SD3.5M)

dusky thistle Oct 30, 2024, 2:43 PM

#

signal shuttle Oct 30, 2024, 2:53 PM

#

dusky thistle Oct 30, 2024, 2:55 PM

#

#

noble coyote Oct 30, 2024, 3:32 PM

#

errant dust Oct 30, 2024, 3:37 PM

#

Maybe I am just choosing the wrong subjects, but so far I have been underwhelmed by SD 3.5 Medium. Compared to Large of course. (workflows included, but prompt is basically: "a comic with strong outlines of a tree that stands tall in the center, with cats of various sizes, colors, and expressions scattered around its trunk and branches. Some cats are nestled among the leaves, while others are perched on the tree's main branches. The tree's trunk is brown, and the leaves are green. The cats exhibit a range of facial expressions, from curious to content." First is SD 3.5M at 768x768, then 1024x1024 and finally SD 3.5 L

#

same seed

untold valley Oct 30, 2024, 3:39 PM

#

Anyone have a quick TLDR of the state of things? How is 3.5 compared to last released model or Flux? seems there are stripping patterns is that the worst of it?

errant dust Oct 30, 2024, 3:39 PM

#

I have not seen striping issues myself, but it seems to completely whig out at sizes over 1024 x 1024. Large that is.

#

the output is quite different and they have different strengths and weaknesses. If you want hands, well, Flux is your choice, but there is more to imagery than that. Comics are clear win (no Loras) for SD 3.5 L

untold valley Oct 30, 2024, 3:41 PM

#

errant dust I have not seen striping issues myself, but it seems to completely whig out at s...

other than that all good?

#

what about text?

#

faces/body how maleable is it?

#

can it do the dreaded woman laying on grass

#

astronaut on the moon

errant dust Oct 30, 2024, 3:42 PM

#

The thing is I don't really do stock photography with people. But that much is fine. Just try it IMHO. Nothing endemic

rapid pivot Oct 30, 2024, 3:44 PM

#

Hello Chess man

untold valley Oct 30, 2024, 3:44 PM

#

comfy still the best way? forge? anything developed better?

rapid pivot Oct 30, 2024, 3:44 PM

#

waow

errant dust Oct 30, 2024, 3:44 PM

#

Hey @rapid pivot

pseudo owl Oct 30, 2024, 3:45 PM

#

untold valley other than that all good?

Large can do woman in grass too, but obviously worse then flux in anatomy. Flux is also better at text.

errant dust Oct 30, 2024, 3:45 PM

#

untold valley comfy still the best way? forge? anything developed better?

It would imply I am trying a bunch of other GUIs. I tend to stick with what I have if I am satisfied and it does what I want and well

#

and COmfy fits that description

#

The trick is to explore them with the themes and types you want, and leanr which are best at what

untold valley Oct 30, 2024, 3:47 PM

#

pseudo owl Large can do woman in grass too, but obviously worse then flux in anatomy. Flux ...

so essentially Flux is currently outperforming and we will need to wait for 3.5 models to be retrained by enthusiasts? were legalities and licensing changed to encourage more of this or we still at old restrictive ones?

errant dust Oct 30, 2024, 3:47 PM

#

I can tell you my findings, but they only cover what interests me. So anime? No clue. Chicks with boobs? Check Civit. Etc. Artwork? Stock imagery. Creative content? Text? I'm your guy

#

I mean, in terms of text, the clear no.1, of any AI, is Ideogram 1.0 and 2.0. It isn't even close. Flux isa decent second, but distant. Logos? Flux is the king today. All the others. MJ included, are just behind

#

MJ6 that is. MJ 5 was strong

#

Satirical comics? Imagen 3 is the best now. Also not close.

untold valley Oct 30, 2024, 3:51 PM

#

afaik Flux hard to train and that's why we were waiting on 3.5 or model like it. and it appears 3.5 needs more training tossed at it.

errant dust Oct 30, 2024, 3:51 PM

#

I don't fanboy any model. I just tryt to leanr what the best tools are for each case

untold valley Oct 30, 2024, 3:52 PM

#

errant dust I mean, in terms of text, the clear no.1, of any AI, is Ideogram 1.0 and 2.0. It...

looking for local

flat oracle Oct 30, 2024, 3:52 PM

#

sd3.5 M works pretty good for me. However, CLIP Text encode node (positive as well as negative) takes extremely long to load, any ideas why?

errant dust Oct 30, 2024, 3:52 PM

#

flat oracle sd3.5 M works pretty good for me. However, CLIP Text encode node (positive as we...

Have you tried the GGUF models?

flat oracle Oct 30, 2024, 3:52 PM

#

errant dust Have you tried the GGUF models?

Sorry, forgot to mention, thats what im trying rn

errant dust Oct 30, 2024, 3:53 PM

#

I use the T5 Q8_0. WOrks fine for me. Though YMMV

untold valley Oct 30, 2024, 3:53 PM

#

ah damn, do we also now have different T5 models we need to deal with?

errant dust Oct 30, 2024, 3:54 PM

#

untold valley looking for local

Flux has a big edge RN in that it has been out for 3 full months now. So TONS of strong LoRas. SD 3.5 hasn't had this level of development yet, so give it time

errant dust Oct 30, 2024, 3:55 PM

#

untold valley ah damn, do we also now have different T5 models we need to deal with?

GGUF means it is quantized. Designed to load and be processed faster and with less memory needs

#

as it is loaded in blocks

#

and not as a whole

noble coyote Oct 30, 2024, 3:55 PM

#

craggy crest here's a workflow to start with. this has the SD3-2b-medium model loaded, but yo...

Changing to 3.5 brings a tokenizer fault at the negative prompt

untold valley Oct 30, 2024, 3:56 PM

#

errant dust Flux has a big edge RN in that it has been out for 3 full months now. So TONS of...

models need to be trained, not just loras. the downside of Flux model, however we go back to how easy/ licencing for 3.5 is otherwise wont get models created

flat oracle Oct 30, 2024, 3:56 PM

#

errant dust I use the T5 Q8_0. WOrks fine for me. Though YMMV

im just wondering because im even trying with dual clip loader t5 & clip l + gguf variants of sd3.5.
Shouldnt it be way faster than flux 1 q8 with both of them?

#

i mean the ksampling is pretty fast. no problems at all, just with text encode

errant dust Oct 30, 2024, 3:57 PM

#

flat oracle im just wondering because im even trying with dual clip loader t5 & clip l + ggu...

I use three clips, but have not bothered to time the entire lot. I should and will, but overall I have found them to be pretty close. At least not different enough to warrant Ohs and Ahs

errant dust Oct 30, 2024, 3:59 PM

#

untold valley models need to be trained, not just loras. the downside of Flux model, however w...

I am talking output results. For me the talk about what is the driver for an improved image, be it LoRA or checkpoint, is semantics. I don't actually care so long as I get what I want from the combination.

flat oracle Oct 30, 2024, 4:00 PM

#

errant dust I use three clips, but have not bothered to time the entire lot. I should and wi...

kk thanks!

untold valley Oct 30, 2024, 4:02 PM

#

you can theoretically have an endless stream of loras, but I do not like to stay with a single acute style. Its easier when a model inherently knows what it needs to. and makes the image generating creative workflow more malleable. essentially Lora's are an overfitment run on top, sure easy fast way of getting what you want if you want specifics but as a whole a model by itself that understands you is better.

errant dust Oct 30, 2024, 4:02 PM

#

Reminds me of an arument I had with a strong chess master who offered to mentor me to mastery myself. He set out this study plan that involved self studying matches as far back as the late 1800s so I could 'absorb' the evolution of the game and blablabla. I told him I would prefer a more focused and pragmatic approach that maximized results (so long as they did not impair evolution). We soon parted ways. I don't have infinite time, nor patience.

#

How is citing endless culture and names going to matter if I lose because I simply played worse? Same for AIs. Talking about why one way sounds nicer on paper is unimportant to me. If I get the results I seek, all the technical perfections of another setup or method won't matter one whit to my eyes.

untold valley Oct 30, 2024, 4:09 PM

#

i am asking the wrong person then. was looking for more of a power user, jack of all trades approach.

errant dust Oct 30, 2024, 4:09 PM

#

I wasn't aware you were asking anything at all.

#

You asked what is best. I told you in results. You then railed about why models and Loras should be splitting the output development differently, which is semantics. But there was no question involved.

#

Is it the purity of some technical aspect that matters to you? Or better images that fulfill your requests?

#

I made clear where I position myself

#

Devolving into personal attacks? Is it that hard to just have a civil discussion?

viral plaza Oct 30, 2024, 4:34 PM

#

errant dust Maybe I am just choosing the wrong subjects, but so far I have been underwhelmed...

If you were expecting 3.5 Medium to be equivalent or better to 3.5 Large... well your expectations are backwards. Medium is smaller than Large. It's expected to be a bit weaker on quality, but with the perk of running fast and using less vram

viral plaza Oct 30, 2024, 4:34 PM

#

untold valley comfy still the best way? forge? anything developed better?

SwarmUI is the best UI

untold valley Oct 30, 2024, 4:36 PM

#

viral plaza SwarmUI is the best UI

many thanks, will give it a try, took a hiatus from SD and trying to find where thing are at as a whole. Swarm is a gui with comfy backend right?

viral plaza Oct 30, 2024, 4:36 PM

#

flat oracle im just wondering because im even trying with dual clip loader t5 & clip l + ggu...

if you use dualclip (CLIP G + CLIP L) in non-gguf, it should be blazing fast and near-equivalent results to triple (the T5 on SD3 is relatively weak, but the most intensive to run)

viral plaza Oct 30, 2024, 4:37 PM

#

untold valley many thanks, will give it a try, took a hiatus from SD and trying to find where ...

yes

errant dust Oct 30, 2024, 4:38 PM

#

viral plaza If you were expecting 3.5 Medium to be equivalent or better to 3.5 Large... well...

I quite understand, but the images I showed emphasize issues in the output that make it unusable. For those topics as I explained

#

I used SD 3.5 L to show that it wasn't a general failing by SD 3.5.

#

That said, here is the type of image that Flux can only dream about doing, but that SD 3.5 can do:

pseudo owl Oct 30, 2024, 4:43 PM

#

untold valley so essentially Flux is currently outperforming and we will need to wait for 3.5 ...

Depends on what you mean outperform, but yes imo. Sd3.5 is more creative but flux is obviously better at text, anatomy, and sometimes prompt following.

So sd3.5 is a great tool too, and it also has more knowledge but from what flux knows, it performs better.

pseudo owl Oct 30, 2024, 4:43 PM

#

errant dust That said, here is the type of image that Flux can only dream about doing, but t...

Prompt?

untold valley Oct 30, 2024, 4:43 PM

#

pseudo owl Depends on what you mean outperform, but yes imo. Sd3.5 is more creative but flu...

thank you, appreciated

errant dust Oct 30, 2024, 4:43 PM

#

in the image

#

all settings and meta are always in images I post here

#

an Impressionist Cartoon of a tree covered in whimsical cats on the base and branches all drawn in a variety of colors and facial expressions in the style of Andy Kehoe and Skottie Young. The outlines in the layered 2d art are strong and reminiscent of Keith Haring.

dry wave Oct 30, 2024, 4:50 PM

#

from what I heard flux is also easier to train, despite being a distilled model

#

guess the way to go for now is using sd3 for creative arts and flux for more realistic/clean images

pseudo owl Oct 30, 2024, 4:52 PM

#

dry wave from what I heard flux is also easier to train, despite being a distilled model

Maybe for Lora’s but not for full finetuning I believe, and yeah i think the flux and sd3 for different tasks is a good idea.

errant dust Oct 30, 2024, 4:52 PM

#

viral plaza if you use dualclip (CLIP G + CLIP L) in non-gguf, it should be blazing fast and...

How about the 32-bit T5?

dry wave Oct 30, 2024, 4:52 PM

#

who needs full fine-tuning

errant dust Oct 30, 2024, 4:52 PM

#

huge file even GGUFed. (18+ GB)

viral plaza Oct 30, 2024, 4:53 PM

#

errant dust How about the 32-bit T5?

blink

#

what

errant dust Oct 30, 2024, 4:54 PM

#

I meant F32, sorry

viral plaza Oct 30, 2024, 4:54 PM

#

what about it

pseudo owl Oct 30, 2024, 4:54 PM

#

dry wave who needs full fine-tuning

It’s better than Lora’s especially if you want to make the model learn more. Loras are only good for small scale learning(one or few styles)

errant dust Oct 30, 2024, 4:54 PM

#

You said the T5 for SD3 is weak. I asked whether you expected tangible benefits using the F32 version

viral plaza Oct 30, 2024, 4:55 PM

#

errant dust huge file even GGUFed. (18+ GB)

GGUF makes things smaller by reducing the bitwidth, so if you explicitly use a high bitwidth yeah it's big

viral plaza Oct 30, 2024, 4:55 PM

#

errant dust You said the T5 for SD3 is weak. I asked whether you expected tangible benefits ...

no, the problem is that the model's backbone (the MMDiT part, that does actual image generation during sampling) doesn't particularly care about what the T5 has to say, because CLIP G provides much more useful information during training

#

the bitwidth doesn't enhance anything, it's just narrow refinements to precision - fp32 requires twice the memory of fp16 and microscopically more precise data, it's incredibly pointless to use fp32 for anything outside of training

#

even fp16 barely provides more precision data than fp8 does

errant dust Oct 30, 2024, 4:56 PM

#

I use Q8_0 myself

#

a cross between them, though very very close to fp16

viral plaza Oct 30, 2024, 4:58 PM

#

viral plaza no, the problem is that the model's backbone (the MMDiT part, that does actual i...

(this btw is the reason why Flux doesn't have CLIP G: by removing the clear G signal, they force the model to learn T5, and once it's willing to learn T5 it can achieve better results in the long run)

errant dust Oct 30, 2024, 5:01 PM

#

That's quite interesting. I assume this was understood before SD 3.5 went into training. Any reason why this path was not chosen?

signal shuttle Oct 30, 2024, 5:08 PM

#

(SD3.5L + SD3.5M)

errant dust Oct 30, 2024, 5:37 PM

#

Hmmmm! Playing around with the samplers with SD 3.5L dpmpp_2s_ancestral just gave me my best result by far in my Keith Haring cats in trees prompt. So far my request for his style and strong simplistic art and outlines was ignored, though it did adhere to my desire for rich multicolored whimsical cats. To be explored:

craggy crest Oct 30, 2024, 5:47 PM

#

untold valley i am asking the wrong person then. was looking for more of a power user, jack of...

you're dissing someone that is what you're looking for. this makes little sense to me

craggy crest Oct 30, 2024, 5:48 PM

#

noble coyote Changing to 3.5 brings a tokenizer fault at the negative prompt

use the workflow in this then, this is 3.5 large

untold valley Oct 30, 2024, 5:50 PM

#

craggy crest you're dissing someone that is what you're looking for. this makes little sense ...

we will do this another time, in the mean time have you found SLG workflow usefull or you running it without it?

craggy crest Oct 30, 2024, 5:54 PM

#

untold valley we will do this another time, in the mean time have you found SLG workflow usefu...

i'm currently finishing the sampler/scheduler compare sheet. however i did work with it some yesterday, and i played around with the same idea when matteo released his block skip node for flux a couple months back. it's very useful, but you need to be extremely careful with the values. one interesting thing is that by enabling skip, you may get better text - but you will also lose fine details. image with skip turned on, on the left. image with it off, on the right

#

it's implimented in 3.5 medium as a way to tweak things like hands and feet if needed, or other things that are warping that you don't want to warp.

#

it's not intended to be a 'turn it on and use it on everything'

untold valley Oct 30, 2024, 5:56 PM

#

they are completely different pictures all together, guess there is no free lunch. results for skip for text based on that sample are crazy though.

craggy crest Oct 30, 2024, 5:56 PM

#

skipping layer 2 only - skip on the left, without skip on the right

#

you can see that the fine details are deleted

untold valley Oct 30, 2024, 5:57 PM

#

wow yes, right picture looks really nice with the splattering of paint

mortal mesa Oct 30, 2024, 5:57 PM

#

i dont think its barely been touched yet, people plug it in and use the defaults in it, can have lots of effects but i sure as hell aint gonna figure it out

craggy crest Oct 30, 2024, 5:58 PM

#

untold valley they are completely different pictures all together, guess there is no free lunc...

if you're referring to the block skip for unet called free lunch, this works in the same way, but the blocks we use for SD3 are NOT a unet structure so you will ahve to carefuly play with the values and layers to get an idea of what you want to turn off or on and when

craggy crest Oct 30, 2024, 5:58 PM

#

mortal mesa i dont think its barely been touched yet, people plug it in and use the defaults...

i am. that's the project i start after i finish this sampler/scheduler compare sheet

untold valley Oct 30, 2024, 6:00 PM

#

craggy crest if you're referring to the block skip for unet called free lunch, this works in ...

there is a lot of possibilities with this, maybe there is a magic combo somewhere, will need to go see the paper to see if they talk about what layers represent what aspects

craggy crest Oct 30, 2024, 6:00 PM

#

untold valley there is a lot of possibilities with this, maybe there is a magic combo somewher...

actually, you really want to go talk to matteo. or talk to @dusky thistle - they're the ones that have done the most digging into this sort of thing

untold valley Oct 30, 2024, 6:03 PM

#

thank you for the resources, let me dig into and familiarize myself with it first.

noble coyote Oct 30, 2024, 6:06 PM

#

#

#

#

sacred jewel Oct 30, 2024, 6:12 PM

#

Nihilism LoRA

untold valley Oct 30, 2024, 6:14 PM

#

@craggy crest top is SLG lod have mercy, really hard to find a sweet spot, finding start percentage the higher it goes the more distorted, needs low initial value.

noble coyote Oct 30, 2024, 6:17 PM

#

craggy crest Oct 30, 2024, 6:17 PM

#

untold valley <@407561236339752981> top is SLG lod have mercy, really hard to find a sweet spo...

heh. yeah, it's not an easy, point and shoot, idea. the people that have dug the most into what each of the blocks are actually doing are clownshark and matteo. and yes, you want very small values. not sure if your node will allow decimals but if so, try them. 0.001 vrs 0.00001

viral plaza Oct 30, 2024, 6:17 PM

#

errant dust That's quite interesting. I assume this was understood before SD 3.5 went into t...

i assume just keeping matched to SD3
ie avoid too much arch change until there's enough to justify an "SD4"

noble coyote Oct 30, 2024, 6:17 PM

#

craggy crest Oct 30, 2024, 6:22 PM

#

https://replicate.com/stability-ai/stable-diffusion-3.5-medium SD 3.5 medium is up on replicate if you don't have a machine that can run it

stability-ai/stable-diffusion-3.5-medium – Replicate

2.5 billion parameter image model with improved MMDiT-X architecture

hallow lion Oct 30, 2024, 6:23 PM

#

Cats with 4GB VRAM (send help)

craggy crest Oct 30, 2024, 6:24 PM

#

@bitter hearth

craggy crest Oct 30, 2024, 6:25 PM

#

hallow lion Cats with 4GB VRAM (send help)

just type @cat and then pick from the list to tag

#

@dusky thistle one of these is dpm_2/linear_quadradic, the other is dpm_2/beta - 3 guesses which is which

sacred jewel Oct 30, 2024, 6:32 PM

#

craggy crest <@1208924372299939890> one of these is dpm_2/linear_quadradic, the other is dpm_...

I will guess Quadratic is the more detailed one (second one)

#

mortal mesa Oct 30, 2024, 6:55 PM

#

Beta/Linear_quadratic, worked very nice on this one, my opinion

craggy crest Oct 30, 2024, 7:21 PM

#

mortal mesa Beta/Linear_quadratic, worked very nice on this one, my opinion

ddim_uniform tends to warm images up

mortal mesa Oct 30, 2024, 7:26 PM

#

bitter hearth Oct 30, 2024, 7:40 PM

#

linear_quadratic is a bit of a spicy choice
sprints through low sigmas at lightspeed

craggy crest Oct 30, 2024, 7:53 PM

#

bitter hearth linear_quadratic is a bit of a spicy choice *sprints through low sigmas at light...

dont' trip at that speed...

mortal mesa Oct 30, 2024, 7:54 PM

#

sprints through low sigmas at lightspeed```

craggy crest Oct 30, 2024, 7:55 PM

#

@bitter hearth

mortal mesa Oct 30, 2024, 7:56 PM

#

i dont really know what im doing but i got detail daemon in there too bosting early if that even makes sense

bitter hearth Oct 30, 2024, 8:03 PM

#

detail daemon will help with linear_quadratic
because it will offset some of the detail loss

#

so yeah that makes sense 🙂

craggy crest Oct 30, 2024, 8:11 PM

#

these are all dpm_fast. one is exponential, one is karras, one is normal

#

can you guess which is which?

bitter hearth Oct 30, 2024, 8:19 PM

#

never actually checked out what exponential graph looks like TBH

#

I think the third one is normal

#

hmm not sure

gritty steeple Oct 30, 2024, 8:31 PM

#

craggy crest Oct 30, 2024, 8:34 PM

#

bitter hearth hmm not sure

;) now you have to go play

real terrace Oct 30, 2024, 8:38 PM

#

craggy crest here's a workflow to start with. this has the SD3-2b-medium model loaded, but yo...

👍

#

I still can't fully understand what to put in each clip; sometimes I feel like I have it clear, but then I don't.

#

If I have to repeat or reinforce the idea in all

#

#

I'm using this GGUF model, but it takes 4:40 minutes, while

#

this large model I think it took 4 minutes only

signal shuttle Oct 30, 2024, 8:50 PM

#

real terrace I'm using this GGUF model, but it takes 4:40 minutes, while

GGUF models trade speed for less vram requirement, so a GGUF model will be slower then an FP8 one while also being much lighter on the vram requirements

gritty steeple Oct 30, 2024, 8:51 PM

#

real terrace Oct 30, 2024, 8:54 PM

#

signal shuttle GGUF models trade speed for less vram requirement, so a GGUF model will be slowe...

now that you say it, with this model my PC haven't go OOM or stutter, that's true, iirc with the large one there was some steps where videos would stop playing for a bit

pseudo owl Oct 30, 2024, 8:59 PM

#

The red panda model which was #1 on the text-to-image leaderboard(beat flux.1.1 pro, flux.1 pro, dev, schnell, sd3.5 large and turbo): <
https://x.com/recraftai/status/1851706399631224939>

It has very bad prompt following(maybe only in this prompt idk), but great realism.
prompt: a high quality photograph of a white cat sitting on top of a blue dog on a brown couch in a living room. Behind them, is a window, with 4 cow pictures, one in each corner. Outside the window is outer space and a ufo.

Flux.1 schnell 8-step on left, Recraft v3(red panda) on the right

dull star Oct 30, 2024, 9:04 PM

#

pseudo owl The red panda model which was #1 on the text-to-image leaderboard(beat flux.1.1 ...

#

indeed its finally revealed on the image leaderboard too

#

idk man it looks average

#

especially if its closed source

lucid swift Oct 30, 2024, 9:05 PM

#

dull star especially if its closed source

bs its very good

pseudo owl Oct 30, 2024, 9:06 PM

#

dull star idk man it looks average

you will see realism and text is pretty great but prompt following is very disappointing. No reason to use it over flux or sd3.5 large imo.

lucid swift Oct 30, 2024, 9:06 PM

#

recraft

dull star Oct 30, 2024, 9:06 PM

#

I guess it does do "crappy style" photos out of the box, but that's required for it to be used as a service with no lora options

#

A close-up, realistic portrait of an elderly man dressed as a military soldier. He has deep wrinkles, white stubble, and a stern, weathered expression, symbolizing years of service. The uniform is slightly worn, with medals and insignia visible, and a green camo pattern typical of an army soldier. His eyes reflect resilience and experience, capturing the weight of his journey. The background is blurred, focusing solely on his face and upper uniform, creating a dramatic, respectful portrayal.

#

the texture is nice

lucid swift Oct 30, 2024, 9:07 PM

#

dull star A close-up, realistic portrait of an elderly man dressed as a military soldier. ...

use the hard flash style option it looks much better

dull star Oct 30, 2024, 9:07 PM

#

I don't want to login

#

idc that it's free

#

wake me up when its open weight

pseudo owl Oct 30, 2024, 9:08 PM

#

lucid swift recraft

Yeah its nice at realism, but as I said above, prompt following isn't great. Open source models can do similar gen's too in terms of realism and have much better prompt following.

dull star Oct 30, 2024, 9:08 PM

#

and its gonna be paid sooner or later

lucid swift Oct 30, 2024, 9:08 PM

#

dull star and its gonna be paid sooner or later

it is payed

dull star Oct 30, 2024, 9:08 PM

#

how much per image/credit

lucid swift Oct 30, 2024, 9:08 PM

#

but u get like 50 free images

#

and then 5 free every day

#

but after then 10 or 20 dollar a moth

#

if you want omore

#

you can also ulode images and make your own style

pallid ginkgo Oct 30, 2024, 9:09 PM

#

kinda like those ai video websites

lucid swift Oct 30, 2024, 9:09 PM

#

yes

pseudo owl Oct 30, 2024, 9:12 PM

#

lucid swift bs its very good

Flux schnell(considerably worse then flux dev, and sd3.5 large) on the left vs Recraft v3 on the right,

prompt: A blue block on top of a red block. Next to the red block is a green block with a candle on top of the green block.

lucid swift Oct 30, 2024, 9:13 PM

#

pseudo owl Flux schnell(considerably worse then flux dev, and sd3.5 large) on the left vs R...

it seems like its worse at promt understanding

#

i cinda like this image

nature--with-rooks-and-a-river-and-a-cliff-and-old1.jpg

#

reminds me of something idk

#

but it has less often this fake real look with that smoth skin

dull star Oct 30, 2024, 9:21 PM

#

aesthetics are good though

dull star Oct 30, 2024, 9:22 PM

#

lucid swift you can also ulode images and make your own style

oh like ipadapter for SDXL

#

like there you could load an image and it would make an image in that style

#

granted, it wasn't perfect, but it was good enough

pseudo owl Oct 30, 2024, 9:24 PM

#

lucid swift it seems like its worse at promt understanding

Yeah its considerably worse then the new open source models on prompt following(flux dev, schnell, sd3.5 large, turbo) imo.

But yes its for sure more realistic then out of the box models, but there are many ways to make models more "realistic".

craggy crest Oct 30, 2024, 9:26 PM

#

real terrace I still can't fully understand what to put in each clip; sometimes I feel like I...

t5xxl: give it natural language that is rich in details and adjectives. clip_g: just give it the basics of the image. clip_l: give it all the fine details and artsy stuff. example: t5xxl: closeup on a vase with roses in it, dew sparkles on the petals. the light shines in from the side at a slant casting shadows across the scene. clip_g: roses in a vase near a window. closeup. dynamic light and shadows. clip_l: sparkling bits of light bounching off dew on the rose petals. soft, bokeh background.

bitter hearth Oct 30, 2024, 9:26 PM

#

lucid swift recraft

the model does look good here
during the testing period on Artificial Analysis I mostly voted against it though

#

my preferences seemed to be for Flux Pro 1.1, Ideogram V2 and Midjourney V6

#

however these ELO tests are unfair to the open source models

#

because they get fine tuning and Comfy workflows

#

but ELO test doesn't reflect that

lucid swift Oct 30, 2024, 9:27 PM

#

bitter hearth the model does look good here during the testing period on Artificial Analysis I...

when i tested it was always very close most of the time promt understanidng was not the porblem

bitter hearth Oct 30, 2024, 9:29 PM

#

its close yes, at the top of the leaderboard

#

I did come away with the impression that the gaps between models are very small

lucid swift Oct 30, 2024, 9:33 PM

#

at some point you cant make it more realistic

bitter hearth Oct 30, 2024, 9:37 PM

#

there are upscaled SD 1.5 images that look like photos even

#

the difference between workflows is like 100x larger than the difference between models

lucid swift Oct 30, 2024, 9:39 PM

#

but these are very limited finetunes

bitter hearth Oct 30, 2024, 9:41 PM

#

yes but I don't see the downside in checkpoints specialising

#

you could train a small router model to route prompts to appropriate checkpoints for example

#

I guess storage, and loading/unloading to VRAM is the disadvantage there

craggy crest Oct 30, 2024, 9:43 PM

#

SD3.5 medium. Prompt: fantasy painting of a handsome lion Knight,long wavy hair, slight smile, piercing green eyes; emerald, symmetrical,intricate engraved armor; hyperdetailed. the words "Kings wear Crowns"

craggy crest Oct 30, 2024, 9:44 PM

#

bitter hearth yes but I don't see the downside in checkpoints specialising

too many checkpoints after a while and you get lost in the clutter

bitter hearth Oct 30, 2024, 9:54 PM

#

Civit does feel like that

#

interesting that even Midjourney chose to have their anime checkpoint separate though

low stone Oct 30, 2024, 10:04 PM

#

pseudo owl The red panda model which was #1 on the text-to-image leaderboard(beat flux.1.1 ...

yikes that recraft is rough. I just tried about 10 prompts that look amazing on flux and was really unimpressed by recraft. I'd probably pick sd 3.5m over it for looks (just don't prompt for hands).

sacred jewel Oct 30, 2024, 10:05 PM

#

sullen moss Oct 30, 2024, 10:06 PM

#

Refract vs Ideogram

#

To be honest, I still don't understand why this model is in first place...

gritty steeple Oct 30, 2024, 10:10 PM

#

pseudo owl Oct 30, 2024, 10:13 PM

#

low stone yikes that recraft is rough. I just tried about 10 prompts that look amazing on ...

Yeah I mean it’s great at realism but it seems even more constrained then flux and has much worse prompt following.

low stone Oct 30, 2024, 10:14 PM

#

sullen moss Refract vs Ideogram

that refract would be decent on its own, but that ideogram pic is killer and makes refract look like sdxl

craggy crest Oct 30, 2024, 10:47 PM

#

pseudo owl Yeah I mean it’s great at realism but it seems even more constrained then flux a...

at a guess, it's some one's attempt at a flux finetune.

bitter hearth Oct 30, 2024, 11:23 PM

#

its a big one, going by API pricing and timings

winged seal Oct 30, 2024, 11:24 PM

#

@craggy crest Wanted to share that I have taking a bit of a "break" after all from bigger over arching training, in favor of my first concept training on Flux Lite, which seems to be working extremely good for my first attempt, so I am happy about that haha

bitter hearth Oct 30, 2024, 11:24 PM

#

the aesthetic fine tune is a bit off, especially small details

#

its got signs of a strong model though in composition and blur

craggy crest Oct 30, 2024, 11:25 PM

#

winged seal <@407561236339752981> Wanted to share that I have taking a bit of a "break" afte...

cool - pictures or it didn't happen

winged seal Oct 30, 2024, 11:27 PM

#

craggy crest cool - pictures or it didn't happen

Training a dappled sunlight LoRA. Cooking up some more examples, just real slow cause I am on my 3060ti lol

#

my dataset is pretty small and less than ideal, so I will be trying to get good enough results to supplementally train off of for better feature reach

craggy crest Oct 30, 2024, 11:29 PM

#

winged seal Training a dappled sunlight LoRA. Cooking up some more examples, just real slow ...

prompt?

winged seal Oct 30, 2024, 11:29 PM

#

winged seal Oct 30, 2024, 11:30 PM

#

craggy crest prompt?

striped dappled light on the face of a young black boy. The light on his face is in striped and he is wearing a gray tank top with very short black hair, dark background

bitter hearth Oct 30, 2024, 11:30 PM

#

right hand side definitely better in both cases, nice job

winged seal Oct 30, 2024, 11:30 PM

#

thanks <3

craggy crest Oct 30, 2024, 11:31 PM

#

winged seal striped dappled light on the face of a young black boy. The light on his face is...

sd 3.5, no lora

winged seal Oct 30, 2024, 11:31 PM

#

mixed in a smidge of my realism LoRA's to make it look a touch better as well haha

craggy crest Oct 30, 2024, 11:31 PM

#

just your prompt

winged seal Oct 30, 2024, 11:31 PM

#

craggy crest sd 3.5, no lora

hoenstly, not bad

craggy crest Oct 30, 2024, 11:32 PM

#

winged seal hoenstly, not bad

modified your prompt to:HDR photography: striped dappled light on the face of a young black boy. The light on his face is in striped and he is wearing a gray tank top with very short black hair, dark background

#

sd3.5 large, no lora, just the prompt

sacred jewel Oct 30, 2024, 11:32 PM

#

Comic Book LoRA

winged seal Oct 30, 2024, 11:33 PM

#

craggy crest modified your prompt to:HDR photography: striped dappled light on the face of a ...

its a decent beginning for that as a concept, nice

craggy crest Oct 30, 2024, 11:33 PM

#

sacred jewel Comic Book LoRA

you really like the word 'whoa!' don't you?

craggy crest Oct 30, 2024, 11:33 PM

#

winged seal its a decent beginning for that as a concept, nice

and no need to sit there and battle fo rhours and hours and hours with a base model that's frozen and doesn't want to do what you want

winged seal Oct 30, 2024, 11:33 PM

#

ok, this looks ASTRONOMICALLY better haha

craggy crest Oct 30, 2024, 11:34 PM

#

winged seal ok, this looks ASTRONOMICALLY better haha

??? what... does?

winged seal Oct 30, 2024, 11:36 PM

#

craggy crest and no need to sit there and battle fo rhours and hours and hours with a base mo...

Oh yeah, I talked to two people with doctorates in this scene, and they both agreed that flux is not "frozen" and that it was a completely false claim made by some dude on reddit with no real truth behind it

They spoke way higher level about it with me, much higher level than I understood, thats for sure but yeah, two different people were 100% sure that "frozen" is a BS claim from people who don't understand flux or distillation. One said friend was the creator of Libre Flux and the writer of the paper used to prune flix down from 12B to 8B, so I would wager he knows what hes talking about 😅

winged seal Oct 30, 2024, 11:36 PM

#

craggy crest ??? what... does?

this

craggy crest Oct 30, 2024, 11:36 PM

#

winged seal Oh yeah, I talked to two people with doctorates in this scene, and they both agr...

i coludn't care less what they say. i know what was done, they don't

winged seal Oct 30, 2024, 11:36 PM

#

casually doesn't read the part where they wrote an entire paper about flux and pruning/de-compressing it

craggy crest Oct 30, 2024, 11:37 PM

#

winged seal *casually doesn't read the part where they wrote an entire paper about flux and ...

casually reminds you that i'm a programmer and said i 'know' not 'im assuming'

winged seal Oct 30, 2024, 11:37 PM

#

ok man, I will continue to believe people who are much more qualified 😅

it doesn't matter anyways, flux trains great regardless of what some people try to say about it 😅

craggy crest Oct 30, 2024, 11:38 PM

#

winged seal ok man, I will continue to believe people who are much more qualified 😅 it do...

i highly doubt they are much more qualified. just that they support your assumptions.

winged seal Oct 30, 2024, 11:38 PM

#

do you have a doctorate in machine learning?

craggy crest Oct 30, 2024, 11:39 PM

#

winged seal do you have a doctorate in machine learning?

i don't need one. if that's a requirement, then robin rombach, the creator of stable diffusion, is unqualified

winged seal Oct 30, 2024, 11:39 PM

#

actually no, I am not gonna get into this, it literally does not matter in the grand scheme of things

#

anyways

craggy crest Oct 30, 2024, 11:40 PM

#

winged seal actually no, I am not gonna get into this, it *literally* does not matter in the...

of course it doesn't. you can sit there and pull your hair out, melt your gpu, and pay outrageous electric bills to train unnecessary loras. i don't care.

winged seal Oct 30, 2024, 11:40 PM

#

how are things going with medium? I heard its pretty good

#

better or worse than large?

craggy crest Oct 30, 2024, 11:40 PM

#

winged seal how are things going with medium? I heard its pretty good

you haven't played with it yet?

winged seal Oct 30, 2024, 11:40 PM

#

No, I haven't

#

I am waiting for better training tools

craggy crest Oct 30, 2024, 11:40 PM

#

ah. well - get a few moments to go play with it and see what you think - try using it as a refiner for your images generated with flux

pseudo owl Oct 30, 2024, 11:41 PM

#

winged seal better or worse than large?

Well considerably worse in anatomy at least but ok in art styles. Decent as a upscaler/refiner.

winged seal Oct 30, 2024, 11:41 PM

#

jesus, worse in anatomy 😭

#

I guess it is more of a refiner after all, so oh well there

craggy crest Oct 30, 2024, 11:41 PM

#

winged seal jesus, *worse* in anatomy 😭

it is a smaller model - that's why you can also use SLG with it if you need to tweak the anatomy

mortal mesa Oct 30, 2024, 11:42 PM

#

how would you do that

craggy crest Oct 30, 2024, 11:42 PM

#

winged seal I guess it is more of a refiner after all, so oh well there

a lot fo people have been playing with it as a refiner

winged seal Oct 30, 2024, 11:42 PM

#

Yeah, that seemed to be more of the direction people were leaning in for a while

craggy crest Oct 30, 2024, 11:42 PM

#

mortal mesa how would you do that

grab the exmaple workflow with the SLG node for comfyUI, update your comfyUI, and then play around with the values

mortal mesa Oct 30, 2024, 11:42 PM

#

oh yopu dont have any input

winged seal Oct 30, 2024, 11:42 PM

#

I'd be down to use it on my flux lite gens

craggy crest Oct 30, 2024, 11:42 PM

#

mortal mesa oh yopu dont have any input

you asked how you would do that. not what tests i'd done?

pseudo owl Oct 30, 2024, 11:42 PM

#

Yeah that can help a bit, it’s far better then sd3 medium in anatomy but aroundish base sdxl in anatomy(worse without slg).

winged seal Oct 30, 2024, 11:43 PM

#

jesus thats rough

#

base SDXL was a bit of a nightmare haha

craggy crest Oct 30, 2024, 11:43 PM

#

there are 3 workflows in the huggingface repo for medium: SD3.5L_plus_SD3.5M_upscaling_example_workflow
SD3.5M_SLG_example_workflow
SD3.5M_example_workflow

#

but update your comfyUI before loading any of them so you get the new node and scheduler

winged seal Oct 30, 2024, 11:43 PM

#

Is hould know... I have trained over 1000 LoRA's on base and was part of the beta program for it where I gave feedback on it before public release

pseudo owl Oct 30, 2024, 11:44 PM

#

I would kinda expect it since it’s smaller then sdxl. But it’s workable at least, not as bad as sd3 medium.

winged seal Oct 30, 2024, 11:44 PM

#

I mean, its basically the same size as SDXL, and its over a year newer with new tech, much more training, and a WAY more mature industry

craggy crest Oct 30, 2024, 11:44 PM

#

pseudo owl I would kinda expect it since it’s smaller then sdxl. But it’s workable at least...

it IS sd3 medium, finished

mortal mesa Oct 30, 2024, 11:45 PM

#

craggy crest you asked how you would do that. not what tests i'd done?

was wondering how you would tweak SLG for anatomy as you said, but no info

#

just looking for info

winged seal Oct 30, 2024, 11:45 PM

#

I still think that medium will be far more viable than large, so I wil have my eyes on that if anything

craggy crest Oct 30, 2024, 11:46 PM

#

mortal mesa was wondering how you would tweak SLG for anatomy as you said, but no info

SLG is skip layers guidance. so you are turning various blocks on and off, and changing values. as an example, this is using it to tweak text. for this i had 3 layers turned off and only two of the three encoders working on the prompt. With is on the left. without is on the right

sacred jewel Oct 30, 2024, 11:47 PM

#

@craggy crest better? 😛 😛 😛

craggy crest Oct 30, 2024, 11:47 PM

#

sacred jewel <@407561236339752981> better? 😛 😛 😛

those are really cool :) very DC comics style

sacred jewel Oct 30, 2024, 11:47 PM

#

craggy crest those are really cool :) very DC comics style

It's a nice LoRA indeed. Available for SD3.5 as well 😉

craggy crest Oct 30, 2024, 11:48 PM

#

zombie ironman?

sullen moss Oct 30, 2024, 11:49 PM

#

winged seal this

Very cool!

pseudo owl Oct 30, 2024, 11:53 PM

#

Sd3.5 2.6b vs Allegro 2.8b(text to video model)

bitter hearth Oct 30, 2024, 11:54 PM

#

mortal mesa was wondering how you would tweak SLG for anatomy as you said, but no info

work out which step the detail you want to work on appears, then apply SLG only around that step, trying each scale from 0-10 in intervals of 1
you can start with the default blocks

#

its important to have it off for the final smaller sigmas

#

cos it will reduce fine detail, just like PAG does

#

you also probably want it off for the first 30% of sigmas or so, because it could reduce image diversity

craggy crest Oct 30, 2024, 11:57 PM

#

bitter hearth you also probably want it off for the first 30% of sigmas or so, because it coul...

i think you just made kagi's eyes crossed. can you boil that down?

#

i think he really would like to be able to use it

pseudo owl Oct 30, 2024, 11:58 PM

#

I don’t think this is Albert Einstein lol, allegro 2.8b
prompt: Albert Einstein walking around in a futuristic world, far away

winged seal Oct 31, 2024, 12:00 AM

#

sullen moss Very cool!

Thanks <3 Training flux 8b haha

bitter hearth Oct 31, 2024, 12:01 AM

#

craggy crest i think you just made kagi's eyes crossed. can you boil that down?

maybe this leave layers and scale at default, start 0.3 end 0.7

#

keeps it away from the spicy areas

bitter hearth Oct 31, 2024, 12:03 AM

#

pseudo owl I don’t think this is Albert Einstein lol, allegro 2.8b prompt: Albert Einstein ...

I like minimax the most out of these proprietary video ones

real terrace Oct 31, 2024, 12:14 AM

#

#

#

sullen moss Oct 31, 2024, 12:17 AM

#

Flux

real terrace Oct 31, 2024, 12:19 AM

#

#

sullen moss Oct 31, 2024, 12:34 AM

#

#

sullen moss Oct 31, 2024, 12:56 AM

#

limpid thunderBOT Oct 31, 2024, 12:57 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

kind flume Oct 31, 2024, 12:58 AM

#

help

#

#artisan-1

sullen moss Oct 31, 2024, 1:01 AM

#

pseudo owl Oct 31, 2024, 1:03 AM

#

bitter hearth I like minimax the most out of these proprietary video ones

Allegro is open source with Apache 2.0 license. It’s 2.8b params dit.
There’s also mochi-1 which is considerably better also Apache 2.0 and 10b dit.

In the text to video leaderboard, mochi-1 is #2 behind mini max, and beats kling, luma, gen3.

sacred jewel Oct 31, 2024, 1:05 AM

#

sullen moss Oct 31, 2024, 1:10 AM

#

real terrace Oct 31, 2024, 1:20 AM

#

pseudo owl Oct 31, 2024, 1:20 AM

#

Mochi is amazing especially for a open source model. I prefer it to mini max honestly. These ones are generated by the official website which uses the open source model and a upscaler.

sacred geode Oct 31, 2024, 1:33 AM

#

craggy crest Oct 31, 2024, 1:49 AM

#

pseudo owl Mochi is amazing especially for a open source model. I prefer it to mini max hon...

sound on

real terrace Oct 31, 2024, 2:19 AM

#

#

#

#

#

#

sacred geode Oct 31, 2024, 2:32 AM

#

@craggy crest Is this correct "frozen just means no gradients apply to it"?

craggy crest Oct 31, 2024, 2:33 AM

#

sacred geode <@407561236339752981> Is this correct "frozen just means no gradients apply to i...

not what I mean, at least. flux is basically a giant lora

real terrace Oct 31, 2024, 2:33 AM

#

sacred geode Oct 31, 2024, 2:35 AM

#

I was asking in general from a convo I just had. To me I have had so many failures on Flux then I take the same dataset and throw it at SD3.5 and BANG, success after success. I mentioned that SAI said theirs were open (unfrozen) weights and was told they will say anything to stay relevant. Don't know just know I have had no real success with Flux locally while I have with 3.5. If the malformed appendages could be fixed then it will be a dream to work in for me.

real terrace Oct 31, 2024, 2:37 AM

#

#

#

bitter hearth Oct 31, 2024, 2:46 AM

#

sacred geode I was asking in general from a convo I just had. To me I have had so many failu...

there has been a couple of pieces of news about Flux that are important regarding training
the simpletuner dev managed to train flux for over 2500 H100 GPU hours without it collapsing- this is $7,500 worth of training, more than needed for the vast majority of checkpoints
secondly the realvis dev managed to train a new checkpoint on flux de-distilled and boost the aesthetics a lot
so progress can be made its just tricky

real terrace Oct 31, 2024, 2:51 AM

#

sacred geode Oct 31, 2024, 2:56 AM

#

bitter hearth there has been a couple of pieces of news about Flux that are important regardin...

My hope is for Juggernaught since they are working with BFL directly it gives us a simpler to train on Flux.

dusky thistle Oct 31, 2024, 3:04 AM

#

#

sacred geode Oct 31, 2024, 3:13 AM

#

craggy crest Oct 31, 2024, 3:39 AM

#

sacred geode I was asking in general from a convo I just had. To me I have had so many failu...

not sure who told you that but - flux is not only distilled, it's also DPO'd. it's frozen. 3.5 has had none of that done to it.

#

to do anything at all to flux, you basically have to break it and then assemble the pieces into something that's not flux

sacred geode Oct 31, 2024, 3:40 AM

#

I agree, and thank you for the confirmation.

craggy crest Oct 31, 2024, 3:40 AM

#

sacred geode I agree, and thank you for the confirmation.

there's a reason that when asked about it, black forest said they were not interested in making it trainable. also a reason they never released any information on how to train it.

sacred geode Oct 31, 2024, 3:41 AM

#

BINGO

craggy crest Oct 31, 2024, 3:41 AM

#

and a very specific reason why you can train SD3.5 almost instantly, and it is such a massive fight to do anything to flux

sacred jewel Oct 31, 2024, 3:41 AM

#

sacred geode Oct 31, 2024, 3:41 AM

#

Local it sure is for everyone I know who trains

craggy crest Oct 31, 2024, 3:41 AM

#

we're being 'watched'

sacred jewel Oct 31, 2024, 3:42 AM

#

craggy crest we're being 'watched'

wristwatched to be exact 😛

craggy crest Oct 31, 2024, 3:42 AM

#

sacred geode Local it sure is for everyone I know who trains

there's also a reason that all the devs who have been fighting to do anything with flux since it released dropped it and grabbed 3.5 the second they could

craggy crest Oct 31, 2024, 3:43 AM

#

sacred jewel wristwatched to be exact 😛

rock-n-rolex?

sacred geode Oct 31, 2024, 3:43 AM

#

craggy crest there's also a reason that all the devs who have been fighting to do anything wi...

Beautiful news

craggy crest Oct 31, 2024, 3:43 AM

#

trainers came out, there was a lora the DAY it released and the guy that made it didn't ahve early access, etc

#

it's almost effortless to work with

sacred geode Oct 31, 2024, 3:44 AM

#

Flux is sledgehammer training then you finally manage it and concept bleed ruins it. SD3.5 is a sponge.

craggy crest Oct 31, 2024, 3:44 AM

#

flux is an uphill battle unless you stick in it's very tight range.

#

it's a tool. a very good tool, with a very specific purpose. use it for that and don't try to make it be something it's not

sacred geode Oct 31, 2024, 3:47 AM

#

I wish we had a save to lora node, or save to checkpoint then extract a lora from it, for 3.5.

craggy crest Oct 31, 2024, 3:48 AM

#

sacred geode I wish we had a save to lora node, or save to checkpoint then extract a lora fro...

not how it works. you have to train a lora, just like you do any other model. you can't just 'save' a file

sacred geode Oct 31, 2024, 3:49 AM

#

I'm not stupid I meant you have 3.5, you have your lora. you adjust the lora. perfect. time to save it out. YES, there is a save lora node in comfy just not for 3.5

#

the old way was save it to checkpoint and extract

craggy crest Oct 31, 2024, 3:49 AM

#

sacred geode I'm not stupid I meant you have 3.5, you have your lora. you adjust the lora. ...

well - if you save the workflow, you'll always have your settings. and you can save templates

#

just set the lora up, and save it as a template

sacred geode Oct 31, 2024, 3:50 AM

#

of course, but this is for release. once we get the tools we had in XL then I can rock the training for 3.5

#