#🆕|sd3

1 messages · Page 117 of 1

muted dove
#

With a flex in their profile of "Crypto holder". 🤣

bitter hearth
#

could someone put on civit pls my shell scripts are written for that

finite osprey
#

waiting for the first woman lying on the floor

signal shuttle
#

Made with SD3.5M

bitter hearth
#

detail levels look good

signal shuttle
#

Text ain't half bad

turbid grotto
#

yea that is definitely improvement

sacred jewel
cobalt moon
#

This image shows a cozy, sunlit room with a relaxed, homey atmosphere, featuring Furina from Genshin Impact. She stands barefoot near a window, dressed casually in a tank top and shorts, holding a plate of food and a glass of water, looking slightly surprised or concerned. The warm sunlight streaming through the curtains softens the scene, creating a peaceful, domestic vibe. On the left side, a pair of playful, water-like creatures—one wearing a whimsical top hat—are causing mischief in the kitchen, splashing water all over the sink and bubbles floating in the air. The room is filled with small, charming details, like shelves full of books, potted plants, and scattered objects such as cans and a tipped-over cup. The balance between Furina’s calm stance and the chaos in the background gives the scene a playful, slice-of-life feel, capturing a moment of quiet absurdity in a seemingly ordinary day.

not a great prompt but hey.

turbid grotto
#

textures are really good

dusky thistle
noble coyote
#

Flux/Florence2 + LoRAs img2img

patent acorn
cobalt moon
#

I mean the basic feature is there

patent acorn
#

cuz its not trained on dat

#

wait for finetunes

turbid grotto
#

am I crazy or it is not worse than large?

patent acorn
#

"sd3 medium is all you need"

turbid grotto
dull star
#

waiting for gguf

dull star
#

oh okay

turbid grotto
dull star
#

I was looking at city's huggingface

#

oh wait yeah

#

I just got used to gguf lmao

dull star
#

didn't even remember WHY I was using gguf

cobalt moon
dull star
#

yeah its good for that

dull star
#

thank god for these GGUFs

#

they have so much better quality than fp8

turbid grotto
cobalt moon
#

will try that tomorrow

#

like with my 2GB VRAM setup

#

I did have my 6GB VRAM laptop tho

dull star
#

q4 is good on like 7B and up models (talking about LLMs)

#

but this may apply to DiTs since these are transformer based

#

but the point is, a large quantized model might outperform a small unquantized model

#

but it will be way slower for sure

#

hmm waiting for all the reddit posts that trash on SD3.5 Medium

#

thomas its to be expected

turbid grotto
#

I hope it will be adopted, it looks so good

#

look at this aesthetics

cobalt moon
#

That guy probably have dual 4090 setup

dull star
#

yeah he can fill flux and T5 at FP32 on his GPUs

turbid grotto
#

imagine opposite happens

turbid grotto
#

q4 is smaller than 1.5 but speed drops from 1.16s/it down to 1.41s/it

#

quality does not drops heavily

noble coyote
#

"My Triple 5090 512Gb RAM, 3 x 10Tb SSD is gonna ... "

#

... need a pocket nuclear-reactor to power

patent acorn
#

wanted to try style blending between photo and anime, unfortunately it didnt do well in medium maybe i suck at prompting 😂
"a photo of a cafe at night. there is an anime girl sitting on a chair"

turbid grotto
patent acorn
#

only seen lykon trying 3 subjects in separated style

noble coyote
#

raw photo of a cafe at night. an anime girl is sitting at a table

patent acorn
#

ok lemme try

turbid grotto
rapid salmon
lavish osprey
rapid salmon
#

It seems like ComfyUI is the only source (that I know of) that supports K quants but...I really prefer other tools. :x

turbid grotto
remote holly
#

is realy good

lavish osprey
#

about to upload my upscaling workflow

bitter hearth
#

ok nice, tiled upscale is what think is the thing I might use it for

turbid grotto
#

what this node doing in "SLG" workflow?

sacred jewel
lavish osprey
#

waiting for review

lavish osprey
bitter hearth
#

I don't use small GPUs but the smallest quant being just 1.79 GB is kinda cool

lavish osprey
#

runs on a mobile phone lol

bitter hearth
#

oh yeah I forgot phones
that's a legit use case for smol model

turbid grotto
muted dove
sacred jewel
bitter hearth
#

if its anything like as good as PAG was for SDXL then this is a big deal

#

been missing PAG so badly

turbid grotto
remote holly
turbid grotto
#

seems like it does help

#

not always of course

graceful osprey
#

SLG SLG SkipLayerGuidance

#

Yes it should help

#

The underlying mechanism is very similar to PAG

lavish osprey
#

@lunar canopy how comes Dango has the yellow name and not me

graceful osprey
#

Oh that's my old days dev title still there ? Lol

lavish osprey
#

chat, I need dog plushie vs dangos fight images now

bitter hearth
#

in terms of just a single node giving a boost

lavish osprey
#

one thing you might have missed with Medium, is that it works at 512x512 too

#

unlike Large that ONLY works at 1mp

#

at the end of the day, they complement each other very well

#

I generated oner 2k images with Medium as refiner yesterday and I love it

graceful osprey
#

SLG is not as direct as PAG in terms of layer selection given how DiT model works - but play with it - it is fun

lavish osprey
noble coyote
#

Flux/Florence2 + LoRAs img2img

bitter hearth
graceful osprey
#

Relevant layers:
2,4,7,8,9

bitter hearth
#

Divisive Norm and Spectral Modulation from here gives a bit of PAG effect also and works on all DiTs https://github.com/Clybius/ComfyUI-Latent-Modifiers

#

should stack with SLG

turbid grotto
lavish osprey
#

I'm also curious to test this stuff here ^^^

turbid grotto
icy coral
# lavish osprey

Wouldn't it be wise to release your base models together with an example fine tune, one maybe done in partnership with a prominent fine tuner, so the people that are less tech savvy and more focused on quality rather than trainability and versatility can have a glimpse of what the model could offer down the line?

lavish osprey
turbid grotto
lavish osprey
bitter hearth
#

the model might be stronger structurally at 512, cos in my opinion flux is strongest structurally at 512 or sometimes even 384

remote holly
lavish osprey
#

cause... you don't have long distances

#

you can also attempt a "cascade-like" workflow

#

where you generate low res, then upscale and refine

bitter hearth
#

yeah "cascade-like" is how I do flux, its good for these

turbid grotto
icy coral
dull star
#

how do I use the skip layer guidance

bitter hearth
#

the simpletuner dev showed some screenshots of sd3.5m fine tune testing and it already looks better

dull star
#

or do I even need it

lavish osprey
dull star
#

sd3.5 medium

dull star
#

idk if euler ancestral is helping but I'm using it for sure

#

im gonna do 40 steps with it

#

since this is almost 2 it/s

#

medium still has that superior VAE compared to sd3.5 large

#

no speckled "dust" artifact

remote holly
bitter hearth
dull star
#

lol dem fingers

remote holly
#

the styles are amazing

turbid grotto
#

3072х512

#

medium

remote holly
dull star
#

thanks

#

I promped it

#

euler ancestral, cfg 4, ddim_uniform, 40 steps

#

sadly it doesn't like uglyness, almost like flux

remote holly
#

thanks

dull star
#

Prompt: The man is shirtless and is barefeet and covered in mud. His long pants are old and torn. He looks thin and frail.

Reality: The man is genetically perfect, has abs and a nice jawline, he is NOT covered in mud cause that would be unappetizing... Rather, we just put a brushed him with a tiny bit of dirt. He looks like someone who is well fed and his pants cannot be torn as that doesn't look aesthetically pleasing...

analog dome
#

I know a lot about stable diffusión but one thing I didn't search and I would like someone to explain it to me what does the TX5 (or whatever it's called) do to sd3.5 ? what improvement does because im using it without it and it's working fine but i want to know what I'm missing

dull star
#

T5_XXL is a large language model, but unlike most of the ones that we use today, this one's an ENCODER too.
Some models can use it as an encoder (like CLIP I think??) to improve prompt adherence and text capabilities.
If you are mostly making images without text or rather simplistic scenes then yeah... you won't ever need it.

dull star
#

he still has abs though and his pants are NOT torn

analog dome
#

thank you very much, I think this sd3.5 + the announced controlnet support will do wonders

dull star
#

@cunning lintel

turbid grotto
#

is there info about controlnets?

dull star
#

BEFORE and AFTER | SkipLayerGuidanceSD3 (default settings)

turbid grotto
dull star
#

medium

turbid grotto
#

looks great

lavish osprey
turbid grotto
#

256x256 lol

#

2s per image

#

native fullhd

sacred jewel
#

SD3.5M ... random seeds, all other settings the same.
prompt: what?

vital surge
sacred jewel
#

Prompt: zombies running screaming with giant billboard in the background that reads "WHOA"

Not sure if my output is indicative or I am doing something wrong? Using the workflow example from the HF Repo for SD3.5M

Definitely not happy with these results

bitter hearth
#

not necessarily doing something wrong, the model isn't as strong as the big boys

mortal mesa
#

Uh where is skip layer guidance node from

bitter hearth
#

its a core comfy node

mortal mesa
#

mmm i thought i updated, ill do it again

keen palm
#

This is 0.5 more B than I need though

bitter hearth
#

might have to do it manually from github

mortal mesa
#

ya i havent actually found it yet lol just started to look

#

OK it is indeed part of Comfy, had to update

sacred jewel
#

a more complex prompt using the Triple CLIP Text Encoder

bitter hearth
#

obviously SAI won't make the flux node so we have to

keen palm
#

Will there be a turbo version of Medium?

bitter hearth
#

I don't know much about prompting so I'm not sure, but it might be doing harm giving big paragraph to weaker text encoders

lunar rivet
#

I see only downvotes under the 3.5 medium announcement, is it that bad?

keen palm
bitter hearth
lunar rivet
#

I mean thumbs down under announcement here but guessing reddit isn't doing much better

bitter hearth
#

oh, reddit and discord are same anyway

#

it takes a while but in a few months there will be Arxiv papers covering these models, there's already a few papers that talk about Flux

mortal mesa
#

Full HD

digital thorn
#

Medium test

signal shuttle
#

I wonder how fast SD3.5M going to be on my dual 3090 setup 🤔, shame I can't test it out right now

bitter hearth
#

for the most part you can just take the parameter count to be the speed multiplier

#

not always true but its not far off

#

however Flux goes double speed per parameter cos it doesn't need a negative

sacred jewel
bitter hearth
#

seems like the image quality you got in the end was similar to the others
its a limited model compared to flux and sd3.5l

sacred jewel
#

OK, now Large vs Medium using a smaller prompt for both L/G

bitter hearth
#

yeah large is just drastically better

#

might be the case that medium is nice for tiled upscale, not sure yet

#

I think Lumina is worth looking at again also, there was a fine tune of it to 2k resolution in the I-max paper, sadly they did not release it but it would be possible to replicate it

sacred jewel
#

Large vs Medium different prompt:

#

Using @lavish osprey 's upscaling workflow which uses both Large and Medium models...

bitter hearth
#

showing some potential

#

on the upscale

#

needs a bit more aesthetic finetuning or preference optimisation

sacred jewel
#

That's a mighty hitchhiking thumb

SD3.5L

carmine vessel
#

In general, how does this perform in comparison to 3.0?

sacred jewel
#

SD3.5L original vs. Upscaled result through Medium (Lykon's upscale workflow)

signal shuttle
carmine vessel
#

Mostly good, but what is the versatility in unique applications such as stylized artwork?

sacred jewel
pseudo owl
bitter hearth
#

3.5m hands are a lot better than 3.0m hands

#

its a big improvement over 3.0m

#

its easy to forget what previous models were like, did sd 1.5 for last week and didn't see 1 correct hand in like 1000 generations lol

#

I really love sd 1.5's lighting and general "vibe" though sometimes, no other model like it

pseudo owl
#

Yeah I think flux spoiled us, basically perfect anatomy each gen.

I still prefer flux but sd3.5’s is much more creative than it.

signal shuttle
#

In regards of full fine tunes

lunar rivet
#

huh 3.5 medium isn't that bad after all, resolutions above 1MP don't break like in 3.5 large and the quality isn't terrible for the size and speed

bitter hearth
#

flux spoiled us yeah exactly
cos what comes out now will always have flux as context

sacred jewel
#

SD3.5M vs SD3.5L vs SD3M

same settings for all three generations.

slender fox
sacred jewel
#

SD3.5L vs SD3.5M vs SD3M ... same settings for all three

lunar rivet
bitter hearth
#

hands are the final boss of models

heady copper
#

my last experience with diffusion models is SD2 but by god it's progressed a lot

serene spindle
#

Because SD2 sucked

sacred jewel
#

SD3M vs. SD3.5M vs SD3.5L

signal shuttle
#

Wait I just realized SD3's "woman laying in grass" was just a copy of SDXL's "woman doing yoga" prompt

sacred jewel
#

p.s. misspelling was on-purpose.

heady copper
# serene spindle Because SD2 sucked

you could do some interesting stuff on SD2 but compared to now especially it's pretty mid (and believe me, I loved that model for some reason)

serene spindle
#

I liked 1.5 a lot more tbh

heady copper
#

I was mostly doing embedding training for 2 which was pretty neat but unfortunately the architecture for embeds is no more

#

oh well, this isn't the channel for nostalgia lmao

bitter hearth
#

you can still do embeddings 🤔

sacred jewel
#

SD3.5L vs SD3.5M vs SD3M

bitter hearth
#

SD3.5L did it 🙂

heady copper
slender fox
#

Well, hands are not going to happen, I tried to generate a woman showing her hands, did 100 images, only 3-5 ok.

bitter hearth
#

I think multiple text encoders is gonna go away also

#

they are mostly a temporary anomaly cos its expensive to train a model to fully replace clip

sacred jewel
#

Flux for comparison...

#

although the only one that got the BACK of the hand was SD3.5L 🤦‍♂️

#

With more "encouragement" it finally got the back of the hand LOL

signal shuttle
keen palm
signal shuttle
bitter hearth
#

yeah

#

clip g and clip l for SDXL

keen palm
bitter hearth
#

SD 1.5 can use T5 using ELLA

#

as a weird bonus

sacred jewel
#

I take a Mythbusters attitude. If it doesn't burn on its own, I will MAKE IT burn.

red nacelle
#

not mentioning datasets because somehow people find/make those which is lowkey crazy to think about

digital thorn
#

If lora training works this time I'm sure that it will be a really good refiner model to use...at least for my graphic card. 0.25 denoise for the image to the right. Just a quick test

muted dove
cunning mesa
#

Also lost a thumb.

sacred jewel
digital thorn
cunning mesa
#

SD3.5M is probably a good test bed for training if nothing else.

frail shoal
#

did you guys have any luck inpainting with sd3.5 medium ? Results get completely squashed. worked well as a refiner but inpainting not at all. I'm wondering if i'm doing something wrong

bitter hearth
#

are you using a stochastic sampler or a deterministic one?

#

for inpainting stochasticity is the most important thing

frail shoal
#

using the suggested sampler dpmpp_2m with sgm uniform scheduler

bitter hearth
#

would suggest trying the node version of dpmpp_2s, the one that lets you adjust eta and s_noise
keep s_noise at 1 and set eta as high as you can without the image breaking
that can help a lot

frail shoal
#

it was supposed to be a face

frail shoal
bitter hearth
#

it should give you higher image quality as well as helping inpainting

#

the main downside is stochastic samplers need more steps

slender fox
#

SD 3.5M testing after work, pretty much random stuff I wanted to try. I guess it could have many uses despite of several things that don't seem to be working with it.

gusty hamlet
#

SD 3.5 Medium, M2 Mac Pro

bitter hearth
# slender fox

by far the best set of images I have seen from SD 3.5M
really good job

#

I think with a bit more fine tune it has potential

frail shoal
bitter hearth
#

not sure if I could convince you to try more steps
inpainting is a hard task, a lot of steps can help a lot

frail shoal
bitter hearth
#

I would have done 100-150, but maybe 40-60 would be ok?

slender fox
bitter hearth
#

if you want to do like 30 steps then the stochastic sampler I recommended might be worse
for low steps, Deis and UniPC can be good

frail shoal
frail shoal
bitter hearth
#

I agree flux was able to inpaint with less steps
even Schnell could inpaint lol

dim geyser
#

my macbook pro did fine with SD3.5L but it really isn't liking SD3.5M. all the results are coming out messed up.

slender fox
frail shoal
slender fox
#

With SD 3.5 medium.

bitter hearth
#

ah I didn't realise these were with euler
so with a stronger sampler you could get some more quality also

bitter hearth
#

euler is the weakest out of any of the default comfy samplers
for deterministic, DPM++ 2M, Uni PC and Deis are particularly good
for stochastic, DPM++ 2M SDE, 3M SDE and 2SA are good, although only 2SA works with Flux and SD 3.5
you can get better samplers than those default ones but they require custom nodes

frail shoal
bitter hearth
#

Clownshark's node pack gives lots of stochastic samplers that work with Flux and SD 3.5
https://github.com/ClownsharkBatwing/RES4LYF
Sampler RK is the latest node as far as I know, and its got a few to choose from

#

if you put Eta to 0 then they run in deterministic mode

quaint flare
digital thorn
vale oyster
#

I gotta ask, as I just had to hit the wrong setting in my BIOS so my PC is in no post mode atm, as it power cycles.

So I need to rely on a cloud provider -- is there any good big names as of recently for 3.5 and flux?

#

I was looking @ modal.. But is runpod still a actively worked on cloud gpu provider?

violet escarp
hallow lion
#

so medium looking good huh

#

much fastor and higher than 1 megapixel

pseudo owl
#

yeah anatomy isn't that great but nice for its size, great for people with weak gpus.

hallow lion
real terrace
#

2:54 generation time here for 823x1152

#

In original workflow clip3 was fp16

#

Really like the details, they need upscalling but pretty interesting to work on

dusky thistle
dim geyser
dusky thistle
patent acorn
#

the post-falling down the stairs

dense blade
#

What's your performance difference between SD 3.5M and L? I use SD 3.5M (default) and SD 3.5L (Q8) with following speeds: 3.96s/it (M), 7,18s/it (L) - both use the same settings: shift 3, 40 steps, 4.5cfg, dpmpp_2m, sgm_uniform. I expected to get better performance from the M version.

#

My env: Linux, AMD Radeon RX 6700 XT (12GB), pytorch 2.5.0+rocm6.1

sullen moss
hallow lion
#

Red Panda?

#

So basically we have entered singularity when a new ai model comes out every day.

craggy crest
winged seal
# sullen moss Hmm

I have done a few hundred images of ranking. Red panda seems to have a little more variation in composition, mixed with really bad Dall E 3/MJV6 "Aesthetislop" which people look at and go "oo pretty" cause there is an unnecessary amount of contrast/noise/"detail"

#

Examples of aesthetislop, where there is just noise and "detail" everywhere to try and get the same reaction as jingling keys in front of a baby's eyes lol

#

I get why people like it, but its just all noise/nonsensical detail over anything logical or visually pleasing in a toned down way

hallow lion
#

fast food for the eyes

winged seal
#

pretty much haha

#

just overwhelm the senses so people don't take a look closer and realize its all meaningless noise and nonsensical details/tones

hallow lion
#

i like the new word aesthetislop

winged seal
#

A lot of people who do the head to head comparisons on that space go for "oo pretty" within 2.5 seconds of looking instead of taking in the composition, stylistic variation, or interpretation of the prompt

winged seal
sacred jewel
#

Zombiecore LoRA - Match made in heaven 🤭

cedar vortex
winged seal
craggy crest
runic tusk
winged seal
#

Blueberry was around for a bit, and now you can use it

#

I am just curious who its made by

dusky thistle
craggy crest
winged seal
#

yeah, we know nothing about redpanda as of now

craggy crest
# winged seal you mean for redpanda?

yeah. it's vaporware. the images could have been created with anything. once we get a demo, and know who's behind it, then it's time to worry about it

winged seal
#

but yeah, curious to see who made it

#

last time it was a new Black Forest Labs model, so we will see

dusky thistle
winged seal
#

We have been due for a new Dalle4 for a long time, honestly. Dalle 3 was a prompt adherence champion, but my god did it look abysmal

errant dust
#

We'll agree to disagree.

#

In any case, for those awaiting it. The GGUF build of SD3.5 Medium is out

sacred jewel
# sullen moss Hmm

I can probably make one of those charts in Photoshop in :30 seconds... I'll call mine Hawk Tuah and give it a score of 2001, a rate of 95% and a selections # of 76871 and profit 😎

errant dust
#

What is Red Panda?

real terrace
cobalt moon
#

It just appeared in Artificial Analysis

real terrace
#

yes

errant dust
cobalt moon
#

I think it's probably a codename

errant dust
#

Workflow is there. You need to install the GGUF extensions and make sure Comfy is updated

cobalt moon
#

like Flux's codename "Blueberry"

errant dust
real terrace
real terrace
# errant dust

Also I see now that the text encoder can be quantized too

#

what is the SD3-5 vae? I didn't use none before I think

craggy crest
#

@dusky thistle comfyUI has a new scheduler in it

craggy crest
real terrace
#

at this point I load some model and it worked somehow

#

I don't even remember where did I get it

#

The only thing about these models is that they take so much I don't fiddle or play too much with the parameters, I just set prompts and img size

dusky thistle
real terrace
#

your gun is backwards sir

hallow lion
real terrace
craggy crest
real terrace
#

they need refinement

#

sh,don't shout the prompt bro

#

god dammit

sterile pendant
# craggy crest linear_quadradic

Afaik, it's pretty much only used with mochi or w/e that model is called. I haven't tested to see if it works at all with other models

#

Off the top of my head, it should likely look like something between an sgm_uniform and simple sigma curve

#

ahh nvm, looks like they use some wonky exponent for it. looks like it spends a lot of time shifting things around a bunch, then rapidly drops into fine details

#

it's definitely meant for flow models only. sdxl models usually have concave curves and it didn't respond to that when I tried to run this with an sdxl model.

#

either way, it's likely not all that useful for non-mochi models.

dusky thistle
mortal mesa
craggy crest
dusky thistle
sterile pendant
# craggy crest Sd3 is a flow matching model

It is, but that kind of curve is not what the model was trained on. So sure, it will work, but it's not optimal. It will spend far less time resolving medium and fine details using that linear quadratic curve

#

Think of it like model shifting to an extreme value away from the norm

dusky thistle
#

i think beta's been best

#

it's got just enough time spent on the fine details

dusky thistle
mortal mesa
dusky thistle
#

papa toilet

severe phoenix
#

please can anyone explain to me what these two clips do. is one more powerful than the other?

sterile pendant
dusky thistle
mortal mesa
severe phoenix
craggy crest
#

SD3 uses all three encoders, flux only uses two

#

clip_g is the workhorse - it drives the entire thing

#

clip_l and t5-xxl work along with it

#

t5xxl gets your detail rich, narrative, natural language prompt

#

clip_l gets your artsy, ambient, background, fine details prompt

#

clip_g gets the no-nonsense, just-the-facts-mam information about what the image is

#

IF you do not use a node that allows you to put in a seperate prompt for each encoder, then the text you put into the positive prompt will be given to each encoder anyway, and they will fight with each other - they aren't in sync

real terrace
#

for example in this img the text that appeared was just some text to reinforce the idea, not some main title to write

craggy crest
real terrace
#

it would be in the clip_l I guess

craggy crest
real terrace
#

I wonder how much different could you find when trying a single prompt and these 3 clip prompts, is there something done?

severe phoenix
real terrace
craggy crest
#

this is an SD3.5 Large workflow i'm working with right now, it also has the node for all three encoders. to get the workflow: 1. click the image to open in viewer. 2. click the Open in browser words. 3. right click, save as. then drag and drop into comfyUI

craggy crest
#

the AI will try to guess what it might be

real terrace
craggy crest
real terrace
#

so clip g would be the classic SD prompt, clip t5xx would the "chat GTP" type promtp, clip_l would be booru tags, like "trending on artstation" ?

craggy crest
#

here

#

clip_l: trees, warped, twisted, weathered, ruined, cracked, flames, fire, smoky; atmospheric lighting; peter mohrbacher, james jean, william morris, ernst haeckel, zaha hadid
clip_g: apocalyptic ruins; pink tree growing flames;
t5xxl: the scene is set in apocalyptic ruins. in the center, we see pink tree growing flames rather than leaves; All that is left of the buildings are parts of walls with crumbling bricks

#

=======

#

clip_l: thick impasto painting, heavy canvas texture, noticeable brush strokes
clip_g: Stratocumulus; Renaissance Island beach sunrise, maya beach by artist "Paul Dougherty", by artist "Nicolette Ceccoli"
t5xxl: the scene is a stunning sunrise at the beach. cumulostratus clouds cover the sky with red and gold, while transparent green water and white foam breaks on the light brown sand,

#

=====

#

there are 2 sets of prompts to play with

#

@dusky thistle comparisons. model: sd3.5 large sampler: euler_ancestral schedulers: on the left, linear_quadratic. on the right, beta workflow is in the images

dusky thistle
#

that looks real good on the right

#

what's interesting i've noticed with this flow models is it's like it just needs to set the trajectory real carefully

#

almost like aiming a rifle

craggy crest
#

it does. i'm working on that sampler/scheduler sheet - should be done tomorrow

dusky thistle
#

and then pow, it can take massive steps

craggy crest
dusky thistle
#

no idea, been coding all night

craggy crest
#

skipping layers, like matteo did with flux back there. skipping the blocks

dusky thistle
#

just got a brutally, just, oh-my-gawd-don't-want-to-admit-it bug fixed that's been hanging around for weeks

#

cool yea sounds like PAG with zeroing out the V

craggy crest
#

and then he talked about how certain blocks didn't do anything and we played around with not using them?

mortal mesa
severe phoenix
dusky thistle
#

it's great having comprehensive nodes like that

#

i make shit like this all the time

craggy crest
#

there are two workflows on that repo, one with, one with out and you should probably DM lykon about that setup you have and talk to him

dusky thistle
#

sounsd good

#

gonna have to go pass out soon here but will do

craggy crest
#

and i want your workflow

#

i can't find the one matteo released

dusky thistle
#

3rd order RES is looking pretty nice with ultracascade

#

which WF?

craggy crest
#

he's got teeny wings coming out of his neck

dusky thistle
#

hahah yeah

craggy crest
#

i'll bet those nodes aren't avaialble, are they?

craggy crest
dusky thistle
#

they should be

#

i think they're in there, if not i can get em to ya

craggy crest
#

DM lykon, and go crash

sacred geode
viral plaza
#

@lavish osprey @graceful osprey SD 3.5 Medium modelspec resolution is 1440x1440, however the model struggles to run this resolution natively: (it's fine at 1024x1024, and I bet it can upscale to 1440 happily, but native from the ground at 1440 doesn't seem to work too well. Maybe with the skip layer guidance, longer prompt, luckier seed, and all you can push it into working? idk even then I feel like recommending 768 or 1024 with an upscale is smarter than recommend native start at 1440)

#

that's a photo of a cat and that staircase it's inside of, is the early steps completely corrupted, and the latter detail steps managed to refine it into something that almost looks like a real structure

bitter hearth
#

Flux doesn't do well at the advertised top resolution of 2048 either

#

unless you are using I-max

turbid grotto
#

but not first try

viral plaza
#

that's defined in the metadata header of the model if you're unfamiliar with modelspec

bitter hearth
#

ah thanks
I knew there was metadata inside .safetensors files but I have never looked at them

#

okay yeah you have a good point then

noble coyote
#

Clip_l, Clip_g, Clip_round_the_ear - what does this one do?! 🥳

muted dove
#

Even the strongest have their weak days.

bitter hearth
#

on Civit there is someone adding Flan-t5-xxl to Flux and SD3.5

patent prism
#

a cat

muted dove
#

I doubt it's a cat that's adding it.

muted dove
real terrace
#

I'm trying this

meager patio
#

Hello everyone, I really need help. I was struggling all night, installing a stable diffusion, literally for one job, and right now I'm launching it for the first time and it drops what's in the picture, talking about old drivers. I can't update it because this is the limit for the video card.

I want to ask someone to do literally one job. I'll be extremely grateful if someone responds.

real terrace
#

This large model took as much as the GGUF sd.35 medium q5_0

bitter hearth
#

the big speed up is only gonna come once you have a setup that fits within your VRAM

#

regarding clips, some people like this for Clip-L https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14
although I am not sure about it

sterile pendant
#

Another big key difference about sd3.5 medium is that the t5 should contain no more than 256 tokens

bitter hearth
#

Flan-t5-xxl for T5 I am more sure about

real terrace
bitter hearth
#

make sure the text encoders get put onto the CPU before the Ksampler starts
and try Q4_0

#

or just offload the text encoders entirely

#

after text encode

sterile pendant
#

And SLG works well with 3.5 medium. Don't know about large though since medium has the slightly different architecture with extra self attention layers or w/e it was

bitter hearth
#

SLG will work with Large and Flux I think
just needs porting and recalibrating

#

its a clever idea, it drops layers for the negative so the structure gets messed up

fossil pagoda
#

👁️

bitter hearth
#

since models try to do the opposite of the negative, you end up with better structure

#

for Flux this will require generating a negative, with the 100% speed penalty

#

would be good to try this method with Perp-Neg also

muted dove
#

SD3.5L using Google FLAN and Flux refiner.

remote holly
#

what are the resolutions suported by sd3.5 ?

muted dove
muted dove
graceful osprey
#

ah but you might need a higher shift

#

like 4~5

muted dove
alpine summit
viral plaza
# graceful osprey Hmmm weird. 1920*1024 should just work out of box

1920x1024, Seed 1, a photo of a cat, CFG 7, Sigma Shift 3 on left and 5 on right, Sampler Euler scheduler Normal, Steps 20
in both cases while watching live previews it's clear the early steps are failing and the latter steps are trying to recover the corruption into something coherent (in this case a TV screen type of effect on the background, and a cat with a patchy outline lol)

graceful osprey
#

Steps 20 - try dpmpp_2m steps50

#

should work - let my try as well

viral plaza
#

here's that at the end. The two other images are intermediate previews just to show what it's doing in the early steps

#

for comparison at 1024 the early step previews have a relatively clean coherent image, just without details built yet

bitter hearth
#

In Flux I got better results with stochastic samplers
the built in DPM++ 2SA in Comfy has been adapted to work with ret flow

graceful osprey
#

Noisy beginning is kinda expected but should not bu this bad

viral plaza
#

and also those early previews resemble the final image moreso than the high res

bitter hearth
#

I've been trying to port restart sampler to ret flow, it might help with these models
its given me the best results in SD 1.5 and SDXL

#

it adds the noise in a special way, its a bit tricky

alpine summit
remote holly
#

what are the schelduders and samplers ?

muted dove
alpine summit
remote holly
#

does someone knows a prompt system for prompt enhancing

alpine summit
alpine summit
noble coyote
#

Flux/Florence2 + LoRAs img2img

severe phoenix
# muted dove

omg i never knew stuff like this existed! is there one for img2img

turbid grotto
flat oracle
#

why is my workflow i used for sd3.5 l not working for sd3.5 m

viral plaza
#

oh shit, i'm so used to ignoring T5 with SD3 since it did almost nothing with the SD3.0 models, is it actually needed for SD3.5 Medium now?

flat oracle
viral plaza
turbid grotto
flat oracle
#

By any chance you know why my workflow is not working?

viral plaza
flat oracle
viral plaza
#

oh i left dpm++ on instead of euler, slightly different details with euler but still wonked

turbid grotto
#

"a photo of a cat"
seed 1
steps 20
1440x1440
dpmpp_2m
sgm_uniform
cfg 4
fp16 everything

turbid grotto
#

hmm there is stripes in background but seems to be gone if I specify

viral plaza
viral plaza
#

i think this is just a case of, the model wasn't trained enough at or above the 1440x1440 res, and the more input you give the more it can compensate and make it work, but if you give minimal input the difference shines aggressively

#

this is a general case with image diffusion models - they are largely self-correcting, so giving it more of anything helps cover issues. More prompt tokens = wider attention = more self-correction. More steps = more actual entire runs of the model = more self-correction

#

(relatedly in the other direction, generating with empty prompts is a great way to sus out implicit biases in a model, whatever it outputs without conditioning will be approximately representative of the types of content it was trained the most on and may have developed a general bias towards)

#

(in that test, SD3.5Medium has a very nice broad range, and seems to favor 768x768 for the most stable generations, even at 1024x1024 with empty input it displays some striping patterns)

turbid grotto
viral plaza
#

yee

turbid grotto
viral plaza
#

that means model lacks any visible bias, which is good

#

I've seen some models eg continually generate humanoid outputs on empty prompt, indicating that the model was violently overtrained on humans to win on "omg it makes pretty girl" aesthetic evals but otherwise useless as a foundation for anything

#

SD3.5Medium has a very broad range of random outputs from that, indicating it likely hasn't been overly tuned on anything in particular, making it very optimal for a base model for community tuning adoption

#

(noting however that's just a quick n dirty test, not a guarantee, so grain of salt and all that)

#

sd35m promptless outputs

#

maybe a touch of over tendency towards artsy stylings but eh

#

and for comparison, Flux Dev

#

burnt hard on high quality photoreal art

#

that woman included is The Flux Woman who everyone sees constantly, and the community have taken to identifying by way of her specific "butt chin" shape

turbid grotto
turbid grotto
#

so, you either have diverse model with lacking coherence or coherent with lacking divercity

viral plaza
viral plaza
#

it's funny how often people have said my discord bot running flux schnell looks better than their flux dev results (or similar comparisons)... the model's worse, but my discord bot has an LLM that extends your prompt, and that does magic

lavish osprey
lavish osprey
signal shuttle
#

After playing a lot with flux and SD 3.5 large, i began to forget how fast 1s/it was, Man SD 3.5 medium is so fast

noble coyote
#

Budget's just finished - penny off a pint of beer!

#

I'm off down the pub!!! 🥳

flat oracle
noble coyote
#

<---------------- needs OmniGen on ComfyUI

turbid grotto
#

I often see people saying sd35 is not cinematic\realistic but that is sooo wrong, they just used to models defaulted to certain style. In that cause you have to prompt to get it and you will

turbid grotto
viral plaza
noble coyote
#

Its the power to meld disparate elements of different photos - seamlessly - which excites me

viral plaza
#

seems like it has cool capabilities but its general image gen quality is very "meh" tier, making it hard to justify the disgustingly long gen time

muted dove
noble coyote
signal shuttle
#

(SD3.5L + SD3.5M)

dusky thistle
signal shuttle
dusky thistle
noble coyote
errant dust
#

Maybe I am just choosing the wrong subjects, but so far I have been underwhelmed by SD 3.5 Medium. Compared to Large of course. (workflows included, but prompt is basically: "a comic with strong outlines of a tree that stands tall in the center, with cats of various sizes, colors, and expressions scattered around its trunk and branches. Some cats are nestled among the leaves, while others are perched on the tree's main branches. The tree's trunk is brown, and the leaves are green. The cats exhibit a range of facial expressions, from curious to content." First is SD 3.5M at 768x768, then 1024x1024 and finally SD 3.5 L

#

same seed

untold valley
#

Anyone have a quick TLDR of the state of things? How is 3.5 compared to last released model or Flux? seems there are stripping patterns is that the worst of it?

errant dust
#

I have not seen striping issues myself, but it seems to completely whig out at sizes over 1024 x 1024. Large that is.

#

the output is quite different and they have different strengths and weaknesses. If you want hands, well, Flux is your choice, but there is more to imagery than that. Comics are clear win (no Loras) for SD 3.5 L

untold valley
#

what about text?

#

faces/body how maleable is it?

#

can it do the dreaded woman laying on grass

#

astronaut on the moon

errant dust
#

The thing is I don't really do stock photography with people. But that much is fine. Just try it IMHO. Nothing endemic

rapid pivot
#

Hello Chess man

untold valley
#

comfy still the best way? forge? anything developed better?

rapid pivot
errant dust
#

Hey @rapid pivot

pseudo owl
errant dust
#

and COmfy fits that description

#

The trick is to explore them with the themes and types you want, and leanr which are best at what

untold valley
errant dust
#

I can tell you my findings, but they only cover what interests me. So anime? No clue. Chicks with boobs? Check Civit. Etc. Artwork? Stock imagery. Creative content? Text? I'm your guy

#

I mean, in terms of text, the clear no.1, of any AI, is Ideogram 1.0 and 2.0. It isn't even close. Flux isa decent second, but distant. Logos? Flux is the king today. All the others. MJ included, are just behind

#

MJ6 that is. MJ 5 was strong

#

Satirical comics? Imagen 3 is the best now. Also not close.

untold valley
#

afaik Flux hard to train and that's why we were waiting on 3.5 or model like it. and it appears 3.5 needs more training tossed at it.

errant dust
#

I don't fanboy any model. I just tryt to leanr what the best tools are for each case

flat oracle
#

sd3.5 M works pretty good for me. However, CLIP Text encode node (positive as well as negative) takes extremely long to load, any ideas why?

errant dust
flat oracle
errant dust
#

I use the T5 Q8_0. WOrks fine for me. Though YMMV

untold valley
#

ah damn, do we also now have different T5 models we need to deal with?

errant dust
# untold valley looking for local

Flux has a big edge RN in that it has been out for 3 full months now. So TONS of strong LoRas. SD 3.5 hasn't had this level of development yet, so give it time

errant dust
#

as it is loaded in blocks

#

and not as a whole

noble coyote
untold valley
flat oracle
#

i mean the ksampling is pretty fast. no problems at all, just with text encode

errant dust
errant dust
untold valley
#

you can theoretically have an endless stream of loras, but I do not like to stay with a single acute style. Its easier when a model inherently knows what it needs to. and makes the image generating creative workflow more malleable. essentially Lora's are an overfitment run on top, sure easy fast way of getting what you want if you want specifics but as a whole a model by itself that understands you is better.

errant dust
#

Reminds me of an arument I had with a strong chess master who offered to mentor me to mastery myself. He set out this study plan that involved self studying matches as far back as the late 1800s so I could 'absorb' the evolution of the game and blablabla. I told him I would prefer a more focused and pragmatic approach that maximized results (so long as they did not impair evolution). We soon parted ways. I don't have infinite time, nor patience.

#

How is citing endless culture and names going to matter if I lose because I simply played worse? Same for AIs. Talking about why one way sounds nicer on paper is unimportant to me. If I get the results I seek, all the technical perfections of another setup or method won't matter one whit to my eyes.

untold valley
#

i am asking the wrong person then. was looking for more of a power user, jack of all trades approach.

errant dust
#

I wasn't aware you were asking anything at all.

#

You asked what is best. I told you in results. You then railed about why models and Loras should be splitting the output development differently, which is semantics. But there was no question involved.

#

Is it the purity of some technical aspect that matters to you? Or better images that fulfill your requests?

#

I made clear where I position myself

#

Devolving into personal attacks? Is it that hard to just have a civil discussion?

viral plaza
viral plaza
untold valley
viral plaza
errant dust
#

I used SD 3.5 L to show that it wasn't a general failing by SD 3.5.

#

That said, here is the type of image that Flux can only dream about doing, but that SD 3.5 can do:

pseudo owl
errant dust
#

in the image

#

all settings and meta are always in images I post here

#

an Impressionist Cartoon of a tree covered in whimsical cats on the base and branches all drawn in a variety of colors and facial expressions in the style of Andy Kehoe and Skottie Young. The outlines in the layered 2d art are strong and reminiscent of Keith Haring.

dry wave
#

from what I heard flux is also easier to train, despite being a distilled model

#

guess the way to go for now is using sd3 for creative arts and flux for more realistic/clean images

pseudo owl
dry wave
#

who needs full fine-tuning

errant dust
#

huge file even GGUFed. (18+ GB)

viral plaza
#

what

errant dust
#

I meant F32, sorry

viral plaza
#

what about it

pseudo owl
errant dust
#

You said the T5 for SD3 is weak. I asked whether you expected tangible benefits using the F32 version

viral plaza
viral plaza
#

the bitwidth doesn't enhance anything, it's just narrow refinements to precision - fp32 requires twice the memory of fp16 and microscopically more precise data, it's incredibly pointless to use fp32 for anything outside of training

#

even fp16 barely provides more precision data than fp8 does

errant dust
#

I use Q8_0 myself

#

a cross between them, though very very close to fp16

viral plaza
errant dust
#

That's quite interesting. I assume this was understood before SD 3.5 went into training. Any reason why this path was not chosen?

signal shuttle
#

(SD3.5L + SD3.5M)

errant dust
#

Hmmmm! Playing around with the samplers with SD 3.5L dpmpp_2s_ancestral just gave me my best result by far in my Keith Haring cats in trees prompt. So far my request for his style and strong simplistic art and outlines was ignored, though it did adhere to my desire for rich multicolored whimsical cats. To be explored:

craggy crest
craggy crest
untold valley
craggy crest
# untold valley we will do this another time, in the mean time have you found SLG workflow usefu...

i'm currently finishing the sampler/scheduler compare sheet. however i did work with it some yesterday, and i played around with the same idea when matteo released his block skip node for flux a couple months back. it's very useful, but you need to be extremely careful with the values. one interesting thing is that by enabling skip, you may get better text - but you will also lose fine details. image with skip turned on, on the left. image with it off, on the right

#

it's implimented in 3.5 medium as a way to tweak things like hands and feet if needed, or other things that are warping that you don't want to warp.

#

it's not intended to be a 'turn it on and use it on everything'

untold valley
#

they are completely different pictures all together, guess there is no free lunch. results for skip for text based on that sample are crazy though.

craggy crest
#

skipping layer 2 only - skip on the left, without skip on the right

#

you can see that the fine details are deleted

untold valley
#

wow yes, right picture looks really nice with the splattering of paint

mortal mesa
#

i dont think its barely been touched yet, people plug it in and use the defaults in it, can have lots of effects but i sure as hell aint gonna figure it out

craggy crest
craggy crest
untold valley
craggy crest
untold valley
#

thank you for the resources, let me dig into and familiarize myself with it first.

noble coyote
sacred jewel
#

Nihilism LoRA

untold valley
#

@craggy crest top is SLG lod have mercy, really hard to find a sweet spot, finding start percentage the higher it goes the more distorted, needs low initial value.

noble coyote
craggy crest
viral plaza
noble coyote
craggy crest
hallow lion
#

Cats with 4GB VRAM (send help)

craggy crest
#

@bitter hearth

craggy crest
#

@dusky thistle one of these is dpm_2/linear_quadradic, the other is dpm_2/beta - 3 guesses which is which

sacred jewel
mortal mesa
#

Beta/Linear_quadratic, worked very nice on this one, my opinion

craggy crest
mortal mesa
bitter hearth
#

linear_quadratic is a bit of a spicy choice
sprints through low sigmas at lightspeed

mortal mesa
#
sprints through low sigmas at lightspeed```
craggy crest
#

@bitter hearth

mortal mesa
#

i dont really know what im doing but i got detail daemon in there too bosting early if that even makes sense

bitter hearth
#

detail daemon will help with linear_quadratic
because it will offset some of the detail loss

#

so yeah that makes sense 🙂

craggy crest
#

these are all dpm_fast. one is exponential, one is karras, one is normal

#

can you guess which is which?

bitter hearth
#

never actually checked out what exponential graph looks like TBH

#

I think the third one is normal

#

hmm not sure

gritty steeple
craggy crest
real terrace
#

I still can't fully understand what to put in each clip; sometimes I feel like I have it clear, but then I don't.

#

If I have to repeat or reinforce the idea in all

#

I'm using this GGUF model, but it takes 4:40 minutes, while

#

this large model I think it took 4 minutes only

signal shuttle
gritty steeple
real terrace
pseudo owl
#

The red panda model which was #1 on the text-to-image leaderboard(beat flux.1.1 pro, flux.1 pro, dev, schnell, sd3.5 large and turbo): <
https://x.com/recraftai/status/1851706399631224939>

It has very bad prompt following(maybe only in this prompt idk), but great realism.
prompt: a high quality photograph of a white cat sitting on top of a blue dog on a brown couch in a living room. Behind them, is a window, with 4 cow pictures, one in each corner. Outside the window is outer space and a ufo.

Flux.1 schnell 8-step on left, Recraft v3(red panda) on the right

dull star
#

indeed its finally revealed on the image leaderboard too

#

idk man it looks average

#

especially if its closed source

lucid swift
pseudo owl
lucid swift
dull star
#

I guess it does do "crappy style" photos out of the box, but that's required for it to be used as a service with no lora options

#

A close-up, realistic portrait of an elderly man dressed as a military soldier. He has deep wrinkles, white stubble, and a stern, weathered expression, symbolizing years of service. The uniform is slightly worn, with medals and insignia visible, and a green camo pattern typical of an army soldier. His eyes reflect resilience and experience, capturing the weight of his journey. The background is blurred, focusing solely on his face and upper uniform, creating a dramatic, respectful portrayal.

#

the texture is nice

lucid swift
dull star
#

I don't want to login

#

idc that it's free

#

wake me up when its open weight

pseudo owl
# lucid swift recraft

Yeah its nice at realism, but as I said above, prompt following isn't great. Open source models can do similar gen's too in terms of realism and have much better prompt following.

dull star
#

and its gonna be paid sooner or later

lucid swift
dull star
#

how much per image/credit

lucid swift
#

but u get like 50 free images

#

and then 5 free every day

#

but after then 10 or 20 dollar a moth

#

if you want omore

#

you can also ulode images and make your own style

pallid ginkgo
#

kinda like those ai video websites

lucid swift
#

yes

pseudo owl
# lucid swift bs its very good

Flux schnell(considerably worse then flux dev, and sd3.5 large) on the left vs Recraft v3 on the right,

prompt: A blue block on top of a red block. Next to the red block is a green block with a candle on top of the green block.

lucid swift
#

i cinda like this image

#

reminds me of something idk

#

but it has less often this fake real look with that smoth skin

dull star
#

aesthetics are good though

dull star
#

like there you could load an image and it would make an image in that style

#

granted, it wasn't perfect, but it was good enough

pseudo owl
# lucid swift it seems like its worse at promt understanding

Yeah its considerably worse then the new open source models on prompt following(flux dev, schnell, sd3.5 large, turbo) imo.

But yes its for sure more realistic then out of the box models, but there are many ways to make models more "realistic".

craggy crest
# real terrace I still can't fully understand what to put in each clip; sometimes I feel like I...

t5xxl: give it natural language that is rich in details and adjectives. clip_g: just give it the basics of the image. clip_l: give it all the fine details and artsy stuff. example: t5xxl: closeup on a vase with roses in it, dew sparkles on the petals. the light shines in from the side at a slant casting shadows across the scene. clip_g: roses in a vase near a window. closeup. dynamic light and shadows. clip_l: sparkling bits of light bounching off dew on the rose petals. soft, bokeh background.

bitter hearth
# lucid swift recraft

the model does look good here
during the testing period on Artificial Analysis I mostly voted against it though

#

my preferences seemed to be for Flux Pro 1.1, Ideogram V2 and Midjourney V6

#

however these ELO tests are unfair to the open source models

#

because they get fine tuning and Comfy workflows

#

but ELO test doesn't reflect that

lucid swift
bitter hearth
#

its close yes, at the top of the leaderboard

#

I did come away with the impression that the gaps between models are very small

lucid swift
#

at some point you cant make it more realistic

bitter hearth
#

there are upscaled SD 1.5 images that look like photos even

#

the difference between workflows is like 100x larger than the difference between models

lucid swift
#

but these are very limited finetunes

bitter hearth
#

yes but I don't see the downside in checkpoints specialising

#

you could train a small router model to route prompts to appropriate checkpoints for example

#

I guess storage, and loading/unloading to VRAM is the disadvantage there

craggy crest
#

SD3.5 medium. Prompt: fantasy painting of a handsome lion Knight,long wavy hair, slight smile, piercing green eyes; emerald, symmetrical,intricate engraved armor; hyperdetailed. the words "Kings wear Crowns"

craggy crest
bitter hearth
#

Civit does feel like that

#

interesting that even Midjourney chose to have their anime checkpoint separate though

low stone
sacred jewel
sullen moss
#

Refract vs Ideogram

#

To be honest, I still don't understand why this model is in first place...

gritty steeple
pseudo owl
low stone
craggy crest
bitter hearth
#

its a big one, going by API pricing and timings

winged seal
#

@craggy crest Wanted to share that I have taking a bit of a "break" after all from bigger over arching training, in favor of my first concept training on Flux Lite, which seems to be working extremely good for my first attempt, so I am happy about that haha

bitter hearth
#

the aesthetic fine tune is a bit off, especially small details

#

its got signs of a strong model though in composition and blur

craggy crest
winged seal
#

my dataset is pretty small and less than ideal, so I will be trying to get good enough results to supplementally train off of for better feature reach

winged seal
winged seal
# craggy crest prompt?

striped dappled light on the face of a young black boy. The light on his face is in striped and he is wearing a gray tank top with very short black hair, dark background

bitter hearth
#

right hand side definitely better in both cases, nice job

winged seal
#

thanks <3

winged seal
#

mixed in a smidge of my realism LoRA's to make it look a touch better as well haha

craggy crest
#

just your prompt

winged seal
craggy crest
# winged seal hoenstly, not bad

modified your prompt to:HDR photography: striped dappled light on the face of a young black boy. The light on his face is in striped and he is wearing a gray tank top with very short black hair, dark background

#

sd3.5 large, no lora, just the prompt

sacred jewel
winged seal
craggy crest
craggy crest
winged seal
#

ok, this looks ASTRONOMICALLY better haha

craggy crest
winged seal
# craggy crest and no need to sit there and battle fo rhours and hours and hours with a base mo...

Oh yeah, I talked to two people with doctorates in this scene, and they both agreed that flux is not "frozen" and that it was a completely false claim made by some dude on reddit with no real truth behind it

They spoke way higher level about it with me, much higher level than I understood, thats for sure but yeah, two different people were 100% sure that "frozen" is a BS claim from people who don't understand flux or distillation. One said friend was the creator of Libre Flux and the writer of the paper used to prune flix down from 12B to 8B, so I would wager he knows what hes talking about 😅

winged seal
craggy crest
winged seal
#

casually doesn't read the part where they wrote an entire paper about flux and pruning/de-compressing it

craggy crest
winged seal
#

ok man, I will continue to believe people who are much more qualified 😅

it doesn't matter anyways, flux trains great regardless of what some people try to say about it 😅

craggy crest
winged seal
#

do you have a doctorate in machine learning?

craggy crest
winged seal
#

actually no, I am not gonna get into this, it literally does not matter in the grand scheme of things

#

anyways

craggy crest
winged seal
#

how are things going with medium? I heard its pretty good

#

better or worse than large?

craggy crest
winged seal
#

No, I haven't

#

I am waiting for better training tools

craggy crest
#

ah. well - get a few moments to go play with it and see what you think - try using it as a refiner for your images generated with flux

pseudo owl
winged seal
#

jesus, worse in anatomy 😭

#

I guess it is more of a refiner after all, so oh well there

craggy crest
mortal mesa
#

how would you do that

craggy crest
winged seal
#

Yeah, that seemed to be more of the direction people were leaning in for a while

craggy crest
mortal mesa
#

oh yopu dont have any input

winged seal
#

I'd be down to use it on my flux lite gens

craggy crest
pseudo owl
#

Yeah that can help a bit, it’s far better then sd3 medium in anatomy but aroundish base sdxl in anatomy(worse without slg).

winged seal
#

jesus thats rough

#

base SDXL was a bit of a nightmare haha

craggy crest
#

there are 3 workflows in the huggingface repo for medium: SD3.5L_plus_SD3.5M_upscaling_example_workflow
SD3.5M_SLG_example_workflow
SD3.5M_example_workflow

#

but update your comfyUI before loading any of them so you get the new node and scheduler

winged seal
#

Is hould know... I have trained over 1000 LoRA's on base and was part of the beta program for it where I gave feedback on it before public release

pseudo owl
#

I would kinda expect it since it’s smaller then sdxl. But it’s workable at least, not as bad as sd3 medium.

winged seal
#

I mean, its basically the same size as SDXL, and its over a year newer with new tech, much more training, and a WAY more mature industry

mortal mesa
#

just looking for info

winged seal
#

I still think that medium will be far more viable than large, so I wil have my eyes on that if anything

craggy crest
sacred jewel
#

@craggy crest better? 😛 😛 😛

craggy crest
sacred jewel
craggy crest
#

zombie ironman?

sullen moss
pseudo owl
bitter hearth
#

its important to have it off for the final smaller sigmas

#

cos it will reduce fine detail, just like PAG does

#

you also probably want it off for the first 30% of sigmas or so, because it could reduce image diversity

craggy crest
#

i think he really would like to be able to use it

pseudo owl
#

I don’t think this is Albert Einstein lol, allegro 2.8b
prompt: Albert Einstein walking around in a futuristic world, far away

winged seal
bitter hearth
#

keeps it away from the spicy areas

bitter hearth
real terrace
sullen moss
real terrace
sullen moss
sullen moss
limpid thunderBOT
#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

kind flume
#

help

sullen moss
pseudo owl
sacred jewel
sullen moss
real terrace
pseudo owl
sacred geode
real terrace
sacred geode
#

@craggy crest Is this correct "frozen just means no gradients apply to it"?

craggy crest
real terrace
sacred geode
#

I was asking in general from a convo I just had. To me I have had so many failures on Flux then I take the same dataset and throw it at SD3.5 and BANG, success after success. I mentioned that SAI said theirs were open (unfrozen) weights and was told they will say anything to stay relevant. Don't know just know I have had no real success with Flux locally while I have with 3.5. If the malformed appendages could be fixed then it will be a dream to work in for me.

real terrace
bitter hearth
# sacred geode I was asking in general from a convo I just had. To me I have had so many failu...

there has been a couple of pieces of news about Flux that are important regarding training
the simpletuner dev managed to train flux for over 2500 H100 GPU hours without it collapsing- this is $7,500 worth of training, more than needed for the vast majority of checkpoints
secondly the realvis dev managed to train a new checkpoint on flux de-distilled and boost the aesthetics a lot
so progress can be made its just tricky

real terrace
sacred geode
dusky thistle
sacred geode
craggy crest
#

to do anything at all to flux, you basically have to break it and then assemble the pieces into something that's not flux

sacred geode
#

I agree, and thank you for the confirmation.

craggy crest
sacred geode
#

BINGO

craggy crest
#

and a very specific reason why you can train SD3.5 almost instantly, and it is such a massive fight to do anything to flux

sacred jewel
sacred geode
#

Local it sure is for everyone I know who trains

craggy crest
#

we're being 'watched'

sacred jewel
craggy crest
craggy crest
craggy crest
#

trainers came out, there was a lora the DAY it released and the guy that made it didn't ahve early access, etc

#

it's almost effortless to work with

sacred geode
#

Flux is sledgehammer training then you finally manage it and concept bleed ruins it. SD3.5 is a sponge.

craggy crest
#

flux is an uphill battle unless you stick in it's very tight range.

#

it's a tool. a very good tool, with a very specific purpose. use it for that and don't try to make it be something it's not

sacred geode
#

I wish we had a save to lora node, or save to checkpoint then extract a lora from it, for 3.5.

craggy crest
sacred geode
#

I'm not stupid I meant you have 3.5, you have your lora. you adjust the lora. perfect. time to save it out. YES, there is a save lora node in comfy just not for 3.5

#

the old way was save it to checkpoint and extract

craggy crest
#

just set the lora up, and save it as a template

sacred geode
#

of course, but this is for release. once we get the tools we had in XL then I can rock the training for 3.5

#

they should come fast if the community can leave flux for 3.5

dusky thistle
#

lakers win the world series!!

craggy crest
dusky thistle