#🆕|sd3

1 messages · Page 112 of 1

bitter hearth
#

from what i understood, to calculate the trade offs you'd have to bring in model and workflow into factors and that's where the results are dependent on and thats where things will vary

errant dust
silver sluice
#

Wow how is it that q3 beat q8 in that benchmark? Thats funny I use the 3.2 3b model too and I’ve been wanting to make it more performant, I didn’t consider gguf’ing it

errant dust
#

It isn't Q3 per se

silver sluice
#

Yeah it has all those extra letters afterwards, must be a special q3

real terrace
#

so instead of the (vainilla?) SD35large or large turbo, which other models may be worth to try, in terms of quality/speed/performance?

errant dust
#

Suffice it to say, I use Q8 as my go to for these all. I personally want to sacrifice as little quality as possible and reap the most benefits from these big models.

silver sluice
#

I use the 3b model to cross base model remix so I’ll take a pony prompt and have the model generate the t5 xxl and clip g text for image gen and then have it pick the most appropriate Lora’s through categories. Anyways all that takes about 30-40 seconds I wonder if I can speed it up with a little q8 (or q3) tech, ive only used gguf in comfy I’m not sure how difficult it’s gonna be to use those with the torch, transformers library

#

@bitter hearth so I copied and pasted your text from your LLM and gave it to o1-mini and added this

Given all that text, if you had to quantify, try to quantify me a potential error rate, even if it’s just a ballpark, I understand it says slightly, how slightly? Maybe a range? Factor in the models and workload?

bitter hearth
#

im still debating myself on using q4 vs q8

silver sluice
#

Yeah 0.1 to 2% for most tasks

bitter hearth
silver sluice
#

I have 8gb and I only use q8, the time it takes isn’t that important for me

bitter hearth
#

you wont get gross drop in image quality

silver sluice
#

Has anyone done a deep dive comparison like side by side image quality?

bitter hearth
silver sluice
#

dang 10 to 20% that’s too much hard pass for me

#

Q2 flux dev looks a lot more like flux-s

bitter hearth
silver sluice
#

ComfyUi has a really nice low vram mode so it’s not like we have to worry about OOM errors

bitter hearth
silver sluice
#

I should adopt the gguf t5 model bc the q8 model is better than fp8 right? So they take a fp16 model and quant size it down to q8 that should be better than the normal fp8 and faster too right?

silver sluice
#

Thanks for the link

#

@bitter hearth you saw my comfy screenshot of how I setup the nodes for the clip loader?

silver sluice
#

By the way your triple clip loader could still use one more optimization, you can swap out that clip L fine tune for the long clip model by the same guy and it’ll work

bitter hearth
#

im not using clip l

silver sluice
#

Yeah you are the second one called Vit14

#

vit l = clip L

bitter hearth
#

vit is improved which is why i replaced the clip l with it

silver sluice
#

Just saying you can swap out that entry by the long clip model from the same guy and it’s even better and more optimized

bitter hearth
#

long clip model?

silver sluice
#

Yeah so like if vit let’s say is 20% better than clip l, long clip is 20% better than vit

#

Same guy, just go to zero points hugging face it’s his only other project

#

It’s really fascinating stuff he didn’t build it some Chinese crew did but they managed to extend the context width from 77 tokens to 248

silver sluice
#

Clip L actually has a 20 token effective length

silver sluice
#

It’s a drop in replacement for sd3, for sdxl and flux you need a special node to make it work

#

You’re going from 20/77 token width to 77/248 that’s pretty huge and they quantify the improvements it’s like double digits percent gains

bitter hearth
silver sluice
#

Yeah you gotta install some special custom nodes for flux to get it to work, it’s all linked in there that’s how I found it, they even cite the workflow

#

You just add the special node after the triple loader and pick your long clip file in there so it ignores whatever L model you picked in the triple model

#

Sea art something

bitter hearth
#

tested it on my end w/o any extra nodes with flux... works fine

hallow lion
#

So when will SD35 will be guffed?

#

Also rename this channel folks. Put the past behind you. the future is bright.

#

SD3 = SD35

hallow lion
#

dang

bitter hearth
silver sluice
silver sluice
bitter hearth
#

also thanks for mentioning the long clip 🙂

silver sluice
#

i gave you one tip to optimize and you gave me one tip to optimize so i'd say we're even 🤝

bitter hearth
#

sure lol

turbid grotto
#

will we get tensorrt for sd35? Turbo generates in only 9s on rtx3060! Trt can bring it down to 6s

hallow lion
#

Anything under a minute blows my mind

#

C O N S U M ER H A R D W A R E

errant dust
#

If I voluntarily toss out 20 Elo on that chart, then I have pretty much given up the advantage the ultra large model offers

#

Ex: Now it is as fast as the Medium model. Sure, and not much better either

bitter hearth
hallow lion
#

these charts should be taken with a boulder of salt.

sage burrow
errant dust
hallow lion
#

Nevertheless AI image/vide/sound gen is moving ahead with major strides

errant dust
#

at what point are you just moving down to a whole different model?

#

and just kidding yourself

#

It is why I stick to Q8. I cannot even run plain fp16 in case you wonder, so this is literally as good as I can get

hallow lion
turbid grotto
errant dust
errant dust
#

4060

hallow lion
#

wow

errant dust
#

4060 laptop to be precise

hallow lion
#

I can run them all on 3060 12 gb ram on PC

sage burrow
errant dust
#

well laptop has 8GB Vram

#

not 12

hallow lion
#

sd35 takes about 3 minutes lol turbo sd35 20 seconds

errant dust
#

and maybe Medium will do 20 seconds too. and per their chart is about equal in quality

#

shrug

silver sluice
hallow lion
#

So basically long story short it will be a while before we can run Mochi locally XD

hallow lion
errant dust
#

That's kind of my point. I'm not dictating what otehrs should want or do. I am simply sharing my view and what I want. The text production and adherence in Flux and SD3.5 are huge huge upgrades

turbid grotto
silver sluice
#

that's pretty good 20 seconds with 12gb vram thx for the info

sage burrow
#

So is anything online out yet to create 3.5 loras? 😉

errant dust
#

Yes, I think. I saw on Civitai

bitter hearth
turbid grotto
hallow lion
hallow lion
#

19.6 econds to be exact

errant dust
#

Well, come Black Friday I wil be upgrading this laptop, so no big deal

#

likely 4080 or 4090 laptop

bitter hearth
hallow lion
#

Good for u lol

turbid grotto
hallow lion
#

Maybe its coz I was streaming with OBS and doing other things

hallow lion
errant dust
turbid grotto
#

with clips only encoding is near instant btw

#

could be great for upscaling to not waste memory and time

errant dust
#

with plain fp16 T5 there is a significant wait while it processes a new prompt. With the Q8 version the time gain is huge

hallow lion
#

I have workflows that take like 15 minutes... first gen with sdxl then background refine with sd3 then upscale with sd15 and IC lighting with sd15 then further upscale with sdxl - lmao efficency

#

XD

errant dust
turbid grotto
errant dust
#

In fact, if you really want to be adventurous, GGUF has an fp32 of the T5

turbid grotto
hallow lion
silver sluice
errant dust
turbid grotto
turbid grotto
bitter hearth
hallow lion
#

I must be messing up then!

bitter hearth
#

and i mean the turbo model

#

which requires 4 steps

silver sluice
# hallow lion X wings!

unlike most flux models this one let's you adjust the cfg in the ksampler between 1 and 10 whereas most models you have the range of 1 to 1.8, it also requires 60 steps minimum, much slower than any other flux model but i honestly believe this is the closest you can get to flux pro level quality

hallow lion
silver sluice
#

@bitter hearth i can confirm that this works uusing my flux workfllow, i swapped out the old vit14-L for long clip and t5 fp8 for the t5 q8 gguf and it worked without any special nodes now

silver sluice
#

im averaging 700 seconds per image with that flux model, that's like 11 minutes for one image lol, 8gb vram, using the q8 version

turbid grotto
bitter hearth
silver sluice
#

i love the animation aesthetics of stable diffusion models tho, something about flux it just has this overtrained weirdness when it comes to doing anime, comics, cartoon, illustrations in general like SAI's training data has some magic sauce that just makes for nicer looking images when it comes to non-realistic images

turbid grotto
silver sluice
#

Flux-dev-de-distill
This is an experiment to de-distill guidance from flux.1-dev. We removed the original distilled guidance and make true classifier-free guidance reworks.

Model Details
Following Algorithm 1 in On Distillation of Guided Diffusion Models, we attempted to reverse the distillation process by re-matching guidance scale w. we introduce a student model x(zt) to match the output of the teacher at any time-step t ∈ [0, 1] and any guidance scale w ∈ [1, 4]. We initialize the student model with parameters from the teacher model except for the parameters related to w-embedding.

Since this model uses true CFG instead of distilled CFG, it is not compatible with diffusers pipeline. Please use inference script or manually add guidance in the iteration loop.

Train: 150K Unsplash images, 1024px square, 6k steps with global batch size 32, frozen teacher model, approx 12 hours due to limited compute.

Yeah the last line seems to imply he did perform training to it, the whole text is just way over my head tho

turbid grotto
#

even schnell which now is 50steps and has cfg))

sage burrow
#

civitai downloads stopped 😦

turbid grotto
#

why

craggy crest
sage burrow
craggy crest
sage burrow
craggy crest
craggy crest
silver sluice
turbid grotto
silver sluice
craggy crest
craggy crest
#

i know what he's doing - what he's not getting is flux undistilled. what he is getting is his mashup version

turbid grotto
craggy crest
#

and, quite frankly, he'd have been better off just using SDXL than what he's trying to do

turbid grotto
#

hope people will switch this resources to sd35l

#

rundiffusion said a while ago that they training flux but nothing yet, maybe didn't work and they planning to train sd35l now, hope them luck

craggy crest
#

(instead of spinning their wheels trying to 'unbreak' it)

craggy crest
silver sluice
#

as someone that has generated 800+ images with the dedestilled model i can personally attest to it's superiority over other models, here's a screenshot of how many images ive generated per flux model

#

the leading number is # of images made, the x cross is just a bug bc it thinks the files don't exist since they're in the unet folder not safetensors

turbid grotto
turbid grotto
#

will probably take time to test different traings

sage burrow
#

has anyone merged 3.5 with flux dev yet? How did it go?

turbid grotto
#

yes it is too diferent

silver sluice
turbid grotto
#

you can't

sage burrow
craggy crest
craggy crest
#

short answer - you can't. longer answer - I'll DM you

turbid grotto
#

should be fast

craggy crest
sage burrow
#

So it seems that 3.5 had a larger training dataset than flux did.

silver sluice
#

i for sure like the illustration output of 3.5 better than flux's but i think flux is better than sd3 when it comes to photorealism

turbid grotto
#

it will be 2.5b, so probably made from scratch

#

will be funny if it appears to have better anatomy than large

bitter hearth
turbid grotto
bitter hearth
#

another sd3.5 render, how would flux fair?

turbid grotto
silver sluice
bitter hearth
#

not quite the same feel

silver sluice
bitter hearth
#

the last one is

silver sluice
turbid grotto
craggy crest
bitter hearth
turbid grotto
oblique parcel
#

Grip enhancing gems

hallow lion
#

Pimp diddy controller

craggy crest
oblique parcel
#

If there’s a joke
im not getting it

hallow lion
#

lmao

turbid grotto
#

just 4 steps ||(don't mind outfit, I am testing how much details can clips only handle)||

#

q4 btw

turbid grotto
bitter hearth
#

and here i thought you wanted q4 over q8

turbid grotto
bitter hearth
turbid grotto
bitter hearth
#

another

craggy crest
brittle nexus
#

Any explanation for Q8_0 being faster than Q5_1?

turbid grotto
turbid grotto
bitter hearth
turbid grotto
turbid grotto
bitter hearth
#

Flux schnell a royal setting, princess Jasmine in her castle. cinematic.

old mango
sage burrow
craggy crest
sage burrow
#

much better lettering 🙂

bitter hearth
#

this is what im getting on sd3.5 turbo

sage burrow
#

3.5 is a perfect and complete model. It knows what spy vs spy is! 😄

bitter hearth
turbid grotto
#

I don't like idea with refining by sdxl anymore, it removes all prettines of sd35l

sage burrow
bitter hearth
#

i stopped using sdxl

sage burrow
#

ohhh, glif has 3.5 alreqdy! 😄 😄 😄

dusky thistle
mortal mesa
dusky thistle
#

SD3.5L has a real sense of style

bitter hearth
dusky thistle
craggy crest
sage burrow
craggy crest
craggy crest
dusky thistle
sage burrow
low stone
sage burrow
bitter hearth
#

I think my best images out of any so far came from refining Flux with Realvis

sage burrow
#

double sided katana!

bitter hearth
#

darth maul vibes

craggy crest
sage burrow
#

that team rolled a 1...

bitter hearth
#

this was Flux into Realvis
you keep the vibrance of Flux but Realvis tones down the Flux look a lot

sage burrow
dusky thistle
#

basic bueeehhhhler sampler

#

vs 3rd order RES-SDE with hard noise scaling and CFG++

bitter hearth
#

I don't think its possible to get people to stop using euler sadly 🤣

#

even when their image quality is maybe 1% of an appropriate sampler

sage burrow
#

I think the AI has been partaking this evening

hallow lion
bitter hearth
#

you know your sampler is old when the developer looked like this

sage burrow
dusky thistle
#

with special control over the variance of the noise

#

first was cfg++, second cfg

bitter hearth
#

snow is nice in the cfg++ one

dusky thistle
#

yeah, cfg++ is generally leading to less blown out areas, smudging, color bleed, etc

bitter hearth
#

I keep forgetting I need to switch to only using cfg++ really
their paper showed really big gains in some of the lower down examples

dusky thistle
#

yeah, def

bitter hearth
#

Blepping added Flux support to DiffuseHigh node yesterday

#

might be useless or might be amazing

dusky thistle
#

seems ol' bleppin' is as hopeless as me

#

nonstop fiddling with the repos

bitter hearth
#

haha yeah similar repos in some ways

dusky thistle
#

prompt from the great festivalman

#

cfg++ on the left, cfg on the right

bitter hearth
#

Extraltodeus repos are always fun too

dusky thistle
#

good example of the kinda diff

bitter hearth
#

cfg++ is so much nicer yeah

dusky thistle
#

yeah, def

#

it tends to have a ceiling on where you can set it

bitter hearth
#

I get confused by the cfgpp variable
cos something like the built in euler_cfgpp in comfy doesn't have a variable

dusky thistle
#

there's something sketch about the math, probably my own understanding of it, but i feel like it's not lined up exactly like the paper describes... every time i try to do it the way it's laid out there, i get a latent explosion lol

#

yeah

#

theoretically you're not supposed to set it to anything outside of 0 to 1, but i've found 1.5 is really nice

#

2 can be good... by the time you hit 3 it usually completely burns to fucking death lol

#

and i mean to death

#

like if you set cfg to 45.0

bitter hearth
#

ah I checked the original paper and they gave code but in diffusers

dusky thistle
#

i looked on their repo

#

tehy've got it done a few diff ways but it's real similar to what i'm using atm

bitter hearth
#

recently a paper gave code but for Compvis SD 1.4 repo, but using SD 2.1 model
it was baffling

dusky thistle
#

i'm prolly just botching the math at some point cuz i failed to find the equivalency

#

it doesn't help that i keep my notes on printer paper and my cats periodically launch everything all over my office and scramble everything

#

as in... several times a day lol

bitter hearth
#

oh yeah cats are crazy

dusky thistle
#

love em

bitter hearth
#

with the built in comfy one I used 1.5 a lot so I think that is ok

dusky thistle
#

gotcha

bitter hearth
#

there is this other benefit of cfg++ where the early steps look nicer if you end sampling early

#

basically this

#

so you can end it a little bit early to get a crazily soft image

dusky thistle
#

yeah the lack of overshooting is nice

bitter hearth
#

was a while ago but that's what this was

dusky thistle
#

oh yea i vaguely remember that one

bitter hearth
dusky thistle
dusky thistle
bitter hearth
dusky thistle
#

my advice is to buy a 10lb bag of wild bird seed and sit on your couch and sort it for a couple days without sleeping

#

when you're done, mix it back up and sort it again

#

then you'll be ready to prompt

bitter hearth
#

you are kidding me now

dusky thistle
#

i actually did that once

bitter hearth
# dusky thistle

im wondering what was the secret to making that image even.... it cant be just LLM or some controlnet

#

that's a decent answer from the LLM to be honest

dusky thistle
bitter hearth
# dusky thistle

i had no freaking idea ... i thought you were just messing around with that sampler, but chatgpt cant be trolling about it, im trying to figure out how to apply that sampler node into the workflow

dusky thistle
#

my WFs are embedded in these

#

you need my repo to use it

turbid grotto
craggy crest
turbid grotto
bitter hearth
#

for example this is Flux, refined with SDXL at 4096x4096

craggy crest
bitter hearth
craggy crest
turbid grotto
dusky thistle
#

i found 2B screwed my images up more often than not tbh

craggy crest
dusky thistle
#

yup

#

it would latch onto some little detail and change it

craggy crest
bitter hearth
dusky thistle
#

and degrade some of the other details... gettin to this look like there were jpeg artifacts creeping in, blown out pixels like the levels slider was pulled up too high on the blacks

#

it'd do things like take a creepy face somewhere in the corner, and turn it into a smiling bimbo, while the rest of the image wasn't changed

craggy crest
dusky thistle
#

i've come to vastly prefer just getting models trained that allow me to generate directly at my target resolution

#

i don't remember, been a while

bitter hearth
#

it might be okay with heavy cherry picking (generate 10 and pick 1)
I have to use cherry picking with SD 1.5 a lot

craggy crest
bitter hearth
#

this is SD 1.5 hands with heavy cherry pick
the top 10% of hands are fine

craggy crest
dusky thistle
# craggy crest what'd you ahve shift and cfg set to. and oh, btw - paper for you to read https...
This insight leads us to explore a rescaled version of the CFG update direction and incorporate a
momentum term, similar to adaptive optimization methods. The rescaling is motivated by the need to
control large update norms, which can cause significant drifts in the sampling process. To prevent
this, we constrain the updates to lie within a sphere. For the momentum term, unlike with traditional
optimization, we apply a negative value to introduce a repulsive effect between consecutive updates,
effectively down-weighting components already present in previous steps. We refer to this as reverse
momentum. By combining rescaling, reverse momentum, and projection, we introduce a new method,
called adaptive projected guidance (APG), which allows the use of higher guidance scales without
oversaturation or degradation in image quality.``` really interesting
#

i was thinking about something vaguely related today... wondering about implementing some kind of guidance on the guidance like this

craggy crest
dusky thistle
#

just a kernel of a thought, nothing fleshed out whatsoever

craggy crest
#

and it works, really really well

dusky thistle
#

awesome

#

gonna have to lok into implementing this

bitter hearth
#

I've wanted for like a year now to make a workflow where it re-rolls the hands until a vision model says its okay
but our vision models are not strong enough yet

turbid grotto
dusky thistle
bitter hearth
turbid grotto
craggy crest
craggy crest
turbid grotto
bitter hearth
#

https://github.com/MythicalChu/ComfyUI-APG_ImYourCFGNow

#

bare in mind it might not be quite what the paper wanted

craggy crest
#

before you use them, read the paper. outloud. and understand what it's for

#

so you use it correctly

turbid grotto
turbid grotto
bitter hearth
#

the authors showed up in this discussion, if that helps https://github.com/huggingface/diffusers/pull/9626

craggy crest
turbid grotto
bitter hearth
#

not sure

dusky thistle
#

even if it does, you're doubling your runtime again

bitter hearth
#

there is also Characteristic Guidance Prediction from https://github.com/redhottensors/ComfyUI-Prediction

bitter hearth
#

it links the paper

#

https://arxiv.org/abs/2312.07586

#

fair warning it will be slow

turbid grotto
bitter hearth
#

no I keep forgetting to try it

#

I am okay with long generation times so this thing could be really good

turbid grotto
#

A while ago I smashed several model patches without thinking much and made sdxl slower than flux

bitter hearth
#

oh yeah when I ran 3 PAG nodes and 3 SEG nodes it was way slower than flux

#

funnily enough the Sana demo has PAG, didn't expect that

turbid grotto
#

oh Sana, looking forward it
they will probably release after medium to not get left in shadow

dusky thistle
#

dinner is fucking served, guys

bitter hearth
#

at least there are some chilli peppers for flavour

turbid grotto
#

taesd3 works with 35

bitter hearth
#

I wish we had something half-way between taesd and the regular VAEs

turbid grotto
#

yea, I had sdxl workflow where 2k pass takes less time than final vae decoding

bitter hearth
#

yeah happens on a bunch of workflows

#

acceleration loras and high res can easily get that issue

turbid grotto
#

also, tensorrt - massive speedup

bitter hearth
#

yeah I always compile

#

the exception is when you want to tweak certain things a lot e.g. lora weights

dusky thistle
bitter hearth
#

rather than tensorrt I use the current compile model node in comfy these days

turbid grotto
dusky thistle
bitter hearth
elder elbow
turbid grotto
bitter hearth
dusky thistle
turbid grotto
turbid grotto
bitter hearth
#

that's matteo's pack

bitter hearth
#

maybe you can get it working on windows

turbid grotto
bitter hearth
#

its just 10 times less effort doing dev or AI stuff on linux

#

I didn't have the motivation to work out how to get it working on windows

dusky thistle
#

yea it's really pretty easy to use linux these days

bitter hearth
#

the way SD 3.5 mixes realistic with fantastical is cool

#

I don't know anything about anime but some people mixed anime with photographic too

dusky thistle
muted dove
bitter hearth
#

hmm maybe anime loras are ok

dusky thistle
turbid grotto
craggy crest
turbid grotto
muted dove
#

Floridian hopitality

noble coyote
muted dove
noble coyote
#

SD3.5L

bitter hearth
#

colours are amazing

#

on reddit someone was saying that in some ways SD 3.5 colours can be better than flux

noble coyote
#

I think SD3.5 has the upper hand when you stray away from realism towards a more artistic look

#

Flux has excellent realism

bitter hearth
#

it might be that Flux is better for me then, I am still testing

noble coyote
#

SD3.5L's photorealism has a grittiness or a grain - an unattractive quality imho

bitter hearth
#

personally my goal is to make photos that look real

noble coyote
#

It also seems to mimic M3DB fractals

#

Yes, photorealism is cool, but 3.5 has an inherent grain (noise?) which needs to be tackled

bitter hearth
#

ah yeah that sounds like a problem

#

I think Realvis are doing a fine tune currently so maybe that helps

craggy crest
muted dove
#

I like how it put prawns on the dish 😄

remote holly
strange rose
#

holy fuck

#

SD3 Large

#

took up every byte of RAM (32 gigs) and still wouldnt load

remote holly
#

Do you use fp16 model ?

muted dove
#

Supes on a night out

noble coyote
#

Same prompt - Flux the first image - SD3.5_Q8_GGUF the second

#

Just sayin'

cobalt moon
#

I actually love SD3.5 aesthetics a lot.

#

especially abstract art

wispy epoch
#

anyone knows LoRA sd3.5 optimal training parameters

#

?

muted dove
#

This is it refined by Flux

noble coyote
#

Your SD3.5 version is perfect

muted dove
#

...or your workflow with that GGUF model

noble coyote
#

Both 3.5L and 3.5_GGUF_Q8 give that weave lattice aberration - but in Flux its OK

noble coyote
muted dove
#

It looks worse in Discord than full size, but I think the refined looks more natural

noble coyote
#

The refined does look more natural

muted dove
#

Another from SD3.5

noble coyote
#

I think 8Gb VRAM is the likely cause

#

... of the weave aberration

muted dove
#

Probably not helping. But I also found that SD3.5 doesn't like doing multiple stages

#

Flux is ok with it

noble coyote
#

It is quite possibly a sign that my six year old GPU is actually on its last legs?!

muted dove
#

Try a different workflow, that only uses a single sampling stage.

noble coyote
#

I am using the ComfyUI default w/f ...

sage burrow
wispy epoch
#

training battlefield lora for 3,5 2 epo looking good, 1st training image , 2 dataset img

muted dove
#

These are from the same prompt, but with this part removed...

Fibonacci, voronoi, fauvism, 
Fibonacci, voronoi, fauvism, 
Fibonacci, voronoi, fauvism,
Fibonacci, voronoi, fauvism,
Fibonacci, voronoi, fauvism,
Fibonacci, voronoi, fauvism,
sage burrow
muted dove
noble coyote
muted dove
noble coyote
#

A kind of banding on her 'jumper' at the bottom?

muted dove
#

The shadow on this one! 🤣

noble coyote
#

Papua New Guinea head-shrinkers!!!

pseudo owl
noble coyote
#

The ComfyUI default w/f may not be robust enough to support the prompt's intricacies?

remote holly
#

is it possible to use only t5 instead triple clip loader ?

turbid grotto
noble coyote
muted dove
noble coyote
muted dove
#

There's not a big difference. Could be sampler, steps, not separating the Clip prompts...? 🤷🏻‍♂️

noble coyote
#

Your w/f with the same prompt there is no difficulty as far as I can see

muted dove
#

Are you using the LLM and Flux refiner as well?

noble coyote
#

I'll check your clips/schedulers/samplers/CFGs and see if mine compare

#

Not the LLM, and Flux Refiner just hangs ...

remote holly
#

i have this error i need to install sd3.5 vae ?

muted dove
#

Mine is using turbo model and fp16 T5

#

SD3.5 vae is in the model...unless you're using a diffusion model

noble coyote
#

I am using a modified version txxl_fp16 and SD3.5 Large checkpoint

#

Its interesting to find out why ComfyUI default w/f goes so awry with my prompt ...

#

... and yet yours makes it work perfectly?

remote holly
#

when it try to decode with vae i get an error

gritty steeple
remote holly
#

i should add a node here right ?

gritty steeple
muted dove
noble coyote
#

Papercut Stylee

#

GTM SD3.5 (modified) w/f

gritty steeple
muted dove
#

@noble coyote I made my workflow more like the Comfy example...

#

Flux'd

noble coyote
#

Great idea - at least your version/GPU/PC does it well

muted dove
#

Only difference is I'm using turbo model, so less (5) steps

#

Ah, and only using T5 clip too

noble coyote
#

Yes, I'm just using t5 as well now

#

GTM SD3.5 (modified) w/f

muted dove
#

Nice improvement! 🙂

noble coyote
#

"I kinda miss the weave!!!" 🥳

muted dove
#

🤣

noble coyote
#

Portrait Master I have used to make dozens of prompts. Those bracketed values make for some very fine-tuning.

#

But that's just me

#

Water-colour look

wispy epoch
noble coyote
#

... just d/loading SD3.5_Large_Turbo ...

muted dove
wispy epoch
#

okay sd35 large + Loras works

muted dove
#

Also, the results seem to be much better if you only use T5 clip

pseudo owl
bitter hearth
#

Mochi is amazing but I think people forget how good Sora was

wispy epoch
muted dove
pseudo owl
sage burrow
bitter hearth
#

its not just resolution, Sora was ahead in the scenes also

#

but its hard to compare as we don't really have benchmarks for this

muted dove
# muted dove

Same seed and prompt as this image, but changed scheduler from sgm_uniform to beta

noble coyote
#

GTM SD3.5 (modified) w/f

sage burrow
noble coyote
pseudo owl
# bitter hearth but its hard to compare as we don't really have benchmarks for this

Soras size is also probably massive, it took 10-20 min to generate a short scene on openai gpus.

On fal’s gpu clusters, it just takes 1 minute to generate with mochi.

Sora heavily upscaled, enhances, cherry picked it outputs. It’s not very fair to compare to mochi then. Also, the same heavily enhanced outputs in the sora page don’t seem better than mochi.

sullen moss
#

What I don’t like about Flux is that it adds pronounced facial wrinkles to female characters, whether they’re young or old. Even children don’t look like children but like dwarfs. Of course, you can tweak the prompt, but that’s not a real solution

noble coyote
#

Flux skin is preternaturally oily/sweaty and with obvious pores

#

I like the muted tones of SD3.5 for skin in comparison

noble coyote
#

Portrait Master prompts in modified GTM 3.5L workflow

bitter hearth
pseudo owl
bitter hearth
#

my taste is unusual anyway
I don't like Flux for example, but Flux is clearly preferred by average person

sullen moss
#

For me, at the moment, Kling is the best video generator

bitter hearth
#

there is some new Minimax one as well

#

on the closed-source side

wispy epoch
#

3.5 with battlefield lora xd

bitter hearth
#

wow awesome

#

those were great games

noble coyote
#

Portrait Master prompts in SD3.5L

wispy epoch
sullen moss
muted dove
#

Trouble in paradise

noble coyote
wispy epoch
noble coyote
#

Turbo_Large_3.5 w/f

muted dove
noble coyote
sacred jewel
noble coyote
#

Yes!

#

Turbo?

#

(Can you use LoRAs with Turbo at all?)

muted dove
#

Off-camera, this trio love socialising together

wispy epoch
muted dove
wispy epoch
alpine summit
muted dove
#

Baby Hulk...SMASH!!

mental bison
#

anyone got a gguf sd3.5 workflow

#

trying out the flux workflows but they don't work

sacred jewel
finite hollow
#

sd3.5

fossil pagoda
muted dove
finite hollow
wispy epoch
#

3.5 anime goooo

muted dove
silver sluice
#

i should make a bot that collects all the cool images on this chat and then posts them on civit lol

#

farm those likes, a lot of these images are high quality stuff too

silver sluice
#

lol is that in reference to what i said?

muted dove
#

Yes

errant dust
dusky thistle
wispy epoch
#

SD 3.5 + Anime LoRa

#

fp8

noble coyote
#

I'm going to tryout OmniGen in the new SD Next (just released today!)

wispy epoch
bitter hearth
#

I really think models like OmniGen are the future

bitter hearth
wispy epoch
bitter hearth
#

clownshark batwing, torcello, galaxytimemachine, youfunnyguys are among those who have that secret recipe that im trying to understand

gusty trail
fossil pagoda
bitter hearth
# fossil pagoda

hey boto how did you incorporate that idea into merging bunny with tea pot? just prompt or some speicifc tool you used?

#

it feels to me there is more to just prompts behind creating those ideas

gusty trail
fossil pagoda
#

It has the comfy workflow for the curious

bitter hearth
# fossil pagoda Just random stuff thrown at a wall

thanks, that's what i've been doing with random ideas, but apparently i can't compare those methods of mine with what these guys are creating, im missing out on something that i don't fully understand yet.

#

what they are doing seems like metaculous and targetted

#

and what they are creating is not something from ideas alone, they must be using some method/tools

lunar rivet
#

are you guys getting anything good above 1024px? (with sd3.5)
was used to flux just casually doing 2k but with this one it's just noise everywhere

lunar rivet
#

I mean like specifically where both dimensions are above 1024px, not like 1536x768

bitter hearth
#

i would recommend using SDXL Resolution node to choose any of those set sizes then maybe upscale later by 2.x

#

if you are choosing resolutions loosely and randomly that might have issues

gusty trail
#

You might try with upcoming 2.5B

#

3.5 L is still 1024px checkpoint

bitter hearth
#

they mention using a res of 1mp that are rounded by 64... but dw about those, just use the node i mentioned

lunar rivet
#

is 2.5b supposed to be trained on higher res?

gusty trail
#

seems trained with 1080p but I am not sure about 2k.

lunar rivet
#

I know I can stick to safe resolution and get 1280x896 image with sd3.5 but that's not what I'm asking, I can do 2048x2048 with flux in 1 pass, no upscaling required

bitter hearth
mortal mesa
bitter hearth
#

fun fact, you can render images with 2048 even with sd1.5

#

in one pass

lunar rivet
#

sure, but they will look like ass, you can downsize the blocks with kohya's fix or whatever but that's still not native res and the details suffer

#

the above is also upscaled 2x with esrgan from what I can see

bitter hearth
#

no they wont look bad

#

most of the later sd1.5 models create brilliant images

#

im rendering one at 2048x768 on sd3.5

#

then upscaled by 2x

#

beautiful blonde warrior princess in tattered clothes. epic mountains in the background.

#

if you were to choose size randomly that don't round with 64 like 2048xrandom that might not work

lunar rivet
#

that is not the issue... but also notice the artifacts on left and right

bitter hearth
#

what were you asking?

#

none of the models would take random resolutions and produce coherent results regardless of versions

#

as for the artifacts you could prompt around it, i just did a basic prompt without bells and whistles

lunar rivet
#

look at your prompt at 1024x1536 vertical basically, that is a 64 divisible resolution, sigh

#

the artifacts above 1mp are horrible

bitter hearth
#

does that round 1024x1536?

#

you'd usually choose 640x1536

lunar rivet
#

I just said it's divisible by 64

bitter hearth
#

proportion seems a bit off

lunar rivet
#

it's not about proportion, the whole lower half beyond 1mp is smudgy

bitter hearth
#

yes i see that

#

not sure why it's blurring out at the bottom edges but that image dimension is workable

lunar rivet
#

flux has absolutely no such issues

bitter hearth
#

im trying that on turbo model btw

#

are you testing on turbo or large regular?

lunar rivet
#

I'm not even on turbo, large 3.5

bitter hearth
#

ok

#

hmm

lunar rivet
#

would expect turbo to be even worse ig

bitter hearth
#

i wonder about large regular, copying it into my ssd

mortal mesa
#

flux was trained up to 2mp images

lunar rivet
#

looking at sai's charts I was under the impression sd3.5 was supposed to be better than flux but I guess that's with 3 asterisks and only at select resolutions

bitter hearth
signal shuttle
bitter hearth
#

yeah a possible work around would be to generate within lower res then upscale

lunar rivet
#

sounds like excuses 🤷

bitter hearth
#

its not an excuse, its a technical approach

#

this is large regular btw

#

at 1024*1536

lunar rivet
#

well yeah, already tried it so no big surprise there, it's still smudgy

signal shuttle
# lunar rivet sounds like excuses 🤷

Sai knows that the average person doesn't generate above 1Mp so why train it to be able to generate above 1Mp?, its like when devs know that 98% people use windows so why make a linux version

lunar rivet
#

I don't follow that logic, as an average person I generate above 1mp, bfl also released models capable of generating above 1mp to the public full of average people

bitter hearth
#

why not generate at 768x1152 then upscale by 2x?

signal shuttle
mortal mesa
#

ide also like to make apple juice out of an orange

bitter hearth
#

using turbo ... at 768x1152 then upscaled by 2x

lunar rivet
#

why would I want to upscale when I can generate in 1 pass with superior details... but okay, you guys have fun, I already know everything I wanted to know about the model

bitter hearth
#

why wouldn't you when you are bumping into technical issues how the model is trained?

lunar rivet
#

because there is a better model that can do it out of the box, it would appear

bitter hearth
#

and why would you even generate a large image in 1 pass when you can optimize time/speed by the method i mentioned

signal shuttle
bitter hearth
#

ok gl you are arguing over things that dont make sense to me

pseudo owl
#

Mochi1 can gen a video of someone watching a video lol. Amazing open source model for text-to video, but need image to video now.

lunar rivet
#

I don't know, just a crazy thought

mortal mesa
#

youve been lied too!

#

anyway 3.5 has high potential probably why most are here

gusty trail
#

Just use what you like. People are still using sd1.5 nowadays. New model wouldn't stop people to use old model.

hallow lion
pseudo owl
hallow lion
low stone
bitter hearth
noble coyote
#

At the entrance of a dimly lit cave, a towering, majestic dragon with sapphire-hued scales glistens in the faint light. The dragon stands tall, holding two crystalline prisms in its claws, angled precisely like those in the reference photo. The sunlight streams through the cave entrance, hitting the prisms at specific angles, causing vivid, realistic beams of light to split into a spectrum of colors, casting a radiant rainbow on the dusty ground. The surrounding area is shrouded in partial shadow, with the play of light and dark creating a mysterious atmosphere. The dragon’s intelligent, piercing eyes gaze at the viewer, offering a silent challenge: solve the ancient riddle of light and shadow. The cave walls are rugged and dark, with faint engravings hinting at forgotten knowledge. The overall mood is one of mystery, magic, and high-stakes intellect, as the dragon stands guard over the path forward."

#

SD3.5 Turbo

bitter hearth
azure zealot
#

Anyone getting decent generation times for SD3.5 Large with 12GB VRAM?

craggy crest
azure zealot
bitter hearth
#

and turbo takes 10s

azure zealot
#

Comfy/Forge?

bitter hearth
#

comfyui

#

i need to mention this too. i have 32gb ram

azure zealot
#

Yeah he has 64GB RAM but it's a 4070 super, dunno what he's doing wrong

bitter hearth
#

that should be fast enough unless he's mult tasking with gpu intensive tasks

#

i have dual monitor setup and mostly have youtube playing at 1080p while i render with turbo

craggy crest
mortal mesa
#

maybe they are 4k images

#

haha

azure zealot
#

Yeah I asked for workflow screenshot, we'll see

bitter hearth
#

thats the kinda res i use.. so that's 2k

#

if your friend has heavy workflow that could be one reason

azure zealot
#

Your unet/diffusion model file is about 16GB?

bitter hearth
#

8gb gguf

azure zealot
#

Ohhhh lol that explains it, he's using 16GB main model I think

bitter hearth
#

but even with 16gb original file from SAI that didn't take me longer than 15s with turbo

azure zealot
#

He's not using turbo at all so that's a different animal

bitter hearth
#

16gb should have no impact on your 64gb that could affect render

#

yeah he's using regular you mention but that would take max 30s on 4x series card

#

people are saying fine tuning sd3.5 would be fairly easy, and i can't wait for it to fix some of the obvious artifacting

hallow lion
pseudo owl
hallow lion
#

omnigen + mochi = movie studio?

bitter hearth
#

we need better upscaler for sd3.5

#

the hack of using sdxl to refine is not cutting it

sage burrow
#

GGUF and I don't seem to be getting along 😦 Are there new special clips to add? Recommended settings?

bitter hearth
#

im gonna stop upscaling untill then

bitter hearth
#

the image just above is with gguf

sage burrow
bitter hearth
#

umm, are you using sd3 or sd3.5?

sage burrow
#

3.5

#

I put in the same 3.5 gguf 8 workflow

bitter hearth
#

your workflow is not correct then, you need triple clip loader for sd3.5

sage burrow
#

does a regular 3.5 workflow work with gguf as well?

bitter hearth
#

here is a workflow you can try, just replace the clips with clip g, clip l, and t5xx that you have

sage burrow
bitter hearth
#

not really, what im using has been around, you dont need them strictly

#

but by using the ones im using those are supposed to give slightly better results

#

not by huge lot

sage burrow
#

what is the LONG VIT clip I see all the time in workflows? Is it better than g?

bitter hearth
#

its made by a guy who works for Glif lol

silver sluice
bitter hearth
#

that upgraded clip l gave me worse images on an SD 1.5 checkpoint when I tried it

#

so just to warn you it might be a downgrade rather than an upgrade

silver sluice
#

@bitter hearth i come back and i see you're still spreading the gospel on better clip models, it's good stuff im a big fan

bitter hearth
#

Clip l is the normal clip for SD 1.5 though

#

i havent had bad results with that long clip using it on sd3.5 or flux

silver sluice
bitter hearth
#

can't remember which

#

i didnt know sd1.5 needed clip loader 👀

#

how would it read your prompt without clip lol

#

well i mean i never had to specifically use a nodefor it when i used sd1.5 on comfy

silver sluice
#

from left to right

  • realismBYSTABLEYOGI_ponyV2FP32.safetensors
  • sd3.5_large_fp8_scaled.safetensors
  • flux-dev-de-distill-Q8_0.gguf
  • acornIsSpinningFLUX_devfp8V11_Q4_0.GGUF
bitter hearth
#

you probably used load checkpoint

#

in that case

bitter hearth
#

i also didnt think sd 1.5 were diffusion models

#

hmm

#

what did you think sd 1.5 was

silver sluice
bitter hearth
sage burrow
silver sluice
#

sd3.5_large_fp8...aled | 🌱 2439241852 | 🦶 25 | 🦮 3.5 | 🎤 euler | 🗓 10/24, 3:25 PM | ⏱️ 143s

bitter hearth
#

goes into model/clip folder

silver sluice
#

my friend tried the Longclip model in Forge as a drop in replacement and he can confirm it's not compatible with forge yet

noble coyote
silver sluice
#

sd3.5_large_fp8...aled | 🌱 400971331 | 🦶 24 | 🦮 3.5 | 🎤 dpmpp_2m | 🕦 sgm_uniform | 🗓 10/24, 3:28 PM | ⏱️ 90s

bitter hearth
silver sluice
#

too bad those hands in that pic with the pink haired girl and the girl with those white jeans looks so mangled

bitter hearth
#

sd3.5 is better left un upscaled

bitter hearth
silver sluice
#

i think it's safe to say that SAI has redeemed themselves with 3.5 whereas 3.0 was a debacle they managed to push out something super decent that's exciting to work with and rivals Flux in a lot of ways

bitter hearth
#

they had no choice 🙂

#

people were literally giving up on sai for sd3.5 and with the arrival of flux

silver sluice
#

well the other choice was to crawl under a rock and just become forgotten

#

afaik SAI was dead and 3.0 was the last model and they had just shrugged their arms and given up, its nice to see they still got some decent talent making good models in there

bitter hearth
#

model wise sd3.0 was utter mess and unsuable for any work, and then they had a ridiculous licensing policy that fired back, community was way pissed with sai

silver sluice
#

stuff like hands is just minor quips that can be fixed by loras, or adetailers or whatever peope want to use via post-processing overall tho I much more preffer SD3.5's illustration aestheticis over Flux's. I think flux is still king for realism but I'd give the crown to sd3.5 in terms of anime/comics/cartoons category

mortal mesa
#

sd3 could of been much nicer if you know they didn't try to "protect" things with their trust and safety

#

literally the only issue

bitter hearth
#

yeah dont try to censor what is fundamental to life 🙂

silver sluice
#

id say the 3 thing downfalls to 3.0 was the quality, the licensing, and what CivitAI did to hinder it's popularity early on

sage burrow
#

it wasn't any more censored than the first release of 1.5 or sdxl though right?

bitter hearth
bitter hearth
silver sluice
#

i agree with their out of the box support for nipples in 3.5 whereas 3.0 insisted on a blank chest, i also agree with not training it for genitals and let the community handle that

mortal mesa
bitter hearth
#

i dont understand the discrimination towards what's a fundamental aspect of life

mortal mesa
#

ya its weird

bitter hearth
#

i mean there are surely some aesthetic concerns to it but with proper female models that can be aesthetic enough

sacred jewel
silver sluice
#

i think there's something to be said about having corporate models or models that can be used in enterprise or restricted situations where it's not really okay to be generating full blown porn images lol, i think if we can all agree that nipples are universally okay then we're off to a good start, there's a fine line between building a pervy tool and building a creative tool

bitter hearth
# sacred jewel

is there any crash course material for your image render secrets?

silver sluice
# sacred jewel

my favorite one from your set is the terminator skeleton dude

#

awww and there's no workflow data in the image 😦

bitter hearth
#

@sage burrow did you get the files sorted out?

sage burrow
sacred jewel
sage burrow
noble coyote
bitter hearth
sacred jewel
sacred jewel
bitter hearth
mortal mesa
noble coyote
#

Turbo 3.5L + LLM

sacred jewel
#

This is what it looks like in normal operation 😛

sage burrow
#

Anyone have a 3.5 GGUF workflow embedded in an image, that works? 😄

bitter hearth
# sacred jewel

looks complex but im excited to studying it over a cup of coffee / tea later on in my quiet time 🙂

bitter hearth
mortal mesa
bitter hearth
#

im genuinely curious btw, i can't think of a reason for the blurred image you are getting

#

i would double check the node files you are using

#

models, clips, vae

sage burrow
bitter hearth
#

I like how its neatly organised and then clownshark workflows are like a whirlwind

#

I think mine are sort of halfway between the two

#

@sacred jewel i might need gpu upgrade from the look of that workflow..

#

i was looking into it... lots of gpu intensive tasks in it

#

you could tile

#

tiling reduces VRAM loads

mortal mesa
sacred jewel
bitter hearth
#

yeah I always use FP8 or even NF4 and its same

sage burrow
# noble coyote

This one looks like the regular 3.5 large, then also the GGUF?

bitter hearth
#

it seems that SD 3.5 turbo comes out better than Schnell for some reason

#

even though its the Schnell equivalent

sage burrow
#

I never did like schnell 😦

#

3.5 seems pretty awesome though in comparison

bitter hearth
#

the realvis schnell was better

#

I don't think it will get that much fine tuning attention
but if it did, I think its possible for Schnell to catch up to dev

#

with the right loras and nodes, SDXL 4 steps is very very close to regular SDXL

sacred jewel
#

Skynet Cyberdyne Systems LoRA

(a little YFG SpyWorld50s LoRA mixed in)

noble coyote
sage burrow
#

The second one is from my system using Torcello's workflow, but config up to 30 (it wasn't good at all at 8). The first one is from Mage.

sage burrow
#

now clearly my prompting sucks in this case lol

icy drift
#

OmniGen running locally! 😄

noble coyote
hallow lion
noble coyote
#

🥳

hallow lion
icy drift
#

I'm getting about 30 seconds per image on 4090.

#

It's slower for image to image though. (Using an image in the prompt.)

#

LOL, well, not sure it's a boy... Could be I guess. 😛 It changed the haircolor.

bitter hearth
icy drift
#

Okay, you can't necessarily give OmniGen instructions like "rotate this coffee mug". You still have to prompt it with a description of the image you want it to generate, but your prompt can include images for reference.

sage burrow
icy drift
#

Success! 🎉 She went clubbing.

#

Wow it really works.

#

And it's MIT license. I hope someone is willing to finetune it.

sage burrow
#

that's really neat 😄

hallow lion
#

sure is

icy drift
#

It's moderately uncensored. (Did a very lite test though. Your mileage may vary.)

sage burrow
#

I can turn all the anime waifus into ZOMBIES! 😄

dull star
#

I got lucky with a seed

icy drift
#

Pose transformation and facial expression change.

sage burrow
#

Glif should add OmniGen 😄

hallow lion
#

tremendous

dull star
#

@cunning lintel

icy drift
#

Render at twice the speed using 20 steps.

#

And the results are slightly better???

#

(Might be lucky seed / better with anime vs. realism etc.)

sage burrow
#

I guess character consistency has been (extra) solved?

icy drift
sage burrow
#

does it work for "photos" as well?

icy drift
dull star
cunning lintel
#

Those look good!

dusky thistle
dull star
#

Negative prompt: background blur, bokeh, illustration, photo, pencil drawing, crayon, photorealistic, anime, video game, CGI
euler, simple scheduler, 2.4 cfg, modelsamplingsd3 left at 3.00

#

I'm trying artists

#

it seems the zdzisław beksiński somewhat works but it just melts everything which makes it pretty much useless

#

and sometimes claude monet creeps in if I prompt for it

#

I just prompted with István Csók this time but idk if it's even doing anything

cunning lintel
#

there'sa few left that work? nice 🙂

dull star
#

impressionist oil painting by István Csók of a man with torn and poor clothing sitting on a bed in depression and sadness. There is a window which illuminates the whole room with a warm orange glow.

#

somewhat I guess