#🆕|sd3

1 messages · Page 3 of 1

cobalt moon
#

of course you won't say we gonna train 16-chanel VAE and YOLO

#

at least it is an attempt

dull star
#

hell yeah you will

#

you will wait 30x longer

#

and so what

#

/s

cunning lintel
#

Bit unfair to compare this effort to SAI, they used a tried architecture from SAI, then threw some data at it. It's fun, would be nice to see it grow, but it is not even in the same league of work needed

cobalt moon
dull star
cunning lintel
#

One party writes researches a recipe, the other party uses the recipe, yet it's all the same work 👀

low stone
#

I know it's been sdxl refined, but I keep being impressed at the composition that hunyuan puts out. The blue ball vs red hat stuff is not the best, but 1 to 1.5 subjects on the screen has way better composition than sdxl and different than ella and pixart so it compliments it well for prompt running across multiple models.

dull star
#

that's very nice

#

I hope we'll get safetensor weights so I can try it as well

#

there's a comfyui plugin so you can run it there with no complex setup

#

its the same plugin that lets you use pixart-sigma and other stuff

cunning lintel
#

How many custom nodes is you comfyui? Checked them all? That's a much larger attack vector than this set of pickled weights that's been out for some time and no red flags have been raised

#

Just use it if you want to try 🙂

dull star
#

oh yeah right

#

they can just update anything anytime and I won't really know

bitter hearth
#

whats that guy responding to ?

#

I can't see cause twitter requires you to have an account to even see anything, just garbage

hallow lion
#

sd def feels more accomplished than sdxl

bitter hearth
#

have a shrek

lavish sparrow
bitter hearth
dusky thistle
#

does this mean it's just going to take a while and they are going to release them? or the plan may have shifted away from an open release

bitter hearth
#

bruh

#

How many times have they said they will release it open source and all

#

sadcat dies

raven fern
dull star
little quarry
#

Two more weeks??

dull star
#

only two more weeks after two weeks

#

🙏

woven dock
woven dock
woven dock
#

another day, another 2 weeks

dull star
#

16 channel imo is a sweet spot for finer things like text

woven dock
#

Midjourney's vae is around that size

dull star
#

damn

hallow lion
#

hang in there Emad just fix the hands first then drop the Weights.

dull star
#

emad will fix the release date

hallow lion
#

Emad is the new Gaben.

#

But he CAN count to three.

teal fossil
hallow lion
#

You can buy bootleg Russian SD3 weight copies.

#

Axel Rose leaked them on the net too.

#

Someone make "In 2 weeks" t shirts.

raven fern
#

that moment when half life 3 releases before sd3

hallow lion
#

Another day, another two weeks.

icy drift
teal fossil
low stone
#

took a little while to tell it what I didn't want and it stopped showing me that stuff

icy drift
low stone
#

the answer is, everything is on there, good, bad, everything. you just have to curate your feed.

dull star
#

its okay to have an account to keep up with news and crap

teal fossil
low stone
#

eh? literally the whole world is on there but they should know what you want?

#

once I followed all the right academic and ai people, and said no to the political stuff, I don't get anything stupid in my "for you" feed anymore.

icy drift
dull star
#

can we just get SD3 2B

teal fossil
teal fossil
dull star
#

3B?!

teal fossil
dull star
#

ah

#

minor spelling mistake

raven fern
#

2B and 9S models (no one gonna get this reference)

frozen lynx
#

guys what to do while waiting for sd3?

raven fern
#

keep generating waifus and husbandos

cunning lintel
teal fossil
cinder junco
raven fern
#

haha

#

but man.. every time i get a ping from this server, i kinda hope it's from the announcements channel, but it turns out it's some random people... smh :3

#

but i still love ya

low stone
cinder junco
#

I don’t really understand the logic of choosing T5 for the text encoder. Wouldn’t a newer llm (e.g. llama 3) or even a reduced parameter count T5 using new distillation methods be better?

raven fern
#

hopefully the t5 module is somehow plug and play and we eventually can replace it

rain current
bitter hearth
bitter hearth
dull star
#

I wonder if I'll switch over to invokeAI if that get SD3 support, just so img2img and regional stuff will be easier to do

dull star
#

yeah idk about replacing it with other weights that do not match the size

#

as they literally just replace the bits of the weights of T5 with zeros in the MMDiT or whatever to make it not use T5

raven fern
#

we are still in the dark when it comes to how this is actually all structured code wise :3

#

depending how their pipeline actually works

dull star
#

no wait

#

I got it wrong

bitter hearth
raven fern
#

the community will find ways to fix stuff anyway

dull star
#

they replace the T5 EMBEDDINGS with zeros, to make the conditioning

#

still, the T5 model they are using is going to perform the best

raven fern
#

im also curious about the edit model, i hope it fixes the artifacts and deformations from cosxl

dull star
#

I'm glad that inpainting will be a thing out of the box

raven fern
#

and i also shaved

dull star
#

god SD3 with highresfix would be a treat 🙏

raven fern
#

OOM 🙂

dull star
#

not for me

#

and if its gonna be 2B, then the base res is gonna be 512

raven fern
#

yea i cant wait to upgrade my pc

dull star
#

yeah its easy, just simply buy a new pc

#

duh

raven fern
#

i mean im just waiting for 5090

bitter hearth
raven fern
#

and if they dont give us at least 32GB vram, im gonna kill the nvidia dude... jk of course...

bitter hearth
#

(in minecraft)

raven fern
#

kek

#

i really dont like to have a beard personally, just always itchy, so i try to shave as much as possible, but sometimes lazy as hell

#

and technically my eyesight is not great, but i just dont want to wear glasses

bitter hearth
#

Just see better

#

?

#

profit

low stone
violet escarp
cinder junco
# violet escarp quantization is better than distillation

Interesting. Do you have a source for that? Is the rationale something like distillation (reducing number of weights so you can keep higher weight precision for a given model file size) effectively reducing the model's breadth of knowledge while retaining the accuracy of the knowledge it has, while quantization makes the knowledge more approximate but maintains its breadth?

violet escarp
#

I might have misremembered the paper, but quantization is still one of the best ways to reduce vram requirements

#

I guess you could distill and quantize

sterile pendant
#

Yes in almost every case with llms, this is the answer. More parameters will almost always be better, assuming it's at least Q3 or higher. Like a 13B at Q4 will outperform a 7B at Q8. The 7B Q8 will be close to 7GB in size and have a perplexity like 5.9, the 13B Q4 will be around 7gb in size as well and have a perplexity of like 5.3 (lower is better)

#

But you can also distill the model as well to make room for more relevant data. If you're using a model for writing English novels, you probably don't need the model to contain a shitload of data about math and science

#

So in the same file size, you can stuff more relevant data in the model if you need to(or just shrink the overall filesize to play nicer with multimodal setups like when doing stable diffusion so you don't need 256 terabytes of ram for swapping 3000 models in and out of the GPU lol)

noble coyote
#

Wombo Dream i2i into SDXL+LoRA+PAG Advanced NOT SD3

cobalt moon
#

do you even know how's the performance of LLaMa 3 8B Q4 + SDXL

#

seriously even some of the SSD can't handle it

fossil pagoda
teal fossil
low stone
low stone
quartz mulch
#

When I finally got it to work.

lavish sparrow
#

however, you can't use the LLM directly as a tokenisation step 😦

teal fossil
lavish sparrow
#

but depending on model, you might have to change output, i'm using ollama as backend

lavish sparrow
#

switched from llama3 to phi medium, so gonna have to find my bearings again a bit

lavish sparrow
#

this is the node i'm using to use ollama

#

important is that you get the model instruct template right!!!!

#

for this workflow, the system prompt MUST be used, or you'll be getting gibberish that you cannot use

#

right now i'm trying to find a better system promtp

#

and then stuff like this happens

#

@teal fossil workflow is inside the image

abstract nymph
#

still no news? goodness

hallow lion
#

Two weeks.

abstract nymph
hallow lion
#

Echoes in the chambers of the mind.

#

Mind of Emad.

little quarry
#

Two weeks

woven dock
#

wrong server bub

drifting oak
#

This honestly pisses me off, and it's only the people who don't know how AI works or they've just never used AI before, AI has been with us for a whole while it only got popular with t2i and llms

dull star
#

decades ago: There is no such things as a digital artist thomas

drifting oak
#

Lol even VFX, a vfx artist uses AI tech to track footage so they can add CGI stuff later on, but they wouldn't know that bcoz they're npcs

hallow lion
#

Ludites man, get off your PC.

#

Use analogue only. Kodak Agfa family moments. Go develop that shit!

#

Better yet only cave art is true art, only uses natural raw materials, paint me some bisons.

icy drift
dull star
#

its only two weeks away on every single morning

icy drift
rotund ibex
low stone
autumn arrow
#

Had the hands fixed afterwards

#

But text is SD3

low stone
abstract nymph
#

you never know eh?

low stone
low stone
half tundra
#

desenho kids

hallow lion
#

How heavy are the weights? 50kg? 500? 5000?

#

Can we even lift the weights after Emad drops them? Are we worthy and capable?

hallow lion
#

Time to sleep.

#

Maybe by the time I wake up the SD3 weights will drop.

hallow lion
hallow lion
remote holly
#

Does sd3 released ?

low stone
night condor
#

Feu

teal fossil
raven fern
#

2 weeks plus tax

quartz mulch
#

I'd be soooo all over it dropping. IF I could even think about running it locally without my gpu sprouting legs and getting the hell out of my country.

#

even if it drops, I'll still have to use the api.

noble coyote
#

2 epochs and an era™

noble coyote
lavish sparrow
#

telling it what it should do

#

else it wwould just respond with a default answer

noble coyote
#

OK, I have a ChatGPT4 account - will that link to this node?

lavish sparrow
#

i dunno if you can do a system prompt on gpt4

noble coyote
quartz mulch
#

In general no. ChatGPT does not offer api access. You need a playground account for that. A separate thing entirely.

#

But you can use openrouter. It's what I do, and the charges are exactly the same as in playground. But offers many, many more models.

noble coyote
#

I do have a PG a/c - so I have plenty of options...

quartz mulch
#

But chatgpt seems to be entirely non-functional for me for the past 5 minutes or so. API calls work though.

noble coyote
remote holly
teal fossil
sterile pendant
lavish sparrow
remote holly
#

Usually I'm a patient person but the SD3 demos impressed me so much that I can't wait to have the open source weights to test, but I guess releasing a model takes time

frigid saffron
#

sd3 open means midjourney in danger, i strongly believe

remote holly
#

Same

#

A midjourney quality open , fully finetunable , is the best thing about sd3

desert garnet
hallow lion
#

What happens is that Stability AI GPUs are so strong that time bends around them.

#

We are in a time loop of two weeks created by this phenomena.

remote holly
#

Or 42

desert garnet
#

but CTO said may,he dont lie

remote holly
#

Wich will be released before between sd3 and gta6 ?

desert garnet
#

manhunt 3

remote holly
#

Alf life 3

dull star
#

hope 2B comes at the end of the month fr

#

8B and others need cooking though

#

2B is a perfect candidate for accessibility and training

hallow lion
#

Whats the diff between 2B and 8B?

dull star
#

quality probably for example

#

I am scared that prompt adherence too

hallow lion
#

and Size

dull star
#

a little but sure, but I don't know if it's gonna be massively worse than 8B in prompt adherence

#

if it's gonna be on the level of the others such as pixart-sigma I might as well wait for 8B or something

#

all I could do is generate paintings as they are in the style I like

#

finetunes of 2B would be fine though

desert garnet
dull star
#

kek

#

2B is closer to a full train as its a smaller parameter model, so we might even get better quality with 2B than what we see on the API

#

I just wonder how GOOD 8B would be at its fullest potential

#

like how llama3 8B was trained for 15T tokens and its wonderful

#

we won't get anything close to that, but I'm still willing to wait if stability still has the opportunity to train further as much as they can

gusty gale
#

I feel like the time complexity of SD3 releasing is [ O(n^∞)+ (2 weeks) ]

#

it releases in two weeks, at any given time

pliant osprey
#

sd3 not supporting unet is a problem with no support for controlnet ootb

remote holly
gusty gale
remote holly
#

"i will catch the turtle maybe in june "

dull star
remote holly
dull star
#

they said that the will release the smaller models first

noble coyote
lavish sparrow
noble coyote
#

I have a fresh API Key and a positive balance $£

noble coyote
#

I have d/loaded it into Custom_nodes and selected the Ollama_Vision Node in comfyUI

#

OK, I got the LLama Server running in the background - says Payload Too Large!!! 😄

noble coyote
noble coyote
dull star
#

YESSSSS

#

this would be amazing

#

bunch of variants to choose from

#

the community will find which one's the best for each model size

raven fern
#

We might release some variants.
We might release some
We might release
We might
sadcat

bitter hearth
#

sd3 2030 confirmed

cursive mist
#

2030 AD?

pale aurora
#

trying to use the stability api and I can't even gen a picture of a woman in a shirt. Just comes back blurred. You can clearly see that she's NOT naked

raven fern
#

enhance.... enhance....

#

that second pic tho.. there is something sus going on on the bottom part of the pic LOL

cunning lintel
jolly drum
raven fern
#

kek

raven fern
pale aurora
low stone
pale aurora
#

I also have 'nude' as a negative

raven fern
#

that straight up looks like one of those videos when you start MGS4, its like some weird tv channels @low stone

bitter hearth
#

y'all think we will ever get a truly photorealistic model?

low stone
raven fern
#

i mean some sdxl models are very good with photorealistic, assuming you prompt correctly

cunning lintel
pale aurora
raven fern
#

so you telling me we cant generate horror?

cunning lintel
#

actually, it works better than females 🙂

#

No, in all seriousness, that filter is bizarre, but it is what it is. The good thing is that there's a lot of prompts to try for the first time once you can use sd3 local 😂

bitter hearth
raven fern
#

well to answer your question, one day for sure

#

i dont see why not

bitter hearth
#

i hope so, looking at ai photos all day makes it so easy to tell. Kinda ruins it

raven fern
cunning lintel
#

it's at the api level

raven fern
#

ah

#

so basically you get them blurred on comfy for example from the api right?

cunning lintel
#

yup

raven fern
#

well that sucks

cunning lintel
#

i think the api is the only source, so it's the same everywhere

raven fern
#

yea just have to wait for the weights :3

#

i mean can't they just release at least one? like heck even give us the small 2B model to play with

#

sigh sadcat

cunning lintel
#

If the model isn't good yet, the reactions will be "is this it???" and no one will remember it's the limited 2b model. Don't think SAI can win whatever they do at this point

#

Though some openness/updates would be much appreciated

pale aurora
#

sdxl wasn't really all that great when it released either. The community made it good

raven fern
#

i would be happy even with a text update that says, it will release in June or something

#

just give us some info

pale aurora
bitter hearth
pale aurora
#

one moment

raven fern
#

take your time

bitter hearth
#

no problem, for me the best i've got out of sdxl is

pale aurora
#

I mean, that's pretty good?

bitter hearth
#

maybe, it's hard to tell after a while if it's real enough

pale aurora
#

aside from the iris, I wouldn't know it was ai, and only because I'm looking for it

pale aurora
pale aurora
weary crystal
pale aurora
pale aurora
weary crystal
pale aurora
low stone
pale aurora
raven fern
#

is there any possibility before June? :3

low stone
#

not this weekend becuase it's a holiday... so that's the only sad part i'm focusing on. 🙂

low inlet
#

Hello Guys

#

can someone join diffusers 4 vc to talk ?

dull star
#

lmao

#

guess he got contacted that hte model is not 2 weeks away, but like 2 months

#

😔

#

Everything is soon if you personally have access to it

pale aurora
dusky thistle
dusky thistle
pale aurora
dusky thistle
#

they do

#

i tried generating flags with hunyuan

#

won't post em here, but it was a test for how carefully they've checked their surely massive data set

#

they have def gone through every image one by one

#

flags that are easily confused: holland, russia, france. nailed em all consistently

#

britain, USA, australia: nailed all those consistently

pale aurora
#

pixart sigma might as well be sd2 for generating anything more than fully clothed people

dusky thistle
#

zero trouble with their own of course, couldn't do taiwan, and had absolutely no concept of what a nazi flag was or the confederate flag. none.

pale aurora
#

'flag of taiwan' probably gets you extradited

dusky thistle
#

both of those latter symbols are fn everywhere in all kinds of random images

#

historic stuff, garbage online, photos of random political events

#

the fact those are not in their model at all is honestly kinda shocking/impressive

pale aurora
dull star
#

SD3 8B (or 2B if the prompt adherence is at least better than pixart and other open ones) is something that would be amazing

#

I do believe it coming out

#

just not soon

#

pixart-sigma isn't even close to what SD3 8B can do

dusky thistle
#

what's impressive is that it is completely not in the data set at all

#

that's a ton of labor

dull star
#

SD3 8B is like 70% of the way there to ideogram level of adherence, and pixart is like 40% or worse, idk how to say it

#

it also all boils down to motion/graphic elements and text working in SD3, whilst not in Pixart-Sigma

#

but pixart-sigma, for a 0.6B model is still extremely impressive

#

for complex compositions its waaaaay better than something like SDXL

#

but it doesn't meet what I want to do, which SD3 8B gets really really close

pale aurora
#

oh yeahhh forgot about ideogram. That model is great

dull star
#

ideogram is amazing, I just wish there was something like it offline

#

SD3 8B gets quite close, and I suppose finetunes would be 99% of the way there tbh

cunning lintel
#

pixart sigma does some nudity by accident quite often. maybe the prompts were filtered, but the images it was trained on less so it seems

dull star
#

2B or 4B being able to be finetuned offline, I would make a bunch of models that would boost the motion graphic(?) element capabilities

dull star
#

it got through the censorship very rarely

#

I don't know how it happened

cunning lintel
#

oh i rember that... it feels as if filter has some sort of cumulative scoring system. woman: 1pt, photorealistic: 1pts, upper body skin: 2pts, breast structure: 1pts. 3 is out. That image evaded it as well, totally covered upper body 😂

dreamy sundial
#

raw sd3 output

faint breach
dull star
#

same guess

#

its not even well trained

#

I wonder if they are just using the laion nsfw detector or whatever

severe phoenix
#

Playground is soon about to drop their v3 model which looks like it might be up their with sd3. they didnt specifically say this but they sound pretty confident about its ability to render better faces and prompt adherence.

low stone
dull star
#

I want SD3 so bad man

#

these images were from a month ago

#

the meme potential

cunning lintel
#

||A surreal, dreamlike portrait of a brunette, with a mesmerizing, infinite zoom effect, where a circular section of her face is magnified, revealing the intricate texture of her skin, with tiny, industrious construction workers, no larger than a grain of rice, busily at work, filling the pores,||

cunning lintel
#

The effect is 10+, but for prompt understanding, well, i had something else in mind 😂

raven fern
low stone
cunning lintel
#

inset map, makes sense , obviously still too much off the beaten path

cunning lintel
#

In a dreamlike, 8K photo, a whimsical, furry, futuristic white cyber owl with purple streaks in its fur, sits in a mystical Valdivian forest, surrounded by bioluminescent foliage, with a tiny garden gnome stepping into a mushroom house. The aurora borealis swirls above, casting an ethereal glow on a secret arctic vault, where a cute, velvet-skinned goblin with whiskers poses.

#

A stunning, ultra-realistic portrait of a black Barbie doll stands in a secret arctic vault, surrounded by towering ice mountains, and a kaleidoscope of colors. In the background, a cute, velvet-skinned goblin with whiskers poses, surrounded by clockwork machinery and glowing orbs.

#

In a surreal, 8K photo, a futuristic, cyberpunk cityscape unfolds, with a vibrant, candy village, where a furry, futuristic white cyber dog rides a dragon-zebra chimera. A woman barbarian rides a majestic, agitated dragon-zebra chimera, through a dense, mystical forest of Schwarzwald.

#

A mesmerizing, 8K portrait of a brunette, with a mesmerizing, infinite zoom effect, reveals the intricate, labyrinthine texture of her skin, where tiny, industrious construction workers, no larger than a grain of rice, busily toil within the pores, building minute skyscrapers and suspension bridges. In the background, a surreal, fantasy cityscape unfolds, with towering ice mountains, and a vibrant, candy village, where a furry, futuristic white cyber dog with purple streaks in its fur, wearing a dogtag saying "soon", rides a dragon-zebra chimera through the streets.

#

A stunning, ultra-realistic portrait of a black Barbie doll, dressed in intricately detailed clothes and jewelry, stands in a secret arctic vault, surrounded by towering ice mountains, and a kaleidoscope of colors, swirling with abstract, Dalí-esque patterns. In the background, a cute, velvet-skinned goblin with whiskers poses, surrounded by clockwork machinery and glowing, iridescent orbs, as the aurora borealis swirls above.

#

In a surreal, 8K photo, a vibrant, fantasy confectionery wonderland unfolds, with a whimsical, furry, futuristic white cyber owl perched on a mushroom house, surrounded by biscuits, and a flowing chocolate river. A woman barbarian rides a majestic, agitated dragon-zebra chimera, through a dense, mystical forest.

dull star
#

chat?

#

@low stone

#

@faint breach

#

4.2B model

#

97s inference time tho

#

but still, 3GB vram requirement for 4.2B

faint breach
#

pretty neat. always great to see new models that can run on cheap hardware

#

especially since stability is going to release sd3 and be done with image models

dull star
#

we need to see how this affects bigger resolutions and accurately captioned datasets (prompt adherence)

#

currently its just 256x256

severe phoenix
# low stone That's exciting. Any idea about clip/language model side of things?

nope, they havent revealed any technical things, not even images. which honestly i kinda like...they said its about 30% done, so i'm guessing 2 more weeks or months who knows lool . i just wish these companies would just keep tthings quiet and drop things when its actually finished. this whole waiting thing is beginning to get tidious and abit annoying. you're not selling a movie or sthng geez.

severe phoenix
faint breach
#

That's entirely fine though. It's sad to see Stability stop in this field, but it's fine. There is enough research and resources available that others can more easily keep going.

lucid swift
#

i think ther streght is in image models

faint breach
#

theres more money in LLMs

lucid swift
#

sure but the open models are so good and you compete with the gpus and engeneres and money of meta

#

but for open image models is almost no competion

faint breach
#

how? lol. Meta's business is advertising. They give away all their models for free. How do you compete with that as an exclusively model building company?

#

it's not a popularity contest. Stability is an actual business that needs revenue

hallow lion
raven fern
lucid swift
real terrace
#

*Congratulations you've been added to the Stable Diffusion 3 early preview waitlist!

You'll be notified by email with an invite to our Discord server when you've been granted access to the preview. *

nothing since that 😕

low stone
#

Jeez that is the most ridiculous prompt

low stone
raven fern
#

is there a way to try that in comfy?

low stone
raven fern
#

nice, i like to try new toys 🙂

cobalt moon
#

me who only have 2GB VRAM :

hallow lion
#

Maybe the cat with 4GB of Vram can help you instead.

jolly swan
#

Once again I have to remind that there is no Pony team, it's just me.

I would be happy to train on whatever (which has the right license) but it's a tricky question. Good data can make meh architecture shine but will still be inferior to good model with good data, hence I am still optimistic about SD3 option. If it does not happen for whatever reason and I need to find back up - honestly another XL version with improved data is probably fine.

woven dock
#

1.1gb is quite small for 4.2b in fp8

woven dock
#

Like what's the vram requirements

jolly swan
woven dock
#

Does the vram usage scale up as more images are dumped into the dataset?

jolly swan
#

Not the memory, you need model weights + (batch size * per image vram) so more VRAM means more images per iteration (but it also does not scale linearly)

#

Generally you just need enough to fit weights and 8+ batch

#

So 40GB is most likely enought for XL finetuning

#

I remmeber 3090/4090s not having enough to use Adam (which you want to use)

muted dove
violet escarp
#

and Stability really needs money

dusky thistle
jolly swan
vale furnace
#

gm

muted dove
#

You're still allowed to fine tune it 😃

#

Please 🙏

jolly swan
muted dove
#

Worth trying to raise funds from the community? I suppose it all hinges on the SD3 release anyway.

jolly swan
#

I would prefer to figure out functional economy rather than using some one time large events like kickstarter

lunar rivet
#

iirc redmondai sponsored a cascade finetune once upon a time, maybe they'd be up for another

woeful spindle
#

that's like $15k

#

would it be more expensive if you rented 9xA100 for a month?

#

or 270xA100 for a day🤔

jolly swan
sterile pendant
# woeful spindle would it be more expensive if you rented 9xA100 for a month?

Oh I misread this, I see that they said they have 3 a100s and it took 3 months.~~ Without accounting for power supply efficiency, A100s pull around 300w at max load plus some minor CPU and other peripheral usage. So let's round up and say 3000w for the 9, you now have to offset two space heater's worth of heat with HVAC for a whole month(~10k btu which is a large 120v window unit that would pretty much have to run nonstop).~~ Meh it's too early in the morning to pencil all this shit out, but these are some of the things to consider lol. Basically, ponyboi would have a massive power bill, but you'd have to do the math to see which would be cheaper. Him buying all the equipment+his power bill vs renting a server to do it in a much shorter amount of time. Plus, he makes multiple models with the hardware he invested in.

#

Oh and also, a lot of data centers have rules against using them for nsfw stuff

rain current
#

sigma 2k - superhands....

cunning lintel
rain current
#

I think it wouldn't work, it will detect potatoes instead of hands

sullen moss
#

😂

#

Two weeks mf

low stone
#

@cunning lintel since you were posting complicated prompts yesterday. (Some sd3 pics, some not) / people traveling through a tube network over a futuristic city. Robot Santa. Man is friend with a robot. One eyed women. Doctors with Cthulhu heads. Scientists with space ship shipping businesses.

#

Another: large throne room where a scientist with Cthulhu head sits on his throne made of lightsabers. Frantic and panicking crowds beneath him as he decrees his next royal orders to begin the end.

peak kettle
low stone
noble coyote
#

SD3@ClipDrop

cunning lintel
low stone
#

@cunning lintel hunyuan at 2.40:1 aspect ratio

raven fern
#

wide

cunning lintel
#

i looked, looked again, but well, of course cthulhu has a third leg with all those tentacles!

low stone
#

even gods need sensible footwear

sick cedar
hallow lion
faint breach
#

non commercial research only license really kills a model. also shutting down the official channel for it does too

muted dove
#

Yep...Gits! 🤣

teal fossil
autumn arrow
#

How do I buy more artisan credits?

#

The site just bumps me back to discord

teal fossil
dull star
#

same with SD3, but with SD3 you can buy a license

#

I forgot if you can also do that to cosxl

teal fossil
sterile pendant
dusky thistle
#

i think it's bizarre that a company with a cash crisis has so many models without a commercial license

#

you'd think you'd at least leave the door open to a conversation

little quarry
#

Two more weeks after another 2 weeks

#

Rumors of the rumored SD3

rain palm
#

ah, reminds me of Nintendo's announcement announcement announcement.

#

so, which dataset is SD3 based on?

dull star
#

my friend's friend who was a stability employee 6.53 years ago heard from my uncle that he heard from emad that it will come out tomorrow (this is legit)

dull star
# rain palm so, which dataset is SD3 based on?

no idea, but I bet its partially laion, trained on 512px first, then on 1024px for 8B and later on the smaller models
the dataset was captioned 50/50 by CogVLM (detailed accurate prompts) and the raw captions

#

they truncated the prompt length to like 72 or whatever because of clip and I don't know if this is real (I heard from a random discord user who heard from a random discord user who heard from a twitter user who was claiming to be a stability employee but it was actually my uncle all along), but they might ditch clip and use T5 only and continue training with non-truncated prompts or whatever idk what was really told, I bet I'm wrong, it could just be clip being ditched and that already heavily improved the text adherence and the prompts were never truncated or something I don't know, it doesn't even matter how hard you try you will never know the truth cause we are never given it.

rain palm
#

sounds like a lot of room for error 😛

dull star
#

a lot of room for misinformation spread around as fact because stability tries their best not to inform us, so we make up random shit constantly and get proven wrong

#

I need to read the paper again

#

random screenshot go!!!! (this must mean something idk)

rain palm
#

i'm not an astrophysicist

dull star
#

its simple rocket science, what do you not understand thomas

#

T5 has 512 context length for sure

#

but I don't know if the cogvlm prompts were actually shortened to 77 tokens or not

#

and if they were, does it sabotage the prompt adherence and make the T5 context length less important as it was never trained on prompts longer than 77

twin tulip
dull star
#

that makes sense

#

so they were shortened

twin tulip
#

77x4096, 77-2 = 75 due to start and end token

#

75 is still a decent length at least

dull star
#

like I'm not gonna use up 512 anytime soon, but like idk ~200 would have been a little more useful or something (longclip has 248)

twin tulip
#

the issue is the vram required for cross attention goes up substantially as you increase either/or resolution and embedding size

#

there are several hacks out there that try to deal with it, like localized or sparse attention, or the chunking of the token blocks, they have drawbacks though

dull star
#

I see..

#

this is 310 CLIP tokens

#

and 360 gpt2 tokens (idk about T5 tokenizer)

twin tulip
#

chunking seems the be the most popular, which is likely what that is

dull star
#

chunking?

teal fossil
# dull star also

Afaik T5 gets the short end of the stick bc they are also still using Clip G & L.

dull star
#

nothing changes the context length of T5 obviously, but idk how the shortened prompts in the dataset

teal fossil
#

And they are testing (apparently) if they can use T5 only instead.

dull star
teal fossil
#

Btw T5XXL is not CogVLM, but good for captioning.

teal fossil
dull star
#

I would use tags for styling if it helps

#

otherwise I would just use T5 becuase of natural prompting

twin tulip
#

T5 has a very long embedding dimension, I iamgine thats why it was used, there's more data there

#

4096 vs 768 or 1024 or 1280 or whatever of common clip models

#

but its not a VLM and wasn't trained on any cross entropy loss with a VIT or anything, its just an encoder/decoder model, like something you'd use for language translation

#

could've just as easily used Llama3 or something else

dull star
#

isn't that decoder only? or is that not an issue?

#

I saw lavi-bridge, which could use decoder only models such as llama2

twin tulip
#

I guess i'd have to noodle on the impact of using a decoder only network, but you can get the features from whatever model I suppose and use that

twin tulip
#

I think a lot of the vlms are just VIT tacked onto (often encoder only) llms with adapters

dreamy sundial
dry wave
#

in encoder architectures every token gets context information from any other token

#

in decoder only models you have a causal mask, so every token only gets information from the past tokens

#

so "a cat with blue fur" in clip or t5 would have the information about blue fur in the cat token

twin tulip
#

yeah makes sense

dry wave
#

in llama3 in contrast the token "cat" has no further information while the token "fur" contains this information

twin tulip
#

right due to causal mask

dreamy sundial
dry wave
#

I would imagine that this makes the cross attention more difficult because the last token contains all the information instead of having all tokens equally

dreamy sundial
dry wave
#

doesn't mean it wouldn't be possible with llama3, but I guess that's why they prefer decoder architectures like t5

twin tulip
#

yeah just as is would probably not be as efficient

#

I think the vlms are using full self attention on the image tokens prior to the attachment to the llm part

dull star
#

I wonder if in the future we'll get large parameter (lets say, 12B or larger) ternary diffusion transformer models

#

idk which companies would be willing to test it further

#

at smaller parameter sizes, it has a massive FID/quality penalty, but it starts to climb back up the larger the parameter size is, whilst retaining low VRAM requirements

#

only problem, the inference time is terrible at larger parameter sizes

#

but damn, the small checkpoint size and only 3GB vram required for a 4.2B model

sick cedar
twin tulip
#

just uses the embeddings from the text model, its a different embedding space but in theory still has contextual meaning

raven fern
cinder walrus
#

What's best way to use SD3 on a phone? Can use via api with Comfy etc but when I'm on the go what's the best solution currently?

#

Ideally outside of discord and not the stability assistant because it's trash

low stone
white current
#

SD3 Open Source Weights, when ?

muted dove
verbal epoch
#

Guys SD3 release in May?

dusky thistle
finite hollow
#

🙂 we had this with 2.0 already 🙂 month was correct, just not the year 😉

abstract nymph
#

tbh would just be nice to see some communication eugh

cobalt moon
#

here is mostly for art sharing or super-technical discussion

low stone
#

hyper detailed, photorealistic, myriad witnesses, frantic crowds panicking, surreal aerated landscape, inter dimensional planetary robotic networking

honest cedar
#

Hi, what's the difference in quality between SD3 and SD3 Turbo?

turbid grotto
low stone
low stone
honest cedar
muted dove
low stone
# honest cedar thanks for this

Yeah I haven't tested it. I've only used the regular sd3 on the api. The main sd3 model is so fast via that I never think about speed and want the turbo. Obviously that could change locally.

hallow lion
#

Emad is the hero we didn't know we needed and we didn't ask for.

low stone
dull star
#

if SD3 comes out, they'll make a super finetuned version of SD3 or SD3 Turbo, like with Core (sdxl turbo) and it will make it a competitive choice

#

especially if they optimize it for stuff like tensorRT, it will decrease the price of the credits

fiery wharf
#

nice jokes you have there

dull star
#

its really funny

low stone
dull star
#

really?

#

I mean they don't give a seed option

hallow lion
#

why dont they just put a commercial license on all of this -IF you make money . and for personal use free... why not?

#

sdxl sd15

#

all of it

dull star
#

money

#

they wanna keep making models

hallow lion
#

ask musk for money

dull star
#

hmm

hallow lion
#

a few billion is nothign for him and this is up hi salley

dull star
#

if you say so

hallow lion
#

he supports this sort of thing

dull star
#

that would be good

#

well musk invested in openai or whatever

#

SD3 being commercial now is the most logical, as they have opted out a lot of artists

#

so to me it feels less morally incorrect

#

but I would still not sell ai art tbh

fiery wharf
low stone
dull star
#

oh yeahhhh

#

I haven't been using the api for a long time 💀

hallow lion
#

Hello papa musk, it's me Emad from stability AI. We hear open AI joined the evil empire and backstabbed you but if you are still into freeing AI for the masses we are doing the same and we won't backstab you because we are righteous dudes. So we need money coz we're nice. We have a long track record of putting out free stuff and we are commited to the cause. Drop me ugh us a call and we talk.

#

Thats all it takes a tweet

#

whatshisface sad billionaire didnt think theyd buy minecart when he tweeted about beign fed up with this world

#

but they did

hallow lion
#

he doesnt have to buy it, just help out or buy a share in the company... whatever rich people do

fiery wharf
honest cedar
hallow lion
#

they did amazing things

#

Emad is no hobo!

fiery wharf
hallow lion
#

CIVITAI SAI comfui etc, incredible what this community did and its for free. It's better than paid products! Could you imagine what people could do if money wasn't a hindrance. If everyone could pour ALL their time and effort into what matters to them and their calling.

#

mind boggling potential

muted dove
#

Earlier posts said how bad it was and not to touch it, so maybe it was improved after then 🤷🏻‍♂️

fiery wharf
#

give them more money and the model will improve

#

two weeks to improve after you pay

hallow lion
#

Whatever is going on with AI it's better than getting involved with anything crypto related... 🤮

#

But it is true based on my experience - NOTHING beats a good model.

#

No inpainting, no face detailing, no loras, no perturbed attention guidance.

fiery wharf
noble coyote
#

SD3@ClipDrop - prompt = Vibrant colours, Bold Brush Strokes, Strong Symbolic Imagery.
Deeply Personal, Reflective of Emotional and Physical Struggles.
Mexican Culture, Folklore, Surrealism.
Highly Emotional Depictions of Pain, Suffering, and the Human condition.
Symbolism of The Monkey and the Humming Bird, Symbols of Hope and Duality

#

This came from a question to ChatGPT4: extemporise the qualities of the art of Frida Kahlo.

hallow lion
noble coyote
#

SD3@ClipDrop - prompt = photorealistic assassin’s creed cybernetic male assassin in an ivory
electrical-rococo elaborate robe by nexro xiii, light and mysterious, in
superhero pose, light and bright, mysterious, magnificent and cybernetic royal, warrior like, light and mysterious immense details, HD, cinematic lighting, cinematic, epic, photoreal by Riccardo Federici, Frank Frazettaby Bill Sienkiewicz and donato giancola and anders zorn, cinematic, dramatic lighting, rembrandt light

hallow lion
#

and it doesnt work sure you ge tpaid in crypto ok cool well as soon as you need to buy stuff in YOUR location anythign like a shouse a car whatever your government and bank will find out because you ahve to covnert your crypto to real currency and the tax man comes

noble coyote
#

SD3@ClipDrop prompt = Art Nouveau style, face by Anna Dittmann, snake eyes, snake young, large illuminati symbol in the boarder, Celtic knot with pine tree and pine cones, perfect eyes, A painting of a norwegian woman with flowers on her head, botanical art by Pierre-Joseph Redouté, vivid, blond hair, 1920s short dress, trending on deviantart, pop surrealism, detail

hallow lion
#

These are nice but I want to see realism.

#

And good hands.

desert garnet
#

those images are so cool i feel like burning my money rn

noble coyote
#

SD3@ClipDrop prompt = antique damaged portrait war poster, devil,portrait, by Albert Bierstadt, by Andy Warhol, by Annibale Carracci, by Caravaggio Michelangelo Merisi, by Takashi Murakami, Spray Paint, Halfrear Lighting, Soft Lighting, Linen, Posterization

#

SD3@ClipDrop prompt = a realistic beautiful autumn queen, headshot, close up, night time, autumnal mood, venice carnival, grand guignol, wavy hairstyle, white hair, character concept art, created by victo ngai henri rousseau vladimir kush coles philips elizabeth catlett arief putra john currin alenka sottler itzchak tarkay anita inverarity maxfield parrish peregrine heathcoate tamara de lempicka mads berg isaac maimon iwona lifsches non binary heart connection/detailed modern art style 8k

noble coyote
desert garnet
#

best eyes i have seen,these folks i tell you,HUGE and BEST hands,they are very very great like our country

calm dew
#

hello

low stone
#

oh... nevermind, their url is the same i guess, i just have to pass that model in json

abstract nymph
deep pebble
#

SD3@ClipDrop prompt = masterpiece,best quality,fine_art_parody,realistic,real,solo,multiple_girls,alternate hair length,wet hair,tears,tsurime,white colored eyelashes,looking at viewer,red eyes,narrowed eyes,large breasts,crop top,gothic_lolita,tabi,cross-laced_footwear,demon horns,half middle_finger,smoking,

low stone
cunning lintel
#

@honest cedar d3 turbo is in another league, but not in a good way

#

each time same prompt 2x sd3, 2x sd3 turbo

#

maybe those were a bit unfair, one more this time more suited for sd3 turbo's looks, it's usable for this kinda prompt (cartoon illustration of a woman in a hat holding a gun, digital art, fantasy art, steampunk, redhead, weird west, portrait of lady mechanika, cowgirl )

cunning lintel
bitter hearth
#

I put epic battle in the prompt

wild remnant
cunning lintel
#

A powerful agent, her eyes aglow with an unholy power, stands atop a ruined, gothic spire, as a stormy, apocalyptic landscape unfolds behind her, in the styles of Michael Garmash, Guy Denning, and Olive Cotton
Neg: boring, tranquil, wrong, low quality, photo

#

A resourceful operative, her eyes in a determined gaze, infiltrates a secret society's masquerade ball, surrounded by masked figures and candelabras, in the styles of Michael Garmash, Guy Denning, and Olive Cotton.
Neg: boring, tranquil, wrong, low quality, photo

#

A haunting portrait of a weary agent, her face deathly pale, surrounded by ritualistic symbols and forbidden knowledge, as candles flicker with an otherworldly energy, in the styles of Michael Garmash, Guy Denning, and Olive Cotton.
Neg: boring, tranquil, wrong, low quality, photo

wild remnant
strange skiff
#

1

wild remnant
spice rain
#

A photorealistic portrait of a 20-year-old South Korean girl radiates beauty with her long, flowing black hair, mesmerizing brown eyes, and captivating smile. She stands at 166 cm tall, with fair skin and a slim, D-cup figure reminiscent of Blackpink's Lisa. Dressed in a white shirt and deep blue jeans, she exudes elegance and charm. The portrait should be a full-body shot, 8k HDR, with high detailed features and a natural, approachable expression, illuminated by soft, golden-hour sunlight.

low stone
dull star
#

I expect SD3 Turbo to look more stylistic and have less variety

shut kiln
#

sheep standing on hind legs whereing a gas mask looks up in the sky away from a cell phone

drifting oak
noble coyote
frigid saffron
#

Any update news for sd3 open ?

pine canopy
#

A striking, azure Lamborghini, sleek and aerodynamic, thunders down a sun-kissed coastal road, its engine roar blending with the crashing waves and salty sea breeze. Majestic seagulls soar overhead, adding a dynamic element to the scene's exhilarating motion.

sullen moss
frigid saffron
sullen moss
worthy hound
gusty trail
#

sd3 2weeks edition

cerulean orchid
#

A dimly lit alleyway in Mumbai, with shadows looming ominously in the background, setting the tone for the dark and gritty atmosphere of the film --ar 16:9

#

#🆕|sd3 A dimly lit alleyway in Mumbai, with shadows looming ominously in the background, setting the tone for the dark and gritty atmosphere of the film.

potent idol
frozen patrol
#

futuristic headphone advertising

delicate hollow
#

where can i generate images

wild remnant
hallow lion
#

Two weeks? Maybe one?

#

who cares I don;t even know if I cna run this on my machine. I am scared. What ifts 25 gigs for the model and it takes 8 minutes to generate one pic

woeful spindle
#

I dont remember step count but it was about 30-45 iirc

woeful spindle
hallow lion
#

theres a cat here with 4 (send help) what is he gonna do?

sullen moss
#

5090 with 32 gig soon

primal summit
#

800 million model Can it produce good images?

river umbra
#

Do you know how to prompt image here

noble coyote
teal fossil
#

Btw guys with TagGui you can already pretty easily play around with captioning images with T5-Xxl and it's not bad.

prisma cypress
#

uhmmm does anyone know when with SD3 model be available on hugging face

gusty trail
#

2 more weeks (until it released

woeful spindle
#

there's gonna be a time where it's exactly 2 weeks before SD3's release

#

probably

fiery dawn
#

Only then we will get SD3 checkpoints.

#

Until then it will be "two more weeks" gaslighting like Emad did a month ago.

woeful spindle
#

Hope stability wont go bankrupt by then

#

Someone's gotta pay that gpu rent

fiery dawn
woeful spindle
#

They have limited time until Google's imagen 3 and GPT-4o's image generation roll out

#

Those two have all the things SD3 promises

errant dust
#

If you look at the paper on SD3, they consider the biggest rival to be Ideogram

past flame
#

Plot twist: It'll be unstable

woeful spindle
#

They show some examples under "explorations of capabilities" title

#

New model has near-perfect text generation

#

those images in these examples are definitely cherrypicked

#

but it is still impressive

#

I don't think they call it Dall-E 4

acoustic kite
woeful spindle
#

I just generated that with gpt4o

dull star
#

not bad

woeful spindle
#

it's not readable

#

but not looking like the old models

dull star
#

hwo did you generate with it

woeful spindle
#

they've definitely rolled it out

sullen moss
#

Well, SORA can also generate images. Paradoxically, to advance DALL-E, they just need to remove the filters 😂

errant dust
woeful spindle
errant dust
#

For text accuracy it is miles behind Ideogram

woeful spindle
raven fern
woeful spindle
#

I dont know if it's just me but it looked like too cartoon-ish

dull star
#

ideogram is still the goat imo

#

I hope finetuned SD3 can get close

errant dust
#

I am not here to wax poetic on ideogram on all things, since they all have their weaknesses and strengths. Ideogram as well, but if you want text, Ideogram is king.

#

As to looking cartoonish, the text, it is a matter of knowing how to engineer the prompt. This is still a factor today.

dull star
#

thankfully SD3 is good enough for Movie Titles

raven fern
#

lol

raven fern
errant dust
#

There are things for example where Midjourney can do things none of them can yet. But SD traditionally can compete soon enough once you get specialized Loras, so my comment is on the vanilla experience

#

SD3 overall is super exciting, don't get me wrong. I'm just underwhelmed for now at the cost. $19 for 300 images in a month? RLY? I can imagine spending the 9 bucks for a test run of 130 odd images, but never for a regular experience.

#

You can get 60 fast for free with Copilot (Dall-E 3) per day, and more if you can wait a bit, and same for Ideogram

dull star
#

they need to optimize it first for tensorRT and stuff, and you have to consider that they need the money
but yeah its very expensive for what they offer

errant dust
#

They can need the money and will have me nodding my head in sympathy, but that doesn't mean the end user/consumer is going to opt for paying more for less

cunning lintel
#

I guess SAI can still lean on the stable diffusion brand and its promise of (local) generation with superior tooling. The main reason I'm interested in SD3 is the promise of much better tooling combined with state of the art generation. It's why i'm now paying a little for playing with SD3, curiosity to see how well it performs. But if the tooling (control nets, inpainting (read on twitter there was no such thing as SD3 inpainting yet, ouch), style transfer, regional prompting, customizable guidance, weighted/mixed prompts) turns out to not be there, it'll be more and more waning interest in SD3 for me. If it's just texttoimage, might as well use something that seems more capable.
But I'm no pro or heavy user, whether I pay or not, is of no consequence to SAI, they should cater to heavy professional use, but I'm afraid for those the tooling story is much the same if all you need is texttoimage stock-footage en masse, there's plenty other options and at the current price point SD3 is not competitive at all.

lucid swift
dull star
#

and even then, we don't know which

#

openpose, canny and depth would be more than enough

lucid swift
#

"A first person view of a robot typewriting the following journal entries:

  1. yo, so like, i can see now?? caught the sunrise and it was insane, colors everywhere. kinda makes you wonder, like, what even is reality?

the text is large, legible and clear. the robot's hands type on the typewriter."

#

idogram seems worse for this example

low stone
#

svd with an sd3 raw image

noble coyote
woeful spindle
noble coyote
cursive mist
#

I dreamt I was in the place of Fry from futurama, and when I asked Bender if SD3 was finally out, he replied “two more weeks”.

noble coyote
icy drift
#

Remember February when everyone was like, "Don't need Stable Cascade since we're getting SD3 in two weeks"... and then proceeded to waste half a year completely ignoring the model that can literally generate 4K images in 30 seconds because SD3 would be better.

icy drift
#

My desktop background is still made by SC, because I only update it when we get the weights for a new model. (And no, CosXL and Hunyuan DiT don't count at all for subtle and ethereal reasons.)

primal summit
icy drift
#

Maybe once SD3 releases and people realize how much RAM it costs to get large renders out of it, SC will suddenly look more interesting. I could just be wishing though...

primal summit
icy drift
primal summit
#

Looks like we'll be stuck with the huge SDXL model

icy drift
icy drift
primal summit
icy drift
# primal summit ?

What's the "huge SDXL model" we're stuck with? Do you just mean to say that SDXL is huge compared to 1.5? Or did someone do a mergekit / mixture-of-experts or something that I don't know about?

primal summit
#

SDXL has 6 billion parameters so it consumes a lot of resources

icy drift
primal summit
#

But imagine if SD3 was released in the form of 2 billion and it is as powerful as SDXL. Life will be easier

primal summit
#

There is an 8 billion model, but I think 90 percent or more will not be able to operate it or will find it not worth the effort.

primal summit
#

But the parameters are not everything, but also the structure of the model and the text encoder play the most important role

icy drift
#

Perplexity

#

Sounds about right.

#

What about the VAE though.

primal summit
icy drift
primal summit
icy drift
#

Perplexity fail oh well.

#

I'm sure the info is out there somewhere.

edgy kelp
#

My guess is that a lot of people will use the smallest SD3, just as most people are still using SD1.5 instead of XL

icy drift
#

A model the size of SD1.5 with the power of SDXL would be a huge step-up. Wouldn't be really useful until all the controlnets etc were trained though.

#

I bet that will only take 2-3 months.

#

After SD3 releases in 2 weeks.

edgy kelp
#

Devs said that 800M SD3 will be more powerful than base 1.5 (despite 1.5 has slightly more than 800m parameters), that's good enough already for to play around with it

icy drift
edgy kelp
#

Either way you'd need to wait the controlnets, but having the smallest SD3 to work with should anyway cost you less electricty and power... which sounds convenient

hardy merlin
#

so any idea about sd3 release eta or something? I just join the channel and looking for some good news.

primal summit
icy drift
#

Two weeks from [insert current day here].

primal summit
dull star
#

same with 2B beating SDXL, despite it being smaller than SDXL (3.5B)

#

and 8B is just undertrained 😔

icy drift
#

You can actually just calculate the release date using this handy javascript function.

var SD3 = new Date();
SD3.setDate( SD3.getDate() + 14 );
console.log( SD3.toISOString() );
low inlet
#

is it the 8b or the 3.5b ?

dull star
#

I think the API uses 8B

icy drift
#

I think I remember hearing it was an early train of the 8b model at one point. Without T5.

dull star
#

Well we can't train T5 though can we

low inlet
#

because i'm wondering is there is any chance that the one at the api gonna get any better ?

dull star
#

isn't it frozen

low inlet
#

or that's it's limits ?

#

because it can't render hands or legs correctly most of the time

icy drift
#

SD only gets amazing once the community fine-tunes it.

low inlet
#

true

icy drift
#

And then you need specialized models and controlnets to get production-quality usable stuff.

dull star
#

I know that finetunes will help

icy drift
dull star
#

I wish 8B Loras will be possible with 24GB, but I'm doubtful

icy drift
dull star
#

yes Loras are finetunes

#

like modular finetunes, you can use it with models

#

and its less VRAM intensive, etc

icy drift
dull star
#

yes

#

but even that will require a lot of VRAM if we would try 8B

low inlet
#

because 1.5 1.6 and 2 is different encoder not a lot of people using sd 2

#

it's sd 1.5 and sdxl

sterile pendant
#

Even still, if you're working with images in the 1024² range, then for an 8B model, it's probably going to take 32-48gb vram to train them with even just the clip encoders and no t5

dry wave
#

just because the model is 8b doesn't mean you have to train all 8b parameters

#

in sdxl you often achieve same results when only training the cross attention layers than when training both. Similarly, you don't have to train the down-layers of the unet in sdxl

#

usually our loras are several times too large for no reason

#

similarly, you can train only a part of the 8b model and you will be fine

dull star
#

well I hope that will work for 24GB and less

dry wave
#

I would say that's just a question of proper gradient checkpointing. We just have to wait for someone implementing efficient training

finite hollow
#

do you guys use stable cascade at all ?

dull star
#

used it for a little while

#

its good, but doesn't match what I want

#

and the results are smooth

#

as long as you aren't looking for super photorealistic images, the model makes nice and clean images

finite hollow
#

can you show some of the nicer pictures you made with it?

dull star
#

uhhh

#

didn't really save any I think

muted dove
finite hollow
#

the white cloth girl portrait is nice

#

the rest has a strong elder-scrolls touch

dull star
#

yeah these look SUPER clean

finite hollow
sterile pendant
dull star
#

guess I'll have to train 0.0000000000001 ranks then

#

😔

sterile pendant
# dull star guess I'll have to train 0.0000000000001 ranks then

Well that's why I brought up the Dora thing. Also since sd3 doesn't use a unet, it could potentially take far more or far less vram to lora train per "billion model parameters" if that makes sense. It's likely also going to take some time for people to get the tooling up and running for it as well

#

I haven't looked into that aspect much yet, so I can't give you an educated guess on how demanding the training will or won't be. Just using sdxl training as a reference since I've trained dozens of loras for it

dull star
#

then again, how will the quantization go then

low stone
dry wave
#

neither for sdxl

#

even less for sd3

#

think of it as you want to train the method a new concept. The size of the concept don't necessarily scale with model size

#

in particular, if you only have a few megabytes of training images, training a gigabyte lora is a rather dumb idea anyways

low stone
sterile pendant
# dry wave you don't need dim 128

Oh I don't train that high ever, it's just some superstitious thing a lot of "guides" lead people into thinking they need to use, so most people use it anyways. I normally just do 16 or 32, but it depends on what you're training and how you're training.

sterile pendant
# dry wave even less for sd3

We don't actually know yet, but if it's like training llms that the architecture is based around(dit), then it's still going to take some hefty resources to train correctly

#

Llms are a lot more forgiving than image based generation

little quarry
#

2 more weeks

rain palm
#

aight.

teal fossil
#

So did anyone test the latest API (or whatever)? How does it compare?

dull star
#

latest api?

#

did it change?

lucid swift
lucid swift
icy drift
lucid swift
lucid swift
#

these two are from the anime finetune but its still not done.

icy drift
#

Oh finetunes. I could use an anime finetune if there was a lineart controlnet.

finite hollow
lucid swift
icy drift
dreamy sundial
raven fern
# finite hollow

she is sitting in the middle of the car with a seat belt on, where is that seat belt connected to from the middle? LOL

finite hollow
#

its one of the new mercedes 🙂

raven fern
#

kek

finite hollow
low stone
hallow lion
finite hollow
remote holly
#

I am learning the word "soon" in any language since sd3 release day 1 :
in catalan : aviat

hallow lion
#

in two weeks is what in catalan?

#

What a weird situation w ehave here. Cascade has been out for 6 months. We know its a huge imrpovement on SDXL. It works on current hardwares... I t makes huge images. It's also VERY fast...Yet here we are in two weeks... Waiting. Can you image what cascade would be in 6 months if it was embraced at leats half as much as sdxl...

#

What are waiting for anywya more than half of us wotn even be able to run this thing locally.

#

So sad.

#

People always want what they can't have and ignore what they have.

#

So of course they closed the channel even, nobody cared. this s all our fault

#

So lets wait then forever

#

for nothing

cunning lintel
#

cascade has one limitation similar to SDXL: prompt understanding. I think it's mostly the promise of a model that will soon be available and that improves on that aspect is what invalidates further work on cascade (and sdxl), the difference being that lots of efforts/research on sdxl started earlier and are only recently published.

hallow lion
#

It iwas very fast

#

faster waaaaay faster than sigma

dull star
#

I remember cascade having very average speed

hallow lion
#

and same or better results

dull star
#

than sigma? maybe

#

better results yes

#

just not smarter

hallow lion
#

yes

#

so

dull star
#

speed is questionable

cunning lintel
#

But just as we now still see sd1.5 research published, i'm sure sdxl wil be there to stay for a long while

hallow lion
#

wheres the controlnets? and why no one refines it

#

lol

#

whwres the cascade loras

cunning lintel
#

cascade is great for what it is

dull star
#

if T5 seems to be too good to not use, and it will be a hassle to load/use, SDXL is staying

hallow lion
#

Yes it is great hence its amazing how underrated it is

#

sdxl is good too yes

#

but cascade has greate rpotential

cinder junco
#

I didn’t mess with it much, but got frustrated with cascade. It LOVES flat backgrounds. Like, it lacks creativity. If you ask for a subject, it won’t build a scene around it, just slap it on a blue or gray background.

remote holly
remote holly
lucid swift
remote holly
abstract nymph
#

heyyyy uh, any news?

lucid swift
remote holly
#

I love the second

low stone
#

Ultra-realistic 8K image of will smith in an exploded view. The components should be meticulously detailed and appear to float against a black background, highlighting their complexity and precision craftsmanship, hyper-realistic detail

cunning lintel
#

oh i like, let's put will smith in an earlier prompt i borrowed

ultra-detailed photo of a shattered sculpture made of rose quartz depicting will smith, full body enlarged, ((pink glitter explosion)), side view, motion effects, ((shattering sculpture)), colored crystal particles floating as the sculpture breaks into many tiny pieces, studio lights, ultra sharp focus, high speed photo, Mschiffer art, soft colors,

#

giraffe confident expression, pixar style, expression