#🆕|sd3
1 messages · Page 17 of 1
can confirm lol
this is beautiful.
this feels very sd1.5 base
A better question - does SAI have any PR people left.
true
when its not humans, the model does great
Anyone getting good images without people in them at least? Since it seems to be useless at that.
SAI is dead.
Also, we wouldn't be in this situation if you guys weren't acting like you have a superiority complex against anything that isn't the base model, and acting like it's other people's fault
every image i have is a masterpiece.
Cute 🤗
that isn't a human lol
yep way better, thankjs
that's out of date info. The current situation is people wanting to pay for the license but not getting a response after contacting stability ai
thanks 😄
nice yoga
almost
Pony isn't a base model, pony is a finetune of the current SDXL, and they were capable of monetizing it. they won't be capable of doing that with SD3 if they can't get a deal. And this isn't just exclusive to Pony but many other finetunes.
The person suggesting to them not to have public business dealings directly with a smut model creator might be part of their PR department
what's the prompt you're using for those?
awkward moment when pony steps on pwsy
it's not really a prompt, it's a complex old workflow from sdxl
he's grimacing in pain because he yoga'd his body into an amorphous pretzel heh
hands still got much issue
So they should get a deal, SAI needs to be able to monetize as well.
just dont monetize it?
have you tried different cfg ?
noooooooo
remember, its not the model, its the prompt 😄
That's what Astral is trying to do.
Lmao
You are quite literally running in circles through the same 5 points lol
Weird it does fine on the api
Hi all
That one might be too much 
thats hopeful atleast
How can I download and install SD 3
Believe me, I'm aware.
So best of luck to astral?
💀 bro do you think it is cheap to finetune? Do you think the good finetunes of sdxl like juggernaut and pony are just charity?

I can see tes fighting from miles away lol
The API isn't using a basic default workflow with bad settings either
so what WENT wrong with 2B? dataset censorship ig
Do you use the full version of SD Medium or only with Clip?
I think something is not right with the current comfy workflow or they uploaded the worng files. SDXL actually did good out of the box
hey folks I wasnt around for the sd 1.5 or XL release so im curious with your impressions of this one vs those previous ones on release
So it's a matter of waiting, even more hilarious
Full version > Clip
Maybe it should have been trained on 1 instead of 3 then?
API, I just said, don't know what they are using there 
Bro think if its expensive just print money
too early to tell for me
i agree man SDXL medium is a bit underwhelming at the moment, doesn't feel next gen
WHy is SD3 so horrible what went wrong and how can they release something like this
its not exactly fast
do we know that this is the reason why? We don't know the status of their licensing team right now
i think maybe we're so used to SDXL + loras that we forgot how just a bare base model is compared?
dont worry this one just needs a few more months of training before its ready for local release! the girl isnt safe-shaped enough. hand me a scalpel...
finetuners will save sd3 in a month ig
idk but i think realism engine is better than both of those and that one isnt payed 😅
I'm pretty sure we all expected the base model to blow after all the "safety" bs that SAI talked about
no way it's that bad

this one includes t5 right? Can Stable Swarm use it natively?
sdxl medium?
just like how it saved 2.0? nah
If there's even motivation to fine tune because of the licensing
So far SD3 is very nice, but @lavish osprey you guys are gonna need to remove the moronic 6000image limit from the creators license. Most likely the dumbest thing ive ever seen in my life. 😂
SD3 will fade away as fast as SC with that in place, it makes no sense, isnt enforceable.
Telling people how much compute they can use of thier own compute, come on now 😂
Censorship.
any upscale tests from anyone?
at least leaving it open that its another issue gives them an off ramp for the issue
they're only able to because of the relase in the first place. thanks obama.
Dudes and dudettes, you need to read the README and use the right model and text encoder.
If you're only using "withclip", the model's performance will degrade a lot.
Also, you need to prompt very differently. The model is very good!
its not working when I tried sd3_medium.safetensors with the tripleCLIPloader. I put the 3 text encoders in my clip directory
sorry i meant to say sd3 medium, getting mixed up lol
That´s happen when you censor a model thanks to MFs that say they´ll sue you
Most finetunes aren't paid, although I do support finetuners getting paid.
People are newbs they just want to be mad about sucking. and blame someone else for it
People don't know how to prompt for it yet, so they think it's bad, it happens everytime there's a new model
any upscale workflows already working?
only if...
get em Lykon do the for the 101st time thing
if you use the sd3_medium_incl_clip model @pesky then you can ignore the triple loader and take the clip directly from the loaded model
I wanted to do the "It's alive!" scene but well need to think the prompt
Only if what? SAI had better licensing?
to be fair I am trying to use some of the pony prompts with the score_ tags so that's probably messing it up too
Waddup Axel. I hope your new movie isn't as bad as 3 was
Show us the true girl lying on grass prompt 
To be clear if you missed earlier conversation: it's a limit of 6k commercial images. So eg a generation-as-a-service company will hit the limit, but if you're generally for personal usage there's no limit, and if you're generating images while testing your commercial finetune or whatever, those images aren't commercial so the limit doesn't apply
@barren spindle if you use the sd3_medium_incl_clip model @pesky then you can ignore the triple loader and take the clip directly from the loaded model
I really want to believe this
There's an internal discussion atm about making this more clear
Yes but then the model will perform much worse! Just FYI.
I know, but wasnt someone saying you get better results with the triple loader?
im not at expert at this but id say yes
Show me a woman lying in grass do it!
OMG guys look!
||cascade result||
@viral plaza you gonna need to make that a lot more clear for sure, i had a feeling it was something like that
Tell me, why am I getting different results when I use the separate CLIP files as opposed to the embedded ones? Is that normal? Just some weird RNG effect (yes, the seed is fixed and same)?
at least it knows Batman
why is it worse @ornate oar and where do i get those other files from then ?
mutilated
why is the 6k limit only for SD3? it seems like an afterthought
at least it can do banks!
it seems the model that includes t5xxl (without triple loader) as well as trying to load t5xxl with the tripleCLIPloader breaks it for me
Look at that lettering, thats impressive
So if someone that makes a finetune and shares it with people across the world, will that commercial license apply to all of them or just the finetunner? Example if one of the users gets past 6k commercial used images, what happens?
better than any base model we've had before
because it's new and generally they don't want to add new rules for old models
SD 1.5 bros are out on suicide watch
I have about 120k$ of equipment around. Pony is expensive. I would like to burn only most of my money and not all.
Talk about being salty
6k commercial images/month is a handful of power users. None of the other limits on the Creative License are relevant in light of 6k limit. If the goal of the Creators license is really "less than 1 million active users" https://stability.ai/license, you probably want an inference limit of about 1000 images/user a month (there'll be both left and right tails of the curve) so something like 1,000 images * 999,999 users = 999,999,000 images/month seems much more reasonable. Round it to a billion images a month.
Lazy cat 
First thing I noticed is lower CFG
ask cogvlm to describe a similar image and then prompt with what it says
2 CFG seems to be best
Small noodle soup
/Be rich dad. Give daughter a corvette for her birthday. she wanted a pink corvette and he gave it to her withoout her bf watching and now she's on the ground throwing a tantrum.
vibes
I just moved them around a bit to condense my view
Once I get past HuggingFace Authentication (I can't!) I'll join all y'all 🙂
Each individual or business has a separate count for commercial usage.
If a model is commercial, and you use it, that's not commercial usage necessarily - that's just personal usage of a commercial model.
You're only commercially using images if you're selling the generation as a service, or including them in a professional marketing campaign, or things like that
oh, it's 6k images created A MONTH, and i think those only count for commercially used images. so that excludes like 99.99% of people
Have any devs addressed the criticism yet? Are people just trolling or is the model actually like that?
almost with 2 cfg
Having the same issue
ye
nvm
Second thing I noticed is more steps are needed, about 50 steps seems good
no one has answers here
6000 images a month per user?
truly next gen
There must be something wrong
people figured out that "woman lying in the grass" makes a mess and are generating 100s of copies of it
I make 6k images in a day
smh
like, that's easily in a day
the model's not perfect, esp. at complex human poses. If you want perfect humans with complex poses wait for finetunes
woman in any pose not standing makes a mess
Did anyone reproduce this prompts?
not just lying in the grass bro
as long as you dont distribute them commercially you should be good
So I just installed swarm, and put in the SD3 model. Any tips on what settings to use? (steps/cfg/sampler...)
but ye, it's not a very useful commercial license imo
if you let stableswarm set up your workflow it just works https://old.reddit.com/r/StableDiffusion/comments/1de65iz/how_to_run_sd3medium_locally_right_now/
Devs don't need to address most of the criticism with the way it's being delivered. Once someboyd writes a mature and honest report about their findings, then tha'ts worth responding too. Knee jerk comments like "The model is shit!" don't need dev responses
they have no control over the images themselves, so I though it was the act of using the model commercially to make those images

I guess it would exclusively be for personal use like doing commissions or something
xDD
steps 28 cfg 5 sampler euler scheduler normal, as per https://new.reddit.com/r/StableDiffusion/comments/1de65iz/how_to_run_sd3medium_locally_right_now/
butt naked real?
can someone use this prompt?
A man with a mustache and a monocle, upside down view
Why is it worse?
Because SD3 has a new text encoder called T5, but also the old XL encoder called Clip or Clip Vision.
If you don't use T5, you'll get worse results.
All the files are in the repo.
you can identify something like a butt in there ?
So... Cute...
makes sense. sdxl was the same way
guys u using negative prompt?
people complain when we hold the model back to train more, people complain when we release the first version of it, there's no winning people will always complain
Dude everyone is using T5 here...
sit pose works decent too ngl
but it doesnt have the curves does it count
I just reacted to comments saying "just use withclips, then you don't need triplecliploader". 
who invited man
SD3 can't actually do cats.,
a cat laying in the grass on its back laughing
you cdan see it was train on terrible stock photo, it has that studio lighting even on exterior pics
Nobody will even utilize SD3 8B. It'll take more than 24GB of VRAM easily.
People also don't understand that a base model isn't like a new version of dreamshaper or protovision or whatever. It's an entirely new base architecture with entirely new base latents
performing has the association
The model is very capable, but just like every other model released, people want it to be perfect every time for every prompt right out of the gate.
That's not reality.
want me to compare the prompt to SD 1.5
Bro, try to train a person on shit like this.
Thats a problem for finetuning too
there's always MJ they can move to, complaining for something free that cost the creators millions to make is kinda crazy
And it's much better than XL. SD3 finetunes will be banging. 🙂
should we be using normal scheduler? if i use exponential or like euler_ancestral it starts looking like silicon
worst case scenario it will take months for a good finetune
I think it'll be fine. SDXL base is the same with complex poses, but people train fine on it
For what point? I could easily find seeds with 1.5 that would have similar issues rendering the prompt.
it seems so...
Is it free to use here?
yup. but that's beside the point.
any way to know the dataset ?
without tes it's 16gb
anything wrong with this? its throwing error
why not Dreamshaper?
The finetuners will be fine. Astralite is being weird about not wanting to pay anything ever
Or Dalle 3 via bing chat or coze
I get horrible results without T5.
You don't need tes?
apple silicon people pulling up like
dreamshaper author already finetuned sd3 2b
i'm not sure but if i were to guess id say the file didn't download properly, got corrupted, interrupted, etc
Morons did the same thing when sdxl came out, only to advocate in favour of sdxl when sd3 came out. Rinse and repeat. They never learn.
Yes I know
Is it true that stability is not allowing finetuners commerical licenses if they have nsfw in thier finetunes?
no idea, all i know that big finetunes are in a delema with the comercial license
oh wow its horrible
Gueess I'll need to wait a bit for custom models, not quite impressed with it so far. But I understand it is a base now
fuck licenses
Because the fine tunes saved SDXL
It's new, people just like to complain.
already redownloaded them a second time already
mmmmmm I don't think so
why shouldn't stability earn money?
could you please show comparisons or pics, I actually forgot to test this lmao
Dangerous hunter
peopel taking merit for things they have nothign to do with
isn't it making profit off of it? i saw that astralite would gladly buy the enterprise sub
we have no idea, it could be due to the layoffs making it difficult for the licensing team to do their work or other internal company issues
try running the update script and see if you get any errors
base model is the core of all finetunes. Stability is responsible for all of them being made.
Sure. Give me a second.
Lmao, how can you call a model "revolutionary" when it is incapable of even creating a normal image of a person? How can you slaughter you training dataset to a point, where people become eldritch horrors, go online and say "Oh we created a revolutionary model, its so goooooood" and then release a complete mess?
its even worse than sd2
any particular reason we've gone back to euler as the standard sampler, after DPM++ 2M was preferred for SDXL?
so how did Lykon get such good images?
they should have private partenarias to make it more rentable for them and release some model to the public retaher than disapearing liek theyr did
A base takes millions of dollars to train and they give it away. License conditions for commercial use isn't a bad thing. The entitlement in this room is palpable
my dude... all fine tunes are made from the base models that stability makes
you do.
He does want to pay though? I though that we've already been over this. The ball is in Stability's court right now
trying redownload again, then I'll try that
i get either:
Lykon, whats going on here xDDD
yeah i know and getting sued doesn't sound like luch fun
@viral plaza what is the stance regarding, lets say, an artist using sd3. Would an image they post on a service like patreon or wat not, count to the 6000. Or are we literally talking about if i guess, its an automated process pumping out images as service to users ?
i mean it's a decent model
SD3 is arguably better than 1.5
And fine tunes will save sd3 as well, as they did 1.5, which was hilariously bad at release
well maybe you can start a company with your own capital and do business that way.
They're charging for licenses is their business model, and that's fine
i agree
how much does that pony finetuner even make to not count in the creator license? 🤣
you could try updating while it's redownloading
Haters gonna hate
the qulatiy is lower as sdxl and 2.1 vanilla
That's if people will be motivated to fine tune off of it considering the licensing
Kek
He likely is making far more than the license would cost him
Hate? Brother, just go and generate an image of a woman sitting on grass
I can't even recreate the images from the paper, can someone try these " translucent pig, inside is a smaller pig."
"A crab made of cheese on a plate"
"A swampogre with a pearl
earring by Johannes Vermeer"
the issue is with the 6k image a month limit, not the profits
bro that image LMAO
If I want to upscale do I need to change the tile width and height also?
yea
my settings are messed up
SD3 is decent at generating stylistic or anything that does not contain women
paper was 8b
at this point, you could even say that SD3 is freaking sexist as hell
it a shame we can't have stake in it or something else, now we indeed to have to launch a new company
8b has far more knowledge, trained for more time on more data
I wonder if sd3 will kill sd1.5, if sd3 gets further optimized and made to run easily on 4Vram i think it could kill sd1.5 because most users only have 4vram (sd1.5 users)
gotta decide which one is funnier tho
if I downloaded the t5 model, how do I use the t5 stuff, or is it automatically done?
how did you get this
ALL the images generated with his model does not count towards his own 6k
"SD3 CAME OUT!! WHAT ARE YOU GUYS MAKING?!"
".....hundreds of images of women in grass"
It's been out for 2 hours lol. Communication will be sorted, they want to make money too
I see no problem with the peopel spending a ton of money to make these base models, setting reasonable terms for their commercial use
its extreamly good in follow the promt right but fucks over legs,arms,hands,hight,eyes,distant faces,light and patterns
How to create?
you can probably run quantized versions of sd3 on much lower vram, and use only 1 te
Everyone is just going around in circles here. Something gets debunked and then 10 minutes later a new batch of people are back to the same arguments
We were told the licensing would be dealt with before the launch.
You still use it with sd3, just make sure to pair it with sgm_uniform
that kind of looks like a crab made of cheese and crab
medium with text is running 30gb on my machine
Damn, so 2B was not all you need after all...
wait is the current sd3 fp32?
"image of a girl sitting on the grass laughing, the image is comfy and there is a subtle light coming from the trees around her" 2cfg 50 steps
soon ™️
can't complain, it's 5 fingers
All these issues can be fixed by finetuners. The real issue in on whenever or not stability is denying fine tuner's enterprise licenses due to nsfw or not.
bout 2 hours
Its a gain the argument "SD3 8b is far better than 2b", yea so WHY do you release the garbage 2b version xD
is the 2b model the same on being used in the artisan? I couldnt even recreate those
I think an image posted directly to patreon would count yes (but, yknow, if you individually post 6000 images to patreon you're insane)
I don't think the license specifies its 6k per individual. Its 6k for the entire operation based on the wording
looks like a torch version issue??
update python dependences with the bat file
Yep we need community models, the potato and the french fries supposed to be "fighting"
i meann not all compannies are willing to share their model, that's my main concern, i am fortunate enought to have ressource to somewhat train a model, but i know fort sure i won'ty publish the latest version open source
how do i do that @onyx sphinx ?

Does T5 automatically happen in any prompt I write, or do I need to use a specific node?
8b needs more time to safe-ify the anatomy. please understand
"a big chair"
lol i played with that one yesterday
nice hands bb
if he's selling image gen as a service, yes it's 6k for the total service.
file update_comfyui_and_python_dependencies on the update folder
a crab made of cheese?
if you use the trippleClip node then it's in
if a finetune is released, anyone who uses it / buys it / whatever does not affect the 6k
that interpretation makes absolutely no sense whatsoever in the context of how these loras and checkpoints are distributed.
I am ready to pay anyone for a custom checkpoint on sd3 that is solely based on realism, hit my dms
I just downloaded the 10gb model
I need someone to get their ass on quantizing sd3 (and to make a1111 webui support sd3)
the images from the paper were on SD3-8B, not the current 2B release
5 legs
the 8B is better at complex understanding
@viral plaza for sure. Im just more concerned for the future in some sense. What would happen if lets say AD trained some stuff for SD3, and we ended up having people creating single animations with like 1000'frames' ? etc
Are you using the t5 model? I couldn´t recreate it withouth it
will 8B come out in august or?
Looking at these images sd3 seems to be. Ummm. Lacking in areas. Lol. Hopefully fine tunes will fix it? If people decide to finetune it.
still same error after updating and redownload
ther'es no smaller pig inside but that's pretty decent
yes, it hasn't gotten the h.p lovecraft treatment yet
I don't know exactly how video would play into it. I would imagine the appropriate intent-interpretation is it counts once per video rather than once per frame
is stehre a realease date for the 8b ?
don't have a planned date yet
how much more ram is it gonna take ?
First one is T5. Second image uses CLIP only.
Make sure the files actually exist.
so when is sd3 horse molester edition?
im sure with better prompting?
Hopefully the threadmills are included when 8B is released.
well the current one is a bit easier to run than sdxl
how large are your text encoder files?
you can already test it on apis
Now make him lay down in the grass
That ? looks suspicious. Are they symlinked to the wrong target?
@viral plaza Ok makes sense, sweet. But ya, it would prob be good idea to expand a bit on that 6000images part of the agreement. Just so it doesnt like, scare off lets say AD creators from giving SD3 a go in the first place and what not
it looks like SD3 can gen higher that 1024x1024 without weird duplicates though
Album art material right there.
with 2 pass it still take 12gb, it's alright but a tiny bit more than SDXL for me
What's number of new SD model?
It is less restrictive in terms of resolution!
copied and pasted them from downloads into the directory
not without weird noises around the edge
Understood, I was expecting even the 2B to outperform sdxl though with new captioning and facny new architecture, even a little bit of prompt adherence would have been good.
Ehm yes, cause it creates weird duplicates even on 512x512 ???
#1237459938901491852 a person
cascade could do different resolutions better tbh
heunpp there
you mean it's dumber? it doesn't look so worse
at extreme ranges sure, but 1536x1536 or 1152x1536 work great
using custom fractal initial starting latent
some prompts it just refuses to follow
it definitely does lol.
I actually cherrypicked the second one (2 tries). The first picture is more detailed.
damm ok cool thenat least one good news ^^
do you happen to know the base training resolution ?
use the text encoders separately and use comfy ui. I am on 6GB vram and can generate an image in 32 seconds on a 3060
trash model
i guess it ignored the white part because samurais aren't usually white
it does outperform SDXL on most things (not all). Especially on general image quality and complex scene understanding SD3-Medium is way better than XL. When you get into specifics like human anatomy... yeah wait for SD3 finetunes
saying thios cause some ratios seems to have weird artefacts
model respects proper race
im so dissappointed it looks so bad
Im really trying to compose and not be too rough, im glad we are getting new A.I. But why are your mods and employees always saying contradicting things. Like for example why do they say that 8b is way better than 2b, and yet you release 2b as revolutionary? That does not make sense at all
(you can get awesome looking humans out of SD3-Medium if you don't ask it to perfectly pose the entire body, eg a portrait shout usually comes out great)
thats uh... racist
I think there is also a "learn to use it" factor.
here's a native 1536x1536 gen
From what I noticed SD3 has very precise prompt following, this is good and bad, because you need to specify what you want in lots of details.
Example: "abstract background" in SDXL would result in pretty backgrounds.
With SD3, it's literally random noise until you specify more.
I'd; gladly take score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4 , which was obviously an expensive mistake since they don't have a hundred million dollars of investments they aren't planning on paying back to re-do it, over this:
for 2b? 1024x
anyone manage to run SD3 with only clip-l or clip-g yet?
i have no comfyui knowledge so i won't bother
it is different for the 8b ?
Seems more like people intentionally misunderstanding things then contradicting messages
Does (stuff:69999) work?
"a blonde woman getting a massage, oiled skin, fuji 400h, shallow depth of field, chiaroscuro, moody, gorgeous"
no way they really thought this was prepared for release
If someone wanna use SD3 Medium in hf spaces, https://huggingface.co/spaces/Nick088/Stable-Diffusion-3-Medium
Anybody scored bingo yet #🆕|sd3 message ?
The weight
we didn't spend "a hundred million dollars of investments" on 2b by the way lol
pretty sure it's the same but just more pictures
nice cat
pretty much anything other than standing or sitting tbh 😔 some synthetic data may have helped here
no
I am not able to get such results, even compared to sdxl 1.0. Can you double check if you uploaded the right checkpoints lol or if the comfy workflow on github is correct?
apparently not
this clip loader error is going to drive me crazy
The 2B we're releasing is better quality than the 8B we have, and the existing SDv1/SDv2/SDXL models.
8B once it's fully trained will be the awesome SOTA lead model, but it's still in training.
SD3 in general is a revolutionary awesome new architecture
here's the thing, i think you might have it configured wrong because i've seen other people's generations with sd3 where it looks just fine
ok thanks fopr the innfo
how can I create here?
In today's video, we go over exactly how to download and use Stable Diffusion 3 (SD3) Medium locally without relying on the API. This method is completely free and lets you use the newest model on launch day.
https://huggingface.co/stabilityai/stable-diffusion-3-medium
Happy creating!
In today's video, we go over exactly how to download and use Stable Diffusion 3 (SD3) Medium locally without relying on the API. This method is completely free and lets you use the newest model on launch day.
https://huggingface.co/stabilityai/stable-diffusion-3-medium
Happy creating!
but wasnt that the advertised improvements? "realism in hands and faces, is achieved through innovations such as the 16-channel VAE" or is that only foreshadowing the finetunes
In today's video, we go over exactly how to download and use Stable Diffusion 3 (SD3) Medium locally without relying on the API. This method is completely free and lets you use the newest model on launch day.
https://huggingface.co/stabilityai/stable-diffusion-3-medium
Happy creating!
In today's video, we go over exactly how to download and use Stable Diffusion 3 (SD3) Medium locally without relying on the API. This method is completely free and lets you use the newest model on launch day.
https://huggingface.co/stabilityai/stable-diffusion-3-medium
Happy creating!
fair enough, it was a gross exaggeration (8
ban for spam?
Things are popping off!
whats the point of score_9 on every image prompt though? that's good for a refine for people that want one specific macro aesthetic, not a base model
just G or just L technically works but is silly to do vs G+L
cant make unsafe content if you cant make content
why does it look so bad
git gud?
skill issue
stop lying
In the long run, I expect "LLM rewrites the prompt by default, but also provide the option to send a raw prompt" to end up the standard for AI prompts
it's what udio has
@pseudo isle square
Nice, welcome back
nah it's to essentially filter out the low quality art but there was a bug in how it was trained iirc
but it also lets you still use it to generate a huge range of styles without censoring out that low quality art
?
layer 8
any help on this error ?
but is this possible with the released nodes?
Sd3 hugging spaces are avaible , chek the sd3 model card
im
I agree, that prompt following is a lot better, but anatomy is bad. Saying "learn to use factor" in a model that supposedly should be easy to use and generate great images using real language prompts, should do that without having to prompt engineer like crazy. That is training issue and most likely cenorship beyond anything ever done before
for them tbh
wait a minute... why in stable swarm when I press import from generate tab are the clip models labeled SDXL?
I think so, too. Many SD wrappers have "prompt magic"! 
DK rap
quality apart and let's suppoose we have a good finetune model inn the next month or so ' best case scenario) what about the controlnnet, thta is still lackinng for SDXL
the 16-channel VAE thing is referring to small details, eg distant faces in groups in an image, or the eyes, or etc. where XL and v1 previously kinda just utterly failed until you upscale the image a bit to make those details large enough to process well
wot
Thanks for SD3 StabilityAI devs, awesome work again ❤️
Also, SD3 has much less AI artifacts.
can someone test the same seed and promt but with different shedulers,steps and so on ????
because the damn text encoders in the sd3 release arent working for lots of people

Having the same issue
Hello dark
they likely filtered out most images of humans until they couldn't get sued for being able to generate illegal content from the base model
hello
no u
nudity isn't illegal

Very horrible anatomy.
try with SingleClipLoader. I don't know offhand if comfy added support for that because it's very silly to do
well that's made up
using different Scheduler gives very different results
i dont have access @barren spindle
i still get this error and the update_comfyui_and_python_dependencies.bat didnt go through with the above mentioned error
pose problems due to lack of training data?
site is down
i think you replied to the wrong message
THE WEBSITE IS DOWN https://www.youtube.com/watch?v=uRGljemfwUE
The Website is Down: Sales Guy Vs. Web Dude High Quality
The original video in high resolution.
This video won a Webby award!
maybe, there are so many messages here lol
SD3's textencoders are literally copy/paste the SDXL one and add an optional T5-XXL
i am at work
and im still getting after downloading redownloading them 4 times to ensure its not a corruption issue, and running update all in comfy manager, and checked for update of comfyui
if you have SDXL models with finetuned tencs you can try them on SD3
wait im confused, better in quality how? like lighting etc or prompt adherance? If so why are api and paper results of 8b so much better ?
😭
sdxl better tbh
i mean you might as well just trash your whole setup, just create a new folder, donwload a fresh copy of portable comfy and just use that, then bring whatever it is you had over to the new folder to ensure a fresh working install
But again, if you claim to be on paar with midjourney or dall-e that also includes usability and they are just a lot better at creating poses and anatomy with simple prompts
My SD3 is broken....
will the sd15 clip layer be like quake 2 code in future AI models? 20 years down the road and it'll still be in the newest generative models
yeah its not looking good for stability.ai
SD 1.4 was better and hotter
You can use both you know
its just trash
skill issue
@silver sluice could you give me a link of where to download the fresh comfyui best ?
no
meanwhile dalle3 gets it on it's first try
Big noodle soup 
so by default it's not using the T5 in stable swarm?
nah, thats like 18/20 images trying to generate this prompt
Guy has MJ as a profile pic but nobody wanna be like him
skill issue for stability 😂
bump, anyone know what to do for this? I've redownloaded the files many times to rule out corruption, and checked for updates for comfyui and all nodes
SD3 is still confused about what a cheese-shaped crab looks like
quality refers to, like, the visual details n stuff generally.
SD3 in general has much better prompt following than XL does.
8B is better at prompt following than 2B.
2B is better at visual details (like the stuff you see when you zoom in and look close) than 8B is currently (until 8B training gets further along)
my day is ruined, let's discontinue the model.
seemingly yes they keep shoving L into everything aaa
lol just help me out, what am i doing wrong? does SD3 need elaborate prompting to be good?
its just not well made
Ah alright
what is the min torch version required for sd3?
join me brother, show me a good SD3 prompt that will make me a cheese-shaped crab im obsessed about it now
try "cheese sculpture shaped like a crab", you keep asking the model to make a crab, it will make a crab lol.
No, I'm creating different prompts.... "A woman with brown hair is lying amidst a lush green field of tall grass. She is wearing an orange dress with white polka dots and is looking upwards, possibly lost in thought. The field stretches out as far as the eye can see, with the woman being the central point of interest"
yes, edit this param to turn T5 on or off:
shes literally me
thank you!
i'll probably swap it to default CLIP+T5 if you have enough RAM
Nah sd3 is good, but we HAVE to create and use a community finetuns, its clear that the censorship of the model affect the generated persons (the same thing happend in 2.0) and the dataset used was also limited, it cant generate images of famous games or styles, but with the images that it can generate it looks good and it have good lighting and textures
yea i mean "lying down"
obviously, but CSAM and deepfakes are :)
uhhh what is that
yeah good point I kept said "a crab made out of cheese" but then I changed it to a "cheese shaped like a crab" but its still doing the same thing, ill try the sculpture bit and see if that helps SD3 understand intent
So, thats nothing you can sue Stability for
that is not comfy official download
🤣
bump, anyone know the fix? what is the min torch version required?
welp :/ I'll try restarting
you can sue people for anything and a judge determines if it has merit or not
If I downloaded the 10gb model what do I need to do to activate the text encoder?
yeah it's not it's like a preloaded version it's what i use and it works well, sorry if i should only link to the official release one
Yea but it does not have merit, because you can't create deepfakes out of the box. Better just stop training on celebrities instead of nsfw
better take it down, we had the recent malware distribution because of custom nodes, better be safe
Server -> Update and Restart
Your copy of swarm is a day or two outdated
what are your guy's opinions on sd3 so far?
Thats crazy detailed
and once you use photoshop to create the deepfake, it has 0 merit in any form or way
it is NOT good
but it is
and you can say after 3 hours.......
base models generally arent great compared to the community finetunes
SD3 architecture is good, training is bad
For certain images it produces good results, but the trained data IS COMPLETE GARBAGE
how is it good, people are all deformed it looks unusable
orange flavor
its dead we lost time to pack it up and play with Elon model
The problem is the training data
apples to apples dalle3 converted my prompt:
generate for me an image of a cheese shaped like a crab on a plate
to:
A piece of cheese sculpted to look like a crab, placed on a plate. The cheese has intricate details mimicking the crab's legs, claws, and shell. The plate is white and simple, with the cheese-crab placed in the center. The background is plain to keep the focus on the cheese-crab sculpture.
SD3 on the left, Dalle3 on the right
yes?
You can also specify to use it alone
i think the saving grace of sd3 will be how easy it is to finetune so far
what is the min torch version required by SD3?!
reddit c*** box vibes
is it 2b?
@lavish osprey indeed sculpted was the key to getting it to look right
Comparison between different clip models ? Is clip only bad ? How much ram should one need for clip+T5 fp8 ?
yeah running locally
no. Did you see SD 1.5 standard model?
Are we sure that the model is implemented properly on local repos? The results from the api vs local are so different.
yes its trash by todays standards
If anyone gets a license for it..
pixart will save us
damn, drop the settings+prompt
shes showing her arms,ban
Pix who
i think the saving grace is that it's free and it's weights. good enough for me. The new VAE is pretty saved too.

pixart sigma
SD 1.5 weights were one thing. Finetuned models another one. This will be the same with SD3
@viral plaza What is the difference between the fp8 text encoder and the fp16? I get quite differrent results, which is why I ask
but it cant even generate people doing actions, its so censored
That is the thing that i dont understand, why people have to contact SAI by email instead of buying the license like its another random software?
Less precision, the text encoder takes up less space in memory but is more inaccurate
We need chinese models to become better kekw. SAI cant keep destroying our hopes.
i love it such fun
Awfully slow I stick to Turbo for now.
the hand on top of the head looks like ps2 low poly, lol
Important to note that Pixart has commercial restrictions too. Theres a limit on how many end users can use the model through your service. So a place like Civit has to license it to tencent
if Civit doesn't want to pay tencent for licensing, Pixart may never take off
That is actually matching my results, notably with text.
people complaining about sd3 just havent gotten it to work right
im confused rn,where is the girl????
pony guy has shown interest in experiementing with a pixart finetune, so perhaps one day..
mmmm corn
really baf
looks like it worked 😉
idk i was supposed to make a gigachad
git gud
True
Can SD3 make me a pizza with horrific toppings?
dead model SAD!
because this is how it'salways done in business 🤦♂️
I created a dataset with different angles, left and right hand poses and so on, around 1500 images captioned with gpt4 vision and curated. Once training is added to kohya or onetrainer, i will try out how good poses can be improved
its a controlnet hidden image you have to squint
different model 😂
So, whatcha guys think so far?
Then why some people CAN´T get a license?
it makes no sense
Pony guy, the goat, or shall i say, the horse kek
Sitting is too risque huh?
no
git gud? its still deformed. Look at the leg
here another
Difference between clip only and clip+T5 fp8
18:14:26.383 [Info] Generated an image in 398.86 (prep) and 1197.11 (gen) seconds
so its out?
also bad
it looks so different
I'm reasonably enjoying myself. Oversized apples...
No 2 hours
this is what you call gaslighting. somebody accomplishes something other people can't, so they tear it down instead.
pls dont do a cascade on this and wait for 8B and not train this and make tools for it
dalle3 does the same thing as SD3 where prompting it this way makes a crab shell made of cheese flesh

Is this supposed to be impressive?
I mean, it's definitely better than the shit some of you are posting 😄
First image is T5 only. The second one has only CLIP loaded.
yes,my soul is out of my body as i keep getting error on comfy
ouch
I found a pretty sick upscaler
Can't. My roots go too deep.
again, this is how it works in business. You contact the company and make a deal. I assume it's not you who try to buy a licence but someone else. I wouldn't care so much about information from third sources
looks the same as sdxl
no, but looks nowhere as horrific as your gend. Git gud.
How many of us actually slept last night? lol
Yea, we are posting with a model from you guys.. using simple prompts that you guys said could create great results
lmao
She has three legs and one of the arms is really weirdly bent.
better prompt adherenhce my ass, it was : cinematic, interior design of a modern living room, designer furniture, polished granite stone, marble annd golden accent details, maculine design, trendinng on AD
cant even do words properly lmaoooo
skill issue
What.....?
stfu
lol
unskilled feedback is not really useful. Learn to use the model first.
thicc
Brother i can feel your hands shaking behind the screen...
put the computer down..
stop coping with the skill issue it's a terribhle model and you know it
I keep getting decent ones all the time
how long until finetune scripts get made
Sure, why is midjourney and Dall-e capable with the same prompt? 😄
maculine design!
pixart seems to have no issues with this prompt
some people better get ready to apply for unemployment benefits lmao
Lots of work for the community I guess.
they do, and a lot
Which clip model are you using ?
yeah ever heard of stark ?
i installed the new comfyui portable, but it doesnt have all those nodes, and it doesnt want to install the with the install missing nodes in comfy manager either, why is tha t?
dalle is not a model, mj is a finetune (of xl? maybe).
Why are you comparing a base model to a service? Compare Dalle to Ultra lol
imagine being so gross and toxic that you wished people lost their jobs because you're bad at prompting
don't try me on design mate
is this good?? really?? can't do "very deep learning"
@lavish osprey also you keep saying learn to use the model first while posting garbage results all the time?
In this economy? Don't have to be replaced by AI to get unemployed. lol
Her legs are completely wrong.
Holy shit what did I come back to lol
i see a third leg here
but it's not made of cheese, it just looks like a crab made out of a golden baked bread shell or something lol
Stop sitting on SAI's lap, the model is trash.
surely part of the dall-e tool chain comprises a model
hands baby
It got a bit crazy while you were gone.
How much ram is needed for clip+T5 fp8 ?
Don't lol
I can not get a single decent gen of a woman sitting. Just a woman sitting.
the face of every person using SD3
LMAOOOOOOOOOOOOO
try running the update script after downloading the latest portable version, that's what I did, i also verified the update script points to the real github source for updates. also if you have tortoise git installed you can just right click and run git pull on the ComfyUI directory
Don't what?
yeah this is unusable
it's a very special cheese!
i'm over here using a free model. it's fine.
if it's trash then you got seriously skill issues. generative ai been out for years and you still haven't figured it out? Figer it out
It fails miserably at generating human bodies in any context
skill issue (c) Stability staff
Im glad that someone from stability staff, on the day of release, is just telling people to "git gud" and then post ok-ish images as a gotcha. SD3 might be better down the line with some finetunes but this launch is ROUGH.
skill issue
how is that a skill issue lol
They are not even ok-ish, look at the legs, this is body horror.
bro is paid by SAI
skill issue (on behalf of the bakers)
Then dont say its our fault that your model doesn't behave as you advertised. Just say go finetune it, we didnt bother to train right
i mean, lots of people making images that are fine is why
the update script doesnt seem to work in the new comfyui
it is trash, and the skill isuue is from stability.ai, i didn0t make the model
thats wild
what? since when!? I guess they're paying me with free models.
I meant rather don't come back for while, it is wild here
😁
They are cherry picked
🤣
Then please, Explain how to proompt.
u gotta be trolling
Dude got MJ as a profile pic and can't even make a free throw shot. /sucks teeth
why you copeing then, what' in ti for you to congratulate such a disapointment ?
more plz
sitting doesn't seem so bad tho?
it seems to be a dice roll at this point really. quite disappointing after seeing how stability flaunted sd3's ability to generate good hands before it's release
Idk,I feel like this is a bethesda moment of "the modders will fix the game". "The finetunes will fix it", sure finetunes will always be better but out of the box the model should atleast be decent.
i'm not disappointed. Actually quite entertianed today
like.... which ones
Pretty sure SD v1 and SDXL launches were dozen times better than sd3?
if you ignore the 3 legs
SOTA in rock generation
well not yours
the otehr script gives me this error:
i bet yeah
this config with dpmpp_2m gives much better results thatn the default workflow comfy comes with,
its the future whats wrong with 3 legs
one more foot please

/spit take
yeah now im trolling.. everybody can see its unusable stop coping
sadly, have to do other stuff than hanging out here, but appreciate the sentiment! 
as expected everyones complaining... ill try it later when the dust settled a bit
for real these guys are mental
it's skill issues everytime until we get the finetunes and are better haha anyway are CN models released?
just a minority of people with seriously skill issues
Maybe ppls settings are wrong or something. Different CLIP needed or something hasnt updated. Why are the results so different?
yeah show me one good one lets all see
try it now,whats the matter getting tired eh sunshine? ill drop u like a bag of potatoes
HUGE SHOES
naw i don't jump when clowns tell me to
The same happened when XL released. 
i has a bit of skill i pass it thru my aweosme noodletown setups later
the only thing that truly matters
landscape seems to take much longer now
Average woman's shoe
you cant, bc its trash lol
oh yeah great result, what a fine image to show how good sd3 is
platform shoes was difficult to do on sdxl so thats a win
40 gens so far, not a single not mutated woman sitting.
Yeah DPM++ 2M should be better than Euler in general, setting it up with an ODE solver might work even better since that's the traditional method for sampling RF models
I DID IT!!!!!
1920x1080 is too much on my vram... can i load this without text? just use 2 clip?
may need to look this meme up @dreamy sundial it might be before your time
"The model is good!" "Can you show us?" "No clown". Glad we are all being constructive here.
maybe we need to use diff smaplers here unipc is a bit of a hidden gem
Try a more detailed prompt
How is being more detailed going to make it fuck up less?
See like some of the results are ok. Why does it vary so wildly?
fucking clown
Im trying for the most simple prompt
x2, im not dissapointed on the model, it works and if the base model can achieve this things the community finetunes will be epic, im just dissapointed on the dataset they used to train the base model, there are a lot of things that it just "forgot" and was possible on older sd models, of course the community will be able to finetune it and fill those missing gaps, but idk why SAI keeps removing stuff related to copyright when the competency is better and don´t do it
when I was using API .. it seemed like the upscale made images way better,
Are there any papers, guides or other documentation about the nuances of prompting SD3 you'd suggest?
doesn't look too much different from sdxl base by the way. Maybe you're comparing a base model to 1 year of finetunes or enterprise apis.
If that's your goal I suggest you compare our Ultra or Core services with those.
Whose being constructive where? i'm not going to work when scrubs demand it. Ever.
I got nothing to prove to y'all lol. You're just clowns with skill issues sulking
put in some epic promts. does it like natural language/>
We need a skull issue
bro were comparing it to base sdxl, and its worse with anatomy than sdxl
THE PROBLEM IS YOUR ADVERTISMENT. You basically said its the next coming of christ
Man you all have such skill issues
Then post your examples
n-nuh uh.. yall have skill issues, clowns! Clowns!
SAI...
its a ne wmodel folks it has its quirks we need ot figure it out
I genuinely think that the skill issue lies in the negative prompting
Ok bud, we hear ya. Everyone is clapping.
by the way, the same happened at sdxl launch, where people where comparing it to DreamShaper, RealisticVision, etc, and even using those as "refiner"
just me. sulk harder
Is it just me, or is this model just bad? At the very least, it's absolutely nothing like SD3 through the api
the model doesn't seem as good at generating from nothing, it might need more handholding
This
Ill do base sdxl side by side if you want.
bro you cant even show us one image with good anatomy, get lost
i remember people prompting with just the refiner and then decalring that stability only ever releases trash
sdxl sucks bud,1.5 better
I wouldn't even hate SD3 if you just had given us realistic expectations lol
do it pls
SDXL 0.9 didn't output images like these for me.
Gotta be some strong tea. She don't know where to look lol
just updated to CUDA 12.1 and updated torch accordingly, still getting same error
Everybody's all up in arms at the very start of the release of the model. Y'all need to chill, learn the model and its quirks, then you can feel free to complain.
this guy is a troll mate don't answer to him, he is coping so hard
i'm not going to publish anything i make today. The low skill babys are primed to kick over every sand castle they see
yeah
whatever get lost
Some people troll for living
Try and use t5 fp16
GOOD JOB STABILITY XD
By the way, can we nsfw with sd3?
Make a free throw. Can't do it
some sd3 2b vs sdxl base comparisons.
Don't respond to flowwolf he's just trying to irritate you.
They censored the model like hell, yet they allow biden to go in bed with trump xD
not the same model we have
If you can't make a free throw right now right this instant, you're a troll.
Bent elongated arm and hand, three feet, broken trousers, if you think this is good I have no words.
Now try that in anything not standing perfectly straight
lol
make them sit
But can it do people lying down?
nope
It can barely do people standing up
I did and posted it and had no issues. Understand that with 3 text encoders you need to be clear in your prompt, or they will fight.
Bu-but skill issue
How much ram needed for clip+T5 ?
mind share your prompts or setup? your gens looks really decent
for real though. i am having skill issues. my hugging face download keeps stalling and failing.
lykon how its possible you were showing very decent previevs for last couple month and now we got this? XD
Yea and that are like hyper trained images in your dataset? They are the same tho, so nothing special compared to SDXL
can't use the t5 model yet
Uses roughly ~16GB
i use the new portable version now and i did update and it did a lot of things, but its still missing EmptySD3LatenImage for example, and not letting me install it
im not hating or something, just curious
Eep 
T5 only. StableSwarmUI. No upscaling.
I will make the following suggestion to others: use the external file for the fp16 encoder instad of the embedded smaller fp8. I am getting night and day differences notably in text reproduction. Here is fp8:
No you did not, you posted a mutant
Thanks
yep
and this is fp16:
Careful when you walk down the stairs folks
SD3 sucks balloney sugma model better
oh, sorry, a bit more in total, but 16GB VRAM.
This is using them without even the other clip files
Already much better
Now my results look like garbage... Very pixelated images despite using the refiner.
It is good at understanding itself hahaha
can it do Booba? Sign me in
Too Much
CFG: 2
dppm_2m, 30 steps,
you are now banned from SAI
git gud
Foot is a bit iffy
getting the scheduler wrong on purpose actually seems pretty cool for glitch art
2/4 of how many, and it still has mutated fins for hands
90% of subjects coming out asian is wild btw
thanks
So you're just looking for arguments. Alright
Bro are you trolling? look at your stairs
Im looking for a single decent gen not standing
I wouldn't mind myself
👍
yup. if you show results the goal posts change. it's why i don't post examples for these people. nothing is good enough for them
look at your images before you say "git gud"
i know why the anatomy is so bad. it was trained on your mom
https://i.gyazo.com/9f092f34b2b39751d6fdb3028b5fd7e3.png
What is a Download Mirror and why is Load Balancing a good idea.
i can do the same when sneezing on napkings after eating a burger at wendys
git gud
my comfyui is still saying this:
but update comfyui says its newest version
definitely better than this #🆕|sd3 message 😄
Thanks man
NSFW is dangerous according to SAI and its investors, so no. Wait for a proper finetune (months).
This probably happened because of the removal/censorship of some images in the training dataset, the community will fix it fast
i wouldnt call velveta "cheese" tho 😉
git gud
wasn't supposed to be... ?
Thank you, first worthwhile gen ive seen of anything not standing. Ill try that prompt
AI Safety is a real concern. Those at the bottom can't see the view from the top though. Understandable that you wouldn't get it
no his stairs actually look kinda okay
how/where can i get a comfyui portable version that works with SD3?
If people are motivated to finetune because of the licensing
not same model
yeah perfect image
this will take month to have something usable dude
yours look like they have been cutout from paper and glued back together
haha
literally downloaded from HF lol
Nobody is getting hurt, Lemme gen some booba.
Very short sited.
You've got sd15 and pony already. go nuts
were not getting nearly the same results
how/where can i get a comfyui portable version that works with SD3?
git
||good||
Yes but this is not something it can be fixed with scripts or controlnets, its a pure training mistake
@compact forge just quietly posting banger gens. Go you!
github, just update
can you give me a link @topaz flume ?
Garbage anatomy aside, why does it only want to make Asian women even with it in the negative prompt?
Yeah like.
whatever
Lots of them in the training i guess?
"british woman"
Oh my god this wizard is powerful
This is so sad to hear all those people bitching about SD3, just because release model is bad at hands and anatomy you say is bad overall, look at other things it can generate, or go use whatever model you like and pay for it, without fine tunes, flexibilty of comfy and programers who chat with you here.
Google Gemini Image gen returns...
https://github.com/comfyanonymous/ComfyUI this one ?
question i'm curious about. would using the fp8 clip be better on a Ada 40 series card, since it has the hopper transformer engine? Or would that do nothing for the efficacy of the math?
sure https://github.com/comfyanonymous/ComfyUI
just install the portable and make sure it's update using the comfui manager
Did you generate this with SD3?
ill try adding some booba to the prompt
yes
maybe we should just upload the fp16 version lol. Too hard for people to use them separately
Im going to try using the encoders seperately instead of the all in one
maybe that is a issue
👍
@lavish osprey so here's where I'm at. sdxl had a lot of the same issues with hands/limbs etc. Over the last month we've gotten the align your steps, and the res_momentumized sampler. Between those 2 things, all of those limbs/hands issues have largely vanished. res sampler is out because it's like sde/ancestral. Do you think there's options for new samplers to fix this like they did for sdxl?
@topaz flume i got it from there, but it says it cant do the sd3 stuff and its not updating itself further
Ah yes, because clearly ppl don't use AI image gen for generating ppl
doesnt work
probably a good idea haha
Left: sd3_medium + TripleClipLoader (clip_g, clip_l, t5 fp16)
Right: sd3_medium_incl_clips_t5xxlfp8
weird... all my images look pixelated as f*ck
And StableSwarmUI is shit too
Bro, you found that most nsfw image! No get banned!
ok
dpmm_2m and sgm_uniform are the only config options that work?
skill issue
seems that way. after he showed the fp16 version i dleeted the model i had and am downloading the version without the clip
this is trained entirely on asian women or something
no for sure it is working with the correct update
dpm2 also works.
Hope Forge is updated fast
DPM++ 2m + simple works too
is it? works on the first go with the comfy ui workflow
do I use t5 fp16 or fp8?
do you want more quality or less vram?
thsoe sandals...
Have the VRAM for 16? Then use 16. If not, then not. 16 > 8 in quality.
signs like this should be addressed and potentially bannable. this is just beyond toxic now. @lavish osprey thoughts?
must be from Hong Kong
I have no clue, hope my answer helps in anyway
ty, should i use gpu?
you want ComfyUI c8b5e08 version or newer
This is what im talking about, it can generate very good images, just not about all the subjects
ty
if you get that option yes
It's a meme dude. I'm not trashing the model.
Shocking that they don't want to be associated with weirdos making furry and real people porn
deleted, better safe than sorry
I actually like the model.
i cant