#🆕|sd3
1 messages · Page 14 of 1
you win
horse gets a taste of lesson 4
Ha , i will try with diffusers
尚未出炉
moon runes
Oh nice. Git cloning it actually works.
只有具有 API 访问权限的人员和开发人员
大大有lib账号吗 我关注你
fantastic, so I just drop it into models folder? 😄
是的。我认为他们会给你 10 或 20 个免费积分。
Are we live?
Still 404 on HF
huh
very good, I tested Midjourney prompts with SD3. SD3 wins across the board
The model seems to have been uploaded.
thanks! made with SD Ultra
model has been uploaded, what are they preparing now
(I'll edit this line when their upload is finished, meanwhile is a 404)
so start your f5ing on HF
hopefully Ultra can be done with 2B workflow
I hope so too
or better!
putting all the latest info together, ultra currently runs an older version of 8B
Are we there yet?
I think so too
I heard it's been
hey @gusty gale 😄
Hopefully no delays with release today
yes
so 2B should be better from release. someone with access to both old 8B and 2B mentioned currently 2B is better
no
Huggingface will break
that processor is fast... damn
Are we really waiting on an upload? The F5s seem wasteful if not
@viral plaza so it rejects, is it true?
no
it's already uploaded, it's just not public
woo
API actively rejects bad prompts, the model just doesn't know how to gen nsfw
yeah makes sense lmao
Loras will take care of that
404 Weights
anymode guy prob got sum misinfos
Are there any official ComfyUi SD3 workflows i should be downloading?
No official workflows uploaded yet afaict
oh that's legit. https://www.reddit.com/r/StableDiffusion/comments/1de2qne/announcing_the_open_release_of_stable_diffusion_3/
feeling like waiting for death, 3h left?
北京时间晚上10点
get real
Links to a 404 page
I need it yesterday
need it Asap Rocky
Will you change your name when you get 24GB of VRAM?
I need now
I mean post is legit, was nothing here on server #📣|announcements
will be updated
be ready for astral long paragraph
true
does git clone work on hugging face?
which model is this? the 8b?
yeah
yes
Yeah, but you're better off using the hf CLI tool.
I'll just use Git
Wait did the weights drop
Otherwise you'll double the space needed
after release will it support diffusers..?
uploaddd but privaate
?
git clone keeps a copy of everything, i.e. git cloning 10Gb for T5 will keep a copy of it in the .git folder too. So you're actually taking up 20Gb.
Where is it
just wait for it.. we're gonna generate lots of sloppery!!
that's stupid
How much is sd3 medium superior to dalle 3 and stable image ultra?
No, it's a version control system. 😄
thats how git works
Those who have already tested
Yea true,I just wanna fuck around with the text in images
👍
Open source
oh
atleast i can run medium on t4
there are a bunch of git lfs commands like dedup/prune/skip-smudge you can use to save space
Wait, is today the release date?
yes
ofc
Yeah, if you want to get technical, but the hf cli just downloads what you need.
Yes
it's going to be released
Probably release on 23:59,9999999999999999
Generally, Dall.e is superior in prompt following , composition and consistency. SD3 does better life like human faces but mainly you can run it uncensored!
There was a PR yesterday I think from ComfyAnonymous with SD3 support. I guess they will update in time with the release.
So, looks like the licensing terms are up and... it makes things more confusing?
I am correct that the following changes have been made:
Professional Tieris no longer a thing, it has been replaced byCreator License?- The Creator License is limited to, well, creators and includes the following new restriction In addition, for customers who have entered into a Creator License Agreement, “Core Models” includes Stable Diffusion 3 Medium in English, which may be used to create up to 6,000 images per month.
- What do I need to actually talk to someone about the Enterprise License? I've reached out via the SD3 form as soon as it become available and has not received any communications. Is this an attempt to control what kind of models are actually produced from SD3? This ensures that businesses can leverage the full potential of our model while adhering to our usage guidelines.
- Where does the finetuning fall into this? I assume it does not matter until the SD3 model is deployed somewhere but some clarifications would not hurt.
impatient for SD3 medium release? use your gems to skip the time
oh shit astral sent it
never tried downloading anything with hf cli tbh so no clue about that, git clone is just dead simple so prefer that personally
two hours again?
I feel like the uncensored part is gonna make sd3 superior cause dall e 3 is bottlenecked by its restrictions
2 more hours for real this time i swear on Yeezy
what will you guys do if for whatever reason they decide to delay the release again
2 square hours
ok so my imaginatgion is way ahead again
Yeah you cannot even prompt things like "battles" a lot of the time or celebrities or even characters.
give me consistency
Glory to OPENAI
small creator licence 20$ a month https://stability.ai/license
Stability AI licenses offer flexibility for your generative AI needs by combining our range of state-of-the-art open models with self-hosting benefits.
open ai should be called closed ai
sigma is a tiny 0.6b model thats undercooked, it's currently just a really cool tech demo
It's a really contradictory name
really impressive for what it is
that makes sense
sigma
so sigma
on sigma
In my sd3 test, the large api is on the contrary superior to the dally 3 in following, but the quality of the dally 3 wins. He also wins in complex compositions
And Stability should be called Unstability
Please reread what I asked 🙂
In the Reddit announcement, there is a link dedicated to discussing the license agreement.
chat, what are people on reddit talking about in regards to safety btw?
Very interested in the safety feature aka censorship. Pretty sure any prompts including "Taylor Swift" will be completely blacked out.
that's not how it's supposed to work unless using api or if you have the nsfw thing enabled in diffusers, right?
astral did read, why would they send that message?
Have you compared the sd3 medium?
Reddit being reddit
wasn't answer, just observation of the price
worst case would be sd just generating a person who looks nothing like taylor swift I think
I am confused by the 6,000 image limit, what happens if you want to go over that?
pls don't drop SD3 in next 30 min. I have something to do! :D❤️
screw u!!!!!!!!
Very true
No just been using the API
i hope not so it should drop around 2pm utc
Sorry, I must be bling - can you please point me to it?
But you can enable sd3 medium in the API. Didn't you turn it on?
I also see it now says that the creator license is specically for the 2B model, was there an announcement that the 8B model is no longer for local use by small creators?
Or just that it isn't ready yet
Channel going nuts. Discord will crash when SD3 drops.
"The weights are now available under an open non-commercial license and a low-cost Creator License. For large-scale commercial use, please contact us for licensing details. "
(In the announcement on Reddit, the link is embedded in this part of the post ^)
I think you will sadly need to contact them via email. But that is where they direct you to.
so!!!! SD3 is not free?
It is still free for noncommercial use
I think its free for personal boobs and stuff but not for commercials?
I have been using it with Clipdrop (as it was cheaper) and then Glif (as it is free) so no idea what version they are using, probably an older one.
basically if you want to make money with sd3 you need to get in contact with stability first
if you sell your sd3 made boobs without acquiring your sd loicense, sai will send swat to your house
lol
is there any liscence?
can someone try : VHS tracking issues distorted vertical hold noisy signal chromatic aberration low light artifacts blown out highlights faded pastel colors worn VHS tape wear eerie nighttime cemetery scene with crumbling headstones and overgrown weeds trees with twisted branches and long shadows cast by flickering moonlight bright red light in the distance casting an otherworldly glow on the mist-shrouded graves worn and weathered stone angels and crosses looming in the darkness like sentinels grainy texture and video noise adding to the sense of unease and foreboding
for commercial use
Model is heavily censored. So I guess no.
There is 20$/mo license for small creators and I guess custom pricing for enterprise
not until a finetune comes out
Right, so it would be nice to have some clarifications. And I did contact them, but I think based on general perception it's not very likely for me to hear back.
haha
Stability AI licenses offer flexibility for your generative AI needs by combining our range of state-of-the-art open models with self-hosting benefits.
in reality unless they're trying some shenanigans like watermarking the outputs again nobody will know what you made your boobs with 🤷
you can but they have to be your own boobs
Just like openai's API ,for free 5$ limits?
Idk anything about the API, I'm just thinking about self hosted stuff for commercial use personally
damn, i would like to hear mcmonkey message about this again since the weight is about to drop
looks like they are not very fond of pony,maybe they are going OpenAI direction
the non commercial use thing applies to the local model as well
Yeah, I assume 6,000 image limit also applies for local use?
oh damn was just about to launch, but since you don't want that i'll go ahead and cancel the launch and reschedule to next week for you
6000 images. So one overnight session then what?
wow, not releasing sd3 as core, just wow. 6k images is about as close to nothing as you can get
I have seen boobs slip though the censoring on the API, it can definitely do boobs fine, I don't know how good it is at other parts.
yeah i think so
Seems only logical.
my boss let me to check sd3 for commercial, now it's impossible
Now look what you did
we should kill him
it still should be possible, you need to talk to someone from stability and discuss it
8B isn't ready for local use yet, still in training
It's probably better to wait
I would like some sort of scaling though, like $40/mo for 12000 images etc
Makes sense
6000 commercial images. If you're just generating locally and not using those gens commercially, that's personal noncommercial usage
if you're selling image gen as a service and exceed 6000/mo then you need to contact the enterprise license thingy
Dang
it's so cheap!! that's acceptable
dammmm , thanks so much , the result is amazing , is sd3 2b ?
Hoof you guys. Neural Networks can learn and generalize, that's the whole point. Even if there was absolutely 0 boobs in training data, you could continue pretraining and add as much as boobs you want. If there happened to be any, you just fine-tune it for more boobs. Also absolutely no one can limit you when you have the weights.
probably yeah, but you gotta contact enterprise support to make a deal if you're above the 6k cap for commercial usage
Does contact us now mean I get a tea with Emad?
Stable Image Ultra
he no longer ceo :(
I prefer flat anyways
ceo of my heart
Agreed
I know you're not in charge of this, but don't you see 6k is a joke, you have those 1 million limits, but then, oh, oopsie, you also can just gen 6k images/month. That makes that license non applicable for anyone doing anything even slightly commercial 😂 even a finetuner can't demo its model on their own hardware like this
sd3 8b ?
a man of culture
Personally, I do not go into such sources, because they are questionable. I prefer from the official website
I did the moment this form became available, they did not get back to me. While I am happy to wait the question is simple - is this related to This ensures that businesses can leverage the full potential of our model while adhering to our usage guidelines..
when sd3?
sorry but looks like it only gens holes where chest is located,black hole size
lol glif
So commercial use has changed to include the output from the model even though it says in the current non-commercial license that output is excluded from the license?
a finetuner demoing the model on their own harder is not using the model commercially, ie they can do whatever they want infinitely for free
I think it's only applying to images that are directly commercialized but I agree it is limiting for certain use cases
ngl idk what's up with that
Yeah but it was costing me $10 a DAY when SD3 API first came out and I was going broke!
Also, how will they know if an image was generated in SD3 and used commercially? There's no watermarking when using local gen?
You coomers don't have to worry about commercial limits whatsoever as long as you have the weights on your personal computer floppy disk.
sorry i don't understand what your question is?
I think "policing" commercial use of SD3 is going to be a nightmare ...
ha is a model
Except most finetuners are sponsored by 3rd parties, which make the fine-tuning endeavor commercial, and thus anything related to it 🤷♂️ You're either acting commercial as entity or not. can't realy split into two, one offering the preview, the other being sponsored
... a nightmare with IMPOSSIBLE at the centre of it!
wtf
it's mostly only relevant if you're selling image-gen-as-a-service, ie you're making a website/discordbot whatever where you pay to get access to image gen
Then it's a badly named tier.
i think any individual use eg putting generated images in an ad campaign or something can't possibly hit the 6k limit anyway so it doesn't matter
I think it's primarily about people running competing APIs/SaaS so shouldn't be that big of a concern
it more of a warning than anything, if you're a large company you'd rather not risk it
i get hte gist is, ok demo all you want, but the way that license looks now, can't be done
So getting this straight: commercial use of self-produced images: OK ... commercial use/re-selling as a backend: NOT OK
the "creator license" tier is for if you're a creator and not selling a service. If you're selling a gen service you have to use the enterprise tier
And the creator license is affordable?
Idk bro i made that shit a year ago after seeing the top post of r/femboys
What are we waiting on for the HF repo to go public?
I'm StableSwarming as we wait ...
My big question is -> Are apple using SD 800m in the "image playground" 😄
I hope stability got a good paycheck from that
Anyone have a docker container for comfyui + swarm?
This is what the wait's doing to all y'all
What if I'm selling a service that sells creators that then sell services that sells images of the USS Enterprise?
If you want to go more than 6k images you need to contact sales, and reading the reddit post I can see the following:
Large-scale commercial users and enterprises are requested to contact us. This ensures that businesses can leverage the full potential of our model while adhering to our usage guidelines.
So my question - how are the usage guidelines evaluated and what are they?
Apple uses a 3B model locally.
if you're finetuning and demoing, even under a commercial environment, I am not a lawyer but you're probably fine.
If it matters a lot to you contact a lawyer, but if you're not a "I have a lawyer on retainer i'm a big business person" type, then, it probably just doesn't affect you to need anything above the $20 tier
She was playing "PATIENCE" by the way!!! 😄
simple: just finetune the model for 1 step and call it not-sd-3
my question is it their own or fine tuned off SD? As technically that would fall under commercial use
have license been fully decided alrady?
I think they mentioned in the State of the Union at WWDC that it was an Apple model.
this is a long document that basically just says "please don't use our models to commit literal crimes thx"
MJ had the most enigmatic copyright/licensing formula - and I paraphrase: "MJ reserves copyright on all material produced; but MJ extends a perpetual and inalienable right of use of any material produced using MJ!!!"
Is there any prompt guide already?
It also says don't make nukes using our models but I might have to think about that a bit
its just natural language now, ask your llm to write a prompt for you. works for pixart so it should work for sd as well
llms go a bit crazier than needed but that works
what is this 6000 image stuff? will my PC explode after i make 6000 images or something?
it out yet?
yes
N o p
aw shit emad
You missed it.
Still 404
you need to buy more credits with emad's shitcoin after you exceed 6000 images
nice! What gives me the most problems are specific words for specific light conditions or styles. Like Those light reflections in the air - but I guess thats just my lack of englisch
i accidentally generated anthrax and wiped New York oopsie
i have to test out this explosion thing i have yet to make 6000 image son any model yet i think
Tio estas ĝusta
Gracias
seriously i cant find anythign on this 6000 image thing what does it mean?
is this another meme
its for the commercial license, it doesn't apply to you
crepuscular rays?
Bc estas abomenaĵo
Esperanto?
oh
Jes
I get that's the idea, and honestly in this case i'd just not ask permission and just interpret it like "cool, i'm not selling my gens, totally non commercial", but the way it's worded, i might only follow the intention, not the letter, of the license, which is fine for interpreting law, but a bit less so for interpreting licenses. Eitherway, not an issue or anything, just thinking this license has a few barbs.
alright well see you all later, this channel will unusable for the foreseeable future.... let the sea sp[lit and the messiah come. I dotn want to be aroudn to see it
:))))
Mklink /D "SD3_Medium" "Torcello's_ComfyUI"
wait dont leave yet,just stay here 2more hours
looooool
And I think it's great list of policies (which I fully support) but does Use Policy == usage guidelines? (as ridiculously as it sounds, I am not sure).
first step of addiction
If Steven Segal says I should abide
https://stability.ai/core-models
scroll down
The Core Models are available to Professional and Enterprise Members for commercial use under the terms of their Membership Agreement.
under the table
Well, prepare for chat to go boom in a few hours. Lol
we need one of those bingo cards, "model is shit", "vae is watermarked", "needs 24gb of vram", etc
I hope it won;t be a lot of dissappointed people. At least be thankful they release them free at all...
Will I have to leave the country if I make 6,001 images at all? 😉
I feel like we will be downloading it with 100kb
"Can I use it on A1111?"
of course, how could I forget
and "I can do the same with 1.5"
why
It should be fast on hf
How will SAI know that I've made 6,001, 6,002, 6,003 ... images etc?
Also nice but I meant lens flares (Thank you, now I know two new nice effects)
good then
"Can this run on a GTX -050TI?"
That had nothing to do with non-commercial and Hunyuan-DIT follows the same architecture
Let's keep it going.. "What does medium mean?"
I believe so yes. Whenever they reply to your enterprise contact they will probably make more exactly clear
How do they Police the 6,000 image rule?
it's only relevant for large scale commercial usages, eg image generative services
ie if you're not running torcellosd3.com where you sell image gen, it's not relevant to you
Can I use SD3 on a dual Voodoo 2 setup on my Pentium 2? Asking for a friend...
And besides the commercial value of a SD image is zip zilch non
(Shucks!!! Blown my cover already!!!) 😉
Only on TempleOS
so, Astralite should be fine now?
Any chance to know how big the backlog it? Should I just apply again? It would be cool for the form to sent you an email to indicate the data was at least received.
"When is SD4?"
Nope, the opposite.

2 years
"SD4 is currently being researched on the Dark Side of the Moon!!!"
Lykon said SD4 is not needed which gives an idea how well SD3 should be possible to train - which is rice!
i don't know (to be clear I am a software developer here i'm not involved in the businessy stuff). I'll relay that comment that it should email you internally to the people who handle that
SD4 will use the new mmDitDitDitDahDahDah tech
@viral plaza can you confirm that sd3 can run locally on a 3060 (12gb vram) and 32gb of ddr4
Please I need to get my hopes up
when is lykon releasing dreamshaper sd3
When is DitHuYuanMobiusPony arriving?! 
You have more than 4Gb VRAM you're fine
There won't be
Awesome,thanks mang
Pretty sure hes already on it - if even needed?
yeah should be good on that
can i run it on rtx 3060 12gb
Of course!
You can always use your CPU to offload the T5 text encoder too. You have enough system ram to do that.
ye
yall might not want to use T5 as that'll push your RAM, but other than that all golden
My cpu is an i7 8700k lol
I have a clockwork GPU.
Question. What is t5 and what does it do
You can already see my next question ... ? 🙂
it's one of the three text encoders SD3 supports. You can use the two CLIPs or without T5, or T5 on its own
I've seen much worse. Besides. You can even run SD3 without the T5 encoder. Most users wont need it.
It's a text encoder, it encodes text.
T5 a natural language encoder
the two CLIPs without T5 is very close in most gens to CLIP+T5
btw, how to caption images with text for sd3? something like that - "...a sign on the wall saying "welcome""?
That's almost exactly how CogVLM will caption text, so yes? probably?
does it have replace face to a real person?
This is my biggest selling point for sd3,imagine making an otherwise really stupid but high quality image with text and no need for photoshop
Text in SD3 "Ideogram! So What?!"
I appreci... Oh wait
Or mebbe even "Harrlogos! See ya later!!!" 😄
ye
I suppose 16gb ram already won't be enough for sd3 + T5?
you might still want to photoshop the text a bit. Or gen repeatedly to get a good seed. It's not always perfect
t5 is absolutely massive, I'm not sure if it's just the encoder part that is used here but it's easily 10gb+ even then in fp16
Wild guesses for top 3 hate-comments?
I made one hand that looks like a chipmunk
My text says noob instead of boob
You cant generate an elephant ant witht the prompt "Small animal with big ears"
Simple words are usually fine, but the longer/more complex your text the worse your odds
What’s the percentage of more or less successful text gen’s
Back in 1989 was a great band named The The. For some reason SD3 text recalls that same band all the time!!!
yea thats awesome
It's main use is for text to appear coherent in images. But, it was discovered in testing that it adds a little bit of extra intellect in how the images are generated (Mostly with very large and complex prompts) Otherwise just using the 2 clip models on there own perform relatively close to having T5 enabled. You just can't genrate text as consistently without T5. That is the maindrawback. But if you don't care about that, your more than fine to run SD3 without the T5 encoder.
Encoderonly fp16 T5-XXL is 10GiB yes
or 5GiB in fp8
if you have 16gigs of ram definitely don't use T5
6Gb VRAM not enough? My gf has 1060 6Gb
Get a new gf
"gf"
uhh it might work if you have enough ram to pair with it
probably want fp8 weights for that
entirely depends on complexity of the text you're trying for
you'll be able to test for yourself and see soon here today
oh, so it's upgrading time
I guess 24gbs of vram and 96gbs of ram is overkill. Haha
What about 4070ti with 12gb vram? Will it be possible to offload T5 to system ram?
or buy her new GPU? I don't leave house much
ye lol definitely good for everything with that
I think transformers did something at one point that you don't have to use fp32 weights in cpu ram anymore so 16gb of ram should be enough to offload t5 (probably?)
how long does it take to upload a model btw
yes offloading is the standard thing to do, comfy/swarm does it by default.
4070ti should be more than fine to SD3-Medium normally. T5 will work if you have enough sysram to hold it
As far as i know. Yes. You should be able to do it fine. As long as you have enough system ram.
you need 5-10 GiB of sysram for T5 depending on if fp8 or fp16
No censors is gonna create so much more yt shorts ai slop 😭
yeah that sounds about right then
It must be love
huh? depends on your network speed?
is there a specifiic node for SD3 local on Comfy? Or just load it into an exisiting workflow?
what is stabilityai's network speed? 😄
Is the tensor rt is gonna usable be on comfyui
12Gb VRAM minimum
Comfy updated the TensorRT repo for SD3 earlier this morning
"how long does it take to flip a switch" more likely 🤡
Would be nice if it wouldn't black out the image 4/5 times 🙂
is SD3 available now?
Ideogram handles text separately
I get an approx 2.5x speed up on SDXL with tensorrt.
I used TensorRT in A1111 4 months ago - took 6 minutes to make a very mediocre Porsche (which in the end looked like a Trabant!!!)
Oh hai!
it has some specific nodes needed - if you use Swarm it can autogenerate a valid workflow for you, if not a swarm user there will also be example SD3 workflows provided
But SD3 (at last) is a worthy competitor to ideogram
huh?
SD3 is already uploaded if that's not clear, it's just not live
man you guys really taking your sweet time right now
tensorrt is gamechanger btw, literaly went from 20 to 11s, and then to 4-6 with Hyper, AYS, automaticCFG and quality is awesome.
Hope it is gonna work as good with sd3
local download of SD3-Medium will be available later today
got it, I have Swarm. Just select the model as usual and it'll auto generate?
Ordering doordash and champagne
please stop spreading made up times
Yes, because it takes 6 minutes to build the tensorrt engine, once that's built it'll take much less time. Comfy's new tensorrt nodes save out the engine and can do dynamic engines with multiple resolutions. So I just made my model into a tensorrt engine and use that
can't wait anymore
that was posted by lykon earlier tho
there's a schedule it's just not public
I doubt it on day 1 maybe on day 101.
Are we talking scale of minutes or hours?
if he did that it was a meme or something idek
haha
it's not the correct time
The correct time will be when that 404 turns into a 200... 😉
Sorry I was thinking is true 😅
(or at least the links i saw posted earlier aren't, there's been multiple, maybe one of them is)
yea, not a problem
(if lykon did post the correct time ever i need to stab him)
doing god's work
(for legal reasons i don't mean that literally)
yep
Remember, like SAI, Hamlet was always procrastinating ... ye know, Release 2B or not 2B
in minecraft of course yes 
Why would I ever need to gen Hunter Biden?????
Why not tho?
y not?
fuck, you guys really got me there
SWARM - it takes some time "to drive ComfyUI blind!!!" 😄
Coming in to see anything about SD3 and greeted with Bingo cards. Cool
For legal reasons the best thing to do I to keep quiet until you have a lawyer. And that Is what I advise you to do as well
@viral plaza what kind of model sampling shift do you recommend? I want to say you said something about staying in the 1-3 range yesterday, but I can't remember exactly.
Buy a 3DFX Voodoo card to run SD3.
This is genius
the default is 3
"is there an app?"
MJ is attempting an app?
I said multiple times I won't post the time because if we are 3 seconds late people will freak out
ahh okay, was just about to dig through the comfyui code to see. i saw the modelsamplingsd3 node defaulted to 3 but wasn't sure what the internal default was
I am pressing f5 button nonstop so no worries 🙂
idk offhand what happens in comfy if you miss that node
if you use swarm, that node is autoadded
Please don't kill HF. You'll be notified for sure when it goes public.
Do we win a prize if we are the first person to d/load SD3 weights? 🙂
3070 8gb vram love?
Will SD3 work on Oobabooga
as soon as available this chat will break down
you get to upload it on a torrent tracker then in case of a sudden rugpull
I'm all-a-tremble!!!
You can make infinite images locally.
The limit is about the images you sell or generate at a cost for users.
You could add an "...ish" to the time 😉
Today-ish
@lavish osprey sd3 got released?
i have a discord bot that bridges ooba for textgen and swarm for imagegen https://github.com/mcmonkey4eva/SimpleDiscordAIBot so for that setup yes
SD3-ish
I knew it!!
SD3 o'clock ish
Does that mean I can generate 6000 images and sell?
probably just has some default built in like with other models that have used similar things (think cascade was one?)
No, release canceled because some users are asleep. It wouldn't be fair for them
just saw comfyanon updated the node to default to 3 as well https://github.com/comfyanonymous/ComfyUI/commit/c8b5e08dc39171babb5d43f160cc04271591743e
brother clickbaited for updoots
im awake!
Joking if that wasn't clear.
I am going to have a quiet luncheon, and back to the mayhem later 🙂
The truth is, One Piece spoilers just dropped, so we are finishing reading them before publishing.
i'm awake, but i got 10 hours of work soon, can we postpone??
main question is how are we gonna train with sd3
What about this super cool question @viral plaza
drop it here
I connected my computer to smartphone's 5G internet for faster downloads.
bill? idk
By wanting it very very much, the purity of your heart will do the rest
2 hours this time surely
im not pure at heart
这个链接在哪里
2 hours.
Training model is as Complex as SD1.5 and XL?
Lykon is there a need for Dreamshaper or is the model so full of possibilities that its not needed?
Where is it?
Where can I see this?
3070 should run it fine, if you have low systemram you might need to not use T5
jokes apart. we will have to wait for new sd3 script right? im not seeing any new script for sd3 in any of the popular trainer. we did see early script before sdxl dropped. so they didnt get the access before or we dont need any new script for sd3?
so just to understand something, the encoders use system ram rather than vram?
I think it's a bit more complex depending on what your purpose is, like if you want to preserve text capabilities and prompt understanding.
Even realistic models are gonna have a hard time not ruining the current realism capabilities imo, like you can't really use synthetic images this time or you'll ruin realism (and some people don't have an eye for it, their brain can't tell the difference)
I still think Stability should release in-house finetunes and controlnets, expecting people to make stuff for your commercial model is kind of crazy
We had to scrap some training runs because maybe we had good results in an area, but we cooked text or stuff like that
Low vram setting on comfy runs t5 on ram/cpu. T5 is fine
Train tenc while finetuning is ok?
I really want the community to develop their own without making our same mistakes.
diffusers will probably be able to train right away, other tools you'll have to wait for yeah. They'll probably be rushing to impl it
that's why i said if you have low systemram
"reverse engineering from comfy update intensifies"
So 8gb vram and 32gb ram should suffice for t5/sd3 right?
Ah I see, My only worry is that I spend minimum of 7 to 12 hours to train Character and style LoRa on SXDL. Never tried dreambooth as someone said it will be weeks of training or so in 4090. Just wondering whether it will be same with SD3.
Will a fine tune guide and tools be available from you guys?
Low rank has less risk of ruining things
SD3 Soon™️
What is classified as low ram though. Would 16GB @ 5600 be sufficient?
They probably will but it seems weird to expect people to contribute to a commercial product out of the kindness of their hearts
if we made everything ourself and didn't want the community to do anything, we'd put it all behind an API and never release it and act like midjourney/openai do. and that'd suck for everyone
Release Soon™️?
- in 2hrs
I am rooting for SD3 though 🙏
based on what I learned here, yeah, but t5 will be pretty slow on cpu
you probably don't want to use t5 if you only have a 16gigs of ram. It'll technically fit but not at the same time as having anything else open so eh
Kindness of heart is very powerful. It's mostly the reason why we release open source models instead of going the openai route and make easy money without any competition
Will there be improvement is Training time? Will it work if I use LLM to caption? will 3090 be enough?
Yep
Yes it's true the good will from the previous openRAIL releases could be considered
Even on a latest generation i9? How would one control using t5 on cpu vs using it on GPU?
it won't be too slow you're fine
it can just run on gpu and autooffload to sysram
Time depends on lr, compute, dataset, purpose of the training.
Any type of caption should work well, tags might make the model less intelligent. 3090 is definitely enough if you preprocess captions.
at worst it's like an extra second or two of processing at the start of an image
Is it still 2 hours?
Always
it's always 2 hours forever until suddenly it's 0 hours
Imagine in two hours someone says it’s releasing in two hours.
lmao i already did that earlier with the real time to send to a coworker
haha
dunno why, but this old ass game popped in my mind. god i used to love that game
it's already 13th somewhere in the world 😔
True™️
When will SD3 be released?
it was great until i realized i was absolutely horrible at it
13th already on New Zealand, scam 😔
2 Hours
one of my top childhood memories when we got it for christmas and the whole family started playing and guessing
SD3, Stability AI based in London, I'm guessing 3pm London time drop. Get it, cause it's SD3 :p
I wonder if learning2optimize could work well for finetuning specific image models. But i suspect training a model free l2o model is still very much infeasible.
23:59 + 1/t
Please explain a bit about preprocess captioning.
i saw some people put out a recompiler version it recently. like with native code execution and that has all kinds of bells and whistles
so glad to see alex, lykon and comfy being active in the last two days
that sounds even more complicated than the game was back then 😄
Precompute embeddings.
So you don't need the tes in vram
my addtional hope for SD3 besides what I've already known to have improved (prompt adherence, base model quality without damaging variety between seeds and faces) are structural anatomy or whatever, like stuff feeling more like they are there within that space and it feels correct (the car image on reddit)
I think this is wrong, but with all the timezones funkery I'm not sure how much it's wrong
is sd3 github out yet xD
Imagine, Stable Diffusion 3 is not just an improved version of an AI model, but a secret plan by an elite group of scientists and tech gurus. This group, let's call them "The Algorithm Masters," has decided to revolutionize the world with a new technology that will completely transform human understanding of creativity.
Why 42 minutes? Simple, the number 42 is known in pop culture as the answer to the ultimate question of life, the universe, and everything, thanks to Douglas Adams' "The Hitchhiker's Guide to the Galaxy." This number is no coincidence. The Algorithm Masters believe that releasing the new version exactly 42 minutes after a certain time (e.g., after a significant event or a secret meeting) has the perfect cosmic resonance to unlock the technology's full potential.
But it goes even deeper. These 42 minutes are also meant to send a symbolic message to all initiates that this is the beginning of a new era – an era where artificial intelligence and human creativity merge in an unprecedented way. Only those who understand the true potential of the number 42 will be able to harness the full capabilities of Stable Diffusion 3.
Of course, the rest of the world will think this is just a clever marketing gimmick, but the truth is much more mysterious and profound. The Algorithm Masters have been working in secret for a long time to develop this technology and prepare the world for this moment. And in exactly 42 minutes, everything will change.
The clock is ticking...
Ok then give us your exact release based on your timezone and timezone. We will calculate it.
the hell is that x3
No vram?
@viral plaza would stab me
There are too many letters. I can't handle it.🫠
I don't anything about the architecture but I think SD3 will be great just based off the captions, those old Laion captions were so bad in hindsight 😂
You saw through me 😅
meow
human brain can't see more than 75 tokens
How long until the 1.5 community stops with their schizo prompting and actually write for what they wanna see
Is this the same as the .npz files we get with kohya?
when we get ELLA uncensored 👍
25.2 hours ago you said 1.8 hours from now. Thanks for taking one for the team 😆
oh they sucked in foresight at the time too there just wasn't a way to fix it before the recent rise of VLLMs
yeah well sd3 has ella included basically so we good
lucky for me, that was wrong lol
(Not)
yes u only need to create a good finetune
copium

If your. Anything like me with deadlines, you'll release it at 11:59:59 pm
I'm sure it's gonna be that time in some timezone
if it's anything like the captioning oai did for dalle it should be pretty interesting
Times Square NYC - the Ball is falling - the crowd is counting-down ... 1,000,000,000 and 1 - 1,000,000,000 - 999,999,999 - 999,999,998 ...
Man I can't agree more everytime I see a prompt start with 1girl solo
Haha... I once did that went a grant deadline... Submitted at 5:00:59 pm and had to call the grant agency to confirm that counted as meeting the 5pm deadline and they weren't rounding up
It become a habit for me

Get some help my man
always works tho
I dont know when is the last time I use CLIP prompting
OpenAI wonked their version of it a bit - they trained only on generated captions and broke their model's ability to understand natural language outside of the caption structure (thus the need for an LLM in front)
well if its a NAI merge you do need those words to trigger it
True
SD3 has mixed raw captions and cogvlm captions to make it more adaptable to however people end up writing prompts
wasn't it like 95% synthetic?
my minuscule test prompt list I made ever since February that has been growing ever since
average 1.5 negative
effectively only synthetic
the way they made it sound in their 'paper' made me think it was a success
LMAO
average 1.5 negative is as long as an average SD3 positive prompt
GTA
I found that if you just copy paste that entire thing, it gives great outputs.
there are only 30008127387 different subjects in there 😅
(((ugly))), ((low quality)), (((((((((bad hands))))))))), ((blurry)), sfw, small booba
I got a great image once by accidentally using a multimillion token prompt
They also kind of botched it by allowing LLM interpretations ("creates a playful and unique scene") rather than just having it factually describe the scene
yeah I hate those
even so dalle outputs are really good when it comes to prompt following so I'm curious to see how sd3 will do outside of cherry-picked cases
Based on his comments that people will be unjustifiably mad if it's three seconds late, we know that it will be 11:58:56 or earlier
1.5 still better than sdxl in some cases, inpainting for example
Does sd3 understand values such as " :, ; , "?
(((deformed,bad hands,too many fingers,three legs,bad eyes,6 fingers,massive toes))):2.5)
I pasted 5 different prompts in there last night and it did well. 🙂
are the weights going to release this morning or later in the afternoon?
Won't it break the generation?
don't know about that, ever since fooocus SDXL inpainting stuff came out, it got really good, but I haven't tested extensively
they're keeping it a surprise lmao. let's go live our lives in the meantime
aw
Is this countdown real?🤨
No, i choose insanity
It's in 2 hours
(((((((1 finger)))),(((((2 fingers)))),(((((3 fingers)))),(((((4 fingers)))),(((((6 fingers)))),(((((7 fingers)))),(((((8 fingers)))),(((((9 fingers)))),(((((10 fingers)))),(((((11 fingers)))),(((((12 fingers)))),(((((13 fingers)))),(((((14 fingers)))):25))
fooocus has a good builtin inpainting model (something like a inpainting controlnet) but it still cannot make seamless inpainting like 1.5 can
Imagine if there was a single model that could produce text, images, video, and audio within a single neural network. Crazy science fiction
(((((((((((((masterpiece))))))))))))
I think oai is well on their way there with gpt4o
It does because I was able to generate text with them, even if 8b did it more reliably than 2b.
example:
me right now
that's then different from what they describe in their paper
(((((best quality,big quality,high quality,ultra mega hyper 8k cinematic wallpaper official art))))
It is not one model, but several models that work together
comfy will ignore that, it only has the (text:weight) notation, like (something:1.2)
Mrbean_up, allthecats_up, 6fingers_up
is there difference between '.' and ',' for sd3?

(((((((((((((((9score_up)))))))))))))))), (1score_up:-99999), 1 thumb, 6 fingers, anthrofurry, furry, human,
the text and audio part at least are a single model from what they said but it's not like there's any solid evidence for or against
will sd3 understand danbooru tags?

On a rainy day early morning, road filled with hazy fog and a paddy field with a scarecrow.a cop finds a corpse with blood splatter in the shirt and shocked with a confused and investigative mindset :
it includes CLIP and CLIP does, so...
thank you
CLIP G and CLIP L right?
Caf, half_caf, soy, diesel
Ahhhh, now I understand why the hands didn't come out with 5 fingers
and T5 xxl, yes
like, point could mean that you describing a new thing
I write all my sd3 prompts in Binary.
00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00101000 00111001 01110011 01100011 01101111 01110010 01100101 01011111 01110101 01110000 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101001 00101100 00100000 00101000 00110001 01110011 01100011 01101111 01110010 01100101 01011111 01110101 01110000 00111010 00101101 00111001 00111001 00111001 00111001 00111001 00101001 00101100 00100000 00110001 00100000 01110100 01101000 01110101 01101101 01100010 00101100 00100000 00110110 00100000 01100110 01101001 01101110 01100111 01100101 01110010 01110011 00101100 00100000 01100001 01101110 01110100 01101000 01110010 01101111 01100110 01110101 01110010 01110010 01111001 00101100 00100000 01100110 01110101 01110010 01110010 01111001 00101100 00100000 01101000 01110101 01101101 01100001 01101110 00101100
? no
Spam filter: bypassed
lmao
maybe but probably not much
Don't forget the mile long list of synonyms for deformed and bad quality
that dog that I HATE
rookie. i run all my prompts through AES-256 encryption
okay, thanks!
I made a node that outputs the tokenized prompt so I can optimize the fuck out of commas
Any idea on the model file size?
do not worry SD3 doesnt have that dog,it has a pony instead 🐴
that pony that I HATE
What about difficult movements and positions, such as gymnastics or acrobatics? Is there a difference between the two models?
T5 XXL means Natural language?
only image and video left now
in their paper they used a mix of synthetic and original captions
Which is better to use following your tests and comparisons? Sd3 medium or Stable Image Ultra (service)
It should use this
oh
COZE was a lifesaver, but even there, the filters have been tightened.
they did say that 4o can produce images without dalle but idk if that works in any of their demos for now
What about difficult movements and positions, such as gymnastics or acrobatics? Is there a difference between the two models?
I hope dalle 4 never releases because it will immediately make SD3 feel outdated
Is SD3 weights released
1 and a half hours
cmon the models are still private so not yet
"Today... on Kitchen Nightmare"
sd 3 released yet?
@lavish osprey man i need ur response..
What about difficult movements and positions, such as gymnastics or acrobatics? Is there a difference between the two models?
Go back to pony guys, it ain’t out yet
no more Wega pls 😔
....
obamber smoker
01:58:35 approx
Dalle 3 will lose its purpose while Midjourney will still have its ✨ Aesthetics ✨
so do we have any idea what we should be doing with G and L? same thing as with SDXL? as in, rumors that you should be using a different style for each, but the reality being you should just give up and use the same prompt for both?
Will it create some retro animation images? I love the early 90s style
It was 2 hours an hour ago. It'll be 2 hours an hour from now.
obviously 8b gonna be better, but dynamic is kinda hard still, I think lora or specific finetune would be needed for that
SD3 makes Video Games and Movies really good
It's always 2 hours away
there will be examples in the repo
Looks like some people have a kind of syndrome That Always thinks SD3 is heavily censored
hmm how come midjourney etc end? I mean cant they use the sd3 and make a comeback again?
It will be tho
GPT-4o definitly only works with DALL-E as of now. Image on the left is what they demoed. Image on the right is what I got trying their own prompt through GPT-4o now.
yo did my message send
SD3 won’t end either
just disappeared i guess
People use MJ because it’s easier
yeah I know it uses dalle internally, it's just what they themselves said before
i wonder if mjers will train on sd3
Hypothetically, can Model 2b do this after fine-tuning?
I don't really trust text capabilities of models behind API. They might do what Ideogram does, you never know if it's actually the same model handling text or if it's a pipeline.
Other way around. SD3 was probably trained on MJ
It’s like how all LLMS train on GPT4
me fr cs im so tired of ppl saying its 2 hours away
I think you can do anything if you finetune on it 😁
Would like to explain this?
How does Ideogram do it?
can you believe it guys?
no
🤔
This is crazy
SD3! JUST TWO WEEKS AWAY!
today is not todaing yet
Well they have an entire safety page if you want to read that. And it depends on what censored means for you. Because it can’t do porn which some might say it means it’s censored
is that a lora or can base output waifus? 
nuh uhh
clicked anyway D:
in 2 weeks
SD3 just 7200 seconds away!
that's non-existent link made by someone else
seriously that guy tried to do "first" too early
sd3 next month guys!!!
Sd3 medium?!
it's not made by someone else, it's on the official FAQ page. But the model just isn't set to public yet...
🔥🔥
i cant even runn it today because i am on my steam deck 😭
APi
((((((((((((((((9score_up))))))))))))))))), (1score_up: -99999), 1 thumb, 6 fingers, anthrofurry, furry, human,
Is SD3 out?
no
kratos son aged up
what was his name again
soon
T^T
Boy
BOY
serious ?
What is the model in the API?


that hugging face link on FAQ goes there.
no i am trolling xD
as most open models nowadays
Idk
real confirmed
thats me
try those on Ultra
they got a lil freaky 👅👅
that's base.
sorry guys sd3 got canceled
It was weeks ago. I dont have the prompt
Does SD3 have the freaky pass?
lets get freaky 👅
You stopped my heart for a split second
🤣
xD
do not ever believe anything on the internet
wtf is happening
surpasses dalle 3
Well I'll start with you first and I won't believe what you just said 🤣
that's pretty good for base 👍
lol
no hoodie, delete this.
Someone’s gonna post the actual link soon and y’all won’t believe them
the world is gonna explode within 1:12 hours
actual link is already out
stability dropping all their samples 
current model failed the second shot prompt as well
That is fake
well I did type shirt, its the lykon from the dimension that likes pineapple pizza
(I'm kidding I love it)
the link is BROKEN! 😦 i wanted the 100b edition model
@lavish osprey If someone train with another 1 million or billion dataset over SD3 2B, Will it make SD3 3B or something?
And what was the prompt for the first picture?

It's the end, bye, oh wait, it's 1.45 here, I'm safe
2T weight please, it'll run on the 4GB vram, right? right??
no. You can't change param numbers with dataset size
Are we there yet?


about another 2 hours for it
((((50000 steps)))))
i swear i was so close to clicking on one of them until i saw that racist edition

我喜欢你
lololol
💀
oki sorry. but we are so close x3
Here’s the actual SD3 link - https://en.m.wikipedia.org/wiki/Trollface
wait
IT EMBEDDED
lemme do something
NO
embedded generations
Nooo
it embedded
we desperately need this
It embedded noo
yes plz
import random
# Generate a random delay in seconds (up to 12 hours)
delay = random.randint(0, 12 * 60 * 60)
# Wait for the random delay
time.sleep(delay)
# Set the variable after the delay
world_exploded = "yes"
print("The world has exploded!")```
So it will be over feeding a chicken and making it bloat
i hope the 2b version got lots of furry data in its training
SD3 now has built in blockchain, no need to add credits, pay as you use
Is there still two hours left? Seriously? Has time stopped moving forward?
i cklicked it it does not work
it feels like it has, must be true.
cant wait for Pony-Coin 🪙
yes, and its going backwards
sd3 would not exist without fiurrys
I'm with you
nooo
ah yes, huggingface
LOLOLOL
You did it 
How big is the model?
Zucc did say their 8b model still kept improving even as they kept training on their dataset so it definitely takes a lot of time to hit the bloat point
2B
I believe some people did this with sdxl (or was it 1.5) by duplicating some layers and only training those layers
VERY
F for those who didn't sleep waiting
2 big as your mother
yes but it's a like 95% or 99% mix, ie basically just synthetic only
I mean storage size
4gb model + vae but text encoders are gigantic too
just imagine if the model released 5 minutes from now
No, please... i'm heading to work!
mogu
It seems that it will work with sd1.5 specifications
If only SD3 could generate stuff like this
my pessimistic prediction is somewhere around 10 hours from now cuz lykon said "less than 24 hours" near that time yesterday
YES, WOULD
u just need to give stabilety 1b bucks and they could make a sora video model. but it would be to big and nobody could ever finetune it
what were the token limits for sd3 again? is it still native 75?
75 and 512 somehow
if message.lower() == "is sd3 out yet?":
return "2 hours away"
else:
return "I can only respond to the question 'Is SD3 out yet?'"
while True:
user_input = input("Enter your message: ")
response = check_sd3_release(user_input)
print(response)```
Hey guys, just run this is python. It's basically the same as asking here 😛
I hope t5 isn't packed in the same checkpoint with all the other stuff or it will take ages to load
twosday
That's not how it works. Models don't store images
you can train both 2b and 8b on infinite data, the model will just learn better or worse depending on your data and your settings
How would they know? I'm not sure I will

Then do it
lol
Give us AGI
wtf is that
stop scaring me
What's the first thing yall are generating
photorealistic stuff I guess
bottles
definitely not furry
Cat wizard
AGI! 10 YEARS AWAY!
no matter how much shit you throw at it, an image generation model will never be AGI lol.
Oh ok, Sounds interesting.
you'll just make it smarter or stupider.
You won’t know until you try!
i had prompts in mind

edgy stuff or cool prompt adherence test
2 hours we are so close
What was the result? Did they broke the model? Model hallucinated?
1 hour actually
Never mind I will take up Photography again 
30mins in fact
a dancing protogen
15s fact
14s
you're both wrong.

13s
reee
Ah I got you. Learning more today. Thank you for answering me patiently ❤️
Oh shit it's the man himself
onii chan senpai~ >w<
trust me
bomb launching
Yessir the CEO of stable diffusion himself
Fr…?
1 minute!
Lykon didn't say the 13s was wrong..
I don't remember which one it was, some vega model maybe, or one of those chinese ones(?), but it still functioned afaik, better or worse is subjective, it's similar to how people duplicate llm layers
im gonna touch you
that means its right!
OMGOMGOMG 1 SECOND LEFT
dinner first 

2 seconds!!
holyf uck

