#🆕|sd3
1 messages · Page 1 of 1 (latest)
yeah i'm using it that way too, its really better than gpt4 in terms of descriptions
Retro urban photography, vibrant neon sign reading 'Torcello's Art!' glowing in red and yellow against a clear blue sky, classic motel buildings with red roofs in the background, palm trees providing contrast, sharp shadows indicating midday sun, desolate street emphasizing a nostalgic feel, 70s color grading for a vintage aesthetic.
So it was able to read the text of the larger part, although not the clip drop piece.
do you have paid gpt4o?
Yeah I'm using it through the paid api endpoint.
kek
Depends on how trainable it'll be. If it can pick up concepts way better than XL etc.
I have different intentions
better than sd3 is debatable, i've ran some tests and it seems to be really similar, finetuned sd3 will be at the same level or even better
how do i create image'
I prefer watching videos, but now I'm tired of watching pictures.
Good things are also used.You won't see bad things anymore.Poor things can no longer be seen, who told them to send a video and send 1080P P.Such a high-definition video is still a long video, which makes me feel bad when I see the pictures now.
Are you all Asians? It is reasonable to say that new york time is only now.It's 4:00 AM
PM
We are all Asians.
All of us.
have you tried the img2img thingy where you can turn imgs to carcatures and upload a character thatcan be used to tell consistent stories
i think you have to use the full app for that. i don't have plus, i just have the api keys so i probably can't use that.
ohh ok, mahn i wanna try that so bad
ella / pixart / sd3
In a hyper-realistic, cinematic style with a shallow depth of field and warm, golden lighting, a massively oversized CPU and motherboard sprawl before us, their intricate circuits and components gleaming amidst a sea of tiny, furry creatures - rabbits, mice, and squirrels - that swim through a flood of glowing, iridescent "memory leaks" like a digital tidal wave, as shocked and disbelieving onlookers, their faces illuminated only by the soft glow of nearby diagnostics screens, stumble backward in awe and horror.
ella vs sd3
so this is with a much shortened prompt. maybe that's why sd3 did so much better.
Gigantic cybernetic squirrel with glowing eyes emerging from a mechanical gateway flanked by screens displaying underwater scenes, with a crowd of people observing, amidst intertwining wires, in soft ambient light.
SD3 definitely prefer the prompts short, the longer the prompt the loser the interpretation it seems
i had gpt4o rewrite the prompt and now sd3 is way better.
Yes
Sd3, doesn't, need, this, too
And every single type of camera settings in the prompt x3
Makes me go insane
i used to be telling llama "compress" but "bring down to its essence" works better,, now i need to figure out how to make my system prompt do that by default without filling 1k tokens with examples 🙂
I've been having pretty good look with brief and terse and limit to 5 or 10 descriptive elements. It's hit or miss.
https://fxtwitter.com/gdb/status/1790869434174746805 do you think that gpt4o generates vae(image) , audio, and text tokes with the llm? all in one thing?
Festival man here’s ur prompt in chatgpt4o
the sd3 one is much more atmospheric, this squirrel doesn't look very cybernetic and the brightness is very offputting
allegedly it still asks dall-e for it and they haven't enabled the direct gpt-4o images in the UI yet
same for voice and whisper
I see what you mean!
I have found GTP 4o is a great prompt interrogator, if you have an image you want to reproduce and don't have a prompt for it , it gets so close when you re-generate it with Dall.e 3 somehow.
I think 4o might have the image to tokens part in already, just not the generate image from it part in the UI yet
but not sure, they've been very unclear about it
Gpt4o really is incredible as a vision model. I put in this stuffed animal and dog image from YouTube, got this prompt, and made the image on the right with sd3. Amazing stuff. The prompt: A vibrant, illustrative style with detailed textures and warm lighting shows a golden retriever with soft, expressive eyes and velvety fur, being petted while a fluffy stuffed tiger with bold black stripes is held nearby, office setting with people at desks, modern chairs, computer screens, cables under desks, carpeted floor, distant shelving with various objects, colorful post-it notes on partition walls, and soft ambient indoor lighting enhancing the scene's warmth.
Yeah I was trying to recreate this without a prompt and got the 2nd image, which is amazing for something so abstract!
I would even say the new one is better than the original
is it actually better at giving you a similar prompt to generate a similar image than say using an ipadapter instead
another example, but I changed the prompt to add the arrows.
Hah I was about to say and then you can change it up easily but you're way ahead of me.
i'm about to test this using movie screenshots to see really how good it is at description
Hyper-realistic, vibrant, wide-angle shot of a massive, chaotic desert dystopian vehicle convoy with a fire-spewing guitarist atop a towering speaker-laden truck, surrounded by armored cars and motorcycles kicking up clouds of sand, under the harsh midday sun casting sharp shadows and flaring highlights.
One on right is Ella. Sd3 didn't do well with it but that just needs fine tuning.
It seems to be trained on movies as well.
crazzy, can i see sd3 version
ohh wow i never thought of doing this too. incredible
Sdv made a neat animation of it though.
theyve said multiple times they will
As expected
Or they save it for a potential buyer that will own kt
It
And then offer the product with a price tag
yeah sd not worth a price tag
Exactly this. There's no way they're going to release it when a potential buyer will probably want to monetize it. Sigh.
only real selling point of sd is being able to train anything on it xD
its still confusing to me how stability didnt even try to have a leading ai gen site like leonardo nd playground. its mind boogling to me. they should be leading in that space right now.
if monetize sd just become a worse midjourney xD
I have exactly the same suspicions
The problem is that as awesome as sd3 is for us, it's far inferior to almost every other big commercial image creator out there. With these new announcements by meta, gpt4o, imagen from Google, they're already better than sd3.
not only at all, the tooling around it as well
So where's the value in monetizing it? Sigh getting bigger.
literally none of them are better if you want to do complex stuff
for pure text to image sure
Dude, did you see the new demos for gpt4o? It's an another planet compared to sd3
meta got an nnouncement too? gaddamn when?
sure, and you cant generate say a gif with my face in it
When they released llama3, they also released their image creator. Does lightning fast amazingly prompt following images and even does short animations of them. Only problem is that it's highly censored.
dalle3 actually gets the job done 80% of the time, on the first generation attempt.
be it something for work, something for marketing, something for dnd, for a one-time-use meme, or background assets when i design printable sheet
llama3 is great i enjoy it
Gpt4o can absolutely do that.
sdxl + loras usually take me like 10~30min, to get the same value or a bit better than dalle3 in seconds
very true, imagetoimge nd model finetuning is sd's biggest leg up. i cant imagine what this img gen space would've been without them honestly
tried sd3 via the api... and uff... I wept
ohh yeah i saw that, i thought you meant they announced sthng like yesterday or sthng.
what demo have they shown that can do that? or most stuff you use say controlnets for
I really really hope the final model is leagues better
same for a bunch of other finetuners I know. they also all tried sd3 by buying like 10$ of credits... and our experiences were sadly all the same 😦
Have to get to work, but have a look at the demos they put out in the last day or so. Full image upload with manipulation
what I really hope is that the tooling, especially controlnets are better for sd3, it's so sad that sdxl has worse tooling than 1.5
<<<Splutter!!!>>>
'image' yes, which is why I said gif
Googles new stuff can do sora like videos. You think it can't make gifs? A lot has changed in the last 2 days.
I think it can make gifs... but not with my face on them
which is why I said 'say a gif with my face in it'
none of those so far look like they come with the kind of control and tooling we have with SD for complex workflows
also google's video model looked kind of bad? the videos weren't that great
not that I could signup anyway, Europe isn't allowed to
Unfortunately, for now, open source is lagging behind, and at the moment, there are no circumstances that could change that.
Cash will win this race!
I've been with software companies whose original business model was "free updates for life!" And it was great - while it lasted. Topazlabs were then hit by the development of AI - so had to renege of the F4L business model - lost a lot of goodwill, and customers; but they had to do it! They're still trading!
FLStudio also does free-upgrades-for-life - but their code is so old! They do not have the income resources to afford a complete rewrite; but I am grateful for over 21 years of free upgrades!
I feel the Stability AI Community should Crowdfund SAI: we'd all be stakeholders, ensuring a continuously affordable product ...
Now that I think about your request, you're asking for it to make deepfakes which they'll never do.
You could do that face swap locally though and then have them animate it
More than that
Although Midjourney is not in the best spot either
But they are keeping themselves above the water for now
But they also got millions of people and some companies paying subscription to them
And they dont have to deal with other expenses in comparison to SAI
Unrealistic
And also not enough
If you are at a dept of 100 million dollars, have a current year quarter bill of 30 million while having a revenue of 5 million...that aint gonna work out
And the reaction in the community says it all as well
Yeah I'm seeing more articles and reactions today on Twitter. It's really bad.
its almost like it was predestined to end this way
basically the game was rigged from the start
it was borrowed time
and the lawsuits are not to be ignored either
i mean i know a lot of people here laughed about the lawsuits against Stability AI, Midjourney and DeviantArt, but those can have big consequences even if the lawsuits dont end up very satisfying for the people that did sue
Do I believe stability would win the lawsuits? Most likely. Do I believe they have the money to fight it properly, doubtful.
open ended book as of now
but the lawsuits might have had consequences already
think of the customers and especially companies that refuse to become customers because of the risks
"who cares about companies, we care about our OS community here!" thats what some ppl think and this is such a stupid one as well
like what companies
right i remember when nvidia and intel invested in SAI
they must be quackin on their boots rn
Nvidia and Intel arent its customers tho, how much and what did they invest
intel like 20million
ah okay, that was the one that was mostly computing power they sent
forgot about Nvidia in detail
but thats one time investment simply isnt enough. You need actual long term partners and customers
yes and SAI couldnt get them because they released the weights for free + their models are not very user friendly, midjourney and openai prob have some sort of software to help big companies generate stuff easily
well yeah they want to get investment back
Midjourney doesnt offer more than their Discord stuff + recently also their web UI. OpenAI on the other hand does more, yeah
OpenAI, Google and Adobe
and Microsoft
Not really. I don't see why open-weights with hosted services and commercial restrictions don't work. But seeing what SAI has produced, the llms, the sd3d thingy, the audio, it never caught on, and competitors that did specialize outperformed each. Can't help but feel SAI just squandered their resources, look what pixart put out with minimal resources but by focusing on one thing only, or the recent chinese PoCs. Apart from that all recent news screams mismanagement and frivolity, it might have been the attitude that got us stable diffusion, but now it's what makes investors walk away. Over promise/hype under deliver might get you attention, but you do antagonize your investors really really badly when you can't deliver.
yeah but just look at the reaction of people whenever pricing is brought up
a bunch of people would rather watch SAI die as long as SD3 gets released than donating money or paying for service to keep the development going etc.
although its unrealistic anyway in this case
That's a lot of Reddit too. I've raced to do the sd3 api etc because I want to support them where possible. Even people like datavoid on Twitter who heavily uses stability products hasn't hesitated to say #notsd3 and down with stability when they mentioned that sd3 would be downloadable with just an email sign up. That was a step too far apparently. Sigh.
That's just shouting on social media, another thing altogether.
Well, you find out who your friends are when you're not in good shape. If the people who are only where they are today because of stability immediately go to abandon them....
Though I myself would never just donate (in the sense of gift) money to SAI as is either. If it goes that way it should be a proper non profit with a solid governance structure.
selfishness
people get one finger, want the whole hand but refuse to give anything back
profit amongst all
and then they shit around against corporations how greedy those are lol
the irony
i said often enough, "the people" arent better than corporations (besides that its not all black and white anyway)
Yeah true, that's madness. I just meant the "popular opinion" of random reddit users
It's ironic, the main use for SD3s text capabilities is memes about SD3/SAI 🤡
Gpt4o text capabilities is way more advance than others. I think it might used some text planing pipeline to allocate the text area.
GPT-4o vision is from other planet
Agree, like an alien tech
"One model to rule them all" great, but how can opensource ever catch up to that, how can you even run inference such a model locally. Yet it seems the way forward.
unless the production cost gets lower i dont see that happening
no, shrek is the main use for sd3
What they showed is indeed from another level xD
the open ai logo with text in it, I'd believe it was really generating text in image and not placing with some technique
This example I mean
The closest thing I know is this. https://huggingface.co/spaces/modelscope/AnyText
It is like a controlnet to guide the generation
well thankfully SD3 will be either released or leaked anyhow
only ethical concern is that many people will lose their jobs
Where can a donation be made?
By simply paying for a membership, even if you don't actually need it. I'm sure if you contact them, they likely have a means for just donating to them as well.
contact them here on the server?
i wouldn't count on a leak
I would
that's their most valuable IP (as far as we know) and any buyer would want that carefully guarded
there's a reason we don't have weights for dalle or know what the MJ pipeline is
Nosumers unite.
Sell it to X lol
So far, do you guys like SD3 or MJ v6?
But then again, those companies didn't outsource their models to other companies
And are not on the verge of bankruptcy
I do agree though that it still may not be leaked due to legal issues
Dont use either of them anymore but if i had to choose it would be MJ i guess
most ppl aren't going to be willing to risk their careers for an AI image model leak
Why you don’t use any of them anymore? 🤔
I feel like the only reason ppl believe that is because of NovelAI, but the only reason that leaked was because 4chan managed to crack their GitHub accounts (aka, figured out that the password was "password")
One doesnt fit my workflow and current situation amongst all, the other one i dont want to pay for
oh ok
Ig also some of the devs said that as well, but again that's prob basing it off NovelAI's leak
I would have been genuinely curios tho how SD3 compares to others and to what i use
SD3 is claimed to be better than Midjourney v6, but I haven’t seen much comparison done.
Would have to see that and more importantly comparison to DALL E
I’m assuming you use DALL E?
People complained that in reality it didnt turn out that well
To supplement Adobe Firefly/Photoshop
Or vice versa depending on how you look at it and usecase
Oof
And no i dont pay for D3 ^^
That’s a good idea
SDXL 0.9 also leaked early.
Yeah and the same approach i have with creative software and workflow i have in doing art and gamedev
SD3 can be aesthetic, but nowhere near the quality of midjourney
prompt adherence though 
MJ is really good. It can't do what sd3 can do, but what it does do, it does very well. I use all of it. 🙂 I usually tend towards fhe stuff that doesn't censor what I ask for, which isn't that edgy, just that many of the services are overly tight
I feel like anything sd3 can't do is just because it needs community finetunes, not that the engine can't do it
🤔
you guys see what 4o image gen can do? it is wild. can even make its own fonts. not release just yet though
Well in art professionals generally have a whole pipeline and ecosystem of tools they use, rarely do you find people thst work on one single package
SD3 is still the best in text, knowledge and prompt adherence out of all OPEN models
This can apply to genAI too
yeah SD3 finetunes would be SUPER sick
dalle will be over, 4o has a built in image maker
Wdym
MJ seems the lowest bar, MJ is just aesthetics. It's Dalle-3 and Ideogram (and probably the new gtpt4-40, google and meta image gens) that are the real benchmark
I also saw that sd3 is better than all of these. Oh it's not released yet, so I can't show it off. 🙂
I hope finetuned SD3 will become Ideogram quality
ideogram is the perfect meme maker in my experience
SD3 doesn't perform as well
no chance
but for open standards, SD3 is still the best
gpt4o is some multi modal omni model and built ground up with image, text, video creation things. they show off an example that was wild
I took ideogram and dalle images and made a Lora of robots punching their fist through buildings. Now sdxl can punch through buildings. Same will be with sd3.
but right now it does not have the image mker turn on yet
I hope, with a much larger scale (so like 3038712847192874 more actions and examples)
you need to consider the usecase tho
I was joking about boasting of models that haven't been released. 🙂
i have use sd3 at my cousins place, he buy some credits. it is just another stable meh release i think. impressive tech but it really lack too, like all they put out xl was over hype so will sd3 imo
the biggest advantage Stable Diffusion has over its competitors is the customization
SD3 will probably do the types of memes ideogram can if I would train a lora on it
its simple really
a synthetic dataset could be easily made even
Still can't help but wonder what a fully trained SD3 looks like, i'd expect that does away with halfway ending limbs, the horrid hands and such
here is the one image they tease with 4o, which will also be able to create you own font. look at that image and also the text unreal imo: https://twitter.com/gdb/status/1790869434174746805?s=46
I would simply expect slightly (yes, only slightly) less hallucinations
4o going to leave the rest in the dust at release and it should be in a few week time
i guess someone will bring SD3 to Photoshop via that A111 plugin as well
i’m banned from gpt, so im downvoting it 🤣
haha
Well, maybe just that, but there's also talk about looking at the text encoder, at the least, maybe longer prompts, at best, better understanding. But that's not so much aesthetics.
Time will tell, or not
i really did not see any difference between sd3 and xl when it was hype. just my opinion of course and still impressive
if they somehow uncap the prompt length (if they have the non truncated prompts) and train further with soejmthing like longclip
I wonder if that would make a change
Also curious what our ruskie friends are brewing, kandinsky was always "just not there" but an sd3 clone just not there would be pretty good :p
we need longclip-g as well
Dalle is one of the most censored models out there. If gpt4o is too, it's not of much use to me.
dalle provide what i need, i am not after some gore or sex stuffs
but yes i am sure 4o will not be open like sd3 could be
I'm after gore if I'm trying to do zombies or war depictions or artistic violence, so SD3 will do just fine
When I was grabbing the images for the Lora, and I wanted a soldier carrying an ar-15, it denied the majority of images. I kept hammering away and got it to give me enough of them. That's not gore or sex.
Right now SD3 is more problematic than Dalle to me 😂 I just gave up prompting females, it always ends in blurrs, though i get the point, sd3 without that filter is less restrictive than dalle 🙂
im hoping for a soonish release of some of the SD available features to Adobe Firefly and the ecosystem
well sam altman say they are going to unleash more of that kind of thing too. who knows. they have to have guard rail of course because they are the first mover in all this, to most people who know ai, they of chatgpt, so they will also get all the bad press too if people overeact
where do you use SD? Clipdrop?
i do not still wear the first pair of socks ive bought
look at the global panic when someone make some taylor swifts haha. they cant have that kind of thing
started with the api, then clipdrop, now mostly pixeldojo (though they have fake delays, it is advertised as unlimited)
any of you guys ever use meta image maker? i am curious how that is
It's really good. Couldn't do text but it's great for other stuff. Only does square images though which kinda blows
Do you guys think weighted models will be outphased by cloud based?
oooh okay
square image is good some time too. i got a co pilot pro subscription once and it only would make widescreen dalles haha. nice but some time i want the option of square too
what do you mean?
depend on the tech. we probably could not run dalle3 or 4o model or whatever on a home computer
i dont have issues with those usually, if any i can outpaint them with ease
in time i suppose a home computer tech will be cheap enough and a model advance enough to run on a phone. who would think 20 years ago we could have the compute power we do in our cell phones
I mean instead of installing a webui, and using it locally on a device, everything will be moved to a cloud server where it will be updated routinely
oh yeah i suppose so
but that also means people gotta pay
for the service
has to be some profit motive for a company to make a local model of course. stable is/was rare but they still were getting funds not do it for charity
which is where SAI failed at
a lot of people seem to think everything should be free, but why would a company just give away a free model you know
people are very selfish
to get buisness
welcome to subscription era lol
it pays out xD
this
if i actually had to buy my whole software pipeline at once i couldnt afford it at all
i would have to sell my family xD
companies know this
and they offer subscriptions
or free models
rarely
to get corporate buisness
There really aren't that many people able to use those big models at home. Just need to limit commercial use (which SAI now does) so a part of that profit flows back to you. (then you have the best of both worlds, generate income, and "free" research using your models)
Stability AI isnt a charity organization
tbh i prefer cloud based where it’s $1 a month lol
and i do not think SAI is at top of the game either, im not sure how well their approach work for them
never was
realistically $$$ wins, look at Grok, OpenAI pissed off Elon and Poof Grok was made
at the top of the game are the corporations which large ressources and influence
Google, Meta, Adobe
not even OpenAI is on their level
oh and Microsoft
alibaba
nvidia could release some big thing too if they want.
there are players for sure
they rather put their technoly underneath the products of the big boys
technology*
A while back ago, Elon Musk was in discussion of a partnership with MJ. I wonder if that’s still ongoing, or if Stable Diffusion is the new canidate, or even canidate.
yes and seem to pay off big time for them for now
exactly and even those boys are compromising on their products
to ot overspend
not
Midjourney or OpenAI?
look how far we have come in just... idk was dalle 2 or stable 1.4 even two years ago? in another year maybe even we will probably be close to 1 shot image creation where it is perfected. then it will just be a matter of the person pick which company they like best as the image creators will all be near perfect
yeah the api censorship is super aggressive
i dont believe in that tbh
no one will be talking about image generation cuz all the models will be able to, will be talking about its other multimodel capabilities
yes kagi. we will be onto video haha
pieces of the puzzle
Anything online is utter trash compared to SD. I don;t even know why ayone would even use them other than being uninformed or lazy.
i prefer dalle for my needs i think it much superior model
bullsh*t Dodge
0 control and trahs quality + sencorship
why anyone even woudl consider using dalle or MJ is beyond me
you are talking purely from ideology
trash quality i would say is more sd even sd3
peoplea r eclueless
I just want a model where I don’t need to keep installing 😂
you have no idea
Does anyone really think this looks good Yes text is impressive, the board itself and hands though, awful. If this is wat photo-realistic looks like, it's not that great. I feel often we're wowed by certain aspects of imagen, while other aspects degrade. (And that's one thing SD3 does seemingly pretty well, their model in no way feels like a sidestep, but is fully better than SDXL)
well some of us just look for best option, of course we have to get into the fan wars like any corps haha
i dont even know why i talk to clueless trashtalkers like you lol
idk it would have fooled me if i did not know it wasnt real. i would love to see sd3 try to duplicate it
you have no control, no ccustom trained checkpoints and cencorship.
comes into the talk and just starts trashtalking
the onyl reason they seu MJ and dalle is coz either they dont know about SD or ar elazy to install it
Honestly, idk why ppl care about text so badly. If you wanted to put text in an image you'd... put text on an image
have you considered that people use multiple products?
Why have the AI do it?
who cares about multiple procudts
well if you really want an image you would paint it or draw it. why have the AI do it?
tbh true
you have so many custom chekpoints on civitai you can do whatever you want
doesn’t sd3 have censorship cause you can’t makws nsfw?
thats how to know you are clueless
Because that's requires more effort than using the text tool in Paint.net?
yes SD3 has cencorhsip
and adding natural chalk text to a guy at a blackboard in some other method does not?
but sdxl doesnt
cascade doesnt
sd15 doesnt
and SD3 wont have any cencosrship too ass son as the weights drop
if we are to get to your level, Why would i use Stable Diffusion if i have MJ or DALL-E + Adobe Photoshop/Firefly or for vector Illustrator? Nobody needs VAE control and similar
so whats your point?
That can prob easily be done with the correct font and a bit of img2img/Cnet
lets see you try it then, post your image here
i just dont get it?
should be able to make it in a few minute right? it so easy to do
i think we have to deal with a fanatic here lmao
dude casually joins the channel and starts trashtalking bs
sd being sued?
i feel like i am on some comic discord where someone is upset we say iron man cooler than batman
Stability AI is sued, yes
for what lol
for real
copyright infringement
together with Midjourney and DeviantArt
dunno them
but i guess i apologize some too. this is a sd3 area and we are talking about ai in general. i suppose 4o and other thing are ot haha
🤔
but to me, i am no fan of any, i use what is best and use them all, not plant my flag like some keyboard warrior for any corp only one digital landscape
SD for the win
sure
Nothing comes close
Your computer must be strong af lol
yeah thats why i cant take the guy previously seriously. Professionals dont use exclusively one tool
haha i wish. i can not even run sd really.
anyone who uses PC seriously has 6-8gigs VRAM these days
my pipeline exist out of like 10 software
that i use
at the same time sometimes
well not literally all open
i guess i am not serious enough pc user 😂
do you even p.c., bro?
or you simply dont need that much xD
you cna fooooocuse with 4gigs please
also my software pipeline is extremely expensive
SD staff reading these comments 🫣
you dont need that i suppose
and eating popcorn
maybe Dodge is their bot
well
someone that works with OpenAI said youll never need more than 16KB memory
wasnt it supposed to be released already?
they wont until the money dry up
I am not their bot
a bot would have more nuance i think
Infidelis.bot
I had to actually upgrade my PC just to use SD but I know its the future and a good oivnestment
MJ and dalle are hot garbage
bruh come on xD
cencored and bad quality and no variety
theres one lora that emulates midjourney for instance
one single lora
reporduces midjourney style for 200MB
for free
is that midjourney v6 asthetics?
If people ar etoo intimidated to install all these scary apps like A1111 and comfy well its their loss
people dont use SD for quality or prompt adherence, they use SD because of local usage and customization possibilities
it took me months to also get used to it and learn it but thats what ti is
and those customizations are absolute overkill and unnecessary for a lot of people
for me for example
nooooo
Btw is it just me or does A1111 run up to only Python 10
4o is going to have consistent character, another big feature for it
i just got access to it
i mean to the model
not 4o i think, it is not turn on yet inside of 4o
Thats why Sd is so greta
i mean the image creation
so the future is not at corporations
it still run on dalle3 for now
yeah its not there yet
oh ok 4o is awesome
also i wonder how many messages i can send before it gets set to 3.5 again
because its limited
if you are free i think not many
dont tell me like 5
lol "death is typing...
idk i am a plus user haha. but so many people are try to make use of it it may be 5 haha
i deleted my original OpenAI account, before that i got rid of MJ sub
by customization possibility do you mean pkgs, loras, etc?
well i got rid of MJ much earlier
MJ is artys fartsy
lol
yes, controlnet too
too high contrast and dramatic and too much color
im signing off
I think Sd was always the perect ballance between grey boring flatness of dalle and artsy fartys wildness of MJ
🤙🏻
have a nice day/night 😄
🙂
say hi to Dream for me
i test sd3 a few week ago, but on my cousins pc since he pay for some credit haha
oh
but i will wait to see it on here if they offer free test. i think they did for xl
it was not very good with hands still i know
hands will still suck at SD3 i feel
if any i would gladly replace D3 with SD3 if SD was better and preferably for free but without local usage lol
we had much better success with the image 2 image which is of course always a nice feature
on the other hand Firefly/Photoshop remains untouched for me
i hear firefly 3.0 is release. good quality?
It may be released in August
if you ask me why not local, because of my specs
and yes its more demanding than my usage of Unreal Engine 5
lmao
take that
😮
Are you a game developer?
how do you even use firefly? in photoshop only?
text2image is good, but in general (worse in some cases, better in other) MJ and D3 beat it in that regard, in editing, in further features it blows them out of the water
yes i am just trying to give some compliment to sd too haha
at this point almost exclusively there and rarely via web UI, they fully integrated it into Photoshop Beta and will so with the major one probably in october
then there is the model for vector and soon for video
but thats not Firefly 3.0
yes
what are the big player then: dalle, mj, firefly, meta, sd... anything else?
cool job
its a side hustle currently
until get eventually freelancer/self-employed or end up in the industry
i*
starting in the alley, one day on the boulevard haha
Me too but I use Unity. Have you mastered the engine?
if i was employed there i wouldnt even have to pay 250-300€/month for creative software
lol
the company would pay for me
nah, sadly not. I dont even work with C++ yet, i play around with blueprints
well C++ is still in you know what i mean but i work with visual script simply
how is it going with Unity in your experience? ^^
I suggest you don't waste your time with it. It is very difficult and the demands are great. Unity is much simpler and powerful at the same time, especially as a visual programming language. You will not even need to read the documentation if you understand its basics.
I mastered it almost completely in about one year. I am currently working on a big game that I will finish soon
for my case its not a waste at all, its exactly what im looking for tbh. Well i dont look for C++ ofcxD
but i have to learn it at some point
otherwise i would have taken Unity indeed
from absolute beginner with no coding experience?
I still don't know things related to online games and also the C# language, I only have the basics
I learned basics but have no experience
oooh okay, we can go to off topic if you want 😄
i think slowly someone might punish me here for being OT all the time lol
Oh, I see. Are you planning to work as a C++ programmer?
not really but since im in Unreal Engine i have to learn it anyway
sooner or later
Lol, I remember that I tried to learn it for 6 months, but I did not understand anything
Don;t fail us Emad. Pleas eremember what this is about.
Money is dead.
Even if you make it taxes and inflation will take it.
Don't deviate.
Stay on target.
By the way, is there a way to train stable-cascade online using Colab or Kagle?
possible lol. Some people get along with it, others not so much. This happened to me with 3D software Blender
I tried a lot but the bugs never end
ho
Yes
Money being dead wont be announced on the news I am afraid just like the pandemic... It wa snever annoucned officially beign over.
Also now we are in WW3 but you will not hear it on news
spo same for money bign dead
its not something media will proclaim
even if its true
You have to kind of use your brain to see it
it can be frustrating ^^
sure is
Maybe if they take the Blender Foundation's business model, it would be great for them and for us
what should i say with me juggling with prototypes currently lol
mhm doesnt work out tbh
Blender is a completelly different case
Emad is gone
i will tell the bank next time they want a house payment, money is dead, you guys just need to see it haha
as well as some of the best researches from Stability AI
allegedly
What I mean for Stability Company, if they work like the Blender Foundation, is to focus on only important models while accepting volunteer developers and scientists
you see 1 google ad for every SD3 generation...
And then u can use SD3 adblocker of course eventually but dont tell anyone
stable probably will get bought by someone else if they even worth the effort at this point idk. hard to see a future beyond sd3 at this rate
oooh so narrow down to for example only image models and there focus on one
the rest should be handled if at all by community and co.
Well money almst died three times in 20tyh century so...
2008 2016 and the stock market crash
Breaking News we are happy to announce that Adobe acquired us today

I use photoshop from 20 years ago
(they wont do that)
uffff
because i just cant stand the montly whatever subs model they sue now
its laughale
what if i dotn have internet?
or a phone
u can tlive without your phine now?>
are we serious?
its absolutely bizare and crazy that peopkle assume everyone has phine now
is a phoen an extension of your body now
i used to torrent them longer time ago, but im a full fledged customer since meanwhile like 2 years
to a certain point, yes
i mean you are on the internet right now, hardly some luddite
which is just a fatter phone that need a cord to be in the wall socket
the idea that i need a phone for everythign is not acceptable
what if i dont have phone?
then you wont have that advantage. idk i dont care if you have a phone or not
yeah
my passwords are so long adn weir di will neve rtype them in on the phone
i juts use my phone to chekl them time and read stuff on the toilet

Poll: how many people read stuff on the toilet on their phones?
i even do draw and paint on my tablet while being on the toilet
soon i will also sculpt in 3D on iPad when Zbrush gets released for it

😄

good
for what

you'll have a better time over on ideogram.
Until the open source Freesound audio model drops. Yep. That's probably what he's taking about
yeah i figured out so far the best couple of settings for sdv. it now gives good output more than it doesn't.
I must try SDV one day ...
whers is sd3?
?
Guys we will get SD3. It has limited useful lifespan, and SAI values community engagement. HanyuanDiT already publicly released with similar performance (and basically same tech). In a year, SD3 will be outdated, and in ten years, it will cost about $100 to train a similar model.
SD1.5:
HanyuanDiT are still below the performance of fullbaked SD3
as well it is just a forecast of SD3's new captioning
ok where is the fullbaked sd3 to test
You can safely assume Artisan SD3 is like 75-85% baked
we cant really assume anything since we dont have the weights 
We do have another SD3 clone apart that Chinese model. That one project carried by Simo Ryu in Twitter
If I won't wrong he literally train a model from scartch with those paper.
If the company is facing a sale, the development team is lost, and the training equipment is an external cloud platform, then the existing big model results are the only thing that attracts buyers. Until the funds are resolved, it is highly likely that they will not be visible. For buyers who spend money to purchase large model assets, with a previous operating loss of 20 million and a debt of over 100 million US dollars from the supplier's cloud platform, how to exchange for profits above the acquisition costs is also a constraint. Ideals are good, but difficulties are great.
do not worry,in 10 years we will be living in underground bunkers so we wont have to worry about ai models
pretty sure you guy know how much is 44k parameter
So what are you actually implying?
Mebbe that SAI is joining MJ, OAI, Clipdrop etc in a monthly subscription business model?!
@morew4rd @GoogleDeepMind Sd3 is due to drop now I don’t think folk will need that much more tbh with right pipelines
I mean they pretty much are in but its not working well when it comes to revenue
seem pretty funny thinking that it is 2 days already and you are the first one posting it here
Weights when?

Just tried hanyuanDiT (got it running on my PC), and it's just bad at prompt following. 😕
A red-haired woman wearing a blue hoodie is standing on the right. A white-haired man wearing a green tuxedo is standing on the left. In the background is an abstract watercolor design featuring arabesque patterns.
Lemme test form consistency.
Oh sweet mercy what am I even looking at right now??!!
This is unbelievable. 😱
I mean, it got the neck a little wobbly, but I bet picking a different sampler would help with that. It has the right number of fingers on both hands, 6 tuner pegs on the head, and 6 strings over the hole (sorry IDK guitar terminology).
Will test more have to go to work.
It's for his cat.

HunyuanDIT is a fully-censored model! 💯
I've never seen that for a chinese model before, but mostly I'm just amazed at how well they managed it.
As a large language model trained on Stack Overflow, this question is off-topic and will be closed.
so you are ai
i cant join this server either,its broken
Hmn.
It can't do text. Wow this is the worst chinese innovation so far. Lightning and Hyper were legit.
A cute kitten holding up a sign that says "SD3".
Is there any other way
3
sd3
the photo just says 3
3
Two weeks SD3
Wait a second. I thought Temu is the worst.
I want to create some original pictures with any convenient AI
Isn't that a shopping app? IDK. Just mean bytedance and tencent have done really awesome stuff with SD, but HunyuanDiT is terrible.
Yeah, not the best innovation. I am Team Pixart. No text at all, but nice prompt following. They just need a model with more parameter.
I ran your comment as a prompt and paid money for this. I call it, octopus hands.
so does artisan use a different version of SD3 compared to the API?
hard to say
probably same
either way, you think SD3 is 75-85% baked?
hope end of may is the release of SD3
with controlnets and finetuning and stuff
I wanna see how highresfix works with it
cause pixart had a lot of issues
Me too, without the paying part. I call it: Basket Hand
nvm
On a slight rant, the fact that sd3 is still just as terrible at hands as it always has been, makes it worthless for so many generations. I want to be able to make a joke picture based on something that happened at work and send it to my coworkers, but I can't do that if there's something blatantly obvious that is majorly distracting from the actual content of the picture. Pixart and Ella are amazing, but they generate octopus hands at random too. Knowing that it's not going to get any better with sd3 is disheartening.
The other image services seemed to have figured it out but not sd3.
I've tried using hand detailer but the original image's hands are so bad, it can't even fix it
DALL E 3 cant even do it well I mean
or in fact all of the AI art generation model
I get great results from dalle all the time.
I generated 30-40 images of men carrying rifles from dalle for a Lora training set and 100% of the images had perfect fingers and perfect hand position on the rifle.
because under machine learning you can't "exclusively" training hand than other part of the body which are much simpler in anatomy movement.
and plus there are like 100 different hand position with 100 different perspective
in 100 image dataset
I just did 2 images from dalle. Lots of hands. All perfect.
I just paid for 5 images from sd3. Every single hand is a mangled mess.
Well, while training hands is damn hard. it is entirely possible that you could lessened the chance of malformed hands. I actually don't quite know how they can able to achieve it without slight "deformation"
increasing parameters and bombard training may worked but eh...
just like how people try to do their job weakening the power of funny hands with finetunes training and stuff.
but anyway thx for sharing
And how can it get better?
Subscribe and turn on notifications 🔔 so you don't miss any videos: http://goo.gl/0bsAjO
Make sure you never miss behind-the-scenes content in the Vox Video newsletter, sign up here: http://vox.com/video-newsletter
Hands drawn by robots … often just don’t look right. Why is that, and what will it take to get better...
Vox did a great job explaining it
If only it was just hands (after a request of "how to create this image earlier in this topic, tried to create it with sd3) . It's also terrible at limbs in general. Those feet don't look healthy. And I think everyone knows how a lying person looks like
What's even going on here?
I think that's because the complexity of number of characters
SD can't count how much limbs are there
I hope this is part of what "sd3 not ready yet" means. i said it before, it reminds me of the early sd1.x leak in many ways, it garbles so many things
Thanks
But really, use ideogram if the intention is to get as close as possible with a promptL https://ideogram.ai/g/s_dmCBfhTPGFjFiQi3kPoQ/2
4 large men in the background wearing sunglasses, pointing their finger at a tiny man, as big as the laptop in front of him. The big men seem to have a newspaper texture all over them,, as if they're made from it, and the small man wears a yellow jersey, the small man sits on what appears wooden boarding, face looking at the laptop on the boarding, legs crossed. his back to the viewer so the laptop screen is visible. Only the jersey and the laptop screen have color, the laptop screen shows some graphic design, the rest of the image is back and white
This is good
yeah ideogram is ridiculously good
If only SD3 could perform as good
maybe with finetunes one day... 😔
it actually got the newspaper part a bit better, but the size difference is non-existent
not that bad tbh
especially for a base model
and yes, I did cherry pick this lol
ideogram works very well, I like it a lot. Sometimes it produces real garbage, other times it creates wonders. The main problem is that the results are very similar, even if you change or add prompts
Well, with MJ subscription, we get unlimited generations… So SAI loses here with it’s credit system, no?
Cage: Not the beeeez! not my eyes! not the beeeeeeez!
Agreed - I have a one year subscription at MJ - and don't really use it!!! LOL
:D
fr
cant wait to see how it performs with larger step counts, highresfix, etc
how much difference there is between the model sizes
what if I only use clip-G+T5
etc
are SD3 checkpoints going to be more than 6 gigs in size?
lemme check the ones i downloaded from hf
o wait
sdxl is more than 6gb and sd3 is bigger so yeh of course
hmm
or I guess the smallest sd3 will be under, the biggest will be over
4B 8B absolutely
don't exactly know what MMDiT weighs like
but fp16/bf16 weights will probably match expectations
SDXL is 3.5, so between 2B and 4B of course
you also have to take into account the T5 model, which is an extra few gigs
https://huggingface.co/city96/t5-v1_1-xxl-encoder-bf16/tree/main this one is 10GB and is recommended over the useless fp32 one which consumes much more RAM in comparison and loads like a hog
you can run this one (or the fp16 one) on CPU RAM just fine and it loads in just a minute or two on SSD
T5 on cpu usually takes like 5-10 seconds to create the encoding, on GPU it's near instant
(talking from Pixart-Sigma experience, which also uses the t5-xxl encoder)
T5 can be turned off entirely (and probably should be when you're not trying to generate text)
If you don't use it, can you give it to me? 
👍
since I make memes I will have it on, and even then I'll have it on just in case
I hope its mostly just text, cause I don't want to have T5 a massive advantage, cause that would turn users away from SD3
like "oh I have to download +10GB and consume more VRAM/RAM just to make it better than SDXL? Why even have SD3"
here's a sneak peak at using only each or special combos of the clip/T5 models
by mcmonkey/Alex
sadly prompts are cut out so its annoying to compare
I was about to post that lol
T5 likes to make photos/photoreal images
@hallow lion What are your personal thoughts on SD3 moving to cloud-based (subscription based) if the company were to be bought by X (Elon Musk), and migrated to their website X with the full functionality (idk if this is possible, but let's assume it was; my knowledge in the tech field is limited) of what is similar to Automatic1111?
It woudl be better than MJ and dalle as there would be more features obviously... but still nothing beats local... I would hope Musk would take that opportunity to really free the weights and alllow open AI sicne his attempt with öpenäi backfired and they went the corp direction
I really hope SD3 gets out and we can refine it and it should be enough for a few years as far as image generations go really...
Ok thanks for your thoughts on that subject! lol
The next thing we should focus on is the tools
consistency is the next big thing
And not you know 90% and ipadapter and hacks and tricks
but real consistency
down to a tea
Yeah that would be a good approach.
你好
He is the worst person to get the company because he will make it pay in the end
If Huggingface would buy SAI (like they teased a while ago), that'd be amazing.
yes facehuggers would be epic choice
Increase resolution and create a HDR photograph
Enhance
Will be sd3 available on replicate?
Impossible
Huggingface cant even afford to acquire Stability AI
They would go bancrupt (almost) instantly
Ups i meant @teal fossil
XD
nice
now try to lift two cats, you can't... :3
You can if they only carry 4GB of ram.
true
oh so SD3 will come in december
only 2 weeks after decemeber? 🙏
I'm taking the Guardian website photos of the day, putting them through gpt4o describe, then changing up details.
I have to give it credit, pixart/ella couldn't do the powerwash sprayer.
sd3 could
although it only got it right 1 out of 12 images, which makes me feel like it was luck.
Wow SD3 is better than ella sigma?
it IS better than those 2, but only where the finetune trainining happens to be better than my sdxl refiner model. that doesn't happen often, but like in this robot one, it was.
Refined SD3 will rule the universe.
it's why it would be tragic if sd3 didn't get released. pixart/ella is awesome, but only because it's a great stopgap until sd3. not instead of.
SD3 ^^
pixart/ella ^^^
In a surrealist style, vibrant colors, and exaggerated forms, Will Smith confidently strides through a bustling, modern airport terminal under dramatic, high-contrast lighting, aggressively pursued by animated, anthropomorphic spaghetti paparazzi with flashing cameras, surrounded by travelers, sleek architecture, and large windows with incoming sunlight.
neither one could quite nail it
@dull star they are experimenting with t5 only.
https://old.reddit.com/r/StableDiffusion/comments/1cgr74j/april_30th/l2bxv66/
https://old.reddit.com/r/StableDiffusion/comments/1ciyzn5/sd3_weights_are_never_going_to_be_released_are/l2dhd6q/
They originally trained it with t5 limited to 77 tokens to match clip. Training t5 only allows them to bump that back up to 512.
challenge: this chat 1 day withtout doom and gloom
if they decide to limit it to 77 tokens then t5 is basically irrelevant
interesting?
pretty much
yeah
I didn't get this initially
then again, we have longclip-L, now we need longclip-G as well
While SD3 is 100% the better/more complete model, often when i have a prompt that looks really good in SD3, i try it in pixart, and to my surprise, it looks really good there as well.
to make them ~240 token length globally
(I have nothing to say)
clownshark and I have been messing around with gpt4o, and the prompts it make are massively more following for sd3 if they're under 77 tokens. it's kinda frustrating because pixart/ella are both 300+ tokens
pixart sigma trains on 300 token length
I think sigma is underbaked and doesn't have the best dataset
but the arch is still pretty good
pixart is awesome and frustrating at the same time. I'm ready to start training it but I spent hours last night trying to get it working and couldn't. too many wacky python dependencies.

kinda sucks right. I'll take the challenge!
hah yeah i try to sneak it in there.
Textured tempera painting, billowing digital anime by Pascal Blanche, Ross Tran, and Glen Keane, big tareme eyes, -- dramatic, low-angle shot of the girly explorer, standing at the edge of a misty, abandoned funhouse, a crumbling, creepy clown face looming in the background
very cool
reminded me of tomb raider
haha, now that you mention it, she's a girly explorer too 😉
There. Gonna be released. Got that! No more fear mongering. XD
Is that Bexos?
lmao
If his eyes were any further apart, they would be in his ears.
That's the punishment for becoming one with the spaghetti
never go full noodle
lol
go full nude instead
mhm
that didn't do the animation. :\
lol
I think it also implies that it will take longer to release compared to what they said earlier
SD3 two more weeks!
sounds like stable is going to be sold
not sure that sd3 is coming from things you read online
If it's musk related then you can pretty much ignore it.
If they did sell to musk then StabilityAI would kind of be backpedaling away from their whole vision of safety.
It is just taking a bit of time for them to finalize the safety DPO training they are currently doing. That is why they released over API first. In order to gather enough data on model behavior to ensure the Safety training worked before full release
well emad tweet about it, it seems more a breaking news kind of thing. i guess we'll see
he tweeted what what's breaking news?
i was just doing some catch up on reddit ai news and in stable they talking about a sale that is pending, emad tweeted about it too. look over maybe first 4 or 5 top stories.
where?
just post a link if you have one
stable reddit
can't find that tweet
wish ppl would actually provide their sources
real? probably? maybe?
well its posted on they reddit i assume they would not leave it up
this was the other thread i was reading
oh wait now i found it
@literallydenis Apparently its been about to be sold and run out of money for a couple of years 🤷
Also apparently it spends more money than it makes unlike other AI companies, crazy.
don't know why ctl f didn't find that
yeah either way the only buyer i could see them finding is someone who has the means to monetize models via online generation, a plan to do so, or some that is a near trillionaire who just throws away tens of billions for the f of it
look if we all just blow into the cash worm, money comes out the top. problem solved.
just ask good ol' Polyjaws for help
this is what happens when you train a lora on dall-e output. I can just say yeah, dall-e made that.
it's uncensored for me, i don't know what your problem is.
I feel like that could be an impulse buy on etsy
i would buy it that's for sure
i would buy it that's for sure
gm
"Lego Lara" 🙂
SD3@ClipDrop 🥳
SD3@ClipDrop - the red knight drinks beer and eats pancakes with the white witch!
SD3@ClipDrop
In a misty dawn forest, a majestic Cucumber Creature poses, its slender, elongated body adorned with intricate, swirling patterns resembling tiny seeds. Delicate, leaf-like fins decorate its arms, and its gentle, smiling face features expressive, sparkling green eyes. Soft, pale green skin glows with a soft, luminescent light, set against a whimsical, romantic backdrop. with enchanting, dreamy colors and intricate details.
Negative: boring, threatening, desaturated, comic
Emad was being sarcastic. Like “they were saying we were running out of money and would be sold imminently two years ago, too. And Stability is the ONLY AI company not turning a profit right now, oh noes!!!” That is, it wasn’t and it’s not.
Yeah he was deflecting.
presentation background, about it and business, keep it simple
Lol how come this isn't blurred? 😄
its not about content
I bet the nsfw detector was only trained on "too much exposed skin = bad!!!1!"
its unreliable
Thanks for this!
Christian is the CTO right?
whaat???
I thought it's upsacled with dreamshaperxl or somethign



