#๐๏ฝsd3
1 messages ยท Page 60 of 1
Sd3 medium accepted in civitai yet ?
Was that cleared up ? Can we use there models for free
no
I suppose I should go do something constructive in the garage
you guys are hilarious though ๐
I don't know if my wife is as amused
"a close up national geographic 4k photo of an evil millipede crawling on a leaf covered in dew in the moonlight" curious what SD3 would do... this is cascade (generated in under a min at 2880x1728, peak vram was about 9.2 gb)
this is what I'd do, by SD3

it's cute!
non human is 2ez for 2b
kinda weird leaf but the lighting and detail is good
you're the director
is it just me, or is training 2b loras more memory efficient than xl? i can do bigger batches
I'm not sure what model to use anymore.
I've tried pixart Sigma, SD3, SDXL, 1.5, 2.1...
Pixart so far has been the one to follow prompts really well for me.
SD3 knows no industrial design. SD3 knows no graphic design. SD3 knows Cronenberg.
tbf cronenberg knows industrial and graphic design well enough that he can dement and pervert it
more fuel for your gpt4o lol
ooo even better one
those are awesome
it's fun seeing stuff transposed from one model to another like this
shi ti think the second is my fav actually
none of the gritty look that sometimes sd3 has trouble with
very clean and well integrated and detailed
yeahhh
got a fun new one
yeah that camera angle is great
when I use dalle
I use "from a low angle"
in every prompt
yeah anything that's not "portrait style" is a nice change of pace
chinook
the SDXL checkpoint
has "dutch angle" trained in as a trigger word
I think its such an under-rated checkpoint
(dutch angle is like an angle that is not straight)
this is cool
chinook is so good
I have no idea why it isn't more popular
or more models aren't trained like that
sure, one sec here
skill issue ๐
Clearly not, though.
for prompt adherence
start in Pixart/SD3/Dalle/Ideogram/Midjourney
and then use as a refiner pass SD1.5/SDXL/SD3
i bet they can get close-ish
this one could get weird with gpt4o
can anyone make a good ultrawide sci fi city with SD3?
wow this is so good
that's stable cascade for ya
nice! I had faith it could do it! ๐
wow thanks
this is my main use cases for image AI
np
vram peaked around 9gb for that too i think
maybe 10
idk what gpu you have but i know a lot of ppl are on 8's and 10s around here
I use cloud so any
ahh k
I can get 48GB for a dollar or 80GB for 1.5 dollars
for an hour
pretty much
here's another for ya
yeah vram def isn't an issue for you then lol
that car is so nice wow
and this one
the thing that's disappointing is you can never really get an image with action in it
That's a big city. Lots of skyscrapers.
it's all about what it was trained on
good to see you're still alive XD
a programmer never dies, he just gets tangled up in spagetti code
Aa okey, Im planning to make a lot of images and instead of using the refiner iยดll just do a img2img with low cfg later using sdxl
I want to add in a dropdown to switch between various diffusers like sigma, hunyuan, sdxl and sd3. trying to put this together for the non-tech savy
Spicy! But who cares about the lady. the backgrouds are awesome! ๐
been trying to train sd3 loras and it's just not happening. they learn something but it's not accurate at all. it's not like sdxl or 1.5. its better than 2 but it's just not happening still. i dont think the current 2b model is something that works with loras. It's not just me either. i haven't found anyone who has successfully trained a person. some style loras but barely accurate to the training data. if anyone has good settings to use they're not sharing them.
maybe this could've been avoided if stability released training code but they're holding that close to their chest
2b is missing a lot of stuff, it was rushed out and a lot was skipped
you might want to wait for updated releases
if that happens. new ceo has said they're still committed to open models , but i'm not seeing it. why not release the training code?
there's nothing to release yet. take that or leave it as you like.
cop out answer. sd3 is trained innit it
Are there trainers which support SD3 now? Last I checked EveryDream and DreamTrainer didn't yet. That was a week ago though!
one trainer has a branch for it
unless you jsut really want more broken, unfinished code? if not, then chill out and be patient
That branch was broken last I tried, has it been fixed? ๐
just because there's a branch does't mean it's code you can do anythign with
its indevelopment yeah. he didn't make it to any spec because there's no training code released so nobody knows, so he just wung a few numbers out there to make it work for now. comfyui then updated to be compatible with his keys.
the loras it made first would only work as sample generations. but now they work in comfyui
have you tried it? yeah. thought so.
argumentative with all these cop out lame shut downs. get off the stage
there's nothing TO release yet. the entire reason that SD 3 was released as it is, rushed and unfinished, is because the community wouldn't stop harping on 'release the open source!" and refused to be patient. so you got the unfinished broken version. you going to keep harping on release until they just toss non-working code at you and wash their hands of the entire thing? or would you like to practice being patient?
it's a miracle they're not so fed up they haven't already
Was everyone this impatient during SDXL as well?
Damn, I hope they release the 4b version ๐ฅบ
i dont think the paremeter count is the problem
No, it was not even close, people keept using 1.5 for months, well, its still being used
you could train sdxl in the 0.9 beta period. loras worked excellently
Am I the only one who's still creating hundreds per day with 1.5? lol
official training code was published too and kohya had it implemented on day 1
1.5 is awsome!!! ๐
I gotta reinstall a 1.5 model, I remember I created some good images with it
It's extremey consistent, which is what I'm still using it for. Digging up my old prompts...
i don't understand why the talking down about being impatient is going on. i'm out here actually sortin data sets and settings and trying to do shit. y'all kicking over sand castles. big people. i know this is the sd3 channel and we hate grass around here, but honestly, touch some
Comparing the best SD1.5 base with SD3 base is just disingenuous. Stop it.
gonna try a style lora. i might consider sharing results if y'all play nicer and agree not to be like "could do that better on 1.5" sure okay so what i'm out here tryna learn mmdit shit. i already done this on unet
because you're trying to train a demo model that's not finished. you should consider waiting until they have a chance to work on it some more and release another version
you must work at stability to have such information. lol. i dont really know what your poitn is. blocked and ignored going forward. obviously have nothing good to say about anything in this channel.
think what you like
i left the mushrooms running
Make them 100 meters tall now 
i cannot they are cuty little mushrooms
they do not care about a mushroom LORA they will be what they can be
this one is begging you to eat it
please put them out of their misery
one chomp is all they ask
how do i explain this to my folders of waifu images
that's a hairy mushroom
is it horror time yet
you take that back kind sir
Werewolf go kill that thing 
careful I haven't read that far into the subreddit on mushrooms to know which one your little doggy can eat
nice car
random commentary
it's kinda specific how it was trained like how lykon said to try anime art or anime screencap in place of anime
digital illustration in the style of cyberpunk, High-resolution, 4K widebody shot, luxury sports car, front perspective, stunning headlights, dark alleyway illumination, neon underglow illuminating streets, pastel sky over the horizon, cinematic realism, dramatic lighting, sleek exterior, premium details, high-end automobile, urban backdrop, atmospheric evening scene, dynamic camera angles, detailed tire reflections
I'm just here for the memes
I like that style
Those look really good
pastel skies taking over
High level crayon drawing
No your m-
i wish

That looks cool
horror time all the kids are asleep
Their loss
no recycle
they trained it so well it can't be bad
we just gotta learn what to say to it almost like developing a skill
get away from those kids
I am not sure it knows what a xenomorph looks like. Hm
i only use wiki-how to learn how to prompt
oh I think I know my prompt was confusing it
Oh god....what have I done
it likes to make 20ft beds
always dbl teeth
why does it love double row teeth so weird
does it 95% of the time when they are wide mouthed
good ol SD95%
This is pretty cool I think
it's like Kubrick style
where is the horror?
@kindred mica were you still looking for any info on the status of training efforts
locking the seed an prompt and just adding periods in the negative
the horror is that she is clothed ๐
face hugger knocked her up ruh roh
"There's free campfire food behind me ๐ ๐ "
oh a Sephora ad
Those wings! โค๏ธ
so many hands
hey, finally a good seed
That is impressive hair
Future AI
is tile any good with that?
dunno just saw the update, didnt even have them downloaded
China's Loong Boat Festival, the background is three giant mountain Zongzi, surrounded by clouds, Loong Boat racing, Loong Boat moving on the water, the river winding from the huge mountain Zongzi to the distance, the background is misty mountains, rich in details, daytime, fairy fog, clouds, movie lighting, super high quality, super high resolution, ultra high definition, ultra clear details, 8K, HDR, Chinese style
Here is the image you requested.
lol
what abotu this one
this is cool as hell
looks evil
one for yo uafter you've had your rest ๐
๐
idk why i love this
stole your prompt and did this
great work
awesome to see that work, i thought you'd done img2img tbh
๐ thanks
nope. i don't have img2img set up on my workflow. jsut very simple
i've been playing a lot with handing the different encoders different prompts. this is the one for my shark tent if you want to play around with it
thanks
welcome
thought it would make it creepier
Black Cat Sheriff, ground, magnifying glass, footprints, animals, forest, curiously, watching, clearly visible, large, oddly shaped, attention, path, depths.
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
i trained the madball data set i made.
it worked but i need to tune things up. i guess it learns non human concepts much better
i like your balls
its supposed to be these things https://www.youtube.com/watch?v=yvfYcZ8GAVM
It's freaky fun! It's Madballs!
freaky fun for everyone!
i should proabbly get rid of half the dataset and supplement synthetic data from the xl version
no bird in the training data. nice result. will play more later. this was a pretty promising result for styles. few more tests to dial in settings and i should be able to do more style loras. but now where to host them? hahaha banned from civit
By the peaceful lake, a panda eagerly plays its guitar, making the entire environment lively. The calm water surface under a clear sky reflects this scene. Bright flowers bloom around, butterflies flutter, and birds sing. The sun sets, casting a golden glow, blending realism with the lively spirit of giant pandas.
@dusky thistle 
cuz surgery
CHALLENGE
Render a bus of any kind where the driver is visible. I have tried and failed.
if successful, must share prompt and sorcery used ๐
LOLOLOLOL
Ball
a girl driving a sports car
sorry... meant external view of a bus.
That's not bad actually. I am up to my 100th seed and prompt tweak and got nothing close.
Yeah. It's weird, you'd think most training images of a typical bus would have a human or two in it.
Try this: Photo in 4k of a close up bus on the street, outside you can see the city passing by and the menacing front of the fast black bus, using an action camera able to capture fast movements perfect for an action movie! The man inside is driving the bus
Tried some wiggle arounds with prompt until it wasn't giving me inside views 
In the world of AI humans don't drive buses anymore
And always the damn Empire State bldng and a view from 34th st ๐คช
Disclaimer: I am a lifelong New Yorker and rather explore other parts of the globe, thanks ๐คช๐คฃ

Damn you're right. The model is telling us "wink wink.. you humans are doomed to be replaced ... Hint hint"
Listen, if you look at mine above with the giant Subway's sandwich worker, I'll take this one every time
Lmao
I changed the place for you, and the driver is gone

@sacred jewel did that prompt give you anything good?
Not tried yet. Already getting distracted by real work at work ๐คช๐คญ
the shortbus...
I'm participating in the SD3 image generation contest on Shakker. I'd appreciate it if you could give me a like to help me win the grand prize. Thank you, everyone!https://www.shakker.ai/imageinfo/07e98796f3404b7fb8301d2e411e9205
Our hub provides members with exclusive access to an elite selection of AI image generation models, designed to produce superior quality images that stand out in any creative project
that would make sense, its easier to see things inside at night than day
thats creepy but nioce
nice in a creepy way
UniPC BH2 seems to work with SD3 too
cfg2 vs cfg 4
30steps vs 60 steps
shift10 - steps43 - cfg3
dpmpp_2m_alt works as well.
thats a new one?
It's been in my drop down for some time ๐คทโโ๏ธ
dont have it in comfyui
Some early morning horror
odd
getting darker lol
Prompt: a Renaissance painting of Hell by Pascal and Salvador Dali
Warning nsfw nearly
By the peaceful lake, an evil angry panda eagerly plays black metal on its electric guitar, making the entire environment lively. The calm water surface under a dark sky reflects this scene while blood paints the water red. Skulls lie around, black butterflies flutter, and crows cackle on dead trees. The sun sets, casting a golden glow through the black cumulonimbus clouds, blending realism with the lively spirit of giant pandas. far away in the background, a forest burns.
SD3 doing some anime style:
(2d anime style, flat pastel colors, lines:1.1), 32yo sexy lady, short black wavy hair, green eyes, (big nose:1.04), glasses, necklace, (friendly smile:0.6), intricate hairdo, high class, gorgeous sunset in the city, electricity cables, jacket, (big boobs:0.5)
The current version of SD3 is perfect for some scenes lolol
Prompt: a Renaissance painting of the Garden of Earthly Delights; Hell by Hieronymus Bosch, Pascal and Salvador Dali
Peaceful lake, depends on one's perspective ๐
Warning ideologically insensitive; also I guess nsfw maybe
religious
At least all the important historical works are within SD3 already ๐
an advertisement for aromatherapy oil in an amber bottle, surrounded by white flowers, evoking a sense of serenity and relaxation. Imagine a cinematic concept with warm, golden lighting, reminiscent of a sunset. Capture a dynamic shot of the bottle in motion, as if it's being gently poured, with the oil flowing like a liquid gold. Use a shallow depth of field to blur the flowers, emphasizing the bottle. Incorporate a soft, creamy color palette with earthy tones to evoke a sense of calm. The image should be high-resolution, with a cinematic feel, perfect for a luxury brand., photo, cinematic```
The lizz supper
so it was Keanu Reeves?
๐คจ
red haired female elven druid /sd3
yes, even spaghetti god got in there, awesome model indeed
tsssssghetti
trivia fact. they didn't sit on chairs at a table. back then, they 'reclined' at the table on mats and bolster pillows.
civit has banned all sd3 content. even non commercial uses.
they were afraid creators would have to destroy their work according to the license, so they went scorched earth and destroyed everyone's work before stability could
if i try hosting this lora on there they'll probably kill my account
have u tried uploading it on shakker.ai? heard they allow sd3 content there
Amazing
better results. pretty high learn rate
looks like it's time to go to war (sd3 8b)
cuddly
doh my training didn't fix the monstrous woman problem
hehehe. cracked the lora training. you just have to avoid human subjects and do concepts instead
Testing ControlNet Canny in SD3, it works well (but in 1:1 ratio). First image from Ideogram
more testing too. its really finnicky prompting and works better at lower strengths
i'd rather publish it with less glitchiness
i'm also considering not publishing it since i don't want to draw the gaze of angry people who say shit can't be done. it's not a perfect lora and i don't wanna end up catching harassment over it
yeah I feel the same
whats is going on 
the reddit stable diffusion community is too toxic
probably just not going to open source any checkpoints or loras I make for SD3
Not just Reddit.
I don't want none those psychos calling my job and telling them to fire me
Not that I care about the job, it's just that's super embarrassing
there's X and Instagram stable diffusion communities?
Tiktok too
almost as if basically toxic people everywhere 
its the community fault for being skill isssued
sexy
thx
that's what happens when you try to gaslight the community and blame them for model being bad ๐
they were also toxic about SDXL when it released
they were also toxic about SD2.1 when it released
etc
year or months later: this model was the best when it released!!
no. nobody was sending death threats to anyone over sdxl or 2. it's gotten worse
individual guys like lykon weren't singled out for those rollouts
that's true but when you try to call people skill issue or say popular models are not "proper" models or never respond to issues like license you will only make matters worse
this is a childish and immature justification for literal harassment. "She probably had it coming" vibes
"what was she wearing?"
"she shouldn't have walked home that way"
standard victim blaming
you forgot to compare them to hitler that way you could have reached reddit level or "reaching"
hitler wasn't a victim. get real
everything lykon said about ponyxl is correct
can't have an honest conversation when people villify honesty
it is a popular model, solely because it is the only XL finetune for its domain that got completed
Looks horrible
yea but like other guy said cant have an honest conversation when you try to villify people who made a finetune for free just because it isnt a "proper" model
i blame the 90s magazine scans in bad resolution. a lot of the dataset is 240x240 upscaled to 1024. i need to rebuild it with synthetic data
vilify? he was turning down advice because he knows more about model finetuning
the pony brigade happened because lykon wouldn't answer licensing questions. such a heroic event
if only we had a proper model we would have gotten a better license
on blind tests people find it to be better than XL. i also remember SDXL being fairly underwhelming on release.
well i dont think its worse,all issues can be fixed with training but since license is not good i dont even know if we will get a good finetune
unless it also can't be trained , which many people said they have found that it cannot
seems unlikely that it can't be trained
we should definitely trust people who declare that a model can't be trained within a week of its release, especially when the model is radically different architecturally from other popular models released in the past
well sai did say they were gonna release training code,so gotta wait 2 more weeks
I heard it recently, that's not a week after..
skill issue
i have had a lot of difficulty training people though. i recognize limits. i just work around them
but at the end of the day
SD3 is based around a DiT
which is not that different from a ViT
and I can't understand on a technical level how a ViT could become untrainable unless it was massively overfitted or had massive catastrophic forgetting
i've also heard people who have tried training it backtracking some parts of their claims recently, like ptx0 saying that he got it to work better once making the text encoder dropout not independent (makes a lot of sense that that would cause training issues and is also why I'm planning to throw clip in the trash)
i'm honestly not sure it exists, beyond what was used to train the base model
Because I heard that the model has things put into it for censorship like, they didnt' just prune NSFW training data. They literally taught it to draw shit weird on purpose. Which means, you can't finetune that stuff out because it's not that it's missing, it's that it draws incorrect on purpose
which is probably why you have trouble training it on people
compare 8b to 2b and the difference is night and day with almost everything
welcome to ViTs, we have:
the remarkable ability to show signs of memorization in less than one epoch
I'm sure that it'll handle differently but untrainable is complete bullshit. 90% of the time when someone claims that, the person actually has no idea what they're talking about.
What do yโall mean by โtrainingโ and โskillโ?
did you try 80000 alpha yet?
Nope
i actually used 1 alpha for my lora. i don't understand the setting. i heard it like this. setting alpha to 2, would make the final lora work as if it was always set at strength 2
Iโm actually an amateur. To me seems like youโre speaking Spanish.
The serious and direct version:
- Prompting issues in SD3 are most likely because people are used to prompting SDXL, I think I have also seen instances of people using the ponyXL quality tag string which is just... wrong. They used synthetic captioning too (for half of the dataset), which favors long prompts, and if the prompt you enter is not like what was used for making the model the results will be bad. The term of art for this is "train-validation gap". The layman's term is "skill issue".
- People claiming that the model can't be trained probably shouldn't be taken seriously, because at this point it would be more akin to a reverse engineering effort to an extent. Also the vast majority of them probably haven't trained anything more complicated than a lora if even that, and should not be taken seriously for that reason alone.
Most people I know of set alpha to be equal to dim ("0" in kohya does this). That means you get consistent behavior from your learning rate regardless of the model dim.
So you mean the problem is in the prompting?
dim is rank in onetrainer?
A large portion of it, yes. This is obvious because as you can clearly see scrolling up in this channel some people are able to get good images just fine. That can't be explained by a fundamentally broken model. However, there's good reason to think that having three text encoders is making it harder for the model to learn than it should be, so that is part of the problem (and you are recommended to play to the strengths of each text encoder for that reason)
I guess so. Other thing is that a lot of people tend to pick a rank that is far too high. I've seen people train successful single character loras with a rank of 1 and it gave the best, most generalizable results they've ever had.
Then again sometimes you can get better results by training with rank that is too high then downsizing the lora. But your lora in the end should never have more entropy than your data.
So are there any prompt generators or sth?
You are once again switching to Spanish. I use SD3 Medium in Huggingface. Maybe you can help me with that?
I've found that asking chatgpt works fairly well.
Thanks!
Hey I didnโt actually get this - You need to give it a photo to get a prompt?
What if I donโt have one?
Ah fine then
Let me try
funny how these lined up
i used 32 for my lora rank on sd3. 16 didn't give good results so i doubled it
but i have no idea what i'm doing. just shooting in the dark and hearing pings when i hit shit
Most of the "good" images are SD3 8b, SD3 2b are most often contrast and sharpness dialed up to 100 and subject at dead center (they seem to come straight out of laion, whereas for sdxl they appear to come from laion aestetic).
One problem with "but use long prompts" is well, it gives a better look, at the cost of getting minimal variety in gens. I'm just not a fan, sure it can do nice images, but the nice vs meh ratio is so much lower than SDXL (even with base, when just released, images looked pleasing, what SD3 2b produces, often just doesn't, but i'll give that finetunes might make a ton of difference here, but why release a model that produces such unaesthetic images). It follows prompts better, but to me it's very much that the benefits aren't worth the cost (the API version has none of the issues that i think the 2b version has, so it's not an architecture issue (though 8b, doesn't like long prompts, which is a huge difference to 2b))
seems like selection bias
its the people who is wrong
more often than not yeah. most people don't know what they're talking about for any given field
i've done a lot of A:B tests, what Aliquip said about contrast/sharpness is the truth
Honestly it's not, I had a TON of fun with the 8b api, (and with sdxl when it was new), it was a better sdxl in most ways, then this 2b thing dropped, and it really is nothing like it (for me) i think i created about 10-20 images i liked, and not for lack of trying. But I never cared a whole lot about portraits/humans (or subjects for that matter, i just like to create a nice atmosphere for lack of a better word, of course there is a subject, but the scene must be a pretty picture first).
8b is very clearly the far superior model and the one that was much more ready for release
unable to run on most hardware though
by the time they release it almost everyone will have an rtx 6060 with 80gb of vram
skill issue
steven seagal signed GPU
doesn't matter... it was still more ready for release
8b still has lying on grass problems x3
but, it does matter. thats why sd15 is far more popular than sdxl
Sucks that that lying on grass became synonym for the state of 2b, it's a problem, not the only problem, and seeing how other models deal with it, even a forgivable problem imho
in terms of whether it's ready for release, it's not relevant
it's ready or it's not
but no one would use it. that's sort of what liberated models are about
either they're accessible or they're not. if it's released and no one can run local, it might as well stay in the cloud
just use the api version
tons of ppl could run 8b locally
3 people can equal a ton. that doesn't say much
kek
Thats so toxic
(Will use it in the future)
you did the math did you? ๐
please stay on topic
there's at least 1.3 million ppl on steam with a 4090, and you don't need a 4090 for 8b
iirc, 16 or even 12gb was likely enough
so yeah, there's a ton of ppl who could run it
smurf accounts
1.02% of all steam users have a 4090
also, the steam hardware survey is a sample of users. not all steam users.
16gb for non quant
i heard quantized 8b is lower quality than 2b from the people wokring on it
My personal prompting may suck lol, but I have noticed a definate increase in quality and it actually listening, in just a week.
if you thought i was toxic, wait until you hear what people will say about this guy #๐๏ฝsd3 message
only reason I'm able to produce the images I do SOLEY using 2b is because of the secret prompt guide @bitter hearth sent me, I'm sure if you want the guide you just have to ask him and he'll send you the link
yeah, if 2b had been trained as much as 8b, sure it might be comparable
it's been said 2b is trained differently than 8b, different team, trained in record time and yadda yadda, i'm sure i'd have found 2b more agreeable if it had seen the training and data of 8b. But we would never know, doesn't matter really, it is what it is.
but it wasn't trained nearly as much, it only got a fraction of the training cascade got
cascade got the equivalent of 3 A100 years of training
would it still be called 2b if it was trained more than 2b
i believe they said in the paper that training cascade cost about 100k per run
yep, that's just the number of parameters - 2b will always be 2b
it's more like a measure of the amount of flexibility it has and the amount it could know, i guess
train it so much it becomes sentient and evolves
parameters are like brain cells

that pretty vague, i'm sure parameter is defined somewhere right?
i wish 2b could be 2t
it already is
pictures incoming
"are you sentient ? Answer written on screen: "YES" "NO", white background with black text, there is a red arrow pointing to one of the answers"
model is not sentient
The thing is, for me it doesn't listen to styles all that well, and composition, oh boy, it often just flat out ignores it (but normally i'd rarely prompt for it, since sd3 2b put everything dead center, i try...). But i've seen your list/system prompt posted earlier, even though i tried most of these, i'll try them again at a later time with stricter wording)
this is a steep ask. most fine tuners won't even release their dataset.
I personally think all datasets should be disclosed but that's not the world we live in. Not many models do it
ya but where did the YT channels get their info from
2mil downloads since launch for 2b says the community is slow to adopt so the information is slow aswell
we need to evangelize 2b more so the more people are aware of it's unlimitted potentional
the only way 2b gets widely adopted is if ppl figure out how to train it
that's taking a while cuz SAI has had very close to nothing to say on the matter, and there's inconsistencies between the implementation and the paper... there appears to be shit in the model that's not even being used
I find it crazy that they don't release how to train stuff and info like that, I never knew sdxl had a big wall like that for the finetuners to work with 
when cascade was released, the training code with full documentation came a few hours before the weights. it's been over 2 weeks with 2b, and the trainer devs can't even get answers from SAI to some pretty simple questions
how many censored models get trained on though, that's a big hurdle for ppl to waste their gpu time on
around the same ammount of models that 2.0 and 2.1 has
so 0
hey 2.1 is not dead yet i see a few on civitai
maybe we should figure out sd2.0 before we try to master sd3.0
even with a better license
Is it my fault that I first this as saying "communists"? ๐ญ๐ญ๐ญ๐คญ๐คญ๐คญ
also skill issue
how many of the ppl that worked on 2b are still at SAI after the release
like % wise
at least 1%
but no more than 100%?
I'd say somewhere between 1 and 100 yeah
doesn't sound like a panic to me then
they really should've just released 8b
sure, there would've been a lot of complaining about vram requirements, etc
what is the incentive to even make these models I understand 1.5 was a "hey look what we came up with"
but at least it would've looked good
probably to train 8b
the cascade paper says that training cascade cost about 100k per run (26,000 A100 hours), and lykon said on here that 2b got "a fraction" of the training that cascade got
so if unless the cascade paper was wrong about the training costs, and lykon was wrong about the relative amount of training each got, then 2b cost <50k to train
well, at least the pose controlnet solves some of the problems
i'd like to try that prompt on the api
lowered the strength a bit to see if that'd help
yeah shit lol i was holding out hope in those controlnets
comfyui finally got support today
the tile isn't as bad
it looks fried as f, but it gives me some hope
i dont feel safe
yeah but who wants to generate hobo women?
Hobo men?
wet ball
i dont know if only the prompt will help (there is controlnet,img2img starting from another model) but here it is : A shiny, reflective silver sphere sits on a white surface, reflecting a serene park scene. The reflection shows a light blue sky dotted with wispy clouds, a lush green lawn, and tall trees with dark green leaves rustling in the wind. A brown tree with verdant foliage leans against the right side of the sphere. The sphere itself is spherical in shape and appears to be reflecting the surrounding landscape.
Character details:
- A silver sphere with a reflective surface.
Scene details:
- A white surface.
- A serene park scene reflected in the sphere, featuring a light blue sky, wispy clouds, a lush green lawn, tall trees with dark green leaves, and a brown tree with verdant foliage.
Angle: Side view of the sphere.
there's a ton of misinformation about training sd3. i almost got caught up in it because i couldn't find settings to train a person's likeness very well. still can't but i've trained a style and have high hopes for next attempt.
the guy who said 80000 lora alpha is needed is still being taken seriously . i shouldn't be surprised though. i mean, the presidential debate last night says a lot about the state of gen pop's cognitive abilities
Not many men are smarter than Trump though. He aced the test and got the US clean H2O water ๐คฃ
it does not train well with the current tools/the state of our understanding of how we should be approaching it
it's pretty evident if you do some A:B tests with the lora on and off, then compare to ones trained on sdxl or cascade with the same deal
yeah, just want some more prompt ideas and words to use
also the fuck would be a side view of the sphere xD
what're you using for upscaling?
I don't know, it's just api stuff 
oh, k
Asian elf!?
sing lyrics in prompts are cool. here is the result from the complete lyrics to Daft Punk's Around the World ๐
๐
How's the weekend treating ya'll?
Iโm out riding my bike
Keep off the grass.
I have fat tires should be okay might try to lay it down on the grass later
I did post about the cost... But I got it from Wikipedia
https://en.wikipedia.org/wiki/Stable_Diffusion#Training_procedures
Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing artificial intelligence boom.
It is primarily used to generate detailed images conditioned on text descript...
which was originally confirmed by Emad
https://www.reddit.com/r/StableDiffusion/comments/1c870a5/comment/l0dc2ni/
the E man
I ate pizza
forgot to add carrots to it
But was good
Frankenstiens monster, sitting in his bedroom. Listening to music and wearing headphones. realistic, retro 1980's cinematic aesthetic I appreciate how good the prompt adherence has gotten.
Best model ever

Carrots? On Pizza?
๐คฎ
I do NOT disagree... but on Pizza? no thanks.. i am out ๐
I was told there is a pizza in this image
At least its not pineapples.
I LOVE pineapples on pizza... ๐คญ
When prompting accidentally goes terribly wrong.
Burnt 
LOL
BUT, my CFG was so low...?
LOLOLOL
You would think the prompt was woman pirate giving me the finger while eating a giant pizza
Hhahahahaah
Finally AI humor coming into its own! 
This inspired me, but it looked much better in my imagination lol
The pineapples ROFL
You're all welcome for me not posting the beach shots of him in less clothing lol
badger badger badger badger, mushroom! mushroom!
Prompt lube.
The detail with SD3! โค๏ธ
furry
webp isn't sure whether to be sad or happy
have you heard the good word of our lord and savior jpeg-xl
2/2
I'm more inclined towards the word of png ๐
jpeg-xl takes up half the space, and it encodes and decodes faster while still being completely lossless. if you opt for perceptually lossless it takes up 15% of the space of PNG
only problem is software support. google doesn't like it (they'd rather people use AV1) so they killed support for it in chrome
I think I'm so used to jpg not being lossless, that I started using png. Tbh though, I've just been using the glif/huggingface versions of SD3, so I go with what I get.
why the hatred for webp though?
i mean jpeg xl is a distinct format. it's somewhat unfortunately named in that regard. but it does have its nice lossless JPEG transcoding feature, it can trim any JPEG down by 20% with no downsides
Can't list them on DA, they aren't as high quality as jpg/png, half the upscalers don't like them...
webp is better than jpeg. that is the only good thing about webp.
https://mconverter.eu/ that'll fix it
there's even a plugin for chrome for it
I've heard talk of using Latent Diffusion techniques to reduce the size of images but so far, it has failed.
We al lcould have told them that a while ago ๐
between jpeg noise and webp noise with the file size being equal you'll find that webp has less noticeable artifacting. And webp does support true lossless mode but it is ATROCIOUSLY slow to encode
That's what I've been doing. Once SD3 fully becomes part of my workflow though, I'll figure out a way to run it with jpgxl or png
Also, GIMP doesn't get along with webp ๐ฆ
ummm my comfy writes out .png files by default...
I mean always use lossless something for any image until you're publishing it regardless.
My, brand new, computer is too old for using comfy locally really, so I'm going to figure out another solution for when SD3 becomes more usable for me
i'm gonna DM you
Perhpas I was too hard on MJ back in the day, when I kept on to them to stop with the webp lol
i heard MJ keeps the format for the 6 ppl that still use and love webp
You're a MONSTER ./.. ๐
They explained it to me, in their case, it's Chrome's fault
How are you all getting those really long perfect texts in your images? I get warped words halfway through
Try SHift to 8
(some awesome person suggested it to me here yesterday and it WORKS!!)
Yonork Lube.
Oops, that wasn't sd3, this one is
@sage burrow
here your furry ball
Is that....is that a **nis?
but there should be 2 ๐
https://imgur.com/a/r6e3303 my jessica rabbit tests. dont worry its safe. the safety training really laid waste to this character is my theory. i don't think she was undertrained so much as she was regularized hard towards an aesthetics set for safety.
the other one just took a bit longer to grow, thats normal...
I changed the middle to "There's sophisticated computer lab with a scientist inside of the sphere"
left one looks cool
"mini planet inside a glass ball" was my prompt
smol prompt for smol planet
can we say you're 'having a ball' the last couple days?
ball's in your court
ะะตะฒ
Cute ๐ฅบ
How are y'all getting decent images? Is this the api or medium locally?
is that the prompt?
Yes 
Random file names work pretty fun
lol
DCS9234.jpg
The forbidden one
shhhh they don't know

pretty cool
Are you sure about that
look perfectly flyable to me

what's the aircraft that I built in Minecraft doing here
It's llama del rey 
Loooong plane
๐คฃ

thats for u @bitter hearth
Easy rider.
got 95% there on the notification
img_girl_on_grass.heic
a

XD
actual training image
yes
ey yoo cosplay.jpg
Photo cosplay.png
DSC_navy_secret_photo.jpg
DCS_spooky_girl.jpg
@placid swallow img_tekmunki

_DSC_XXX_no_look.jpg wish me luck
@bitter hearth
holy cow you found 2b's source code
lol LOCKED
(many many many locks there)
they sux
we've been bamboozled
im going to sleep
IT'S FRIDAY
Forbidden tits.
Marty forgot to activate the time machine

I was sleeping
For shoot 8 minutes, in my dream you guys were making fun of me trying stuff wrong

Now I go back
/dog, excited, happy expression, spacesuit, helmet, oxygen mask, floating, space, planets, cinematic, full-body, 4K, 9:16 aspect ratio, photorealistic, detailed, vibrant colors, majestic, otherworldly, visually stunning, classic composition, masterpiece, exquisite, color correction, amazing visual effects, intricate details, sharp focus, super high effect, HD, 16k --ar 9:16 --v 6.0
It looks like the diffusers SD3 dreambooth script forgot to include a VAE shift factor subtraction when encoding the latents, before multiplying by the scaling factor. I tried a run in my code without and with it, and it seems to have fixed the degradation in image quality seen (my sample prompt generation had it). This is the difference in image quality for the same prompt (though shuffled) at about the same number of steps, though some are blurry so I'm not 100% sure
Anybody using Textual Inversions alongside SD3-Medium at all?
Iโm but i'm not sure if they really work
cat
Here are some SD3 images using a Basquiat Textual Embedding from A1111
xD
Sleep is bad, more coffee needed!
Kidding ๐
Those backgrounds! โค๏ธ
AI is so mean, wings come in SETS OF TWO! Yet the best renders always wnd up with one wing ๐ญ
You managed that without controlnet, nice! ๐
i'm using the tile controlnet xD
I've found that AI doesn't know any martial arts moves/postures/katas/etc. really ๐ญ Good thing for controlnet and reference images!
Yeah, it really struggles with a lot of dynamic poses 
I find the word parkour helps a lot ๐
and it's crazy how resistant all the ais are to dynaimc poses! Try a flying dragon sometime ๐ญ
3K
