#๐๏ฝsd3
1 messages ยท Page 49 of 1
How much ram? I have 16gb
Though it's very likey this is what is slowing me down!
results were inconclusive, take those measurements with a grain of salt FYI
gonna be hard to upscale very high with fp8
32gb of ram but when i had 16gb of ram i was still averaging 30-40 seconds per render. i9 13th gen CPU if that helps you compare
it's not working just like you said earlier, but we still had to try
i7 here
yeah there's a lot of factors, im going to mess with the diffuser nodes for comfy and see if i can get sd3-flash to work for me
Please tag me when you post results ๐
I'm really curious what the min requirements are for that, I didn't see it on the download page
the biggest factor is vram. if you overflow it, the system has to buffer with system memory and that'll be a magnitude slower
im gonna start my own manga :3
This thing does not have as great of prompt comprehension as people think
I'll say "object <X> is on the left" and it still puts it on the right quite often. And it always makes spelling errors or misses a word when putting in text
I tried, using realistic photo style art. Was not popular, aside from my close friends lolol
You mean flash right?
i was actually kidding, but yea, prob wont be popular, cause it's not just about images anyway, you have to then make something that makes sense with story, character development, setting, etc, a lot of work
smol boi :3
via online flash, exact same prompt, just different seeds
How?! Too bad about the bg image lol
sd3 2b
Thanks>>>
This uses 'dpm_adaptive', this means it ignores the set number of steps and adaptively determines its own. Check the console window it is probably really using >30 steps not 4.
You're welcome ๐
Oh!! I'll have to do a test where I put it at 30 (my usual), then try it at 4, and see which is faster. Somehow though putting it at 4, reduced the time the generations took by 75% about
I think I am doing OK with the LLM aiding the prompts...
The speed of dpm_adaptive is usually more dependent on the cfg.
missed the L 
CFG 1
VERY FAST == 3secs
CFG 4, same steps of 4
9 secs
Same settings, CFG changed to 8,
15 secs
I think he's cheating, but it'll do
finally some hard muscles Becky ๐
i have no idea about this game
Base Model
And with the extra fingers, she can smash you even worse than a regular bodybuilder
Cheat day
one strong and one skinny leg. you know what that means.
Zombies, by Bosh
Bosch probaby isn't the best artist to name until the model is more perfected ๐
๐
So what are the content rules of this particular section again?
I'll just say that SD3 is really good at horror.
very nice
Is that the famous Calgon?
Is there a good guide anywhere on how you're supposed to prompt differently with T5 for stable diffusion specifically? I know SD3 is better, but I'd like to leverage the new quality as much as possible.
well, I tried
Trying to get the hands right... same seed... prompt tweaking
well, there'd have to be an Unstable Diffusion 1 first, but anyone who's been here since the start knows how that turned out lol
This is closer to what I was trying for...
"Mooooom I'm reeeaddyyy"
I think they want you to spoiler any image that is bad gore
LOL that is also my question, but decided to save everyone lol. I was thinking of posting them on my alt DA account and the link here lol
So, interesting experiment.
using the singleCLIPTextEncoder with SD3,
This Prompt:
score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, 1woman \nbored, annoyed, unimpressed,\ns, nipshiny_skin,\n\nsolo, ear_piercing,\n\nby rakisha, by totesfleisch8, by meesh, extremely detailed background, BREAK, โ trending on (artstation:1.2) โ โ โ โ โ โฆโฆโฆโฆโฆ, painted concept style
and this modified prompt removing most of the "stuff"
generates literally identical results...
1woman bored, annoyed, unimpressed, nipshiny_skin, solo, ear_piercing, by rakisha, by totesfleisch8, by meesh, extremely detailed background, trending on artstation,
oh man, this would have been perfect except for the AI messing up the window
Hey that's pretty good
The one thing that DOES MAKE A DIFFERENCE?
Ready for this?
"trending on artstation"
LOLOLOL
amazing... didn;t expect that
all the same prompt, but different seeds
That is quite the difference!
Excessive... ah well lol
I think they forgot some bits with their censorship ๐
Or they crawled my MJ account, one of the two ROFL
omg this actually freaks me out
skurry skurry
Not safe for sleep
Prob want to spoiler that one, don't want SD to get into trouble with discord
Thank you ๐
how do you spoiler something
phone or computer?
Start to post the image, but before you hit send, there's this little eye on it that you can click. Hover over it to see it
oh! good to know!
When you look at the closet door at night
Discord (overall, not just here) has some strict rules about public being able to see certain things. So in very public places, spoilers are good :>
Nice information all, thank you
I think I see that, I walk the other direction
He is just a humble demon trying his best!
But he has cookies! ๐
I'm trying to swap the girl and the zombie but the AI ain't having it. So the zombie is in bed and the girl is looking in . I guess this sort of is that.
Well, I mean, pastries. Ok I'll listen
nom nom nom
SD3 is really really good at pixel art, it seems to listen better with that style even
Best SD3 I've seen in a while
Completely innocense... ๐
So checkout my twitter account ๐ @mortal kite
I like the glowy alien lol
ONLY difference is "trending on artstation"
Gimme prompt
uhh, are you sure you aren't just getting an unconditional image? the only things in that prompt that would actually be likely to do anything would be "woman", which the model is very likely to generate anyways with no prompt.
FOr ultra the SD discord Artisan section (you do have to get a subscription though). Or Glif for nearly ultra
of vhs video still of a teen girl in a dark hallway at night. The little girl is running toward the camera screaming. A hideous tentacle monster zombie is behind her chasing her. Film grain. Vhs tape.
For some reason it still made a little girl, not a teen
Negative: "ugly, disfigured, malformed, low quality, watermark, poorly drawn, bad proportions, out of focus, bad anatomy, disproportional, amputee, missing feet, missing foot, missing fingers, missing arm, watermark, signature"
pony tag strings definitely will not do anything to SD3 and it absolutely isn't going to know furry artists (until my team fixes that problem of course)
How the heck would I know? I am just a normal regular NON model expert, dude ๐
You wrote little girl..
Stephanie has her first visit to the I.T. department
The only reason those are still stuck to so many prompts is that for experimentation between SDXL and SD3, I like to keep the EXACT same prompt, even the irrelevant bits.... (so that was my fault)
i cutefied it
i never started using pony because i already have enough trouble with clip context windows without a 42 token string at the start of my prompt
oh you are right I still had "little girl" in another part of that prompt
baking quality alignment into the model from the start is a mistake imo. it's so easy to do after the fact, and you can do it in a much less intrusive way. i managed to get decent quality alignment with a single negative embedding that's 6 tokens.
what even are the point of style up tags?
its decent. Still gets confused if you have multiple subjects in the scene
Pony took me FOREVER to figure out after I was used to 1.5! It's so completely different, or something. I mean even with the scores
I don't care about anatomy or artists names in it, so it's basically the best model for me atm lmao
People say it was a training error or something. But also if you use some, but not all, you can get different results
And the ultra fine detail!! โค๏ธ
quality alignment. if you train a model on a lot of booru images... well, they almost all contain useful information, but most of them look like shit, so the mean of the distribution is going to look like shit too. so you want something to bias it towards images that look good.
the need for them is probably a lot greater on pony because the uncond model is completely dead
Dalle 3 is exceptionally good if you want to make images for loras ๐
but with dpo, if you're prompting all the low quality stuff to be ignored, it's just not useful then. it might as well just not be in the set then right?
if you want the model to learn low quality why not just use low quality
no, the images still have useful information. even better -- if it's in the model, you can target it with a TI embedding, then put that in the negative. then, that will push you away from the low quality data and towards everything else
yeah maybe i might make a pony embedding instead of using a default prompt
i experiment with the pony lineage some times but its annoying when you're trying ot avoid nsfw
I recommend either Autismmix or one of DucHaiten's pony models, straight up pony is too difficult to get good results
At least on the api I get a very high variation of colors, if i want stuff super contrasty i can put "shot from iphone" or something else if I don't
Her first nightmare
Thanks for the insights
I prompt the clothing in those cases, works 80% of the time. I have an entire gallery of shorts failures lololol
doesn't Pony have "option_safe" or such, it won't create lewd then
another thing that caption dropout helps with... if most of the distribution is NSFW and that's represented in the uncond model, you'll be pushed away from NSFW pretty hard unless you ask for it
"rating safe" is a tag that avoids nsfw on it, never really had a problem with that tag in place, just have in mind what your prompt can do, if you're trying things that could closely relate to nsfw you can pump the weight a bit or do some "rating explicit/questionable" in negatives
It works better on base pony than it does on the realism variants i've tried. i think it has an underscore too
I don't know if it matters but "rating_safe" or rating... nm lol
rating_safe yes
Well, those merges usually do that ye, break the original model 
Those all look conventional not fission... ๐คญ
sorry I had to
this lack of refereance image stuff is making me sad ๐ฆ
lol i'm not going to keep complaining about pony tags this is sd3
I dont care about politics but it knows how to draw them so well ๐
i thoguht image 2 image workflows worked fine. do you mean contorlnet and ipadapter? yeah only diffusers file format that isn't compatible with everything else is available
I remember on the competitor's forums, that the staff were complaining that by far, the most prompted thing/person was Trump ROFL
Anyone have a SD3 with ControlNet available to share? ANd where the heck are the controlnets anyway? LOL
they were on huggingface a day or 2 ago!!! (404 now)
i generally tune him out. i've had a lot of practice living on his doorstep. He was funnier as an entertainer clown man on The Apprentice. Innocent like ozzy crying for Sharon is.
Everything changed when the birther nation attacked
furry fail! ๐
were images of either made popular ever? i know the aftermath had images published. They sure didn't set up cameras to take pics of the event
challenge accepted!
I'm pretty sure both cities are rebuilt today too
there was only the bomber that had the range to get there. they didn't exactly have a naval fleet with air craft carriers
Enola Gay?
i guess these are the only pics of the event
yeh the super fortress mofo
Ooops damn portal opened again
looks very similar to the image i linked. the two japanese detonation photos clipped together
Bockscar?
I'm more than happy to make fun of all political parites ๐
made this the other day lol
time for snu snu?
nah that's enuf of snu snu for a while ain't it?
is this what your mystery code made?
Looks like me preparing for work
my code was a fail then cause this gen didn't work how I wanted
The trump lovers are in chat

i personally dont love any political person :3
It's extremely annoying to me, that so many folks I talk with blame the more nsfw censorship, on the political party that they personally don't vote for. I mean the people who feel that censorship is bad.
nah here we go
Is pretty unifying honestly on people being afraid of AI and regulating it. It's pretty non-partisan -- around the entire world, honestly
Me on a good day be like
||Copium||
Just before SD3 came out is when I kept seeing/hearing that
wtf
i blame taylor swift
Taylor is know for hating images of her laying on grass.... ๐คจ
Gemini is my fave help me figure out stuff bot ๐
You get a 6 months free trial if you have google drive premium or whatever it's called
Some people say Claude is better though
Literally me
||
after few years of self care and gym maybe||
is that a chad?
You guys should watch the Nvidia presentation that happened this year 
It is! Recently, any online news about ai, more than a week old is irrelevant quite often!!!
that is one pissed off raccoon
whatever this bugatti is doing under the sea
Just search for jensen huang keynote computex 2024 
just two trash pandas doing trash panda things
MAGATs cult is alive and well ๐คญ
Can you guys try to get animals wearing masks? Had a hard time on that with api
like a dog wearing a cat mask?
@kindred mica were you demoted? i dont see the pink :3
best i can do is a red panda watching miley cyrus while eating his salad
Some pretty amazing AI stuff in there, it definitely shows how much more insane things AI is doing in the real world, our little image gen fun is small compared x3
Yeah anything, even tried a generic "black mask" it wasn't working well
hmm i got it to do a astronaut chef's helmet
I see a lot of this in AI, models placing characters face outside the window cx
ya..ugh
ya I dunno I put red panda wearing a ski mask to commit crimes and it just makes him happy in all the photos no mask
and my failed gen (not an animal)
Not an animal, it's nightmare fuel 
well he's got gloves now
happy to commit crimes, doesn't need a mask we all know red pandas are horrifying creatures now
lol wearing a paper clown mask just put him infront of a circus tent
does sd3 have inpainting yet? Use that for the mask?
sd3 barely has a license, let alone inpainting :3
even scary bee creatures need to go shopping.
hammerhead shark
missing a few chromosomes in that hammer there
๐คฃ
I think my sd3 has gotten lazy and started using photoshop to make my prompts
Differential diffusion works just fine with SD3. Just drop in the node between the model and ksampler and make sure to blur the mask
Create A luminous Russian girlfriend, her skin shimmering with delicate opalescent hues, stands amidst a misty Rome in a mesmerizing oil painting. The soft, dappled light filters through the ancient buildings, illuminating her flowing silver hair and intricate, iridescent gown. The artist's meticulous brushstrokes capture every intricate detail. This breathtaking masterpiece transports the viewer into a mystical realm of ethereal beauty and wonder.
ah yea i haven't tried that yet with sd3, will try tomorrow
very sexy
for sure
I heard someone made a good lora with SD3. Is training going well?
i like the sketch lora
full body of robert pattinson walking through the fire and flames, handsome, athletic, cowboy hat, desperado style black suit, using a yamato katana, the lonely shepherd, kill bill origins, natural look, eastman kodak 35mm vision camera negative film stock, Vision 500T 5279, anamorphic panavision lenses, twilight breaking dawn, red dead redemption aesthetics, sergio leone spaghetti western style, tenet time entropy inversion cinematography by christopher nolan, directed by quentin tarantino
i remember that movie
i like how it has the guy in the back ground staring like "wtf" is that guy wearing makeup
"a clown on the street? what the hell man..."
Please teach how to generate this images in this server by mobile phone
no click on the artisian-faq and read it to find out how to use the server to generate images
SD3 vs SDXL
nm found it thanks!
you're welcome
single concept lora paper authors on life support, sks now has meaningful information encoded on it. although that is DEFINITELY not an SKS.
I'm using Ollama with IF_AI_Nodes - they seem setup to do 3 paragraphs ... it wasn't down to me - I mean, I didn't request 3 paragraphs! Hope this helps?! Havaniceday ๐
Anyone can tell me how to run Flash SD3 on ComfyUI locally?
I tried it as a LoRA and it does not seem to be working ๐ฆ
i guess that VLM captions make it much better at dealing with complete nonsense. I just put in the WHAT IN FUCK IS DONE TO THIS POOR RIFLE copypasta and got this. these are overall very coherent guns, and they also reflect the prompt I suppose
i've never used any of these "unique" tokens like sks and my loras have always worked and train easy. i've often found that youtubers relay a lot of information confidentally, that's just not how it is in practice.
Unique in quotes since so many loras use them
i actually did an analysis on tokens to find the one that had the absolute least impact on the direction of the resulting clip embedding. i found that the gender neutral couple emoji ๐ had the least impact.
emojis as tokens that sounds fun
t5 doesn't support tokenizing emojis though sadly
i tried prompting with emojis in t5. it wasn't as good as i thought it'd be. i thoguht it would
can't we just train loras with only the clip layers and not t5?
don't train text encoders with sd3, i wouldnt even do it for loras
oh yeah because transformers or something
because MMDiT specifically. it is unique to this model having its own space for conditioning
Prompt?
Looks great
i was just thinking. A bunch of researchers that left stability built the mmdit architecture? Where did those guys go?
I hope their next project doesn't go like RWML did , and it's closed forever
an photo of Female Knight standing tall, clad in gleaming gilded armor adorned with an abundance of ornaments that shimmer and shine. Her armor is a masterpiece of craftsmanship, boasting intricate details and engravings that seem to dance across her imposing physique. In one hand, she wields a massive sword, its blade etched with runes that pulse with energy. The sword appears to be an extension of her unyielding spirit, forged from countless battles and victories. She wears an towering shield in the other hand Castle background
Depends on if you are you trying to train in new data/concepts that don't exist in the base model or not. If not, then you actually only want to train the tencs to get better at mapping to the existing data/concepts in the model.
HAHAHAHAHAHA. guess what prompt got me this
gorilla war fighter
it's the fucking navy seal copypasta. gorilla warfare.
i haven't explored copy pastas too much. i been doing poetry and song lyrics as prompts haha
Batman: EU
tracksuit makes him russian batman.
Own a musket for home defense, since that's what the founding fathers intended.
when the musket first came out, that mustve been like dalle 1 for them. just so remarkable
we're all making laws for ai today that are going to be grossly misinterpreted in 200 years
This one is excellent. Which sampler are you using? Is it upscaled? If so - how?
Background: A dark, foggy night scene.Main Image: A ghostly silhouette centrally positioned.Text: "HauntedMinutes" in a creepy font, placed at the bottom of the image.Colors: Use dark greys, blacks, and white or light blue accents for ghostly elements.
no upscale https://up.reselltek.com/u/xSudOV.png
LLAVA and/or ZEPHYR. I am using IF_AI_Nodes and the two input methods - i2i or t2i. Each method returns a prompt which is then fed into Clip ... Olllama examines the input image and makes a prompt; for t2i Ihave used this phrase "You are a creative director and I am going to give you a project you need to work on. I want you to come up with generative ai prompts in grammatical sentences - and in a natural-language format related to this project. Develop entirely random prompts on entirely random subjects. Be as anarchic and subverting as possible with combinations of subject and theme. Start with "PROMPT = " This will be used as a generative ai prompt. Keep its length to 77 tokens."
Then your prompt was great and the noise-Gods were on your side.
may I share some SD3 models here๏ผ
i will allow it
Go ahead! Also share how they were trained, please. ๐
I'd vote for that
I still can't comprehend why politicians (nr.1 aim for Fakenews and political deepfakes) were not purged, but everyone else was.
I've heard he is a very funny guy
i think most celebs are mostly in through clip base knowledge and politicians have so many pictures of them that it is impossible for the model to not learn their likenesses
Yeah, tragic and sad... ๐
surreal. bee movie script
not sure it knows who Harley Quinn is either
harley quinn at home
even got the Temu Harley Quinn
#๐๏ฝsd3 On a gloomy day, a little dog with a wine glass on its head comes to find a little girl. The wine glass is filled with milk, realistic style.
https://www.shakker.ai/modelinfo/090553f4ee5843fca21949fae75d7b55
you can have a try
Version Description:
Base film: The 10G version includes clip and T5 version SD3;
Fusion: Fantastical fusion, SDXL Realistic Scene v3, paintinglORA (fun version, generated models? indeed generated. Is it effective? A fantastic answer!)
Workflow: Comyfui
ps: As long as people are not painted, there will be no terrible body parts!
Our hub provides members with exclusive access to an elite selection of AI image generation models, designed to produce superior quality images that stand out in any creative project
you need to download to comfyui
If we shifted from women laying on the grass to cats, I have lots more to say:
Discord wobbles in a weird way when I try to upload a picture. What's that??
It's because of the timer
The one above is the upscaled version with SUPIR. This is the original generated by SD3:
nice one
Supir and others cause it to loose some detail because of the SD3 VAE, but I prefer that way for several use cases. Even if the SDXL based upscales lowers the tones and details .
Discord compressed my SUPIR version. It's more detailed than that.
And there is that. Yeah.
What's the difference here? Which is which?
I mean, the discord compression makes things worse. But I've opened your 2 images and the Supir one looses quality in details and colors. And I still prefer the lower quality (upscaled) one depending on the situation.
discord does not compress images, you have to click "open in browser" to view the original
I don't know anymore, there was a time it did compress and then it didn't. I don't care. I did open his images in the "Open in browser" and compared.
adobe super resolution on the right
A girl, walking on the grass, full body, smiling
A girl, walking on the grass, full body, smiling
A girl, walking on the grass, full body, smiling
You are right. Open in browser shows the 4K image as it is
"B.L.T." ๐
hello
Oof Monkey also resigned from stability

Got a link or something where did you see it 
Yes I'm trying to copy the link from my phone from Reddit, it's a post by comfy and monkey is commenting
I'll try and check on Reddit, is it the same one with that comfy post?
No it's a new one called "next step for comfyui"
Man I hate that fkin app, every other one easily let's you generate a link
step 1. hire Comfy guy
step 2. Comfy guy leaving saying releasing the best model is not SAI management priority
step 3. part of SAI team leaving to Comfy
kek, love this guy, hope comfy will prevail
Yea SAI really killed themselves. It's like management tried to destroy SAI for some reason. If they lose all their Devs, how are they gonna hire new ones especially with their bad rep lol

Just waiting for the bankruptcy news now. I'm no expert in this field, but I'm expecting nothing else at this point ๐
Hopefully then someone will come out and really say what the hell went wrong
so many strange / bad decisions in a row yes
I would assume that it's all about trying to find some funds for SAI, just extremely poor execution
but yeah, we don't have too much information on that
Poor is an understatement of titanic proportions. It was criminally poor execution, sabotage one might say. I am not sure some average clown like me could have made worse decisions lol
They also hire Devs who fully believe in open source, promise them the world and then force them into corporate shenanigans. What could've gone wrong?
a pig
BUT! They seem to be very confident about comfy and the future which gives me 2 ideas: New Base models from other companies are going to be more integrated AND they know how the SD3 model works and may have an idea how to fix it
The problem with anything to do with SD3 is that license that's still not been clarified
lmao shrek lost some weight xD
He's been working yes ๐
I'm not sure they are interested in fixing it, it's not their problem anymore. I think they will just add basic support for it and be done with it. There are way more promising alternatives out there that should be focused on
But they work on it and I really think, there is something coming. Nevertheless: There are plenty of alternatives out there. And they are getting better. Just imagine them having community support and they will be as good as SD in a miute
Exactly what he also mentions on Reddit - sounds great IMO
#๐๏ฝgeneral-with-images
Image Prompt:
Create an image of a modern, sleek office environment where a User Interface/User Experience (UI/UX) engineer is deeply immersed in work. The scene should include a high-resolution computer screen displaying a sophisticated design software interface, with multiple design elements and a color palette visible. The UI/UX engineer, dressed in casual professional attire, is shown intently focused, with sketch pads and wireframes spread across the desk. The background includes minimalistic decor and a whiteboard filled with flowcharts and user journey maps. The lighting is bright and natural, suggesting a creative and productive atmosphere. The overall theme should convey innovation, creativity, and attention to detail.
Comfy has formed a not-for-profit named Comfy.org. He and his team remain dedicated to making ComfyUI "the best open-source free software available anywhere!"
The team includes Dr.ltd.Data, and pythongosssss
i smell a gpt prompt
Ollama's version of P R O M P T Create an image of a modern, sleek office environment where a User Interface/User Experience (UI/UX) engineer is deeply immersed in work. The scene should include a high-resolution computer screen displaying a sophisticated design software interface, with multiple design elements and a color palette visible. The UI/UX engineer, dressed in casual professional attire, is shown intently focused, with sketch pads and wireframes spread across the desk. The background includes minimalistic decor and a whiteboard filled with flowcharts and user journey maps. The lighting is bright and natural, suggesting a creative and productive atmosphere. The overall theme should convey innovation, creativity, and attention to detail.
Ollama
an image like that ? from dalle 3 ? no definitely, thats at least dalle 4
Peak AI !
nah thats Dall F at this point
Has anyone else been able to get Ollama to work for SD3 and comfy?
Chibi baby Siberian cat, posing with dual lightsabers, background with a chaotic urban street, chibi cat's face, baby superhero pose, full body shot, full-length view, wide angle, perfect composition, hdr, soft light, epic
so hands are fine....as long as they are cat hands.
do NOT mess with kitty
I have been wondering...
Did team who made 2B (and also is researchers who already left SAI)
Instead of only doing an abliteration (negative reward if prompt produced is NSFW, training without image), they instead did part of training on a bunch of nsfw, but with the loss function kind of inversed, like, instead of noise reconstruction, trying to make as much noise as possible on nsfw prompts....
And so we get those mangled bodies, bc noise, structural level
The images don't turn out bad if you prompt N SFW, they turn out without any uhm, specific bits ๐ ๐
They just "forgot" to train a few things I think ๐
"army of cats invading the streets of new york" x3
So many kitties! ๐ฉท
Well that escalated quickly! ๐
I have yes!
Interesting (free) extension for ComfyUI-Photoshop https://exchange.adobe.com/apps/cc/3e6d64e0?pluginId=3e6d64e0&workflow=share
I mean other people, I'm just convinced you have coding superpowers LOL
There's a GIMP one as well fortunately, for doing inpainting!!! ๐
My first (SDXL) image via ComfyUI-Photoshop - the first one is the base image; and the second is the ComfyUI image masked over the original. As you can see, it "doesn't do text!!!"
lmao
evil kitty
Sucks its own blood!!! ๐ฅณ
actually it's an anime bat ๐
Here's a few more
Nah, this is sd3 with cosxl-sdxl refinement stage. As long as the denoise on the sdxl stage is high enough with the res_momentumized sampler, it'll fix the hands etc.
One of the most difficult prompt for any model and of all time, the infamous scorpion prompt. SD3 almost gets it:
cat paws 
clever!
Me and my army are taking over this place, hand over ALL your catnip and cardboard boxes!
what's the prompt?
Try putting the word "Scorpion" and see if the model gets it right. It never does. Most models get a lobster type thing, nothing like a scorpion. Every SD model fails this prompt. Even MJ and Dalle did at one point (I haven't tried in the recent versions)
This is what it would be like if I were the one doing these 
Anthro scorpions are also difficult!
Will try a scorpion in SD3 now ๐
I'm not sure what it is precisely, but I'd stay indoors that day ๐
Here's a couple ๐
Dev list getting shorter and shorter. How long until SAI collapses under it's own incompetence?
It always reminds me of a dancing lobster
Will be funny when flowwolf and crystalwizard are the last two people in here insisting SD3 was perfect to an empty room.
He's the star of a scifi B movie I think
oh goodness
they will be the last 2 ppl on the building washin the used empty toilets ๐คฃ
Lizard warrior of the realm
Couldn't resist, sorry
@magic turtle CONFESS U GOT MUTATED PEOPLE TO PHOTOSHOOT ON GRASS ON PURPOSE FOR THE DATASET
with some hair ^^
Somebodyโs been watching fallout
I never finished fallout
coming for help
Finish it or it will finish u
Zendaya without makeup ( shes reptilian) ๐ฆ
Iโm not joking though
It will finish u, the marketing team will come to get u if u donโt @bitter hearth
forgot muh shield muh lady
I had to look her up
Do I need to delete my statement ๐คจ
Now ur admitting to hiding
no!!! I am a good Amazon Citizen
Just like the vault peopleโฆ ur done for rip
WHO DO U THINK THERE MARKETING TEAM IS?
ChatGPT
We canโt even generate examples of this momentโฆ cuz itโs been censored from the model
plus or free?
oh plus now, for sure
Some people dig it. Donโt judge man.
oh dang its emad ๐ฎ
I lost my 2fa again so got a new account
What has been censored from the model? (mean besides the obvious anatomy)
nice to meet you sd3 ceo
Iโm talking about characters from shows and stuff
๐ซก
Even with all the chaos of the last week, I'm still having fun with SD3 medium!
its a solid model. just prompt it in a different language or something
I wouldn't say solid, but there's some gold to be found.
Was it trained on multilingual stuff?
Thank you emad very cool ๐
this model sucks, there's not enough anthro ladybugs in the dataset!!!
<sarcasm>
mebbe
oh guys look its the very real emad
updated my screen name
Oh! I"ve only tried presents so far... I don't watch TV or movies, so this might be a stretch, but I'll try ๐
He means another language like morse code or braille
french will do
skill issue if you aren't prompting in braille
damnit
I see no lack of celebrity data in this model ๐
Oh wait, you might mean nsfw stuff of celebreties? That's illegal these days, so just put some shorts and a top on them ๐
Well at least he didn't took an arrow to the knee
thats johny deeper
Actually only working with this, so prompt understanding is.... meh.
Now it makes sense, the wall of dots in negative is actually a very smart braille negative
exactly 
oh wow already
I suppose the point of this is to have a realtime VAE like the regular TAESD, but with 16 channels and stuff
I tried it 
What the fish doing
it needs some time to get it doing okay stuff, but okay is the upper end. Mac User, waiting for Draaw Things...
The girl was meant to be underwater ...but i like how it came out๐คฃ
No that wasnโt what I was suggesting
Reminds me of Gyo by Junji Ito, great story
We found hidden footage from deep inside Stability AI just prior to the Stable Diffusion 3 release.
Exclusive content and guides, support me on Patreon https://www.patreon.com/sebastiankamph
Chat with me in our community discord: https://discord.gg/dFB7zuXyFY
Stable Diffusion for Beginners Playlist https://www.youtube.com/playlist?list=PLXS4A...
one of us
Oh, thx for remind me of junji ito! Will try to generate something in his style. Love Tomie
be sure to share here gotta see that
@upper snow With your custom node for ODE samplers, do you have any guidance on which scheduler is best? I tried a number last night and they all seemed to give identical output except for one (I think it was โsimpleโ). I just saw this repo and the corresponding arxiv paper for GITS (third link in the repo readme), which appears to be a new scheduler optimized for ODE diffusion models. https://github.com/zju-pi/diff-sampler
any other open apis without registration out there?
damn that guy followed all of the discord drama hahaha
ok I can't be doing Junji Ito. It's too much nightmare fuel
So SD3 does know something about his style
I have done so many in his style before, but not with SD, what an awesome idea ๐ ๐ ๐
I'm still trying.
um. Hm. Maybe don't do it
Yeah, junji ito style is ๐
I find it easier sometimes if you say their name, then also the name of one of the characters they play
This one's pretty good
hahah. Ok...I think I need to stop
Pls don't ๐คคLove this!
gonna need to purge my hard drive after this morning ๐
How is that different from the Flash version on there? (in layman's terms please :D)
Done with Taesd via huggingface - SD3
you got a link to the flash version?
That's a big gerbil
This one is awesome! OK I need to get out my old Junji Ito prompts ๐
I have to get my Bible and bless my PC now
As I understand - being anything but a pro, expert or even informed - its a very small encoder therefore showing the genration quikcly but a lot of tokens harm him by not taking care of them
But, but reference images for later ๐
eyebleach
I would say theres a tat more control by CFG and step choice and therefore better results especially for longer prompts (only very quick testing)
if you click the advanced settings button, you can change the seed, and try all the various seedds with your prompt, to find the best one. Each is VERY different.
I Wonder... what if... Junji Ito + girl laying on the grass?
Thank you ๐ Fortunately now I can do side by side comparisons, with both sites.
a DVD screengrab, of a horror movie, of a lady laying on the grass
JUst looks like a lday that got into a bar fight is all (all limbs are fine)
https://huggingface.co/spaces/philipp-zettl/stable-diffusion-3-medium wanna try a third one? I wonder if there is a differencs to the Taesd3 version
okay, those are the same
Getting there slowly
Taesd3 web, Jinji Ito ๐
Just looked into the SDXL channel after a while. This here looks a lot like kindergarden ๐
I like this one the least of the 3. I thin I like the Taesd3 the best ๐
taesd is fun because you see how the image generates which gives some nice insights ๐
The flash one has the instantaneous change the seed and get a new image, thing though!
for non-human i got the feeling less cfg around 3 is good and for humans you can go up to 10
๐คฃ

Perfect
Lmao
Junji Ito and Banksy
Inflatable toy of... Steven Seagal 
@desert garnet
Sensei ๐โโ๏ธ thats why hes the goat ๐
Have you guys seen the video on SD3 by latent vision?, its pretty interesting
Lol I'm watching it right now. "Let's confuse the little bast*rd." Boom it worked.
there is so much good information in that video, really explains SD3 well without bashing it too much
i hope the video can restore people's hope in SD3 2B
No. He shows hacks to make SD3 work. It's still too much work.
If we can automate the scrambling, then we have a solution
The architecture for SD3 is so good, especially it's VAE. I just wish we had better weights for it, and a more clear license. :/
It really had the chance to be the last image model we'd ever need.
(Octane render, Unreal Engine, by Weta Digital, Cinematic, Photography, DSLR, Nikon D750, Short Exposure, F/2.8, Cool Color Palette, Kodachrome, 32k, Ultra-HD, Super-Resolution, HDR, DCI-P3, Natural Lighting, Flare, Cinematic Lighting, Contre-Jour, Beautiful Lighting, Ray Tracing Global Illumination, Ray Tracing Reflections, Ray Tracing Ambient Occlusion, RTX, VFX insanely detailed and intricate, hypermaximalist, elegant, ornate, hyper realistic, super detailed) A whimsical robot finds an unexpected garden in the heart of the city, with the robot being curious on exploration and its interaction with the vibrant flora and fauna. The scene is complete with vivid colors and intricate details.
use this alternative detail prompt if it gets truncated: (hyperdetailed, masterpiece, 108mp, hyperintricate subtle details, cinematic volumetric lighting, imax, extreme contrast)
This is nice!!
thanks!
would use it as wallpaper
True. In 8k would be nice
i used 8b for that, 2b gives somewhat ok of a result
looks nice, did you use a upscaler?
yeah. i used a glif app that has one equipped with it
So beautiful image ๐ฅบ ๐ค ๐ค
trying different stuff with oil painting
some 2b images using the same prompt
My hope for SD3 restoring already. Kinda like the results im getting. The only problem now is the anatomy so far.
If the anatomy was the only problem in SD3 then it can be ironed out using a finetune, but there is so many weird issues and quirks with SD3 that it might take a long time to fix them
2B
Comfyui or SD.next?
missing a wheel but looks nice
Still, this would not be my wallpaper ๐
yeah well it's less "artsy"
Hate the License as well
thats super nice 
why do I get the feeling this is a woman laying in grass? ๐
That was the attempt lol. Wanted to make it in style of junji ito...
kinda cool honestly lol
lmao
It needs to be pointed out that this is very old-school prompting. It will still be ideal for the SD XL type models, but that sort of prompting relied on just throwing in a ton of key words and hope the parser made something nice from it.
Yeah i modified it a lot
the end result was more like an oil painting, so the unreal engine prompt didn't make any sense
gotcha. i was just experimenting with those details that i used with the older models such as 1.5 and 2.1 back in the day to get the best result of out images that i generated there.
an app i'm using to generate sd3 images has an llm in between so it redetails some of the stuff in a more human tone and truncates it without getting rid of the most important things of the prompt (mostly)
It takes some real readjusting, so it was not meant as a bash, just a heads up. When Dall-E 3 came out, the first to really emphasize natural language prompting, it took quite some time to 'relearn' prompting. Nowadays with MJ6, Ideogram, and SD3 (8B and 2B), natural language prompting is the norm.
To be fair, before it made no sense to bog it down with natural language as the parser would almost randomly choose which words to latch onto and which not. So avoiding anything 'distracting' was the underlying key to prompting.
Qwen2 prompt
Latent Vision has interesting tips for SD3 prompting, saying that we should start first with the most difficult elements to generate and then the easier elements and then background and lastly everything else, in this order
This is not actually new advice, and applied to old-school prompting as well. It was just how you wrote it that was different.
Before, and still applicable today, you started with the main topic, then threw in some key info and details, then proceeded to stylistic desires, and finally into aspects of quality, detail, etc.
Who you gonna call?
why is he giving a scuffed mutated middle finger lol
That's what i wanna know. But call center is busy
Today's parsers are far more advanced, so if you don't respect this order, they can usually do it for you, and are thus far more forgiving. This means they respect your prompts far more as a rule, and are far more user-friendly by nature.
his arms are fused too
It's like the mutant creature from Altered States making a phone ad.
A great movie from 1980 that fused Huxley's The Doors of Perception with sci-fi extrapolation/speculation
zipper man in a dark forest
darth zipper
robots all have the same shape, it's hard to get another type
the upper body is almost the same
where is the best place to browse \ discover local image gen models?
๐
huggingface or github
death metal
buzz (sd3 8b)
this is like too sharp ๐
SD1.5 finetuned
@ancient cape and SDXL Epicrealism Hades
seems like the upscaler did an extraordinary job with that one. makes me think if it's really meant for with close-ups
Touching grass tutorial.
try crocodile
am I the only one who enjoyed SC ?
Just because it wasn't cared about too much
I did too.
Having fun with this ComfyUI to Photoshop Plugin
I really liked Stable Cascade too, but it felt more like a transition to SD3
Stable cascade was just too slow on my mac
i was able to make 4k * 4k images without upscale on 8gb ram
ill try with SD3 when I have time
did anyone try double upscale with sd3?
lying in lava
that dude's expression
"They pay us to do what?"
so, ahem guys, today's assignment is
The same Felicity Jones prompt after 4 epochs of training the wider dataset, way further along than when training in isolation. It shows that a lot of the learning was probably the prompt format and high quality photo tag (which is trained in a lot of other images), whereas the actual subject is quite easy to learn. Previously it took 16+ epochs to begin showing any likeness at all.
pikachu after the LOBOTOMY and CENSORING of SD3 (meme)
Pika on fentanyl
how long did that take to train 
the dataset was ~8k images and each bath was the same image repeated twice at a high and low timestep. About 4 hours per epoch, training only the transformer with bf16, stochastic rounding and fused backpass. Encoded prompts were pre-cached, though the images were encoded through the VAE again each time with flipping.
trying one of my own characters with more training data, there's no apparent likeness aside from maybe the outfit colour. So it seems it was probably riding on the original person being known to the model somewhat
cctv footage of an army of cats inside a bedroom at night, nightvision view
Thanks
Any response from SAI to Civitai yet? News? Updates?
No
last we heard they said nothing
All my attempts on ' a photogrph of a Female eating a banana' was a FAILURE...
This one is SDXL ๐คฃ
i mean that's one way to eat a ๐
People eating food isn't one of SD strong suits
Lmao
Its my favorite of the models thus far.
people doing anything isn't a strong point
๐คฃ๐คฃ๐คฃ๐คฃ
using LLM output for the prompts definitely makes it feel useable, I like letting it run all night and seeing what it came up with in the morning
I guess just ask it for deformed horrors. Makes sense. Strong suit.
it's like it has built in monster loras
Just done use humans ๐ ROFL ROFL ROFL
Qwen2 prompting for the three encoders
I'm just doing phi3 medium with the triple clip
Has anyone tried using natural language in the negative prompt? It's hilarious. `The image is a close-up, cropped mess. It's hard to tell what's going on because of the poor quality and weird colors. It appears to be a drawing, possibly anime-inspired, but it's poorly done.
The main focus seems to be a creature with two faces. Both faces have ugly eyes and deformed features. The hands are especially disturbing, with extra fingers, mutated proportions, and interlocked digits. There might even be extra limbs poking out of the frame.
The overall impression is one of pain and disfigurement. The artist seems to be trying to depict some kind of mutation, but the execution is clumsy and unsettling.`
Latent Vision has done an excellent video on SD3 - a neat way to use the negative prompt (which SD3 doesn't seem to like at all!); and a way to regularly generate at 1728x1024 resolution (low CFG apparently!) https://www.youtube.com/watch?v=OrST6Nq1NUg
How does SD3 work? Is it any good? No drama, no politics, only the technical side of things.
The SD3 Negative node is part of the Comfy Essentials: https://github.com/cubiq/ComfyUI_essentials
Free SD3 generations at OpenArt: https://openart.ai/create?ai_model=stable-diffusion-3-sd3
Discord server: https://discord.com/invite/W2DhHkcjgn
Github ...
Also from "Latent Vision" - a 'hack' to stop deformed bodies is to step away from the "dimensions must be divisable by 64". He suggests 1044x1044 - try it and see?!
That one is more natural than the one i asked ai to make Cx
"The deformed creature, with its mangled limbs, disfigured face, and body, was twisted and distorted, its abnormal proportions making it look unnatural and aberrant, monstrous and grotesque, misshapen and unusual, anomalous and deformed, crippled and ghastly, its very existence a grotesque anomaly in the natural world, demons, horns, horror"
Grotesque anomaly in the natural world is so funny to me lmao
It works surprisingly well, sometimes it stops me from making ugly images on purpose
It makes you want to put it in the positive
SD3 Civitai unban when? 
2 weeks
^..^<
Wouldn't have it way other way.
๐
welp thats a tad terrifying
I'm not entirely sure on its surface if this is a general ODE solver or if it's like DPM-Solver++ (DPM++ 2M) and is hyper specialized for diffusion models and not necessarily flow models, but we'll take a look at it!
When their earning started to subside? ๐
Kissing, maybe
Large Format generations using Latent Vision's idea on small CFG numbers/no negtive prompting to make clearer output at the higher resolution. 1728x1024
My comprehension of the paper was low, but even though they talked about it as a sampler, they seemed to deal more with the noise schedule than anything else.
ya there's some good info in his video i'm running with the changes now to see how it feels
These are perfect
ah, the latest one is specifically for probability flow ODEs (which afaik does not include rectified flow models). It might still work but the general ODE solvers will probably work better
One of those fat gentlemen even got nillpes
^..^<
Are there any finetunes of SD3 available yet?
There are some SD3 Controlnets available for A1111
Haha, I donโt know what those two categories mean! I guess this is a sampler rather than a scheduler, then?
Hopefully, as rectified flow models start to take over, we'll start seeing more papers looking into better sampling methods for them specifically. But for now we can surely settle for using the same ones people design airplanes with.
SAI is done. I see no future in SD3 architecture.
1728x1024
i'm trying out th shifted 6.0 with no negs for the details and it seems like it does change
SD3 8B won't be released at this point.
It is a sampler/solver yeah. The main difference here is that diffusion models are an SDE, and SDEs are incomplete, and you have to model the incompleteness using random perturbations (like the extra noise injected by SDE/ancestral samplers). Rectified flow models are ODEs, they are complete problems, and you can run them forwards or backwards.
Yes, change on an image-by-image basis - I mean don't leave it set at 3
This model is like a puzzle... Like a riddle!
As long as you're not making any NSFW content SD3 is great, can't wait for finetunes
Do you have any recommendations for the default schedulers then? Or do the ODE samplers remove the scheduler effect?
i like it too but its a bit deeper than just NSFW
I'm trying to dream-up a Model Merge using PiXart-Sigma and SD3-Medium ... A model-merge node rather than a checkpoint mash
The adaptive ODE solvers literally only take a start point and an end point. So just use sgm_uniform because no other scheduler will make a difference (or if they do it will be a wrong one. The start should be 1, the end point should be 0.)
let's see if this 3060ti can handle a 1728x1024 with upscale
Overload the negative with random characters, make a weird resolution, perturb it, trick it in the promt and it works! ๐
Emad left us a puzzle! It was his way of saying bye.
Latent Visions' Negative Prompt (I kid you not!) aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaa aa aaaaaaaa
Cool, thanks. Might be useful to explain in the GitHub readme for non-experts.
With LORA's it will be perfect
accidentally did 1728x1032 but it came out pretty good, even if it took 2mins
Squirkittens
afk
6912 x 4128 after upscale
that's crazy it's so hard to tell it's not a picture from a camera now
Sadly there is another eye at the end of the tail
Where
how can you even spot that
ya it's still not perfect but the pixel definition are ๐ค
kitty evolves
That too
that's just the mouse it's saving for later
With a trained AI's eye
Look what i just made (surely not ai)
You say that like it's a bad thing? ๐
you have gifted drawing skill you should try to sell your stuff
wow you are goood
Oh my god that's so small xD eagle eyes stuff
let it fly the world is a weird place
@craggy ridge Here is your drawing! Haha sadly the api is not very bad at woman lying in grass to do the meme consistently
halved the initial latent and it still comes out pretty good after upsizing to 3456 x 2048
So much blur im thinking it's sdxl 
Discord doesn't let me, don't know what word is banned
Photo of a group of people riding a roller coaster on a clear, sunny day. The roller coaster, named "RollerCoaster Tycoon 84," has blue and yellow cars filled with excited passengers, many of whom have their hands raised in the air. The wooden tracks and supportive beams of the coaster are prominently visible, with the bright blue sky serving as a backdrop.
those supportive beams
i didn't have any problem with censorship?
I only sent the G clip text.
T5 clip text: The image captures a thrilling moment on a roller coaster named "RollerCoaster Tycoon 84." The coaster is filled with an enthusiastic group of passengers, all of whom are experiencing the exhilaration of the ride. The front car prominently displays the name "RollerCoaster Tycoon 84" in bold, yellow letters against a deep blue background. The passengers, a diverse group of men and women of various ages, are clearly having a fantastic time, with many of them throwing their hands up in joy and excitement as the coaster plummets down a steep drop. The wooden structure of the roller coaster is sturdy and intricately designed, showcasing the engineering marvel of the ride. The beams and tracks are detailed and create a sense of speed and motion as they lead the eye through the twists and turns of the coaster. The backdrop of the image is a flawless, bright blue sky, enhancing the sense of an ideal, fun-filled day at the amusement park. The vivid colors, dynamic angles, and the expressions of the riders all come together to create a snapshot of pure, unadulterated joy and thrill, perfectly encapsulating the essence of a memorable roller coaster ride.
L clip text: Photo, vibrant colors, dynamic composition, high contrast, sharp details, action-packed, lively atmosphere, detailed textures, energetic mood, high resolution, playful scene, immersive feel, bold typography, vivid imagery, fun and excitement theme.
I could send them in one discord message each, but not allowed together...hmmm
funny, you use L and G exactly the other way round i do.
Could I interest you in some fresh mouse crepes or a succulent house plant?
L is better for object and G better for style?
Lmao i love this
its what i use, but is it better? i dont know. i just find it funny you use L for style and i use G for style ๐
maybe we should use T5 for style instead? who knows? ๐
