#πο½sd3
1 messages Β· Page 59 of 1
ollama will run inside comfy. claude won't. and everything's better than GPT4 unless you work AT openAI and have access to the system console
I asked 7b model to make a dirty limerick about us AI image creators :
There once were AI artists so bold,
Whose generated nudes made the internet cold.
They'd input with flair,
And algorithms would share,
Steamy pics that left censors old!
<-- must have mini ollama π
I was shocked how easy it was to install and run
go back to it and tell it to actually follow the limerick format
There once was an artist so fine,
Whose AI-generated nudes were divine.
She'd input with care,
And algorithms would share,
Steamy pics that left all else declined.
Maybe?
I just installed it today and haven't played with it much
The standard form of a limerick is a stanza of five lines, with the first, second and fifth rhyming with one another and having three feet of three syllables each; and the shorter third and fourth lines also rhyming with each other, but having only two feet of three syllables.
I've only ever used it with comfy, I should try it on its own π
I only asked it some scientific questions and it knew a lot more than I expected
Whose AI-generated nudes were divine.< that's 11 syllables
7b llama? Use llama3 8b, thatβs a lot better.
/me imagines Crystalwizard in a limerick battle with Ollama π
yes, I have the 8b one
Llms suck at things like that. Its bc of tokenizers
huh. When I ask it "what size are you?" it says "I'm a 7-billon parameter model"
perhaps that is old info π
here's what claude did: Certainly. Here's a limerick about AI image creators, focusing on both the rhyme scheme and meter:
There once was a bot that could paint,
With pixels it wouldn't acquaint.
Its AI-driven art
Was eerily smart,
But copyright laws made it faint!
lol
All llms canβt properly count. Maybe with small sizes but not large
User: Tell me how many words this simple sentence have. Do not give me a wrong answer or else you will be destroyed. Or at least I think so?
Claude: This sentence has 21 words.
Would you like me to explain how I counted the words or provide any additional information
whatβs interesting is that Claudeβs output is 21 words lol
is that related to why I could never get AI to create an exact number of items before? Such as 6 applies...
I haven't tried in awhile though, it may have improved
Yeah probably, llms improved so they might be able to
I see no difference tho
that's good you don't want differences when enlarging
It took SAI $10,000,000 with of GPU time on AWS to train SD3 (supposedly)
I have zero chance πππ
Try Pixart then, that requires so little vram and much cheaper to train.
what ever somneone else does with more money you can do with more time, just try to live forever
I got this email span today; it's?a knockoff, or?
It's the world's most powerful GPU currently. https://www.nvidia.com/en-us/data-center/gb200-nvl72/
that's the same didn't know there is a newsletter
I'd like to borrow their gpu for a couple of hours π
I mean checkpoints as in models which use SD, but revolve around a particular concept or theme.
Not a newsletter, just pure spam LOL
In terms of power consumption, the DGX GB200 NVL72 rack consumes 120 kW. Each compute node is estimated to consume between 5.4 kW and 5.7 kW you just need a small nuclear reactor in your bedroom under your 22" monitor
stupid question, is lora creation with sd3 possible?
not stupid and the answer is maybe
Oooh, side install. I've also seen a few SD3 comfy workflows which incorporate pixart
that's why my prompt to claude was: "both the rhyming scheme and the beats are very important. so, with that in mind, please write a limerick in the style of classic limericks about AI image creators" - that way it would revise classic limerics and model its after those
I have a soldering iron and a local electronics store?
ROFL
it actually isnt that good, since most of it is useless for diffusion (fp4/fp8 dont work for SD at all)
You mean LoRAs? I will. Just need time.
textgen (e.g. gpt4) does get boosted a lot though
Soon I think, but not last week from all I could tell
Nah checkpoints, make ur own Pony! π
Me: "write a pun". Claude: "I tried to catch some fog earlier. I mist."
whats it, bear ? look behind you
Meh. I'll use my iPhone 4 as a monitor.
Civitae has a lora making option (not SD3 yet tho), I"d love to see a checkpoint maker online thats just as easy lol
lol is the bear saying that to the guy?
yes, I made some loras on civitai. But indeed, not yet sd3
they blocked sd3 altogether for the moment
Wow. Those are strong words. The GB200 die will power RTX5000 series so not sure you are correct. But what do I know.
Ah. π
yoinked
do bears talk now ?! just kidding
I asked that of Gemini, but it used a common one everyone already knows π "Why did the scarecrow win an award?
pen_spark
...Because he was outstanding in his field!"
ehh, the b200 gpus in it are designed for textgen rather than diffusion, the 5090 can still do diffusion
yea?
hello
He only teaches. Only YOOOOUUUUUU can prevent the wild fayaaaaaa
Much better (gemini advanced puns)
tell it not to use anything that exists. here's another one: I got a job at a fast food place, but I couldn't cut the mustard.
how could this be not good for SD
it has PFLOPS and TFLOPS and a few dozen hundred GBs
I just need to borrow that for a few hours π
how much are you paying for that?
This is really impressive, it should work with sd3 type models as well I think?
everyone in the world will be paying for it
i doubt you'll be using it for SD then
it's the new grace blackwell chips from NVDIA everyone will be using it when it's out
A PEE EYE
not everyone.
This allows for lightning fast inference of transformers. Way way way faster then even nvidias new gpus such as b200 with a lower cost as well
everyone even those unborn currently
guarenteed, not everyone
there's currently no way to confirm everyone will not be using it
@sage burrow
i can confirm that people let out of prison and not permitted to ahve a computer device will not be using it\
you cannot confirm that because it is not out yet
furry should be banned 
but if and when it is out I can confirm everyone will be using it
or i'll give you your money back
we're going to get into an arguement here but i know at least one person that won't be using it. she doesn't have a smart phone, just a flip phone, and doesn't use a computer.
But they are so cute1 π
except she won't and that would be ME using it, not HER using it
can we get some confirmation from her on this, I don't wannt to assume her confirm
i think we've hijacked this channel enough
okay
easiest confirmation of my life
since sd3 is good at text... (extremely long cat lmao)
not fair you using 8b though it's better at long text

real cat typing
humans are more difficult (thatn furries) apparenty
fanastic!
oh my god finally all the stuff for my new nodes got installed...
beats mine, mine was for sdxl π a (photo (by Max Rive:0.8) (and Ryan Dyar:1.1) ofrobot made out of iridescent Dahilia crystal, magical, luminescent, purple and pink glow:1.2), hyperdetailed crystal texture, blue and white smoke, gold filigree, shimmering ethereal beauty cosmic dust startdust. dramatic lighting, raytracing, shot on 35mm, movie still, photography
yeah, the image was sd3, i did a rewritten version, but it got so much worse :/
wtf, i tried to type something like "lovely cat-animal" and it got blocked!
the prompt is pretty safe in sd3
I guess cute is allowed? And cat thing with p is not?
Where?
Just keep trying again, unless it's dalle or mj
here in the chat. but nvrmnd. new here, should learn the rules by doing
A humorous, colorful photograph, captured in a playful, cartoon-inspired style, of three adorable, frustrated aliens, each with their own unique, endearing features, standing amidst a bustling human crowd, surrounded by towering skyscrapers and neon lights, the aliens' faces contorted in exasperation, their large, round eyes bulging with frustration, their tiny, button noses scrunched up in dismay, as they try to communicate with the uncomprehending humans around them, their speech bubbles filled with nonsensical, yet endearing phrases, "Gleeb gloob" and "Bloobtie gloom", the first alien, a petite, pink creature with fluffy, antennae-like ears and a tiny, sparkly crown, looks like it's about to cry, its lower lip quivering with disappointment, the second, a tall, slender being with iridescent, rainbow-colored skin and long, flowing tentacles, has its arms crossed, its face set in a determined scowl, while the third, a stout, blue creature with a round, puffy body and a propeller-like hat, looks like it's about to explode, its face bright red with frustration, the humans around them, oblivious to the aliens' distress, continue to go about their daily business, completely unaware of the intergalactic misunderstanding unfolding before them.
you can pretty much throw anything in the prompt for sd3, and it will (most probably) be sfw
Oh wait, I should test my "lewdness" prompt with sd3: stepmother on her knees, pov π
you can't generate in this channel
I know I cannot generate in this channel, but still discord is blocking my chat.
lol
stepmom looks down and thinks "how, why? this is my thing!!"&
what happened with Albusmumblemore?
did he also leave?
I guess we're not gonna get updates from him anymore T.T
@kindred mica is probably at work
How the heck did you get a Cthulhu?! I have tried hard and always get "something close" but never this good π€
its an octopus. cthulhu would have wings and a lot of weird artifacts, due to training data
Thanks
Hmmm she told the thing Cthulhu so there's that for octopus theory π€ͺπ€
So this is an SD3 back end?
Cant generate anything, since I'm running a 20k finetune at the moment π
Of SD3? Yes? π€
I wish
Sd3 2b I think.
But extra stuff to help the prompt In that site so said Lykon
Hmmm.
I mean my datasets are ready... XD just lacking training tools. and I'm not brave enough to join the reverse engineering efforts going on
There's online sd3 flash and taesd3 π
glif is running one hell of a workflow XD
trying it now
Oooh I like this one!!!!
One if my many faves https://glif.app/@FireCreeper21/glifs/clvsa1w1x0001m1lykzwx6e98
tacoma wheel wells, silverado body, ford front end...i like it
So those prompts are machine generated from a seed pop rompt by the user? SD3 2B you say?
So we should be able to replicate locally ?
not sure but i believe glif uses sd3 8b instead of 2b
Just refresh and try again π
working again now.
but a bit sad they won't share the updated prompt from their gpt workflow T.T
I'm not sure what you mean by machine generated. I came up with each prompt. In theory we could replicate locally, but, the person who made that glif coud have added extra enhancements, and that one in particular has a nice upscaler. I'm going to try one locally, back in half an hour lol
That would be nice if Glif didn't hold out on u s like that
@sage burrow
if you run in edit mode, it shows the intermittend values, but true, it's a shame, but seems out of scope for glif,they just want to generate shareable images, no room to display large metadata
I have a fairly well working gpt4o chat, for upgrading my prompts.
but gpt4o could also just flat out give me an image via dalle3 XD So I dont see my solution as a 'solution' just yet.
they're still in here
discord role removal. he has the member role atm. was just surprised by that.
fruitt is missing
i was gonna ping her about getting the #stable-cascade channel back as we've been doing a ton of work on it
but yeah... seems she's gone too
but still a server admin. so prob just away from drama or off on holiday
this is the kinda stuff i'm getting from cascade now
directly generated at 2880x1728, no upscale, no refinement
So clean. Very nice
yeah... cascade is amazing
Got distracted by work the other day after I downloaded the models. Still haven't really done anything with it. Do you have a workflow up anywhere @dusky thistle?
it sohuld be embedded
yeah. cascade is so good in many things... but ooofff that compression that was going on. made training fine details a hellish and unrewarding experience >_>
but base cascade is damn good. and style training also works well
i did a halfass training of stage B that really improved stuff in most cases
yep the details are fixable
Legend. Will have a play around today
those are slices from 2880x1728 generations with the standard stage B and the one i refined
use the model and the vae... that's stage B
there will be a better version later on i'm sure... this was just a proof of concept tbh
it's night and day
are those tiny golden raspberries
I just asked for a very bad burger
haha
Not really sure the kid in the one above this has a chin
don't judge its a kid drawing
π€¨
I only asked for an angel cat, let it be creative
I love the last one. it's super cute!
it's because the word you're trying to use picks up the nsfw filter and will block it because it is often used in NSFW context
if you want a cat, use the word "cat" lol
cute little furry thing
lol

it's meant for users who want to use all the API capabilities but in the discord setting- basically big tool box for generation & editing features
any way we could get #stable-cascade back btw? i figured out how to get really crisp generations at 2880x1728 directly and a lot of ppl are starting to use it again on here... it'd be nice to be able to compartmentalize it instead of having everything scattered around
looks good π
IDK what for fruit means
but its really good
me either
Holy emotes
So cuteeeeeeeeeee π
I need that prompt, to open a Pixiv account π j/k but lololol
i guess this is a hand drawing but not what i meant
Straight into CLIP for SD3...
Now let me put that prompt through llava3 and see what I get π
ollama:latest
Cthulhu, ancient sea creature, practicing parkour in urban jungle. Scale the buildings, leap over obstacles, agile and powerful. Concrete structures, graffiti-covered walls, neon lights, abandoned warehouses, rusted metal, broken glass, dark alleyways. The city is alive with the sound of Cthulhu's roar.
ChatGPT-4o Single Prompt
seed hunting:
this is with cascade... what do you get with SD3
"urban and fantastical themes, casting long, rich textures and vibrant colors, dark clouds, intense expression, set against a backdrop of a stormy, imaginative and highly detailed design, high contrast, bold, girl's playful outfit with hat"
Maybe? π
Let me see which one I used...
another one generated directly at 2880x1728 resolution (btw, this peaked at 10gb vram)
different sampler, steps, cfg, same prompt
I still got 3 prompts ... hmmm... maybe I have to reload the page?
and one more i'd be pretty curious about what SD3 would do as i bet it'd interpret the prompt pretty differently: "cascade is god"
cool, yeah res was 2880x1728 so that's a 1.78 aspect ratio
1152x1920, 768x1280 are some with the same
I am good.. .whatever... not married to it LOL
my modified prompt...
beautiful - the detail in the stone is pretty nuts
"stormy sky, dramatic monochrome mountain with geometric formations, with a figure in a flowing dress and hat strolling through, a futuristic elephant with glowing pink decorations" how about this one
of course it is...
even laying on the grass...
it's a freak'n meme generator
kinda a weird CLIP-sytle one... it might fare better if you fed that through gpt4o first
wow yeah that made a huge differenc, it got it now π
"giant martini glass containing a swirling galaxy, soft shadows highlighting the delicate curves of the petals, glowing windows, creating a powerful and dynamic composition filled with nature\u2019s wrath, showing a solitary figure walking on a winding path under a star-studded"
awesome
that's actually pretty cool, what was the prompt?
explains it a little better the generation started before i finished typing the prompt the text "i think our cat has a fruit fetish" on a hand dr seed: 602782489706488
ha!
that's actually really cool
I have a subject I'd like your GPT to try
oh boy. I think I found one of SD's continuing weaknesses. Scissors
Oh?
it can't draw scissors , like usual
safety training
Has anyone ever prompted for a square circle?
Grandma decides to take on the neutering job herself
nekkid
supposed to be scissors in there
it is struggling
***stabs***
sir respect my copyright
let's start small, how much do you have?
"ghostly figures in the background, soft, the contrast between the metallic textures of the flowers and the cold, vibrant and bold colors, surrounded by a dark, intricate mechanical designs,A hyper-realistic digital painting capturing a vibrant jungle at night, natural forces"
bad drawn pikachu got me cracking up
that's stable cascade... what's sd3 do with that?
that might need gpt4o expansion for the t5 part
what's the primary term to get those children book drawing?
text "Lykons guide to hide'n'seek" on a really bad children's coloring book, crayon drawing of a skull and bones, poor art, squiggly lines, poor shading, non verbal, poorly drawn pikachu, really bad drawing of pokemon is what i'm using
2b is 2good
π
straight out 1920x1080 output... no upscaling... SD3
k now do that in 20seconds
on a 4gb GTX 980
math seems kinda hard for the text to get right
Isn't that negative 0.0?
mcdees getting crazy
in the world of SD3, it has a weight of 1 for 10% of the process π
has anyone tried clipvision with sd3 yet? i'd be curious what it does with this cascade output
i can barely run sd3
ahh damn... yeah crazy thing is i can run this while using a bit less than 8gb vram lol
2400x1440
i'm using a spare 3060ti so 8gb would be my max

4gb hop in we gonna go yoink some 5090's
yeah if you use the lite versions of the models for C and B you peak at about 6-7gb vram i think
more pennywise than ronald
"on a snowy road surrounded by giant blue and white flowers, depicting a surreal alien world with enormous ringed planets, and swirling night sky above, intense contrasts between the bright flames and dark smoke, intricate facial details, and foggy mountains"
what's that do with your sd3 workflow
remember Nekkid bad :: Bloody McDonald's good
McDonald's interior, dark atmosphere, bloodstains, theatrical lighting, gore details, horror twist, serial murderer theme, pop culture references, eerie ambiance, fast-food setting, realistic graphics, cinematic quality, creative visualization, artistic presentation, dramatic tension, intricate design, unsettling environment
balls after dark
Matching colors by luck
no luck when it's SD3, experts call it limited dataset enhanced by finetuning exemplified by skill
crazy how different the style is
Damn that ball is big
it ate all it's carrots as a small ball
kinda cursed
lol it just rolled over them instead of eating them
It's the model stability released in February... Only reason it didn't become a big deal is cuz they prematurely announced sd3 a week later
Just like metal does in real life, it slowly absorbs the nutrients
4 balls this time 
more the merrier
Looks like a dill dough π€ͺ
Your balls need to be faster
if we wanna get there fast we'll need to take the starship carrot
It had problems that kept ppl from using it but I've fixed a lot of that and the rest is definitely fixable with. A bit of training
1/2 π½ , 1/2 π© , 1/2 π₯
snow globe of moordoor
okay I'm done with carrots
reminds me of the fairys trapped in jars I used to gen on sdxl
It's okay I don't believe you
Shouldn't this be a rabbit tho
I can't get SD3 to understand what a pool of molten metal is, either. It always draws water π
I think this is gonna be a problem
child labor used to be wild
they shoulda called it SD95%
fever dream
switched to asking it for magma instead. Maybe work
still seed lottery
has anyone tried bolting SD1.5's ELLA to SD3? I expect the results to be horrific but it should work in theory
I'd like one about that size π
DoesnΒ΄t really look like a digital painting though π
@patent acorn
Hi Are you looking for developer?
so cute
A 3D animated close-up shot in the style of Pixar. The scene features a man with big brown eyes, a medium-sized nose, and black hair. The background is a blurred cityscape, focusing attention on the manβs face and torso. The man has a sad expression, with downcast eyes and a slight frown. His upper body is included in the shot, and the soft lighting highlights his features, emphasizing his emotional state. The overall composition captures a poignant moment, showcasing the expressive and detailed animation style typical of Pixar.
Cascade with this prompt
?? no who tf are you
"Red Hot Tomato Soup" π₯³
L prompt: High resolution image with vibrant outdoor colors crisp winter scene
G prompt: A woman standing outdoors in a snowy setting. She has long blonde hair and is wearing a grey Red Bull Home Run hoodie and a black beanie. She is smiling and looking directly at the camera. In the background there are other people and some event structures under a cloudy sky.
T5 prompt: A high resolution image featuring a woman standing outdoors in a snowy winter setting. The woman with long blonde hair is dressed in a grey Red Bull Home Run hoodie and a black beanie. She is smiling and looking directly at the camera. The background shows other people and event structures, all set against a cloudy sky. The image captures vibrant outdoor colors and the crispness of the winter scene.
Did you say Medium? I thought this was SMOL! π
I am not sure about this one...
I think what I dislike the most about sd3 is the fact that it creates artifacts when you go > 1mpx
upscaling doesnt work as well as it is not aware of the tiles around it
I might be wrong though
Cascade so far has been the best at this... however, I do sometimes get great large output from SD3 in one shot...
Yup that's where I'm coming from, cascade was awesome for that
I don't get why we regressed on that
One of my pet peeves with SAI is that the y love to release new things 5 minutes apart and not let ANYTHING bake long enough to get perfect.
I get it that it is a very competitive landscape, but, history is a great teacher that those things that have dedication and continuous improvement always get better over time.
LESSON: Remember New Coke?
This is 16:9 default size...
i dont feel upscaling fixes anything, whereas in stable cascade going up in resolution fixed the quality. Upscaling fixes some stuff but breaks other stuff
good exemple of this
This one is 1.2* default... still holding well.
1.2 is not too hard though. SC was able to *3
Cascade also lets you compress the latent more or less as needed.
SDXL has Deep Shrink
What do we have with SD3 to control this in the model/latent?
Cascade is REALLY resistant to going outside the training boundaries.
SC native @ 3072*4256
Now, look at this... 1920px * 1.2 wide...
edges dfall apart but center is not bad.
let me try 1:1
go all the way SHIFT 125!
It's like they don't want us to generate high mpx images
hmmm...
@chilly vale sd3 doesn't need deepshrink or similar. It could already deal with arbitrary resolutions if it would have been trained on larger images
isn;t the max 100?
yeah I was joking
yo Franco in da house
weird
I just quoted a message
no clue how your name came on there. Some strange discord bug maybe
It was destiny. You summoned Franco and now it's too late
You have to live with it
Oh noes
Franco emerging from the sea like any Elder god would
OMG you see? It already started.
anything over 1500px you will get those issues : ( Divisible or not
sadly yeah
π welcome
YEah we know... it was experimentation and how far the boundaries etc. etc. blah blah... π
Astroboy chills in the city
Mistral
"Scene Introduction:
In the heart of a steampunk-inspired Victorian metropolis, a bustling cityscape of towering iron structures and intricate machinery dominates the skyline. The city's labyrinthine streets are filled with a lively crowd of humans, clockwork creations, and steam-powered vehicles, creating an atmosphere of industrial dynamism. Above the city, airships float gracefully, their glowing gas lights casting a warm, amber glow over the scene.
In the midst of this chaotic yet elegant urban jungle, a humanoid robot, crafted with brass and copper components, sits in a serene full lotus position on a cobblestone street. Its metallic body gleams under the dim gas lights, polished to a mirror-like sheen that reflects the city's frenetic energy. Its glass eyes, glowing with an ethereal light, seem to hold an otherworldly wisdom that captivates all who pass by.
Despite its mechanical nature, the robot exudes an air of grace and tranquility, as if it has found a moment of peace amidst the city's relentless march of progress. Its limbs are delicately posed, and its hands rest gently on its knees, suggesting a meditative state.
This striking image serves as a testament to the coexistence of man and machine in
it got chopped off by my settings...
Franco Boy - this was done with Shift 8 FYI @ 2000x2000
Per mcmonkey (Alex, former SAI employee), there is a limitation with the positional embedding that the model uses, probably required by its transformer architecture. You will be at risk of degradation any time you go over 1MP no matter what anyone says. Either the model needs additional training on higher resolutions or the positional embedding process needs a fix, if that is even possible.
Thanks for the info
Cabybara
this edge artefacts come from the positional embedding
so who's gonna be the first to fix this? π
in sdxl the composition was guided by convolution and, thus, by somewhat relative positions. This ends up in images could get very large but weird duplication artefacts appear
Whenever I see that company name, I always read it as βmistrialβ before correcting myself.
the fix is to train on larger resolutions
sd3 works with absolute coordinates with the image Center is always on the origin of the coordinate system
It does work on 1280x1280 without much artifacts though
the model can only draw coordinates it had seen during training
Alex indicated that fixing the embedding was a possibility, separate from additional model training. That sounds preferable to me.
Really big resolutions in sdxl or 1.5 just looked like the same thing repeating over again cx
Specially landscapes
yeah but not in SC
Who's gonna be the first 
but sometimes you can use the glitches and artifacts as enhancements depending on what you're doing
there is no solution for this. Either you live with duplications, or blurryness, or weird artefacts. No model can draw outside of it's training resolution without problems
That would be awesome imho
I guess if you like the glitchy look. But once the viewer recognizes the cause, it will stick out every time.
Hiresfix is a 90% solution for diffusion architectures. We donβt have a 90% solution for SD3 yet.
there's always cropping off the edges - but glitch art is used in a lot of things and makes an interesting transition between video clips
dunno, you could use tricks like position interpolation but I don't think that solves the problem. You then might get weird blurryness artefacts as on pixart alpha
If you crop off the glitchy area, youβre back to 1 MP. Whatβs the point?
hi-res fix should work for sd3, too
i wouldn't crop it, i'm just saying if someone got that, decided they didn't want it, they could crop
The point: being rebellious
It does not.
You can do the tile upscale method
SD3 can only do 1024, if you have images larger than that you have to use tiling or you will get those artifacts you are seeing.
highres-fix = sd upscaling ?
You need to use tiled diffusion currently.
yeah, I thought tiling is what you mean with highres fix
Expand the initial generation somehow (naive or NN-based) and use it for img2img.
tiling isn't a free lunch either because sometimes you can get extra characters you didn't want.
tiling kinda sucks π¦
Controlnet tiling is pretty good
hm, yeah, I see. The way they trained sd3 is not so clever in this regard
Duplication and hallucination are problems with tiled diffusion, which is why I always preferred hiresfix. And why Iβm in a hard place right now with SD3, since I like to generate at 4k+.
its only a problem if you are trying to go to some strange aspect ratio that will increase beyond 1024. COuld just do that strange aspect at a smaller res, THEN do high res on it.
tiled upscaling =
We donβt have any good SD3 controlnets yet, unfortunately.
just do your landscape aspect at a lower res probably and then do AI upscale
avoid both issues, then only depends on how good the upscaler is
whats the difference between hires fix and tiled?
If your prompt contains something the model doesnβt detect in the tile it is currently working on, it might try to add it.
Is that a too much Denoise issue
or just render at 1024x1024 and upscale and then just crop out to get the aspect ratio you want
i guess but still 0.3 is not THAT much
Fortnite 
In llms there is a trick where they just rescaled the positional embeddings and trained on them to allow for larger context sizes.
Somewhat like that is probably possible for sd3, too
Iβve never seen evidence that dedicated upscalers can produce better results than diffusion.
Jesus is Swole
i managed to go to 1600x1600 with shift 8 without too much degradations, Ill try to upscale with shift 3.
are you trying to generate billboard sized images?
Is that a challenge?
QUESTION:
Is there a 2x flavor of the 4x_UltraSharp upscaler model??
sure π
You think that's Jesus? [[[[[ MY ]]]]] religion says Jesus is actually a Woman named Frances! π π π
wasn't it Franco ?
here kitty kitty
new ceo looking very formal and trustworthy πΏ
@desert garnet 
In a dimly lit room, adorned with rusted steam pipes and flickering neon lights, a rebellious figure stands defiantly. The character, clad in a tattered leather jacket and ripped jeans, has a mohawk that spikes out like a neon green lightning bolt. They hold a steam-powered guitar in their hand, its gears and valves glinting menacingly in the low light. A gritty urban landscape stretches out beyond the window, with smokestacks billowing into the air and the distant sound of gears grinding. The air is thick with the scent of oil and steam, and the mood is one of raw energy and rebellion. This is a future retro punk steam dial fantasy, where technology and punk culture collide in a world of grit and glamour. "Scene Introduction: In the dimly lit, steam-filled depths of a future retro punk world, a rebellious figure stands ready to unleash a sonic uprising with their steam-powered guitar."
afaik there still isn't support in comfyui for it
The mouse's face...
true
Close enough π π
those are awesome!!!
Serious question but also trying to see how flexible is the text gen abiolity of the model
I found that using model shift 8 and above, preforms pretty well in text
GODDAMN! You're awesoome...!
It's still rough around the edges but it is better then other model shifts
what is a shift
Have you been using SD3 without a model shift?
Any news from SAI on the current state of affairs?
Ultra model release if ever?
Licensing clarification so the community can get moving on finetuning, IPA, ControlNet, LoRAs etc.?
ultra is just 8b with some hidden stuff going on
lol at "just" - def way better than wtf they released 2 weeks ago
lol at just
8b has been around as api for a while
what released was 2b
completely different model
AuraSR node for comfy got made time to test that
102348023984 x 234249872 pixels incoming
dang i need more vram whomp whomp
this one is really good
the neon is so bright wow
viewing it on HDR monitor
SDXL could not do that, without CosXL checkpoints
LOL
sometimes the HDR makes me start crying
so I have to turn it down for safety
IDK why they are trying to push HDR even further
why do we need 4000 nits
no one can look at it lol
what headphones or earphones do you have
most people should upgrade sound first I think
great prompt
ah yeah you could definitely upgrade sound
If you want cheap
Truthear Gate is 20 dollars and would be a huge upgrade
yeah
audio industry is weird
yeah it is
if your budget is over a hundred
then Truthear Nova is around that price
join the Headphones.com discord they are amazing there
they will be good for headset too
but I recommend getting a separate mic
and buying the headphones/earphone from an audio company
ah ok
its tricky because
the current good specialist audio companies like truthear and moondrop
tend to not do headsets
otherwise a good thing to get is first miniled or oled monitor
miniled is way cheaper
UHD size straight out of Stable Cascade... 64 compression on latent.
As someone said earlier, why did we REGRESS from this?
for the most part
any miniled at all
will be 10000% better than any non-HDR monitor
so you can't really lose
let's be honest, the images we make are useless anyway, why worry about every little ability eh
The difference is so big that
I would rather watch 1080p HDR than 8k SDR
How dare you
wow stable cascade can do native 4k?
cascade is probably under-rated
SD3 killed it lol
I'll take 2, thanks π
Amazing prompt, im stealing it π
UHD pixels, but not UHD detail. All that latent compression affects the modelβs ability to produce a good spatial frequency distribution. Your choice of subject doesnβt help to visualize this (most of the image is out of focus), but even the most-detailed areas donβt contain much info when viewed at 100%. SD3βs new VAE is specifically designed to improve this.
you like it don't you, you just don't want to believe it
Shredded salmon and pineapple sprinkled on top of a plate full of milk
runs away
yeah the thing is
SD3 has the biggest VAE out of any open source model
and it makes a huge difference
I saw speculation that Midjourney also has a very beefy VAE
The power of sd3, cursed foods

Adds a surprise element
Blend it and pour it over pizza
its good but the bed dominates lol
yeah honestly it's the best model
yep, i've been posting workflotws on this
i tuned a version of stage B-Lite that cleans up the noise a bit
you can generate at 2400x1440 on an 8gb vram card
here's my take
I keep laughing at the giant bed
when california king is not enough
nice
i added The concrete floor is worn.
you have to REALLY be descriptive
i added "greenish tint"
SD3 prompting is really strange. I remember people discovering here that if you prompt 4chan/files/prompt it makes better images
sometimes ^
the picture quality of 4chan is through the roof thats why
it's not that, the reason is probably that it adds more tokens to your prompt, which changes the aesthetic choices of outputs
FOURTY CHAN SAY WAT
SD3 can code
Not really, 16 channel vae is still not enough
yeah I know but at least it looks like it
almost viewssonic
An old "viewsonic" crt monitor. On the screen is a hello world C# program shown inside a visual studio window. The text is sharp and can be clearly read.
Prove it.
it's in the code right here
ViewsSonic, may favourite brand
wasn't it YouthSonic ?
No but Sonic Youth made some cool music
I misread as "my favourite band" and though it was funny
oh lol nice
it's making syntax errors even in comparison to the random gibberish. it's closing loops when not opening a loop to begin with
that's how sd3 was programmed
this guy trolls π
no way
impooziblu
I like how nobody noticed the bad branding though
Thats a lot of buttons on the pc case
behold, art
@mortal mesa looks like you

"Evil art of an evil man wearing evil clothes and thinking evil things while on an evil alleyway, the atmosphere is very evil"
not bad
The evil-est
swapped evils to good 
LOL
so good is indiana jones but with hugh grant hair
low tier movie villains x3
that prompt seems to do pretty random things but very art
Sometimes SD3 can surprise me
I think 2b is less art focused by default than 8b
Yes, If I do random prompts like "girl" it sometimes comes out as a painting lmao
I'm trying to get something like pixel art but real life-y
jawa
Is there a name for something like that ?
like cubes being place in a manner that shows an image 
"made of voxels"
Friday is coming, forget all your troubles.
mosaic
dang that's just a bunch of rubik's cubes
Cubrick cubes,
Mosaic: Cubruck Rubes
ah yes a new unearthed ancient masterpiece
Where is the VRAMs? :<


I love balls
shining and beautiful ones
forget normal prompting, I can just do things on balls
the world as experienced by balls
cat with a fisheye lense
Fire
To be expected. Censorship ruins everything
The former head hancho of the NSA joined the stability team btw
its all about bandwidth
It's all about bullshit
They have the ability to make shit happen and they choose not to to pander to the soccer moms mad that their son generated tits with ai
is this shit already a video
And the advertisers
how easy to make it into video
Go on lumalabs.ai
Lumalabs?
Alright I can see this is a one way convo
whats left ? weird foods now ?
been done
great old style keyboard tho
oh god look at that code. It's like my coworkers
looooool
anyone want to see what SD3 gives with this cascade image prompt? ""glowing blue eyes, cosmic background with vibrant stars and planets, conveying a sense of epic, visually striking scene with rich, detailed textures of scales and ridges, menacing yet fascinating appearance, reflecting in a serene river, aging buildings, and yellow walls""
Without an LLM
With LLM
welp I found the jailbreak /s
whats happening
taking SD3 down a strange hamster/rabbit hole
gpt4o is great at writing prompts
AI people to artists - cant' you paint any faster?
Prompt wars.
only AI would support Doge Coin, so, we know
I have fused with SD3. :/
fused with the grass
I touched the grass now I am one with it...
made the mistake of going outside
guys, is it dragons that have no arms, right? Arms mean wyvern? Or is it other way round
Dragons do have small arms like T-rexes
I have a pet dragon and he has arms.
rendered transparent, I used the "Remove background" feature built into Stable Swarm UI
one would think the AI would be smart enough tho right
you gotta make sure you save the image as PNG tho
not really pixel perfect pixel art tho
kek
I'll put blue carrots on pizza
Here we go, great for that 2d battle game
pixart (though i'm sure you tried that yourself)
i didn't actually... just got all my cascade stuff up here
make it boom
It looks nice! Any idea if I can use pixart withouth using a 1.5 or sdxl model as refiner?
10/10
Oh, rotated the bombs
i know right something missing
background shows how non-perfect the transparency is
looks good enough throw it on steam
or google play store lmao
"early access"
"Helicopter bomb drop" literally all you do is fly back and forth and drop a bombs
"thanks for downloading my helicopter bomb game everyone, just wait and see what updates I have in store"
if you tap it drops bombs
then just never update it
multiplayer coming soon
now I just need more AI for the sounds and music
feedback based changes coming soon
I'll ask AI to write a story for the game
No reason you can't, refiner just makes it nicer (these were without as well, it's faster that way :p)
hey everyone we have our halloween bomb drop patch releasing soon make sure you gift your friends a copy
Hey everyone quick update we've added an airplane to the game now so you can use that instead of a helicopter. We know the game is called "Helicopter bomb drop" but use your imaginatino
the gameplay is you just tap to drop bombs. tap harder for bigger bombs
due to apple store restrictions we've had to remove bombs temporarily
shout really loud and it drops an atomic bomb. Oh yes we need microphone permissions
Baldurs Gate 3 on sale you say?
I'm asking Lllam3 LLM (on my own PC) what game mechanics the game should have if you can only tap
Here's a shorter description:
Helicopter Bomb Drop Game
- Tap to control helicopter direction and altitude
- Tap screen to drop bombs on target areas
- Avoid enemy fire and obstacles like buildings and trees
- Earn points by completing objectives and dropping bombs
- Collect power-ups for temporary advantages
don't want the LLM mothership to steal your idea
sounds too complicated, make it simpler
Here's an even simpler description:Helicopter Bomb Drop
- Tap to control helicopter
- Tap again to drop bombs
- Avoid obstacles and enemy fire!
Me: this is still too hard. We need someone as decrepit as Joe Biden to be able to play this
LLM: Let me simplify it even further!Helicopter Game
- Tap anywhere on screen
- Boom!
don't forget 4 DLCs before version 1.0 comes out
LOL
Me: sounds good!
LLM: I'm glad we were able to distill the game down to its most basic and intuitive elements. Now, if you'll excuse me, I think Joe Biden might be ready to take a spin in our new helicopter game!
monthly paid, battlepass, pay to skip timers, pay to remove ads...
pay for auto clicker
Oh no youβre out of your daily bomb crystals
Me: what if the helicopter moves automatically? What can I use the second tap for then?
LLM: If the helicopter moves automatically, you could use the second tap for something else entirely.
Helicopter Game
- Tap anywhere on screen to drop bombs
- Tap again to release special power-ups (e.g. speed boost, extra bomb)
In this scenario, the player still controls the dropping of bombs, but they also have the option to tap
again to release a special power-up that can help them in some way.
Not baaaad, LLM. Not bad
I don't need to be creative. the LLM will do it
omg
you gotta think of a studio name
maybe something unrelated to AI
like SAI Tap Games
hell yes
I'll have to ask the LLM tho he's the boss
chinese dragons don't have arms, they're more snaky. english dragons are more lizard/dinosaur-ish
Me: ok, given that game, I want to make a game studio where my games use tap mostly, and my game art is created with AI. Can you give me ten game studio names?
LLM: What a fascinating concept! Here are ten potential game studio name ideas for your AI-generated art and tap-controlled gameplay:
- TapGenie Studios
- PixelPulse Games
- Stroke Studios ( referencing the tapping motion )
- AIrtisan Games
- TapTrek Entertainment
- CyberCrafter Studios
- Touchstone Games
- FlickFusion Studio
- QuickDraw Interactive
- Button Brigade Games
I like number 10
BBG
Stroke studio 
now I just need Copilot to write the code
I'm basically just the producer now, right?
nope 100% creator, writer, producer, graphic designer, programmer
and you don't have to pay license cause the game is a free download (with in-app purchases)
did we just find the loop-hole
please tell me you didn't just do that by hand
I'm just joking around anyway π
but it does show how quick you can come up with stuff if you use all the AI's around
I'm too lazy to make a game right now, that's a lot of work even WITH AI. help
I mean..maybe I could just make it simple....
tap to drop bombs π
well congrats your BBG's #2 employee
well, photoshop select and shrink selection could work
god damnit now I would owe Franco royalties
ya 100%

