#🆕｜sd3 | Stable Diffusion | Page 9

hallow lion Jun 9, 2024, 7:08 AM

#

Comfyui nodes and workflow will arrive 5.2 seconds after the weights drop.

sterile pendant Jun 9, 2024, 7:29 AM

#

I don't know about how long kohya will take to get it up and running with the UI, but I'm sure there will be some update to the accelerate and other python libraries that will support it and when that happens, you'll be able to do it manually with a CLI script (all koyha's interface really does is gives you a convenient UI to make the command, then runs it in the command prompt)

hallow talon Jun 9, 2024, 7:42 AM

#

sterile pendant I don't know about how long kohya will take to get it up and running with the UI...

Unfortunately I have no idea what that means. I don’t know much about Python itself.

neon wagon Jun 9, 2024, 8:04 AM

#

SD3 loras = 2 week after the weights
SD3 finetunes = 1 Month after weights

#

but good finetunes merges after weight need 3 month minimum.

dry wave Jun 9, 2024, 8:14 AM

#

sterile pendant I don't know about how long kohya will take to get it up and running with the UI...

kohya sd-scripts is a cli script.

#

you don't need updates to accelerate (you don't even need accelerate if you use single gpu). You need updated on the training script

rain current Jun 9, 2024, 8:40 AM

#

Where can I see the results of 2B?

bitter hearth Jun 9, 2024, 8:42 AM

#

"the villain, a villainous villain making villain stuff, in a dark alleyway"

#

very villain

#

storm saffron Jun 9, 2024, 9:13 AM

#

neon wagon but good finetunes merges after weight need 3 month minimum.

2 weeks? 3 months? Why so long? Fine tuners will already have datasets ready to go.

sterile pendant Jun 9, 2024, 9:31 AM

#

dry wave kohya sd-scripts *is* a cli script.

I know it is, I already said it's just a UI to make things easier to build the command. There's even a button to print the command in the command window so that you can use it without the interface if you desire. I said accelerate because that's literally what gets called when you run kohya, it sorts the rest out. But yeah, the other libraries, like I mentioned, will need to be updated before anything can propagate downstream from it.

bitter hearth Jun 9, 2024, 9:32 AM

#

sterile pendant Jun 9, 2024, 9:32 AM

#

Should only take a few days to weeks before you can train though.

vapid radish Jun 9, 2024, 9:56 AM

#

dry wave Jun 9, 2024, 10:11 AM

#

sterile pendant I know it is, I already said it's just a UI to make things easier to build the c...

yes, and that's wrong

#

kohya_ss is the python script that does the training. It's not an ui

#

there are several UIs on top of kohya_ss, that is what you are referring to

#

accelerate had nothing to do with that. It's just a wrapper script that simplifies things like multi gpu training

sterile pendant Jun 9, 2024, 10:29 AM

#

dry wave kohya_ss is the python script that does the training. It's not an ui

See what I said about other python libraries...

#

And accelerate can then call those other libraries. But yeah, I should have worded things a little clearer. Accelerate is like saying diffusers in the sense of it simplifies all the platform/hardware stuff and simplifies the calling of other libraries. Can you manually call those the hard way, sure. The training scripts will be updated pretty quickly, but won't work their ways into UIs until things like accelerate are updated, since you'll have people trying to train on a variety of PC environments

dry wave Jun 9, 2024, 10:33 AM

#

yes and that is still wrong lol

#

you don't have to update accelerate

#

accelerate is really just a "simple" wrapper that allows to switch between different precision types and gpu setups.

sterile pendant Jun 9, 2024, 10:35 AM

#

and you're 100% postive that it will be able to handle switching to the requirements that the DiT architecture has?

dry wave Jun 9, 2024, 10:35 AM

#

doing that manually is not more effort or complicated than using accelerate. The idea of accelerate is that you can easily turn off these things without changing the code

#

yes

#

DiT is just a transformer architecture running on pytorch

#

and we had this discussion already but: DiT is using transformers, sdxl is also using transformers. You call the same pytorch kernels under the hood

cedar gale Jun 9, 2024, 10:38 AM

#

sterile pendant Jun 9, 2024, 10:42 AM

#

dry wave doing that manually is not more effort or complicated than using accelerate. The...

i suppose we'll see, just figured there'd potentially be compatibility issues with SD3 due to all the components that will have to be shuffled around like if you're trying to train sd3+encoders+t5 at the same time. i imagine they will add in some kind of control over where to keep the various models, like having the T5 stay in ram instead of vram and so on. but yeah, at the core, they are all going to just call pytorch at the end of the day

dry wave Jun 9, 2024, 10:43 AM

#

yes, but accelerate is not doing these kind of things

sterile pendant Jun 9, 2024, 10:44 AM

#

https://huggingface.co/docs/accelerate/en/package_reference/accelerator

Accelerator

#

i'll read through a bit more, but it definitely looks like it handles where shit sits

#

from a wrapper point of view

#

But anyways, again, other libraries are going to have to be updated, along with training scripts, etc etc, before you'll see UIs implment them. Your average user isn't going to want to go into command prompts with a scalpel to manually train. 99% of people are going to rely on things like koyha and onetrainer

#

So it will likely take days to weeks for all the stars to align to be able to just do it through a UI

storm saffron Jun 9, 2024, 10:48 AM

#

sterile pendant i suppose we'll see, just figured there'd potentially be compatibility issues wi...

You can freeze the embeddings for the text encoders and the image latents before training to save vram. All the stuff for training DiTs is in pytorch already otherwise pixart and Hunyuan couldn't train.

sterile pendant Jun 9, 2024, 10:49 AM

#

storm saffron You can freeze the embeddings for the text encoders and the image latents before...

that's true, so there's already a leg up there

dry wave Jun 9, 2024, 10:50 AM

#

stuff like cpu offloading, pre-conputing and so on have to be done by the library itself (diffusers or kohya_ss)

#

but you are right for sure that it will take some weeks until everything runs smoothly for the end-user

storm saffron Jun 9, 2024, 10:52 AM

#

Yeah not everyone but the enthusiasts and early adopters will want to fiddle with cli scripts.

sterile pendant Jun 9, 2024, 10:53 AM

#

dry wave stuff like cpu offloading, pre-conputing and so on have to be done by the librar...

right, again, the main point is that people want to be able to do this in UIs like koyha and onetrainer, so all the various libraries used in those are going to need to be updated. there's a reason why linux only has a 2.3%(counting steamdecks) usage rate for home PCs, people hate CLI bullshit(even though modern versions don't make you do cli stuff really anymore, people still have the bad taste in their mouthes from when they tried linux ages ago lol)

storm saffron Jun 9, 2024, 10:53 AM

#

We're just seeing new people starting to train pixart now OneTrainer added support.

sterile pendant Jun 9, 2024, 10:54 AM

#

storm saffron We're just seeing new people starting to train pixart now OneTrainer added suppo...

i saw that they finally added support for it. personally, i like onetrainer more than koyha

storm saffron Jun 9, 2024, 10:55 AM

#

WSL does kinda help with that though if you truly wanted to train a lora the day SD3 comes out (assuming they release the LoRAs code same day)

sterile pendant Jun 9, 2024, 10:57 AM

#

even if they don't release it day one, i'm sure people would try to frankenstein something up anyways

dry wave Jun 9, 2024, 10:58 AM

#

if you have fine-tuning code going to lora is trivial

storm saffron Jun 9, 2024, 10:58 AM

#

sterile pendant even if they don't release it day one, i'm sure people would try to frankenstein...

I'm sure. The code will be there in diffusers, just gotta glue it together.

sterile pendant Jun 9, 2024, 10:59 AM

#

it's not some new architecture or anything, llms have been using it for many years now, so a lot of the functions are already there. obviously, you just have to get the steps in the right order for the layout of how SD3s specific flavor works though

storm saffron Jun 9, 2024, 11:13 AM

#

Wonder what happens if you give T5 and clip completely different prompts entirely that contradict each other when inferencing. 🤔

wide pagoda Jun 9, 2024, 11:16 AM

#

Similar things have ready been done with SDXL
Anyway I assume clip would dominate

silver sluice Jun 9, 2024, 11:16 AM

#

So after following the conversation, from my understanding, even tho people already have early access to the weights, none of those people have knowledge or ability to integrate or update the UIs

cobalt moon Jun 9, 2024, 11:17 AM

#

silver sluice So after following the conversation, from my understanding, even tho people alre...

comfyanonymous :

silver sluice Jun 9, 2024, 11:17 AM

#

Seems like another missed opportunity by SAI, not giving early access to devs who integrate stuff and fine tuners, seems like a strong case for that VIP room

cobalt moon Jun 9, 2024, 11:17 AM

#

Bro literally SAI staff

silver sluice Jun 9, 2024, 11:19 AM

#

So we’ll have to rely on CLI scripts to generate anything after release? I realize someone said to expect workflow for comfy quickly after release but it seems prudent to just have given access to devs who would be making this work so it can be refined and ready for launch

cobalt moon Jun 9, 2024, 11:19 AM

#

Lykon who also got access on SD3 Medium said he will train new Dreamshaper based of that model

silver sluice Jun 9, 2024, 11:19 AM

#

cobalt moon Lykon who also got access on SD3 Medium said he will train new Dreamshaper based...

I’m a big fan of dreamshaper, is lykon the guy who made it?

cobalt moon Jun 9, 2024, 11:19 AM

#

silver sluice I’m a big fan of dreamshaper, is lykon the guy who made it?

Yes, he just said that in Civitai Discord server

storm saffron Jun 9, 2024, 11:22 AM

#

silver sluice So we’ll have to rely on CLI scripts to generate anything after release? I reali...

I believe comfy will be updated same day as launch to support SD3 (it's literally what they are using to test and generate the images you see on X and discord)

cobalt moon Jun 9, 2024, 11:26 AM

#

SAI literally use Comfy to generate SD3 images yeah

sterile pendant Jun 9, 2024, 11:30 AM

#

silver sluice Seems like another missed opportunity by SAI, not giving early access to devs wh...

Maybe one in a hundred thousand to million users train stuff. It's really not as high of a priority as you'd think. We're just in an echo chamber of enthusiasts. Plus, I'm sure SAI just can't wait to see all the degenerate stuff people will immediately train; that further gives AI a bad rap and further pressures governments to want to crack down on generative AI even harder...

cerulean sphinx Jun 9, 2024, 11:31 AM

#

Gradio is such easy to use that updating the UI to train SD3 should be quite quick once you've got a CLI script to train it.

silver sluice Jun 9, 2024, 11:31 AM

#

cobalt moon SAI literally use Comfy to generate SD3 images yeah

So comfy supports SD3 internally and they’re just going to release the weights along the software update I’m gusssing?

silver sluice Jun 9, 2024, 11:32 AM

#

sterile pendant Maybe one in a hundred thousand to million users train stuff. It's really not as...

I’m not concerned or interested with training stuff just wanna do basic level stuff like use it with comfy lol but I do see your point

cobalt moon Jun 9, 2024, 11:32 AM

#

silver sluice So comfy supports SD3 internally and they’re just going to release the weights a...

Yes

cerulean sphinx Jun 9, 2024, 11:32 AM

#

But personally I think using config files are easier in the long time than dealing with any UI. The only benefit the UI serves really is reasonable presets for training.

cobalt moon Jun 9, 2024, 11:33 AM

#

cerulean sphinx Gradio is such easy to use that updating the UI to train SD3 should be quite qui...

Well I dont think it is mattered that much for high quality finetune creators

cobalt moon Jun 9, 2024, 11:34 AM

#

sterile pendant Maybe one in a hundred thousand to million users train stuff. It's really not as...

This is what you get without regulations

#

Everyone ( at least here ) hate regulation, but they must be there, for legal purpose

silver sluice Jun 9, 2024, 11:34 AM

#

cerulean sphinx But personally I think using config files are easier in the long time than deali...

So you prefer editing a file and running a command over using a UI?

cobalt moon Jun 9, 2024, 11:35 AM

#

You do not wanted people to easily generate porns with your face on and share it online

fiery wharf Jun 9, 2024, 11:36 AM

#

cobalt moon You do not wanted people to easily generate porns with your face on and share it...

funny that they use that excuse to regulate ai models but ai military models used to bomb other countries are not regulated

cobalt moon Jun 9, 2024, 11:36 AM

#

fiery wharf funny that they use that excuse to regulate ai models but ai military models use...

AI Military Model? Care to share one of them please

sterile pendant Jun 9, 2024, 11:36 AM

#

cobalt moon Everyone ( at least here ) hate regulation, but they must be there, for legal pu...

Yep, and so long as people are consentlessly making vulgar loras of people, along with other endless amounts of illegal content, governments will feel more and more pressure to regulate and enact laws

cobalt moon Jun 9, 2024, 11:37 AM

#

If you cannot show it then it is just a scaremongering

fiery wharf Jun 9, 2024, 11:37 AM

#

cobalt moon If you cannot show it then it is just a scaremongering

https://interactive.aviationtoday.com/avionicsmagazine/december-2019-january-2020/artificial-intelligence-efforts-for-military-drones/

sterile pendant Jun 9, 2024, 11:39 AM

#

fiery wharf funny that they use that excuse to regulate ai models but ai military models use...

if you live on land within the planet earth, you are subject to the rules of who ever owns that land. they make the rules, end of story. military offense/defense is critical in keeping a nation going. making waifus and training ||drippy butthole || loras are not.

#

(i was legit horrified to learn that it wasn't just a meme, that these were actual loras people trained...)

fiery wharf Jun 9, 2024, 11:39 AM

#

sterile pendant if you live on land within the planet earth, you are subject to the rules of who...

oh yea bombing people with ai drones is better than some nude b

sterile pendant Jun 9, 2024, 11:41 AM

#

fiery wharf oh yea bombing people with ai drones is better than some nude b

look, i get it, you have a toy and you don't want people telling you how you can and can't play with it. but you're comparing two completely different things

fiery wharf Jun 9, 2024, 11:41 AM

#

sterile pendant look, i get it, you have a toy and you don't want people telling you how you can...

not really we are comparing goverment regulation of ai and why they arent touching ai military models

#

reading comprehension 101

sterile pendant Jun 9, 2024, 11:42 AM

#

fiery wharf not really we are comparing goverment regulation of ai and why they arent touchi...

oh i get what you're trying to say, but what i'm saying is that you're too immature and self-centered to realize that they aren't even remotely in the same league

#

this isn't checkers, it's chess

#

national safety is exponentially more imporant to a country than you playing with ai waifus

fiery wharf Jun 9, 2024, 11:44 AM

#

sterile pendant oh i get what you're trying to say, but what i'm saying is that you're too immat...

government wants to regulate ai but doesnt touch ai military models,there i kept it clean and simple for you seems like english is not your forte

#

if they truly cared about security they would regulate both

sterile pendant Jun 9, 2024, 11:46 AM

#

fiery wharf government wants to regulate ai but doesnt touch ai military models,there i kept...

it's actually my third language, yes. the governments want AI for the military for national security. it's an arms race... protecting your country from other countries that want to harm your country. AI will make a larger difference in physical and virtual warfare, than going from the sword to a drone strike.

sterile pendant Jun 9, 2024, 11:47 AM

#

fiery wharf if they truly cared about security they would regulate both

no, they will regulate the one to prevent internal stability issues that would arise from people making fake content that could cause economic/political issues within a country. like deepfaking a politician doing or saying something

storm saffron Jun 9, 2024, 11:49 AM

#

sterile pendant no, they will regulate the one to prevent internal stability issues that would a...

I mean that's already been possible for years and years with Photoshop premiere, etc.

sterile pendant Jun 9, 2024, 11:49 AM

#

storm saffron I mean that's already been possible for years and years with Photoshop premiere,...

not easily

desert garnet Jun 9, 2024, 11:49 AM

#

these reads like one of those evangelical clowns in america telling me why killin ppl is better and safer than sex

sterile pendant Jun 9, 2024, 11:49 AM

#

and not doable by 99% of people that can type simple words out

sterile pendant Jun 9, 2024, 11:50 AM

#

desert garnet these reads like one of those evangelical clowns in america telling me why killi...

i'm an atheist, but reality is reality. if you want to keep a country running, you need to stay a step ahead of your enemies and rivals

storm saffron Jun 9, 2024, 11:50 AM

#

sterile pendant and not doable by 99% of people that can type simple words out

No, but they don't really matter in the big scheme of things, instability comes from large actors who have enough money to be consistent

tiny otter Jun 9, 2024, 11:50 AM

#

@storm saffron i need to create images where should i go

storm saffron Jun 9, 2024, 11:51 AM

#

tiny otter <@407192024068915200> i need to create images where should i go

I dunno.

desert garnet Jun 9, 2024, 11:51 AM

#

sterile pendant i'm an atheist, but reality is reality. if you want to keep a country running, y...

yes the ai drone that killed the humanitarian workers from belgium in gaza helped to keep my country safe

sterile pendant Jun 9, 2024, 11:51 AM

#

desert garnet yes the ai drone that killed the humanitarian workers from belgium in gaza helpe...

do you know how many civilian casualties there have been, historically, in wars throughout the past century?

#

it absolutely sucks and ideally should be zero

#

but modern warfare has cut it down astronomically

tiny otter Jun 9, 2024, 11:52 AM

#

tiny otter <@407192024068915200> i need to create images where should i go

Anyone know ??

desert garnet Jun 9, 2024, 11:52 AM

#

thats true,to increase that number we should train more models to automatically target ppl based on an algorithm

sterile pendant Jun 9, 2024, 11:53 AM

#

#

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8581199/

#

i can dig up even more credible stats like this

desert garnet Jun 9, 2024, 11:54 AM

#

perfect argument for ai robots in wars

sterile pendant Jun 9, 2024, 11:55 AM

#

which is why im all for it in militaries. let the bots duke it out with the other bots and leave the people out of it. it would just be the modern version of siege warfare, minus wasting human lives

desert garnet Jun 9, 2024, 11:56 AM

#

yea if your family dies because of a misplaced airstrike, its just collateral damage bro,man up

sterile pendant Jun 9, 2024, 11:56 AM

#

the point is, that number will be far lower vs historical statistics

cobalt moon Jun 9, 2024, 11:58 AM

#

desert garnet thats true,to increase that number we should train more models to automatically ...

And to target you and your family?

#

Yeah, sarcasm 100%

sterile pendant Jun 9, 2024, 11:58 AM

#

so yes, governments are going to continue to strengthen AI usage in their militaries and yes, they are also going to crack down on what citizens are allowed to do with AI usage. you can cry into the wind about it, but it is how it is

cobalt moon Jun 9, 2024, 11:58 AM

#

We went from SD3 to whole AI military regulation and usage

desert garnet Jun 9, 2024, 11:59 AM

#

cobalt moon We went from SD3 to whole AI military regulation and usage

well there was someone cryin over waifus so it always goes down that way

sterile pendant Jun 9, 2024, 11:59 AM

#

yeah, it was a dumb false equivalency fallacy thing that someone brought up, like usual

low stone Jun 9, 2024, 12:33 PM

#

fair spruce Jun 9, 2024, 1:23 PM

#

cedar gale Jun 9, 2024, 1:25 PM

#

#

silver sluice Jun 9, 2024, 1:44 PM

#

low stone

can you do the same image but monochrome fuzz pets and instead of kids replace it with men in black suits? lol i dont know if you're taking requests or not but that one came out raelly nicely looks great

patent acorn Jun 9, 2024, 2:16 PM

#

cedar gale

i would better see the mice facing to the side instead

cedar gale Jun 9, 2024, 2:17 PM

#

You can mount your hamster however you like.

#

#

#

patent acorn Jun 9, 2024, 2:28 PM

#

no way

urban arch Jun 9, 2024, 2:40 PM

#

Ribbed, for HER pleasure.

cedar gale Jun 9, 2024, 2:53 PM

#

raven fern Jun 9, 2024, 3:05 PM

#

low stone

awww that's cute :3

frail shoal Jun 9, 2024, 3:06 PM

#

i'm curious, is the SD3 prompt comprehension better than pixart sigma that has a 20GB T5 model ? Has anyone compared them both ? I'm currently using sigma with sdxl refiner and it can understand things like concepts, emotions and feelings and represent them visually like a real artist. Curious if SD3 text comprehension can do that

sterile pendant Jun 9, 2024, 3:21 PM

#

frail shoal i'm curious, is the SD3 prompt comprehension better than pixart sigma that has a...

The issue isn't so much the size of the t5, it's how well it maps to the image generation model. Think of t5 like chatgpt and how well it can handle and understand conversational input. Since it can better understand semantics like a red ball on a green cube on a blue table, it can then do a better job(in theory) of pairing those colors to those shapes when you go to generate it. That 20gb is it in fp32, there are also fp/bfp16 versions of it that are around 10gb. Pixart sigma is awesome, but is pretty small and not all that trained yet(in comparison to say sdxl finetune level image quality). SD3's t5 will likely be on par with pixart's, but the image creation half will blow it out of the water since it has a far larger dataset and more training to map the concepts

frail shoal Jun 9, 2024, 3:24 PM

#

sterile pendant The issue isn't so much the size of the t5, it's how well it maps to the image g...

base image quality in sigma is shit, that's why i use sdxl refiner

sterile pendant Jun 9, 2024, 3:25 PM

#

Also keep in mind that SD3 has three input encoders: the same clips from sdxl and t5. I'd imagine you can run it with any combination of the three, maybe even pure T5, but I can't remember if they've verified that or not

#

yeah sigma is meant to be more of a research paper kind of model, they don't have the resources to fully "train" train it all the way

#

It costs a lot to make quality base models

#

but sigma is great for hashing a scene out to then be resampled with sdxl. it has kinda been the poorman's sd3 while we wait for sd3 to come out

silver sluice Jun 9, 2024, 3:31 PM

#

how many gigabytes do you guys expect SD3 2B model to be? looking for a ballpark estimate

cedar gale Jun 9, 2024, 3:42 PM

#

trail frost Jun 9, 2024, 3:47 PM

#

Is there any open source image generation models other sdxl 1.0 and pixart

gusty trail Jun 9, 2024, 3:53 PM

#

hunyuan-dit, lumina-t2i, Kandinsky 3.0

low stone Jun 9, 2024, 3:54 PM

#

gusty trail hunyuan-dit, lumina-t2i, Kandinsky 3.0

Hunyuan has been great. Has different composition than pixart so its adds variety to use both

trail frost Jun 9, 2024, 4:03 PM

#

gusty trail hunyuan-dit, lumina-t2i, Kandinsky 3.0

Thank you so much

frail shoal Jun 9, 2024, 4:07 PM

#

sterile pendant but sigma is great for hashing a scene out to then be resampled with sdxl. it ha...

exactly, however i find that with the poorman's sd3, i prefer the images this workflow does than midjourney. Midjourney does not do what you want most of the time

frail shoal Jun 9, 2024, 4:17 PM

#

sterile pendant The issue isn't so much the size of the t5, it's how well it maps to the image g...

also i'm using 20 gb model since i can use it in cpu. The half precision one only works on gpu with comfyui

#

and i need gpu room for sdxl refiner

#

i only have 6gb vram

lucid swift Jun 9, 2024, 4:59 PM

#

living avian theropods

#

cerulean sphinx Jun 9, 2024, 5:17 PM

#

silver sluice So you prefer editing a file and running a command over using a UI?

Yes, that makes it easier to test out different parameters when training.

sterile pendant Jun 9, 2024, 5:24 PM

#

lucid swift living avian theropods

Was kinda just hoping it would spit out a chicken since they are living descendants of them lol

#

(well all birds are living descendants of therapods)

lucid swift Jun 9, 2024, 5:44 PM

#

sterile pendant (well all birds are living descendants of therapods)

yes xD

lucid swift Jun 9, 2024, 5:45 PM

#

sterile pendant Was kinda just hoping it would spit out a chicken since they are living descenda...

another result xD

cedar gale Jun 9, 2024, 5:46 PM

#

#

lucid swift Jun 9, 2024, 5:47 PM

#

what the hell

cedar gale Jun 9, 2024, 5:47 PM

#

Lol thats a dualtops.

lucid swift Jun 9, 2024, 5:51 PM

#

cedar gale

cedar gale Jun 9, 2024, 5:51 PM

#

😄

#

11 / 10.

lucid swift Jun 9, 2024, 5:52 PM

#

😎

#

cedar gale Jun 9, 2024, 5:56 PM

#

FBI catfish?

lucid swift Jun 9, 2024, 5:56 PM

#

xD yes

#

#

the thinnest smartphone that exists android

#

#

the thinnest smartphone that exists android 1999

#

cunning lintel Jun 9, 2024, 6:04 PM

#

New china model (Lumina-Next-SFT) https://github.com/Alpha-VLLM/Lumina-T2X
If these devs keep this progress up this lumina thing might start be be actually good

GitHub

GitHub - Alpha-VLLM/Lumina-T2X: Lumina-T2X is a unified framework f...

Lumina-T2X is a unified framework for Text to Any Modality Generation - Alpha-VLLM/Lumina-T2X

lucid swift Jun 9, 2024, 6:06 PM

#

cunning lintel New china model (Lumina-Next-SFT) https://github.com/Alpha-VLLM/Lumina-T2X If ...

looks impressive

#

sd3

fallen swallow Jun 9, 2024, 6:18 PM

#

WTF ... this is awesome!

silver sluice Jun 9, 2024, 6:27 PM

#

cerulean sphinx Yes, that makes it easier to test out different parameters when training.

i could see that, where you need to batch things so you quickly script a thing that'll batch a few different scenarios to examine the results. sounds about right?

trail frost Jun 9, 2024, 6:38 PM

#

Anyone here using Hunyuan dit if so can you please share some examples

sterile pendant Jun 9, 2024, 6:47 PM

#

cunning lintel New china model (Lumina-Next-SFT) https://github.com/Alpha-VLLM/Lumina-T2X If ...

That's the new finetuned version, right?

#

I need to reinstall it to mess around with it again

lucid swift Jun 9, 2024, 7:17 PM

#

cunning lintel New china model (Lumina-Next-SFT) https://github.com/Alpha-VLLM/Lumina-T2X If ...

cool

frail shoal Jun 9, 2024, 7:18 PM

#

sterile pendant I need to reinstall it to mess around with it again

can you use it in comfyui ?

dreamy sundial Jun 9, 2024, 7:21 PM

#

cunning lintel New china model (Lumina-Next-SFT) https://github.com/Alpha-VLLM/Lumina-T2X If ...

this starts to look good

storm saffron Jun 9, 2024, 7:21 PM

#

silver sluice how many gigabytes do you guys expect SD3 2B model to be? looking for a ballpark...

If nobody has answered... 4Gb

silver sluice Jun 9, 2024, 7:22 PM

#

storm saffron If nobody has answered... 4Gb

thanks i've been super curious about that, so the 2 billion parameter model will be around 4 gigabytes and the fine tuned models are expected to be 6-7 kind of like SDXL right?

storm saffron Jun 9, 2024, 7:23 PM

#

silver sluice thanks i've been super curious about that, so the 2 billion parameter model will...

Nope, 4. SDXL was around 6 due to also including CLIP and VAE in the safetensors file, if they also include the VAE and CLIP in it then it'll end up around 5.5Gb (ish)

#

SDXL was 2.6B and ended up at 5Gb just for the UNET

#

Without VAE and CLIP

lucid swift Jun 9, 2024, 7:24 PM

#

Lumina-Next-SFT promt: This image is a captivating digital artwork that portrays a surreal scene set in a misty, swamp-like environment. The dominant color palette is a deep, neon purple, which bathes the entire scene in an otherworldly glow. The atmosphere is thick with fog, obscuring the details of the trees and vegetation that surround the scene. The water in the foreground reflects the eerie light, creating a mirror-like effect that adds to the sense of depth and mystery.

In the center of the image, there is a figure seated on a motorcycle, poised as if ready to depart. The rider is clad in a dark, form-fitting outfit that merges with the shadows, and their helmet has a reflective visor that mirrors the intense purple hue of the surroundings. The motorcycle itself is sleek and modern, with a design that suggests speed and agility. The rider's posture exudes a sense of anticipation, as if they are waiting for the perfect moment to make their move.

Lightning bolts pierce through the dense fog, creating a stark contrast with the otherwise monochromatic scene. These bolts of electricity add a dynamic element to the image, suggesting a storm or a supernatural event. The lightning's jagged lines and the way they illuminate the mist create a dramatic and intense atmosphere, enhancing the overall sense of suspense and wonder.

The image is a blend of natural and surreal elements, creating a dreamlike quality that invites viewers to immerse themselves in its mysterious world. The neon purple color scheme and the ethereal lighting contribute to a feeling of otherworldliness, making the scene both captivating and unsettling." image

cedar gale Jun 9, 2024, 7:24 PM

#

storm saffron Jun 9, 2024, 7:25 PM

#

at least 8 GPUs are required for full fine-tuning of the Lumina-T2X 5B

Ouch.

lucid swift Jun 9, 2024, 7:29 PM

#

storm saffron `at least 8 GPUs are required for full fine-tuning of the Lumina-T2X 5B ` Ouch...

i think ther is a 2b model?

#

at least it says to b in the web gui

storm saffron Jun 9, 2024, 7:29 PM

#

There's a bunch of models and it's not that clear how to actually use any!

lucid swift Jun 9, 2024, 7:30 PM

#

yes they shuld have sepereated models into diffrent githubs

#

but i guesse more stars or someting

storm saffron Jun 9, 2024, 7:30 PM

#

no requirements.txt either for pip

lucid swift Jun 9, 2024, 7:31 PM

#

💀

#

Lumina-Next-SFT

storm saffron Jun 9, 2024, 7:31 PM

#

What's SFT?

lucid swift Jun 9, 2024, 7:33 PM

#

storm saffron What's SFT?

i think that means its finetuned to look better and its not a base model

storm saffron Jun 9, 2024, 7:33 PM

#

Not just the base model safetensor's version?

lucid swift Jun 9, 2024, 7:34 PM

#

storm saffron Not just the base model safetensor's version?

"Lumina-Next-SFT is a 2B Next-DiT model with Gemma-2B serving as the text encoder, enhanced through high-quality supervised fine-tuning (SFT)."

storm saffron Jun 9, 2024, 7:34 PM

#

Ahhhh ok

#

It doesn't look that fine tuned from the pic above. 😄

lucid swift Jun 9, 2024, 7:34 PM

#

xD

storm saffron Jun 9, 2024, 7:35 PM

#

What was your prompt?

lucid swift Jun 9, 2024, 7:35 PM

#

i think it uderstands natural laguage better so my promt was not optimal "photograph of ghost special force agent, adorned in all-black human anthropomorphic furrsona fish in fursuiter at a con highly detailed, the interplanetary from "2001 a space odyssey"

storm saffron Jun 9, 2024, 7:35 PM

#

I'm gonna compare it with my PixArt finetune.

#

Have you tried their 2K model?

lucid swift Jun 9, 2024, 7:36 PM

#

storm saffron Have you tried their 2K model?

no

storm saffron Jun 9, 2024, 7:37 PM

#

There, my pixart FT

#

lucid swift Jun 9, 2024, 7:39 PM

#

this one is more like a fish

#

"photograph of ghost special force agent, furrsona detailed, the interplanetary"

storm saffron Jun 9, 2024, 7:39 PM

#

Anthropomorphic, ✅

turbid grotto Jun 9, 2024, 7:40 PM

#

I have seen some disagreement about sd3 75 tokens due to clip or 512 due to T5 a while ago, is this still concern or it is not a problem?

lucid swift Jun 9, 2024, 7:40 PM

#

turbid grotto I have seen some disagreement about sd3 75 tokens due to clip or 512 due to T5 a...

its still a porblem

storm saffron Jun 9, 2024, 7:40 PM

#

OK now it's gone more furry since I changed furrsona to fursona

lucid swift Jun 9, 2024, 7:40 PM

#

we are not sure how it will behave

lucid swift Jun 9, 2024, 7:41 PM

#

storm saffron OK now it's gone more furry since I changed furrsona to fursona

these models dont like mispellings

storm saffron Jun 9, 2024, 7:41 PM

#

Comfy needs a spellchecker node

turbid grotto Jun 9, 2024, 7:41 PM

#

lucid swift its still a porblem

someone noted "clip stacking" or somthing like that, could it be a solution?

storm saffron Jun 9, 2024, 7:42 PM

#

I guess it's going for Ghost Recon?

#

Considering it's a 600M param model that I badly finetuned, it's doing alright. 😄

lucid swift Jun 9, 2024, 7:42 PM

#

turbid grotto someone noted "clip stacking" or somthing like that, could it be a solution?

i am not sure.

#

yes thats smaller then 1.5

lucid swift Jun 9, 2024, 7:43 PM

#

turbid grotto someone noted "clip stacking" or somthing like that, could it be a solution?

it might allow 512 tokes

storm saffron Jun 9, 2024, 7:44 PM

#

Well DiTs behave more like models 1.5 - 2x the size of their UNet counterparts of the same parameter count.

lucid swift Jun 9, 2024, 7:44 PM

#

storm saffron Well DiTs behave more like models 1.5 - 2x the size of their UNet counterparts o...

what do you mean

#

storm saffron Jun 9, 2024, 7:45 PM

#

I mean they behave more like models bigger than the equivalent UNET model.

lucid swift Jun 9, 2024, 7:46 PM

#

storm saffron I mean they behave more like models bigger than the equivalent UNET model.

oh

storm saffron Jun 9, 2024, 7:46 PM

#

Hard to explain. A 2B parameter DiT is more like a 3-4B parameter UNET in capability

lucid swift Jun 9, 2024, 7:46 PM

#

lucid swift

vs sd3 but sd3 was only able to fit a small part of the promt

lucid swift Jun 9, 2024, 7:47 PM

#

storm saffron Hard to explain. A 2B parameter DiT is more like a 3-4B parameter UNET in capabi...

daim

#

i think one problem of the lumina model is that they use sdxl vae

storm saffron Jun 9, 2024, 7:48 PM

#

That could be an issue, especially with small faces and text.

#

Same problem with PixArt

#

IDK if I can be bothered to get Lumina working with SD3 just a few days away

lucid swift Jun 9, 2024, 7:50 PM

#

storm saffron IDK if I can be bothered to get Lumina working with SD3 just a few days away

xD if i would be home i would install it xD but you can also just use the web demo

#

lumina vs sd3 (but sd3 is like 8b and lumina is like 2b ) The image depicts an alien-like creature with a large, elongated head and dark, almond-shaped eyes. The skin appears textured and rough, reminiscent of reptilian or amphibian skin. The creature is sitting in a body of water, partially submerged, with its legs and lower torso hidden below the surface. The background is foggy, adding a sense of mystery, and features tall, thin reeds and barren trees, creating a marsh-like or swamp environment. The overall atmosphere is eerie and otherworldly, with muted colors and low light enhancing the creature's unsettling appearance.

silver sluice Jun 9, 2024, 7:52 PM

#

storm saffron Nope, 4. SDXL was around 6 due to also including CLIP and VAE in the safetensors...

thanks for the technical answer i appreciate the insight

storm saffron Jun 9, 2024, 7:52 PM

#

lucid swift lumina vs sd3 (but sd3 is like 8b and lumina is like 2b ) The image depicts an a...

I like the composition and light of the 1st one.

lucid swift Jun 9, 2024, 7:52 PM

#

yes

#

but the second one looks more like a photo

#

but i never said photo in the prompt

storm saffron Jun 9, 2024, 7:53 PM

#

PixArt. (600M param)

lucid swift Jun 9, 2024, 7:53 PM

#

storm saffron PixArt. (600M param)

this looks also interesting

storm saffron Jun 9, 2024, 7:54 PM

#

I'll do a widescreen version, which I think would enhance it a bit

#

lucid swift Jun 9, 2024, 7:55 PM

#

"The image shows a cozy, eclectic room with a vibrant, colorful ambiance. The ceiling is draped with multiple tapestries featuring intricate designs, including mandala patterns and depictions of plants and celestial motifs. The lighting is soft and atmospheric, with various sources contributing to the overall mood: Ceiling Lighting: There are red, pink, and purple lights that illuminate the tapestries, highlighting their patterns and adding a warm glow. String Lights: Multi-colored string lights are draped around the room, adding to the festive and relaxed atmosphere. Television: A flat-screen TV on the wall displays a scene from the animated show "The Simpsons"

lucid swift Jun 9, 2024, 7:55 PM

#

storm saffron

cool

storm saffron Jun 9, 2024, 7:56 PM

#

Lumina seems pretty undertrained or not at all finetuned.

storm saffron Jun 9, 2024, 7:56 PM

#

lucid swift "The image shows a cozy, eclectic room with a vibrant, colorful ambiance. The ce...

lucid swift Jun 9, 2024, 7:57 PM

#

storm saffron Lumina seems pretty undertrained or not at all finetuned.

yes this look under trained

lucid swift Jun 9, 2024, 7:57 PM

#

storm saffron

impressive

storm saffron Jun 9, 2024, 7:57 PM

#

PixArt constantly impresses me for its size

lucid swift Jun 9, 2024, 7:58 PM

#

did stabillety improve fursuits?! "The image shows a person dressed in a colorful fursuit against a plain pink background. The fursuit features a large, stylized head with prominent, pointed ears that are white on the inside and purple on the outside. The head is covered in bright red fur with a white stripe across the face and purple fur around the muzzle. The eyes are large, black, and oval-shaped. The person is wearing a black shirt with long sleeves and a purple skirt. The sleeves have blue and purple striped arm warmers extending to their paws, which are covered in purple fur. The lower part of the fursuit includes purple fur-covered legs and feet. The overall style is vibrant, playful, and highly stylized, typical of fursuits often seen in the furry community."

#

maby the lumina model wants diffrent promting or something

storm saffron Jun 9, 2024, 8:00 PM

#

I guess they would have trained it on fursuits to please a certain demographic?

storm saffron Jun 9, 2024, 8:01 PM

#

lucid swift did stabillety improve fursuits?! "The image shows a person dressed in a colorfu...

It's definitely NOT a fursuit. It's a weird combo between the two. Hmmm... Try "Mascot Costume"

lucid swift Jun 9, 2024, 8:02 PM

#

storm saffron I guess they would have trained it on fursuits to please a certain demographic?

idk but stabilety models are normally very shit at fursuits. i know that xD

storm saffron Jun 9, 2024, 8:02 PM

#

Mascot costume is not as bad, but sill not good. 😄

#

notes If I did a retrain on my finetune of pixart add more furries.

lucid swift Jun 9, 2024, 8:03 PM

#

storm saffron *notes* If I did a retrain on my finetune of pixart add more furries.

the only one good at furrys is open ai here my tests xD

storm saffron Jun 9, 2024, 8:04 PM

#

lucid swift the only one good at furrys is open ai here my tests xD

My dreamMODE model for SDXL is pretty good at furries.

lucid swift Jun 9, 2024, 8:04 PM

#

storm saffron My dreamMODE model for SDXL is pretty good at furries.

also fursuits?

storm saffron Jun 9, 2024, 8:04 PM

#

I'll try your prompt in it

lucid swift Jun 9, 2024, 8:05 PM

#

the promot is just made with chat gpt so it might not be a good one

storm saffron Jun 9, 2024, 8:05 PM

#

It'll probably miss out half of it due to being SDXL

lucid swift Jun 9, 2024, 8:05 PM

#

xD

storm saffron Jun 9, 2024, 8:05 PM

#

lucid swift Jun 9, 2024, 8:05 PM

#

classic clip

#

its better then bais. but it does not look like a real fursuit xD

storm saffron Jun 9, 2024, 8:06 PM

#

It's close, hold on.

#

Better?

#

A person in a grey wolf fursuit at a comicon

#

lucid swift Jun 9, 2024, 8:08 PM

#

yes it looks better

storm saffron Jun 9, 2024, 8:08 PM

#

that's dreamMODE CosXL on civit if you want it. 😄

cedar gale Jun 9, 2024, 8:08 PM

#

storm saffron ```A person in a grey wolf fursuit at a comicon```

lucid swift Jun 9, 2024, 8:08 PM

#

i just asked chat gpt to generate the image with the promt and it made this

storm saffron Jun 9, 2024, 8:08 PM

#

cedar gale

Why so much fire?

cedar gale Jun 9, 2024, 8:08 PM

#

I have no idea I tested your prompt.

storm saffron Jun 9, 2024, 8:09 PM

#

That seems like a lot of fire for it not being in the prompt. 😄

lucid swift Jun 9, 2024, 8:09 PM

#

cedar gale I have no idea I tested your prompt.

https://tenor.com/view/shannon-sharpe-shay-nope-nah-nuhuh-gif-14856215

Tenor

cedar gale Jun 9, 2024, 8:10 PM

#

Ok ok I admit, I might have spiced it up a little.

storm saffron Jun 9, 2024, 8:14 PM

#

did you just add "on fire"

lucid swift Jun 9, 2024, 8:15 PM

#

✅

storm saffron Jun 9, 2024, 8:16 PM

#

😛

#

felt puppet nightmare, night, fog

cedar gale Jun 9, 2024, 8:16 PM

#

Noice.

#

#

Pikachu edition.

lucid swift Jun 9, 2024, 8:22 PM

#

#

lumina cant do text

cedar gale Jun 9, 2024, 8:24 PM

#

#

lucid swift Jun 9, 2024, 8:28 PM

#

Lumina, idogram, sd3 "The image is a top-down view of a photograph placed on a wooden surface. The photograph appears to be in black and white and has a vintage quality. The edges are clean and straight. The subject of the photograph is a person standing and wearing a long dress with a high collar and buttons down the front, with their hands clasped together in front. The person is wearing a mask or headpiece resembling a goat's skull with large, curved horns. The background behind the subject is dark, suggesting the use of a flash which highlights the subject and makes them stand out against the darkness. The photograph itself is not dirty or damaged. Below the photograph, handwritten text reads: "Fear is weakness.""

storm saffron Jun 9, 2024, 8:41 PM

#

Hands are no better on Lumina either

#

PixArt (once I added polaroid)

storm saffron Jun 9, 2024, 8:49 PM

#

cedar gale Ok ok I admit, I might have spiced it up a little.

I mean there are some sub-sets of furry culture to also try prompts for, but that might get removed. 😄

cedar gale Jun 9, 2024, 8:49 PM

#

Tell me more.

sterile pendant Jun 9, 2024, 8:50 PM

#

frail shoal can you use it in comfyui ?

at the moment? no, not that I'm aware of.

storm saffron Jun 9, 2024, 8:52 PM

#

cedar gale Tell me more.

I'd rather not. 😄 Have a bat minion instead

#

cedar gale Jun 9, 2024, 8:53 PM

#

storm saffron I'd rather not. 😄 Have a bat minion instead

The information will be used very wise. 😄

#

As u know me. 😉

storm saffron Jun 9, 2024, 8:55 PM

#

Research is left to the reader. 😉

cedar gale Jun 9, 2024, 8:55 PM

#

I do thank you kindly for the nice bat minion tho. 😄

sage burrow Jun 9, 2024, 8:56 PM

#

image1-Core-photograph-of-a-female-alligator-anthro-hybrid-skater-punk-riding-a-skateboard-wearing-shorts-and-a-hoodie-extremel_2.jpeg

vapid radish Jun 9, 2024, 9:30 PM

#

#

bitter hearth Jun 9, 2024, 10:08 PM

#

vapid radish

gimme prompt for these 2

#

waow

#

#

lmao the explosion became a christmas tree

silver sluice Jun 9, 2024, 10:14 PM

#

can the SD3 API generate stuff that's like licensed/copyrighted? Like if you ask for it for super mario riding a skateboard on a sidewalk will it do that or will that throw an error?

bitter hearth Jun 9, 2024, 10:15 PM

#

silver sluice can the SD3 API generate stuff that's like licensed/copyrighted? Like if you ask...

silver sluice Jun 9, 2024, 10:15 PM

#

oh wow so like fully unlocked lol cool thanks for the share

vapid radish Jun 9, 2024, 10:25 PM

#

bitter hearth gimme prompt for these 2

The Monster was:
In a foggy mountain pass, a repulsive creature emerges, but its form is indistinct and partially obscured by the dense mist. The winding road disappears into the thick fog, flanked by towering pines that create a tunnel-like effect. The fog muffles sound, creating an eerie silence and a sense of mystery and solitude. The creature's grotesque appearance is only vaguely visible through the haze, blending seamlessly with the eerie atmosphere of the journey through the unknown

The Cat:
cinematic photo film still of a little cat, (close shot), snow treading, rain dripping, fog filling, .shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy, cinematic photorealistic, 8k uhd natural lighting, raw, rich, intricate details, key visual, atmospheric lighting, 35mm photograph, film, bokeh, professional, 4k, highly detailed

low stone Jun 9, 2024, 10:27 PM

#

So they just released a new version of Lumina-Next-SFT which is kind of impressive and more prompt following. I prodded city96 to see if we can get comfy support for it.

#

bitter hearth Jun 9, 2024, 10:29 PM

#

@vapid radish am I doing it right waow

vapid radish Jun 9, 2024, 10:30 PM

#

bitter hearth <@589174193925390347> am I doing it right <:waow:1017853838516035725>

Would you rather fight the monster-sized cat or the cat-sized monster?

bitter hearth Jun 9, 2024, 10:32 PM

#

thomas

lucid swift Jun 9, 2024, 10:47 PM

#

storm saffron I mean there are some sub-sets of furry culture to also try prompts for, but tha...

bs has nothing to do with furrys and it also exists in every other group of people

storm saffron Jun 9, 2024, 10:51 PM

#

lucid swift bs has nothing to do with furrys and it also exists in every other group of peop...

I'm confused. Anyway let's move on.

vapid radish Jun 9, 2024, 10:54 PM

#

dull star Jun 9, 2024, 10:55 PM

#

silver sluice oh wow so like fully unlocked lol cool thanks for the share

Yes I love that stuff like pop culture characters and etc weren't omitted, and gore/blood neither!
And yet bunch of artists were opted out and it doesn't change the quality of the model that much

#

And fine-tuned models are yet to come

patent magnet Jun 9, 2024, 11:05 PM

#

Side by side comparison: left is SD 3 Ultra raw, right is the Upscaled + Eye-Corrected version

#

Side by side comparison: left is SD 3 Ultra raw, right is the Upscaled + Eye-Corrected version

#

Side by side comparison: left is SD 3 Ultra raw, right is the Upscaled + Eye-Corrected version

#

Prompt: anime art, 1girl, kemomimi, harpy, curvy, [white|black] hair, looking at viewer, [yellow|aqua] eyes, hood, fur coat, underbust, corset, night, moonlight, castle, sitting, squatting, Large_wings, hands between legs, from below, rooftop, wide_shot

vapid radish Jun 9, 2024, 11:16 PM

#

silver sluice Jun 9, 2024, 11:24 PM

#

patent magnet Side by side comparison: left is SD 3 Ultra raw, right is the Upscaled + Eye-Cor...

I prefer the left on all occasions as far as general image composition but the right has better eyes, if i could take the eyes from the right and the composition from the left that'd be the winner, feels like upscaled is making the image overall worse

torpid forge Jun 9, 2024, 11:36 PM

#

#

patent magnet Jun 9, 2024, 11:37 PM

#

silver sluice I prefer the left on all occasions as far as general image composition but the r...

The upscale toned down the colors a bit, and some parts of the lighting made it worse, I agree. But the final quality is still superior to the original.

Yes, I could do that. Correct the eyes without the upscale. Very easy to do. 👍

silver sluice Jun 9, 2024, 11:38 PM

#

patent magnet The upscale toned down the colors a bit, and some parts of the lighting made it ...

so you have access to the sd3 weights and you're doing this in comfy or something?

torpid forge Jun 9, 2024, 11:40 PM

#

patent magnet Jun 9, 2024, 11:42 PM

#

silver sluice so you have access to the sd3 weights and you're doing this in comfy or somethin...

I'm using the official Ultra API colab for the SD3 generation. And the post in Fooocus

torpid forge Jun 9, 2024, 11:42 PM

#

#

I'm just making little goats

#

and clowns

severe phoenix Jun 10, 2024, 12:08 AM

#

vapid radish

soo cool, pls whats the prompt for these

brisk bay Jun 10, 2024, 12:12 AM

#

a fat man 40 years drinking beer funny picture

bitter hearth Jun 10, 2024, 12:55 AM

#

torpid forge Jun 10, 2024, 1:11 AM

#

wild remnant Jun 10, 2024, 1:59 AM

#

fading fiber Jun 10, 2024, 2:19 AM

#

nice

fleet meteor Jun 10, 2024, 2:30 AM

#

low stone

Amazing gens

sick cedar Jun 10, 2024, 2:31 AM

#

can anyone tell me what version of the T5 model SD3 uses?

patent acorn Jun 10, 2024, 2:43 AM

#

https://tenor.com/view/rick-and-morty-rick-laughing-terrifying-morty-gif-10529203898491388485

Tenor

#

yo sd3 make me an image of maniac laughing rick from rick and morty

low stone Jun 10, 2024, 2:45 AM

#

fleet meteor Amazing gens

Llama3 prompt (it's so creative) and SD3. June 12th is gonna be great: A grotesque Trump mask made from human skulls and twisted metal hangs upside down from a rusty chain suspended above a trash-strewn alleyway illuminated by flickering fluorescent lights casting eerie shadows on crumbling brick walls.

#

low stone Jun 10, 2024, 3:17 AM

#

#

hallow lion Jun 10, 2024, 5:19 AM

#

2 more days

#

😄

#

now it's not 2 weeks!

#

just 2 more days

desert garnet Jun 10, 2024, 5:21 AM

#

after that we only need to wait 2 weeks for 8b 👍

bitter hearth Jun 10, 2024, 5:48 AM

#

hallow lion now it's not 2 weeks!

agony

#

I miss the good old days

#

sadcat

compact forge Jun 10, 2024, 6:11 AM

#

desert garnet after that we only need to wait 2 weeks for 8b 👍

whats 8b?

desert garnet Jun 10, 2024, 6:15 AM

#

compact forge whats 8b?

8B is a unit in NieR:Automata, and one of the targets in the "YoRHa Betrayers" quest. 🤖

compact forge Jun 10, 2024, 6:16 AM

#

great, cant wait

bitter hearth Jun 10, 2024, 6:22 AM

#

@desert garnet thomas since when did you become the sd3 ceo

desert garnet Jun 10, 2024, 6:23 AM

#

bitter hearth <@179678516198113281> <:thomas:1005605185013416016> since when did you become th...

2 weeks ago thomas

bitter hearth Jun 10, 2024, 6:30 AM

#

fallen swallow Jun 10, 2024, 7:01 AM

#

^^

#

torpid forge Jun 10, 2024, 7:18 AM

#

severe phoenix Jun 10, 2024, 7:44 AM

#

lucid swift living avian theropods

dude pls whats prompt for this

#

are thee maade witth ideogram?? 😲

noble coyote Jun 10, 2024, 7:58 AM

#

severe phoenix are thee maade witth ideogram?? 😲

Yes 🙂

muted dove Jun 10, 2024, 8:12 AM

#

severe phoenix dude pls whats prompt for this

These are with SDXL, but I used national geographic documentary style shot of a living avian theropod

hallow lion Jun 10, 2024, 9:46 AM

#

severe phoenix are thee maade witth ideogram?? 😲

No it's real photo.

remote holly Jun 10, 2024, 10:14 AM

#

Have developers of tools like a1111, fooocus or comfyui received the sd3 models in advance or will we have to wait a bit for integration?

#

Or can we use the model directly?

jolly abyss Jun 10, 2024, 10:18 AM

#

remote holly Have developers of tools like a1111, fooocus or comfyui received the sd3 models ...

I am pretty sure comfy has them and is ready for the "launch". I doubt that A1111 and Foooocus will support SD3 from the start.

#

Oh and swarm obviously too.

remote holly Jun 10, 2024, 10:22 AM

#

ah too bad, in any case I can't wait to see the workflows that it will be possible to do in comfyui

mild bramble Jun 10, 2024, 10:58 AM

#

Request: Can anyone try to push sd3 to its maximum capacity for realism in ai images

lucid swift Jun 10, 2024, 11:10 AM

#

severe phoenix dude pls whats prompt for this

living avian theropods

low stone Jun 10, 2024, 11:38 AM

#

mild bramble Request: Can anyone try to push sd3 to its maximum capacity for realism in ai im...

You can use artisan in the other channel for that

sage burrow Jun 10, 2024, 11:53 AM

#

image0-Ultra-a-character-turnaround-sheet-of-a-porcupine-anthrofurry-wearing-shorts-showing-several-different-active-poses-fron.jpeg

turbid grotto Jun 10, 2024, 12:00 PM

#

glif-stablediffusion-3-cds899-oi73skmt7doml5jqapen3nez.jpg

#

made a mistake in word "truck"

glif-stablediffusion-3-cds899-i1qjcx2vsski709op20si4c9.jpg

#

lykon is teasing with further trained 8b sponging

low stone Jun 10, 2024, 12:05 PM

#

cedar gale Jun 10, 2024, 12:17 PM

#

low stone Jun 10, 2024, 12:33 PM

#

rain current Jun 10, 2024, 1:38 PM

#

past flame Jun 10, 2024, 2:13 PM

#

Ah yes

#

The Hort

patent acorn Jun 10, 2024, 2:14 PM

#

rain current

thats SO good

#

could have some simpsons comic parodies in sd3

low stone Jun 10, 2024, 2:54 PM

#

#

news reporter with a microphone standing in front of a chaotic scene where a hideous gigantic monster is destroying buildings and stores

lucid swift Jun 10, 2024, 2:56 PM

#

sage burrow

what is your promt?

noble coyote Jun 10, 2024, 3:00 PM

#

lucid swift what is your promt?

"Sonic the Hedgehog does Sears Catalog?!?!?" 😄

#

SDXL, not SD3

#

lucid swift Jun 10, 2024, 3:03 PM

#

noble coyote "Sonic the Hedgehog does Sears Catalog?!?!?" 😄

i dont think thats the prompt

patent acorn Jun 10, 2024, 3:04 PM

#

who pinged?

sage burrow Jun 10, 2024, 3:05 PM

#

lucid swift what is your promt?

#artisan-3 message

lucid swift Jun 10, 2024, 3:05 PM

#

sage burrow https://discord.com/channels/1002292111942635562/1237460408651223121/12496920611...

nice! thank you

fallen swallow Jun 10, 2024, 3:53 PM

#

steep widget Jun 10, 2024, 4:29 PM

#

New system available ♥ - https://www.youtube.com/watch?v=-mTT49TDxIE

YouTube

uisato

Measuræ / Audio-Gen Geometries - [TouchDesigner + Stable Diffusion]

New system for audioreactively generative geometries, intervened with various SD configs.

You can access this new TD patch and SD configs (3), plus many more systems, experiments, and tutorials, through: https://linktr.ee/uisato

#touchdesigner #stablediffusion #generativeart

0:00 - AI intervened - 1
0:07 - AI Intervened - 2
0:15 - AI Int...

▶ Play video

noble coyote Jun 10, 2024, 5:02 PM

#

SD3@ClipDrop

grand_guignol_venice_carnival_catherine_abel_marc_chagall_frottage_tamara_lempicka_mads_berg_7.png

grand_guignol_venice_carnival_catherine_abel_marc_chagall_frottage_tamara_lempicka_mads_berg_4.png

grand_guignol_venice_carnival_catherine_abel_marc_chagall_frottage_tamara_lempicka_mads_berg_.png

grand_guignol_venice_carnival_catherine_abel_marc_chagall_frottage_tamara_lempicka_mads_berg_1.png

microscopic___1_closeup___1_backlit_jewel_tiffany___3_intricate_needlepoint_closeup_murmuration___3_13.png

microscopic___1_closeup___1_backlit_jewel_tiffany___3_intricate_needlepoint_closeup_murmuration___3_16.png

microscopic___1_closeup___1_backlit_jewel_tiffany___3_intricate_needlepoint_closeup_murmuration___3_10.png

microscopic___1_closeup___1_backlit_jewel_tiffany___3_intricate_needlepoint_closeup_murmuration___3_4.png

low stone Jun 10, 2024, 5:43 PM

#

rain current Jun 10, 2024, 5:47 PM

#

agony

noble coyote Jun 10, 2024, 5:48 PM

#

rain current <:agony:1002961183105634415>

Too Daze alreddie

fair spruce Jun 10, 2024, 5:51 PM

#

pseudo stone Jun 10, 2024, 6:04 PM

#

rain current <:agony:1002961183105634415>

12th june in 2 days tho

fair spruce Jun 10, 2024, 6:13 PM

#

raven fern Jun 10, 2024, 6:15 PM

#

just like Naruto says in the dub: "Believe it!"

sterile pendant Jun 10, 2024, 6:22 PM

#

What if it was a dyslexic mixup in d/m/y vs m/d/y and they actually meant December 6th?

fair spruce Jun 10, 2024, 6:30 PM

#

#

#

#

#

cunning lintel Jun 10, 2024, 7:01 PM

#

A_UHD_full-body_shot_of_a_mystical_creepy_insect-hairy_moth-human-monkey_chimera_w.png

dull star Jun 10, 2024, 7:08 PM

#

so close 🙏

#

its so good that we have a date

#

and 8B will come later and will be released as well

mortal mesa Jun 10, 2024, 7:15 PM

#

is it known that its comfyui ready?

mild bramble Jun 10, 2024, 7:15 PM

#

fair spruce

what fr?

mild bramble Jun 10, 2024, 7:15 PM

#

mortal mesa is it known that its comfyui ready?

it'll be like sdxl I guess so yea

raven fern Jun 10, 2024, 7:22 PM

#

mortal mesa is it known that its comfyui ready?

as far as i understood, it will be day 1 ready on comfyui

#

or rather, available day 1

mortal mesa Jun 10, 2024, 7:23 PM

#

raven fern as far as i understood, it will be day 1 ready on comfyui

ya just lots of things have been said, wondering if reconfirmed

raven fern Jun 10, 2024, 7:30 PM

#

maybe ask comfy directly :3

ancient cape Jun 10, 2024, 7:34 PM

#

mortal mesa ya just lots of things have been said, wondering if reconfirmed

interally they are using comfy. so yes, its ready day 1

#

same for stableswarm, if you wanna generate grids

dull star Jun 10, 2024, 7:37 PM

#

exacly, comfyui and stableswarm are day 1 things

#

anything else is probably gonna take a few days if not weeks, don't know

#

we should also know that controlnets are not going to be day 1 things, but we might expect 1.5 quality if not better due to the MM part of MMDiT

#

it's just that it may take more time for people to do research about it first

#

as long as regional prompting is day 1 and they figure out a pos embed fix so that we can get proper highresfix, I'm perfectly satisfied for the time being

raven fern Jun 10, 2024, 7:40 PM

#

i dont care about controlnets right away, il be happy enough with t2i and i2i for now 🙂

dull star Jun 10, 2024, 7:41 PM

#

oh btw they might've already started training 8B

raven fern Jun 10, 2024, 7:41 PM

#

😮

dull star Jun 10, 2024, 7:41 PM

#

#

Lykon's been posting pics

turbid grotto Jun 10, 2024, 7:41 PM

#

I dream about memes finetune

dull star Jun 10, 2024, 7:41 PM

#

then again, these must've used Ultra's workflow with like highresfix and everything

dull star Jun 10, 2024, 7:41 PM

#

turbid grotto I dream about memes finetune

if finetuning is made easy, that's LITERALLY the first thing I'm doing

#

I can't wait

#

and since its 2B, we can actually train it thomas

turbid grotto Jun 10, 2024, 7:42 PM

#

dull star then again, these must've used Ultra's workflow with like highresfix and everyth...

hiresfix can be done only with sdxl rn?

dull star Jun 10, 2024, 7:43 PM

#

no, with any Unet model to my knowledge

#

but with DiT models such as pixart and now SD3, highresfix is broken

turbid grotto Jun 10, 2024, 7:43 PM

#

dull star if finetuning is made easy, that's LITERALLY the first thing I'm doing

I will try too!

dull star Jun 10, 2024, 7:43 PM

#

for Pixart, the entire image is noisy and distroted, and with SD3, the area outside the base resolution part is a blurry mess

#

turbid grotto Jun 10, 2024, 7:44 PM

#

dull star but with DiT models such as pixart and now SD3, highresfix is broken

tiled upscale breaks bokeh and blur for me agony

dull star Jun 10, 2024, 7:44 PM

#

if you go out of resolution range without fixing the positional embedding handling or using tiling, it does this (clear image in center, distortion on the outer edges)

#

from Alex (mcmonkey)

dull star Jun 10, 2024, 7:44 PM

#

turbid grotto tiled upscale breaks bokeh and blur for me <:agony:1002961183105634415>

this is exactly why I am worried about them saying "just use tiled upscal"

#

then agian, lykon's image must've been with upscaling, and they look wonderful

#

maybe low denoising is okay? 🤷‍♂️

#

I really hope its just a case of fixing a part of the code and not a limitation with DiT

turbid grotto Jun 10, 2024, 7:46 PM

#

dull star then agian, lykon's image must've been with upscaling, and they look wonderful

maybe sd3 just smarter)

#

also tiled controlnet might be more effective

dull star Jun 10, 2024, 7:48 PM

#

turbid grotto maybe sd3 just smarter)

no its definitely upscaling

#

but the VAE is superior though

#

tropic plume Jun 10, 2024, 7:49 PM

#

guys

dull star Jun 10, 2024, 7:49 PM

#

this an image withotu highresfix and the woman's face doesn't look like a mess

#

the eyes look perfect

tropic plume Jun 10, 2024, 7:49 PM

#

do i create images here or can i invite bot to my dms

dull star Jun 10, 2024, 7:49 PM

#

I actually don't know, but you can use #1237459938901491852 channels of course
check out #artisan-faq

tropic plume Jun 10, 2024, 7:49 PM

#

thankyouuu

turbid grotto Jun 10, 2024, 7:51 PM

#

dull star but the VAE is superior though

will it be slower?

dull star Jun 10, 2024, 7:51 PM

#

no idea actually

turbid grotto Jun 10, 2024, 7:51 PM

#

hope not 4 times slower waow

dull star Jun 10, 2024, 7:51 PM

#

I hope its mostly vram difference and it can be solved with a tiled VAE

#

but I hope its not 4 times slower

tropic plume Jun 10, 2024, 7:52 PM

#

it wont let me type anything in that channel

teal fossil Jun 10, 2024, 7:52 PM

#

dull star

Is he using the 2B we'll get or a version of the 8B?

dull star Jun 10, 2024, 7:53 PM

#

these are 8B pictures

teal fossil Jun 10, 2024, 7:55 PM

#

dull star these are 8B pictures

Meh. That tease right before the 2B release is in bad taste.

dull star Jun 10, 2024, 7:56 PM

#

honestly doesn't look any better than 2B right now

hallow talon Jun 10, 2024, 8:01 PM

#

Has there been any update on what time on Wednesday we should expect the model to be released?

dull star Jun 10, 2024, 8:01 PM

#

don't recall

#

just simply expect the worst and you won't be disappointed

#

I would wager midday to night in the US

#

I'd be surprised if they released it around greenwich midday time

hallow talon Jun 10, 2024, 8:04 PM

#

as long as it comes out at some point on wednesday as promised I'm happy lol.

#

thanks 🙂

dull star Jun 10, 2024, 8:05 PM

#

it cannot possibly be delayed, the model is trained well now

hallow talon Jun 10, 2024, 8:05 PM

#

oh yeah, wasn't expecting it to be! Might've worded that wrong. I was just curious what time to expect it 🙂

vapid radish Jun 10, 2024, 8:20 PM

#

dull star Jun 10, 2024, 8:24 PM

#

https://github.com/comfyanonymous/ComfyUI/commit/8c4a9befa7261b6fc78407ace90a57d21bfe631e

GitHub

SD3 Support. · comfyanonymous/ComfyUI@8c4a9be

#

GUYS

#

YESSSS

#

#

#

#

YESS THANK YOU COMFY

#

excellent!!!!!

#

we can leave prompts emtpy to try it all out

bitter hearth Jun 10, 2024, 8:39 PM

#

Comfy

#

sadcat

#

(the UI)

hallow talon Jun 10, 2024, 8:40 PM

#

hopefully someone will make a workflow cause I can use comfy but I don't know exactly how to connect everything together properly (especially now that it got more complicated with SD3)

remote holly Jun 10, 2024, 8:43 PM

#

How many gb takes sd3 to memory including all components ?

gusty trail Jun 10, 2024, 8:50 PM

#

cliploader cliptextencoder not a very good name for t5

dull star Jun 10, 2024, 8:50 PM

#

yeah idk why its still cliptextencodesd3

#

when there's a T5 as well

#

but I suppose its for familiarity

dull star Jun 10, 2024, 8:51 PM

#

hallow talon hopefully someone will make a workflow cause I can use comfy but I don't know ex...

https://comfyanonymous.github.io/ComfyUI_examples/ not now, but later it will probably be here

ComfyUI_examples

ComfyUI Examples

Examples of ComfyUI workflows

vapid radish Jun 10, 2024, 8:55 PM

#

Do we know how big the SD3 model file is going to be yet? I did just order another 4TB SSD just in case 😀

dull star Jun 10, 2024, 8:57 PM

#

2B isn't going to be big, especially since its going to be at bf16 or fp16

#

it's T5 that's gonna be massive of course

#

which is 10 GB if you haven't installed it

turbid grotto Jun 10, 2024, 9:10 PM

#

why T5 is so huge 💀

dull star Jun 10, 2024, 9:11 PM

#

its a large LLM model (large large language model kek)

#

even if we are only using the encoder part of it

teal fossil Jun 10, 2024, 9:12 PM

#

dull star it cannot possibly be delayed, the model is trained well now

Don't jinx it now. 😉

dull star Jun 10, 2024, 9:12 PM

#

I mean if they train it more, I don't mind

#

okay maybe I do, cause I kinda want to try the model offline now 😔

teal fossil Jun 10, 2024, 9:13 PM

#

dull star

Do you know what Shift refers to?

dull star Jun 10, 2024, 9:14 PM

#

no idea, but cascade also needed a shift node

#

yayy its alex!!!!1!

teal fossil Jun 10, 2024, 9:14 PM

#

dull star we can leave prompts emtpy to try it all out

Pretty neat, but like early SDXL prompting where we experimented with G or L only or custom prompts for both... it'll probably turn out not that useful.

viral plaza Jun 10, 2024, 9:14 PM

#

gusty trail cliploader cliptextencoder not a very good name for t5

lolyeah had that conversation earlier

dull star Jun 10, 2024, 9:14 PM

#

I think T5 will make a difference

viral plaza Jun 10, 2024, 9:15 PM

#

dull star https://github.com/comfyanonymous/ComfyUI/commit/8c4a9befa7261b6fc78407ace90a57d...

swarm too

teal fossil Jun 10, 2024, 9:15 PM

#

bitter hearth Comfy

The noodles don't scare me no more. I have become one with the noodle.

dull star Jun 10, 2024, 9:15 PM

#

viral plaza swarm too

oh absolutely 🔥

viral plaza Jun 10, 2024, 9:15 PM

#

dull star https://comfyanonymous.github.io/ComfyUI_examples/ **not** now, but later it wil...

probably on launch day yeah

viral plaza Jun 10, 2024, 9:15 PM

#

hallow talon hopefully someone will make a workflow cause I can use comfy but I don't know ex...

if you use Swarm it autogenerates workflows for you

#

great for getting set up and still lets you take the workflow itself and muck with it after at will

teal fossil Jun 10, 2024, 9:17 PM

#

dull star even if we are only using the encoder part of it

Alex linked a T5XXL version that is only the Encoder. It's smaller and apparently loads faster.

viral plaza Jun 10, 2024, 9:17 PM

#

vapid radish Do we know how big the SD3 model file is going to be yet? I did just order anoth...

SD3-Medium-fp16 is a 4GiB file, it relies on a separate 1.3 GiB CLIP-G, 0.24 GiB CLIP-L, and optionally a 9.5GiB T5-XXL.
When downloading finetunes expect to usually only download variants of the 4GiB file, not the textencs

teal fossil Jun 10, 2024, 9:17 PM

#

Hi Alex. 🙂

viral plaza Jun 10, 2024, 9:17 PM

#

Also it's possible to store the model in FP8, making it only a 2GiB file when you do that

dull star Jun 10, 2024, 9:18 PM

#

teal fossil Alex linked a T5XXL version that is only the Encoder. It's smaller and apparentl...

yeah its the encoder part only at like fp16 so its around 10GB

turbid grotto Jun 10, 2024, 9:18 PM

#

dull star its a large LLM model (large large language model kek)

what about q4? waow

dull star Jun 10, 2024, 9:18 PM

#

it loads WAAAY faster than fp32

viral plaza Jun 10, 2024, 9:18 PM

#

turbid grotto what about q4? <:waow:1017853838516035725>

I don't think we have the software tooling ready for launch day for T5-4bit

#

hypothetically if we did it should work fine and would be a ~2.5GiB file

dull star Jun 10, 2024, 9:19 PM

#

I wonder if a ggml implementation of encoder models such as T5 would help 🤔 (I'm a broken record)

viral plaza Jun 10, 2024, 9:19 PM

#

however you can just not use T5 at all as an option, imo probably the best launch-day option

#

yeah t5 works well in ggml 4bit, just need a convenient way to shove that into comfy

bitter hearth Jun 10, 2024, 9:20 PM

#

turbid grotto what about q4? <:waow:1017853838516035725>

waow

dull star Jun 10, 2024, 9:20 PM

#

there's llama-cpp-python pip package or whatever its called

teal fossil Jun 10, 2024, 9:20 PM

#

viral plaza SD3-Medium-fp16 is a 4GiB file, it relies on a separate 1.3 GiB CLIP-G, 0.24 GiB...

That's interesting... for SDXL (and v-1.5) training the TENC alongside the Unet makes all the difference. You think that's unnecessary now?

turbid grotto Jun 10, 2024, 9:20 PM

#

viral plaza _however_ you can just not use T5 at all as an option, imo probably the best lau...

thank you for the info ♥️

teal fossil Jun 10, 2024, 9:21 PM

#

Basically the difference between mere Dreambooth and Finetunes / multi-concept-LoRA's

viral plaza Jun 10, 2024, 9:22 PM

#

teal fossil Do you know what Shift refers to?

Sigma Shift, if you're familiar with sigmas in SD it literally just offsets those values, see here https://github.com/comfyanonymous/ComfyUI/commit/8c4a9befa7261b6fc78407ace90a57d21bfe631e#diff-6c3064a93127b01542c5772a797c9d356b876fc9940ec14951f95ff8ea270656R172-R204
or for a simplified version of the same code:

teal fossil Jun 10, 2024, 9:23 PM

#

viral plaza I don't think we have the software tooling ready for launch day for T5-4bit

Taggui can apparently run Salesforce/blip2-flan-t5-xxl in 4bit. Maybe that helps?

dull star Jun 10, 2024, 9:23 PM

#

well there's bnb4bit

#

we can load it in 4-bit, but storing is a different question

bitter hearth Jun 10, 2024, 9:24 PM

#

no clue what im reading

viral plaza Jun 10, 2024, 9:25 PM

#

teal fossil That's interesting... for SDXL (and v-1.5) training the TENC alongside the Unet ...

it's quite possible that training the textstream will replace the power training the tenc.

probably still training the tenc is a powerful tool, but it's less valuable with the streams, and also harder with the 3 tenc setup dealio, so the tradeoff made the most sense to just not include tencs

teal fossil Jun 10, 2024, 9:25 PM

#

viral plaza Sigma Shift, if you're familiar with sigmas in SD it literally just offsets thos...

I've been hearing about Sigmas more and more for a few weeks, but apart from "something something detail" I don't really get them. So changing the Shift will have a direct impact on the output and should be experimented with?

viral plaza Jun 10, 2024, 9:26 PM

#

and for reference when I say this, I'm basically personally the reason SDXL has tencs in the model lol, others wanted the tenc separated for XL but i fought to include it because training it is so worthwhile and the tencs are only 1 out of the 7 gigs of space the model takes anyway

#

balance is different for SD3 so I can't really argue it on this one

viral plaza Jun 10, 2024, 9:27 PM

#

dull star we can load it in 4-bit, but storing is a different question

bnb is so weird and jank with all the limitations like this. ggml and exllama just do things so much better

teal fossil Jun 10, 2024, 9:28 PM

#

viral plaza it's quite possible that training the textstream will replace the power training...

So of course they would have to be loaded alongside SD3 2B to be trained with it (excluding T5 since that would need more than consumer hardware)?

#

Textstream == the new Ddit architecture?

dull star Jun 10, 2024, 9:29 PM

#

viral plaza bnb is so weird and jank with all the limitations like this. ggml and exllama ju...

I saw in the terminal that apparently it's an immovable brick due to bitsandbytes, so could it be that you cannot offload it to RAM after you have used it?

teal fossil Jun 10, 2024, 9:30 PM

#

viral plaza balance is different for SD3 so I can't really argue it on this one

It's ~1.5GB for L & G now. Are they exactly the same Clip's SDXL used or custom ones?

viral plaza Jun 10, 2024, 9:31 PM

#

teal fossil I've been hearing about Sigmas more and more for a few weeks, but apart from "so...

Uh so the short of it is:

in early diffusion models, we had timesteps 1-1000 exactly
in the modern era of diffusion, we now have dynamic steps (eg 20 step or 50 steps or etc) and that worked by converting to approx timesteps (so eg for 50 steps, you multiply your step by 50 to get a timestamp value) and adding the sigma value to represent the timesteps in a way the model can now actively process
the shift effectively curves the timestep space, so it can spend more time in the early (structural) steps or more time in the later (detail) steps

If you've ever used my Dynamic Thresholding toolkit in auto/comfy/swarm, the CFG Scheduler feature is very similar to sigma shift (albeit of course using the CFG rather than sigmas to push this preference schedule)

#

in the case of SD3 I think sigmas are basically just linear by default until you apply the shift (vs other models had more of an algorithm to em)

#

(the sigma goes through an embedder to turn into latent magic inside of the model)

teal fossil Jun 10, 2024, 9:33 PM

#

So Sigma Shift 0.5 -> more time for structure Vs Shift 2 -> more time for detail?

viral plaza Jun 10, 2024, 9:33 PM

#

teal fossil So of course they would have to be loaded alongside SD3 2B to be trained with it...

TextEncs don't have to be loaded, you can precalculate embeddings separately. Better for VRAM that way. Kohya and other trainers suppor this out of the box iirc

the textstream is part of the model, it can't be separated. Presumably trainers will let you select which parts of the SD3 model you want to freeze or unfreeze

teal fossil Jun 10, 2024, 9:34 PM

#

Like we could (later down the road once it's fixed) go 0.5 Shift for the initial image and Shift 2 for the HiresFix / Ultimate Upscale / whatever?

viral plaza Jun 10, 2024, 9:34 PM

#

teal fossil Textstream == the new Ddit architecture?

the "mm-" in "mm-dit" is "multimodal", referring to the multiple streams - in SD3 base there's text & image streams, but hypothetically if you go to add eg a controlnet, you'd just add a controlnet stream

viral plaza Jun 10, 2024, 9:35 PM

#

teal fossil It's ~1.5GB for L & G now. Are they exactly the same Clip's SDXL used or custom ...

SD3 is, for all official purposes, CLIP-G + CLIP-L + T5-XXL. While I think in practice everyone's gonna just yeet t5 out a window and focus clips-only, the official reference publication stuff is all including T5

rugged nova Jun 10, 2024, 9:36 PM

#

https://huggingface.co/Alpha-VLLM/Lumina-Next-T2I Uses RoPE scaling to generate at higher resolutions. Any idea if SD3 will also be capable of this?

Alpha-VLLM/Lumina-Next-T2I · Hugging Face

dull star Jun 10, 2024, 9:36 PM

#

#🆕｜sd3 message

#

this was 5 days ago, idk if something changed

#

but we can do tiled upscaling for the time being

viral plaza Jun 10, 2024, 9:37 PM

#

teal fossil So Sigma Shift 0.5 -> more time for structure Vs Shift 2 -> more time for detail...

this graph shows blue = no shift
red = shift 3
blue = shift 0.5

red as you can see runs through timesteps much faster early on in structural phase, then spends more time slowly on the later detail steps

#

SD3's default reference recommended shift is 3

#

it makes sense to play in the range of 1.5 to 3

#

you probably wouldn't ever do below 1

dull star Jun 10, 2024, 9:38 PM

#

nice, this is a good demonstration

viral plaza Jun 10, 2024, 9:39 PM

#

rugged nova https://huggingface.co/Alpha-VLLM/Lumina-Next-T2I Uses RoPE scaling to generate ...

not at launch, and probably won't be the exact same code (I think RoPE doesn't apply here iirc, the posembeds are weirder) but most likely someone will find a way

#

("RoPE is "rotary positional embeds" and "RoPE Scaling" is a technique to scale RoPE, SD3 has its own pos embedding logic)

teal fossil Jun 10, 2024, 9:43 PM

#

viral plaza TextEncs don't have to be loaded, you can precalculate embeddings separately. Be...

I still have to wrap my head around the fact that very early day embeddings are back on the menu... xD

viral plaza Jun 10, 2024, 9:48 PM

#

TI embeddings have been awesome the whole time

#

they are incredible on XL

#

i wish people would use them more / write good tools for them

#

it's theoretically possible to train a small TI in a few seconds (some people even published code that does this before, then deleted it off github argh)

#

LoRAs are more flexible obviously but for single-concept training TIs are hard to beat for speed/quality/usability

dull star Jun 10, 2024, 9:55 PM

#

also aren't they going to work across all 3 (or 4) model sizes?

#

since they use the same clip?

dull star Jun 10, 2024, 9:57 PM

#

viral plaza TI embeddings have been awesome the whole time

the only time I remember using TIs for fun were in SD2 days

#

idk what happened, but stylistic TIs were really good in quality

#

we had like midjourney TIs and Greg Rutkowski TIs

#

they were nice and super small in filesize thanks to TI's nature

#

would making a TI for SD3 have nearly the same procedure as making a TI for SDXL?

fleet meteor Jun 10, 2024, 9:58 PM

#

dull star would making a TI for SD3 have nearly the same procedure as making a TI for SDXL...

we´ll see in 2 days 😁

dull star Jun 10, 2024, 9:58 PM

#

eh but I'm inexperienced in training 😔

#

I'll have to wait for random rentry blogs lmao

#

all I did is use google colab and then later some super braindead easy GUI to train Loras for SDXL

fleet meteor Jun 10, 2024, 9:59 PM

#

I think I was decent in training untill I tried training a lora / checkpoint with more than 30 images

#

My model broke

dull star Jun 10, 2024, 10:00 PM

#

damn

#

https://github.com/Nerogar/OneTrainer okay this has a super easy installation method and it has an SDXL Embedding preset

turbid grotto Jun 10, 2024, 10:16 PM

#

dull star https://github.com/Nerogar/OneTrainer okay this has a super easy installation me...

It is super easy to use

#

and convenient

raven fern Jun 10, 2024, 10:22 PM

#

dull star https://github.com/comfyanonymous/ComfyUI/commit/8c4a9befa7261b6fc78407ace90a57d...

oh shit.. waddup 😮

dull star Jun 10, 2024, 10:23 PM

#

ah, CEST has 1 day left

dry wave Jun 10, 2024, 10:29 PM

#

I found TIs amazing in SD 2.1 but in SDXL they didn't worked well for me

#

also I had the feeling that TIs overfit much worse than unet training

wild remnant Jun 10, 2024, 10:32 PM

#

viral plaza Jun 10, 2024, 10:33 PM

#

dull star also aren't they going to work across all 3 (or 4) model sizes?

Yes. For that matter an SDXL TI trained today will work on SD3 when it's out

viral plaza Jun 10, 2024, 10:34 PM

#

dull star ah, CEST has 1 day left

that's not how timezones work but lol

wild remnant Jun 10, 2024, 10:35 PM

#

dull star Jun 10, 2024, 10:36 PM

#

viral plaza Yes. For that matter an SDXL TI trained today will work on SD3 when it's out

excellent, thanks

raven fern Jun 10, 2024, 10:51 PM

#

wild remnant

damn son.. give me :3

wild remnant Jun 10, 2024, 10:59 PM

#

@raven fern lol

coral sable Jun 10, 2024, 10:59 PM

#

hype is unreal, last day (or 2) of waiting catroll cowroll pikaroll

raven fern Jun 10, 2024, 11:00 PM

#

happemad

wild remnant Jun 10, 2024, 11:00 PM

#

@raven fern

raven fern Jun 10, 2024, 11:00 PM

#

❤️

coral sable Jun 10, 2024, 11:02 PM

#

release will be as planed on schedule?

#

raven fern Jun 10, 2024, 11:05 PM

#

all according to keikaku

coral sable Jun 10, 2024, 11:07 PM

#

agile hornet Jun 10, 2024, 11:17 PM

#

When it drops on huggingface I should be able to just grab the model and toss it in A1111 right?

low stone Jun 10, 2024, 11:23 PM

#

low stone Jun 10, 2024, 11:24 PM

#

agile hornet When it drops on huggingface I should be able to just grab the model and toss it...

Nope

#

There'll be comfy support at launch. Not a111

agile hornet Jun 10, 2024, 11:24 PM

#

noooooo

low stone Jun 10, 2024, 11:25 PM

#

Comfy is written by stability ai. The other guys are their own separate devs so they'll have to work on integrating afterwards.

fleet meteor Jun 10, 2024, 11:25 PM

#

Aaa I can´t wait to try it

raven fern Jun 10, 2024, 11:28 PM

#

132k lines changed 😮

molten valley Jun 10, 2024, 11:34 PM

#

low stone Nope

it will be on huggingface right ?

low stone Jun 10, 2024, 11:39 PM

#

raven fern 132k lines changed 😮

dull star Jun 10, 2024, 11:48 PM

#

low stone

very epic new nodes

low stone Jun 10, 2024, 11:49 PM

#

dull star very epic new nodes

Ooooh very exciting. Didn't realize they pulled it into main branch already

viral plaza Jun 10, 2024, 11:50 PM

#

agile hornet noooooo

you can use it in Swarm as well on day 1, which has a friendly interface like auto1111 does

viral plaza Jun 10, 2024, 11:50 PM

#

molten valley it will be on huggingface right ?

ye

viral plaza Jun 10, 2024, 11:50 PM

#

raven fern 132k lines changed 😮

most of that is just the tokens file for t5 lol

viral plaza Jun 10, 2024, 11:51 PM

#

low stone Ooooh very exciting. Didn't realize they pulled it into main branch already

yep lol everything goes straight to master for comfy

low stone Jun 10, 2024, 11:51 PM

#

viral plaza most of that is just the tokens file for t5 lol

Sshhh don't say that, it's all code that you guys toiled away on. Just go with it. 🙂

viral plaza Jun 10, 2024, 11:52 PM

#

dull star oh btw they might've already started training 8B

8B started training months ago but yes it's regained focus for the training team now that 2B is about ready

dull star Jun 10, 2024, 11:52 PM

#

excellent!

viral plaza Jun 10, 2024, 11:53 PM

#

sick cedar can anyone tell me what version of the T5 model SD3 uses?

(scrolling through backlog sorry if this already got an answer) T5-XXL, encoder only, as found eg here https://huggingface.co/mcmonkey/google_t5-v1_1-xxl_encoderonly/tree/main

coral sable Jun 10, 2024, 11:53 PM

#

updated and rdy ^^

viral plaza Jun 10, 2024, 11:54 PM

#

coral sable updated and rdy ^^

if you update swarm you can get a valid SD3 workflow out of it too lol

low stone Jun 10, 2024, 11:55 PM

#

Is there a new ksampler as well or still using the same one?

dull star Jun 10, 2024, 11:55 PM

#

nope

#

all of the nodes I've shown you are the all of the nodes I've shown you

#

🔥

low stone Jun 10, 2024, 11:55 PM

#

Ah

agile hornet Jun 10, 2024, 11:55 PM

#

Not sure if I can run Swarm, I got a 2080TI with 8GB of ram and all the other ones work well but I heard swarm uses more recources to run is that true?

viral plaza Jun 10, 2024, 11:55 PM

#

low stone Is there a new ksampler as well or still using the same one?

same ksamplers as normal

viral plaza Jun 10, 2024, 11:56 PM

#

agile hornet Not sure if I can run Swarm, I got a 2080TI with 8GB of ram and all the other on...

Swarm uses less resources than auto does lol

#

it's way more efficient

agile hornet Jun 10, 2024, 11:56 PM

#

nice

#

im gonna set it up today then

dull star Jun 10, 2024, 11:56 PM

#

viral plaza same ksamplers as normal

what about schedulers?

#

can we use all samplers and schedulers?

#

how does the flow matching influence it

viral plaza Jun 10, 2024, 11:57 PM

#

dull star can we use all samplers and schedulers?

You generally want Euler+Normal, but any Sampler that isn't Ancestral or Stochastic (ie SDE) should work

#

ancestral/stochastic are incompatible with flow

low stone Jun 10, 2024, 11:58 PM

#

dull star what about schedulers?

That's a good question. Is there any benefit for the perturbed and align your steps nodes with sd3?

raven fern Jun 10, 2024, 11:59 PM

#

go with the flow 🙂

dull star Jun 10, 2024, 11:59 PM

#

we'll see, about that..........

viral plaza Jun 10, 2024, 11:59 PM

#

AYS won't work out of the box, but can be retrained for SD3

raven fern Jun 10, 2024, 11:59 PM

#

so nvidia has to release something new for sd3?

dull star Jun 11, 2024, 12:00 AM

#

it's gonna be cpu by default?

viral plaza Jun 11, 2024, 12:00 AM

#

alternate guidance options like SAG and PAG will probably just flat not work as-is and need new research to make things like em

raven fern Jun 11, 2024, 12:00 AM

#

yea some new research

dull star Jun 11, 2024, 12:00 AM

#

and freeu neither, cause its specifically for unets

viral plaza Jun 11, 2024, 12:01 AM

#

raven fern so nvidia has to release something new for sd3?

for AYS they published a unique list of sigma values to use per model arch, so they'd need to rerun it and get a list specific to SD3

raven fern Jun 11, 2024, 12:01 AM

#

yea basically anything unet related wont work

viral plaza Jun 11, 2024, 12:01 AM

#

dull star and freeu neither, cause its specifically for unets

oh yeah freeu is entirely out

#

as is cacheysamply :(

#

my precious baby, killed to death twice in a row

raven fern Jun 11, 2024, 12:01 AM

#

viral plaza for AYS they published a unique list of sigma values to use per model arch, so t...

right

#

i personally like the Inspire pack ksampler for ays, cause it's already built in, dont have to use a lot of nodes for AYS

dull star Jun 11, 2024, 12:03 AM

#

at least we can experiment with stuff on day one, like messing around with text encoders, textual inversion embeddings, tiled upscaling

#

what about regional sampling?

coral sable Jun 11, 2024, 12:06 AM

#

ok I'm convinced to try SwarmUI, isn't it on Pinokio, cant find

low stone Jun 11, 2024, 12:06 AM

#

dull star what about regional sampling?

I should try my stone man shooting money from his hands regional prompt on sd3. Seeing as it understand placement keywords, it can probably just do it natively

raven fern Jun 11, 2024, 12:08 AM

#

wait, i just noticed comfy did a small fix to cosxl edit models, does this mean they fixed the artifacts? im gonna check 😮

dull star Jun 11, 2024, 12:08 AM

#

oh shit I just realised

#

top-bottom memes will work perfectly due to regional sampling

carmine blaze Jun 11, 2024, 12:09 AM

#

is API call the only way to test SD3 right now?

dull star Jun 11, 2024, 12:09 AM

#

currently, yes

#

but just wait till 12th and you can use it offline

carmine blaze Jun 11, 2024, 12:09 AM

#

sweet thx

dull star Jun 11, 2024, 12:09 AM

#

you're welcome

raven fern Jun 11, 2024, 12:12 AM

#

didnt fix the artifacts, welp

#

actually wait, il try with some other pics

dull star Jun 11, 2024, 12:13 AM

#

hmm I wonder when we'll get inpaint models

#

ngl it would be perfect for SD3

raven fern Jun 11, 2024, 12:14 AM

#

man greatness awaits 🙂

viral plaza Jun 11, 2024, 12:17 AM

#

raven fern i personally like the Inspire pack ksampler for ays, cause it's already built in...

SwarmKSampler (built in with swarm) has it too

raven fern Jun 11, 2024, 12:17 AM

#

nice

viral plaza Jun 11, 2024, 12:18 AM

#

coral sable ok I'm convinced to try SwarmUI, isn't it on Pinokio, cant find

swarm is super easy to install natively, you don't really need an install manager like some other uis need https://github.com/Stability-AI/StableSwarmUI?tab=readme-ov-file#installing-on-windows

GitHub

GitHub - Stability-AI/StableSwarmUI: StableSwarmUI, A Modular Stabl...

StableSwarmUI, A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility. - Stability-AI/StableSwarmUI

low stone Jun 11, 2024, 12:18 AM

#

dull star what about regional sampling?

I think I was able to achieve just directly what I did with regional promoting with this one. One of the things that's so impressive about sd3 so far is the variation available between seeds. It's not just the same image from a slightly different view each time. Every one is very different which is great.

raven fern Jun 11, 2024, 12:18 AM

#

@viral plaza do you know when we will get an edit model for sd3, like cosxl edit?

viral plaza Jun 11, 2024, 12:18 AM

#

raven fern wait, i just noticed comfy did a small fix to cosxl edit models, does this mean ...

was just a bugfix

viral plaza Jun 11, 2024, 12:18 AM

#

dull star hmm I wonder when we'll get inpaint models

don't need an inpaint model really

viral plaza Jun 11, 2024, 12:21 AM

#

viral plaza don't _need_ an inpaint model really

here see for example i did an inpaint with sd3 medium release candidate directly

#

my lil terminator kitty looks awesome

#

i set the creativity a bit too high so there's some discoloration on the edges but not too bad unless you're looking really really close

#

this isn't some SD3 magic btw this is just Swarm's inpaint code working well lol

#

can do the same with XL

viral plaza Jun 11, 2024, 12:23 AM

#

raven fern <@105458332365504512> do you know when we will get an edit model for sd3, like ...

iiiddunno. Hopefully somebody finds a better way to make one than just ip2p. If not... well somebody's gonna make an ip2p on sd3 at some point

raven fern Jun 11, 2024, 12:23 AM

#

yea

#

also, look at Alex flexing with the release candidate version 🙂

dull star Jun 11, 2024, 12:24 AM

#

viral plaza here see for example i did an inpaint with sd3 medium release candidate directly

wow that works well for a base model

#

like yeah its 0.9 strength, but still

viral plaza Jun 11, 2024, 12:25 AM

#

can of course also just do the mask better to prevent the discoloration

dull star Jun 11, 2024, 12:25 AM

#

kek

raven fern Jun 11, 2024, 12:26 AM

#

nice kitty

viral plaza Jun 11, 2024, 12:26 AM

#

oh actually turn off Mask Shrink Grow and this looks really good

dull star Jun 11, 2024, 12:26 AM

#

how come a1111 users didn't switch to stableswarm yet, it even has these convenient features similar to it

viral plaza Jun 11, 2024, 12:26 AM

#

... anyway tldr point is, if you use a good UI like Swarm and fiddle settings a bit, you don't really need an inpaint model

viral plaza Jun 11, 2024, 12:27 AM

#

dull star how come a1111 users didn't switch to stableswarm yet, it even has these conveni...

main reason for most people aren't using Swarm is just cause they don't know about swarm or don't really comprehend what swarm is

#

most people once they try swarm never go back to anything else

dull star Jun 11, 2024, 12:27 AM

#

its not 1:1 to a1111, so therefore it must be bad thomas

viral plaza Jun 11, 2024, 12:28 AM

#

comfy users it's a 100% no-brainer to use swarm, auto webui there's differences to learn but ... like, most of the differences are improvements so lol

#

the one pain point for auto users is if you have old auto extensions you really like - you can usually find a comfy equivalent, but if you don't like the noodles, it's awkward

coral sable Jun 11, 2024, 12:28 AM

#

waiting installation to finish :c

viral plaza Jun 11, 2024, 12:28 AM

#

swarm is generally less reliant on extensions though, a lot of stuff is built in

#

you only need extensions when you're getting really really crazy

coral sable Jun 11, 2024, 12:30 AM

#

meanwhile octopussy v2 (SD3)

raven fern Jun 11, 2024, 12:30 AM

#

stable swarm seems very well built, il try it one day

coral sable Jun 11, 2024, 12:30 AM

#

not sure if she be eating herself or that's how she is

#

when ~~lambo~~ sd3🤣

ocean lance Jun 11, 2024, 1:12 AM

#

viral plaza (scrolling through backlog sorry if this already got an answer) T5-XXL, encoder ...

would this work too? i tried with pixart and the output was identical https://huggingface.co/city96/t5-v1_1-xxl-encoder-bf16

city96/t5-v1_1-xxl-encoder-bf16 · Hugging Face

#

just saw yours is roughly the same size

viral plaza Jun 11, 2024, 1:13 AM

#

that looks like it's the same thing just bf16 instead of fp16, so yeah

ocean lance Jun 11, 2024, 1:13 AM

#

do you know why the team chose xxl over xl? xl seems to be more popular just from searching for fine tunes of xxl

viral plaza Jun 11, 2024, 1:15 AM

#

ocean lance do you know why the team chose xxl over xl? xl seems to be more popular just fro...

they literally just grabbed the biggest one they could find im pretty sure lol

ocean lance Jun 11, 2024, 1:16 AM

#

bigger is better 😄

hallow lion Jun 11, 2024, 1:23 AM

#

Prepare the consistency nodes for the arrival of SD3.

#

the only thing left to conquer - consistency.

#

we have it in lighting btw

#

😄

#

we just need shape/texture consistency now

#

1 day left

#

dont predict the arrival of the mesiah

#

well in this case we can

low stone Jun 11, 2024, 1:31 AM

#

faint breach Jun 11, 2024, 1:57 AM

#

can sd3 make it 2 more days from now though? can it time shift?

frozen lynx Jun 11, 2024, 2:04 AM

#

guys we gotta think of something to make time go faster so it's 2 days from now

viral plaza Jun 11, 2024, 2:04 AM

#

faint breach can sd3 make it 2 more days from now though? can it time shift?

no but effectively-time-travel has been proposed before

#

("anticausal" = model that's trained to solve in the reverse of the direction of time)

faint breach Jun 11, 2024, 2:05 AM

#

using improbility fields? thonk

viral plaza Jun 11, 2024, 2:06 AM

#

i was thinking MM-DeLorean

faint breach Jun 11, 2024, 2:06 AM

#

this is heavy

frozen lynx Jun 11, 2024, 2:07 AM

#

like .. if u move fast enough u time travel to the future right? we all just gotta get on a spaceship that can travel at near light speed

faint breach Jun 11, 2024, 2:09 AM

#

naw. that'll just age us infinitely before we get back to the same moment

wild remnant Jun 11, 2024, 2:33 AM

#

faint breach Jun 11, 2024, 2:41 AM

#

if i eat that will it be 2 days from now?

ocean lance Jun 11, 2024, 3:07 AM

#

maybe from a food induced coma

compact forge Jun 11, 2024, 3:09 AM

#

whos ready for unexpected delay on release day like last time 😆

sterile pendant Jun 11, 2024, 3:24 AM

#

compact forge whos ready for unexpected delay on release day like last time 😆

And then all the selfish people that will complain about their free toy being late

compact forge Jun 11, 2024, 3:25 AM

#

obviously if you delay something right as people expect it to come out

frozen lynx Jun 11, 2024, 3:25 AM

#

I think I figured it out. so like time is a measurement of light moving, so if we can move at light speed, it'd be like pressing pause on a video game but then we could just skip to any point in time instantly so to get to SD3 launch all we gotta do is go fast like sonic

desert garnet Jun 11, 2024, 3:29 AM

#

compact forge obviously if you delay something right as people expect it to come out

damn selfish ppl who pay for api access

sterile pendant Jun 11, 2024, 3:32 AM

#

desert garnet damn selfish ppl who pay for api access

That already have access to it and whose credits are still going to be in their accounts...

compact forge Jun 11, 2024, 3:32 AM

#

yeah man fucking leeches

#

i hate people so much

ocean lance Jun 11, 2024, 3:33 AM

#

is this the first model that hasn't leaked?

desert garnet Jun 11, 2024, 3:33 AM

#

compact forge yeah man fucking leeches

we truly live in a society

sterile pendant Jun 11, 2024, 3:35 AM

#

But for real, people will complain about anything. Even when it does come out on time, people will complain that they can't run it on their 2gb vram gpu and 8gb ram laptop from 10 years ago

#

Or that it "suck"

#

Because no bobs n vegine

compact forge Jun 11, 2024, 3:37 AM

#

sterile pendant Because no bobs n vegine

thats valid tho

desert garnet Jun 11, 2024, 3:39 AM

#

too many bobs and vegana,ban model pls

sterile pendant Jun 11, 2024, 3:46 AM

#

compact forge thats valid tho

To who? 90% of the complainers are just people from countries where prawn is banned, like India and China.

#

They just happen to come from the two largest countries on the planet

compact forge Jun 11, 2024, 3:47 AM

#

sterile pendant To who? 90% of the complainers are just people from countries where prawn is ban...

lol you think they gen images cause they have no access to porn? 😆

sterile pendant Jun 11, 2024, 3:48 AM

#

It's illegal in both those countries.

desert garnet Jun 11, 2024, 3:53 AM

#

compact forge lol you think they gen images cause they have no access to porn? 😆

didnt u know? ppl in china and india have an iron mask on their head with tiny wires coming out of it

compact forge Jun 11, 2024, 4:02 AM

#

sterile pendant It's illegal in both those countries.

so im asking if they are using ai as a workaround?

viral plaza Jun 11, 2024, 4:02 AM

#

ocean lance is this the first model that hasn't leaked?

we still got 2 days for somebody to be stupid about it

#

like the person that saw "stable audio open" uploaded to HF and stole it and reuploaded a "leak" with the word "open" removed

#

(i really wonder what the goal was. Was it, like, clout chasing? or... why tho?)

compact forge Jun 11, 2024, 4:03 AM

#

viral plaza we still got 2 days for somebody to be stupid about it

do they even have a set of closed beta testers like sdxl?

viral plaza Jun 11, 2024, 4:03 AM

#

compact forge do they even have a set of closed beta testers like sdxl?

nope, only a few business-tier partners this time

#

vs XL for example was sent out to anyone with a .edu address

ocean lance Jun 11, 2024, 4:04 AM

#

so why not anyone with .com address? ;p

#

any sensible training parameters to recommend for training a lora?

viral plaza Jun 11, 2024, 4:05 AM

#

that'd be a @lavish osprey question

bitter hearth Jun 11, 2024, 4:48 AM

#

Is there a free way to use sd3?

ocean lance Jun 11, 2024, 4:52 AM

#

bitter hearth Is there a free way to use sd3?

when the weights are release you can run it on your own computer for free

bitter hearth Jun 11, 2024, 4:53 AM

#

ocean lance when the weights are release you can run it on your own computer for free

Thanks

noble coyote Jun 11, 2024, 5:46 AM

#

bitter hearth Is there a free way to use sd3?

Fabian@GLIF

junior dune Jun 11, 2024, 5:47 AM

#

What’s the estimated vram requirement for sd3?

#

12gb and 32gb ram?

junior dune Jun 11, 2024, 5:50 AM

#

dull star how come a1111 users didn't switch to stableswarm yet, it even has these conveni...

Cause auto1111 lets me remotely share and access the website on any network

ocean lance Jun 11, 2024, 5:54 AM

#

junior dune What’s the estimated vram requirement for sd3?

my guesstimate is an 8gb gpu and 32gb of ram should be okay

#

since T5 can be loaded in CPU ram

#

pixart is a 0.6B model, requires 1.5 GB of VRAM

#

so roughly 5gb for the mmdit

#

loading clip g and clip l uses 1.8gb of vram

viral plaza Jun 11, 2024, 6:00 AM

#

you can just not use t5

#

at which point the reqs are <= sdxl's reqs

viral plaza Jun 11, 2024, 6:00 AM

#

junior dune Cause auto1111 lets me remotely share and access the website on any network

swarm does too!

#

here's the docs about it https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/Advanced Usage.md#accessing-stableswarmui-from-other-devices

ocean lance Jun 11, 2024, 6:01 AM

#

comfy only uses 3.6gb to run sdxl

#

does that make the 2B model excluding the clip models less than 1.8gb?

junior dune Jun 11, 2024, 6:03 AM

#

viral plaza swarm does too!

Awesome! Thank you

#

Does it generate a link you can access on mobile on another network like A111 does?

viral plaza Jun 11, 2024, 6:04 AM

#

ocean lance does that make the 2B model excluding the clip models less than 1.8gb?

SDXL minus CLIP is ~5.5GiB, SD3-Medium (2B) minus textenc is ~4GiB

viral plaza Jun 11, 2024, 6:04 AM

#

junior dune Does it generate a link you can access on mobile on another network like A111 do...

mostly yep, check the docs i linked, use cloudflared option probably

#

mobile support is a liiil wonky atm

#

it works but not a great experience tbh

ocean lance Jun 11, 2024, 6:05 AM

#

is comfy offloading the clip to cpu maybe?

junior dune Jun 11, 2024, 6:05 AM

#

viral plaza mostly yep, check the docs i linked, use `cloudflared` option probably

Sick

viral plaza Jun 11, 2024, 6:05 AM

#

ocean lance is comfy offloading the clip to cpu maybe?

if you're using only 3.6GiB of VRAM usage then yes comfy is offloading things for you

ocean lance Jun 11, 2024, 6:06 AM

#

so might be able to run on my 6gb laptop card then :>

viral plaza Jun 11, 2024, 6:06 AM

#

yeah probably

ocean lance Jun 11, 2024, 6:06 AM

#

cowdance

ocean lance Jun 11, 2024, 6:11 AM

#

viral plaza yeah probably

do you know if it's any quicker for our friends with apple computers than sdxl?

viral plaza Jun 11, 2024, 6:15 AM

#

ocean lance do you know if it's any quicker for our friends with apple computers than sdxl?

https://x.com/argmaxinc/status/1790785157840125957

argmax (@argmaxinc) on X

On-device Stable Diffusion 3
We are thrilled to partner with @StabilityAI for on-device inference of their latest flagship model!

We are building DiffusionKit, our multi-platform on-device inference framework for diffusion models. Given Argmax's roots in Apple, our first step

visual hamlet Jun 11, 2024, 7:02 AM

#

哈哈哈

teal fossil Jun 11, 2024, 7:37 AM

#

@viral plaza Does Swarm's Inpaint put the whole image through the Vae (degrading the image over time) or is it using a stitch-approach to mitigate that?

viral plaza Jun 11, 2024, 7:41 AM

#

teal fossil <@105458332365504512> Does Swarm's Inpaint put the whole image through the Vae (...

there's an Advanced param under Init Image that controls that, defaults to enabled (ie stitches the image back to prevent VAE damage)

teal fossil Jun 11, 2024, 7:46 AM

#

viral plaza there's an Advanced param under Init Image that controls that, defaults to enabl...

Perfect.

I think I'll try to get used to Swarm more with SD3. (as I can still use my overconvoluted crazy Noodle workflows in that tab)

ocean lance Jun 11, 2024, 7:46 AM

#

does swarm have tiled upscale?

teal fossil Jun 11, 2024, 7:47 AM

#

Kinda like KoboldCpp + SillyTavern.

remote holly Jun 11, 2024, 8:00 AM

#

It will be possible to use sd3 online tomorrow ?

#

Online and free

haughty jasper Jun 11, 2024, 8:14 AM

#

Where can I use SD3?

cobalt moon Jun 11, 2024, 8:21 AM

#

haughty jasper Where can I use SD3?

tomorrow. OK?

#

ComfyUI just update its compatibility to SD3

#

the Midnight is coming

#

thomas

#

uhhhh... actually are they going to just do the license announcement along with open weight

viral plaza Jun 11, 2024, 8:23 AM

#

ocean lance does swarm have tiled upscale?

yknow not built in currently in a convenient way but i oughtta add that before sd3 launch eh

#

(can do in comfy tab of course)

haughty jasper Jun 11, 2024, 8:23 AM

#

Is SD3 available for download now? I thought I could experience it on Discord.

cobalt moon Jun 11, 2024, 8:24 AM

#

haughty jasper Is SD3 available for download now? I thought I could experience it on Discord.

nah the Discord is one is kinda like Midjourney, which need a subscription

haughty jasper Jun 11, 2024, 8:26 AM

#

But I couldn't find the SD3 bot in Discord.

cobalt moon Jun 11, 2024, 8:27 AM

#

haughty jasper But I couldn't find the SD3 bot in Discord.

Artisan. I suggest you to wait til tomorrow when the announcement is here

haughty jasper Jun 11, 2024, 8:28 AM

#

okk

remote holly Jun 11, 2024, 8:56 AM

#

Sd3 on hugging face ?

viral plaza Jun 11, 2024, 8:56 AM

#

ocean lance does swarm have tiled upscale?

added support for https://github.com/BlenderNeko/ComfyUI_TiledKSampler - if you install that, under Refiner you'll get Refiner Do Tiling checkbox. It's not perfect but if your RefinerControlPercentage isn't too high it does its job well

cobalt moon Jun 11, 2024, 8:56 AM

#

yeah

viral plaza Jun 11, 2024, 8:56 AM

#

haughty jasper But I couldn't find the SD3 bot in Discord.

#🗣｜artisan-support-feedback

remote holly Jun 11, 2024, 8:56 AM

#

Ha cool , i hope it will run correctly because stable cascade have lot issues with diffusers lib

ocean lance Jun 11, 2024, 9:02 AM

#

viral plaza added support for <https://github.com/BlenderNeko/ComfyUI_TiledKSampler> - if yo...

Should've told you I made a tiled sampler 😛 It adds the tiles together using a feathered mask technique which is pretty seamless. https://github.com/bash-j/mikey_nodes/blob/main/mikey_nodes.py#L3015

GitHub

mikey_nodes/mikey_nodes.py at main · bash-j/mikey_nodes

comfy nodes from mikey. Contribute to bash-j/mikey_nodes development by creating an account on GitHub.

torpid forge Jun 11, 2024, 9:03 AM

#

viral plaza Jun 11, 2024, 9:06 AM

#

ocean lance Should've told you I made a tiled sampler 😛 It adds the tiles together using a ...

ooo
any chance you'd be willing to PR the bare minimum version of it into swarm? (I'm intentionally trying to keep swarms dependencies at bare minimum, no large packs except when absolutely necessary)

ocean lance Jun 11, 2024, 9:08 AM

#

viral plaza ooo any chance you'd be willing to PR the bare minimum version of it into swarm?...

I can try, where would it go? I have a function that splits the image into tiles, then a function that stitches the tiles back together

viral plaza Jun 11, 2024, 9:12 AM

#

ocean lance I can try, where would it go? I have a function that splits the image into tiles...

as a parameter to, or variant of, SwarmKSampler: https://github.com/Stability-AI/StableSwarmUI/blob/master/src/BuiltinExtensions/ComfyUIBackend/ExtraNodes/SwarmComfyCommon/SwarmKSampler.py#L115

storm saffron Jun 11, 2024, 9:41 AM

#

@viral plaza do you know if SD3 will work with tensorRT (not sure anyone would have tried it yet)?

viral plaza Jun 11, 2024, 9:44 AM

#

storm saffron <@105458332365504512> do you know if SD3 will work with tensorRT (not sure anyon...

it will yes, nvidia will publish stuff for it

coral sable Jun 11, 2024, 10:20 AM

#

bro Swarm has everything, you were right. I'm rdy for SD3 w00t

bitter hearth Jun 11, 2024, 10:21 AM

#

has cat waow

coral sable Jun 11, 2024, 10:23 AM

#

bitter hearth has cat <:waow:1017853838516035725>

lots of cat

bitter hearth Jun 11, 2024, 10:24 AM

#

lavish osprey Jun 11, 2024, 10:27 AM

#

viral plaza that'd be a <@180327464155742208> question

params for lora training? It would greatly depend on the type of lora, the dataset size, the goal, etc

#

at the moment we didn't test any style or character lora (well, we briefly did but then we had to move out to other stuff).
We did lots of aesthetic or "fix" low rank training

prisma zenith Jun 11, 2024, 10:32 AM

#

So I read thru the history to try to understand. But I still have a few questions. Woudl appreciate if someone could fill me in 😄

SAI is releasing SD3 Medium, but there could possibly be a small and a big? Also what is the impact of the 2B vs the 800M vs the large? Is the impact more on the model's ability to translate text into images and understand what it's trying to create? Or will it have an impact on image quality too? (my understanding is that it's a bit of both because it will have less latent's to draw from right?)

Finally what is SD3 Ultra? is it a comfy UI workflow when using the SD3 Api? Or is it the large model?

Thanks to anyone who takes the time to answer!

lavish osprey Jun 11, 2024, 10:32 AM

#

It's style aligned well enough to allow most types of lora training imo

bitter hearth Jun 11, 2024, 10:33 AM

#

lavish osprey Jun 11, 2024, 10:34 AM

#

prisma zenith So I read thru the history to try to understand. But I still have a few question...

param numbers impacts all type of things, from general knowledge to inference capabilities (ie the ability to create new concepts from the ones you know)

coral sable Jun 11, 2024, 10:34 AM

#

bitter hearth

lavish osprey Jun 11, 2024, 10:34 AM

#

that being said, 2B mmdit is likely all a human can ever need.

bitter hearth Jun 11, 2024, 10:35 AM

#

coral sable

I also did that prompt and got a similar image lmao

#

coral sable Jun 11, 2024, 10:38 AM

#

SD3, same prompt(lots of cat)

bitter hearth Jun 11, 2024, 10:38 AM

#

coral sable

lavish osprey Jun 11, 2024, 10:38 AM

#

prisma zenith Jun 11, 2024, 10:39 AM

#

lavish osprey param numbers impacts all type of things, from general knowledge to inference ca...

thank you for clarifying!

#

What about what SD3 Ultra is? Is it a workflow? or the larger model?

lavish osprey Jun 11, 2024, 10:39 AM

#

it's only "Ultra" in theory, not "SD3 Ultra"

prisma zenith Jun 11, 2024, 10:40 AM

#

but what is it? a model? or a workflow to make the most of the output from the model currently on the API

lavish osprey Jun 11, 2024, 10:40 AM

#

what's in Ultra is a trade secret, you can think about Core and Ultra in terms of msg, which is salt on crack, while base models are just salt.

prisma zenith Jun 11, 2024, 10:41 AM

#

I see, so Ultra is basically the model that is available via API is that it?

#

or is that the one available via stable assistant?

bitter hearth Jun 11, 2024, 10:41 AM

#

coral sable Jun 11, 2024, 10:41 AM

#

lavish osprey

lavish osprey Jun 11, 2024, 10:42 AM

#

prisma zenith I see, so Ultra is basically the model that is available via API is that it?

ultra is not a model

#

there is no model named ultra

prisma zenith Jun 11, 2024, 10:42 AM

#

so what is ultra then? a workflow?

lavish osprey Jun 11, 2024, 10:43 AM

#

ultra refers to an api endpoint, what's behind that is not for me to say

prisma zenith Jun 11, 2024, 10:43 AM

#

lavish osprey what's in Ultra is a trade secret, you can think about Core and Ultra in terms o...

or was is this a meme/joke that i'm not getting xD

prisma zenith Jun 11, 2024, 10:43 AM

#

lavish osprey ultra refers to an api endpoint, what's behind that is not for me to say

I see! i get it now

#

thanks!

lavish osprey Jun 11, 2024, 10:43 AM

#

prisma zenith or was is this a meme/joke that i'm not getting xD

https://tenor.com/view/uncle-roger-king-of-flavour-gif-18146501

Tenor

#

enjoy some ai generated pizza

bitter hearth Jun 11, 2024, 10:44 AM

#

thomas

prisma zenith Jun 11, 2024, 10:45 AM

#

lavish osprey ultra refers to an api endpoint, what's behind that is not for me to say

this is the one on fireworks right?

coral sable Jun 11, 2024, 10:45 AM

#

that's evil, I'm hungry

lavish osprey Jun 11, 2024, 10:45 AM

#

prisma zenith this is the one on fireworks right?

no, Ultra is provided directly by us, not Fireworks

bitter hearth Jun 11, 2024, 10:45 AM

#

coral sable that's evil, I'm hungry

lavish osprey Jun 11, 2024, 10:45 AM

#

coral sable that's evil, I'm hungry

then enjoy a dog on a ball on a surfing board

lavish osprey Jun 11, 2024, 10:46 AM

#

bitter hearth

@viral plaza give me the yellow name so I can start banning people who put pinepple on cheese and tomato sauce

#

thomas

prisma zenith Jun 11, 2024, 10:47 AM

#

got it! i'm looking at the api now. This one has Sketch & Canny as controlnets right?

lavish osprey Jun 11, 2024, 10:47 AM

#

prisma zenith got it! i'm looking at the api now. This one has Sketch & Canny as controlnets r...

elaborate on "this one"

prisma zenith Jun 11, 2024, 10:47 AM

#

https://platform.stability.ai/docs/api-reference#tag/Control/paths/~1v2beta~1stable-image~1control~1sketch/post

coral sable Jun 11, 2024, 10:47 AM

#

tested some MJ prompts with SD3

lavish osprey Jun 11, 2024, 10:48 AM

#

prisma zenith https://platform.stability.ai/docs/api-reference#tag/Control/paths/~1v2beta~1sta...

I don't remember if Structure has canny to be honest

bitter hearth Jun 11, 2024, 10:48 AM

#

#

runs away

viral plaza Jun 11, 2024, 10:51 AM

#

lavish osprey what's in Ultra is a trade secret, you can think about Core and Ultra in terms o...

For legal reasons I would like to clarify that Stability AI does not compare its services to crack, Lykon's choice of phrasing is entirely his own

viral plaza Jun 11, 2024, 10:51 AM

#

lavish osprey that being said, 2B mmdit is likely all a human can ever need.

for not-pissing-off-reddit reasons I'd like to clarify this is also Lykon's own phrasing gasjlasg

lavish osprey Jun 11, 2024, 10:53 AM

#

let's just piss off reddit

#

catlurk

lavish osprey Jun 11, 2024, 10:56 AM

#

viral plaza For legal reasons I would like to clarify that Stability AI does not compare its...

https://tenor.com/view/uncle-roger-msg-cocaine-of-cooking-gif-14767225509662422148

Tenor

teal fossil Jun 11, 2024, 10:58 AM

#

lavish osprey that being said, 2B mmdit is likely all a human can ever need.

I so hope that this is true.

I can't wait to try Lora & FineTuning...

Can't wait how different the Textstream (& Imagestream?) will train compared to the XL Unet+Tenc.

#

And I wonder how long it'll be until Kohya & OneTrainer add manual Clip-G/L training for concepts.

#

And dedicated simple Embedding training.

lavish osprey Jun 11, 2024, 10:59 AM

#

you don't really need to finetune TEs for concepts

#

historically NAI used SD1.4 TE without any change

#

and Pony finetuning TEs kind of destroyed them, to the point it lacks basic knowledge like "a bank"

#

the first 5-6 months of SDXL lora finetunes didn't touch the text encoders and they mostly work very well

#

even with made up words as activation

#

(as a matter of fact, I'd suggest to keep the tes untouched and just use them for preprocessing)

ocean lance Jun 11, 2024, 11:08 AM

#

lavish osprey you don't really need to finetune TEs for concepts

say I have a 50 photos of my cat and I want to train a LoRA, where should I start? Would it be any different to training a LoRA on 100 paintings by Monet?

storm saffron Jun 11, 2024, 11:10 AM

#

lavish osprey even with made up words as activation

Is that because the text encoders will just output tokens for whatever you put in there anyway? Even if it's completely nonsensical?

teal fossil Jun 11, 2024, 11:13 AM

#

lavish osprey and Pony finetuning TEs kind of destroyed them, to the point it lacks basic know...

And it has problem with colors and color bleed.

I have high hopes that SD3 got rid of most color bleed.

teal fossil Jun 11, 2024, 11:22 AM

#

lavish osprey (as a matter of fact, I'd suggest to keep the tes untouched and just use them fo...

I'll definitely try that.

For SDXL at least training the Unet only was never enough for new concepts (at least during my tests) - adding the Tenc always got it into the right direction.

dry wave Jun 11, 2024, 11:27 AM

#

storm saffron Is that because the text encoders will just output tokens for whatever you put i...

each token gets an embedding, yes

#

that's how the original dreambooth paper worked: they used a "non-sensical" input token and trained on that

#

instead of using one token you can also just use a bunch of tokens (like your name)

#

as more rare (and non-sense) the token is as harder it is for the model to learn from it but there is also much less overfitting and damaging then

dry wave Jun 11, 2024, 11:31 AM

#

teal fossil I'll definitely try that. For SDXL at least training the Unet only was never en...

it worked also for sdxl, but training the unet takes much more time than training the te

lavish osprey Jun 11, 2024, 11:31 AM

#

storm saffron Is that because the text encoders will just output tokens for whatever you put i...

tokens are the input, not the output, but your intuition is correct.

lavish osprey Jun 11, 2024, 11:33 AM

#

dry wave each token gets an embedding, yes

^ this is mostly correct. The model will produce an embedding of your prompt depending on the tokens. So there sre still going to be vectors that the Unet/DiT can "catch" to understand new concepts

#

with SD3 you also have more stuff you should check when training, like "did I ruin text understanding" or "did I create conflicts among the various text encoders"

amber nexus Jun 11, 2024, 11:34 AM

#

If the SD3 2B is released tomorrow, will I be able to use the inpainting or upscaling features directly in ComfyUI?

outer cloak Jun 11, 2024, 11:35 AM

#

A quick dumb question is there a fixed time for the release like 12PM PDT or something?

teal fossil Jun 11, 2024, 11:35 AM

#

Another thing - I started pruning my LoRA's per Weights - for some concepts that was highly effective and countered the detrimental effects of my LoRA's on the base Models.

Will that be possible / necessary with the new architecture, or is it structured differently?

dry wave Jun 11, 2024, 11:36 AM

#

what do you mean with "per weights"?

#

but loras work technically the same everywhere. They are not specific for diffusion methods

teal fossil Jun 11, 2024, 11:37 AM

#

dry wave what do you mean with "per weights"?

You can load LoRA weights individually (can't remember what the Comfy Node was called exactly).

teal fossil Jun 11, 2024, 11:37 AM

#

dry wave but loras work technically the same everywhere. They are not specific for diffus...

But aren't they like a "UNET jank" for XL?

dry wave Jun 11, 2024, 11:38 AM

#

each matrix has it's own pair of lora matrices. You can remove them individually, yes

low stone Jun 11, 2024, 11:38 AM

#

dry wave Jun 11, 2024, 11:38 AM

#

teal fossil But aren't they like a "UNET jank" for XL?

I don't know what you mean with unet jank.
But they are not limited to unet anyways. You can have text encoder loras, too

bitter hearth Jun 11, 2024, 11:46 AM

#

fathom path Jun 11, 2024, 11:47 AM

#

bitter hearth Jun 11, 2024, 11:47 AM

#

gonnabegood

vapid radish Jun 11, 2024, 11:47 AM

#

It's funny that if you mess up the prompt with SD3 it really tries to follow it, with hilarious results.
"vampire with fangs, wearing a black cape and black with red lining inside shoes"

bitter hearth Jun 11, 2024, 11:49 AM

#

vapid radish It's funny that if you mess up the prompt with SD3 it really tries to follow it,...

https://tenor.com/view/unico-uni-mario-galaxy-goomba-unicouniuni3-fight-punch-cat-gif-835764120138785088

Tenor

#

its a goomba

desert garnet Jun 11, 2024, 12:01 PM

#

bitter hearth <:gonnabegood:1008985420949880893>

its wednesday 12th on new zealand already,where SD3 ?

#

https://tenor.com/view/discord-who-asked-me-looking-for-who-asked-travolta-meme-travolta-confused-gif-27133622

Tenor

gusty gale Jun 11, 2024, 12:02 PM

#

desert garnet its wednesday 12th on new zealand already,where SD3 ?

two weeks =]

desert garnet Jun 11, 2024, 12:02 PM

#

catsprout

teal fossil Jun 11, 2024, 12:04 PM

#

dry wave I don't know what you mean with unet jank. But they are not limited to unet anyw...

True. But it was just established we should try not training the Tenc first.

We'll see I guess.

Someone else wrote "jank" as in LoRA's interact with Unet's in a strange way.

dry wave Jun 11, 2024, 12:12 PM

#

I don't know what you mean 😅

#

if you train the TE then you should train it first

#

or you train it together with unet

#

but training unet first and then te sounds wrong

#

regarding lora and unet: I have no clue what this "strange" interaction should be

#

in the end a Lora is doing nothing strange, it's just a weight update

calm surge Jun 11, 2024, 12:23 PM

#

dull star Jun 11, 2024, 12:29 PM

#

lavish osprey It's style aligned well enough to allow most types of lora training imo

Generic phone selfies look so good YESSSSSS, I hope CCTV and other styles work as well as 8B, this is good news

bitter hearth Jun 11, 2024, 12:32 PM

#

dull star Generic phone selfies look so good YESSSSSS, I hope CCTV and other styles work a...

you can just take a selfie yourself

#

thomas why gen them

sick cedar Jun 11, 2024, 12:34 PM

#

viral plaza (scrolling through backlog sorry if this already got an answer) T5-XXL, encoder ...

Yeah. Thanks for this m8!
I was making sure i could run it, because i was having troubles running Pixart with the same encoder.
I turned out in the end that i was just doing something wrong with the workflow, because it works great off of thwe CPU now.
Thx again for the reply. 🤗

tropic aspen Jun 11, 2024, 12:44 PM

#

So, just want to mention we're around 1 day away, and we still don't know how the licensing for SD3 works despite being told it would be explained before SD3 launches

bitter hearth Jun 11, 2024, 12:48 PM

#

storm saffron Jun 11, 2024, 12:51 PM

#

dull star Generic phone selfies look so good YESSSSSS, I hope CCTV and other styles work a...

Generic selfies aren't exactly difficult.

dull star Jun 11, 2024, 12:52 PM

#

I don't like how this one looks

#

why did the AI dude use portrait mode 😔

dull star Jun 11, 2024, 12:53 PM

#

dull star Generic phone selfies look so good YESSSSSS, I hope CCTV and other styles work a...

this one looks 300x more convincing to me

#

generic ceiling light, no depth of field, just a dude smiling

storm saffron Jun 11, 2024, 12:54 PM

#

dull star why did the AI dude use portrait mode 😔

#

There, badly lit, mildly out of focus, terrible selfie

dull star Jun 11, 2024, 1:00 PM

#

this one's better than the last

#

but I still prefer SD3's selfies

storm saffron Jun 11, 2024, 1:06 PM

#

I'd prefer to see examples of things that can't be done already in SDXL.

bitter hearth Jun 11, 2024, 1:06 PM

#

#

thomas

bitter hearth Jun 11, 2024, 1:07 PM

#

storm saffron I'd prefer to see examples of things that can't be done already in SDXL.

Tell me one thing SDXL can't do

storm saffron Jun 11, 2024, 1:08 PM

#

bitter hearth Tell me one thing SDXL can't do

Bow and arrows.

#

SDXL

bitter hearth Jun 11, 2024, 1:09 PM

#

Kek

#

Weapons are annoying indeed

#

Don't think this basic version available can do any better

storm saffron Jun 11, 2024, 1:10 PM

#

It can do it, sometimes, a bit better

desert garnet Jun 11, 2024, 1:11 PM

#

storm saffron I'd prefer to see examples of things that can't be done already in SDXL.

6 dogs standing next to each other,each one is of a different breed and has a different fur color

storm saffron Jun 11, 2024, 1:12 PM

#

Or a woman talking to another woman in the street, one has blonde hair and is wearing a red dress, the other is wearing a purple dress and a baseball cap

#

I mean it was fairly ambiguous.. 😄

#

more specific (for SD3 purposes)
a woman talking to another woman in the street, one has blonde hair and is wearing a red dress and a fedora, the other has brown hair is wearing a purple dress and a baseball cap

#

SDXL has mixed hair colours now, and only baseball caps

lavish osprey Jun 11, 2024, 1:26 PM

#

storm saffron Generic selfies aren't exactly difficult.

but this doesn't look anywhere as realistic as the one I posted.

lavish osprey Jun 11, 2024, 1:27 PM

#

storm saffron There, badly lit, mildly out of focus, terrible selfie

this is likely the most realistic one you posted but still looks like "good cgi" to me and not a real photo