#๐ฌ๏ฝgeneral-chat
1 messages ยท Page 124 of 1
You havent tried Harrlogos thenโฆ
Well hopefully 3-5 weeks now that it's been a week since the CTO said 4-6 weeks 
We could be playing around with the smaller models by now
You release the small models first and theyll get such a headstart that noone will use the big one on release
Especially with how much cheaper it would be to train on
Good morning, everyone!
Indeed. That's the great thing about all of this: SD3 on release will be lightyears behind 6 months from then.
Same
I have 1660ti, quite an old card, and forge is the only viable option for me to run something
I have a 3090 coming soon, but it was nice to be able to give SD a try even with low end pc
I like forge, good inbuilt extensions and good model management under the hood giving good speed. They just need to get compatibility with supermerger sorted
I will be moving into comfy with the 3090. Its complicated but also necessary for real world use cases.
there is rtx 3090 for 670$ and 3080ti for 500$.I think it's better to buy a 3090 for sd 3. Otherwise, the 3080 might not pull it
Whats your use case??
If you can spring for the 3090, go for it as the extra vram always worth it
Im visualizing menus. You pick style or just improve asthetic of existing menu photos. Its good for new restaurants, more engaging menus, and advertising
So text to image for all menu items
absolutely 3090
even if it ends up running on 12GB you will still benefit from being able to do higher resolutions and more tools running alongside SD3
like controlnets and such
im gonna a get a 3090 as well 
hi
Will version 3.0 require a large onboard GPU memory?
Where can I test the model now?
I don't need a commercial licence
its not on civit.ai
(yes I've signed up for EA)
its not out yet
its not out yet
its not out yet
its not out yet
(its not out yet)
How grateful our great grandchildren will be when SD3 releases
agree,cant believe u woke up and came here to spit straight facts
im sorry ๐
why

Does anyone know of any daily challenges anywhere? Iโm looking to learn and broaden my horizons and a daily challenge would be a fun way to do it.
I posted a daily challenge in #๐๏ฝdailies you can try out.
sd3 release when
Could be tomorrow, could be next June idk
๐ญ
Is there a reasonably accurate guide on chkpt models, as in which ones are good for what? I'm getting model fatigue as there seem to be too many. I know storage space is cheap, but I'm doing all of this on a notebook.
Or, can I run chkpt's from an external usb drive in ComfyUI?
there is a file named extra_model_paths.yaml in the comfyui folder, you can probably edit it to redirect to an external drive
I have some questions that hopefully someone here can answer as I have only been tinkering with ai for about a month now. I have used both Automatic 1111 and Forge. Forge is definitely a bit faster on my 3080ti. I am generating batches of images and I am coming across some issues. First the faces need to be fixed and I am using Adetailer and that seems to work just fine. A detailer adds about 1.5-2 seconds of extra generation time. This seems to be acceptable but I am curious if anyone knows any other fast fixes for faces at the moment. Next sometimes my fingers come out strange. I have tried embeddings such as badhandv4, negative_hand-neg, and deepnegative. They all seems to give better yet different results. I do not know what the best embeddings are for hands. I am curious if anyone has more info on this. Finally I use ChaiNNer to batch upscale my images by 3x. ChaiNNer honestly seems to be like a godsend piece of software. Very fast and very reliable.
Hello, Where can I genara a Poster?
ponyxl for boobs, dreamshaperxl or juggernautxl for fantasy, zavychroma for realism
That's the stable diffusion checkpoint starter pack
Oh yes! Ride me like a pony! ๐
Any news or hints recently on when SD3 will drop?
You should have asked yesterday ๐
@honest spear , @opal hedge Thank you both!
iโll be using the bathroom in about two minutes
@alex v 'preciate it.
Soapdrop? ๐
Epicrealism is pretty good, it actually does what I say.
But I do have type the prompt more like a sentence

Turns out it was shit.
Better change your supermarket ^^
can someone help me
i downloaded a style LORA and idk where to put the file, in lora folder?
Is there any way to create videos
I heard somewhere there is a model
Not like Sora videos
Check general chat with images
oh its comfy... ๐คข
Is there like an img2img sort of thing where you put in two images and it generates something that averages the two pictures?
Why the barfy face?
A1111 for life
Whatโs your reasoning ?
UX
Fair. I would like the interface to be better..
use the new ipadapter
Forge is better tbh
i just downloaded forge
what is new abt it
noticing its faster
theres a model that used to take like 10 mins for an image on auto1111
but its taking way less on forge
yup, it is indeed 2-3x faster
it's much faster on my 4090 and my 3080 12gb, and every other card i've heard of other ppl trying
and uses less vram too
hi
All I have is a simple 4080 ๐ฆ
4080 is a really good card ๐

How long should StableDiffusionPipeline.from_pretrained() take for a local model?
hello
How does Imagine.art have SD3?
good day
will SD3 be better than SD-XL?
Is SD3 going to be on civitai
will SD3 change the world
I guess, unless a license prohibits them from doing
will sd3 cure furries?
It will change the galaxy
Only a bullet to the head will do that
Most likely on hugging face

nah, we're hard at work preparing to cure SD3 of its normie-ness. with it we will march ever closer to mandatory fursona assignment day
It could support furries
and waifus
the multiverse won't ever be the same
It will usher in the messianic age.
I hate to say it but I agree. I like the customizability of comfyui but I like the interface of automatic1111 more.
Wait till you try Forge

I hope ๐ฅบ
What's Forge? The second coming?
A way better version of a1111
A1111 interface with comfyui backend (its the fastest ui right now)
Ahhh. Do you have to mess with the backend, or can you just mess with it if you want to get your hands dirty? I like the simplicity of A111
Its like a1111, you dont have to use nodes or things like that
You can even make the images animated with SVD
Ok now you are selling me.
Iยดll send you a photo of how it looks like on #๐๏ฝgeneral-with-images @verbal osprey
Forge is available right now? I tried one a111 clone, it was called vlads or something like that. It didn't really impress me too much
Is this agreed upon as the best successor to A111
Forge can change a life!!
Also, can I keep my models in my A111 folder and point to them in the forge UI? I have a lot of stuff organized and I really don't want to have to copy large model files to another directory
Not yet, i have to try it ๐ค
here's what's weird though.
idk why.
forge is actually significantly faster than comfyui...
Thereโs night and thereโs day, and thereโs a super nova

for dpmpp_2m_ancestral with karras, 1024x1024 on a 4090...
a1111: 1.83 it/s
comfyui: 2.96 it/s
forge: 3.71 it/s
I never understood these it/s, I always assumed it was mb a second or something

100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 60/60 [00:16<00:00, 3.62it/s]
Requested to load SDXL
Loading 1 new model
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 60/60 [00:27<00:00, 2.16it/s]
Prompt executed in 55.23 seconds
got prompt
[rgthree] Using rgthree's optimized recursive execution.
Requested to load CLIPVisionModelProjection
Loading 1 new model
Requested to load SDXL
Loading 1 new model
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 60/60 [00:30<00:00, 1.98it/s]
take the steps and divide by the it/s and then look at the runtime
๐ฎ
I was super reluctant to come around, but having run it now for 3 weeks, yah it's the real deal
lol
good to hear ppl are trying it
it really is like a1111 minus the whole being a memory hog slug
what are some things about it that you like better than a111?
it's over twice as fast and uses far less vram
more stuff is ready to go out of the box
Does it support most of the extensions that A111 supports? I have maybe 3 or 4 extensions that I use a lot.
It generally supports them all.
yeah, i'm sure there's some it doesn't, and there's some bugs here and there that are different than the ones a1111 has
Some extensions have specific Forge versions.
but eveyrtihng i've tried has been no prob
my advice for the last month or so to everyone has been...
use forge, comfy, or ideally, both.
Forge + SDXL = Lit Combo
literally the only thing i use a1111 for now is lycoris-ia3 as neither forge nor comfyui seem to support it (at least, that was the case a few weeks ago)
for one, I never get OOM anymore, even after switching models multiple times. it does seem faster, but that really wasnt what I was looking for. the built-in extensions are are well thought out and at first I though, why all this extra crap, but it starts up pretty fast, so it's not actually a burden and I appreciate most of them being there
yep those extensions are really useful ones you def want
i was happy to see them there out of the box
You guys are selling me. I'm going to try it.
you're going to laugh your ass off if youre anything like me
at how much faster it is than a1111
and again, as a primarily comfyui user (btw, i hate that shit ppl sometimes do on here... Team This, Team That)
forge is definitely faster. don't know why, but it is.
you're not gonna get the control and power user capabilities you'll get with comfyui, but you'll be able to do most things still just fine
i use comfyui mostly because i like to make really, really wild images
I haven't noticed it being THAT much faster. I ran some benchmarks and it had some mild differences. Maybe because I have 12GB of VRAM...
there's def a possibility that there's some shit going on with driver versions etc, idk
version of windows vs linux, etc
but on my 3080 12gb, forge was vastly faster than a1111
i couldn't run batch sizes greater than 3 on a1111 with sdxl without OOM
forge, i was running batches of 8 without it even blinking
anyone know how to do text pictures like SD4 localy?
and yeah, on my 4090, forge is about 2-3x faster depending on the sampler
batch size != speed
Need more controlnet stuff for sdxl ๐
Last I recall there were still limitations.
what is forge?
yeah controlnet for sdxl isn't great
but there's other things now that are even better imo
I never really understood what this was
like, I would like to create images of the same character
character consistency or whatever
right now, afaik it is comfyui only, if there's a forge/a1111 equiv i'd def like to know so i don't misinform ppl... but ipadapter plus has new nodes (these are NOT new models) that can do stuff that's sdxl-only like style transfer and composition transfer
the style one in particular is spectacular
Yeah, thatยดs mostly the only resolution that I use
it's funny, I'm fiercely loyal to a fault, so switching was hard, but after so much talk curiosity overcame, me, I'm a convert
๐ฎ
i'll drop a piece of software in 2 seconds if i find something better
it's just a tool
Comfy drives me absolutely batty, so I only fire that up if absolutely necessary.
#TeamNothing
hahaha. A111 definitely feels slow......I'm excited.
a1111 is slow as fuck
Its faster than comfyui on my PC ๐
the reason i tried comfyui in the first place was because a1111 performance was so god awful on my 3080 12gb
forge is?
Forge > a1111 > comfyui
what's your gpu and what it/s for dpmpp_2m_ancestral with karras as the scheduler, 1024x1024?
Im using a rtx 3060, 1024x1024, comfyui up to 22 seconds, a1111 up to 19 seconds, forge up to 14-15 and sometimes 17 in some models
oh, well that's really weird, i wonder if your comfyui install is wack
I'm mostly concerned about when stable diffusion 3 comes out. A111 is always slow to incorporate new SD releases.
my only concern with comfy has nothing to do with performance, its simply that I would spend more time tweaking workflows, which has nothing to do with why I'm actually in it in the first place
can you still expose the api to share with ollama?
Right....I'd rather run default settings that work 90% of the time than mess with tweaking workflows
this i hear a lot... here's my solution to that
i don't worry about that shit, i have lots of functions hotkeyed on my g502 gamer mouse
i just fire up a workflow, copy paste stuff from old ones and go
no organizing or anything unless i'm sharing it on request
haha, nice, but what would make that first image killer is if all those tiles made a stick figure or something, that would be meme-worthy
i find if i start carefully organizing anything it turns into the rabbit hole you described... hours wasted, nothing made
ha yeah
I just have the MX Master 3 and the Razer Viper Ultimate

ahh nice
here's the issues with comfyui as it stands imo
- documentation for most nodes either is shitty and incomplete, or flat out doesnt exist.
- a lot of times there's unnecessary redundancy (why can't there be a preview or save image node that has a vae and latent input?!)
- needs better node grouping support... in particular the ability to group stuff like a function, where it could be expanded and collapsed, would be hugely useful, or if you double clicked on one, it'd open a new window with the "subworkflow" in it for that grouped node
- chaos with node packs where people just throw in the kitchen sink with everything they've made that again, is often undocumented
great things about it: you can do just about anything, and if you stop worrying about making it look nice, you can work just as fast as in webui
I think it shines with automated workflows, so maybe when video workflows are less crappy
but by then someone would probably make a cool video workflow for forge
i find working in webui to be more tedious and repetitive because a lot of tasks i just group up some nodes for in comfyui i end up having to do manually
but yeah, again, i'm doing some weird shit
if you want to make images that make ppl think your acid use has expanded into the workweek, comfyui is the shit
So out of the box forge allows you to make video out of images? Or do I need an extension for that?
Why canโt I find the discord creation channel, which is a channel where I can generate pictures online? Can any expert explain it?
The bots are down.
Hey hey hey, no wonder they are all deployed locally, I see they are all comparing the differences
Hey hey hey, no wonder they are all deployed locally, I see they are all comparing the differences
prompt me in general images and you might get something you like
technically yes, but there's no consumer video tech out there right now worth anything, which isnt forge's fault
I bet some people use SD just to generate babe's
Anyone familiar with using KOYA to train your own models?
Thank you for your kindness, for me the creative process is the most wonderful
checkpoint training - dreambooth yes, full finetuning, never had the patience or resources
oh i mean if you type a prompt in there i might copy it verbatim into comfyui and see what comes out
I perfer to generate things, I would also like to generate charaters from games like mario and sonic, etc etc, but I don't think SD is built for game characters like that, coz its more of a realistic model im guessing
oh you can def make those
๐ฎ
Agreed. I want to be able to replace a character in a movie with someone else using stable diffusion, or change the look of the movie using a specific model. That will be when video really gets crazy. Imagine the hobbit as painted by vincent van goh
im using an AMD machine it looks like the KOYA installer is asking me questions about an NVIDIA im not too sure how to proceed
I'm amd as well, it can work, but there's a bit more to the install. linux is the best way to go with AMD
I have a 4090 but I am salivating at the power that a 5000 series card might have. I need more power!!!
you have the best consumer on the market, chill ๐
I built a PC from scratch, but didnt want to cheap on those parts to get a good card (or pay the nvidia tax), so I just got a decent GPU at the time and splurged a bit more on the board and other components knowing I'd eventually upgrade the GPU anyway
im being asked about intalling fp166 bf16 fp8 or none, how do you think i should proceed?
bf16 seems like the safe choice, it's about the mixed precision
yeah bf16 is pretty good
realistically, there's probably little diff between fp16/bf16 for local training, but if you do any cloud training, it might matter. they discuss it here https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus
is there a bf32?
not that I know of
I wonder if 32-bit precision will eventually become a thing
โOn the full SWE-bench test set, SWE-agent resolves 12.29% of issues, achieving the state-of-the-art performance on the full test set.โ

So itโs a robot code janitor that can hypothetically go into the entire GitHub repository database and clean up 12% or so of all lingering issues?
xin chร o cแบฃ nhร
Does anyone know when SD 3.0 will be released to the public?
soon
lol I hope so
im told 6 weeks, but who knows
Ok thanks!
Haven't had the chance to look but the glass half empty in me immediately asks "so if 12% succeed, are the other 88% catastrophic failures that introduce issues?"
The other 88% resulted in a โI could not fix. Please understand.โ message
Really though, very interesting, thanks for sharing
imagine reading this with 4 gb ram
couldn't imagine it
Hello, Everyone.
I want your help.
I am going to convert any dog image into sympsonized one using stable diffusion project.
what is sympsonized
looool
high soggy camel variable
the taco way shrimp cattle way
upper echelon tacorex-PYrex
kawasaki cheese rex
airbag jennie chewables
WHY WONT THE BOT GENERATE
SD3 where are you? ๐ค๐ฅน
No bot here
@lavish lake @wintry stream Sorry for the ping, you both were involved in a support ticket I had open last September. I'm hoping you may be able to help push along my new ticket. I opened it Mar 21st, it was then claimed by someone on March 22nd, and I have not seen them online since then. If you were able to take a look I would really appreciate it!
1324
hello
hello
1
1
1
1
11 00 101
heelo
hello
what about RAM for sd xl? I have 16 GB of ddr3. is this enough if the graphics card is rtx3090? will there be any problems due to lack of RAM?
VRAM more important ... should work ...
I would try Stable Diffusion Forge ...
okay
Easy to start and uses less resources ... advanced might wanna check ComfyUI ...
https://forms.gle/9i4jM9BQu9bVVAAF6
As a young professional just a few years into the workforce, there is a constant, low-humming anxiety about proving yourself and finding that mythical work-life balance everyone talks about.
Sometimes you can't help but wonder - is this really what you signed up for? Or is there a better way to approach this whole "work" thing as a young professional?ย
We at 5day.io have the same questions as you.
Our goal for this survey is to understand your work habits and present you with a work management ecosystem so good that it brings your spark back.ย ย
Do you have a minute to simply tap on 7 answers?
It will not take you more than 1 minute 49 seconds. We checked.
how fast u guys generating a 1024x1024 XL 20 steps on A1111? Idk if im going slow or what
With a rtx 3060 in about 19 seconds
hmm ty ty, 3070ti taking like 35sec. trying loads of guidance ive found
๐ค
will keep trying
It should be way faster, what settings are you using? I use dpm++2M karras and about 5 to 7 cfg scale
I hope you can fix it ๐ค
Euler a, 7 cfg, dynavisionXL checkpoint.
still new to all this so hopefully figure it out
you can always try forge instead of a111 or lightning models to speed up a lot but yeh that sounds too long
check if your drivers and whatnot are up to date
also idk maybe you are using a vae and everyone else isnt or something
yeee so many variables to compare i suppose, everything upto date so prolly smthn dumb on my end
Are you using xformers or other optimizations in the webui.bat COMMANDLINE_ARGS? @austere pollen
I have medvram-sdxl and xformers
รooooo ๐ญ
dooomed
Oof it looks like some models use 8.6 gb of vram, maybe its offloading it to the ram, im not sure then bro, sorry :C
Maybe you gotta use medvram and remove xformers but im not sure, here are all the optimizations you can use
ty for trying bro, my ram and gpu are both at like 100% when i generate
ill keep trying, ty tho โค๏ธ
๐ค
you are probably using a1111 that one sucks for XL
forge is faster with XL
yeah i am, i tried invokeAI and then moved to A1111
does Forge have dynamic prompting n stuff? might have to try
forge is a1111 it just has the comfy optimizations,only major difference is that some extensions that work on a1111 dont work on forge
okkk, lemme look into it. thank you guys! i just use the dynamic prompt extension so hopefully that works
hello
Hi!
hello.
hi
ไธญๅฝไบบ
Which don't work? I haven't tried enough to run into pcoblews... I know there's a memory leak with regionalprompetr at high resolutions but that's it
Hi everyone i hope I'm not bothering you. I tried to install the program but couldn't. I really need an AI drawing. Could anyone help me out with that? 
hello
supermerger,agent scheduler
some of the sfuff in supermerger works but not all of it
what program/platform are you trying to use?
I wrote A1111 and open Github and followed the rules, but there was always a problem (sorry for my english btw)
Itโs okay, I DMed as well
nihao
What is the functional difference between models and their XL counterpart? I see that image generation size is a thing, but from what I read it comes a price in variety and ease of getting the image to match your prompt if you're generating anyting other than people.
helo
Sorry I can't help you with this mate, gonna try pinging @sudden ruin and @bleak matrix
Guys, stable audio released their 2.0 model.
the actual model?
It's an audiosparx model so it's still web only.
But you can prompt for 3 minutes for free now.
Alsoo.....
Stable Radio is back.
itโs okayโseems to be a little repetitive for the most part
I would just say to open a new ticket!
running forge now, 49sec vs 90 sec for 4 images! ty ๐
so glad to hear you're getting such a huge improvement!
moving from a1111 to forge is as good as moving from a 3080 to a 4090
Hello friends, would anyone be able to provide aid to someone who's a little bit stupid
just ask
I'd like to make some means of converting an image into a set style, consistently
I tried to make my own model but the results are all over the place
Ill see if I can get more informations on this 
you can do that with ipadapter in comfyui now if you have enough vram to use SDXL
How would you explain this if you were talking to a 5 year old?
what gpu do you have
are you 5?
No just a bit dim
RTX 4070 Ti
I used DreamBooth and fed it a bunch of images
that's 12gb vram i think? def enough for sdxl
anyway, install ComfyUI and i can give you a workflow that'll do it for you https://github.com/comfyanonymous/ComfyUI
if its an artist style or a highly stylized/specific style what you want to do is a lora
give me a sample image of a style you'd like to imitate and a prompt, and i'll give you a demo of what it's capable of so you know if it's worth bothering with
Much appreciated, shall I DM you?
you can if you want, but others might find it handy, i'd just post it in #๐๏ฝgeneral-with-images
Alrighty, shall do
got another Q if anyone knows:
im making stuff with dynamic prompts and it keeps putting text from one prompt onto other prompts.
i.e. wearing an adidas tshirt also makes the hat say the words "adidas" even though the hat is defined to say 420
and its split up like {adidas tshirt}{hat with "420" text}
I hope everyone at SAI are doing ok. 
...something happen today?
Not in particular no. ๐
maybe not today, but that forbes article a couple days ago sort of painted a gloomy picture
are there still no anime styles in stable diffusion like in midjourney? for example, if I want the Osamu Tezuka style, then I will only have to train my own?
There are plenty of them on civitai
So, is the bot ever coming back online?
is there like all in one?
What do you all use stable diffusion for? if i might ask?
porn
I dont know how to reply to this....
I like using it for creating character portraits for tabletop RPGs
Otherwise, it's just for the fun of art
thats pretty cool.
Also I have used it to create a logo before, but I don't need to create logos very often
My PFP on here is also made with Stable Diffusion
Im just trying to get ideas of things i can do with stable diffusion besides just for fun.
you must be good with promts?
More like I'm good with generating a lot until I like the result. I rarely go in with a particular picture in mind
oh, thats a unique way to do it, i guess.
Pretty much the same as the whole "gardener vs architect" bit for writing (I fit on the gardener side best)
The subtle-est, but most important tip I can give for getting good results though is this:
Be intentional about your aspect ratio. You're not gonna get a good side view of a horse with a tall skinny aspect ratio
That and just use the finetunes, DreamShaper Lightning (SDXL) really lowers the bar for what makes a good prompt in my experience
you know ive been searching for a creative outlet and i think ai might be able to help, but at the same time im lost.
too bad i cant run sdxl due to hardware.
I get a huge amount of inspiration from community spaces, particularly https://remix.ai/ (disclaimer: I help build this), but occasionally civitai and leonardo as well
is stable audio going to be open source?
I kinda want to tell a story with the stable diffusion images like a movie or a comic but those are complex at this time.
https://twitter.com/maxescu - might be able to find some tips and whatnot from this guy. I don't follow him very closely, but I know he's specifically trying to make a full-length movie using AI (he does go cross-tool, but should at least be able to inspire a little)
There's also a lot of story that can be told with just a few stills, https://www.reddit.com/r/3FrameMovies/ comes to mind, but I'm sure there are better examples. You can always start small
3 frame movie look intesting, ive never heard of it before.
That particular subreddit takes famous movies and tries to boil them down to three frames. But I could see making new stories that work the same way ๐
Constraints breed creativity
-- I don't remember
ive also seen some use only images to create a movie, thats commonly done with ai versions of music videos.
Yeah, and AnimateDiff can get you pretty far if you have some source video you want to reskin or you're okay with your base images not being faithfully recreated as frames
(When doing more advanced stuff for myself, I use ComfyUI, and I haven't gotten deep into animation yet)
sparse control net works really great with animatediff v3 modules. but it's not supported on forge last i checked. just base a1111
how to is described in here https://github.com/continue-revolution/sd-webui-animatediff/blob/master/docs/features.md#controlnet-v2v
i use it with my 4080 16gb with fp8 enabled
now I can use SegMOE
Did segmoe ever get integrated into Comfy/a1111??
its smarter but not a massive massive boost
congrats, getting the 24gb vram is big
thank you
Hello, I don't see bot channels anymore... Will they come back?
Hi, does anybody no when stable diffusion 3 is coming out?
ty
nobody knows, unfortunately
anyone knows where can I try latest models for free?
im compelety new to this. Is it anything like midjourney? I know we can install on the computer but is it pretty much the same concept where we prompt on discord?
not really, after installing it on the computer, you can install a UI graphical interface so you can access the model like comfyUI
or dreambooth
decrepit smooth chili rice
for free?
i cant wait for 3 because im tired of ai mispelling stuff lol
bad news, sd3 cant do perfect spelling always
it can do text though
aw what?
I think I just have 12 on my 4080
12 on the 4080?!
sorry but sd3 isnt a 35b model
its 8b
and it really good for 8b
and t5 is very big so it's not clear if sd3 full will run on 8gb vram
it can
:3
(with offloading)
8B or 6B
really? T5 at what quantization
4bit
wow
clips and NotUNet?! at fp16
impressive, but it'll take more time to generate, or not?
yeah more
NotUNet is the best architecture ๐ฅ
would my system be able to run it? Alienware laptop 13th Gen Intel(R) Core(TM) i9-13980HX
NVIDIA GeForce RTX 4090 Laptop GPU
64gb memory
wait really?
yea
I know the 46% win rate when it comes to prompt adherence sounded good, but I don't know
compared to what?
like could it do my prompt with 2 different fighters
t5 being there was a last second decision if i had to guess
wow
then what did they make the MultiModal part of MMDiT for exatly?
could we use other LLMs?
cause iirc they just simply replace the T5 weights with zeros to NOT load T5
Alienware laptop 13th Gen Intel(R) Core(TM) i9-13980HX
NVIDIA GeForce RTX 4090 Laptop GPU
64gb memory would my system be able to run it?
yep
(rough memory from the paper)
yep
the ram will help a lot for offload
how long is the setup process from start to finish?
you could shove in any random T5xxl model but the finetune'd one will help the most
ahh
can you point me to a site where it shows me step by step on how to get up and running?
thanks
#1080946152318443610 follow the comfyui guide
Also how does 1 step of SD3 MMDiT work? Doesn't it do like 6 forward whatevers per step
you guys sound like how I sound when I talk about crypto lmfao
seems like it
does that mean that it does 6 steps in one step or does it not matter to us
so having 32gb ram will offer some advantages over 16 because there is more room for offloading?
yeah i think so
SD3 Turbo is basically SDXL Lightning, which is just so epic 
what does this mean? Prompt outputs failed validation
CheckpointLoaderSimple:
- Value not in list: ckpt_name: 'v1-5-pruned-emaonly.ckpt' not in []
i guess i need to install a checkpoint?
oh nice i got it working!
Hey guys im ๐ญ๐ฆ๐ ๐จ๐ฌ๐ฉ๐ฆ๐ท๐๐ ,im a certified and a walking W for work so im a pretty big fucking deal i'll tell you that,i used to be a part time W but recently i became full time so yeah i js wanted to introduce myself to y'all
ok so whats the best way to prompt. I want a rocket going to the moon and I want the text $Print written on the rocket. Is that possible?
hey guys, idk if this is the proper channel to ask but i hope so.
what's the difference between sdxl and sd 1.5, at first I thought that they were just checkpoints but then I was investigating and found that they have like completely different architectures i guess, that's why they have different LoRas and need to be implemented, etc. but i suppose that they are also checkpoints, like there is the architecture and then the model, and the model only works on that architecture, and then the checkpoints that take the sd 1.5 or sdxl as base model take the model, adds its own data and publish the new checkpoint that runs on the architecture of the base model, am I correct? Also, why is the sd 1.5 model used as a base and not the sd 2.1?
1.5 was an architecture, 2.1 was the next architecture, but not well adopted, SDXL is most recent architecture, and SD3 will be next
SDXL is the best
mostly none of them are compatible from a trained model POV
-for image quality, for time 1.5 is faster
If you want extra limbs
well finetuned models don't make those
yah, but every time I go back to 1.5 I'm horrifed with all the deformations
but then again most finetuned models give you portraits
like it was sooo bad, that I forget sometimes
Some arms turn into legs, random legs come out of the belly lol
are you sure you are not talking about the base model lol
Gotta go XL mate
im amazed on how fast this shit is
Which GPU which resolution?
Nice.
Is a 4090 on the laptop the same power as a 4080 desktop ?
im not sure
I prefer 1024
give me a prompt and i'll do it at 1024
oh okey, and every architecture comes with a "checkpoint" that i guess are the weights? and then the other checkpoints like Juggernaut XL take that as the base model?
yeah i think that's about right
XL can change a life!
tip: to download custom models, place them in the checkpoints/ directory
less than 60% the cuda cores of the desktop 4090
juggernaut xl has the arch of XL, but its own weights
i got this laptop for free from work so I aint complaining lol . They gave me the option for the alienware desktop but then I woulnd't be able to game on it at home lool
SD3-XL when
yah at 1024 it does it in like 5 seconds
prob never
what do you mean by arch? i'm pretty new as you've may guessed ๐ถ
architecture, like how the weights are arranged
Arch Linux
800m, 2b, 6b, 8b
whats the best sampler?
biased choice: dpmpp_2s_ancestral
arc de triumph
yeah, this gives the best quality in general
reality is though that there is no best sampler
man you guys got me up and running in no time. Thank you!
there are some cases where dpmpp_2m is better, the sde samplers are better, the gpu versions are better, res_momentumized is better (and worse!) along with a litany of others
i thought it was like this 5 hr long process getting it all set up
the easier thing to do is identify which ones are BAD
i can say that ddpm is the best sampler and i'd be right (if i were talking about speed:img quality)
dpm_fast
but... that doesn't mean it's the best sampler
Forge is best
agree with the comment there's no best, some of them take less steps, and there's ancestral vs non ancestral, which I seem to recal the non-ancestral ones are better if you're trying to reproduce an image
oh, you mean comfyui? yeah
that means it may be best via a somewhat subjective metric (image quality) with regard to only runtime
yep
No, I mean Forge
ancestral injects a bit of noise with each step i think
so comfyui with a gradio frontend, yeah its good
i prefer comfy
the best sampler is the one that does the job you have at hand
give me a prompt i can test
Forge is a lot simpler I like that
want a really crazy image with a lot of wild reimagining with each step? res_momentumized is king
for you probably
want stability with an image when upscaling? dpmpp_3m_sde_gpu with exponential, after unsampling with dpmpp_2m
lol
the weights are the files that hold all the information about the images they were trained on right?
yep
the easier thing to understand when you're starting out... is which schedulers you should be using
yah karras is making up some mangled images lol
interesting
rule of thumb there: karras should be your default, exponential if you want more stability in the image with each denoising step (so upscaling...), sgm_uniform is kinda in the middle in terms of behavior and has some special uses too
probably:
SD3-S
SD3
SD3-M
SD3-L
ah
the others are handy too but if you just stick with karras and exponential in the beginning you'll rarely have problems that trace back to the scheduler
settinsg?
the other thing is... keep that step count at 25 or higher even if your computer sucks ass
my recommended default is 35
uhhh
unless you are say using lightning and 25 will fry it really bad
schedular is karras, cfg 8, steps 20, control randomize, denoise 1
20-25 is good enough
Some ppl do 50 steps
i run into scenarios all the time where it's not
I usually just check what the cool looking images on civtai with a given model used and use that
^^good advice too
this is awesome! im tired of paying midjourney
what's your sampler?
Imagine paying
sampler is dpmpp 2s ancestral
midjourney sucks compared to SD if you have 12+gb vram and the time to learn how to use comfyui
yeah try bumping your steps up to 35, drop your cfg to 6
see if that cleans things up
idk, out of the box midjourney looks way more amazing at way less effort if you just want images, tho I dont use it
I wonder what 100 steps would do
i've tried that, i've tried thousands
usually 100 steps will make it overprocessed
someone give me a prompt, my prompts suck lol
my advice is cap it at 60, with the SDE samplers and the ancestral ones you can get really good results with the higher step counts
res_momentumized is the exception... especially with cascade, 100+ steps can do some amazing shit
much more intricate compositions
when is this coming back, i swear its been down for ages
Imagine making an image 2048x2048
what's your checkpoint?
pruned emaonly
yup
you have a 4090 mobile right?
right
first thing is that's the base SD1.5 checkpoint and base SD1.5 is pretty damn bad
looks good, I think you can decrease cfg and increase steps based on me eyeballing it
second is, with a 4090 mobile you can absolutely run SDXL no problem
and i recommend doing so
how do i do that?
the tiling is probably because you generated at 1024x1024
Imagine having a 4090 in ya phone
download an sdxl checkpoint
it'll be in no time. tech is moving so damn fast
are those links for me?
a proper 4090 is the size of like 20 phones, and uses the energy of a phone per 15m or something when running at max
what should i set them to?
just play around in those directions
first order of business is get a SDXL checkpoint
don't bother doing anything till you download that link
are you on a1111 or forge or...?
though if you are changing checkpoint it doesnt really matter how it was on your 1.5 one since the difference between checkpoints and sweetpoint for settings is big
the juggernaut one?
juggernaut is the most popular sdxl one though
damn these files are huge lolo
yup, it is worth it though
gonna run out of sdd space talking to yall lol
you def want at least one sdxl checkpoint
yeh you need a lot of space, especially with controlnets and upscalers and shit
how about steps 50 cfg 4?
try it
good to experiment as much as you can, but regarding cfg...
and change the batch so you generate like 2/4 at once to choose the best one
best default is around 5.5-6.5 for most checkpoints
below that can give you a nice grainy photographic look, but tends to lose detail, and get washed out, faded
higher than that can be nice for illustrative styles and means it'll follow the prompt slightly better
but can qucikly start looking blown out - insane contrast, hypersaturated, just bad looking
i rarely go outside 5-8
ok i dropped the check point in the folder but its not showing up
if i do, there's a reason
which folder?
you have to hit the refresh button usually
click the refresh button next to the models
if you're in webui it's a lil blue arrow
and when you load the new model change the resolution to 1024/1024
got it
k now i'll give you something to test
dpmpp_2m_ancestral, karras, 35 steps, 1024x1024 resolution, CFG = 6... negative prompt = "low quality, low detail, bokeh, blurry, text, watermark, signature"... positive prompt = "a freak cannibal clown shark ninja with a pickle sword leaping sideways with a metallic giraffe out of an airplane inside a scuba pool in a rooftop barn in a cyberpunk city"
i'm sure it'll be interesting
I think my 4080 does that faster
hugely better than what you saw a min ago! ๐
really?
are you using forge or automatic 1111
cuz if you're using automatic 1111, use forge, it's much faster
what about you kill krill?
im using comfy
k cool that's good
not sure where the giraffe came from lol
ever need any workflows, i always leave mine embedded in my images... lots of ppl here do
the prompt ๐
let me rerun it actually time it
Do you feel comfy?
lol
you prolly had a delay from loading the model
comfy waits until you run the workflow the first time to load it
I wonder how a M3 Pro MacBook Pro would perform

damn same 18 sec
that's not too far off from where it usually is with that sampler
dpmpp_2m_ancestral is about half as fast as a number of the others
dpmpp_2m_sde is a lot faster and also pretty damn good
same with dpmpp_3m_sde and dpmpp_3m_sde_gpu
but def recommend sticking with the slightly higher stepcounts not 20 cuz otherwise you end up wasting time rerunning shit when it looks weird or crappy
i have a m2 mac book at home i wonder if it would be this or not
another 3-4 seconds per gen sure isn't gonna kill ya
im sure the I9 is better than m2
yeah you're def better off with the 4090 mobile
I have noticed 512 images are extremely fast to do, but they are also a more blurry and low quality, which makes sense
i have gigapixal ai which is awesome
are you referring to 512x512 with sdxl
ye
yeah that's the wrong resolution for sdxl
๐ฎ
you wanna stick with 1024x1024 and other resolutions with similar total pixels
thanks again guys, you are awesome!
1024 x 1024
1152 x 896
896 x 1152
1216 x 832
832 x 1216
1344 x 768
768 x 1344
1536 x 640
640 x 1536
does any models work well with 2048? I've noticed it tends to just repeat a person, Im told you need to turn tiling off, but I never see a titling option lol
probably controlnet tile
๐ฎ

nope, don't bother trying
you gotta upscale if you want that
natively generating at that resolution is pretty hopeless atm unless you're using cascade
sometimes faces come out perfect sometimes not lol
ัะพ
Uno
I did taylor swift came out perfect, but kylie jenner and it was botched
Gotta do Batman

droped in chat with img
๐ฎ
how do I generate the same thing? promt gives out different options every time, is it possible to regenerate the option you like and add details? change the character and so on
If any teams are looking to locally train/develop their own AI text-to-image or text chat models, but need help securing the right on-premise hardware, we have liquid cooled RTX 4090 24GB, RTX 6000 Ada 48GB, Nvidia L40S 48GB, as well as the just recently released, proprietary Phison AI100 SSDs that are compatible with aiDAPTIV+ middleware for 70 billion+ parameter, large languge models all in stock. DM for more info, happy developing
Iโd recommend AWS Sage tbh

WEED AND CIGARETTES THE TWO BEST THINGS IN LIFE. PERIOD.
weed is nice
does that just put the fans on high
been trying it since it came out
i wanted that upgraded Sony ToF sensor.
SageMaker's Nvidia V100 Clusters are quite capable for those with solid internet connectivity, however the TCO over the next 3 years doesn't scale well compared to the shear volume of add'l AI users that will be active by 2027. Per SageMaker, "The compute cost for Amazon SageMaker is $2,549 in year 1, $3,059 in year 2, and
$3,569 in year 3, totaling $9,177 over a three-year period."
A lot of us in this group will still be using AI models in 2027, so why pay $9k+ when we can just build our own machines for less? Training takes a bit longer, of course, but a lot of users back in the day with 56k dial-up connections left their machines on overnight to download 480p movies. This same multi-tasking methodology is still feasible today for training large models, especially for those living in disadvantaged cities with poor high-speed internet availability. Starlink isn't really that cheap for AI devs that want to live out in nature, but still work remotely.
Which channel do we use the dream command in again?
it won't let me use the command in there ๐ฆ
try typing a prompt at me in there as plain text
I am using forge, but I cant seem to get openpose to recognize models. Any ideas what folder models are supposed to go in for forge's inbuilt openpose?
๐ I guess so lmfao
Actually it OC the cpu and video card
SD3 is never coming out :,(((((
Y because ceo left?
๐ฎ
i have a g15 and whenever i turn on performance mode in alienware cmd center it literally just ups the fan speed
Emad quit and took SD3 home with him.
Idk, I donโt keep track of rich ppl
hello
hi
hoew to use
Stability is probably sitting on SD3 hoping investors will come around, but our current global economy would serve SD3 up as a poison apple.
Hello is Stable Audio 2 available via API?
Not yet, just the site widget
hello
good
hi
Nihao1
spaghet
ball (singular)
hello
gesundheit
hello
ไฝ ๅฅฝ
hello
yellow
thing
hi
123
tunak
Kamusta
I store my stuff on a nas, all the SD stuff is on SSD, but the generated images go to a cheap and deep NAS
I feel like these SD models tends like nude, I tend to have to put it in the negative prompt sometimes to prevent it lol
hello
hello
hi
hi
you see see you
1
hi
nice
nihao
้ๅผยท
Stability is probably sitting on SD3 hoping investors will come around, but our current global economy would serve SD3 up as a poison apple.
Could anyone help? Using Forge XL webui with dynamic prompts.
My prompts are leaking into eachother i.e. text designated solely for a tshirt also shows on the hat - I have it seperated with { } so not sure if its just something that happens always.
Pretty sure a few of the stable diffusion breakthroughs came from them.
thats just normal,brackets dont help with that
try with "cutoff" extension it helps with bleedin
will have a look, thank you
ไธๅๅฅฝ
.
1
2
3
Hey everyone, i needed help with something. I want to generate photographic images of people that look as realistic as possible. I used realistic vision v6 for this and it worked fine for lower resolution. But for higher resolution like 2k it fails miserably and takes lot of time to generate on local system and even Google Collab GPU. Any suggestions to speed this up and create realistic looking 2k images ?
generate at half that, latent upscale and use fewer steps
Reduced steps to 25, still no good
use fewer for hiresfix
And the image generated even after taking lot of time is distorted
do you have hiresfix on
I don't think so
I like started 2 days ago, reading about stable diffusion and trying to impleme the APIs
Daz AI finally ๐ , you know how long I've been trying to get Daz and SD to team up on some level lol.
Upscale it as an image then convert back to latent and use tiled ksampler at 0.4-0.5 denoise, 35 steps dpmpp ancestral Karras or exponential
Good morning, everyone! How are we all today?
Very nice
I am going to make a sick abandoned building lora, I'm making the images for the dataset rn
How about you
why SD3 is delayed?
why it wouldnt?
It's not delayed?
Thanks ๐
yah there has to be a communicated release date for it to be delayed
I can't wait for any update at all about Stable Diffusion 3, maybe just a few new invites so we know it's progressing
I've been waiting for research access for a month... I've just accepted that the company is tanking and it's not gonna happen anytime soon
I have a question, I'm trying to make clothing Loras on SDXL, but whenever I do and apply the Lora, the image quality of the generated images severly gets degraded, like if there was a filter over the image and it doesn't look sharp anymore.
What am I doing wrong in the training for it to look like this?
Even at lower values @hot wraith ?
So at lower values that goes away, definitely, but then the problem is the item of clothing just becomes something general. I want it to reproduce that exact piece of clothing. Like say i'm doing a jacket I made in real life, I want it to learn that jacket exactly ๐ฆ
you overcooked the lora thats why
I guess that's it, yeah. How do I avoid that? I followed some youtube guide for clothing, and i did 15 epochs at 20 repeats, had 31 training images.
20 * 31 * 15 - that's like 9300 steps
and with only 31 images each image was repeated 300 times
what I typically do is take 500 / (num images) and round down to get a decent starting number for base repeats, save every 1 epoch, and maybe do 8 epochs, test each one until you are satisfied
That's exactly what i did yes, looking at it now it seems insane. I'm going to try your 500 approach and see where that gets me, thank you. Are there any settings you recommend for clothing?
Hello, I haven't used this bot for months and I see that it changes a lot, how do I create images now?
that's not something I've typically attempted, so not going to be much help there. You're just doing 1 outfit per lora right? if you are doing multiple it could confuse the training
One outfit per lora yes, definitely
just dont describe the outfit would be my approach, describe everything else, it's basically a subject training, but not a character
The bots are turned off, but you can generate images with stable diffusion on for example https://leonardo.ai/
Interesting, so you describe everything but the item/subject? Iโve heard various instructions. Some even saying to skip descriptions entirely. Iโve honestly only created one Lora & did pretty thorough descriptions of everything and it turned out ok. I think I may have just been lucky though ๐
we got a number 1 victory royald
"a man standing outdoors wearing a xyzpdq coat.", coat is a class that the model will understand, so recommended to use it, xyzpdq is a custom token. Skip any other description unless you want variations, like different colors for example
hey guys, quick question - If we put a second graphics card will it help to generate quicker? Does itย workย withย sli?
currently I have 24GB nvidia 4090
theyโve lost developers and investorsโin Big Picture language, that means theyโre approaching SD3 as their poodle in a window. Butโฆpets from malls is a strange thing.
What cfg are you using? Sdxl loras vs cfg strength are fickle I've found, especially if you overtrain. Try setting your cfg to like 2-4 and see if it's still bad
Also, you describe everything you want to be able to change and the thing you want to "keep" is usually associated with a unique token
sophistication of context will be what divides the public from the private sectors. I think. /talkingoutloud
So it might be something like " a 36 year old woman with brown hair wearing a yfeh7sd, denim jeans, a blue hat in an empty stadium. She's posing with her arms in the air"
If the shirt is what you're trying to train into the lora
it isn't going to be released. move on
also, what did you do with stable cascade? it is extremely powerful
they really had just 1 shot at capturing enough mindshare for interesting stuff to happen. it happened with stable diffusion 1.1, for the kind of idiosyncratic reason of generating porn. stable cascade is phenomenally capable. yet the manic individuals building stuff in the stable diffusion ecosystem haven't done anything with it
it is too bad really.
deepfloyd/if foreshadowed all of this
Yeah
Part of the problem is the announcement was followed almost immediately by the sd3 one
In retrospect they should've waited to announce sd3 for a bit
Obviously they weren't as ready as they thought
taking off the shelf research models and training them on an excellent dataset is actually a much better idea than trying to create a really good research model
heart of the matter is data ownership rn
because SD3, when the "full weights" version of the model is hosted
and the lightweight model, that is of limited use, is released
it will be obsolete in a year. but their dataset will not be
True that
and with upscaling becoming more contextually powerful, the industry has already underestimated how useful lower resolution models have managed to remain
i think their biggest obstacle is convincing investors who have very simple narratives in their head, like
- compute is expensive
- be #1 on someone's charts
it is easy to raise money to spend on compute, not because it makes sense (it doesn't), but because investors buy it
and to all of those pointsโthey are all extremely contentious and difficult ones
at the moment
compute itself is in danger of becoming monopolized like every other commodity
well i think they tried to convince people that compute isn't actually expensive (it's not) and that there are unlimited kinds of charts to be #1 on (there are)
maybe
compute isnโt expensiveโbut neither is ketamine, neither are tomatoes.
lol
๐
itโs the overhead
thatโs what i worry about
the greed
one day, we will be mining helium in a sustainable way, and we may yet have zero point energy to actually offer all of this stuff as a basic free service for all humans
etc.
okay well, well capitalized enterprises, they never pay someone else's monopoly pricing for a long time. anthropic is a great example of this: they have a $XX billion convertible investment from amazon, let's say $10b, which costs amazon only $100m to provide, but $10b in "investment from amazon" lets them raise whatever, $100m-750m in hard cash from investors.
because "compute is expensive"
itโs investments for sure
stability is going out there telling a much more complicated story
ape no like complex
but thatโs what the system is instinctively trying to make for itself
but i'm going to "just" "use the best thing."
i mean there are people who watch this chat who work at stability
it's a simple yes or no question: do you want to make another ideogram?
they have all the evidence in the world right in front of them that the dataset is more valuable for the role of a private enterprise in an ecosystem where researchers give out brilliant ideas for free
but it still sounds so sexy, to be an "R&D" guy, and sexier than going to do research at the universities they are too non-traditional to be admitted to
personally i think they should directly monetize the dataset for enterprises. accept the most straightforward billing possible: a sort of DLC where each new part of the dataset is a small additional charge. they should be doing the kind of enterprise work that i am doing lol
this thing where they want to stick to a 0th or 1st level expanding brain meme playbook. it's over. they lost their chance for that. and anyway, i want them to thrive, and they will not thrive competing directly with ideogram and midjourney. they will only thrive coexisting with them. i'm sure emad said all the same exact things.
models are amalgamations. we build things, products. AI has a medial nature that contradicts our own intellectual metabolism, reinforced by economic principles
however, data is contextualโand the weight of context is prone to shifts. rapid changes in value.
not to play the devilโs advocate
@shell tendon scale.ai already does this, and has been very successful, for a very long time
hmm, it's more likely that stuff that is literally old and has meaningless context and little business value today, like internet archive snapshots, is the most valuable data now, because it isn't polluted by generative AI spam
generative AI is the best thing that ever happened to dead authors
it just goes to show that the real problem is that people are really fucking stupid
i don't mean you lol
but letโs say that a bunch of that context happens to be built from human data, from living humans who unknowingly contributed to it, et.
etc*
but that there is a big, popular discourse about many of the issues stability faces, and it gets things 200% wrong
itโs likeโฆa royalty at that point. isnโt it
okay but if it weren't for the goodness of a few people's hearts, there wouldn't be an Internet Archive. they had very humanistic ideas for what they want to do, and until generative AI, its greatest "real" value to ordinary people was as a paywall bypass for very, very recent news articles. kind of the opposite of what they want to be involved in. presumably if the people who put up random geocities crap thought it was actually valuable, they would store their twee little writing somewhere, but they don't. that suddenly your twee little writing about twee shit has ancillary value for training an AI: nobody could have anticipated that. but it fulfills all of internet archive's humanistic goals, that because they preserved this twee geocities internet culture stuff, it can "provide universal access to all knowledge"via chatgpt, that is used by millions of people every day
nobody was reading your geocities crap ever. it is only interesting in some big aggregate with everyone else's geocities crap
Hm, but if large data represents structures of a whole new magnitude, then that data represents building blocks to pyramid-like vectors that enable the already powerful to be more powerful, etc.
i don't know
nobody is misusing geocities crap to get more power
Thatโs beside the point
But I know what you mean
When it comes to AI models, for every hypothesis you find in your output you are eliminating xxxxx amounts of null hypotheses in the form of those geocities rants, etc.
it has similar energy to saying that mark zuckerberg will personally will himself to be president or whatever the fuck, because he controls all the most powerful communications platforms in the world right now. and yet. you talk to a real political analyst, and the most likeable thing about him, the real reason he could get elected, is because of his lovely family.
The null data is perspective
Ty for chatting btwโI gotta go to work
lol me too
