#💬|general-chat
1 messages · Page 128 of 1
Im sure anime stuff will be great and will get the most finetunes, it's like the style you should be least worried about
I will say anime is not even a challenging part in generation stuff.
I mean SD3 tho
I need to make friends in this community, so I can bug them about issues I run into, that have probably already been solved by others.
°^°
use the search function in discord
:O You expect me to know how to do that?
btw for A1111 and Forge there's a Civit.AI Extension I really like. Downloading Models,Picture, Model Information ... "CIVITAIHELPER"
Oh? I have been just manually going to civiai
Can you dm that to me so when I sleeb, I can see it still.
You search in your Forge or A1111 @ Extensions
Ahh, like I had to do with controlnet
I still couldn't get the 3d model tab to appear.
Which irked me so much.
ControlNet is already integrated in Forge ... but the same you did this in A1111, yes
Oh, you've got me so excited for forge now.
Am I gonna have to transfer all my saved prompts over?
And ... you can use your downloaded models in every UI don't need to download again. Youtube will help with that ...
https://www.youtube.com/watch?v=q5MgWzZdq9s&t=20s gave me some nice information
(Maybe not for that problem)
Youll have to transfer everything if you plan to remove A1111 but its worth
Forge is king right now
At least compared to A1111
I dont see the need for A1111 when you have forge but you do you 
I don't have unlimited space, so if I don't need them both I'll just switch over.
As long as I'm not missing out on anything, then I'll be fine.
Sometimes an update breaks a system, so I keep both 🙂
Forge ftw
Hearing that scares me.
It's just that other parts need updates, too. So you have 1-3 days to play around with your other A.I. system 🙂
I'll keep both 😭
I am terrible with writing prompts, so I need as much practice as possible. I wish I had a cheat sheet of useful prompts, and what order to place things for the best results, but I know it differs from model to model.
It's like every other language. The more you practise, the more you learn. If you wanna try out things, it might be an idea to try with a fix seed to see the difference.
Fix seed ...
And try out things I think about small changes in one prompt ... I'm on the run to work, sorry 🙂
Have fun at work!
how far is the run?
Marathon ... 🙂
anyone tried ella?
there were examples from yday in #🏞|general-with-images
I set it up but forgot I dont even have 1.5 models anymore when I tried to run it, so left it for today, will probably test in a bit
tho having to use 1.5 in 2024 kind of yuck
No, don’t even know her
ella somehow cant do anime for some reason, even tho they have anime examples in their paper
:D
What kind of "anime" are you trying to do?
some extensions just dont work on forge, so having both is better
No I mean what are you trying to make it do with the anime
i dont get ur question im sorry, just normal prompting, its the quality that is ruined
I hear some people who have the Pro Max iPhone say that they don’t need an iPad coz the phone is big enough.

Did you follow their directions exactly? Also, how exactly are you using it like are you using their GitHub stuff or some kind of implementation within another app like comfyui?
im using in thru comfyui
and someone else on reddit are getting same ruined quality as i am getting
Sup guys! Maybe some of you know a good guide of how to use EMBEDDING for SD 1.5 model in code by using diffusers?
What exact model are you using it with?
This is done with Ella #🏞|general-with-images message
The quality is indeed not as good as the images generated without
ive tried several, i even tried the exact one they use in their research papers, witch is counterfiet v3
here are the examples anotehr user posted
im not sure why ella wrecks the generation on anime
but its unusable
what GPU are you using? it might be autocasting to some other precision level or something
im using a 4060 ti 16gb
feel feel to try it out urself, if u have the time, id love to get it to work
I’m just using a yi old 4080
looking at their github code, looks like it has a default cfg of 10, uses stable-diffusion-v1-5 and DPMSolverMultistepScheduler for scheduling
i tried 10 as well, no difference
some checkpoints was way worse then others as well
give me a few, i'll test it out some. i don't really mess with 1.5 models anymore, so i hadn't bothered to install it yet
Good morning, everyone! How are we all this morning?
I like XL
yeah i pretty much exclusively use sdxl based models or cascade
Bass model keeps having deformed legs, arms, body, and multiple body@pars everywhere
1.5 can be superior in some cases it seems, especially when it comes to anime
and even animating it seems
there are a few decent overly trained 1.5 models that can put out decent stuff, but they are so damn biased toward pretty waifus its not even funny
but its so much i have not yet had time to try out
Problem with XL are the not so good controlNets
the control nets work plenty fine and i use the hell out of them. you just need to learn how to make proper depth maps or canny maps
but yeah so far, this ella is god aweful for anime stuff lol
Nah they degrade the image quality a lot
i'll have to check the author's wrapper, he might have bad code in it or something
Especially openpose
I don’t use control nets, what do they do?
they do not, you're just using the wrong. the strengths are misleading because it's actually based on steps under the hood. i don't really use openpose when it's easy enough to make a quality depth map with zbrush/blender/ue5/etc
a good depth map>>>>>>>>>>>openpose
when making videos u need to use openpose whos gonna waste time making a depth map for a video with hundreds of frames lol
plus controlnet is more than depth and canny,theres shuffle,softedge,instructpix2pix tile etc
most of these are not on XL and the cn tile sucks
you can render out sequences with 3d packages, it's fast. stop using tiktok dance videos lol
they are not bad they are just incomplete
the main ones that the vast majority of people use work perfectly fine... depth and canny.
the vast majority of people aren't making videos
yea they are fine
the 1.5 are better but 1.5 models are less precise so nothing is perfect
and another issue people have with controlnets and sdxl are that they carry over a bunch of bad 1.5 habits like having a million things in the negative prompt
or poorly wording the prompt
it's not a lie, they are not even the same as 1.5's (and youd hope they'd be better) but worse
they are still usable but they are a step backwards when you expect a step forwards
Thank you. I always say that and also always get the "nah, you're just doing it wrong" response...
Can someone recommend me a model best for generating Pokémon that fits in 8GB of VRAM?
Nintendo wants to know your location (also probably Cascade).
I’m not using a Web UI as I’m building my own Pokémon generator thingy
shhh
It’s gonna be totally unique too lol
‘Totally unique’, to an extent lol
My current model gives me blurry and obviously AI generated stuff (runwayml/stable-diffusion-v1-5)
Good afternoon! i would like to make an ai female character that seems life real. I also want to get into pornography here. I'm still new to this and have no idea where to start. Who wants and can help me?
bruh.jpg
I’ve also tried https://huggingface.co/lambdalabs/sd-pokemon-diffusers and got the same poor result.
🤣
They most definitely arent perfect, I had a lot more consistent looks with the openpose model from thibaud
As well as using it together with highresfix
That exterminates most issues for me
Hi! Can you tell me how to set up the display of all information under the generated image? Where are these settings stored?
Hello, if you need someone for mixing, mastering or effects for your instrumental does not exist to contact me
Usually they are stored in the meta data, you can drag and drop images into the PNG Info tab in A1111/Forge or get the full workflow from ComfyUI images by dragging and dropping them into your comfyUI
Hi everyone, does anyone know how to load a local DreamBooth adapter (not from HF) to a pipeline? Not using A1111
finally, weekend is coming
almost 1 week closer to SD3
I wish the waitlist beta test would come next week or something
Dear... God...
people keep talking about ella, why? Is it better than kohya or something?
Ella isnt like kohya at all, kohya is for training. ella equipps SD models with LLMs for better prompt understanding and improved visuals
Yeah that git is about reverse engineering ELLAs code to make a SDXL version

Reverse engineering ... they did with humans a long time ago ^^

Aboutta do it w u
imjk
What's the best UI these days? Let's say that I'm programming in python
I like the UI for rapid testing of prompts, and then I like to run a larger job through python
Quick'n'easy sounds like SD Forge to me. Your abilities more sound like ComfyUI could be fun ...
Oh yeah, that's right I heard about ComfyUI
I took a break for a few months and things keep changing, so felt it was worth checking if there was new hotness
Forge is the new A1111 now ... comfy still the same 🙂
Yeah A1111 had some annoying things Ididn't like (such as the weird weird notation with [1:2] and [1.0:2.0] having completely different behavior)
depending on whether it was decimals or integers, it completely changed the meaning of what you were doing lol
Forge is a Fork of A1111, hnadling some stuff a bit cleverer, a bit faster, and in my eyes more complete, cause it comes with many useful extensions
class Character:
def __init__(self, traits):
self.traits = {}
self.traits['age'] = Age(traits["age"])
self.traits['gender'] = Gender(traits["gender"])
Yeah I was working on basically some code like this, so that you could select the kind of character you wanted, and it has predefined way to construct the prompt.
Bad luck I'm not good in understanding code. I still have some basics cause of Borland Pascal ... but never used it ^^
Nah I don't need help with code, just show and tell
My idea was like: say you wanted to make a picture of a 16 year old, it has code that automatically adds the positive words Teen and negative words middle aged
so instead of choosing the words, you just pick the age you want and it makes the right prompt for you
Well with SD XL many don't really work with neg any longer ... like like in SD1.5 times...
i rarely if ever use negatives with finetuned sdxl models
Yeah XL just came out when I paused my work
SD 3 could change even more ... so you may be a bit late for that idea
yeah next step for me is just getting XL to run in comfyUI I think. Not needing negative prompts would be pretty great
comfyui is actually very easy, not sure why people think it's some black magic
Yeah it barely needs negative prompts, I only ran into a few issues where I depended on them
I remember the other big problem was getting consistent colors (such as "red shirt, yellow pants" instead of getting the colors swapped, or ignoring one color etc)
I had ideas on how to get around that, but I think XL doesn't have that problem nearly as much
That problem still exist sometimes ...
Eye color was very annoying lol
the color problem can be solved most of the time if you put more emphasis on the color and the object, like (red sneakers:1.2) and (orange shirt:1.2) or something, but yea it's not perfect, maybe SD3 will solve all that
That extension pretty much solves it
at the end of the day, we are all waiting for SD3 to solve all of our problems 🙂
yeah!
🤣
I've been playing around with Ella a lot and if sd3 can do that, we should be in good shape.
I hope it can wake me up with my good morning coffee 😄
well i saw their technical report or paper and it does do that it seems, and a lot more, so i am excited
SDXL wasn't really good first ...
btw, what's currently the best approach to latent couple or regional prompting with sdxl in comfy? i saw this civitai page, it seems decent, but wondering if it's outdated or if there is some better methods: https://civitai.com/models/21100?modelVersionId=94130
yeah SD3 will also do a wider range of stuff like celebrities. ELLA refuses to generate some (if not most of them? haven't tested)
It knows Brad Pitt as a random black man
hahah yea ella was giving me some weird stuff too
but now that ELLA training is possible, it's going to change the game
imagine uncensored trains of ELLA
do you guys this could make SD3 DOA (for AI corn addicts)
Isn't T5 a Google creation? They're into making black nazis
i think by the time people train ella, sd3 will be released, so yea....
hehe I get you but I think it's just ELLA's dataset not being wide enough
or maybe its just censored, idk
the ella we got is 1.5, not sdxl, so no wonder it's bad
idk SD3 is a superior pretrain with a superior VAE so I would personally try to train AI corn (if I were to be into that lol)
Indeed. I tried throwing prompts at the new cosxl. It was horrible. Shows how far the fine tunes have come
CosXL sucked for some reason
like there's a CFG threshold or something where above that everything becomse a weird abstract art
i mean cosxl might be bad, but it is cool tech to edit images with instructions, the edit version of sd3 gonna be awesome
even with merges its bad for me
and it literally looks like someone took up the contrast effect
Is it that blurry thing?
I bet there's some mathematical explanation behind how it's like superior but I don't know
it just makes stuff look worse for me
I wonder if that merged data is still bad though because it's based on the old model. It probably has to be retrained with the source imagery.
haven't tried it extensively but I suppose its a sneak peak at what 2B edit and other models could do
there is a way to maybe fix cosxl outputs, you generate first with cosxl, then you apply a very light img2img with some sdxl model to fix the output along with upscalers, or whatever
idk I just hope SD3 will have bokeh optional somehow
but yea sd3 hopefully will do better
like you can do a negative prompt or type these into the positive prompt: "deep depth of field" or "narrow aperture"
I hope that will work
and yea i hope we can deal with bokeh, cause i hate it most of the time
I like it for portraits and cetain occasions but
if I want believable generic images it ruins it
boring reality Lora is so good for this
we need anti bokeh lora
there is, but it makes everything daylight for me
I think its just the dataset not being diverse enough maybe
also the guy behind boring reality lora is making a new batch of loras If I recall correctly
I hope its gonna be even more amazing
it honestly shows the bigger parameter size superiority much more than a bunch of generic bokeh portraits
or...30 seconds in photoshop
ppl get tunnel vision for some reason and forget that photoshop still exists
Or they don't want to pay for Photoshop 🙂
im not a huge fan of having to constantly switch tools to do a job, so i would prefer if sd3 or something can do all that by itself
Enjoy your malware rootkits, they all have them
I'm baaaaack, good morning people c:
I am going to attempt to switch to Forge now O^O
YaY! Good luck! 🥳
Hello, community! Do you know if an image generated via stable diffusion several months ago for ourselves or someone is findable or not?
Y'all checked out that new audio model
You can make some hype stuff with it
I almost have a hard time believing it's AI
Thank you!
It's very high quality, highest I've heard
im personally more interested in models that i can locally install and use :3
hopefully some version of stable audio comes out
@woven panther haha you are a legend man... brushnet now too? you are implementing all the cool stuff for comfy ❤️
Pretty quick to wrap that stuff now that I've learned a bit, I was surprised how good brushnet seems to be for outpainting especially
i also saw powerpaint, but not sure if that is older or newer than brushnet
yea it's some cool stuff
Just looked at that, they adapted to use brushnet now with some changes, I'll look into including that
😮
nice
btw i saw the marigold thing, is that like a new depth map thingy?
il try it out
Marigold is oldest of them now, but still one of the best
There's also depth-fm and geowizard available now (wrapped both)
Geowizard creates absolutely astounding normal map predictions for a lot of stuff, like architecture
wow i didnt know about those, im gonna spend a lot of time on your repos
probably missed a lot of cool new stuff
Yeah haven't had time to actually use anything much... every day something new comes out 😅
Hope so
Just around the corner -> 2-3 weeks left 
gib mii
fr
Well... if it turns out true. Another week went by without any news at all. They promised API access like a month ago.
I hope they don't spend all that time lobotomizing it further...
Armed Jewish colonialists attack a village and burn homes and vehicles
insert that Godfather scene :3
the boring reality lora was made from just a few images. i think you can achieve the same thing with vanilla unclip
they cant
they could restart the entire training process and prune the dataset with 99% NSFW checking strength like with SD2.0 and Cascade 
then it would be lobotomized for sure
Cascade can learn and trains fast... but it's a Diva to use. The strange cousin of the SD family.
Did your look around the corner?
Hard for me not to make a joke 🙂
Im too scared to look around the corner :3
any of you have any plans for when sd3 comes out, and what if you are let down?
i expect to be let down as any base model of sd so far, and i expect the finetunes to be better 🙂
Finetunes are usually better for some odd reason.
i'm honestly afraid that they'll maxlobo SD3 and basically make it PG6
that would be very bad
just give us something we can work with and we are on our way to greatness
i hope it will run on my low end device because i can run sd1.5 but not sdxl.
if you cant run sdxl, it might be hard for you... but will see
ouch, you havent run sdxl this whole time?
people will maybe come up with pruned versions and whatnot
yeah, and im scared of google colab for some reason.
Just download on Torrent (Joke off)
well, there are other options, but they cost money
you are scared of your own shadow too :3
the free colab technically it's against the EULA
dang
people still use it, but you risk google banning your account
is there a free alternative by chance?
there's something called kaggle, but I havent tried it
I might look into that.
kaggle is pretty good - you get 30 hours of access on their t4 machine per week, but they may block a1111 installs IIRC
aws sagemaker has a free tier
IIRC? Maybe better don't use shortcuts ...
you can do cheap, you can do good, you cant do both
Suno getting generous with their free credits, I wonder why
Quick question fellas, what's the easiest prompt to generate a desired text on a t-shirt?
example: "white t-shirt with LOL word on it"
?
woman wearing a t-shirt with the word "LOL" try that, but dont expect it to work every time
text is kind of tricky
OMG, BBQ, and LOL are considered the Holy Trinity of Internet Abbreviations
Also WTF is too, but more like in a fourth band member sort of way
yah, dont expect to be able to generate a paragraph
I suppose once SD3 drops you could try generating something like “I live in the age of nerfed datasets and quantified love”
Hm aight, thank you
Whoa!!
So cool 🤯 🌈
But then comes the kid that wants a new and unique story every day, and then suddenly they want a new and unique story every day as an adult and then suddenly they are some kind of sociopath
So… 🤷
If it ever drops
it will
hi discover that AI from a youtube vid, is it still free to use ?
I hope multi subject Loras will be trainable for SD3 8B on 24GB, but I bet that's a stretch
I only got 16gb
yoinked said that training an 8B with 24GB of vram is barely possible
2B Loras can probably be trained with as low as 12GB 🤷♂️
since 3.5B SDXL requires around slightly more than 12GB
You guys see the sd3 thread on Reddit? He does lots of peoples prompts. Better than sdxl in a lot of ways. Lots of the sdxl problems like intersubject interactions are still there. Hands are still rather messed up. It'll be a nice movement forward but it's no Dalle or ideogram. Not by a long shot.
thank you, installed the thing but can't figure out how to make it work XD, trying to recreate images using other image i have for pixel art
you might want to search for a model that is good for pixel art
although im not an expert at this stuff.
🤷♀️
good luck!
Here’s a question, how come there’s always UFO sightings in the US, but not Australia??!! Are we not good enough for the aliens???

that's because the kangaroos are the aliens
Have you seen Australia - there could be an alien race having a giant rave party in the middle of the outback and noone would ever know!!!
yeah that's ok, thanks for the answer haha
Yamer has a decent pixel art model - and searching pixel on civitai should net you 5-10 models that will all do a reasonable job
https://civitai.com/models/381734/hysteria he has a new horror model coming out in a day
Honestly i have no idea how to use the tool like at all, i don't even know what to do with that info XD, my first goal is to recreate a really similar art style for a indie developper, he lost his art guy and we want to add more sprite in his game, the style is pretty specific, so yeah. But it's noted, i'll look into it probably tomorrow, going to stream a bit on youtube for now lol
Ive seen the previews, its pretty sweet

hello
Hello God Like SDXL user
sightings tend to be correlated with two things: the amount of military bases nearby, and also the population density
Ah, so we screwed:(
but with statistics, outliers are predictable
and those reports are simply outliers
and these days, if they aren’t actual antigravity devices then they’re one of four things: russian hypersonic technology, chinese hypersonic technology, united states hypersonic technology, or nation ___
or, honorable mention: a made-up news blurb designed to distract from someone’s affair or murder
or war
its def the obama reptilian,he can fly
😮
yes i am told you can hear his “whoofle-whill” sound if you leave a bag of kumquats and cola nuts in front of the savannah at dawn
We don’t even get these ghosts everyone is raving on about in the US lol
key thing to remember is that it isn’t everyone
it’s just one person saying it’s everyone
sometimes two
I believe in the reptilian armada hiding within our ranks.
So about a week ago, there was a thread on reddit and Emad replied:
https://new.reddit.com/r/StableDiffusion/comments/1bvt1yb/emad_is_no_longer_the_main_shareholder_of/ky3cf09/?context=3
He says at the end "SD3 will release in a few weeks probably API much sooner worry not."
The API part seems true since there is a dude on reddit with API access it seems, telling people to post prompts so he can generate them. So, Emad was right on the API access part when he said "much sooner", so we can perhaps trust he will be right for "few weeks" as well? which would put it still around April 26 idk? which i guess is approx the same range as the new lead dude said at Stability. 😮
so if we are lucky, maybe like 2 weeks from now
You, change your mind, like a girl, changes cloths
I signed up and never got anything back, I would like to see the bot return
Is everyone still on the wait list?
It will be out before you get the invite
It could be that stability has decided that having a massive amount of beta testers via the API/discord bots isn't particularly useful at the moment, so they're not going to send out any more invites.
The whole wait list may have been an Emad stunt more than anything they really really needed to develop SD3
Wooooowwww wow wow wow this is huge why didnt i hear of this
👀 👀 👀 👀
is it just me or has progress on realistic SDXL model training really petered out, civitAI seems to be overrun now with the most repulsive and depraved Pony variants imaginable
pony popular,realistic models are good enough already
good enough for what
All of the realistic models tend to be extremely overtrained. You tend to run into a diminishing returns effect
You'll see the same handful of facial structures and looks
It's easy to test just by setting the cfg to 1
yeah, and flexibility for doing action poses is very limited
So it's kind of pointless for a lot of them to continue training more into the models because it will break their vertical slice presentations of the models
hello world was very promising, then they just stopped.
By adding more variation to the model, you'll make sacrifices in quality elsewhere
There's only so much information you can pack into X number of gigabytes
If you want an ultra quality model with a ton of variety in say juggernaut, you'd probably have to double or triple the model size, which would knock it out of usability for 95% of local users
still thats fine if you are seeing fine tuning to customize it for specific things, but thats not really happening as much any more, except for pony models that specialize in various juices overflowing from engorged anuses
you can always do what pony did,get a few a100's and train them yourself
they spent like 50k to train pony
I only recently learned about the pony stuff and was absolutely disgusted, but not surprised by it
someone said the midjourney model is 80GB in size, not sure if true, but interesting if it is the case
no one really knows since its closed source
since its so much more flexible than any individual SD model
MJ probably has a dozen different models that get used based on the prompt. They use LLMs to prompt expand and probably pick the correct model to use with it
yeah hence the disclaimer
image that much VRAM
i mean that would mean its running on A100s or whatever, again i have no idea if thats true, its just something someone told me, and i forget who, also your fingernails don't keep growing after you die, that was a lie too.
i mean you can simulate having that much vram pretty easily, it will just be extremely slow. you can just do CPU renders and use regular ram. 128gb isn't really that expensive. but you'd probably have 5 minute render times lol
or Nvidia could put some more fuckin vram into their goddamn cards
😮
no need for consumer cards. the vram gimmick is a lie. even the most demanding 4k games don't really need more than 12(even though clickbait youtubers think they do). pretty much every major game engine uses pooling and just dedicates some chunk of vram. the actual usage of the allocated space might only be a small fraction of it. also, asset streaming and virtual textures. no point in keep a 4k texture for a rock that's 5% of your screen space when the texel resolution is based on your screen pixel count. so it will drop it down to like 512 without any loss in visual quality. but there are also so garbage game engines out there that don't manage pooling well, so they use the extra vram as a crutch
only 27.72% of GPUs have more than 8gb of vram btw, according to steam hardware survey
and only 4.8% have more than 12gb
talking about using it more for 3d rendering, compositing, visual effects, editing, professional work
just the vram and nvlink/ai support
yep
but the chip is the same
yep, well almost, but yep
quadros use less power
and they still let you use blowers to run 4 cards in parallel on the one machine
and pool the vram
but all that stuff they used to let you do on consumer cards
until they took it away
so they could charge more
and if you render in redshift you dont even need nvlink any more tbh
nvlink is shit anyway
point is, the vram is there for those cards since they actually need it. gaming/run of the mill cards do not. nvidia knows this and anyone else in the gaming industry knows this as well. but hey, there's an 8khz polling rate keyboard out there that will give you a 300% advantage in fortnite (sic)
so they aren't "holding" the vram out from rtx cards, they just don't need it right now and it's a waste to put it in there
They say more VRAM = happy life
its not a waste to 3d artists, they just dont give a fuck about them as a customer base since they are a relative minority
there was a card that needed good vram and nvidia killed it with a shitty bandwidth the 4060ti with its 128bit bus
its just a dumb gamer fixation for now, like idiots that think they need more than 120hz monitors
but anyone who does 3d has been screaming about getting more vram for years
except for those kinda people who say "why wld u need that" on every forum when you ask for anything to get better ever
not really, you can do quite a bit with even 8gb cards. if you find yourself needing more than that, you need a render farm anyways lol
Imagine having 4gb vram
oh and i should say that i've been doing vfx and game development for over 15 years now
they know they can just gatekeep the 48gb cards and sell them for 10k usd instead of putting them in the x090 ti's and selling them for 5k usd
yep
8GB vram is useless for 3d rendering anythign but really basic shit, i presume you are being hyperbolic here
Imagine SLI
if you need more than 8gb to render out 1080p videos (you dont waste time rendering 4k content on a single machine, it would be assinine), then you need to learn how to optimize the scene and settings
and that's counting gpu accelerated rendering
they didnt offer good suppor for SLI either,they just abandoned it,instead they chose to support NVlink cuz they know only the $10k+cards support that 
Tbh, they should allow SLI on modern cards, even if it isn’t efficient. Good to have options
well i have a small render farm of rtx6000s and 8000s and 4090s at home, i used it to render a bunch of 4k shit recently for films i was making in redshift and vray gpu, i can tell you none of that would have rendered in 8gb vram, big fluid dynamics sims with millions of foam particles blah blah,
sli was just carryover from 3dfx iirc
then youre not optimizing the scenes out and not precaching the fluid sims correctly
Ocean sims are amazing
a long time ago, i learned the importance of proper optimizing. you can get within margin of error on images but the poorly optimized one might take 5x longer to render and use far more ram/vram
uh huh
oh and again, rendering at 4k on consumer level hardware is pointless. it would be 10x faster to render at 1080p and AI upscale to 4k, with an absolutely minimal loss in quality
But did you learn coding?
yes
Nice
im an engineer by formal education

oh and another big hit in render times fabricatedgirls: your sampling settings. you can also allow for far noisier renders and just use a good denoiser. gotta find the right balance though, if you go too low with it, the denoiser will smooth the scene out too much
Me when I solve a problem
lol
why are you trying to give me a remedial beginners lesson in optimizing 3d scenes
because internet
lol
Back in my day
the whole point of remedial is to relearn something you either didn't learn or didn't learn correctly the first time
back in my day we waited 35 mins for a single frame
appreciate the condescension, i think most people who do 3d professionally would not make the claim that you can output any render of any scale or complexity using 8gb vram if you just work smartly to optiimize the ram usage, since some things tend to have more detail than other things and to do the things with more detail it uses more vram no matter how much you dial in your render settings, and even IF you use a "good" denoiser.
anyway
back in my day, a frame was mailed to each other 1 at a time

Back in my day we used oil paint to create a frame ^^
Just wanted to be part of the "Back in my day" thing ^^
Oil paint is cheating, back in my day we used manganese dioxide and charcoal on cave walls
lol i cant figure out a good way to make characters grab stuff like sword = /
Repeat, repeat, repeat ...
not sure what im doing wrong, ive tried inpainting, img to img
controlnet + openpose
well, i figured i could generate an image, add a sword over the hand in editor like photoshop, then regenerate it and have her grab the sword
im not trying to make it in one go
maybe its just too few pixels to work with
can you try openpose to help guide the inpainting?
idk never had to try this
specifically
posted image in #🏞|general-with-images
ah i see
if you placed the sword in photoshop when not just take a photo of a hand gripping a sword from somewhere and past that roughly in, then im2img will have more info to work with
hmm thats a good idea, we might need some cool lookinh swords in #🏞|general-with-images
Hey everyone! I'm offering $100 to anyone who can help me install two local instances of voice cloning software like TortoiseTTS, X-TTS, etc. I'll pay $50 after each successful install, which means it should consistently clone the voice of a chosen person.
I've run into some snags trying to do it myself, so be prepared for a few challenges along the way.
Specs wise, I'm working with an NVIDIA GeForce RTX 4070 Ti, 13th Gen Intel(R) Core(TM) i5-13400F, and 32 GB of RAM, so hardware shouldn’t be an issue.
I’ll be checking my DMs tomorrow at 18:00 BST and will give everyone a fair shot—first come, first serve. Looking forward to your messages!
i have to pace a whole hand in to just make it understand? T_T
well you can do it really quick and rough, and it'll be a lot faster to get what you want done than the way you are going about it now imo
anyway thats the first thing i'd try, since you are using photoshop anyway, its a trivial amount of work to test it out
oh i see you got it working anyway, nice one
kind of odd because i just made a vdb in embergen with a 512x512x128 resolution(~33m voxels) and two data channels. rendering it with two lights and at 4k, i didn't even go over 3gb of vram in the task manager(i forgot to check the peak within the render window, but it would have been much lower). don't have maya on this pc, so so i'm doing it with blender 4.1 with experimental feature set. i'll try running some more sanity checks with actual scenes+vdb fluid cache to see how high it goes
SD3 out yet? haha just kidding! -_-
Download on Torrent ...
April fools was awhile ago
hello
Had some time to rig up a vdb from embergen into the demo junk shop scene. the smoke effect was around 50 million voxels and the majority of them are in view of the camera. I set blender to render at 4k and cranked up a few light bounce settings. Peak memory usage was 5043mb and my gpu never wen't above 7.7gb(didn't go into shared memory) and that's with the viewport scene assets still cached in the vram(it would have evicted them if it needed the space). Kinda a slow render, like 2m30s per frame, but this PC only has a 2080 in it. But yeah, that's a fully fleshed out scene, with a very large amount of voxels and rendered at 4k, all fitting within 8gb of vram.
oh and render tiles set to 2048 and the vdb step size set to auto(it will determine based on voxel size). I don't really use blender much, so I can't think of any other settings to really mess with to make it more or less difficult to render.
Me when I solve a problem
hello
I'm looking for some help. I have some very stylized text from a game logo and I would like to try and use comfy UI to generate different words/letters in the same style. Anyone know how I might be able to achieve this workflow?
How to create 360 panorama? (ComfyUI)
Dang it! I forget to make Sword images
my god supir is painfully slow. the electric bill for using it on an entire video must be insane...
What?
I said "MY GOD SUPIR IS PAINFULLY SLOW" 😄
its used for image restoration / upscale. typically need around 30 gb vram ( you can use some methods to decrease that a bit) to run it
I only got 16gb 😦
you could run it with that im pretty sure. ive got 20. a 1.5x upscale took just over 30 minutes though, 2560x2560 to 3840x3840. Its good for sharpening up an image as like a final upscale pass without changing the actual details. just for putting that final polish on something you really like
ok... i just zoomed in to compare side by side. im actually blown away. well worth a 30 minute render
Supir is the best upscaler in the known multiverse.
what paramters you used?
oh man. which paramaters lol
more or less default on everything, was just a test run since i just installed it
Is there anyone here that has been able to deploy stable diffusion serverless to the cloud?
did enable background restore, ran it at fp16 , and used tilevae
I'm kind of lost and was wondering if anyone could give me some tips
BRO a grandma will get done walking from state to state before this company be ready to finish what their doing with SD3
Like Tacobell will already have came out with a brand new menu before this AI company gets ready
my guess is that its been ready but "safe guards" are being put in place. same as with sora. been ready, they just want to ensure it cant be misused 😐
i personally never understood the point of "safe guards". to me this would be the equivalent of like manufacturing a weapon and then nerfing it, which doesnt make sense. the problem is the "person" using said weapon, said software, etc.. go after those people, but dont nerf the product, i mean idk.. might as well then not make anything at all if we are this worried about misuse or whatnot..
I got a question. Do you need to download all the controlnet models if you are using forge? I installed forge that has controlnet baked in, but I haven't installed the models for controlnet, but they have still worked, all the ones I have tried at least.
Like the SD3 has the capability to create something of a celebrity, a photo that never existed. So this SD version to celebrities could be threatening to them
Or making some form of fabrication, or worse or unspeakable things
Nah, if that were to be the case we all shouldn't exist if mother nature didn't knows we would misuse the purposes of our living, like hurting or unaliving another person.
would you like a half baked model?
would you like something barely better than sdxl with higher vram reqs?
or would you like something that is Basically Sota On Your PC?
(also unfortunately they will lobotomize it, not as much as 2.x lol)
so they CAN lobotomize it?
how so, especially somewhat late in the training?
LECO or something like that
more likely will just be training on pure sfw for a while
so erasing concepts
damn I hope that doesn't hurt the model's quality
if not then idc, cause guns and blood are still available in SD3
or maybe not blood, I don't remember having seen blood
its mainly just to remove pure nsfw, e.g. what most anime models excel at
guns n [relativly low] blood, while not pure, are in the data afaik
I mean I thought about this too, removing concepts from the finished model somehow, but I was always worried about it accidentally hurting other parts of the model
I mean blood will probably look like thick ketchup anyway
just like with current models
I have seen someone on reddit try TF2 for me and it didn't know about TF2 still, it just made "generic cartoony hero shooter"
idk if the stable assistant was using 8B though
yeah i doubt it has enough tf2 data
I thought it might have a bit of TF2 like with other dataset, but with a higher parameter size it would perform better with more obscure stuff
but I suppose that's not how it works
I dont even know the current dataset
the way i think it works is that while it might know what tf2 is, it probably only has like 4-5 images, since tf2 is really obscure in the grand scheme of things
re-ran it, yeah still with "Generic Hero Based Shooter"
aw you have access
good for you hehe
well yeah
I'm trying to search it in Clip Front (laion clip retrieval) but the backend url it uses is literally 502 bad gateway rn
so I cannot even see if Laion-5B had it
reading people hate on sd3 is amazing lol
Laion-5B had almost everything, including angry video game nerd, which SDXL and SD1.5 failed to catch of course
so I thought maybe an 8B model with like a higher portion of Laion would perform better in more obscure territories that previous models would have missed upon
fr
on reddit they keep comparing it to Ideogram and DALLE3 and Midjourney
it is not dall3, it is probably better than mj (i have not used mj)
and they're like "eh not following the prompt 10/10 and hands aren't perfect so this model fckin sucks 👎"
lol
the person who got kicked literally gave it a 500 word prompt and complained when it didnt follow it perfectly
oddly better and worse than MJ, from the Stable Assistant results it seems to perform worse in 2 subjects, yet better in every other territory
it's weird
i think it more or less decided that it shouldnt waste params for an obscure image and use them for something else
uhhmm!!!!! 500 tokens are within the 512 context length!!!1!

hmm
maybe they also use aesthetic filtering that might accidentally prune these as well
🤷♂️
awww I wish I could use Clip Front man
I mean its not like DALLE3 was miles better
it did give the characters gloves and ammo belts so they all looked like heavy weapons guy
so it was somewhat like TF2, but it's still not it
sort of the same for todays model
if we can train multi concept loras on 24GB on 8B I'd be super happy and I would not even mind missing info, but you said that training simple loras on 8B with 24GB is already barely possible
lora vram reqs are gonna suck for sd3 (not as much though)
well 2B Loras are obviously going to be trainable on 24GB (right? 😬)
oh yeah the bot's model updates every now and then
damn
so there is some hope wow
like I wont expect PERFECT TF2
yep, if you can train xl you can train 2b (unet)
but if it can get to whatever style DALLE was trying to depict then that's awesome already
ahh so it does somewhat work like that
point is, SD3 is already amazing for a base model

these are Stable Assistant SD3 images
WHAAAT
so thats why people say it looks like 512px upscaled
if it's 2B then I'm amazed
hi guys, anyone can help me? I'm trying to do face swap video, I use roop unleashed, it works +- when the person is facing forward but when the person turns their head it completely bugs. Is there any special configuration that needs to be done? Or another software?
the only example of 2b ive seen is a picture of yellow elmo (its better than xl)
and alex (mcmonkey) made some cat images that compare 2B-512 with SDXL
and it looks as good as SDXL (if not better), and prompt adherence wise it obviously won
the 16 channel VAE is really showing it's worth with those 512px images
the new vae is amazing
I thought it was only going to be for better colour accuracy
and it doesnt look fingerprinted
wdym
remember xl1.0vae?
yep
also its funny that people complained about an SD3 result, because the prompt had "very low resolution" and "pixelated background" and it make the whole image pixelated, yet when people generated DALLE3 results they were all 2D pixelart games, whilst SD3 was 3D geometry with big pixels
but I mean DALLE3 modified the prompt anyway so we don't know
@crude notch https://www.reddit.com/r/StableDiffusion/comments/1c2je28/comment/kzc0ltv/
Amazing painter strikes again
wtf he complained about all those stuff but could not take 5 seconds to read about the dataset captioning from the paper
its like a 15 second job with the search function
also what's funny is that the TF2 prompt I had where a blue soldier is holding a rocket was terrible with DALLE3, it kept holding some weird rifle or grenade launcher
oh would you look at that, someone posted a comment explaning each point and why its inaccurate
who could that be 
i have no idea 
also wasn't that painter guy using an older version too since he got banned like weeks ago
yep
even then, the model still could do that
the funniest thing would be if he prompted it like SD1.5, I would have laughed my ass off
i wish i could post the prompts.
i dont want to lose sd3 access, but it would be really funny.
its more like if someone treated sd3 as a local dall3 clone
hmm I was thinking of Superprompt v1, but that would need more training to get it to work better with SD3 tbh
it would help a lot for the SD1.5 community though
make them learn natural prompting and etc
which, as i have to say, it is not
yeah, tag based prompting (at least for launch sd3) is dead.
wooowwwww a model that's less than half the size as DALLE3 is not as good wow its trash it's literally DOA on god fr fr
"i cant believe this 8b+12b model is sllightly worse than a 20b+gpt4 model!!!!"
I wonder what would happen if we omitted or left the prompt for the clip model empty which performs better on tags
clip_l or clip_g
I don't know how the two clip conditioning models work alonside eachother
they dont want to get sued
^^^^
and they need good PR, eg: "we worked a lot on SAFETY so that the prompt wont destroy our lives and take over the entire world"
they always target the open models
Thing is...people are going to misuse their product anyway, and it's funny because they know this
sure, people will train it back in
but as a company, you need to censor it.
if you don't, then people will know Stability as a company which gives out models that can generate nsfw images of celebrities
they have to cover their asses, legally
if they censor the model and people train it back in, it's less of a hassle for stability, since they promised a safe model which would prevent bad actors
if you make a model that can do nsfw of any celeb, then those celebs will get mad, and will sue you (most likely)
pushes the blame onto those other people
Sigh why does anyone care anyway what Harry Potter looks naked...
Sounds like prompt material ... thanks! 😁
This stupid bad guy might be more interesting ^^
SD3 almost seems like a myth with how early the release papers were out and still no actual release
yeah idk why they announced it so early
marketing stunts
they built hype too early
mainly because this is a whole new arch
and they want to make sure that It Works Well
https://subpac.com/ they've waited 2 years for their product
never came out
Looks great, but in reality it's not real
okay but SD3 will come out
CTO stated multiple times that it WILL have an OPEN release
The rest is just fluffy bs 🙂
I could state this 100 times for 2 years
guess we will have to wait for another two months
does that make you feel better on signing up for something i haven't released yet?
I just put out some papers?
its been like 2 months, cant you wait 2 more months/weeks?
that's pretty ridiculous
to announce features 4 months ahead
anyway, g'day lads, just thought i'd chime this in
announcement: WAY too early
Elder Scrolls VI
yes, it was early, will it release? of course
yessir
2 months till it releases then another month or two for the fine tuning
epic win
And only then we will be able to prompt shrek sitting on a chair made out of spongebob
i doubt that wd will take that long lol
I'm still patient, but here's what this did to my mind: I quit using SD all together until SD3 was released, yall keep pushing it back
well wd is kinda dead
I have not done any Ai because of the SD3 wait
i'm more waiting for animagine sd3
sort of, wdvxl is soon
tried deepfloyd, Pixart-Sigma and ELLA SD1.5
lol
pixart-sigma is the closest since its a DiT with a captioned dataset, but it's tiny (600M, smaller than the smallest SD3 model)
last time i tried a wd model it gave me errors or was very low quality
but the quality and prompt adherence isn't even close to SD3
1.5e3?
probably yeah
yeah its mainly since auto webui doesnt like 2.x
interesting
hopefully they make it work this time around
its gonna be cursed.
thats all i can say
lol probably
those guys commited to agi anime waifu anyway
or some llm thing i don't remember the announcemnt
agi tomorrow (probably)
lol
oh kino's announcement yeah
anime waifu? lol

emad
yep lol
its really time for sd3
I keep checking the stability.ai twitter feed every couple days like an addict waiting for sd3 this is kinda sad
I would try it out on the demo or even 2B locally so bad lmao
but it would mess things up sadly
for weeks/months ive read „next few days“
I check each weekend
stop trusting those lol
but it's gone so long that now im llke ??? i need to go to the discord and ask
CTO's ETA isn't even close to "few days"
i gave up
Did the beta access even go out ?
or 3 if we're lucky
- yes its happening
- 3-4 weeks
yep!
not for the waitlist people iirc
I swear lykon is the only dude that got access
for the waitlist people too?
i have it, just that cant share without permision
some even mentioned, beta finished two weeks ago
bruh
oh that dude.
wha'ts your twitter yoinked?
i dont use twitter
based.
i only post on discord
less brainrot
you have access to sd 3?
he uses tiktok like a real man

yep
lol
how well it works with artists?
where 👀
here or tohoai
wildest? obscure video game character (didnt get every detail but it got shockingly close)
I wonder are people already mixing models with it?
not yet*
uhh I cant find it lol
So you don't have access to weights just like an api thing?
can you DM link or something
oh, should just be .gg/touhouai
yeah discord bot
same for every non-sai employee
💋
Oh got it yeah makes sense
they really dont want it to leak
guys
i want to turn my pic to a cartoon and put a pikatchu hat on it
is it possible to do in here eh
no answer eh !
not with any bot no
Try with R2 D2.
I've been using Auto1111 for a long time now but I'm wondering if it's time to pivot. Are there any guis that have surpassed a1111 I should try out?
I've tried comfy but I'm not sure it's my thing.
Stable Diffusion Forge is the new A1111
I thought the repo owner abandoned it?
Damn ...
is it?
thats a shame
it was better than A1111
except for the bugs
lots and lots of bugs
oh if you like the performance then just use comfyui
if you like spending your time zooming in and out of things
swarm then
ngl I tried swarm and I was more overwhelmed than just comfyui
I got into this custom workflow thing and I just cannot use an auto1111 type interface ever again
auto is limiting
but swarm is really cool though
anyone care about foooooooocus any more
basically comfyui but you dont have to do the hard bits
you get fast updates and everything
i would recommend comfy for three main reasons:
- usually is the first to have support for the latest new stuff
- has better memory management and requires less vram for stuff
- there are tons of modules that are implemented almost on the daily, based on new papers released, etc,
and are usually available way faster than A1111 or similar where you have to wait who knows how long till someone
implements it, if at all.
Also, I think technically comfy has way more modules developed for it than any other framework anyway at this point,
just lot more things to connect and try and have fun experimenting with, gives you a lot of flexibility where as the others
you cant connect them in a way you want, you have to follow the way it works in the background and that's the only way.
stable cascade support, and now SD3 support WHEN IT COMES OUT
Comfyui was forged in Hell.
but I mean technically comfyui DOES run SD3 right now, we just don't have the code and weights
it literally already supports it lol
i take it there is still no cascade support in A1111, then, other than that useless stable cascade extension/tab thing
comfyui might take over, you guys... 😔
stableswarm will be your only option for now if you want to try SD3 on day 1
with very good VRAM management btw
comfy supports sd3 day -1
i bet 80% of ppl still use A1111 over comy tho (wonder what the actual numbers are)
yeah cause its simpler
i switched from a1111 a long time ago
and many people just want to type prompt and click generate
a1111 will support sd3in like 3-4 months after release
but I have been using stable diffusion since 2022 august so uhhh
lmao real
plugins for a1111 supporting SD3 will come out before A1111 implements it lmfao
With 64GB Vram needed.
why so slow, is the coder an alchoholic?
hello
😄
hai
less than you expect
codebase is a fuck
it is simply put... a fuck...
12GB VRAM for 8B fp16 and T5 at 4-bit?
yea
and that's the biggest model
correct
and if stable assistant is using 2B or 800M, then people don't even have to worry
2B beats SDXL
why was forge abandoned then, seems like it solved a lot of the problems.
alex (mcmonkey) showed an example of 2B-512 with equal quality + way better text and prompt adherence
they have to keep up to date on both a1111 and comfy
forge is same coders as comfy?
Lvmin Zhang (Lyumin Zhang)
yeah lllya
Didn't he write an examn?
Hello guys 😄 do anyone have SUPIR?
yeah i have supir
Any source?
might not be latest version tho, havent updated in a bit
How long is the average processing time on your current GPU?
Even NASA can't run Supir.
I own a 3090, I don't know if I can install it
3090 will be fine
I heard it uses 32gigs of vram
They tried it but some satellites started crashing so they had to stop.
They were trying to upscale naked pics of alien girls.
How many GB does the download SUPIR? I'm currently on an internet package 😦
3090 will be more than enough for SD3
with vram left for controlnets and loras as well
my supir install folder is 68GB
jfc
wtf
thank you for information man 😅
yeah but that includes several SDXL checkpoints (juggernaut/helloworld and at least one more)
pluss it downloads llava
😮
we're talking about Dr. Furkans supir frontend here btw
and autoinstaller
when there's a cloud front for it, it'll be cool, but I gave up after getting frustrated with OOMs
its good tho! Best upscaler i've ever seen, i paid for a magnific sub and cancelled it when this came out lol
only downside is it couldnt uprez to super high resolutions due to ram limits but maybe they've addressed that since, not touched it in about a month
It's incredible.
yeah its really impressive
Almost as good as those Blade Runner "zoom in" things.
I did all the magnific tests for furkan in his video about it, it pissed all over magnific in every single test without exception
🙂 yes.
There is always a better free alternative out there for all the AI services.
You just pay with time and your own resources.
actually did some upscale tests from tiny low rez blade runner images lol, and the results were amazing hahahaha
it preserves the subtleties of peoples facial expressions better than anything else
Where to get it?
Yes, I took a look at dr.Furkan's video. He is truly a competent person
@crude notch my god I'm starting to give up on this community
hello
More like goodbye apparently
woohooooo
woohooo indeed
i'm subscribed to his patreon
dont think he made it public?
its 5 bucks a month
Ahhh... thanks! I found that ...
went to sleep so missed this. I have no clue why you are trying to argue that because you can load a very low resolution voxel grid into a basic demo scene that ships with the software, that this somehow means people using gpus to render and comp in production dont need more than 8GB vram to render production quality 3d renders. I don't understand what you are attempting to achieve by arguing this? Literally what is the point? If this was the case why do they even sell quadros with 48gb vram that you can pool with nvlink? is it to render bigger scenes or not? People here will argue literally anything.
fuuuuuuuuuuuuu
metamorphic generation = more contextual fidelity
and they’re training an even stronger model with the help of that Open Sora project
so we got zords merging into an open source m e g a z o r d
We are cool as a polar bear's nipple.
Ohhh... never touched that kinda nipple ^^
When SD3 drops we'll be as cool as liquid helium.
Midjourney and Dalle will melt in our bright light.
Imagine how much VRAM you’d need
sd3 release when
sometime in the future
In the future.
as a real power rangers fan i can confirm
you guys excited for WW3?
Sort of
ww3 release when
i am excited for SD3
they are beta testing WW3 right now
so I have learned to tmeper my expectations.
But WW3 is more likely to come out before SD3 right now.
no pop
:/
I thought my game is a dystopia. But with the way things are going ym game is actually a utopia.
😄
We need to generate some cool Swords in images
I mean two fronts have opened up and we are just waiting for the Franz Ferdinand moment for full fun and games…
I am totally excited for WW3 yes. 😄
Pick your sides
Geriatric Biden or obese Kim
Lets go
!
How is everyone doing today??
Prety good
mmm toast.
hmm does anyone know how i can install embeddings etc?
wait till SD3 comes out
wait its coming out soon?
we told soon, but who knows when
It should
ahh ok
like this month?
after WW3
idk
also does anyone know how to install embeddings?
the pt files
and how to use them etc
cant really find tutorials online
oh, like you dont have to type any promopt to activate them or anything?
are you using A1111
no, im using forge
ok same thing, theres a tab with loras etc etc
Forge can chnage a life!
there's one for these
ohh ok
you just click on 'em in the tab and it ads them to the prompt
np
Is it possible to earn money with free ai tools/software, or do you all think its impossible?
i really want to start working towards a better computer/device but im not allowed to get a job yet, and my autism limits me greatly.
i did concept art for an upcoming movie using SD and some shots in a TV commercial in dec, in combo with 3d rendering, etc. But people are a bit antsy now about using generative AI, due to how enraged it makes people, and copyright concerns etc