#💬|general-chat
1 messages · Page 111 of 1
Though, the base model images already look pretty dope
like DreamshaperXL level quality
I'm not really a Linux guy but I had to do some projects in a class. I think you can use a .service file to do it.
I will see what i can come up with tonight. See if i can figure out the services file. thank you.
is there an argument or variable i can change to let SD use multiple GPUs to even out the work load?
I know I read about multiple GPUS recently but I can't remember if it was even supported
I had an Automatic 1111 that was supposed to use every GPU but i could not get the features i wanted through that install. It was worth a quick ask.
Good afternoon, everyone! How are we all today?
I guess I'm alright, how about you?
starting it via a service is an option, but it'll be tricky to get all the environment stuff correct. Your other option is to use 'nohup' to run it and tell it to background like "nohup ./webui.sh &"
after that you can close the terminal and it should still be running
i will try the nohup but the & does nothing to hide it.
hide it? why do you need to hide it ? you can redirect the output to a file?
or to null
like "nohup ./webui.sh >/dev/null 2>&1 &"
the console output is pretty handy though
good day Sunny! i'm good, what about you?
bot is down
and 🤣 at that prompt
is civit down for anyone else?
yup
hope its not the end for them...
nah, it happens all the time. they'll be back in like an hour max.
hello
I have a question regarding running stable diffusion locally
I was able to generate ai images txt2img just fine
but I do notunderstand why when using img2img the ai completley changes the base image
you need to turn down the denoising strength
this is what I mean
okay i will try that
okay well that actually worked thnx alot
img2img isn't like "make the composition of this image and it's style into a new image!" instead its just taking an image, adding a bunch of noise to it, then scheduling the diffusion generation on those now existing pixels with an adjusted scale. "Denoising" is that adjustment. 0.7 would add 70% noise to the image and start the denoising schedule at 30%
you're welcome :)
if u dont mind me asking but where Can I actually know more abt each and every modifier in full detail?
if you want to keep composition while denoising even more, you need ip-adapters or controlnet
I see
https://stable-diffusion-art.com/glossary/ when learning things i always jump into glossaries
for me it was a combination of youtube tutorials and experimentation
oh so this is like the documentaion?
nice
yeah. loads of media consumption and studying and experimenting.
i love this channel for high level looks at technical details. https://youtu.be/1CIpzeNxIhU
glossaries are common term definitions. good for reference or perusing
okay
its a jargon rosetta stone
Well then all I need is a good roadmap
roads? /flips shades/ where we're going, we don't need, roads
Future? where we're going, we don't need , future
So not fetch
yo, its not free anymore?
care to elaborate?
what server name?
top left
top left reads stable diffusion
I dont know what news you're referencing, but it's been opensource until now
I dont know man, give up is my suggestion
thats great man
these stimulating conversations are what keep me going
when was the server ever named free?
you weren't around then
lol, dont even try flowwolf, your mind will explode, poof
do or do not! there is no try! (smart fans noticed this was an absolute right away)
I've been inactive for 1 year in the stable diffusion, what has changed?
based on my interactions here, this populus is undeserving. but i'm overtly harsh in my social judgements. don't take me too seriously. i wouldn't give me any power either.
oh oh i get it now. you are talking about the free bot.
bots turned off. stable diffusion is still free and open though. good luck!
Has it gotten better?
I'm ok with that
if it's worthwhile it'll come out, if it isnt, meh
or someone else releases something, or stability goes under, a million possibilities under the sun
dont make it like this, if we all work together..., no
Does anyone knows how the performance scales 4060ti vs 4070s vs 4070ti super? 🤔
what are the odds anyone here has all those cards? so really what you're asking is for someone else to research this for you
why ask? takes 5 seconds to google
idk, but I got a 4080
Can't seem to find an answer to this online. But has anyone found a way to simulate a photoshoot?
Like I feed SD a single image of a Bride and Groom. Then SD goes off and produces a set of images based on that single photo as if I had taken the entire photo shoot. I could settle for pose changes but likeness, and clothes need to stay the same in every img.
Nothing in google, 4070s is pretty fresh
You don't have to have it all, you can just describe your own performance in it/s, someone with other card will do the same and voila
Or maybe I have found something , they say
The 4060 Ti sure seems like its still the sweet spot. 16 GB of ram and about 12 it/s is pretty decent for $700
The 7900 is getting about 20 to 21 it/s. $1,570
The 4070 TI Super is getting about the same; 20 it/s. $1,417
The 4080 Super is getting between 24 and 30 it/s, for $1,799
4080 is good enpuph 🙂
sorry for the question but how do I install these in automatic1111?
root@redacted:/workspace/sd/stable-diffusion-webui/models/Lora/temp# tar -xvf 25jrju.tar
lora.safetensors
embeddings.pti
special_params.json
```I mean if it were a lora it would just be the safetensors file only right? why the .pti too ??? It doesn't work either way when I load it as a lora
Guys how to fix this? (NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.)
Maybe try to provide sysinfo with such cases, cause there is a lot of options
it's a long text
pastebin or a link to file
how do I do that? (shy)
This is output from your console
To get sysinfo you have to go into your webui settings
let me try
why did u guys react with thinking emoji
am i dumb
fp8 support from the transformers engine, basically doubles memory of 4080 . 3090 can manage fp8 too, with a significant speed hit though.
inference at quarter precision really isn't noticeable at all. same seeds are identical
Cause I have no idea what is this thing and what are you doing, especially on linux
I'm in the settings now, where do I find sysinfo?
I downloaded the weights from my finetuned model on replicate.com
and I just want to add them to automatic1111 web ui
you know
😮
Hi, sorry, where to enter the prompts?
i think it's built inot forge, but base auto1111 needs the extension----no my bad. it is in auto1111 now
I'm lost here at discord and used midjourney 1 Year ago
"System info file, generated by WebUI. You can generate it in settings, on the Sysinfo page. "
I guess you have to ask the creators of the weights, or deduct from the contents
i mean you guys heard something about pti?
ah fuck it
im probably just gonna train on the model again
You are on linux, there is no file extensions on linux
I mean, there are, but its just a string
ya
ok than I paste it into pastebin, enable highlight, than what?
File extensions are windows os concept, linux has shebangs (and magic bytes)
where do i share my art? nsfw
XD
Make a link right there, and paste such thing always with your problem when you ask for help
how do I make the link? Just clikcing the paste?
Create new paste button
I run a 4090 and I get around 7 it/s at 1024x1024 with SDXL base
So 50 steps takes like 7~8 seconds
chili rice retro wrap
Drop --no-half or just add --disable-nan-check
How? 🙂
It's your starting parameters, its most likely in the webui-user.bat if you are not passing it manually
please never use --disable-nan-check
soggy nuggets dipped in spring water or something
Had to use it every day on AMD 🤷♂️
then you probably did something wrong every day
Just try dropping --no-half
but @warm junco helped me figure out, it was one of my models, juggernaut gives me problems
you mean delete that line?
set COMMANDLINE_ARGS=--xformers --no-half-vae --device-id 0
https://en.wikipedia.org/wiki/Filename_extension came from unix servers and windows filenames are just strings too
where do people get this stuff?
Yeah, delete this from your starting parameters
no he meant use that set COMMANDLINE_ARGS=--xformers --no-half --device-id 0
anyone knows where i post nsfw art?
not here #✍🏼|rules-and-tos , rule 4
ok getting rid of the -vae part
It's more than a string on windows, you have things in the registry for the given extension types, but nvmnd
default apps?
both gnome and kde provide support for default apps. pretty sure other environments do too. its quite standard behavior
i wonder how good lora will be on these new models
NansException: A tensor with all NaNs was produced in Unet. Use --disable-nan-check commandline argument to disable this check.
continuing in #🤝|tech-support
still error
with juggernut
More than that, windows is ignoring magic bytes completely, you can make file with multiple extensions and it will use only the last one, there was many attacks basing on this afaik, while on linux you can do gcc yourfile.c -o yourfile.png.bmp.bin.whatever and it will stil run correctly just basing on magic bytes
extensions are still implemented on linux. always have been. it's the primary way to decide what apps open what file. reading the first few bytes of a file is so innefficient compared to reading the last few bytes of the filename
you can't garuntee every file from the internet will have magic bytes either
It is more on the gui side, but ok
kde will display a thumbnail if the filename is .png . not if its .blipitybop. even if it has a magic byte
the purpose of those is something else. not general use
you're way off base bud. extraordinary claims to say linux doesn't use file extensions. ugh. i quit
Nice, always using gnome or openbox. Linux doesn't have file extensions as the operating system, there is no such things in the linux kernel, period
if there's no file extensions in the kernel at all, why the fuck does the code implement .ko file extensions? honestly. confidently wrong beyond anything dunning krueger could've predicted
It's all for the user only, you can name anything any way you want
Build your own linux from scratch and check it if you don't believe me
nothing about file extensions are a microsoft thing. they come from the early unix days. inherent to file systems
Between A1111, forge, and comfy, which UI do people prefer?
I dabbled in A1111 a while back but have heard good things about the others too
yeah, cause unix is much older, but *nix systems are not using the extensions techincally speaking, at least under the hood
if you mean what does everyone prefer, how would you ever get an answer, people prefer different apps for different reasons
i have. i tested myself years ago and got gentoo up and running. can't do that without dealing with object code. wonder how those files are saved?
Just a concept that you can change, now try to do it with .dll files on windows
i assure you, it is much more efficient to load the last bytes of the filename in the lookup table, than to seek the file, open it, and pull the first bytes out of it.
Magicbytes are a security measure for servers to verify files are what they are. not to replace file extensions
True
First part, at least
(╯°□°)╯︵ ┻━┻
you could make some generalization like, comfy does better automation, auto1111 is more intuitive, forge is like auto, but more geared toward performance, but at the end of the day it's your experience
https://civitai.com/models/267242/proteus new proteus
hoping the bely button artifact isn't on this refinement. lets find out
┬─┬ノ( º _ ºノ)
You can't generate pictures here anymore?
File extensions aren't actually a thing on windows either. Rename some image.jpg to just image, right click and open with paint. It still loads. The extension just flags what should load what.
it was the weirdest flex. it was the strangest flex.
some guy asking how to use a lora on linux, starts telling him theres no such thing as file extensions

So they do do something then…
Any chances that bot get back online !?
Beep boop. Image requests in #🏞|general-with-images please.
need to be invited to the bot this time around
a doubled egg for each celebration
chili rice in a jar
obese slopathen
doing stuff the taco way
tronisanator landings
can’t get rid of the sour obese slopathen
skank tallat marks author
a vengeance doubled egg for each century
giffa gabba from the tyrants
oop i forgot abt that gibba gabba
Beep boop. Image requests in #🏞|general-with-images please!
For the actual file? No. They are essentially just a tag like putting a sticker on the outside of a folder. It doesn't change or have anything to do with the actual file contents.
But anyways, this is a stable diffusion server, not an acktchully debate server
Hi! Which chat can I ask for advice on workflow? I'm interested in how to change the style in ComfyUI, to make an already upscaled 80's photo look like a modern photo without changing the face. And secondly, where do you discuss Lora's training here? I have a lot of questions about training. Or maybe there is a better discord for this? (noob mode)
we have a comfyui channel and a training channel
where is the free gpu channel
Hmm could you point out pls I'm dont see it 🙂 I see comfy only in dev section, but my questions is not dev related. And which one for train?
Workflow sharing and general Comfy UI discussion.
use A1111 or Forge for beginners
ComfyUI gives more customization but also increases the difficulty for using
What about this what were they using, ComfyUI?
yes
#🔧|finetune sorry forgot to link the other one, this channel is where folks usually talk about training
I tried to installed A1111 (Couldnt get pip to update and xformers to download) I know its not tech support was just saying
Hi
also tyvm
which is better you think
forge or A1111
which specialize in what
forge is a1111,its just a fork from a1111 thats better optimized but not all of the a1111 extensions work with it
Fooocus best for beginners, a1111/forge intermediate and comfyUI advanced
Acly extension for krita also pretty easy
could you go int odetails? Im trying to decide or somewhere to read about it. Sorry to ping
Who know how do use the diffusion ,I,am a Newcomer
I'm not usually one to say this, but Google and YouTube are your best friends. Look up "how to install stable diffusion automatic1111" and you'll get plenty of tutorials
Hi
I am following one
but xformer wont do anything for me
I feel sad
What card do you have?
NVIDIA GeForce RTX 2060
🤦♂️
He was asking about .pti extension, which can mean many things even on windows, and he was on linux, where ppl name things as they wish
Linux uses the same python scripts that look for the same file extensions chief. This stuff actually is all root down in linux and ported to windows after the fact
What stuff?
pytorch. the whole field of ML
You are dead wrong
weird hill to die on. the linux ecosystem has plenty of support for file extensions and they're used extensively. Find an expert you admire, since you've decided you know more than i do on the subject, and ask them for clarification.
its 3am and you woke me up with that notification. now kindly drop the confidently wrong act
For years Nvidia support for linux was so great that Torvalds said "Fuck Nvidia" on his public spech, first cuda release was for both windows and linux, but nobody used the linux version afair
Your 3 am, your problemo, for me its 11pm
Yeah I know. I was there. Back when i built gentoo, dapper drake ubuntu had come out earlier that year and i was running beryl / compiz desktops with nvidia drivers. There was a lot of hate for the binary drivers then too since no source. Been dealing with linux "experts" who say they know better than me for decades. i just keep doing what i do instead. file extensions go brrrrr
Linux makes up such a small percentage of non-server PCs that you'll just end up in echo-chambers when talking about it. I think last time I checked, Linux was like <2% on steam hardware survey and that's counting the surge of users with steamdecks? Yeah, that's why drivers have always sucked for Linux...
Nice to meet you, I'm Japanese.
Can I ask Stable Diffusion questions in this Discord group?
Yes.
thx!
There are various items on the left side, where should I ask questions?
(If possible, if you add a reply, you will receive a notification so it will be easier to notice)
Here is a good place for general SD related questions. for technical support questions, go to #🤝|tech-support
Thank you, I'll write it over there!
is cascade unoptimized?
I have no idea but it runs on 12GB
maybe even 8GB, idk
in comfyui, weights are loaded and offloaded so it never exceeds the VRAM limit for me
(this is probably how we might see SD3 + T5 run on 8-12GB too
)
i have heard that it requires MINIMUM of 18
which is why i was asking
@warm junco @bleak matrix⚠️ @still glacier
no its better to be cautious
I probably shouldn't be so optimistic about SD3
I just want to believe
okay I'm gonna load in Stable cascade models and see the highest peak in VRAM usage
Okay, on a batch Size of 4 it goes above 8GB
Can I help?
hmm decreasing batch size barely decreases VRAM 🤔
yeah for me, batch size 1 on the Stable Cascade Stage C model it peaks at around 8.1 GB of VRAM
so maybe 12 is needed 🤷♂️
Taken care of 🙂
Thanks!
What do people think of ELLA and SD3's long wait? I feel like we'll have any LLM hooked up to any diffusive model within weeks compared to months o.o
i have no strong feelings one way or the other
SD3 is pobably going need 18-24gb of vram for the quality levels they've been demoing. I'm sure they have some quantized versions, but those will obviously sacrifice quality and accuracy.
Minute one people will be able to just use llms to make good prompts, since SD3 has good prompt cohesion. Just take a simple idea for a prompt, run it through an llm with an instruction like "take my prompt and expand it into 100 words, include XYZ descriptions, yada yada yada"
Fooocus has a built in gpt2 prompt expander that does this under the hood. Same with dalle and MJ, they use an llm to detail prompts
I think prompt cohesion isn't going to be as useful when the model is censored. The time taken to safety check SD3 will be pointless if people find alternative ways to get other diffusive models to work effectly, such as 1.5
95% of people that use SD don't actually care because we don't sit around making lewd waifus all day. I'm fine with it being censored. Plus, if you're really that interested in making that kind of content, just do it in 1.5, upscale, unsample, resample it in sd3
The censorship of the model is a censorship of creativity. Maybe some hobbyists won't mind it, but serious creators don't want to have their tools limited
I got banned on dalle-3 due to an error in the filter when I asked for a very sfw artistic picture of a cosmic angel... The filters are not perfect, and I don't see the point of moving to an open source tool only to be limited yet again if there is censorship
Dall-e 3/Bing AI basically censors or bans you for drawing women at all haha
Actually the reason I recently moved to SD is because I was trying to make art for a Pathfinder character who's a female fighter
And her design isn't even sexualized
But if you use the word woman or girl and anything related to being physically strong
It censors you
Yeah, they really censor strong/powerful women for some reason
You can make a shirtless guy shaped like a Baki character though, that's obviously fine 😂
but sdxl isent censored right?
All base models are afaik
With LCM, Turbo and Lightning models maybe its closer than we think
Yeah I just mentioned those since it looks like were heading towards faster models at least for image generation
Latent Consistency Model
They are much faster models
Ive seen people reaching like 12 pictures per second
Thats about generation
I just used those as an example cause im more familiar with image generation
Anyone here who know of a tutorial on how to make your own model? I've used a model but it's like 1+ year old and would like to make my own with the images i've created..
hm, im having some memory loss or something, i make a few images, then after while im getting out of vram error, then i restart and i can continue = /
thats the bad one,plenty of ppl tested it and is not very good
I'm testing it, getting good results but I wasn't getting bad results before, so I don't know if it is doing much, which is the good one?
Ive read somewhere that it wasnt even meant for upscaling? Not sure
yea its not made for upscaling and thats how most ppl use it
how many imgs you have and whats your gpu
whats the best image tagger tool to implement in comfyui workflow?
In the guide it gives an upscaling guide
doesnt matter what the guide says,the results are the ones that matter,i already deleted my imgs so maybe ask @shell tendon he made several imgs with it maybe he can post the results in sdxl chat comparing them.
Several 1000+ 😄 and 2080ti
I just made this, it works (!?) but really as I say I should test it without it, I don't understand if it doing something #🏞|general-with-images message
As I say before, Ultimate Upscaler seems to work fine for me
you should probably make a lora instead or just download a better checkpoint,u need like 50k to 100k imgs to make a good checkpoint and all those imgs dont fit in 11gb vram
There is tons of checkpoints out there and people are doing them. I wanna learn on how to create my own
yes for 1.5 checkpoints a 3090 should be enough
we talking about training 🗿
ah i see, sorry my bad
which is the good one?
there isnt one yet
Mm just test it without it and it is upscalling just the same, I don't know if it is doing something
yea theres no difference with vs without it
Can anyone help me by explaining how the supermerger extension works (the Merge Block Weights function and the LoRA merging)?
theres also one for anime but idk i havent tested that yet
yes me neither
When you cough up millions of dollars to create and train a model like SD3, on god knows how many A100s, let me know... It's their product and they get to pick how they want it used. Just because it's "open source," doesn't mean it's "open freedom" or "open to degenerates feeling entitled to be able to make degenerate things." If anything, serious creators, as in commercial creators that will generate revenue for stability, are far more likely to want a censored model anyways.
wait you can generate revenue for stability if u create imgs?
Good thing is if it's like SD or sdxl ppl will find a way to get nsfw soon after release 
https://stability.ai/membership Unless you're a pro or enterprise member, you're not allowed to use stable diffusion commercially. Also, the three tiers of membership are in reference to the core models list and SD3 isn't in that list yet. It might get put there, it might not. Maybe they will have their own set of commercial rules for it, we don't know yet.
damn so all those ppl in twitter and patreon makin money with sd must be paying a lot
Go into any AI art community and ask the creators there. Literally nobody will want increased censorship lol. It's a tool, creators don't want photoshop to tell them what to do
StabilityAI likes some of the buzz that hobbyists can generate, but hobbyists don't fund massive projects like this. Technically, StabilityAI can go after the people making money with Patreon and whatnot (legally, you are required to enforce your IP or you can lose rights to it) that aren't paying for a pro membership. Commission work with SD is still commercial work, so at the minimum, they'd just need to have the pro membership which is cheap(unless they are pulling in over a million). "For creators and developers with less than $1M in annual revenue, $1M in institutional funding, and 1M monthly active users (all three must apply)"
tbf i wish stability had a donations page,maybe we could help them a little bit not a lot but at least help them with all those training costs
I think the censorship happens to save them from legal fights and stuff, they are on the safe side releasing a model thats censored. What we do with it is on us and not their fault.
You are allowed to use old models commercially from my understanding, only the new models from turbo and upwards are covered by the membership iirc
SD3 will surely be part of it as well
I mean you can donate by buying the membership and even get some benefits with it. You could also buy clipdrop credits or what else they sell 
if they make sd3 available early to test on some site with credit, ill gladly drop some coin for it
Read the core model page, they explain the other models not listed and how the terms are listed on HF/git. I image some of the other models aren't allowed to be used for commercial purposes. As for SD3, we'll see. It's a big leap that's going to directly compete with the other juggernauts, so now might be the time they want to to get a big return on all the RnD over the years.
Yeah it does not list any 1.5 models or the normal SDXL for example
Yeah SAI only recently started doing the premium membership thing
older models aren't under it
Which means you need to refer to the terms of use on HF/GitHub like I said. If they say they can't be used commercially, then you can't use them to make money at all; legally.
For 1.5 im pretty sure you can use it commercially. SDXL was a special case iirc. Dont pin me on anything I say here 
Yeah it's an openrail M license, i remember reading it a while ago
You're allowed to sell anything you create with 1.5 models
Not sure how custom finetunes are handled tho, civit handels things differently
SDXL 1.0 is listed as open rail ++-m on github, it can be used for commercial as far as I know. Turbo and SDXL 0.9 have different licences
It's good practice to read what license is offered on the model page, sometimes it can happen they merged with models that prevent commercial use, but it's rare
Oh yeah turbo was the outlier
where can I make the images?
bot is offline for now. you can generate images locally or use online services to generate
where do i go for requests?
can you not read a single message above you?
im in the wrong discord server then
no i remember a server that had a bunch of channels where people can request and other people would make images for them
maybe it was called unstable diffusion
no i need the image now while it can still be funny in context of the conversation
you can make an image for me?
Does anyone here use SD for game development? Making 2d/3d assets?
Good afternoon, everyone!
I dont but there's a guy over on the stable diffusion training discord that does, goes by Hackmans
❤️
How are you?
It seems that using Playground v2.5, semantic understanding, attribute matching and spatial position relationships can be better achieved. With the addition of Prompt-to-Prompt, accurate replacement of local objects can be easily achieved! ! !
But what I want to emphasize is that Prompt-to-Prompt is really easy to use! ! ! GN (AGAIN)!
The original post quoted is the picture of #SD3 posted by @EMostaque
at that time. You can make a comparison:
fine! happy that spring arrived & finally sd3 is cooking! what about you?
so 12 is needed
that is, somewhat fine i guess
way more okay than 18
doubt it could get optimized even more
Me too! Oh, just hanging out, and doing some 3D modeling!
thank you a lot
anyone implement it yet?
i ran their test script and it worked. haven't seen it as a comfy node or extension yet
https://github.com/philz1337x/clarity-upscaler very cool thanks for the link @pine fiber . can't find anything on the reddit sub right now. The "my son" meme brigade is burying any new content.
lol
i don't think it's a healthy community anymore. people downvote new releases of tools and hate on china constantly there. the tone is just.. sometimes tone policing is needed but that can kill a community too.
lost cause really.
yeah i need to find new information sources. It's just a cess pool on reddit since the emigration happened.
i'm on x looking for people to follow, but the main feed keeps throwing anti woke shiit at me. i dont even follow any of those people. x just figures i'm in the demographic that needs to see that shit.
Time to move to Threads
X is very good for ai and stable diff related papers news, it was already good during 1.5 period, the problem is that a paper without a comfy node or a1111 extension is not very much for the normal users
Interesting to see how much this improves 1.5 and sdxl prompt adherence
SD3 when
Sd3 already has t5 lol
SD4 when?
u mean sdxl
Next presidential election
No, sd3 according to the paper
Ok I'll prepare the RTX 7090 with 512GB of VRAM 
I just have a simple 4080

I'm back from the future and the 7090 will cost $4,500

I wonder how much faster a 4090 will generate SD images compared to the 4080
I still have a 3060 so I don't know 😂
I've always dreamed of having a 4090, a bit like everyone else
8 months ago, I just didn’t wanna pay an extra $1200
How much did you pay for the 4080?
I think it was $1700 AUD
that's too cheap
in a decade nvidia will still only give us 24GB max on the top cards 
We deserve more guys, we have to fight to get 512GB of VRAM 😂
oh god. is it? does it have to be? it might just be
Better off just buying used server GPUs
24gb limit is more about the chip density then what nvidia is willing to provide. TMSC 's usual scaling plan was wrenched up by covid. won't be till 2027 that they get factories producing denser chips online
Threads like creating threaded conversations, or the meta thing
there's only so much supply and they don't want to compete with themselves, so only enterprise get cards with a lot of chips
did yall see the new google sima?
Canada sued Google over Google news, and now all their AI services are banned in Canada as punishment. We don't get any of Google's ai up here
basically extortion
You're not missing much
what about the tinygrad things
the tinyboxes
Anything google does in the machine learning space is a bad time. I know how they're willing to weild their models.
google's a prime example of why the worlds largest advertising company can't have exclusive control over models. Why open weights matter so much
can't wait for lavi bridge on comfyui
i paid $1200 after sales tax for my 4080 in Canada in December. It was the only card that retailers weren't surging the price way over MSRP. The sales guy even took a slide at me for not getting a "better card" frm the 30 series. I laughed. Youtube tech journalists spread so much fud.
i was looking at the code. everything looks straight forward. if i had more of a basic understanding of diffusers and the math, what goes where kind of stuff, i bet i could port it myself. I'm surprised it's not ported yet. People seem to be ignoring lavi-bridge
llama 13b at 4-bit can run under 12GB iirc 🤔
when i run the t5 test file, my vram only hits 7gb
also I wonder why stability chose T5 specifically out of all the LLMs.. is it cause it's an encoder? easiest to implement?
damn!
does it have int8 anywhere in the code
like bitsandbytes or whatever is used
no idea. a lot of it is diffusers calls.
i'm not particularly experienced with python let alone diffusers
yeah, i think it loads everything in half precision. sounds right
it was the t5-unet.py test file, so i'm not sure which t5 it uses. proobably not large
Yes because it's an encoder-decoder. GPT and Llama are decoder only so they don't "make" a text embedding that can be used.
documentation in the project is pretty sparse
ah thank makes sense then, thank you
theres a llama2 adapter in the project too but i couldn't get it running
i got the weights for llama downloaded through facebook's weird download script, but when i point their script at that folder it doesn't recognize it
T5EncoderModel.from_pretrained("t5-large")
there you have it. large . surprising
I just want to know if that means XXL or not, cause if that's true, 8-bit could probably get it running on 8GB and less
this would further prove that SD3 could possibly run on our lower vram computers
flan-t5-xxl's model card was modified from t5-large, so this could be it!
https://huggingface.co/google-t5/t5-large the large model card. i think its the one ella will be using too
loading it in 8-bit is just as simple as doing this
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xxl", device_map="auto", load_in_8bit=True)
code often is simple. learning the incantations though....
at least it will be easy to implement for them
but yeah I would not even understand a bit from the actual code that is running beneath these

yeah my toying around with the lavi bridge code, i have no idea what i was doing, but it gave me lots of hope for the coming days
i'd love to do more testing but i hate eiditng code to make prompts haha. That's how i first started when i found a collab script for stable diffusion (no ui)
This offers means for trading off model performance for im-
proved memory efficiency, which is particularly relevant
for the 4.7B parameters of T5-XXL
🤔
its weird cause google's version of XXL is 11B
the thing that excited me about the repo most is there was training code too
I think they rushed out to beat Ella with their methodology haha
I'm happy that they will release SD3 will stuff like controlnets, fine tuners and it seems a Trubo model too
i should read about turbo models more. i like turbos in general. that sound when a turbo charger is spinning up. ohh baby 🤌
honestly the removal of T5 is so contradicting
we observe limited performance drops when using only the two CLIP-based text-encoders for the text prompts
Only for complex prompts involving either highly detailed descriptions of a scene or larger amounts of written text do we find significant performance gains when using all three text-encoders.
So is it a limited performance drop or not man

isn't the point of SD3 is to have highly detailed prompts (or text, but that's less important)
without T5 the context length isn't 512, it's 77 again (or 77+77 if both clip models are somehow used)
they won't
they're pretty entrenched yeah
most people stay on SD1.5 because of corn
it's got the juice
and performance wise too, SD3 will probably only run as fast and as efficiently as SDXL
if at all
i probably won't ever turn off t5, even for simple prompts
same
thank god that it won't hinder generation speed
we only will have to wait like a second or two (if not less?) where it generates the conditioning
to me, 5 images a minute or 10 images a minute, negligible difference of speed. but if i were at 1 image a minute, i'd want some improvements (why i bought a 4080).
at some point, the returns on the speed are low
with highres fix it takes like 30 secs to generate an image with SDXL Lightning (on a nasty weak little 3080ti I know)
I'll likely be making images with highres fix in SD3, Lykon's results with highres fix are immaculate
I'm on an amd, it's probably more like 45s for me 😉
and I dont even bother with training anymore, I just do it with cloud compute
I could train fine on 1.5
a year or two ago I was dreaming about training my own 1.X models offline, being able to generate whatever I would want to
now I don't even feel like doing so
I only trained a funny SDXL lora on google colab a few months ago
vast works well enough, and the money goes a long way as long as you dont forget to shut the dang instance down, which I have once or twice
I'll bet you could easily train a dozen loras on $20
We can talk about SD3 all we want, but it still ain’t here ! 😦

my plan was just to watch people post images, when the time comes, that's good enough
one of my favorite general "rules" is the 1% rule. means 1% of invited people are publishing
The 1% would most likely be YouTubers and researchers

wait whaaa
the twitter #sd3 tag is like 3 people with access posting images, and a lot of other people hijacking the tag
once ella is released
yeahhh
ella been delayed
or someone trains a lavi adapter
No point if no one has it
If I would have gotten access it would have fricking exploded
I have a whole prompt list ready
4 people are posting images
yeahh
Realistic Pickles!!!!
1girl big gazongas
https://www.youtube.com/watch?v=Fdsomv-dYAc one of the most nobel and true uses of AI to date

sorry, that never gets old
the fact you had a default pfp made it look real 😭
wow and this was 3 years ago
give us access and the world will be a better place 
lol
i'm not sure how much AI and how much genuine love and craft was used, but i love it
is it sfw? lol
yes
people keep saying a1111 or comfyui, but cant i use both?
sure, why not
for lets say
i generate something with comfy right?
i want to inpaint!
i send that one image to inpaint and proceed from a1111
you can inpaint in either
a1111 feels easier in usage
but of course, you gotta do the nodes and the this and that, I'm a a1111 guy
yeah inpainting in comfy is a little more bothersome
is there a way to merge these two
nothing the nsfw diffusion crowd comes up with is noble. not in my books.
stableswarm is like an easier UI for Comfyui I guess
but idk if that has inpainting
🤔
I guess you'd have to define what noble means to you
not doing the helicopter dick like willem dafoe is a big start
ok, actually laughed out loud there, nice
if i had to personify my definition, it would be Tyrion Lannister. Quite a perverted dude but composes himself in public still. Keeps it in his pants you know?
all I'm saying is, until I see that personified in SVD or animated diff, those arent ready yet
man I thought mattvidpro would be getting access
he would have made a good video on it
train a motion lora for it
so my plan is, generating an image using comfy and then using that image on A1111 onward for inpainting and stuff
and i found some video that does something similar
what you're saying is totally doable
the helicopter thing, well that remains to be seen
is it a good choice of action
i want efficiency
good point
like do you want a good image, and to be comfortable with your choice of tools, or do you want to force a workflow...no wrong choice here
i want both good image and efficiency
I personally stay within a1111, but I dont do a bunch of complicated batch-type things that would need that node framework
I read that as
results > weed
also, people here seem to stress about it/s. it's so silly if you ask me
get over it, hah
Hey!! Is there a way to automatically generate photos using Fooocus and alter the face without having to manually edit each photo with faceswap or inpaint?
first foocus user should get some kind of prize
i have yet not met any linux expert who calls himself an expert, especially the ones who claim to know better than anyone tend to be just super toxic noobs, like bragging that you have run your desktop on some 3d cube 15 years ago and i have to accept your authority and your subjective opinions as a general truth, sorry i wont, i am also using linux longer than many ppl here lives, so what
id say those last two statements could be viewed as toxic
til file extensions existing all over linux operating systems is just an objective opinion
top answer says exactly what i'm saying.
brrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
only medication i take is an anti inflamitory over the counter. bad knee tendons. That whole "flowwolf needs meds" thing was just invented by certain people and upheld by the mod team.
doesn't really get at me. Just makes me laugh whenever "Mean Girls" movie becomes reality
if I generate more images at once, will this strain my (weakass gtx1060) computer harder or will it just do them one by one in queue?
if you generate more at once, it will generate more at once. Not sure I get the question.
That s why you habe both batch count and batch size in automatic1111.
batch size, how many picture do you want to generate at once for one batch.
batch count, how many batch you want to cook
oh this is my first time using the actual stablediffusion, up to now I used civitai's on-site generator
I'm not sure where batch count/size is
depends of the software you re using
uhhh sd 1.5 webui? idk any other answer
that s not a thing.
sd 1.5 is a one of the base sd model, webui is a family of software using sd models
ok so...webui?
that s not precise enough, what are you driving ? a car.
I don't even know how to answer you
what tutorial did you follow ?
I'd like to easily create endless video, for website backgrounds. Would you know an easy way to to do it, at least a way to realize that?
I already installed comfyUI. Maybe you have a ready to use worklfow that works with humble computer
so that would be automatic1111's stable-diffusion-webui aka auto1111
yeah.
batch size and batch count should be sitting pretty obviously in the middle of the page
there s tooltip, hover bubble for basically button and slider explaining what they do.
what gpu do you have ?
gtx1060 6gb
I'd prefer to just use civitai but a lot of the models I want to use aren't actually usable on their site so I had to download them to use locally
open webui-user.bat, find the set COMMANDLINE_ARGS= line, replace it with set COMMANDLINE_ARGS=--xformers --medvram --no-half-vae
is that something I can do while it's in the middle of processing an image?
sure but it won t apply until you restart the webui
alright
Naw brl
Stable diffusion just bsod my pc
Downloading it
How bad is the experience with a 3060 12gb??
I am using this video card and with other gui fooocus works good
no problem whatsoever
Stabsl diffusion no
Is it my setup
I was using juggernaut
Maybe it's filling up all 12gb of vram
I think I need more vram for stdxl
Too big
sdxl can run pretty easily on 12gb
Is it like large language models
even on 8gb
No?
even on 6...
I ran sdxl quite well on 12GB
I have to go for tonight
8GB would probably be possible as well but using facedetailer and other fun toys would be much harder
to be clear I should probably avoid doing much else on my computer while generating images right?
On the border of all my images with juggernaut
I run text interface and image interffCe
Lol
can you load up a bunch of different prompts and queue them to generate in order on automatic1111? like if I want to go watch a movie while I generate 10 images for 5 different characters (so 50 in total)
There is an extension for that
Google a1111 queue extension
Wildcards is an option as well
just generate infinitely 24/7
What images you making?

Beep boop! Image requests must be made in #🏞|general-with-images
https://github.com/ELLA-Diffusion/ELLA/issues/13 Ella won't be released for SDXL
good, 1.5 is the one that would benefit the most with it
it'd sure be nice to have it for both
"good" weird
"weird" good
didnt they they wouldnt release it, just said itll take some time because they need to review it
does it make sense with 1.5 even?
It doesn't sound likely based on the phrasing there
Agreed, it sounds like they would have difficulty working with the different licence
I'm on phone can someone tell me how to generate photos
with your own computer if u have a decent gpu
So I can't with phone?
what phone do u have
Android m01s
you have to pay
then no,u can run it locally on a phone but it has to be one of the top models
if you wanna mess around sites will give you free credits
no such thing as free and unlimited
hi. I have a question about embeddings, where should I ask questions about that topic?
Have the early access sd3 people been releasing produced images?
Thibaudz did
Hello! Pissbly, someone knows any free desktop tool for AI TTS?
I want to TTS the whole book and but paid ai tts like elevelnabs is extra expensive, but sounds nice
i mean i want to find really good tool, because 90% of things that i found sounds like pepsi vendor machine talks to me
RVC-2 is voice2voice and combined with a weaker tts model it can do it pretty well txt2voicr
does anyone know why i can’t see promote post on my x post activity area even though i switched to a pro account ? 😵💫
i'll check that, thank you
Bark
which upscaler is bes tfor realistic models ?
try lollypop
4x ultrasharp/nmkd superscale/nmkd siax/foolhardy remacri all do a good job, it's more about the process than the upscaler itself.. an ultimate sd upscale with a low denoise and a good amount of steps helps
If I have a specific promt, generate 20 images, find 4 seeds that I like and want to rerun all 4 of them with hires. fix. Is there any way to input all 4 seeds to run them in the same go? Or do I need to generate one at the time?
I'm so fed up with XL always getting the short end of the stick...
most of the time people don't bother with XL cause it's more expensive to train ig
But "At this point, nobody uses SD1.5, so the community will likely not show interest if it's based on it." is not a 100% true claim.
many people still use 1.5
yea pretty sure 1.5 is still the most popular local model out there and second place is sdxl
its cheaper to run and train and its less restrictive for corn
SD3 won't replace SD1.5, it will replace SDXL
maybe the ELLA creators are waitin for sd3 thats why they are only releasing the 1.5 ver
most people who use SDXL use it cause it's "cutting edge" anyway, higher base resolution and fidelity and slightly better prompt adherence, but it's also more expensive to run and it may not excel in EVERY region (like corn)
I'm not so sure if the 2B or 800M version of SD3 will replace SD1.5
probably, never tried it
I have on my 3060 12gb
Sometimes it works good
Sometimes no
Can I run it comfortably on a 3060 12gb
which one sdxl or 1.5?
Sdxl
oh yea u can run it on a 3060
yeah 12GB of vram is plenty enough for SDXL
Easily or barely?
easily
And what about training
depends how much ram u have
its not a question for VRAM, its about the cuda cores
and for training VRAM is more important
probably 16GB no idea, never looked into it
you can train SD1.X with like 8GB offline if not less with Lora
yea it will be tight but it works
I only remember 8GB because of dreambooth back in like 2022 lmao
Cuda cores mid on the 3060
god there's been some time since then
It's slower
my stable diffusion strucked with "Commit hash:" command line
how can i fix it?
yeah it's frustrating. sdxl is pretty great but hasn't ever had the full feature set
i blame nvidia for their price gouging/enterprise product protecting vram limitations
I really really hope that changes with SD3.
pretty sure even with 48gb still wouldnt be enough,pony creator had several a100's and still took him several months to train their sdxl model with 2million imgs
help me 🥲
you might want to check in the tech-support channel, and provide better information
commit hash doesnt mean anything
sure 👍
well they will launch with controlnets
Here is a good place, but you're welcome to ask me directly. I've created quite a few.
Any ETA when API is fixed? Not sure where to ask API related questions
How come only Comfy gets a channel? What about Forge and A1111 and Fooooooooooooooooooooooocus?
btw my stable cascade thing in forge/A1111 died
it said i can only run it with 16gig vram on the webzone but i was able to run it with 12 but now its dead, i think it permanently suicided when i tried to render more than 1024/1024 and now it refuses to live again even after reinstal
I have been using Loba's skin on my style, but where can I find others and test them out? I have trouble with Lora's view and etc with it.
SD3 release when 😭
I'm waiting on sd4 tbh
yeaa frrr just imagine its potential
i wanna try out sd3 nowww
like if there was any way
Where can you see if you got SD2?
probably u need 40 gigs of vram... they have to optimize it and stuff im guessing
stability likes comfyui
and comfyanonymous is also a stability dev
Environment
well they have given early access discord invite to like 12 people and only 2 of them are posting on twitter
Fooocus is better than stable diffusion
one of them was kind enough to accept requests
you mean like fooocus gui guarantees good images
Config is better for sure
its kind of like an open source midjourney
fooocus is so easy
comfy is arch linux
Best quality.images are from fooocus
Does anyone else have this issue with sdxl
It doesn't follow the prompt
hmm
You can't change eye position etc
What's up with that
I can't make them look at something
And I can't make car chases
no idea 🤷♂️
never made those types of images
either prompt weighing or its not possible
I probably gotta train a lora
you can probably use 1.X models with fooocus
if you have loras or they perform better with those
Quick question yk the images you put in a lora
but fooocus is an sdxl first gui
Is it like a deep fake will it generate people in your lora
uhh
well you can train faces and people
you can train subjects
and you can train styles
with loras
I trained a TF2 lora for SDXL and that was fun
What gpu
idk which colab I used but it was in colab
even if I had enough vram I would not train on my PC
3069 12gb good enough for local training?
Privacy is king
my GPU clearly outperforms the T4 in a lot of ways (except vram lol), but still
When it's all going through my GPU I have a sense of ownership
Yeah
that's why I inference offline though
My dataset is derived off Google images and discord images
It's about 1gb of images
1 GB?
Of training data
gosh
that would be good for like a massive dreambooth fine tuning where it becomes a big project like DreamShaperXL
Gonna feed it a bunch of dashcam footage
Split videos into images and feed it that too
what model are you using fooocus with btw
Dreamshaper or juggernaut
if I recall correctly, fooocus's inpainting works with any sdxl model
cause its some sort of special model or adapter
What's the best model to train for realism and natural camera shots
In dim rooms and lighting
juggernaut and dreamshaperxl
I think
you can look at some juggernaut and dreamshaper images by searching this: ( from: darksm in: ✨|sdxl has: image )
no I mean you use one of them
just look at the preview images and decide which one you'd prefer
my images are similar to lykon's simple portraits
I do use highresfix though
highresfix can increase fidelity so much with any model
including SD3
this is why Lykon's images with SD3 look so extra nice
When is sd3 coming out
no idea
hopefully weights drop in april and way more people get access to the discord server towards the end of this month
the model looks good enough for me, but idk how much they are training it still
- controlnets + fine tuning tools they are promising + optimizations
etc
controlnets are going to be so much superior than sdxl controlnets
sd3 is AGI CONFIRMED
well
Mistral 7B with projector and internet RAG (GGUF) and SD3 image generation sounds like GPT4V
offline
I wonder if that would be possible with 12GB
Let me get this right
I wwnt on citviai
And there is people selling loras of ai influencers
☠️
Wsll
Some of them are selling for like 60
You can buy a gpu for 300 but fair enough 300 is still alot
I'm gonna try multiple models in general image and see which is bsst
you can sell things on civit??
but isn't that not allowed by the license ?
I am drawing on stabble diffusion but it is taking too long, I think my computer is good enough, can you help me why is it taking so long?
32 GB RAM
RTX 4050 6VRAM
i5-13th Gen
wdym
do lora models you train have a license for example?
if you mean finetuning a model itself then sure, its probably still licensed under open-rail or whatever
what resolution are you doing? perhaps its a bit high
1024 - 1024
and how long is it taking?
that is a bit high
and i watch some youtube videos, so people have same system i have but they doing it like 1 minutes
why i dont know
time to invite me to the beta, i would cook
same, got a whole prompt list ready lmao
would you prompt according to how CogVLM would?
So with natural language and not tags
just to be 100% sure-whats your meaning of tags?
this type of prompting for example: "1girl, with blue shirt, in castle, fog, dust, masterpiece"
I just call them tags, idk the official name of the prompt type
u have 6 Gb Vram that is barely enough for simple SDXL use in ComfyUI, what is happening is when your system run out of Vram it start using your regular RAM memory which is super slow, thus is why your generations take that long, get a better GPU with lots of Vram like 4060 Ti 16Gb.
you know I expected the r/stablediffusion subreddit to explode with a bunch of images and stuff but literally nothing happened besides like 2 twitter users posting some images for like 2-3 days and then nothing (one of the was kind enough to take a bunch of requests too)
yeah 6gb is pretty low, even less then i have
6GB was the perfect vram minimum back in 1.5 and 2.x days
fast enough to run with xformers and no midvram required
back in my day 👴
well one thing is for sure, amd completly sucks ass for stable diffusion
mostly natural languange! some tags are very powerful. it gets playful when one tries to balance between those. as an photograph/visual artist i see the contrast changes that occur when balancing those. what i love: using subjective language! it brings out the magic out of models often
im choosing parts for a new pc now
ah yeah
but its hard to decide on gpu
what are you nvidia choices
i think the amount of people in the beta is super low. just industry friends & so on
4070 ti super instead of 4080 I guess
lower tdp if I recall correctly?
and similar cuda cores
but the price between 4070ti and 4080 super is fairly close
not the type of people to share extensive, probably some none at all
its that much of a power draw difference? i didnt know
both has 256-bit memory bus, nice
depends on where you live I guess, 40 series is already lower than 30 series by default
if you live somewhere where energy bills are expensive of course go for the lower tdp and it might just pay for itself in the long term
but yeah 285 vs 320 may not be a huge huge difference 🤔
hmm idk to be honest
4070 ti is $1050 and 4080 super is $1250, where i live
tax and stuff bring it up wher ei live, sadly
same fr fr
Oh
4080 super
4070 ti super vs 4080 super
that's a better comparison
yeah the memory bus is much inferior
same amount of vram but not the same speed/strength vram
does memory bus inpact stable diffusion a lot?
yes


