#💬|general-chat
1 messages · Page 170 of 1
FP8 checkpoint (unet+clip+VAE) is supposed to be about 15GB, based on other discussions I've seen at least
^ this might be wrong, people debating lol
Its 8b. Flux is 12b
SD3.5 was a real surprise hit
From a dude that sounds like he knows what he's talking about:
100% positive. You can always verify the contents of a safetensors file via the metadata.
sd3.5_large_fp8_scaled.safetensors contains:
-
first_stage_model
-
model.diffusion_model
-
text_encoders
- text_encoders.clip_g
- text_encoders.clip_l
- text_encoders.t5
Small and cute.
nice, should fit
pinged out of nowhere for that announcement lol
The "random" channel on the comfy discord is talking about sizes a lot
exactly same here lol 😂
They were smart to avoid the hype trap this time around
Just found a Stable Diffusion 3.5 lora 👀 https://huggingface.co/Shakker-Labs/SD3.5-LoRA-Linear-Red-Light
but how tf, it just released lol
are people just waiting with compute to modify these base models lol
So any updates on licensing?
it's like you joined a cars club but the club became about spaceships
the transition from SD1.5 to FLUX, Pony, and SD3.5 is crazy, the more advanced it gets the less people it interests
in a few years no one will be able to run any model
wtf is INTELELCT1, nevermind
Every 2-3 years computing power doubles
Although it's probably approaching 4 by now
who can afford this
So it's either fork over a bunch of cash or be stuck with an old timer
it's crazy
Also there is for some reason a hard limit when it comes to the performance to compute "ratio"
In other words, no other way but forwards
fine, just wanted to rant a little
Me too, don't get me started on Nvidia prices
And market dominance
+refusing to give us more vram
ong
Speaking of which, do we know how much vram this model ideally needs
well just have to wait for quantum PCs - generate images in nanoseconds
use quantum entanglement to generate images BEFORE you write the prompt
Is it only available by using comfy?
Sd3.5 large? At fp16/bf16 it should need fit in 24gb vram. With quantization, you can probably make it fit in 6gb vram gpus.
Ah ty
Diffusers also supports it but I think apart from comfyui, no other ui supports it. Not sure.
Comfyui master race. XD
The FP8 variant of SD 3.5 fits on the Arc A770 with 16gb of VRAM*
Given you run the encoders on cpu
hopefully forge gets an update to run this 🤞
esp cince sd3 doesnt work on it anymore
Hello just joined the server 👋
pretty sure they will - but how long it'll take, who knows
here's the SD3.5 SAI huggingface space https://huggingface.co/stabilityai/stable-diffusion-3.5-large
if you're using it in comfy, all you need is the sd3.5_large.safetensors file. stick it in comfyui/models/checkpoints - there's also an example workflow
hello
besides here and CivitAi, are there any other good places to browse other peoples work for inspiration? I like just copying other peoples prompts and tweaking them to something I like. Need inspiration...
hello just joined
You also could just find images you like and use vlm like paligemma or joycaption to get prompts from those images and try to use those in flux to see what comes out 🙂
We encourage the distribution and monetization of work across the entire pipeline - whether it's fine-tuning, LoRA, optimizations, applications, or artwork.
Does this mean the license changed? Can we expect something like what PonyXL family of models / derivatives did for SDXL?
(I remember the license of SD 3.0 was problematic enough that Civitai had to clarify stuff before it was even allowed on the site)
Today the Pony trainer said they don't plan to use SD3.5 and are sticking to AuraFlow
Other trainers will be diving in though I'm sure
anyone have an idea what settings are used in huggingface space/replicate for the sd 3.5 large model? which sampler?
I remember he said that much back when SD 3.0 and Flux were new, so I'm sure it also has to do with the amount of work involved
can't just switch because a model is the hot new thing, I just hope it's not a license limitation
I'm really glad if SD 3.5 is good for finetuners, Flux Dev can use some competition!
Hello, recently the vae I was using for SD was marked as suspicious in huggingface, so I'm thinking about what other vae could I use for anime image generation... Any recommendations?
This is supposed to be a rough explanation:
UPDATE: the main issue is the enterprise license which Astra is worried about. And also a possible rug pull by stability. Which is understandable when he's sinking this much time and effort into the project
So some concern with the license I guess, still early though obviously
makes sense, if I'm not mistaken Aura Flow had a very permissive license closer to what SDXL had
would be really ideal if SD 3.5 can be licensed like SDXL but I guess they must be already worried about ways of making money with the release
(I'm not exactly sure if/how Stability made money from releasing SDXL)
After flux dropped it sounded like FAL wanted to stop putting money into Auraflow. We're already overdue for 0.4 judging from the time it took to go from 0.2 to 0.3, so maybe they already pulled the plug
hello
I'm trying to use Automatic1111's Web UI, I just installed today and I'm encountering some technical difficulties, is there a channel dedicated for helping technical difficulties?
nvm just found #🤝|tech-support
What are some key differences in the datasets that SD3.5 and FLUX got trained on? Will SD3.5 rival FLUX in quality?
The license changed and it’s better now but it is still not suitable for Pony. 1 million revenue is a large number but it’s ~80k MRR which is not that crazy for a company that does contracting and gets model licensing payouts(as per license it’s total company revenue, not specific model). The moment we cross that line SAI will tell us to delete the models. At least o have very strong reasons to believe that, you can of course call me crazy but I really don’t see a reason to gamble when multiple great alternatives exist.
TLDR: I have no I data to believe SAI will stand by this statement, and a lot of data pointing otherwise.
Pony is 90% about data so it’s really not such a huge issue to try different model architectures. Plus we have enough funding to try different things…
makes sense, wishing the best to finetuners
i hope y'all find girl laying in green grass in your lives..oh happy day!
as per usual i think the finetunes will be where we'll see the true beauty of this model. especially when it comes to aesthetics and i thin thats where it'll knock flux out
Why is there a waitlist for ComfyUI?
thats pretty sus tbh
i dont trust that link at all lol
for the desktop application - it's not out yet, they're taking early release applications
Are there any demos of SD 3.5 that have a negative prompt?
what are you talking about man, the model is so good, you dont need negative prompt :3
My process requires a negative prompt to combine styles.
But if you have some other way of discovering what the model thinks Picasso without Cubism looks like (for example), please let me know
hmm maybe i misunderstood the word "demo" in your question, i assumed you meant the online demos, like on huggingface, cause most dont have negatives. if you mean offline stuff, then i dont know why you cant put negative prompt yourself in your workflow, assuming you are using comfyui or idk...
you are now also using the word "process" which would suggest some offline thing anyway, cause online demos dont really allow for "workflows" or processes.
I meant my process of exploring the latent style space. And there are HF demos of SDXL that have negative prompts, and of Stable Cascade, and SD 3.0.
Which is how I knew to completely ignore 3.0. it was terrible with a huge bias to a particular style and subject matter.
so are you using mainly online demos or are you using offline too?
cause if you are using online only, you might have to wait who knows how long until someone decides to put a negative
Currently mainly using online demos. Was using the free tiers of some services for SDXL, but that got nerfed.
Well, hence my query, right?
yep i get it now, online only is a struggle i can imagine, i like the freedom with offline stuff :3
man i made a LoRA of my character but the black glasses f*cked everything up with the eyes consistently, the struggle is real
I'll eventually invest in better hardware, meanwhile online is sufficient for me to explore how styles can be manipulated.
If you want an example of the kind of thing I do, try a prompt (in SDXL) of "By Alan Aldridge" then try the same seed adding a negative prompt of "Mati Klarwein" and then try again changing the negative prompt to "James Rosenquist".
i dont get it... if your lora has black glasses, of course its gonna be on top of eyes, you are not gonna see the eyes anyway.. right?
but then again you didnt post a pic so no idea what it is supposed to be anyway :3
Additional basic examples in these threads:
Oooh..
.. 3.5 ':x
Is there a comparison of how much better it is to 3.0 in terms of images?
3.0 is deprecated
:c
a1111 is deprecated cough
What does that mean though
its just a bad joke to say that a1111 is terrible compared to the other alternatives
but im not gonna stop you if you want to keep using it
Just to be transparent, I'm not one for jokes or ribbings, nor understand them really. (Kind of why I deleted and asked in tech support instead once realising I'm in general.)
If there's something better than A1111, which ever is the best I'd like to use it, assuming it's free.
it's usually for beginners, cause its easy for them to work with stable diffusion stuff. more advanced users go for comfyui and variants that use comfy in the background
Ah, I'm a beginner, yes.
Oh my god.
That reminds me of umm..
There's this tool I used to use when I tried to do databasing.
I was really confused by it.
if you follow the youtube series by scott detweiler for example, he has a very nice friendly beginner tutorial on how to use comfyui from simple stuff to more advanced workflows, it feels like some udemy course for free lol
you could learn comfy with it
Good grief.. ':<
I need help with something kind of embarrassing... I really want to see how stable would take my prompt, but I don't have a computer or anything powerful enough to handle it
.. I'll try it :U
❤️
you can always try online demos maybe?
The demos freeze my phone.... I'm using an old android
oh wait you said embarrasing LOL
I'm taking 'embarassing' as NSFW, so I doubt he could. ';x
welp...
i mean you could ask someone from here and they could dm you LOL
Yap :3
No, it's not NSFW, I just said embarrassing cuz I'm poor yet still on this disc lmao
Aw :<
it's ok man, dont hide it, we know what you want to generate 🙂
Bwahahaha, I'm trying to generate a villainous character design
:U!
Was it some UML tool?
whats the latest easiest packaged way to startup on stable diffusion, is it still automatic111?
omg i remember UML stuff , ugh
anyone have experience with the intel ai playground?
not a single soul
figured
i bought a a750
very pleased so far
wanted to try AI, don't know if theres compatibility issues with the normal apps
testing out GPU offload on LM studio rn
its not that complicated as it may seem tbh
spooky lady got spooked
Spooked :<
you are doing fine :3
im happy with my lora, even though it only has like 30 images in its training data and needs a decent amount work and finangling with settings to get a consistent result, the pure dopamine when i get something that resembles my character almost perfectly is like crack
i know that feel
Is SD3.5 like 2, crap? Or is it worthwhile?
It's lower quality than flux dev but with a lot more variability. In other words, really, really promising for finetuning
Also with surprisingly little censorship
how do you make sd3.5 generate anime?
not having much luck with retro anime style, a woman with long flowy blonde hair, wearing bold makeup, in her casual t-shirt and denim pants, navel cutout, indoors, solid red background. dramatic perspective.
install this node https://github.com/BlenderNeko/ComfyUI_ADV_CLIP_emb
switch it to token_normalization: mean and weight_interpretation: A1111
then try cranking up the weight on the anime token
you should probably go play with it and see what you think
use the word 'anime' in your prompt?
did, but doesn't work, i think artist name ref is needed
i wouldn't use the word retro - that's likely to throw the AI into the wrong area of latent space.
Question: Is there a way we can use SD3.5 through a web app?
Fal will have it soon
many will
sure. go to mage.space and make a free account then look for the General (stable diffusion 3.5) concept
unlmited gens
Is there a unet-only SD3.5 fp8 anywhere?
What happened to the original dream studio? I see it only has SDXL on it, and an old version of Stable Diffusion 1.6?
even if there isnt, isnt there a way to "extract" a unet from a checkpoint?
Possibly, but last time I tried something like that in comfy it didn't work, and I don't know enough otherwise how to. Also downloading the text encoders for the 10th time gets annoying lol
but the text encoders are the same, you dont have to download them again :3
you don't like bundles? 😛
depends
I don't get even why it is bundled, it saves like 2 clicks for gigs of bandwidth and storage
3.5 does not use unet
no. unet is a neural network architecture. 3.5 uses a different neural network
you'd have to ask the people in charge of maintaining it
what about version 3.5? lol. Again, I don't know enough but what the weights without the VAE and CLIP is called
they're just called the model weights.
Well that's confusing
how so?
On Civitai that would range from loras, vaes, embeddings, etc..
those are all models of various types, they all ahve to be trained, just like your image model does, so they all have data they have assigned weights to as they trained.
I'm thinking the correct term is "Diffusion model" but that would be easily confused to possibly include the vae, and clip, at least in my opinion. shrug
a diffusion model is a model that uses the diffusion process to learn how to create images (simple explination) a VAE is a variable auto encoder model. CLIP is Contrastive Language-Image Pre-training and is an archetecture. when you talk about the AI doing this or that, what you're really talking about is a collection of models all doing individual jobs together to return a single result
Yes, but I just want the weights without the CLIP and VAE 😁 I'll just be patient and wait for somebody to eventually probably post it another day
the weights for the image model? and what will you do with them?
save the bits to my ssd and probably only run once before I download them again using a finetune or something lol
do you understand what the weights are?
does anybody?
yes. so when the image model trains, it looks at images and text caption pairs. as part of learning the information, it assignes 'weight' to the concepts and values and assigns vectors to them. the SD3.5 model weights are on huggingface - but if you aren't a developer you can't do anything with them. what you need is the actual 3.5 model.
from the US department of Commerce: "Model weights reflect distillations of knowledge within AI models and govern how those models behave. Using large amounts of data, machine learning algorithms train a model to recognize patterns and learn appropriate responses. As the model learns, the values of its weights adjust over time to reflect its new knowledge. Ultimately, the training process aims to arrive at a set of weights optimized to produce behavior that fits the developer’s goals. "
I'm really confused where we are disagreeing but I know all the "high level" stuff just nothing underneath if that makes sense
it makes sense. we're not disagreeing, but what you're asking is for me to give you all the molecules that make up flour sugar and salt so you can bake a cake.
but what you need is the actual flour sugar and salt
if i give you the flour sugar and salt, all the molecules will be in them, and you can actually use them
lol, All I know.. right now all I see is model bundled (with VAE and clip text enc), I just want the bundle-less. Because I already have the VAE and clip on my drive.. I'm too tired to continue this conversation and learn anything new
save model node in comfy
load the result in "load diffusion model" node
https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main go here and just download the .safetensor's file thats on the bottom of that page
I'll try it tomorrow
you should probably also grab the example workflow on that page as well
i don't think you need anything else, but if you do, they will be in the folders on that page
I get confused about all the repos and which is which
when I used SD 3.5 earlier I just set it to download from like 5 different repos LOL
and tried them all
https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main that's the huggingface stabilty.Ai page for devs but it also has the 3.5 large model on it
ah okay yeah
and it's also been uploaded to civit now https://civitai.com/models/878387/stable-diffusion-35-large
there is this experimental "scaled fp8" one that Comfy made but I haven't got it working yet
I saw that but also bundled lol
i haven't tried. i did download it but i don't need it
I rarely have the correct CUDA or pytorch for stuff like that
i'm also not using Comfy's workflow. i'm using the workflow from SAI as i know it has all of the controls in it that it should have
oh yeah I should switch to that, I was on the Comfy one
using the workflow of the model creators at first is good yeah
Looks like gguf formats are also available, but those take like 5 minutes to load for me for some reason (well the flux ggufs did)
I haven't learnt how to use gguf yet
the cool part about gguf is they work here https://github.com/leejet/stable-diffusion.cpp
no python
halo!
guys which checkpoint do you guys recommend for anime art?
Will some 3.5 model support 6gb vram gpus
Waifu diffusion
btw, do you know a good checkpoint to modify clothes
and a checkpoint to modify facial expression?
with the inpaint
I want to be able to close eyes
or open mouth as if the character is talking
where command Chanel
read the information in this channel #artisan-faq
where is this workflow
https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main on this page near the bottom along with the 3.5 large safetensors file
ignore all the other folders and stuff, that's for the devs that want to play with diffusers
do you have an idea how to get results like in the HF space or replicate with sd 3.5 large? @desert dagger
like which sampler they use and so on. the results im getting there are great
compared to my comfy results
you want one of my workflows?
yes! please
i'll DM it to you, can't post it in this channel and don't want to post in general with images
thanks
how long does it take to generate one image? using sd3.5 or sd3
@desert dagger How much experience do you have?
I got added randomly from someone named Jon Snow.... Their messages seem a tad off and it says they are a mutual server member to this server...
Hi ! im new in the server & to this community & its my first time using Stable Diffusion with the A1111 GUI, and my goal is to make some realistic women pictures taken from a smartphone in a room with natural lights & not too much exposed reflections on the skin, like in the real life
i already did some try, but i was never satisfied to the final results
Well first of all use comfyui or at least forge- then u can try some of the realism loras from civitai, there is even a VHS lora
aight bet, ill start with this
thanks
its always better to use a checkpoint & a LoRa at the same time ?
depends, if you want something very specific then loras help
On my machine, it takes about 30 seconds
I kinda suck at prompts
it works very well to go to meta or claude, tell them what AI you're prompting, tell them it's specifically drawing an image, and ask them for a prompt for it
dunno whats meta or claude. Those are image to prompt AI's?
https://www.meta.ai/ <--- meta - it's an LLM
Thanks man
it's better than chat gpt I suppose
chat gpt has limited amount of image you can send per day
everything is better than chatGPT - meta is free, and if you just give it a prompt, i'tll generate images. if you ASK it about generating images, though, it'll tell you that it can't
any good guides for using automatic 1111 api?
can you run sd3.5 in a1111
and if so what model im guessing medium but im not sure
im running it on a 3080 btw
(medium isnt out yet)
is it possible to use large on consumer hardware, guessing i should use comffyui
Nope
I tested it on H100 and took 9 seconds. but ti's not competitive with flux model
is there any solution to run flux model in a few seconds?
Try this: https://github.com/aredden/flux-fp8-api
Should only take 2-3ish seconds.
Also, sd3.5 large should be competitive with flux(few pros and cons for both), did you use sd3 medium?
hello?
Is SD 3.5 more akin to flux in "structure"? Or closer to SD 1.5 and SDXL? Thinking since we haven't gotten a proper tensorRT converter for flux, but maybe for SD 3.5?
Asking due to it using the same text encoder as flux
sd3.5 and flux use teh same neural network arcitecture
https://arxiv.org/abs/2403.03206 this is the SD3 arXiv paper
Ah, shit. I can forget about finding a way to speed it up then
Unless LCM loras is the rescue there for 3.5 due to not releasing only as distilled unlike flux.
Didn't fully understand what the paper meant as most of that is hieroglyphs for me as of right now :P
But got the gist that it more follows how it should do, vs the older models being less accurate
the older models use unext. sd3 and flux use a different neural network
Though, as usual, we're gonna have to wait for finetuned models, as stabilityA.I is still sloppy with details and especially certain anatomical traits
and while flux is frozen and untrainable, sd3.5 is flux, flexible, and trainable. just give the community time to write stuff
i'm going to take that as a personal insult
Yep, that's the plan, and i've waited for the proper release of the full SD3 since medium released in june. Now to wait for the popular trainers to go ham on the base model
a lot of stuff has already been implimented in the last 24 hours, and the dev behind AI toolbox has already got code in place for a 3.5 trainer. he posted that yesterday
Was more general question regarding how the structure was, and whether it'll be "soon", or quite some time until tensorRT has been written/re-written to properly convert and accelerate sd3.5.
Guys, I'm absolutely horrified....
it's the season for that
Yes, but I'm not in the way favorable
it's the month for that, too
Noice :) Cause i was going to try training for a 3.5 lora as soon as ai-toolkit updated for 3.5 as flux nailed my first ever successful lora immediately. But sdxl failed all 12+ times lol
My friend used 3.5....
he posts updates to his twitter profile, if you're not following him, you might want to
did they now
Used a generic anime model
there are no generic anime models for 3.5
so what model was it for?
I'm at the point where i forget i have a twitter. That's how little i use it. Plus pretty much stopped using it after musky elon bought it and killed the bird
It was just a common anime esque one
I don't remember what exactly, it was just anime ish
you can't jsut use a model here and a model thre and hope they work. checkpoints and loras MUST be used with teh model they are trained for - or you either get nothing or you get a real mess
and there aren't any for 3.5
We got a result...
Lol, found this on civitai:
SD3.5 Med/Lite (11GB) Improved Duel CLIP (Full Checkpoint) Duelling clips 
Maybe it wasn't 3.5 they said it was newer though
i'm sure you did. and i'm equally sure it was not correct. because you used a check point/lora written for something other than 3.5
there's a really cute kitten lora for 3.5 on there too
or you get a real mess
That's part of the fun! Especially when you use the wrong lora, and you'll get a deepfried meme as well. So 2 birds with 1 stone xD
3.5 doesn't need a lora or fine tuned for anime anyway. just needs the right prompt
Not lora.. VAE lol. Took me a good sec to remember xD
The result we got wasn't exactly within full coordination with our promp....
What's the name of the model?
It was... One might say very quesionable
No clue
tell ya what. go back to 3.5. get rid of the lora, get rid of the checkpoint. jsut use 3.5 by itself, prompt it and see what you get
ther's a whole bunch of 3.5 stuff in the #🆕|sd3 channel, including anime
using the wrong vae will get you interesting results for sure
Oh shit, forgot i downloaded the fp8 sd3.5, need to give that one a go
ROFL! forget you have a twitter, forget you downloaded a cool model... are you trying to do too much all at once?
you probably want to use that one with comfy's workflow
Also, do you know of a tool that can auto sort my models within a folder? 
As i might have dumped some loras intended for SDXL into SD1.5 for instance, so wanna quick move them into subfolders of sd1.5 and sdxl, then cut them from those folders where they belong :P
um no? but if you ask in the #🤝|tech-support channel, someone in there might
Popcorn brain mate. Welcome to the world of having adhd 
had 2 kids and a nephew that had that, one so bad he could only make sound effects to express himself. not a fun life at times
True
Though, not too sure if it fits as "support", as it's not really a "problem" per say
but that's where most of the really technical folks hang out
Aye. Nephew i have no idea if he has adhd, autism or what, but he expresses himself by overreacting quite like "that's so raven", or other "live action" disney actors when they exaggerate them expressing themselves lol.
Aaaand that's why i bought my loop switch ear buds. IRL volume knobs to shut everything up
3 decibel volumes to choose from
how old is he?
7
could just be in the first stages of puberty and out of sync with himself
And when he's playing on his tablet, he constantly narrates everything, at every second lol
Could be. I was quite the opposite. I was quiet, but my mental narration/fantasizing/activity was all over the place :P
and https://x.com/cocktailpeanut/status/1849201053440327913 you need to read that
sd35 refined models will mop the floor with flux.
😏
See this is the moment weve been waiting for
Definitely worried about Cat with4 gb of vram (send help). He surely shouldn't miss these historical moments.
im not using flux or 3.5 cause my computers not good enough lol, ill stick with SDXL for now
Fair... Whats your specs?
the bare minimum these days is around 8 GB vram I believe.
i have a (regular) RTX 4080 with 16GB of VRam and 32GB of regular RAM, and i tried Flux Dev, and it was slow but worked, then i tried to put a LoRA in and my computer started chugging really badly so i got scared and de-installed it, i might just wait until the far future where flux is easily available without all these shenanigans
I have a 3060 rtx with 12 gb of vram and 32 ram and I trained some loras for flux and they work... I mean its slower but it works - I can stack loras and it slows down but it works...
Takes a few mintues what can we do... at leats its local. after our next upgrade in afew years it will take a second probably.. Still amazing...
And I mean DEv too not Shnell...
i think i also tried with schnell too and put a lora and it chugged like crazy, i use forge for flux and automatic1111 for sdxl and 1.5 cause thats what im used too
A1111... well maybe thats your problem... try comfyu or at least forge
just try comfyui - u can set it to use the A1111 folders so u don't have to re-download al the models again
comfy is the most optimised always
idk about comfy's ui though with all the node connecting stuff, i like UI like automatic and forge
Yes. I was scared at first too for about 4 months but after that...
I gave in and I do not regret it.
hide the connecting wires if you don't want to see them. they're just for your information so you can see how the functions are connected
Comfyui is for "serious folks". A1111, Forge, Focus is for "casuals".
It's like an unspoken rule: everyone knows Comfy is in charge but no one admits it. 😉
man this is giving me a headache, all i want to do is to run flux with at least 1 or 2 LoRA's without my computer chugging and that doesn't take forever to start then generate, and that is also better then SDXL with prompt stuff and inpainting or whatever so i dont have to spend forever inpainting something to make it look good
Hmmm....
Well this is an emerging tech.
I am waiting for about a year now for a decent local video gen.
Flux will take a few minutes on almost any system now especially with some loras.
I started about a years ago with all this during SD15 times. It took forever so I upgraded to handle SDXL. which I can handle now ok but now there is flux and sd35 and hunyuan and so on. things move fast now and it will be a bit more time before we can settle into a pipeline we are comfortable with and that does what we want it to do.
At least 10 years from now we will be able to say "we were there when this started" because this tech will change everything we do radically.
yeah, i just wish it wasn't so confusing to figure it out
wELL OUR GRANDPARENTS TOOK A WHILE TO FIGURE OUT THE INTERNET
A SERIES OF TUBES
gRANDPA: HOW DO I DOWNLOAD THE INTERNET?
lol sorry for caps
keyboard is on pillow.
that's okay, today is capslock day, i think
Works stable Diffusion 3.5 with a 4070 TI 12gb vram? Or i get Out Off vram? I dont want to wait longer then 20 sec for a picture.
how and from where do i dowload stable diffusion 3.5 on my pc to run locally yk
How can I generate an image in this Server? like we used to do using Bots!
youll need quantized flux model. Should not be a problem with 4080
what's the deal with SD3.5? it's too early to bother using it, right? how does it do next to flux and SDXL? i've heard prompt adherence is much better?
you could use it right away if you want although its probably not gonna seem quite as visually impressive as flux to you in its current state
I don't think it is far behind though
I was trying to run latest stable diffusion 3.5 large model on g5xlarge with 24 gb of GPU memory . But seems like it is not killing the task due to less computation.
Does anybody knows GPU requirements for running stable diffusion 3.5 large model ???
It should run easily on 24GB (I've done this myself), run reasonably on 16GB, run OK on 12GB, then I think recent optimizations make 8GB possible
I just saw my system RAM is getting killed .. it's not the GPU problem
My system RAM is 16 gb
It's a base model so I think the best thing to do right now is compare it to SDXL Base, and I'd say it's an improvement. Flux Dev is definitely better than SD 3.5 base, but the real question is if 3.5 fine-tunes will surpass Flux Dev
Ouch yeah that will limit you. Try FP8 versions whenever possible (unet & t5xxl clip)
Fp8 version meaning what settings I have to touch ?
Or directly use fp8 model u are saying
and for sd 3.5 altough i havent used it yet: https://civitai.com/models/879701/stable-diffusion-35-fp8-models-sd35
Thanks !
yeah i heard of that, though im not sure what exactly that is, or if it'll work with loRAs or be as good as regular flux
do you use comfyui?
i use forge for flux
Quantize = Exchange quality with file size. Same quality as regular flux? No. Quite as good? Possible.
Will loras work: For me yes they do, but it depends on what checkpoint the lora was trained with.
Ive never used forge so i cant help you out with that.
But I can tell you that im running flux on a 3060 with approx 1:20 per image with lora on 1024x1024
This is my comfy workflow if still interested: https://civitai.com/articles/7292/flux-easy-workflow-lowvram-gguf
thanks for the info, last time i tried out flux with forge, it took me 2 minutes because it was unloading the model? then doing a bunch of things then loading the model again then generating?? when i looked in the command window, so i think ill try out comfyui for flux since i heard you can hide the node stuff which im not comfortable with
Yea newest update adds hiding nodes. Workflow is very very simple.
What you are talking about is the model offloading from gpu to ram. This is common procedure.
With more VRAM you need less offloading.
ohhhh, it was probably doing that because i wasn't using any quantized versions, luckily i was just messing around with it and made some comic meme panels that got deleted with the install (i forgot to move it)
exactly! (smaller file size, less offloading too) you understood! 🙂
im trying my best lol, AI stuff goes so fast nowadays
hard to keep up, for sure
FLUX 2 is probably gonna release tomorrow and make this whole thing irrelevant knowing my luck lol
With probably 48gb of file size huh 😄
400GB of VRAM required and tomorrow after that, its down to like 3GB
anyways, ill try to find the right version of quantizised flux thats appropiate for my 4080 and 32GB build
Imagine:
Like deforum
With time stamps and images, but it makes a video, like upload an entire script from a movie and it plays out
we'll probably have the singularity and agi by then (and hopefully they won't terminator or matrix us by then)
@unborn hedge https://civitai.com/articles/6730/flux-gguf
Checkout the Quantization Comparison Table
What is NFT DUE?
the next image gen model should really be called FLEX
FLUKE
and the small horse finetune will be FLEXXX
hi
does anyone know an AI gif generator?
I want to make like an animation of a character that I created
like idle, closing eyes and stuff
is free? 🙂
i've never paid anything to use it.
I wonder if I can input an image
docs page
you can do all sorts of stuff with it, go explore it
okay
Hi !
did someone here already done a faceswapping on a existing image on Forge ?
because actually i try to see how to do it, but it doesnt seem to work for me
ahh whatever... have a great day & good luck with Stable Diffusion yall, im too autistic with these kind of stuff fr
thank you for the help! i'll have to get more training data before i start using FLUX for my comic though, it'll take a little bit
would someone experienced with comfyUI give me proper launch args for a 6GB card specifically the 1660 Super as I am having an issue where my id 0 gen will be super slow (90-160 s/it) and in console it says that it's requested to load SDXL and then loads it before the gen. But then on my id 1 gen it will gen super fast at around 7 s/it but it says unloading models for lowvram mode and then unloaded 0 models, but it's generating super fast so I don't have to wait 20 minutes or longer for a single generation. This is in lowvram mode, but in normalram mode it's slow every gen. So how do I get comfyui to a point where I can always gen at that 7 s/it on SDXL
Anyone know if controlnet-union works with pony or illustrious based models?
@bleak matrix uhh
is MexxL_LCM2 some kinda super mega forbiden jutsu model? can't find it anywhere
i did a google search for MexxL_LCM2 and found several hits
Hi everyone, I'm newbie in SD and ComfyUI
welcome
I'm using Mac M3, I tried to generate image with Flux1.dev, it took super long to generate
is there any recommended lightweight model that I can use, to generate in seconds, so I can focus on testing ComfyUI functions?
you shoudl probably ask in the #🤝|tech-support channel
aah nice, I'll ask them, thankss
Hi all, new here
all i know is what i found on google.
guess it's gone then lol
yo! can anyone here help me to find out what A.I tool they used for a Comic? 😄
we can try. which comic?
im making a comic too! im gonna use FLUX because i heard its the most cutting edge generator so far and i need consistency and prompt adherence in my comics, especially with the super fine-tuned loRA im making so FLUX knows what my character is
@unborn hedge I'm trying to find how to get that rendered vibe the did with consistency ! You mean flux A.I? i will check it now
Image ratio on flux is too low! @unborn hedge
midjourney, stable diffusion, flux, and others, will all do that style
look on civitAI for comic book loras
you can't fine tune a lora for flux - and you probably don't need to. try just prompting it
but i see loras for flux on civitai tho???
yeah, i know. but you said "super fine tune" and flux - is basically just a giant lora itself. it has a very narrow range. stick in that range, you don't need a lora. try to go outside that range, you're going to get nowhere. if you really want to create fine tuned stuff, use stable diffusion 3.5 which just released
You need to train a model first step
@desert dagger at loras i found a video on yt
you don't need to train squat. just use the base flux model and prompt it well
But the quality doesn't look this sharp . or the backgrounds etc. it's basic comic
do you get the same character without training?
there are good loras on civitAI for comic book art.
the only i found close to the quality aspect and the render is Leonardo A.I
i have a specific-looking character though that im positive that flux wont get right
you're probably going to ahve to contact the creator of that video and ask him or her what they used
and i'm positive you can't train flux - wait a week till the trainers are updated and use SD 3.5
They say that they trained a model for the character from scratch and then do the whole thing. but they dont mention what A.I was used.
at a guess, they used SDXL
Stable Diffusion XL
oh
so if flux is a giant lora then whats all the other stuff on civitai? civitai says its a lora as well
are you familiar with what a LoRA is actually for? what it does?
i am
then you know that the entire reason LoRA was developed was so you could update the weights of a model and not have to retrain the entire model just to update information in it, or add information to it, correct?
im trying Lora right now. but i doesnt seem like you can use this for pro stuff. The image quality aspect ratio is ridicullusly low
i think you might want to start by watching some how to train loras and checkpoints tutorials on youtube. i'ts not that easy
putting a good data set together first is a skill
that's a rather odd aspect ratio
1792x2688 is the Ultra quality i can get from Leonardo max out.
it's close to the comics aspect
okay, but the AIs all have aspect ratios that they do well in, and outside of those, you get issues
if you want to use an AI to do everything, you're stuck with the AR that the AI you pick will do without issues
yes but to have a crisp render or a good image you need at least 1000x2000 and above
naw, you just upscale
otherwise it's only for showcasing on the internet chats or sites
you can upascale but .... meh
nonsense. and you don't need that large for print, either
it does the trick yes upscaling you are correct
but its not the same . for the detailes
up scaling sometimes for example makes the image realistic
unless you're printing on a web press
I'm talking for the new era of comics generated with A.I
not for general creations
yes . if you want to print also you need a good source
and i'm talking about actual publishing. you don't need sizes that large. and if you do, you can upscale the image with something like topaz, which makes it larger without changing it.
but before you can create your lora, or finetune a check point, you need to learn how to make a good data set. you really should consider watching some tutorials
i will check it yes!!
the character consitency is the hardest part !! @desert dagger
Topaz probably still beats the open source stuff on average yeah
the Enhance Everything discord server is good if you like upscaling
the majority of popular upscalers with .DAT .ATD etc were made by people there LOL
if someone manages to extract the control net from Supir so we can use it in regular Comfy workflows then that might be the best, but attempts have not succeeded so far
why do you want to do that?
what happened was the Supir team made the best Tiled control net of all time
but they locked it up in a lot of Diffusers code so it needs extracting
which model did they make it for?
oh its SDXL
StableSR is SD 2.1, funnily enough, and is still very good, proves that SD 2.1 is a strong model
2.1 is a very strong model, but 2.0 had ... issues and no one liked the 2.X series after that. so SDXL...
is there something similar to AnimateDiff but for flux 1 dev ?!
nope
feels sad
you could try using cogVideo with the images you create with flux maybe
@desert daggerare you pretty good with it ?
nope, but others are using it and they seem to be liking it
Hi everyone, I have a question, I'm looking for a new computer with a budget of around 800💲 that I can use for modeling and rendering as well as using sd and comfyui, any recommended configurations?

For training a lora? If ur using GGUF variants of FLUX you can probably train the lora with the base checkpoint. (I think Q8 performs best with LORA and Q4 the worst. Some said below Q4 loras wont work properly).
However if you want to use for example a flux 8 step model like Hyper Flux, youll probably need to train it with that as the base model. Just keep that in mind.
what settings would you use for generating realistic images with sd35, shift, steps, cfg, sampler and scheduler?
After finetuning a model, will licensing remain the same?
What about merging two checkpoints?
Mochi 1 is definitely the best open text to video if you want(it’s very slow locally, 1 min on fal), CogVideoX is considerably worse but supports image to video, trajectories and is faster.
gm all
Where is the best place to follow to see the latest releases of checkpoints and Loras to use?
CivitAI?
Well you know in the good old days you had to wait for film to be developed before you could see anything, now you wait for video gens to generate... :/
my video on kling has been almost ready for nearly a day XD
this is why I preach local, anything else is worthless, better invest in a good GPU
A day? Yikes, you can use Fal too, it takes mochi like 1 min to generate there.
i know i tried it - mochi has great promise
omnigen + mochi = movie studio
from what i understand voice/lip syncing is also very advanced by now tho I am not sure what would do that locally
I only ever tried Animatediff tbh, I wonder if CogVideoX is better, I like the image to video feature
it is better but not as good as mochi
until they tame mochi cog is our best local video gen tho
ohh ok, I wonder how it runs on a RTX 4080, I took a glance at the GitHub and kept seeing mentions of 4090 but I don't have that GPU
Hi, is there no more bots for freem stable diffusion imge generation ? I only see the artisan .
hello, tell me please how to generate image for free?
There is a fp8 Version of mochi out, with unloading of the VAE it should work with 24gb VRAM at least with like 160 frames...
still out of my reach...
https://huggingface.co/models?other=base_model:adapter:stabilityai/stable-diffusion-3.5-large a lot of LoRAs for 3.5 already on huggingface
Can anyone show if sd 3.5 can do, 'A woman lying on top of a pool of marshmallows.' As far as I know it can do, a woman lying on grass'. But can it do marshmellows.
can't post images in thsi channel, so posted it in #🏞|general-with-images
any suggestions for a model to generate artwork for a house track i want to upload to soundcloud? wouldnt mind there being text in there but np if not
with nothing more than that to go on, all the models out there can generate images that would work. more details maybe?
so I asked GPT to make me a prompt for artwork for my track i uploaded - ''Eerie, dark logo with the text 'Heavy Music' in bold, distorted font. Shadowy, misty background with a subtle, ghostly glow in shades of dark green and purple. Abstract, twisted shapes and faint, flickering lines, giving a haunting, mysterious feel. Minimalistic yet chilling, with a touch of horror and a supernatural vibe''
oh i need to post the img in the other chanel
not much like the prompt description .. but im totally new
yeah. this channel doesn't allow images so that it can be more conducive to chat :)
@fervent thunder https://x.com/_akhaliq/status/1849945005269336096
Any open source way to do what runway act one does
cogvideoX (not very good yet) mochi 1 (very good but not yet adapted to most consumer hardware)
@warm hull just in case i wouldn't go there, that's some random server
Wait i'm confused, what happened?
hello, i have rtx 3060 and i can't use 3.5 large error of memory
i do'nt understand
i try all driver and also cuda 118 and other
im bored
bored
the support ticket is a scam ?
you could just go to meta.ai and tell it to review what it knows of SD3 and then ask it to create a prompt
support ticket where?
I swear there was a guy who claimed to me an admin and directed me to create a support ticket. but the messages are deleted now
Is meta.ai any better than chatgpt? I'll try it
if it was here, the support tickets are legit. if it was somewhere else, i ahve no idea
not only is it better, it's free
@warm hull went afk but ya i assume it was some scam, why i said something, certainly nothing to do with SAI
the beware of scammers message was an invite link to a discord before it was edited
Ahhh ok
I'll try it out for sure and see how it goes. Thanks
I probably fell for that ngl i'm still confused. Cause there was some support ticket thing and i clicked on it but @desert dagger said it's legit
in this case crystal is wrong, it was an invite to a discord server named Support Ticket, that scream scam
they didnt see it before it was edited, so confusion
dude was quick lol
i said if it was for a ticket here - it was legit. if it's a discord server invite, it's not here
mb , just some misunderstanding
Is this your blog post ?
nope.
https://x.com/poe_platform/status/1849831370866164136 Stable Diffusion 3.5 Large is now on Poe
Sup everyone?
thanks this is great
yeah I wanna put vision models throughout my comfy workflows to have them judge how the image is doing LOL
Guys
I want to deploy a custom SD model(Juggernaut) on runpod. Is there a template i can start with ?
I don't know runpod but on vast the default comfy ones work well
you also save at least 30% by using vast
I already loaded up credits on runpod. You think it's worth it to make the change to vast ?
yeah
Any reason why you recommended comfy over automatic111 @fervent thunder
comfy is more complex but it is more powerful, and while it might be hard to start from scratch comfy supports workflow
I bet there is no Forge template in Runpod so yeah if you want to be straightfoward just choose A1111
For now i want something more straightforward so i might so for A1111
A1111 has maybe 0.01% of the features of Comfy
On vast can i build custom models(juggernaut) images or i have to build it locally and push to vast ?
I already have a live website with users so i'm looking for something that is quick and easy to setup , but i'll definately keep comfy in mind
I am pretty certain Vast have a Stable Diffusion template. I am not sure how you supposed to push the model onto the SD webui itself but well you gonna have to do that.
Hugging Face may be useful
also you barely need a template
there will be swift install instructions for any of these UIs on an empty server
Thanks. Have you ever tried pushing a custom sd model into a SD webui ?
I'm kinda new to all this so i dont understand what you mean here
Got it. Thanks anyways
you can install on empty server
or on a basic template that only gives pytorch
A1111 and Comfy UI are just software like Google Chrome or Microsoft Excel
you don't need special setup
hello
gm
hello
Hey guys, sorry for the stupid question. But I don't get the SD updates (2, 2.1, 3, 3.5..) I was using 1.5 a year ago with better and faster results, than 3.5 is doing right now. Why would I use a bigger/slower model than? I mainly use it for pictures of woman (nude and non-nude)
they're different models, different training. what i suggest is that you run the same prompt in each of them and compare your results
Well i would say that you don't need any face or hand restore, you can add text etc. so the base result follow the prompt better, ... Still it is possible to achive great images (lora, ipadapter, fine tuned models,...) with sd 1.5
Thanks, I will try that! But I should use a different sampler for the different models, right?
not if you want to see what each does and compare results. you want everything as identical as possible.
What about the seed?
keeping seed same would be best yeah
I think most people don't, when comparing models, but it would actually help
you're using different models. they ahve different latent spaces. seed wont' have any effect
what you want to see is render quality, prompt comprehension, etc
the seed has an effect its just really small
with default ksamplers
if you used Sharksampler with fractal noise and alpha = 3 then the effect is big
totally different latent spaces, and really what she needs to see are things that the seed won't do much even in the same latent space
they are different latent spaces yeah so they will interpret the seed differently
but what you can find is that a certain seed will tend to force there to be an object in a certain location, if the noise has a feature there
or force a certain colour in a certain area, if the noise has a colour bias there
this effect is small, but still present for the default high frequency gaussian noise that is the default in ksamplers
if another type of noise is used, that is more low frequency (more lumpy) then the effect can be large
yes, but for what she's doing, trying to see the major differences between each model and figure out which works best for what she generates, that's a lot of work that's probalby not necessary
yeah that's true its probably overkill
most people don't fix seed when doing comparisons like this
@fervent thunder @shell tendon so this happened https://x.com/abursuc/status/1849945966536950080
CLIP, ViT, DINOv2, SAM and Depth-Anything in one model wow
that's really cool
i thought so :)
interesting that the highest performing combo didn't need the depth sensing
makes sense I guess cos you can get depth info out of the ViT
you gonna play around with it?
yeah I wanna start doing more vision stuff
at the moment I mostly just use clipseg, DINO+SAM and then my yolo fine tunes
oh florence2+SAM rather than DINO+SAM these days
:) too many toys
literally yeah
Hello
so im a complete noob when it comes to SD i just downloaded it and im trying to figure out where to put model files i downloaded, its not a .ckpt and nothing in the file name says where to put it. anyone know?
@alpine mantle Kindly contact the support, they should be able to help you from there
best to ask technical questions in #🤝|tech-support
alright thanks
Genuine question... Do people actually use Artisan?
sure
people were generating in the channels here all day today
I just started a cyberpunk 2077 fan art community on Reddit https://www.reddit.com/r/CyberpunkFanArt/ It's also for the cyberpunk genre in general. It's new so nothing is posted yet, but since there are few places to share AI art where it isn't reviled and derided, I thought I'd promote it here.
Hello everyone, does one of you have an idea to transform the drawings into photos while keeping all the details of the image? We find a lot of things to go from photography to drawing but not from drawing to photography. Especially keeping the details
Depends on what you would like to achive. Image to image works, you can use controlnet to keep depth or lines accurate, you could use ipadapter style transfer with a realistic image as source for the style...
hey i'm new to this server - Anybody here from india? let's connect
can someone please explain to me what the hell this is? The upload of recordings is not working on Stable audio! I keep writing to support to get them to do something about it but I guess they don't give a shit if they have our money! Why the fuck isn't it working!!!!!!!!
i forgot to resize my training data so its all at 512 x 512...dangit, i needed it to be 1024 x 1024 because i heard that captures the details better..
Is flux 1.1 inpaiting availble anywhere?
Thanks for your answer, I had started to move in this direction but I didn't get there, the tools are not so easy for a beginner, I will insist;)
you can't just resize the images - you'll wind up with the same data just stretched over a larger area
how about some more details of what sort of recordings you're trying to upload, what errors you are getting, and so on
Anyone using the Krita plugin?
Ok
I have question. Am I violating copyright if i use AI to create image of character from a movie or game without commercial use and without publication anywhere?
i couldn't understand comfyUI so im going back to forge, this time im playing around with the quantized version of flux which should be fun and shouldn't blow up my computer this time
you'd have to ask the company that owns the intellectual property of that character - the issue isn't the tool you used to create it with, the issue is the character itself.
oh hey i was able to load a lora this time without my machine exploding lol, looks like the quantized version worked!
guys help
1 picture (inpainting with "segment anything") is taking forever. is 1.5mb/s normal for something that needs 3GB?
diffusion_pytorch_model.safetensors: 10%|███▉ | 336M/3.46G [02:03<19:30, 2.67MB/s]
diffusion_pytorch_model.safetensors: 39%|███████████████▍ | 1.34G/3.46G [12:22<09:01, 3.92MB/s]
i would just let it finish
yea but if i want to make more than 1 picture every 6 hours?
can somone help my I cant seem to find other ways to run stable diffusion 3.5 only through comfy?
it's only been out a few days. give the other interfaces a chance to code in support
ah thanks
How viable is local generation with an AMD GPU?
Anyone got one and had success with it?
boys
yall think is feasible to use SD 3.5 comfortably with a RTX 3060 on comfy?
or better to upgrade to a 3090
go into the #🤝|tech-support channel and read the AMD guides pinned in it. follow them, and you'll have quite a bit of success
depends on if you want to use the quant versions or not
tyty
Something with better quality and results than sdxl
after formatting my pc and reinstalling windows im currently backing up all my models and stuff from my previous server installs and planning to do a clean install of a server, has anyone used the ux ui forge server
https://github.com/anapnoe/stable-diffusion-webui-ux-forge
this one rather
or is auto111 recommended for stability and compatibility
kinda just want the most basic one for making cute pictures
eeellooo
Hey Guys, what is currently the best local ChatGPT like model i can host locally with Ollama? To do the stuff like ChatGPT can, without generating Pictures and T2S or S2T
It depends mostly on how much vram do you have
For example I have 12gb of vram and the best model I can run is gemma2 9b or mistral nemo 12b
Auto1111 would be the basic one.
But if you need flux or sd3.5 support you need forge
I want to run it on a Dell Server, sadly no graphics card
You can run it on CPU, it will be way slower but it will run, how much RAM does it have?
Llama 3.1 is really good or gemma or mistral nemo/orca
gave the vm 8 haha but i have like 128gb
Nice!
i have a intel xeon e5-2670 v3 2.30 ghz. I want it as a business solution for our company so the people can test it out, maybe work more efficient. Cause we blocked chatgpt natively cause of data protection reasons. maybe that helps you for the recommendations. 🙂
It would probably run gemma 2 9b at a decent speed, you should test it first in my opinion, then you could try with the 27b version
Sadly I haven´t tested bigger models, llama 3.1 8b should work fine too
mhh okay is there a big difference between them in case of features? they should work with internet search too later on. Idk what it takes to make that happen i'm pretty new but yeah that should work
And it would be nice if i could restrict the model, cause its beeing used in business environment
idk about the internet part but with ollama you cant connect it to the internet search.
but checkout this, maybe they have more features:
https://anythingllm.com/
they have options to run stuff localy or cloud and you can change between different services
and pdf support is nice
oh so you can upload pdf so it can work with it? nice
https://github.com/open-webui/open-webui
together with ollama and a good vector db/framework works also with PDF, internet search and even images if you host a comfyui anywhere. You could also use a ChatGPT entpoint if you can make sure to get one executed within a save azure environment for example.
it seems that the models seem to be the problem in case of internet not ollama itself
but i think i will try llama first to get in touch with local ai
thx guys
The models are not the problem as they are local files that cant do anything when not used with a tool like ollama or Anythingllm
Its the implementation
i read there are plugins that make llama in ollama internet capable, that true?
Hi
"Ollama now supports tool calling with popular models such as Llama 3.1. This enables a model to answer a given prompt using tool(s) it knows about, making it possible for models to perform more complex tasks or interact with the outside world.
Example tools include:
Functions and APIs
Web browsing
Code interpreter
much more!"
this may be what i need
Hi everyone! I have a passion for reading and love diving into all sorts of topics—I'm an explorer at heart! Looking forward to connecting, learning, and sharing ideas with all of you.
if you mean generate here on this discord, you do that in the artisan channels. start by reading the information in this channel #artisan-faq
Hello everyone! How are you all doing today?
@warm junco Do you know if flux 1.1 is available anywhere with inpainting?
Dont know sry
Im making a discord bot rn that can do that, I havent seen it anywhere online though
Anyone using stable diffusion for architectural design? How to go about “sketch to render”?
how much VRAM?
12
better is a compareson word and doesn't mean anything.
it'll be slow. you might want to use https://civitai.com/models/879701/stable-diffusion-35-fp8-models-sd35
does it require better hardware than flux
which will take less vram
the GGUFs are here https://huggingface.co/city96/stable-diffusion-3.5-large-gguf/tree/main
But does it run faster than flux dev
Both quantised versions in comparison
What’s the difference in performance
depends on if you are generating a negative with dev
if you are generating a negative with dev then SD 3.5L is faster, otherwise flux is faster
depends on your hardware too. but flux is frozen and has a very narrow range of what it can do. stick to that range, you get very nice images. try to go outside that range, it won't let you
hello everyone. Been a while since ive touched base here. I have a question and i hope it makes sense. has anything cropped up making it easier to run a locally hosted stable diffusion install? and im not talking about on my PC. I want to be able to install it on linux server on my network and access it from my pc, maybe witha web interface.
civitai is lagging?
have you trued using comfyUI?
or swarm?
no i havent i will take a look at that, thank you so much
nvida and mit release this new diffusion model, github coming soon but you guys can try it out 🙂
https://hanlab.mit.edu/projects/sana
What is the recommended ram and vram for sd 3.5 large
It needs about 16gig vram
Whats your prompt
For FP 16 ? And also what is the loss going to fp 16 from fp 32
sure
where did you find an fp32?
So there is no fp 32 version of sd 3.5 ... okay ... then what is the quality loss between fp 16 and fp 8 ? Can 4060 ti 16 gb run sd 3.5 fp 16 effectively ?
that's what i have and i have no issues running the large model exactly as SAI relesaed it
runs fast, on my machine
unless i stupidly kick off 10 image batches without remembering i had it set for 10
Ooo okay got it ... u cleared a lot of my inquiries
Why do some LoRAs require trigger words and others don't? Is it just up to the creator or is there a useful purpose?
lora training is still the wild west
there are lots of different techniques and it is unclear which is best
What is the maximum number of tokens that can be used to write a prompt with Stable Diffusion 3.5 ?
512
Hello!
@lone hawk how did you determine this information ? thanks
Hey guys. So there seems to be a problem with Ollama, VM and Cores, right? I have an VM in Proxmox with Ollama, WebUi and Llama3.2 3b. I gave it 64gb RAM and 20 cpu cores. but it does not take more than 50% and its slow af.
When's sd 3.5 medium going to be released ? It's 29th october
lets wait for a little longer
Mods. Remove that guy's balls
yokoso watashi no soul society
SD 3.5M is out!
https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/
setttings to use ultimate upscale with sd3.5 medium?
dead on release model
"9.9 ONLY"
9.9 GB? aww man I'm 8GB
wow, that's a whole new level of "out of touch with reality"
Did you really just @ everyone?
2024
wanting to use SD on 8gb vram
why are people like this
GGuf models can be smaller for the 8gb cards
Do we have access to the api to use the new medium ?
why do people put out things and its not even accessable to eveyone?
Yeah same
"Most consumer GPUs"
Lemme doubt that lol
How r v supposed to catch it if it runs out of the box?
cant even access the model
I want to point out the graph of which GPU is compatible is misleading, mostly because of NVIDIA
there are two 4070's with different VRAM capacities
will this even work with amd?
1.5 still best for 8gb VRAM?
I had the exact same though. They should've said "many" instead to be more accurate.
I'm also restricted to 8GBs, but I'm sure there's optimizer peeps out there that'll squeeze it down further.
I'm sorry but, isn't the graphic misleading? Flux can certainly be ran on 24GB cards
probably, but see if you can run SDXL/Pony, I think you can
(worthy of note is that, despite the name, Pony can do anime and even realism with checkpoints like "pony realism" or "damn! pony realistic")
you need to edit it to run 8 gigs of vram
you mean get a pruned model or something?
dunno if quantizing is a thing but it maybe is
with 24GB I just run 6 images in parallel so I assume you can do 1 with 12GB
9.9vram is insanity considering my consumer grade 3080 has 0.1 more than that
Is sdxl still the best or nah haven't been active here much the last year
RTX 3080 has 12GB
No, I literally have one
Yes
depends on which one it is
thanks
anyone know offhand how much vram sd 3.5 large requires?
is that a mobile variant?
6GB even was always totally fine for XL models already in Comfy, if the card is Nvidia and of Turing or later architecture
No it's just a normal 3080
Before EVGA stopped making gpus
Such a painful loss to the PC world
3.5 Medium should have zero issues on 8GB VRAM + 16GB system ram if you run FP8 T5 or even better the Q8_0 T5 GGUF
Is 10 vram enough for normal 3.5?
Then what's the difference I thought medium was for less vram
You can use GGUF quants of the actual image model too though for Large
10GB 3080 is such a dick move by nvidia, even the default should have been 16GB not 12GB
I didn't even know they had a 12gb 3080 till right now LMAO
the 12gb is the standard 3080
I hope Intel and AMD get off their asses and stop dreaming they can simply charge 10% less than nvidia
whats wrong with amd? i heard their latest GPUs are great
Is it an issue with AMD or just that no one is making software for AI with amd?
AMD charges a little less, for a product with a little crappier software stack (a lot of AI stuff and other productive apps expect CUDA)
It is, just don't use the big chungus full FP16 T5 encoder, use FP8 or Q8 like I said
Ah so it's just software people not making amd options
Geohot (the tinycorp guy that previously jailbroke the iPhone and PS3) offered to fix ROCm if AMD open sourced it, AMD drops the ball very often
AMD steals defeat from the jaws of victory (only on the GPU front, the CPU front is doing great)
ROCm is open source
btw what do all those abbreviations stand for?
FP - floating point
BF - alternative floating point format with more bits allocated to exponent rather than mantissa
that is what I know due to limited programming experience I've had
but Q?
is it just Quant? what format is this?
I thought AMD was the good option for low end and Nvidia was best high end
refusing to acknowledge the existence of quants for both the image models themselves and the text encoder
why are people like this
the driver side isn't I think?
if buying new, yes! but you can get used Nvidia stuff for better prices because the market is so full of it
womp womp
Should be all open source
Only Adrenalin for Windows not
Honestly they should improve their autoencoder
I don't think I'd want anything pre3000 generation vs a new amd card 😭
Yknow how much faster stable diffusion could be if they just adapted an autoencoder that could compress more
yeah this idea is called Sana
its coming soon
quality is a bit hmm
I'd send an example right now but I can't
looked it up and it seems they actually open-sourced and fixed the thing geohot was complaining about, Lisa Su even responded to him, nice!
There's this open source autoencoder that can compress 64 fold and has a better image FID than Stabke diffusion's 8x vae
I don't doubt that the autoencoder does that well
but I doubt the diffusion model trained using that autoencoder would do well
no, I mean you can get an used RTX 3090 for the price of a new 7900 XTX, in that case the Nvidia one makes more sense
oh yeah this is Sana
They also made a diffusion model
It is Sana?
Can you send me a link?
Oh really? Albeit I haven't looked at gpus in 4 months last I checked a 3090 used was $1500
they dropped to the 1000s
Nvidia works with ai more reliably
So I'd rather pick that
Unless you're just gonna generate images using stable diffusion and game
Yeah just checked 3090 is $1200s now, not good but cheaper than a couple months ago while 7900 XTX is $800s
Holdon did they make or use DC ae?
yeah
its confusing cos Sana debuted multiple things at once
but Sana is also the DC-AE paper
7900 XTX for $849 is an amazing deal holy
other models could be trained with that VAE though
This
And sdxl wasn't really that amazing
TIL my 3090 is as valuable as it was 2 years ago when I bought it new, nice to know
Yeah, there seems to be massive shortage in some places. I got mine used for 700e.
yeah still 1k+ holding its retail value highly
a 12B with DC-AE would be cool
EU?
Yeah.
Knew it, all my EU friends got their gpus so cheap
strange it has the super title but i dont see why someone would buy that instead of a 4080
just questioning why that card even exists
It's be able to be ran on more hardware and faster
bro got muted for spamming 😭
idk maybe because 4080s cost over 1000
whereas 4070 supers dont
really? i saw a bunch going in the 800s on ebay
like 3 weeks ago
$599 i guess that makes sense
200-300 jump for a used 4080
I wanna compress Minecraft images effectively
More than 32 or 64x
Using DC ae, but idk how I'd train that
efficient and minecraft in the same sentence?! insanity
apparently a new shaders mod came out for sodium that has the most efficient ray tracing for minecraft as of yet
Photonics
ray tracing in minecraft yuck
i love rethinking voxels
dont mind me, i still love my refurbished 1080 ti 11 gb xD
I wanna get a new phone
Has anyone managed to generate over one megapixel images with 3.5 large? For me everything above that turns to mush. Can't wait to test 3.5 medium later today as it should support up to 2MP
what are you generating on?
i got a 1080 in my backup pc 😭 poor thing only being used for my media server
Ah yes
Stable diffusion 🍷
Or maybe something about whatever you did is somehow wrong
That's just how ai works
Not sure what you're asking, rtx 3090, ComfyUI
It's very highly I'm doing something wrong, just recently moved from A1111, though I've read 3.5 large might only support up to 1MP
that would be correct
i like swarmui!
basically comfyui but with an option to use simple stuff similar to forge/A1111 for when you cant be bothered
I had my own UI before all the ControlNet stuff etc. became mainstream. Didn't bother to study how those work in code so just moved to A1111 and now to Comfy, though I'm going to go full circle and make a new UI that uses Comfy API 😅
i did the same exact switch
except after comfy i switched to forge and now on swarmui
Where can we try stable diffusion medium 3.5 online for free?
Does Stable Diffusion Medium 3.5 have photoshopping functionalities like DALL E 3?
Hey there having problems with the Web based stable diffusion all iam getting in grey images what am I doing wrong
a used 3090 costs less than a 7900 xtx if you shop around, but at the very least the same
if they cost the same it's a no-brainer, if they cost less it's even moreso
really? if they're both the same why would i get a 3090 instead of a 7900 xtx
yeah, you might wanna check where is the place to buy used stuff in your country, dunno about the US, I think facebook marketplace?? but of course double check
ebay primarily, facebook is better for lower end atleast in my area
seems ebay prices are ~$690 (nice)
Yeah it seems a 3090 is even worse than a normal xt much less a xtx
damn 4090 is amazing though
depends on the task, if ROCm is worse than CUDA for that task, it's 24GB
what i was saying
I think AMD caught up in SD and LLM stuff but I'm not sure
yeah im looking at 3090/3090ti/4090/7900xtx/7900xt comparisons
either way it sucks balls that a lot of projects are hard-wired to CUDA and as an end-user you can't do much about it
(aside from going into the code and making changes yourself... not fun)
i wonder if a xtx is either a 4080 or a 4080 super
xt seems to essentially be a 3090 ti
depending on what your task is, 7900 XTX ~= RTX 3090
but look for people posting metrics that are relevant to what you want
like tokens per second, iterations per second, etc
ok yeah an xtx is essentially a 4080 super
im looking at overall not anything in particular
majority of situations
I have a 7900xtx, great card for Gaming.
For SD on windows it works good with ZLUDA and hip.
For Linux you can use ROCm
