#💬|general-chat
1 messages · Page 165 of 1
press L1 to kiss
yo
hey everyone, does anyone use Streamdiffusion here?
Hey
did sd3.1 come yet ?
not sure its ever coming
Which ui should i start with? I see 4 on github and Im kinda confused.
I mean like automatic111 comfyui
what even is the difference
Focus is probably the simplest solution, comfy uses noodles and nodes, Auto1111 is more advanced than focus but still uses a similar HUD
Not sure, im not an AMD user
But zluda development was dropped by them iirc
hi guys, i just got sdxl to work somehow, was very confusing but i got it working now! on my local automatic1111
The github file for zluda has been updated yesterday. Why would the development be dropped?
dang
zluda is much much faster than directml
My latest audio-reactive system! What do you guys think?
as amd user i say : AMD get your fat lazy ass up and provide an alternative for windows AND linux dammit! amd is always 2 steps behind concerning ai frameworks
https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu this version of automatic1111 runs on windows with amd cards using zluda, i am using it myself
Here is a guide for ZLUDA: #🤝|tech-support message
I know, im following it rn, but theres a step about installing zluda, and idk what version to install
rocm5 or rocm6
what is Auratura?
When I launch auto1111 it says "Zluda works best with SD.Next. Please consider migrating." This isn't an issue, my auto has been working fine since we fixed those issues, I'm just curious as to how SD.Next runs better with Zluda.
question, is it possible to fuse lora with sd turbo?
thanks for your answer , this is the first time I've heard of 'turbo lora'
I am trying to train a lora for my streamdiffusion setup running on sd-turbo
streamdiffusion is an area I am not familiar with
yeah I would have liked to use sdxl-turbo , but apparently it doesn't work with tensorrt, which is important for me since I need the frame rate to be high
but thanks for your help anyway
Is there a way to keep models in memory with flux? As for every change i do to prompts, it empties the vram, and has to load it all slowly back in
That won't work unless they have the exact same gpu it's trained on
Unless they have worked out it's kinks to work on any RTX? 
i know that file in particular wont. but it's demonstrating that it does.
Ah. I've used tensorrt a lot. Downside with them, you need a model for each resolution/resolution range, and that shit can stack up real quick
can this work with streamdiffusion somehow?
talk to streamdiffusion devs. i'm just showing that there are models available. this one looks like it was even made by nvidia, so maybe it is worked out like duck was saying
What does this mean exactly? Talk to me like I'm a caveman, because I am.
ok so A1111 is one piece of software for making AI images using stable diffusion
A1111 also gets called WebUI sometimes
also because A1111 is so popular, sometimes people get confused and think it is the only GUI so you get people coming in and calling A1111 "stable diffusion" sometimes
anyway
Huggingface made another library called Diffusers
which is CLI (its code)
and one some GUIs use it as a backend such as Invoke
however what SD Next did is use both Diffusers and an A1111-style backend
so it has dual back end
there is also Comfy UI which has some other slightly different lineage, although its closer to A1111 than diffusers
as far as I can tell Comfy UI is mostly based on K-diffusion which is based on the Karras 2022 paper commonly referred to as EDM
which is the same paper where both the karras scheduler and the sampler known as "Heunpp2" come from
this long story is why you use a K-sampler node in comfy
Thank you!
Rate the UI of the web
https://vexel.pages.dev/
Np, but when your talking about adding prompts then its the styles.csv
I meant the actual default prompts that get put into your face when you first open A1111, and also I'm redoing the defaults for my Adetailer as well!
@warm junco It reset my stuff to default lol
Hi, I'm new to the server, my name is Christian.
Normal the webui launches without any default prompts
I put prompts into "txt2img/Prompt/value":
and there's nothing
And it also reset my Adetailer
And my resolution 😂
I don't get it
This is way too weird
I edited some stuff in txt2img, txt2img adetailer, img2img, I think img2img adetailer as well and that's about it
Idk what could be the issue
can i upscale mutil image at once, or one by one?
Hello 👋
hiiiiiiiiii
I can say after days of testing, USDU is is superior to Tiled Diffusion.
(unless im not using TD optimally)
you definitely can, but the devil is in the details, there are different UI's and each have different approaches to that, and some maybe cant
what??
hello
hello everyone
what's hand negative stuffs most used overall
What is the best model or method to make faces similar but not identical, like siblings (brother/sister)? Also, how can I maintain the same style and posture?
Similar to face-swapping, but not exactly like that. I want to create something similar, like a brother or sister.
Anyone know if you can do the stable fast 3d stuff here via command line if so can I ask how?
not sure
hi! can someone help me understand how seeds and batch sizes work? Does every batch get one seed? I've noticed that the n-th number of image in a certain batch will stay the same as long as the seed doesn't change. Is there a way to isolate that one generation without having to generate the whole batch?
it depends on the code base but in general a common way to implement it is for each generation in the batch after the initial one to have the seed implemented by one
he is. he made a bet with his friend and now he's gotta get banned fast or pay his friend
if you want to work with batches in more detail then the nodes Latent from Batch, Latent Selector, Flip Latent and MultiLatentComposite are nice
i dont know what any of those words mean when you put them together like that, will it let me isolate single images from batches?
your first image in a batch will use the seed you specify. all the others get a random seed or every image would be identical
use a save image node and they will be saved indvidually in your /comfyUI/output folder
you have to have some randomness somewhere for batching to make sense
the vast majority of the time this comes from the seed
but any source of randomness can work
oh shit are we talking about comfy UI
ye
seed is a stable diffusion thing, doesn't matter what interface you use
batch is universal. you can't use the same seed with every image in a batch or they will all be the same
and auto1111 is going to save the images somewhere
its ok I thought that too at first
the seed is the number used to initialize the random number generator. if you use the same number, you get the same image
but it wouldn't work if there is no randomness it would just make the same image
i found an image i liked 12th down in a batch
that's what they are for. so you can actually get the same image multiple times if you want to. that lets you tweak other settigns
so i've been generating 16 pictures at a time trying to figure out the right clg for one image
oh no
ouch. um you should probalby ask auto1111 questions in #🤝|tech-support
in comfy you could get the seed but
I am worried A1111 might not have a way
lol i thought about that but
my question seemed so dumb that it didn't fit in the same caliber of
how to train loras and stuff
that's where the people that can help with auto1111 usually hang out
there are no dumb questions. you can't know what you dont know
i'm probably going to try comfyui
i just don't use auto1111 so i don't know the answer - except i can't imagine it doesn't have any way to specify a seed
no it's embedded int he meta data
i literally just didn't notice the last digit is changing
if I had to guess, putting your seed number for the start of the batch plus 15
will get you the seed of the 16th image in the batch
you know that feeling when you realize you're jsut dumb
if it's in the meta data, just use an EXIF viewer
welcome.
👍
using flux at the moment
I use SDXL but generation time depends on so many factors
my generation time has varied between 0.1 seconds and multiple hours
for SDXL
i just mean compared to sd1.5
but i'm guessing that's probably a nonissue since you didn't lead with that
i'm actually kinda psyched about flux
the generation time has a lot to do with your hardware and not so much to do with the model
i heard it's great for text gen
I very recently did go back to SD 1.5
keeping all other variables the same, SD 1.5 is a decent multiple faster
I didn't keep track but something like 2-4x faster
SD 1.5 is a really good model, just requires some fancy prompting
yeah i probably have a long way to go
i mean
i just figured out seeds worked today :/
🙂 you're doing fine.
my favourite model is actually Kolors
which doesn't get as much attention these days
what's good about it?
honestly i just want a good photoshop-esque GUI
can you imagine being able to inpaint controlnet stuff with the pentool?
Does fooocus support controlnet?
Kolors has unusual training data
firstly its trained on a certain specific look
but also its a specialist chinese model
which gives it different results to the others
to prompt it with best results you actually have to translate your prompt into chinese first
don't think so
someone made this https://github.com/fenneishi/Fooocus-ControlNet-SDXL
it's using the SD3 open source as far as i know
Is it possible to change a face to make it look very similar, like a brother or sister, but not exactly the same while keeping the posture and everything else unchanged? I want to experiment with face obfuscation so that a person looks similar to others but can't be recognized by facial recognition software.
Has anyone done something similar?
maybe for the bulk of the data yeah, but its got a strong chinese bias
towards objects and people
which I kinda like
I use an SDXL model called LuanFangHua sometimes which has that too
i feel like SD already has a strong asian bias
some checkpoints do yeah
would recommend CinenautXL, LuanFangHua, RealvisXL and Leosam's Hello World
yeah, most of the checkpoints do - because most of them are anime based
LOL
RealvisXL and realistic vision are the same guys
i just checked it out
saying 乱芳华 has a slight asian bias is like saying
the fire was a little hot
LOL
乱芳华 is definitely a specialist one yeah
and that's the watered down version that has been merged with real-vis
容华 is the original and is even stronger bias
any stronger and it's going to start generating CCP propaganda unprompted
its same with CinenautXL
its been merged with jugger
but the original CineVisionXL is stronger bias towards cinematic
but I don't use the originals
I use the ones that got merged with photo models
merging a fantasy/cinematic/Chinese model with a photo model
is the best look in my taste
yeah it looks like something out of a jin yong novel
i dig it
wait, how does flux compare to SDXL in speed?
also if you do multiple passes
you can choose the direction you want the image to go in
if the first pass ends and you want it more realistic, you can do second pass with jugger or realvis
or if you want it more cinematic, you can do second pass with kolors or helloworld
flux speed was originally slower
but now its complex because they made quants of flux
i never thought about switching models in the middle of my workflow
ye its rly fun
cos you can make realistic versions of cartoons, or cartoon versions of realistic
for example
lol
also you can use depth map
to only change the background
so you get realistic foreground but cinematic or cartoon background
just add a geographic location to your prompt and you don't have to deal with the bias
i just asked chatgpt if there is a way to generate a batch with different settings on the same seed
and it wrote a step by step guide on
- choosing a seed
- choosing my settings
- generate
- change my settings
- generate again
i've never felt so patronized in my life
try half strength and turn it off half way
or even after a third
you also might have to add something like soft-edge or canny
because depth map alone can sometimes confuse the model
when there is something in the depth map and it can't quite work out what it is
because most of my images are in the setting of a palace or a mansion, when the model gets confused it tends to put a cloth or a blanket over the thing that confused it LOL
i didn't think of it that way, usually when i start losing style/coherence i always start taking things away
like, less prompts, less CLG and etc
it goes for the reverse too
with canny alone, what can happen is it thinks a foreground line is a background line, or the opposite
hello
Is there any way to turn an img2video via one click in comfyui?
in other words how easy it is?
gm
SimpleTuner, kohya_ss, ai-toolkit - there are so many tools now that support Flux training
if you talk about full finetuning: although that's possible with enough vram, it seems to be not necessary. Training lora is more efficient and sufficient
What if i have lots of concepts?
Learning a character or celebrity is one thing, learning anime is another
there are other parameter efficient finetuning techniques like lokr that can learn complex things more efficiently
gets a bit squiffy trying to stuff lots of concepts into one lora
that full finetuning is necessary was always a myth
in theory a full rank lora is equivalent to a full finetuning
so increasing lora rank brings you closer to finetuning
with loha or lokr you can get somehow best of both words, having a low rank model but being still able to approximate full finetuning
Rank its not a problem, also i have lots of data, its more about finding the right configs
Im renting the compute so vram is no problem either
I will do some research on loha and lokr, atm im using ai-toolkit
if you have enough $ and time you could make a full checkpoint if you want
Yeah but flux level checkpoints are in the 100k+ range
are you sure
seems very high
anyuone can let me know or is it not doable?
Actually 100k is very very low value. Flux was trained on 2.3 billion images, its 12b parameter model, the training cost is likely in the millions. Requiring months + hundres of gpus to train.
That is without factoring in storage costs and experimentations.
checkpoint doesn't mean base model
Oh
so to take the example of SDXL
the base model is SDXL
but a checkpoint might be "Dreamshaper" or "Juggernaut"
does that make more sense?
Yeah
what would be the way? teach me master
i have comfyui and flux installed
you've got a few options
AnimateDiff or SVD are ok
can i get a consistent face and body via animateDiff?
and also where can i install animatediff on comfyui? can i install directly via comfyui manager?
you can get a bit of consistency yeah but its not perfect
personally I don't use the manager I use terminal
but it may well be in the manager yeah
how do i install animatediff via comfyui manager?
not sure if it is in there
but you could search around
generally manager is not for models its for nodes
but they do include a few models there
so basically what I would have to do is create an image through flux and then restart comfyui with animatediff?
Is there checkpoint training for flux yet?
this method seems pretty comprehensive
https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved
in terms of installing things with comfy its always just a case of either using wget or using git clone via terminal
checkpoint training doesn't need to "come out" you can train a neural network as soon as it is released
Ok, how?
you can do it in pytorch but its not rly possible to teach all of pytorch over discord
is here can make picture with words?
what is the point of merging a lora into the model? What benefits come from it?
not muhc
not much benefit
do you need animatediff for animateddiff- evolved?
bro
do u know how to fix it?
wait a sec
Lets say you want to use 20 loras and dont have enough memory to load, you could merge them into the model.
Btw, you typically do not want 20 loras though...
Bro,do u know how to fix
TypeError: 'NoneType' object is not iterable
all the trainers usually have checkpoint training available
Which now gives me the stupid idea to throw as many loras into my merged model
I just remember that it's just a bad idea for flux.
Im just wondering if theres a way to train flux on a lot of images in a new domain, like anime or nsfw that it doesn't know. It would have to learn both general stuff and specific concepts. I dont think 1 lora is enough.
why? There are many loras out there training on new domains
including nsfw stuff
Yes but not good enough ones
Learning new variation of what it already knows 60% its different than learning nsfw which it knows nothing about. Thats why the loras are not that good
they are not worse than sdxl finetunes
Thats not a high bar to set xd
sdxl base had basically no idea about nsfw, you need to finetune it much more.
I've had a lot of issues with SDXL loras
one of the issues I get is secondary characters being in the training data of the lora
for example C3PO being in someone's R2D2 lora
you assume that low quality comes from loras, which is not the case. To train fully new concepts you have to mix them into a training set of millions of images and train with high batch size. Nobody is doing that. It doesn't matter if you use Lora or full fine-tuning to achieve that
the really big projects like pony were in the millions of images ye
its a lot
I do wonder, and will test it, if maybe instead you create some nsfw lora with general basic concepts, it will be bad, decent at best. Then you merge that to the model and repeat the process
sounds like the mangled model someone on here made
it worked well
they merged something like 2,000 loras
Wow
I need to meet this guy
if you search mangled
it should show up
how do i use animatediff
Will do
in this channel #🏞|general-with-images
can you check my post please? would do appreciate it
With flux, is there a way to remove the background blur depth of field bokeh effect from photos without negative prompts? I find the effect annoying, and the current methods for negative prompts slow down speed by a huge amount. There is a lora for this, but it seems to negatively impact details.
sup
if you're trying to create a realistic photo, flux is bound and determined that the background needs to be out of focus. i doubt negative prompting would even stop that
yeah it seems so deeply built into the model
it is. it probably trained on way too many artsy photos with blurry backgrounds. to be fair, that's really what you want, it makes your subject stand out. you might try creating the subject in one run, just creating the background in a second run, then compositing the images
yeah that could work, but the subject might seem too large or small and the perspective might not line up. Maybe the reverse with inpainting, but idk if flux works well with that
you'd have to try it and see
I've never done inpainting before, but maybe soon. I'm getting crisp backgrounds with an sdxl-based model much faster than flux generates it's blur, so I'm probably going to wait for good flux finetunes.
I saw some people sending a Lora to fix it on reddit, maybe you can find something there
I probably tried that one, the blur is still there, even in some of the preview images. I tried a few loras
how does this work again?
how does what work?
i have some problem about making good background, it usually give me unknown things btw
anyone can help?
When people make loras, do you just have a closeup of whatever you want added? Like just picture of a nose? Or do you include the whole face?
if its a person lora a portrait works u want to add a few full body shots there to increase variety
More for certain details. Like "i want this cat, but with a snout of that dog", but that dog breed isn't in the model.
then yes a bunch of closeup pics of the snout works and then maybe a few of the cat without the snot as a regularization imgs
Quick question, can you use ComfyUI to help you create images with 2 different characters in it without it mixing their traits and stuff? Especially with different character loras
Hanging onto that one as i couldn't for the life of me get 3 characters to wear their normal getup without colors being all wrong lol
Ahh, so that's what regularization means :P
As i use A.I-toolkit, how would regularization work there? As i've just watched nerdyrodent's videos, and he just used base images of training data without those.
idk about that srry i only use kohya or the civitai trainer
Gotchu. I kinda find kohya bloated with too many options i've no idea how to use, and all my attempts has been either lora does nothing, or generation is only training data images 
🥺 
ai art do be hard tbh
no
Guess who fixed his drive
Turns out the cable is the fault and my drive is fine
I'm ready to continue to try to boot stable diffusion 1.5 from an external drive
Depends on what u mean
The lora can create whatever u train it on
Like if u want it to be capable of making full body shots u gotta add full body shot images within the other angles of the character
Not according to "real artists" they say we just write prompts and voila!
Is there a node in ComfyUI that allows to expand a mask in a specific direction, such as downward or to the left?
how can create a model that have consistency? how many images are needed for that?
for example, I want to make a lineart of a character or a pose and the model make the pose
Id say about 20
Character + specific poses (that you have images of)
As much as u can find
Just simply collect every single image u could find with that pose
Currently using A1111-Forge due to laptop is severely outdated. I'm currently upscaling an image using Tiling Resample method, and I've seen the "Seams fix" option down below the img2img. A lot of people said not using it is the best option. Is it true?
My laptop is a mid-range 2017 Acer Aspire E Series with a Core i7 U-series CPU, a 2GB VRAM NVIDIA GPU, 12GB of slower DDR4 RAM and a SATA3 SSD.
you could try passing it through an sdxl refiner + uscale workflow...probably any solution will 'alter details'
I'm mostly working with SD1.5 due to obvious hardware limitations. Is SDXL even feasible for a system that ancient and weak?
Though I'm currently holding out upgrade until Q3 next year when NVIDIA announces 5000 series and the price stabilised.
you don't actually have to make images fast
I've waited an hour before
if you are willing to adjust to patient and slow style then its fine
I actually think it can even be a benefit sometimes to go slowly as you spend more time trying to make the next image good
dont hold your breath, the 3090's are still pretty expensive even. there's still lots of demand. so even when the new gen comes out, I wouldnt expect a huge price drop
I just got a 3090 to replace my old AMD card, but I took a chance on a refurbished one
so far so good, no issues
the other thing to keep in mind, depending on just how outdated your system is, these newer cards need a decent amount of power, so you'll need a power supply and a pci bus to drive that performance, which means a nicer board. if you're doing that you'll need newer ram and might as well get a newer gen CPU, so basically you're doing everything
the reason I had an AMD card was because I made those decisions a couple years ago to invest on everything but saved money on the GPU which is the most expensive component by far, so I struggled with a slow GPU for a few years and finally upgraded that just recently, but at least I had already done the other stuff
Hello!
i am new at discord
fr, making a good image takes me multiple generations, like up to 100's depending on the image i wanted
is that normal btw, to have to make 100's or 10's of images until the seed gets you something that looks good?
With flux its much better ratio
well i cant use flux tbh so im sticking to SDXL for now until it gets super super optimized
You need upgrade.
im not a billionaire lol
Then become one
If I want to make a virtual background for a camera that is 1280x720, would it be better to use normal stable diffusion or XL?
Thanks for the deep insight. However, I’ll probably stuck with laptops for another decade or so until I’m confident enough to own my first PC.
Tbh, I’m the type of person who carries laptop around a lot. Therefore, unless I settled down on a job that don’t requires me to move my butts all the time, I’ll probably get a laptop.
just use nf4v2 flux or gguf q4 flux dev(i believe nf4v2 is better right now) and it should use around the same vram as sdxl.
Also, I would recommend applying hyper flux lora to flux dev since that improves quality(both prompt following, and image quality) while also making it 3-4x faster.
really, I'll have to check out that lora
yeah here is the link: https://huggingface.co/ByteDance/Hyper-SD, you can use 8steps with 2-3.5 guidance scale and it will be most of the time better quality then flux.1 dev. It's somewhat similar to flux.1 schnell or the dev-schnell merges but much much better.
does anyone know what's the difference between euler a simple to exceptional
euler a is a sampler, specifically an ancestral sampler, the simple and exceptional are schedulers
yea i know that but whats the difference between the simple and the exceptional
speed and quality are at play between the different schedulers...I'm not sure if you meant practically or technically
so the exceptional gets you better quality ?
that's the idea, at the sacrifice of speed....but I havent done a ton of practical testing on that
the sd3 model? or their sdxl models? because the sdxl models will still have the late gen vastly inferior VAE.
Easy to fix if you just run it through flux as a refiner for 20 steps at low denoise. Or whatever you find is appropropriate for the image.
i'm not sure sd3 has the knowledge depth to always be better than flux as well.
none, I said their flux models. They released flux models as well.
oh keen!
is front prompt is more important than later prompt
why do all my outputs just look like a cave wall?
oh, i cant add pics here, ill add them in #🏞|general-with-images
exponentially, not exceptional
exponential
it's about the timestep schedule. How much time is spend for composition and how much time is spent on fine details
right now im using a sdxl model i always prefer bette details so which should i go for
I think the karras scheduler (both exponential as well as karras) spent more time on fine details
dpm++ 2m ?
oh ok
that fixed the colors mostly, but it still looks like a painting instead of a photo
For photos you need lower CFG
maybe try euler a vs euler and bump up the steps to like 35, you already said you've tried other models
also you need custom models, SDXL base is not very good for that
I saw realistic vision in his first image
realvis is probably the best model for photorealism
I still say start at cfg 7, you can lower it from there, but it'll listen to your prompt less
still looks ass , ill send pin in general with imgs
prompt it differently
don't use 8k
also you don't need "photorealism"
use "photo"
or better:
"photo of a girl in the forest, 35mm photograph "
you can also try to just write "girl in a forest", because realvis should already make the image photo-like even without prompting for it
also
use width and height of 1024
not 512
it looks better but still not as realistic as they should be
im using real vis 5.1
90% sure its not an xl model
have they gotten worse than they were previously? i understand they wouldnt be better, but this is worse than real vis 3 used to be
i havent really done much with sd since then, so im not up to date at all
if your hardware is good enough, use the best model available (Flux) or SDXL (if your hardware is a bit older)
ill try some out, still confused why it looks worse then it used to
there's a realvisxl I have a v40 not sure if that's the newest, and there's juggernaut and a slew of others for XL
i just tried out real vis 3 and it looks as good as it used to, i think its just the new model and the way you have to promp it
if you're not doing people, there's a dreamshaper
Hello, does anyone know where I should contact for questions about stability.ai?
dont they have a contact on their page?
I know that this discord is not the right place
yah there's a giant button on their page that says "contact us"
and if they are unresponsive, it's because skeleton crew 😄
I noticed on top of my Forge instance, I have two extra options: Swap Method and Swap Location. What are these for and what are the differences?
On Swap Method I have:
- Queue (block method)
- Async (asynchronous method)
On Swap Location I have: - CPU
- Shared
cfg 7 on SDXL without tonemapping, rescaling or thresholding is way too much
so there's an ipadapter for flux now, and it actually works. so exciting (it's been out for over a week, but it's news to me)
ye it came fast
the xlabs one? people been telling me it's really bad and doesn't work, and other people been telling me they're just using it wrong and don'tknow what they're talking about. i haven't had a chance to get it working yet. i used forge to get started
i saw some examples where they plugged in black and yellow striped theme image and got the same idea out of the gens. not sure what parameters they used. like i said i haven't dug in too hard on it.
some image types might work better than others according to their dataset
Yes.
thats why non local generators are useless for anything other than casual gambling. u need many many iteierations so your credits will be eaten fast (which they count on) + the censorship is also a big issues with anythign non local
ah ok, thanks for explaining!
what I've notice, even if you only look at the last hour of conversation, you can get into endless debates on what is good...so I wont go there, but I was able to produce an image with ipadapter, and it did create something that looked like the source image
I'm using swarm/comfy, so the gentleman above using forge, I cant comment on that functionality
Hey all
Sorry if this is a stupid question, I'm just starting to explore AI stuff
I've been using DALL-E for inspiration and reference pics for a worldbuilding project I'm doing, but it would be nice to have something I could run locally
I generally like the results I get when I ask for a "hatched illustrated" style (very medieval-y, gets the vibes I'm looking for)
Guess my main question is, is there a 'best' model for this kind of thing? Most of the Stable Diffusion stuff I've seen is geared towards either hyperrealism or anime. Is it a matter of a specific model or tool, or of training an existing one, or both, or something else?
historically speaking, auto1111 and forge lag comfy by some weeks usually for most things
that of course may change going forward, because comfy is no longer part of stability
Can someone help me getting to know what model was used to do a image ?
that's possible, but only if the metadata is still there in the image
how can i check it ?
what GUI are you using?
forge
in forge, there's a pnginfo extension, drop it in there and it tells you everything used to generate it (if it's not been stripped out)
let me check...
it doesnt show anything :c
then there's no way to really know, sorry
and if i send u the image?
it wont matter, if the metadata is gone, it's anyone's guess
ffs
hello
can any of you help me?
i want to get similar images like the one on the screengrab
I don't know how to upload a screen shot
I mean something similar to what this coloring book shows
that link goes to an amazon site that requires login, your better option is to use a screen grab utility and put it here https://discord.com/channels/1002292111942635562/1004159122335354970 then ask us to look at it
just dont post anything against the discord policy, like nudes, etc
ok will try
@warm junco Hey what do you think I could do to fix the ui-config.json issue I have? I edited some prompts and resolutions in txt2img and img2img in the text file, and it just reset everything to default, no default prompts, no Adetailer prompts and parameter settings (only the default ones).

How many steps do you guys do per image for x512, x768 and x1024?
@warm junco Hey I'll leave this here if you get on and still feel like helping me:
When in A1111 I go to **Settings **➡️ **Other **➡️ Defaults and click View Changes, I get this in my CMD:
I'm missing perms
To send the txt file
But yeah
I get Traceback
venv, modules, and Python in AppData
depends on what model you are using, for flux.1 dev, around 30 steps for 1024 is usually good quality. The lower res, the lower steps you can do but I would keep it at 20.
Also, I would just recommend using hyper flux since its usually better quality then flux dev and just requires 8 steps for all res so a lot faster then it too.
For sdxl, kolors, sd3, the best amount is usually also 30ish steps but they will not work well below 1024 res.
You can try to set the ui-config.json to read only
Properties ➡️ General ➡️ Attributes ➡️ Read-only ➡️ Apply?
It no worky
(same result)
Oh damn, only 30? I've been doing 100 per image xD Thought even for SD 1.5/xl it was recommended to do like 130 or 150 per image?
Gotcha.
How long does it take to train a base model for SD1.5, SDXL, SD3, and Flux (respectively)
Wait, several million dollars?
yes. those models are billions of parameters. that's a lot of gpu time
...nevermind
datacenter gpu time
Then how were those non-merge model guys able to train it
they're just making loras and checkpoints for the base model
Oh... OH
those just update small amounts of the base model's weights for specific information
I meant like checkpoint, not an actual base model
I just call them base models
Like base model for SD1.5, SDXL, etc
okay well, depends on how specific you want your checkpoint or lora to be. you can get away with only 30 images for a lora if you're really specific and it doesn't take very long
I remember reading somewhere non-merge checkpoints take millions of images to train
depends on if you want to cover a wide range or not. you could, technically, make one that was smaller. what is it you're wanting to create one for?
Another anime model in a distinct style
Even though I know Loras can do that... I just want to have the experience of training a checkpoint
you'd probably want to do a lora for that, really, and just train it on the style, then run it with an existing anime model
the only difference is the number of images you have to label
Yeah besides I don't have a computer with GPU yet
I use google colab to train my Loras 😂
most don't train at home, they train on colab or replicate or civit, etc
what pc do i need to run stable diffision without issues? I want to create high res images, be able to make a model that keeps the person looking the same over diff images and positions
you want an intel cpu, an nvidia gpu, and as much power as you can get
i found a used workstation on ebay for 600€ it has a AMD threadripper 1950x, 64gb ram and rtx 3060 12gb
would that work
as my budget is low
According to a friend, technically you still can run stable diffusion with only CPU but the image will take days to create and then it will just crash.
So... you need a GPU
that's a question for @warm junco - he's the AMD expert. post in #🤝|tech-support if you would
it's got an rtx 3060... that's a gpu
No I mean like this
And meanwhile the cheapest one i can find is for $600 before taxes it's an RTX 2050 laptop
what about those mini pc's where u can allocate the ram to the iGPU
basically u got one with 64gb ram
then allocate 16gb ram to the Igpu
laptops are really not designed for heavy graphics work
you get what you pay for, more so in computer equipment than anything else. you buy cheap, you get cheap - and cheap isn't going to work well
this is actually what I use to test workflows (iGPU using regular system ram)
its like 5 minutes per image or something
the way I do things is to build the workflow at home, couple of tests with CPU
then upload to cloud GPUs to run fast
if I had to use CPU all the time what I would do personally is look into 1 step, or maybe 2-3 step hyper or LCM workflows for SD 1.5
👀
where do I post 1.5 images?
On your wall. XD
ty, posted
Shit.. appears training 1024 res images with flux eats juuust a bit over 24GB video memory. Or 23GB plus 0.1-0.2 swap storage. So I might limit my Lora making for it to 768 or 512
damn
so 12gb vram wouldnt be nearly enough
to do proper shit?
with gtx 3060
sup chat
you can train with MUCH less
I can event train on batch size 3 with 24gb
batch size 1 would be around 16gb I think, but there are ways to shrink it down even more
you should definitely use quantisation. As you use quantisation in inference, too, this is not a quality problem
and don't use a adamw optimizer but something more lightweight like adamw_16bit or adamw8bit (or adafactor, although this one often gives worse results)
sup
Anyone aware of a workflow that lets you specify an area on an image cut it out and feed it to sampler to regenerate (with optional upscale or user can add it), and place it back in the correct position of the original image with the original scale? Same idea as inpainting but done differently.
I mean, that's ADetailer from impact-pack
unfortunately, they chosed to solve this kind of simple problem in the most complicated way possible.
I might have used ADetailer before i'm not sure. But I'd like to breakdown the process for various reasons one being that it might give the user more control the other is that it might breakup the processing. I tried FaceDetailer or face fix whatever it's called and it uses a lot of processing compared to if I just manually broke down the process then my computer would have no issue performing the same exact process, one step at a time.
you can breakup as much as you want. They have hundreds of nodes for each single step
it's just totally unnecessary complicated and proper documentation is missing
I am somewhat new to comfyUI but definitely have used it, but I am not super familiar with it or its nodes and I trying to figure out what node does what or is best for what is a complete trial and error that can take forever. So I was wondering if anyone had already broken down or done this process. I am sure that someone who is familiar with many of the nodes would be able to set it up quite easily. But yes the main thing is just more user control and customization and breaking up the process so it's not so processor heavy.
I did this for SDXL, but not for Flux yet
I'm still using mainly SD 1.5 since I have only 6gb VRAM. Occasionally use SDXL but it runs out of memory if i try to do anything a bit complex beyond something like UltimateSDUpscale. Like adding ipadapter or whatever. Soemtimes it generates most of the time it runs out of memeory.
If your using Auto1111, go into Settings and enable FP8 mode for sdxl.
With that it uses less vram
Did you guys see that RunwayML deleted all their Stable Diffusion 1.5 repos on HuggingFace and Github?
yes. sd 1.5 is almost not used at all by anyone any more, and they are finished with it
really ? thats bad
we have civitai backup.
its not normal to nuke old repos
yea, but like 99.9% of people there talk English, if you want people to understand what you're saying it'd be better to use English too
maybe they part away ?
looks like SAI dmca the models 2 years ago lol ... its taking effect now!
its funny how sd1.5 was a leaked model XD
sure it is, i do it all the time
I think its mostly a shame if public available resources are removed in general, with some exceptions of course
Hey everyone! 👋
I'm a beginner in the field of data analysis and I'm actively seeking an internship to kickstart my career. I'm eager to learn and grow in this exciting field.
If any of you know of any internship opportunities or have advice on how to get started, I'd be grateful for your help! 😊
intersting fake social media app that uses ai.
I don't know what to use lol. I only watched nerdyrodent on how to get started :p
Hey everyone,
I wanted to share my latest work titled "Enhancing Conditional Image Generation with Explainable Latent Space Manipulation". It explores a novel approach to improving fidelity and contextual alignment in text-to-image synthesis by integrating diffusion models with explainable latent space manipulation.
If you find this topic interesting or have any thoughts or questions about the work, feel free to reach out to me. I'd love to discuss it further!
You can check out the paper here: https://arxiv.org/abs/2408.16232
Github Implementation: https://github.com/kshitij79/CS-7476-Improvements-in-Diffusion-Model
ye I am so glad other people also feel this way
I never really even got past understanding "what is a SEG" and why do we need it
from the Python, a SEG is an abstraction over the following:
"cropped_image", "cropped_mask", "crop_region", "bbox", "control_net_wrapper", "confidence", "label"
so its 7 things that we mostly have existing nodes for already
what I can't work out is if this specific usage of SEG has actually been used before, e.g. in a paper or model somewhere
or if it really was made up for Impact Pack
the use case is usually that you run a segmentation algorithm beforehand. Many segmentation algorithms also output bounding boxes.
they are then the starting point for the detailing
should I switch to SDXL from SD1.5?
I only have a 3060 Ti
and I'm most worried about long generation times + model storage size
what's best lora and prompt for fixing hand
anyone can help?
there's actually a really good comfy workflow with a hand fix masking setup on scott's channel
If you use Auto1111 and have --xformers --medvram-sdxl --no-half-vae in the webui-user.bat
You should be fine with performance
idk but can i get a link?
So do I install SDXL first and then add this stuff?
or do I just add this stuff to the .bat
You add this stuff to the bat at the line commandline_args=
Then your ready to use sdxl models fast
https://youtu.be/PLSIegjSEDg?si=dzt5NPdU8PNsSdf4 worked well for me at least
ty sir
could you explain more about how the command works pls? I'm not very experienced with commands
does adding that command install a medium vram version of sdxl next time I open up auto1111?
You need to right click the webui-user.bat and edit it.
Then add the commands to the commandline_args= line
They dont install sdxl
yeah I know how to add it but like what do the parts of the command do
You still need models
yeah
These commands make sdxl run faster on GPUs with 8gb or less.
Xformers is a speed boost in general (reducing vram usage a lot) and always recommended to use.
--medvram-sdxl loads sdxl models splitted into the vram and reducing the vram usage to generate faster
--no-half-vae is for VAE compatibility
In ai-toolkit, is flowmatch the only recommended sampler for flux? And which optimizers are there,and what's recommended for flux?
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Command-Line-Arguments-and-Settings
Here's a list of what does what.
So in webui-user.bat, you would make it look like this.
@echo off
set PYTHON="%LOCALAPPDATA%\Programs\Python\Python310\python.exe"
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS= --xformers --medvram-sdxl --no-half-vae
Who has 3060 or 3060 ti gpu?
What speed are u getting on sdxl and flux models
can i upscale mutil image at once?
I just tried JuggernautXL on a 3060 Ti and it was incredibly slow
even with xformers and med vram
really?
how slow?
I tried to generate a batch of 4 images at 1024x1024 and I think it finished the first image but after that the time left kept going up
it started at 5 minutes and kept going up and up to 10
it made my PC so slow that even the interrupt option wasn't working
I had to close the tab and terminate the command prompt from task manager
how many iterations per second u got?
it has 8GB VRAM right
yeah
didn't check that, idk where to find that
which UI did u use
auto1111
did u try fooocus?
nope
was it SDXL model or normal SD model?
🤔 and it wasnt working well?
one question, can stable diffusion work without the need for a super powerful computer using only cpu?
lightening models MAYBE
on the AI cpus
I'm relatively new to this, so I don't understand anything.
yeah
yeah
you basically NEED a GPU because video game math happens to line up with AI math
how much ram do u have?
because the freezing issue happens when all ram is used up i think
this
i had this too , then i upgraded to 64GB ram from 32GB and it was fixed
Mh, 4 gb (this is not my pc)
nah i am asking kraken
that was what I saw in task manager
I only have 16gb
yeah it's probably low RAM
everything freezes
but jeez, 64 gb needed?
I think I might go back to running SD 1.5
I only have 16 gb ram and can't upgrade
sadge
Yep. Super slow compared, but doable.
I tested it on my steam deck. 512x512 25 steps. 25 min
5 min on steam deck's 8CU's gpu
You need to increase the Windows Pagefile then to use sdxl without freezes
how?
i'm thinking of getting a new laptop, and i noticed this one from best buy. However it doesn't seem to have a gpu/vram, and i'm not an knowledgable about pcs and stuff. Can anyone tell me if how well it might run with ponyxl or run at all?
ASUS - Zenbook S 16 16” 3K OLED Touch Screen Laptop, AMD Ryzen AI 9 365 - 24GB Memory - 1TB SSD - Scandinavian White
follow this guide until your on the pagefile screen:
https://www.tomshardware.com/news/how-to-manage-virtual-memory-pagefile-windows-10,36929.html
Then make sure its only enabled for the C drive and disabled for any other drive.
Then set it to custom sized and set 16000 Min and 24000 Max
Then save and restart your PC
tysm
seems to be much faster
ty again
Perfect np
it's difficult to put into words just how unhappy you would be with that choice. It's an amd gpu. Here's the specs: https://www.asus.com/us/laptops/for-home/zenbook/asus-zenbook-s-16-um5606/techspec/ It has an 890M GPU which is a 4Gb vram card. In essence if you could get anything at all to work it would take a millenia to do it. For more fun, see this review of that 890M https://www.tomshardware.com/pc-components/gpus/amd-latest-integrated-graphics-perform-like-an-eight-year-old-nvidia-midrange-gpu
if it must be a laptop, then look for high end gaming laptops, dont look for "AI" tags. That's going to be a gimmick
Hm I see I see
another question, can I add custom hires fix models to auto1111? where?
that's actually 2 questions
Como genero imagenes?
models\ESRGAN - put the .pth file in it for whatever upscaler
ty
I feel silly for asking so many questions
SDXL/1.5 checkpoint suggestions for generating spacecraft/rockets/aerospace technology/vehicles?
Epic Photonism
actually how well would it run on a macbook m3 pro?
very poorly
its unfortunate that doing deep learning on apple silicon gets suggested all over the internet
unless you are making extremely secret images you can just use cloud at a verified datacenter and there is a very high chance your data will be fully private anyway
otherwise, even a very low end GPU, such as 8GB VRAM, would be a better choice than apple silicon
if you do decide to go with apple, its worth getting good with the speed-ups like turbo, lightning, hyper, LCM etc
SD 1.5 hyper at 1, 2 or 3 steps can make ok images
and that would actually be fast enough to be tolerable TBH
hm not too concered with running sd on it honestly, i'm using a 4070 for my pc
but i was hoping that i could run sd when i'm away from my pc for long periods of time
there's some personal choice cos not everyone cares about speed
here, and on reddit, there are some people very happy with their very slow setups
in terms of “value for money”, which one is a better pick: rtx 3060 or 4070 super?
rtx 3060
😮
sup
gm
what fabric is the simpson's couch made from?
not leather, right? What would you call it in a prompt?
just "fabric"
remember its not about finding the right word
its about finding what was in the captions
right, I wish I knew what words were used in captions for various things
same
this is one of the areas where I disagree with most model creators the most
I think they should open source extensive amounts of their data set
doesn't have to be anywhere near enough to copy the model so they don't have to be scared of doing that
but it would allow us to get a better idea of what prompts the model likes
the other big area where I disagree is that I would have made image diversity a larger priority, even if that damaged image quality a little
also I would have preferred almost all existing checkpoints and loras to have a looser fit, so we can stack more loras
loras could be like lego blocks where you build up an image from like 10 loras that each add a little bit
but that's not rly what Civit loras are like, they just give you a fixed finished image
rtx 3060 12gb good enough to create photo realistic images of humans?
yeah definitely
you want at least 6-8GB so 12GB is definitely fine
the minimum is like 2GB
hey im trying to install automatic 1111 but when i try to run it it says error no checkpoints found
where do i get the checkpoint I need?
huggingface
hi, can someone help me?
I just did the math and apparently I'd need 282,641,285.12 strands of DNA to hold my entire comfyui folder
Back up of all family photos from 2012 to 2019 = about 9,983,283,200 bits, so need 99,832,832 strands of DNA to hold entire data of folder
School video from last year = about 484,380,672 bits, so need 4,843,806.72 strands of DNA to hold entire data of video file
Entire math class summer packet = about 1,044,480 bytes, so need 10,444.8 strands of DNA to hold entire data of pdf file
how can i generate a video with custom prompts, using the API??
Hey guys!
Starting a challenge today: I’m earning 1€ and will increase it by 1% daily. By year’s end, I’m aiming for around 37€ a day.
It’s day one, so I’m looking for small tasks to hit that 1€ goal. If you need help with anything or know of any gigs, hit me up. Thanks!
yello bible thumpers
Jk i only thump the AI teachings
Anyway guys if anyone wants to dm or ciscuss AI or code related ideas, speculations or just imagination based, impossible becoming possible coding or AI developing I'm always game!
So I've been out of the loop for about 2 weeks which = 2 years in AI land. What's the latest bests Comfyui locally runnable models and video. I hear some new video thing is out?
there is still free stuff...
cogvideox has excellent motion and definitely sota in open source but sometimes the video quality might be poor. It has no image to video support which is pretty bad but it's sota in text to video.
You can try FancyVideo if you need image to video, which is very good quality as well. It's definitely better then Animatediff as it has better motion and no flickering.
It can work for animation too?
alright! thanks!
Iirc in the cmd window
Do people mostly use forge these days, as well as for flux?
I've seen a lot of hype on the Flux model lately. How good is it actually?
it depends on how you judge it
the overall image quality is very good, aside from certain specific styles, which sadly includes realism
its distilled which has quite a few side effects, not all of which are even fully known
So, photorealism's out then?
(as in, other models can perform that better and easier)?
other models can perform photorealism better and easier yeah
can u also make it so u can generate same human face and everything in diff positions and locations?
inpainting/controlnet is what Id look into for that?
Anyway, looks like digital painting style is what flux shines in?
some of the distillation issues with flux are a lack of image diversity and loss of ability to work well with negatives
digital painting yeah or especially midjourney cinematic style
I mainly use my workflows for the following, Photorealism, digital painting, and anime, different models for each but, looks like it could be worth trying out flux for the digital painting stuff.
It's been a long while since I've touched any 1.5 model though to be honest
mainly been using SDXL/Pony
Hey does anyone know how I can get to the image bot
yeah I use SD 1.5 less now
what things if any do you use 1.5 for nowdays?
the base model is really chaotic
so you can use it to start off an image
and then refine with SDXL
Yeah... I remember.... it would generate a monstrosity with 100 arms and other nonsense from time to time.
To train loras with A.I-toolkit for SD 1.5, can i use any model to make the lora of? Or only barebones 1.5 from hugging?
is there's a ai for animation that is free?
Comfyui has a good few that can make text to video, image to video, and video to video, all on windows
Though, as far as i know, only available models can only make 16 frames worth, then it needs to blend 4 frames, generate more, but those more isn't identically continuing, but blends into something else
stable studio or comfyui
tell me, I installed
flux1-dev-bnb-nf4 but all the pictures are blurry. The image is visible, but it's not clear at all. What else have I not installed?
Forge with CUDA 12.1 + Pytorch 2.3.1 also installed
4070TI Super
loooool im gettign Pond5 watermakrs in my cogvideos XD
ultimately its not a bad video generator probably beats svd... needs img2vid tho XD
yeah img2vid for the bigger cog video model would be amazing
Is topaz Photo ai the best upscaler?
I think this upscaler is the best https://openmodeldb.info/models/4x-NomosWebPhoto-atd
what's the best option to capture/transfer hair likeness? like if I have an image of someone with unique hair and want to generate more images with similar hair
SUPIR is very good at upscaling a blurry/bad photo and making it good, but not so great at taking a good image and making it great/better
the main face specialist one is https://openmodeldb.info/models/4x-FaceUpDAT
it does do a lot better with eyebrows and hair than similar models
these two big HATs are good too for people https://openmodeldb.info/models/4x-Nomos8kHAT-L-otf https://openmodeldb.info/models/4x-Nomos8kSCHAT-L
SUPIR is for super degraded images yeah
it was slightly strange that Supir got a strong reputation as a general upscaler
models trained on heavy degradations smooth things over too much (by design)
Hi - anyone using SDXL 1.0 ?
They already have an img2vid model but they are doing extra evaluation and might release it since many people want it.
hi
wow great news
Hi,
I'm looking for ComfyUI tutorials for advanced architecture and interior design.
Could anyone help with some relevant resources?
anyone having problem with colab? my sd stop generate images after 15 min or less, and I have to run colab again... It's happening by almost 2 weeks, idk what is this 😣
the free one or pro?
Pro
Id think pro should work, imo colab is pretty crappy. you'd be better off on something like vast which still uses jupyter and cheaper and gives you more choices over the runtime environment
I don't know vast, I'll take a look on it, thanks ^^
Is this like colab? Do you have some notebook to suggest me?
what do you like to use? here's one for comfy with flux. https://cloud.vast.ai/?ref_id=62897&template_id=c221e79c01d15b1461748f54946f530d just search templates, tons of choices
choose a template, then you choose your runtime environment, once you've done it one time, it's easy
most of what I use vast for is training, but it can certainly do more
I'm looking fot automatic with sd.15 first. Thanks a lot I will look it carefully
kind of blows my mind how many people use that old model
hey
sd1.5 is a very good model.
yah, for it's time
and it used less resources I guess
whenever I go back to it though, I'm horrified
more abominations, more issues with drawing things in the distance due to the small latent space, and the prompt comprehesion is the worst of all the models
well drawing things in the distance is a funny one
cos the thing about SD 1.5 is it wasn't trained on a lot of bokeh
I remember when SDXL base came out, I used to hang around in slightly different AI communities back then
and they were super unhappy with the bokeh
I dont really mean bokeh, but I see where you're going.
I know what you mean yeah- accurate fine details
what I would say is it depends on the resolution
for sure flux seems to prefer photgraphic and bokeh
SD 1.5 is much better at high resolution than SDXL
because Hi-diffusion works better with SD 1.5
for example
I'm guessing a finetune will fix that, just like the flurry of amazing finetunes made the base XL model all but unused
there's at least 4 big releases coming up this year
SD 3.1, the OMI and the next generations of Pixart and Auraflow which will have the modern VAEs
would also expect something from China e.g. Lumina team, Kolors team or that Huan one
I partly think its likely people will switch from flux
if they do, it'll have to be because something is way better, because flux is very trainable and that's usually all the community wants
well, as a way to define it, I would say that training on 1.5 and xl frequently let to 3 headed monsters, extra limbs, and burned out very quickly
Hi 👋
Does someone know how to fix anime full body pictures in ComfyUI?
sure, you can go into weeds on training params and such, but just saying, so far it seems the flux model is pretty tolerant and hard to overtrain
18k steps later ,you still get pretty solid images, in my experience
ok so here you are referring to precision
but that's not the issue with flux, in that area it is the strongest model, the problem is recall and conditioning
it can make very high quality images, but it cannot make a wide variety of images
and it cannot handle a wide range of conditioning vectors before getting off-manifold issues (CFG burn)
when people say "Flux can't use negatives" its a bit of a misnomer
Flux can use negatives it just burns the image though, by flying off the manifold
the thing with the negatives is a fair point and I do miss them at times, or I work around it by feeding it an image
try setting your cfg to 1.5 so that it pays some attention to the negs and doesn't go too overboard with overcooking
yeah that's a good tip
you can get a little bit of negative guidance
on reddit some mad lads got up to cfg 6 by chaining a bunch of anti cooking nodes (rescale/tonemap/threshold/auto CFG etc)
I actually had to do this with SDXL lightning today, its a similar issue, just not as bad as with flux
can someone help me find a checkpoint/lora thats similar to this style?
https://www.youtube.com/watch?v=ClSkX17HnRY&t=377s
can i pay someone to make me a flux lora
Wondering there are githubs or repositories extending SD 3 to video editing or generation?
You think anyone made a Lora based off of Super Smash Bros' Master / Crazy Hand?
because you know how AI and hands work
or rather, don't work 😂
I want to upgrade from my intel graphics card so that SD doesn’t take ages to do anything. Any recommendations for a card that has a good price to performance ratio?
scott detweiler has a good video on his youtube channel for fixing hands. because the masking part runs after the image generation, you can use it with flux as well as stable diffusion
you want an intel CPU and an nvidia GPU
hi
in ComfyUI, does lora loader have to be near the start of a workflow? I'm hoping to split up a generic prompt 10 times into loras and see exactly how much impact each lora has.
#🧣|comfy-ui better channel to ask in probably
Don't get anything under 8GB vram, 6GB vram for me at least struggle when you add things together (controlnet, ipadpater, etc) especially when using SDXL,
8+ GB vram, got it
If you want to run LLM's, just be warned that it will take a LOT more vram.
you want at least 16G vram, more if you can afford it
what does 8GB look like? Longer time to output an image?
1
I've played around on most VRAM sizes and 8GB does not feel great
its doable but you get out of memory errors a lot
once you start adding extra stuff like control net or upscaling
does anyone have a link to a google collab working and supported fork notebook of "deforum stable diffusion" please?
sadly that would involve me fixing it and I am beyond thick in those terms
. They have stopped supporting it and it was my main tool for creating stuff!
I was hoping a clever person out there would have an updated version they are supporting
does the notebook not run?
not as it did....
"⚠️ NOTICE: This project is no longer maintained. ⚠️ This repository is no longer actively maintained or updated. Users are advised to find alternative solutions or fork the project if they wish to continue development."
you might post to twitter, ping angry penguin, and see if he can suggest a solution
thanks
Hello, anyone know how to use adetailer + soft inpainting in img2img tab of a1111
I don’t get this error on my crappy potato graphics card, it just takes 20 minutes per image. Have you tried telling the computer to slow down when it’s low on memory instead of throwing an error message?
hehe
its a funny idea but yeah it sadly would not work
it just can't fit stuff in the box its not a matter of speed
is anyone know how use soft inpainting with Adetailer ? 😢
damn all that installation just to realize my 1070ti is too old to generate images 😂
Its not, even 1060 can generate images.
Come to #🤝|tech-support for install help
Hey, guys. I have a question. Can I, for example, generate an NSFW image? Doesn't this violate the policy of using Stable Diffusion 3 Medium?
depends on the new license
anybody got this version of forge ui? Version: f0.0.17v1.8.0rc-latest-276-g29be1da7 I'm looking for this one please share me if you have it or tell me where I can find it
hi, is anyone know how use soft inpainting with Adetailer?
guys, best background remover with an API thats cheap?
hi there..i am running flux in forge, and i was hoping someone could explain what this means in the cmd prompt as model runs
[Unload] Trying to free 18359.89 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 8456.80 MB ...
[Unload] Unload model JointTextEncoder
[Memory Management] Current Free GPU Memory: 8959.56 MB
[Memory Management] Required Model Memory: 6246.84 MB
[Memory Management] Required Inference Memory: 10239.00 MB
[Memory Management] Estimated Remaining GPU Memory: -7526.28 MB
i have 32gb ram, plus 12gb on my 3080RTX.
is just the way Forge makes sure your pc doesn't run out of memory
i do recommend grabbing the NF4 version of Flux as that one can run on 3GB VRAM even and not 16 GB at minimum
ok
100% looks like a phishing hub scam situation. i wouldn't start spamming your "young site" since it looks wholly illegitimate
oh got it's unsecured http too. this guy is just farming IP addresses most likely.
anything sent to this page is sent unencrypted and will be plain text to local network admins and any processes running at your ISP.
or if you use any proxy server, they'll see everything too
the only time you see http is when some inexperienced person is trying to launch a scam off and can't figure out how to get a signed domain
Error occurred when executing PipelineLoader:
Error no file named config.json found in directory D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\ComfyUI\models\IDM-VTON.
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-IDM-VTON\src\nodes\pipeline_loader.py", line 46, in load_pipeline
vae = AutoencoderKL.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\modeling_utils.py", line 567, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUi\ComfyUI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\configuration_utils.py", line 374, in load_config
raise EnvironmentError(
guys having this error can anyone tell me how to solve i am trying really hard and going to loose my mind now
can anyone tell me how can i solve this
when using stable diffusion, is it better to use prompts structured as sentences or prompts as individual characteristics separated by commas?
when using SD 1.5 you can go for attributes separated by comma. It's not that it was trained on that - it's just that the CLIP-L text encoder used by SD 1.5 is soo bad in text understanding, it will only listen to certain words anyways without getting the complete sentence
SD 2.1 and SDXL use better text encoders, but they are still CLIP-based, so they still have very limited text understanding. They benefit more by natural captions but you still can use comma separated lists
Flux and PixArt are using T5, which is a powerful llm that really understands your prompt. It will benefit a lot by natural language prompts. Also, both models are likely trained on natural language captions, so its better to stick with that
the 1.5 anime models and its derived merges were actually trained on comma keywords because they contain the nai dataset and that was trained on booru tags
some XL models were also trained on booru tags like Kohaku XL,Animage and Pony
even Flux was trained in booru tags
but a) will this bias the output to anime and b) will the model have better prompt coherence with natural language
yea on models with t5 is better to just use natural language
anyone got a current guide on running stable diffusion on windows + AMD?
tried using lshqqytiger's fork of automatic1111 and am continuously running into issues
yes there is one in tech support
Could someone help me understand how i would make an anmation or video using ai?
tx
do we have a somewhat decent text=>3d or img=> 3d yet?
how to know what i installed in ComfyUI
thre are a couple that were posted to twitter in the lst few days that look promising, but that's all
you mean as far as nodes or as far as models?
i'm seeing new stuffs on comfyui called "flux"
and i dont know if i installed it or not
flux is a model. you would know if you downloaded it. Comfy will run it, but it's not a node specifically for comfy.
can comfyui can track prompt and send it to text to text like SD?
but even if not, it kind of does it already, by embedding the workflows in the images
so you just pick the image that you want to recreate and drop it in comfy
so what I do is have a subfolder in my output folder for saves, if I like something I put it in there so I can reuse that workflow
I'm afraid I don't quite understand the question
I think they are saying they want to save old prompts as text files
SD can track the input prompt of one image made by it and send it to text to text label, idk if comfyUI can do like it too
SD is stable diffusion, that's not a GUI, maybe you refer to automatic1111
idk but i think yeah...
yes they mean automatic1111 probably
i'm just installed comfyUI and know that it can use same model as SD/auto1111 so idk if it can do like what i asked too?
in any case, if you get used to the workflow in the images that's actually easier to search later
visually
and swarm also lets you save workflows separately as kind of like a favorites
the IT display look hard to me at first place, but i'll try to learn
I think they are saying save prompts
you can do this with a node like Save Text File from was-node-suite
#🏞|general-with-images message idk but can you guys comfirm it?
generate a few images with different prompts and drop them back onto comfy, you'll see what I mean
Hi guys!
Hi everyone
hello, anyone confortable with comfyUI + mask nodes ?
Just bought an 3060 with 12 GB for my seconday PC ( host pc ) so that I can setup my AI and Stable Diffusion on and run it locally on my network 😄
I am excited 😄
automatic 1111 just broke suddeny. was producing images okay, all of a sudden its all buildings? any checkpoint/vae I use, all buildings no matter what prompt
Send screenshots etc in #🤝|tech-support and we can help you better
will do!
