#💬|general-chat
1 messages · Page 93 of 1
what are you opening - are you starting a runpod or something?
and are you using comfyui or a1111/forge?
is bot down
Am i allowed to talk about ai animation with Stable Diffusion here? or is there a better channel for that?
I always double check when it comes to channels, by asking.
Sure! You can also try #🎥|animation
I wasent sure but i guess ill start here.
is the bot not active?
is animatediff the only animation workflow for comfyui or are there better ones to use for comfyui?
I would assume so?
there is svd, hxsvd, and deforum
which one would you suggest for a beginner if i might ask? im thinking of trying to make a super short film if i can.
Ok i just googled them and im debating between hxsvd and deforum but i guess ill try as many as i can first like you said.
one more thing where should i share the ai videos at?
I would be interested in looking at the videos. I have only seen examples on SD reddit
haha that commerical was just making fun of AI art
i get mostly ID-10-T user errors
where can I find artists to hire that are not from civitai? I've been through a lot of them already
What do artists currently charge?
It ranges based on their skill and their ego
Some charge $30 USD, some $120, and some over $500
for how many images?
For images? I'm talking about lora creation
oh ok I have no idea what I am doing with SD really. I can't get anything even close to midjourney
anyone know how to install open_clip dependency?
pip install open-clip-torch
thanks man, figured that out a few minutes ago. Now I’m just getting the configuration set up correctly. For text to video do you know off hand what the YAML is for that?
no sorry
can we put link here? cant find rule against it. I have something to ask
ye you can paste it
im asked to imitate this photobooth installation. I only given 2 months time to research or I will lost my job 
im a graphic designer with 0 programming skills, do you think this project is possible to reach? 💀
And I wonder if Stable Diffusion / ComfyUI is the right tool to reach that goal
Yeah that doesn't look crazy - it looks like it takes a photo input, uses something like rembg or segment anything to remove the background, and then it is either photoshopped onto a movie poster, or onto an AI generated background (the text in the movie posters looks too good to be ai)
If you have a stupidly good setup (like a desktop pc with a 4090, and a good digital camera), you should be able to do each of the steps manually in a week, and then you've got a few weeks to figure out how to put it together in a sleek and automated fashion
still dont know how people dont throw up when you buy a 2500 dollar gpu
Hardware is not a problem, i got two A5000
My car isn't even worth that much!!
Sweet..
why buy the a5000 when you can just get a non server grade gpu that has the same vram
Im not really sure if its manually photoshop or not, but my Boss expect the generated image result in less than 3 minutes after the photo was taken
the PC often used for rendering for days, I think server grade GPU is best for constant work
you can do it in like 2 mins if u have the backgrounds and the text already generated
amd has a server grade 7700xt called the W7700 and its basically the same thing
but the W7700 is 1000 while the 7700xt is like 500
It should only take a few seconds to save the file and then strip the background from it, then about a minute to generate a set of potential images
also u dont need rembg or segment if you take the picture using a solid background like the one used in the video,its easier if u have a green screen in your photo studio
we also have a wx9100
good point
If you have the backgrounds and text already generated, yeah l can see 3 minutes being more than reasonable
if you do the soliid colr background, always a good idea to set up a back flood light so that color splash doesn't happen
nearly every program I know slacks when it comes with AMD
fill light? i'm not entirely sure of the technical terms. don't want green reflecting back is the goal
I feel like there would be people in here having that going in a couple of hours if they had the hardware and the boss was paying them to do it
I dont really understand about hardware, my office is willing to pay stupidly amount money for hardware but no for human resources. Thats why they asked a graphic designer to do this instead hiring suitable experts
so yeah, 2 months is more than feasible
i remember when i tried daz studio with an amd gpu and it was so slow rendering with nvidia iray
I hear Devs talking shit about AMD and then I hear AMD talking shit about dev studios
like with blender... I think its the studio at this point because they keep adding junk instead of improving stuff
Can you help me by giving me more insight, what are the important points I need to learn
im really new into AI stuff
look for a tutorial on inpainting backgrounds
thanks
I have few pages on facebook, all 500k-1m followers, I want someone to JV with me. As I don’t create my own content? Anyone up for it? #content #help
jv?
where do people go to ask question about stable diffusion, seems asking here it's hard to archive and find it back in the future
depends what the question is..
how do i use stable diffusion to steal the declaration of independence? the real secret one kept under lincoln
I have over 200 2D design, in both 2D sketch and it's final 2D polished if you will. how do I go about it to train a model for it to learn the transformation from first pass 2D sketch to final polish drawing, so that I can pass the model new 2D sketch and it spit out the final polished image that're both in quality and style match the previously 200 final polished design
getting my question ready
i would train a lora on your finished work and use existing sketch controlnets
is there a "beginner's guide" video that you can point me to? I'm total beginner
https://www.nextdiffusion.ai/tutorials/how-to-install-controlnet-extension-in-stable-diffusion-a1111 good site with tutorials here. there do videos and text guides.
learning how to train a style lora is going to be a little more involved. easy enough but you will have to experiment a bunch
https://www.youtube.com/watch?v=RT2jj-5t8x8 the scripts kind of update over time, so this is a little old. like most videos. but yeah. good primer for it still
hey did anyone familiar with integration between Stable Diffusion and TouchDesigner?
watched all of it, so much concept are unfamiliar, lora, control net... don't even know how to dissect it, for image to image and with my use case in mind, what kind of project that are be suitable for me to start experiment on, or what keyword I should search video for
You've bit off a big piece of cake to chew for someone who has no experience. Try starting with a glossary of terms
Huh
Uh
hello
hey wassup guys am new here
What is the next logical step for Stable Diffusion? More SVD development? Improving SDXL turbo?
What is best model/method to make dating app pictures atm in 2024?
With a camera 
Hahaa might be true. But i would love to know latest thing right now and test it. Im so bad at taking pictures of myself 😦
I dont think people appreciate if youd use AI, just be yourself and youll be fine
You are propably right. But i would like to test and see What are the results. At somepoint dreambooth was way to go, But not sure What is bleeding edge technology right now
IPAdapters probably right now
Or using a face swap tool
Okey! Thanks alot!
what version of automatic1111 is best for amd gpu on windows?
Depends on what you want, Pure Speed or (Usability/Compatibility)
pure speed i trried comfyui but i only get 1.4its
and it doesnt use all my gpu ram so it crashes trying to generate anything above 512
whats your gpu?
6700xt
ik i shoul have got nvidia but i didnt know about this AI stuff and upgraded from a 1060 😔
then i would recommend using the amd version of auto1111
you would get the most out of it
do you know what frok that is ?
yea its the one from lshqtiger,
youll find my full install guide of it in the #🤝|tech-support channel, in the pinned messages
used it on a 6700xt, 6800xt, Vega7 and 7900xt, and the usage is good, speed is okay, not the fastest
if you want max speed you would need to use the Shark webui from Nod.ai, but that lacks support of the most features, also no extension support
how to use the ai?
the bots here to generate images are currently under maintenance.
you would need to use an other service, or install Stable Diffusion localy on your PC
oh thanks
I followed the tutorial and still get 1.4 its 😦
yea it uses directml, the same as comfyui uses for AMD, but you can generate at higher res than 512
and you can even use Hires fix to upscale images when done right
everything you are going to try will perform more or less the same under comfyui & directml on windows.
improved comprehension of the prompt
you can fine tune sdxl to learn your style. you can use controlnet to turn sketches into drawings.
hello everyone!
hi, im back after quite a while and am wondering, where i can type in my prompts, because the bot1-17 etc. system seems to be offline...
guyw what techniq is good for upscaling? upscaling +ESRGAN+ 4x, hires fix, refiner? its a lot of nods and options and i dont know what connect with what 😄 SDXL and SD, or all connected in one? can someone recommended
For now im using two workflow, simple workflow with hires fix and another simple workflow with ESRGAN 4x upscale, but i dont connect refiner and stuff
Fooocus seems simpler to work with but it has the same issue as the other ones: Performance on my laptop lol
foooocus is to simple for me and lack of things
depends on what someone wants, i dont need SD for more than some generations of things which i cant achieve with Firefly
but it hits my laptop hard so i guess i will have to go for some web UI
no local
Is there an AI tool that is good at including text in images? I'd like to find an SVG generator that can input things like a person's name, and add decorations, embellishments etc.
It's not SD but there's || Microsoft Designer ||
Iirc some SDXL finetunes are also good at this
As well as StarryAI and LeonardoAI
I usually use Bing Image Creator/Copilot nowadays
AI art rocks though, so many cool ideas I would've never come up with on my own
I just doodled something for like the first time in months, drawing is super slow but the AI helps a lot
Is there any sdxl normalmap controlnet/t2i-adapter/anything? I'm not sure if there actually isn't one, or I'm only looking in the most obvious places...
Anyone here good with unerwater/landscape/simplistic cartoon like styles look at my post in community projects
looking to colllabbb!!👀
i like MS Designer/Image Creator but i wish it would have less censorship since i already use another one that is heavily censored
I have a newb question. I'm trying out a new (to me) SDXL checkpoint, and the images look great all the way til the final step, and then it switches too oversaturated, bad coloring. I'm seeing comments that say this is a VAE issue with unpacking? but I'm not really sure what to do to fix it. Any advice?
a1111 or comfy?
a1111
if its a1111 and the vae is set to automatic or none - download this vae and set it to use this https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/resolve/main/sdxl.vae.safetensors?download=true
Boom, fixed. You're awsome. Should I keep that as the default for all checkpoints, or only use it when I run into this saturation issue?
default for sdxl checkpoints
Saw news about CUDA Library replacement for AMD GPU ZLUDA. Looks really promising for AMD folks
I do not like how AI use poor language, like you do not get a cat when asking for a pu**y, and what shall we write if we want a cherub?
No no no no no, that is not a small child, a cherub was horrible monsters with a lions body, eagle wings and a human head with a beard and also goat hoofs. Will it ever come a grammatical correct AI.... did the dyslexic person asked.
Fgs, even this server blocked a correct word for cat.
Hey guys i make 30k with dropshipping thank you lord
It uses whatever it was trained to use.
Hey G's how y'all doin, i need help with stable diffusion set up when I open the webui-user.bat file and try to run it to gain access to stable diffusion interface it gives me this error, Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check, any ideas on how to fix this
Yes, the downfall of AI is that it is trained on the open Internet when all good info is behind pay walls 😄
Just call Discord by it's name 😂
On the other hand, the Discord AI do kind of funny images as long as you do not use educated language, you never know what you get when ask for a pu*** in Boots.
Someone knows about the amd gpu issue?, i have a rx580 and it dosen't use my gpu,it uses my cpu
there are various models for controlnet, like here some:
https://civitai.com/models/38784/controlnet-11-models
can you share some info in #🤝|tech-support , mostly you run the wrong webui
It depends on what you are trying to generate
Unfortunately not for SDXL though 😦
lots for sdxl too
thanks, yeah seems like there's no normal map unfortunately. will try a bit more with midas or zoe depth estimation
https://huggingface.co/bdsqlsz/qinglong_controlnet-lllite this guys stuff too
wtf wow thank you for finding this I would have never seen it. Seems like it was only trained on anime, but idk, maybe it'll work decently for photoreal as well. I hope it works
yeah theres lots of obscure efforts
curious how you found it? What search terms did you use? B/c even when I use huggingface search I didn't see it. If there are other obscure efforts like this I want to be able to find them
Do any of you know if there is something for stable diffusion that turns an image or text and turns it into a 3d model, or is that not a thing yet?
Stuff like Pokemon and so on works...but when it gets more controversial it doesnt
I cant make parody of Jesus or prophet Mohammed for example
Jesus works with some things
There was something but a) if i remember correctly last updated like 6 months ago and b) bad quality (like with all of them tbh)
Ill be wiling to check it out do you remember that name of the tool/extension? or should i just learn 3d modeling...
Zero 123
Stable Dreamfusion.
I encourage people to give 3D modelling a try but the question arises what do you want?
I mean the same applies to 2D as well
Yeah this one as well
For me what im excited for are actually those 3D gen AI things that generate point clouds instead of meshes
Meshes are "dirty", have bad textures, topology of doom, mesh anomalies that have to be cleaned up as well and so on
Hey everyone i have a question about loras something I really don't get
Is it useful to use multiple lora's with 1.0+ each or adding to over >1.0
i feel like i've already seen many prompts that do but i am not sure because at first read the total needs to be <=1.0
i never heard of going over 1.0
usuaully u go, 1.o or less, 1.0 on the lora u want more of on the art, and the other loras under that, so that they appear, but not as pronounced
and from there u can kinda tweak the values
i dont think going over 1.0 is even supported and of it is, it prolly has nu effect on how much its used over the other loras is such value is assigned to that lora
bc nu matter wut, the loras under that one, set to 1.0 will always be less than the primary
its kinda like, think of 1.0 as 100%, and anything under like, for example, 0.9 or 0.6 like 90% 60%, etc, i think anything above 1.0 is still techically 1.0
:3
Some loras will go from -5 to 5... just depends on the training
whaa
<v> but then that begs the question, where is middle
if a lora can go to -5, a negative integer cant possably be valid
Hello everyone!
I have a question about SD1.5
I'm trying to generate a human character with cat tail.
But whenever i try to generate the image, it always has multiple tails or misplaced tails.
I tried positive, negative prompts several times but it hasn't worked so far.
Do you have any kind of ideas of solving this issue??
Thanks in advance 🙂
kind of just depends, many times if you have more than one at >=1.0 then you will start to have lots of image artifacts, or lose the ability to produce things the model knows how to draw well. if I'm using multiple loras at once in txt2img, I often will set the ones high that will get the basic shapes in the image (think of like what you would want to see in a controlnet depth map), then using controlnet to maintain the integrity of that, you can play around more with the various loras without using really high values
in general 1.0 only means, 100% of the strength of the training, but that could be overtrained or undertrained. that in a nutshell is why "it depends"
my personal anecdotal experience with loras posted in public domain, many are overtrained, and that's not a bad thing really, there are times you want to nudge the interpretation a bit harder, and you can always reduce the weight
I'm Taylor David, a skilled AI developer with a focus on developing cutting-edge conversational AI systems. Creating AI chatbots, like ChatGPT, with a focus on voice AI, SMS, cold calling, and Twilio connections is one area of expertise. competent in creating Broadcast Chatbots, Auto-responder Bots, Telemarketing Bots, and Appointment Bots. competent in deploying frontline AI, VoIP sales, IVR, live chat, and conversational AI for Plivo SMS, guaranteeing smooth and efficient interactions.
does it matter how many lora's you use together--even if they are all 1.0 or lower?
Where do i make pics? Cant find bot1 etc on list
Found out...
Sad news..
Read last message
Cold calling voice AI. Classy.
and i'm a unicorn
increase cfg scale
by liike, default is 7, so try 9
also try my negative prompt,
"boring_e621_v4, aid291, wings, bad_prompt_version2, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face"
have you searched civitAI for loras that do this
why does it take more than zero seconds to refresh the lora folder in A1111, losing my mind
wym
Because it looks into each lora to determine if it is the right architecture, and then only displays compatible loras
does it dial a modem first
well looks like SAI Japan was too fast with their fingers and published early major SAI announcement
I wonder if they will release it officially today
or in few days
Looks like huge improvement over SDXL again. Increase in VRAM for default models but aesthetic score improvement means probably that it should be waaaay better at prompt following reducing mutant generations and so on.
hopefully competes with the big 2 other image gens rn. i need something to be able to work locally again with high qual for anime-esque pics.
aesthetic score compared to what lol
scored by whom
to SDXL
by same people i assume
They wouldn't be releasing worse model than SDXL because there would be no point
lol most ppl still use SD 1.5 instead of SD 2/2.1 and XL
but hopefully it'll be faster and easier to train than XL
isnt that what happened with 2.1?
yes
SD1.5 was not released by StabilityAI it was released by researchers who were involved in creation of SD and who decided against SDAI wishes to released model uncensored.
2.1 is better than 1.4 to which should be compared
Yes i did. But none of them give me exact 1 tail
there is a standardised aesthetic scoring model out there..
You mean all the ones who work at stability?
I see you were not there at release of 1.5 and following controversy spat between Emad/Stability and researchers
I was definitely there haha. Since that the 1 researcher who was not at stability joined stability
Whole team is here now 
Then why do you not remember that spat ? Should i post logs from hugging face or something ? They are still there. Clearly 1.5 was not meant to be released as such.
I do remember it well. Tbh was a little silly at the end of the day. 1.4/1.5 were not all too dif from one another, given it was only 550k steps of retrain in a laion sub portion. Either case its long past now
yeah
I guess I just like people to know on the research end we are all buddies at least haha
so was japanese SAI hot finger to fast or this is japan release ?
I mean researchers sure are buddies no question about it. management and corporation when money is involved are entirely different thing, especially laws to follow
Emad on twitter was raving about something to be released soon and i assume it was it
Yea definitely, navigating the gen ai landscape can be a whole thing to be thoughtful in
i remember japanese courts outright said AI companies can use copyrighted data unlike US or elsewhere where it is still gray area
so maybe japan release has some meaning
Huh wonder which thing it was, tbh I am not the best at following his Twitter 😅
truth be told SAI is cooking multiple things at the same time so it might be about something else
Not japan specific, just think they are up early
nice, then so today release, hope whoever sleeps with button on hugging face repo wakes up early haha
Speaking of which. I would love to see proper image to text to image model released. So for example you talk with model :paint me a tree and then you say ok now make tree taller and have bananas on it.
in theory you can do it with vision model and then llm as fetch for SDXL creating prompt but live editing would require whole end to end model
I think creatively this is where goldmine is, more so that prompt style.
Definitely, iterative models are an area to be majorly improved on. We saw some early variants of this with say ip2p and emu edit, you can honestly get away with a lot just doing plain old noise inversion. There are some potential model changes that could be fun to help with this task but really I think the biggest barrier is generalizing ever form of edit someone might want both in the language side which likely will incorporate a vlm and the image model side to enact that change is on the data for training one. Open source VLMs still need some love and improvement and a lot of this will become easier
Shoudln't dataset be a problem ? I mean creating image from image text pair is workable, but teaching how to take language and modify image is entirely different problem.
I mean such dataset would need to be probably creating from scratch
and you have chicken and egg problem essentially
What gives me hope is that with huge LLMS you have emerging properties. Like spatial knowledge where model can infer gravity or other forces without having experiencing gravity
Yea data is pretty much the biggest barrier to most problems I still see in the entire generative space. Making a set that encompasses enough radical and tiny modifications to generalize a model of that form to most things someone would want is not an easy task.
I still use 3 wooden box problem to test model inteligence. Puzzle goes like this. You have 3 wooden boxes in exact same shape and size you put second on top of first and third at the side of second. Puzzle does not contain any information about friction, gravity and other things you should take into account and most of LLM models fail at it aside from really big ones (or best ones) like GPT4 or miqu70b
Effectively puzzle of second order when answer relies on things outside of puzzle
Yep, just need enough strong data points and a model capable of learning from the distribution and you get some magic haha. A very strong accessible VLM would be helpful in making a synthetic data set for a task like this or at least filling knowledge gaps in training
So if model is just next word generator it will fail, but if it had emerging property of reasoning based on experience then it will not
another thing to consider is probably compression artifacts.
the ability of 7B model to call up data from a terabytes/petabytes of data is insane
it leaves in dust even the best compression methods, but it has its price in form of compression artifacts aka wrong data, hallucinations and so on.
dude is literally spamming his resume as a spammer, lol
Do you guys feel any difference on using Comfyui or automatic 1111 on pinokio or external browser? for me automatic 1111 is way slower...
try sd-webui-forge
a1111 is slower, but forge is as fast if not faster than comfy, but with the same webui
Did you find?
hey question, i use remote desktop to connect to a couple of machines on my home network to drive them from my office but more recently one of the machines (4gpu quadro nvlink 32 core threadripper machine) machine has started kicking me off the connection when i'm generating stuff with A1111. I can reconnect again shortly afterwards but if keeps happening. Anyone ever experience anything like this?
network speed is 10g
Maybe the vram is so maxed out that it can't keep the stream on
Because the remote software uses vram too
yeah thats what i was wondering so i plugged the video outputs into a different card and its still happening
20GB VRAM, kek.
Try using —listen and then use the web browser on the connecting system instead of rdping in
Hi
yah i've used that before, just annoying because i still need to be able to drive the machine, not just for SD
Im using Rustdesk for Remote control, maybe give that a try
thx will look into it
Do yall know any extension which can help me improve my prompts?
Like suggest me what should I add.
Or any other cool extension you'd suggest using
what version download this model from files list?
Hi
Is here someone from Poland?
yes, two things: dynamic-prompts has a 'prompt magic' feature that runs your existing prompt through one of several language models and ads cooler stuff. you might also want to look up some styles files, which are snippets of good prompts for certain styles, like vector, photorealism, retro, cute, 3d render that can be appended to your prompt and beef them up, kind of like an autotune for your existing prompt. 😄
Hallo
which text2image model (local) has the best results for a A100 GPU?
maybe on dalle3 level
i want this art style: https://24posters.co/products/crypto-kids-6
maybe someone know a specific model/checkpoint
You are a lifesaver. Was looking for something exactly like that. Tysm
Any chance you can paste the link of that dynamic prompts extension ?
I struggle finding the right one.
https://github.com/adieyal/sd-dynamic-prompts/ sadly im not sure if they have a comfyui version or not. if you use A1111 install this via the UI rather than outside of it. (extensions > from url)
https://github.com/adieyal/comfyui-dynamicprompts oh the dev DOES!
: )
that's cool, no probs. One thing i will say about using magic prompts. never use any of the models tagged unsafe, they contain some really horrible creepy stuff that would make a 4channer blush and you really don't want something randomly generating images on that.
its actually worth looking the extension up on youtube too, it does quite a lot
Oh : / . I will not download random models then
Do you the best one?
Just 1 model for general work. (No nsfw).
Which was trained on most prompta
yeah, the gustav model is great.
Ty i will get that one then
a1111 will download the model for you first time you select it
Oh wow cool.
i forgot, it also has "im feeling lucky" which will take random prompts from lexica.art and play with them too. if you cant think of anything at all to prompt.
Thats handy too. Tysmm. Is thwre any other cool extension you suggest ?
I am noobie.
Or Tool
controlnet is very very good too. i would suggest playing with dynamic prompts first and then getting controlnet. or controlnet first. both are kind of a whole thing to learn so you dont want to have 2 extensions and not be familiar with either
Oh I got that one. Controlnet.
yeah i dont think i know anyone that doesn't by now
dont think i have. i just between comfy and a1111 a lot
You should definatly get it
It removes things from picture with your brush.
Just paint on what y dont want and it will fill
https://github.com/huchenlei/sd-webui-openpose-editor this is good too, it works like an extension for controlnet to let you edit the open pose models and move little things around, or attach a hand if one wasn't detected.
From ControlNet extension v1.1.411, users no longer need to install this extension locally, as ControlNet extension now uses the remote endpoint at https://huchenlei.github.io/sd-webui-openpose-editor/ if no local editor installation is detected. nevermind, its actually integrated into controlnet now, im behind the times 😄
i will look into lama cleaner though, i tried something similar last year, but it was a bit underwhelming
Oh. So it lets me edit pose image?like i can move hand points around with mouse?
I didnt see this feature yet
I thought it will need a 3d program or somewebsite or photo editpr
Bcz it generates a 2d pose flat image ig
Mikuu
no its tucked away. i cant share a screenshot with you but if you press the little explosion icon beside preprocessor (using openpose) it will generate the pose model for you. on that image it will have small buttons. edit, JSON and close. click edit and it will open up a finetune editor. its a bit janky but still good.
So its already added to controlnet?
I will definately look for it next time I use
Thanks so much.
^_^
no problem, learning and sharing is pretty much the whole AI community. we're all running blind figuring things out from each other since theres no definite academic courses invented yet 😄
: )
any front end using stable cascade ? Comfy UI ? Automatic ? etc ?
Not yet
damn it, from online demo it works amazing
it is really great imrovement in following prompts
Nice! I've not had chance to try it yet.
there is online demo but it is currently hugged to death lol
Example: Red squirrel sitting on motorcycle in middle of shopping mall
output on first generation:
damn it online demo deletes fast generations
but it followed really really really good prompt
Kind of like DALL-E3
Looks promising. Can you try some with hand in and see how that copes?
from initial reviews hands are still problem, dalle3 still has those issues.
still to early to tell. If it can improve in that aspect by 50% that is still a win
Not even a little better hit rate?
online demo is hugged to death by public rn
generating anything is almost impossible
is there something similar to photoshop's "puppet warp" here in SD?
say theres something im curious about, saw it a couple of times but dont know what its called. Basically before you use a image as reference for a generation, you grey it all out so only a sihluette of the original picture remains.
Any ideas whats that called?
looks like online demo got some improvements and now there is proper queue and you can see generation in real time
woman buying dragon egg in fantasy mall
war scene from movie, lizards attack human knights
Maybe you mean Controlnet with Canny mode
Or Depth
deph, thats the one
scene from sci-fi movie, spaceship battle over planet
those all first generations no redos
Stable Cascade looks pretty promising.
does the bot use stable cascade now
There is no bot #1047610792226340935
ok
depth can be very good at getting poses correct. a lot of people dont really use it, but if you combine it with openpose as the second controlnet it works very well for fidgety poses like dance or crossed arms or legs that pose just doesnt quite work 100% with, like understanding which leg is in front of the other.
i mean openpose as the second controlnet and depth the first, my grammar is weird 🙂
training needs 20-24 gb ??? but waht about running a model ?
Depends what batch size they recommend for training but given the numbers of parameters involved it should be similar to sdxl
Assuming stage models are swapped out
i am not an model trainer so my question was can i run it offline with 16gbvram
because it is very near dalle3
and to my wonder its based on wuerstchen
@nova zodiac
Im guessing yes but that may be premature
Guy in the DB training server said it was ~9gb if you use gpu offload - and there are multiple models so will just have to see....
Oh Nice, stable cascade. Sounds like a nice excuse for me to ditch my RTX2060, or will it run?
I admit I haven't researched this at all, but what exactly is the difference between SD1.5, SD2 and SDXL?
I assumed SDXL would simply be the newest and best version, but for example on Civitai there's far less LoRas for it
There are other deeper differences but SDXL is 1024x1024 it requires more time and vram to train. That's why there are less on Civi.
Ah interesting! What are the other differences then?
I am currently using SD1.5 and generating my pictures around 400-600 pixels depending on the aspect ratio I need.
If I used SDXL for that, would that also take longer?
Thanks!
stable cascade is the newest and best version, just came out today
Now if that isn't an amazing coincidence
but i'm not sure if anything has been updated to be able to use it yet
there's a demo for it on hugging face @ https://huggingface.co/spaces/multimodalart/stable-cascade but it's being... hugged to death if you will xD
when it works the image quality is great.
there's standalone scripts released for it too. no ui. its fun to use generative ai that way sometimes
where's the standalone scripts? i don't mind it being clunky, haven't found anything yet tho.
Ah so it's not possible to load a stable cascade checkpoint into Auto1111 or Comfy?
not yet, it's the very first tri-model
sdxl is a dual model, cascade uses 3 models in its inference
https://stability.ai/news/introducing-stable-cascade this is the official announcement
i know they support the wurschein (sp?) architecture with the first model they released. shoudn't be long before cascade gets ui support
Great
ugh; the scripts are in notebook format, i'm not very good at setting up notebook stuff
wurstchen! with an umlat but i don't have access to those characters
I got it running locally with this: https://github.com/EtienneDosSantos/stable-cascade-one-click-installer
nice, ty.
Würstchen !
wait that's wuerstchen
Yes ue can replace ü
That said, I think it loads all the biggest models, so YMMV
is wuerstchen v3 and stable cascade the same thing? i've been seeing people say that
They're built in similar ways, I believe
that script is specifically for wuerstchen v3 i think
but i'll try and get it setup & see
It's not pulling wuerstchen, it's using StableCascadeDecoderPipeline and StableCascadePriorPipeline? But I've never used wuerstchen, so I can't really compare
ah i was reading the code directly and it says "pip install git+https://github.com/kashif/diffusers.git@wuerstchen-v3"
there are demoe on HF, quick give me a prompt before it gets hammered
I think that because they're built so similarly, they borrow some underlying base code. But the models it uses and the diffusers it uses are all Cascade
oh ok
also i never used diffusers before, is it still like 100% local?
i always used the manual setups/webuis/etc
What I just sent you is, yes 🙂
alright. ty. gonna run it now and see
Umlauts are so metal : 🤘
It uses just around 20GB vram with whatever models it downloaded - I think that it's using the biggest models. I'm not quite certain where they got downloaded to, so I'm not sure how one would switch them out
np; i have 3090
i might have to close youtube tho
or nah i have more than enough leftover even with yt open xD
will cascade finetunes be 3 models or only one will be the finetune and the rest is the same?
YouTube using vram? Weird
sometimes discord/youtube open total to 1GB vram but not right now
seems i'm only using 0.5
Vram?? Or system ram?
vram i think, when i used to use kobold AI and maxed out my vram i had to close everything out
I've never encountered such things..nifty
it generates at 1536 x 1536
that's the max u can go before quality loss?
@keen ruin it doesn't use 20Gb it uses just under 15Gb (bfloat16) , it would work on Colab free tier if float16 was not overflowing, and thats the full fat version, the lite version should be significantly smller
dont know, just on a huggingface space and it let me gen at that
Well, the script I linked above is using 20gb - I can't exactly tell you why, but it do!
I've been switching between the Cascade model and a base SDXL model and this may be anecdotal, but **hands **generate a lot better! I'm having more trouble breaking hands
2.1.2+cu118 won't work, will it do fine if i do 2.2.0+cu121?
o same with torch vision, also fails install so need to bump up the ver
did you have to do anything special to set it up? the torchvision version and pytorch version it wants to dl don't work & it fails to build for me.
I did not - I can see the author is currently adding commits, though, so I have no idea if what you cloned and what I cloned is going to be the same 😅 I have the commit with Code_Example-v0.0.3.py
the install.bat is the problem for me, i can't get the specific version of pytorch/vision it wants apparently
That's odd, could be a Python version issue? Mine seemed to install fine without any errors. I'm not sure if it pulls any global dependencies or if it's all in the venv (I'm not too knowledgable with Python specifically)
using 3.11 or 3.12?
3.10.11 actually!
"Could not find a version that satisfies the requirement torch==2.1.2+cu118 (from versions: 2.2.0, 2.2.0+cpu, 2.2.0+cu118, 2.2.0+cu121)
ERROR: No matching distribution found for torch==2.1.2+cu118"
ah okay i'll downgrade
its silly, but there are so many dependancies on it they have to keep it there for now. like how big military bases all use old computers. it would take forever to upgrade everything and make everyone else do it without issues.
yea; i know newer versions sometimes don't work, used to a lot of the stuff being made for the new stuff.
also if you can, use pip to downgrade
yea it's downloading it fine now lol
its annoying when you have drivers and stuff screaming to you to upgrade and also being told to keep it low. i remember the little screaming fit moving up from torch caused 😄
I have been using 3.10.11 for like 8 months without issue (knock on wood), I'm scared for the day I need to upgrade
also if you ever need cuda different versions (you almost never ever should unless youre devving) you can log in to nvidia with an academic account and get builds.
Thanks for the tip!
getting no module named 'torch' error now ;_;
have you maybe missed a step in installation? like does your computer have the binaries?
well with the install.bat i'm getting some kind of error at the end
ERROR: Failed building wheel for insightface
Failed to build insightface
ERROR: Could not build wheels for insightface, which is required to install pyproject.toml-based projects
o this is hyper-manual, i need the visual studio c++ build tool things maybe?
yeah the build tools
and it might also cry that you installed them after instead of before. first time is a headache but youre moving through it quite well
i've done some of it before, just on a fresh windows install now >w<
oh thats a face swap thing. thats a whole iffy issue i wont give tech support to. just because. anything else, sure.
"one-click install" is def deceptive XD
but i'll do as much as i can to get it running, it shows promise
install wheel
pip install wheel?
ya
its worth the tears
then try again
finishing vs build stuff first
lemme upgrade my powershell to terminal again too one sec
a good powershell script would be great for new users
instead of a bat.
dont worry though, you'll get there 😄
im sad i dont have time today for this, maybe someone will implement the diffusors pipeline to comfy nodes 🙂
ehehe. i wish. cuz it also uses a better version of pytorch with a more efficient
thing that makes it run better
xD
you might scream "never again" but you will get it. getting AI working on your PC is honestly more than any academic course would ask you to do. if you dont know about python or wheels or orders or venvs its especially infuriating. the should be a small cottage industry of people installing it for people
i knooo i'm not a total novice, but i'm on a fresh windows install so i have to get every single thing it needs manually
it's progressing more than before
i was expecting one-click install to be one-click install xD python is actually really intuitive
cuz it gives you nearly all the info you need to resolve a problem luckily
I’ve read the stable cascade introduction. It mentioned that it’ll be more efficient and better for lower end hardware, but at the same time mentions 20GB VRAM
I’m currently using SD1.5 as my setup takes too long with SDXL. Can I expect to be able to run Cascade then?
the screenshot the creator of one-click script uploaed shwoed it using all their 16GB VRAM
no no, its not a trap for idiots, its just quite unfair if you've not got a pristine clean machine. 9/10 times for me its a venv issue and im accidentally in the wrong environment.
ah yea, i made sure to read the scripts
and saw that it's properly activating the right venv n stuff too
finallly it finished
before/after cooking videos 😉
xD
How is Stable Cascade compressing the images so heavily at the first step different than what a U-Net does? It sounds very similar to me.
christ im sure its a white paper i couldnt even real the first sentence of. everyone lost me at LCM 😄
Well that could depend on what they’re generating. But how does it compare to the SD models relatively?
Will see in a bit, i was able to gen a couple things with it and it looked like early dall-e 3
couple things with the huggingface one
i posed 3 cats in gen w images
@keen ruin Are you doing a lot of images per prompt, they've gone for the way of doing thats uses more memory than necessary in the script, assmuing your interested in using less memory
Are you talking to me?
@long wigeon Sorry no didn't notice the @ thingy didn't pick up a name
it's definitely improved over sdxl but it's not as big of a leap was i was hoping for; still will be worth experimenting with a lot.
No - I was only using one image per prompt.
(I am fine with using 20gb on a 4090, my note was more for anyone else)
i think its phenomenal, huge detail first pass
It's much improved over SDXL, not as much as I'd like for anime style images.
ahh havent tried
i modified the script a bit already to crank out gens faster since i'm not changing height/width/num of images
seems realism/regular drawings improved a lot too
it follows prompts well but i think it needs more than 30 steps
That’s great
oh maybe it just needs less guidance scale
the ones made with like 6.5-7+ didn't come out well but 4 came out really well
Stability just dropped a brand new model they've been teasing for months, called Stable Cascade. It's really good at handling text (Including text into images)
@stiff quail Yep the sample code use 4 for the prior and 0 for the decoder call
Was just reading about it. Looks promising. Did not fully understood but you can’t use it with A1111 or Comfy, right?
not yet. there's a script for it but you'll need vsbuild , python 3.10 & git all manually setup
you can use pinokio
Yea, thought it would be something like tha. I found the three models but thought it’s not possible to yet use them like that. Better to wait then. As far I understood lora training too is improved greatly.
heres a funny one
if i generated a image with a character
is there a way to remove the character, so that i can showcase the background behind them?
You could try experimenting with inpainting, mask the character and generate the background on it
Lower cfg and medium denoising might be something I’d star with.
Then experiment with fill/object and whole image/masked. Object/masked could work
That would be A1111 but those translate to Comfy too I think
downloading via pinokio let's see what it can do
does it take the a1111/comfy authors to do some kind of update for it to work or is there something you can do manually?
I’m guessing a1111/comfy needs updates or patching or something to add the functionality
quickest way is to use photoshop AI object select and erase tools
its unusable
it works great for me
are you in discord of pinokio
any models for a gpu with 6gb of vram?
?
check dm
it takes around 9 seconds for me at increase steps compared to default and 1024x1024
from my testing this is huuuuge step up
almost as big as from 1.4 to SDXL
does it understand general text like dalle?
the biggest difference so far is lack of "wrong" parts. If you say to generate something it will probably generate it in just one generation correctly rather than you repeating generations until you get something that is "correct"
yes, understading of text is much much better
as for censorship it is slightly less censored than SDXL and can do better faces from get go which was SDXL problem
Do you guys think it would be possible to run that on a 12gb gpu? 🥺 , I tested it but only the online demo
there is online hugginface demo
thx
12gb gpu, yes absolutely.
original post from stability is that parts of stable cascade can be "replaced" with lower models or something like that
the main benefit of this model seems to be (according to SAI) much better and faster finetuning
its supposed to be easier to train right?
so far from my testing it is 90% DallE3
just without crazy censorship
whatever it is higher or lower than 90% dalle3 i need more time spent with it
hands wise it still has issues but dalle3 as well has those issues but generaly i see improvement
please stop calling a thing not doing boobs and things censorship. It isnt censorship when the guy at subway, or your tax guy or whatever doesnt flop himself out at you. he could, doesn't. hes not censoring he's just being normal.
models are also better spatially aware, so if you have sword characters seem to make less mistake with how swords are held
Sounds amazing if that´s the case, remember its the base model, the community will fine-tune it too like sdxl or 1.5
Yeah finetuning of both 1.5 and SDXL directly increased their capabilities by a lot
well it is censorship... anything that deliberate blocks information is censorship
What i am waiting for is proper frontend integration with control net
stuff should be crazy, or inpainting
yeah literally, definitionally, it is.
thats really really silly. so anything that could and doesnt is censorship? that's not what censorship is. anyone could do anything. that they don't doesnt mean theyre censoring themselves.
If it can't generate anatomically correct human then it is censored period. The issue with faces in SDXL that needs heavy fixing via finetuning is also caused by censoring.
i don't see why you have issue with calling spade a spade
when you're saying censoring it sounds a lot like pouting to me. you wanted a thing and didnt get a thing. super mario bros was censored?
what do you think censorship is
ive just explained with examples twice. you literally know what i consider censorship and just not doing a thing.
i think no one understand what you consider censorship
you dont walk down the street censored do you?
because your definition of censorship is wrong
you just pointed to examples that arent censorship and said an example of censorship is that
When you want to curse around children and you don't do that is it censorship to your or not ?
https://www.aclu.org/documents/what-censorship
here is the ACLU definition of what is censorship
the suppression of words, images, or ideas that are “offensive,” happens whenever some people succeed in imposing their personal political or moral values on others. Censorship can be carried out by the government as well as private pressure groups.
there is also self censorship
gore and nudity is censored with SDXL. just because it effects the public image of SD
thats an intention you have. you're putting your intention on a model
either way offtopic and no need to continue debate
Stable Cascade is slightly less censored than SDXL from my brief experience
And about the same if you want to produce someone likeness
hi how i restore comfyui to default again? loike when you delete the env folder in automatic1111?
only on the discord because there are literally children in this server
i like when there are people in ai servers who are borderline nonsentient
another thing i noticed with it compared to SDXL is that it way better follows all kinds of signs
so you can type transparent saying XYZ and it will generate it
Is Image Variation a new thing with Stable Cascade?
nope
if you are doing anything with words its better to just take it in to the GIMP and do the work there instead of trying to generate it in SD
I look at AI as something to storm new ideas not so much to do the work for me
prompting SD to act on your masked controlnet words is best approach for words. HarrlogosXL is neat to play with too
keep in mind, as the end user, only you understand the key concepts of logo design and versatility. SD just knows how to denoise towards a trained goal.
@fervent thunder your already here bozo
Are there any extension that add MP4 or Webp support to previews?
is using a mix of similar celebreties a viable way to train a face lora to get an "original" face?
There's one but I forget the name. Would take each step as a frame.
Howdy all! Very early on in my SD adventure, but loving it...Q: trying to understand if there's a way to inpaint in a video?
Where do we write prompts again, and how?
Yep, but it will require learning and effort. The Deforum Automatic1111 extension can achieve it using masks and ControlNet. AnimateDiff is easier but less powerful. There are other proprietary services that can do it even easier, like Runwayml. I'm sure others have more suggestions.
Does anyone know anything about why there's still no ControlNet Tile for XL? Is it difficult to produce?
not sure tbh, but the same guy that did controlnet maintains foocus and now sd-webui-forge as well, so me thinks he might be quite busy!
can anyone tell me if there is a tool that can allow me to upload a random face, and then that face can be used as a consistent character?
FaceID, InstantID, PhotoMaker, Roop, ReActor....
InstantID for SDXL only, PhotoMaker probably best bets
thanks for the response ill try them
Roop is dead i guess, you can try reactor and other options suggested above
hey guys im looking for creative to help out in a project we are launching summer!
how do u do the ai
soft and tenderly
first I take it out to dinner... 😛
I mean are you asking how does one make an image?
ok xbox make AI
valentines day sucks when your single
我想要一个教室的图片,里面有中国女学生
Stable Diffusion, i choo choo choose you to be my valentine! 🚂
Would anybody explain what is the process to train a model with comfyUI
you don't train with comfy
you use kohya_ss or onetrainer
Can someone tell me how can I train lora for clothes such that it could pick up pictures on it and show same type of vlothing with picture in output
in discord, not really anything free - you are better off going to the civitai website and using the generator there or going to aiscribbles.com
you could also just use ipadapter instead
I believe Sebastian Kampf has a good tutorial on that
Check this out https://boximator.github.io/
stellar. 3months out. that'll be neat
ty
"Design a charming anime-style illustration featuring a lovable dog adventurer on a white background, set against a backdrop of serene landscapes or bustling cityscapes, capturing the excitement and wonder of exploration in every pawstep."
Dang, that looks very promising.
ya
Is it possible to use Stable Diffusion Reimagine in Auto1111?
how can i use this model https://civitai.com/models/271592/big-head-3dxl?modelVersionId=306137 with sdxl on Python?
i need a python code
so i can use my 2 gpus at the same time
for beginners i recommended use stable matrix, u can intall there all one click, and u have there ready --xformers --vdram etc.... also python git etc
Good morning, everyone! How are you all today?
gm frens!
no pay ? then why?
hey guys who is into dopshipping here
its fun and meaninful and it could lead to a cool job
its not fun to waste time on a project just with the promise that you may get a job,thats why artists ask ppl to pay first when doing commisions
well im not paying anyone
hey guys what the recommended way to use SD?i got something, i forgot the name, whats the version with the nodes?
u dont need to pay anyone,u have 2 hands to make it yourself
i have this version: AUTOMATIC1111/stable-diffusion-webui
what version do you guys reccomend
i remember seeing a UI thats node based
it is not version, it is an application to run AI art stuff
Node-based UI is ComfyUI
made by ComfyAnonymous.
Most beginner-friendly UI is Illyasviel's Fooooocus, which give you some of basic settings like Guidance Scale and steps. He is also the creator of Forge, which is an optimized fork of A1111
bot is down forever ?
Who wilp like to dtart dropshipping
Can't figure out why I can't reproduce an exact image that I have the PNG datas for, loaded into PNG Info, send to txt2img and selected the exact same checkpoint.
Also made sure other settings like sampler etc are matching aswell.
Even did a batch size of 10x2 images and none of the 20 were the same as the one I created yesterday
hey friends
we fittin' to play RLCraft. we got 6 people so far but we need more since its hardcore mode. hit me up if interested (via dm)
What’s Dropshipping?
am i missing something where are the bot channels?
Dropshipping is a business model where you sell products without having to keep inventory. Instead, when you receive an order from a customer, you purchase the item from a third party and have it shipped directly to the customer. It's a great way to start an ecommerce business without the hassle of managing inventory.
Entirely off-topic for this server
It's a great way to sell stuff you've never seen and don't know the condition of
Hey guys
Do you know if there’s any impact on performance or speed etc. if I have the checkpoints and loras in Automatic1111 directory while using comfy ui ?
big if true
small if false
it looks way too good to be true, tbh
ya gotta remember the tech world runs on hype
can SD models be used to produce wide format art--like 21:9 or more?
any suggestions
Sdxl models can do 21:9, 1536x640
thanks--i've heard the product of the HxW resolution has to equal an exact amount of pixels--if setting other then 512x512, is that correct?
well that 1536x640 has a much different amount tbh
Both axis should be divisble by 64 to get the best results
ohhh
They do go best at a square image, but they do also train on other ratios and there is a guide for sdxl floating about
noob here but it looks like the bot is down...i am still seeing generations are these running locally and people are posting? ZDIdn't see much under announcements, please don't flame me
#1047610792226340935 - its all good, we get this 3-4x a day so somethings not right with current comms methods
yah sorry about that
what about if i wanted use it to produce super-ultrawide wallpapers---pushing it too far?
a 7680x2160 monitor is like 32:9 aspect ratio
i have two ultrawide monitors so it's probably almost 42:9
anything produced can always be upscaled after correct?
Probably better off stitching two results together
Although 2048x512 should work for sdxl, def not for 1.5
does cascade open more options though?
I dont push it much past 768x768 unless i know the model is trained on 1024x1024
Not sure yet, too new
are most models suitable to use atleast 1536x640 though?
or will it cause rendering issues
Is there a spot on the discord where i can post questions about DreamBooth?
Yo, is there a list of usefull prompts for pictures? And negative ones aswell?
i've heard kohya-ss is preferrable to dreambooth for creating loras/checkpoints
why?
check civitai--in all the models posted there they usually give you the prompts if you click the screenshot previews
apparently the usefulness varies by SD updates--less stable i assume
i think kohya is more specialized for it, but i haven't managed to get it working unfortunately
requires a small amount of command line tinkering
false
yea trying to get dream booth to work right now. Just tried doing a simple extension install but its missing tabs
if you want info regarding dreambooth I recommend you check out the dreambooth subreddit
check out this guide I posted there as well
I've timetested this method
it's by far the best
appreciate it @rich kestrel, would you mind if i ran a question by you?
This only applies for dreambooth, for LoRAs you still want to use kohya
I'm still rather intimidated by all the steps for either
I need to get my hands abit dirtier
yeh
yea, i figure it is like most tech software though, perseverence will get you results eventually
what are you referring to then--what do you output with dreambooth
I just installed the dreambooth extension onto stable diffusion but for some reason the input tab is missing from the gui. Have you encoutered this error before?
isn't dreambooth for creating safetensor loras/checkpoints also?
or are you using the training folder thing
yeh but dreambooth models are way different than loras
got some sour news, but modern dreambooth extensions are pooh
if you want something half decent, you can use kohya
if you want the best possible output, you have to use an older A1111+DB commit
does your dreambooth use a live folder for image training data
live folder?
like a folder you dump your images in and the tag uses them i guess
yeh
i have a very limited understanding how it operates
yh thats how it works
i am primarily interested in creating models atm
checkpoints or loras
also it seems to indicate dreambooth requires more ram/vram than i have (8gb/16gb ddr)
i see. I guess i have a couple followup questions. 1) does this mean i will have to have a seperate dreambooth install(outside of stablediffusion altogether) or is there a way to install older versions of the plugin? 2) do you have links to where i can download or view older models?
yeah ur gonna struggle with small vram
I would stick to loras in that scenario
You would install a new folder with the old A1111, at least that's the simplest way I can think of
have any of you tried running SD in a cloud server
i have been considering HF spaces or AWS for this
I have before
Too much pain in the ass
seems like it yea
I forgot the name
but how does it perform
runpod
it works well, but since its all linux and cloud, it's just soooo much extra work
i have done the SDXL turbo demo on HF and it's quite fast
I doubt locally it would work well
if you find a repo for training on it, then by all means
just dont expect it to be easy
The way i have been doing it is running StableDiffusion in a virtual environment that i made with python. Is this what you mean?
HF spaces has a python sand box--could you just dump a1111 files there
all SD's come with venv
so the other one would run on its own venv
its just a matter of git pulling the right commit
I cant say it works bc I havent tried it, but I would try it if you don't have any alternative
@rich kestrel do you happen to have access to the dreambooth discord?
Super newbie inquiry.
Anyone can offer tips and tricks keeping the imported image intact but have the QR code diffuses?
Meaning I want to keep the imported image as is.
tf
@sudden ruin NSFW spam.
So I suppose I should change the question to "Is Illya the only person who knows how to create ControlNet Tile or something?". Seems like there's demand for XL Tile, but no one taking it on.
No idea sorry

hammerbanned
Can I do keyword blending with more than two keywords?
yep, concatenate contex node
you use Automatic1111 or Comfy Ui
If it's automatic, I don't know how to help you, I think it only allows text input, but why do you need to repeat a command?
XD
I can't wait to get my new computer so I can run locally
I'm using colab right now, and its both buggy and expensive
Still, I havn't had this much fun in ages
automatic is [token1:token2:ratio] in the prompt for concept blending iirc
Totally, I was just wondering if automatic had syntax for scheduling three or more prompts
as in a queue system.. ie generate image 1, then image 2, then image 3? not that I've seen but theres probably an extension for it
Hey Guys - This is Ai music with sound engineering + Ai Art enjoy: https://www.youtube.com/watch?v=UgHmTPwkiY0
With a good laptop, and stable local broadcast/other AI, how many seconds does it take to create 4 images (like on midjourney) ?
I'm increasingly considering buying a very good (expensive) laptop that could run all the big recent games, chat GPT/other local AI. But I don't know how to choose yet (1500-2500 euros in budget) because I'm still a noob on this subject.
When a model is asking for a image mask for inpainting, what format should that image be? I assumed white pixels where the inpainting should occur and transparent where nothing should happen.
I'm trying to interact with a model by api and the inpainting isn't working as I'd expect given my above assumptions. Curious if I have it wrong, thanks!
Quick check of my understanding
[A|B] toggles between prompts, first one on one step, then the other on the next, right?
As opposed to, say, trying to enforce both at once?
If I wanted to enforce both at once then I'd just include both keywords, right?
yup
I'm assuming that's spam
I'm doing the reading and experimentation of course
But could someone walk me through the process of establishing a lora for my own character such that I can start making all sorts of images of them?
Like, ELI5 XD
And, follow up. Lets say I sort this out. Now, I want to start building a big collection of works of this character. Would the next step be figuring out poses with control net and open pose?
hello
how to queue prompts to generate images automatically
multiple different prompts
Depends on the interface
most base models are, as I'm guessing a lot of photos of people taken to try and look their best and so are either airbrushed, photoshopped, or the person is wearing makeup
and thats what gets thrown in the data sets
hello, ive been inactive for a past few months..and i foundout that bot channel to generate image is gone, something happened?
yes
try civitai's onsite generator, or aiscribbles.com, or happyaccidents.ai as good starting points
is it normal for a 600 frame (40 secs) deforum to take 6.5 hours to generate lol
gtx 1660ti and intel i59400f
what
what happened to
stable ai
dreamstudio is there, it isn't free; and as for the bot here #1047610792226340935
what resolution?
the three above I suggested
when they said soul destroying they meant that it made someone depressed, not flattening the South Korean capital 😛
720 x 1280
39s/it
6.5s/it lol
390 minutes * 60 seconds/600 frames..
math checks out but idk my command prompt says around 6.5s/it and my max frames is set at 600
and of course the ETA was 6 hours
when i started
the one thing ive noted is only my gpu memory usage is maxxing out, everything else is below 50% usage
6.0 gb of gpu memory
can someone help me with this ? https://www.reddit.com/r/StableDiffusion/comments/1ar454s/modelsafetensors_download_always_very_slow_to/
In which case your going out of vram and swapping into ram - theres a setting in the nvidia control panel to stop that from happening
Dont download through python - grab the file from huggingface or civitai and put it in the right location
ok , but how cand I discover the file using python I'm using pinokio
For a1111 you are putting the model into stable-diffusion-webui/models/StableDiffusion
If there are files there the system should try and load them first before resorting to downloading them
I know, but how can I find the link ? in this case is just model.safetensors according to image I've posted
theres no information
If not add —no-download-sd-model to the commandline argument
Its usually downloading stable diffusion 1.5 model; i would recommend grabbing a different model though as that one is old now
its automatic download I'm using comfuy through pinokio.
In that case download a model from civitai (try ICBINP to start with) and put it in the models/checkpoints folder in your comfy install then fire it up
ok but how can I discover what model is download only looking in the image I've posted ?
You are just guessing (or heading into the code to find the url, but im 99% sure its the base stable diffusion 1.5 model)
thats the point I need to know how to discover what model is being downloaded , what is the link and where to insert..
@bleak matrix you around??
Im telling you to grab any checkpoint from Civitai and put the safetensor file in your comfyui install in the models/checkpoints folder
@sudden ruin?
theres a lot of models there already around 13 I need to know specifically what is this model that is being download, thats the point
This is a warning to all parents: remember to show attention to your kids and not drop them on their head
Fetal alcohol sydrome is real
If there are 13 files in the models/checkpoints folder, what are they??
other models , but not this one wich is being downloaded very slowly.
Got a screenshot of the list?
would the only way to get characters from a specific show be to train my own model? not a mainstream show so models dont really know it well from what ive noticed
or is there like a broad general tv characters model that might be trained
lol qanon nonsense
how is that acct not banned yet
mlp profile pic is the secret key to discord mod hearts
Train own lora
can't use windows 11? skill issue
Although a lot of common characters are available on civitai
yeah i would just get a dataset of about 20-50 of the character and train a lora on the base model
i've been using it this whole time. when is pay to use coming?
i'll just use the 2024 version
Dont engage with crazy!!
steam dominates pc games. that wont happen
They will drag you down to their level and beat you with experience 😛
who says i'm not the one dragging ?
is it possible to have one model for multiple characters, or does it need to be only one? like if i upload a bunch of pics of various characters and make sure to name them in the description for the training will that work
all you gotta do is seed a little doubt and they wind themselves up so hard. its like giving a cat some nip
you can train multiple characters into one lora but it just needs more dataset organizing and larger datasets means more training time.
training a whole entire model is less needed for such things. loras can be applied to many different models too.
you've never once read any software license ever it turns out. wuhwuh
Steam also works better on linux than it does windows, so them caring about what windows is doing is like.. whaaaat? what planet are you on?
i've got 1000 game steam library uh oh. guess i'm just really skilled at steam and you apparantly are very low level
I warned ya
quick question
im having problmes with automatic1111, can i ask here ?
or is the a specific place¡
?
Here or #🤝|tech-support
Much better 😛
gotcha do you know what its called or where it would be?
within the control panel i mean
thanks i appreciate that. I assume the drastic slowing is from the act of swapping to ram? even tho itll still be limited at 6gb vram it will be more consistent i assume?
always fast, but will limit resolution heavily when doing animation
interesting so then would you say to just upscale it using another AI after i get my final generation or
by animation does that include deforum because thats mainly what im interested in yeah
i also dont know what tiled vae is haha lots to learn but dont want to give u 25 questions lol
Not sure if deforum plays nice with tiled vae or not
Tiled vae is part of the multidiffusion extension - breaks the numbers down into smaller chunks that the gpu can process
do you have the multidiffusion extension link handy?
Not to hand, but it is searchable in the extensions tab
is it possible to have GPU/CPU/RAM usage floating over all windows? kinda annoying to check task manager everytime
Yes with Rainmeter
Its a tool for floating windows, after the install it already adds CPU, ram etc usage
Very cool Programm with a lot of overlay extensions and themes
rainmeter kinda complicated
doesnt mean I cant do it but
something like Glasswire maybe? it can show net usage panel
Idk if there is something for that like glasswire
Hello there! I've a question, what the best way to be able to use a really specific bottle into image generation?
Should i train a checkpoint, a lora, both or something else?
Thanks
lora
So Just by training a lora i'll be able to put in every images i want Always the same and specific bottle with no problems?
well not no problems, but is best bet at maintaining shape of bottle.. ipadapter is another option with a high control strength
Mmm, interesting. What Is ipadapter? Any tutorial or guide?
Hi there, im very new to this. Can someone here explain to me how i would get a person to look away from the camera when generating images while still making the character consistant? (using fooocus)
Sebastian Kampf has a tutorial on it on youtube
Ah ok thanks i'll check It out and i Hope It Will work
Add looking away from camera to positive prompt, and looking at viewer in negative prompt usually is enough
Any other suggestion Is welcome on how to photo insert a specific object and play with the scenario where around It
Thanks
still looks at the camera - could the reason be that im using the "face-swap" feature of fooocus in image-promt where that image has the character looking at the camera?
If yes, then how would i generate an image with the character looking away while keep the character consistant with previous images?
Yeah thatd need to be off, no idea on how to do it in fooocus sorry
which other front end would you do it in? The first i tried was comfyUI but it was a little overwhelming for me as a beginner so i swapped to Fooocus
Automatic1111
A middleground between the two, and theres lots of tutorials on yt for it
Alright ill install Automatic1111 - once installed - could you guide me to generating an image with the character i already have but looking away from the camera?
or link me a video or article?
Again, Sebastian Kampf with the ipadapter tutorial is best bet, but youll probably want to lower the controlnet strength and set it to “the prompt is more important”
For an install guide of Automatic1111 checkout the Pinned Messages of #🤝|tech-support
Hello. Can I use SDXL with 6GB VRAM? Or it's not worth the effort?
ive noticed that some models are more "expensive" than others. like most models are 2-3GB but i got revAnimated recently, gorgeos model. 5GB. takes forever to switch to it and sometimes renders just will crash, I have to use smaller word counts and smaller resolutions for it. is that normal?
hey guys - I'm new to this stable diffusion and wanted to ask
when you use a picture of yourself, how do you instruct the prompt to 're-imagine' the picture you provide of yourself
That depends on which of the trazillion different user interfaces you are using. Not everyone uses the same terms or even same techniques / Models for that.
i've just installed easy diffusion and trying to put my face onto a character or body
what is the easiest way to learn how to do this
I am being silly in this question btw
Still learning but it seems like so much. I really want to learn control net well
to get as specific outcomes as possible
I get my new gpu today
thanks fawks for replying
what GPU are you using for this?
I am a complete amateur when it comes to using stable diffusion. I am in the process of buying a new computer, and the options are a MacBook Pro with m3 or a PC with an NVIDIA RTX 4070 Ti Super 16 GB WRAM. Is one clearly better than the other?
the rtx 4070 ti super is better