#💬|general-chat
1 messages · Page 187 of 1
Im not sure, Id have to reload it, try to duplicate the problem and come back.
Closing out invoke and restarting would help but not solve it. Matter of fact it got worse, Id get 1-2 generations and then black squares.
I have comfyUI installed and it runs, Ive just put off really digging into it. Felt like if I wasnt fine tuning the “easier” programs then Id be more lost in Comfy.
I have forge installed as well, had the most “luck” there. The images tend to look soft but it stays consistent at least
there's different ways to mess up memory management, is why I was asking them
i have been trying to download comfyscript for 2 days buy cant get it to work at all, i have followed the instructions on the github page and tried to google, if anyone knows how to download it, please tell me :D
https://www.tomshardware.com/pc-components/gpus/nvidia-may-release-the-rtx-5080-and-5070-super-with-boosted-memory-configurations-according-to-leaker if this is true ppl should hold out on buying gpu till the release of 5080 super
Hey guys,
is kohya_ss still state of the art when I want to train my own LoRa or are there some other usable alternatives out there?
The reason I am asking is because kohya_ss is gving me a ton of dependencies error messages after clicking the Button "Start Training" which I cant seem to fix
RuntimeError: operator torchvision::nms does not exist
20:22:47-499620 INFO Training has ended.
fluxgym 100x easier
is api pricing flat fee for stable video?
thats only for flux no?
it's only for flux. luca taco has good trainers on his replicate
Thanks for the suggestion, I actually got kohya working yesterday and trained my own LoRa successfully, yay 
Do you guys have suggestions for good tutorials on how to make the bust use out off stable diffusion?
yeah, start with scott's https://www.youtube.com/@sedetweiler
I have issue with stable diffusion running on AUTOMATIC1111
Everyone facing same issue or its only me?
hi
I do not currently have a computer, only an ipad. I am assuming I can use artisan? And only artisan? Is that correct? It looks like much of the site is devoted to people running stuff on their own computer systems, and that artisan is the only part of the site that is for creating on discord? Please ping me.
Probably but you could look into other cloud services like civitAI, that allows you to use more tools and models then just stable-diffusion
However these are paid services
But using sdxl and 5 dollars should get you about 900-1500 images depending on you use loras or not
You can try Draw Things. It's a free app, you can find it in the app store. You can join their discord server for help.
I recommend but, more than likely, one will indeed need help getting it set up and going... lots of moving parts but I'm glad to see something like that available.
hello i was wondering how good is stable diffusion at converting sketch to a nice looking image
in an accurate way chat seem to change it how it want it to look like not what i skatched
There's nothing to set up, it has a complete interface. You do need to start with downloading the base models to get things working.
Next to their discord there also are helpful videos on yt (although due to the ongoing development of the app, a few of them are already outdated). @near seal
I just happened upon the app last night, I had only minutes to try it out but something's not right because everything that generates looks like food that only cooked halfway. Lmao
Okay I have some doubt so see in my training my sdxl model the images which i have used to train the model aree all of same size 1216x832 and when I used trained model to generate images previously I was using output image size of 1024 X 1024 but images were not that much good but when I used exact output size of 1216X 832 then the generated images were amazing why is it like this if anyone can guide me in this why is it like this
resolution is part of the input the model might overfit on
if all your training images have same resolution, then the model associates the resolution with your training data (same way as you would associate a trigger word in the prompt)
just crop some of your training images to 1024x1024
Any recomendations on where to start getting into generative AI art? Video by Shadiversity got me interested in more complex use of ai to get better results than with using just one prompt at some site.
Well it depends on you sketch and what are the important features of your sketch. The object and their lineart, the color, size and composition,.... Depending on the used input and tools you have much control over the conversion.
Another thing what I observed is fir example if I try to generate my image of Suze 2048x 2048 than what it does like it makes bad image like for example if i want to generate image of pizza than in the generated image I will have multiple pizza just thrown away but if i generate exact size image than only one pizza comes up
Why is it like this
that's an artifact of the convolutions/unet. But in general you have to train on similar settings as you want to do inference
HYEEEAAAAAAAAAAAAAAAAH
So let me tell u what I m actually doing so my input image is actually of size 6969 X 4640 okay then when previously i was doing some experiments and making buckets with max resolution to 1024 x 1024 then the bucket size which is made is of size 1217X 832 which is 6 times smallee than my original image so what I did was i increased my max resolution to 4096X4096 so now the bucket which I got was 4096X 3792 which is better than previous ones and not in the training config file I m using exactly the same resolution of which is in my bucket because all of my images are of same size so what do u think is this correct way of getting high-resolution images at inference and not getting multiple pizzas
So i did wrong or right because for now my training is running
Wdyt
👀 gotta let all the channels know bout that fiddy dollaz huh
What are the best AI Image to Video generators that accept watching videos to earn tokens
Well there are some that have a tradeoff like that... I'm not familiar with them tho. You could check em out.
Is the stable diffusion or is the diffusion stable? Hence the question.
-Shakespear probably
Just use that wai one locally, results I've seen seem pretty goooood
Ye
What is that
Not sure if checkpoint is the correct word for this, but it's basically that to make videos
So a Local AI Generator? Like Comfgy UI or Stable Diffusion? I have an RX 570, not the best GPU
Stuff like those yea, I imagine even a1111 can do it
Dunno, I use Nvidia gpus
No clue if that one's good or not
What is the name of the Loca AI you said?
Ya mean the app itself? It's really just called a1111
Serves me well for pictures, haven't tried videos just yet
Still, if a1111 can't do it I'm sure confy or swarm can
How mean like how do I run it? COmfy UI or what?
depends on which of these ya wanna use lol
Comfy UI doesn't work for me? Any good alternative?
try swarm, you can do stuff like this locally, this one was made locally https://civitai.com/images/73141789
So swarm uses my GPU, but is web based?
idk man, never used swarm, i use a1111, ya gonna have to ask that in tech-support
i simply heard it was good
Gimme link to a1111
you should probably make sure it can make videos first lol
as i said i'm not sure, never tried
Just gimme
Ay Ay CAPTAIN
happy to help
now ya install it, i don't remember the exact steps and it's probably a bad idea to tell ya from memory lol
read it's page, i'm sure the instructions
are there
@abstract quarry please reply this
i have downloaded stable diffusion for the first time and have downloaded a model off civitai but i am getting an error when generating, it says the model is for Pony not sd1.5 like in the tutorials i'm watching, could someone give me some insight/tips? thank you
Revisit CivitAI and check the upper right-hand corner on the models page. There is a filters icon. From there, you can filter which model type to use. Skip Pony for now and try out the SD 1.5 first.
Guys is there any way of increasing training speed without affecting the model's quality?
I have enough vram
Wtf is that link
I just want to increase the training speed it's just that I don't know what values do I have to change to use more of the vram
Does anyone have recommendation for a SD model that is good at creating things other than characters? There's virtually an infinite amount on Civit.ai that are trained to be good at making people, but I want one that can create interesting spaceships, buildings and the like.
Spam links, just ignore
thought so
I do
Im in the market of getting a custom built PC and thinking of getting a GTA 3000 or 4000 graphics card series with a good number of VRAM so I can generate something like 90s cel anime like sailor moon or berserk or evangelion, or even something like aqua teen hunger force style
what are your recomendations?
I personally have a 4090 but Ive seen people with lesser GPUs having good generations
Idk what is your budget
something less than $1000 if possible for good quality
But aqua teen it's extremely simple artatyle so you shouldn't need that much
There must be someone here better suited for answering this
what about cel anime like from the 90s?
It should work
If I'm not mistaken having realistic generations is harder than cartoons in general
so a 3000 series should do the trick I presume
my current computer is using a gtx 1060 6GB graphics card from 2016
I see people within the 3000 series making good generations
But I'm not sure
You should aim for the Max cost-perfomance gpu
But dunno where does it fit that limit today
suppose I get my custom pc with that kind of graphics card and I set it up, what do I download to get the locally hosted stable diffusion to start and operate?
ForgeUI it's ok for a start
It's an user interface that helps you to make generqtions
so I download forgeUI for that computer and its ready to go just using the GPU? can I access it over wifi from my android phone when Im away?
I like forgeUI for generations
You download the AI models you want to use and put it in the corresponding folder
Start the program and works at least for me
There are some YouTube videos with simple guides
and those models I can select from forgeUI after downloading it right?
Dunno if you can directly download from forgeUI
where do you usually download your models for yours?
I just take one model I like from civitai, put it inside the stable diffusion folder and it's usually good to go
ah ok
Try to look for more opinions about what gpu is good for you
But if you have the money try to get something good
what chat can i send the scatch in
@oblique elk i sent it in the other general chat can you take a look
quick question
i want to use the hunyuan v2v model where i upload a video of my own and give a prompt which it then changes accordignly and gives back to me
but to install hunyuan do i just install it normally as a model and then get a workflow for v2v or is there a specific v2v model i need to install?
I used to run 3070ti (8gb vram) illustrious and sdxl is absolutely no problem for that model
Higher vram however allows you to run it faster
My 8gb vram back then was about 20-30s per image with 1-2, loras
(I am unable to send any image in this chat) Hello everyone I am technical officer at genotek, a product based company that manufactures expansion joint covers. Recently I have tried to make images for our product website using control net ipadapters chatgpt and various image to image techniques. I am giving a photo of our product. This is a single shot render of the product without any background that i did using 3ds max and arnold render.
I would like to create a image with this product as the cross section with a beautiful background. ChatGPT came close to what i want but the product details were wrong (I assume not a lot of these models are trained on what expansion joint cover are). So is there any way i could generate environment almost as beautiful as (2nd pic) with the product in the 1st pic. Willing to pay whoever is able to do this and share the workflow.
int4 flux is 6.64 GB so 8GB is ok
i recommend reposting in either #🏞|general-with-images or #🌶|off-topic to include images
Does anyone have the detail++ Overall Detail SD1.5 embedding? It was removed from civitai and was curious where I can find it
Will use the basic generation model but within the Video sampler you would use images from the input video as samples input.
I've seen someone gen with a 2060
It's pretty nuts the optimizations some webuis do
I myself used to use a laptop GPU but that was with SD1.5
2060 was with Illustrious
Tho admittedly no clue about Forge, I use ComfyUI
Hi guys, not sure if I’m allowed to post this but I am looking to hire a LoRA trainer
Hi
how can I make spiderverse style images with dreamstudio ai
??
and animate them
yo
do you guys have a good to go formula depending on the number of images?
does anyone of you know how to generate image of width = 4096 and height = 2752 from sdxl model, and if it is then how i can do it and how to multi gpu infernce in sdxl
- can't natively SDXL is trained for 1024x1024, you have to upscale using hires.fix/img2img/outpaint/etc
- no software (to my knowledge) does multi gpu inference. At best you can run one instance per gpu.
seee i generated the image of 2048 x 2048 with sdxl
liike i was using kohya inference sd-scripts tpo generate imag of this size
actuakky i generated image 2752 x 2752 too
so its not like its just limited to 1024 x 1024
have anyone of u saw inferernce scipt given in kohya repo
Generate an image with 688 height and 1024 width and upscale it. Either with some hires / tile controlnet etc. to add further details to the image or without to keep the amount of details from the first generation.
you can do multi-gpu in diffusers
these days the models are getting so big that the larger models like cosmos or stepfun come with their own multi-gpu scripts as well
can u give me documenation or provide some link to it
What is the current stable diffusion experience for supported amd cards? Trying to get an idea of whether or not the 9070 is worth it for me when it gets full support
if you are willing to write custom kernel, driver and compiler code then amd can be ok
otherwise rly not
iirc amd is useable upto sdx
sdxl but you gotta pull some tricks for upscaling etc
What sort of stuff for upscaling and why?
tiled upscaling, zluda etc
Umm
scam dont click
Yes
Hello, I'm new to SD and want to buy a GPU. Would it be better to buy a 12GB RTX 3060 or would it be worth spending $250 more for a 16GB RTX 5060 TI?
More vram the better generally
And considering the 5000 series is newer you'd get warranty too
I don't live in USA and won't be able to claim warranty if anything goes wrong.
Would 4GB more be worth the $200-$250 more?
@atomic mortar
Depends on how much 200 dollars is worth in your country
Well 1 month rent for a studio apartment for a college student is like $300 USD here
16GB can make a big difference, but may still not be enough depending on what you intend to do
I plan on using it for inpainting and text to image inference mainly. May also play around with audio but I would probably be using APIs for that
hello 😄
what is the fastest workflow for wan? i am running it through comfyui and the least i can get is 5 minutes, i have a 4090. any faster workflow please? and maybe same quality. thanks 🙂
Is there a quality program for downloading models off CIVITAI in Bulk? Looking to fill an 8TB hard drive with at risk models
anyone up?
Tea cache maybe but you'll lose quality for like a 20s improvement, video ai just takes that long
Hmm elaborate? Like in bulk by providing a link and click download? And it fetches all metadata?
swarmUI has a downloader integrated with civitAI apikey support
Noted, was looking to download all models with safetensor and XYZ tag
Well if you want to download all models you'd have a gigantic task ahead
Hey I got almost a month
Does anyone know what happened to the batch processing support added to SD Ultimate Upscale? - https://www.reddit.com/r/StableDiffusion/comments/11ul4t8/ultimate_sd_upscale_update_announce/
do u people really trust comfyui?
i really feel that the comfyui generated results are not goood because when i used to generated images with the help of comfyui its results are not good at alll rather than if u use raw code to generate images they are very amazing thats what i have observed
we cant blindly trust comfyui at all
Its opensource no? if you don't trust it just dive into it or compile it your self
Forge however uses obfuscated code to hide they used code from comfy
yes ofcourse i m just saying like we cant blindly trust comfyui for genrating images, for example if someone is finetuuning a sdxl or sd3 model investing so much and they generate results from comfyui and if the results not come good they will think that training didnt went well
because thay happened with me
*that
but when i used raw code to generate images results changed so much
Just a case of samplers and schedulers
You know comfy is just a bunch of nodes (code blocks) transmitting data to one node to another
You just made your own workflow
Tedious but workable for you
its not tedious i guesss, i just mean that we cant trust it blindly
hi guys does anyone know any checkpoint that goes well with Vroid models? im still noob and my loras suck, and when i use different check points for Vroid the generated image style changes so much. also how imporant are regularization images for loras? ive been at it for a week but all my loras look like deformed monsters especially the eyes xd
sd 1.5 btw
can anyone please help me with some settings, Im getting crazy
and peopel instead of trying to help they just start yapping at you
I just need to understand somet things Im having trouble with
no one?
If it is a technical problem feel free to ask your questions in the support channel if you have problems with prompting or prompt settings ask in the prompt channel. …. If someone can answer your question they will.
No blind trust every line of code is visible. Even people write their own sampler and nodes.
may I assk you what training program do you use?
I just want to compare some training settings
Great question and would fit perfectly into #🔧|finetune
And pretty sure others already recommended you Kohya, onetrainer and fluxgym.
The parameters are not comparable as it depends on so many different factors. Amount source images, concept / style or object. Close to the source model or far away. Overfitting useful for later generation etc.
yeah, Im in a million discord servers and I confuse the channels sorry
yeah parameters change between training programs, thats teh problem
I find settings that dont work for another trainer
and its a bit confusing
Yes, copy and paste does not really work for training. At least if you do not want the exact same result as the tutorial, blog,… and you use the same input files and descriptions.
So no other choice by reading and understanding the effects of the different parameters. And then start with trial and error with a small and good labeled dataset.
assuming everything is correctly tagged, if you an I have the same number images in more or less the same style, shouldnt I get good results with the same settings?
Does anyone know why refine/upscaling uses up so much more vram on rocm compared to normal image generation or the equivalent process on nvidia cuda?
I would guess because internally it is upscaling the latents, which uses more ram.
Anyone proficient with Stability Matrix and in general running models locally? I need some guidance on a couple of issues I'm encountering with certain models...
not with matrix but i do run sd locally
no clue, kinda somwthing we just gotta live with if we want to fix eyes and artifacting lol
rocm is less optimised
especially the typical consumer rocm setup
it seems to fill up my vram even if i am only upscaling by a really small amount. It makes me wonder if its some kind of bug
My name is Ziliah and I had a few breakdowns and pretty much decided to rebuild my life by being fit and healthy, yk
I like listening to music and reading novels
I currently own an online store that brings in some amount per week because I am passionate about helping my parents retire and achieving financial freedom.
I'm open to friendly conversations
does anybody know a easy way to ai generate images for a book? it doesn't even need to bee the whole szene, one object theoretically would be enough, although the whole scene would be more impressive.
it would need to be some kind of text2text generation first (book2prompt), afterwards prompt2image
i would need one picture per page
hmm that would just be text to image though?
like you describe the object or scene
You might need to use a llm (ChatGPT, Gemini,..) to summarize the page and to focus on main subjects / main objects. Then let the llm create a prompt and your favorite image gen ai creates the image.
BUT for a bit of consistency I would add some style tags to avoid each image looking different (comic, real, anime, sketch,…).
Additionally you would need to remember the last object to avoid repeating images on each page if the text main object overlaps over many pages.
Another BUT if your book has a main object/ topic/ character that appears on multiple pages you will not get similar images. So a reader might be confused why the brunette woman from the start is now blonde. …

exactly those two BUTs are the issues i face. additionaly i don't have the money to pay for gemini / openai APIs and my local ressources are limited to a quadro p4000 (8gb), 32gb RAM, ryzen 7 3800x
flux.1-dev onnx fp4 (for scenes) and ssd-1b (for single objects) are the models i use
Would it be easier—or even possible—for an AI to pair a scene from a book page with the corresponding sharp frame from the movie adaptation? If so, how could that even be achieved?
that would be way to easy. just giving the book scene to sd or flux results in shitty images. it's text2text2image. i mean if you prompt an ai to generate an image, you would never write like an author of a book
I mean if your scene has a lets say a rusty knife, just prompt for a rusty knife in said setting?
i don't want to manually prompt 300+ images
Ahh thats the gist, yeah you need a llm for that to somewhat get a prompt out of it
that's what i'm asking for. an ai generating the prompt from a given book scene
Chatgpt or a local one could do the trick yeah
exactly
Even copilot could somewhat manage if its below 3k letters
gemma 3, bart, pegasus, t5, llama3.2... I've tried a lot, they just make it shorter, but nothing you could use as a prompt for an image generation model
not even when you include a manual description of every person/setting
Realistic? Anime?
colored scratchbook drawing
i had one including description of every person/setting and a description of the style i want, output prompt was in sentences and too complex for flux
Hmm flux and sd could work
But hmm extremely long prompts arent image gens strong suit
that's what i'm struggling with
As it will just ignore certain elements after a certain point
book page 2 short image comma seperated gen prompt
or just add something to make the scenery more "realistic"
fck...
thanks for your help anyway : )
Well the only way would be manual work
Hmmmm if its single item/people prompts however
Then you could maybe, ask the llm to highlight the most important person or object in the page
or just a simple szene, just needs to be a background image for every single book page
Yeah you could do that
the issue is the additional unnecessary stuff the llm will output. sth like "here is the highlight of the given text snipped"
The user will show you a page of a book and you will generate an ideal prompt for generating a colored scratchbook photo for that model on Flux/FAL. You will confirm whether the model is an object/product, a person, a pet animal, or an art/photography style.
Ironically for complex things I found it was easier to 3d model what I wanted manually then img2img to change the style
Your prompt should start with the first sentence setting overall context and containing the model name mentioned by the user. The overall prompt should account for location, model overview, expression and pose, angle of the shot, placement when the model is an object/product, lighting, colour palette, and styling.
Could be a decent start as a base prompt for gpt
the scene thing... someway to get every different place/setting shown in the movie, image2image generation to remove everything in the centre of the image, and a simple python script to use those images as pdf background
tysm!
i'll try later
Youd have to edit it though a little but i hope that gives you a way further
Still thinking you could look tools like dfiy. They allow you to chain up different ai tasks (llm, scrape, image gen ). So you would create on ai agent to determine the key element of a given text (system prompt etc. ) then pass the result to another llm agent creating outstanding sketch image prompts with the input of the first agent. (Could even be 2 different llm ). Finally send the prompt to the third agent which uses for example a predefined comfyUI workflow to create the image.
Hello
Hello
If you have the option to work at home, please contact me. I understand how things can be challenging now and I want to help you find an appropriate opportunity. There are many vacancies to apply 💌
Hello
Hello
is there any channel here to discuss trainings?
or another server for it
I dont want to spam this channel
Fine tune channel got removed a while ago
where do I go then
And tbh even if you spam here nobody would mind
Its not like theres a big discussion going on
what a difference with other servers, where they start crying because people ask questions
I mean I don't know if you have noticed, 300k members yet this morning a 6 hour gap between "hello"s
The only big active ones are anime and techsupport lmao
Nothing wrong with us animu lovers 😭
I know, its one of the few channels im active in lol
We got the opposite here once, dude was out here talking like we the webui CEO, demanding to make stuff work for him lmao
Once? All the time
Well I only saw it live once
I only wish that people like donny just use off topic for their vague resume job search
Instead of posting it twice a week everywhere including techsupport
Yes, not sure the chat is a good place for Job begging
Just use fiverr or community's made for it ngl
Oh cool literally what we just talked about
Wait tech support is a genuine server? I thought it was spam
no?
the channel #🤝|tech-support
'>??
not the server
the sever is a scam
Ahh ok XD
the stable diffsion channel is not lmfao
I just need some place where I can discuss training settings with people
I got good generations but they could be better I think
iirc werent you banned in the one trainer server?
ah i see we dont share this one server
imma link it to youi
I was very annoying to be honest
but I just need to ask some questions
something with some back and forth
Let me guess your question: can anyone share their working parameters for training. And best they work with resolution 2048x2048….
no
this is the kind of thing that pisses me off
If you join the tech support server they tell you to please wait a moment and DO NOT REDEEM
That's what sucks about AI, it's full of those
Im gonna ask here too if its ok for you now that Im here
do you know that some games put the same exact image but with different face expressions for dialogue? like a visual novel
I was wondering if feeding all of those images helps the training
because they are the same images but not really; if I tag the images all the same except for those face expresions, the model will learn to differentiate?```
What? That was a tech support scam joke lmao
yeah thats what Im sayiingai servers are full of scams links
more than others
Should work, even cutting the original multiple times should work
I've heard that using the same image counts as data poisoning but maybe they are wrong dunno
Not what heard, but hey the best way to really find out is try it
because otherwise, if I can use "variation" images as completely new ones, my dataset increases from couple of dozens of images to hundreds
that for a lora
if you want to do a full finetune you need more
Im getting good generations now after some tries, now Im just nitpicking
Tagged you on the chat with images for example
so I suppose I have to change some tags and images
Far as I remember reading, that char's Lora had about 20 or so pics
That does matter indeed yes
10-50 Ive heard
like 10-20 the bare minimum
This specific one is 20
yeah and you also have to change the number of repetitions depending on dataset
Ofc the more variety the better in Theory
I use the technique of just cutting pics with mine
Works fairly well
Not flawless I'm sure
I mean yeah you can always rotate thgem and the ai will think of them as new imageS i SUPPSOE
Any small edit to the picture should do it, cut a corner or smth
I was thinking more something like this #🏞|general-with-images message
like in these cases its clear that the artist amde 1 image and edited a bunch of face expresions with layers
Yes, that should work perfectly fine
In fact a lot of facial expression variation is good
what if they show the whole body/more stuff
and its like 90% of the image the same, except for the faces
In theory should be fine too, don't see why it wouldn't
dunno let me search for some quick examples
If it's a known char I suggest getting fan art, screenshots and much more
#🏞|general-with-images message this for example
Preferably in different styles
not only teh face but a whole body
So it doesn't super glue to 1 style
is it better to crop and get the face only or just feed the whole image
At the moment im trying a character in an specific style so it shouldnt be a problem
Ofc, assuming each of those is 1 pic
The same char multiple times like that in the same pic... Not sure that's a good idea
then the thing is tagging so they have the exact same tags except maybe 1 or 2
not in the same same pic, its just an example
Oh, should be fine then yea
just imagine that in the example I gave you they are separated in 4 diff images
Yes, that should be fine
that opens a lot of possibilites then
Indeed it do
do you have any tips for tags
I just use WD14 then filter some things that could be wrong
Well, best tip I can give you is making the trigger a unique word
I still dont get this trigger thing, do people use triggers or not, which is better
Also the most words possible.
Some peeps only use 1 word for the whole thing and then it's a mess if you wanna change literally anything
for a character
Like removing a hat
I do, so I advise doing so
Imagine a character like Son Goku which always has the exact same hair
You could just give the hair "Goku hair" tag to avoid any confusions
Or if you're brave lol short hair
do I remove black hair and spiky hair then?
Do whatever you find better
do you have some examples you trained on?
Trial and error
Not sure I understand the question?
as a general rule,you have to remove the tags you want the model to learn right?
if you are training dunno, a goku character, you remove all the tags that define "Goku"
am I correct?
Remove? I don't remember doing so no
Honestly, ya should just try the same tutorial I use lol
Used*
No expert by any means, simply had a guide to make mine
Or use Civitai to train it
There it's not free costs buzz
But it has an amazing in-depth guide
I preffer training locally
btw is there any reason characetrs eyes look weird?
like the bidy usually looks ok but the AI seems to have problems with the eyes, I see this with many models
lack of using hires fix i'd assume
it's amazing to fix eyes, give a lil more detail and fix artifacting
Yeah lack of hires or adetail
whats exactly highres fix
And if it's isn't a close-up it's not gonna look great without it
idk the tech mambo jambo terms but for normal peeps like me: seems to make the picture twice but fixes eyes, adds detail and fixes artifacting
assuming you're using a111 like me there's a lil box you can tick to use it
there's many upscalers tho
i suggest remacri
never disapointed me
its just a high resolution upscaler then
guess so yea
so you tag everything and add a trigegr word then?
to use the hires fix? it doesn't use any trigger word
no I mean for a chgaracter lora
if you asking if you should tag everything in the char? i'd say so yes
for example if a char has a hat and there was no word for it... that shit ain't comming off lol
I heard both things, tag everything and add a trigger word or remove tags that define the charcater
well, just telling ya my process, i don't remember removal
but ya can try both and see
which works better
sounds like a case of trial and error to me
Hi
Hello
just train on a single image, that's fine
wdym
you cant train on a single image only
you can
you can also train iteratively: build a initial model by training on a single image. use this model to generate a larger set of training images (using e.g. controlnet to increase variety) and then train again
thats cool
I just wanted to know if I coul do this #🏞|general-with-images message
I have tons of those and I f it helps if I tagg them correctly then Ill do it
it might help to teach the model which prompt stands for which expression
but I don't think it helps much with training a single face
I have tons of images, its just that some are cloned like that
looks like rpgmaker face sets 😅
yeah its basically that
many Vns and games do that
if I feed it woth more different images it should help the training right as long as tags are good
soI increase my dataset like 10 times
what do you want to achieve?
I don't think that these face pics are complicated to generate with flux. I don't think you need a lot of training data
at this moment create a character lora
I already have good results but I want to improve it
its just that I dont have that many images but I do have that many images if I include teh face expressions avriations
hey
Gona try to train on a single image and all of its variations and see what goes
Btw does anyone know the purpose of image repeats in a dataset?
I know this option makes an image be processes that X amount of times during an epoch
But I don't know what does that really mean for the training
Why is it a thing?
You can use it for weighting. Let’s say 10 of your images are great you might repeat them 4 times. 10 are good you repeat them 3 times. And 20 image are for reference but not your training goal you might repeat them only one time. So you make sure the training is more biased towards your images.
So it's a way of giving more influence to some images and less to others
I've heard this is also good to increase if you have a small dataset vs a big one, to compensate
What I don't really get is that this logic points to a standard N° of steps threshold everyone is trying to imitate
But I thought there was no formula for this stuff
let's say you train a model on a specific style and your training images are men and women. You have 30 images of men but 90 images of women. So you set repeat on men to 3 such that in average the model is seeing the same amount of men as women
if your model would see much more women than men then it would get biased towards women. You see this very often in anime models where the model generates a women even if your prompt asked for a man
however, anime models are extremely biased. Having a 1:3 ratio would not induce such a strong bias
guys that the hell is DPM++ SDE CFG++
i cant fint it anywhere
krita diffusion seem to not have this sampler
there is dpm++ 2m cfg++ and dpm++ 2s a cfg++ but not sde
ohhhh its only comfy and reforge thing
That makes sense thanks
I can also set repeats to 1 and just copy paste the images in my folder
I think in some GUIs you can only set repeats to ALL the dataset, not just some images
Is that the same time basically or the logic behind it doesn't exactly work like that?
yes, it's the same, just needs more preprocessing time and disk storage
So as you said, if I have repeated images like those RPGmaker sheets or visual novels, even if I tag the poses and the elements in those images, the model will be more biased to show those at generation with neutral prompts
So I should have that in consideration while prompting and while tagging because it will have a bias with those repeated elements, like character poses
I'm only assuming, but I had some generations copy the poses of my database if not specified
For example, if I say "looking at front" sometimes looks from a diagonal, because my dataset has those, it is *technically looking at front (dunno if I'm explaining it well)
is 4070 fine for creating lora?
Absolutely. The only restrictions you have is vram depending on the dataset and which checkpoint training is targeted for.
If you're willing to spend a lil you can also train on Civitai, that way you're not dependant on hardware.
Think it's about 5 euros to train a Lora
And far as I'm aware they do come out well (?) dunno, never done myself
works pretty good imo
Morning
What kind of systems are you guys running? Because mine crashes on wan video generating (3080TI)
5080, takes bout 10-12min
Hi, when I try to use any name for example "GLOBE" the ai generates an image with incorrect letters. eg. it would make it "GLLOBE". any idea how to correct this?
Assuming your using XL, luck
Sharing this in case anyone else is deep into messing with GPT model and figuring out jailbreaks prompts. Its a list of prompts and models that the prompts can be used on. uncensored/jailbroken.
Not mine. Just figured someone else might want to check it out:https://shrinke.me/Z6zVWp
If the link dies, I’ll try to reupload it later. No idea how long it’ll stay up.
What changes do I need to make.. i was using Foocus
Xl isnt the greatest at text
Flux dev /sd3.5 large is better at it but still not perfect
ok... looking to create a logo
I recommend a textless logo or photoshopping after ngl
ok... and what is ngl?
ok, but what is photoshopping after ngl
Do you know photoshop?
yes
Well youd need to photoshop said result to try and fix the text probably
ok... thats sounds good! thanks for the suggestions... appreciate it!
isn't there any sd model that is good at text?
What gpu do you have?
3060 12GB
Hmm im not sure if you could use flux dev, Maybe someone else could chime in for better text results
Well depending on how you want to integrate your text within the logo. Flux and SD3.5 are ok for generating images with short text even short sentences.
Another way would be a simple depth controlnet with the text as source and for example sdxl as output.
yes its a short text for logo, is there a specific way to prompt it to avoid the spelling mistakes it makes?
@oblique elk thanks for the alternatives
do you use basic and specific tags or only specific tags for prompts?
like eyes, blue eyes
with automatic taggers I get many clothing tags
and I dont know if I have to leave coat, white coat, white long coat or only white long coat
@oblique elk you know when you use wd14 you get tags
Or while prompting, you can get tags that include other tags
Maybe eyes is a bad example
For example if a character has a white coat, do you put coat, white coat
Or do you only put white coat
What is better?
Ahh bottom one
Personally
Both work
Its trail and error ngl
No one size fits all solution
Did you get to see any patterns?
Maybe if it's a very complicated images, too many tags is not good
Or maybe more detail is better, but if a model is already trained it should at least have some kind of knowledge right?
XD
But honestly tag based prompting is nice for smaller prompts to keep it accurate as possible
But once you get complex prompts, cant beat natural language
Illustriousv3 will support it
Illustriousv2 works sometimes with it but also does tags
Illustrious v0.1 does only tags
why not post the question while you wait for someone to come online eh?
you wanna stream AI?
no i need help with setting up workflow
well if you wanna stream comfyUI workflows im afraid your gonna need to dive into it a little
stable diffusion
though if you use forge its pretty much working straight out of the box
yes thats a model
i just wanna make some spaceships 😭
uhh
its important to know before i recommend anything really
NVIDIA GeForce RTX 2060 SUPER
and i have 16 gigs
i can run stable diffusion
but idk how to make images 😭
a 8gb vram is working for xl
https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides i recommend Forge (since theres more guides )
its not just one thing lol
oh
you need help installing the UI it seems
https://www.youtube.com/watch?v=zqgKj9yexMY&list=PL-pohOSaL8P_VxpGxcay1EJFtqX4m8WqZ this video will help the way you describe it
you can post images in #🏞|general-with-images
k
i DO recommend following the installation guide from CS1o however in the github link i posted above
yeah i see the image
the video will help in this case 👍
posted
why is stable diffusion only capable of generating one image per prompt?
Depending on the UI you can do multiple images at once but theres generally no speed improvement
I can make 3 images concurrently but i save about 0.7s or cost 1.5s on my 5080
So its not really worth it to do it like that on consumer hardware
what settings are great for flexible character illustrious lora?
flexible as can handle more loras, styles, concepts, etc? and not getting baked..
im using civitai's trainer btw
Hello everyone!! So i have some questions about stable diffusion models. What model do you think is the best and can you recommend me one?
It completely depends on your usecase and hardware
You want anime? Illustrious type models, you want realism? 3.5 large/flux
Dont got the vram? Sdxl
If you got less then 6? Well i think sd 1.5 is pretty cool still
hey guys!
I'm akash and i head bd at Hive intelligence. We're building infrastructure for AI agents
have a few synergies in mind and would love to connect w the bd team at Stability AI to discuss this in detail
can anyone from the team help me connect to the right POC?
I think someone here said it before, but tags are "aware" right? if I make a custom tag in my training or prompting likw potatocamp the AI knows about potatos and camps and it will have somme kind of influence over that tag
talking about wd14 tagging
so some tags dont have to resemble what is only shown on screen, but also concepts
Itĺl be more like a trigger word. to achieve this direct result in flux, i used ¨sp1ky¨ for examplehttps://discordapp.com/channels/1002292111942635562/1004159122335354970/1362174882330316870
but it will have that context integrated?
Yep. In this case, the trick is to ¨tell¨ the training that X thing is actually Y, so instead of saying itś actually porcupine needles, i tell it that the handfull of needles are ¨hairstraws¨, and the training learns that.
what I mean imagine Im tagging something that has nothing to do witrh potatos or camps, nothing. If I use a custom trigger word like potatocamp, even if its all in a single word and not 2 separated words, it will "learn" some context about potatos and camps?
so maybe randomly in one of my generations I get a random potato generated?
Yep, hence triggerword. A all in one unique word/¨password¨ if you will to unlock the loraś full training trigger
how am I sure then that Im using a trigger word that only focuses on the things I want to train on
with characters I suppose I could use a triggerword that already has the name of the character on it
like duno CustomGoku
it osnt really Goku but it will get characteristics from it right?
you want your keyword for your lora to be something that's not going to be normally used in a prompt so it's not triggered accidently
so you wouldn't use potatocamp - you would use p0t@toc@mp
so its context aware then
in this case to confuse the AI and not trigger potato generations or concepts related to that
ty
very, yes.
even a . in your prompt has an effect on what the AI draws. or a , or a ; <--- those are noise, but it's aware that: this is an apple. and this... is an apple aren't really the same phrase and don't really mean the same thing
and If you have something liek a mix of colors what tag do you use?
I have like a white greyis background for some images
do I use white background, gtey background, both?
i use phrases like "red to gold gradient"
im using wd14 tags
remember that the AI wasn't trained on color charts, or pantone colors. so try to stick with the common terms you'd find out there on the net
using white background grey, background make sthe model mix concepts or does it confuse it
run those tags as the only thing in your prompt to see what the AI understands them to be. you might be surprised
I dont really get a clear answer
it also generates more stuff in a simple model like sdxl
I dont know what to do, iare models able to mix colours
or is it better to stick to the color that resembles it more
the thing is, those tags aren't giving you what you think they are. to the AI they are probably jsut noise. unless you train a lora on those specific colors with those specific tags, the AI isn't gonna have a clue.
stick with common color names that you'd find thousands of times in the google webscrape database
but tags have information on their own as yous aid right?
not if they aren't in the data training set for the model, or a lora you are using with it
if the model you train on already knows what a white background or a grey background is
sure. because if you go search google for 'white background" you get millions of hits
like these are wd14 tags which mean there are already images with those tags trained on
and so that's likely a term in the database
so if I have a greyiss white background, do I use white, grey or both
neither. you use teh term "light grey"
if I want to recreate the background in my images as close as possible?
then you use photoshop
its just an example, imagine more concepts like that
concepts that are mixes of wd14 tags
think of the AI as a fancy sort of camera - you get raw footage out of it. you don't have perfect control and you'll need to do post production work
color matching like that is post production work
you dont really have a tag to define them or the tags are too abstcat, in this case, I cant really knwo what grade of white or grey the tags reffer to
at the moment im just tagging, I just want to get as close as possible now that im doing it
that would be why you trained a lora on that specific concept. so that when you used a specific tag for a color, the AI knew exactly what color to create
give me one of those tags that you're using please.
so while training, the model "changes" the meaning of tags a bit then
no, it doesn't.
so if for example you use white background in the base sdxl model, you get white backgrounds, but if in a training all images have a more greyiss background and then you train on those and you generate images with white background prompt, you will get more greyiss backgrounds
can you please tell me one of those tags you're using
white background XD
and white hair
do I use white hair, silver hair
what do you consider a "wd14 tags"
there are 2 ways of doing prompts
doing descriptions
or, doing, tags, like, I, am, doing
what do you consider a "wd14 tags"
those aren't tags. that's just you adding noise injection by the use of commas, into your prompt
that's not a phrase, with proper punctuation, so the ai won't look at is as a concept. it'll use the commas as noise
you dont know what wd14 is?, its an auto tagger for anime
and is there any reason you think a model not trained to understand those tags would have any idea what they mean?
there are models trained with those right?
like, its something very common with AI image generations
for a while now
probably the pony models
pony and illustriousmodels are trained on danbooru etc no
forgot about illustrious - yeah, those are the models you want to use with those tags
Ah ok
Was asking because that tag system isn't something fluid like a long descriptions
So you can't say greyish white if there is no such tag in the first place
So maybe mixing was the correct way to do it, if anyone knows please let me know
if you want the tags to actually work, use the models that are trained to understand them. those will be the pony models and the illustrious models. no other model is going to ahve a clue what you mean when you use the tag
Hello! In AI image generation field, what can you do with 24GB of VRAM you can't with 20GB? I have a choice between RTX 3080 Ti 20GB for 45.000 rubles (550$~) and RTX 3090 for 65.000 rubles (800$~). They are almost the same in performance, so the difference is in VRAM capacity and price.
you can load models that require 24 gig of vram into 24 gig of vram, and you can't load them into 20g
hey everyone how are you
no you cant
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
UAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Hey uu
waz good
I'm building a visually-driven lifestyle brand and I’m looking for an AI image expert to help me generate highly realistic, on-brand product photos using tools like Midjourney v6, DALL·E 3 (ChatGPT), or Flux AI.
Our products are luxury-inspired fashion items (bags), and we need a consistent, editorial-style visual direction that feels clean, elevated, and believable.
This job is part hands-on creation and part training
i think you need a photographer more than an AI artist
Hi everyone
Hey! I’m building Claity ai, an AI tool that will integrates ChatGPT, Claude, Gemini & more in one place with smart prompt routing.
Just launched the landing page and would love opinion or support!
Join the waitlist here: https://claity.netlify.app/

You know i was about to make a snarky comment about all these advertising posts but it's the most we get here lol
Helloo
Yo
so, a1111 can't do text to vid right? preferably being able to use ILLU stuff
if not, can forge do it?
i'm trying everything to not install comfy bro lmfao
Sooner or later we will get you. Come to the dark side of comfy users and node torturers
Maybe as entrance drug try swarm 🙂
was avoiding that one too
😭
so forge can't either? fuuuuuuuuck
No quite sure. To be honest I am not very patient, so I prefer a “UI” which is designed to be able to add fast new model support. Long time comfy user. Especially T2V models seems to be released nearly every two weeks.
i shall google it
now that my masterpiece is finished
doesn't seem like it. videoless i stay
ain't worth the pain lol
deforum, animatediff, etc etc there are ways to do video on a1111/forge. I don t think there is any support for wan and hunyuan stuff however.
shame, hunyuan seems to be able to use illu loras, stuff so i was looking forward to that one
so yeah do it the "old way" or switch to comfyUI
neither, i shall remain videoless
valid choice
Why avoid swarm?
cuz i'm lazy
Valid but your loss for video ai
a sacrifice i'm willing to make
Hey guys anyone got edu maıl? currently studyıng ın unıversıty can you reach me pls
i'm saying hello because discord made me
Sorry for mmy basic question but how do i generate a image? Which channel?
It isnt free?
it's not on the server for starters, there's many diferent ways
there's diferent ways
theres a variant of automatic1111 called forge that is an 'upgrade' to a1111. search up youtube videos on how to install automatic1111 forge. once downloaded you can generate images for free locally on your pc.
While correct it's dependent on your hardware
Only thing i miss from forge/auto, is the extension to fetch preview images and json info of checkpoints, lora, embeddings and so on. There's one extension now for lora and checkpoint, but not embeddings and the like 
oh yeah that had me not touching comfyUI for a while too. I ended up making my own half baked node for previews. Never released tho, code was wack
https://github.com/willmiao/ComfyUI-Lora-Manager This is the one i use now, quite neat. But wish it had a node thing for inside the main comfyui to click and auto apply lol
why is SD 3.5 Large not working? It's only generating empty images (none at all)
a111 can use embeddings
been using hem for a while... rarely tho, since illu there's barely a need for it
pony is miserable without hem tho lol
is it a black square? last time that haoppened when i got acheckpoint and forgot VAE
Tbh, use flux, way more accurate results.
it's just an image icon
welp, no clue then
I used this https://stabledifffusion.com/tools/sd-3-5-large
@karmic brook is there ever gonna be a v4 v 3.5 is nowhere close to the top anymore neither closed or open
how does this exactly work
Good afternoon everyone! 🌞 How are you all today?
i'm ok, how are you?
Hello, has anyone using Flux NF4 All in One models in Comfyui been getting errors on their workflows since last updates? Error: mat1 and mat2 shapes cannot be multiplied (1x1 and 768x3072)?
your doing it wrong
Fooocus, it generated me an image based on certain prompt, how do I get more variations of this single model from the photo. This faceswap shi aint working or i just dont know how to use it. Anyone willing to help
i literally do not believe that images in chat gpt image generation gallery on civitai are ai
i refuse to believe it is ai
it is literally not fair at this point
Results are absolutely mindblowing even if you write something like "make colorful manga with coherent story" in prompt
yeah crazy, maybe u know answer to my question
They where working fine, before the update 5 days ago, do you have a simple workflow that I could test out? Have used some and all give the same error.
Hi
I'm new to Stable Diffusion (SDXL A1111) and am thorougly enjoying figuring out all the ins and outs of the tech, installation, etc. After a couple of weeks of learning, I'm finally creating images that I'm please with. Here's my question: with the proliferation of online AI text-to-image sites, what are the advantages of SD beyond the offline/local genration and no censorship?
Hi there y'all! Can I delete the .cache folder or move it to another drive? Right now it's taking almost 20GB of space from my C drive and I'm just wondering if I can move it to the same drive where I instaled SD WebUI Forge if it's not advisable to delete it.
nothing? the whole point of local is that: no filters
it's always slower for us consumers and all that, and we are restricted to whatever our hardware can handle
but ya know
NO FILTERS
Hey, you shouldn't move that folder.
But open up a cmd and run
Pip cache purge
That will give you a few GB back
howdy all 🙂 i have a few servers here at home i run proxmox on, and am thinking about making a vm to run stable diffusion with. im just wondering what GPU i should pass through to the vm. currently gpu's i have are as followed:
Tesla K80 12gb x 2 (Dual GPU card)
Quadro M4000 8gb
1060 6gb
id like to use the the tesla gpu but have seen talk of the Kepler architecture maybe not being supported for alot of things. ive debated buying the M40 instead because of this.
idk what do yall think? or should i be trying for something Pascal or newer?
hi hi !
Will do that, thank you!
you have so much more control with all these things like ipadapter, controlnets and so on.
ChatGPT/Sora is great, but even there you not always get perfect results with the first try, but you don't have much possibilities/control to improve the image afterwards
I've been out of the loop for a while, have there been any large/notable improvements in the world of open source image generation in the last couple of months?
should i put vertical borders on both sides of a vertical image to make them fit a square for lora training
and tag it as "pillarboxed"
or will that screw the training over and put out generations with borders
hi what is this servser about
Hi, have you heard of ai generated images before?
Just tested LXTV, and after a few gens, it feels like it's basically "animatediff done properly, while wanvideo being a full on video-only gen 
https://github.com/OpenLounge --- Follow up my startup organization 🌟🌟
What's the difference with all the recent ai video generators compared to creating videos using SD. I think SD usues animediff?
I am assuming with SD its only good using img to video, or video to video, basically it can't really generate animations on its own?
hi just wondering how important is resolution for training images for lora (im not talking bout the parameter setting) ? im using 768 x 1465 for some of my training images, is this enough ?
Hhgg
Hello !
Hi! I’m new! Is it possible to create images from text directly here on discord?
video models are more efficient/faster and more coherent
with animatediff you see the image (in particular background) changing all the time
also, video models can do stronger changes and animations without getting incoherent
yes and no,
yes #artisan-faq
no its not free
we mostly run it locally
yeah, you don't want to use Artisan... ^^°
imo the civitai site one is a billion times better
the whole idea behind Stable Diffusion and Flux is that you can run it on your own gpu.
If you want to do it online, there are so many websites offering strong models like Sora, Midjourney, Flux Pro
Stable Diffusion 3 is not competitive to those in my opinion
if you want to start generating images on your own graphics card, look for tools like
- SwarmUI
- InvokeAI
- Reforge
Idk about the Keplar architecture being supported but you may want to look at the Tesla P40 since they typically are very cheap on ebay
Animatediff was one of the earlier attempts to make animated generations. With wan/hunyuan, it can achieve text to video/img/video to video much more effectively and cleaner. And wan is basically a stepup from hunyuan, but requres quite a bit more video memory.
Hello
how to generate a picture
No data source is currently selected. Please choose a data source from the dashboard and try again.
Hi there
Does anyone know how to take an image, preserve a central part of it, and then use the prompt to build things around it or something?
hi guys im just wondering im using sd 1.5 and my stupid 8gb vram pc keeps crashing when upscaling and i dont have the money to pay for cloud, do you think its better if i save money and buy a 16gb vram or a 24gb vram ? i know 16gb vram is enough, but if 24gb vram is much faster and can do some other things that 16gb vram can't i will consider buying it. any feedback appreciated 😄
What's your GPU and your webui?
rtx3090 and automatic1111
ah wait might be wrong i will check again
but im sure its 8gb vram
rtx 3060 - its weird bc it says its 12 gb in google but thats what pops up when i task manager
Are you on a laptop?
no, computer
OK strange a 3060 should have 12gb and a 3060ti has 8gb
do you have any recommendation of what i should get ? i guess logically since 3090 is 24gb and its older than 4000 and 5000 models its the best cause only vram matters ?
If you get a cheap 3090 thats a go to. But if not go for something like a 4060ti with 16gb vram
But we can fix your Upscaling problem too
You need to have --xformers --medvram-sdxl
In your webui-user.bat for the best performance
Then you shouldn't have problems with upscaling or using sdxl models
@warm junco omg ty so much i just checked the price its so cheap. you da real mvp bro. i thought i had to sell my kidneys for a new gpu xD
Np ^^
Vram matters so go with the best price/vram option.
If you just want to use ai stuff for fun then nothing wrong with 16gb.
If you want to do professional stuff or high quality video generation then 24gb or more is required.
nice thanks for telling me this. i want to pursue it professionally so i have to consider this well. hayzzz why does everything need money haha
can community guides remove ppl? because theres a scammer/phishing attempt in #🍥|anime
Any one help me in stable diffusion fordge ?
Anyone know where to find the pre-github-removed reactor?
Where to find releases?
yall ever heard of "invoke ai" ?
apparently it's crazy good
been meaning to move on from a111 so might give it a shot
One of the top 7 UIs for local AI generation I would say. And as usual it depends a bit on the use case.
2080ti 22gb is almost twice cheaper than 3090 and its OC`ed so close to 3090 speed
drivers work well too
hello
guys
my noobai just broke
i rechecked couple times and my settings are perfectly fine
theres a shit ton of artifacts on the image
my wai also broke......
what the fuck
Check the VAE you are using
checkpoint default
i never touched this setting
Yeah uh try grabbing the VAE from a checkpoint you know has a VAE embedded or just download the VAE as FP16 separately and load it (it's like 200MB it's not that big)
If you are using ComfyUI, also try re-making the nodes to see if there was maybe an update that broke the old ones
Especially the Sampler
im using krita diffusion
at this point id just block any link other then huggingface, civitai, reddit and github
if any one can tell what settings should i use to improvise my time taken in generating images without costing quality in stable diffusion
currently using(automatic1111(ui),
models( juggernaut XL, realistic vision),
sampling method(DPM ++2m)
REFINER(checkpoint(juggerenaut XL) switch at 0.9)
sampling:- steps 80)
on rtx 4060 without using --medvram
movign to forge, lowering the steps
A1111 is hilariously outdated and forge is optimized for XL
80 steps is exessive imo
what is best sampling method
its different per model
for juggernaut xl
what version are you running?
as it is producing better images from other models
anyways for the jugg XL by rundiffusion recommends ```'
Recommended Settings(VAE is baked in):
Res: 832*1216 (For Portrait, but any SDXL Res will work fine)
Sampler: DPM++ 2M SDE
Steps: 30-40
CFG: 3-6 (less is a bit more realistic)
Negative: Start with no negative, and add afterwards the Stuff you don´t wanna see in that image.
VAE is already Baked In
HiRes: 4xNMKD-Siax_200k with 15 Steps and 0.3 Denoise + 1.5 Upscale```
(juggernautXL_juggXIByRundiffusion.safetensors [33e58e8668]) where i can check version ??
and should i run on --xformers --medvram
no idea about the xformers but medvram could help
no idea though since your using a hella outdated UI
and about refiner setting, what is recomended??
last line:
HiRes: 4xNMKD-Siax_200k with 15 Steps and 0.3 Denoise + 1.5 Upscale
Always use xformers for auto1111
And dont use any refiner
If you have 8gb vram then also add --medvram-sdxl
Then you have the best performance
¯_(ツ)_/¯
This looks fun
https://openmuse.ai/ some of the collection is wild
Hello, I would like to use stable diffusion locally, does exist a package ready to install with all plugins and setup ready to create images ?
#🤝|tech-support first pinned message, however you need a pc with atleast 8gb vram for decent speeds
For sdxl
Hello
I’m still pretty new to LoRA training, but I’ve had a clear plan in my head for months – now I just want to finally put it into action.
I really want to get things moving, but setting everything up (Kohya, WebUI etc.) is dragging on for way too long.
I’ve got 30–60 images, generated with Astria AI.
The face is nearly identical across most images, although there’s still a slight AI look to it. The poses are pretty repetitive – mostly frontal or simple angles, very little variation.
What I’m looking for:
Someone who can build a strong starter LoRA model out of this. Ideally, someone who can also give me a bit of feedback or guidance (I’d even be happy to chat on WhatsApp if that’s easier).
I’ve spent over 50 hours working with Kohya, Stable Diffusion, WebUI, etc. – but I’m honestly just not that fast with tech stuff. I need help speeding up the process and doing it right.
I’m definitely willing to pay!
Depending on the level of support and effort, I’ll compensate fairly and immediately.
I’m based in Germany, super motivated, and really ready to get things rolling.
Would appreciate anyone who’s open to help – thanks so much!
Just use luca tacos trainers on his replicate repo
Thank you for response can you explain that a Little Bit ?
question: does anyone know how to transfer drawing style into an image using controlnet ? for example image A(reference image): any ghibli style drawing, e.g., ghibli house, image B: keanu reeves = image C(generated image): keanu reeves in ghibli style
hello
Man, i love this SHA256 hash script that i had GPT make
Scans all my models from checkpoints, embeddings etc to find duplicates as i'm in the 100's and 1000's of each lol.
Sadly it's quite slow because i'm scanning my server's model repo of the models at 100MB's, so it's gonna take a hot hour lol
Hi there, you would need the ipadapter instead of controlnet.
@oblique elk thanks !
@oblique elk hi, sorry, could you go into more detail, like is what are the exact settings, like preprocessor and model. sorry im really bad and all the youtube videos ive watched seem to be about image reference + text prompt rather than my desired image reference + image
Well you could simple google for ipadapter style transfer
And go for a video or blog post:
Morning yal
Just recently got into stable diffusion, learning all the tools on how to make it effective
Is it possible to have SD use an image onto another image creating a new one utilizing both?
I was able to create an image using chat gpt, and then merged the image that was created with a google image to create a new one
A detail that is important, is that the google image was a greatsword. I had replaced the ai generated greatsword with one I chose from a screenshot ( from an mmo )
Substituting aspects of an image with another is basically what im wondering.
100x easier
through photoshop
than doing inpain
in my HUMNLEEE opinion
go to Luca taco's replicate repo here https://replicate.com/lucataco and scroll through it, look for his trainers. and use one of them that fits the base model you're wanting to train a lora for
is installing comfyui from pinokio the same as from the github?
Thank you man can you recommend a Model for my project: I have about 50-70 Pics of a Woman generated as a lora from Astra.ai but I have Only the pics no safetensor data. And than About 20 pics further developed by me. Tried all sorts of programs, facefusion, kohya gui, comfy ui, but nothing really sensible came out of it 🤣
.
not really, no
hey!, I just joined and I am new here but I was wondering if anybody can help me out with stable diffusion, my pc is very capable for generation but things like hands and feet are still a bit wonky, just send me a dm if you are able to help me out please!
Oh, deleted as i was about to report lol
when making a concept lora how many images are you guys generally using for the training dataset?
Im trying to load flux 1-dev in cmfyui but whys it using my ram instead of vram?
How much vram do you have?
24gb
Is there a hub or place like a website or discord server apart from CivitAI where I can download Lora character models?
hello
one way is to first "photoshop" the object into the image (you don't need Photoshop, gimp or even paint are sufficient) and then inpaint over it
Hi, I have a character that I generated and I want to use him for more pictures. How do I do this?
Hello guys, i need someone help
who can teach him SD
Draw a raccoons with Hong-Kong egg waffles in they hands
SD don't understand what is "Hong-kong egg waffle"
I'm gonna be honest and i don't know what a egg waffle is either but you could try image to image
Seeing its a cone shaped you could try generating a icecream cone at first, photoshop over a the egg waffle and do a img2img /inpaint to
Match the style
Hi everyone. is there someone who has experience in replicate API integration?
are there any open source IMAGE GEN (not video) to SORA? that level of quality?
Image to sora?
Or you mean images at the quality of sora?
open source models that are comparable to SORAs image gen capabilities
sorry i didnt phrase it properly
I have a character that I generated and I want to use him for more pictures. How do I do this?
depends. For many tasks open source image gens are comparable or even better than Sora.
In terms of prompt following and text generation, Sora is clearly superior, though
but Sora is quite new. It's normal for a new model to be better than older models. Just wait a few months until new open source models come out
can you recommend me a model to use? i am new to this
i set up an environment, but not sure what to download
Flux is usually the best I would say. Except for anime stuff, where Illustrious and NoobAI are the most popular ones
But Flux + custom model/loras on civitai is usually giving you great results
thanks 🙏
Hi everyone, I'm new to generating images with AI, nice to meet you, I was wondering if anyone could help me get this ( animatediff_motion_module_v1.safetensors ) file so I can animate my drawings with AnimateDiff
Tbh, you're better off toying with framepack, apparently works with only 6GB vram
https://github.com/lllyasviel/FramePack
Or give ltxv a try, quite better than animatediff. https://github.com/Lightricks/LTX-Video
My goal is to make a ai music video rhythmically ai arranged to my music. I am using a amd 6700xt but perhaps consider buying a used 3090gtx, when nessesary... you have suggestions what I can try to use for this? Thank you in advance
First question, how much ram and vram do you have? As flux can be quite hefty.
With framepack, you don't need as much vram, as it can do repeated few seconds-half a second runs "infinitely". But yeah, look at used 30 series cards, even if they're hella power hungry, they're quite cheap, and with 30 series, you can install triton-windows for comdyui and sageattention which can speed it up further by up to 30% 
I came to ask about the best locally run video generation models of today. FramePack and LTX-Video, is it?... Everything standalone these days? No more SD-webui integrations like with animatediff?
hello all, new to this Stable diffusion. Still getting used to prompting. Been trying Layer diffusion with no luck yet. Seems it is broken. Maybe somebody knows a way to get it working better? You get the checker board backround and the image that is supposed to be transparent has the colors of the image smeared across the backround, but you do not get a transparency.
Wan is better imo but ltx/frame pack runs with less vram
But yes, stand alone nowadays
should I bother with rtx 4070 ti 16 gb?
Running with a 5080 16gb vram and it takes about 10min for 81 frames
the 1.3B model? (ComfyUI integration)
The 13b
I can only see 1.3b and 14b, and the latter already exceed 16 GB. Anyway, I'm gonna give the small one a try. Thanks for the tip.
Woops, meant 14b, yeah 14b works on 16gb albeit slow
1.3b is comparable to the old ltx ngl
On RAM? Must be pretty slow alright. Maybe I'm gonna try that, too. Need to set everything up first. Switching from A1111 to Comfy just now.
Try swarmUI, its pretty good to make it run easy
does tiled upscaling for stable diffusion 1.5 cause more deformation when upscaling ? im forced to use tiled upscaling bc of low vram, so ive never tried norml upscaling and want to know the experience of other people
does anyone here use forgeUI?
@floral umbra I took a look at the github for FramePack. I'm not sure what I need to download. Is installing from Comfy Manager enough to get up and running using the ComfyUI_RH_FramePack?
Hiii guys, girls and queers !
Can anyone here help me set-up SD/ comfyUI on a Linux VW (win11) with AMD 6700xt GPU? 😅
Having issues!!! 😂
Click Here to Download One-Click Package (CUDA 12.6 + Pytorch 2.6)
You click on this one bit down on the git. It's standalone and comes with everything needed.
Better to ask the actual question than askin if you use X, as that leaves no one interested :P
You'd want as much vram as you can buy if you want higher res/longer-per-gen videos
And there's apparently a intel B580 24 and 48GB in the works that may cost less than a 4070.
i really should link this to people more ngl
https://dontasktoask.com/
Hi Everyone, 👋 new to this channel
Foobar123:
Any Java experts around?
This is bad form, for several reasons. What the person is actually asking here is,
Foobar123:
Any Java experts around who are willing to commit into looking into my problem, whatever that may turn out to be, even if it's not actually related to Java or if someone who doesn't know anything about Java could actually answer my question? ```
yo welcome mike
Quick question - is there a recommended channel/thread on this server (or elsewhere) to look for freelance SDL devs for an upcoming project?
this channel, #🏞|general-with-images or #🌶|off-topic
Much appreciated!
granted you might have more luck on fiverr if you want to hire somebody
Got it - thank you
hmm the more specialized servers like L3 or Bandaco have more power users
here its mostly people asking basic questions or how to get started running it them selves nowadays
Yep lol. In the server i co-moderate, we have a ?justask command which brings up our in-house bot and links to said website 
Its often the most silly questions too
But ngl i hate discord as the normal forums have died and you can't tell people to just Google it anymore
Nope. Literally 20% of the time when i google up an issue, i find the same problem asked from half a decade ago, and first comment says "just google it".. Like dude.. i did, it lead me to your useless comment lol
I should make a basic flowchart / faq since its mostly the same question being asked over and over
Maybe a PowerPoint so they cant accidentally skip a step
Bros posting this 3-4x a day
And deletes their old post 