#š¬ļ½general-chat
1 messages Ā· Page 189 of 1
Dang. And I did, can't mimic the blend of artists i like. Anilustrious and Nova3d can get there, but neither is for XL
can i run comfyui and wan video in a 4GB nvidea graphics card?
Hey people
Can anyone here help me with runpod, trying to create a Lora. Walking into too many problems. Especially when I do all the premilary things, press start training and nothing happens. Endless clicking
I just use RunPod , weāll just learnt what it was. Seems like itāll be good if you need higher graphics card execution
this is probably a path problem, missing dataset, or a corrupt configuration file. frequently causes training not starting after setup š¬
Walked into it 100 times, no idea what Iām doing wrong, especially after waiting ages for my files to upload
well, that's quite hard. best possible idea is to really double check everything or really start from scratch
What do you use? If you do?
uhhh A111. pretty user friendly. i mean i also try to make sure everything is configured correctly, I usually begin by double-checking my dataset and configuration files. then, i monitor the logs throughout training in order to identify any errors early. also execute a small test batch aids in identifying problems prior to full training
guys, when iam generating anime, there are some floating stuff, like a view of an internal organ
how to remove those?
Stop prompting for it? I dunno without seeing your prompt I can't tell you what's causing it.
^, if you tried to use natural language in a anime model or use tags like "xray" "viscera" or other well grotesque tags it could be a cause
noobai v-pred seems to have the most knowledge
Sadly itās just been outputting garbage tier images
Hello everyone is there a guide to jump on making LORA? I'm bored of seeing only the women from anime. Where are big muscles
maybe something is wrong with your parameters? Because it is top 1 anime model for me
Although it is true that it is less coherent than others
I was using it properly, but the output always looked like badly drawn fanart
Even adding in a style lora and it was bad
hmm, not what I experienced, where are you using it? it requires v-prediction and zsnr enabled, comfy should do it automaticaly
Forgeui
sadly, I did not use it for a long
Itās just a far more efficient tool imo. Setting up adetailer and control net on comfy was a pain
Also forgeui generates lightning fast
Speed was the same on my tests, but it is fair about efficiency. It is not really convenient for manual work
Invoke is cool for manual generative editing
I think Iāll probably just start training my own Lora for what I want. The few models I stick with for illustrious are so good that itās just a matter of getting the odds and ends I want ported
that will give the best results, if trained right
So far as wishlisting, I kinda wish that there was an equivalent to NovelAIās vibe check, or their multi character prompt system
The former is such a reliable way to generate the exact same style and character parameters
maybe regional prompting and optionally quick sketching can help?
can't wait for flux kontext 
Maybe. For now Iām just learning and getting my preferred stuff figured out. Iāll be upgrading to a 5080 soon so thatāll help run bigger stuff
Great stuff
I am still on 3060
Iām using a 4060. Not bad, runs more efficiently than I had thought
12gb is what saves me
I need a better card though for more than just ai. High poly sculpting with blender
Blender uses GPU vram
got it
Hey guys, apparently fkunn1326/openpose-editor has been dysfunctional for some time now. The tab won't show even if you install it, and I've seen people with similar problems. I've also tried downloading the sd-webui-openpose-editor in the Extensions > Available tab, but I don't know if it didn't work or I can't find it. So, are there alternatives for SD 1.5 AUTOMATIC1111?
I also tried to open Latent Couple, but I'm having the same problem.
Anyone got a link to a decent tutorial on LoRa training?
Hey guys, I would like to morph one face into another, generating e.g. five intermediate images. Is there a model that you know of, that could perform this task? Many thanks, chillbird
@tawny drum search not4talent on youtube hes the best ! helped a lot although im too untalented to make a decent lora ive made a barely passable one with his help. its for sd 1.5 automatic 1111 tho
š¤
I think he meant to ping missalys but accidentally pinged you somehow?
Is anyone using AI to write custom nodes I just tested how easy it was and I was able to get a node for rotating image for box selecting an area of an image and another loading multiple images and making them work in various ways etc its basically as easy as 'one too three'. It takes just 1 prompt to create costume nodes now days and in 1 to 3 minutes. So anyone using AI to create costume nodes?
Anyone have ideas for custom notes?
Reach out to me
is a1111 still a good interface to use? i just now got it working with zluda xD
im too noob to give you an answer myself, so I'll copy and paste a discussion from another group chat "Don't use A111, it hasn't been updated in nearly a year. Since extension authors are focusing on support for forks that supersede it, a lot of them won't work with main anymore.
I suggest giving forge or forge classic a try. (edited)
NEW
@Equinox Psychosis
Don't use A111, it hasn't been updated in nearly a year. Since extension authors are focusing on support for forks that supersede it, a lot of them won't work with main anymore. I suggest giving forge or forge classic a try. (edited)
Kylada ā 5/31/2025 1:45 PM
but i am so used to a1111...
NEW
@Kylada
but i am so used to a1111...
Equinox Psychosis ā 5/31/2025 2:59 PM
In that case I'd recommend forge classic. It stays more true to the original a1111 webui while keeping things optimized and up to date. It does remove some of the outdated stuff tho, like the old textual inversion training and the old model merge menu(You should be using sd-scripts and super merger instead) (edited)
1
NEW
June 1, 2025
@Kylada
but i am so used to a1111..."
thanks
i think ill stick with it until i learn how to use sd
and then will see about forge
thats my plan too
it already feels a bit overwhelming by itself, i dont need more stuff to worry about while im learning it
and using an amd card is basically a gamble getting anything ai to work on it
i swear, anytime ive tried to do anything ai on amd, ive ended up spending a weekend looking for workarounds, not just stable diffusion, but i was playing with openai whisper at one point and it too was a pain in my ass
ohh coincidentally i also used whisper for one of my college classes xD
how did it work out for you?
i used it for creating captions for a japanese concert
good. i used the clour version but it was too slow, so i ran it locally no problem
it seems they discontinued the v2 onwards large models so i used v1 i found a workaround eventually but there was some weird stuff at the start
hm, it was pretty slow for me local, but not too bad, took like an hour for a 3 hour video
i recorded my lectures then summarized in chatgpt lolol
hm, its been awhile but i felt like i got better translations from the med model
yeah, im sure it works way better if it only using english
oh, and it didnt hallucinate a bunch of crazyness?
oh ya it had weird shit but as long as i got the general concept it was fine
true
mine had music in it so that didnt help, but also, i think long pauses really screws it up too
it would say some crazy stuff
and some times it would just say the same word like 20 times in a row
lol, man, i spend so much time learning how to get this stuff to work, use it one time, and now i dont even remeber how i ever got it to work
i even installed ubuntu just for whisper
i guess i learned how venvs work tho
its funny tho, i couldnt get whisper to work in windows, and couldnt get a1111 to work in linux
Hi guys, I'm finishing to record my sd course for achitecture in portuguese -Brazil
I'm recording this for 1 year, not all time but was when I started. It's 10 hrs of practical video class and 3 hours about historie, isntallation and interface.
The course is 100% focused on architecture with A11 and SD1.5
https://www.youtube.com/watch?v=bSaYeWK83wk&t=2s
I want to thank the server and the whole community, many of you have helped me by answering questions, giving attention, thank you very much.
I would especially like to thank @oblique elk , @warm junco and @fervent thunder . I would like to ask you to mention your name in the acknowledgments of the official course material, may I?
Yes if your focus is on 1.5 or sdxl, pony, illustrious models it works still great
yeah, im still new to it all, i just got done just playing with img2img, lost track of the time actually xD
and reported
Has anyone experience with doing motion capture with auto1111/forge/comfyui and using that data in blender? I've seen videos where they make the opposite like use a rig to create openpose images. But I'm looking for smth I can run locally that allows me to capture motion data from images/videos and use those in blender.
hiii
tried it once, extracting poses from a video using ComfyUI with OpenPose. then i used some scripting to map the good keypoints it provided me with per frame into blender. it also required some cleanup and correct skeleton matching before you can use the keypoints to drive a rig with constraints or convert them into BVH
um hello guys I'm a computer science student who is into DeepLearning, I have took machine learning and neural networks courses and I want to expand my knowledge in that domain, I know few deep learning topics, didn't really wrap my head around RNNs yet how does they get trained, also same with the attention, so does anyone knows is this server a good place to do deeplearning projects ?? I know that stable diffusion is a pre trained generative model
yea I've given up on that, the more I read about it the less sense it makes. There is actually software that outputs video to mocap data even that struggles with hands and limbs out of view. So I'd have to still clean it up too much.
any 5060ti guys in the chat
is it worth getting over the 5070 (12gb) for the extra vram
ngl comapred to veo 2 and others Stable diffusion picture generation hurts my eyes
its a failed abomination
doomed to die
the XL finetunes are still pretty good for stuff that isn't complex
hello
Hi, everybody
I'm an AI developer with specialized in generative AI and llm, langchain and RAG
Hi guys
Hi folks. I'm trying to figure out how to generate video using a starting frame on ComfyUI, like how I'm able to do with Klang/VEO/etc. I tried WAN Image to Video, but it's not adhering to the reference image. Any help?
Are there any decent methods to make backgrounds and characters look appropriate in size compared to eachother? Like making a bed look realistic and not like a toy next to them, or have them not be the same height as a door?
yo
yea the struggle is real haha try out savro though, been using it for quite some time now and it does the work. less hassle for me
Hello, I haven't heard of savro before. Is it new?
That won't allow me to do mocap on my own 3d models so kinda pointless for me. Not trying to avatar here.
yea i get you it's not for mocap folks. just love how it takes so much hassle out of content stuff. Kinda a game-changer for my workflow tbh
uh not sure if it is, havent looked that far behind. just found it on socmed and tried it out
@abstract quarry do you know any Ui for multimodal models? lm studio cant handle safetensors it seems
Had no problem at all with the example workflow from the template library. Did you used the I2V model? Or maybe the regular t2v
hey guys im like day 1 of stable diffusion and had some questions
can someone help me?
Your better off asking the questions directly and if someone has time or knows they could reply :-)
yeah
i got the help now


artwork by noper, made using SD: https://x.com/_lisa_gallery/status/1930655934105161904
I've been using FOOOCUS UI for a long time now. I really wanna try using Flux but I'm not exactly sure what's the best way to use it locally. Do you guys have any tips?
what about no ?
if someone joined that discord, leave it. It's all scammers trying to install ransomware / get your wallet
@gloomy helm
Oh no I didn't join. But thank you for warning
can someone recommend a lora for adjusting clothes that is not Pony?
hello guys, seem to be getting problems with Wan lately. I am running on 4090 but when I run it, it is getting stuck at ksampler part, disconnects and tries and reconnect but will not finish my prompt. I am also getting this error, but not all the time. Any help? Warning: Ran out of memory when regular VAE encoding, retrying with tiled VAE encoding.
Was hoping to find some activity in here for some help too
My SD runs but the art is horrible
Iām trying to use dreamshaper but it looks like itās not even switching to it
check your dm.... and block this scammer
Bruh
there is no "external support discord" or whatever
it's just a bunch of scammers trying to get you to install ransomware and get your wallet
also if you want some help, you can ask #š¤ļ½tech-support or #šļ½prompting-help but you're gonna need to give more details.
because "it looks bad" is not very descriptive
what software are you using, which settings, give some output examples, etc
Gotcha Iāll collect some now
gotta go to work for now, I might help if I have some time.
fixed it
fixed it, but now am getting this error
Given normalized_shape=[1280], expected input with shape [*, 1280], but got input of size[1, 257, 1664]
Hey guys, I was wondering if anyone knows if there are any models that have better prompt understanding than FLUX. I really like how FLUX understands the prompts so well, but I don't like the appearance of it that much.
well there are others that offer greater flexibility when it comes to style like RealVisXL, SavroAI or Juggernaut. you can give them a try if you're searching for something with a different appearance but a similar prompt strength
I've tried Juggernaut and RealVisXL, but they also use the older tag-cloud style right? FLUX seems better at understanding a long descriptive text rather than a list of tags.
oh yeah they do. try savro like it works with good descriptive prompts
afaik yeah no local model at this time. like if you don't want to deal with hardware or setup issues. you can directly produce models there
there are many loras and checkpoints for flux
if you are more into artistic stuff, PixelWave is a great flux checkpoint for arts
then there are a lot of loras for photographic styles in flux
The most simple way to fix that is to just edit the image yourself manually, resize whatever object, add some rough filler details, and have the ai regenerate. This can be done more efficiently using krita ai (krita with comfyui)
It's not exactly simple but it's the most straightforward way to fix such issues effectively.
Hello..
hey is here a lora trainier for realistic ?:D
Scam
No one click that
No one is going to send you to a different discord server to give you help
Question so I want to make ai art again i previously used forge im told that's no longer being supported? So should I use comfyai automatic1111 or swarm?
don't reach out to this scammer
(sorry, was watching their messages history didn't realise this message was days old)
@still glacier They already scammed me but they only took $55.5 billion, from me so it wasn't too much..
ComfyUI is the best. I havent usedthe others axcept a1111. But you can do so much. Including building your own nodes to do whatever.
Yeaaa but first of all you have to train a good lora for the beginning of comfy ui but I have big big trouble with KoHYA GUI
kohyass is terrible last time i tried to use it.
something I noticed with most gen ai model that works with wan 2.1 is they associated long hair as blonde (when you do image2vid) so the vid always turns out as a blonde long hair girl
heyy, when you say building your own nodes, like does this apply in the same way when making ai influencers and stuff? like does it make it easy to customize?
guys
my flux is consistently making one image twice as long as previous one
1st image takes like 60-70 sec, second like 140 sec
consistently
why
is a model required if you are just creating ai images from existing images you have on your local folder?
and how does one accurately tell the ai image to image to use the picture included and just add a simple thing, like a hat to a bald person in the picture
while still keeping the picture the same
@bold trout img2img > inpaint > only masked area. im using automatic 1111 sd1.5 tho. if its an anime style, i can try and do it for you
it's not an anime style. more ofa realistic style using f222. For example, I'm trying to keep a picture of let's say woods from black ops but make him bald or wear a boonie hat
im noob and i havent tried realistic but if you sned it in the general-with-images section i can give it a go and show you my process
are existing images from the web allowed?
I'll share an example from the web in that gen with images
Also can you add prompts in complete multiple sentences like for ex: make a stickman wearing a boonie helmet. Also have him carry an ak47
@warm junco nsfw is not allowed to be talked about here in any form? right?
correct
Oh ok then. I might have to take it to the reddit then since it sems that nsfw is (somewhat) allowed there regardless of the rules? I think they allow discussion but no pics
you can ask me in dm if you like, i dont mind nsfw in private
I've seen people create all sorts of nodes and workflows for all sorts of things beyond just image generation. Co mfyui can handle more than just stable diffusion. It can make video, 3d models, audio, etc. you can even talk to LLMs in comyui and have nothing to do with image generation. Being able to create your own nodes, if you are willing, or at least create your own workflows with pre-existing notes, allows you to customize most anything in comfyui. But to answer your question yes it does make it easier to customize 'ai influencers ' or anything that can work in comfyui.
just started my comfy journey after learning forge and A1111. Steep initial learning curve, but its well worth it. And i havent even touched video gen or any of the other stuff.
The very basic workflow that you start with should get you an idea of how things work. Then you can start messing around adding and changing thing and press the play button and see what happens and youll start learning, if you want to go the self experimental way. Otherwise just youtube it.
had a buddy show me his flow, and he explained what each portion or group did and why. the connection dependencies im familiar with since ive done some light automation for work with n8n and powerautomate. but theres a bunch of stuff forge does under the hood that you have control of with comfy and i didnt even know where to begin looking for all of it.
If you don't know you can also use LLMs to help you create nodes yourself, if you dont want to learn how to do it, and do all sorts of other things. I havent tried asking it to build a workflow for me, but i suspect, it can actually build workflows. The only other thing is that you do have to have the models that have the ability to do whatever available for the nodes to work with
yeah, assuming its a reasoning model or can search the web, im sure it can. i had them teach me how to make wildcards when i was using forge so im sure it could probably teach me to make custom nodes. but that sounds like a lot of work. might look into it if pressing generate starts getting stale though.
Hello
I had the LLM help me make a few nodes from scratch and it took only about a few minutes. I didn't need to know anything about how nodes work or how to build them, other than requesting the LLM to build those nodes for me. The LLM knows everything it needed to know - plus it has internet access for anything else. AND its all local, and freee.
huh, might have to try it out
I want people to realize how easy it is because I want people to start building a bunch of awesome nodes.
haha theres already so many to pick from, it would be hard to stand out and have your nodes be seen and used lol
I also tried Comfy for the first time yesterday, after using A11 for a year and a half. I can see where people come from regarding the "flexibility" for bulding the pipeline, but regretfully for SDXL most of the extensions that do matter for me still only work properly for A11 like Multidiffusion tiled upscaler. Just like Forge's, Comfy's version of the extension remains broken
I will still have to get into comfy one way or the other if I want to start practicing with Flux and SD3.5 though
hmmm multidiffusion tiled upscaler eh. thats something i never really messed with. im surprised it isnt already something someone has fixed up on comfy though
so does the interrogate feature describe the image source in question? when doing image to image generation?
it will attempt to tell you the prompt, but it wont extract metadata if thats what your asking. though i do believe there is a way to do that if the image has metadata attached to it
deepbooru also can interrogate and convert to danbooru tags, but its not super reliable
so it will tell you the promp of the estimated image or what the prompts could be for the image?
it will try to. its not 100% accurate, youll often have better luck with sharing an image with chatgpt and asking it lol but if your dealing with nsfw you can give it a shot. i know comfy has a metadata extraction node that can pull full prompts from images.
got any link for that?
they are designed to be used for Lora trainners and checkpoint refiners in order to automate the very labor intensive metadata filling for each picture on the dataset. as Retsubu mentioned they are not meant to be fully reliable so afterwards the user is expected to clean up the false positives themselves. but the bottom line is, that as someone who is just generating and not developer you shouldn't look at it as a way to find out tags or what to write, it takes sometime but in my opinion the best aproach is to just bookmark the danboru wiki and search things up as you need them, with time you will memorize which tag correlates with which concept
the clip interrogator is built into forge, and deepdanbooru can be downloaded through the extensions tab in the webui itself. it connects directly to github
I was talking about comfy srry
uhhhh id have to dig through the nodes. one sec
its in the crystools node pack
oh I see not meant for casual generators I see. how does one use the danobru images from the wiki (searches) and implement them into stable diffusion?
dum question again, whre is that node pack lol
in the comfy ui manager, you can look it up and install it there
also how does one save prompts in stable diffusion in both positive and negative prompts
create a .txt file and save the metadata from your gen to it, or go into your settings and tell it to copy a file each time you generate an image of the metadata, which will contain your prompts and settings/parameters
if you mean how to save them into the picture's as metadata, that happens automatically upon generation. if you mean how to save them on the workflow that happens automatically as you save your workflow before closing the program. if you mean how to save the prompt as a stand alone text so you can use it later... well, it is is literally just text, you can copy paste it into a .txt and open it later to use as a reference
so for someone new to donboru wiki. how does one use those images to create images in stable diffusion
my apologies, I should have been clearer, this is something specific to anime oriented models. SDXL is trained on I think 8Billion pictures, those pictures would carry asociations of descriptions of their content, often in the form of short verses. The developer of SDXL fined tuned for anime models realised they could use this idiosincracy to train their fine tuned checkpoints over some hundred thousands anime pictures carrying those tags so their models would be most likely to respond while using those "tags". if you are not planning on using an anime model this is straight up useless for you
yeah no anime models just realistic ones.
Does anyone know of SDXL finetunes which have been trained for native 1536x1536 generation, like some Illustrious models have been?
so when it comes to prompts, is there a priority on which one gets first? and how does one tell the ai there are 2 characters in the picture and how to designate an action to each one? ex, vietcong soldier and american soldier. american soldier weiling an m16 while vietcong weilds ak47.
No. You just have to hope that the model has been trained enough on images + captions that have this level of detail. The more recent models like whatever is in current chatgpt and things like Flux Kontext are improved over earlier models like Flux and SD3.5. And those (Flux and SD3.5) are improved over earlier models like SDXL. For now, a prompt like your example may or not be correct. Although in general I would expect these models to be bad at differentiating between specific gun types.
Hi there. I've been exploring AIs recently. How does it make it less hassle? AI definitely has it's catch so i'm pretty sure this might need quite some time to get used to like ComfyUI too
thanks. I also I saw that doing <character> is with <character> adds 2 characters to the picture. ofc it can't be too long. yeah I gave it a shot and it seems like guns are really difficult to generate
Hi everyone, Iām using a new RTXāÆ5080 (Blackwell, sm_120) and Iāve tried to run AUTOMATIC1111 WebUI with SD.
I get errors like āno kernel image... sm_120 not supportedā.
Has anyone managed to get Stable Diffusion running on RTXāÆ5080? What steps or builds did you use?
Thanks for any tips š
Subject: Stable Diffusion compatibility with RTX 5080 (Blackwell architecture)
Hello,
I'm trying to run Stable Diffusion (SDXL/WebUI) on a new NVIDIA GeForce RTX 5080 GPU, but the current version does not yet support this architecture (sm_120 / Blackwell).
PyTorch nightly builds with CUDA 12.8+ still fail or require manual compilation, and several components (like xformers, flash-attn, or torch-sdp) are also not working.
Can you please let us know:
- Is support for RTX 5080 (sm_120) planned?
- When approximately do you expect compatibility with Stable Diffusion (esp. SDXL)?
- Are there any known workarounds or official plans to address this?
Thanks in advance for any updates. Many users with new GPUs are currently blocked from using SD.
Best regards,
Stable diffusion and most models work fine with cuda 12.8+ comfyUI works fine too.
So I am not sure if you are addressing the right people here?
hellooo, well comfyUI can be a lot at first been there and is still there lmao. Savroās way more beginner-friendly since itās all plug-and-play, no setup or downloads and that's how it is less of a hassle for me :))
I faced the same trouble once I upgrade to my 5090. to solve this simple do a fresh install of the dev branch, that one includes the pytorch compiled for cuda 12.8
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/16818
basically open your console and change directory to where you want ot install A1111. then run this:
git clone --filter=blob:none -b dev https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
let it run for the first time and test that it is working fine. Xformerfs are functional but don't appear on the list by deffault. so after running it for a first time close it and add this to the webui.user.bat:
set COMMANDLINE_ARGS= --xformers --force-enable-xformers
afterwards launch A1111 again and go to settings > stable diffusion > optimizations and you will see xformers once again in the list, select it, and aply settings. From here on you have a fully functional A1111 for Blackwell
What is the best image to video tool?
https://github.com/lllyasviel/FramePack if you want easy lightweight simple image to pretty much as long video as you want, and apparently even only needs 6GB vram minimum.
Or comfyui if you want a more comprehensive image/text/video to video
do you know how to make it do a shot where it doesn't move the camera much or not at all?
trial and error, let me show you an example prompt
Slow motion video of a woman gently kis the viewer while she slowly tilts her head to the sides with pucked lips and a feminine expression. The soft diffuse light highlights her detailed skin textures and delicate contours of her nose and face in full detail
Adapt that prompt to whatever you are trying to create and see if it improves
thanks, yes I guessed it needed some trial and error but the camera movement and changes of shots happens almost all the time, but I'll try that, thanks
yeah its a common problem
Also tell it that "camera holding position during filming", literally direct the clip like a movie director describing what will happen. Ain't like older sd models where you just do short keywords, gotta prompt it like a movie director.
tea cache speed up vid generation by a lot
thanks, the movie director helped, as with re.cut I was able to take out some nonses
teacache, if you got 30 series or better, sageattention, skip blocks node, speed loras, so much that can boost it by sacrificing accuracy. Or if you're lucky enough to have 50 series from nvidia, nf4 which can speed it up even more.
What i wish existed, was a tesnorRT converter for wan/comfyui that splits all the components into smaller ONNX, then once converted, pack it back up into the final .trt which can speed up generatio for up to 3x.
I am rocking a 3060 12gb will be getting 5080 super 24 gb when it is out
hello
https://www.correlation-one.com/dod-cyber-sentinel today is the deadline. only US citizen is eligable to enter
Thanks for the info, but I don't seem to understand it at all.
you can search on https://civitai.com/
have you installed A1111 before at all?
Yes. that was the first thing I did after I created the folder where the SD will be installed.
did you use GIT? all you have to for blackwell to work is to install the specific version I told you
just go to your folder (eithera new one or delete the contents from the previous install) and run this in CMD/powershell
git clone --filter=blob:none -b dev https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
once you have it running, give me a heads up and I will guide you on how to activate xformers
guys what is the difference between SDXL and Pony?
SDXL is the base AI model developed by Stability AI, all currently popular models, such as Pony, Illustrious, Noob and Animagine are simply fine-tuned (retrained to a lessser degree) versions of the very same SDXL. Pony in particular forked very early with its idiosyncrasies. so amongst the rest it is the one that feel more different to operate, but it is still SDXL under the hood, like the rest.
next gen illustration/anime models are already looking into DIT for their next projects but progress has turned difficult, allegedly. the next animagine is gonna be based on SD 3.5, Illustrious has already been cooking with aura flow, and who knows what else they modded under the hood which allows them to aim for natural language prompting just line NoveAI, Noob introduced for the first time (I think) a V-prediction tweak into their training. but as of right now they are still SDXL
So I have it running.
cool. is it working properly?
I don't know yet, it downloads after launch
Yeah, don't worry, it does that. The one thing you will notice is that by default it will not recognize or list xformers (the best memory optimization setting) however, it is absolutely installed and it works just fine, activating it is just a matter of adding a couple lines into a text file, don't worry we will get there when we get there
it installed and it works. Thanks
Algum brasileiro para me ajudar? Estou tendo problemas onde meu stable diffusion demora 15 minutos para gerar uma imagem com cfg de 5 steps de 30 e seed fixo sem hire Flix sem Adtailer e ControlNet
Help me generate some illustrations of Jiangshan City's intangible cultural heritage projects
#š¬ļ½general-chat Help me generate some illustrations of Jiangshan City's intangible cultural heritage projects
thanks! Idk about it all! What is DIT and version V-prediition? I will check about Illustrious, Noob and Animagine too. Now I understand better what is a fine-tuned š
The short answer is that DIT is a new architecture used on newer generative AI, very resource intensive, it is for users but even more so for trainers, that's why it has stagnated on gaining adoption on the open source community. I am told it is very difficult to research and train it without the resources of a big company. V-prediction is simply a different math formula being applied for the noise scheduling part of the process, however this is not just some component you can hot-swap for any model, it has to be involved during the training for it to also work a intended during generations
I got it!
@valid oar I'm making a presentation to explain what is SD and controlnet to a portuguese course that I'm recording here in Brazil, using A11 and SD1.5 in this first module. Can I DM and show you to get your opinion?
Sure, no problem. I don't consider myself an authority in this or anything but I'll be glad to share some thoughts
Hey will upgrading from 16GB to 32GB RAM make a difference?
I generate with illustrious model in 832x1216 with rx 6600
If my goal is to produce photos of myself or some other person (have a library of training photos data) is there any advantage of training a full model from a base model or does a Lora basically give full power at fraction of training time/size?
Also are there any new techs for achieving this?
I need to be able to produce consistent images (body and face)
I actually don't need hundred percent adherence to the original person, but consistency is more important
Second question, what is the best uncensored open source models nowadays? For photorealistic images
i think realistic vision v6
or maybe cyberrealism
i never used them locally but as far as i see they are most realistic for me
i have made a workflow for flux all in one
you can find it at https://civitai.com/models/1668904/desaign-workflow
anyone gotten dreambooth to work with 50 series yet?
I used Google Labs to generate an image; there is one image it generated I like I'd like to change the rotation of the image but keep everything else ?
So you want to change the orientation of the image from portrait to landscape but you need more content left and right? Then this is a task for outpainting.
@oblique elk It's a bottle and every thing about the bottle I want to keep but change it's angle and rotation
he wants to change the composition but keep the bottle (subject) the same
Well if you look at the bottle from a different angle and rotation the subject (bottle) will change as you see another part of it or less depending on the rotation etc. Like this would be the bottle from above "o" while this "I" from the side. So you would need to change the object and the background elements. Totally not worth the work compared to generating a new one in the correct position
How do I know how to get the bottle the same when generating a new image ?
save the seed from the original might do the trick. depends on how strong your prompt was
You wonāt. You could try to inpaint the surrounding with the elements you need. You could use the image as a depth map or canny image input for controlnet to keep at least the shape and dimension of the bottle.
No idea if it'll work, but currently with git's copilot, i'm gonna attempt to make a script with tensorrt's sauce code on git to convert wan/hunyuan checkpoints to .trt for a 2-3x generation speedup
It's like with these image generators, they can generate what you type but if you get the result you want but need to do some changes. That's when it creates a whole new image and you can't get anything close to the previous image.
You can. You just gotta lock seed, then backspace until that image appears again 
And what you speak of is basically SD 1.5 turbo, lightning or whatever it was, and very low steps, it'll look like porridge, but you can even use it as a low fps webcam, i've tried that 
On my 3090, it can generate such images in 300 millisec or so.
Didn't know google had one. I've blocked off it's annoying LLM popup from my search results if that's what's doing it, so i don't know how that one works.
I'm using comfyui for all local generations with a library of models and loras on my server.
Which is the best service for renting GPUs for SD?
hello, yeah, LoRA often gives you most of what you'd want from a full model but with way less compute and training time, especially if your focus is on one person and you just want consistency. If you want a consistent body and face outputs, good image curation and tagging matter just as much as model choice. And for your second question, there a bunch. one is RealVisXL and juggernaut. did not work for me though hahahahah tried to use savro (a plug and play) and results have been good so far :))
anyone have a link to a decent img2vid (ideally wan 2.1) colab notebook they use?
dreambooth is still king
but a pain in the ass to get working
ideally dreambooth + lora will give the best possible results
Hey guys im new in comfy ui. And im having this problem: ā./ComfyUI/output/2025-06-11/ComfyUI_0001.txt' is a write-protected path.
Please add it to the whitelist fileā. Can anyone help please?
Is it better to use rundiffusion or spin up a runpod? For Lora training and generating images
Also is a MacBook pro m4 suitable for all this stuff or do I need Cuda enabled Nvidia chip?
If a lora has "unknown", or "other" as the base model, how do i find out what base model it's for? Would only way be to run it through every base model to see what won't look like colored static/deep frieed image?
If you are a bit more patient most things work on your Mac book. Some audio or video models wonāt. Flux, SDXL for example work.
Yeah, figuring out the right base model when a LoRA says "unknown" can be a headache. Easiest way is to start with the usual suspects anything that's V3 is one and see what sticks without turning into deep-fried chaos. Sometimes the file size or sample images can give clues too.
Yep. And the script i made with LLM support that scans and sorts the model as well as lora database from civitai, but even there a lot of the models are unknown
As i've seen some loras have unknown or "other" there as well
I'd almost wanna make my own trained image analyzer model based on deep fried images and use non deepfried to sort them that way 
Hi i am complete new....what shoukd i install first to get on stable diffusion....i am broke with midjourney paid version....heard stable diffusion in github.....
Hey, what's your GPU?
Hi....i got lenovo legion laptop
Does gpu matters....no idea but its decent laptop.......one of my friend uaing lower specs than my laptop ...bit hia image are very good....he runa python etc like that....he works from that
Its RTX 3070 8gb
hey everyone
I'm looking for somewhere to ask questions about a diffusion model I'm trying to build
I need some human feedback
GPU is what drives all these hallucinating models, be it text to text, image, video or what have you.
And 3070 8GB will do fine.
So depends if you just want a simple Gui like forge webui or node based (comfyui).
@vapid dove scammer above. As we just filtered that scam in a server I staff in. Who added it as a link filter. the server ID that starts with 1294 and ends with 6045 in the first numbered after attachments/
I install pythoon...github....then civitai and get safetensor 8 dreamshaper like that....did something i forget....now on 127.00 something like thqt came up
You install python 3.12, pytorch 2.7 with cuda 12.6, install cuda toolkit 12.6, and remember to add all to path when installing, install git for windows
I get this reply
š§ RECOMMENDED SETUP (Stable + Tested):
Component Recommended Version
Python 3.10.6 or 3.10.11
PyTorch 2.7.1 + CUDA 11.8
CUDA Toolkit ā NOT needed (PyTorch comes prebuilt with CUDA)
Git for Windows ā
Yes (needed to download WebUI)
GPU ā
RTX 3070 is perfect
Never used midjourney, only dall-E 1 or 2 times. Since then i've used different local diffusers since september 2021
Oh....how to upscale image caz in it we can get 512x512
hello
are you able to create ai videos by default on stable diffusion?
I didn't know this was a thing lol
sort of.
animatediff
but it sucks ahh compared to sora
sora has a few jailbreaks for those wanting to do some semi-erotic stsuff
Hello
[NEWS] Lockheed Martin Corp. - The Most powerful arms corp in the world invest in First Quantum Currency ever made - QRL (Quantum Resistant Ledger)
With what tool?
Yep. I use wan video and hunyuan video. There's a lightweight image and iirc text to video called framepack from illyasviel which works even on a 6GB GPU.
Lol. No idea what that was. I don't use stock client on my phone, so whatever ammo discord thought was a good idea to add to scammers for instance wouldn't work on me, all I see is raw code 
Iāve been working as a software engineer for over 7 years, mostly focused on web and mobile. Recently, I built a time management app ā so I even managed my own time pretty well! I also have experience building apps in travel, news, POS systems, and integrating with weather APIs. I'm currently looking for new opportunities. Thanks!
Hi
This is my issue
Can anyone help which version to download and step by step process....the one who did recent ?
š§µ Stable Diffusion + Karras Incident Summary (with GPU info)
-
Installed Python 3.10.6 after issues with newer versions.
-
Set up Stable Diffusion WebUI (v1.10.1) successfully.
-
WebUI worked fine with model (dreamshaper_8.safetensors) and default samplers.
-
Wanted DDIM++ / Karras samplers, so tried adding a custom sampler.py script or extension.
-
Script needed xformers, so I tried installing xformers==0.0.37, but it doesnāt exist for CUDA 11.8.
-
Tried various .whl files from GitHub and VastAI ā none worked.
-
My GPU is NVIDIA RTX 3060 12GB (CUDA 11.8) ā compatible but still faced issues.
-
Accidentally pasted multi-line shell commands ā possibly corrupted environment.
-
Got error: sampler_names[0] IndexError ā UI failed to load due to empty sampler list.
-
After multiple failed fixes, ended up formatting my PC to start clean.
Hey guys, this might appear basic as hell. I just started with sd yesterday and here is my situation. I have real images/backgrounds that i took in real life. I also have created some characters/models in stable diffusion and I would like to place them in the very real backgrounds that I took without altering them. How do I go about doing this? I have tried inpainting, imgtoimg with my background but my character never comes out quite right. (I only have faces for the characters) any help would be greatly appreciated. Thanks,
Hey, best for you is forge webui
Do you know how to generate characters over backgrounds?
You can try Inpainting
In img2img
Oh you already tried it
Do I inpaint over the background and then generate the character in there? Ive tried that but then how do you inpaint the right area where they would actually be, the face for some reason never matched the face i had in the ip control net. It was always too deformed to be able to tell anyway
You make a mask where you want to have the character and then prompt for it and play around with the denois.
But I think the whole thing is to much to ask for SD and can be better done in Photoshop maybe
Installation on forge ia same as auto1111
The reason I'm doing sd is because I only have a characters head, not the full body. I am trying to go for ai influencer type thing but I want the backgrounds to at least be real and not all ai generated
Hmm okay I mean you could just generate them on fake background and use an extension like rembg ti remove the Background then paste them onto the real image
Okay, I'll look into that, thank you!
Np š
Many here use auto or forge ?
Curioua to know...heard forge has limited resource than auto...is that true
can u help me
For the setup guides checkout the first link of the pinned messages in #š¤ļ½tech-support
Forge is a fork of Auto1111 which adds a few extensions by default and also native rtx50 series support aswell as flux model support.
It also has some performance improvements for GPUs with lower vram.
Auto1111 has a better extension support but forge nearly supports the same extensions. Also Auto1111 hasn't been updated in months.
gm, showing some art made with SD by noper [@bagdelete]
feel free to share https://x.com/_lisa_gallery/status/1933192651588989002
anyone here have much experience with Invoke?
@wicked badge is as far as I know an invoke hero.
I haven't ran SD for well over a year. Last used on Automatic1111.
Is there an easy to use face swapper that's decently good these days? I have a decent amount of RAM and a 3090.
Do you have any tips for avoiding extra fingers for characters? No matter how much I specify it in the negative prompt, when my character's hand holds an object, his hand always ends up with 6 fingers instead of 5
I'll say one thing about these bots. I admire how wide of a net they are casting. Don't see this much coordination usually. Shame it's such a shit tier obvious scam that they're peddling.
copy and paste the characters into the bg image ("Photoshop") then do inpaint to improve it.
Or use flux kontext as soon as it comes out š
whts the prblm
Thats exactly what I'm thinking to do, what is flux kontext?
i put the picture of it in tech support
the free version not out yet. It's a flux model that allows you to change images via prompt. For example, you can give it an image of your background and an image of your character and then use the prompt "insert this character into the background."
Ok, got a question. Does anyone know if there is an ai program that can generate normal maps based on a given mesh/UV? Say I have a low poly count mesh that is UVād, is there a program yet that could read that and then generate a normal map?
get lost
don't follow that link, it's a scam.
There is no "external official discord support" or something like that
they'll jus try to get in your wallet or install ransomware
I'm interested in finding an expert to pay for some consulting. I'm using COmfyUI, ipadapter, controlnet
oh i know, lol, didnt see this till after it was deleted anyway
which one is better
stability APIās or Runway API?
Any dev who has experience in both ?
not anaa expert haha but an avid reader on reddit. people say Stabilityās API gives you more control if you like to tweak things, while Runwayās is easier and smoother to use. It depends on whether you want flexibility or simplicity. Some folks mix both.
So, if you're still exploring, savro.ai is also a solid middle ground for this one and playground.ai for experimenting
where to download stable diff?
Scammer
Hello, sorry to bother you, I am pretty new to stable diffusion, I followed some tuto, but it seems to not like my prompts cause I never get what I want, do you know where I could get some help?
gguf models are slower? i mean in wan, vace, fusion x etc
I want whichever one gives me better results
hello guys
Hello server
anyone able to help me figure out what was used in a certain set of images? I've not found the lora or anything
it really depends tho, for fine tweaking stability AI and for user friendly, fast stylized vids runway, for plug and play savro and playground.ai for experimenting
you guys know any good illustrious checkpoints or loras that help enforce an 2d, anime screenshot style?
or do you just rely on an artist style that is focused on that particular aesthethic?
i luv gemini
Hey everyone! š I'm an AI art creator experimenting with future human and city concepts using AI tools. Looking forward to learn and share ideas here!
hi welcomeee!!! also looking forward to hearing urs!!
guys, can I understand that nowadays to may be possible to use a open source model, the model needs a coder and decoder? is it?
or the coder/decoder isn't from model, but is a tool from the interface?
is this what is called "architecuture"? the way of the model is trained? Or the tools of the interface?
and that model (checkpoint) training with denoising is called Unet? is it?
a
Lots of words in there I'm not sure you understand all of them.
Understanding how AI work can take months/years. It's definitely not something that you can master just by reading my over simplifying rambling. But here we go !
Usually coder/decoder/VAE in the context of stable diffusion refer to the part of the pipeline which transform "pixel information" into "latent information. What it means is that instead of working on an image made of pixel, you're working on an image made of "latent information". What is that ? Let's see.
The core of stable diffusion/AI models being a "neural network" meaning it's a huge graph made of (likely) trillions of neurons. Each neuron having the role of recognizing a specific feature, for instance : circle shapes, fur, nose, whiskers (in reality they recognize things so abstract we don't have proper word to identify it...). So if you feed all of those neurons the picture of a cat. The neurons in charge of recognizing cat features will lit up much more than the other ones. And throught the magic of math, the model can recognize cat. The map of all those neuron being lit up is a description of your image in "latent information".
But how does it make picture ? Simply put, we feed it a picture made of noise (random pixels) and ask the model "hey do you see a cat there ?" The model answer "dunno maybe there if I squint my eyes really hard I can see a cat in the top left". Then we modify the "picture" (in latent space) to reflect those results, ask again, etc... Until we have a proper picture in latent space. Then we ask the decoder to transform it into pixels.
If you're into music, think of "pixel space" and "latent space" as when you're applying a Fourier transform to a song to get its "frequency domain / space". Some operations are much more easier in "ftt space", same goes for "latent space".
Usually "open source model" refers to model with "public source". We know the code behind the model, how to run it, how it works and we can run it ourselves to replicate the results. (But it does not necessarily mean that the data used for its training are public tho)
it's really difficult to explain U-net properly tho.....
U-net is not "one thing", it's the assembly of multiple blocks working in latent space.
If it helps, think of its job as follows;
U-net segment/split/reduce the """""resolution"""""" (in latent space) of the image, then work on it much more easily because of its low """"""resolution"""""" then upscale it back.
I'm so sorry for any AI scientist reading this....
https://www.youtube.com/watch?v=NhdzGfB1q74 This explains it way better than I could in text, but it requires some "basic" AI knowledge to understand it.
Not sure if I helped or made your headache 10 times worse.
you help me! You amplied my compreehension about some functionalities, like nural networks
@still glacier can you give me a time to reply you? I will see the unet video and answer you. I'm working now, I need time to reflect about all that you say!
Hi, does anyone have any tips for generating images with sunglasses that are completely opaque?
anyone here knows about inpainting?
I cant manage to get good inpainting with my pics
is there any trick to it?
It starts with good inpainting models. Not each model that exist can be (sucessfull) used for inpainting. There are flux inpainting models, SDXL etc.
Next thing is masking, feathering, etc. to get the right elements good embedded etc.
but you have to use an inpainting model right?
I cant just use the lora for inpainting
They work at least a bit better. Got good results with the basic flux as well
I only want to make small corrections and small changes to images
they are trained to maintain the artsyle of the image or does it give problems with less common artstyles?
you might try tools like invoke or krita with the comfyui ai plugin
I wanna use an inpainting model in forge
what do people use for non realistic images? anime, cartoon, comic...
@abstract quarry whats the model everyone's using for inpanting nowadays?
@oblique elk do you have any model you like?
Most used the last days is chroma. I do also like the combination of two models. Like illustrious for the object and second pass with realvis, or juggernaut.
I see a lot of people talking about chroma, is that good?
do you have a link for those models or the base models are also baked for inpainting?
does anyone here use flux Kontext in comfyui?
If you want to do a stly transfer, how do you refer to the second image to transfer the style?
it looks like it uses an image stitch. I think the model only takes one image input
what verson are you usind? sd1.5 sdxl, sd3.5, flux?
im using illustrious
sdxl basically
With realistic vison (sd1.5), the inpaint model don't create, just remove! the normal model create. This make confusion for a long time, I don't know if it's only for realistic vision, or is commom on all inpaint model. If it happens on sdxl can be it
In theory, what I undestand is that inpaint models are better to remove and substitute, and normal models to create
@ancient mauve #šļ½general-with-images message
about mask options
must be, juggernaut is huge! I like this model, Is my favorite from sdxl, but I used more sd1.5
but this is a normal model, not a inpaint version
where is the inpainting version
you kow civit ai? is easily to find models
I will show you
maybe this one?
https://civitai.com/models/403361/juggernaut-xl-inpainting
but I will show you other thing
hi, Aretys, I saw the Unet Video, it is very cool! Is what I was thinking, about coder and decoder, noise and denoising... I like when you talk about what is open source, this help me a lot to understand better what exaclty is it, what is free, and how I said, about neural netwrok. Was very good the example. I talk with chatgpt and understand better what is architecture on sd1.5 sdxl / sd3.0, sd3.5/ flux š
can I DM you to show you one thing?
I think im gonna ditch flux and go back to Stable Diffusion.
Working with that model in WebUI is absoluteley impossible
Is there any way to use OpenPose with Flux in WebUi???
Yeah, if you have ControlNet installed, just load up the OpenPose model and it should work.
I have the integrated controlnet, but the generation doesnt follow it. The file in the model section of OpenPose is only for Stable Diffusion and there doesnt seems to be a .pth file for Flux
hmmm.. have you tried generating the oamge separately then uploading it as a reference instead? not quite sure why that happened but maybe because flux is expecting a format that it can read?
oagme?
Is there a way to inpaint without masking and using only prompts?
Hello, please I'm new to stable diffusion. Please let me know how to install SD on my pc.
Hey, checkout the first link of the pinned messages in the #š¤ļ½tech-support channel for the Install guides.
Does something good come from here?
does swarmUI automatically updates when i launch it? Didn't use it for a long time
does someone already have a Yunara (new LoL champ) LoRA?
sure.
how do i download stuff on huggingface? All i see is "clone repo" and i have no idea how to do it...
files tab, then click the arrow next to the file you want to download
or click on the files then on the download button
thanks
i'm trying to download the wan.21 I2V model
i have 10 GB ram
there's a 480P model FP8 and a FP8-scaled, which one should i pick?
Also how do i save a video in SwarmUI? I see no option to download or save anywhere
i see so many conflicting and cryptic info about all the img2video everywhere, i can't find a definitive answer (sorry i'm bad at this, never did video before, only used txt2img on a1111)
hey,
im using A111 with SD but want to get into Flux locally....
Which is the best way to start? ComfyUI or Forge? I m looking for best support also for LORAs.
I have a GeForce RTX 2060 Super
I also dont like switching between stuff often, so I would prefer the environment that is best for long term use
is it normal that img2vid takes way longer that text to vid with wan 2.1?
I belive comfy must be more complete, but you can try now with forge if wou already understand a11
someone already hear about Mellon UI?
yeah, maybe a good idea.
Mellon UI is another alternative?
I don't know, I see something about it but I don't undertand if is a future project
looks like ComfyUI
forge was a good decision, really easier than I thought š
thx
@vapid dove support scam
Do not join that server, it's just going to make you install malware or something else
the only project i know of that you might be referring to is Cubiq's Melon project. where'd you see it
Hey everyone,
I recently used AI to generate some weird futuristic designs ā imagining what humans might look like if they lived underwater.
The result is strange, otherworldly... but also kind of beautiful.
I made a short 15-sec YouTube video of the result here:
https://youtube.com/shorts/WLOVT8nHYT4?si=arc-d4cl3H7W5zPF
What do you guys think?
Would you survive in a world where humans evolved to live underwater? Or what would you change in this design? š¤Æ
Any creative thoughts or weird ideas are welcome! I might generate new ones based on your comments. š
Any reliable website where u can hire people to train Lora for you?
Anyone have open webai installed locally? I got ollama and a couple other models working on it and I heard theres a way to integrate stablediffusion into for it images inside chats. Not sure how to go about it doe. Thanks for any help
You will need to have comfyui, auto1111 running to generate images.
Will get busy on your host as language model and a model like flux would fill up the vram very very fast.
Hello, I just purchased Stability AI credits and I'm looking to test out thei video API.
Is there a place I can request accesss to it?
Also, hello everyone. šš»
I know, I block that. Thanks!
on L3, I thought it was from Matt3o, he was presenting
can you talk more about it?
they are the same person, I just saw his github
guys what are the resolutions to use on forge for photos?
Hello
Resolution is not dependent on the user interface but the models you use. SD1.5 Models base resolution 512x512 and sdxl, flux, ... 1024x1024. Landscape and Portrait variant work too (for example 400x600 or 800x1196,...)
ok,thx
Yeah. Matteio is cubiq. Thats what youll find his github under, and what everyone know him as. He's written most if the comfy nodes they use in most projects
He's tired of the issues comfy has and is writing his own interface
ginius*, I hope he finished his project! S2
So does evrryone else. Hes got a channel for updates for it on L3
interesting, I will take a better look on that! I will take attention
Call for Participants: Help Shape the Future of AI!
My name is Elie and I'm a researcher at Stripe Partners, a London-based research agency. Weāre preparing a study on how people use agentic AI tools in their personal lives ā tools like AI agents, operators, browser extensions, or chatbots that can complete tasks for you, such as planning a trip, organizing your calendar, or building a website. We want to understand how people are using this new technology, what they like about it, and any challenges they have faced trying to make it useful.
Weāre looking for participants who use these kinds of tools outside of work to take part in a paid research study. This will include:āØ
A short digital diary over several days
A focus group with others exploring similar tools
If youāre interested in how AI fits into everyday life ā and youāre using it to get things done ā weād love to hear from you!
wow, that's funny, I'm so interesting
What model should i use for this? https://civitai.com/models/1585622/causvid-accvid-lora-massive-speed-up-for-wan21-made-by-kijai?modelVersionId=1909719
I'm a super veteran for image gen and ESPECIALLY text gen, but I'm so new to video gen. I've been using text and image generation locally since basically the day they both started, and I've seen the growth trajectory of both technologies, and then I look at video generation, and there is just so much competition, and rapid improvement. It boggles my mind and overwhelms the crap out of me!
I don't even know where to begin š
I've been using frame pack for like 2 months, but I took a big break in the middle. I just updated it and started using it again and I'm getting some pretty cool results, but I know that the results that I'm getting are nowhere near what you can get now with newer models, but I just cannot for the life of me figure out where the hell to begin
Does anybody have any suggestions or resources that they can point me towards so I can try and dip my toes into more advanced local video generation?
Hey folks! š
Has anyone tested Stable Diffusion (especially SDXL or video models) on the new AMD Ryzen AI Max+ 395?
Iām particularly curious about:
⢠Performance when generating video (e.g. using FLUX)
⢠Compatibility with ComfyUI, FLUX, and video workflows
⢠Any issues or limitations with drivers / acceleration on AMD?
I know itās a very new chip, but if anyone has hands-on experience or even rough benchmarks ā please share! š
Would love to hear your thoughts before investing.
Thanks in advance! š„š»š„
Hey im trying to use the hassakuXLpony model in python with the diffusers library and i want to increase the token limit for clip so i can pass more tokens, how can i do this?
does forge or reforge have controlnet built into it?
in forge, you can add the controlnet
i already have a1111...do you know where i can get the controlnets for sdxl?
find github
Anyone wanna tell me how I can use my Stability API key with credits to generate videos from my discord bot?
I would really appreciate it.
I can get the image generation figured out, but not the video.
Hey everyone,
I recently used AI to generate some weird futuristic designs ā imagining what humans might look like if they lived underwater.
The result is strange, otherworldly... but also kind of beautiful.
I made a short 15-sec YouTube video of the result here:https://youtube.com/shorts/5JCtGOwgixU?si=wS0IoUL-nHzSjr19
What do you guys think?
Would you survive in a world where humans evolved to live underwater? Or what would you change in this design? š¤Æ
Any creative thoughts or weird ideas are welcome! I might generate new ones based on your comments. š
Hi all!
If you work with AI or large language models, Iāve built a free open-source toolkit (WFGY) that can instantly boost reasoning accuracy and stabilityāno extra setup required.
Details are in my profile/bio if youāre curious. Feel free to check it out!
"NVidia Cosmos Predict2! New txt2img model at 2B and 14B!"
I saw this on reddit
It's correct I assume that the only open models now are SD, Flux and now Nvidia?
good link, I never found it on Civit AI XD
I like to use this one, have ControlNet for sd1.5 and sdxl
https://github.com/Mikubill/sd-webui-controlnet/wiki/Model-download
how to genrate text to video
my A1111 randomly screwed itself up...
i didnt run git pull myself, but now infinite image browser and pretty much nothing else works š¦
what could have happened?
does it auto update?
Maybe a browser adblocker blocks it? If not come to #š¤ļ½tech-support and share a cmd log
does anyone know if i can use a111 with my 5060 ti or not
why is it showing me thats it not compatible with pytorch
hello guys, does anyone know if there are significant differences between a 32gb vs 24b vram in image generation? maybe in video/gif generation too
Hello, everyone!
I'm an AI Full-Stack Developer with over 8 years of experience in the software industry.
š Services I Offer:
Automation: I specialize in automating tasks using tools like n8n, Zapier, and Make.com.
NLP: I handle advanced NLP tasks with models such as GPT-4.5, GPT-4o, Claude 3-7 Sonnet, Llama-4, Gemini2.5, Mistral, and Mixtral.
Model Deployment: I assist with the seamless deployment of machine learning models across various platforms.
TTS / STT: I implement both TTS and STT solutions for interactive and conversational AI experiences.
AI Agents & Chatbots: I develop custom AI agents, Agentic AI, chatbots, and VoiceFlow applications for diverse business needs.
š Portfolio: Check out my portfolio in my profile
If you have an innovative project idea, feel free to reach out. Letās bring your vision to life!
Thank you.
which ComfyUI Sampler is the equivalent to the Euler A Sampler on Forge?
hi, can someone help me. i generated good picture but it have bad eyes, someone have a guide, promt or can someone help me to generate good anime eyes USING INPRINT on stable diff? thank you for answer !
You can use ADetailer to give specific features more detail (for example face, eyes, hands...)
Guide for installation and usage: https://youtu.be/Gxj_bYoZ4dg?si=O_bwyt9_ui8z577Q
If you want to download custom ADetailer Model, you can find on Civitai.
For eyes i recommend following: https://civitai.com/models/150925/eyes-detection-adetailer
Dont get confused about the words "finder" or "detection", the ADetailer simply locates the specific features and adds details to them.
Also, if you're wondering where to put your downloaded file so you can select it in Stable Diffusion under ADetailer:
...\stable-diffusion-webui-forge\models\adetailer
.
Anyone have a ComfyUI Workflow i can use for Character Illustrations?
I used Forge up to this point, there i used some LoRAs, ADetailer and sometimes i used ControlNet.
I've tried just drag and dropping a Forge generated image into ComfyUI to "recreate" the workflow. A Workflow appeared, but once i started generating, the image was just random noise...
on my whatsapp volunteer group, a girl is that ai is about screwing over the little guy. I told her ai is about control and if you can't beat ai join them. Classic simps jump in that's defeatist altitude. Another you have to sacrifice yourself so others can have a good life altitude.
Hey folks! š
Has anyone tested Stable Diffusion (especially SDXL or video models) on the new AMD Ryzen AI Max+ 395?
Iām particularly curious about:
⢠Performance when generating video (e.g. using FLUX)
⢠Compatibility with ComfyUI, FLUX, and video workflows
⢠Any issues or limitations with drivers / acceleration on AMD?
I know itās a very new chip, but if anyone has hands-on experience or even rough benchmarks ā please share! š
Would love to hear your thoughts before investing.
Thanks in advance! š„š»š„
I downloaded a detection model for adetailer but I have no idea where to put it so I can actually use it
Bit of a noob question: I want to take image A e.g a landscape, and reimagine it in a different time. What would my best bet be to produce that? Control net is good but a bit too restrictive I think
something like what chatpgt will do, where itāll draw from the image when creating the new image, but it isnāt restricted to the depth map
Hi guys, I'm a beginner who wants to generate celebrity faces for my new yt shorts channel. I own a pc with gtx 1060 6gb and 16gb ram. I'm seeing if there are solutions where the model can be run locally. Where do I get started?
hey so I wanted to create my first lora and I was wondering if any of you in here have any recommended parameters for training. I have read a lot of pages but on each one they say something different like the recommended number of steps and epochs. My lora would be based off an artist style and so it's supposed to have more steps, but I can't come to decide what numbers to use. Any help would be thankful
You have to manually switch the pytroch version;
try this:
Go into stable-diffusion-webui folder
Open cmd prompt in that folder by clicking into the adress bar and type cmd and hit enter.
Then copy and paste these three commands one by one:
venv\scripts\activate
pip uninstall torch torchvision torchaudio -y
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
Then relaunch the webui-user.bat
ever heard of pinokio.computer?
Hi, everyone. I'm a web developer. I find someone who can lend me a real account. I'll pay you 20% of my salary. If you want this, please contact me.
A Hollywood friend of mine is messing with stable diffusion and green screen for high level production. He made this concept piece to hopefully pitch to the big wigs. I am just showing my friend who is wildly unappreciated some love. Thanks for your patience, guys!
Hello, there!
I'm an AI Full-Stack Developer with over 8 years of experience in the software industry.
š Services I Offer:
Automation: I specialize in automating tasks using tools like n8n, Zapier, and Make.com.
NLP: I handle advanced NLP tasks with models such as GPT-4.5, GPT-4o, Claude 3-7 Sonnet, Llama-4, Gemini2.5, Mistral, and Mixtral.
Model Deployment: I assist with the seamless deployment of machine learning models across various platforms.
TTS / STT: I implement both TTS and STT solutions for interactive and conversational AI experiences.
AI Agents & Chatbots: I develop custom AI agents, Agentic AI, chatbots, and VoiceFlow applications for diverse business needs.
š Check out my portfolio in my profile
If you have an innovative project idea, feel free to reach out. Letās bring your vision to life!
Thank you.
hmm..
Yo guys, My cousin is looking for a EU Boyfriend
Hello there
Hello mate
I'm new here, I was referred to come here to learn how to set up and configure SD to generate on my machine, but I can't find much on the channels.
That's cool. Might wanna join a dating discord for that
Tech support mostly for issues you have.
But for easiest start out, forge webui which has a neat GUI with sliders and buttons.
Or if you're ok with nodes and dragging "cables" between each "next step", there's comfyui which is my goto.
hey, check the first link of the pinned messages in the #š¤ļ½tech-support channel. There are the setup Guides.
@fervent thunder 1 crucial point to start out with from the pinned messages:
Always check gits for what each SD generation tool requires of python, pytorch and cuda versions. And install cuda toolkit for each version of cuda you use. And make sure they are compatible with eachother. As that's one point i struggled with most when i started out :P
The biggest annoyance, is when i need to have several comfyui venvs as some nodes are too outdated, so can't use the latest stuff on those lol.
Hi! I got a quick question, is there an open source version of Flux Kontext that I can use locally for free?
Thanks! I'll have it in mind. I'm new to this and all the vocabulary still reamins fainty with no clear target on my mind, but it gets better with time.
Thanks, I'll check this!!!
Yep.
For comfyui, for basic SD and SDXL, all you need for simple nodes are:
Load checkpoint > Text encoders (one for positive, one for negative) > Ksampler that generates the image > VAE decode (that converts the noised image to a image) > save image (to write it to a file)
Plenty of workflows online you just drag and drop into the webui for easy recreate images 
Most likely it's still "too early" or beta of it, and if they do like the rest, it will be available publicly once it's ready for the masses.
Ok thanks for the info!
Hey everyone! Iām Blackemarlin ā big fan of cinematic AI art and post-apocalyptic vibes. I mostly use Sora + Photoshop and love storytelling through images. Looking forward to seeing your work and sharing ideas!
The official flux dev kontext is not released by now but i am pretty sure it will be released soon. HiDream got a model call e1 which is able to accept prompt for image editing. Flux Redux can help to for some of the task you would do with kontext.
Yo guys is ComfyUI better for realism or auto1111, or is there any diffrence
The used model, prompt Loraās etc determines the result far more then the used front end. So no not really a difference.
nno but I mean comfy can use loras better
this is what ive watched in YT

Hello
I've question I want to test stable diffusion.cpp with small size model
which works
- sd3_medium_incl_clips_t5xxlfp16.safetensors
- sd3_medium.safetensors
?
would sd3_medium works with prompts ?
hey guys im new to stable diffusion. Any advice on getting to to not generate people with multiple limbs or deformed bodies?
me
does anyone know how those video generations are made? ive only used stable diffusion but would like to try it one day. im looking to upgrade my pc rn so i wanna know what specs is needed for those video generations. is it just vram like SD ?
Hey, use a resolution that is supported from the model.
For example 1.5 models are trained on 512x512
While sdxl based models are trained on 1024x1024.
If you go extremely over the trained resolutions it will start to deform and clone stuff
Hello!
I need help activating the free trial version of Google AI Premium. Due to regional payment restrictions, I canāt complete the required verification step myself. If someone can assist me with this, I would be very grateful! Please DM me if you can help. Thank you!
Hey, can't help but dont trust support scammers which will invite you to shady crypto/nft discord servers
Okay, thanks bro!
how to use the bot?
@sinful seal Make sure your output resolution is not exceeding your model's resolution. Controlnet canny can help maintain a pose or composition from an image. Revisit your prompt and or add negative prompts.
hi
Hello mate
Is there a way to save your previous prompts? Or at least access them once you've closed stable diffusion?
You can access the last used prompt and settings when you click the blue/white arrow below the Generate button
Got it, thank you
sure, let me know specifically what youre struggling with
Has anyone tried to use IP Adapter or Reactor with WAN yet? If this is possible, this would be huge
VACE/WAN is already insane at replicating subjects with just a reference image. Using IP Adapter or Reactor alongside would be perfection
Hello!
I need help activating the free trial version of Google AI Premium. Due to regional payment restrictions, I canāt complete the required verification step myself. If someone can assist me with this, I would be very grateful! Please DM me if you can help. Thank you!
Hello!
Hello
does anyone know of a Wan VACE Video-to-Video Workflow download anywhere? im not familar with comfyui and i dont wanna spend hours tryna figure out how to make this work, and i cant find it anywhere online?
is there anyon who can solve my error in comfyui
What should I use for video generation, WAN, LTX or FramePack?
So far, I got WAN to run in comfy. LTX is just giving me errors most of the time (also in comfy). FramePack works fine but feels kinda limited in its UI-options and mostly produces completely still backgrounds (in non-realistic pictures).
I'm trying to create a character and once I describe everything and have it generated, the result is completely different from what I wrote. Where am I making a mistake?
It's hard to say, without seeing the prompt and knowing what model you're prompting against.
and the tools / settings used
Hi guys, im back at ai image generation, and i want to make product images. I have my products and i need to keep the image and make different scenarios.
Theres some like stable difussion GUI. like the NMKD but "2025" where i can use controlnet?
Thanks in advance š
Hi, i try the first time Stable Diffusion and download SDXL 1.0 as BaseModel. But i have some problem with rendering things without "anime/Manga/illustration" style. It would be not generate things in photorealistic style. Can someone help me to fix that missue? I dont add an Lora or Refiner.
just use prompts like "35mm photography" or "digital camera photo". However, SDXL base is a weak model. You should use models specialized on photography
for sdxl these are models like realviz
or use flux. But flux base also has some kind of plastic look and needs custom loras for perfect photorealism
i tried on 512x512. 1024x1024 is better i read here in some channels. I create atm some pictures, but it still a long time because atm i work on "only cpu" mode (Intel Mac). But i think, this way with 1024x1024 is very better. Thank you for our help and sorry for my broken english. Thanks a lot.
99% of bugs sitting in front of the system...š¤”
Hi all! What's a good tool chain to generate a variety of consistent images for games, sprite sheets? I haven't played with stable diffusion in a year. I think I'm using the comfyui interface. Based on my memory if what I was experimenting with, I can't see a path to solving my question, but maybe someone more experienced has a suggestion.
Does anyone know whats the best tool for animating an anime style SD image ? Like if I have a character and I want to make him move. From what Ive heard, ppl say Luma is good
āļø
Hi
Hello
yeah, luma's good for cinematic style. there's also animatediff, it's pretty good when what you're aiaming for is the anime movements itself
Hello!
Do you guys know if there's any comfyui workflows, or just a git repo with code designed to create lora preview images? As lora manager fetches what's available of images from civitai, but a lot of loras fetched elsewhere has no images.
As i tried to make a automated script, but it doesn't fully give the output i want it to have.
@pale wedge thanks dood
Hello!
anyone know a good tutorial to makes loras? Not necessarily character loras, art style loras would be nice
Hello!!
To train a GLora, you'll need Jelo's fork of Derrian's trainer
This will break a library, to fix it go to the trainer's location and run these commands:
.\backend\sd_scripts\venv\Scripts\Activate
pip uninstall numpy
pip install numpy==1.26.4
Starter Settings
Gradient Checkpointing: True
Gradient Accumulation: 4
Batch Size: 4
Number of Repeats: 1
Max Training Time: (Epochs) 50
LoRA Type: GLoRA
Network Dim/Alpha: 18/18
Unet Learning Rate: 2e-5
TE Learning Rate: 1e-7
Optional Args: "weight_decay=0.07", "betas=[0.99, 0.999, 0.99995]"```
https://github.com/Jelosus2/LoRA_Easy_Training_Scripts
aint gonna get better than that
Nice to meet you, all friends. I am here to help you.
@steady parrot still need?
yea
you can dm me, what kind of picture, and ur img2img settings
check dm
I would like to ask you how to achieve the effect of stable diffusion patching pictures or eliminating parts of pictures for appropriate filling.
hi,
i have a RTX 2060 Super with 8GB Vram and while this is ok to run, I wonder if I can, for a couple hundred bucks, aiming 500 EUR max, increase the Power of my PC for Local (Forge Based) AI generation. I saw there are some weird cards that basically pack VRAM on top, but I m not familiar with that kind of stuff.
TLDR can I somehow buy a dedicated card ON TOP of my RTX 2060 Super that helps with local AI?
I want to primarily be able to load big models that require 16, 32 or 64 GB
A 5060ti has 16gb of vram.
You can only use one GPU per webui (one image/video or task).
But multiple GPUs can run at the same time to generate multiple images. (But no way to merge their vram etc).
And then you can sell the 2060 or use it for streaming or upscaling games. (Lossless scaling)
np š
make sure to pick one with 16gb vram and not a 8gb version.
also if your case has the space go for a 3 fan design
thx, will check that out.
haven't been here for a while here and would like to know what's the easiest to install comfyui i2v generator/workflow that has loras and all of that stuff? thanks
Is there an ai service that you can make text to speech but change many aspects of the voice; for a character voice over ?
I am waiting for 5080 ti/super with 24gb vram
What exact tech do companies that take like 10 selfies of you and generate professional headshots use?
Do they have some stable diffusion based pipeline they optimized, or do they have their own foundational models?
who knows. But it's so easy to do that I guess they just use Stable Diffusion or Flux or similar models
I tried creating a Lora with stable diffusion and it was pretty awful
i guess i just suck at it?
Flux is really good for that. But SDXL is decent, too
how many photos are optimal? Also if i managed to get some ai generated photos that look nice and look like me, can i include those as training data too? why not, right?
however, you can also use FaceID or IPAdapter and stuff like that
What are those? tools to just replace face?
but i'd like for my body shape to be preserved too, if possible
I'll look into them too though
they act a bit like controlnets, but different. You give a face photo, they transfer the facial features into the character in your image
of course, for best results you should train a lora
my discord profile photo is a Flux generated image lol It's so fucking realistic, every single hair of my beard is correct
yes. Flux gives best results, in particular for photorealism
how many photos did you use? what platform did you use? did you spin up some machines on runpod or something?
my own machine, a lot of photos (probably 100...) and a lot of time X_x
ohhh
but you don't have to
wow im jealous
10 images are enough
i don't have a good machine
in particular if they are high quality
my own photos were all a bit shitty cause my smartphone is so old and the camera is not very good
what constitutes high quality? can i just do 10 selfies?
Did you follow any lora training guide? or do you already know the best settings and stuff
and what exact model did you use? flux base?
yes, Flux base
i tried using runpod as well as rundiffusion for training a lora with sdxl i believe, and i just hate working with cloud machines :/
lora rank 24
I think learning rate around 1e-4 to 5e-4 depending on how much time you have
I used SimpleTuner, but other tools are probably also okay
there are many xD
kohya probably
it seems a bit like automatic 1111
yeah that one
i didn't like it too much, too many settings
i wished there was something way simpler
I used kohya a lot in the past. It took a while until they had Flux support, thats why I switched to SimpleTuner
I see. I'll def take a look at simpletuner thanks
its not an easy to use tool xD More complicated than kohya
no, never do that xD
yeah. I prefer local ones, though xD Otherwise all my images are on ChatGpt
Gemma and Mistral are quite good. I would use Gemma if you have enough vram
gosh.. i think i need to get a machine with graphics card
yeah, so annyoing that all these old gpus are so expensive still
if i use my friend's macbook m4 with 24 ram can i run this stuff? or do i need nvidia CUDA?
i want to use cloud machines but they are SOOOO annoying to use
you can probably, it will be just super slow I guess
Gemma quantized runs well on 24gb vram. You might even quantize it down to use it with 16gb vram, but thats all possible
(not all possible, but all that makes sense)
But ultimately the photos should be captioned right? it's not a good idea to train with just the main keyword?
hm, my feeling is that captions are better in the long run
when training only one a keyword, the model might later have problems if you have a long prompt
but you can train only on a keyword and it will work
it will just get worse results if you use it later on a very long prompt
Ah, i see
btw if i have like a 5 year old desktop but with decent cpu and ram
and i upgrade the graphics card
will it be sufficient for this stuff?
or will there be a bottleneck
it's probably more like 10 years old but its still like an i7 cpu
with like 8 gb ram
uh no, you need more ram and a good graphics card
the model should fit into your normal ram
yeah, its probably much cheaper than getting new hardware
cool cool. I think i have a much better idea of where to go. Thanks a ton for your advices. I sent a friend request in case i have more questions later. Please accept š
Hello
can someone sugest me a site to rent a pc to me use my architetural programs and play games? My pc is a potato
post: young girl anime showing tidies and vagena
civitai: looks good to me
me: semi-2d girl cycling bycycle with all skin covered
civitai: realistic image of minor is not allowed, you dumbb-ass pedd
anyone know any things for voice generation where as I can change many options of the voice for text to speech ?
Always fun when you have 100's of loras without pictures, and you have to generate previews for every single one for lora manager 
Hi there,is there any sd prompt generator , free or paid one , not prompthero and online based, i heard there are many like andrewongai etc......
Hey! I'm looking for a skilled Veo3 user to collaborate on an IG video project (Not freelance work ā long-term partner setup.) ā DM if interested.
Hello everyone!
I am šš & ššš§šš«ššš¢šÆš šššš¢š šš±š©šš«š | ššØš¦š©š®ššš« šš¢š¬š¢šØš§ | šššš | šš šš¦šš š/šš¢šššØ/šš ššš§šš«ššš¢šØš§
I can help you.
please DM!
š šš & ššš§šš«ššš¢šÆš šššš¢š šš±š©šš«š | ššØš¦š©š®ššš« šš¢š¬š¢šØš§ | šššš | šš šš¦šš š/šš¢šššØ/šš ššš§šš«ššš¢šØš§
š Hey creators and innovators - I'm your partner for AI-driven image, video, and 3D generation, computer vision, and custom machine learning solutions.
š§ What I Do:
⢠Generative AI - AI-powered image, video, audio, and 3D generation using VEO3, Stable Diffusion, ComfyUI, and Hugging Face
⢠Computer Vision - Object detection, pose estimation, and video analysis with YOLO, OpenCV, TensorFlow, and PyTorch
⢠AI Model Development - Fine-tuning, training, and deployment of custom models for vision, media, and generative tasks
⢠AI Video Generation - Leveraging VEO3 and diffusion models for cinematic, high-fidelity AI videos
⢠Automation & APIs - Backend pipelines, API development, and full-stack AI-powered tools
š Highlights from My Work:
⢠quickpose.ai - AI-powered real-time pose detection for fitness and sports
⢠www.hellohistory.ai - AI chat app for conversations with historical figures using LLMs
⢠kyzo.ai - Computer vision system for movement tracking and analysis
ā
What You Get:
⢠Fast execution
⢠High-performance AI pipelines
⢠Creative and production-grade generative models
⢠Transparent communication and daily updates
Let's create next-generation AI tools together. š DM me anytime!
Hy anyone's have hey gen ai membership contact ya who know how to do ai voice dubbing contact me
Hello
For training LoRa's on people, is it best to have simple selfies / headshots, or is it better to have a variety of photos?
hello hi
@abstract quarry should i run preprocessing on the training images? should i resize them? should i crop them to show mostly face? should i upscale them?
Hello guys, I'm new here and trying to understand everything. I've heard Kohya is the best LoRa trainer in term of quality and professionalism, is that true ? (better than Flexgym but harder to use ?)
Also, what tool you use then to generate the images with the LoRa trained with Kohya ?
AUTOMATIC1111 Web UI
ComfyUI
InvokeAI
Which one of these is the best according to you ?
Thanks a lot
@raw hemlock Hey, im kind of new to and trying to learn
I had SimpleTuner recommended as lora trainer
you can take a look too
and i think ComfyUi is generally considered best for generating
hi
hi anyone on?
Hello mate
@spare flicker Can u help me
Whatās your issue mate ?
Error: ''NoneType' object has no attribute 'sd_checkpoint_info''. Before reporting, please check your schedules/ init values. Full error message is in your terminal/ cli.
whats this issue
pls help me somekone
We need more infos. Which webui do you use?
like deform
control net
i want to use like image to video
u help me na pls
i cant find a fix of this
@warm junco
Can you provide a full cmd log in #š¤ļ½tech-support ?
ok
Need someone who can merge me a model with 8 Loras.
I cant to it because I have only CPU work with my SD. (MacBook Pro 2019. š« )
And now im Not able to merge this.
So, if we found someone, ill pay them some Coffees!
@stone tangle you?
don't upscale them. Downscaling is okay, upscaling will decrease quality. You want to have a good mix of training images with some showing face as portrait and some show upper body or full body. Face is most important, though. You can crop an image to only show your face, but this only makes sense if the image is super high resolution, otherwise, the face will be too lowres. Making a good portrait photo is better, probably.
Flux wants your images dividable by 16 pixel.
Most tools automatically crop your photos such that they have the right resolution. You have to be careful though, as sometimes they do a lot of "preprocessing" that rather damages your data
like most tools come from a time where you HAD TO train on very specific resolutions and so they expect you to provide the resolutions as parameters somehow
for flux, you can train on any resolution you want, so just make sure that your training images are not accidantelly cropped or resized in a weird way by the training tool you use
you might also add some low res photos (e.g. 512x512, 768x768) to your train data to speed up training. Just downscale your highres photos (1mpx) down to the 1/4 mpx)
one more thing: add group photos to your training data, i.e. photos where you and 1-2 other people are clearly visible. This is important for the model to learn that not everybody is you š
hello
Oh can you really train on arbitrary resolution photos as long as divisible by 16? all the online guides seem to say like 512x512, 768, 1024 etc. Can you super arbitrary (just as an example) 1040 x 528?
And if you add group photos, how should they be captioned?
yes, flux can
just caption them as you want, e.g. with "photo showing [trigger word] and another man with black hair" or "photo showing [trigger word] on the left and a woman on the right"
it's just important that the method has some training images where it sees your face together with other faces
Ah I understand. but it's also interesting, first time hearing something like this. Have you done comparisons of LorAs trained just one solo pictures and LorAs trained on solo and group pictures and noticed a difference?
yes. Without group photos it will transform any face into your face
its even enough to add collage photos
has anyone managed to make chroma work in ForgeUI yet, now its been added/
with collage photos I mean: just add two portrait photos next to each other into one image
the important thing is that the model learns to recognize your face from other faces
Ahh super interesting. I'll try out that technique
is there a starboard here still?
Flux Kontext is out! š
Is Flux Kontext better than dev at text to image
or is it only better at image and text editing?
Comfyui. Forge has some extensions that does a better job, like inpaint, outpaint, and controlnet with stick figure. Comfui's is annoying to use in that regard. So they each have their strengths and weaknesses.
Thanks I will check about Forge. And what is the best LoRa trainer according to you ?
A lot of different ones will serve different base models. AI-toolkit i've found nicest for flux, diffusion-pipe for wan/hunyuan iirc, but sadly sdxl and the like i've never managed to make a successful lora with.
Luca Taco has several very good trainers for different models in his replicate space
Do you use the discord for rendering or some kind of app?
Hi, I have a 4080 super and got a1111 working w some random model. What's the best model I could reliably? For text to image? Or if possible text to video?
does anyone know if you can use blender or any vfx tool to polish anime videos? my plan is to create an anime clip then polish it with blender. is this feasable? from my knowledge you need to have a 3d model to use blender not a video. (havent tried any of them yet, so im clueless)
What are some good UIs for Stable Diffusion nowadays? (No Comfy)
Hey! I just launched a blurred vault called SceneSnatch ā would love to join any creator network servers or vault collabs youāre part of. Open to swaps or promos!
guys can i make a job post here?
there's a111, tons of features there. hard to navigate tho
there's also invoke ai. cleaner buttttt also has a higher vram gluck w the gpu memory huhu unless u invest in good GPUs then ur set w this
there's also savro which is a plug and play, so it's easier to use. i use this nowadays since i make the model myself and there's consistency
in general there are a ton aside from comfy, u can use whatever suits your liking. i personally prefer savro thoo
Hey. Can you make images based on an input image using SD?
Like a reference image to copy artstyle from?
there is flux Kontext that can do that
released yesterday
for stable diffusion there are several ipadapter
Not sure where to ask. Are there any decent low-profile GPUs for stable diffusion?
Looking to create a series similar to black mirror. If anyoneās interested hmu.
with Flux Kontext, can you do a controlnet to only modify certain areas?
I'm guessing yes?
Hello pals,
I don't know literally anything about AI at this point, but I am an expert in traditional art and VFX. I want to be able to take a fairly simple physical object as a video of all angles of it (or whatever) or photos of it from all angles and compose it into an AI created video, or in-painting it into something else etc.. Would this require training LoRa for my specific object? I am interested in consulting with an actual expert (at least that can do what it is I am talking about and understands the comfyui pipeline or whatever for it), paid of course, or interested in someone here (or anywhere, or a recommended platform to hire) an ai artist to do this for me as a example as I can't find similar thing online really.
Anyone know or is interested please let me know. I'm going to start learning about this stuff full time from this week on until I get it sorted, find someone to do it for me, do it myself, or learn what I want is not possible.
Thanks, and please to meet you all.
hi
guys, what is the difference from Forge to ReForge?
you can just use inpainting for that
Also is there a comfyui node for comparing different outputs? maybe with a slider
Yes kontext can be used instead of all kind other control models for image generation. It can be used for face replacement, style transfer, inpainting, etc. and works quite good so far.
The comfy node you are looking for is called Image Comparer (rgthree) but there are other ones too
Does Flux respond better to a different prompting style than stable diffusion? i can't remember if i read something like that or not. Like more prose natural language prompting style vs SD keyword style prompting
Yes flux can work pretty good with long natural descriptions and sentences. No comma separated list of keywords needed
Can you use multiple images as input for kontext? like 5 instead of just 1.
yes you can
Any advice on how I can make the videos I generate on the Image-to-video model better?
Unsure of how to go about this. For instance the image is fine, I have even created a separate AI agent node in my workflow to create a video prompt and advise on how the video should be brought to life, though the rendering isnt great - clearly shit
And how do you feed that? I see there's a image stitch node i can use to stitch two photos together. But more? Do i have to stitch the photos manually first?
wait, comfyui is open source right? i want to contribute a node that stitches arbitrary number of images into a grid
Should i? lol
Well there is a image stich node which can be used for it and repeated use leads to the same result, so well it might not be required.
Feeding depends on your scenario. If you want to make a group picture etc. you would stich all the images together and then prompt them for whatever they should do.... The people on the picture are sitting at a restaurant,....
If you have a special background etc. you would use the background as another condition and merge the group and the background.
But as the model work best with step based editing i would think about adding steps between to avoid glitches
I wanted stitch together multiple photos of the same person so the model understands the person better. Almost like a hacky LoRA. Do you think this is a good use of the model or not really suited?
I would not do it as it does not learn, so better chose the best suited image (angle, lights,...) for the situation you want to generate
Is there a way to connect the output of a generation directly/automatically to a load image node?
so i can iterate without having to download and upload over and over?
There are loop nodes for comfyui but these are not easy to use... but some example workflows exist. Most use case is different settings (denoise, steps,...)
A question not related to Kontext.
I'm trying to create a Lora and lets say i have 10 images. I've already created workflows using Lora or otherwise (using Kontext or other models etc) and generated maybe 20 more images, based on the 10 original. Of course the 20 were handpicked to be the best representations, out of lets say 100s of outputs.
Now, if i train a new LoRa on the 10 + 20 = 30 images, will there be an increase in quality over a lora trained on the 10 original, even though the 20 extra are "based off of" the original 10?
How do I enable Xformers in ComfyUI?
I am kinda nooby here is there any resource where installation of latest model is shown?
For the Webui installation guides checkout:
https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides
Models can be downloaded on Civitai.com
Does the flux context dev make IP adapter and maybe controlnet more or so obsolete when it comes to editing images? flux context dev seems to be much better at generating the same face and such. IP adapter face ID seems to max out at about 80 to 90% but flux contacts dev seems closer to 99%
is there discord community for automatic 1111 user?
Here
You can chat about any webui here and get help in #š¤ļ½tech-support if you have technical questions or #šļ½prompting-help for getting better images asking for settings/extension etc
hey hey everyone, Elios is the name, animation is my game! ā¤ļø
Hey
hey
not so much controlnets, but definitely ipadapter
ipadapter works by processing the image by some vision model and extract information out of it. These information are then injected into the diffusion model. The important thing here is that your input image is not given 1:1 to Flux, but only some kind of description or summarization
that's why ipadapter can never use all details of the image you give it
flux Kontext in contrast is putting the image as it is into the diffusion model, so it can use all details of the image
(one reason why ipadapter is no doing this, besides its computational more demanding, is that if you give the diffusion model all details of an image, risk is high that the diffusion model just ouputs the very same image. Like you say: give me a similar image like this and the diffusion model just outputs the image itself. Flux Kontext works because it was trained very long on different tasks, very likely MUCH longer than ipadapter
Controlnet definitely has more fine control especially compared to ip adapter as far as I know and I assume flex context dev doesn't provide that level of control yet.
It has less level of control but it covers 90% of image editing tasks like "Remove this, add that, or colorize the sketch"
control nets and kontext work differently
Flux Depth(and flux-fill, same both technique) place the control image and the generated image on top of each other. So each patch in the generation image gets the exact corresponding patch from the control image
so this makes it easier to transfer shapes, depth maps, contour lines and so on. It's also more compute efficient than Kontext
in Kontext control image and generated image are placed "next to each other". This makes it easier to change the generated image and more difficult to keep it exactly the same.
Nevertheless, you could train flux kontext on depth maps and it will probably perform almost as good as Flux depth. It's just: why would you want to do that, Flux kontext is slower than Flux Depth.
I mainly use IP adapter for face transfer and flex kontext dev seems better for it so ip adapter is going to the basement for a good while.
flux Kontext can also be finetuned
as to be expected, many people currently waste their compute on unclothing finetunes and stuff
bit I think/hope we will see cool and interesting finetunes of Kontext in the future
Outfit swap between two pictures can be interesting and look totally doable/trainable
How doees flux kontext decide the output image size if i didnt set it? it doesn't necessarily seem to match the input
the same way as the base model
if you use comfyui, the default workflow is to use the conditioning image as input for img2img. just change the workflow to use an empty latent image in arbitrary resolution
usually it defaults to a standard resolution rather than matching your input image size. If you want more control, youāll need to explicitly set the width and height in your settings. you can also add it in your prompt, add specificity to it with the size
Output image size and dimensions are determined by latent. It can be any size. Reference image on the other hand is part of condition together with text prompt
Hey guys, I've been doing some SDXL in diffusers as my first AI project. If you wouldn't mind I'd love some input on my code! I have a local branch trying to configure the 2x latent upscaler instead of the beefy 4x
https://github.com/luckielordie/sdxl
yeah
Hello @outer patio
Iād love to help you train an AI model using your image.
I have hands-on experience with custom AI training pipelines such as DreamBooth, LoRA, and Stable Diffusion, and I can generate high-quality, personalized outputs whether you're looking for realistic portraits, stylized art, avatars, or any other use case.
Just dmed u
https://github.com/VectorSpaceLab/OmniGen2 damm it is looking good
anybody here using Microsoft Designer (Dall-E 3 via Bing Chat?)
Pwo
Who sell ig followers
I could promote your account
If that works
Hahah
Wish me luck. Middle of installing rocm via distrobox to run comfyui on steam deck xD
Hell, maybe even attempt with the highly unstable zluda lol
Rip, with rocm in distrobox, it got as far as loading model, and halted on ksampler 
There is an rocm flatpack for alpaca llm on flathub store.
Idk if it works but thought maybe you can access or use it in a way
Ain't llm, but image gen. Comfyui in this case.
is there a way to mask images just using prompts? Is there any extension i can download for that?
check this platform.
@tender minnow Check out Comfy's Seg Anything Ultra node (layer style node pack). It allows you to mask by prompt. you can do subject, face, hands, and possibly other features.
https://github.com/chflame163/ComfyUI_LayerStyle
hi there, i', pretty sure you canāt really mask images purely with prompts, but you can use inpainting which let's you cover the areas you want. If you want something simpler, Iāve used Savro since it has an easier interface for masking without extra extensions.
hello newbie here 
Just a question as you spam the savro thing on others servers as well even if there is a complete correct answer posted (segment anything node). Are you in any kind related to the product???
is there a way to view the previous results? isnt it saved in a different folder? does anyone know where it is?
Maaaaan i can't get good results with Flux Kontext (using their api for their pro model even)
I ask it to change the background of a photo of a person, and the person definitely looks slightly different
Disappointing...
Might have to go back to LoRA training.. i was hoping Kontext could be a hacky alternative to using LoRAs
you can train a lora for kontext
hi
hi
this usually worked very well for me. What is your prompt?
Hi, all.
If you have a project in mind or if you're just exploring potential enhancements to your website or web application, I'd love to chat with you as a full stack developer.
@stanle
hello gamers
Hoooi looking to someone to help me with some generations
Paied of course and willing to ask for more if its works uwu
š£ļø
One answered me but just ghostee dhdhd
does stable diffusion have the capability to keep an images overall composition and such but update the colors/rendering to improved styles?
HI
Hey
Gm
that would be style transfer, or controlnet, or img2img, etc.
Hi
Hi @novel tide
Hiee
Hello everyone! We need person who understand how stable diffusion works and can help us with workflow. Write for more details, payment is negotiable!
Random can somebody please tell me why there are so many tech layoffs.Yet accounting seems to be more stable right now.I know things in the future will eventually be taken by ai but would like to know why software engineering jobs seems to be more impacted by ai then accounting for example.
if i had to guess, maybe its bc its less forgiving to make a mistake in coding than when it comes to finances. you can always debug later too
more forgiving*
Would say that the optimization for accounting has been done in the last 20 years will all kind of tech. OCR Document, Fraud Detection, ... The potencial is not very high. The "mass" Jobs which need not much skilled labor are already done by machines. The Workforce is below the line cheaper and better available.
IT on the other side has not been much automized. Still there are enough easy tasks that can be done by ai instead of junior devs. And even junior devs are kind of expensive, so the potencial cost cut is kind of large. Additionally in some countries skilled engineers are rare, so finding another solution is more urgent
Is there anyone you know who might be able to help me try on some of my designs?
What do you mean by try on?
try (as in to try and upgrade the rendering/style etc) on some of my design
You could post it in general with images maybe someone will pick it up
I posted a thread to "community projects"
Hi can someone help me choose a model for generation of creatures?
A curiosity when creating the Lora. Is it better to use images of that character without the background (transparent) or with a black and white background?
there is no transparent
diffusion vae does not support alpha channel
if you have a transparent background it will usually just get black
HELLO
hi there!
Ive got the dumbest question in the history of ComfyUI but I got to try.. I had this green dino in my comfyUI that could save workflows and clean cache or Vram. Now I had to re-install the whole thing and its gone.
Anybody knows the name of the plugin/node?

hello from India
hello
Hey
Do you guys know if stable diffusion or flux models understand colors in terms of hex codes or HSL or whatever?
Hey is anyone using fooocus ai on amd?
heha
yeah, kinda.. they donāt really know hex or hsl but they can guess if you put it in the prompt
Do you know any (free) program to remove people from a vacation photo? I usually use affinity photo but it's too hard to do it by hand
Hi, there
Hi, there.
I am an AI/ML developer with over 8 years of experience solving real-world challenges across multiple industries like healthcare, law, education and so on.
I specialize in developing AI-powered solutions such as chatbots, AI agents (MCP/Agentic/Voice), Prompt engineering or implementing RAG system, and LLM models (training, deployment, and fine-tuning).
With a deep understanding of both AI/ML and web technologies, I provide end-to-end solutions from conceptualizing AI models to integrating them into practical, scalable applications.
My track record of successful projects across multiple industries ensures that I can deliver high-quality, tailored solutions that meet your specific needs.
Services I Offer:
Automation: I specialize in automating tasks using tools like n8n, Zapier, and Make.com.
NLP: I handle advanced NLP tasks with models such as GPT-4.5, GPT-4o, Claude 3-7 Sonnet, Llama-4, Gemini2.5, Mistral, and Mixtral.
Model Deployment: I assist with the seamless deployment of machine learning models across various platforms.
TTS / STT: I implement both TTS and STT solutions for interactive and conversational AI experiences.
AI Agents & Chatbots: I develop custom AI agents, Agentic AI, chatbots, and VoiceFlow applications for diverse business needs.
** Check out my portfolio in my discord profile**
I always try to learn new and cutting-edge technologies, and I place great importance on collaboration with team members in development.
If you have an innovative project idea, feel free to reach out. Letās bring your vision to life!
Thanks.
Does anyone know how to automatically get the pictures for models and loras on Forge? On Sdnext there is a button to press that likes goes on websites and autodownloads pictures etc. Wondering if it can be found on the other sd types
Hey, I am a ai automation developer if you need any help or service just drop me a msg @split haven
hey guys im superr stuck, i successfully saved 2 workflows. I'm trying to save my 3rd but... its not saving to my workflow folder, i am clicking save as, and when i try to save again it is complaining that the file already exists
what benefits does comfyui nodes have over forge/automatic1111 ? is it worth to learn it even tho its harder ?
yeah use img2img with low denoise strength at around 0.2
so you're looking for an auto-downloader that grabs example images and metadata previews for models?
Excuse me, where should i text prompt to generate image?
hello!
https://crowdllm.ct.ws I found this tool to crowd build datasets for LLMs the concept is interesting. you put you create a task and people can contribute by adding prompt answer pairs, just putting prompts or just answering prompts without answer.
Which model do you guys think is best for "X artist style of Y" like "Picasso style cat"? SDXL, SD1.5, Flux
I would try SDXL checkpoints like Juggernaut XL or Realities Edge XL, Dreamshaper isn't bad either.
flux
it honestly depends
there are certain workflows that are only available through comfy
and most of the newer stuff is going to be comfy-only
100% recommend learning comfy if you truly want to reap the benefits of AI
