#š¬ļ½general-chat
1 messages Ā· Page 188 of 1
It works surprisingly well for powershell & scripts
So far had gpt alter some comfyui's nodes to be faster, mmap cache models to load way faster, like from 300MB's stock to 4GB's, but nothing i can "sell any skills for", as it was all GPT's handywork lol
Lately it's actually been great! (if it was regarding gpt)
Hmm if its selling the improved nodes i dont see a problem Personally
If someone commissions me a custom workflow i charge them by the time i put into it
Might need to elaborate, "if it's selling"? As in if it's gained popularity?
Hmm well if you regularly optimize nodes you could post a few for free and some behind a patreon or paywall
Tho it goes against the spirit of opensource but a man gotta eat
Indeed. I'm on disability income, so if i gained money from this, it'd be a plus-income 
Hmm since you work with comfy you could sell custom workflows if thats something you can do
Sell the skill not the product so to speak
Indeed. Though, in my case it's more a "word wizard" for A.I, if that's even a skill, as i still can't wrap my head around large amount of info easily as others can.
You might see a beauitful lush forest, i struggle to even grasp the rest of the tree from the first branch.
Ahh hmm in this case I'd imagine more of a workflow to make such images instead of the image it self
Though you could always try to sell t-shirts online with custom images lol
Nah, as A.I generated is not original art. It's existing people, styles and art thrown into a blender. So sadly i find it hard to want to take money for non-original work.
Plus, i'm all for open-source stuff, so since i've no proper coding skills, i have no clue how to even make a job/freelance out of any of it 
Prompt > image > photoshop/edit > shirt
Feels wayy better since theres work put into it
Where i can find it sellable of generated stuff, is use generated as illustration/inspiration, then make your own on top.
Hmm could also work, or use ai as reference etc
Yep. Like the video game i am super slowly starting on, i need inspiration for the "character", so i generate images of illustrations of said character, find a few i like, then create character models in blender from there
When I use the Wan image-to-video example workflow in ComfyUI, the output has absolutely zero to do with my input image... Any idea why this might be happening?
Also... the LTX example workflows (https://github.com/Lightricks/ComfyUI-LTXVideo) refuse to load at all. I think I installed all the necessary custom nodes and models, but nothing.
Anyone know the tech behind Kling AI?
I only know as much that it uses 3D Spatiotemporal Joint Attention, which gives generations their natural physics based results.
yeah $4200/3 Months, I want to make a open source at that point.
It actually gave me an idea to "enhance" wan results by implementing a in-between node for comfyui which will use depth models like midas, and bake it into the latents. GPT is scrambling together the code as we speak 
I can help if you're open to sharing š
Oh, that branched to an idea where i'm gonna have GPT make a node that outputs not only regular video results, but a controlnet stick figure as well 
API prices like this make no sense, what is the difference between professional and standard?
Professionals buy in bulk and probably get a discount etc
Of course! As i'm gonna throw the code over to my git
Sadly, i can't code for shit yet, just some basic knowledge if there's a red because a syntax error, i can somewhat fix those x)
Token costs and result are nowhere to be found.
I'm a SE, 20+ years.
Ping me, let me know how it goes or if you need python help.
I haven't had good results with WAN yet.
Added ye so we can take it to dm's or something :P
Hi guys,
I want to create a gif or a video thumbnail that has static picture with the information of my song with moving sound wave across the picture.
I tried videos by Sora, but it doesn't work. it doesn't keep static image on the background. Images also do not work, because sound frequencies waves do not move
I would really appreciate any info that could point me in the right direction. Maybe any other more simple tool then SD?
Thanks a lot
Bro again?

I honestly recommend looking on LinkedIn or fiverr man, this isnt working
Its seriously clogging up gen chat
Im not short of money but i recommend looking on places where businesses look like the one i posted above
Almost all people here are for image generation etc
true
you can easily do this with python and a frequency wave form.
You mean python the programming language? š
What else would there be?
You're just converting frequency to a image, any language can do that easily.
This wouldn't be a task for ai
Is anyone using dreambooth with a 50 series GPU? If so, can you tell me how?
I don't know, like real python for example :). I'm equally scared and inexperienced in handling both of them :)))
Sorry, maybe I didn't make myself clear. I meant to say that I don't need to convert the sound wave frequency into an image.
I need to have static image and running soundwave frequency on top of the static image. I don't know if it's possible at all with AI tools like Sora or filmora or SD
Here's the simple example of what I'm referring to. It is YouTube thumbnail for a song.
https://youtu.be/VqAj8yL2lOA?si=sI71ELV539eKVeY_
Thought I think this thumbnail is a video thumbnail, not .gif
This has no real application in AI, generating soundwaves is just a series of images (gif) of said soundwave, you would capture and export that using a program or your own written app.
You can generate the background image with SD no problem.
Thank you, however, at this point I'm not looking to do any custom coding myself.
I don't think all the musicians that post that type of thumbnails on YouTube are coders either .
I think I'm looking for more simple solution than coding.
Have ChatGPT code for you G
Humans werent made for coding. That's what AI is for
I made a game in Python with me being the architect and ChatGPT being the translator from English to Python. So far, so good. o4-mini-high is insanely good at coding
He gets it
@vapid dove fairly well known scam
Hi. My name is Aki and im from Austria. Im pretty new to ai. Nice to meet you.
general with images is more art this is just general
thsi discord is mostly art
yeah, thats what im wondering, i just joined, been playing with automatic1111
ive kinda just been playing around with whatever ai i can get my hands on
Mix of both kinda but tech support gets the most traffic
ah, cool, i honestly dont know what this server is, i kinda just stumbled on it through the discover tab
Ahh its the official stable diffusion community server
do people generate images in this server or do they do them local and then upload them?
Hmm people in #artisan-faq do it here but most do it locally
I prefer local due more control & freedom
i prefer local so i know what im doing when the elites cut us off xD
Lmao, that and the only costs i have (after investing initially) is my electricity bill
true
biggest pain in my ass is im using an amd gpu...
amd doesnt even try foreal
but they cheap
Ahh rip, hmm amd is close to getting native pytorch support
So once thats around youd get a massive performance boost
they been saying that for years tho
whats crazy is, i couldnt get automatic1111 to run on my linux os, i got it to run in windows
I mean fair since a1111 is outdated by a year+
what is the better option now and is it free?
Hmm for amd forge is better for sdxl
whats great is i can use ai to help me set up ai xD
I 10000% do not recommend that
But yeah in techsupport theres a guide for amd cards forge written by Cs1o
yeah, youre probably right
it was walking me in circles
i kinda just figured it out myself
Personally i use SwarmUI but for amd you'd need to change the backend manually to use zluda
i ended up just running the bat file normal, uninstalling the torch files, and installing torch-directml
Ohh directml while it works
Its sorta takes up more vram
But if you dont mind slower gen speeds, swarmUI is peak
directml? doesnt sd run on top of that?
iono, i feel like it had to have that installed to get it to work
oh well, thanks for the info, im not too serious with it atm, i kind of just wanted to see what it could do
Fair fair
the models i was using didnt produce anything good
and even tho i have 16gb of vram, i didnt seem to be enough for more intensive stuff
Yeah with amd you can't really run anything above sdxl
And directml uses a lot of vram on its own
With my 16gb nvidia i have 5s gens in sdxl
oof, i would kill for an nvidia card
or for amd to get off their ass
oh well, i guess its just not their market
their aim is more, selling mid range cards and undercuting prices as little as they can get away with xD
they pretty good at gaming tho
does anyone know how to add eta noise into comfyUI?
Oh rip. Dont use directml
With a 16gb vram GPU and zluda you will have way more fun than on directml.
Checkout the install Guides for Zluda.
First link in the pinned messages of #š¤ļ½tech-support
Hmm you could rent an H100 for $1 an hour
I almost got Wan 2.1 14b to work on my 5090 but I was too dumb to write my own Cuda kernels
lol I am also struggling to write cuda kernels
H100 for $1 an hour is not gonna be high volume its not profitable price
?? 14b should be easyly possible on a 5090
I run it on my 5080
guys how to generate figure 1 in Midjourney in the style of figure 2ļ¼thank uuuuu
š Seeking a sales & marketing automation specialist!
I need an expert in AI-driven workflows to accelerate lead generation, outreach and campaign analytics. While many of us build multipurpose agents (myself included), if crafting scalable, data-driven specific to sales funnels and marketing optimizations is your passion, letās connect! (and yes this was written by ChatGPT)
how do I add eta noise seed delta in comfyUI?
gotta enjoy spambots
@ancient mauve have you tried asking in the comfyUI discord or other more specialized discords? Like in L3
Im asking in various servers and no one is answering
im also seaching in google and everywhere and nothing
mind me sending one where you might have any luck?
Im trying to recreate an image and the metadata says it has ENSD 13337
yes in forge is easy to set up
but not so in comfyUI
so its just a number added to the seed?
i assume so from that post
It just makes the seed more random as it already is
but I know both the seed and the ensd
Try recreate an other image without an esdn if you only do this to test if you can get the same results then
You won't even get the same image if you use a different webui anyway
Also using different GPUs (nvidia, amd, intel, cpu) makes also different outputs.
Does comfy UI work better enough to justify the instalation and the learning curve?
Try swarmUI, you can try comfyUI whenever but its optional as there's a handy GUI
Ui***
It will automatically make a workflow in the background depending on the settings you change so you don't "have to" use comfy but still have all of its benefits
Be sure to not accidentally install stableswarmui though
As its the old old version
I have installed Automatic1111 and Forge. they work, but there are things that are "Broken"
Are you on amd by any chance
Can anyone help me understand, i was told to check out Hyperlora for the best consistent character workflow. But it seems to use a reference image for control net, but i dont want that - i only want the face to be needed + a text prompt to describe the scene. Any advice?
anyone experience with comfyui and wan img to vid? When I try I get rave fest lol :D
Works pretty good for me (comfyui) which WAN I2V do you use?
I downloaded wan2.1flf2v 720p 14B fp16, I have no idea what settings I need. I tried 1024x1024 but I seem to take forever now to generate
did you tried the example workflow which you could reach in the menu (Workflow > browse templates > video > wan image to video?
Yesnt, I downloaded 720p instead of 480p but I essentially didnt change anything besides sampler, and I also put image gen in front of it instead of providing one
and the generated image is "clear" not too much noise and not too much objects in it?
its basically just jittering around and stretches into oblivion with a lot of rave like colors :D
just tried it with the basic workflow and it worked flawless.
I tried these model to confirm the example workflow is working:
wan2.1 480p and wan2.1 720p both FP8
The umt5_xxl Clip
and the wan.vae
Okay thank you, will try that in a bit!
Okay I have now everything, do I need to edit the prompts since I dont really have an image yet? Do I adjust them to fit the image I insert?
Oh, that's way too big, as unless you got a 48GB+ gpu, and a latest 5000 series quadro, it'll take an hour xD
Try starting at 480x640 with 41-85 frames
I just barely got Wan 720p to work on a 5090. I quantized just the linear layers. And it worked for the t2v, but ran out of memory for i2v
Exactly 1 hour, but down to 40 min when I used sage attention
Has anyone managed to get characters with different poses? I keep trying but it never works.
nah only got 4080 super :D
No intel , with nvidia 12 gb VRAM
Ah what are the things that are "broken" then?
Can someone help me with Stable Audio's input audio upload? I am trying to upload audios but they all just fail. One succeeded and the others are the exact same files as this one, but they fail. It just says "Contact support"
Depends on the base model, some further trained base models has been fine tuned/further trained with better pose controls.
Alternatively, you can use the good old stick figure with controlnet :P
Layer diffusion and other things
Ah extensions, Makes sense if not maintened
An important paper just dropped
https://arxiv.org/abs/2505.10046
"Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis"
Focus on 6.1 which finds timestep conditioning to be a waste of parameters
Hi, Iām learning to use Stable Diffusion for the first time. Iām trying to use Controlnet for posing the image, but it seems like most of the checkpoints (even thatās a new word for me) I want to use are too āstylizedā to listen to Controlnetās Openpose.
I donāt actually know for sure what the problem is.
Do you have any advice or workarounds for this?
Trying my best to understand your question. One of the controlnet available is an openpose controlnet. To make it work you would need a special open pose image (stickman in different colors for limbs etc.) or use a preprocesser to extract the pose from a source image (again will be convertet to the open pose model). Then just write a prompt incl. a character and the character should take the pose if the strenght of the controlnet is high enough.
Thank you.
I use a 3D modeling program to make the pose Iām looking for, then I upload that backgroundless png to Controlnet. I use the openpose preprocessor to process the image, and it shows me a preview that the stick figure took the pose and understands what I am trying to do. Then, just to see if itās working, I set controlnetās priority to 2, the highest possible strength I see, and then select the box that says āControlnet is more importantā then prompts.
Then I add minimal written prompts.
When I generate the image, it makes a picture without the pose I gave it.
Also, the image I gave Controlnet keeps saying āstart drawingā over it despite it recognizing the pose I assigned it.
It seems that some checkpoints are listening to Controlnet, while most of the ones Iāve tried are not responding at all. I hope there is a way to still use Controlnet since I like the art styles of the ones that arenāt working.
could you share the pose (output from the preprocessor in general with image?
So the pretty good animated images or videos on civic are made with banger hardware I assume? Since whenever I try to make it work its so wonky
Video minimum requirements (if not endless patience available) would be 16gb vram
Yes im on 4080 super which has 16gb I believe but still not super sure what Im doing wrong, maybe my prompts are just bad
What model are you using.
WAN? Huan? ITX
Are you running wan, hunyuan, ltx
WAN
Or the ancient tech animatediff
Wan just released a new thing where gens get sped up massively
480p 14B fp16
Things previously took like 30min on a 4090 is now 3min or something with wan
generation takes about 5-10 minutes but its just off
Are you trying to animate something real or 2d?
2d
Thats the problem lol
I want an animated/cartoon cat walking/bouncing from left to right and its just raving lol :D
Iirc theres another group working on animating anime but its not ready yet
Previews while great were wonky lol
I wonder how its done on civicai website then, seen some incredible stuff
Trail and error probably, could keep trying but with our hardware it takes a while
You got the original negative prompt?
With all the Chinese in it?
Hey everyone how's it going? I'm trying to do a simple animation in stable diffusion with an existing image using img2img and animate diff but im ending up with the picture basically just tearing apart. Wondering if anyone has some tips or can point me to a basic setup to achieve this. It's basically just a still image and i just want evry minimal animation like camera movement or breathing.
Animate diff is ancient
Great to know, how would you recommend setting this up?
What gpu do you have?
rtx 3060
So if I wanted to make the cat work, I rather try to animate a real one?
Hmm im not sure how to get animate diff to work since its been a while sorry
thats fine im not married to it
literally just trying to animate my still images. 5 second animations, nothing fancy
Better chances yeah, and people use tricks to get longer videos like extending from the last frame etf
Etc
ahh neat, thankers!
Your using comfyUI?
A1111 and comfyui, I have both installed
i don't really care about the tool stack or anything but for example i have a guy standing there i just want some motion, maybe it looks like he's breathing or something
started with a1111 then switched to comfyui cuz I like the nodes
I'm not sure how to realistically archive that within a reasonable time frame with an 3060, your best bet could be #šļ½prompting-help or #š¤ļ½tech-support if its not behaving as intended
You could combine them and use swarmUI
Hella discussions going on with wan and its new light speed lora
I see, I will look into it the coming days. I mostly do this out of curiosity anyway so there is no rush :P
thanks
Bet, if you want i could dm you an invite so you can peek into the server since swarmUI has a comfyui backend
So you can use comfy whenever and have the benefit of most models working out of the box without issue & documentation
sure sounds cool :D
Just doesn't support comfymanager that well
are you saying a 3060 is not capable of generating a 24 frame animation in a couple minutes or ?
My 5080 takes 10 minutes
for how long
what is the newer stuff?
which image gen/checkpoint would you recommend that I can use with wan after?
Any really
Aight, tyty
Traveling home, you could ask in the swarm one since they know a lot more then me immusama
no worries, thank you!
i downloaded a release of forge but no stable diffusion video https://github.com/lllyasviel/stable-diffusion-webui-forge/releases/tag/latest
Does anyone know what to use to get the following.https://youtube.com/shorts/EpzprZuWEwI?si=6ro8nkmvIvk2CAls
Sure, if I may tag you. I don't think I can do it here.
Is there a list of stablediffusion benchmarks for the 50 series cards yet?
hello
i'm tryng to train a lora style model
in the tags part, should I only focus on the style or do I have to tag the character appearance as well so the AI can know what is going on?
Does anyone have a good prompt for generating a solid white background with no shadows? (For creating transparent images)
Hello
hello
i ask ai to give me a promtp and here waht it said
"Generate an image of [your subject] placed against a perfectly solid white background, completely uniform and without any shadows or shading. Ensure the background is pure white with no texture, gradients, or lighting effects."
Hello Adrian here, aka Sm0ke
I tried to fetch civitai's entire library of lora's with a script someone made that made the SQL DB of all the models, tried to fetch all of SDXL lora's specifically for archiviung..
Well.. I ran out of storage on my C drive..
Reached 600GB ish worth of models. And i got this far lol
SDXL 1.0: 4%|āāāāāāāāāā
Moving them to my server to continue there lol. Though slower because spinning rust.
Had GPT make the script that reads from the model database, and download all relevant images, all versions of a model, sadly it also fetched non relevant ones too, like FLUX and SD1.5 versions of the same lora, so had to manually "post process" them by searching for any .safetensors with pony, flux, sd1.5 and so on in the name 
You only need 15tb for sdxl then nice
Depends on the remaining models, if it's only 1 model, or 7 versions, as well as if the model was trained with high rank, or low rank :P Low as in rank 16 will leave it at a few 100MB, 128 rank? 1.2GB per model :P
Got 20+TB free on my server atm, and more freeing up as i'm transcoding all my movies and shows to av1 xD
Hmm if sdxl 4% is 600 gb then 100% is 15000gb avg but then again idk how that script tracks progress
SDXL 1.0 | Total: 40470
- Refiner : 1 models (0.0%)
- Standard : 805 models (0.17%)
- Unknown : 39664 models (8.51%)
Other | Total: 6843 - Inpainting : 2 models (0.0%)
- Standard : 4553 models (0.98%)
- Unknown : 2288 models (0.49%)
NoobAI | Total: 2458 - Standard : 4 models (0.0%)
- Unknown : 2454 models (0.53%)
Hunyuan Video | Total: 1473 - Standard : 8 models (0.0%)
- Unknown : 1465 models (0.31%)
Wan Video | Total: 604 - Standard : 4 models (0.0%)
- Unknown : 600 models (0.13%)
SD 2.1 768 | Total: 567 - Standard : 556 models (0.12%)
- Unknown : 11 models (0.0%)
Flux.1 S | Total: 504 - Standard : 12 models (0.0%)
- Unknown : 492 models (0.11%)
SD 1.4 | Total: 300 - Standard : 228 models (0.05%)
- Unknown : 72 models (0.02%)
It tracks progress based on model count, not sizes. Wish it did tho.
Here's percentage of models that civitai had as of a week and a half ish ago.
Hey! Would anyone be kind enough to share a working ComfyUI workflow package for cinematic / realistic scenes (like Netflix-style)?
I'm working on viral hook content and struggling to get clean outputs.
A working .json or folder setup would help me a lot. Thanks in advance!
Which version of python is everyone using
if i want to do more than 3 characters, am i forced to use regional prompter? Like when doing 4 characters, instead of picking the 4 of them, he will do 3 and the 4th one will be a repeat of one of the ohters
I'm using Python 3.10.6 (with ComfyUI) ā no issues so far.
3.12 for comfyui, but any other/older version for other python programs written for older python.
using 3.13 but i 1000% advice against it as only comfy will work but any other nodes will break
unless you patch them
Yep. I keep a few different comfyui venvs based on what i use them for. Stableaudiotools iirc needed 3.10.
Use ltxv, it can easily gen in a minute on a 3060. Although, wan2.1 is probably still sota quality, ltxv is still decently close and very very fast. Both are far better then svd.
https://github.com/Lightricks/LTX-Video or something else
heyyyy
Hi. anybody on?
If you clicked that link and logged in you got scammed
Its a common scam here
You better change passwords
If you gave them anything
Any guide here is found in #š¤ļ½tech-support pinned messages
is there a guide on prompting? i cant figure out how to link keywords.
I realized it after I joined the server and left immediately so np ig
What model? Xl anime style?
waiNSFWIllustrious its the only model i got to work
i tried making different girls with different hair and skin color. is there a way to link those to one particular girl without it being randomly distributed?
Ah, then you could reference danbooru for tags or do you mean something else in tagging
Ahhh
Thats mostly luck
Or regional prompting
But I'm not an expert on that since i mostly make 1 girl or man women combos
so which are your favourite models to use?
they are the only ones that work for me ^^
Illustrious and NoobAI are both solid for anime ive found. But every model has its use case.
ive never seen a more obvious scam link in my life lol
Oh boi, i'm having a "blast" not attempting to have GPT try to make a node that loads sd 1.5 model lora's, be parsed as if they were sdxl models, and have if any blocks the sd 1.5 lora has to be applied with SDXL
So far fuckall results, but i'm still gonna see if it's any way possible lol
that doesn't make any sense
Why not use a sd1.5 model with the Lora followed by a sdxl image2image run?
you can load the clip l part for 1.5 loras. Besides that, sdxl and sd 1.5 are incompatible
while it doesnt make much sense to me since i dont know the inner workings of xl and 1.5 but it does sound like your trying to run a diesel car on cola
Nope. Idea is to find out if I can make a 1.5 Lora to be loaded with a xl gen
More like trying to make diesel work with gasoline x)
Both are combustible, but combusts under different environments.
Testing new A.I chat with Darth Vader in Fortnite https://youtu.be/r37wXl-i6Ik
you don't have to test that. It won't work.
Is anyone here good with photomaker v2 in stable diffusion?
Nvidia GTC Taipia 2025 keynote should be starting in a bit https://www.youtube.com/watch?v=TLzna9__DnI the live pregame show just ended
https://github.com/smthemex/ComfyUI_YuE?tab=readme-ov-file I don't understand this. Yue is for making music how is it link to comfy ui workflow
Hey, I want to create raw footages like the one in this video https://www.youtube.com/watch?v=QIQYzi7IPGk. I can act out myself but change the background and the face with of someone else's. what all softwares do you suggest me to use
what's comfy ui?
like a multipurpose board to plug in different models?
comfy ui is just workflow management program mostly for image diffusion
Can I make a video using that?
will try it out thanks
I think so
PLEASEEE
What gpu do you have?
The background is easy as its a single image but the faceswap in video live takes some capacity
Post process too
Anyone have any good models for sd sdnext or even site that can make the talking baby videos
Hi, Iām thinking about saving up and investing into a new computer for making AI art, particularly comics like a Webtoon. Right now Iām working with an NVIDIA RTX 2080 Super and I must have did something wrong because generating one picture took up to an hour a couple times, and the computer froze. Usually itās 2 or 3 minutes, thatās okay, but I add enough LoRAs and controlnet influences and itās a cry for help.
And thatās with Xformers and LowVRam.
I newly installed Automatic1111, and I hear good things about ReForge, so Iāll try that soon.
Can I ask what you guys are using, and what you would recommend? This would mean so much to me, seriously.
Why use lowvram with a lot 2080 super ?
Iām still learning how the program works, so I thought that helped and might have been mistaken.
vram and procesor is key for good image and fast image generation
by using lowvram the program is splitting the model into multiple parts, loading and unloading them into the gpu whenever they're needed. Thus wasting a load of time when you could simply load the model at once.
2080 super are 8gb so --medvram-sdxl should be the right spot.
aka using --medvram only for sdxl model
I use the checkpoint WaiNSFW with most of my initial attempts since Iām getting a lot of the art style I want from it, the kind that looks like a screenshot of an anime. I hear that model, on top of Noob Controlnet, and hires fix, and using multiple LoRAs if Iām using a style or more than one character, can be very demanding. I figured getting a more advanced computer would help with the generating time.
And considering Iām hoping to make enough images to make a webtoon-style comic, generation time and the computer not freezing is really important.
I just want to get this right, so I really appreciate your insight on this. Gaming specs make sense, but shopping with AI in mind is entirely new territory for me.
Not gonna discuss anything NSFW in here, but yes you must have done something wrong because it's not supposed to take 1h to generate one image with a 2080 and without any logs / screenshot of your settings / etc I can't properly help you. Also that would probably more of a #š¤ļ½tech-support kind of discussion
so is Auto1111 support dead ?
Hi folks, maybe it's a question already asked hundred times but can't find a good answer. I want to have AI generated role play heroes. How to generate images using this character? That means I want to retain the silhouette/face/clothes of given character
can anyone help, how can i save the prompts for image generated through flux in forge ui in csv format
hello
Yes, its been outdated for over a year
But you can always ask questions about its use in older models
Alright guys question, my buddy said he would be willing to swap HIS 4070(not to or super) just a standard 4070 for my 7800xt, I do work in stable diffusion and was wondering would it be worth it to swap? What would I gain in terms of it/s and what would I loose (going from 16gb vram to 12gb)
Going from amd to Nvidia you'd get a performance boost by not using to use zluda or directml, you could upscale etc
How long does a gen take you now in sdxl?
A 1024^2, 25 sample steps takes about 30s
I'm also wondering if the native cuda support will make it run better as I was just getting into animatediff and it's workability is to be desired or I could just be bad at using animatediff
With AMD that is
Well you could expect faster times thats for sure, and you could probably run flux schnell
And I'm assuming the native stable diffusion works better than automatic1111
What
Just use swarmUI, forge or comfy
A1111 is outdated and i have no idea what you mean by native stable diffusion
What o meant is you don't have to jump through all the hoops with nivida to get it to work like you do with amd
this, the first step for installing a1111 with zluda is update your amd drivers... turns out the newest drivers dont work with zluda...
bilibili's open-sourced this anime video gen model called Anisora. The V2 version, which runs on wan2.1, kills it for anime videosālike, it's really good. But right now, it doesn't work with ComfyUI. I'm really hoping someone can make it compatible!
thanks!!
GitHub address:https://github.com/bilibili/Index-anisora
Hello everyone, I'm a newbie actively learning Stable Diffusion (SD). Your guidance and support would be greatly appreciated!
Good morning.
i have gt 610 at the moment but can upgrade if that's necessary. Not live actually, dude we actually edited the same video for simon, but what we are trying to achieve is a similar quality video in which he only gives us the content and we make the whole video from scratch. Do you think its possible? Most of the people in our industry are using just elevenlabs and heygen right now but the video is easily spottable as one made with ai
Scam
Theres a github somehwere that was convincing quality but its beyond me
check this guy out for example https://www.instagram.com/thevarunmayya/?hl=en
people call him the pioneer lol
how do you find such repos?
Eh just Reddit or Google personally
bro trust me I have been on it for quite some time
everyone calls themselves the best until you see the results
its like finding a needle in a haystack rn
Sorry, WaiNSFW is just the name of the checkpoint, not that you have to do NSFW with it.
So with civitai shutting down does anyone know the status of the mirror/torrent tracker?
Ų³ŁŲ§Ł Ų¹ŁŁŁŁ ŁŁŲÆ ŲŲÆŲ§ ŁŲ³Ų§Ų¹ŲÆŁŁ ŲØŲŖŲµŁ ŁŁ Ų“Ų®ŲµŁŲ© ŲØŲ°ŁŲ§Ų” Ų§ŁŲµŲ·ŁŲ§Ų¹Ł ŁŁŲ³ ŁŲ§Ł
Hello !
Any lora of the meme tung tung sahur ballerina capucina?
There is only one lora and is not all the characters
If the biggest lora site doesn't have it i suppose your outta luck
for generation, my cpu rarely use up more than 15% of the resources
I am rocking an amd ryzen 5600
Some shameless ones that copy the ones from civitai without permission of the lora owners yea
It has a few like tensorart
But compared to civitAI its pretty bad
Once i get onetrainer working properly I won't use public loras anymore
anyone good at python. i have a basic python homework and need help
Hello!
Could someone please share a working download link for the SD 2.1 motion_adapter.safetensors file (v1.1, ~1.8GB)?
Iām looking for the official AnimateDiff v1.1 adapter (not SDXL or Lightning), SHA256:
5fd196988206cfac90bab1dc56cc59fdd12035492f74ff0662eb6eed5113c547
or
45b71fe98efe5f530b825dce6f5049d738e9c16869f10be4370ab81a9912d4a6
The official Hugging Face repo is down/private. If you have a Google Drive, Dropbox, or other trusted mirror, Iād really appreciate it!
(Please, no SDXL or Lightning adapters, only SD 2.1 v1.1!)
Thanks so much in advance! š
Hello! I'm trying to learn Realistic Vision v6 and tuning. Currently images are still seeing a lot of noise with the following configuration. I'm wondering if anyone had some advice. Also I've seen recommendations for denoising steps/strength configuration, but I haven't seen those in the UI...
Interface: WebUI
model: realisticVisionV60B1_v51HyperVAE.safetensors (models/Stable-diffusion)
Upscaler: 4x-UltraSharp.pth (I added this to models/ESRGAN folder, but not sure if that's right. Also unsure of how to properly configure upscale)
Sampling method: DPM++ SDE
Scheduling Type: Karras
Sampling steps: 50
CFG Scale: 7
dimensions: 896x896
prompt:fashion photography, cover photo, black and white, female model, facing the camera, shot on Sony A7 III, photorealistic, close-up, monochromatic, flawless, ultra-detailed, hyperrealism --no freckles --chaos 75 --ar 1:2 --style raw --profile rsy6uq9 --stylize 750
negative:(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime), text, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck
Thanks!
why with the latest version of the request package, it prints ID not 64 charactersw
I think it the code needs to be upgraded
I have the requests version 2.32.3
@vapid dove
We really needs more mods in here ong
Making a personal archive of civitai, and it might take a while lol. Seeing as sadly the script GPT made to fetc them, also fetches irellevant models.
SD 1.5: 1%|ā | 1636/111713 [00:00<00:54, 2028.34model/s]
111k models, but a lot of them are irellevant models like sdxl, flux and the like, so have to use lora model manager to weed out the non SD 1.5 models now and then lol
Wan video has like 3-400GB of loras, same with hunyuan, so it'll be a few TB for the base model lora's i need lol
Hello guys, I am p new to the idea of fine tuning diffusion models, is there any resources yāall would recommend for me to look at, to get a basic understanding of how to train a model to get my desired output
Thank you :D
i cant seem to install stable diffusion š¦ problem after problem currently cant seem to install CLIP
What ui are you using?
stable-diffusion-webui
Ahh a1111 outdated stuff if I'm being honest
I recommend forge or swarmUI/comfyui, in the techsupport channel there's some install guides (pinned messages)
thank you i will try that
If your on amd, definitely the forge guide
Nvidia GPU
12GB, im trying it out with a Tesla P100 I got cheap. I'm new to all this. Got a Xeon 12 core CPU and 32GB RAM
it was supposed to be 16GB but ... typical eBay
Hmmm im not sure how fast tesla is but yous be able to run xl probably just fine
Maybe flux schnell
I heard its a little slower than a RTX 2060 but with more VRAM so we shall see. I'm not expecting miracles from it to be honest. Just want to try all this and see how i get on. Might get an RTX 3090 or something later on.
Is there an install guide for Linux?
no. but if you ask on the comy discord, you'll probably find someone that can walk you through it. most custom nodes won't run though, they're written for windows
i'm upgrading my gpu to a 5070ti in a few days, is my current pc setup good enough to not terribly bottleneck the gpu for stable diffusion? 7800x3d and 32gb ram
i need help to be able to run this command in python
## import the libraries(instant)
from diffusers import AutoPipelineForText2Image, DPMSolverMultistepScheduler
import torch
## load the model to cuda(should download the model automatically, time depends on your download speed)
pipe = AutoPipelineForText2Image.from_pretrained('lykon/dreamshaper-xl-lightning', torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")
## inference time(should take a few seconds or so)
prompt = "portrait photo of muscular bearded guy in a worn mech suit, light bokeh, intricate, steel metal, elegant, sharp focus, soft lighting, vibrant colors"
generator = torch.manual_seed(0)
image = pipe(prompt, num_inference_steps=4, guidance_scale=2).images[0]
image.save("./image.png")
i am getting error after error but it should not be that difficult
i cant find the right information for what version i should have of diffusers, transformer, cuda, pytorch etc
Maybe you might share the error you are getting in the #š¤ļ½tech-support channel
Which program are you using? forge? comfy?
They usually go in the checkpoints folder
it's just a generic format for matrices. It depends on what's in there
its ok i figured it out. It's working now but my god is the text to image a little crazy haha some wild results. This Tesla P100 is working well too.
which models do you use for realistic human images?
like real humans
that don't look fake
Would it be possible to get a workflow in like comfy that can generate full material assets? Like it can make:
Color Texture
Normal Maps
Metallic Maps
Roughness Maps
Specular Maps
Ambient Occlusion (AO) Maps
Height / Displacement Maps (via tessellation or parallax)
Emissive Maps
Opacity Maps
Anisotropy Direction Maps
Subsurface Maps
Refraction Index Maps
Clear Coat Maps
How much of this would even be possible with all the crazy stuff you can do with comfy UI and could there eventually be a full material generation workflow
who needs vps - 8 gig of ram, 4 cpu and $12 a month
Hi. Anybody online?
@median dome How can we help you ?
Hi. actually im messing around with swarm ui and wan video and it takes a long time to create the video. i was just looking for a way to pass the time^^
are you using wan video?
WAtch youtube, movies, shows, anime. Or play games. Ez.
I'm using wan video
well i felt like chatting. i also like to learn more about ai creation. im just a beginner but its a very interesting topic
Is anyone here familiar with "mimofr"?
I'm trying to figure out what kind of AI he's using for his wizard videos
Trying to create something similar


Hi Guys!!
hello!
Is stable diffusion dead? Are there any other open source image generators?
flux
but why should it be dead? in particular, sdxl and it's derivatives are still very much alive
Are there any SD Zluda AMD users here? I feel like generations are abnormally slow. A flux 1000x1200 2 image batch literally estimates 1 hour 30 minute completion time
Illustrious is maybe like 2 minutes for 2 images
I cant generate anything with SDXL since it keeps loading and loading
I donāt know if anyone is having similar issues
in the #š¤ļ½tech-support channel. you want to talk to @warm junco
what interface are you using to run it in?
Iām using Stableās API to run SDXL in my platform
don't suppose you'd DM me your code so i can look at it?
Hey, sorry just seeing this. I was able to fix it after all, thank you though š«¶š½
If I take a picture which has a white background and I img2img it with 1.0 denoise and controlnet, why on earth does it keep generating a light background no matter what? Like, how does it even work? How would it know it has a light background in the first place? I'm trying to understand how it works, it's starting from pure noise and only has lineart contours to go with
because it has to generate something and it cant' create transparency
what do you need help with?
there are two reasons that both contribute to that:
1.) You often do not start at timestep 100% (pure noise), but at some timestep shortly before the 100%
The reason lies in the math of epsilon prediction diffusion (the one most models like SDXL and SD 1.5 are using. Note that SD 3 and Flux are using Flow matching instead)
the reason is that in epsilon prediction the model predicts the noise and not the image. So given an noisy image the model figures out how the noise looks like, then this noise is subtracted from the image
however, if you would give a pure noise image then after subtracting the noise the image would be empty. So denoising from 100% noise would not make any sense
that's why you start from, say, 99.9% noise. You use the underlying image (by default just a blank image, which is somewhat gray) for the remaining 0.01%.
This is the reason why images in SD 1.5 often look "grayish", "unsaturated", "medium brightness", because the model still reconstruct from a gray image
When you do img2img from white images, then the white part often appears in the resulting denoised image, too, although you denoise with 99.9%. The reason is that white has a very high value in the latent and, thus, is still visible through all the noise
2.) In diffusion you mix your signal image with random noise. But what does mixing mean? Basically, you multiply your image with a value called alpha and your random noise with a value called sigma and then you add both together. Alpha and sigma depend on your timestep. Intuitively, you would think that alpha+sigma is always 1 and this is the case but only for rectified flow matching. In Flux or SD 3 its very simple. If you are at timestep 50% then this means that you multiply your image with 0.5 and your noise with 0.5 and add both together.
In SD 1.5 or SDXL it's more complicated. Here, alpha and sigma are determined by your noise schedule and they don't have to sum up to 1, nor do they start at 0 and 1. So you might always have some remaining of your input image in your noised image, even if you would start at timestep 100% (what you are not doing in epsilon models, see 1))
btw. this also means that this effect only occurs in SDXL and SD 1.5
if you use Flux/SD3 and use 100% denoise then you should not see anything from the input image
Hey thanks for the time for the comprehensive explanation mate
Where do model creators go these days to upload their models? As after civitai's latest restrictions due to asshat-visa's demands, appears at least wan and hunyuan model uploads has quite died down there.
Tensor art works but ironically that place has less censorship with many more problems
civitAI has been hit with the no nsfw so the site where borderline (or actually illegal depending on where you live) stuff is hosted gets popular sadly
lol, civitai is far away from "no nsfw"
people are crying cause they can no longer generate deepfake-like porn with their favourite actors and call that censorship
Not NSFW entirely, more "against their will", and other stuff like piss and diapers, that sort of stuff.
Oh definitely, love the payment processors deciding everything nowadays
Even sfw gens are now allowed of certain peolle
Right? Just become worlds most dominating transaction company and change the world /s
Thank god opensource is a thing so the restrictions arent hurting everyone
It's not so much about the open source thing, it's more hurting users wanting to access models, as civitai is the most known, and if they really want their piss, they need to make the models themselves, as civitai will not host them.
Its not about piss but okay
Hi. Anybody on who knows swarm ui?
you can host models on huggingface...
Be cautiousāscammers are using curated GitHub repos to impersonate AI developers. Stay sharp out there. š
anyone in here focused on the api side of ai-video development?
Anyway to create longer videos with wan? If i extract the last frame and use it to i2v it looks like a strange cut in color and saturation
do you mean making an API or using an API?
Docker + Kubernetes + standard python web dev libs works fine
if you need enterprise-grade stuff this won't do but its fine for most uses even up to small startup size
its not rly worth re-inventing the wheel
Aye, but i've no clue how i'd find specific models there :P It's not like civit which has tags, base model and so on for search filtering :P
Oof, way too late attempt to archive civit's lora's, and so far, like 3-5% of civit's lora's has been wiped.
google search for the model - example: this_model.safetensors
they come up
And what if i wanna see what's new that people has cooked up? :P Can't know the name of what i've never seen nor used x)
you're not gonna see that on civit, either.
Nope, so i've been looking for any alternatives to be popping up, so far none 
hi
Hello
what is the best image generation model currently
best for what?
yeah kinda depends on what your use case is. if you want a good all rounder, probably Flux or HiDream. If your aim is anime, cant go wrong with Illustrious or NoobAI. If you want realism, Flux and Pony seem to dominate that area. And if you want creativity, like unique art styles, Flux, HiDream and NoobAI seem to handle different art styles really well.
In actuality, you can use any image generation model to make any type of image, assuming the checkpoint is fine tuned or trained for it. Some behave better and handle prompts for certain types better than others, and the more popular one is in a specific type of image, the more variety you will have of checkpoints
Dumb question: stable diffusion has no easy-to-use app to generate images, videos, and sound using their models, right?
When is the next upgrade coming? SD is getting far behind...
?? Sd, flux etc don't got their own apps
They got sites maybe
But for local comfy/swarm/forge is as easy as it gets
Swarm being the best out of 3
Imo
And they offer an api for use in said UI's
Google, openai, midjourney etc developed an official UI.
okay but can you run those locally?
no
developing a UI for only their spesific params and open source mess with it how you want is kinda different and really not nessacary in the current ecosystem of open source models to have a stability UI only
an d correct me of im wrong they do have one ope, just a website https://stability.ai/stable-assistant
Hence the question to ask when they are going to upgrade the models to achieve something that aligns with current standards?
sd3.5 large is still pretty solid
Sadly sd 3.5 stock is terribly tuned vs flux.
No one knows. Like they just released a new sound model for creating music with lyrics and all. Ain't as good as suno in sound quality, but it's not bad at all for being free and local.
New "next gen" models only arrive as fast as people discover new ways of doing things/accumulate dataset and word things properly to teach the hallucinative models to be more accurate.
Hi
hello
Hey everyone! Iām an AI engineer focused on generative image pipelines. Iāve worked with Stable Diffusion (1.5 & SDXL), LoRA fine-tuning for consistent visual styles, and ControlNet for pose/structure control. Iāve also deployed models via Replicate and Hugging Face. Always open to collabs or contract workāespecially if it involves stylized character generation or creative AI tools!
you misunderstood this. Official uis are a bad thing. They usually lack features and are technically behind..The great thing on open source is that you have plenty uis to choose from, all competing against each other and come up with new features regularly
Flux was SOTA for long time now. openai's Sora is quite new and, yes, it's prompt understanding is much better than any open source models. But it's also much newer
in 3D and audio they are, they made SOTA releases this year
if you look at their news they are repeatedly hiring 3D people this is probably their new direction
Stepfun released a 30B DiT and the scaling from 14B Wan to 30B Stepfun was not spectacular
I think this means merely scaling DiT has reached its limits
so everyone needs new methods anyway
Is it me or is civitai barely functional recently?
what current standards are you referring to?
civit seems to have become a bad taste in a lot of people's mouths in the last 48 hours
I agree that creating a 3D environment is the future as it allows to keep coherence, and correct shoot angles, lighting etc as well.
Current standards are better in text writing, for example.
It's the same thing that happens to every software product
They just become bloated and shitty
Civit exists only because it's the central location to flock to. I'm sure everyone would go with the wind if a new player came to town
https://youtu.be/a6xbUBTlf2s?si=4i6Yr7xIOE8RoIga i finally added my voice to my videos!
Hi, I'm a fullstack developer having some fun with image generation for a sideproject, having some issues with Stability API I get this error:
An upstream service tried to fulfill this request, but a required resource was not found,
when using the google collab from main website:
https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1ultra/post
How to solve this? I want to use Stability API
How to do this? Invite is invalid it says
Scam
That is a scam link dont join
1326848249579049002, userID @vapid dove another discord scam support link
hey @rain sandal same thing here. I'm developing an app for which Core and Ultra are integral, but both are kicking out those errors. I thought it was my configuration until I checked this link https://stabilityai.instatus.com/
I solved my issue by using a different api key , from a different account , and am no longer mentioning Bearer in the authorization which also seems to help
{"authorization": STABILITY_API_KEY, "accept": "image/*"}
Hope that helps!
Any working Data Scientists here?
hi guys im using sd webui replacer, and every time i generate images, the models SAM and grounding dino are loaded each time. is there anyway to prevent this for faster speeds?
About 90% of the time that I generate a set of images from the same prompt (with Flux Dev), the first generated image is the best of the set (of 4, of 8...). Any reason for that? There can't be
Attempted to download all of the remaining lora's from civit.. 10794/369564, 3% done, 1.5TB so far lol
Good morning. Has anyone here already installed Stanle Diffusion on PC? I will need help
Best to start with what you actually need help with :P
Also, SD on pc, what else would you have it on?
(i guess there's also on phones, but that's so super niche barely anyone here does it lol)
I want to fix an error message (I can't show you screenshots so it's hard to describe)
I mean not online version
Aye, i prefer screenies as they are easier to explain than words :P
Well, technically they are "online" as it's webui after all 
sorry but for some obscure reason I can't send any here ><'
You can use channels like #š¤ļ½tech-support or #š¶ļ½off-topic to share screenshots.
Indeed. Lets hop into TS
Thanks
Preshoting the bots and scammers.
There is no "official external support discord" of any sort. Don't get scammed.
.
So, uh, with SD 3 being a complete failure, what is there to look forward to beyond SDXL/Pony/Illustrious?
I'm trying to create a LorA of myself using 15 selfies but the results are pretty bad... I used a paid service using the same 15 pictures and the results were decent, so i know the training material is sufficient, theoretically. Any tips?
Or resources
Its pretty good (sd large) but they are working on mostly 3d rn i heard
And music

But there's flux, wan etc that's still being worked on
I just wish there was a more efficient model
Less vram requirements for like Illustrious quality
The new illustrious based models have some pretty advanced natural language prompting and non-anime artistic capabilities. Has there been any advancement in the non/anime SDXL world in this area?
which model do you use?
from my experiments so far, Flux is by far the best for photorealistic images of the own face, but SDXL is a bit better with transferring the face into different styles (anime, comic and so on)
I'm using Google Wisk but I have an image I like but I want to only change characteristics of the image, not the main image is this not possible in Wisk, and if it's not what route must I go ?
Scam
is it possible to run LTXV in SD ?
you re confusing SD with whatever client you were using previously to run SD model.
SD is a type of generation model and so is LTXV
if the question is "Can Automatic1111's stable-diffusion-webui use LTXV model" then the answer is no
Need opinions on how people store their loras for a script i'm working on to sort and de-duplicate models by hashes.
How do you guys store them?
sdxl/funny_cat.safetensor for example?
Or animals/sdxl/funny_cat.safetensor?
Or how do you sort your loras?
Speaking of, if there's any mods active right now, what's the mod role to ping for scams?
Nope, no mods just fruit n maxfield
I sort em by archetype > version > type > lora
So illustrious > v0.1/v1/v2 > animals/style/pose/modifier/character> lucy.safetensors
Bit more work but this way I know what i have and don't have for my models
But since your archiving everything on civitai i suppose your method is better
If none are present, you can try to @ any active "community guides".

Noted :)
Gonna see if i can make the script have a config.json with the sorting script where they can choose options, easily just replace "1, 2, 3, 4" etc, just simple numbers, with a commented out list below of what each number sorts by
dog ĀÆ_(ć)_/ĀÆ
hello
Years ago, yes.
@abstract quarry did you used it?
I'm trying to put image support on it
Like, analyze and generate images
I switched to LMStudio
is Gemma3 multimodal ?
yes
Then there is no option for images with the current setup I have
I suppose I would have to add an extension or something
we're looking for a comfyUI designer to join our agency full-time, long term position, fully remotešš» anyone interested?
you might wanna offer more details,
making workflows or making use of custom nodes or even making custom nodes
what is your agency for? what does it sell?
compensation?
hardware/cloudgpu provided?
I used sdxl, I'll give flux a shot, thx
Would you guys take interns? And yeah can you give more details
seeing your UK based,
will payment be in Pounds, Euros or Dollars?
What do you envision to be the workload to be? making workflows 40 hrs a week seems hardly realistic
or is it purely marketing (example, sell a product in a AI enviorment like a new bag from a brand)
I can give you all the job info on private maybe? don't wanna spam the group... in case they kick me out lol
yeah this can be an option too
Hi,
Does anyone know how to set up the image to video in n8n?
Iāve set up the text to image model and want to convert that into a video, though Iām not sure how to convert the image into readable binary format
looking to make detailed depth maps from images, to acheive 3d sliced laser engravings... anyone got experience of good models or workflows for this?
I wanna take a current poll of how many people have failed to watch/read ONE PIECE AKA the greatest fictional story ever told. Give me a percentage/number for the sake of history. We will keep a record. Thanks and goodbye~
image
how well will a 2080ti do in SD ??? iam using a 970 gtx ... will i notice a massive upgrade ?
Do you need support
So couldn't someone just get an ai to write an ai image generator model that is only 1GB that can generate pretty much anything. I mean stable diffusion for example is just a bunch of what do you call it code or zeros and ones. I think it's possible if you had a strong enough computer. Youll get hit and misses but its doable. And that goes for any other ai generator. Generates in .01 sec and uses less than 1gb of vram/ram. (an AI model generator)
PMD
do it, I dare you.
that s most likely a scammer who s gonna send you some private discord stuff tricking you into installing malware.
there is no "external support discord".
the support place is #š¤ļ½tech-support and is community driven.
how do I prompt for squinting eyes? I always get a frowning face instead of squint in attempt to see better
no wonder he has not answerd and still online for hours š
Does animate diff for sdxl automatically reduce quality of generated frames?
@valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river @valid river
Ų²ŲØ Ų·ŁŲ² ŁŲ³ŁŲ³ Ų§Ł Ł
translate
I do know animate diff is autodated and doesn't follow frames very well, what gpu do you have?
You might be able to use newer video models
4090
Try looking into WAN2.1 with the causvid lora, high quality video should take like 2-4min then
Youd need to use swarmUI or comfy ui though
listening to this banger
https://open.spotify.com/intl-de/track/4B0AzVyr78p6VK1gbJ0hsh
I don't understand a word
but it sounds aggressive
I think it's dutch
https://huggingface.co/Agents-MCP-Hackathon anyone want to join force with me
Do you mean the model will just provide training code or create a model from scratch? First is definitely possible right now, but the 2nd is not possible at all right now.
A 1b model is going to have 16,000,000,000 bits(0s and 1s). Even current llms can barely make coherant 100,000 tokens and they suck at anything too out of domain.
it doesn't make sense at all. llms can only perform tasks they learned from humans. They can code, but only stuff they learned and they still do a lot of errors. An experienced programmer can definitely write better code than llms
and creating generative image models requires training data and compute power. Writing code is the smallest issue here
I have two questions: Do you have techniques for the AI āāto take into account all the elements of the prompt? And can we insert a reference image into SD?
reference images can be included with ipadapter. For specific use cases there are also control nets
ipadapter is a software ?
an adapter
it's like an extension that allows image inputs into the diffusion model
there are different adapter models. Some are specialized in transferring style, or faces or content of an image
this exists on some level its called hypernetwork
but they can only create very very small neural networks
hi, everyone
hi
how can i but seperations in my prompt for better navigation? i get lost in the prompt sometime
the usual practice is using a comma and space
,
so youd get:
1girl, retro, goodquality (horrible prompt for non illustrious/pony models)
sorry, i need to be more clear.
i want to seperate between groups of prompts.
example:
dynamic perspective, good anatomy, looking at viewer [enter seperation] black hair, long hair, brown eyes, [enter seperation] vibrant colors, cel shading, detailed background
my goal is not to influence the prompt with the seperations. I just want to make it easier for my eyes to navigate the prompt when i make changes
hmm it depends on the UI but in SwarmUI i use <break> to split up the prompt (and tokens)
in forge it was BREAK but im unsre
Do you guys think ComfyUI tools like LTX/Wan etc can compete with the quality of Kling?
I'm not saying it can't I actually don't know and wanna hear opinions
what do embeddings do?
on their own they don't do anything they are just some data
ah
im using comfyui so what do i do with those?
i put it in the embeddings folder but idk if i have to do anything else
Wan and Kling have very similar blind winrates so they are around the same quality
only Veo series are significantly better than Wan
oh you are referring to clip embeddings ok
you need to type something in the prompt box to use them
ah okay ty
Interesting good data to have thx
hello everyone
since civitai removed their real people/celebrity loras, does anyone know of an alternative site that hosts them?
I've been searching for days, so far heckall sadly.
Only places i can think of is huggingface, but you gotta nearly be a wordsmith to find out the right combinations of keywords to find someone's model repo of celeb loras
can somebody recommend an Illustrious Lora for expressions?
I want to use it on anime/stylized images
this one is pretty good imo
linked above mb
0.6 works best for me
@atomic mortar nice, thank you so much š¤š¼
Models: 5%|āāāāāāāā | 19922/369564 [16:14:32<1349:56:34, 13.90s/model
Damn, near 20k models, and think it's closer to 3TB of loras now lol
Also, do you guys know of a git that can safely convert a ckpt to safetensor in a contained environment? As even when converting, maliciously injected ckpt's could be executed while converted to safetensors
Ckpt were pickle files right?, im not sure about contained but i suppose you could use a google colab solution
Aye, downside, is that as you see right above my comment, i got a few 1000 lora's archived on my server, and loads are the old ckpt, so need to have them converted and pruned.
I'm slowly thinking of just running a VM with windows and converting them by taking the auto1111 extension for converting them, dump into GPT and have it make a script for me to run it standalone, and access the lora location and it's folder structure to auto prune all that can be pruned.
if ckpt is the pickle format, then converting that to safetensors is a single line of python code
Oh no digital currency scams again @vapid dove , deliberate attempts to bypass the filter too
does anyone have experience training a lora in kohya? im running one with 50 images and its taking 7 hours per epoch, i cant imagine this is the full performance of 64gb ram with an rtx5080
ram doesn't matter. You have to train on your gpu
7 hours per epoch sounds like you have not enough vram
ššššššš
a new flux model!
and there will be an open weight distilled version of it soon!
Hello everyone!
Iām new to the server! I wanted to ask a question ā Iām starting to use Stable Diffusion with automatic1111. I have an Asus RogStrix laptop with a GeForce RTX 4060. Which base model would you recommend using?
Iāve been testing with epicrealism_naturalSinRC1VAE.safetensors and generating some styles to add more realism, but whenever I make a prompt without specifying clothes, you know what happens... What do you recommend?
Thanks a lot in advance!
with that gpu you probably use sdxl or sd3.5 medium
yeah, lot of custom models are full of porn. Strong base models like flux, sd3.5 or sdxl turbo don't necessarily need custom models. It depends a bit on what you want to generate
Thank you very much, itās for generating realistic images
i thought 16gb was enough. task manager does show 15.4/16 gb, maybe its throttling?
I think realviz is the typical model on sdxl for realism
I don't know what you train.
50 image lora no regularisation
yeah but what rank, what model, what resolution
512 squared px, rank idk, no option in my current kohya installation. sdxl 1.0
oh, sdxl? that should not need much memory
in this case I assume your cuda is not working
you probably train on your cpu
you can check by running the python code
import torch
print(torch.cuda.is_available())
in the same console window where you run kohya
but more important: there will be a new flux model š„
on benchmark there commercial model performs similar to Chatgpt. I assume that the dev model will be much weaker. But still: no need for ipadapter or redux anymore, we will be able to input conditional images directly in fluz
i thought so too, but task manager says <10% cpu load and 99% gpu
I can just say: sdxl should not need more than a minute per epoch when you train om 512px. If it needs an hour then something doesn't work. Mist likely there is a memory peak, but sdxl is relatively small, it shouldn't need that much memory. You can check if it gets better when train on rank 1 lora with batch size 1
is it possible that im using a faulty version of sdxl? i downloaded the safetensors from ~~civit ~~huggingface
don't think so
Hey everyone. Have you seen voidstomper? on instagram?
I come back with my question, but do you have any techniques for the AI āāto take into account as much of the information from the prompt as possible?
the AI reads everything in 75 token blocks. it also prioritizes whatever is at the beginning of the prompt more than the end. also, higher cfg = more strict prompt adherence, but its still a bit of a gamble what you get and too high of a cfg will overbake the image.
My problem is that I have a really precise idea of āāwhat I want (a few specific details). So my prompt is very precise and detailed. Except it seems like the AI āāalways forgets a few things
thats the nature of AI. its not gonna get the image 100% correct because it doesnt "see" your vision like you do. the things i suggested above will help, but nothing will get it exactly perfect.
As long as I can strive as hard as possible towards what I want, thatās fine with me. I will try your advice
you can always share your prompt with chatgpt or another llm, and ask it to segment it into 75 token segments with BREAK separations, that will help the model read your prompt better. and the llm should use logic to ensure it keeps things together that need to be together
since trying to math out the token count yourself is tedious
as for cfg scale, depending on the model your using to generate the image, i wouldnt go higher than 10 as a general rule. some models will work up to 12 or 13, for anime models they prefer 5-7 with a soft cap of like 9.
Does anyone know what voidstomper uses?
what model do you use?
older clip based models like SDXL have only very weak text understanding.
For best prompt understanding you should use Flux.
Also, flux doesn't have a limit if 75 tokens and you don't need break separations and so on
still, for very complex prompts you might have to use inpainting. Flux' text understanding is not on the same level as ChatGPT 's Sora
i didnt say there was a limit of 75, that it read the prompt in 75 token blocks. but yeah, flux has much better prompt adherence than most other older models
yes, but that's not true for Flux
I use Elar-a
never heard
Ah you mean the checkpoint ?
yeah thats a new one to me as well
Or the sampling method ?
the checkpoint or the base model
i think they were asking about the base model, so the checkpoint would tell us
(oh, he meant Euler Ancestral š¬)
If it's Eular a, yes. Is it bad ?
And so the checkpoint is WAI-NSFW-illustrious-SDXL
so your using illustrious base, and your using Euler A sampler
it only understands these danbooru tags
not entirely true, that particular model your correct though
the new illustrious 2.0 base has natural language understanding
and any merges using that base will as well
not to the degree of flux mind you
I use the version 12.0 if it can help
haven't tested it. I found all illustrious and NoobAI models so far quite limited. They don't understand anything that is outside their domain.. However, they have extreme good understanding of any weird combinations of these danbooru tsgs
thats just the version of the checkpoint. i believe their newest version is 14.0, but its still not using a Illustrious 2.0 base
I don't know what is this danbooru tags :/
ive made several merges with 2.0 as a base, and tested both danbooru prompts and natural language prompts, getting relatively similar results.
search for tutorials for how to use illustrious
it's a special model that has to be prompted in a very special way
it still has an anime focused training though, and unless you merge loras for crazy concepts into it, some things will not be understood, especially really flavorful language
I don't use these kind of models so I can't help
https://danbooru.donmai.us/, its the method of "tagging" artwork used by this site. most older illustrious models (including Wai-NSFW-Illustrious SDXL) are trained on images that are "tagged" or referenced by these tags. The end result is a model that understands these better than spoken language. The prompts will look like "1girl, school uniform, blue hair, red eyes, standing, market, outside, daytime" etc etc
The 2.0 base can understand paragraphs of words written like you would normally write them, but its still not got the same level of inference and understanding of newer architecture models like Sora or Flux based models.
Ah ok. It's what I do generaly
Then the things i suggested to help you will work for you. Still wont be perfect, but it will get your closer. Then you just gotta either inpaint what it missed, or press generate til it gets it right
i will say, sampling methods matter, as do schedulers, to the end result. I find that DDIM sampler or Restart is good for images with crazy abstract backgrounds, whereas DPM++2M with Karras scheduler is good for low steps and high detail. Euler and Euler a pair well with DDIM or SGM Uniform, and are pretty creative samplers if you like a little creativity added in.
Of course, the model will play better with some samplers than others, so usually you should START with the recommended settings, and tweak as needed for your taste.
hello everyone
hihi
https://www.twitch.tv/nekomipii playing my favorite game league of legend lets rank up with 198 ms im i getting to gold today ??? playing NA and JP /maid cosplay
i get weird noisy colorful images from my epoch samples, has anybody ran into this issue?
i keep reading "abliterated" model - what does abliteration mean? no dictionary I know of seems to contain this word
assuming yoir doing it for a llm
but if you want a less spesific:
Abliteration involves fine-tuning a language model to bypass built-in refusal mechanisms that prevent the model from generating responses to potentially harmful or sensitive prompts. This is achieved by analyzing and manipulating the model's activations to distinguish between "harmful" and "harmless" prompts
thanks!
Context? Are you training?
Hi, thanks for your response. I'm training a 50 image lora on SD1.5 (SDXL gives good results but it takes 50+ hours where SD1.5 only takes 20 minutes) with 1600 steps. I will send you an example of my results in dm because i cant send it here
If you started with SDXL and swapped to SD1.5, make sure you aren't still using anything intended for SDXL in the workflow like the VAE. That would be the expected result.
Do you have a screenshot of your workflow, or would that be giving away too much information?
I have restarted my entire kohya server before swapping between sdxl and sd1.0
what do you mean by the workflow? just a screenshot of my settings?
Yes, what settings?
I will look them up, one second
Wait, I think I got it working actually, the time drasitcally went down from 40 hours to only 1:40
Oh geeze lol
This is awesome
I enabled no half vae, because you mentioned something about it, and it seems to work
i also discovered that i am working in the dreambooth tab of kohya and not lora
thanks for your help man!
Glad to help :)
Is there a way to prevent my image from looking crooked?
Inprove the prompt probably
Hi everyone
Is it possible to use img2img to spoof real photos? The goal is to get 100 unique photos that look the same from 1 photo. Is this method better than ffmpeg spoofers?
Hi Guys, I am grateful to be here! I have a question: I am trying to morph the face of a photo into a Santa Claus' head. The face of the photo should not be generated but taken as it is. Only the outer borders of the face should morph into Santa's head. I experimented with masks and distinct API methods, but I could not get it right. The nearest I could come it preserving the face of the photo while generating some Santa related features over it. This is great but unstable/unrelyable - the goal is to have a process that brings me the desired result as often as possible. Any ideas?
hi guys, Im finishing to record my course of generating images for architetcture with stable diffusion, it's between 25 hours of course. I finhised the practical part and started the theorical
About the models, I will show the size of then but I'm a bit out of the news. Must I show any other model of they that I choosed?
SD1.5
SDXL
SD3.5
FLUX
For face "swapping" i would suggest pulid model with flux which works quite good with non extravagant faces. Newest model would be dreamO which is also good to swap faces.
how can i generate multiple characters on the same image, using different Lora for each?
Regional prompting or using a model like Flux is your best bet. Regional prompting is still a tedious process though, and requires some learning.
x,y,z script? are you usind a11 or comfy?
i am using Automatic1111
maybe with this, you can generate more than one image with varieting some parameters, idk if it can be used for LoRA, you have to check
https://www.youtube.com/watch?v=1P-6con4xFQ
oh sorry i meant like multiple characters interracting in one image
maybe this, regional prompt
Latent Couple maybe work too
Hey guys! Sorry to ask out of nowhere, but I have a favor.
So I've been trying to make a LoRA for like 2 months now and Iāve learned a lot, but I just canāt get any results. Right now, another person is helping me and said the reason I can't make a LoRA is because of captioning problems, and weāve been back and forth with many possibilities on what could be the problem of my lack of success, not just captioning. Tbh, Iām just a bit demotivated right now because Iāve had no results despite trying a lot.
So my favor is: can you let me borrow a set of training images + captions that you have, that you were able to successfully make a LoRA with?
That way, Iāll know whether itās a training image/captioning problem, or a parameter settings problem, or something else thatās stopping me from creating a LoRA. If you were able to create a LoRA with the images, and I used those same images but i wasnt able to then i can assume that means its a parameter problem or something else and I can cross out one more possible causes from the list of possible causes of my lack of success in lora creating. I just want a taste of success. I feel like Iāve been heading nowhere despite my efforts, so I just want to know what itās like to successfully create a LoRA once, even if itās with someone elseās training images + captioning. Would anyone be able to help me?
(sd 1.5, automatic 1111)
Are you using dreambooth?
if yes i assume thats the problem
try looking into OneTrainer/Koyha
ah thanks for replying. no I am using Kohya and the problem seems to be the captioning, after I solved that with the help of someone, my results are finally consistent and better ! just the face is still very distorted when full body and not close up face xD. man I didnt realize how important captioning was
Ah faces, especially in SD1.5 you need to use adetailer
ohh i will look into this !
why are you using sd 1.5 and not a newer model?
Can somebody help on how to train Flux 1.D Dreambooth models or Fine-tune not checkpoint merging nor LoRA training on Kohya_SS . I was looking for tutorials and videos but there are only a limited numbers or resourses available online . I was researching in the internet for last 2 weeks but got frustated so I decided to ask here . And don't recommend me this video , when I started with SD and AI image stuff I used to watch this channel but now a days he is putting everything behind a paywall . And I'm already paying for GPU rental services so absolutey cannot pay patreon premium.
If anyone has resourses/tutorial please do share here (at least config.json files which I have to put in Kohya_SS) . If anyone knows other methods also please mention them . (Also it is hard to train any model via Diffusers method and also the result isn't that great thats why I didn't do that.)
Thank You.
since using regionalprompter my CPU is running at around 95%.
SD freezes often while just writing or copy/pasting a prompt
is this a common issue?
Hello Can someone guide as to where I can download and how to install Stable diffusion? Noob here. Ty for your patience.
scam dont click
why would you want to finetune flux? There is not much you cannot do with loras
and yes, please don't pay for tutorials xD
another digital currency scam yay
@vapid dove scammer alert
what's a good safetensors model for buildings/environments?
Can someone recommend me a Lora training guide?
anyone got tips against prompt blurring? Like it mixes stuff from previous parts of the prompt with later parts. An example is I want a character to have blue hair and a red shirt, but stable diffusion makes the character have blue hair and a blue shirt with a red logo on it
man this week has been wild for reports.
@vapid dove
When you use more than one lora in an image, it is normal that either the quality of the image suffers and/or one of the loras loses its effectiveness?
hi
Yes. Think of lora as "patches" on a model. So if they operate on the same "zone" of the model they re gonna fight each other. You can try to tweak their weight in your prompt, use them in multiple pass, use regional prompting, use them along controlnet to constrain them depending of what you re trying to achieve, etc
Mixing style loras can have interesting results though, i useually have 4-5 different ones in use
It can be done for sure. But it s tricky at first :p
hi guys! do anyone knows a guide on how to generate 2D sprite sheet using AI?
Hi
hiiii, is what you're looking for to generate each frame individually and compile the sprite sheet later, or have AI generate the whole sheet in one go?
Is it allowed to search for freelancers here?
Hi. I have a quick question. I've been using stable diffusion with FOOOCUS for a while. But I really love the way Sora works. It gives great results with simple prompts with almost no anomalies or weird outputs. Is there any open source model that works simillar to Sora that I can use on my device?
Flux
more like generate the whole sheet
Hi guys, hope everyone's having a great day!
I posted a help request in tech support, if anyone has experience with SD and AMD gpus, please check it out!
I would be really grateful if somebody lent me a hand
dont click, spam link
Scam
The link seems like a regular invite and doesn't have a mask, but is there a chance that if I already clicked it my discord api token is stolen?
when i Inpaint one face out of 2 characters, the inpainted face gets features from the other character... how can i fix that?
i am keeping the same seed and prompts when inpainting.
also i use regional prompter and Adetailer
Nope, they want you to install a remote control client or link a digital wallet
nobody in this discord that is actually trying to help you will send you a link or a DM to join another discord or private message chain, nor will they ask for any of your personal information outside of terminal logs for debugging. good rule of thumb is just dont click anything lol
yeah I just haven't been to large discord servers recently and kinda forgorš
Unless its a hyper specific issue with comfyUI or similar but those are large servers too compared go the timy scam ones
I've sent people to more suitable servers before as we dont get many lora trainers here
thats fair. i havent seen many of those instances. but i also havent been here THAT long lol
Large audience of any sort inevitably attract scammers. If they see someone vulnerable / asking for help of course they'll latch on them. And if they can do it in DM far from anyone's eyes then it's even better for them.
i just immediately report any link shared from someone who has recently shown up, only said "hi" or "hello" and then never interacted with the server in any way beyond that but is suddenly suggesting help lol works for me
case in point lol
Yeaaaaaah ...
anyone know a good model for buildings 'n such? or is this not really a server for that?
all the models I'm tossing into A1111 just keep giving me characters for obvious building prompts
put no human in the prompt
even with a negative prompt asking for no people, characters, etc.
and human or character into the negative prompt
but i dont have a particular model in mind for landscapes or architecture gen
i can look and see if i find one but its not really my thing
people, characters, humans, anime, cartoon, low resolution, blurry, bright colors, daytime, modern office, messy composition, jpeg artifacts, bad lighting, extra limbs, distorted anatomy, text, logos, nsfw
is my negative prompt :(
i know a lot of people are riding the Flux hype train
that's what I'm using currently, and I just got a ton of characters
lol
I saw it on CivitAI and it showed some neat examples of environments and whatnot, so I was like "this looks like the right one!"
https://shakersai.com/ai-tools/images/stable-diffusion/best-architecture-stable-diffusion-models/
according to this article, there are 7 that are recommended here for architecture gen with stable diffusion. i havent used any personally. all should be available on civit
oh, I think I know what I did wrong... I did the thing I normally do with characters, where I put a "high weight name of thing" at the front, and in this case it was "(The Hollow Veil:1.5)"... Veil
i know for a while a lot of people were using Juggernaut XL for realism, and landscapes in the example images looked really cool. but i cant say how much success youll have
I removed that and it seems to be working derp
could be it
buildings don't really have veils xD
hahahah
god damned it =x.x=
not with that attitude
hahahahahaha
but yeah, I do that because it creates consistency with characters really well xD
"(<Character's Name>:1.5)", it's not as good as training a LoRA, but man, it works really well with some models
and I'm terrified of training a lora
I looked into how it's done, and it seems extremely tedious and slow
uhhh i think its easier than the jargon makes it seem, but i havent done it either. i just got into merging model checkpoints, training a lora is the next step on my journey lol
well yeah
blender's going crazy with them too btw
Currently training LoRa with 250 epochs bcs i only can do a batch size of 1
just 2 hrs 30 mins
and idk if the results are good 
luckly i have a sample every 5 epoch and Validation after 10
what kind of LoRa are you making?
also not even training with 1024 resolution lol
character
which? if it isn't a secretš
hmmm ok okš
it is my idk 10th time training?
have some Saftetensors now but i want better results
still have some artifacts (tbh the images i use are pretty bad quality (720p lol)
hopefully you'll get something satisfactory in the endš«”
i hope 
How do I use this? Iāve very lost
dont click, thats a scam
Okay. How do I use stable diffusion to generate an image?
Very vague question. What tool do you have currently?
And do you have an nvidia gpu? Or intel/amd?
I have nvidia and I donāt have a tool
Can I add a notification sound for sd when I finish img2img?
I have a notification for txt2img
hello
hello
i'm getting started with stream diffusion and i'm wondering if all SD models are compatible..
I'm not sure with stream diffusion but based on the name it probably can handle sdxl at the minimum
Generally, we use forge/swarm/comfy
it's meant to be used in real time
Realtime as in generate a image as you type?
the thing i'm using came with sd-turbo and sdxs.. not sure what those are
it uses tensorrt.. whtaver that is.. also it seems i can use lora too
Ah youd need sd turbo
And if you dont have a 4090 (or better) its not gonna function as well apparently
chat gpt says this:
Stable Diffusion v1.5 and v2.1: Fully compatible.
Models based on SD 1.5 or SDXL base (e.g., DreamShaper, RealisticVision): Often compatible, especially if they're not heavily modified.
SDXL: Works with StreamDiffusion, though it may be slower and require more memory.
UNet-compatible models: StreamDiffusion needs access to the underlying UNet structure to do patch-wise inference.
Disregard that advice immediately lmao
Chat gpt is 9/10 times wrong with ai models
hahaha ok
You COULD use a non turbo model but its gonna be lots slower
i love that my use case doesn't mind low fps and low res..
What's what your trying to do?
but as i udnerstand.. lower resolution also means lower details up to the point where it affects the content so I might want to stick to 512
a tamagotchi
Exactly, less pixels = less detail
Like a game?
yeah so i'll probably keep res higher and scale it down..
yeah a game but for now if i can just animate a blob creature using control net that will be fine
Could try looking for animating stuff within comfy if you need a gif but youd need to do research
oh i got a way to animate the control net itself.. i'm hoping this is enough to roughly control how my creature moves
i'd be curious to do this in comfy eventually but i got another way to host stream diffusion that i'm more comfortable with for everything else
hmm can't share images here otherwise i'd show you how turbo looks so far with default settings
what does cuda 12.1 do better than cuda 11.8?
also somehow a desktop 5090 doesn't perform as well as a laptop 4090 in my case over here..
Hi everyone! Iām looking for someone experienced in training custom models for Stable Diffusion 1.5.
I already have a LoRA trained on ~20 of my own photos, but I need help with either:
ā fine-tuning a clean SD 1.5 base model to match my personal visual identity
ā or training a fully custom model compatible with Mage.space / RunPod (.safetensors output)
This is a paid commission. DM me if interested ā happy to share more details!
alright so i'm using this atm: https://huggingface.co/IDKiro/sdxs-512-dreamshaper
which lora would be compatible with this?
guys i installed automatic1111 and animediff, generating low quality image to Video takes forever and quality is not that good
any better image or text to video generator?
i heared about this pinochio browser and flux? anyone here using it and is it better?
Hi guys does anyone know how to use this version of Colab: https://colab.research.google.com/github/Jelosus2/Lora_Easy_Training_Colab
Yes animatediff is the old way to do videos. Search for stable-video, lxtv, wan models.
hey guys does anyone know the difference with sdxl and sd 1.5 is the difference that big ? i heard sd 1.5 distorts face a lot, does that still happen in sdxl ? im curious bout what the biggest upgrade is when generating images in sdxl is compared to sd 1.5
First the native Resolution (512x512 vs. 1024x1024). This helps a lot with details as there are simply more pixels available for objects from your prompt. Would say overall faces and hands are better in sdxl for sure
for real, SDXLās jump to 1024x1024 really boosts detail and structure. faces and hands are noticeably more consistent, especially without needing tons of inpainting
@oblique elk @pale wedge thanks !
Hi everyone! Iām looking for someone experienced in training custom models for Stable Diffusion 1.5.
I already have a LoRA trained on ~20 of my own photos, but I need help with either:
ā fine-tuning a clean SD 1.5 base model to match my personal visual identity
ā or training a fully custom model compatible with Mage.space / RunPod (.safetensors output)
This is a paid commission. DM me if interested ā happy to share more details!
anyone got some recommendations on checkpoints for fairly good anime generation? I've been playing with Illustrious-based checkpoints but it doesn't play well with openpose.
I use anything v5 and AnyLoRA
AnyLoRA is a god at lora compatability and for control net
And anything v5 is just good in general
just Lora designed for SD, or for any?
also curious as to the size difference between checkpoints. Both Illustrious and Pony are around 6gb, while SD is far smaller
its only 3gb, really lightweight, but really good for the amount of memory it takes up
also REALLY fast
I've noticed some are better than others. MiaoMiao for illustrious is ridiculously fast. I'm working with an 8gb vram limit and its tsill under a minute image gen for 1024x1024 and then upscaling
i have only 4gb vram š„
and yet anylora is still stupidly fast
under a minute
is there any crossover between SD and SDXL so far as lora use?
Is there any good explanation other than luck of the draw that the same prompt will produce 75% good images one day and 10% good images the next?
variety is a good thing. A model that produces always good outputs will probably also always output the same/similar outputs
There is a specific openpose model for illustrious:
https://civitai.com/models/1359846/illustrious-xl-controlnet-openpose
only thing i'm having issues with on illustrious is that it will ignore a 'dark skin' prompt, even with negative modifiers for pale skin
Try "tan" as tag
are there any XL models that are trained on danbooru artist tags?
or is that more of an illustrious thing?
