#💬|general-chat
1 messages · Page 169 of 1
i know there is, but how exactly do i make ComfyUI output Images as high quality as Auto1111. It always looks a bit funky or cartoony when using ComfyUI
Which SDXL? Checking on civitai, there's SDXL 1.0, SDXL 1.0 LCM, SDXL Turbo, SDXL Lightning, and SDXL Hyper.
all are sdxl as sdxl is a base model
SDXL 1.0 was the first one and the others are based on it
Okay
you can use any model
What resolution should I aim for? It's not really worth it if I can only make images up to 250x250px.
Sdxl best resolution varies based on the model, but all are at least 1024x1024
But you need a gpu capable of it
With zluda, a lot of amd gpus are capable of working with sdxl reasonably well
Zluda is a godsend for ai on amd
i can do like 1920 x 1080p and even higher res with flux dev
and i only got rtx 3060 12gb
anyone know how to get flux in kohya? like the sd3-flux.1 branch?
Hey does anyone know if there are any open source voice tools that do what RVC does that have been developed recently just seeing what new options are out there
The goyim will never know
correct
there are probably AI tools out there that are decades more advanced than what is open to the public
that the elites can use
how to geenrate
stare at screen intensely and picture in your mind what you want to see
Hey everyone! I'm new to Stability and was playing around with the API. I'm curious—what are some good negative prompts for image-to-image generation with Stable Diffusion 3?
When I use image-to-image with a prompt, the avatar ends up looking like a completely different person if the strength is above 0.35. I'm trying to keep the person the same and just change the background. Any tips on how to do that? Thanks!
comfyui - sam select node - profit
Are there any new or relatively new open source video creation out there
Just not something hopefully that takes a huge amount of comfy knowledge
cogvideox
Is it possible for me to make stable diffusion use a reference image to generate that same character in different poses or do I need more images on that character at different angles?
hi
Hi
/prompt
gm all
Yes, it is possible to use Stable Diffusion with a reference image to generate different poses of the same character, but there are some limitations.
If the goal is to create more accurate or detailed poses, especially from multiple angles, providing several reference images of the character from different angles or perspectives will help the AI generate more consistent and realistic results.
What if I only have one to start with? Can I use that to make mkre angles using it as a reference image? And am I right that the proper reference image control is imgtoimg? Or habe I been doing it completely wrong?
If you only have one image to start with, it is possible to use it as a reference to generate more angles, but the results may not be highly accurate or consistent with the original character’s details.
You r correct that the proper method to use a reference image is through img2img, i think
So what is your bet on whats gonna happen, when ai can make humans redundant and it is in the hands of power hungry megalomaniac elites
Anyone know if there are any OpenPose techniques or extensions or models or whatever to not only pose people, but place objects? Or any way to best achieve that? I am having an absolute pain of a time trying to specifically get a character with "sheathed sword on hip with character grasping handle" and I cannot get it to work for the life of me
There might be a LoRA for the sword thing. And if not, you can probably train one yourself, if you have enough reference images
Yeah I looked around and didn't find a Lora that worked well enough
I came here to ask the same question. Stable Diffusion has "Style", "Character", "Face" and "Object" training but I ideally could use one for Poses.
I'm looking to get back into AI gen. anyone got a good youtube video to recommend me for best options and possibly installation isntructions?
hello everyone,Im a new student。。
Depends on your gpu in part. You have more options with nvidia than with amd. Also depends on your os
Win 10/Linux/ 3060 ti
Hello, I was wondering what models I should run for SD... For example, on CivitAI, which models would have the most checkpoints? Which would be good for generating images the quickest and which should I use for quality?
Also if I am wanting to run an SDXL checkpoint, do I still need to install SDXL Base and Refiner?
I've noticed this like SD 1.5, 1.4 and 2.1, but I do not know the differences. Same with SDXL Turbo/Lightning/Hyper etc...
I think my main concern is downloading the base/refiners, is this needed if I using something SDnext
SD 1.5 has the most checkpoints
followed by SDXL
followed by Flux Dev
no other model has a ton of checkpoints and loras on the scale of those three
Lightning/Hyper are about acceleration, and they are in either checkpoint or lora form
SDXL refiner is very rarely used these days
Except possibly pony
okay yeah
But with that one you have to be careful with what you ask of it, to be able to post it here
You also don't need to download the base model to use a checkpoint of it. You just need to download the checkpoints you want to use. @oblique jay
Thank you!
Is this because it is NSFW/
Thank you for this information 🙂
It is capable of it, yes. You can get sfw images out of pony, but you have to be careful to be specific
hello!
What is Flux?
A type of model.
It's significantly different from SD 1.5, SDXL, and Pony
And not everyone can run it, or even a cut down version of it
The full model will max out the vram of most graphics cards, except the highest end cards. Even heavily pruned models are still as fat as sdxl models.
I utilize an A6000 48GB for this, is this still not enough?
A 48gb vram card would qualify as highest end. I was thinking more about consumer grade cards, which mostly cap out at 4090/7900xtx atm. Your card is more of a prosumer card, which is not what I was thinking of.
A heavily pruned model can be as little as 6gb. But it can easily get over 16gb if you want to try to go for the full model. The 48gb card will be able to handle it, but most consumer cards will want to get a pruned model to be able to work.
An fp16 unpruned flux model can be as large as over 30gb, coming close to giving even your 48gb card a hard time, @oblique jay
anyone work with the qualcomm ai hub version of SD?
Interesting, thank you!
hi
Hello!
Hallo
Hi!
aloha
hello all
therefore drink your prune juice.
👋🏼
what's a good WebUI and model to locally generate text to speech? is there an equivalent to SDXL I can do locally so I don't have to abide by these stupid limitations websites impose?
dunno what the requirements woud be, I have 16GB RAM, 2070 Super with 8GB VRAM and for CPU Ryzen 9 3900X
if I could get something on the same level as this
https://elevenlabs.io/
Hey
Can anyone tell me if the image I generated is good enough for the flux dev fp8 0model? I can send in dms cause I don't know if I'm allowed to send it here
Who wants to work as a moder or developer in my project?
Hi
Lots of good options, I doubt raw output would be good as elevenlabs though. You can try fish speech(extremely fast, good voice cloning, streaming support), VoiceCraft(bad installation process but great voice cloning but slow and no streaming)
CosyVoice supports good Voice cloning, streaming, is fast, and is even promptable(you can make it have different pitch, male-or female, slow-fast)
Parler-tts only supports prompting but is much better at it.
What are some of the best Open-Source LLMs?
Looking for ~7B
I use an Arc A770 16GB
Probably llama 3.1 for general usage.
I was looking to use it for story generation to make a DND campaign. This would still be good?
It should still work well, but there are more creative finetunes you can try out.
Are finetunes something you can find and implement, or are they something you need to train yourself? Or both
ControlNet Question: I have a word that typed out in Photoshop using a chunky bold font on a transparent background and exported as a PNG. I'm using this image as guidance in ControlNet using SDXL with the intention of prompting to transform the letters of the text into different materials (eg. stone, fire, etc.), with the goal of keeping the background black or white. However, I can't seem to get SDXL to generate the font as a material, and not the background. Instead, the text AND the background become the material that I prompt. Is there a way to seperate the text from the background using a specific ControlNet? I've tried all the common ones (canny, depth, sketch, lineart, soft-edge, and scribble) with various weights, but so far, no dice. Thanks much!
Finetuning is basically training the model a bit more to make it perform better in some area. You can make it and other people can make it.
having a transparent background (alpha channel) for the control net image might be an issue
You can try layerdiffuse which will only provide transparent images like you are looking for. However not sure if it works with controlnet.
https://github.com/lllyasviel/LayerDiffuse
@quartz siren @fervent thunder Thank you kindly for the suggestions. I should have added, I've tried a non-transparent image as well. Still not getting the results I'm hoping for. I don't need it to be transparent, but just thought that might be the way to go.
IP adapter, attention couple and regional text conditioning, all using the same depth map
would be a next step
but its not really worth the effort
Hello
Which model does Automatic1111 use? I am led to belive it's XL but idk
oh
TY. Why would it not be worth the effort?
ooooooh, is it just determined by the checkpoint model you use?
because even in the best case scenario SDXL will not do this task particularly well
you could do this with Flux with zero effort, and the quality will be higher too
I was unaware that Flux had functioning ControlNets
I reckon it could do this without control net
having said that, some of the control nets are not too bad
Gotcha. I definitely need a ControlNet to keep consistency with my font and layout. But the Flux ControlNets I've tried so far don't work. I'll have to keep looking. Thank you for the help
https://huggingface.co/XLabs-AI/flux-controlnet-depth-v3
https://huggingface.co/jasperai/Flux.1-dev-Controlnet-Depththese two apparently, good luck
Great! I'll give those a try. Thanks again
I want to train a lora for oil paintings but I don't want the colors the artist uses to be the only colors the lora uses when make images. Would captioning the entire image, including the colors used be the best way to address this? I'm basically trying to only train on the art style/textures not the colors.
just a case of avoiding overfitting
and then lowering lora strength during inference if needed
so no need to caption the colors in the painting?
a lot of people are doing no captions
not saying that is better but try it with no captions first
I tried but I didn't see much of the texture pop in until using 1.5 strenght, then it started to look weird. So retraining atm with gpt4 captions.
also training a bit longer.
Anybody have a good image gallery extension for Forge?
I had one for A1111 awhile back but I lost that. I could sort through old images and sort of have a style gallery with it
Search infinite browser
Thanks, I will
That should work just fine, I appreciate it
If you're trying to train a LoRA for oil paintings and you want the model to capture the style (like brushstrokes and textures) without locking it into specific color palettes, captioning the entire image (including colors) might not be the best approach. Mentioning the colors could unintentionally cause the model to associate that style with only those colors, limiting its flexibility.
A better method would be to use minimal captions—or even no captions at all for the first phase—so the model focuses on the style rather than the colors. For example, captioning only “oil painting style” and leaving out any specific color details would guide the model to learn the textures and brush techniques without getting stuck on specific hues.
From there, you may need to train multiple models or stages. Start with a model focused on style and texture, and once that’s working well, you can create new datasets by generating images with different color schemes using simple prompts. With this expanded dataset, you can retrain or fine-tune the model to capture a wider variety of colors. This way, you’ll have control over both the texture and flexibility in color in the final outputs.
Also, as mentioned, if overfitting becomes an issue, try lowering the LoRA strength during inference. Sometimes, running extra training epochs or using more diverse images can help too, but testing with simple prompts at lower epochs can help avoid overfitting early on
@knotty turtle Thanks for the tip! I see what I can make with your tips in mind.
I am running Flux via Stability Matrix on my PC. Can anyone point me in the right direction on how to automate this with Python. I have searched, asked ChatGPT and Google and I did find some example code somewhere but I've lost it now. I just need pointers on how to get started. Thanks
hello
hiiiiiiiiiiii
gm
hello
How do you use this software? Is there a tutorial for it?
hi how to use ai?
Automatic1111 is just an user interface. The actual generating is done by the models you want to use. It doesn't even come with any models when you install it, you have to seek out a checkpoint you want to use with it.
imagine/ tortoises from behind crawl over white sand towards the ocean, at sunset, photographic realism
If I use a seed, there is no point in doing a batch count/size greater than 1, right? Because the seed/prompt combo will always produce the same image
hi 🙂
hello everyone
hello
it will count up, starting from the seed you have set
If its the same seed like 42, same prompt, same guidance scale it will be the same img. If its different prompt or different seed or different guidance scale, you will get a different img.
Hey there, getting back into AI for fun, and downloaded Stability Matrix, can it be used directly, or is it more like a "portal" to keep things tidy and up to date and better use WebUI from there ?
I saw that Inference is useable only with ComfyUI from Stability Matrix so... i'm a bit lost now, too much have evolved 
I'm looking for a ComfyUI node that iterates through a list of tags in order and outputs the single tag as a string. For example, the node has a textbox with "1girl, 2girls, 3girls" and it sends 1girl to my string combiner node for the 1st generation, then 2girls for the next generation, then 3girls, then back to 1girl. Thanks for any help!
hey all. I want to make a poster for a friend and want to print it in sizes 70cm[W]x100cm[H]. Thing is I don't know where to generate it where it will do a good job but also allow me to suggest images for a poster of this size. Any ideas?
Stability Matrix is just an installer for the webuis.
I would recommend installing the webuis the default way
Hello,
I've been working with Stable Diffusion for a few days to create variations of an existing drawing. The drawing features two characters in a specific situation, and I'd like to place them in different contexts while keeping exactly the same style and linework (for example, the characters could be laughing, dancing, jumping, etc.).
However, I'm struggling to achieve results that maintain the original style and linework perfectly. Is this possible with Stable Diffusion? If so, could you advise me on how to keep these elements consistent while changing the poses and situations?
Thank you very much for your help!
hi
Error code 128 stable diffusion
What is the solution to this error? I have been trying to find a solution for two days.
Can you paste the cmd log in #🤝|tech-support ?
Is Flux 1.1 coming to Schnell?
Maybe, but it didn't even come to dev yet. It's only for pro right now.
sorry, thought I was in #🤝|tech-support
I hate microsoft
I hate how even though I had saved images on an AI generator, they decided to erase the website to build it from the ground up and now a ton of stuff that I had on there is gone permanently because of Microsoft being stupid
guess that should teach me to save every image I like instead of just keeping it saved on Copilot because if Copilot revamps itself again, I might lose everything again
yeah I'm sorry that you lost your image but its worth saving stuff
yall dont allow flux channels?
basically SD3 is the flux channel
these days
Hello everyone! Could someone help me out with more details on how he’s doing this using Stable Diffusion/ComfyUI? I’m especially curious about this part of the video: https://youtu.be/SHmjC7t3fJA?si=vCqwT3uA-SrUmgSV&t=119. How is he making it happen? Where can I find more instructions, like installation, setup, and a quick guide for 'Blocking to Render'? #1292682865028501606
yoyo
i have a question
Are there any startups that are doing basically a MJ competitor using Flux?
Like what Grok is doing
and just buildung a really good wrapper and web ui around flux
replicate
Hy
yeah but a full featured web ui?
depends what features you mean
would probably just end up looking like mage.space or rundiffusion
like inpainting, style transfer, etc
all in one ui
I think rundiffusion is the best version of this
anyone know how to consume most of the vram while training a flux lora using ai-toolkit?
i did select the 24gb template cause thats the only available there. any suggestion?
i attached a pic, its only consuming 24gb vram out of 48gb machine.
https://discordapp.com/channels/1002292111942635562/1026382406279770152/1292765999644545024
hi
Is it a problem tho ? Just because you have 48gb available does not mean it needs all of that.
Do you notice any loading/unloading into vram slowing things down ?
No it's not a problem. And did not notice any difference. Instead it's very fast. But It would be great to utilise all the vram. like with the AI-toolkit on replicate it's fast.
Flux works only in forge, sdnext and comfyui locally
Hello
wouldn't recommend auto1111, it has yet to be fixed annoyances, and forge is just auto made better.
Doesn't break when modelhopping either lol
Hayo
ello bri'ish sarah
yo. I ve been working with SDXL for a few weeks and thinking whether I should try 1.5. Do you think?
Depends, sdxl has better prompt following and usually better imgs but sd1.5 is way faster so easier to experiment and has better control nets I believe.
SD 1.5 is an outright better model for some uses
I have 2.8 it\s on 2080ti using 850*850 resolution
trying to set up tensorrt workflow
Sd 1.5 will be way faster, and like NeonNinjaAstro said, it can be better too sometimes.
ok I ll give it a try thanks!
I just need 3 more 2080ti and I m good
Hey. Which room is best for stable-diffusion-webui ?
Given the choice between an RX 6900 XT and a RTX 4070, which would be better for generating images in SD? I heard AMD cards are just way slower.
Hey folks, I've been using Automatic1111 w/ ReaActor to try and make a comic strip featuring my son. I haven't touched these things in many, many months. Is there something better I should be using these days? Thank you!
is it just me or is Dreamshaper a REALLY good generalist model
also, I am trying to condense down my several super specific models into a few good generalist models
any recommendations?
also what model is closest to the quality of modern NAI
4070 is probably a better choice.
Forge UI - any one know how to disable the "checkpoint merge" Tab? I do not use it at all. Can't find it in Settings. its "Hidden UI" i expected its calles "HIDE from UI" anyways....
where can I find info as a newbie on comfyui on what commonly used custom node do what, what are the best starter workflows etc?
Is there a channel for lora training related? As i'm struggling quite a bit with training SDXL lora's 
just got into text to video, is stability the best resource right now for generating quality videos
I heard on reddit this is not so
No, CogVideoX is the best open source model for sure now. It's not Kling or Gen3 level but still pretty decent.
Svd was pretty good for it's time but pretty outdated now.
you can get 3080ti for the same price, it s 30% faster in SD, and I d buy a used one
guys i used to work with automatic11111 a long time ago. now trying to start working with SD again. which UI u suggest for win11? kinda dont like to bump into lots of errors when installing and working 😄
I m starting to learn comfyUI and already feel like it s the best decision
either works. Just auto1111 being the shittiest :P
If you prefer it, there's forge webui, basically fork of auto, but fixes most of the negligence auto still suffers
I still bounce between comfy and forge as some stuff is better pre-setup for the other :P
Is it possible to do after generatione edits, like upload a image, highlight something like a part, a arm, or head, and regenerate that specific part?
ya thats called inpainting
How does that work?
have a good day
hi
Amd cards are slower because SD is built with CUDA in mind and amd cards have to translate the cuda instructions into something they can work with. Depending on the OS, there are ways of speeding it up significantly(zluda in windows, ROCm in Linux), but they're still just not cuda.
That said, a card like the 6900xt, while slower than nvidia, will still be decently fast. A surprising factor affecting the speed is how much vram the card has, an area where amd has novideo beat. If there's not enough vram to fit the model, it will rely on the much slower system ram or, even worse, the page file. The ideal would be an nvidia card with at least 8gb of vram(if not much more). Second best would be amd with 8-12gb of vram imnsho.
I use a 7900 xtx
And I'm using 6700xt. Mind, I didn't get it specifically for ai, but I figured I'd try it. Has been a surprisingly good experience.
Then again, 12gb of vram is quite helpful
I can run 3 batches of 4 images per batch in 12 minutes, making it about 1 minute per image. Using an xl model at 896x1152/1152x896
An equivalent nvidia gpu can probably do it faster, but this isn't exactly slow.
Yeah I heard good things about Zluda but I wasn't sure. I also heard it got taken down lol. But the data hoarder in my downloaded and kept a copy. I am on windows btw.
I've went back and forth on AMD and Nvidia cards. Both are fine and have ups and downs imo. Right now I'm on an RTX 3070 with 8gb VRAM. And 8GB feels so bad. Some games need more. That's why I'm going to upgrade. I just really wish AMD cards could use Ansel for screenshots. In the 5 games that feature is implemented, it is sooo nice.
One thing thats making me lean a little more towards the 4070 is the power efficiency. My computer room is somewhat small and gets hot at times. Less power means less heat, and the 6900 XT scares me about that lol. I think the 4070 uses somewhere like 220w? 6900 XT and RTX 3080 TI both use like 300-350w respectively. From my quick searches anyway. And 4070 is newer so it'll have better resale value if I end up wanting to get an RTX 5070 or 8800 XT or whatever they'll be called lol.
Entirely up to you. There's benefits to both, and downsides to both.
I have questions about the Stability Matrix app and WebUI Forge. Is this a good place to ask?
Is there a way to cartoonize an existing image without completely describing the image in the prompt? I kinda have the problem, that either the image is not cartoonized enough or it ends up being a completely, weird different image.
3070 ti benefits from already being nvidia, but you'll be offloading xl models more often. I can tell you that the 6700xt is decent, able to churn out 12 images at 896x1152/1152x896 with an xl model in 12 minutes. I use it myself. But setting it up requires more work, as the ai tech is made for cuda, an nvidia tech
Hello
Could you explain what "Offloading XL models" means 😅
I'm already using 6600 with ZLUDA and the definition + frequent memory errors is kinda annoying
"offloading model" <=> splitting one model into multiple subparts and then proceeding to load and work with each of those part instead of working with the whole model at once. Needed if you don't have enough vram to load the whole thing but slower as it requires splitting the model, loading some part of it, unloading some part of it then loading another part, then loading another part, then etc
Kinda like baking a cake while still having all your aliments in your car from the store and having to clean your hands in between every steps.... Sure I ll roll with that analogy.
It s much faster to have eveything at hand and ready, without having to have to repeat tedious prep steps all along.
And now I'm hungry, damn it 😄 .
The image definition would benefit more from the 6700XT/6750XT's 12GB no?
Or is 3070 Ti enough
8GB sounds a bit on the lower side of things
Speed isn't really an issue here
With 8gb you should be able to generate (not train tho) anything (for now at least) but it's probably gonna require offloading at some point when generating with heavy models.
It all depends how much you value your time and how much you re willing to bet on AMD catching up Nvidia on the cuda scene.
Personally I value my time more than some others and have enough money to bite the bullet and pay the green tax when it comes to building a pc.
You can check some benchmarks to get some approximations of what to expect : https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
I mean, under windows there's zluda for amd, which allows me to get 1 min/picture. And on the Linux side there's ROCm. So it's not like there aren't workarounds for the cuda issue.
Yes there are solutions, not denying that. But they come at the cost of a much harder setup and compatibility issues with some extensions.
And I have no magic crystal orb to foresee if Zluda will be able to keep up with Nvidia in the future. (Sadly) Nvidia did invest a ton more money into their software and have started doing so years ahead of AMD. So team red is playing catch-up right now.
Ugh, I hate how it can't just say ryzen 9 5900x
No, it has to say amd64 family 25 model 33 instead
To be fair, the cpu doesn't play that much of an important role in SD (unless you re forcing it to execute in "cpu mode" obviously). If there s not offloading of any sore, no "cpu mode", etc... The only significant impact of the CPU will be "how fast can I load the model into the GPU" (and even there, there are some asterisks).
Yeah, but when comparing test systems, even if one part isn't actually doing anything, you'd want it to be human readable. The average consumer doesn't know what this family/model system is. I had to look it up myself
Maybe I'll get 6750XT
3070 Ti's VRAM is rather limiting
And 3080 12GB is a tad too out of budget
Oh and used 6700XT is only $200 here holy mama
Yeah, that's the baffling thing about nvidia: they insist on putting just a bit too little vram on their cards.
Didn't know Apple made discreet GPU but here we are
Jensen Huang is Tim Cook?!?
Lol
Hi
Hello !
Hi everyone - Is there a chat dedicated to people who need help setting up and running Automatic 1111?
What does pony do and how do I use it with flux?
pony / ponyXL is a model (based on SDXL). Flux is also a model.
"Using a model with a model" makes no sense.
You can use either of those in ComfyUI, automatic1111 stable-diffusion-webui, sdnext, SwarmUI, etc.
Is epochs when training just to split load? Or will more steps and less epochs make model better there? 
Anyone know where to hire people who are really good at inpainting with SDXL models/loras?
hello
need help, i can't run stable diffusion webui on kabble
well, i can run but when i access in locally, the local rejects the conection
Hello :D,
Hey I need help. I've been trying for like a good month on my own but I just can't manage it. I want to place a specific style on my image. I have this drawing colored and drawn. Then I have a checkpoint and also a lora. I've been looking for ways to transfer just the artsyle to the drawing without messing up the original at all while also carrying over the original colors. I've tried ipadapter, controlnet, t2l adapter and I just can't figure it out. Can anyone please help? I dont want it generating things that aren't there or change the image at all
if the image does not contain content that is against the server rules and you could post the drawing/image and the style you want to apply in the channel general with images. So people could try to help
i'm new in about using stable but u can try some configurations
but actually can't run the webui
Thank you I will try that
Herro
can someone please explain to me what controlnet models are?
https://github.com/lllyasviel/ControlNet
Scroll a bit down, ignore the math and look at the examples
i want to use automatic1111 on kaggle but there's are not notebook, someone have one?
easy inpaint and outpaint from 6:00 https://youtu.be/cx7L-evqLPo?si=Lzk6QWYGmY2pnzRT
invoks canvas is very handy
Hi, @zenith elm
I've hands on experience in inpainting.
I often use Flux + Controlnet for the HD image text guided inpainting.
Could you kindly share me with more details?
Thanks.
sup
hii does anyone know how people make very clean and precise any anime character fan arts using stable diffusion? if you know please dm me. I'm new to ai and have no clue how people make these
how much vram do i need for fluxx dev and schnell?
without tiling?
you can but it takes a bit of setup
two passes with res-adapter, deep shrink and PAG, with an upscale in between the passes
how much time does fluxx dev and schnell takes for u all per image? 20 interations
for me its taking around 25 mins both..
hi
👋
25 mins means VRAM got full
you need a smaller quant
when the time goes really long like that, its mostly vram issue
but flux can fit on very low vram with good quant
And schnell should only need 4-5 iterations.
ah!
Yeah dev should not take 25 mins, maybe 1-2 minutes on a not too great gpu but definitely not 25 min.
Hi peps
I want to use depth generation for some architectural images.
Im a customer of openart.ai but there semes to be no function for generating images out of depth. Sketch to image works fine but I want to know what else is possible.
Is there a way to generate images out of depth-images online? Or can I do this only on my PC?
Thanks.
Hi! I'm quite new in the IA, I'm working with stable difussion and I have certain questions I am not sure if this is the correct channel, if it isn't please let me know 🙂
I have realizie that usually, the images that stable difussion (Right now I'm using XL) doesn't match proppertly the promt that I write, is this common? are there certain guidelines to follow to make the promts?
I have try using (( )) for add weight to the promt, but it doesn't seem to work really good
Welcome to AI.
Thanks 😄
Don't think of this as a reliable pipeline or process.
It;s more like you're in a casino and you pull the slot machine and sometimes you get something good.
And even fi you do get something good it is not 100% reproducible.
What can I say... early days...
mnmn I understand
But it is hard to get something like the position of the arms or the head with precission
any advice for that?
i have 16GB vram , rx 7800xt gpu
u cna use openpose control net
can someone please help , what am i doing wrong , why is it taking so long with fluxx?
i am using comfyui
1m per interation?
isnt that what controlnet does?
20 for dev and 4 for schnell
ohhh
faster but a little less quality
what guidance do u use on fluxx schnell?
CFG? 1
why so low?
you can go higher if you want
but it takes twice as long
and you have to use extra nodes to fight the CFG burn
Yeah the same thing Neon Ninja Astro said, Flux dev does not work with normal cfg but distilled cfg and doesn't support negative prompts
I actually recommend trying out Flux.1 de-distilled if you can wait, that supports normal CFG with negative prompts, slower but produces better results imo.
whats distilled cfg?
its called guidance
its a new fake variable
that the teacher model taught the student model
it doesn't relate to anything in the real world its just a virtual label
they taught the model to imitate it
oh?
Yeah Flux.1 dev and Flux.1 schnell are distillations of their closed source Flux.1 pro. Flux.1 pro is the original model, and does support normal cfg and negative prompts. Dev directly does not, they made it have distilled cfg which is an imitation.
need help i need a working notebook from kaggle about auto1111
Not very often, I usually use diffusers.
never heard about that
Thats the main python library to load models, I usually need to use python and diffusers is simpler to run then use comfyui's api.
i keep getting shape mismatch error 😢
Whats the code?
Hello, I haven't used deforum for over a year, since Google colab banned it, have you found a way to use deforum for free? I need to know urgently
Hey fellas, anyone here using fooocus with Jupyter Notebook? Stuck at last step :/
Hey there, can you tell me more details?
hey you can use it when you rent GPU on sites like https://www.runpod.io/
Well, it's not free 😅 I got a time when you could run SB on colab for free for 4 hours
SD*
it cost like 0.3$/hour 😄
Extremely expensive, a 15-second video can take up to 2 hours to make
I made about 8 videos a day, I spent the whole day on the computer
well I didn't try it so I am not gonna argue but highly doubt that when image is for me currently about 15seconds
btw do you have by any chance experience with fooocus on these sites?
I'll send you a video privately
ah I guess you are right
alright
Hi! what tools are new for animation in stable diffusion?
Rn, CogVideoX is currently the best open source video gen model.
Is difficult to learn? or difficult to install?
Well you can install it with comfy ui, or diffusers. It’s pretty popular now as an open source video gen tool.
Thank you very much! I'll check!
Hello to everyone... looking forward to an amazing experiene with youguys.... ok then, best wishes!
is there any guide to install flux locally? got a 16gb nvidia card, im looking for a flux model that can use lora's, im not sure what the difference is between pro and schnell?
nvm i think i got it figured out, im using the dev flux model i think
Hello Managers, my body is ready for Flux Auto canny 
not sure how well flux will follow current canny control nets
but good luck 🙂
hello
gm
I use to create free images before in this server now it seems like it's gone?
yeah
hey, does anyone knows how to setup a grid generation based on flux checkpoint ?
yeah is comfy ui information okay?
yea i'm looking to get a node to use inside Comfy
the way I do it is use KJNodes
he has nodes for adding labels to images, and for joining images
then I copy and paste the Ksampler setup 9 times
thanks i'll look into it
also check out his nodes "widget to string" and "something to string"
to automatically generate the text for the "add label" node
it updates labels based on values in nodes
Hey everyone, anyone is really experienced in stable and could help me ?
I am trying to place a product on a background. But I don't want to do Image composite by mask, I actually want the AI to recreate the produc. Any idea how to do it to be as acurate as possible.
That would mean >
I load a product image
remove bg
write a prompt
It generates the product into the background
https://civitai.com/models/419539/botos-background-swapper
I made this some time ago, with some adjustments it might be able to serve your use case
I edit the original picture on top, but you can remove that part and have the AI product as you said it
Hi , i like your workflow, would you be open to work with our team?
Please do pm if so. Looking forward to it
Hey thanks for sharing, how does your workflow handles product recreation ? Is the fidelity good ?
Cant test rn
I think you should rely on other projects rather than stable diffusion.
You could change background first with pretrained AI models but not with SD.
And then could harmonize that with another model.
In this case you could customize your background as any image you can.
And no need to worry about fidelity and variety, maybe.
Something like this? https://civitai.com/models/546180/geeky-remb
I made a custom Remb node for comfyUI that removes the background of one image/images and layers it on another. You can daisy chain them, adjust the x and y position, animate it a bit with simple 2D animations and etc.
Please I wanna ask if anyone uses MacBook for most of this ai generation and this heavy face swapping software, are they a good options?
Or I should use this heavy omen 16
Whats the difference between 'merging' 2 models together for a generation, and doing 1 pass with 1 model and then another pass with the other model?
Which project for example ? What's the problem with SD ?
+1
when you do 2 passes, 1 with 1 model, and the other with the other model
you are choosing a sigma to stop the first model at
and to start the second model at
they are never acting at the same time
if you want the diffusion step at a particular sigma to be a mixture of the two models then
2 regular passes can't achieve that
I use CosXL for it, id say the recreation output is only mediocre, thats why I edit the original picture on top with the workflow
Speaking of merges. Generally speaking will a merger of two models be better at following loras if one has trouble working with them and the other doesnt?
Got it. Thank you.
for the most part merges seem to average almost all aspects
Perfect.
Ok, and why cosxl ? what's the difference between sdxl ?
Its a great model to edit existing pictures, its based of SDXL
Hello there, i've already train my own model .. but i'm not satsified of the result because my model generate beautiful landscape but the character are sometimes ugly .. someone can help me to find some tutorial to made a style with more control ???
Hey Neon, the issues with KJnodes is that i can call a base ckpt name but there are no options to call the diffusion model like flux_dev for example to iterate in the grid, that's my main issue..
Perhaps, i found this workflow that allow to call diffusion model/unets etc... with qqnodes https://civitai.com/models/635692/flux-xy-grid-dark-mode but i got a "Grid Annotation" issue at the end of the pipeline that does allow me to make the grid while everything else is working
load diffusion model node, then unet_name widget to string
Oh yea, i see it might works, Thank you !
Hey guys, I'am looking for an experienced comfyui individual (paid job) ! 🫡
RuntimeError: Your device does not support the current version of Torch/CUDA!
😿
What's your GPU?
I'm just venting, I am perfectly aware of my GPU and its lack of CUDA
Oh okay. If your on AMD, I have guides for that
Oh, that would be nice
I had Forge working before until something broke it that I can't figure out so I'm just trying to blank slate
which is proving rather challenging what with things like python stubbornly refusing to exit my system so I can put it back in
Here are all my Guides:
Go for the ZLUDA versions if your GPU has 8gb or more vram.
https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides
oh cool, thanks
wish me luck
NP, for any questions feel free to ask me in #🤝|tech-support
How do you guy remember all the trigger words for your loras? Also is there a way to automatically add the trigger words for a lora?
I am new to Stable Diffusion, is it free to use for converting Images to Video on macOS? (I have Intel, not Apple Silicon)

if stable diffusion large and medium are out, then what about stable diffusion small?
Is it even worth installing S.D with a 3060ti?
Sure its a good card for SD
Only 8gb of vram though :/ are there any samples of the best quality work that a 3060ti can produce?
yeah, any Flux image
with the quants and tiled upscale you can do it
Is there a preferred gui frontend these days?
ok thank you
Automatic1111 for beginners, forge or comfyui for flux usage
You can generate large images by upscaling
So no problems with that
OK, I think I will get it installed
You can follow my install guides.
For Automatic1111 it shows the best config dependending on the vram.
https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides
awesome, thank you
Np
Hey, i used to run 1111 and switched to comfy, was reading comments on the reddit and saw a lot of people suggesting in particular (ppl with 8gb vram and less) said they got better gen times with forge, is this possible or is it just confirmation bias?
not sure
Anyone can help me with product photo ?
I am not looking for an already existing workflow as I already tried every single one of them and those are not the results I am looking for, please dm me.
I'd love to talk with someone experienced to get their point of view. 😇
Is it possible for stable diffusion to make a automatic folder for each prompt? I know it can be annoying when you make 1 by 1 but when i do 10 in batch or more it would be nice.
Does it work with like %folder-name%-%number% or something? 😅
Photo, or generated image?
wsg
Yes its slightly faster and uses less resources
Wassup
I m considering modding my 2080ti to 22GB. Or upgrading to 3080ti and modding that.
- So I ll be able to make 1400-1400 images? But is SDXL trained to do that?
- I`ve noticed that reducing resolution from 1000-1000 to 800-800 improved prompt following a lot...
Gemini Pro says that more vram allows potentially better adherence to instructions, is it true? is it significant?
@warm junco
Starting from a real picture, to integrate it in a new background. But i don't want to do some basic image compositing, I want the AI to recreate the product so it has a better blend of light and colors
hi
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
The 1400x1400 should be fine for SDXL, but there are so many factors that contribute to how an image is generated, it's hard to say how the image will turn out, no matter the VRAM amount. More VRAM has better adherence since it permits for higher-resolution models.
(By this higher resolution I mean models that are not optimized for lower-end cards)
Reducing pixel resolution generation seems to follow the prompt, though I am not really sure how this infrastructure works.
Bro please check your dm
More vram doesn't mean better prompt adherence. Like gwolf, said it allows to use larger models.
It depends on the models.
Sdxl is trained on 1024x1024 but with Upscaling you can go up to 2k, 4k etc
okay thanks guys
actually when I upscale I lose detail, it becomes worse
tried all the best ones
If you used hires fix, try lower the denois
Also if you upscale in img2img you can use resize (latent)
only tried normal upscalers
I ll try hires fix
Dont use the Extras tab for upscaling it can't generate details
I m in comfyUI
Ahh okay
Thats.. not going to happen unless you train a lora on your product
Hi. When I share a google drive link for someone and I gave him editor role. Could he access the files other then what I shared with him ???
some legend wanna run me through some questions i have abt textual inversion?
ok
I want to train an embedding on my face.
Im technically fairly competent but this is way above my skills.
1:
when i train an embedding do i have to use the model with which im afterwards gonna generate the images?
2:
Can i use any model? I wanna use dreamshaperXL_v21Turbo for the generation.
3:
how? As far as i can understand a1111 cant train with the sdxl models.
I installed OneTrainer and immediately have no idea what im doing. I cant even find where to specify the location where my training images are and the documentation is virtually non-existing.
It feels a bit like a phd in computer science is basically a requirement for playing with anything deeper than the most basic image generation
embeddings train clip rather than the diffusion model
you do have to use a specific text encoder while training, and its the one you want to use in inference
i have no idea what that means =)
you don't need to know much but basically stable diffusion has 3 parts
- text encoder(clip), converts text into embeddings which are basically "meaning"
- diffusion model(unet), uses the embeddings above, to generate the image as latents
- vae, to convert the latents above into an actual understandable image
The easiest option is to not train anything, but just use something like IP adapter which a111 supports out of the box I believe. This just needs a single good quality image of your face. Try it first, if it does not give you good quality, you might want lora training.
hmmmmmmm okay... Is training a lora less complicated? Do they produce better quality images than embeddings?
Hey are there any sub servers that are using stable diffusion? How can I find them?
explain please?
Slow days.
People are in pain and preoccupied.
"What's next" everyone asks because we all know the way things are can't go on.
Lora training trains the diffusion model, embedding only trains the text encoder. Lora training should be better quality.
Quick question for everybody, I remember on automatic111 you could install an extension that would show an image wall of your checkpoints and loras, and then you could auto assign preview images to each checkpoint and lora. I'm using Forge now, and it has the "wall" for checkpoints, but I can't figure out how to give them preview images. Does anyone know how? Thanks
Click the little hammer and wrench icon in the top right corner. Then in the bottom center hit replace preview. But it will only use the last generated image, there is no option to select something else.
gm
gm
I think it is possible with multiple ksampler, and playing with the denois and ending with a restore detail node
Hey, anyone knows how to perform good accurate color match ?
I am placing a subject on a new background but the colors look off.
I have tried using colormatch node but the result is off too
Been saving up for a 3090 PC. Any places to get one cheaper? I know there's craigslist but no good deals around locally yet
guys, anyone knows why inpaint with brushnet and sdxl model changes the background color and makes the mask seem so visible ?
hey guys, does anyone know of a comfyui node that gives output in a few separate bursts? like for example, it generates 3 images, but instead of returning it as one bundle of 3 images, it outputs it one at a time?
@warm junco I'm gonna have to redo everything with 5.7 I think, with 6.1 it ends up crashing while compiling after throwing a bunch of python error
ONNX failed to initialize: module 'optimum.onnxruntime.modeling_diffusion' has no attribute '_ORTDiffusionModelPart' Exception Code: 0xC0000005
then a bunch of rocblas.dll stuff
Hi guys. Does anyone use flux with 6Go VRAM ? Searching on internet, I read everywhere some guys who do it and can generate under 5 min... I don't understand. I have 6Go VRAM (a GTX 1060 card) and I can generate between 11 and 16min, approximately... 😞
I use flux1-schnell-bnb-nf4.safetensors and the basic workflow found here : https://turboflip.de/flux-1-nf4-for-lowend-gpu-for-comfyui/
What was your GPU again?
Normaly its better to use 6.1
6600 xt
Change is always a constant. The way things are can never go on indefinately.
Change is the ONLY constant some have said
@warm junco Hey there! Sorry for the ping. I just wanted to let one of the admins know - y'all need a "Programmer" field in the onboarding section where you select your hobbies
I, personally, tackle SD from a software engineering perspective
Any recommendations for image to 3d? I want to mainly use it for anime hair
I don't think good image to 3d exists yet
Probably not, but I want to get a base and then modify it in a 3d program
Hello!
hi
@warm junco I think I got it running now? It's still iffy and the results is questionable, but at least it's not a grey box, I do still have 6.1 and 5.7 HIP SDK both installed as before so I really don't know what happened before, nothing actually was changed to be honest
It's literally just smudges but still better than grey
Wait, never mind. It's working now somehow
Okay, good. Let me know in #🤝|tech-support if you get any error again
who the f is Jon Snow
What are the 3 or 4 best Esrgan models for upscaling people??
is there ever gonna be a stable diffusion for generating 3d models?
I am an experienced AI developer with 2 years of expertise in creating innovative solutions, as well as a fresher web app developer skilled in React and Next.js. Additionally, I specialize in building crypto trading bots and offer my services at affordable rates. Let's collaborate to bring your ideas to life with cutting-edge technology
has anyone else issues with forge getting Killed when using --medvram or medvram-sdxl?
hey guys, ultimate SD upscale and tiled diffusion, which one is more recommended?
already are a bunch https://stability.ai/stable-3d
two different things that can coexist
video says that they are two different ways to upscale picture, but which one is better?
dont know what video you re talking about.
Tiled diffusion is about "splitting the VAE part taking place on the whole "picture" into multiple "VAE operation" producing multiple smaller images and then stitching things back together to get one big picture". Making it much easier on the vram consumption.
and Ultimate sd upscale is about upscaling
are there any channels for forge webui
if it's a technical question, just ask it in #🤝|tech-support
do yall know stability matrix??
Hey guys,
I am going to create like an influencer...and I was wondering what kind of settings/extenstions/LoRa's are good to create it, and what do you guys use. I downloaded the pony model already from civitai but I have no idea how I could generate consistent faces. Last year, I managed to generate consistent images, using a faceswap and in the prompt I would input like a celebrity that looks alike and would be generated mostly the same. Is there any better options to go about it? Do you guys use the pony model? if you do, could you share some configurations and prompts. Sorry for this mess that I wrote, and thanks for any comments😀
Meshy.ai is the best text to 3D I've found, the models it creates are actually pretty good! Image to 3D is a more difficult problem. Monocular photometry, the most promising method of inferring 3D structure from an image, is still very primitive. Even if it wasn't, there's only so much information that can be gleaned from an image. And if that image is illustration, then there's essentially no point
I've put a lot of thought into it, and the problem boils down to three sub problems: inferring the structure of what is visible from the image, inferring the structure of what isn't visible, and actually translating this structure into a 3D mesh. If you've got the latent then the last part is relatively simple. Multi-view inferrence is the only mostly-acceptable way of solving the second problem, and breakthroughs are required to solve the first
Especially for stylized tasks, the only actual way to do this is diffusion (or similar) in a shared embedding space which can simultaneously decode into both 2D and 3D. If you can do this, then an encoder can reverse engineer a latent from an image, with diffusion (and friends) doing the work to synthesize whatever isn't visible. But the processing power required, not to mention the dataset
shudder
Hello
H a p p y C a n a d i a n T h a n k s g i v i n g!
wait... no way to use ssl with stable diffusion server?
my reverse proxy setup is ony working (on my https site) usind webserver and stable diffusion server on the same machine... testing from a client webpage i get cors error 😦
Someone that has familiarity with the SD api and can test the wordpress plugin I made to see if some features that can be implemented are missing?
even less frequented dc, but coolest of them all, all of dc been pretty quiet tonight, is there some special occasion going on in the whole wide world that I am missing out on lmao
I dont know where to ask this question but is there a way I can run SD locally and have it be used in my discord server?
best upscaler for CGi graphics? Think of old mario 64 promotional silicon graphics style
hello, does anyone know if flux nf4 supports lora??
canadian thanksgiving?
Search for ajabot
What can I add to my SD wordpress plugin? https://fidefix.ddns.net/sd.mp4
is there a consensus on the best photoreal XL model now? i liked Helloworld, but not been updated in quite a while
hello guys, now I use i2i to do style transfer(with SD1.5), there are some text in my picture(my pictures contain some signs with text). After style transfer, the text became blurred and illegible.
Is there any method that can help me keep the text on road signs clear when performing style transfer? Are there any LoRA weights or workflows available for use?
Thx for your advice!
Maybe I change base model can solve this problem? Do anyone know some base model which do better in text gen?
Hi Thomas from LA here, working on https://WandAI.app, a one-stop AI creativity workspace and community especially designed for non-technical creatives .
We inviting 100 creatives to join the internal testing, where you can share the challenges you’ve encountered when using AI tools, brainstorm new ideas, and more, feel free to jump in!
hello
hey guys
Good morning
One thing, to move Automatic1111 and my comfyUI to an external SSD, what do you recommend?
Would it be just a matter of copying and pasting the folder? Or would I have to install the interfaces on the SSD and transfer the checkpoints, loras, preprocessors, etc.?
yo
hey
i am looking for someone as same age as me to be a fellow developer or any other. wanna be a friend?
This community is dead... I am developing multiple frontends (wordpress web plugins) for textgen and txt2img servers asking people what they like and maybe test functionalities to give feedback, nobody cares, nobody responding... I am asking myself why I am here...
This in multiple ai discord servers.$
They're busy generating cats and ponies.
I would appreciate it if you could send me more active servers related to comfyUI and A1111, on this server they are totally indifferent; it is a waste of time to ask here.
What is the best to make stylyzed photo of yourself?
Like pin up or something?
hi hi
You can try flux or sd3, both are great at text gen but sd3 is bad at humans. Flux will take much more time but give you better quality.
It should technically but it will not be as good as applying Lora at bf16 or fp16(normally)
You can use controlnets and ip adapter. You can also use special techniques like https://github.com/songrise/Artist
how can I make a poster on this ?
I need to make a poster on mineral resources for my college can anyone help?
Just post here.
sd3 is only bad at humans if they are laying down or standing on their head
No, it's basically any pose apart from standing. Even with standing, fingers are usually deformed or messed up.
Is there anyone that understands llm nature? So I'm having issue where the model will not respect the token limit or the stop commands. Sometimes it will start its response by finishing my question in its own little way. Sometimes it will give me a response and then reply as me and keep going on forever. Other times it will give me a nice response and then start spouting gobbledygook and kanji. In almost all cases it just keeps going forever becoming less and less coherent before eventually cutting itself off mid sentence. I feel like it's something to do with prompt templating but I don't know anything about this. I don't think it has anything to do with the ram because I only experienced that issue in certain models.
anyone with a 3090 can screenshare me their generation speeds?
Is there anyone here who works directly for stable diffusion that I could speak to?
I would Like to ask specific questions about copyright and commercial usage rights for the artworks we create.
wassup
why is there a LyCORIS folder if you're forced to move it to LoRA anyways?
Historical reason. An extension used to be required for LyCORIS. Seperate functionalities, seperate codebase, seperate filetype => seperate folder. Now LyCORIS support is built-in Auto1111 (and most other interfaces).
Guide worked great. Thanks. Any recommendations for models I should get?
hey. I want to do a like, me. but a cartoon, in the woods witht he sun shining, maybe holing a lollypop.
is there a good cartoon model?
Hey, I'm not sure if still relevant but yes my Discord bot does it. Contact me if interested!
I have a weird bright blue eyeshadow that has crept into my generations somehow. It is in every image even with no loras or changing models. Is there a cache I can clear without deleting all my settings or something?
Oh! I'm using forge by the way.
Hey everyone! I'm working on a large-scale 360 equirectangular video project with over 32,000 frames and need some advice. Specifically, I'm looking for help with:
Controlling latent images and noise for smoother transitions
Mapping latent control to a camera track
Tips or best practices when working with equirectangular projection in animation
I have prior experience with A1111, Stable Diffusion, and ControlNet on a similar project from two years ago, but this one’s a bit more complex. Any insights or techniques would be super helpful! I'm happy working with python, json.
I'd really appreciate some explaining on how to use your own noise layer. and how i'd go about building my own transformation program for equirectangular projections in A1111 or other ui or alternatively how I can use software like blender/TouchDesigner to take my rendered image transform it based on camera tracking data (along with the noise layer) and layer them all with the next frame from my reference video.
is it possible to mirror de UI of FOrge/A1111? So the image preview on the left ?
Hello, I am new to stable diffusion, I tried using prompt "a cube on ground with a light source directly above it", The idea is that light source would effect shadow, I don't really want to show light or anything on top of it, could someone suggest what prompt I can use?
a doubt at the time of generating is loading the model but it takes some no and some yes and in theory for sdd or hdd disk that is faster to load?
Hello
is there a model like chat gpt but offline to make cmments tools tips and etc for your code?
free open source?
nvm found some
What models are compatible with fooocus? Is it only SDXL that installed with base foocus?
Fooocus uses only sdxl
But it can work with pony models too
What is the best way to emulate faceswap feature from fooocus in automactic1111 ?
hello:)
Put the name of an Artist in the Prompt or select a Style Template
All of my attempts to us Flux seem to fail, the preview is a grey pattern shown here. The final image is just all black...what am I doing wrong? I'm using Forgeui and the flux1-dev-bnb-nf4 model. (deleted image since those aren't allow in this channel)
Reactor extension or IP-Adapter face
hello which a1111 version is most compatible with extentions?
Hi everyone, I would really like to create my own checkpoint from scratch. Could you tell me how to do it? Or even better, a video tutorial? Let me know
my images always come out undetailed, even at 30 steps and 2 clip and 4 million loras, anyone know why?
Hello all- I am looking for help with a project relating to flux and creating seamless images. Right now, it is unable to make images that can be tiled into larger designs. If anyone is up for a project, I am willing to pay for its development.
none of the AI's make truely seamless images - that's something you need a different tool for
the latest
you would need millions of anotated images and gpu server farms to get it done in a lifetime.
training an already existing model is possible tho
hey everyone
Clip 2 won t help in that regard.
It depends of your prompt, your model, your sampler, etc
The more lora you have the more they re likely to clash with each other if you don t do it carefuly.
Do not use any robotic GPT! Order only human written essays and homeworks -> https://extraessay.com/?key_wpg=5wpgrd
What Sampling Methods and Upscale methods are you guys using and what do you recommend for Cartoon Style?
please do anyone know how i can setup a live face swapping on gtx 2050
I have been trying to generate Iron Man Prime for years now. Not a single AI image generator can make Iron man prime model 51. Any tips?
could try to make a LoRa off of him yourself using art from the official comics or something? i dont think it would be that hard to do??
In total there are not much images from the prime armor model 51. So as mentioned either you train by yourself or use some ipadapter composition / style transfer magic with some of the original images
ah ok, thnx for the info. i usually use dpm++2 because euler often has waaaaay too many errors to bother. i usually use pony or juggernaut
hi there, new to stable diffusion, is there a way to use it similar to midjourney where you generate in a private dm?
If i have a reference and i only want something inside of the black how would i set that up in flux?
If you have a moderately good PC (memory and VRAM), you can run it locally, with the utmost privacy
hey man thanks for the reply yea been looking into it some more this evening, have comfyui running, just have to learn how to use it now lol
hi guys
yess i love getting malware !!😍😍
who's a mod here
sounds like a fantastic scam, count me in, i love scams!
We are a commercial company, and we currently have a paid project requirement. We need assistance with editing some images by replacing the heads of different people into specified positions. Could you please let us know if there are professionals available to help us with this task?
there's a #1092446741984444416 forum you should post this in - but this might be better asked in a photoshop discord
😊 thanks
Can you be a little more specific with the work involved? Are you looking to adjust facial characteristics? Also feel free to message me about this I might be able to help!
how do i create an image
Anyone know any resources for sdxl-sdxl base-refiner setups?
gm
im looking for NSFW enthusiasts to chat with, hmu
Who wants to work as a moder or developer in my project?
you lost me at work
hello everyone

Hello All
Hi friends, I generated a pic in stable diffusion using epicrealism_naturalS model. I liked the face/character it generated. However stable diffusion isn't generated similar images in the img2img tab when I'm passing in the original image. I didn't even change the prompt, just want different variations of the same face. any help here? Been struggling with this for sometime now...
I'm looking for somebody to make a character Lora, since as much as I look into it I seem not capable of doing it
or at least somebody to baby-guide me on it
Try FluxGym, doesnt get easier than that...
answered this in the #🤝|tech-support channel
So I just discovered something new. This is pretty fantastic for FLUX users. It's a new FLUX LoRA that makes it Turbo. It uses 8 steps for a fantastic image. It's Alpha, but totally worth testing. https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha/commit/b2db8dcbd15fb095cffd8ab530499e47883466e7
does anyone want to help me with my homework
I need to make a poster about a fictional character running for mayor of my hometown
dm me
just go to mage.space, make a free account, generate some images, and make the poster
i'm new to AI, and I want to incorporate it into my creative medium as one of the process... I'm quite confused with the terms. what's the difference between Model vs Assistant / Artisan? Is it Model being run on my own computer and the other is web hosted? If so, what's the minimum requirement to run a computer-noded Model? I'm only researching at the current stage. What would you recommend for a beginner to start off with? Thank you and look forward to digging more about SD.
I used Midjourney previously.
Can I prompt right here in the chat?
an assistant is usually a text LLM (large language model) that you type text to and that talks back to you. a 'model' is an overall term for all AIs. only a few models are small enough to run on your own machine. and you want an nvidia GPU
you're welcome to DM me if you want and I can try to answer more specific questions
the only prompting channels are the #artisan-1 through #1237460438229450772 channels and you'll need to read the information in the #artisan-faq channel first
can anyone help me generate a image correct? i have the correct checkpoint and lora but its not coming out like it should.
what interface are you running
Automatic1111 if that's what you're meaning. sorry i'm new
that's what I meant. you might find more people that use Auto1111 in the #🤝|tech-support channel than in this one
Hello
Do you need anything?
nah, there was a bot when I posted that
it's gone now
Ah yeah, it can take some time for us to catch everything. Especially with time zones and jobs 
Hello guys, I am wondering if there is any possibility to create different angle (perspective) of existing photo I have... I need to explain first... I have existing photo of real property, then I have sketch of building, but the sketch doesnt fit the perspective of real photography, is there any chance to transform real photo into sketch of building but retain colours, objects, of course generate missing parts based of the input image ?
this is very hard atm, you need to somehow project the LoRA into 3d and then map it back down using NeRF onto 2d based on camera intrinsic qualities
its theoretically possible but we need someone to do the research on it lol
or u can try publish paper yourself and become famous in CVPR 😄
guys, i got a rtx 3090, however, its doing like 3.3 iterations per second, while my rtx 3060 did like 1.4 iterations per second.
I thought i would get like 5x the speed?
like, its still good tho
yes I was searching for something, but not enough information about it, its like changing perspective of existing image, but keeping same forms
that's really a photoshop job...
which webui are you suing?
Forge
okay make sure its updated, then delete the venv folder.
Also if you upgraded the gpu you may need to reinstall the driver so the webui dont think your still on a 12gb vram card
ill do it now
ty
np
after deleting, should i run environment or run?
run
i did, however, it didnt create a new "venv" folder, idk if thats ok
then something is wrong 😮
hmm
ah, maybe i had connected to a1111 root
"@echo off
set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS= --opt-sdp-attention
@REM Uncomment following code to reference an existing A1111 checkout.
@REM set A1111_HOME=Your A1111 checkout dir
@REM
@REM set VENV_DIR=%A1111_HOME%/venv
@REM set COMMANDLINE_ARGS=%COMMANDLINE_ARGS% ^
@REM --ckpt-dir %A1111_HOME%/models/Stable-diffusion ^
@REM --hypernetwork-dir %A1111_HOME%/models/hypernetworks ^
@REM --embeddings-dir %A1111_HOME%/embeddings ^
@REM --lora-dir %A1111_HOME%/models/Lora
call webui.bat
"
should i remove everytinh g from webui-user?
mm ok its downloading now, ty
looks okay
not really 🙂
im testing again and speeds for sdxl 25 steps with 1 controlnet doing about 3.5it/s
however, i think this is the normal rate for 3090
yeah it is - photoshop and perspective warp: 10 seconds. battling with comfyUI: 3 months if not more
I think it’s a much more difficult problem since they need to basically generate a different view point of the image with new objects as well. I doubt perspective warp can solve it.
What are you using? You can boost speeds up massively by stuff like torch compile, stable-fast(probably best option?), one diff.
https://platform.stability.ai/docs/api-reference why cant I access this page?
its stuck on loading
he could train a lora on the building - but he'd need more than one shot of it
how do i do that?
Yep you are correct, maybe they can use ctrl x as well which allows for a reference img(the building photo) and a control image(the sketch).
It’s not going to be Lora level but doesn’t require training at least.
If you are using diffusers, the GitHub has example code.
i dont understand sht about code
Ok I would just recommend sdxl lightning models if possible, they should give similarish quality while only requiring 4-8 steps. They will have slightly less varied images tho.
when I add in multiple loras to my prompts I always get this error: Lora not found: WesternCartoonClassicDisney100, andav, shan
anyone ever deal with or fix this?
i actually have multiple drone shots of the area
problem is I dont have the right one, and clients are not able to shot them again 😄 😄
generative AI is deliberately coded to not make identical images of things it learned about. so first it would need to learn about that specific building, and that requires training a lora, and then you'd need to figure out how to keep it from being random. you might want to try what @quartz siren suggested
is the api page down? https://platform.stability.ai/docs/api-reference can someone please tell?
we don't support the website or API - you'll need to contact customer service
I see...thanks anyway
if you want to DM me, i can try to assist you
yea sure why not 🙂
making art to put art into the loRA to make art easier / faster, never ending struggle lol
What are the 3 best ESRGAN models for upscaling people and faces??
All those upscaling models ar emore less the same.. :/
I havent found one that blows all the others away
帮我画一副吸血鬼图
Totally new to the server. Where do I create images? What models are available? And do you offer loras? Or is this a support discord for people creating art on their home systems? Please tag me with any responses.
you need to read the information here #artisan-faq as far as creating images here goes. other than that, each channel is fairly well named. and if you have technical questions, ask in #🤝|tech-support
hello
does anyone have expertise that can help me recreate a style? it's giving me a hard time
i have an example prompt to follow but my results are not good at all
what sort of style?
Would be great. Thanks. I will dm you.
someone made custom portraits for randomized characters in a game and I would like to recreate it, but my results are not very good even when following his prompting
i'll link to you in the images chat
Newbie here: when I use inpaint to mask something out of an image, I get it to work, but there is always a faint color difference, that I can never fix. For instance, I masked a person out of a desert scene, and the mountains, cacti and sand behind the person came through, the person is gone, but there is a faint ghosting around where I had the mask. What (if anything) can I do to fix that?
hi guys and girls, how can i make a stable diffusion site that generate images with a specific LoRa. im trying to make a SaaS app that sells halloween and christmas decoration on images. Im having trouble with thinking out how to host a stable diffusion in my virtual private server.
python django or flask
and then something like diffusers, comfyscript, stable diffusion C++ lib or a pytorch script for the stable diffusion part
yeah ok, i am finding a developer i can ask questions to on upworks. but they all don't have the knowledge of how to do it
is there a stable diffusion service on the web that can host the gbs that is stable diffusion and i can just using those scripts to use it on my server?
sorry im 10 years experience in laravel and react when it comes to Ai Im a noob
ah thanks
hey
gm
Sd 3.5 out!
Hello, joined at the right time 🙂
sd3.5 but prolly still not better than flux or WHAT? 😄
woman lying in grass on announcement post is a nice touch
they could have selected a better gen and not the one with crooked hand
and eyebrow... and more...
how many legs are in a woman laying on the grass
as many as she can fit
The images on the site at least have less of a plastic skin look than flux default outputs
Thats what ive looked for. I dont see any texture on the womans skin?
except for the neck maybe
These to me look a little less plastic than the flux outputs
i cinda like that they didt tese the model. i can just downlode it now
pretty bold to use a woman lying on grass as an image for their blogpost
we'll see once people start generating more images
thats somehow true
i wonder why this plastic look even exists
but from what i see, the licensing looks manageable
ye
Let's see if they fixed it, I honestly was not expecting another launch for Stable diffusion
dunno, dall-e 3 has it too
i dont raly care about the licence
yes it has it very strong
holy its a 16 gig model
where to try SD 3D guys?
ill wait for the prunes
Ah yes,
Sd 3d
hf spaces
thanks!
thats 2 links to sd 3.5. not sd 3d?
where is the comfy workflow for it?
haha SD 3D
.
better prompt following, lower aesthetics
found it on hugging face but dunno is this correct or not
aesthetics can be trained by preference tuning
prompt following on the other hand...
we'll see
better prompt following would be crazy. Flux mastered it imo
if any staff is here, $0.04 per image w/ 3.5 turbo vs $0.003 per image on schnell is kind of brutal :c
Did they remove the NSFW and why???
cant integrate based on that
flux is too slow so i hope sd 3.5 actually lives up
It's probably about as fast tbh
not mastered but a massive step forward
Flux is insane
4B less parameters, wonder what level of diff that'll make to speed
They're like 3 steps ahead
flux quantized version / hyper models are pretty fast
with good quality
17 seconds per image on default settings on a 4090 with a long prompt
those are the ones i use but still too slow for me
the default release versions are almost unsable even on 3090
they work
unfortunate. They are consuming lots of computing
but your pc has to be totally idle when generating
why is there no comparison with flux 1.1 pro
im running on 3060 with 40 secs per image
if you start doing things, the generation just stalls forever
too scary
Cause SD wants to look good, not bad
what size images
but not the base model *
Flux 1.1 pro is a huge leap up from dev
it was 1024x1024
oh yeah thats true. absolutely forgot about that one
i wonder if a 5090 would be able to run that locally
one funny thing in your bio " has overpowered pc" 😄
i hope so and they decide to release the weights
Well
I lost my pcpartpicker
But it's 64gb of ram, an MSI 4090 and an i5 13600kf
you need that much to run sd 3.5!? it's just 8b params
nice 👍
or will community just optimize the shit out of it like is always done?
No one said you needed that much
does the 4090 comes with 24gb vram?
Yes
yes
hello guys
Man im sitting on my lil 3060. However, if ai will continue like this we will need 8090 soon
maby you will be bale to run the smaller model thatreleses soon
hi
im an artist im 78.6 years young and i like turtles
I wonder why tf they made stable diffusion 3.5 medium when there's sd 3.5 large turbo
They both perform literally as good
oh man ive given up. Its cloud computing for me from now on
except for highly quantized models
So how anti NSFW is the new model?
can anyone rate my picture that i justed drew
most of the people here dont respect "real" art. LOL
Also, how is it possible that sd 3.0 is on par with flux schnell?
According to their measurements
Cause that's gotta be skewed in a way

Look at their announcement page and scroll down lol
Is SD 3/3.5 available in auto 1111 yet. Had a break from ai so I'm not up to date again.
i believe you.
i just say its cap
... It released like an hour ago
sd3 == flux schnell. like on what earth
i think similar to flux
Stuck with SDXL forever then? 
Just train a new model
sdxl also is like flux maby even stricter
or nsfw unlock loras
Lmao
id say its stricter
yes
that omnigen is good, if it does all those things it promises its a game changer
for me the model was not working and i got errrors and i thogh wtf but then i notivced i had another ai program also running lol
lol funny that you mension it that was the ai that was open
lol
But it's been decensored pretty harshly, while there's nothing really conclusive for flux as far as I can see
But obviously I don't know anything about it, I'm just amateur users
omnigen needs like 24gb vram
So does sd 3.5 doe
but they can still improve it a lot by reducing acuracy
I think
3.5 currently needs 17gb but maby its ofloding something to the system ram
24gb cards are there
I dont really understand how they even managed to create GGUF of flux. Would this be possible for omingen too? I havent seen a single tranformer model which has been quantized to gguf except for flux
also i am running it in the hardest mode
What is Stable Diffusion guys?
yes it would be even easyer for omnigen becasue its based on the phy llm
omnigen is a llm that is finetuned so it can las make image tokens
so it works very difriently
but it has cool abiletys like context and beeing able to see multible images.
and adding new cababiletys is easy you just have to make training data. everything else is just lerned
That as a single checkpoint? No agentic process?
Hey, can someone tell me what encoder 3.5 needs?
but the qualaty is like sdxl or sometimes worse
Do you mean it released on auto 1111 an hour ago? Last time I used auto sd3 wasn't available yet.
ill check it out
here is everything you need https://comfyanonymous.github.io/ComfyUI_examples/sd3/
so flux dev makes better images but prompt coherency is worse then sd 3.5?
i have to test to make sure but there data says that the astetic cuallaty of flux dev is a bit better.
but they say finetuing might be easyer
for sd3
and model is smaller i assume so faster?
flux is hard to finetune, so if the SD 3.5 is easier to finetune, the community models should be on par with flux or even better
should take a few days still I guess xD in any case i see from the cover she can lay ont he grass so thats a big step up from 3.0 😄
is SD still releasing models?
people forget that base SDXL and base SD 1.5 suck compared to the popular community finetunes
brother, did you read the announcement
let's not forget SD 2.0 which was so bad that everybody forgot that it even existed
yeah I'm just confused, I thought it was over for SD or something
no, they hired james cameron, took some time off to perfect the model so they didn't release another crappy one like 3.0, and just released a new one
hi if i am using a amd gpu for stable diffusion is it not possible to also make use of my cpu? also amd
film maker right? interesting
Hey folks. Is there any talks going on about training a diffusion model in a similar manner as INTELLECT-1 ? I have been raving about the need for decentralized training for well over 3 years now and it finally seems (?) like it is happening
What's the 3.5 license like? If it's overly restrictive I might just stick with flux.
The Stability AI Community license at a glance
We are pleased to release this model under our permissive community license. Here are the key components of the license:
Free for non-commercial use: Individuals and organizations can use the model free of charge for non-commercial use, including scientific research.
Free for commercial use (up to $1M in annual revenue): Startups, small to medium-sized businesses, and creators can use the model for commercial purposes at no cost, as long as their total annual revenue is less than $1M.
Ownership of outputs: Retain ownership of the media generated without restrictive licensing implications.
idk if that's the same or not
the link says last updated in july, the post says they're releasing their new license

