#💬|general-chat
1 messages · Page 182 of 1
Depends 100% on model. Sd1.5 like 4-5s 3070ti
And sdxl like koitenshin 14-30s depends on the lora

And resolution
I typically use smaller resolutions that are close to 1024*1024 ( like 882 * 1238 close enough)
So its faster
Mostly 3:2 or 16:9 tho
Because I can, sometimes I generate 4K wallpapers (also no upscaling or detailing required)
1366*768 for 16:9 remember?
Oh yeah but im on phone and i use a silly dropdown
My braincells dont function at 5am
Well 6 now
Well, if they have 1024x576 that would also work, but it'd be lower res.
Nah i use swarmUI, its useually a bit more then a normal 1024*1024 pixels
what browser yall using btw
brave and edge
Yandex
Opera, the portable version.
fix yo language bro 😭
some people have trouble wit their browser
ion know using more ram and stuff
Oh
Then use opera
Like guy on the top said
i am but most say firefox is the best dats why im asking
Im still a disgusting chrome user with 20 tabs open on Average
But i totes recommend firefox
If you have +32gb of normal ram it doesn't really matter that much
Firefox hates the ComfyUI interface, Opera does not. Not sure what's going on there.
When I say hate I actually mean "rampant memory leak"
i do but it still lags hella and takes a long time to load
it says Installed RAM 32.0 GB (31.2 GB usable)
Which UI are you using to run the models?
Sohryyy
a1111
with amd radeon rx 6800s
Yeah, no wonder you're taking so long to generate. Using an outdated UI + AMD GPUs. There's no hope for you.
this message appears a lot
damn should i switch to reforge then?
or amd gpu just sucks
If there's a guide to get it working with AMD GPUs, yes.
ill do it tomorrow then cuz there is
Da hell
Forge / ReForge have ComfyUI's memory management + A1111's interface. That's why Forge is faster than A1111
I thought if you dont have nvidia gpu you re basically cripled in aspect of sd
There's workarounds now, but you used to be crippled.
yeah thats why theres zluda
whats the difference between the two
forge and reforge i mean
Btw, have intel gpu users appeared there yet?
The dev behind Forge wanted to start changing crap randomly in Forge and the community didn't like it. Thus Reforge was born.
I don't remember what UI it was, but I have seen people using Intel's ARC GPUs.
It should still apply, but you'll have to do more manual stuff
Forge you mean that minecraft fundamental mod
Or what
ill ask CS1o
This is Forge for SD. https://github.com/lllyasviel/stable-diffusion-webui-forge/
Ah
Understandable
anyone know if stablediffusion.cpp can do tiling (repeating textures or horizontally repeating panoramas/ scrolling backgrounds, that sort of thing). ages ago i improvsed it the hard way in python for sd1.x , then saw they added real support
Hi
Hi, is there a way to capture the timing of each inference step during image generation? I can see the timings through the loading bar, but I would like to record them separately.
Does it make sense to calculate (1024x1024) / (Average time of inference steps ) to estimate the average number of pixels generated per second?
I understand that diffusion follows the concept of noising and denoising, but I feel it’s more accurate to measure the time for each inference step, take the average, and then compute pixels per second—rather than using the total inference time [for all the inference steps ]for this calculation.
there is no stable diffusion program, it's a model. There are dozens of different tools that can run such model. For whatever question you have, you always have to say which tool you use
because, yes, in many tools you will see the time per step in the console, but probably not in every tool
seconds per pixel does not make necessarily sense, because the running time increases quadratically with the number of pixels. So a twice as large image might need four times as much time to be generated. However, in practice this depends also on your graphics card
Hello I would like to create rule three four content with the app, how can I do it
Generative AI does not work "per pixel". So I'd say it would be a bad unit of measurement.
Secondly as kaibioinfo pointed out you would be measuring the inference program (auto1111, forge, comfyui, etc), not the model .
Thirdly speed does not scale with resolution in a perfectly linear fashion.
Also you can already check this page if you want some rough benchmark numbers : https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
Yap
"the app" what?
You mean you want to run stable-diffusion models on your pc? What gpu do you have
My mom keeps watching master chef on the tv and I can't use my playstation ts pmo sm 💔💔
Good morning, everyone! How are you doing today?
Just got off work, also morning? Sheesh big timezone difference
yeah you could implement mixture of diffusers method, multidiffusion method or syncdiffusion method in stablediffusion.cpp
if you want to do diffusion model stuff in a language like C++/C/Rust
you might be better off writing code that outputs CUDA kernels directly, than using stablediffusion.cpp
cos you are already doing a more difficult path so you may as well do the fast one
how can I create r34 content
Your images are taking 20 min. you say? You must be running it on CPU. For AMD/Windows I think you have to install something called Zluda. But I wanted to answer as I have a RX 6700, and what I do is to run it on Ubuntu 22.04.5 (dual boot with Windows). I think that's the optimal way for speed.
ye for zluda probably want linux
it looks like you are using stable-diffusion-2-inpainting
ironically you picked the one censored model that this company has ever made
Thank you
Say sorry to him
just report it, don't engage cos it could be a malware bot
(if you reply to them it triggers them to say more and sometimes to direct message you)
the way the bots mostly work is after "talking" in direct message a bit they drop a URL link, and that's where the malware is
@vapid dove we got someone being a big ol baby in general
❤️
eh i just wanted the emojis
i have never heard of this man in my entire life, though he does sound like the previous troll from last time
thanks maxfield, your the best
thanks maxfield
it could just be a troll yeah
I find troll less scary than malware bot
last week we had one that basically insulted all nine generations of mine so it doesnt really phase me anymore
yeah you get used to it
modern internet is kinda more tame than early internet anyway
it used to be a jungle
these kids wouldent survive a mw2 lobby with their current mindset
I meant more like 1990s rather than 2009
but yeah even 2009 internet was different to now
hi
you are right bro
It's not actually breaking the laws of vram, it's just using vram more efficiently. Width * Height * FPS * Length gives you numbers less than the VRAM capacity.
i havnt played with it
Hi gang, we are organising the best ever multi-modal creative AI HackXelerator with streams for AI music, image, video, fashion, gaming and combinations. The event can be joined online and/or IRL in London, Paris and Berlin. We are supported by the best brands in the business; Central Saint Martins, Station F, Mistral AI, Hugging Face, Luma AI, Vultr, AMD, Nvidia etc.
This event will bring together 100s of creatives and techies to create awesome experiences https://www.kxsb.org/lpb25
Hello everyone!!!
I hope you are doing well!!!
@w361_emp is SCAMMER.
He stole my portfolio.
Be careful!!!
Is this an option to use loras on certain areas of image? Like, I need to use orc lora only only on mouth part, but all image should be affected by another lora

Impainting after?
But during generation? Depends on the UI i suppose
Search for "regional prompting"
they added regional lora to comfy at some point
Sorry again @vapid dove but we have another one -_-
hello
do yall know what would be some of the best controlnet models for referencing and making consistent characters? SDXL btw
reference unet if you can find it
Any LLMs out there that references and updates real time data? I am looking into Gemini
A bird trapped in a cage, with the door open but the bird refusing to fly, symbolic of emotional imprisonment, surreal and artistic."
hello folks
is there something that is along the lines of SDXL kinda resource requirement, but also with T5 clip?
had a taste of T5 prompt adherence in SD3.5
it's... addictive
is there like a SD1.5 or SDXL that works with T5?
Hello, how can i use different model with auto1111 text-to-img endpoint?
🐦 👮
I mean most llms dont real time update. You can however enable web search for more relevant info
Any reliable way to train flux on colab?
What's a good civitAI bounty for a 54 character-pack in SD1.5
Yeah, I specifically want SD1.5 by the way (how much buzz)
I know the others are better but I really like using SD1.5 still.
lmao the 'no robots or apocalyptic themes' part.
done to death
I don't get it, what's with all the hate?
you know that's just going to be pure Streisand effect
tell people not to do that, for so silly a reason, and people are going to be only doing that
Another competition for US people exclusively
not my choice, legal gonna legal
It's ai artwork that's featured on a site, prompted by someone from somewhere
Could you elaborate why it's illegal outside the US
not accepting outside US doesn't mean it's "illegal"
probably just something to do with contract enforcement
I'm with you, my understanding was that if there isn't a cash prize there doesn't need to be a location requirement. But I was told otherwise, and I'm not a lawyer
cash prizes are just fine in other places, but there may be different taxation requirements and the like.
Weird that. But I'm also not a lawyer, anyway goodluck with the competition
That means the cash prize may have different sizes depending on who wins where.
but it's not actually illegal or anything
things can get weird across boarders with jurisdiction, different laws, etc
I mean, Fortnite manages with their cash cups lmao
The EU cash cup is actually bigger than the US one
fortnite generates billions of dollars per year, its a massive business
Sure but they're beholden to the same laws
especially since both revolve around a competition
not a sweepstakes.
Not for competition law no
Nevertheless if Linus Tech Tips can do it I'd imagine SD can do it
Those are businesses that are presumably in the same ballpark of revenue
every country they open to probably costs more for legal analysis is my point
so open up only to the big players
US, European Union (single market, laws are harmonized for this sort of stuff), China, and whomever else can be expected to add a pretty sizable number of competitors
I don't think most people would be bothered by the lack of Togonese participiation
EU isnt as harmonized as you might think
for stuff like sweepstakes and competitions it sure is
it falls squarely in the commercial purview that the EU was founded upon
I've had to deal with at least soem of this myself, I have had a few clients in different EU nations, different nations have different laws
2025-02-28 02:24:01,984 - Inpaint Anything - ERROR - shape mismatch: value tensor of shape [159, 256] cannot be broadcast to indexing result of shape [64, 256]
does anyone know how to fix this error
i'm using automatic1111 with inpaint anything extension
Hi, i have a question.
I want to use stable diffusion directml to color my drawings. When i use it, it usually messes up the hands and changes the art style completely. Does anyone know a way to use img2img to only color line art or add shadings to base color? Without messing up the hands or changing the art style and eyes and such.
Look up depthmapping
Ok thanks
AI has revolutionized the "for exposure" line of work
artists used to require manual labor and creative skill for exposure
now you can just prompt
Yeap, though comfy workflows is okay money
...you can sell comfy workflows...? 😂
Yeah? Some of them can get quite large
But i also do a remote install for them to make it work
aka TeamViewer (best one I ever used anyways)
I charge for my time tbh
And knowledge, not the wormflow perse
"wormflow" I think you've been watching too much Dune
ah makes sense, there's probably an untapped market for making dumb coomers know how to generate unlimited amount of slop till their retinas get burn-in
Eh forge and swarm are pretty straight forward
Things like automated Photoshop, custom nodes, custom workflows now thats business
you'd be surprised at how many creative ways people can screw it up
Im not surprised, its my irl job to guide such people lmao
"it's very slow"
"I picked CPU, is that a problem?"
The best question i had gotten so far was "does the pc need to be on when i connect to the ui from phone" i normally don't shame but man
ah yes hahaha
not to mention ControlNets, LoRAs, all the nitty gritty
like a friend... was putting... hires fix to 1.1x... at 100 steps......
Yeap thats i useually shy away from techsupport now and useually hit up businesses asking for something
Might eventually steer towards making loras more
it's kinda fun when you get it right
and very gatekept by the fact Kohya_SS makes the other SS look like the friendly guys
it's <input> porn
I don't mind guiding new people making their first image but after that i kick them out of the nest
good idea tbh
I might be too nice sometimes 
hello everyone there
somone can help me on #🤝|tech-support
is animatediff currently working on forge ? I read sometime ago that it wasn't
That comment...just wow.
well have you tried making sense of that thing?? I had to go AdamW with the Cosine with Restarts because anything slightly more complicated than the basics and it just came out wrong
hello stable people
What's better forget ir sd.next
hey guys, where do you upload your gens to show off? it sounds like its just deviantart right now?
Are there any good guides to get me going on my journey to making actually good stuff instead of producing garbage?
So many guides idk which to believe or what is still relevant.
Other Discord servers since this one is biased against nsfw stuff. A single Age Restricted channel wouldn't kill them.
Hey guys, would anyone be interested in joining a multi-modal AI hackathon? You can join online via Discord or IRL (London, Paris, Berlin) #music #art #video #gaming #fashion https://www.kxsb.org/lpb25
CivitAI is a good one but lacks community engagement ish imo
Nsfw channels cause them to be blocked from the discord community discovery hub within discord iirc
Really? Discord sounds like a major pain then.
eh its to keep the platform mostly SFW friendly => more users => more discord income
🚫 IP-infringing content
🚫 Robots or apocalyptic themes (We’re steering away from the “robots take over the world” narrative)```LOL I think I am out of the contest cos I only make R2D2 images
but I am in UK anyway so I was not eligible
its a good point about robots I guess, didn't think of that
biggest issue with discord is the search is only "fuzz" search
and it thinks "Diffusers" and "diffusion" are the same word
similar words come up the same
The only robots i make are certainly not in a position to take over the world 
for some reason R2 was basically used to deliver maps/plans every time
they did it again in the recent star wars, IDK why lol
he's basically a delivery box
imo its because they dont really innovate
since the rogue one movie i basically never touched starwars
yeah I didn't like episode 7 8 9 era
I agree rogue one was great
some of the visuals in 7 8 9 are good though like the salty planet
its a bit like with Avatar Way ofthe Water, IDK if it was the best writing but the water looked rly nice
for me episode 7 8 9 simply doesnt exist
Best movie that disney put out imo
hello everyone there!
I know, I was trying to find some simulation servers (racing/flight/etc) and it just kept popping up the Simulator Games servers (Trucker, Farmer, etc)
yoo does anyone have experience training Lora's for SD? I tried using Flux Ostris on Replicate but it didn't work very well, i'd appreciate some help!
Trying to generate really basic essentially MS Paint drawings. Here's an example: #🏞|general-with-images message
how many of such drawings do you have?
though you said two very different things. SD is not Flux lol
like i tried to make applie pie with pears 
dozens haha
ya i mean i tried using Flux but it didn't work, now im asking about SD!
hmm well with that style i am not entirely sure but i manged to make a style lora with 64 images
but theres still a bit of bias due its smaller data set
Do you think it's possible to achieve an output like the one I posted? I don't know much about this stuff
Hmmmmm i honestly am not sure since i tried that image but the auto tagger didnt recognise most of it
you could try to induvidually label each image but again i havent done a image so minimalistic
try using luca taco's trainers on replicate. his are better
SD 3.5 trains much easier than flux. you want around 30 images if you're making a character - from all sides, different posistions, etc. i use between 800 and 1000 for styles
Greeting, I'm kinda in middle of complicated decision whether I should buy a normal pc or try out the recommendation build for stable diffusions. Do you guys have like recommended build for this, I'll let ya'll decide
Good luck getting a 4090
Gefore RTX 4090...
Well yeah it's expensive for one piece
Can anyone help me figure out what model to download? I have never messed around with AI art but want to make 2d pixel art for a game I'm making.
WAI Noobai
Ohh you mean Lora?
Or the checkpoints
Im not too sure what im doing. Do I have to download a specific model of stable diffusion first and then get a lora from civitai? Or can I download straight from civitai?
You'd have better luck just finding something for Pixel Art (a checkpoint) on CivitAI. The real question is, will your system even be able to run any inference.
Ok. I have a 3070ti in my pc. Also, when I find the checkpoint and download it, how do I use it?
Do you have a UI already?
No I dont
Then I recommend following one of the guides in the #🤝|tech-support channel. They're in the pinned messages top right next to the people icon.
Ok, will do. Thank you!
hello
/image_dream prompt: genarate a image of Indian man who dress pants and shirt barrowing money from another man who dress dhothi and topi
Gm guys
I was wondering if anyone had a workflow for creating consistent character with different poses. I also want to integrate my logo in the image like on his shirts. Im using comfyui
Will do! 800 to 1000 is a crazy amount, I have maybe 30-50. Would that not be enough?
hey guys im tryna steal an artists art style and train my own data with their artwork
is there a guide to do exactly that?
when i'm making a style lora, i want a decent amount of examples of all sorts. keep in mind, base models have billions of images and data in them.
yes. first contact an intellectual property lawyer who will explain that what you are doing could land you in hot water with large amount of money owed to the person that owns that style.
a method came out in the last month
to extract training images from loras
I'm trying to spread the word a bit LOL cos it means they can see what you trained on
i said it like that to be clear i have no intention of stealing like that i just want to see how it would come out ill draw my own
I honestly have no idea about the current legal setup regarding this stuff anyway
its probably changing fast
Anyone?
Hello from China
Surprised they let you through "The Great Firewall"
Hi, its a pretty regulairly asked question but mostly it boils down to lora, faceswapping (reactor) and controlnet
i think if you search for consistent character in search you find better explained awnsers
source? hella curious about that
was on arxiv in the last week
The only thing I see is reverse engineering prompts.
date might have been earlier then maybe
@tepid finch @tepid finch thx bud
But possible yeah?
Sort of yeah
hi, havent posted here in forever, is it okay to ask here for a quick job offer? i'm sure i could technicaly do it but i'm lazy to look for the prompt and stuff, havent updated my old SD in forever too.
you could. depending on the complexity i might pick it up
atleast i think you can
Hello, I was wondering if something like game ui can be made with STable Diffusion? With a prompt will it be accurate enough to recreate something and give different forms? (Let's say I want the same exact style image as this, but instead of a tree, make it a car)?
Stable update for 50 series gpus?
On comfy i had to do some unstable torch build iirc
Additional info?
Thank you friend! I truly appreciate it!
so, which ip adapter for face copy should i choose
oh, nevermind, i think reference only do the job better
I prefer reactor faceswap tbh
and i love all mighty controlnet
does reforge support amdgpu, saw that on stability matrix that it has support but don't say anything about that in github page
I tried to make it work with zluda but it currently freezes the PC at launch xD
Whenever i hear stability matrix i get a ick
So many bugs due matrix, i recommend sticking to a UI outside matrix
I've never had a single issue, but I'm using nVidia.
bruh, that's a shame
But forge and Auto1111 work. So there are options
if zluda didn't work, how about directml?
Didn't tested it with directml. But that would be no benefit over forge or Auto1111 with Zluda
hmm, let me test that for you
Which webui do you use currently?
automatic1111
With directml or zluda?
zluda
Ok
I will ask Lshqtiger if he can look into forking reforge for AMD like he did for Auto and forge
Shouldnt be to difficult
wait what, he can?
hmm, lemme try reforge with directml for a while then
Let me know if it works
nope, it doesn't work
Ah damn :/
You can directly train the face you want
yeah, i use reference only in controlnet, work like a charm
can you make animation
guys anyone kind to help me a bit please?
shoot your question in #🤝|tech-support if your stuck robet
or if you have a question dont be afraid to ask
Hello, can I do requests somewhere, that someone can generate an image for me? Or like an external website
Youd probably wanna use the CivitAI website since you start with a few credits and get 25 a day free that you can save up. Depending on the model you use it can get really cheap/expensive credits wise
How do I create images ?
hello all
Either locally or through #artisan-faq , but some online services like civitAI also exist
If you have a gpu with around 6-8gb vram i recommend looking into the #🤝|tech-support pinned messages for the guides from CS1o
Hey, weird question, i intend to run automatic1111 in wsl, and wonder if it has any differences from running on linux native?
#💬|general-chat hello
guys how does pykaso AI website provide so realistic images...How can I do the same locally on my pc?
ys
!generate "a beautiful sunset over the ocean"
free?
I expected more from Rust community
⛱️ 🌇
Jk check #artisan-faq
Hi!
it will take extra few memory to have comfyui running on WSL inside windows, depending on which model you are going to use sdxl/flux and your gpu power might make a difference. since you are going to use a1111 i'm guessing you will be running sdxl, so it would solely depend on your gpu vram
are there any prompt generators for Flux? Which will edit your prompt to make it better
anyone using an amd card on windows rn? is it still as hard to use as im reading from year old threads?
with zluda it takes some time to setup but it's pretty smooth afterwards and much much faster than the yee old directml days.
im on AMD, follow the install Guides from the Pinned Messages in #🤝|tech-support
Go for any of the Zluda Guides for the best performance
Hello New to Stable Diffusion and have no clue what I'm doing. Wish me luck! 🐈⬛
good luck!
chatgpt > luck ...makes getting into it so much easier
hello
hello, i'm trying to use regional prompter for the first time, and even if I was able to generate the image I want, the resolution is pretty bad, everything looks blurred, any idea?
Hola a todos!!
hello
HI So quick question i have a 4080 super. i wanna try out stable diffusion and someone gave me a link to huggingface and i see a bunch of options but i dont know what to try

I have a mental disability that effects my memory of things after my brain injury... I have been working toward running SD locally for 2 years... I found some cheap(ish) hardware that will run it, and am eager to get started. I have experience with Midjourney and Dalle but I'm way behind now with so many different choices to make. Is there anyone here that could help me learn which things I must choose to get the functionalities I'm looking for...
I have been trying to do it myself but my poor memory and the speed with which this is all developing, and my limited resources' off-optimal hardware, means when I watch a tutorial on how to do something, by the time I adjust to it the information isn't current or accurate. If someone who knows about running SD locally on an AMD APU (5700G) running Ubuntu and can patiently guide me toward the functionality i need I would be willing to discuss compensation (not rich or anything but, this is a priority for me.) DMs open to anyone interested.
that might be @warm junco
Id do it for free but currently omw to work
how does pykaso.ai create realistic images every single time?
it even has 'amateur shot from iphone' stuff
how can I replicate the same locally?
I am not looking for a quick solution... things take me longer... so if you'd like to talk later, after you are recovered from working, please let me know.
Hello
Hi all , can someone tell me how can i delete Stable Matrix??
question, do I need to download the base model that a certain checkpoint or lora uses to efficiently use it?
for example SD 1.5
Nope any model based on the base model can be used
Hi I'm using SDXL as I have 8gb vram, I need a good method of fixing hands automatically without inpainting, I'm on Comfy Desktop V1 (windows) and I can't find Mesh Graphormer in Comfy Manager. What are my options and what is the best method of fixings hands without the need of inpainting? Also is there a good hand LoRA for SDXL?
There is a mechanism called "embeddings." They are basically per-canned negative or positives you can add to the prompt. The face detailer can also fix hands when properly configured. Another route might be to add an OpenPose control net.
what upscalers / hiresfix do you guys use for photorealistic images?
hello guys i would like to create videos from 1 photo for free. Do you know anything about this?
search for Wan 2.1 i2v model
you need a really good gpu and a lot of patience, though
You could also use a online service that gives you some free credits
Results vary though
unfortunately I always find everything paid XD
doing it locally is also not free - you have to pay for electricity
20 cents/hr for electricity consumption isnt the worst ngl
Though im sitting at a 4 person household average consumption as a single person lmao
if your gpu runs 20 minutes to generate 5 seconds of a video then, based on your 20 cents/hr, you would already reach 7 cents per 5s video.
if you assume 40 cent per video with SORA, then this is only 6 times as expensive as your local video generation
Benefit of local: uncensored
oh, yeah, sure, if you want to generate 5 seconds porn then you can use local models
I find it kind of pathetic that people don't do anything more creative with that technology than boobs
Eh its not only that but sure*
look at civitai when you select video models as filter. Its basically 99% porn
btw. I also don't say anything against local models. I love doing everything locally. I just say: video generation is not as easy and mature as image generation and for novices it might be the easier and maybe even cheaper solution to just use SORA and similar tools
Oh yeah for novices / common rabble definitely i agree
But the moment people try to do more then the set guidelines then from those corpo smucks decide your stuck with no way to edit it the way you want it to
Without paying even more offcourse
I mean you can just rent GPU servers from reputable datacenters and then they don't monitor at all
hello
How to make short videos? There is any video on YT to learn about plase?
Thanks
So this is useing comfyui right?
comfyui or swarmui
@vapid dove any idea when well get a stable diffusion v4 its been over 4 months and 3.5 wasnt that good even at launch
I wasn't super impressed by 3.5 when it launched either. To be honest I didn't even use it much until I joined here. Now I'm hooked. there are some shots it does fantastically well
Kinda wish I'd started using it sooner
i mean yeah its not terrible but it underperformed even flux dev in my tests and nowhere close to larger models like ideogram or imagen thats why iam saying we need something better
to be fair those are like the two biggest closed models
apart from recraft which slides in between the two, in terms of size
I mean considering it takes a lot of money to make something like this and we're getting it for free eh. Compared to early sd 1.5 they have come a long way
ye I've started training models and its rly hard
if you put in a reasonable budget what you get out is like
a 256x256 imagenet model lol
What are you talking about?
/generate
现代都市公园,阳光柔和,绿地和树木,背景有摩天大楼。一个白人男性(30岁,浅棕色短发,浅蓝色衬衫,灰色休闲裤,微笑)和一个黑人男性(30岁,黑色卷发,深绿色针织衫,卡其色长裤,开朗笑容)站在桥上,影子交融,背景有鸽子和国际象棋棋盘,插画风格,低饱和色调。
d
cup tea
Hei folks . Want to ask are there SDXL models that use vpred that also uses the normal natural language prompting?
Or maybe are there ways to make regular sdxl to be better at bright/dark scenes?
just do img2img on a dark or bright image
it's the most simple way to fix that issue. I never understood all these weird workarounds like noise offset and so on
Hello
if you want a video tutorial for wan setup, here's one: https://www.youtube.com/watch?v=KcYuWRB1_xI
ok ill give it a try. hmm won't img2img is more like taking an existing picture an denoising slighly?
CosXL https://huggingface.co/stabilityai/cosxl is basically just SDXL Base but with the dark/bright range
img2img on black/white/gray does actually work for old XL models, see here for explanation why: https://blog.freneticllc.com/posts/lowfrequency/#how-does-stable-diffusion-work
It's weird getting past the community opinion blinders to see the underlying model for what it is, eh?
Back when SD3 was a baby, even the smallest pretrain version blew my mind with its capabilities (as compared to xl or any other preexisting model).
Was quite frustrating that licensing mess and communications confusion ruined such an incredible model release.
Check out that TensorArt 3.5L-TurboX model btw, ~8 steps and even more amazing than the base, on par or better vs Flux Dev but runs way faster thanks to the turbo.
trying flux with ComfyUI. It starts generating and then says 'reconnecting'. I try to start again and it says 'failed to fetch'
Is it an issue with my RAM?
did you solve this?
ok i found out this was a memory issue. any suggestons on how to run flux with lower memory usage?
GGUF models can work a fair bit lower than the normal models, you'll probably want schnell too if you're resource constrained -- links and relevant usage docs for Flux here: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model Support.md#black-forest-labs-flux1-models
with limited resources you'll probably want to consider a different model class though, eg SDXL finetunes or the new SD3.5 TurboX
guys, how can I inpaint text onto paper on an image realistically?
Yes, I have the experience about that.... Please DM me
SD3 and SD3.5 are perfectly fine for the way most people use image models
like if you are just making a 1024x1024 image they can quite often give out a more interesting image than Flux, it will at least be more colourful and creative
their issues were more on the technical side with the pos embeds and attn maps not working as well as they should, but this mostly affects inverse-problem stuff like using the model for inpainting, super-resolution, segmentation etc
for just generating images at 1024x1024 both SD3 and SD3.5 are really under-rated at this point
if the image is painterly or has smoke/fog effects they are particularly good versus the others
photoshop comes to mind
there are stronger methods on Arxiv but none released model/code
for example they use control nets or new SDE samplers
Yoooo Guys can anyone of you help me pleaseeeeeeeeeeeeeeeee
my stable diffusion is using only the cpu instead of GPU taking lot of time to generate images
if anyone wants to help me please DM me
could we talk here instead?
yeah sure
I don't use DMs cos in the past people acted weird in DMs
what software is this
is this A1111 software?
yes
as first step could I maybe convince you to change software
i have 3070 ti but still image generation is very slow due to it is using my cpu
okie
what should i use
your graphics card is strong, I rent 3070 ti sometimes
probably SwarmUI
it will have a way to make sure the GPU gets used
hey can we get on a quick call please, if you thing i am weird you can disconnect & block me
sorry I can't do DM or calls
too many bad experiences helping people over DM or call
please pleaes from morning onwards it's bugging me
try to follow the install instructions here: https://github.com/mcmonkeyprojects/SwarmUI
and see what happens
okie
as a first step
thanks for the link
neg prompting 'face' but faces are still appearing in the immages
any suggestions?
hey OG
hey
can you come to VC
try delaying the negative prompt for some steps
VC in diffuser lounge
Hey Neon do you know any good image to video generaates free online ones or local installed
@fervent thunder
yeah Wan video is good
on some level Stepfun video could be ran locally but not really
https://wan.video/
https://www.wan-ai.org/
which one
IDK just google "Wan video huggingface"
okie thanks
@fervent thunder
swarmui installed
how to make it run on GPU
image generation is slow this swarmui also running on CPU instead of GPU
have it
then look for command line arguments that will make sure the model runs on GPU
v1-5-pruned-emaonly.safetensors downloaded and placed in models file
yeah v1-5-pruned-emaonly.safetensors should be ok
what is the file name that i need to edit
i'm new to this man, sorry for asking so many questions
the problem is I use comfy but I advised you to install swarm 🤔
in comfy it would look like this:
vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).")
vram_group.add_argument("--highvram", action="store_true", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.")
vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.")
vram_group.add_argument("--lowvram", action="store_true", help="Split the unet in parts to use less vram.")
vram_group.add_argument("--novram", action="store_true", help="When lowvram isn't enough.")
vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")```
in this file ```https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/cli_args.py```
checking
if you want a quick fix to test
these nodes should do it https://github.com/neuratech-ai/ComfyUI-MultiGPU
this will also help work out if you have an issue of multiple CUDA devices
in swarm also we have comfy
bare in mind these nodes have two issues
- they monkey patch
- they interfere with the memory management
bro could help me out quickly i'm in the VC on this server
aha okie but i have no clues about these, i'm very noob to this stuff
issues like this are tricky no matter how much experience someone has lol
cos with local generation you never know what the prior setup of the machine is
thanks for the help buddy, i'm gonna go to sleep and deal with this mess in the morning
okay sure
goodnight
night
Man, apparently nobody wants to see any kind of straightforward image comparison between any set of t2i models anymore on the Stable Diffusion subreddit lol
I used to do them every so often and they were all fairly well received usually, but all recent ones I've done have just been downvoted to 0 with very few comment replies (but thousands of views)
The research channel was deleted?
i wanna make a lora of my pet cat using flux, is flux like good for that, idk about animals specifically
honestly sdxl could work for that too
unless you wanna do crazy stuff
but xl is pretty neat for it too
hmm well i dont make character loras often so im not sure
Flux is so much better than SDXL. It always shocks me if I compare my own face lora in SDXL (which I found great at that time) with that Flux is capable of
However, the big disadvantage of Flux: it trains extremely slowly
it's not a cfg model, so things need much more time to get cooked in
im probably using civitai to train, does that mean i need to mess with some settings?
I'm not sure if civitai is a good place for that. I haven't tried it
but if you pay money anyways I would rather train somewhere else
e.g. on the Flux Pro page itself (there you can directly train the Flux Pro model which gives you probably the best results)
but there are so many other services that offer training as a service and usually they have good predefined settings
soo what should my settings be then to get best results (still using civitai)
Wan2.1 needs its own channel
Flux Pro doesn't rly look that different to me
Flux Ultra does but not the regular non-Ultra Pro
Hey lads, whats the current best img to video that I can frun locally?
Hey everyone, not sure if this is the right place to ask, but I’ve got a few questions:
- Is there a way to get Flux 1.1 Pro quality for print-on-demand designs without using the API? Maybe some local setup?
- Would a Mac M1 with 8GB RAM be able to handle it with some crazy hacks?
- Could you finetune models to learn faces and match Flux 1.1 Pro output locally? Any pointers?
Thanks so much
- Dev and Ultra are the only ones relesed.
- no
- yes, google.
Hello, is this server strictly for stable diffusion discussion or also allows related tools discussion ,such as ComfyUI?
Gotcha, thank you for your answer! 🙌 I was hoping there was some way to train a model (like Dev) to match 1.1 Pro (at least for a subset of images, like pod images) and run it locally on the Mac with some crazy quantization. 🤪
Like, there are websites that generate public images from Flux1.1 pro (like Glif). Can those images be used, like how DeepSeek used OpenAI outputs to retrain another model and get similar results ?
afaik flux dev requires 8gb and 32 to train, if I remember correctly,.
it worked! my cat is fighting a pirate lol
Flux Pro Ultra in "Raw" mode is the only official Flux that can do like, completely grounded SD 3.5 esque realism without any trace of oiliness
It's REALLY different for photorealistic gens than other versions of Flux
I've done what I think you mean (train a Lora on outputs from a different model) at least twice
https://civitai.com/models/633453/zoot-fluxxer-xl
https://civitai.com/models/1204546/zoots-flux-pro-ultrafier-for-kolors
the dataset was higher quality / higher resolution for the Kolors one cause they were ALL Pro Ultra outputs
whereas the SDXL one is mostly Dev outputs
also Kolors just does a lot more with the natural language captions I used
oh yeah there was SD 1.5 version too lol
https://civitai.com/models/639244/zoot-fluxxer-for-sd-15
worked pretty well actually particularly with Ella, cause I just trained it at the exact same resolution as the XL version
Very interesting, thanks for sharing. Results look pretty cool. How did you feel about them?
i think this one worked because its a cat and flux probably has plenty of data about that in their model
I was quite happy with them
combining the XL one with DPO, SPO, and one of my other more general detailer loras can some pretty cool stuff even just with Base SDXL
https://civitai.com/images/60554049
the Kolors one is my favorite though, just by itself:
https://civitai.com/posts/13806835
Wow, so cool!
hello
Is there an api specific channel?
not sure, with flux dev if you combine enough photography and realism checkpoints and loras then it just looks like photo to me, when the resolution is high enough
哈喽 你们好
yeah I just meant like
stock stuff
If anybody knows how to make images/videos like this guys, please DM me. I got $$ for ya
https://www.instagram.com/reel/DF0IXLKRbHf/?igsh=d20xd3RzcWVtYWIy
This looks a bit scary
oh for stock I agree yeah
although almost all the big models are bad for photos in their stock form
for some reason photography seems to need models to be pushed towards it a fair bit
what's up guys
any ideas on how to make a less realistic selfie? i don't want professional camera quality and insane lighting
it seems SD struggles with this
Amateur lora / checkpoint
It really depends on the checkpoint bgl
Ngl*
usually you can also prompt for that using keywords like "2006 digital camera image", "amateur photo" and so on
thanks
Out of interest who's rushing out for a RTX 5090? I'm saying that as I need a stronger GPU as I have a 4060 and everything is just crazy at the moment.
If a 4090 isnt enough for you, a 5090 will just get you more vram and if thats the limiting factor i recommend looking into enterprise cards
As 6k for a 5090 is insane
@atomic mortar I would love a 4090 but there stupid prices, but as a former system builder I don't trust the 2nd hand market. So the only option for me is a new card with a warranty if it starts smoking. I'm not liking the extra power draw and Nvidia has just F'D up in this area without proper power design on the card. But new 4090's with a warranty (warranty not transferable to 2nd user) like rocking horse Sh!t to get hold of without being scalped. I only can afford a FE 5090, and they're just badly designed at the moment. I hope AMD get in the game for AI or Intel for that matter just to give competition and the pricing down. But it's clear where Nvidia are for the money and that's commercial GPU's for GPU farms and its feels like the big 2 fingers for people who want local GPU power.
Woops i read it as 4090 but you said 60
Mb
I settled on a 5080 and while a bit slower with video ai it manages
I can always rent a gpu if i need a heavy workload
The 5080 has a lot better power balance, as it has less draw, but the power design is ridicules
Oh yah definitely, when i first got it i ran a few Wan gens in a row like 3 hrs
And checked for any plastic melt smell
@atomic mortar I'm looking at the A series with 20gb, but am I gonna miss those missing 4GB
@atomic mortar I'm just learning at the moment and will end up doing video at some point, which really lends itself to GPU farms really. But the 24GB does seam like a good trade off from 32gb. But of you're really doing stuff you really need a 60gb + card which takes you out of consumer, as the better models won't fit. Crazy really, the move is heavily focused on render farms and less on some local GPU, but hay that's where the money is
I run wan image to video within 20-30min
Text to video like 9
16gb vram
Unless you go for the extreme llm's you dont need that much
@atomic mortar You know the crazy thing with the 5080 pricing just for a few hundred more you might as well pony up for the 5090, and that's like crazy am I that stupid to buy a poorly designed card. I'm left gobsmacked, but the 5080 is the better option power wise unless you like roasting marshmallows. The only option really is a 3rd party device to do power smoothing for the 5090 at the cost of performance.
Few hunderd more? I got a 5080 at near msrp and a 5090 will cost me 4500-6000
Non FE
@atomic mortar where I am the 5080 is around $1500 depending on GPU type non FE, the FE (5090) for me looks like the best option price wise as it what I can afford, but on the risk of using the card out weights that option, that's when there's stock. It's easier for me to find the extra $500 than it is to buy a lower grade card and upgrade later. But there's also logic to buy what you need, as these cards will sell 2nd hand. But the A series look attractive and i'm wondering what will the Blackwell offering be when they release these Pro cards and is it worth the wait. Draw back is less cores to work with, but you then have the VRAM for the models to fit, swings and round abouts.
But realistically, what problem do you have with your current 4060?
Models won't fit + future proof need and 16gb now adays isn't good and the VRAM will always increase
I was on a 3070ti and waited till i could score one at msrp
If you can wait id wait as sdxl is still pretty dope
@atomic mortar I'm thinking that I'll end up with a 5080 unless I can get a 4090 that I can trust 2nd hand even better new, and the A series is an option too
I wish people who don't know about AI wouldn't talk like they know about it
Ppl acting like Loras haven't existed forever
(Not on this server)
I don't even use SD anymore but loras are like, super basic
so you want moreso just like, explicitly low-quality / candid sort of thing? It's not quite clear what you mean by realistic
Also what version of SD?
People were not fans of this comparison I did on Reddit yesterday for some reason lol
but I thought it demonstrated what I meant by "grounded" pretty well
https://www.reddit.com/r/StableDiffusion/s/saeUfoF5M2
For Medium vs Flux with and without Loras (including a Lora made myself)
this one was just, 'candid amateur selfie of a young man in a park' in 3.5 Medium, if that helps the guy who was asking
Negative of 'ugly, blurry' also
#🏞|general-with-images message
where do i go for help with forge
that isn't true
you are assuming and you have no idea how the 50 series is even built
my comment was based on the fact that he has a 4090 but i misread
yes, but - the 50 series is not the old tech that the 40 series was built on
it's totally different, it has AI on the chip, it generates a FEW pixels and then the AI takes over and predicts the rest
you can't base any idea of what the 50 series does on previous nvidia tech
go to youtube, search on CES 2025 jenson and watch Jenson's keynote
to the #🤝|tech-support channel
nvm i fixed it
Hello
you can find checkpoint and lora combinations for both these models that are closer to photo
I don't mean that as criticism, what I mean is that some of the big flux checkpoints and popular loras are not as good for photo as some of the less popular combinations
maybe try this UltraReal Fine-Tune with Sony Alpha lora, Boring Reality lora, iphone photo lora
for SD3.5m I have seen some good ones in training discords that I am not sure have been released, but there are some that brought the photo style out rly well
Right, again I just meant like e.g. stock SD 3.5 Medium (and Large presumably) already has a solid fundamental understanding of terms like "editorial", "amateur", "candid", "analog" and such. It's also reliably possible to get rid of depth of field using negatives in them by default. None of that is really the case in Flux without at least loras, or jumping through more hoops to approximate the negative stuff.
oh yeah I agree with all of this
Flux Dev is more overfit
when it first came out I was worried
like If I made a general photo lora for SD 3.5 the dataset would be completely different from the Flux / Kolors ones I've made
Cause it just doesn't really need training on the same things
Doesn't need de-chin, or de-bokeh, and so on
what does de-chin entail lol
I think for the most part you train towards what you need rather than away from what you don't need
so in that sense I would approach both models the same
and train towards photos
Quick question, what method do y'all use for line breaks in your prompts? Any hotkey for that I know of includes enter, which no matter the key combination, begins image generation.
what is the latest stablediffusion image generator version?
3.5
is it the best ?
it doesn't have to be the best, if you like it you can make nice images out of it
it called sdxl 3.5?
what do u prefer sdxl 1.0 or sd 3.5
oh SD 3.5 is definitely stronger than SDXL yeah
just throwing a whole lot of super high res professional photos where a face is very prominent at it, basically
with my Flux one I was basically just trying to brute force Flux in general into looking "not distilled" or "normal" by default
so it was basically just a super diverse dataset of people of all ages, with no single person ever appearing more than once
whereas SD 3.5 really doesn't need that kind of Lora at all, moreso just like, targeted improvements in specific areas kind of thing
I see what you mean yeah
I would go about it a bit differently, and just run through general photo datasets
like not curate them beyond quality control
well I was only concerned about people for that one, not so much environments which seemed "fine enough" to me
I did one specifically for animals too that I never got around to releasing yet, with like accurate captioning for different dog breeds and quite a variety of other species represented also
yeah there is a problem with my method that it is not suitable for smaller lora/lokr/loha
it doesn't have to be full rank finetune cos it can work with a large lokr but it has to be a reasonably big dataset if the dataset is generalist like I was describing
an animal one sounds cool 🤔
yeah I never usually go above 1000ish
it seems to always be enough if I pick the dataset right
and caption it well
I actually don't know settings well for this stuff
I skipped lora/checkpoint training and I've started looking at training new (very small) foundation models instead
its kinda backwards to the order most people learn lol
yeah for sure
you can train lightningDiT for $80 its less than it would cost me to make a flux checkpoint probably
although it might go wrong I guess we will see
I don't know. I tried to make a pen&paper character with Flux and SD 3.5L and was surprised that Flux had much more variety and diversity in its generation. All SD 3.5 characters looked the same. Also, SD 3.5 had big issues to not make him too handsome.
Could also be a skill issue as I'm not so familiar with Sd 3.5 as with Flux
negative prompts work for flux if you use cfg
you just have to use lower cfg values and probably should skip one step at the beginning
this sounds like the opposite of the problem most people say they have TBH
there's like a zillion factors that could be at play here though
I don't really use Large at all though, only Medium mostly, for the higher resolution support
and Medium seems to have a fairly different dataset than Large
I do remember that Large was much more finnicky about CFG when I tried it, and had a tendency to do weird things with the background
and also the FP8 Scaled version was REALLY noticeably worse than the Q8 GGUF for many outputs
so those could have been possibly relevant things for you
in my testing Flux had a very high variety across seeds, with the right workflow
as well as that, there are ways to estimate log-likelihoods of samples now so
stuff like diversity which used to be qualitative can now be measured
it depends cos SD 3.5L got the tile controlnet and SD 3.5M did not
SD 3.5M does have depth+canny though (from tensor.art)
I would maybe look at merging the new SD 3.5M turbo with SD 3.5M turbo base
why?
turbo is one of the distill methods that has an adversarial part
so it can raise image quality
doing the merge lets you choose the tradeoff level per block essentially
I've been using --xformers instead of --opt-sdp-no-mem-attention, and going back to sdp improved quality significantly with sdxl
not something I've really thought about TBH
would be nice to have though for sure
Hello insights needed
Does conceptual captions have significant chunk of important representation encapsulated for a generalized diffusion model for pre training?
Or is just too erratic high frequency data
I was thinking of using synthetic sets along with sa1b
yeah xformers is like that
CC12M is fine yeah
what is the sa1b for?
sa1b is uncaptioned I think
yeah it has to be fed into llava for captions
would it be detrimental if i drop CC12M and use sa1b jdb diffdb etc
journeyDB is common
what is diffdb?
I don't think there is anything special about CC12M
if you drop it its fine
Hi
SD 1.5. Snapchat amateur quality
I'd like to transform a person's face photo into a cartoon-like character while keeping their recognizable features (just like loverse.ai does).
Questions I have:
- SDXL vs Flux for this specific task - is one clearly superior, or are people just following the hype?
- IP-Adapter configurations - is there a "golden setup" that actually works consistently, or is everyone just guessing?
- Has anyone ACTUALLY created a workflow that matches commercial quality?
- What workflow end-to-end to get same or better results?
I've seen countless tutorials claiming to solve this, but the results never match services like loverse.ai. Who's actually figured this out?
If you've got real insights (not just theories), I'd love to hear them.
Hm id use something like reactor but it really depends on the cartoon style if it works. But are you looking for commercial use or private?
@atomic mortar what is the best way for me to upscale a generated selfie of someone at 512x512 to a higher res while preserving the composition and shit
hires fix or img2img or extras?
hmm it depends if you want to refine it or just upscale
if you just upscale it wont mess with the other stuff but i use SwarmUI so my knowledge doesnt translate over to A1111 well
well i generate it at 512x512, but I want it to be like a photorealistic full-size phone image, so it has to be "refined" not just upscaled
it looks like shit at 512x512
hmm well it seems to me that your using SD 1.5. what gpu do you have?
RTX 3080
hires fix and img2img are basically the same thing
10gb vram
i first off recommend using SDXL then instead of SD1.5
since SDXL gens at 1024x1024
or similar aspect ratios
the only reason I use SD 1.5 is because im using the realisticvision v6 checkpoint
well, technically it's SD 1.5 Hyper
there are realistic SDXL checkpoints too lol
recommendations?
id honestly look on civitAI to see what matches your prefference
since i mostly use illustrious
alright, ill try the SDXL juggernaut checkpoint
seeing as that is the most popular one on civitai
which upscaler do you prefer btw?
well i use ones for anime so
4x_NMKD-Superscale-SP_178000_G
4x_foolhardy_Remacri
I'm trying to generate an image similar to one I saw at civitai, did the same generation settings but im getting a much lower quality result. I think I'm doing something wrong. The model I'm using is https://civitai.com/models/81458/absolutereality?modelVersionId=132760
also I see that it's base model is SD1.5, however maybe I have to use it as a lora? i dont understand
he has steps that arent listed like
txt2img + Hi-Res
hi res fix
and do you have his embedding like "bad dream"?
yep
also isnt txt2img something from the automatic1111 webui?
doesnt hi-res just make the image bigger
it also "refines" but i recommend watching a tutorial for that one since i dont have it open rn
what I find odd is that Stable diffusion's default model is 2+ GB while this model is like 1.98 GB and generates better results. I'm new to this btw so maybe I'm missing a core concept
my intuition tells me that the model I sent before should "run" above sd 1.5 but i'm actually replacing it
what the best ai for generating arts right now
Flux. Although for artistic images SD 3.5 might be a bit more creative
could be many reasons. Maybe they left out the vae (tools like auto111 will use the base vae automatically if the vae us missing) or used a lower floating point resolution.
It's another synthetic dataset on hugging face
Thanks!
Hello!
yeah the one I linked was exactly that, but with SD 3.5 Medium
looks like the post at the other end was deleted though?
not really sure why
Has anyone in here utilized SPAR3D successfully?
ok most of these seem fine
personally I am going mnist -> cifar -> mscoco -> imagenet -> lsunsBedrooms/lsunsChurches
as a progression of models
cos then you have the most stuff to compare it to
Does anyone know what is the best checkpoint for using niji like midjourney ?
The Japanese word "niji" means both "rainbow" and "two o'clock".hmm
I find it hard to understand anime stuff
but maybe pony, illustrious or noob
the anime community clearly don't like Flux, but I am not sure why, because when I look at Flux anime checkpoints they look impressive. There must be something inauthentic about them but I am not sure what it is
because anime isn't suppose to look like flux usually
commerical
hi
The body proportions seem to be more realistic oriented, or simplistic, and also you do require better hardware to run Flux and usually they like stacking a bunch of LoRAs together for their own individual purpose
Furthermore Flux doesn't support anime stuff even remotely out of the box and is kinda hard to train on
Similar reason why anime ppl avoided SDXL until something like Pony came out that just was trained on enough stuff
If you want Niji, you probably should get Niji. I personally am fond of Anim4gine V4 Opt, but a lot of the illustrious models are also really good. If you are just starting out, Noobai or HassakuXL are easier to prompt iirc
but do the anime community inherently have less hardware than other users?
wouldn't hardware be the same across genres
I'm unsure, all I'm saying is that even if it were good at anime, you would have less users compared to SDXL
And the stuff that Flux is genning, at least from a quick view, is basically doable with SD1.5 aside from the text capabilities
I'm not sure Flux has less users
on Civit it has more reactions on its images
Unless you spotted a model that I wasn't able to spot just from a quick look
Anyone know where to find the pre-github-removed reactor?
for example this one
not sure if it is good for anime community or not
Looks decent actually
Doesn't look like it's capable of genning furry or nsfw stuff tho, or there would be at least one post
It also doesn't help that quite a lot of the posted pictures are not even anime but more realism / other art oriented
that might be better though, if it was a good anime checkpoint but was more censored
like it might be good to separate that out
https://civitai.com/models/971952?modelVersionId=1345990 This is for comparison a stabilizer LoRA, a lot of the ppl that actually aim for reactions use one, there is a difference in presentation
I do think the one you posted fell under my radar, I'm going to have to give it a go
Once I clear some space
There haven't really been anime finetunes. Chroma is training right now which looks very hopeful. There's also going to be an anythingflux model.
https://www.reddit.com/r/StableDiffusion/comments/1j4biel/chroma_opensource_uncensored_and_built_for_the/
https://civitai.com/articles/12246
Chroma is modified from Flux Schnell which has the Apache license btw.
It was also trained get rid of the distillation and readd cfg
ah I forgot about Chroma that is a big checkpoint yeah
Hello
The models are mostly diffusion transformer backbones with cross self attention right?
What makes them better worse than others? If we exclude diffusion formulation like EDM or ddpm or ddim
Just number of parameters?
rectified flow is the thing to look into
which models specifically?
is this referring to denoising score matching or score based generative models ?
3.5 or flux or chroma
both 🙂
there is stuff like sliced score matching but does not apply here
sliced score matching got solved by vincent pascal in denoising autoencoders if im not remembering it wrong, it still needs to calculate jacobians..
but do these models just scale number of parameters or change architecture or training loss semantics entirely?
sliced score matching was song, same researcher as DDIM. It can be done with jacobian-vector products rather than full jacobians, that is still slow though
architectures changed loads
and major training loss changes as well, in particular regarding first v-pred and then rectified flow
Is it downloadable ?
didnt yang song write about score based models as SDE as well
yeah
can anyone help me install comfy on amd
Yes! Theres a guide written by Cs1o jn techsupport pinned messages. If you get stuck i recommend making a post in techsupport
hello
Is there anything like stable diffusion with game art? I want to create sprites but in the past it wasn't onsistant.
Hi there
Like pixel sprites or?
Perfect consistency is probably not a thing
But generating assets like tools or items yeah
hmm a longshot maybe but anyone found a decent finetune of the illustrious V1 release? (not 0.1)
where can I find tutorials, articles or anything for lower requirement models and faster generation? Everything I find is always for PonyXL or any other XL model
i have 4gb vram and 16gb ram
You should use forge webui. Its optimised for lower vram
I found this which looks great but the prompt is completely different to the image. I'm trying to generate a similar thing using that prompt and for me, it follows the prompt (using same everything except Controlnet which idk what it is) https://civitai.com/images/11346154
the prompt is a space scifi themed thing but the image it generated in the link is completely different from that
for me it did generate a space themed img
last update was on february 2024, should I use it anyways?
Yep you can also try out ReForge if you want faster updates
Okay, thanks you
do I have to place reforge sourcecode in the forge folder?? i dont understand
No
Its a seperate install
Yeah pixel sprites. I think Grok can do it.
I remember back in the day, like 2023, I was generating like 4 images at the same time at a very decent speed. I don't remember how was the model called. Maybe it wasn't even SD 1.5
I thought about generating in a very very fast model, then if i like the generated concept, pass it to img2img in a better model
what do you think
the generations were good but the faces were a bit deformed compared to what it is out there right now (in sd 1.5)
Grok is just Flux 😅
basically every model can do pixel sprites. There are also loras for it.
However, the images just look like pixel art, you usually have to apply some post processing to really make them pixelated
there might be some Aurora model
it went away for a bit and then apparently came back
Anyone use SD reforge? I'm having some troubles here. I just updated SD reforge (Stability matrix). After that all of my generated images looks very ugly and disfigured. From the 'Png Info' I copied the parameters from my previous generated images and sent to text2img. Still getting bad result (I used the same checkpoint, settings, same seed and LoRAs)
would it be possible for you to install comfyui?
I've it. It's kinda complicated so I don't use it.
there are some node packs that make it simpler
like
they can make an image with just 2 nodes
Have you compared the image meta data of the two images before and after the update?
I haven't. But I used the same parameters so I think it'll be the same.
Check it to be sure.
Load the image into the PNG info tab and copy all data into a txt file.
Then do that with the second image and compare
does anyone happen to know a platform that rents h100 instances
Thanks I'll try it later.
Ok let me check.
I had a look at Reforge code and
its using a lot of ComfyUI code anyway (they credited Comfy so its not a bad thing)
I also started out in A1111 so I understand why you like the interface
vast.ai, RunPod, CoreWeave, Lambda Labs, Paperspace Gradient, Google Colab (Pro/Pro+), Amazon SageMaker, Microsoft Azure Machine Learning, Google Cloud Platform (GCP) - Compute Engine, Vultr, DigitalOcean, Genesis Cloud, Crusoe Cloud, Fluidstack, Q Blocks, TensorDock
I can't use A1111. Always getting 'jsonmergemodule not found' error lol.
forge and then reforge are like combinations of different things
they all have linked history though
cos both comfyUI and A1111 got their sampling code from K-diffusion
which was made by someone at Stability AI based on a paper by someone at Nvidia
I heard Comfy is the best. It gives great control for generation. I'm thinking of learning it, but whenever I see the workflow, my motivation always breaks. 🤣
thanks!
out of the GUIs yeah
depends what you mean with "best"
it's definitely not the "best ui", it's horrible 😅 but you have full control and it's quite good in memory and speed
and it definitely has support for more models than any other ui
im getting errors constantly in the cmd
File "C:\Users\user\Documents\Stable Diffusion reForged\stable-diffusion-webui-reForge\venv\Lib\site-packages\gradio\blocks.py", line 1429, in process_api
inputs = self.preprocess_data(fn_index, inputs, state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\user\Documents\Stable Diffusion reForged\stable-diffusion-webui-reForge\venv\Lib\site-packages\gradio\blocks.py", line 1222, in preprocess_data
self.validate_inputs(fn_index, inputs)
File "C:\Users\user\Documents\Stable Diffusion reForged\stable-diffusion-webui-reForge\venv\Lib\site-packages\gradio\blocks.py", line 1209, in validate_inputs
raise ValueError(
ValueError: An event handler (save_new_preset) didn't receive enough input values (needed: 13, got: 0).
Check if the event handler calls a Javascript function, and make sure its return value is correct.
Wanted inputs:
[textbox, dropdown, dropdown, slider, radio, slider, slider, slider, slider, slider, checkbox, radio, radio]
Received inputs:```
and then a giant random text
Guys how do i Train a Stable Diffusion Model Based on my Own Art Style
Does someone tried the new hunyuanvideoi2v model with comfyui yet ?
you train a lora - lots of tutorials on youtube

Dont place Reforge in the Documents folder.
Move it to a folder on C which you created by yourself. For example
C:\Ai\
do bigger prompts take longer? seems like they'd have to..
conversely it seems like larger models could take longer, but not necessarily.. like how many dimensions do each token or inference level have?
高清球队合影,运动员身穿运动装备,佩戴护膝、手持保温杯,展现健康养生主题。背景中有毛笔书写的“养生局”,呈现中国传统水墨风格,电影灯光效果,动感构图,精致细腻的艺术细节
hello to all
anyone here speaking german?
anyway, i hope to find a nice person that could help me and could create a AI picture for me for, well, political commedy
not realy a big task for a person that is able to do it (to be honest, i am too dumb for it), but it should look like the emperor Palpatine (star wars - already in his dark robe etc) with the face of the high commissionaire named "ursula von der leyen"
hello anyone here knows about sillytavern
Depends on the question you have ngl
Hello everyone!
Does anyone know what the Enterprise license to use SD 3.5 commercially costs in case I succeed $1mio annual revenue one day?
its a community server so hmm i think your better off directly reaching out to stability through their contact forms
okay, thank you
was gonna ask for help installing oobabooga on it (amd/windows)
but i gave up
so nvm
go for ollama, it s much more simpler and usually faster (but less tweakable)
Does anybody have any idea how these videos are made? I'm willing to pay if someone can figure out the workflow https://www.tiktok.com/@biogenesis__
someone recommended me Kopoldcpp is this tweakable and amd/windows compatible?
Just an image to video workflow probably. Hunyuan, wan2.1 etc. the real question is do you have the hardware to run it?
What gpu do you have
usually not. Prompt length is fixed for most models. However, some tools might expand the number of tokens if the prompt is above it's maximum length
haven't tried can't vouch for it but it seems to work with AMD gpus
what upscalers do you guys usually use
Do you happen to know how runpod handles interruptible instances?
as a heuristic how often should i expect it to get interrupted, I could instantiate callbacks to run every x steps
to save progress
how often do the interruptible instances come back online for use
Adetailer got its update so it works on a 50 series gpu
you can get interrupted within seconds or minutes if someone bids more
I don't know how often individual servers come back online as I don't pay attention to individual servers
Hey all, I’m pretty fed up with Black Forest Labs and Flux 1.1 Pro. It seems like they’re not bothering to improve the image generator anymore— as of it is now often the images are still wrong, and an incredible number of times the text is incorrect. Plus, it feels like they’ve shifted their focus to video and inpainting instead, all while keeping prices ridiculously high. I’m also worried they’ve gone into that startup scaling mode, and everything might go downhill from here, like so many other scaling startups. Anyone know of new startups or alternatives that can match or beat Flux 1.1 Pro’s quality for pod images—preferably cheaper or better value for pod images?
what's a pod image?
print on demand image
the ones you print on tshirt mugs and so on
text is still gonna be iffy since its just not there yet
though a little photoshop could fix it
if you run it local theres some ways you could work around it with some custom comfy nodes but cloud based? no idea
there are some text rendering things on arxiv but generally not released or ported to familiar software
Thank you for your answers. Hopefully, someone else can point me to a better alternative 😦
its not a model that you need its a sampling method
Also, because it’s kind of super annoying to complain about quality and price at the same time. I think I would be okay paying the crazy prices right now if they were able to ship what I need to ship. Or, I’d rather pay less. But like this, I feel like I’m paying for pricey stuff that also doesn’t deliver. Meanwhile, it’s been months that they’ve been focusing just on the other stuff.
for example this https://github.com/AIGText/Glyph-ByT5?tab=readme-ov-file
i just want an api, i don't care about what's behind the api
closed source are really really far behind open source at this point for image stuff
mostly because APIs kinda have to be generalist whereas the best methods are usually highly specific to one task
and secondly because they limit how much compute they will assign each API call
and generally pick an amount not that high
got it! my questions was "Anyone know of new startups or alternatives that can match or beat Flux 1.1 Pro’s quality for pod images—preferably cheaper or better value for pod images?" I will add also "delivered via api".As i said I don't really care what's behind the api, closed source, open source, nodes whatever
So i can stop paying them and use an alternative instead that can deliver same quality cheaper or better quality at the same price
But not sure if the alternative exists, that's why am asking
are you willing to make and host the API yourself?
not really, i know how to do it, but prefer to avoid deploy, redeploy (when new models available), start/stop gpu server, code, etc
but tyvm for all your answers 🙌
ok I think my conclusion is I don't know of one then
got it tks! it may not exist yet 😦
为什么他们拿着保温杯,笑死
alright thanks!
it worked, thank you 🙂
Just looking
Is there a way to convert/merge a sd1.5 model to/with an sdXL? I have a great model i made with great lain but would like to upgrade it with the data of an sdXL model
Yeah
O: how?
Any reliable way to train flux on colab? I have 139 compute units from colab (currently not subscribed but those were when i was)
hello
hello, everybody
hiya, i'm trying to get into SD, and don't really know what the best ones to use are. any suggestions?
Civitai erased an image I made without posting and now I have to recreate it from scratch : (
oh no. sounds bad! don't you need any help?
Hi, do you mean what UI? i recommend checking out Forge. Cs1o has a great guide written in the #🤝|tech-support pinned messages
but it depends on the GPU you have though.
ui yes, thank you i will look into it
i have a 4060ti
oh then you can run a big range of models pretty easyly
For anime images i do recommend Illustrious or Pony (though illustrious is better IMO).
For realism. i suppose a realistic SDXL finetune/merge or Flux schnell but i dont make a lot of those
perfect thank you
4060ti is fine yeah
2080ti or 3060 even are fine as well, I've rented those before
with some offloading and quants you can run things
video is the hard area because those models don't seem to squish down very well
no it's not
you have to retrain. You could use synthetic images for training, though
how the fudge do you even train an SDXL? i previously used dreamscape
what do you mean? which tool to use? I always used kohya's training tool but there are plenty to choose from
have a go with the Koyha nodes maybe
I know they are called Flux Trainer but they added SDXL support
i'm running A1111
no idea what any of this means
you usually use a separate tool for training. If you want an UI you can search for bmaltes kohya_ss
IDK A1111 stuff any more
*bmaltais
I actually used it before the others but I forgot what I knew
same 😅
I liked Fooooooocus but that's dead now apparently
that was where I did inpainting for the first time I was astounded by inpainting
I removed a clock tower or something like that and was shocked that it worked so well
i am a leek with pc stuff its already a miracle i got a1111 running
dont worry, i think i found it? or recreated the art style i liked at least!
I mentioned it so many times here but I always liked InvokeAI
i was struggling for a bit but i think i found all the pieces of the model and loras i used
I find it a bit weird that so many newcomers try auto111 where invokeai is so much more intuitive and easier to install
I myself most often use comfyui cause it has newest features first most of the time
prolly because the first thing you find on internet when researching a localSD is a1111
yes, but auto111 is dead
it's from sd 1.5 times
since then there are so many forks
it's totally not user friendly though
anyways it sounds like i should try a different localSD to actually start rumbling with sdXL and other stuff
excuse me what the shit comfyUI only needs a zip file installment and a bat file executed? What sorcery is this?!
comfyUI is for the advanced users tbh, i use Forge as my sd1111 fork
is there a way to re-evaluate my prompt text every time I generate, even if it hasn't been changed? Trying to use the random selection syntax, but its using the cached prompt (thus not changing the random selection) unless I make a change to the prompt.
ComfyUI btw
comfyui is very complicated to use
you can use swarmui that is a bit more user friendly
I would recommend InvokeAI
I'm so deeply setup with comfy though. would be a pain to switch
I'll take a shot at getting invoke running on the side, and maybe move my things over to it if I like it
does invoke have a trainer or sdxl merger?
that would be sick