#💬|general-chat
1 messages · Page 171 of 1
Domain specific video generation is still video generation
Results of Minecraft image generation in #🏞|general-with-images
nice gan looks good
Hey i think this may be a dumb question. I joined this discord a few months ago and I forgot the website and or where to download this. I meant to try stable diffusion but I haven't gotten around to it till now lol any help pls
Gracias
Yeah 3.5 medium does much better 2MP images than 3.5 large, but above that it starts to break again :/
HI
I'd like to know if someone already created the GGUF Q8 version of SD 3.5 medium (maybe turbo/4steps) also?
I tried, but failed
I am using SD 3.5 medium on comfy UI on a M2 Mac Pro 18 GB. It's generating good images - but want to understand how to up my game :
- How to make sure the model follows my prompt to the t
- What settings do i use ? steps, cfg, sampler, scheduler, denoise ?
I'm sure City 96 might
Have you tried just using a quantized text encoder yet though? The actual 3.5 Medium model is far smaller than the default FP16 encoder is.
could also unload and reload encoders
Does 3.5 have the same kind of licence as XL?
can you run 3,5 on automatic UI?
I'm using this all in one checkpoint: sd3.5_medium_incl_clips_t5xxlfp8scaled.safetensors
Does 3.5 Medium require 9.9GB VRAM or is that just a recommendation? Will it work on 6GB?
where can i get help with stable ai iam getting grey images ?
so is 3.5 medium gonna be like the new SDXL?
could be if the community embraces it..
have you played with it yet?
I suppose by toaster you mean 3060 12GB? lol
I think so but I’m not sure if it’s going to be very popular compared to large since that can fit on 8gb vram with quantization and be much better quality.
Haven't really been in the sd scene for a while, but me and my friend wanted to try to train off her artstyle. we got abt 15-20 images for a dataset. would it be better to train a model or a lora/ti? there aren't any specific characters, but lots of stylistic choices such as color, exaggerated limbs, etc that we want to capture, and i've only ever created loras for character styles like ocs
hey guys, im using SD 1.5 and i want to be able to take an image i made, and keep creating that specific character in different ways, on a beach, etc and im wondering how i do that
Is their a really simple way to take an image you made, and keep pushing out new pictures in different scenarios
I just tried to do this days ago and it's kinda hard since it depends how much time would you invest into that
yeah, like i mean im kinda willing to put some time in but i dont have time to wrap my life around it, but id love to sorta "build a character" for my avatar
i have a great image , just need to figure out how to get SD to read the image and keep making similar images
Hey if anyone has had the same issue as me with SD where loras/adetailer sometimes takes male faces and feminizes them and vice versa when you are making images with more than one person, i comissioned Anzhc to train a yolo8 model that detects only male or only female faces
he's put it up on civitAI now so if its any use to anyone here it is:
https://civitai.com/models/293448?modelVersionId=1007485
The short answer is to create a lora with that character, I saw a complete video on how to do it, I save those videos because you never know when you need it, I will pass the link in a min
be useful to get feedback from other ppl, works well for me but until today I was the only one testing it
Create CONSISTENT CHARACTERS for your projects (the tutorial shows workflows for 1,5 and xl) https://www.youtube.com/watch?v=MbQv8zoNEfY&list=PLqvJUJ2nkbont6HjW4nXKgRIsF4Aqh0tM&index=14
thanks ill look into this, i just use the default stable diffusion UI, so id have to start from scratch here, i use auto SD AAAA
because im on AMD
mick has good tutorials, but you don't need to yell about it
been out of the game, what's the current best anime model? flux? pony? MJ/niji? is 1.5 obsolete?
1.5 never goes obsolete. It ages like fine wine, the older it is the better
probably the best for anime also in spite of technically better models that came out since
ty, i guess i'll keep my 1.5 installation around
u shld!
do people mostly use comfy over auto1111 now? did auto stop getting supported?
comfy is the best indeed. A1111 is kinda of unsupported... If you want to use A1111 you try Forge which is a better fork of A1111
ok awesome thanks for all the help!
download sd3.5_medium_incl_clips_t5xxlfp8scaled right now anyone know if its supported on Forge ui yet? or just comfy ui?
Are you using the new app?
im using forge and loading app right now
nope not working for me maybe cuz i got amd gpu

i still use A1111 because of the automation, a few batch command lines and my AMD card works beautifully, but its definitely not got flow and stuff and is just the standard ui
also, alot of safetensors i download are SD 1.5
try swarmUI then, it'll do all that, and a whole lot more
SD 1.5 is still the top model for both control nets and brushnets so it has a lot of use still
it used to be the only model that had IC-light also, however IC-light for Flux is coming out soon
its possible Hunyuan-DiT's new control nets are stronger but I have not verified that claim yet
it's also the only one that understands a lot of style and artist name tokens correctly
its the last of its kind
a model with training data mostly scraped and not curated
makes it sometimes the most interesting
Hey guys, anybody know a workflow to do svd image to video along with a text prompt of how you want the image video to actually be animated?
the medium or the large?
i dont think my 3060 can handle the large
you can definitely run 4.0 GGUF of large with text encoders offloaded to CPU after text encode is finished
i have no idea about the terms you just said i am a beginner in this area i just use sd as a hobby
don't click the link it is a scam and likely malware
Stability AI do not have support tickets
hi
anyone here got an easy workflow for sd3.5? Mine is somehow not working even tho i used it with sd3.5 l
#free-generations
If iam using python webui what models work with that Web ui ... very new to this .. any help is appreciated
Is it pony model ?
Thanks sorry if I made a mistake
@fervent thunder where was the link p
Crypto scam...
So common, means its profitable, need to get into it
How much ram is auto 1111 supposed to use? Like RAM and not GPU Ram
hello
do you guys find inpaint tool useful to correct images that are faulty in some parts but otherwise good and cool elsewhere?
anyone know of any recent benchmarks for stable diffusion that include enterprise gpus? curious how much faster I could generate with a cloud service than on my 3090
thanks for the advice! swarmui was the easiest install i have ever done and just works
yes, for fixing and composing or creating. Sometimes it is really hard to make one concept in a go, so you make something that serve as a base and then inpaint until you get the image you want. Also you can photobash or collage other reference images into your image, an then with inpaint merge it into the original image.
for example this image https://discordapp.com/channels/1002292111942635562/1004159122335354970/1301291502630338692
I corrected and use several methods like instant-id for the resemblance, or to get the beverage he is holding, or to general correct things
interesting, nice to know that it has some merits, I don't feel like I'm wasting time anymore
instead I feel like I lack skills necessary to do what I want with these tools but that's a matter of time
inpaining it is kind of tricky and frustrating, but I think it is the only way to finish some images
Has anyone else noticed that the latest models have a bias toward making noses, cheeks, and ears pink (or reddened, or flushed)?
Hello, where can I find proficient stable diffuser hobbyists in the community, who could help me with realizing a promo video for a project?
Hi
that sounds like you've got a vae issue
post in the #1092446741984444416 forum for this
Hi
It isn't just my images.
i haven't seen the effect. any idea where they're being geneated?
if this means Flux then yeah
that's what population average preferences tend towards though
it's probably because there is way too much anime in the data training set, at least if you're talking about flux and flux loras
not sure, flushed look might match what people look for on average
population average tastes are not good LOL
Thank you! I will do so... 🙂
My images are from mage.space or from the HF demos.
The effect was a LOT worse in 3.0, BTW.
we're gonna have a conversation across multiple discords at this rate. you want to DM me and we can talk there?
Good morning, everyone! ☀️ How are we all doing?
hello ☀️
How are you doing?
not bad
☀️ pretty good
Hello
Hello, can anyone point me to the tutorial on installing stable diffusion on a machine with ryzen 5 4550g ?
what graphics card?
ciao a tutti
bonjour, puis-je avoir une assistance en Français pour la génération d'images?
It is different to make img2img, and then upscale, than to do img2img and upscale in the same go, in latent, so only one vae decode?
I see many workflows that have everything integrated, like generation, refiment, upscale, but I don't like working like that, I prefer to do those steps when I think the image is good for example
Hey guys, just a little question: I have a little study exercise and for that I need some text-to-image pictures with the topic "big city and jungle". I have to do a little collage with animation. Mood isn't important, but I thought about some kind of "I am legend" type images. The images has to be big for the animations. Would be great, if someone could help me!
G
Anyone have Workflow SD3.5+Reactor?
try using flux
Hi guys. I am a cameraman and editor. I want to learn how to work with SD. I have a Mac Studio M2 Max with 96 GB of RAM at work. I’m trying to find the best setups for generating with ComfyUI on it. Guys, please share your thoughts/experience.
start by watching Scott's ComfyUI tutorial videos https://www.youtube.com/watch?v=AbB33AxrcZo&list=PLIF38owJLhR1EGDY4kOnsEnMyolZgza1x
Is there any plan for fp16 editions of 3.5 models? fp16 performance is 8x on my hardware
the stable diffusion 3.5 large IS fp16. in case you don't have the huggingface space link https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main
Hi guys, I want to know how to sign in the SD in my private server after I add the bot to my private server. Only Sign up here.
How do I easily check what trigger words the Lora I am using has?
I am using ComfyUI
you'll have to go back to where you downloaded the lora, read the page you downloaded it from, and see what trigger words are listed there
Boo! 👻
Boo -_-
can i have some quest tion
!?
i think it's a bit greedy but...
i have some fun time with sd 1.5 be 4 but never try this
idk iff it wwork and how?
ic ty
Stable video diffusion? Yes it should work, but it’s pretty outdated and not very good now. Depending on how much vram you have, use mochi-1 or CogVideoX.
i think my lap good to use meta stuffs, can you recommend me?
Hello Everyone,
I'm happy to be here and hopefully I'll get my questions answered.
I'm using Stability AI API for my image generation service called Francis AI. I'd like to make a subscription / buy credits as a company but I can't find any form / input that makes it possible, nor in the stripe checkout nor on the UI of the Stability AI client.
If anyone knows or faced this issue before, I'd really appreciate some further information 🙇♂️
Thank you guys in advance and sorry for disturbing peace with my silly question
Why is it so difficult to get the results you want from some Stable Diffusion Lora?
u must be something wrong
what sd version are you using
Im useing pony version
I use pony too and the loras work for the most partr
if ur using clip skip 2, try using 1
I don't like to use pony that much for loras, I much much prefer Illustrious XL, it's like pony but better and responds to Lora training way better then pony
how do you train loras for XL
and is that model good for anime
Illustrious XL is great for anime, its like Pony but made for anime people not pony fans. also you can use tools like simpletuner and kohya_ss for Lora training
@upper plinth Mister can i send u one Lora for u to try to use please? This one idk how to make work
yea go ahead
u can send me the link if u got it from civitai too
Also i forget to add but OneTrainer is great for beginners new to SDXL lora training
Meta doesn’t have a open source video gen model. There are lots of options but as I said, I would recommend
Mochi-1: https://github.com/kijai/ComfyUI-MochiWrapper
CogVideoX: https://github.com/kijai/ComfyUI-CogVideoXWrapper
Mochi has definitely the best text to video generation, but CogVideoX has more control.
wow ok. seems a bit slow I'll have to check my rocm version to see if I can get some more speed
make sure your cpu isn't trying to do any of the work
Is one of them capable to run on a GTX 1070 8G (low ram)?
probalby not, they all take quite a bit, but you could try cogVideo
Yeah crystalwizard is right, only one that can fit is cogvideox.
Going to be pretty slow as well, probably 5-10mins on a 8gb vram gpu.
Is there a specific model that will fit my config?
I'll take a look. Using SwarmUI. ComfyUI backend
you might ask in the #🐝|swarm-ui channel
There are 2: 5b and 2b.
With quantization, 5b should fit too and be better quality then 2b. 5b also should have more support too.
I would recommend the “fun” models since they support different res video generations, and different length video generation too(less then 6 sec)
i just want to make image btw -==
then use mage.space
wow
Guys , where can i find a lora-based image styling comfy ui template . I want the image to be generated based on the Lora that i select. I could only find a template for using multiple loras.
just right click on all the lora nodes you don't want to use on that workflow and select bypass. then it won't use them
That would only work temporarily , but once I deploy the UI, I plan to pass the style to ComfyUI directly so it can automatically select the appropriate LoRA model based on the input.
you probably want to ask about this on the actual comfy discord
I checked and It seems like they dont have an official discord
they do
they very much do - DM'd you the invite link
Thank you!
welcome
Has someone made a "stable diffusion" version with "Stroke-Based Rendering models"? where the A.I with descriptive text is trained on concepts of mountains, rivers, people, and other descriptors, but instead of hallucinating images made with images themselves that "isn't yours" generates a random sensical painting by applying actual public knowledge of painting techniques, brush styles, ways to use the brush to create a truly unique painting?
no. but did you see the database of images spawning just released?
if you've got a couple million dollars, and a datacenter, you could train a model like that
Of course i don't lol.
that's unfortunate. seems like a good project
Also, who's the mods here? Or is fruit here the only "mod"?
@karmic brook Where do i go/who do i dm to report a scam bot? As a account in this server spams phishing links to a discord pretending to be customer service.
Or do i just ping you here with their ID?
you can ping me
What is better (for pony): 1024 resolution without highres fix OR 512px with Highres Fix?
what is your defintion of better?
native resolution is 1024, always better for generation, then highres fix it (!?)
in 512 you are loosing quality and highresfix won't matter much
you can get SDXL/Pony to work smaller resolutions but you need two things
- reduce Unet temperature with this
https://github.com/Extraltodeus/Stable-Diffusion-temperature-settings - Res-adapter, which was trained on SDXL to give it 256x256 to 1536x1536 range
https://github.com/bytedance/res-adapter
stable-diffusion-xl-1024-v0-9 supports generating images at the following dimensions:
1024 x 1024
1152 x 896
896 x 1152
1216 x 832
832 x 1216
1344 x 768
768 x 1344
1536 x 640
640 x 1536
For completeness’s sake, these are the resolutions supported by clipdrop.co:
768 x 1344: Vertical (9:16)
915 x 1144: Portrait (4:5)
1024 x 1024: square 1:1
1182 x 886: Photo (4:3)
1254 x 836: Landscape (3:2)
1365 x 768: Widescreen (16:9)
1564 x 670: Cinematic (21:9)
Pony is SD XL so
this issue got solved by Res-adapter, the range of SDXL is 256x256 to 1536x1536 now
doesn't have to be a specific number any more either
what is res-adapter, some add on?
yeah its called a domain-consistent adapter
it was trained by TikTok
funnily enough
what?
but is something you have to install and takes resources?
its tiny its like the size of a lora
its in comfy manager, the nodes handle it all
you don't need any other download
It seems to prevent double generation on bigger resolutions
But I don't see the point really, better to generate with native resolution and then if you want resize and then upscale
doesn't it seems easier being able to pick any number though
I agree its not a huge deal
Does anyone have ComfyUI-3D-Pack And can you tell me the size of it because I am currently downloading and I have a 1 MB Internet speed and so far I am over 400MB and is still downloading. Downloading a two gigabyte file takes me about a day. So wanting to know the size is important . but The real issue is not the size since it will complete eventually but rather the issue is that I am using a script to download from Github which is supposed to auto resume any download when I get a connection issue and I do get connection issues quite often with large files so I don't know if the repository is being downloaded properly whenever it gets a connection interruption, or If it's not downloading properly I could just be downloading data forever.
did you do a git clone?
is flux 1.1 available to train a lora with yet?
that's pro and no
hello everyone
may i know what exacly is this server for
i also saw a few images
did you people make them?
i think its dev only
and tbh unless its a anime style or something, just stick to SDXL, i been having massive trouble with flux training for my specific character
the is the stable diffusion discord server. if you look in the #🆕|sd3 channel you'll see a lot of images, everyone makes them. it's not just a devloper server.
Hello 👋 I'd like some help with setting up or joining a log in for this app, the thing I use is AUTOMATIC 111 and I don't know what the url or login and password are
Please ping me if you have a solution. Thank yiu
url should be in the cmd that pops up
Could I see where to look?
when u run webui.bat
do you have a model?
need to download a model and place it in models/stable diffusion
Could I have help knowing what to download?
also in case i stop responding or u need further help there is a dedicated support channel #🤝|tech-support
Okay might need to get on computer my last question is it available on computer
yes you need a computer if you want to do it local.
You can try without a computer but it will be sort of difficult.
Let me guess. Throw different colored rice in the air and then move the rice corns on the floor until you get the image that you wanted? 🙂
Yes before AI that is exactly how we did it. Especially in North Korea,
yes the usual way for any costum node. im at 1.8GB now. not sure if the download is screwed. i am able to download other custom nodes fine but they are small. i dont know if ComfyUI-3D-Pack is just silly large or my script is screewing up. if anyone can download it to confirm its size please help me confirm it. normaly it takes less then 5 minute for normal people to download 2GB but my internet speed is 1.mb
just tried and it was 198MB
took around 0.05 seconds on 3GB internet
your file is too big
something up with the script yes
Hi!
I'm trying to figure out why I don't have Python Embedded folder in my ComfyUI installation and how to install it
I want to use FaceID, InsightFace, etc.
hello
anyone know what Emad is up to nowadays?
hello
hey thanks i appreciate it
cooking up a new model using some help from reddit, idk if this'll actually work this time but its worth a shot
Im trying to deeply understand Stable Diffusion architecture, but i need somebody to help me understand something.
I understand that the VAE codifies the input image into a kind of "smaller" image, with less pixels or spatial regions, but with more channels that encode the formal patterns and therefore allow the decoder to recreate a full size image.
My doubt is in Self Attention and Cross Attention (with the prompt)
How is Self Attention performed? Is each "Pixel/Region" of the latent image treated as a token? if thats the case, How are the Key, Query and Value vectors generated for this region? The initial emedding for a single region should be too small, only composed of 4 values for the 4 channels.
any general eta for sd 3.5 support in automatic1111?
anyone in this chat can help me out troubleshoot why after installing rgthree comfyui node still won't work?
Dods anybody know of a good lora for design sheets for pony? Like, I wish to get reference sheets of a character, it would be awesome to have it divided in pieces because I want it to model it, but I'm okay with different views of a character t-posing
I try searching civitai but even with the filters on there's so much porn and unrelated stuff that damn
And I don't know where else to search
Hello! I was curious whether or not Stable Diffusion 3 has been released and if/where we download it and where to put it in the folders
😄
I finetuned a Lora, but when using it made very small difference althought the weight is 1. What can be the reason?
Can someone point me in the right direction? I'm planning to train a model using images and a list of tags associated with each image. So far I've only been creating images and listing tags for each one, I haven't really looked ahead to see what to do with that information.
How do I generate images?
Yes sd3 medium released and it was very very bad at humans. Now there’s sd3.5 large and sd3.5 medium.
Sd3.5 medium isn’t that great either imo, but ok. It’s basically base sdxl with better prompt following and better text rendering but worse aesthetic and similar humans, but honestly I prefer finetuned sdxl models.
Sd3.5 large is great, it had much better prompt following, text rendering, and humans(not near flux level but workable, better then sdxl)
It’s a solid alternative to flux, it’s much more creative and knows more art styles, less biases, but worse humans, text rendering, prompt following.
is Flux Automatic 1111?
I think FP8, NF4 and GGUF killed the need for models smaller than 8B or so
I don't believe a111 supports any of the newer sd3.5 or flux models. I would recommend swarm or comfyui or forge.
and yeah neonninjaastro is pretty right, with nf4 you can fit 12b model(flux) in 8gb vram and sd3.5 large(8b) in 6gb vram. You can fit it in even lower vram gpus if you can wait long enough.
I thought Stable Diffusion was Automatic 1111?
Automatic111 is just an UI to load and use models and so is comfyui, swarmui, forge. Stable diffusion itself is a series of models by stability ai(sd1.5, sd2, sd3, sd3.5).
Flux is a series of models by Black forest labs(has many previous stability devs) which currently include(Flux.1 schnell which is 4step like sdxl turbo but for flux, and Flux.1 dev which is normal step range like 25). Both are 12b params in size and similar to sd3/sd3.5 architecture and open source.
Ohhh thank you for explaining that! So how do I install SB3.5? And download it?
I really appreciate it
You can probably just search a youtube guide to use it. Sd3.5 is very new so none of the guides are outdated most likely.
why i cant generate brown person on flux
like i put "latino brown man"
and i get white dude
well - is EVERY latino guy a dark shade like he's been in the sun too long? sometimes you have to use terms that the AI understands whether you really want to use those terms or not. geotarget the guy. use the name of the city he's from or something. and use a common male name from that area. like, maybe: Juan standing with his arms crossed, leaning against a tree
Try to use more stuff, dark skin, tan, person of color, idk
And use the negative prompts like white person, white skin ...
maybe dont use latino
AI in general struggles with skin color
it is particularly hard to get non-black dark skin girls in 1.5
the models are biased towards white
1.5 was trained on images from Liaon - and its images are from websites - and guess what's on those websites? lots of commercially targeted images
i have two gpu's one with vram 12 gb on with 16 gb. is there any possibility to run stable diffusion video using these two. its would be a great help. i am new learner .
I am trying to create images like the following, but they always come out blurry. I don't have any problems with other images with all the same settings. I also sometimes get blurriness with photos of white sand beaches. Is it the brightness? What is it about this prompt? "flat white background, 3D Unreal Engine render of an old green wooden chair"
Does anyone know any other option compared to Dynamic Prompts? I had one installed but kept getting a lot of errors and it was aiming to the dynamic prompts and other people had the same issues so I removed it. But I like the way I could use it and wonder if there is other options?
Post in the #🤝|tech-support with showing the image you get and you can help 🙂
I am using Dynamic Prompts and it works well. I have some really complicated wildcards pointing all over the place. What kind of errors are you getting? I am using sd-dynamic-prompts extension
I kept getting these and if made the auto1111 to crash.. sometimes I could do 6 - 10 prompts with 20 batch images high-res etc and sometimes only 2-3 and boom and when I did upscale it would do from 5 - 50 images and then crash.. ( can't paste img ) https://imgur.com/a/mdyvifr
Ah, so it looks like a code/installation error, not a usage error
I just installed it from the extenstion tab and did nothing else beside addind more wildcards?
Did you restart?
Yeah, this was an use of like a week with this error :3
So I've done many restarts 🙂
Sorry, above my pay grade, wish I could help. Wildcards are amazing. It will be worth it to find a fix
Yeah, I found it to be amazing too and wish to get it running without issues 😄
Was anyone already able to train SD 3.5 on KohyaSS ?
He released a new repo named 3-3.5-FLUX ?
Anyone ?
simple tuner is probably the way to go if you don't want to use pure pytorch
in the simple tuner discord (its called terminus research group) they have made some progress fine tuning SD 3.5 already
I'm not really a big fan of using pre-made scripts though
is it easy to make a transition from SD 1.5 to SDXL (including Pony)?
no not really
almost everything about a workflow changes
really? even when using A1111? 
oh that might be fine
A1111 has barely any features aside from just running the model straight as it is out of the box so it's probably about the same
good - I plan on doing said transition
will search for all the LoRA's I'm missing on XL when I can
you can often keep the same sampler, schedule and CFG settings, which is nice
Hey crew - I wanted to share a new video and blog post about how LLMs assist with auto-patching vulnerabilities.
https://www.dylandavis.net/2024/11/self-healing-code/
I hope you find it useful. Feedback is always welcome!
Is there a way to make python be multithreaded when loading up comfyui?
@still glacier Don't know if you have mod/admin powers. But we got a scam bot/compromised account
Sjjsksaksjsks
Long shot here, but anyone have docker-compose for AUTOMATIC1111?
anyone here that uses controlnet can tell me if using the colored option of depth anything v2 is better than without it? because I'm using it on comfy but I don't see that it's using the colors of the depth map
Is there no mod role or admin role to @mention? No staff role?
nope. what is it you need?
That's some ridiculously basic server functionality that is missing, to be honest.
Anyway, use huskyeth_alp is DMing people with links to another discord server under the guise of it being a problem support server.
this server is set up with right click reporting
not everybody needs a direct communication line to (often volunteer) mods on a popular server. spamming people in DM is against discord's TOS, so report works there too
It rarely works fast enough to keep them from DMing a ton of people, when they could at least be banned immediately from the server. What would mods be volunteering for if nobody can get a hold of them?
At the very least, a server this big should have some kind of modmail/reporting channel given the amount of spammers and scammers that hover around accessibly tech communities.
does 3.5 work with a1111?
main character syndrome ahoy.
The mods do a lot. They're not here to serve you specifically
No, theyre here to moderate and keep a community safe. The user that DM'd me has only been in this server for a couple days, has already been posting links to a random server IN the server, and was never banned?
with mod mail, people tend to "editorialize" the reports. Right click report makes things more objective.
i dont see any issue with what you linked. mod wasn't required there
you your all sheep to be fished
Looks to me like their message was a link, before it was edited to say what it was now.
i can confirm that, i told the guy to not go there
^
But the person was never banned, they just started DMing people instead.
not the only oe ive seen also, not sure if it was the same guy
english is tough huh
ya this discord is pretty much abandonware
so report the message. mods can see edit history
they do say that but im no rocket surgon
you don't need to write a big essay
Anyway, back to what I'm actually here for. I'm planning to train a model using images and a list of tags associated with each image. So far I've only been creating images and listing tags for each one, I haven't really looked ahead to see what to do with that information. Can someone point me in the right direction?
AI seems to move so fast that I dont really know how to tell if a tutorial from a year ago is even still relevant
this discord doesn't have any mods any more
the basic ideas of how to train a model are still current
it does they're just not on stage. properly managed communities shouldn't have them so show boaty.
no, they don't. those that were are now community guides
you misunderstood that change if you think that means this server is now unmoderated and lawless
your best skill is twisting what people say, you realize that? go back and re-read the original question
weird
Anyone here use comfyui or know a good related discord server?
is foolhardy remacri still a good upscaler to use with sdxl?
@placid hatch thats some random discord, remember the owners of this server dont care if you lose everything, #🏞|general-with-images message
comfy has a discord of his own. we also ahve a #🧣|comfy-ui channel here
Awesome, thanks! I'll hop in there in a bit.
Hello
do you follow Purz on twitter? I think he might have a discord channel, also lots of videos. The video run on but are detailed.
upscale to 4k, and don't change the face make it real and same and upscale it HD image
can we change this for better results. now the results are very poor.
anyone has a better extension then Openoutpainting?
why Monetico so popular on huggingface?
it's an interesting demonstration of efficient model training. peopl take notice.
TradingView Premium for Free:
https://www.reddit.com/r/OptionsDayTraders/comments/1gbvh06/tradingview_premium_free_edition_desktop_pc/
hello
I assumed it was some spam bot. Their response had nothing to do with my question.
how do I prompt for "waving hair" without making the person wave lol
can you send the link? I can't find it anywhere
DM sent
wavey
we have nothing to do with runway
👀
a1111 or comfyui for a complete novice whose only experienced with dall e 3?
comfy
actually, install SWarmUI - let it do all the technical stuff, and run comfy inside it more information in #🐝|swarm-ui the swarmUI cahnnel
is SWarmUI the most supported webui
it's heavily supported, yes. and we have a channel for it. the developer also has a discord, and he's both here and there just about every day.
it's designed to make your life a whole lot easier. just run comfyUI inside it. scroll down the page and you'll find install instructions
ty
Hello, is it adviced to host Stable diffusion on a Google Cloud server?
adviced ? not really, it s harder than doing a local instance and google fight back against it as it costs them money. But it can be done
I see and how much would it cost average if I setup? I guess it depends on how many images I wanna generate
Do I care to download stable diffusion v1.4 model/checkpoint? or I should ignore that part and only look for LoRa and Checkpoints from Citiv.ai?
Can be done for free with some google colab notebooks. But they tend to "brake" whenever google dimmed them too costly. cf https://stable-diffusion-art.com/automatic1111-colab/
Ignore v1.4, it is ancient and even back then it was quickly outdated by 1.5.
get some sd1.5, sdxl or sd3.5 (probably too soon, to run those on collab. I haven't tried) based models from civitai are those are the most used architectures. Therefore that s what tend to get the more support
you can still some mirror of it on hugging face
Most, probably not by now as it s quite old. But a significant part for sure.
That's if I want the barebone/raw model I guess but if I'm looking for specifics I believe a Lora+checkpoints on civitai are enough
gm all
SD 1.5 is very old, though
SDXL is much newer and better and has as much support as the old sd 1.5
SD 3.5 and Flux are the two newest and best models, but there are also less tools for them
control nets, brushnets and powerpaint, and IC-light are the main things I still like SD 1.5 for
apparently a big new HunyuanDiT may be coming
didn't realise they are backed by Tencent also
You can host if localy for free if you have a GPU with 4gb+ vram or a apple MacBook with an M chip
For cloud there are multiple services. Websites directly for generation or cloud GPU hoster like vast.ai where you can rent a GPU and setup Stable diffusion yourself
Anybody know where I can learn more about all this? Especially the technicalities of it all
Hello, whatsupp guys
I see, welp I'm here to ask someone about webui_forge
Can't be me, hope you get help 😄
Thanks, you are new here also?
post your questions here and people can answer them
oh, I see got it, thx
Yes 😅
Well, mostly which UI I should try and focus on? I've tried the stable diffusion from a1111, and just recently found out there are more. And also about something called pinokio..?(?)
how technical would you say you are?
Well, rather... I installed the SD a few years back with the pytorch and bash thing, but I've fallen off of it. However, I've got time on my side and a decent set up to work with 😄
you want to use comfyUI - and if you'd like to make your life a whole lot eaiser, install swarmUI https://github.com/mcmonkeyprojects/SwarmUI and let it do all the technical stuff, and just run comfy inside it
Perfect! I'll get that sat up. Also, I'd like to make video ai, is that possible too? And the models and stuff can still be downloaded on Civit?
comfy just posted in his discord that he's added support for GenMo's Mochi, so yes - but i have no idea what the hardware requirements are. you're going to need a decent amount of vram at the least
That shouldn't be an issue. I'll find his discord 😄
i can DM you the invite if you like
Please do!
sent
@desert dagger When you have a moment could I trouble you for one of those as well?
sent
Thanks!
👍
Anyone got a good tutorial for animatediff?
there are some on Purz's youtube
reported for suggesting having virginity is insulting
is this sarcasm
💀
with him, probably
read the information in the #artisan-faq channel
Can someone help me with installing stable diffusion in windows11?
Hey checkout the pinned messages in #🤝|tech-support for the Guides.
And ask there for any questions
Can someone suggest some good checkpoints ?
Hello - I’m interested in developing rendering techniques with SD. Can anyone recommend any pre-processors for controlnets (comfyUI) which can interpret depth and normal maps (exported from a rendering program).
I’m looking to improve image interpretation and rendering accuracy - while using SD to play with style etc.
Who is using premium gpt 4 dm now
Can you make gifs in forge ui ?
Can we now get the text in the image correctly?
usually
hey guys
i just started using stable diffusion
so just wanted to ask, is it normal to get bad images
on your first run with around 20 steps?
im using the SDXL model and yet getting really bad images
even tried using better prompts
is there something im doing wrong?
yeah
its 512 by 512
i have rtx 3050 6gbvram not sure if thats whats causing this issue
hmm
ok ill try that
Sdxl doesn’t do well at 512, you need to do 1024x1024
Also I would either recommend some sdxl finetune(juggernaut, dreamshaper, realvisxl), maybe the lightning versions(need 4 steps so 6x faster).
You can also try the new sd3.5 medium, the above finetunes are right now definitely better in image quality but sd3.5 should have better prompt following and text rendering, worse image quality tho.
hmm
Creative avatar. Black void
Здарова
One does not become enlightened by imagining figures of light, but by making the darkness conscious-carl gustav jung
you have a point, might be time to put some thought into my avatar
have anyone tried controlnet-union-sdxl? is there any difference between this and separate controlnet models in terms of quality (same, worse or better)? i guess they should be the same, but... if someone tried both, please share your thoughts
3.5 is still far away from red panda, flux pro or Midjourney, hope v4 is going to bridge the gap
post some images in #🏞|general-with-images ?
Could you explain "expands". Are talking about outpainting?
can you post a 3-4 screenshots for better understanding, or share the link to video
that is tiktok ai filter
you want somthing similar with sd?
will this help?
or this
it's video leap https://www.videoleapapp.com/tools/infinite-zoom-ai
that is video leap and that's the app. and that's what is used for those infinite zoom in and out
ah. the images on that clearly say capcut
they have a lot of video tools
go look at the top right corner of the images in that video
you really know how to tell people you don't want help, don't you
hello
Hello
yo guys whats up.......!!!!! good morning
i bet you all are sceared of my chat........????
they're all asleep
I need to generate an image of a gryphon standing on a supply crate, and it needs to look like line art/drawing.. anyone recommend a model and/or LORA thats good for that.
hello
its evening
2am here
arizona
turning quite cold now
same here
Greetings, everyone.
Recently I acquired a computer with the necessary specs to run stable Diffusion and I would like to run it locally. I used it a lot back in the day when you could run it using Google Colab for free.
Now, I would like to go back to using it locally. Problem is, I'm an absolute noob at doing that.
Does anyone know a simple guide I could use to set things up?
Check out these guides, i use comfyui with swarmui as the frontend, really liking it.
Used to have automatic1111 but it was limited to what i wanted.
Can i get advice on which model to train to produce the most life like images of my self for my linkedin?
Thank you so much!
Hello! I am trying to add "external captions" to images to train the using fast-stable-diffusion collad. Where can find more info on annotating correctly?
I think i found it
"Hey everyone! 👋 I’m looking for a fine-tuned Stable Diffusion model that can generate web user interfaces (UI) based on prompts. Does anyone know of any good models specifically trained for UI or web design, or have tips on how to set this up? Thanks in advance!"
nope.
So @karmic brook runs SAI's social-accounts, am I right?
I'm the server owner and verified as staff if you click into my profile
I always thought fruit was an awesome username 🍌 🍎
@karmic brook is good people, she's not going to bite you
Yummy
no, you can't eat her!
🙃
hello, i need your help. i downloaded the ComfyUI tool for stable diffusion. But when i click on it it gives me an error: Connection denied from 127.0.0.1. Can you help me?
127.0.0.1 is your local host. your machine. why is your machine refusing to allow an application to run on your network? turn off your anti-virus
my antivirus: windows defender XD
your system has something running on it that's paranoid, and is refusing to allow comfy to use the local host network port.
might be in yoru brower's security settings
i use google chrome
is there an easy way to rotate an image in automatic1111 for inpainting? if not is there an extension for that?
When i run a batch job with a set seed, it increments the seed from the one I set for each gen. I'm testing wildcards and i want the same seed for each gen. How do i keep it from changing the seed during a batch job?
hello
hello!
You don’t have it set to fixed. You have it set to increment. Change to fixed.
Firewall settings may be the issue
is there any way for me to prevent adetailer affecting other parts of the image? when i put an image into img2img that has previously been inpainted and use adetailer on a different part of the image the part of the image that was previously inpainted loses detail.
it seems to affect the whole image but the inpainted parts are the most noticeable
Make sure you set the Inpaint area to "mask only" when you use Inpaint
Then it shouldnt affect the whole image
ok so on adetailer there is a setting called "inpaint only masked 2nd" is that the setting you are talking about? its already on
hmm i think i figured out what the problem is. with adetailer even with denoising strength set to 0 on the img2img settings it is still redrawing the image before adetailer starts
seems like i might have to manually inpaint to avoid this and not use adetailer
No I mean in the img2img inpaint tab
There is a setting called: inpaint area
Oh wait I get it now you have already Inpainted it
right ok, but i will still have to manually inpaint the area right? which defeats the point of adetailer
Yea you can't restrict what adetailer affects.
Best is to manually inpaint the face then
ok thanks man
Np
hi
I checked the firewall, Google Chrome has 2 checkboxes for private and public
How can I see if my browser's the problem?
Just out of curiosity where do you click that it gives you that message?
Hello Friends,
Check This :
https://ai-social-media-post-generator.onrender.com/
Its a great Tool! And Currently Free!
Try it and forward it to Your Friends!
I went into the firewall settings on the computer
I mean the “connection denied”
here it doesn't let me send photos, I'll send you a private message, wait
Are people using local stable-diffusion installs with Python and GIT, or Cloud GPU's?
I've got SD working locally but on an AMD card with 8 gig VRAM it's results are poor.
Make sure you re using zluda. It should help. Check #🤝|tech-support pinned guide for a detailed guide about it.
Yes, I need to try it out, Direct ML is really bad with the memory!
Guys, who will help to generate one image for the track, it is very necessary, please help
You mean you don't have access to image generation?
Yes(((
Is it possible to run Auto 1111 and Swarmui on the same PC at the same time?
I think so. They might share some resources but should be independent of each other now.
I am not sure if SD needs different versions of Pythin for example.
*Python
Probably be the same
Even if they can't, it's pretty simple to reset and get one or the other working again.
This is quite confusing. When I DL a SD 3.5 model safetensor it isn't the model? I have to also download some text encoders and make a call to a huggingface API?
seems ridiculous
well its just a case of matching the format you need with the software you want to use to run it
not sure which format you need since you didn't mention how you want to run it
It says I have to allow gated access, set quantization parameters, set offloading settings
where does it say that
it does a bad job making images anyway
don't really know about that one
for some reason, the images I get on SDNext are awful compared to those on a1111, so I guess Ill wait for 3.5 support there
Have a tremendous weekendfriday.
Comfy is impossible for me with all those windows and connectons
Aww youll get used to it. You must or you will perish!
I cant see it
there's roo much to have to see...my eyes are bad
sadly we don't really have a well-packaged "easy option" for people that has the full set of model support and features
it essentially doesn't exist yet
Yes...SDNext has a lot of hoops and manual installation to get 3.5 to work
Yes. It will be a bit more time before an easy option like Photoshop emerges but comfy is the closest we have...
or just use forge bleh
If you are a total newcomer I do recommend Forge.
I have been using a11111111 for about 2 years...tried comfy , it was no good with the spaghetti
Yeah so you can try forge. It;s like A1111 but a bit better. And A111 has no support for it anymore from what I gather.
does forge support 3.5? Does comfy without having to DL the encoders etc?
I think it does. It supports Flux for sure
its updated pretty regularly.
and comfy supports everything so yeah...
dats why i reccomend it
it took me almost 6 months to embrace it
but once i did i never go back
i only use forge for casual image gens
All hail! Make $500,000 dollars in 2 minutes doing nothing!
Whats currently the best base model to use with comfy at the moment?
I would like to discuss this:
ago I had some hopes that one of the video generator platforms would implement a software chain similar to the following.
Step 1: Generate image T-pose with AI image generator.
Step 2: Turn 2d image into 3d simple mesh with AI.
Step 3: Auto-rig and animate 3d simple mesh in an open game engine or one such as Unreal Engine or Unity with something similar to Muse animate and Cascadeur (including physics from game engine for water, particle effects, gravity and lighting to give a reference point for the AI. It's important to note that the 3d object itself need not have a detailed or extremely accurate mesh because it wont be viewable in the final visual output. Locations of humanoid/creature body parts, environmental objects is most important.
Step 4: Take frames of that animation and use something similar to Control Net to place New AI generated images on each frame to give any art style you want (Realism, Anime, 3d model etc).
Step 5: Since this is in a game engine you can move the camera and objects around at will to get the angle you want. Also object to object's relationship in distance and size will stay the same, so the user can create a more permanent scene and have longer video clips in the same environment, (usually AI clips are 10 or fewer seconds only and the locations are not easily reproducible).
Note for digital infrastructure: (A lot of video data could potentially be turned into "suitless" motion capture data for training a 3d animation tool. Synthetic data from video game assets/characters created inside the game engines would add to the pool. Within the game engines different angles could be recorded simultaneously with the segmented parts of characters and objects auto labeled for training data. The camera angles/ camera movement, the environmental objects, the character/creature type, the characters starting location and ending location, the type of movement (running, jumping, crawling etc) and the positions of all components, could be semi randomized to vastly increase the amount of quality multi camera angle data.)
Yeah flux dev is the best overall like kaibo said(in human anatomy, prompt following, and text rendering) but sd3.5 is a nice alternative too if you want more creative outputs with less biases and more art styles.
Sd3.5 large will also use less vram I believe
I think it would be helpful to have a video game company like unreal or epic games paired with a video generator team.
epic owns artstation. unreal and epic have their own internal teams
a game company plus a video generator group would help the process i mean
Dumb question, downloading the pruned of that, will that checkpoint "just work" if I drag it into my checkpoints in comfy?
@gritty scarab what are your thoughts?
you have to put it in the /comfyUI/models/checkpoints directory - and i suggest you put it in a directory under that where you can find it - and then restart comfy
in their current states I would base a fine tune on flux
maybe in a few months there will be an amazing Realvis or Jugger for SD3.5
there are some finetunes here https://huggingface.co/models?other=base_model:finetune:stabilityai%2Fstable-diffusion-3.5-large&sort=trending
however if you need apache license the choice is more like Schnell, Lumina, Auraflow
thanks will take a look at these
there are several tabs - adapters(loras), finetunes, and so on. look at all of them
TBH I spent the whole year thinking a new Pixart with Apache license was coming and now it seems it isn't
cos they made Sana instead
Sorry what I'm asking is the 3.5 will work with base comfy?
I assumed I was on SD 1.5
yes. comfy has support for sd3.5
Thanks!
Last question lads,
STOIQONewrealitySDXL_SD10 Anyone know what version this is?
is that 1.0?
not sure but sometimes there is SDXL 0.9 and SDXL 1.0
it doesn't matter much
it will run using normal SDXL inference methods
it says SDXL in the name
I need that specific version as I am testing a workflow
where did you find it?
There's no SDXL: https://civitai.com/models/161068?modelVersionId=498484
Type: Checkpoint Merge
so they've merged a bunch of checkpoints
and: Base Model
SDXL Lightning
oh these numbers are the version numbers for the Stoiqo people
Does that mean that the checkpoint the workflow is looking for wont be the same?
and lightning models are 'optimized' to run fast
I should just go with the SDXL 1.0 and try.
look at the model card under "details"
and it will say base model
its occasionally wrong on Civit but rarely
if you use that merge, there's no telling what you'll get. if you are testing something, use the base model
there is an issue on Civit sometimes with knowing exactly what is in a checkpoint
Yes I'm doing that, but from what I Can tell there's no checkpoint with the same name, so I'm assuming unless I have that exact "old" checkpoint I just have these to use.
to me it looks like they made a horrendous mess and tried to merge all sorts of models.
just try all of them
is what I would do
that is what you're looking for
I would have agreed but Realvis 5 is a merge model
Testing thank you!
it specifically lists that file name right there in the readme on that huggingface page
I wasn't very clear
what I meant was I would have agreed that training fresh models is better than making merges
however certain merges like Realvis seem good to me
I'm going to check this out, is this it? https://civitai.com/models/895985/flux-devschnell-base-unet-google-flan-fp16nf4-fp32fp8
this checkpoint has FLAN which is kinda controversial
its not clear whether FLAN is a good idea or not
the official download is here https://huggingface.co/black-forest-labs/FLUX.1-dev
Is that the "commercial licensing" you mean?
Looking at this now: https://civitai.com/models/617609/flux1-dev
I don't understand the "- Should use VPS or if you are not afraid of burning your laptop (or if you are rich, it depends on you :slight_smile: )" 2 paragraphs down
Is flux sending telementry data? Or do they mean the GPU?
yeah, base model usually works best. There are many Loras for Flux out, but quite often its surprising that the base model beats them easily if you just improve the prompt
I really struggle to find good Flux loras on Civit
that actually improve image quality and don't burn the image at high strength
there are some, its just hard to find
Actually last question, do you think it's worth it to upgrade from 1.5 to flux?
depends on if you want what flux can do.
they mean the gpu
no I actually don't
but its up to you
SD 1.5 still has the best control nets, brushnets, IC-light and powerpaint etc
and it is the model that has the most Arxiv papers too
SD15 ages like fine wine.
it really does yeah
tools keep coming out that make SD 1.5 better
there was time to write papers back then - the tsunamis of advances hadn't quite started yet
SDXL is catching up in papers/tools slowly
it really does take time ;)
a lot of the papers are just done on base research models anyway
like there is a DiT just called "DiT"
hello
i just downloaded it from some yt tutorial and everything i try to create looks so bad bro :(((
like if i go to a website find a cool ai photo and then use the exact same prompt mine turns out all distorted and low quality
some1 pls help
could you try to describe what you downloaded
Install Stable Diffusion Locally (In 3 minutes!!)
that was the title of the vid
okay this is A1111 I can't help with that
I use ComfyUI or Diffusers
/Factors associated with changes in subjective well-being immediately after urban park visit
#Factors associated with changes in subjective well-being immediately after urban park visit
read the information in #artisan-faq
Well, damn. I just realized where probably much of the completely weird word salad originates from: people running clip interrogator on images and copy pasting bits of that to prompts (and other people in turn copy pasting them further). Just a few (non-ai) images and I'm seeing exactly same pieces of word salad as you find on civitai.
Yes. That’s the legacy 1.5 left us with.
seems that prompting barely matters for image quality anyway
Eh. I consistently find there’s a big difference in results from a decent negative prompt. Positive is then fighting with the text encoder to get the composition and elements I want (in reasonable computing time and without butt chin).
prompt following and image quality are a bit different
there's not really been a prompt engineering technique that really moved the needle on benchmarks so far
Which is why I specified positive and negarive differently. No negative prompt = outright bad image quality in my uses.
feels unlikely since Flux doesn't have a negative and looks very good
That’s the thing. Flux doesn’t look good to me and is ridiculously slow on top of that.
Any wip fine-tune of sd3.5m?
Do you know why when using refiner at any strength, tried switch at .10, 0.50 and 0.90, always corrupts the image at the very end?
The otherwise nice looking character in the preview suddenly turns into odd colors, low res and such
if your new then InvokeAI is probably the most newcomer-friendly UI for stable diffusion. You also don't want to use SD 1.5, its super old and looks really bad except you add 1000 additionals tools and custom models to make it look okay-ish
if your gpu is good enough, use Flux. Otherwise you can use SDXL Lightning or similar models. Or you look up tutorials how to run Flux on old hardware, but I'm not sure if you can then use invokeai
Question.
Does SD3.5M [mmdit] even use a value that signifies a set maximum % the model can alter the initial noise?
If so I want to try and apply a higher value however I don't know if rectified flow even uses such a value
(The value is sigma max in SDXL and SD1.5)
Whats the angle called thats not from the side but also not from the front. the one inbetween
not sure which one you mean but the 3 choices are acute, oblique or reflex
i mean where the character is not looking directly at the viewer but diagonally away. i cant find a prompt that ponydiffusion understands
GPT 4o gave this standing with a focused expression, looking diagonally to the left (or right) of the viewer.
however since its pony this probably won't work]
you need an expert for that model
there is likely a tag
hmm ok
Hey everyone,
I'm looking to turn a real-world video into an anime-style video. I've heard that Stable Diffusion can do this, but I'm a bit overwhelmed by the number of models available.
Could you recommend the best Stable Diffusion model for this task? Any tips or tricks would be greatly appreciated!
Thanks in advance!
best bet is to go to civit.ai and see which models you prefer. no such thing as best since its all subjective in what you think looks good
as for videos, tokyo_jab on reddit has a very good method for doing it, most consistent that ive seen so far. he posts a step by step guide on his profile. there's definitely other methods and a huge range of stuff you can try out but if i were to do that, I'd try his method first
Thank you I'll check it out
Besides negative prompts, what more i can do to avoid double hands, body imperfection on SD 1.5?
what's a good custom node for resizing images in comfyui?
is anyone bored?
could someone generate me a pr image of someone crawling into a man sized croissant thats like a bed but all steamy and delicious
lol im not at home and i need this in my life rn
Is everybody using Flux/SD3.5 by now?
Im still too comfortable in sdxl, and i feel like flux and 3.5 are still not very optimized for quickly iterating in 8 to 16gb of vram
Should I switch? 🙂
Hey anyone know any good prompt helprs or generators?
I hvve likeideas but idk howto promot for them correctly
that and people just picking up other people's prompts and using terms with no idea what they may or may not do
claude or meta.ai do a very nice job IF you are using flux or sd3.5
use the tool that does the job you want done, the way you want it done
Damn, I only use Pony Diffusion models
um well - pony was captioned a different way, so it may or may not be useful to use the descriptive text from an LLM with it. what you should probably do is look at how individual terms work with your models and learn how they think. start by just giving them simple terms like apple. or pear. or bunny. generate just that one word for a prompt several times and study the images. then give them prompts that are a few words like an apple on a table. or a bunny eating a carrot. learn how they think, how terms work with them, and then you'll have a much easier time crafting good prompts for them
yeah I guess the question is: What are flux strengths over sdxl that would make a switch meaningful?
nothing and everything. flux will make really pretty pictures in a very tight range. if that's what you want, you'll get better results from flux. but if you want a wider range of content and concepts, go to sd3.5 or stick with SDXL
don't switch - just add more tools to your toolbox
not sure what you mean by prompt following?
Does it display what you're asking more precisely?
Yeah thats the question. If you can prompt it more naturally, describing relationships between objects in the composition, etc
for SD3.5 and flux, they use the t5xxl encoder as one of their encoders, and it wants natural language that's rich in descriptive details
and THEY have the comprehension needed to actually use prompts like "a white man wearing a white tee-shirt and a black man wearing a green jacket running away from a small sports car. the car that is fairly far in the distance, in the background, is on fire."
or "on the left of the image is a green dog. on the right of the image is a blue cat. in the middle of the image is a yellow triangle with a frog sitting on it"
Hum... What?
this is not the way to actualy get your message across. all you do by doing this is make people angry and make them ignore what you have to say
Also, what are the most recommended methods for Inpainting today? It seems overwhelming.
Ive tried Fooocus inpaint nodes, Differential Diffusion, Controlnet Promax Inpainting
photoshop gen fill ;)
Promax Inpainting seems to have the most coherence, since it recreates the whole image
that is not what in OR out painting should be doing
the AI should read the entire image, but only act on the selected areas
yeah but the trick they do in controlnet inpaints is to use controlnet conditioning to keep the pixels that shouldn't be inpainted
since this is never absolute, some pixels end up changing
yeah, well, just because you can do something doesn't mean you should...
I agree, but the results are quite good
okay
What is your favorite method?
On the topic of SD3.5, is it significantly faster than flux? (which is too slow to be usable to me)
It’s similar speed, sd3.5 large might be slightly faster but it’s not a significant difference. I would probably recommend flux schnell or sd3.5 large turbo.
Sd3.5 medium should be significantly faster but I won’t really recommend it, the flux schnell and large turbo models are faster and better quality.
They are similar but flux is usually slightly better at prompt following. Text should be better too along with humans which are considerably better. But as crystal said, sd3.5 large is more creative and has less biases, knows more art styles as well.
yo might consider using SD3.5 turbo
@quartz siren what's wrong with medium?
It’s nice for its size but when you can just use quantized sd3.5 large turbo or flux schnell, why not use them. Both are better at humans, text, prompt following then medium while being faster.
well sure, it just sounded like you didn't like it for some reason
medium is designed to be artsy, anyway, so it's a really good refiner
Hello! I am new to stable diffusion and have come across this article which talks about the ability to use it/AUTOMATIC1111 and controlnet extensions through Google Colab (https://stable-diffusion-art.com/controlnet/). I am thinking of doing this, but want to make sure that the notebooks on Google colab are safe and the images i create remain private/my own as i want to use it to create my artwork. Does anyone know if these notebooks are legit/private?
I followed the link to this site where the notebooks are able to be accessed https://andrewongai.gumroad.com/l/stable_diffusion_quick_start?_gl=1*1dzux7m*_ga*MTk4Mjg3OTYxNC4xNzMxMTEzNDA3*_ga_YHRX2WJZH7*MTczMTIxMDg1Mi41LjEuMTczMTIxMTA3Ny41OS4wLjA.
Does having more RAM while having 4GB of VRAM still helps to avoid memory out?
For instance, I upgrade my system ram from 16GB to 32GB
You should probably upgrade to 32 GB in any case. It’s cheap and very helpful even outside SD.
👌
you'll ahve to look at google's TOS. we're not a collab support site
I have been using Flux with Forge UI for the past week and wondered why my images looked so bad compared to Civitai gallery examples, even if I tried to recreate them with the same prompt. Now I know: Forge UI creates terrible looking images compared to ComfyUI. Now that I'm using ComfyUI my images look just as superb as those masterpieces on Civitai. Which is weird because I'm using the same model, clip, text encoder, vae, sampler, scheduler, prompt, steps, cfg and resolution as with Forge UI. Could it be that this is just a bug that will get fixed some time later? I would prefer to use Forge UI because it's way more user friendly. But right now I can't justify it.
hi
It’s almost certainly some generation setting that has a different default value in Forge vs Comfy.
Eg. I used to get really bad looking sdxl / pony generations when I tried them on a cloud service because the UI used the wrong clip skip.
ok i will try that thank you
hoi, do you guys know of a workflow or 2 that upscales and sharpens images without the use of trained models? As the upscale models i have tried to use for sprites just gets too artifacty.
ok thanks, just trying to figure things out
Thousands of companies maintain databases of discovered security vulnerabilities through their bounty programs. These databases, while currently separate, represent a massive collection of exploitable security flaws. The crucial realization is that combining these databases could create a training dataset for AI language models specifically focused on generating remote code execution (RCE) exploits.When a company puts out a bounty for a security flaw, and one is discovered, they don't just write it down in a database, they patch the flaw.
Secondly, LLM technology in general is available to both people on the penetration side and on the security side.
But more importantly, you skipped the explanation of how someone is going to access and combine the private databases of security flaws. How are they able to access them in the first place?
well you see the internet is like swiss cheese
yeah the real issue is what if a mouse comes and eats its way through
certainly a concern
if its open source then the people on the security side can go and read it and patch it
you can't just skip to the step where you have an rce llm
if its based on open source security flaws those are already patched
but they close the flaw once it is discovered by the bounty hunter
they don't just leave the flaw open
whats your answer or just uh proclaiming our doom
why would Microsoft not close a known flaw?
why do Microsoft pay bounty hunters if they are not going to close them?
hello
again you can't skip to the step where you have a working RCE LLM
exploits don't have patterns they are often very different
besides which, the security side also has access to LLMs
in fact Microsoft has the unreleased prototypes of GPT 5 right now
What's Emad up to nowadays?
I don't understand your point about bounty systems
because the companies close the loopholes found with the bounties
that's not really true though, they tend to be very different
but this is missing the broader point
the security side also has access to LLMs
which they can use to patch their systems
security exploits against popular software are like finding a needle in a haystack and there's no way AI is gonna be able to do it with their current level, when GPT4 is nowadays struggling to understand basic instructions
to be specific
if the future of security is essentially a race between LLMs
then the large firms will win that race
given their larger budgets and compute supply
you're describing organisations that don't have a security focus, and then extrapolating that to organisations that do have a security focus
again you are skipping to the step where you have an RCE LLM
you seem to really want to skip to that step
but I don't agree with that initial premise
there is an upstream/downstream system for security patches, its not the case that every software company would have to do the work themselves
PS5 isn't the most secure system in the world its hilarious that that is your example
applying patches from vendors aint gonna cut it
you're doing the thing I described here ```you're describing organisations that don't have a security focus, and then extrapolating that to organisations that do have a security focus
I couldn't find clip skip in Forge UI. The quality problems my Flux generations have in Forge UI are two-fold: First, the quality of fine details like hair is really bad. For example hair looks jagged and crooked (instead of straight or curvy and smooth) and there's JPEG-looking artifacts around them. The second issue is the overall appeal. For example a squirrel I made with ComfyUI has a more expressive, cuter face and nice fluffy fur and the overall image looks very professional, whereas the Forge UI generation looks generic and uninteresting and definitely not production ready.
your initial story involved hacking critical banking infrastructure
which is drastically more secure than Sony
Windows isn't the secure part of Microsoft
the secure part of Microsoft is their Azure datacenters, which so far have never been breached
Yeah, some setting is just outright wrong (clip skip doesn't apply to flux - it was just an example of the wide ranging effects such wrong setting can have) or there's a bug. With the same model and settings, all of the frontends produce equally good (and often identical) results.
there are companies with very little security focus
but those are vulnerable already so its kinda the same as the current situation
Attackers breached Microsoft Azure by using credential theft through phishing methods and cloud account takeover, targeting mid-level and senior executives for financial fraud and data theft.
this is still skipping to the step where you have a working RCE LLM
phishing is where you trick someone into giving you the password
its not a cybersecurity breach
its a separate category
ya just leave them on the system lol
I might be wrong about this but I was fairly sure there has not been a known software breach into azure data center
its always phishing
I've already given like 5 arguments as to why that isn't the case
I don't really want to keep repeating so I'm going to end it there
as far as LLM's remotely executing code, that won't happen.
The library that reads the LLM though....
ah yeah its phishing again though
Personally whenever I see someone using "attack helicopter" as pronouns or a name, i put them squarely into the "needlessly contrarian" category though. I wouldn't invest too much energy into a discussion with this bad faith actor
same with icloud, many phishing ones but never one via code, as far as I know
I always assume good faith personally
breach is a breach. But yeah, it was social engineering. the most effective form of hacking
yeah I agree there are phishing breaches
I consider phishing breaches a separate category to code breaches
yeah you can't program around someone giving away theiir credentials
if someone was making the argument that LLMs raise phishing risks, I find that argument a lot more persuasive
it's like if you were to dress like you were getting ready for soultrain, but then acted like people were obnoxious for saying they expect to see you on soultrain.
The whole attack helicopter thing screams "bad faith"
i agree. the social engineering aspect of hacking will soon be automated to great degrees
out of the points I made I think the strongest one was
if security is going to become a battle of LLMs
that's gonna be won by the big firms and not rogue entities
because LLMs are rapidly moving towards requiring a trillion dollar datacenter
If i came into a chat server with the name "Rittenhouse Swagger" what do you think that would say about me?
100% the security landscape is going to shift to a lot of LLM's engaging each other
most big companies have an internal LLM of some sorts now
of varying quality
there are levels of vuln though
the high end vulns that are a risk to critical secure locations tend to be very convoluted and weird
a lot of those rely on old software.
and an LLM isn't required to deploy those at all. Scripts that hunt for open ports on old CMS versions already exist
I also think its important to not oversell how good LLMs are at coding
I see GPT 1o errors all the time
LLMs are good at codign if they're making boilerplate code. If you ask it to do something involved, it'll make up a solution that won't apply
yeah I use them for most boilerplate stuff now particularly scripts
problem with current coding LLM's is they have no code validation in them. The library that loads the LLM can do it, but that's not conduscive to writing effective code
if you're suddenly 25x more productive, a 2400% increase, you should aim that productivity at creating a robust module testing framework because OH BOY you're going to need it
would love it if you made a video, if you have a good methodology
I do worry that I am using them wrong
you have attack helicoptor 4000 as a name. All your flexing about clients is surely roleplay
to be fair to them, that meme was not as offensive in the early years of the meme
okay thanks
the yellow star of david probably wasn't offensive to people either /godwins
hmm that's godwins law yep
the meme was from Obama era
when things were different
a bit like that green 4chan frog
i basically have zero respect for self proclaimed bigots, and figure they're always grandstanding some fictional version of themselves
if you believe this guy's bragging, you're a chump
still used today /godwins
he didn't make a particularly big claim he just said he is a programmer and claude helped him
I believe that at least
i believe he thinks it has made him more productive but hasn't used it in any code that is really involved yet
I think LLMs made me more productive :shrug:
at the very least for code completion and stuff like wget scripts
you don't have to actually deploy the code they give you if its bad
GPT 4o can clean up a shell script pretty nicely
it can't do anything helpful in the area of statistical modelling or quantitative modelling which is a shame
it doesn't really even have the relevant software or languages in its training data much
and if you try to ask it the abstract math it can't do it in the abstract either
been trying to get LLMs to work with partial differential equations for like a year lol
ah yeah I prefer linux to windows
build it a reference library rag
it's a pain in linux too. it's just the pain people prefer
Yeah using a rag with inference is smart. Your own in house library as a rag is smarter
so you used it for boilerplate code that you would've gotten faster at using regardless. gfy
if that mono code project is what you call productivity improvements, i guess
good example of how LLMs don't replace experience imo
ai didn't produce it. you did using ai to help you with vue.js and library management.
Faster doesn't mean better either. "functional" is not the only goal. manageable is a better one for long term library development
other tools could've given you these productivity boosts too. code templating is a big deal. What AI isn't giving you is a quality boost
that'll come with experience
i'd say that in your project's case, the core library was the MVP prototype already.
you just implemented a library meta made with vue js. 6h is normal for that, and it seems AI just brought you up to the mean
if you worked in the dev houses i had, your code would be under regular review and wasting the company's time. tbh.
lol, implementing sam2 with vue js is impossiblle in 6 hours?
i doubt you've got the kind of experience that qualifies that statement. your name is "attackhelicopter4000"
companies at that level are stacked wiht trans or furry software engineers, so it's unlikely you're one of their colleagues when you openly run with an alias like that
we're in 2024. a discord username with it is a social statement
only for a minority
oh geeze, that minority of people tend to be the oppressed minority too 😮
bigots. bigots everywhere
i dont know, but you are saying true /shrug
i mean, noboyd has to argue against baseless speculations.
word sensitivity is a new emerging trend
is it though?
not new i suppose, maybe growing would be better
strawman helicopter is a strawman helicopter. one blazing arrow will down that tumbleweed
odd tactic
i tend to not care about what bigots think of my intollerance of them
pretty big leap there out of nowhere
I think the strongest argument was that Microsoft et al. will have LLMs too
to use on the defence side
and they have more compute + budget than the rogue entities
this is stuff like Windows and .NET apps though
the secure and important servers in society aren't running things like this
what it is, is that you're underestimating how secure the more secure systems are
critical secure systems don't have a random .NET service that can be a vuln
what do you think about the fact that so far no one has ever breached an Azure datacentre using code
there aren't patterns to the high level vulns though
high level vulns for critical systems tend to be very convoluted and weird
its true that some are simple yeah
but the other side, the security side, also have access to LLMs that can race through assemblies
to find vulns to patch
its not really possible to hold a proper conversation with you about it because often I make an argument and then you just carry on and don't address it
Anyone here is good on training Lora, i need a little help please
its okay to get excited about stuff
the main reason its probably going to work out okay is that most of the "scary" scenarios involve LLMs breaching core key critical servers
and those servers will be secured
your typical person's domestic devices, and most company or government departments will not be okay, however they aren't okay now so there is no change there
kiss your bitcoin goodbye, the plebian internet is doomed, maybe
Could it be that maybe the samplers of my Forge installation are outdated or something? I noticed another quality issue: my images have high sharpness (sigma), for example between skin and blue sky there's a very dark line separating the two regions. I tried to reproduce this image: https://www.reddit.com/r/StableDiffusion/comments/1feibuv/guide_getting_started_with_flux_forge/
I got a different image even with the same seed. And it had all the same quality issues I always get. Could it be a difference between torch231 and torch24? On the other hand, I also noticed an issue with ComfyUI (although that could simply be related to the workflow I'm using): The shapes of the body are over emphasized. The ribs and collarbone for example looked protruding. The skin overall looked computer rendered whereas the skin on my Forge generation looked natural.
Is stable Diffusion a GPU Killer when you use it every day 2 hours Long?
hello there, i found a pretty good flux lora script here https://github.com/TheLocalLab/fluxgym-Colab
i wanted to know if you guys have a similar project for a SD1.x model
hello there
I wonder if processing an exhisting video with generative AI in order to get more realistic details is already something we can use in production.
I.e. providing a movie done in 3D and get a more realistic version of it . better looking people and vegetation, keeping a temporal coherence - much like we do on images for a while.
Anyone here has experience on this ?
Or maybe can point me to a link on this topic ?
hasn't killed mine
I'm not sure anything is worse for your GPU
apart from a handful of numerical computing tasks
not anymore destructive than playing a game on ultra instead for those two hours
hey guys, hear me out.
Pokemon themed power rangers
problem is people are running stable diffusion for many hours rather than two
and often leaving it running overnight as well
there are extensions that monitor GPU temps and throttle it when it gets too hot but I have no idea if they actually work
they do but at the expense of speed
I feel like if the reduction is not too bad it's likely worth it for some folks
it is sensible yeah
sadly many people go the other way and overclock and overvolt for more speed
does that even work for image generation
is the 3090 24gb worth it for 575 euro?
Even if it is second hand?
yooo, at that price?!
yeah I knew you meant second hand
it's a four years old card

