#š¬ļ½general-chat
1 messages Ā· Page 174 of 1
Stable diffusion isn't geared for such things. It's a denoising algorithm at it's core. Any changes have to add noise to the image and then "denoise it towards a prompt" if that makes sense. So it's fundamentally a destructive process
True but then I have to learn Color grading and lighting. There has to be a simpler way to get professional photos
hire a professional š
You'd have to learn all that to get good results with stable diffusion anywyas. it'll only do what you guide it to
Well thatās what Iām asking. If anyone done this
I could train my own lora model or similar and create an entirely fictional image, but thatās the thing. I just want to make the quality better. Not change everything
yeah. that's actual skilled work and is not an automatable thing really.
not yet anyways
you could look into sd1.5 and sdxl tile controlnet models. They keep the original image VERY consistent, but affecting the color grading or fine tunes like you want.. that's not easy because it's trying to maintain the original image
you could also look at IC-Light, whhich is a whole different model that can relight an image. it's impressive but it's just one type of control over the lighting and you'll stiill need to know color theory and stuff
https://github.com/lllyasviel/IC-Light check this one. it's not stable diffusion, but it is a model.. E: i'm wrong. it suplements stable diffusion
Hey so when doing latent noise mask to change stuff with inpainting any tips? Can I use it to add stuff thatās not where rather than just tweaking already existing content?
yeah with higher denoise values. You can set it to either start with the same image you're inpainting, or start with pure noise and fill it in from there
Lad still doesnāt understand how all SD projects que up prompts. ššš
Iāve written many nodes for many applications, but please keep telling me I donāt know.
Maybe start by understanding what vram and ram is before you start giving advice?
i didn't say you don't know about comfyui. you're obvioulsy a fan of it.
but you don't know about forge ui and were wrong about it. here's a tip, admiting you were wrong allows you to be right next time. š
Yes you can upscale. Nudist doesnāt know what heās talking about.
Lad still doesnāt understand what āmany programsā means.
All front ends use the same process of storing an array of prompts when you que them up. Itās crazy you still donāt understand this.
Itās not my job to explain to you basic understanding of programming.
til an array of strings will over fill your vram
Vram != ram
20 prompts. that'll fill your ram so hard.
And technically itās json not strings.
(tfw json is just a string)
you don't need to explain basics of datatypes or programming to me. I clearly have a better grasp on it than you realise, since.. well.. this whole conversation
I donāt know what heās doing he could be generating some 100+ node prompt. And again itās on the browser you have limited space in js.
Regardless itās not vram
Lmfao lad still doesnāt understand
Are you the kid that I argued with a week ago who didnāt know what an array was?
Nodes, scripts, modules, all the same thing kid.
Is this the best you have?
no. believe it or not but there are many people online. thye're not just all one guy
here's a fun fact. a string is just an array of chars
i mean, when you're having a technical discussion, semantics are kind of the whole thing
Lmao, you sound like that kid, who didn't know what an array was. at the very most it's an array of characters
No one says "Pass me plain text string to the API" it's techncially json regardless how you want to argue it, do you even develop mate? š
Yes you pass json to an API.
this is a really weird argument. No i'm not @dasilva3334181
Before you give bad advice you should learn what you're talking about there mate. You started arguing about comfy first.
i dont think i'm needed here. you got yourself your own little strawman best friend to argue with. if he only had a brain you might have a useful conversation though
the guy isn't using comfy. he's using automatic1111 webui.
All i said was forge doesn't use comfy backend. i don't use comfyui since it has a large attack surface.
i dont' argue about it since i'm not versed in it
Cool, I never talked about comfy. I said the process of storing prompts is the same for any application it's processed and stored as an array of prompts, you're the one arguing about some drivel.
You should prob understand this before you start claiming it has anything to do with the vram, as it doesn't
caching 20 prompts (in json format no less) will fill a megabyte , maybe a few megabytes, worth of memory
Even if, that has nothing to do with vram, and again (third time now) I / We don't know what his workflow is.
youve lost the plot and are spiraling. take a breather. this isn't pretty
his workflow is automatic1111. it isn't comfyui
I see a pattern with this lad, when proven wrong he starts to claim the other person doesn't know what he's talking about, then claims some other garbage.
Do you know what an API is? :DDDD
we're well past a socratic teaching method. what are you trying to say?
i've yet to find a point
I'm trying to say you should prob stop giving advice because you haven't been right once.
dont let the door hit you on the way out big gunner
Hey how can I use SD to improve an image?
You can do that with photoshop or any other traditional editor. Those kind of edits are what graphic design has been all about for decades
š
Hey how can I use SD to make a image better.
"use photoshop"
that's how usesless you are š
Now we're in the then claims some other garbage. going to do another full circle?
this is the message you're talking about. sorry to ping you fruit, but could you offer wisdom here? @karmic brook
Please pray for my friend here, he's deficent.
are we to be discouraged from offering advice to people here because of this one guy's ego being bruised? I mean, i will if thats desired
He was asking how to upscale or style transfer, not photoshop my dude.
We went full circle, another attempt to act like I don't know what I'm talking about.
goes to SD form, gets told to use photoshop
It's sad that you don't even know what those algorithms in PS are doing...
it should just be up to the model i think?
the current version of clownsharksampler doesn't have the shift stuff in it anymore
this you?
heyoh clownshark. you're a dev that i respect. offer illumination? how much memory would json with 20 prompts in it require ?
Yeah I think it's up to the model
if you want to stay completely outta this topic, i fully understand
thanks for the reply
secret twist the prompts contain all known literary works
the prompt of babel
this fits
yeh i concede. it would definately fill all system memory in that case
i mean seriously though. If i was wrong about making any photo looking professional without changing the core aspects of the photo, i'd love to be wrong about that. Lay it on me
Again, we don't know his workflow, he could be using base64-encoded images and many of them, just because you think a prompt is just text doesn't mean it is.
Either way it's not vram
wtf is a base 64 encoded image? you mean like how you encode it into an html string and use javascript to display it?
webui isn't compatible with that. it just takes pixel data in it's img2img tab.
did you notice when he said he's using automatic's webui? it's a big clue.
Ok, are digital tools to get an image's tags/prompt off-topic?
images only have tags if the person published them in the meta data. any metadata reader will see that but there are lots of specific ai generation solutions out there.
another way is to use a vision model to describe the image, if there are no tags saved in the metadata.
i'm reluctant to give more specific advice. it might offend people
Of course there are always limits to what can be described, for instance, an image with a man walking with his dog in a park will most likely consider the foreground with more details and the background as just a place the foreground is. If the dog is a chihuahua or a dachshund will end up having more of a difference than if one of the barely visible people in the background is wearing a backpack, and adding that description could even interfere with the original focus.
It may not exist but a way of satisfying every parties would be to add a minimum to maximum amount of tags for the image, because I believe an trained ai would naturally strafe to descript the most important parts of the images instead of the harder to view ones. With tags I mean in the sense of online image boards, they usually describe their content with tags and so do some AIs. Natural language, to me, is only a last resort when I cannot find tags to describe what with clearance.
LLMs like ChatGpt already can analyse images, but as every non generalised ai it simply doesn't do that specific task as well.
The utility of this would be for when you've got a reference image but the ways you can describe it are way too broad, and the correct terms would have never went over your head. Ex: for poses, bird's eye view, actions.
Sorry if I wrote too much, I think slowmode would have gotten me there
pixel data, wtf is base64
š
Are you asking about image classification?
@green sand the expert guy has got you covered from here my man. you're set now
literally starts an argument with someone, then starts to cry when they are clowned on.
almost everything you've said in the last hour has been either a bad take, or wrong except for this first time where you seem to understand that meta is embedded in the image
^ relevant
Not at all, I'm talking about the non natural language type of prompt, where you put tags like the ones on imageboards
A prompt is just strings ran through tokenization, so a "tag" (single word, comma separated?) in what you're talking about would still be ran through a NLP to tokenize.
Still, I'm talking about getting a description to something you can't describe
"digital tools to get an images tags/ prompts" I would consider classification, unless I'm missing something in your question.
With all respect, I don't fully grasp what you're asking, you're asking for a tool or tokens, or tags?
Classifications were supposed to be about the image itself and not it's contents, no?
Classification can give you information, what information you need is up to you.
If you ask GPT to explain what colors are in a image, it will tell you.
If you ask it how many men walking their dogs it will do that too.
llama3.2-vision is a good option if you want to run local imager
Is it paid?
It's local \
i have some questions boyz
I know on-web is wack but it's kind of my only option, so if you have experienced any alternative that's server hosted it'd help me
llama 3 8B sized models can run on most local GPUs
koboldcpp is a good app to use for them. supports vision models too iirc
open router or docker plus runpod
runpod if you want to run the 40B sized versions yeh
running local is ideal. using a service means they get complete distribution rights to your work, due to how service works. they have to store, copy, transmit your data back to you. And instead of taking specific rights, they just take a broad kitchen sink license
Chat gpt
I donāt really know what youāre asking I assume itās how to classify things in an image.
i'm unsure if llm vision models fall into the category of classification models. but i'm not about to discuss that in depth because.. well, lets just not
https://huggingface.co/blog/vision_language_pretraining as usual it seems to be a complex topic that isn't a simple yes or no answer, and hugging face blog does a good job diving into it
if its a managed service then yeah they can often have bad licenses
barebones platforms like Vast/Runpod are fine they don't take ownership of your stuff
if its for an important commercial thing the normal thing to do is pay a bit more than Vast/Runpod and get an enterprise contract with AWS/Azure/GCP
home GPU would actually cost me more in electricity per hour than a cloud 3060, funnily enough
cloud is the cheaper option, contrary to very popular belief
depends on electrical prices in your area
yeah cos Clown was saying the price for him and its like half
the US has lower prices in general for that which is partly why cloud is less popular there
Hello everyone! There's a LoRA trainer on Civitai, would the result work with Flux Dev GGUF Q8? Or is it only for the original Flux?
yes it will work with GGUF Q8 if you have enough VRAM
with GGUF if you hit the VRAM limit the lora can fall off
you can fix this by merging the lora
you might not need to though
Thank you for the replies! Do you happen to know how much VRAM GGUF Q8 normally takes?
12.7 GB
the VRAM usage is simply the file size https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
Ah, sure, makes sense. And LoRAs seem to be pretty small, so hopefully 16 GB would be enough for one or two of them. Thank you!
Thank you very much!
Hello
Hi
I found something that does what I'm talking about it's Clip Interrogator
that works yeah
hello
Hey
what program do ppl use to train AI on specific voices ?
running in fp8 halfs the cost of memory too
Is stable diffusion not freemium anymore?
Applio
What do you guys think will be the best uncensored LLM model to run on the new 5090 with its 32GB VRAM?
check back when the card is released
lolol its dropping early Jan
in AI time that like years
rofl
ide like to see a Lama 3.3 uncensored
Do you guys have any model recommendations for spaceships? Iāve been using dream shaper and juggernaut and SDXL.
It depends on the style but you could always train a lora if needed.
do scalpers use a webscraper to buy GPUs on launch day or how tf is it done?
I ask not because I want to scalp (I wish I had the money), but because I want to make sure I get one before the scalpers do
I'm not in the US so sadly waiting in line outside a store is outside of my reach
yes exactly that
hello
š
no they have alternative front ends for power buyers. if you have a customer thats always willing to buy all your stock in one go, you cater to them. the rest of it is just a facade for the common people
I can't wait to pay 2500 dollars for a 5090, or whatever it costs
I'll feel dumb but the speed ups will be massive
The problem is that amazon itself is a scalper
Ur delusional if u think amazon will sell it at MSRP
MSRP is gonna be $2000 max which is not that bad
but amz will hella scalp it. U need to buy from newegg best buy etc
halo
MSRP price is only available on the first 10 seconds of launch
after that, its sold out and everything else is a resale
only US will do MSRP tho bc of import tariffs
for image gen you don't need RTX 5090 that badly
svdq-int4-flux.1-dev.safetensors is only 6.64 GB of VRAM usage
and that's one of the fastest flux methods out there
only a good tensorrt setup might be faster
3060 is plenty for flux
maybe not dev versions actually
schnells and hybrids yea
3060 will kind of work with the dev version
i've generated images with flux-1-dev with 3060
I've rented 3060 for flux before its ok
oh? how i always get bsod when i try to use a dev model
i would guess that a 3060 might have problems training flux dev though
i got comfy to work with it
ya me too i use comfy
i put the --lowvram command line argument
i tried fluxgym and that just gives error 0
I actually do the opposite and put --highvram
but then spam "unload model" nodes
how much of a hit does it take to gen? on this 8 step anime hybrid it takes me 30 sec for 1024 res pics and 1 min for abive that
most of the time is spent loading the ~20gb file
once you've spent the day waiting for that to load, it's a little bit longer than an sdx renderl I'd say
niceeee. good info to know ty
I was also loading the model from a HD, not an ssd, so 10 time slower
i finally bit the bullet and copied my models folder over to nvme the other day
it's much better
how god how long did that take
180gb - not that long when you do something else
on 3060 I tend to use distilled versions of SD 1.5 like SD 1.5 hyper rather than anything bigger
some Schnell stuff at 768x768 like Shuttle 3.1 for 2 steps is okay as well
if I'd spent the time watching it copy 180gb, it would have taken forever!!!
sd 1.5 is stable diffusion 1.5
yeah no need to run that, just use xl models
3060 is good for it
illustrious models
for anime
i gave up running sd1.5, i had absolutely no luck with it at all, everything has either 3 legs, 4 arms or 12 fingers
I personally like quite fast generations but people have different tastes on that
thats cos its super early model, you needed alot of prompting know-how and loras
and negatives
can still make nice stuff with it but
impractical
if have the vram
honestly, i prompt better on 1.5 than xl
one of my merges at the time
i really attempted to prompt on xl, but i simply can't feel it
it depends if you use Ella or not
Ella SD 1.5 prompts well
i did last night when sleeping, took like 3hrs with only 10images
so if you not planning on using pc lol
i can send you a ss of settings if needed exard
I like IP adapter a lot
i had reasonable luck using florence 2 for sdxl prompting, I tried it for tagging train images, then take the prompt and put it into generate
not sure if SD 1.5 or SDXL has better IP adapter
florence 2 is awesome yeah I use florence 2 for every model
what is Ella?
Ella is this thing https://github.com/TencentQQGYLab/ComfyUI-ELLA?tab=readme-ov-file
it lets you use T5 with SD 1.5
it makes it prompt better
i had a weird experience with the resolutions, if i asked it to make an arbitrary picture of resolution not on that list, it makes a mutant
out of all the uis i always tend to gravitate back to comfy
oh, that's actually a good thing, i should try it
i had no luck with auto11, but tried it recently when i had more knowledge and got it to work, but now I have the issue, it randomly does not load lora directories
invoke good? havent messed with it
yeah invoke is under-rated probably
i got comfy within krita running last night
super neat
i gravitate back to comfy
invoke is like comfy but more stable and with canvas
downside is less nodes
Anyone use OneTrainer, I don't know where the trigger/activation tag is set for a lora
i hit the step in the middle
i just put my gpu on train for the last 8 hours overnight and got complete garbage
usually to add a trigger you do it while tagging
your dataset
least thats the way i do it
i put the trigger as the first word in the tagging files
it was 18 epochs for 110 images
repeats?
i don't know where that is set, so couldn't tell you
i have 1.1 days of experience using OT
you can open up ur concept
click it and youll see repeats on the rightish
(moving stuff from on drive to another so i cant open onetrainer to send ss rn)
usually i do like 2-4 repeats
and play with the learning rate
.003 is default i think? i move it to .001
Balance
for regularization, i set the balance to (No. reg images) / (training images)
there doesn't seem to be an option to explicitly set them as reg images like in kohya
oh shit dev isnt crashing and working now
and only 15sec longer
lma9
fantastic
literally 11999mb is being used lmao
Hi
I was dubious when upgrading my pc, should I go for 16 or 32gb or ram, thinking, I've almost never seen my pc use most of 16gb, but erred on the side of caution and went for 32gb. training SD has bought a new meaning to this when i see task manager performance and 15.99/16gb vram and 31/32gb ram being used constantly!
Ella - reads your clip text and outputs and using an LLM outputs prompts sequentially over the time-step to improve the render generation process
Why do they not just put the abstract at the top of the github page
I'm guessing it uses the page file for the rest of the ram being used!
that should be what the page file is for, increasing your ram when it goes beyond the physical limit
If I use 0 reg images, the ram consumption and speed of processing increases massively
i'm not sure what effect having 0 reg images has
Do you think you could send me a SS of your settings please, I'd like to take you up on that offer. š
What might be a good item for gpu manufacturers would be a PCIE dedicated vram board, no processor, just VRAM
From my experience, changing the learning rate is useful for training convolution networks, can this be implemented with stable diffusion, ie.
If I train a model with a high learning rate for the first 10 epochs, then incrementally lower the learning rate every 10 epochs.
Is this a thing?
Should I consider using Comfy ui or forge?
I'm looking for speed. Also I have a 6gb ram gpu and I'm using flux v1
I remember someone saying that Forge automatically detects and switches to low vram, while comfy uses a command line argument in the batch file --lowvram
but i think that comfy is faster but the issue is that its hard to use control net with comfy
it's not user-friendly
at the end of the day, it's the amount of tensors/s that your gpu/cpu can process that determines the speed, not the application, as both applications use the same underlying code, just have a different ui
I've tried a number of different ui's and the speed it renders from a given model is very consistent across different ui's.
it's true that it's a bit more complex to make the control net work in comfy, but you do have a lot of flexibility with how you wire it all up
most ui's should offer a button or option to allow you to load the model into memory before you start the generation queue, this can help.
wsh chat
Hello guys, where can I download stable diffusion
Hi guys
I want to use SD to draw color block textures like early cartoons. Is there any suitable model or lora? I only have 8GB Vram, is it not enough?
I can't paste pictures here, I mean cartoons like Tom and Jerry
I recommend downloading Stability Matrix for an easy way to get into SD. I recommend either the Forge package or SwarmUI package, if you are AMD gpu you will need to check in tech support probably for further assistance.
damn, wish I'd had that advice when I decided to learn SD
I went from AMD linux to getting nvidia cos windows
that's quite a shift
haha. i did the exact same thing
i prefer windows but amd windows drivers are so bad that i took up proton gaming on my desktop for a couple years because of it
it worked out well though because i was genning on my amd card the year that SD came out. automatic1111 had it all set up to work with rocm on arch
also games were getting more frames
Hi everyone!
Is it worth adding captions to my regularization images when training a lora, ie run them through WD14 etc, other than the class word?
I'm guessing that can't be added to OneTrain?
ignore that, i just looked, it's a complete ui
wd1.4 is just an sd2.1 model iirc. it was bad when it first came out too. The gold standard of anime when it released was novel ai's model and the refines based on that. shortly after wd came out, sdxl dropped and the few people using 2.1 moved on
you dont want to bother with refining 2.1. it's not very refineable. doable but something about it's architecture made it a lot more difficult to teach anything
you've convinced me completely to not use WD14 again!
back to the question though, is it worth properly captioning regularization images?
Just had a gander at my page file, it's only 30gb...
Hey guys,
What's a good Discord for chat about Local Video AI Models like Mochi, LTX & HunYuanVideo?
banodoco is probably the best
Can you invite me / post a link?
not sure if I can send it here, so sent it in dm's
if you have 32+ ram you dont need to increase or mess with it
the pink plead is gone :C
How to create photos for 16:9 full screen monitors without cut
with stable diffusion
Yeah but it is some extension?
https://github.com/HM-RunningHub/ComfyUI_RH_OminiControl
Runninghub.ai has developed and open sourced the comfyui node of OminiControl. Everyone is welcome to use it.
In addition, the corresponding comfyui workflow has also been released, which can be edited and run here.
https://www.runninghub.ai/post/1865085524393500674
what is a good model to use for tag generation in Taggui?
Thanks
tried to do 100 epochs overnight with 3 repeats. I managed to do 17 epochs! The computer decided it wanted to sleep and stopped processing.
i'm guessing that I shouldn't minimise the window
yo. Just rented a 4090 for the first time, I need to make some images illustrating songs for a small concert. Not very experienced with Comfy and never used Flux before. My Comfy is in fp16 currently. What exact Flux model do you recommend and maybe some of your favourite loras?
if you can, try to get SVDQuant nodes working, and use their flux model https://github.com/mit-han-lab/nunchaku/blob/main/comfyui/README.md
its by far the best choice for 24GB VRAM and below
Hey how are you guys. I have a very tiny problem which is extremely big for me. I need Stable Diffusion XL Inpainting's finetuning script, but even after months of work, I can't find a working script.
I saw SD v1.5 and SD v2 Inpainting's script which makes random masks and I think that is awesome, but even extensive works on the script trying to convert it for the XL model came up dry. I need help, if someone could?
is this the right place to ask, or would the tech support channel be more suitable?
Hell!! š
Iām really excited to join this community! Iām passionate about everything related to AI and always eager to learn more. I use AI to help with projects like book writing and content creation, and Iām looking forward to connecting with others who share the same interests.
sounds like a scam to me
if it's not a scam, you shouldn't be asking to loan people's accounts, that's called phishing.
How do I make an image prompt in this code? I want to be able to load a picture and create a picture that is inspired by the input picture
from diffusers import AutoPipelineForText2Image, DPMSolverMultistepScheduler
import torch
import os
import time
# Image storage folder
IMAGE_FOLDER = './images'
def create_pic(prompt):
# Load the model
pipe = AutoPipelineForText2Image.from_pretrained('lykon/dreamshaper-xl-lightning', torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# Optimizations for vram usage
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
pipe.enable_vae_slicing()
# Inference (should take a few seconds)
generator = torch.manual_seed(0)
image = pipe(prompt, height=768, width=768, num_inference_steps=8, guidance_scale=2).images[0]
current_time = time.strftime('%H%M%S')
filename = f"{current_time}.png"
path_to_save = os.path.join(IMAGE_FOLDER, filename)
os.makedirs(IMAGE_FOLDER, exist_ok=True)
image.save(path_to_save)
print("heer")
# Return the relative URL path
return f"D:/setup/Gamla bilder/website/{filename}"
print("saved to ", create_pic("man jumping from sky"))
Only thing i can see looks like filename but "man jumping from sky" seems to fit prompt syntax?
Nvm filenames date based. That must be the prompt
yeah but what I am looking for is being able to load a picture so img2img. So when I write in the textprompt it will use the loaded picture as guidance @modern pagoda
yes so lets say i have a mug i want to use that in the loaded picture, and then in text prompt i say dog holding mug, and it will hold the mug i loaded, ykwim?
It should in theory depending on model
yes but i am pretty sure SDXL supports that
lmk if us till looking for someone hosting open source 3d gens
latent diffusion coming to LLMs. so it'll generate entire passages of text at once instead of only the next token at a time. META published a paper on it and the next generation of models will likely be developed this way
Not looking for someone to host stuff just points on how to run locally so I can do it too. I made Tripo run but I'm having trouble installing 3Dpack
wow thanks this is really huge
Yeah it's a generational leap (heh)
there were some funny diffusion language models before they were not too strong but used for some stuff like artificial data creation
yeah it's not a new direction. it's been tried before. but Meta's new model seems to get it done
Meta get the benefit of the doubt because scale yeahg
https://github.com/facebookresearch/blt here's the code if you're that sort of monkey
ah nice they gave code that's cool
as far as i can tell the license is permissive too
very good

š
hi
is it possible to reverse search an image on civitai?
try png info first
load an ai generated image and if it has metadata the png info will pull the prompt
ill try rn
i put the image but nothing shows
just to give context, the images are from this guy https://x.com/arisato_yu
????
that has nothing to do with diffusion
ah okay I haven't read the paper yet I assumed it was one of those diffusion language model things
I don't think so but maybe their API
anyone know of a good stablediffusion finetune for pixel art that i can run on 12gb of ram? I'm trying to train a controlnet on a specific pixel art task and i need to make a good choice of base model
12GB VRAM and you can run flux
can i train controlnets for flux? that's what I've been using for straight up image gen. was trying to use a lora for my task but it doesn't really work for what I'm trying to do
I'm using omnigen on playground (https://omnigenai.org/playground). I launched a generation but it indicates "queue: 36/36" + a timer which increases over and over, slash 6349475.8s. There is also an loading icon which plays over and over. 6349475.8s, it's more of 73 days. This is just crazy. What's wrong here !??
was explained to me that way
i welcome being wrong
yeah, it's a very different technique
but a cool paper, although I have the feeling it's not a theoretical breakthrough but rather an engineering one. Their main recipe is the n-gram encoding, not their fancy entropy based patching
it uses the t5 encoder so yes
can automatic1111 run flux models or i can't yet?
nope
thanks for the info
hi
on forge with some workarounds
its native supported by forge
and gguf models work there too
interesting
Hi guys; where can I go to get a refund for my subscription? I can't get through to support
automatic1111 can't but forge webui, a heavily modified fork of a1111 will run new models exceptionally
i didn't know that, thanks for the info
i might try it out
old extensions won't work as well. there are some compromises with it
i don't really use extensions, the only one i use is tiled vae
i have a thing called "Stability Matrix" that allows me to manage and update a few different UI's, all of them with access to the same models and lora folders
that seem complicated to set up
The matrix app manages it all pretty well for you
you could otherwise do it yourself with symlinks, which isn't so hard
looks like forge ui has multidiffusion built in. Which is tiled vae afaik
not sure if that works on forge though
i'll need to try that out later
forge ui seem better than automatic1111 so i think i'll switch soon
thanks for all of the informations
"better" in some ways.. but others . .. i often find it being a complete memory hog
then i might not switch if it take that much memory
i don't know if 12gb of vram would be fine
though i could try comfyUi
i heard that it's pretty good
12gb is tight. comfyui might allow more flexibility there, but its more advanced noodling
https://arxiv.org/abs/2406.02507 what is this ? negative guidance? "throw some poop at it. that'll make it better" lol. more involved that that surely, but summing it up that way seems hilarious.
yes its fine
<12g 3060 user
Generation is quick, training however takes the night
if you use comfy/swarm youll have better speeds too imo
is there some damn tutorial to launch the software? i look at stability.ai public releases, it has zero search results for "download". i download some random model, it asks me to download user interface for stble diffusion, i do that, there is not a single .exe file to launch the software....
check out stability matrix, i use it to be able to use multiple uis while sharing the models across uis
easy to use
7za.exe just flashes the cmd black screen and python.exe has some credits screen
it has no errors, it simply closes itself.
yeah, he told to download some other file which doesnt work as well. just flashes the explorer window as if file was opened but closed too quick to display anything
read what hes saying, you arent launching the webui via the .bat
i did. it had no exe files, so i ran the bat file, it simply flashes the explorer as if cmd window was opened but closed too quick to show anything
what bat file are you launching?
webui.bat
if i'm not wrong, the right .bat file to open should be : webui-user.bat
for me, it's the file that i need to open
with automatic1111
it behaves the same
@echo off
set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=
call webui.bat
``` thats the webui-user.bat
you have a nvidia gpu or an amd gpu? (or intel, if it's intel, i can't hel)
nvidia i think
alright. it says for nvidia to
Edit the webui-user.bat (right click, edit), At the line COMMANDLINE_ARGS= You add: --xformers --no-half-vae
Add the following command there too depending on your GPU Vram. (Check Task Manager ->Performance ->GPU ->Dedicated GPU-Memory)
i use amd so my webui-user.bat is
@echo off
set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS= --use-zluda --update-check --skip-ort --medvram-sdxl
set HIP_VISIBLE_DEVICES=1
call webui.bat
couldnt find python error
so after the place where it's written : set COMMANDLINE_ARGS=
you should add --xformers --no-half-vae so it should be :
set COMMANDLINE_ARGS= --xformers --no-half-vae
for you
(wait this is the wrong chat)
blender runs fine with python, so idk.
so if you use Nvidia, do you have CUDA of downloaded on your pc?
it's for nvidia graphic card
we should switch to tech-support chat because we are talking in the wrong chat right now
(sorry for flooding general)
not needed afaik. in the past it helped because you'd have the newest versions of the cuda, but drivers and pytorch have since come along since.
Consider that you woudlnt' need to install the entire directx sdk to play dx. you only need the dll that works with the binary compiled for it. The driver has cuda built in for that end user purpose
yes
welcome
hi happy to be here
Heyy
hi, very happy to be here with you!
Hello. I am new with SD. How artists get so same looking ig models for example in ig pics with ai. What settings and tools they use. How same ai model with 100% same look you can create to be as model etc in internet etc.
hello! I just start to use SD and happy to be here
For beginners I would always recommend InvokeAI. It is by far the easiest software
i needed an image generator which generates map layouts. the one i found used stable diffusion, so i had to go with it
I use ComfyUI myself, but I would never recommend that for beginners. The same with Forge or Auto111. Yes, they are more intuitive than Comfy, but not by much. InvokeAI might have less plugins and need always a few months more to come with newest features, but it installs automatically, has an intuitive user interface and lots of tutorials
stable diffusion is a model, not a tool
well, it said to download stable diffusion so idk.
its just a model. You could run it in python if you are a programmer, but I assume you are not
you need an application that runs stable diffusion for you
there are plenty to choose from
- comfyui
- swarmui
- auto111
- forge
- invokeai
- ....
invokeai is in my opinion the easiest to use
you can download it here:
https://github.com/invoke-ai/InvokeAI
when you install it, it will ask you which model you want
there you can select stable diffusion
btw. if you want to make battle map layouts, I don't think stable diffusion or image generation in general is the right tool for it... I tried several stable diffusion models for battle maps. I also made a model myself. Yes, they work somewhat, but they are not perfect. I would rather use one of the many tools you find on e.g. steam for it (like dungeon alchemist)
should probably do abit more research before jumping into something new
asking in a discord channel is "research"
Am I supposed to have the new hook nodes in comfyui? They are beta, but also they have been released 2 weeks ago. I have a stable build, so are they not part of it?
probably need update
Do you have those nodes and do you have a stable build? Of course I updated everything.
I have those nodes but I don't have a stable build I just download the newest of everything as soon as it releases
Is there a guide explaining the diffrent prompts parameters, cuz sometimes is see things like this <0.9> or this (Prompt)
That depends on what interface you're using. ComfyUI and WebUI have a syntax section in their documentation.
k, I'm using forge
Forge should have the same syntax as WebUI
I'll try to look for it
Short answer is <> is used to add loras and () to increase the weight of a part of the prompt. But there's more cool stuff you can do so definitely read the documentation or watch a tutorial.
Is the t5xxl_fp8 encoder good ?
Hey guys!
Beginner here.
I've managed to get the hang of image generation and modifying those images by inpainting by following some tutorials, but I'd like to alter an image of my pet. And for some reason I struggle to get that right. Either it barely changes anything, it screws up the shapes, or... it turns it into something entirely different...
I was wondering, do you guys know of any (video) guide to do this kind of stuff? For example, I'd like to create something like this:
https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/eae3f1f8-c95e-49cb-b502-e5b88fd76082/original=true,quality=90/00102-3865871024.jpeg
or this:
https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/562f5b5e-e84e-4c2c-862a-67915014e736/original=true,quality=90/41254735.jpeg
Thanks!
(I'm using AUTOMATIC111 if that matters)
Make sure you set the inpaint area to "masked only"
Then set the resolution to a sqaure
That should help
And set the denois to 0.5 at start, then adjust
I'm already doing those things. Very little effect unfortunately š¦
Can you show an example in #šļ½prompting-help ?
Sure, give me a couple of minutes. Thanks already! š
Yes its good
whats the best model?
i finally got everything working lol Im trying a few different ones
There is no "best" model, completely depends on what you want.
flux dev has the best prompt following, humans, and text rendering, however sd3.5 large is a bit worse in the above capabilities but is more creative, knows a bit more knowledge, and knows more art styles.
Pixelwavev3(finetune of flux dev) is great too, has flux dev level capabilities and is more artistic and has more knowledge.
If you want fast models, flux schnell is the best at 1step generation and shuttle3(finetune of flux schnell) is the best at 2-8step generation. Sd3.5 large turbo is good quality at 4-8step but shuttle3 is similar quality at 2step and can do much higher res like 2k.
Ahh I see i just meant general consensus what latest model do yall use
Buut yeah more depneant on what ur looking for
probably flux/pixelwave or sd3.5 large, I personally just use shuttle3 because it has good detail/quality and incredibly fast since its low-step.
hey any good toturial for installing stable 3D?
Does anyone knows a good upscaler for flux generated images ?
Comfyui_TTP_Toolset
i'm on forge ui
(āÆĀ°ā”°)āÆļøµ ā»āā»
Not sure if I am dumb, but I am using WebUI reForge and I can't see the ControlNet tabs. What am I doing wrong?
"sd_forge_controlnet" built-in extension is checked under the extension tab
and no errors in console
nvm, I have an error in console:
Path E:\SD\extensions\sd-webui-controlnet\annotator\downloads does not exist. Skip setting --controlnet-preprocessor-models-dir
and then
*** Error loading script: controlnet.py
Traceback (most recent call last):
File "E:\Data\Packages\reforge\modules\scripts.py", line 533, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "E:\Data\Packages\reforge\modules\script_loading.py", line 13, in load_module
module_spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "E:\Data\Packages\reforge\extensions-builtin\sd_forge_controlnet\scripts\controlnet.py", line 24, in <module>
from lib_controlnet.controlnet_ui.controlnet_ui_group import ControlNetUiGroup
File "E:\Data\Packages\reforge\extensions-builtin\sd_forge_controlnet\lib_controlnet\controlnet_ui\controlnet_ui_group.py", line 16, in <module>
from lib_controlnet.controlnet_ui.openpose_editor import OpenposeEditor
File "E:\Data\Packages\reforge\extensions-builtin\sd_forge_controlnet\lib_controlnet\controlnet_ui\openpose_editor.py", line 6, in <module>
from annotator.openpose import decode_json_as_poses, draw_poses
ModuleNotFoundError: No module named 'annotator'
Hi. Iām new and i use stable diffusion (text2image) but itās not always really realistic, can i do something to upgrade ?
Hey, try out a different model
Is your reforge updated?
Also #š¤ļ½tech-support is the right channel for error logs
You talking about, [Sampling method] ?
No I mean the checkpoint
Which one do you use?
V1-5-pruned-emaonly.safetensors
I just have this one
Ah okay yes thats the default one. Its over 2 years old and not recommended to use
You get better checkpoints (models) at Civitai.com
Then you drop the model files into the models/stable-diffusion folder.
Make sure the model file is 2gb or larger
I will try that
Itās not free ?
Its free
"Get access to this Model Version!
The creator of this Checkpoint has set this version to early access, You can download with this Checkpoint by purchasing it during the early access period or just waiting until it becomes public. The remaining time for early access is 3 days, 11 hours, and 10 minutes"
i need to find another one
3 days
Thats only for the latest version of some models. You can select an other version at the top to download
oh you right excuse me, i am a little nooby
god i hate they added early access
i remember crying the first few days of learning sd
@warm junco that's work, thanks š
Can I swap the face? Like, put a face on the image I want?
There is an extension called Reactor that can do that
Should I go through an extension and paste the URL, or is there something better and easy ?
Any Upscaler recommendations?
does it work with other models ?
Its a text encoder used for Flux and SD 3.5
Click on extensions, then on Available, then click on Load from.
Then you get a list of extensions
Then search for Reactor
Yep and enable it
thanks
This is so interesting, I think Iām going to get a lot more questions š
best course to learn stable diffusion w/ a1111?
I like youtube tutorials but want series all put in one place
so thinking like online courses
@karmic brook can you handle this spammer pls
hello guys I think I can ask here but I need help making a decision for a new PC. I can either go with a 4090 laptop or a 4070 TI Super desktop. Between the two which GPU would you go with for AI? Both are 16GB Vram
laptops aren't designed for heavy graphics work, and you need an nvidia GPU
not a cpu with an integrated gpu
4070 Ti Super is gonna perform better in every regard anyways
Better get a GPU adapter for mine, all to run my local SD client. Before I shell out mula (that's moo-lah) for an hourly cloud GPU rental.
optionally subscribe to me, might get image gens, maybe

for legal purposes this is a joke
Not off-topic. 100% scammer!
It's always people like you who don't see the signs that fall for such things. Go ahead and DM the guy and see what happens.
Hello!
-# this user is under scrutiny by the FBI, do not contact this person as he/she is a direct suspect for a case. for more info, check discords trust and safety policies.
Can someone make a Lora for me
b580 looks goated
animation is possible ?
Does anyone know a good upscaler ?
Anyone did install successfully Trellis thing for image to 3D on Comfy ?
Simple installation stable diffusion with amd
Just past this in terminal ?
sudo apt install git python3.10-venv -y
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui && cd stable-diffusion-webui
python3.10 -m venv venv
??
Only if you are on Linux.
For Windows checkout the install Guides from here:
https://github.com/CS1o/Stable-Diffusion-Info
the guy mentioning stocks is one of the scam bots yeah
need to be careful cos what these discord bots have been doing is sending hyperlinks to malware over DM
they can be persuasive now because they can use Claude etc to do a bit of "conversation"
wonder if can expose the ai with some logical bs lol
Huh ?
what's happening with these bots is real its not people exaggerating
need to be careful
What link are you talking about?
oh they send the link in DM after a bit of conversation
do the ol "say potato" trick over an over
how do you pass or fail the potato test?
you keep asking them to say the word potato, usually works to weed out the fake tinder bots cos tll be obvious its a bot, theyll either ignore it repeatadly or be strange
what did u think about only fan IA ?
Hey so I'm using sdxl with comfyui and trying to get the impact pack, it says install failed and I think it's because I need the requirements seg anything etc but I'm using a laptop which doesn't have WiFi is there anyway to manually get the dependencies maybe please?
oh I see
this test can be passed by LLMs now though
So annoying lol everything looks great apart from faces
Does anyone uses X/Y/Z Plot Script with Reactor Face Models?
I have the issue, that when running with the script, the generated images with later face models are looking completely different, than when I use them seperately.
In ReActor I have the setting "Save Original" and "Swap in source image" ticked on, because in some images I got better results, when the swap is in the source image, instead of in the generated image. But I think what is happening is, it takes the previous generated image as a source, instead of using the very first image as a source. So the more swaps happening, the worse it gets.
I never talked to a scam bot, do they try to answer any question? So if you ask them something no normal human would know or that doesn't even have an answer, would they still try, no matter how absurd the situation is? Ask it what happens in The Simpsons episode 456 and if it knows or acts like it knows, it's a bot.
these days they use LLM
I think he has a good point. llms might be vulnerable to this kind if test
Yea but they are instructed to play a role, right? I don't think they can distinguish between reasonable questions and unreasonable questions. Also LLM can't do math, ChatGPT used to fail at basic additions. But now the prompt is preprocessed and math instructions are executed separately.
but let's be honest: just thinking about "is he a bot?" is already the most important step. In the end it doesn't matter if somebody is real or a bot. A real person can send you a scam link, too
doing a Turing test on the scam bot is actually not necessary yeah
cos you can just avoid the humans as well
Ellie was talking about Tinder bots. So in situations where you try to find a real human these kind of test questions might be helpful.
yeah, but even there real humans might be more dangerous than bots š
but yeah, it helps to not waste time
I imagine it's extremely frustrating chatting with somebody for long time just to find out he is fake
but again thant happens with real humans, too
Well if it takes you a long time the joke's on you
fair enough
there's that thing where if a test gets popular it ends up in the training data
that's why you don't do a specific test but test general behavior
llms can be very naive and dumb
But for me being able to detect AI is important when researching on the internet. I don't even wanna waste 10 seconds on websites where the information is AI generated.
it's easier to trick them than other way around
that's difficult, though, when you cannot interact with the ai
text generated by ai just looks similar to human generated text :/
Not really. I'm now fairly good at it. It's the way the information is structed and the talking points. I only have to scroll through and can tell immediately if it's AI or not.
AI often uses alot of words to not say anything of importance.
I've read blogs sometimes where I only realised near the end it was written by LLM
I sometimes read blog posts with instructions to do something. If it's AI you always see the usual bullet points with super weird topics where you go "what does that have to do with anything?".
amd user, i want edit files "webui-user" but after he said python not find etc
can you help me ?
I hate the recent news article trend where the article slowly loops itself
I can't tell if it is intentional to cause more wordcount and ads, or if it's just llm runaway
BBC articles did that before AI lol
Hello, I was trying to learn how to make my own Lora, what would be the best way?
hello guys
sitting down and actually making one. The steps are: 1. make a good data set and then 2. figure out which model the lora will be for and then 3. train the lora and test it. <--- with that in mind, research how to make good data sets
It's been a long time since I've been here. I'd like to know what model people are using these days! Is it called Flux.1-Dev?! What are the requirements for using it or if there is another one, what would it be?
It's LLM. It's all fake news anyway. I'm not in that business but from what I understand many such sites use freelancers that would write articles and get a portion of the ad revenue. There is no real editorial process. So it's most likely people that would feed an LLM with info and tell it to write an article. Might even be fully automated at this point where a script scans the internet for hot topics, gathers the information, makes an LLM write an article and then publishes it on multiple news outlets. Pro tip: Do not consume news from main stream media. Find some independent commentators that you can trust and let them gather and digest all the information out there.
which "mainstream media" is using llms for writing it's articles lol
mainstream media are the ones that actually still do something like journalism
I don't know what they're all called, or if they even have names. But there's these news feeds on various websites like your e-mail provider for example. Or any odd website that has these newsreels. I call this mainstream because it's what most people are exposed to. I'm not saying it's all LLM because I don't read this stuff. I'm just saying if an article is weirdly structured, like repeating itself for exmpale, it's most likely LLM. There's probably tools out there that take text as input and tell you the likelyhood of it being AI generated.
any word on open-source/open-weights alts to suno and udio? all i could find was riffusion and they seem pretty dead or inactive. in my understanding, at least partly, audio generation is somewhat in a similar domain to image generation and there have been a multum of new models since like sd1.5 yet no word on anything in regards to audio, apart from stable audio, which can't really do music all that well.
stable diffusion 3.5 large
an nvidia gpu with 16gb vram or more is recommended, you can get away with lower up to 6 but at heavy performance cost, personall i use a 3060 12gb and its been keeping me fine, training loras takes anywhere between 1 to 3 hours depending on settings, with flux dev you could get away with genning with the 3060 12gb on --med-vram command line arg to help but i personally use distelled versions of flux, runs satisfactory.
I recommend Stability matrix for an easy way to install and manage UIs and some of the pins in the server from CS1o
personally i use illustrious XL models for day to day stuff, lora training and whatnot, then flux anime distilled/newreality flux models for memes,realistic,more detailed stuff
I have a tool on my civit aimed towards beginners in lora training to help them get the right balancing args, if you're interested (or anyone else for that matter) you can click my profile here on discord and follow the civit ai link to browse my tools/models
i'm also back after a long time. i'm reading that forge is the best webui to use and it looks like it has a one click installer as well, i'm trying that. my past experiences have always been riddled with technical difficulties to trying to get stable diffusion to work
thanks, i'll make a note.
hello, live portrait work good ?
i have this error code
RuntimeError: No ffmpeg exe could be found. Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
yes but how ? š„² (i'm nooby)
macos š
i just need to lunch FFMPEG files ?
looks like it just that one file yeah. extract it to a folder outside of the desktop first
then run it
finish
retry doing whatever you were doing that caused the error see if still happen
may have to reboot the device idk i dont use mac
euh ...
RuntimeError: No ffmpeg exe could be found. Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
I have to put it in a special location ?
https://evermeet.cx/ffmpeg/ffmpeg-118022-gca889b1328.7z i download this one
File "/Users/buell/stable-diffusion-webui/extensions/sd-webui-live-portrait/liveportrait/gradio_pipeline.py", line 246, in execute_video
raise gr.Error("Please upload the source portrait or source video, and driving video š¤š¤š¤")
gradio.exceptions.Error: 'Please upload the source portrait or source video, and driving video š¤š¤š¤'
prolly need to set the imageio (which im guessing is the file?) in the enviroment variable and thats where i wont be much more of help and should poke around down in tech support two channels below
don't rly need much VRAM these days, 8GB is fine
the fastest version of flux is svdq-int4-flux.1-dev.safetensors which just uses 6.64GB VRAM
if you are less fussed about speed then flux.1-lite-8B-alpha-Q3_K_S.gguf uses 3.74GB
anyway to batch inpaint, but still be able to select the areas you want to inpaint on each picture? (useful for removing watermarks)
news articles repeating what they've said isn't LLM slop. that's adwords in action. and small screens that you can only scroll slowly being aprimary form of information consumption.
you have to scroll past ads to look at it all. and more keywords in proper contexts allow you to rank higher in search results.
LLMs aren't the reason it's going on
Yeah, but they are not "mainstream" media. "mainstream media" would be something like New York Times, Washington Post, Fox News (unfortunately) and whatever (I'm not America, so I don't know that much about your newspapers).
But sure, these autogenerated news sites are bullshit and should not be trusted
I always advertise InvokeAI here xD
It's in my opinion the most intuitive webui by far and its also the easiest to install. Forge, in contrast, is still the ugly gradio app we know since auto111. I think the reason its called the "best webui" comes rather from the fact that swarmui and comfyui are soo bad in terms of usability. If you want to install hundreds of plugins and know what you do, Forge or ComfyUI are the way to go. If you are new and still learn and search for a ui which is intuitive and easy to use, I would recommend invokeAI
is training a lora on Illustrious the same as training on pony? How many images should I have on the dataset?
InvokeAI are great yeah their canvas is very good
they are more stable than Comfy also
I like that they give an official docker image too
hi
Can someone explain to me how the Lora zoom slider work? Is it just the Weight that u use as slider?
nvm found it
i installed forge and it works outside the box so i will stay with it for now. the model it comes with however is barely usable, do we have an index of good models?
On Civitai.com You'll find a lot of models
yeah ok i'll try, sadly civitai does not offer partial or torrents as downloads, so whenever a download crashes midway you have to restart over and over and over.
recently update to 4060ti 128gbram... lets go flux XD
im interested, send me
At first I used A1111, but I liked the modularity of ComfyUI more, I'm learning how to create the node chain. I can probably train some models later, currently these models are loras instead of entire checkpoints from what I understand. I followed the hype of the SDXL launch, but my 1060-6gb was not able to support the process.
Previously, in sd 1.5, I put an image and the image name as the CLIP description, currently I verified that for LORA it is exactly the same thing, but divided between an image file and a json with the description, right?
Hello š
I saw image made with a prompt that contain this"score_9. score_8_up, score_7_up" and i don't quite understand the mean of it may someone explain
its the weird way Pony was trained
you only use this kind of prompts when you are using the Pony model
i'm using ponyDiffusionV6XL so i guess i should i those but i don't understand what they're used for
they give all their training images a score and then add this score to the caption of the image
K thx
problaby the trainer use "generic tokens names" for their training...
because:
- they're lazy
- to hide the process and avoid copy
i know who are the creator youre looking, ive saw the same prompts XD
astralite completely destroyed the sdxl text encoder with his genius captioning. i doont know why he's so praised. He's not lazy, because he went out of his way to do this broken quality tag approach, which was never needed. Then he brags about how he disaligned the text encoder and acts like he created a whole new base model.
The worst part is his primary customer base on his generation service are under aged kids. and he's held up like a hero for it. The whole community around pony is just so savage. Not a good look for generative ai work. Don't publically admit to anyone that you like Pony because people who know will judge you.
does anyone have any recommendations for running local SDXL on ubuntu?
comfyui has some annoying errors I don't know how to fix
and forge ui doesn't have a linux version at all
forge ui has sh launch files. it's just a python environment. theres no reason it wouldn't work on linux.
I think your problem has more to do with you not knowing your system very well.
linux isn't known for "one button installers" so you won't find them there
comfyui should work with linux environments easily too. The UI wouldn't be the problem
One thing you'll learn quick with linux world is that you can't pass the buck as easy. You've gotta do the leg work. If it doesn't work, that's probably going to be a layer 8 issue
wtf it does?
i didn't find them when i looked
i'll try again later
Does anyone know why sd ultimate upscaler does 3 passes in comfyui? Is this normal? The documentation says nothing about multiple passes and ChatGPT knows jack shit as always.
I guess the second pass is for seam fix but the third one is an enigma.
So I've been out of the loop for a couple of months. Is AI images and video perfect by now? If not I'll check back in summer 2025 or something like that.
It won't be perfect even in 2030
I don;t have a precise barometer but these things move faster.
Awww, that;s too pessimistic.
Ai moves quick.
Images are kinda good enough already...
Video not so much yet
It's not. It's realistic. It can't possibly be, there's too many difficulties that can't easily be overcome, like lack of good training data and lack of good image captioning as well as ambiguous language. And don't forget that we are way past Moore's law.
Quantum computers!
stuff like that
Computing power is like money, you throw enough of it at a problem and it gets solved.
Quantum computer research, as well as fusion reactor research has been going on for decades. And it will continue for decades before we get anything useful.
Nah. This is the last generation. We'll get the best outcome. This is it folks.
i saw a cool device invented. quantum positioning systems. it cools a bunch of atoms into a bose einstein condesate state of matter, and then takes a reading from that bunch of atoms, and cycles them again. it's kinda a bazooka sized apparatus, and it can do accurate global positioning with no external devices. so no GPS satellite signals.
These are huge breakthroughs. but also scary. because now warfare machines can't be disrupted with gps signal jamming.
there's hard math problems too that quantum computers might one day solve. like does P=NP or does P!=NP. i'm hoping for the latter because then encryption is still mostly safe for now
I cna tell you with 100% certainity that P=NP because the world will not stay the same forever.
prove it and you get a nobel prize that year
they'll likely hold a special ceremony
It's P=NP in the quantum world, but not the concrete world. The big letdown with quantum computing is that you have to get the results out of the quantum world.
Neither is proven yet. But there are suspicions and ideas that both camps have.
proving eitehr will have huge implications either way. I'm hoping that its not equal
Hey i'm having problem with different thing when generating image i'm a beginner in this domain and i can't find what resolution should i pick because if i set a one to high it takes to long and if i set a one to low it looks like garbage and i cannot find wich one should i pick
Also should I use the Hires.fix when generating image and if yes how
How much vram do you have? Higher resolution images will use more vram.
sd15 models have a native resolution of 512x512. but They can generate higher images with hires fix. Essentially doing a small one first, then sizing it up and denoising it some more. But that second step will fill more memory.
sdxl has a native resolution of 1024x1024, 4x as much. So hires fix isn't needed, but sometimes i use it still just to get refinements on the second pass.
You could try loading your model with fp8 memory precision, which is fine for most cases but could take a little longer to load your model on the first run. Since more calculations. That means the weights fill half the amount of memory, so more memory is left over for the image generation.
you could also try something like tiled diffusion. Forge UI has this built in as multidiffusion. It does small patches of your image at a time. There are lots of these kind of extra solutions too. SD ultimate upscaler is another one, that'll upscale images using tiles.
I have a 3060 with 12gb of vram
That's plenty but it'll be tight. i'd run sdxl with fp8 mode if i were you, and keep gens around 1 mega pixel (1024x1024 or other aspect ratios that have a milli pixels)
Do you perhaps know any good youtube Channel about stable diffusion
helps to keep your task manager open on your gpu's performance tab. you can see how much memory is being used. if it's maxing out your gpu vram, it'll slow it down to mud speed
Detweiler's comfyui series
https://www.youtube.com/watch?v=AbB33AxrcZo it's older but is still a good primer to the app
i use forge ui these days, but comfyui is a strong tool that allows a lot of flexibility and control
by the way i'm using automatic 1111 web-ui is it good
Just a question/Suggestion : Shouldn't there be a Flux Channel in the "Stable Models"?
flux isn't made by stability, its made by black forest labs. I doubt they would want to make a channel for their competing modelš
Oh.I didn't think This server was that exclusive. But that makes sense.
this is just wrong. There is not a single quantum algorithm known that would solve any np hard problem in polynomial time
Qubits r quick!
if you stick to sdxl and sd15, auto1111 is good nuff
don't dm this guy. it's scam bait.
i dont even have to to know it. i can smell it
exactly what i said
š
We were talking about our expectations of the future. Nothing is fact it's just what I believe.
Marvy the failed scammer everyone. Bad at real business so he had to try scamming, but is bad at that too. Give it up for Marvy everyone!
literally idled on the server since months ago, just to spam scamm invites today.
If you're actually DM'ing this guy, reevaluate your life choices
i was able to use civitai.com for like 10 minutes and now the website isn't loading on my browser, other websites work fine and other browser works fine, is this a known issue?
yeah down today
it was back a half an hour later, maybe just server issues, yeah
yeah. civit is offline more often than they are online
go look at flux and see what you think
perfectionism is still ages away
i am new to coding, developing an star app looking to mingle and learn. hello everyone.
They gotta use their 404 gallery
OlĆ”
It's a board for robotics etc.
anything. low power $250 inference machine
hey can you guys help me out a little? im tryna use a flux model for image gen and im using ae.safetensors vae but im still getting burnt/blank images
don't use adaptive samplers. use euler.
even when i use euler im getting burnt images
denoising = 1, steps = 10+
Does upgrading ram and going from an ssd to an nve make sd faster ?
32gb ram is good, more only needed for maybe flux with t5 fp16 encoder
Ok then cuz Iām on a 16gb ram, Iāll upgrade
You can also just increase the Windows Pagefile. That speeds up sdxl gens too
But ram upgrade never hurts
Thats the file that gets used when your Ram is overfilled. Its like virtual Ram and uses disk space of an SSD preferable
Yep
Make sure its only enabled for the C drive and not for any other drive.
And then set it to 16000 Min and 24000 Max.
Then apply and restart the PC.
Also make sure to have at least 15gb free space on C.
unless I got my math wrong its 10% the FLOPs of an A100
in a handheld
its amazing
there might be some fine print though, that the numbers they gave are for Int4/FP4
which is fine, but would be deceptive
Nvidia's favourite marketing trick is to do comparisons where one thing is in FP32/FP16 and the other thing is in FP8/FP4
you have to use CFG=1.
Flux is a cfg-distilled model, so cfg does not work as it does for non-distilled models
you can use cfg, but you should use low values for cfg (1-3) and maybe start cfg not from step 0 but at step 3 or so
yeah its def not for training
if I were them I'd say the FP4 number so I suspect that's what they did
Hello
hiļ¼new herećgood 2 everyone
what's the best local AI image generator right now ? last time I was doing this I was using SD1.5
with the stable-diffusion-webui
Your imagination
thx very useful
I got flux running on local (yay) what kind of guides should I look up on how to copy the style of an image I provide?
Right now iām promping and getting good images for a scene in my game
But I want to prompt for different scenes with the same style
Hello!
Personally, I use SDXL 1.0 and comfyUI
You don't uninstall stable diffusion. If you don't use the model anymore just delete it.
you can Reinstall the systemš
discord åÆä»„ē© stable diffusion åļ¼åmidjoureyäøę ·ļ¼
greetings all
I think my comfyui is stuckon a style lora. Everything is coming out the same style no matter what I do tried loading sdxl and schnell same exact style I have the lora nodes all deleted and restarted too
because its a scam
lmfao they set it up so i could ping @ everyone
so i just pinged and said "this is a scam"
its community based so it might be slow
wdym wallet
like bitcoin?
oh so its like
not even an advanced scam
its just like
give credit card and we double ur money!
they didnt even respond to me
probably because i started to send actual things that i "needed help with"
no no
they use the nfc chip
in your credit card
for processing power!
nfc chip is the thingy mcjig that lets you tap on
yeah i was trying to say that its using that chip that has the processing power of my pet rock as a gpu
hi!
"Hi, I'm an ethical hacker. I specialize in identifying vulnerabilities in systems and helping organizations strengthen their security and I can hack and recover all social media account, recovery of lost funds,unban, game hack hmu for your service
hello
If you get a dog call it Barkolomeo.
Hi y'all, quick one: is there an established set of stable diffusion tools for asset generation that game devs use seem to favor? For example, generating isometric characters, etc. ?
Real game devs still use human drawn art. If you opt for AI art you are already on the bottom of the barrel and it doesn't really matter what you use.
I'm a coder, not a graphics modeller, so looking for a good solution, doesn't have to be perfect
SD has 3D models pretty sure. Looked chunkier than N64 graphics tho
Well if you just need place holder graphics just use some free assets. There's plenty.
I don't know, used to create my own place holders graphics when experimenting with game ideas.
I've seen a model that can create views of different angles of an object. Although I believe it could only rotate on one axis.
Good afternoon, everyone! How are you all doing?
Hello
Hello everyone,
I am an artist looking to create a model from my existing images so I can quickly generate unique art to use for other projects. I have tried using Upwork to find people who could potentially help me with this, but everyone seems to find it tricky. If anyone has any knowledge on this, please let me knowāI would be happy to pay.
Hello
You need to train a LoRA. Probably for either Flux or SD 3.5.
Yeah its very fast but quality is ehh, The best one trellis for sure, stable fast is not even close and its trellis is decently fast too, but a bit slower: https://huggingface.co/JeffreyXiang/TRELLIS-image-large
hello
hey best to train a lora on your images
how the fuck do i get the diffusion soundvoard
hello everyone !
i have a question , i want to create some dark fantasy images like this one
https://i.pinimg.com/736x/c0/cd/ab/c0cdab73f605dd6ef9b96e66a9201651.jpg
what prompt should i use to achieve that vibe ? i think that image was generated with midourney
i want to make an image with that kind of texture and vibe
hi
Hey someone do here faceswap with A1111?
can someone tell the differences between SDXL and SD 3.5? Civitai seems to have a lot more models for SDXL available, is tehre a reason to still consider SD 3.5?
3.5 is the newest model
ah ok, thanks
@stray zinc
No
Guys I tried to convert a digital image I have into a realistic photo but it takes such a long time (6 minutes so far). Did you guys know how to convert a digital image into a realistic photo in Stable Diffusion ?
Did you used img2img?
Try lower the resolution
No because I haven't found it yet
How did you find it ? @warm junco
I use the local version of Stable-diffusion
Automatic1111 or Forge Webui
How much time does it take to convert a digital image into realistic photo without using img2img ? @warm junco
Idk, which tool do you use?
I use Openart Stable Diffusion AI
Ah okay, then I guess you can't get it faster because its a cloud based service
I just want to say, SD3.5 large is extremely based.
It just does the style I ask it to do, and doesn't do stupid style locking. For that, it is extremely based.
ComfyUI - I have 32gb ram and a 4070 ti super with 16gb vram, should I be able to use flux1-dev with a lora? When I use the Load Flux Lora node in comfy, it gives an error. list index out of range
do i just use the regular load lora node?
You need to use flux dev FP8 or Q8
what are the differences between the two?
Not much
A little output difference
does either one produce any noticeably better results?
maybe Q8
for comfyui you also need the gguf loader nodes
I'm able to generate images using flux1-dev using the regular lora loader - LoraLoaderModelOnly node, will this output be any different if the Load Flux Lora node was used?
idk, im not a heavy comfyui user, try ask in #š§£ļ½comfy-ui
is the restriction to fp8/q8 because flux1-dev is using more ram/vram than I have and paging it to HD?
yep with 16gb vram you shouldnt use the fp16 23gb version
the fp8 / q8 is half of it and should work much better
Thank you.
It's nicer to know that it's a limitation of my system than thinking, it's just broken for me š
I can generate images, but it is slow, several minutes
yea its using your pagefile for that because the model doesnt fit into the vram
and that slows down
do you know if i can use the same clip models, ie clip_1 and t5xxl_fp16?
and do i need to use a lora explicitly trained for the fp8/q8 models?
i'm getting a large blob of colour, not a decent image
You can use the same clip encoders yes.
And nope there are not special fp8 or q8 loras needed
Thank you š
i think i got it working, i was using the regular k-sampler. As soon as i switched to the X-Sampler, it worked
if i upgrade to 64gb of ram, will this likely be sufficient to run the full flux1-dev perhaps?
it's 7200mhz ddr5 ram, so is pretty fast
Has anyone already created an example/full script to request/generate and receive the output from an generated image? (Python)
this is basically what all the UI's do already. if you look at the nodes in ComfyUI for example, they are just configuring the parameters to be passed when the generate button is pressed. what you are asking isn't really as simple as it sounds. if you are a coder, you can review the git repo's for the components that make up the generation configuration you are intending
if you are doing remote SD, i'm sure there are scripts for executing that, just look for the api, perhaps there would be examples
yeah its a client-server architecture
someone made a nicer input method called ComfyScript https://github.com/Chaoses-Ib/ComfyScript
it just writes an API call but in a nicer syntax
and some cool alternative modes where you can call nodes one by one
it has a transpiler, that should do what you want
hello
the comfy ui is exposed on a localhost port, can this be hosted on your local network somehow perhaps, so it can then be accessed from the client machine. I haven't done network hosting for a few years, so am really quite rusty about this.
look into Secure tunnel services for that option.
or use ipaddress of server:port
Nope as the models need vram and not RAM.
It would still overflow the vram and use ram+pagefile
IDK if its worth doing at that point its probably worth just switching to pytorch
is this for a server/client both on your local network?
Comfy is not really in a state where it would be a good idea to deploy it
its fine for experimenting or playing with new tools
or using in the way people use photoshop
if you're gonna make a custom server and network setup I think its better to switch to pytorch at that point
is this just for yourself, so you can run a server and access it via another machine on your network?
was !InstantNameOfficial who was asking
--listen comman line option will let you access comfyui locally
lol, sorry bud, lost track of who was asking
yeah --listen does work
Comfy has the functionality to be used that way I just don't think its robust enough yet
I thought about running a 3060 on a spare machine I had myself, but ultimately, the spare machine is dell, so they skimped and added exactly zero upgradeability to the pc, so won't accept a powered graphics card.
and of course, the mobo won't accept anything but a dell power supply
ah yeah I had a lot of trouble with trying to upgrade Dell prebuilts
can be very tricky
it was a free pc, so can't grumble.
it makes for a very large oversized paperweight! š
if its free that's fine
I know someone who paid multiple k for Alienware
was not a good idea
any prebuilt pc is going to be built as cheaply as possible imo, regardless if it cost £3k+
particularly if it's sold from a large corporation with shiny green badges and glowey fans
the power supply and motherboard were pretty rough yeah
a low quality power supply is the one thing a pc shouldn't have, that's so long as you don't want to be buying a new one in a few years.
I'd be worried about it harming the GPU yeah
i spent about 2k on a pc back in 2012/13, all the asus, corsair parts lasted until i replaced them a couple of months ago. the £600 780ti was msi, that broke numerous times and bust completely after 2 years. The asus graphics card lasted 10 years.
oh this matches my experience very well
I always buy Asus stuff and I have a corsair case still working well
that msi was the first time i didn't go asus, i regret that, lol
msi can be a bit squiffy yeah
it actually cost me about £1100, as I wanted to go the SLI route, for 2x 760's, however the mobo, id' bought was a crossfire, so had to buy a new sli compatible mobo. This proved completely useless as sli 2x760's did nothing. so just bute forced performance buying the 780ti.
gigabyte are usually good also
oh yeah I remember SLI
people used to get dual/quad gpu for gaming but then they stopped
and now they get high gpu count again for llm
it stopped, because it relied on nvidia setting up the game profile for it to work
ah okay I was never sure why it went away