#💬|general-chat
1 messages · Page 172 of 1
wait if possible get the person to show you the card works at least
older card is ideal for value though that's kinda the point
ye, where I am that card would be three times the price
Seller lives 100km away from me so ill just have it shipped, got buyers protection anyways
Asia or north east europe?
🤞 may your delivery be swift and full of packing beans
Hello all, I am looking to see what is the hardware requirements to train SDXL from scratch and how much time will it need to be trained
you need a data center, and a couple million dollars
I know. Is there a way to estimate how much memory is required to do full parameter training of SDXL? I know ordinary people can't do it but I am curious about how we can estimate the memory usage for SDXL.
i was serious. you can go look up how much it cost SAI to train sdxl when they did so.
Can you actually? SD 1.5 figure is available (either $500k or $600k) but some googling didn't reveal one for SDXL.
you can go look up what is not propriatory information, yes. and after that, anything anyone says is a guess.
If you know the figure, please point it out (or the source for it) instead of making vague "you can totally find this" allusions
sure - go email Stability.AI and ask them for their financial figures
it's not really any of your business
Then you should stop claiming you can look up the cost
just another victim
moral of the story how dare you do what they say to do, the nerve
You can now train an SD 1.5 level model for under $2,000
this was shown in this paper made by Sony: https://arxiv.org/abs/2407.15811
Yeah, SD 1.5 figures are easy enough to find. SDXL is the mystery (and which the claim was made about).
So I'm trying to setup an Comfy Workflow that allows me to generate images using various Pony Models but then after generating 2 options I wanted to use the Preview Chooser to continue the process into a SD1.5 model that would apply a face swap using the IPAdapter. I tried setting something up but I really don't know what I'm doing too well. Could anyone tweak it or tell me what to do? https://i.imgur.com/Jl4nDMe.jpeg
For a model above SDXL quality, the Würstchen architecture which Cascade is based on was about $30,000 or so, or $60,000 if you want to double that size.
It apeared to be a scam xD, got money back tho
If the seller sells alot of other "second" hand computer stuff, you might actually be buying third hand and I wouldn't trust that. Be aware that the card might have been reflowed.
before the costs went up across the globe
costs went up?
yeah, think back about a year when suddenly all the gpus were out of stock, and backordered. costs have been rising since
ah well they have done for stuff like 4090, but rental price for H100s has dropped
ya prices have dropped to the lowest ever
Didnt know that, might buy first hand now since i dont wanna risk getting scammed again. Thanks for the advice!
we also have much better triton, tensorRT, CUDA, Pytorch, Nvidia drivers and attention libraries than a year ago
so for the same hardware budget you can get a lot further
Flux dev speed more than doubled this week with the FP4/Int4 release, for example
any help here? very much appreciated
does each model has a specific setting for it to work properly?
sd1. what? your question is confusing
what, specifically, is it you like about fluxgym that you want to do with stable diffusion?
i was using a tutorial for flux lora, and it was pretty straightforward
so are you wanting to train a lora for stable diffusion then?
yes, i have a small dataset and i wanted to train a lora for stable diffusion 1.5
are you planning to generate images with either stablediffusion 1.5 or checkpoints for it?
im not too familiar with the idea of checkpoints to say if it would be helpful. i wanted to get a safetensor so i could generate images later
a checkpoint is a model. a fine tuned checkpoint uses the base model, adds information to it, and tunes it, to create a revision of the model. you can NOT use a lora trained for one model with any other model. if you train a lora for stable diffusion 1.5, then you can only use it with either the base 1.5 model or fine tuned checkpoints created from the 1.5 model. if you decide you want to use SDXL or SD3.5, you have to train loras for them
hmm i see. i want to train a model with stable diffusion 1.5 as its base, using a small dataset of my own, so it returns me a safetensor (that, from what i gather, i can use to generate images in said style later)
i did something similar with flux, and it was a pretty straightforward UI
https://www.youtube.com/watch?v=yvXOKHeZtgs&ab_channel=TheLocalLab used this video as a tutorial
i was not able to find until now something analogous to SD 1.5
you can train any version of stable diffusion that you like. you'll get a safetensor's file. that's a file extension that is safe to use unlike pickle files which it replaced. however first you have to decide what version of stable diffusion you want to train for, as that will be the version you have to use to generate images with in order to use it. there are a number of trainers for it, onetrainer is the one a lot of people use but it's not just a point and click. If you want to use sdxl or sd3.5 large, then luca taco has trainers on replicate - but you can't use them for free - you do have to pay replicate for their GPUs
start by reading this https://www.scottbaker.ca/AI/LoRA-Training
@fervent thunder have you seen this yet? https://pollo.ai/
read it
i was trying to use kohya
but theres so much more stuff compared to the flux
and im sure one of the bajillion things is making it error
personally, i just use luca taco's trainers on replicate and train either sdxl or sd3.5
thanks didn't know this one
should i use sd3 with only 4gb vram?
you're not using anything with only 4 gig vram
If you don't need a card anytime soon you could also wait for the new RTX 5000 series to release. This will lead to significant price drops of the older series. The 5000 series is expected to release early next year.
well sd 1.5 does work i just dont think sd3 would
https://huggingface.co/city96/stable-diffusion-3-medium-gguf/tree/mainsmallest GGUF here is 1.28GB
you can go smaller than this if you have to
if you want to get more advanced you could convert these two libs from flux to sd3.5, they work well https://github.com/chrisgoringe/mixed-gguf-converter and https://github.com/chrisgoringe/flux_cpu_offload
mixed-gguf-converter uses different quant per block
i'll try that
i've been using 1.5 for months now but its getting boring
yeah its nice to change model sometimes
yeah GGUF is a huge game changer
you can use even smaller ones than the ones there, sometimes
didnt even know it was a thing
Hello, does anyone know what image generation model does chatgpt 4o use? What's the most similar model that I would be able to run locally to replicate it's capability?
it uses dall-e3
and you'd have to use their API if they have one if you wanted to use it. otherwise, just use stable diffusion 3.5 large
ahhh that's a good point. The API sounds like a good method for what i am doing. And okay thanks, I will try 3.5 large
appreciate it friend
hey guys whats the best video upscaler at this moment?
totallynotavirusLegitvideoupscaler.exe
what is your issue?
yyou asking me what my issue is
i did, yes
Ку!
i wasnt asking
to make ur pc run faster
press windows key + R, type regedit
in regedit, go to Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shell
delete the shell folder, restart pc
to unlock new feature in windows, make a batch file, type this in it
(colon)loop
start
goto loop
save as fast.bat
double click
Heyo, what're the differences between Stable Diffusion, Sability and Flux by Black Forest Labs
uh
one is stable diff, was is stablity oh and flux is called flux
i think thats right
tired of dealing with a login screen on windows, easy, delete LoginUi.exe from windows system32 folder, then restart, now u dont have to worry about a login screen anymore
please .... dont
just don t do any of that...
Deleting random registry keys is one good way to break your windows.
Deleting .exe from the OS / that windows expect to be there is also a good way to create problems.
There are proper ways to "skip login screen", just make your account password-less and set the "require sign-in" option to never. (more infos easily available through google)
Stability is a company, they made sdxl, sd3, sd3.5, and partnered with runway to create sd1.5.
Stable diffusion is usually either sd1.5 or the series of the models.
Flux is a mmdit model similar to sd3/sd3.5 but is by Black Forest labs which is a different company but made by stability ai devs.
Sd1.5 has the most support followed by sdxl. In the new mmdit models, flux has the most support followed by sd3 and sd3.5
Make sense and which Sampling method do you recommend? Sampling steps
Atm I use DPM++ 2M SDE
Yeah that’s a good scheduler, I usually use that for the sdxl or sd1.5. I only use the lightning models which only need 4-8 steps, so it’s much faster. Quality is roughly the same.
where's the quantisized version of 3.5 medium?
any way to run without these? Text Encoder - T5 e4m3fn
Text Encoder - T5XXLFP16
I'm surprised we don't have a channel dedicated to flux
heyy
Flux is not a stability.ai product and this server is a stability.ai one. Nobody is preventing anyone from talking about other models. It would be pointless if not counterproductive. But I wouldn't count on a flux channel poping up anytime soon ^^".
that's all camu ever does here
9 months with Stable Diffusion has been enough to prove one simple fact: I suck at writing prompts 🙂 .
sure. just load all three clips, run them each into their own prompt text field node, and then just give . to t5 as it's prompt
does anyone have swarmui experience
we have a #🐝|swarm-ui channel. might ask in there
Does anybody know if forge UI is working with controlnet for poses yet?
controlnet on forge ui is fine for sd15 and sdxl. it's newer support that isn't implemented yet.
Thats great
Thanks man
Anyone has any idea where to download the models for controlnet? No idea where to find it
does anyone have time to help me, with screen capture support?
you need ipadapter
what's the problem?
Ah. I see, so I would have better luck finding a server that is for Flux itself.
Hey lads, anyone with good knowledge of training, I want to train on movie posters to generate new posters (random), anyone have a good tutorial?
I'm just going with dreambooth in auto11, if anyone has any other suggestioons on the newest / best way please let me know.
is anyone in pixaroma's discord by anychance? the invite link is dead
please invite me/send me a new link
tiptoee, through the window, by the windowww, that is where ill be come tiptoeee, through the tulips with meee
yusuke is cool
same name as my bestie
das mai bestie
there is someone who know sites where users able to download checkpoint files and loras? other than civit ai.
Ya
u got, google drive, mega, fboom
all are good file hosters
jk ik wut u meant
jsu wnatd to be annoying
u got tensor.ai
pixAi
huggingface
das bout all ik
buh why not civti, they got quite literally the most
did u get banned or sum
plus all on civiti are free
pix.Ai, not all are downloadable, same with all the other with an exception to huggingface
2 main downsides to civit are download speeds, and they require API token or login for most models
i can neither conform or deny the download speed
as for the login, Ya, u need an
account
not too bad tho
for me, most of the time, my download speeds are pretty consistant
usually around 5mb/s
I don't rly like having to put token as it makes it harder to write the download scripts
its common though to require token, huggingface not requiring token most of the time is the rare thing
isn't 5mb/s over an hour to download flux? 🤔
I know this can be a reddit post but I wanted to ask it here in discord. I'm a devops guy and I know Python. However, until yesterday, I did know nothing about stable diffusion or AI in general. What I'm interested in is virtual clothing experience. I've looked at different models and installed ComfyUI so far but I don't know how to proceed from here. I know there is IDM-VTON and OOTDiffusion but I don't know how to use them. I also know that there are some LoRAs involved to fine tune the experience and upscaling methods but I still haven't used. Are there any guide that you can provide. What should I do to start?
I'm working with MacBook Air but I can spin up a GPU instance on Google Cloud and install ComfyUI there. It's not a problem with me. The only problem I'm having is to understand how those technologies fit together.
So i installed stable diffusion web UI like 2 years ago, and barely used it since then... i want to get back to it, but since it's been 2 years, there's a ton of new stuff, and i don't know if simply updating the .bat is enough to get most of what i need...
If someone would be able to take the time to explain stuff to me, it would be great! (no rush though, i understand it's a pain to help a nood with their dumb questions)
After 2 years your better of installing the webui fresh using the guides from the pinned messages in #🤝|tech-support
There you also can ask for any technical questions.
thanks!
nu
u can jus move it to sum where else then install again
i haev 2 stables
i have stable 1.5 for cpu, and then stable 1.7 for gpu xd
ik that sounds odd
but, they actually give diff results
with same settimgs and prompt, sum times i prefer wut the cpu makes, other times, gpu
i'm not talking about the models, just webui stuff
i knows
i think they changed their content policy and some loras get removed.
o
thats weird
wonder wut got removed
or why, rather
probably anything involving children/loli?
prolly
i wouldnt kknow a ting about that
but if that were the case
is that why u wanting to move from civiti ai
so u can have those loras
xd
doesnt it like
really grind ur gears
when ppl bash Ai, and say it makes crap
when they dont even take the time to learn it
so they make crap
ahg, i cant stand those ppl
makes me so frangry
Oh, I think I will ask my question to tech support then.
if like 90% of my dataset are 1400x1799 images, should I train (pony style lora) in that resolution or use a resolution like 896x1152?
you should make all your images the resolution you're going to train
anybody use macbook? i need help. when i use ext bt mouse when i tried to scroll with it keep bouces back and not smooth
anybody know how to fix this ? pls im desperate
hi can someone help me with this error?ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0, 2.2.0+cu121, 2.2.1, 2.2.1+cu121, 2.2.2, 2.2.2+cu121, 2.3.0, 2.3.0+cu121, 2.3.1, 2.3.1+cu121, 2.4.0, 2.4.0+cu121, 2.4.1, 2.4.1+cu121, 2.5.0, 2.5.0+cu121, 2.5.1, 2.5.1+cu121)
ERROR: No matching distribution found for torch==2.1.2
Hey guys, looking to generate quick video clips locally, what should I be using?
animatediff is not bad if you get good with it
Is that the one where you post two images and it "lerps" between thm?
it can do that, people also make their image by building up frames individually
its okay, its lower resolution than the big new models but its not that bad especially for social media
I don't do video though personally but I would recommend Banodoco server they specialise in this stuff
there is a working link here I think https://old.reddit.com/r/StableDiffusion/comments/18wm1md/whats_the_current_banodoco_server_discord_url_all/the guys there make the comfy nodes for most video stuff anyway so its good to check out
Amazing thank you!
no problem
hey, wrong python version installed. Uninstall it. And then delete the venv folder.
Install Python 3.10.11 64bit
Thank you I will try it tomorrow and let you know
Hello!
is there any local alternative to Kling Ai? I heard Animatediff is not as good, but Kling is rustling my jimmies
actually i like the site because there is examples of generation and setting which model was used, with the lora and which sampler, and the description of prompt icluded the generated images, i usually testing with the same seed number i able to generate the same image. unfortunately i didn't save thesse information.
I think get removed due the tags with🍑💨 but with a pg13 so not revealing body parts.
so i not exactly sure why the got removed.
now i am getting this exit code: 9009
stderr:
Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.
Launch unsuccessful. Exiting
after dowmnloading it from the internet
Have you deleted the venv folder?
restarting after installing python because windows doesn't live update the path variable. also.
it works now thankyou
i have another question because i wanna do image2video but what do i need to do to install it
wich folder do i need
v
Крош
БОТ
— 18.09.2023 18:19
Изображение
Если хотите особые привилегии на сервере, то вы можете получить их, добровольно пожертвовав денежные средства, которые пойдут на развитие сервера.
Problem with Latent Couple in Stable Diffusion API: Overexposure, Artifacts, and Contrast Increase when Using AND Operators
Hi! I'm facing an issue while using Stable Diffusion and the Latent Couple extension through the API. When I add AND operators in the prompt to define different segments, the generated images start to show strong overexposure, artifacts, and increased contrast. The more ANDs I add, the worse these effects become.
Has anyone encountered similar issues with artifacts when sending prompts with ANDs through the API? Could it be that the API interprets AND differently compared to the web interface?
Is there a way to properly define separate prompts for each segment without these effects, maybe with a different formatting approach?
I would really appreciate any advice or alternative approaches to working with segmentation through the API!
Note: I'm sending the image in base64 format.
All discord messages should be written in the format of emails tbh
I don't have api experience, but in automatic, AND is super overcooked and it's better to use things like BREAK, or just not split the prompt at all.
You could also try reducing cfg scale and see if that helps
i'm doing a fresh install of the webui, and i clicked on the link provided in the pinned messages, there's more webUI now? Which people here consider the best?
i had the automatic1111 before, but are comfy and other better?
Comfy is powerful but harder to use (although swarm makes it easy again)
Forge is just better auto (faster, automatically chooses the best performing options for your machine instead of having to experiment with a lot of technical options)
so Forge is the best for a beginner? I still want to use different models and LORA and such
Hello there, if I need help with my generations what's the best channel for asking for said help
I don't know whether it's the best but it's a good choice
just install SwarmUI, and then ask for help in the #🐝|swarm-ui channel. Forge isn't that great, and they tend to lag behind providing support for new model releases
this one isn't even one the install guide, what's better about this one?
Swarm wraps around comfy to make it usable for non power users
is stability matrix worth it? It seems to be able to switch between different ui without hassle, but what's the catch?
uuuugh, i hate being undecisive, i don't know which webui to chose
i just want one that's compatible with all models and loras
they are all confusing as hell at first, pick one and learn, comfyui have the fastest support of new things generally speaking
https://x.com/StabilityAI/status/1857079853893238847 "'We are pleased to share that Stable Diffusion 3.5 Large, our most advanced text-to-image model, is now available in
@amazon SageMaker JumpStart.
Starting today, users can securely deploy Stable Diffusion 3.5 Large within a virtual private cloud on SageMaker JumpStart, ensuring data protection and privacy.
By combining our cutting-edge technology with @awscloud’s reach, we’re enabling both startups and enterprises around the world to access this technology to create professional grade images at scale.'"
Can any1 help me? I need stable diffusion to use my nvidia and I cant
SwarmUI is in the list too
All webui install Guides:
https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides
Maybe I should add Invoke too
oh it's like comfyUI, the nodes thing look super complicated to me
explain please
swarm will do all the technical stuff for you. it does have a comfyUI tab so you can edit the workflow if you want to, but you don't need to
wondering if anyone knows how to make a bunch of tiny forest/medieval/fantasy creatures like little fairies, little goblins, etc
I can find loras for 2 people, because of sexual fetishes, but a bunch of tiny fairies seems really hard to find. Anyone know a model or lora that can manage?
Want to use the dedicated gpu not the integrated one
Hey in the pinned messages of #🤝|tech-support You'll find the setup Guides.
Also ask there if you need help setting it up
Hey all, hoping someone can help! I'm trying to find a research paper I saw recently (two days ago) on image blending, I recall one of the authors being from Google. It focused on taking subjects from one image and merging them into another, while matching the style of the destination image. Lost the link! Does anybody have any ideas of what I'm talking about ?
there are tons - if you do a google search on: image blending arxiv <--- that might not get the exact paper but it might get you useful papers
yeah, unfortunately I've been looking for like an hour but no dice. My last resort was writing here in the hope someone else stumbled upon it recently ha
all i remember from google recently was dealing with video
Hi does anyone have experience deploying Stable diffusion in docker? I am developing a final degree project and I need some help and I don't mean using the typical webui like automatic1111.
What's the open source txt2video model right now to use
i think the only place you're going to have a hope of anyone that uses docker seeing your question is in the #🤝|tech-support channel here. you probably want to be asking this on the banandoco discord
Mochi is definitely by far the best followed by CogVideoX and allegro. You can run it as low as 12gb vram(maybe even 8 now?) but it’s slow and a bit lower quality then full precision.
At full precision(requires 40gb vram gpu at least) it beats gen3, kling1, and Luma according to the text to video leaderboard.
no wonder i stopped using stable diffusion 2 years ago, retrying now and despite using the same prompts as the image i'm tryint to reproduce, using lora with instructions and embedding+negative prompts, i still can't get anything i want, i'm so bad at this and i'm losing hours for nothing... i don't get it, i wish someone would just teach me...
and sometime the thing just compltely glitches, like the previews are fine and once the work is done it's a jumbled mess
gonna go to sleep for now
suggestion - start over. the technology now is not what it was 2 years ago. start with SD3.5 large, use swarmUI to install everything with, and just experiment
i reinstalled everything already, started fresh
will try swarmUI tomorrow
but i doubt it will make much of a difference if i can't get the prompt and lora to do what i want :p
Does someone is having problem with PapperSpace with A1111?
I wonder whats the best way to go about upscale fix with videos. Im not talking about higher res, but mostly fixing the existing video anatomy which may not be correct, 6 fingers, clothes mixed with skin and such
My current solution is just taking frames every 0.5 seconds upscale fix and interpolate at the end
Yeah that should work well, it kinda works decently for me. I use flux schnell 1 step but you can use sdxl lightning too I believe for fast inference.
Has anyone heard of any Controlnets or IP Adaptors created yet for SD3.5?
Has anyone here had success making QR code images in XL/Pony?
If so, could that someone share the settings he or she used?
I am stuck and my image doesn't look like a QR code at all even with exaggerated settings
I am trying to do it with integrated controlnet in forge
Is SD 3.5 uses the same resolution as SDXL? Can 8Gb of Vram run this model?
yes and yes (but with --medvram-sdxl if you re using auto1111)
I'm using SD3.5-Large-Turbo on 12GB of VRAM and it works great. 10 steps in 13 seconds.
i can finally do some good stuff... but man it takes sooooooooo long, my lazy motivated ass will give up again... i wish it would be easier to get what you need and it being consistent, maybe it will be in 2-3 years, lol
plenty of methods out there now to make stuff consistant
you can fix that
how, lol, so many settings, so many models, so many loras, so many settings that can completely change the image, and sometime you get completely different results despite not changing any settings
and when i see these tutorials that are so loooooooong, just for one small thing, it's really discouraging, i can't memorize all that
Does Mochi allow for lora usage? Can it get as accurate to character loras like animatediff?
currently have 24gb vram
No training code for mochi yet, only cogvideox. Also it’s not like animatediff where sd1.5 Lora’s will work with it, you need to use Lora’s trained for CogVideoX or train it yourself. However, both are miles ahead of animatediff.
You can run full cogvideox or mochi-1 with q8 quants in 24gb. Honestly, I would probably recommend CogVideoX right now, it just has a lot more control(trajectory, img2vid, vid2vid, controlnets, Lora’s) then mochi but mochi is far ahead in pure text to video.
Okay, I will check them both out, thanks
Yep here are the links
Mochi: https://huggingface.co/genmo/mochi-1-preview
cogvideo fun(finetune of cogvideox to support multiple aspect ratios): https://huggingface.co/alibaba-pai/CogVideoX-Fun-V1.1-5b-InP
Cogvideox1.5: https://huggingface.co/THUDM/CogVideoX1.5-5B-SAT
Kijai has great comfyui nodes for them, you can also use diffusers if you like code.
any tips on how to remove details like skin shine and so on? prompts or negative prompt, nothing seems to work, and the detailed shinny skin doesn't fit with the face of my character which is traditional anime style
is 1.5 finetunable yet ? or have to wait for diffusers ?
No I don’t think so, but support should come soon. Only 1.0 is fine tunable but 1.0 Lora’s work in 1.5 as well.
oh damn , gotta try my lora on 1.5
how to apply low denoise inpaint ? or img2img but only for a part of the image
I think you said you had a workflow with very fast img2img in few steps ?
I just used diffusers, it was flux schnell with 1 or 2 steps. Even sdxl lightning or something else could probably work to refine the img.
the new Shuttle Diffusion 3 at 2 steps looks amazing
from the developer for shuttle diffusion - its' based on flux, not stable diffusion. keep that in mind
schnell is better for my preference lol
why? it loses so much
i need a workflow .. low denoise inpaint, sdxl or lightning.
theres tons of inpaint with denoise 1 or img2img, but idk how to do inpaint with low denoise :/
on fooocus though it works great
in comfyui i tried fooocus inpaint but idk how to make it work with low denoise.
my local install of a1111 is generating very slow, normally it didnt, how can I debug what its causing this_
Shuttle Diffusion 3 is simply a Schnell checkpoint
ive been doing this its wild, does big res also, probably going to try a 2 step tiled upscaler
oh nice if it can handle high res that would be good for upscaling
wonder if the jasper or union control nets will still work
I've started using inpainting for upscale rather than tiled
to avoid seams and blend a bit better
when you go past like 12-16 steps the image completely transforms to something else, not quite refine at larger steps like they said
would be interesting to see what they've done there
i wouldn't. schnell just ... no. use dev for that
I agree its better to wait for higher quality, but I found people tend to not like it when I suggest really slow workflows LOL
yes, well - let them trade quality for quantity if they want to
chasing the One Step of Euler dream
Ask in #🤝|tech-support
Also check if you have xformers in the webui-user.bat
;)
hi
Is there a way to change only one face on adetailer in a picture with several faces? If so, can someone give me install links?
Do you guys still use inpaint models when doing inpaint ? Or is it kind of deprecated and we can use just normal generation models for doing so ?
I wouldnt say deprecated just unloved. Quite a few people still use sd 1.5 inpainting because it wasnt really focused on in the training community for sdxl onwards.
brushnet and powerpaint are good
how to generate free images, what model to select
Hello everyone! I would really love some help if I could talk to someone.
I am trying to make consistent images of the same chibi characters in different poses.
Like the puppy I want it in different poses and different environments! I would really love some help I am very new to AI image gen thingies.
Is that even possible to do, and if so where do I start!
Thank you!
Ok but are those inpaint models really worth it ? Personally, I have been struggling for a long time with inpaint when I want to insert some characters or objects in an empty area of the background. First I thought it was because I don't use a inpaint model but finally, I have good results with a workflow which crop and zoom in the area concerned. And then I see that it seems only few guys use those specially dedicated to inpaint models... So that's why I wonder if those models really worth it... 🤔
they are really bad compared to modern tools
that's not to say they aren't useful
SD 1.5 inpainting models are nowhere near as good as something like powerpaint
more recently the research has moved to LLM-agent inpainting but you should probably try all the older methods first before that
Hello
Hello
what is the "it" they would be worth? they dont cost anything
Hello, does anyone know of a problem where when you try to download a Checkpoint, it appears in the SD menu but you can't use it? Sometimes it doesn't even download, it says it was downloaded but it doesn't appear
Hey try update your extensions, then delete the venv folder and relaunch.
Do you use Civitai Browser+ extension to download them?
hello frineds
what version of stabble you would reccomend me? Optimal for functionalit maybe
I usually use the batchlinks downloader
What's "LLM-agent inpainting"?
Is automatic1111 outdated? Idk, I just don't see much about it anymore nowadays. Should I switch to another program? Kinda like the interface of automatic1111, I'm just getting used to it
people are still using it, and still asking for support for it in the #🤝|tech-support channel
Fair, thanks!
generally they make an inpainting workflow in diffusers and then hand that to an agent as a tool
you could do it with comfy API also
from there its just standard LLM agent stuff
hello folks, is there any other UI than forge that is compatible with both regional prompter and prevent artfact?
want to use both but the plugin only works with the outdated version of forge ui
could you describe prevent artifact
Is this free
is what free?
Using the image generator or is it credit based
read the information here -> #artisan-faq
You mean inpainting by describing what you want to be inpainted?
no that would just be regular inpainting except an LLM passed your prompt to the diffusion model
its what is known as semi-autonomous agents
where they are not fully autonomous because there is a "human-in-the-loop" as they say
you give the agent the task of looking at the image, working out masking, prompting, and sampler settings, and assessing whether the task is done or if more is needed to be done
How does that differ from "Change the tree leaves to brown"?
it differs because both the instructions you give to the LLM, and the actions the LLM takes are different
#💬|general-chat message if anyone has the answer for this feel free to @ me, thanks
that's probably your answer
Test
this prevent artifact plugin is fixing a bug in A1111/Forge so other platforms don't need it
passed
I'm looking to use Latent Couple + Composable Lora, or if not, then at least Regional Prompter. I installed them and tried, but something is wrong. Anyone know how it works and can help me with my config?
could you switch to comfyui? these are all in the manager for automatic install
@me or someone else?
you, I think it would help
there are a lot of regional nodes these days, several hundred even
i am using forge, and they are both in the manager for automatic install there too, so I think it installed correctly, I just am using it wrong
forge lags behind spport for the new models and auto1111 is apparently no longer a good solution. everything is being developed for comfyUI now
hmm, that sucks for me because i find comfyui unintuitive and don't feel like picking it up atm
however i doubt the problem is that my version is out of date, I suspect something in my prompt isnt done right, or a dropdown menu has an unticked box or some such is the problem, and if I'm right, then someone who has used this can look at my setup and tell me what I missed or did wrong
oh I totally agree comfy is unintuitive
it took me much longer to learn comfy than, say, fooocus
but the sheer scale of features in comfy is unmatched by other UIs
unless someone makes a diffusers UI one day
well, I'll leave the prompt here, in case someone can tell just from that what I did wrong. (This is for Latent Couple with weird sketches, not for Composable Lora with its square regions)
witches outside in a dark forest, magic ritual, pagan
AND massive demon, devil, powerful, red skin, horns
AND pentagram, summoning circle, witch magic
AND witch, witch hat, black robes, short hair, masculine features, 35 years old
I believe the first part should be applied universally, and the next 3 describe prompting and lora use inside the selected regions, and AND separates them for the addon, but something isn't generating, I just get a witch sitting in a circle every time
AND doesn't work like that. you're not writing a script
oh should I use BREAK? Or something else?
what interface are you using?
it's never easy to switch when you like what you've been using, but you'll gt the hang of it
idk, but I have a screenshot that apparently Discord doesnt want me to post, replacing upload with "Create Poll"
I am using "Forge" and "latent couple", and a keyboard and mouse, and I don't know what "which interface" is referring to
you have to use the #🏞|general-with-images channel to post images
forge is your interface. there are several people use: webui, forge, auto1111,comfyUI, the commandline...
ok so Interface = ui or frontend
right now, most of the development for new stuff is going into comfy
UI = User Interface
alright, so we have established i'm using forge and unable to separate prompts with a Latent Couple addon
I should say that the reason for the ANDs in the prompt is because when I filled in the prompt boxes for each shape and pressed "Prompt Info Update", the client added the ANDs itself
yeah. that sort of seperation has never worked. that goes back to people thinking they could get around the 77 token limit by using BREAK or AND and stuff. which didn't work then, either.
you would need an interface like nightcafe has built to use multiple prompts - and give different weights to the prompts in each field
hmm, so did it used to work? if not, I'm not sure why the addon would be programmed to do that
I also dont see any buttons or boxes to change to another option for separating prompts
Should I just go to the with pictures forum? it seems like a ss would help a bit
i would just go with the standard weighting that stable diffusion uses - the stuff at the start of the prompt has the most weight, the stuff at the end has the least. put the stuff you don't want to see much off all the way at the end
Right, but my goal is to use either Latent Couple or Regional Prompter because I want to try different prompts for different parts of the image. Latent Couple does it with odd shapes that you sketch out, and Regional Prompter does it with user defined grids, but both of them allow more control over a prompt, both for abrupt changes without prompts bleeding into each other, and for composition of an image. In my case I want to be able to make a large number of small characters, like a goblin army, or a forest full of mushrooms and fairies, so using either of these tools will probably be a lot better for me than merely weighting my normal prompt in a clever way, which is why i'm here, failure to use the prompts cleverly enough and a desire for more direct control
for reference, regional prompter interfaces with controlnet which I know a lot of people use for more control over image composition
Anyone here want to adopt a 100k user base chrome extension? There only condition is to help transition it to manifest v3. There may also be a potential cash bounty for it.
The story is a sad one, The Great Suspender was removed from chrome as it had tracking/malware, others cloned it removed the tracking and added some features and thus The Marvelous Suspender was born it ran for a good few years but the repo owner has put it for adoption as he does not have the skills or time needed to maintain it and transition it to v3 anymore.
I have tried to pull quite a few people for help but there seems to be little traction, If I had the skills I would simply do it myself as I rely on this extension way too much but unfortunately I don't and cant learn fast enough to save it.
This is pretty much a last ditch effort but I humbly ask you to simply take a look and see if you are intrested or willing to contribute or adopt.
https://github.com/gioxx/MarvellousSuspender/issues/197#issuecomment-2480824976
post in #1092446741984444416
update: I got Regional Prompter to work. Oddly, when I had "Use BREAK to change chunks" checked, it did not work, but when I unchecked it, BREAK worked to jump between regions again. Weird.
your computer is haunted...
Thx for the directions
Hey, say a certain character doesn't have a whole lot of high quality art. Would generating images using an early epoch of the lora and then feeding high enough quality generations back into the training work ok? or is that bad?
could you try something like supir to repair/restore it first
Hello GM
hello frineds
what version of stabble you would reccomend me? Optimal for functionalit maybe
Hi all, I would create my omn LORA on Pony model on Colab with https://colab.research.google.com/github/hollowstrawberry/kohya-colab/blob/main/Lora_Trainer_XL.ipynb
However, I receive an error that I think concerns the dimensions of the images in the dataset. What are the characteristics that the dataset must obey? (I specify, if it is not clear, that I am a total ignorant)
hello all
Is 840 euros good for a second hand EVGA GeForce RTX 3090 XC3 Ultra Gaming
A reputable online store is selling one right now.
its good depending on your country.
im in balkan and can find them used for 600-800 euro
On marketplaces they are around ~700 euro without any warranty. Also i would have to drive over 120km to buy most of the second hand 3090's xD
So 840 euro sounds more reasonable?
its reasonable. im assuming delivery included
i got mine used + pc for 1000$, i loved it ; )
able to run image generation, video generation, local llm, loras, llm fine tuning, its awesome.
Already got a 7900gre with 16gb but just want one in my homelab xD
Sounds like a good investment
runtime error torch is not able to use gpu , is amd gpu can't use automatic1111?
hi everyone, we are Kling AI
flux1.1 is not open source so people can't even train it
also flux pro 1.1 ultra is possibly 24B or more model
quite possibly more
the control nets for dev are apparently costing $40k, per single training run
to train a control net for flux pro 1.1 ultra might cost $100k per run
this is partly the appeal of things like Nvidia Sana
You would need to follow the AMD install Guide from here:
https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides
Preferable with ZLUDA if your GPU is supported
guys i added a detailed lora (https://civitai.com/models/411088?modelVersionId=458257) to my comfy ui workflow and all of a sudden the images are nude even if i use proper prompt and negative prompts.
Even when the weight is 0 , the images are still inclided to be nude
once i remove or change the lora , the issue goes away
this is common
civit ai is an incredible resource with thousands of free checkpoints and loras
but its a mostly open system
so its common to find a lora which had issues in the training, for example in this case the lora is overfit on NSFW images
That makes sense
By any chance do you know of any lora that fixes details , mosly corrupted faces
Or what other options do i have to fix distorted faces
inpaint on the face or get a model like flux/sd3.5 with 16ch vae
Check your youtube video comments (kling ai fully upgraded). Many people have issues with your product and/or can't cancel.
I only use this lora for SDXL https://huggingface.co/SPO-Diffusion-Models/SPO-SDXL_4k-p_10ep_LoRA
@desert dagger I am in here as well
@tawdry cape @fervent thunder can you guys help Koss with his system questions, please - they're somewhat outside of my skill set
the crib notes - I have ComfyUI set up on an LXC container in Proxmox, I am wondering what the best way to set up a shared drive is to symlink into the image output folder so I can run ComfyUI from any of my machines and have access to the output. I was mulling over using my google drive, but I'm wondering if a network share would suffice#
I've given up on docker and given Comfy its own LXC container
works fine for that purpose, just need to make it more usable between devices now
the way i did it when I set up my system for using remote instances of ComfyUI to render images for me, was control the whole flow, by using the ComfyUI API's backend to queue your prompt you also get to listen to the events for when it's finished, so I would listen to the finished events and just copy the files out of the output folder into the main server. It's a different way of seeing the same problem I think
I haven't delved into the ComfyUI API yet, I'm literally 24 hours in. Up to that point I was just running locally on Windows. That's given me an avenue to go and explore
It just seemed that the path of least resistance was a symlink to a networked drive, but that could work better
its a WebSocket interface that they offer, so you can look for api.js in the code and base iti off of that
winner
yeah i mean you could do it that way but it's messy bc youo have to manage each instance and the storage for each instance and organizing and tracking what each instance have and then the user experience of having to go into each instance to look at the contents, the whole idea is messy, streamlining it by collecitng all the images into one spot and just using them to render and report back would be cleaner way to go imo
yeah, that makes a lot of sense
not sure if you're familiar with nodejs/javascript
i could share my ComfyUIApi.js file if you wanna build on top of that
much appreciated. Let me see if I can figure it out first, might have to come back to you later if I get stuck. I think half of my problems with this build have been leaning on other people's work too much and missing the intermediate steps to understand what's actually taken place
that happens to me sometimes when i let chatgpt blindly code for me and im just quickly prototyping copy pasta and then I iget stuck I'm like completely lost and I have to take a step back and review everything of what I supposedly did
I have a question about differences between SD and Flux workflows.
With SD, I like to use either from/to or image to image in order to produce images that are somewhere in between two different prompts. I find that feature very useful to create things that would be difficult to prompt for from scratch.
When I try to do the same thing with Flux dev, I do not have much luck. With either from/to or image to image, the output does not seem to change meaningfully unless I switch from one prompt to the other at step 1 or 2 or set the denoising strength/creativity very high for image to image. At that point the output switches almost completely from the old prompt to the new one. I cannot get the in between images that I am looking for. I am wondering if that is something inherent to how Flux is designed, or if there are settings I can adjust. I have tried the basic things like changing CFG and step counts, but nothing has been particularly successful. The closest I got was doing 100 step generations, but those are way too long for normal use.
I want to effectively compare the results of SDXL, SD 3.5, and Flux, but that is hard when the core part of my workflow doesn't function the way I expect on one of them.
how are you doing img2img?
I have been using the Init Image function in SwarmUI.
I see few effects on the final output until I turn creativity up to ~.8-.9, at which point the image changes. Then there are few additional effects up to 1.0 creativity. That is on Flux.
On SD, I am used to small effects being visible at <0.5 then increasing as I approach 1.0.
Most of my experience using this has been with 1.5 derived models on Auto1111, but I recently decided to try out some of the new tools and models. SD 3.5 seems to work similarly enough to 1.5 with image to image that I can port my workflow pretty easily, but I like some of the results I have gotten from Flux outside of image to image.
okay, i haven't done any img2img with flux at all, you might want to post on the SwarmUI discord about the issue
Will do, thanks.
local llm's? how fast is it, cause when i use my 4080 and 32gb of ram, my PC screams lol
with 32b q4 it struggles, have to wait 30 seconds for response. for anything below that its super fast.
what model? i use LM Studio myself and just mess around with different models
i use text-generation-webui
models Unholy , command-r are my go to, dolphin models are next.
really the value of open source is in uncensored llms, for the rest closed source gpt-4 is way better.
I mean mistral large 123b, llama 405b, and even llama 70b are considerably better then gpt4 and similar quality to gpt4o(large and 405b are teeny tiny better while 70b is worse).
Running them locally is a bit difficult though, 48gb vram or more is needed, but they are far cheaper then gpt4o(even 405b) in api pricing.
comfy is quite good for running LLMs
sup guys, i have recently completed re implementing GPT model with code after learning the transformer architecture, and now im trying to learn about LLMops (lang chain and all) im not clear what the road map should me tbh, can someone suggest me what to do. ( I did Machine learning with maths, deep learning with math)
And i would highly appreciate if u guys can suggest me some good projects to practise and hone my skills. 
have you tried llama 3.2 uncensored? I use that one and I swear by it now, really good at staying uncensored imo
i think that's a really high level question, my best suggestion is to ask an LLM that very question, you can even copy and paste it, I think an LLM would offer way better suggestions and ways to practice vs anythin we could come up with, plus 4o has the ability to web search now so it could do an internet lookup to find stuff for you
wait did they release the web search feature?
lemme check
not yet released in my country
okay i copied and pasted our chat into 4o and this is what i got:
@real tendon
Embarking on a journey into Large Language Model Operations (LLMOps) is a natural progression after mastering the transformer architecture and implementing GPT models. LLMOps focuses on the deployment, monitoring, and maintenance of large language models in production environments, ensuring they operate efficiently and effectively.
Suggested Learning Roadmap:
-
Understand LLMOps Fundamentals:
- Definition and Importance: LLMOps encompasses the practices and tools required to manage the lifecycle of large language models, including deployment, scaling, monitoring, and maintenance. It ensures models are reliable, efficient, and aligned with business objectives.
- Key Components:
- Prompt Engineering: Crafting effective prompts to elicit desired outputs from LLMs.
- Model Deployment: Strategies for deploying models in various environments.
- Monitoring and Maintenance: Continuous observation and updating of models to maintain performance and address issues like drift.
-
Explore LLMOps Tools and Frameworks:
- LangChain: A framework for building applications with LLMs through composability.
- BentoML: A unified model serving framework that simplifies deploying machine learning models.
- Helicone: An open-source LLM observability platform for logging, monitoring, and debugging AI applications.
- GPTCache: A semantic cache to store responses from LLM queries, enhancing efficiency.
- LangFlow: A tool for experimenting and prototyping LangChain flows with a drag-and-drop interface.
-
Engage in Hands-On Projects:
- Conversational Chatbot: Develop a chatbot using an LLM to handle customer inquiries.
- Text Summarization Tool: Create a tool that summarizes lengthy documents or articles.
- Sentiment Analysis System: Build a system that analyzes the sentiment of user reviews or social media posts.
- Question-Answering System: Implement a system that provides precise answers to user queries based on a knowledge base.
- Content Generation Application: Develop an app that generates content, such as blog posts or product descriptions, using LLMs.
-
Study Real-World Case Studies:
- Analyze how organizations have integrated LLMOps into their workflows, the challenges they faced, and the solutions they implemented.
-
Stay Updated with Industry Trends:
- Follow reputable blogs, attend webinars, and participate in forums focused on LLMOps to keep abreast of the latest developments and best practices.
Recommended Resources:
-
Courses:
- "LLMOps Concepts | From Ideation to Deployment" by DataCamp offers a comprehensive overview of LLMOps, covering the ideation, development, and operational phases of LLM application development.
- "LLMOps - DeepLearning.AI" provides insights into creating LLMOps pipelines, including data retrieval, model versioning, and deployment strategies.
-
Articles:
- "LLMOps Essentials: A Practical Guide to Operationalizing Large Language Models" by DataCamp discusses the importance of LLMOps and provides practical guidance on managing LLMs in production.
- "A Guide to Large Language Model Operations (LLMOps)" by WhyLabs offers an in-depth look into the challenges and solutions associated with deploying and maintaining LLMs.
-
Open-Source Projects:
- Explore the "Awesome-LLMOps" GitHub repository by TensorChord, which curates a list of tools and resources related to LLMOps.
By following this roadmap and engaging with the suggested resources, you'll build a solid foundation in LLMOps, enabling you to effectively deploy and manage large language models in production environments.
tysm mate
you might want to look into perplexity I think that's another LLM that offers web search
Hi, everyone! So happy to join this server and I'm the AI/ML engineer focus on the open-source models.
I love the open-source LLM models and image generation models such as LLaMA and stable diffusion and FLUX.
Recently I have been working on FLUX models for the fine tuning but has some problems.
When I train the model for human images, the dataset is not require many images and 5 images have good result but when I train the logo, I had the bad result.
What is the problem? to made the captions or need more different images for logo?
I'm using ostri/ai-toolkit for fine tuning.
Thanks
No havent tried. Which version of it is uncensored?
Hey So Can i Run stable diffusion smoothly on Mac Mini m4 16GB variant?
can anyone suggest any online resourse to learn machine learning
Sure, You can run the stable diffusion model smoothly on 16GB gpu
Ahh Mac has Unified memory so the 16GB acts as the GPU too
I'm just dwelving into AI, i was confused if the 16GB is enough. I'm really on a budget, if i need to go for the 24GB i have to get a loan
No, if you run only stable diffusion model, the 16GB is okay. Now I'm using 12GB GPU😁
It is possible to lora train the model for 16GB
I don't know well about mac os but I trained the stable diffusion model by using my 12GB Gpu
What about text models?
I want to create a chatbox that combines Text and Image generation
Which text model? you mean llama?
I don't really know anything about it... so whatever model works ig?
I'm just hoping that the 16GB variant is good enough to handle these
for the text model, there are several. closed-source like GPT and open-source like llama.
If you use the GPT, it doesn't require the GPU and only require the credit for token. it means you must pay the money. lol
And if you use the llama, this model designed require the GPU and CPU so, no worry about that.'
If you don't use 2 model boss, it is okay
16GB is enough
2 model boss?
Does it run nsfw stuffs?
if you wanna nsfw contents, you must use the model supported nsfw.
And the general llama model doesn't support the nsfw content
I'm not really into AI tech, i was hoping there is a software that acts as a medium so all i need to do is create a character and select the model and do the talking
plesae use the llama dophine-uncensored model
Ummm, interesting
Could you send me DM please? Your idea is very interesting and before, I met the man has the same idea
Sure
is there any good examples of with/without this lora to see the difference on faces?
There are closed-source LLMs that give you a reasonable amount of daily free uses, like the Gemini models
Yeah, you are right. but the closed-source model like Gemini can't support the NSFW content and the @small vale ask me about that
Ah, got it. You had mentioned ChatGPT (which of course is also not NSFW) so I figured you were just talking about all models.
No and actually, I love the open-source models because I can train and fine tune so easy than other closed source models
Anyway, the llama dolphin(uncensored model) can support all kinds of content
does that work in tagui
good
@surreal perch
hello 🙂
Any tips for getting vines to wrap around and object in A1111, trying to get an overgrown environment background, but the vines are like coming out of like concrete in random places and penetrating back into the concrete in random places that doesn’t look right😡
Is it possible to run stable diffusion 3.5 large using python and running it with my gpu without the need of tokens and paying? I am trying to set it up, been using fooocus before but trying SD 3.5 now and testing if it will work on python
sure. i'll happily promise you that, but i'm not always one to keep my promises
@still glacier you see this?
saw what ?
Should i delete the built in mac apps? like keynote or number ?
Yes diffusers is what you are looking for, minimum vram required is 6gb but 8gb+ is a good idea.
Yes i have been trying to get it to work for 2 days and always running into trouble i was trying this "https://huggingface.co/stabilityai/stable-diffusion-3.5-medium" and https://github.com/Stability-AI/sd3.5
do you know of any tutorials that will show me how i can for instance type "cat driving car" in python string and then it will generate? without using any tokens or stuff that costs money
don't use the api, i'll only cost you the amount of electricty your machine uses
I am only aming on creating low quality pictures like 256x256 and not that many steps like 5 sec per picture is it still better to use token? Its for a website
you said you didn't want to use stuff that costs money, though?
yeah well thats true XD but i meant like tokens
you could go use mage.space - they have free accounts, and all accounts, free or not, get unlimited generations.
how come its free doesnt it cost money for them with electricity? @desert dagger
cause that's how roi and hunkins want to do business. they do have paid tiers. go look at it
data collection. selling data is a very lucrative business. if you can monitor the habits of lots of traffic, you can sell that data to advertising companies so they can better determine what people want
or how to make people want what they got
they don't collect data, nor do they sell anything other than their ai generative services. perhaps, you might do a little research, before you make posts like this.
Yo fam. Welcome on board
The community is growing rapidly. At some point i thought they engaged platforms like forcefi.io
Oh! You mean that matchmaking platform that helps upcoming projects succeed? But the contrast here is stable diffusion is not a new project lol
That is true tho, am just sayiny
I know right, am they run ads using Adpod to get the massive traction who knows
That is very likely, I gathered they use AI and on chain data to helps users reach their target audience
Is there anything locally comparable to runwayml?
Good site, but looking for local.
ya sorry guess i should of added you can use Mochi 1 local
yeah mochi-1 is local(apache 2.0 license), and you can see its better then gen3, luma, kling1 in the arena. But if you want to run it with full quality(fp16), you required 40gb vram gpu and it will take a pretty long time.
You can use quantized versions like q8/fp8 or nf4 to fit in 12gb vram but that will be slow too and worse quality then fp16.
Crazy how mochi is in 2nd place and their discord is so dead
yep, I think right now for local, cogvideox is slightly more popular since the new 1.5 model came and 1.0(1.5 has some of those things too) just has so much things(lora, controlnet, trajectory control, img2vid, different res generation, higher res generation, orbitX loras for 3d/4d) and its much faster too. But pure text-to-video is clearly far worse then mochi.
Do they have a 24G|B model 😄
But the million dollar question does cog video run on 8gb GPU? Lol
I can tell you mochi does
yeah it can fit in 24gb vram, q8 can fit
I’ll take really crappy quality as long as 8gb can fit
It runs on 2x less vram then mochi, Mochi is a 10b dit while CogVideox is a 5b dit. It can fit in 8gb vram as well.
Last question, is it that much worse than fp16 to even use it|
Mostly depends on how long you can wait, but q8 won't be too much worse then fp16. fp8 is going to be considerably faster but lose more quality then q8. To get full quality, you will have to do 200steps with the model, which will take very long even on a 4090 or something.
i have a question about a code chatgpt wrote how would i make sure sd 3.5 runs on my gpu and not my cpu I am not that familiar with python its in the command line
import os
import subprocess
# Path to the folder where the virtual environment is located
folder_path = r"D:\pass\sd3.5"
# Change the current working directory to the specified folder
os.chdir(folder_path)
# Print the current working directory to confirm
print("Current working directory:", os.getcwd())
# Activate the virtual environment
activate_script = r".\Scripts\activate"
# Run the activate script
subprocess.run([activate_script], shell=True, check=True)
# Now, run the desired Python script (sd3_infer.py) with the given prompt, model, and GPU support
# Ensure sd3_infer.py is set up to use GPU by setting device='cuda'
command = ['py', 'sd3_infer.py', '--prompt', 'cute wallpaper art of a cat', '--model', 'models/sd3.5_medium.safetensors', '--device', 'cuda']
# Run the Python command in the activated environment
subprocess.run(command, shell=True, check=True)
Probably not a good idea to use chatgpt for that, sd3.5 is new and chatgpt doesn't have knowledge about it. First put, how much vram do you have and what gpu?
im not aiming to create high res pictures only quick ones with pretty low steps but a 1070 8gb
i have run fooocus on it multiple times and it works ok takes time but still delivers
Then, maybe sdxl lightning is better. Its faster and generates decent imgs. It doesn't have too great prompt following or text in images though.
that's just a script that runs another script. to know if that will run on your gpu or not you must reveal the code for sd3_infer.py
there are 2b cogvideo also
Ok but does sdxl lightning have an API where i can run it using python?
i do like the idea of cog video + faster than mochi + img2video. I think the only other question I'd have about it is, have you installed it? does it work in ComfyUI without installing custom nodes?
yes the API/library you would use to run sdxl locally would be torch and safetensors, you ask chatgpt to write you code that will use those APIs to generate images using SDXL and it will do that
I tried using stable diffusion 1.4 in python and it looked really bad is sdxl lightning better than that?
yes far better, sd1.4 is really outdated now. Even base sd1.5 is outdated, wait let me give you python code to run sdxl lightning
thanks :D
@median jewel read this: https://chatgpt.com/share/673bae3e-b468-800f-b9c5-c289f6964c8c
first you have to install diffusers and accelerate
in console you can do this by
pip install diffusers accelerate
if you are using a jupyter notebook like colab/kaggle, then just add ! to the beginning of the installation above
This is code to run a sdxl lightning model, dreamshaper is pretty good at general so might be a good choice.
## import the libraries(instant)
from diffusers import AutoPipelineForText2Image, DPMSolverMultistepScheduler
import torch
## load the model to cuda(should download the model automatically, time depends on your download speed)
pipe = AutoPipelineForText2Image.from_pretrained('lykon/dreamshaper-xl-lightning', torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")
## inference time(should take a few seconds or so)
prompt = "portrait photo of muscular bearded guy in a worn mech suit, light bokeh, intricate, steel metal, elegant, sharp focus, soft lighting, vibrant colors"
generator = torch.manual_seed(0)
image = pipe(prompt, num_inference_steps=4, guidance_scale=2).images[0]
image.save("./image.png")
dang you beat me in terms of lines @quartz siren check out that chatgpt link too, even after specifically asking iti for a super short script iti made it all drawn out when you did it inii like less than 15 lines
Where are you guys getting all your info on SD? I didn't even know there was a 3.5
404 Not Found :C
to be fair I did tell it i wanted to load it from a safetensors file
I happened to stumble on it when i seached for stable diff in hugging face
ok thanks !! ill check it out
This is just starting y our venv and starting the infrence py
works for me, just use what @cedar salmon posted that seems a lot easier to set up
ya lol i sort text 2 image by created date on huggingface
wrong click, not me!
yeah did you mean @quartz siren @tawdry cape
Does anyone here use Roop Unleashed? Why does it take hours to process a 15 second faceswap video?
@cedar salmon didi you see the code chatgpt wrote? it didn't even cuda support. Your example is a lot cleaner, and easier to set up, you dont even have to manually download anything
what if i wanted to change a detail like add orange hair to the code you provided would sdxl lightning find it difficult and add orange to other spots?
i think that's one of those things you just have to try it yourself and see
just change the prompt to anything you want, only the part of code that says ## inference time(should take a few seconds or so) part is the real inference code part, once you load the model and everything, thats the only code you need to run,
if you want orange hair, just change the prompt like this.
prompt = "photo of a man with orange hair, in a green living room"
image = pipe(prompt, num_inference_steps=4, guidance_scale=2).images[0]
image.save("./image.png")
Mm thanks :D
don't need a discord at all to go up the leader board
lol yeah i know but it's weird how they're being featured on CivitAI, a huge site, plus they're number 2 on the leaderboards, and the discord is dead, i guess II'm just out of touch, id expect to see all the generative nerds like me to be on there figuring stuff out haha
Hey friends is anyone planning to attend NeurIPS this year? If so, what is the best way to learn about what side events are going on?
hello people, im new in this place, How is AI used within a virtual machine?
same way as it's used in a physical machine
?
please can anyone create a wallpaper for me?
Isn’t it more satisfying to type in the prompt yourself and see what you come up with?
dang dang dang
I'm using automatic build, with rx 580, and downgraded torch to 1.2, and i need help, i'm downloaded nsfw model, and I have a photo of one girl, and we need to use her image and face to nsfw model
and in one model there is a nsfw and a non-nsfw image, how to indicate to use nsfw with replacement
and the proccesing moment lasts about 700 seconds, is this the good?
Hello, I am a newcomer and I would like to know how I can use AI to generate photos
I can write direct requirements in trigger words
like "draw a nsfw model and use the face and image of the girl I gave you?"
Hello! which is the best GUI for Stable Diffusion? i;m new into AI stuff..
Hey, you won't get help for nsfw stuff here as its against the rules.
Also 700 seconds is very long.
Do you use DirectML or did you used skip torch cuda test ?
Hey, for starters Auto1111 or Forge webui are very beginner friendly
I think the webui forge and automatic 1111 are the best for the stable diffusion
lol, same opinion
But I prefer the webui forge because it can support the FLUX model
and better ones? even is not for begginers?
I think webui forge
so ConfyUi is not a better option?
There are no better ones as it depends on the use case and the User.
Some people use Comfyui, some SwarmUI but all started with Auto1111 and i like it and forge more than Comfyui
To get started I would try Fooocus. While not supporting flux it helped me understand all the things happening in the background after that I moved on to using comfyui and creating workflows
The comfyui is little difficult for beginer. Have you ever any experience of stable diffusion?
i just play with comfyui a couple months ago..
I started in A1111 too
Its not as User friendly as the others.
But you have more control over the workflow.
What's your GPU?
1070Ti strix 8Gb + i7 8700k + 32GB Ram
Anyway, it is up on you but I prefer the forge and if I need some workflow, I built the custom python scripts myself.
Then I would go with Forge webui
It gives your card the best performance
yup! agree!!!
I saw a guy on internet. i really like him work and i want to make wallpapers for me like that work. i can share here a link? idk what he use.
Also its not one or the other. You can have every webui installed and share the models between them.
So you can test them all if you like
If it is okay for this server's rule, please share
Search " Brecht Corbeel " on facebook. idk what exactly he use and i really want to make images like those.
If you are possible, please share the image here
are you sure Forge is faster for that card?
do you happen to know why, if that is the case?
i send 2 images
Yep I'm sure as I have a 1080 in my old PC and getting 2it/s in Auto1111 with the same settings and 3it/s on Forge
Forge has memory optimisation and oder performance stuff already included. Thats why it doesn't need custom commandline args
I thought you share an image of the webui he uses xD
But you can get images like that with every webui
so you can't know by image what exaclty he use? i mean what type of Ai and stuff like that..
you will be able to match whatever speed you get on forge with comfy
this card does not have native FP8 matmul so the best you can do for this card is fitting the entire GGUF DiT in VRAM
which comfy can do
Only if the image contains the meta data. But users and websites can remove this
i mean just by your eye :))
Not possible because the output doesn't include special stuff a webui could do.
Every webui or python scripts can generate images like these. With the right model and settings of course
Would say try out Forge and look for good models on Civitai.com
True. But I dont like the node based layout. Also I'm on AMD with my new PC
Forge also supports gguf which is nice
I use Nunchaku personally, these days
its this thing https://github.com/mit-han-lab/nunchaku
We have developed a software in which if you give a news topic to it , it will checks periodically and when any new news arrives it Automatically creates a Image post and a full post article of that news using ai and Email you.
Like you have dubai real estate business.
For a social media daily post on trending topics and news is essential.
So if you set in the software, whenever any news related to dubai real estate come it automatically create post and a post article and delivered to you.
It can be used for various platforms like Instagram, facebook, linkedin, pinterest, twitter, Website Blog ,etc etc
And specially on LinkedIn and Twitter, Posting trending News can grow fast.
Do You Want This Service?
Or Can you help me to find clients?
Thanks
waiting for your reply
there's a #1092446741984444416 for this sort of stuff.
What is the difference between
stable-diffusion-3.5-large and
stable-diffusion-3.5-large-gguf ?
I'm not sure what the GGUF means. If I'm on a Macbook Pro M4, should I download the GGUF and use it as my base model in comfyUI? I'm still learning all of this, apologies if it's a noobish question
think of it as stable-diffusion-3.5-large (one bite) and stable-diffusion-3.5-large-gguf (little chunks) for your gpu to digest. obviously the first one is much faster if you have a GPU that can hold it, but GGUF format allows slightly smaller systems to take it in chunks, loading and offloading
okay I see, that helps. thank you
my hot take is i dont think forge is friendly, im bad at filling out forms and i have no idea whats going on, comfy solved that for me
gguf was designed more for linux/windows types so im not sure how it plays with mac, but mac is p powerful.
this is kinda opposite to most people 😄 do you by any chance play music?
comfy is very similar to music DAWs
not well, to me i think its about seeing whats going on
well i agree its a better system, gradio is honestly only a thing made for nerds with no real programming chops to slap together prototypes, im suprised a1111/forge have lasted. but forge is very good at optimising things under the hood, if you excuse the forms
(gradio is the form filling out thing, its like lego for devs who just want to hobble together a proof of concept)
ya cheat code to not have to program the full UI, i know its good, just not really for my brain
well, its a few years on now with people using AI, i agree its time to round the edges off. the only problem is, its still the wild west for people just bundling up someones github and making a pretty css and saying "give money look i made i thing". comfy isnt like that though. i looked into comfy for something else and yeah it turns out that litegraph which it was based on was an audio thing. figures, its a great visual method for describing a manipulation of something in a workflow
i'd say comfy sits nicely between user friendly and not so user friendly its some crap snapchat filter someone was duped into paying for
plus its the most actively worked on of the "big 3"
the new all in one installer they are buttoning up will help many
ruuuned diffusion is a good one too, as is swarmui. theyre not overly complicated but still have good stuff under the hood
yeah people have been asking support about that in other servers. i think an issue is people who already have stuff running arent going to install a helper. which means they'll lose that tech support for a bit.
i know this discord has a view scalpers looking for sidegigs, installing that and helping other people install it might be a decent gig, i know most settled users of AI stuff cant be bothered so thats a wee niche if youre looking
not pointed at you kagi, i mean the 100 or so people who would charge to help you install a1111 or whatever 😄
In forge webui, is there a way to take like a headshot and make it into a full picture?
wghat
FYI to my question earlier or if anyone else was wondering
what is your last question? lol
It was a while back asking if there was any IP adapters or control nets for SD3.5
Hi, is there an area in SD discord oriented to 3D generation?
pretty rare subject on this discord, ide guess animations
My phone screen was damaged badily yesterday and it is no longer connected. There is nothing I can use to touch or see what is on my screen. There is a gmail account I had on that phone and was used nowhere else. Google wont let me reset the password unless I reset it from my phone, which again, I cant because of the screen issue. Google wont let me reset it normally because I didn't add a phone number or a backup email. I dont know of anywhere to ask so I figured some smart person could help me here. I have tried scrcpy, Vysor, AnyMicro, and Wondershare. If anyone can help me or point me in the right direction, that would be greatly appreciated
Whats up everyone i got a budget of $2500ish and looking to get into a gaming pc, any recommendations for what to get,where to buy?send me a dm!
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
Hi, anyone know to make a consistent character throughout a story(in this case a picture book) based of a reference character/picture? I've tried for quite a while but couldn't get it to be consistent if there are many images and scenarios generated
Are there any ai models that can replace substance designer?
Yes. I think we can use the ipadapter or img2img pipeline and any model is possible but I recommend you FLUX because this model is good for understanding the user prompts and generate the high quality image
In Forge when i try to "Load from" avaible extensions he have me errors.
Can you share more info in #🤝|tech-support ?
Please share more detail about that, I can help you about forge
upd, after add directml option, i have a bad output
and 8vram loaded
Make sure you also include --opt-sub-quad-attention --opt-split-attention --medvram --no-half-vae
We have developed a software in which if you give a news topic to it , it will checks periodically and when any new news arrives it Automatically creates a Image post and a full post article of that news using ai and Email you.
Like you have dubai real estate business.
For a social media daily post on trending topics and news is essential.
So if you set in the software, whenever any news related to dubai real estate come it automatically create post and a post article and delivered to you.
It can be used for various platforms like Instagram, facebook, linkedin, pinterest, twitter, Website Blog ,etc etc
And specially on LinkedIn and Twitter, Posting trending News can grow fast.
Or Can you help me to find clients?
Thanks
waiting for your reply
DM
Does anyone happen to have any idea if I’d be able to do Ai video on my 4070ti 16gb super Ai gpu and if so would it be worth trying it or would it take forever to do stuff?
I think it is impossible for the 16GB GPU. The smallest GPU require is 24GB.
That’s a shame
I think you can use the could GPU service like runpod or vast. they service the cheap GPU server. the 24GB GPU is 0.4~ 0.6$/hr
Previously it was impossible, but now many models fit now in low as 8gb VRAM.
but to generate the AI video, 16GB is impossible. almost models require more than 24GB
CogVideoX requires just 9gb(can be further lowered with quantization) : https://huggingface.co/THUDM/CogVideoX1.5-5B-I2V
mochi requires 24gb vram to fit properly but you can run it in 8gb vram by using shared ram(will make it very slow though)
Did you test this?
CogVideoX one, yes but I have 16gb vram., @tawdry cape tested mochi
Wow, interesting. I'll try. thanks
If anyone's interested, Seb Kamph's discord is having an image competition 100th for the next two weeks, 5k in civit buzz for the winner. I'll not link, cause if youre that interested you'll find the server 😄
Hello, not sure if this is the best place to ask this. But i built a basic diffusers pipeline and its working with our model and the Lora we are using. But we are having a hard time matching up parameters from A1111 to get similar results. Does anyone have any experience with this stuff?
That’s definitely interesting, will it be insanely slow tho?
What are the sites for those?
Yes 75mins lol, there are nf4 versions which could bring it down to like 30 mins but still pretty long.
75 mins for how long video?
Would it be even worth it?
1sec
I would much more recommend cogvideox. There is just a new Lora that also improves quality now, and has so much more control. Pure text2vid is worse but with the control, you can get better vids then mochi. Will be far faster too, 15-20mins for a 6 sec video.
15-20 mins for 6 seconds is still pretty slow but way better.
Which spec gpu is the 15-20 mins?
An 8gb vram gpu is pretty slow, your 16gb gpu should be much faster. 5-10 mins?
That would be better yeah, It’s a 4070ti super 16gb Ai
The one in my laptop however is only 4050? 6gb
Probably should take around 5 mins, you can use lcm 8steps in just 1-2mins but that will be a bit worse quality.
Could be worth trying then?
I hope the egpu enclosure doesn’t slow the 4070 down too much it’s only thunderbolt 3 and for until I can manage to afford to build a pc to put it in.
Yeah it’s a good idea, I would recommend using kijai’s nodes, he makes really good comfyui nodes for models with lots of optimizations.
I use a1111 or whatever atm, I looked at the one you said but it looks insanely confusing and complicated
Those are just the updates, you don’t really need to care about them. It’s not as complicated as it looks. There are many tutorials in YouTube, you can search them up, just search cogvideox comfyui.
I meant Comfy UI in general looks confusing and insanely complex
A1111 or whatever I think I mostly understand that.
Yeah I understand, same for me, so I try to diffusers when I can. A111 is kinda outdated now, doesn’t even support many of the new image gen models I believe now.
For diffusers: https://huggingface.co/THUDM/CogVideoX1.5-5B-I2V
Damn is it, I used an easy install version, I don’t understand all those command based installs, they always seem to fail to install probably no matter what I do.
It’s actually surprisingly easy, assuming you have python installed
You can first just do
pip install diffusers accelerate
And then go to python and do
import torch
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image
prompt = "A little girl is riding a bicycle at high speed. Focused, detailed, realistic."
image = load_image(image="input.jpg")
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
"THUDM/CogVideoX1.5-5B-I2V",
torch_dtype=torch.bfloat16
)
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()
Inference code is
video = pipe(
prompt=prompt,
image=image,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=81,
guidance_scale=6,
generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]
export_to_video(video, "output.mp4", fps=8)
I don’t know even on the one I install it says something has an update and to go into python.exe and put said command in to update it and it always says the command doesn’t exist
And sometimes if I install extensions it totally breaks the whole thing and I have to make a copy of my backup copy of the whole folder for it and use that instead which doesn’t make sense to me and is irritating and confusing.
Mostly deleting the extension and then the venv folder can fix all issues
Really?
Because old not updated extensions, break the venv folder. And by deleting both the venv gets recreated clean
Oh ok thanks I’ll try remember that next time
When using Hyper-SD on fooocus i get a really good result back in 30 seconds what type are they using? Cause I have tried https://huggingface.co/ByteDance/Hyper-SD But i doesnt give me near as good result or takes forever. Does anyone know?
its not that easy to use hyper
the range of acceptable settings for everything narrows at lower steps
For me thats ok, I am not looking for perfect result. So if i were to say in prompt of fooocus and using hyper-SD "portrait photo of muscular bearded guy in a worn mech suit, light bokeh, intricate, steel metal, elegant, sharp focus, soft lighting, vibrant colors" it will give me a realkly good result from what i am looking for but using the link and the code there it either generates really fast but terrible outcome or extremely slow and lose the whole point of using hyper-sd
I think try to use a normal model for now
later you can go back and find out what was causing the problem in the hyper workflow
in theory it is merely a case of selecting the right hyper lora for your step count, and setting the CFG that the huggingface page tells you
but if that isn't working then there is likely something else problematic in the workflow
Its prob a good idea but what i have noticed is that when i am running flux1 my ram peaks and then the code stops because it runs out of ram but when i use fooocus it stays at decent level and my grahics card just works a lot. When using SD 1.5 and sing their code they have on the website it is going really fast but produces bad result, but could i theoretically tweak the settings to get a much better result, is that what you are saying?
flux is huge, but you should be loading it into your vram, not using your ram.
should i just enable cuda in this?"
import torch
from diffusers import FluxPipeline
from huggingface_hub import hf_hub_download
base_model_id = "black-forest-labs/FLUX.1-dev"
repo_name = "ByteDance/Hyper-SD"
# Take 8-steps lora as an example
ckpt_name = "Hyper-FLUX.1-dev-8steps-lora.safetensors"
# Load model, please fill in your access tokens since FLUX.1-dev repo is a gated model.
pipe = FluxPipeline.from_pretrained(base_model_id, token="xxx")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora(lora_scale=0.125)
pipe.to("cuda", dtype=torch.float16)
image=pipe(prompt="a photo of a cat", num_inference_steps=8, guidance_scale=3.5).images[0]
image.save("output.png")
"
yes ram
is this windows
yes
okay, no. you want your GPU to do the work, and that means using your VRAM, not your system ram. that's what your CPU uses. and you really don't want a cpu with an integrated gpu
what hardware do you have?
I said this earlier on other channel: there is an issue that affects windows but not linux where loading a .safetensors file causes a big spike of ram usage which is around 2-3x the size of the file
GTX 1070 gpu
i7 4.2 GHz cpu
48 gb ram 2133 MHz
so not up to date hardware but i can run fooocus pretty ok
seems more likely that what it was, was that you filled your vram and then the ram spike was the data spilling over
no it doesnt even effect my gpu at all if i look in task manager
my ram just goes bananas and gpu is at 10%
what was the vram usage amount at, when the problem happened
Flux will not fit in 8gb vram normally, it requires 24gb vram at the least. You need to use quantization which can allow flux to fit in 8gb vram, let me get the code to use quantization with diffusers.
th VRam is 3.6/8 GB
Does anyone know of a stable diffusion model that creates isometric tilesets?
sorry for being a noob at this
Also, flux will take a pretty long time on a 8gb gpu, are you sure you want to use that? You can use hyper/lightning sdxl as well, should give you nice images(considerably worse prompt following, human anatomy, and text) but still decent imgs really fast(just several seconds).
Well im just doing trail and error and trying to find how fooocus can generate that good result under 20 seconds and all the time I am trying all of them and its either shit or takes forever. But would lightning SDXL be a good chioce for generating under 20 seconds with 4 iterations?
4steps or 4imgs? I don't know the exact numbers on how fast it will be on your device, but to generate a single img in 4steps, it should take a few seconds. 4imgs should be complete before 20 seconds.
Fooocus uses sdxl under the hood as well, flux will be much better but much slower as well.
Ah ok so using SDXL is a better chioce on my device
the code you provided two days ago that was lignbting SDXL right?
Yeah most likely, instead of hyper SDXL models, I would recommend lightning SDXL or DMD2 SDXL. Both usually give better imgs then the hyper variant.
I think I did give some sdxl lightning code a few days ago.
with dreamshaper I believe, thats a pretty nice general model.
but what i dont understand is when i use this code my gpu doesnt really work I is that common on windows?
## import the libraries(instant)
from diffusers import AutoPipelineForText2Image, DPMSolverMultistepScheduler
import torch
## load the model to cuda(should download the model automatically, time depends on your download speed)
pipe = AutoPipelineForText2Image.from_pretrained('lykon/dreamshaper-xl-lightning', torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")
## inference time(should take a few seconds or so)
prompt = "portrait photo of muscular bearded guy in a worn mech suit, light bokeh, intricate, steel metal, elegant, sharp focus, soft lighting, vibrant colors"
generator = torch.manual_seed(0)
image = pipe(prompt,height=768, width=768 ,num_inference_steps=3, guidance_scale=2).images[0]
image.save("./image4.png")"```
and because of that it takes 3-4 minutes to generate a picture
Oh, yeah that uses like 12gb vram so its spilling to shared ram(slowing it heavily), this code should work on 8gb.
## import the libraries(instant)
from diffusers import AutoPipelineForText2Image, DPMSolverMultistepScheduler
import torch
## load the model to cuda(should download the model automatically, time depends on your download speed)
pipe = AutoPipelineForText2Image.from_pretrained('lykon/dreamshaper-xl-lightning', torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
## some optimizations to lower vram usage
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
pipe.enable_vae_slicing()
## inference time(should take a few seconds or so)
prompt = "portrait photo of muscular bearded guy in a worn mech suit, light bokeh, intricate, steel metal, elegant, sharp focus, soft lighting, vibrant colors"
generator = torch.manual_seed(0)
image = pipe(prompt,height=768, width=768 ,num_inference_steps=3, guidance_scale=2).images[0]
image.save("./image4.png")"
i have a lot of pictures of muscular bearded men now lol
its possible to do flux dev on 8GB but it needs careful managing
is a lot easier to fit SDXL in VRAM so that's likely a better choice for now
my gpu doesnt do anythiong still is it bacuase there is a cpu_cutoff?
first delete any other pipe or anything else thats using vram
yeah that was it now it generates much faster, thanks for the help
(and dont' run WoW at the same time on the same computer...)
I have seen videos of people drawing on a website and an ai trying to depict whats on the screen and it updates really fast how is that possible. Is it just constanly running and doesnt need to reboot?
They usually have more powerful gpus and use either sd1.5 models or 1step models.
Anyone have a workflow or suggest one where I can take a character image and recreate it / make new images with it?
it's one step technology. you can do that on the playstation too, martin does that and posts his exploations out to twitter
consistency model is often the one they try these days
for the one step of euler dream
it used to be GANs
SD1.5/SDXL hyper and loras like that can do one step but not so well
Anyone know why thiis is happening? Every so often, an image gen will take 10+ minuyrd. I see little use of GPU while looking at task manager. Then next generation, it's back to normal speed. It often happens if I switch something in a controlnet, like choose a new image for openpose or face ID.
it's probably loading stuff into vram
Sure, but it doesn;'t only happen the first time I generate with certain things selected,....I do multi images for IP Adapter, for example. Sometimes it will gen 50 fast, and then suddenly the next one takes 10+ minutes
Just finished...
10:53 to 11:28...half hour to generate an image
next one 10s
9s next one, 10s next one...
now added a new face for FaceID...going to take looks like 10+ minutes
only 9 minutes 37 seconds
this only started happening about 3 days ago
before then, I could change the images...no l;ong generate times
what all did you do 3 days ago, and do you have any errors in your system logs
I looked for errors...don't see anything
Maybe something else updated something that affects A1111. I'll try installing it in pinnokio...supposedly that isolates everything
Pinokio ist just an installer
The isolation comes from the webuis venv
I see
well, I have abnandoned A1111 for now. Installed SwarmUI...cant figure out how to get IP Adapter installed
tried it on SDNExt, but the output was horrific with ipadapter face plus/
Got it installed, but when using there's an error now:
File "C:\Swarm\SwarmUI\src\BuiltinExtensions\ComfyUIBackend\DLNodes\ComfyUI_IPAdapter_plus\utils.py", line 161, in insightface_loader
02:45:30.663 [Warning] [ComfyUI-0/STDERR] raise Exception(e)
02:45:30.664 [Warning] [ComfyUI-0/STDERR] Exception: Unable to import dependency onnxruntime.
that error is when I try FaceID PLUS V2
When I tru FaceID, another error: File "C:\Swarm\SwarmUI\src\BuiltinExtensions\ComfyUIBackend\DLNodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 254, in ipadapter_execute
02:48:31.300 [Warning] [ComfyUI-0/STDERR] raise Exception("insightface model is required for FaceID models")
02:48:31.302 [Warning] [ComfyUI-0/STDERR] Exception: insightface model is required for FaceID models
Yea for comfy you need to install the face id whl file manually into the venv
the what into the where?
Iam using swarmUI, not comfy
I see no venv dir in the swarm dir
SwarmUI is just ComfyUi with an UI on top
Hello 👋
Ok, so where do I get all the files needed, and where do I place them?
I dont have time right now to explain it sry
Why cant this be listed in the install instructions?
"Simply click this button, and accept the confirmation prompt, to install IP-Adapter to your ComfyUI backend." Seems like it's not so simple...something else has to be done
I'm looking for a tiny model like SSD-1B can be use in Google Colab (I'm an IT teacher, my school don't like install everything, it's complicate 😅 ). I'm not seeing a newest version. If you know any tiny model available on Hugging Face, it's can be nice ❤️
Hey, you can use any 1.5 model that is 2-4gb or any sdxl model which is 6gb
Or do you mean a already cloud hosted one?
Oh no I would like to use the basic T4 on Google Colab (16 VRAM)
Currently I'm using this model:
from diffusers import StableDiffusionXLPipeline
import matplotlib.pyplot as plt
import torch
pipe_img = StableDiffusionXLPipeline.from_pretrained("segmind/SSD-1B", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe_img.to("cuda")
It's not really recent (1 year ago)
there are no videos on it, no clear explanation in the install guide, tried what it showed in issues but still no luck.
So you want and sdxl based then or can 3.5 medium work too?
I'm not sure if I can running 3.5 ... :S maybe this one: https://huggingface.co/Comfy-Org/stable-diffusion-3.5-fp8
And does your colab supports GGUF models?
Yes 🙂
12 hours and counting trying to get this to work
Then here is 3.5 medium gguf:
https://huggingface.co/calcuis/sd3.5-medium-gguf/tree/main
But 3.5 as well as flux need additional text encoders like t5 gguf, clip l and clip g
What do you want to do?
I can try it 🙂 thank you
Get IP Adapter working in some UI. I had been using it in A1111, but suddenly that was taking 30 minutes to generate images. So I tried in SDNExt, but the output was awful. So I then moved to Swarm - but can't get Face Plus V2 or FaceID to work.
I can help to get it working on Auto1111
For swarm and comfyui you need to find a face id WHL file
Yes, but I don't know what those are, where to find them, where to install them
I have it working in A1111, but every few generations sudden;y an image gen takes 30 minutes.
Can we move to #🤝|tech-support
There I can help you to not get the 30mins wait time. I'm sure we can find the cause
What does Controlnet tile actually do? Should it actually be called "deblur" or "add detail" controlnet? Is the tile part just the workflow?
Does this treat each tile as a separate image, or does the controlnet actually enforce some global coherence?
guys why the ticket support is asking for wallet address?
Its scam
Please report the username and let us know to ban them
It was a Discord invitation that appeared as soon as I asked you the gfx question, he asked to make a ticket in that discord server with a bot and I was contacted by one called "Admin (Don't DM first" and then a private individual who I reported was called like "samantha" and I don't remember the rest of the name
Yea these scammers are so many its insanse
We ban them daily
yo peeps, what's a good model for generating environments? nature and city sceneries and stuff like that
Hey y'all, i remember to see some workflow where you could import a 3D model with some node and use it to influence the generation of images but i can't get my hand either on the custom nodes or workflow or examples that use it.
If any remember or know it, it would be appreciated 🙌
@warm junco i got a 4070 now
which sd should i use now
😅
as in
which is the best ui
Good evening everyone
Hello
Forge if you want to use flux
Or comfyui if you like node based stuff
is it better than A1111
It has flux and in a few weeks maybe sd3.5 support
But auto1111 is better in terms of extension compatibility
ah
but 1 question like
is there only 1 a1111 or different like 1.5 or sdxl
or those are models and i can use any with a1111
automatic 1111 opened all sd models
except the FLUX
I think so too
it is better than other closed source models
does it have restrictions like filters
What is the benefits of FLUX do you think?
You mena NSFW filter?
The FLUX has NSFW filter and until now, I didn't see the FLUX model without NSFW filter
I think at first, the SD had the NSFW filter and some one removed this filter and released
So I think one day, the FLUX model without NSFW filter released
flux can do nsfw out of the box its just not very well trained on it, if you want nsfw models just look on civit there's a ton of resources and models to make it more nsfw friiendly
if anything i think sd3.5 has better nsfw training than flux out the box, but if you count it with loras flux is superior
its' not really a filter. It's just been trained to think nipples are smooth.
You throw the right lora at it though....
you mean the FLUX can support the NSFW? If I u se only lora, It can?
you dont know? go to civit and look at loras for flux. not even a search term. Just filter the list to flux and the top results are all porn
You need a lot of vram for flux and its slow
With SDXL/Pony You'll have the most fun
Cyberrealistic XL is better than pony if you're looking for particular results. IMO.
Pony is XL, just broken
It's not a base model. It's just a refine of XL that is disaligned so it doesnt' work as well with XL loras or controlnets
It got popular for the same reason Evony the mobile game got popular. Kids like cartoon boobies.
Yes
pony isn't a base model. if a ui supports sdxl, pony is jsut a merge/refine of that.
using pony requires clipskip 2 and special tokens to in every prompt you've got
Hello All... Looking to join this group and learn more about creating and using AI models for social media
is it possible to use image prompt in SDXL lightning using python? I am trying to insert a photo and put him in a desert
what should i write in the webui.bat apart from —xformers if i have 12gb vram
You can add --no-half-vae
Nothing more needed for 12gb
oke
Hello
ofc, possible
where can i find information about this and how to do it?
DM
hello
Hello
hellO
have a tremendous friday. especially if you are in Korea (south)
if you are in north korea there it is always Monday.
Anyone know if it's better to genreate depth map with the image, or after in some other application?
Generate an IP three-view of AIDS-prevention
here
Hello
hello
does flux fill work with loras ?
Any idea how to get 2 IP Adapters in a comfy workflow? I want to do like A1111 has, get two face ID adapters in there, each one with a different weight, to blend the faces.
hello
does anyone know of any good stable diuffsion models to use for architecture and archviz?
something that gives good,consistent results
Im interested in video fine tuning , idk if there a server for that
you mean the video fine tuning service? what does it mean server in your question?
hey, this might not be the place to maybe ask this here but i'm in urgent need of "anything" that can make my work and i.e: i have corporate photoshots of my employees and i need to convert those normal portraits of my employees (wearing a suit// typical linkedin profile picture). into a pixar style profile picture (it's for a project). as an intern i have to do this in the next 3 hours. somebody please guide me to where i can do this. feel free to ask me this any question for more context. please give a solution that's possibly a free to use thing.
Discord server ..
How many people's image do you change?
a small team, maybe 10-ish photos.
how many images do you have for each one person?
i have only one as of now, which needs to be converted into the pixar styled. but i can surely ask for more. what's in your mind ?
I think you must fine tune the image generation model and I have the experience to work with similar project. if you have interesting more, please dm.
And actually, I think there is no free service to complete this work
Have you ever heard the Cogvideo model for video generation? this model is the latest released t2v and i2v open-source model and possible to for fine tuning
Yes and its ok
But i do want to have mochi or ltx finetunes instead
Willing to pay to make it possible
Dm please
does anyone have a code for image prompt in python using lykon/dreamshaper-xl-lightning?
Screen Out : the computer is man's best friend, until when ?
explain please
you'll want to set up and familiarize yourself with the diffusers library, if you intend to do stable diffusion generations through code. Diffusers is the best solution. https://huggingface.co/docs/diffusers/using-diffusers/conditional_image_generation
Anyone know if James Cameron (StabilityAI board) has made any videos with the AI models yet? would love to watch what he comes up with 🙏 ❤️
Technically, avatar and it's sequel employ a lot of machine learned systems
but no. not "ai video" as we know it
Ok thanks, hope they are cooking up a model like Sora and get him to make some cool short films or show episodes
(since he is on board I assume he gets early previews of everything from SVD to new sound models)
He probably won't use ai video models in Avatar 3. The guy is fanatical about control over his shots.
?
Hey lads, do you know a consistent way to get usable text using flux? Some of it is good then it just turns to gibberish.
how long is the text you're trying to get it to write? and how complicated are the words?
Because we can't post images on this chat its in #🧣|comfy-ui https://discordapp.com/channels/1002292111942635562/1204675216773619752/1309599833958645871
Not really that complicated, it's just quite a few words.
that's what the #🏞|general-with-images channel is for
post the text itself, please
I don't specifically have it, it's all being generated through another ml, but just looking if theres a way to generally get something more cohesive when you have a lot of words.
not really - it's also going to struggle with words with repeated letters such as armadillo, or complicated, fairly long words
Second question then, does anyone know a good way to generate just blank spaces where text SHOULD be, so I can add it in post. I can't seem to get anythign to work myself blank space for text, no text etc.
This is all in flux.
nope. you can try breaking up the text into chunks like this: write "apples oranges bananas" at the top of the image. The background of the image is an abstract image of a banana. write "yummy fruit" at the bottom of the image. <--- sometimes that'll work, sometimes it'll ignore it and just do what it wants
Yeah that's what I'm finding, thanks so much!
works exactly the same way for SD3.5 as flux and 3.5 both use the exact same encoders
if you really REALLY want control over the text, just put it on in photoshop
Yeah that's what I'm planning on doing, but I'm just trying to set blank spaces where text should be with no luck, it's like I want the design withotut the content.
you can fix that in photoshop - just put a new layer on top of the image, use the fill to fill it with white, make the layer about 40% opaque, type in your text. then adjust the transparency of the white layer more if you need to, or remove sections of it that aren't under the text
Hello
Hey fam. What's good?
All good fam. By the way do you by any chance know any platform that could help me run a successful ad? I have been able to reach my target audience with my current campaign
Well, maybe you should try Adpod
Okay. Could you tell me more about it?
insane way to self promote
google is your friend
thats what they want you to think
