#💬|general-chat
1 messages · Page 114 of 1
gm
they won't call it ww3 until its done so we could be in that time right now
The good news is that we'll be able to generate booba while the nuclear echoes are ringing
literally metro 2033 but with an ai waifu chatbot
Suddenly a 12GB Vram model becomes 500 GB
Gotta start saving up quarters in the change jar for 3 B100s
next stop is the blockchain. stabilitys partnered up with nft networks
In 10 years people (or robots) will look back at us and go like, not even 1 Terabyte Vram per card?
we'll never need more than 40gb of vram
imagine if you stopped talking about SD3 for at least 15mins 
no but like, imagine if like, sd3 could do anything
Imagine all the people (that sd3 can generate)
😮
impossible
ok but when sd3 🗿
ooOooooOooOOoo

yoko's blockchain idea is gonna fuck it all up i swear
The 800m parameter model might, who knows
Though I imagine you need at least 4GB not to rely on off-loading
i dont see anything in the white paper about how they're going to take 8b down to 800m . wonder how they'll do it. blockchains?
😮
The SD3 turbo looks very good too. Maybe it can be used for real time generation in games together with an LLM and audio gen for fully custom games
i'd imagine it'll prolly be more useful for non-realtime applications like video generation
Yeah it looks promising for video too
turbo seems really good for generation but it doesn't do well in editing
I mean, 1 step inference looks insane
It will be really good for experimenting with ideas given how fast it is
i hope that a new gen of point and click adventures will come up
I'm sure countless types of games will come up once the tech improves. I gave it a try for fun, but I think local LLMs for 8gb vram are still not there yet to make entertaining characters + custom world
Is there any verified article that talks about why Artists are against AI Img Generator?
What I know from it is because the AI databases are from the unpermitted arts by the artist themselves.. and when some companies use the AI img generator as a commercial use like in the online paid AI img generator, the artists didn't get any affiliation to it (or something like that idk..). (tho I think the company just want to pay their electrical bills and upgrade GPUs)
I think the true reason most people are against AI is because it will take a lot of jobs. Once the AI that does X gets close to human level, some people will go against it, some fields more than others
ultimately AI is the most potentially beneficial and most potentially thing we've ever created IMO
all comes down to labor protections
does anyone know what version of clip to use with sv3d
the two obvious extremes: unprecedented prosperity, or basically everyone reverts to meat bags roaming the countryside while a handful live it up in castles
Well yea, that too... Companies start using it anyway, cuz they don't need to pay any artist or creator to make something..
Bunkers seem to be the new castles.
ai will create new jobs. There will be new film makers
Bigger room, lower ceiling
guys sd3 release when?
I would love to make a fully featured film if AI gets good enough where that is possible for me to do it all myself
sora will be nutz
what i noticed is nvida's new initiavie. ACE. Avatar Cloud Engine. So npc ai's will be software as a service
single player games that are always online work right guys?
how long till MMO's have no players and it's all bots?
if mmo's have a gold trading market it will have like 70% bots,if u cant trade then it will be like 10 to 20%
Is that really a bad thing? lol. Some bots may be more fun to play with than someone who is ruining the game or griefing
yes it is,its like letting ppl print their own money,they flood the market with cheep stuff and devalue the in game coins
just an always online single player game then. not massively multiplayer
If we live in a star trek society we won't need money anymore
It would make a game like hitman interesting if you could talk to the different NPCs and they would react differently to certain situations
lol startrek still has replicator rations that crew have to save up. biggest reason their future economy is so efficient is transporter tech
the more we understand special relitivity though, more it seems we can't go ftl
I think it really broke other people's plans to keep tech releases at a certain pace. I just hope Runway and Stability will relase something similar soon
its a deluge of resaerch unfolding right now. anyone in this space isn't counting on slow releases anymore.
the thick one
Is SD3 trained on 1024x1024?
is there anypoint in going higher than 1024?
1024 is already rough when it comes to Deforum unless you have a beefy machine
Hi all. I'm new here. I read the Pricing for Inpaint, it is shown as $3.00 for credit. Is that 1 instance or for 100 instance? Thanks!
is a 4080 beefy?
I'm pretty sure the sizes of images used in training are variable
yeh, the image will change based on what you set it to, if you set it more wide, the person will most likely laydown
if you set it very thin, it will be more of a standing shot
Yeah, I think image resolution and ratios play a large role in parameters
if you pick a weird number, the person moslty becomes mutilated
If you use the forge-webui deforum extension you'll be able to do larger resolutions no problem, also kohya hires fix was just added to deforum #🎥|animation message
😮
no worries about mutilated stuff now, hehe
Sorry to bother you, I'm trying my hardest to figure out how to upscale and enhance a very small image, but nothing I've come up with has been suitable for my needs so far. Maybe it's just not possible with such a small image and I shouldn't bother?
lol, I get so annoyed, the bodies get so deformed so much lol, and random arms and legs in random places
Yeah, it was really annoying, not anymore though 🙂
I think more resolution should give better quality up to maybe 4k
wait, does this make videos?
https://www.youtube.com/watch?v=P6LlZ-fMhiE
wdym?
deforum is an animation thing yeah
the vid came up with that
That's a pretty old tutorial
i see
supir, cssr, ultimate sd upscale, latent upscale
These should all work unless your image is absolutely tiny, at which point it's worthless and you can just generate a new one entirely
maybe this? it seems as general purpose as something like lora or xformers https://mhamilton.net/featup.html
noob question: when i pass a txt2img image into img2img for slight adjustment, do i need to include the original prompt?
depends on which type of img2img you're using
maybe he meant is inpainting
since, he said he pass the text2img result into img2img for a slight adjustment
img2img is very vague. It could mean inpainting, upscaling, downscaling, style change
yeah i meant inpainting. sry for the confusion
For most cases yes you need the original prompt, but some methods of inpainting for example with the krita plugin all you need to tell it is what you want to put in the inpainted area
thanks!
auto-sd-paint-ext editing, this plugin? I'm interested with the plugin :v
yooo
I've been install the run.bat
after it's doing installation and download some stuff
and press any key to continue
nothing happens it's just shut, any hint ?
what should I do Tijmi ..
download python
Checkout my install guide:
#🤝|tech-support message
still doesn't work
morning all!
For what
I have an itch for $2000 for an RTX 4090 if someone could send me the money 🥺
I won't be giving my username for privacy reasons so just give me your card details and I'll handle the rest!
(joke)
Does JuggernautXL have an inpainting model 🤔 I should check
But sadly really I didn't experience any improve in the inpainting process, sometimes just worst results than just inpainting with the original model I used for the generation, so I stopped using it. But maybe I was missing something.
🤔 I read something about Soft Inpainting in A1111 on Reddit a few minutes ago
It requires making some changes in Automatic1111 to make it work.
Maybe that could make inpainting better?
it is better with the mask blur so inpaint blends better, also more options, but to me really complicate, still haven't learnt. But I use it just with the default settings and I use 10 pixel for mask blur.
I tried using a ControlNet + inpainting with hephaistos nextgen dpo (a model) and oftentimes it refused to make changes to hair color specifically 🤔 It worked only a few times. I doubt this is a Canny ControlNet issue, as I can change features if I don't inpaint, but say I don't want the rest of the image changed for example.
but the inpainting model is the same
Could also try that!
inpainting great changes or color changes it is really hard sometimes. Better to just photobash it or just paint it in the color you want in photoshop or windows paint and then inpainting in SD. So if you want to change the hair color it is better to paint the hair in photoshop an then inpaint it.
Also if you want big changes you need to choose "fill" in Masked Content, and kind of high denoising (0.75). If you want to make a minor change over something that is already there, "original".
But fill is a gamble, it is like generating from scratch. So sometimes it is better to make the change manually and then inpaint with original. But for certain things it works fine.
It depends on the denoising strengh and the context SD has (if you use "only masked", instead of "whole picture")
Plenty of tips I will try! Thanks!
with "only masked" you change the context "Only masked padding, pixels"
no problem!
HOLA!
any way I could like, volunteer to help SD3 somehow?
I'm just so anxious for it to come out
I don't understand why people do prompts like a story, as I understand it, words in prompts just trigger stuff, so that just add random stuff. Sometimes it do the images as certain words trigger what is prompted, but I believe the better to prompt is some kind of tag system, deep boru style with some short, concise ideas
I don't know how much grammar SD understands
Also I see huge prompts with a a lot of stuff, much of it doesn't appear in the image, it is funny
It depends of the sd architecture.
with sd1.5 (which is using CLIP) it's usually better to use "word salad" prompts
with SDXL meeeeeeh not so much
yes I use SD XL and most of the time I do very succinct prompts, I don't even use negatives as I believe that may affect the image good or bad, like if you use "bad hands" it will try to not draw hands at all, or it won't really matter
i've been trying to do style transfer with SDXL but all the turtorials seem outdated, does anyone have any recent tutorials on style transfer?
style transfer like from an image? probably better to train a lora
style is harder than subjects, imo
https://github.com/ExponentialML/ComfyUI_VisualStylePrompting
This came out like 4 or 5 days ago but I didn't try it yet
i assume this will only work in comfyui and not automatic1111 right?
i thought to do that at first until i came across what used to be called clipvision, which was advertised to be some kind of style transfer model in controlnet. i havent found any updated turtorials and the main model (clipvision) is not missing from controlnet
It's just a miracle that the paper got a comfy node almost immediately, if you don't like it you can wait and see if someone adapt it for a1111
So bad that the a1111 devs aren't like they used to be at the beginnings. Now everything takes so much time to adapt for A1111 and it seems like the devs don't really care that much anymore anyway.
anyone can tell me what ''basicsr'' is doing? trying to use softedge hed in auto111, and it says this module is missing..appreciate
No module named 'basicsr'
a1111 has very good extensions and in some cases are superior to the comfy counterpart, but Stability uses Comfy, and they're working to improve it. Using both would prob be the best choice but I personally enjoy comfy
apparently they will be inviting more people this week
its already Wednesday and nothing so far
i really mean the model release beacuse im renting a machine on lambda for ollama and sdxl rn
after dealing with the openai api i never want to deal with an ai api again lol
eh it wasnt that bad in my view, most of the 'waiting' was for controlnet, which wasnt really the fault of auto1111
regardless of the timing of SD3–every day with open source is a day with gratitude.
never mentioned which april, april 2025?
lmfao
hey i mean pony diffusion is a thing, let’s see someone start building horse diffusion lol
at this rate someone else could have bronco diffusion by the time april 2025 rolls around
SD3 pony will be interesting
oh definitely
Pony3? lol
I wonder if they will have to recaption the dataset, idk what their caption types are
Yea, man, pony is really good, but it's not an all inclusive model. I mean it's not good at everything for now. Regardless that, it's good at prompt understanding for specific images
if its booru-like where its a bunch of adjectives and nouns separated by commas then it will need recaptioning
this is true!
i’d think dataset recaptioning would be a bigger deal for a lot of point releases.
its unfortunate that a lot of pony finetunes and loras are mostly uhhhh, nasty
For photos i dont even know if we'll need finetunes
they look as good as sdxl photo finetunes like dreamshaperXL and juggernautXL
juggernaut v9 is pretty slick
I wonder if more actions and expression swill be in massive finetunes or just separate loras 🤔
because it would make a lot of sense to finetune that
exactly, that's why the model it's not good at more abstract art
or logos or illustrations
it shows community bias in a whole new light, don’t it
heh
but guess that the NSFW generators can make some money now with pony
i mean the adult industry is probably already churning out fake websites complete with 100% fake content. it’s a win-win for those folks as they get to start laying claim to being more relatively moralistic in principle.
if they manage to train SD3 with NSFW successfully with detailed and obedient prompts then imagine the possibilities 
I personally don't use AI for corn
with the context that Sora seems to demonstrate, I would imagine their NSFW controls are very robust.
if it will turn out that 8B will only run with 12GB if its like super slow CPU RAM offload even with comfyui (so no matter what it won't fit), then we may still rely on SD3 8B Turbo though as it only needs 4 steps
imagine FastSD or whatever with openvino, generating 4 step super intelligent images
(provided you have the RAM)
dont they just take your prompt and manually modify it at this stage?
well, it’s probably like how Fooocus utilizes GPT-2 with its prompts
the prompt has to be “baked”
unfortunately fooocus mostly adds tags at the end of the prompt
so its for aesthetics
it is, and that’s okay by me as long as I’m using DPM 3m
and juggernaut v9
and a high guidance scale
but guys.. imagine the amazing possibilities once Fooocus gets SD3 support, then we get Fooocus V3 prompt enhancement (SuperPrompt-V1) running on CPU (super fast, don't worry), which MODIFIES your prompt entirely like DALLE3 and Ideogram's Magic Prompt
you basically get DALLE3 offline
im running superprompt on my cpu right now and its very very fast yet still coherent
its just a small 77M model
Brian (stability dev) made it and he might make a comfyui node and a newer model
Did you ever try to win money with AI art? I tried almost everything SFW related (etsy store with posters, game characters etc.) and I had no luck...
so far it has some issues like similar prompts ('gazing at the skyline' appearing for a bunch of portrait shots when rain is in it)
naah
has anyone tried to make “cyber woman with corn” yet using AI?
I use SD for entertainment
to entertain myself or others
and with SD3 having a bunch of artist opt-outs I will feel even happier
me too.
especially how the quality hasn't degraded despite it
absolutely that yeah
I mean for photos its to be expected to remain nearly the same quality
it’s what we’re wanting from AI
Same, but I really like the generative AIs that I want to win money from this passion. You know, making money from what you like to do.
ah yeah I get it
I mean there were a bunch of AI Art competitions where you compete with AI Art only
once there was one where you could win a 4090!
on civitai I think
Yea, I remember those
but yeah SD3 is going to be great if we can run it on 8-12GB (the 8B model)
even if not the 8B Turbo model is there or the 2B model will get heavily finetuned to look way better
but isn't really any other way for now to win money with AI art? I mean etsy is oversaturated...
yeah
idk I don't want to open that can of worms here
I just want to make either stupid funny imagery or nice looking portaits and masterpieces
Alright
“the board is set, the pieces are moving”
I would really like to make my own comics, but SD is not really realiable for that atm. I mean it's really hard to maintain the same character (with the same clothes, same hair) and it's even harder to make him perform certain actions (like landing a punch into someone).
Guess you need 4-5+ ControlNets at the same time to make something "good".
And I don't have a RTX 4090, so yea
hopefully IPAdapter comes to SD3 so we can get consistent characters
or something similar
and detailed prompting will also help a lot with consistent characters
consistent got a little better with ipadapter + instantid
that right there is 2 controlnets
I've found it's somewhat better to not focus as much on the look during txt2img and get the pose right, fix it in img2img
having said that, yah it's extremely difficult
InstantID requires the use of 2 models at the same time (ipadapter+keypoints) and those 2 take enough VRAM already. And it takes so much time to fix images in img2img...
yup, time consuming, you'll never get it all in 1 gen
I imagine probably 20 gens + inpainting
that's one of the things MJ has been touting, I havent tried it, never used MJ
but for me it's like Dark was saying ,hobby, not trying to monetize
real question is if you can illustrate a whole comic for the price of a 4090
That's right :)))
And then you have to sell the comics, so that you can get your investment back (2000$ for RTX 4090).
hello, can i get some good generic tags for image generation?
As I heard, only the ones into the NSFW AI generations can "monetize" nowadays.
more eyeballs than the superbowl 😄
Eh, I see trying to monetize art generation similar to drop shipping. Yeah there's a market for people to make money off of commissions just like how there's a market for dropshippers, but it's pretty easy to get lost in a sea of people doing the same thing since the bar of entry is pretty low.
is it possible for automatic1111 to run without internet? tried doing this but it keeps saying it needs to install requirements(ive ran and installed the requirements multiple times)
Once you have properly installed everything it is able to be run offline.
not trying to run anything special, is there a reason why it might be continually trying to pull from git as opposed to just running with what it has?
your startup might literally have "git pull" in it, check it out
Because your program is set to update when you open it.
That's true. And you're competing against manual artists as well who can draw specific things better than your controlnets. :)))
Yeah artists will always have more control over the concept. They have their limitations too, but honestly I think artists should pivot to the performance of creating art, similar to how musicians perform songs.
yah, create a base and train their own loras for example
Did you know that during the time period impressionism was kicking off, that the final stages of creating a painting were performed in front of a crowd? The act of varnishing the painting was a spectacle because it was the point where the color exploded on the painting and was fun to watch.
Fun little art history tidbit.
@forest trout @trail lion do you guys know what config file i have to edit to fix this?
webui.bat
right click on it and press on Edit file with notepad
yah might just have to comment it out in the -user one. I'm in linux, but same concept
Both are in the top level of the parent folder
hello
inpaint should have a negative mask that prevent SD to get the context from that area.
So when two objects or subjects are really close and you want to define one, it doesn't try to extend the other one.
Good morning, everyone! How are we all today?
have you tried changing the prompt to remove the other subject and doing masked only? it'll only look at what's in the mask
if you use too much denoise you'll start to destroy the coherence, so it's a game, you have to play with it
yes, for example I was trying to inpaint some woman long hair over a guys head, it kept enlarging the guy's head instead of working with the woman's hair, and I try a negative "man's head" or something
yes I know, but if only we could occlude certain things from SD it won't have a reason to try to create something from it
for example I was inpainting some railing, there was a hand near, so it draw another hand
Also when trying to add another character, if it sees another character in the same context, SD will think it is about that character
sometimes you can also help it along with inpaint sketch
I tried once but I failed, then I learnt it is kind a bothersome task
many of these tasks are, and if they arent worthwhile then giving up is an option
I think just adding a negative mask it is an easy solution and tackles many issues with inpaiting
just venting, got it
SD gets easily "distracted"
how much does the 16 channel VAE improve our images
4x more detail for each pixel whilst the VAE is decoding?
or better colour accuracy?
I need to see the paper again
Hey guys, I've created an open-source prompt engineering & prompt chaining toolkit in Javascript - it's kinda like the OpenAI playground but you script in Javascript. I'm looking for a couple people to take through version 0.0.1, get some feedback. Please DM me or @ me if you're interested.
Stability AI CEO Emad Mostaque told staff last week that Robin Rombach and other researchers, the key creators of Stable Diffusion, have resigned https://www.reddit.com/r/StableDiffusion/comments/1bjhjls/stability_ai_ceo_emad_mostaque_told_staff_last/
NGL
Controlnet better
💀
hello everyone I am new and I would like to know how to generate images on this server
you can try to find some algorithms on github and try to run them
ofc, GAN is previously popular several yrs ago
especially a lota art students use StyleGAN for their graduation thesis
currently, I am still learning a previous Kaggle competition abt stable diffusion
and maybe need to read some paper for further comprehension on it
I hope I can develop an algorithm abt diffusion model in medical imaging
and wish I can contribute sth to my github soon
Stable Video 3D looks so promising, guess the entire gamedev and CG industry will be changed soon. 
stable diffusion is very promising in a lot of fields I think
thanks anyway
and not so many people apply this model to medical imaging atm
so I wanna try it
by learning from a professor
Not enough yet, I can always distinguish an AI-generated artwork from real one
trew
btw, I just lost my job
so I need to publish a SCI paper or a good conference paper for my next job
I don't like teaching, that's why I am fired
good luck with your project
I want to do research so I need my own paper first, then I may have my second, third one
and I need new job that can be helpful for my research
Losing my job is very terrible
I was angry, drunk
but Ik I still have to figure out how to continue my plan and my research
At least, you had a job. I am just living on welfare, in a s-hole country
Good luck, anyway
I lost my job, still have to find the new one
welp, as long as we get SD3
its quite unfortunate though
forbes has been anti stability for a while
teaching is like charity x sacrifice these days, the effort is not appreciated
seems odd that now Robin is leaving right after SAI picked up Esser back from Runway ???? kinda mindblown by this news, not sure what to make of that
I agree
for example, emad wrote this a few days ago
https://x.com/EMostaque/status/1769190852599943582?s=20
ah hell nah wtf
that..changes things
so whatever forbes etc are writing don't trust it blindly
the stable diffusion research team all leaving before its even finished.. weird.
https://sifted.eu/articles/stability-ai-rombach-news sifted has the same story with comment, unless they're also lying and misquoting
it aligns with the time frame of pivoting to crypto partnerships too
wait what?
I just hope the models come out and etc
its just not the most brightest future
render chain and anime chain NFT networks
two partnerships announced this week
https://twitter.com/anime_chain/status/1769529635073642663 https://twitter.com/rendernetwork/status/1769791745200460199
i fk'd up its' called render network, but it's all blockchain/nft/crypto shit
we'll know in a bit, emad is scheduled to appear on a talk soon https://twitter.com/rendernetwork/status/1770471655543886000
to their credit, otoy makes octane, but show me a company that has ever succesfully pivoted into nft/crypto
did we not* all learn our lesson from the Folding Ideas video on nfts and daos?
never saw a video. i just paid attention over the last 10 years
doesn't look like anything specific mentioned wrt nfts though, rendernetwork is about compute, no?
identity too
yeah thats what I got from it at face value as well
I guess i'll try not to jump to conclusions yet there
I just shrugged it off until flowwolf started talking about this
my only conclusion is "does SD3 come out either way or not"

😭
i think sd3 will make it out the gate, but the stable is on fire (see what i did there?)
@coral yew 🤗 🫡

this the video you mentioned? i might put it on https://youtu.be/YQ_xWvX1n9g
yes, its an amazing video (the whole channel is great, too)
it goes deep into DAOs and how they're used to grift poor people for spec work
seems that way. hard to find these gems on youtube since the thumbs down button was killed and it's turned into a pure hype/engagement machine
there are add-ons which restore dislikes
not for the whole though
he only puts out a video every few months, but they're bangers, and I feel that NFT/DAO video was basically the final nail in the coffin for the collective consciousness after the (actively dumb) hype over NFTs, not that it was necessarily news, but deep diving into some of the more ridiculous acts is something really everyone should watch
i could invidiually use thumbs down buttons but nobody else does. just us few users with an extension. so it doesn't matter towards any of the content creaiton and algorithm boosts
just another one of those circus decisions from google
next one is the chrome manifest change which kills adblock
I mean, it has 14 millionviews, its made rounds
i'm pretty on board with all of what he's saying. i've long been a crypto enthusiast, but i saw the writing on the wall a lonnnnng time ago. watching roger ver operate the bch network was really telling of how behind the scenes look
i still think vitalik buterin is an animal. love that guy. but the legacy of the network is a whole lot of crime and fraud
yeah, some good technical ideas that unfortunately, when reality of application commences, is generally not good, largely there to enable money laundering and abuse people who cannot understand the complex economics of what they're doing and like to play the lotto
firefox is implementing the new manifest api too so that there's no split in compatibiliyt.. supposedly.
-cough paidoff cough-
firefox will maintain v2 while chrome discards it
here's firefox responding directly https://x.com/firefox/status/1770215127767503163?s=20
i have no doubt that large sums of cash change hands behind the scenes, because why wouldn't it happen?
I mean, git source control uses successive hashes to help provide provenance, closely adjacent to blockchain (but without the eco-destroying hash mining), its a good technical idea, but nft/daos are next level gross bastardization
i do love merkle trees still. i'm not about to get prejudice on core technical ideas like data structures or consensus algorithms.
right, an algorithm is just an algorithm until a person uses it for bad things
algorithms don't kill people. I kill people! (good tshirt idea?)
I'd prefer that SD3 not be the last
But the crypto pivot isn't really a good look
x2, I hope they continue with more advanced models so they can compete with openAI dalle and midjourney
What they did, they invested in a crypto company or something like that?
I tried searching but i got nothing
feels like a dropping the anchor through the hull kind of moment
Any updates on SD3 or a final release of Stable Cascade...?
Open source community projects might be able to produce better models on longer timelines, but I'm not sure that they can do it at the same speed as Stability AI can. The EU wouldn't have been able to place restrictions on GPT4 if it had been open source and the weights easily accessible
F*ck EU restrictions, too bad that even in the US and Russia/China there is censorship/restrictions too
Because a site asking you if it can use cookies is really going to stop big data. smart move EU. Thanks for doing that and adding an annoyance layer of privacy theater to the whole web
They were also trying to kill online privacy for a while with Chat Control, until the ECHR told them to knock it off
realtime automated surveillance of all online communications. yup
in order to protect the kids
they didn't actually knock that off and have a v2 proposal in the mix now
That was when they wanted to ban end to end encryption?
backdoors are how adversaries hack you
the proposal allows it BUT only if every message is scanned for abuse before its encrypted, and then sent to the police if theres a hit
effectively making it moot
Lmao then its the same thing
if its "scanned" before is encrypted what is the point of encryption
i dont plan on abusing children ever but false positives do exist
It would be better if parents raised their children instead of giving them the damn cell phone to talk to strangers so the government has to do something
here kids, go play in this asylum lounge
Might not.
i don't believe that
I think they are using the bot's computing power to calculate new video models.
never underestimate the power of 10,000 nerds on the internet 😆
SD3 is still not released, right? no way to try it?
can post prompts in the #sd3 hashtag on twitter and maybe they might do them. People have been posting dall-e images with prompts and someone with access posts a comparison
soorry i mean the website formally known as twitter
Hello, inquiry:
Is it possible to feed SD with different photos to get one with different characteristics? I mean, I want to create a warrior and I have a photo of a person with a hat, another with a weapon in his hand and another with a specific gesture and I would like to gather those characteristics in a single photo.
I'd be surprised if we got access this week, they said they're givin out more invites
invites seem to be picked and chosen and signing up for the list doesn't matter. its not a spool.
Yes, that´s possible in Comfy for example. I think it´s called IP-something, others might be more knowledgable here
Thanks, how do Iuse Comfy?
I could try begging on twitter from Emad but I'm not 3x PhD 500 IQ AI Researcher or Fine-Tuner 😔

us PC gamers may reign over the console plebs, but the ML phds and candidates reign over us 
true..
Install Comfy UI for starters 🙂 Then the exact technique I haven´t done myself yet, so youtube could be your friend here 🙂
node graphs are a learning curve but the view is nice from the top
Thanks!
remove if not allowed ofc, but im looking for someone who is capable in faceswapping onto videos - DM for more info.
once you learn nodegraphs in comfy, you can move onto things like touchdesigner
what is touchdesigner?+
node graph for creating visual stuff
ah, I see , thank you
https://youtu.be/0qy8CYrVl20?t=525 feed stuff like this into animatediff would be fun
Looks like brainwashing methods 🙂
This looks interesting as well: https://www.youtube.com/watch?v=gtP1Ae35RqY
researchers have left before, SAI will be fine lmao
well I don't think its going to be that terrible
yeah its pretty wide open on what it can do
currently loading, though I don´t feel like testing it right away 🙂
do you read it? - no, I reddit
how
MJ is based on SD
which upscaler do you use to make 8k instead of 4x-Ultrasharp for anime when you use ultimate SD upscale ?
anyone talking about this yet? https://github.com/ZHO-ZHO-ZHO/ComfyUI-APISR
The quality doesn't look better than other Anime Upscalers tbh.
oh ok
good to know—I hadn’t given it a thorough look, I just saw that it was trending a bit.
another question
I tried to do a lora this morning but I had errors at that moment (https://youtu.be/Un9SHPVAAbE?feature=shared&t=589) it told me that python 3.0 could not launch the link
Or is there a simpler way to make loras ?
For anyone using A1111 / Forge who is familiar with the extensions with AR / static res buttons, I did not like any of the ones available. They all output bad resolutions with no control over rounding precision, while having between 1px - 4px while 64px is ideal in most cases.
I forked the pick of the litter and spent a long time enhancing it.
https://github.com/altoiddealer/--sd-webui-ar-plusplus
holy work
^ this kind of spam is only going to deluge more and more. now that stability is partnering with blockchain companies, we're going to see a faux gold rush to capitalize on hype and naivity. sucks.
yo bro you want to get in early on my newly minted dao? 😆
it's going to the moon!
I can imagine entire micro-economies based on what are essentially glorified ComfyUI workflows.
Strapped to, by then, “ancient” A100s

disgusting thought, sort of like how bitcoin mining really is basically digital coal
too bad Nano never took off as an altcoin—the entire network could be powered by a single wind turbine
bitcoin is hitting some kind of halfening. where the rewards for each block are cut in half. this is driving the whole market into a frenzy and ponzi , i mean investors are going full salesguy mode
i remember when IRC channels used to go through splits…but no one ever ponzi schemed because of it 😕
I wonder if ai has gotten to a level where it can create whole projects like games, software, and such, i feel like we are close but im not sure.
i miss the days of the internet where the worst thing was an irc split
stop making me feel old 👴
the sage advice of flava flav applies here.
"Don't Don't Don't Don't Believe The Hype"
You have a point
it's all one (very large) comfyui workflow file away !
how so?
half joking, but people already have langchain, you can build "project managers" and "coders" in llms and string them together with image models, etc.
I think we’re going to see a more distinct correlation between creativity and energy efficiency in future MoE subagent workflows
so creativity in models will be more prized as subagent designs popularize more
ask an llm to come up with a product idea, ask an llm to list features it should have, ask an llm to configure a project that will have large epics, then break it down into smaller features, etc, etc., and down to creation of art assets, its just a big complex graph
flava flav has a point... well probably a lot more than that. (heyohhhh)
also a strained neck
I must say i feel like im too invested into ai, but is that a bad thing?
I think we're very close to someone publishing a (simple) game completely created by AI, not just like "hey Llama write Pong in python for me" but like "come up with a game idea that is simple" ... "write down the major features..." "write tasks for creation of all the art assets" and soforth
I just wish others understood that because it will be cool to see the future of ai.
give your llms different system prompts/prompts for specific tasks, let it run the code and fix its own errors, and soon
You can take pride in knowing what others don’t. People are extremely stubborn and for the most part deny a very basic reality: that they need to see to believe.
Seeing doesn’t just mean with the eyes—it also means understanding.
if i were to make a game with ai i would probably use an engine or a simple framework along with ai tools.
I like the idea of basing a game concept on a console format, like a preconfig
a couple weeks ago one of the big boys released a paper I think on having an AI pilot your desktop, like, click on windows, move the mouse around and soforth
but most of this can be done programmatically without a UI at all
humans are very visual, but its extra cruft to actually click on a UI
I would love to see some kind of supergame where you start off playing in 4-bit, OG gameboy style, then as you progress through the game it jumps generation after generation
i dont trust an ai taking control of my pc, if that makes sense.
the VLM models already have grounding, like Kosmos2 or Cog, they can literally identify where specific objects are in frame
this is reasonable
i mean you dont want it clicking on your banking website link and logging itself in with 1Password and sending your money off to some dude in Tunisia
yeah or downloading a virus.
i was on board with the internet long before people thought it was going to be a big deal. once it was, most of that old experience was worthless. /shrug
you mean learning how to setup my modem init and SLIP/PPP isn't applicable anymore?

I want to make Devin create a Virtua Fighter clone, like straight out of 1994 or whatever and then I would make all of the playable characters choo choo train faces
when you can, you'll know AI has truly arrived 😆
yep and then the show me what you got head will descend from the sky and ask me what i got, and I’ll show him and he’ll like it and float awayyyyyy
Windows Copilot?
I don't think that is what I was remembering seeing
but it looks like the same sort of idea
i wonder if there a tool that can help makes games because ive been wanting to make video games but im to dumb to understand code and such, but i also dont have money to throw at api's.
I think you could optentially fine tune Cog or Kosmos2 to learn how to identify buttons from screenshots, then just write a program to click on the buttons, for instance, and then chain that into a larger program using LLMs and such
https://youtu.be/dCer2e0t8r8?t=15 whenever i hear people talkin bout devin
you ask the VLM "where is the close button on photoshop?" and it will spit out a bounding box for where the X button is, in theory, and you built that out behind the scenes, and you could pilot your computer that way with AI, or voice recognition, and soforth
not quite there, but close
now ask it how to close vi
Devin in 2024: It works, but it's nowhere near as good as a software engineer
Devin in 2026: You won't get the same quality as a team of software engineers.
Devin in 2028: You still need a software engineer to look over it and give his approval.
Devin in 2029: We need legislation or else new CS majors have nowhere to work!
Folks are seeing the consequences of applying human precedent to a decidedly nonhuman era
cough yann lecunn cough
the actual first ai swe probably won't be devin and will be something made for a specific project then released later
*wasn’t
you know there’s some DARPA-level clubhouse shit going on that’s possibly years older than what we’re seeing now.
Yeah, I imagine you're right. We'll probably have 20 dev AIs in a year or two and they'll specialize in different dev tasks
i spoke in future because there isn't one yet. it's claims are a lie
I like the guy but I don't think his takes on AI development are accurate
Hasn't he been proven wrong in his predictions by large margins several times already?
I need to watch some of his past interviews and see what went different
Because I watched a few newer ones and they just seem... pessimistic about the rate of progress
Like pessimistic in an unrealistic sense
https://twitter.com/PicoPaco17/status/1768354131880656901 this is my favorite yann moment i've seen recently
My theory is that his idea of intelligence is too narrow—as in he’s got some religious thinking possibly affecting his imagination.
he presents this problem like language can't possible solve it. but then people with inner monologs show up
How does someone not have an internal dialogue? Is he a buddhist monk lol
He keeps painting the notion that X or Y “isn’t possible” yet, who is he to declare what is and isn’t possible in an age where neuroscience is fueling the most advanced technology developed yet?
some people don't
i mostly dont have inner vision. just flashes of images here and there
mostly a chorus of me like a price is right audience
Huh, that's so odd to me
consider deaf people
I suppose you're right
So they just walk around with silence in their mind like enlightened beings? Isn't that the point of meditation
i dont think no internal audio means a calm mind.
I guess they could technically envision the world in images and sensations
ton of other senses and concepts that could steal attention
drugs provide more context, but i’m not going to advocate for them.
I meditate regularly, and I have experienced true thoughtlessness for seconds at a time, but I envy people who can do that all the time. I can't get my inner monologue to shut up lol
This is speculative but imagination and inner dialogue may be like a spectrum, and I'm definitely on the loud as heck end 😂
Vivid imagination constant music and voiced thoughts
mmh I heard they just kinda "feel" when something must be done, said, etc..
synesthesia is what we’re getting at here
I guess it's like aphantasia where people can't see faces in their heads but can recognize people anyway
there is a lot of sensory overlap, because the pathways in our brains that process our senses decussate and multiplex
totally
Back to the topic of yann lecun, if he has no inner monologue, does that mean he can't really ponder his ideas?
At least on the spot
ehh it means he suppresses his ideas which tells me they manifest as subconscious abstract and he probably reveres them more meaning he probably leans more into his egocentrism
aka stubbornness
(i might be projecting a little there)
It's probably more difficult without an inner voice to think something like "no this idea is trash" or "I shouldn't post this"
I'm a little out of the loop, what are we talking about? 😅
we’re talking about how yann lecunn is an enlightened guy and all, but his predictions on AI have been pretty far-off
wich generator can i use to make pixel art? without credits
He's made tangible progress to the state of AI so he deserves credit for that. In my opinion I think his long-term visions are just unreliable
Especially when he's put on the spot in an interview since he can't deliberate his thoughts
I think you’re right about that. His instincts are serving him well in the moment, but he doesn’t seem to have much of a future vision.
I think if he took other disciplines and perspectives into account with his thinking, he might have a different approach to the future—i.e. neuroscience
and hell, philosophy
aphantasia is kind of like a spectrum. its not the same as face blindness. there are people who can legitimately not recognize faces. i tell people i have partial aphantasia, but its technically called something else and i can never remember it
if i have vision its for a flash. like a strobe light hit it
I'm not d'
aphantasia happens on account of a specific region in the visiotemporal cortical area going offline
meanwhile, my eyesight and acuity is on point. 41 don't need glasses and can see better than 20/20
This may sound strange, but I think I have aphantasia for cars. I can't remember what my family members' cars look like
Oops, my goodness I keep sending my messages prematurely 😂
where visual qualia is mapped to facial attributes
I'm not 'diagnosed' with anything but I may have hyperphantasia, took a quick test once.
while i don't see stuff, in a moment i hear an indepth description of what i remember
it can happen with any region of the brain encoded to a specific pattern
if it goes off, then your brain loses the ability to process.
simple as that.and sometimes your brain can compensate for it in other ways
my going theory is my visual cortex works on real vision more than inner vision.
there are different layers of visual acuity, with distinct regions where the “math” is computed
all towards the very back of your head
yea we also know the brain is very plastik
there likely was some damage at some point
and the amount of wiring going back and forth is prone to a lot of dynamic growth
could be! probably true for most folks.
Isn't that the same thing?
/thinks back to all the hang overs and knocks ive taken/ hmm could be indeed
We only have inner vision, there is no such thing as real vision
the signal coming from the optic nerve rather than my inner nerves
there’s the visual sense, and then the imagination.
With technicalities aside, ^ basically what Alex said
at some point the visual cortex was like "fuck all that i'm gonna focus on the eyeballs"
phylogenetically speaking, eyesight is the most recently developed sense
and the frontal lobe is the most recently developed (evolved) region in the brain as well.
But you don't see what your optic nerves detect. You only see what your mind interprets the optic nerves' signal as. Not being pedantic, I promise
Clint—very poignant point
thats right. and theres a delay too.
but what i mean is theres signals coming from teh world and the signals coming from inside
also—we technically view the world upside down, but our brains actually convert the signal right-side up. and each eye decussates (crosses over) to the other side of the brain.
lowkey we're all australian
ever found your blind spot? remember that xfiles where this assasin used the blindspot?
yeeeea!
hey, if you want to read about something trippy—look up Charles Bonnet Syndrome
some folks can have blind spots that are much bigger, called scotomas
and your brain does the same thing with those as it does your blind spot—it pulls a photoshop-style interpolation effect.
scotomas deez nuts... /shrug
Except with CBS, the scotomas are so large that the brain projects straight up crazy hallucinations
the brain hallucinates all the time
I see what you mean. So for example our imagination (memory centers?) sending signals to the visual cortex vs the optic nerve sending the signal
yes it does, we are always trying to find homeostasis. stability.
most crashes happen in the km near people's homes. i heard that once. cause the world is a reconstruction based on memory, not realtime
fascinating. and creepy.
given how unreliable memory is, yeh
hence the race for stability, lol.
We don't really have a memory. We reconstruct it every time based on a few tidbits
Kind of like neural nets
and whenever you recall a memory, you are degrading the proteins that have encoded the memory very slightly each time. so it’s never technically the same memory.
that one always bakes my noodle.
We're basically biological AI, using synapses and neurons to reconstruct an idea based on how it's wired
sounds like some never ending story shit
the door is open.
hi
hiiiiii
👋
Yes, but I'm not sure if they're trained only on high quality data or how they accomplish what they did in v6, they surpassed even dalle3 by...alot
it’s trained on everything in the world, probably.
or—they have a fundamentally different approach to the text encoding.
Does that mean midjourney has to pay SD royalties?
no one is paying anyone anything…yet
lol
well, i mean….everyone is paying microsoft
$$$
the SD commercial license means anyone making money off of SD has to pay SAI
There's also the research license but that doesn't allow commercial usage
what I know of is Individual training and prompt-rewriting. There also at least appears to happen some enhancement at the final stages. When you watch a generation procedure it looks like SD, just with the final step the typical MJ look is being applied.
including a style prompt as well it seems
Midjourney hasn't surpassed Dalle nor SD3, it's in the SD3 paper
I think ppl are underestimating Dalle-3 just because they censor literally everything
Surpassed in what exactly?
he mad
when are we gonna get a sd3 discord channel where we can see community gens w new model
i mean yeah talk normally
grifters have not changed in the decades that I’ve spent online
you guys are all the same
i feel sorry for you 😦
bruh what?
this guy is trying to communicate. it’s not working very well. 😦
do you need an SD3 invite?
lol
Maybe he is a midjourney/dall-e spy 🕵️
oh, those are eeeeeverywhere
bro mov2mov extention is so gnarly. it halucinates stuff that you cant make with img2img
any alternatives to mov2mov?
not yet
anyway to run SD from a batch file?
there are batch scripts out there, but temporal persistence is a general issue right now for what could be awhile.
maybe we could split a video into images, then run each image through a bat file
well if it could use the previous image as a prompt, with the settings of the new image together
every image represents a different input, and even with the same seed you’re going to get a different diffusion for each different input
something else is, like, if the bat file only runs if there is a missing file, u could just delete bad frames and keep running it until you have better temporal persistance
yeah but some promts are very limiting, so it might narrow stuff down enough
it’s noise
i noticed using a bunch of specific negative promts severly lowers the diversity of the output
i think if u could just hand delete bad frames and regenerate them, it would probably make stuff a lot more feasible
beats doing it all by hand, its not 1000x productivity increase but even a 50x helps
well actually right now its not even dooable at all.
how do u run sd from a script anyway? because all the settings are temporary in the browser
Check out RunDiffusion.com.
Hey, i'm pretty stupid with this stuff, i purchased a subscription to midjourney specifically to use a stable diffusion generator to give me images of this building if it was rehabilitated; but since I am too dumb to figure it out i am blowing my fast GPU away
I put it in a server called Dr Toboggans Historical Rehabilitation
i'm totally lost in this server stuff; can someone help me? or tell me if it's even possible?
I'm not even sure what I'm reading here. You bought a MJ sub to use stable diffusion that is free?
yeah pretty dumb, i'm pretty computer illiterate
can it be done tho? i put the picture in the general-with-images hashtag below this one
can i do this on rundiffusion.com?
If you have a fast gpu as you say you can do it on your computer. Look up image to image stable diffusion on youtube and start learning
Don't have to pay anything except your electric bill
YouTube is a good resource for how-to’s on nearly all of the stuff everyone here is using.
No, i have a macbook pro from 2015, it kind of sucks; i think there was like fast gpu points on midjourney; i didn't realize there were other good/free diffusion softwares
But it starts with what you want to do—if you want to do an image to image, then search for that technique particularly.
image to image, thank you, i did'nt even know the term to start
Ah, if you have a macbook you probably can just use midjourney then. They do the computing for you since your computer probably can't
AI is sort of a use-case thing right now for folks. People describe it in terms of what it does. i.e. txt2img, img2img, img2vid, vid2splat, etc.
thank you, i'll see if midjourney can do image2image; also should focus on learning a little bit more of this stuff; thank you guys
Just to get you started
The second link is more recent for the current version of midjourney (v6)
Also keep in mind that these models we’re talking about are free on their own—it’s how they get used that the money starts to factor in. Running it yourself? You’ll get pointed in the direction of the code repositories. Weak GPU or no GPU? Then you’re surfing the cloud. It becomes a matter of finding the cheapest way to get the code running virtually.
Approach it like that and you’ll save money
He already bought a midjourney sub so might as well use it
That is true! and Midjourney is a powerful tool in its own right
interesting, yeah i plan on building out a computer in the future; may be easier to run it on that in the future; for now, i have to figure out how to use midjourney to to image2image
thank you guys!
yep…just remember, we are in an age of answers. there are so many out there. you just gotta know what you want to ask. and try using GPT for stuff too if you haven’t started yet, you’d be surprised at how helpful it can be whenever humans aren’t around to help 😛
true, i'm so used to asking a person to "show me", been giving off boomer vibes for some time now
boomer vibes are okay. boomer vibes need to happen in order to evolve into a zoomer (unless you have an evolution stone)
ahhh….pokemon jokes
anyway
will the 800m sd3 loras be compatible with the 8b model?
Maybe eventually through new techniques like x-adapter, but the vanilla versions probably not
can we use Python 3.13.0a5 ?
if they added mmdit layers and froze the others as they added potentially yes, but I'm interested to an answer to this as well, I'd guess no
i've only heard one mention of the 800m param model in the original announcement. nothing in the paper
I'm starting to lose the hype since it's taking so long for the preview invites, ngl 
Unless you pull 10s of millions of dollars of venture capital out from under your couch cushions, I don't think SAI will care very much
Wait let me see if I can- nvm
The hype ain't for us sadly
Yeah it seems that the hype was mostly for investors
They need to do a better job of labeling the hype trains so us bottom feeders know what’s taking us where
you underestimate the bottom feeders ability to entitle themselves
i deserve sd3. i've worked hard for it!
I think it didn't hype anyone, they said few dozen of invites have been sent and to look for #sd3 posts, but after a week looks like only one or two people got invited
Most of the hashtag is other people spamming stuff unrelated to sd3. Just because so much thirsty attention is on it
SAI has to know that more people become impatient every day, so it probably won't be more than a few weeks before mass invites are sent out
nice, i was right lol
800m model is an afterthought
every smaller model was an afterthought
are the parameter counts of the other sizes known? Or just 800m and 8b known so far?
the paper mentions a 2m param model i think
I think most of the people who were invited so far are people who develop things for SD models
Not randoms who put clouds in a bottle or generate booba
what about people who make hourlong youtube videos with hundreds of examples of them with the same pair of glasses wearing every kind of superhero costume you can imagine
and gingers
800m, 2b, 6b and 8b
Sd3 how long will u make my poor heart wait 😥
more than a month
😦
not on launch
will SD4 loras be compatible with SD3
Yes, should work too
hey mates
👋
nice to meet you
I want to prepare datasets for SD3, but will there be anything different from SDXL?
When preparing datasets for SD3 (Shoelace Data 3), compared to SDXL (Shoelace Data XL), there might be some differences based on the specific requirements or characteristics of each dataset.
Here's a general some differences you might encounter.
Data Size: SDXL typically refers to extra-large datasets, implying a larger volume of data compared to SD3. SD3 might be smaller in size, containing a subset of data or focusing on a specific aspect or subset of the larger dataset.
Scope and Coverage: SDXL datasets might cover a broader range of topics, domains, or categories, while SD3 might be more focused on a specific area, industry, or domain.
and so on...
hi ! i m using webui to run Stable diffusion , is there a pc app to run it local ? i didn't find one.
webui is local, it's only using your browser to display stuff
you can disconnect your internet and it will still work
i know ... i m looking for a interpreter that can run SD
??? there s none up to date as i know ... that s why i ask
using webui is very complex for normal users
you mean programs like comfy or Automatic?
have you tried fooocus, i've heard that's easy to use
what do you mean? you can download fooocus and run it locally
i dont code python lol
you don't need any coding experience for foocus, A1111, or comfyUI
i m using a1111
and what do you dislike about it?
it s to complex for normal user ... not for me lol
oh you mean run an LLM locally to use SD?
what's a normal user?
do you mean you don't want to bother with step size and stuff, isn't fooocus then pretty easy to use?
A1111 you can also just copy settings from others and then go from there and only mess with stuff you know, like prompt
i got to run 3 prog for full ai power ... a chat ai , open interpreter , and webui ... why not make one app ?
oh i havn't heard about open interpreter
i test it ...very powerfull but but it very dangerous to lol
dangerous how?
you can arm your own pc ... lol ooops be carefull with it
open interpreter use python code to execute command in your pc ...it can do reseach on the web for you , it can start app ... event format a disk ooops
oh dang, yeah i wouldn't use that after my experiences with how bad language models can be lol
i guess you could run SD without a user interface
the probleme , is to run SD in a web page use ram for nothing
a browser is not the best interface ...
also a browser is to easy to hack
yes, you might want to look how to run stable diffusion through python code directly
so i ll keep looking but i think some interpreter will soon come wt options wt SD models
sure
like jan.ai , gpt4all , lm studio , ... someone will do it ...who will be first ??? lol
i'm surprised that those LLMs run locally on 6GB VRAM already, havn't kept up to date in a while, might check out some of it as well
i m running SD 1.5 wt controlnet on a gtx1060 ti 6g Vram laptop and it make 688x470 ( wt 3 ctrlnet running ) under 3 min lol
it very powerfull ... i m totaly amazed
it s a bomb lol you dont know what you created in the world of CGI lol
Out of Gpt4all and LM Studio, I would so pick LM Studio. GPT4All is a glitchy mess on my system, and models take forever to load 😕
a agree ... i tested alot of them ...lm studio simply rock lol
multi AI now lol
yes they did that ..you can run 2 or more AI in 1 pc
lol
I wish I was a rich kid who could afford 2 4090s and 128 GB of RAM 😂
it s pure madness
i bought a new laptop last year with 8gb vram to be able to use SD and it's running quickly as well, before I had to use CPU and wait 50 minutes for every image lol
Runs SD3 + a 30-billion parameter LLM on his PC at the same time
Oh God 😭 When I first saw "00:40:00"... This is why I got an RTX 4070 Ti
Depending on the SDXL checkpoint 1024x1024 usually takes around 30-40 seconds
Base SDXL takes around 10-12
@gritty dust , well my friend got a gtx 1650 wt 12g of vram laptop ... i ll try to find a used one soon
i cant run SDXL in 6g of Vram
you could try comfy, i think that uses less vram than Automatic
@stone latch i got a lap ... i cant change my GPU
I have always been repelled by ComfyUI because it looks so convoluted, is it hard to use?
My little brain might not be able to handle it 😅
yeah, buying a desktop was also out of the question for me, so i got a laptop with a 4060
the idea of working wt laptop ... it cheaper than rtx 3060 lol ...no hope for a 4090 at all lol
i downloaded it yesterday, so i am not sure how easy it'll be
Now I'm compelled to try ComfyUI but I wonder if I have disk space...
@wally dm me a link plz
for comfy?
yess
it's in #1080946152318443610
I actually started with InvokeAI first (now known as Invoke) in my baby days of AI image generation, and at the beginning of this year I mostly switched to Automatic
see runing a aI use all the ressource of your pc
A1111 has way more capabilities and settings to tweak... That are easy to tweak
so having laptops let you work wt different ai and event share them with a local server ...yes
@stone latch yes i agree , but i m lookinf for a pc app that can run SD localy ... i dont tink someone did it yet
it s the concept of browser app that bug me
Not a thing yet to my knowledge. I only know web UIs 😕
me too
yeah i also don't know why it's always webui, browsers take up previous ram
There might be a reason for this, but I'm unsure why someone hasn't sat down to write a C# or C++ application for Stable Diffusion 🤔
Dall-E 3 for prompt adherence (currently) + SDXL ControlNet to enhance the image and do inpainting if desired 😌
you can't run Dall-E 3 for free, nor locally though
I did this to generate portraits - Dall-E 3 because I often like how it does male faces, then SDXL to add realism
i got a issue with AI that are not local , i study local ai only lol
True that, but at least Designer lets you use it 15 times 😅
i want to try some workflows where you multiprompt sspecific regions.
promt adherance is less important if you can specify a region in a picture where a thing is supposed to be
my freind pay 30$/month for gpt pro ... i show him the power of local ai ... he fall down lol (chat gpt pro got a limit of 40 question every 3 hours) ... make me laught again
$30?
The subscription is $20 without VAT 😅
Because it's all based of libraries that are written in lower level languages. Dig into the massive pytorch library
you don't know which $
Ah my bad
All the python level code these generators use is just tapping into the API of torch
The actual hard calculations are done at a very low level
ok ... i m not that deep into coding ... i runit without any internet ... it work but contronet need the web to edit is control lol so it not perfect yet
i use 2 webui ...one for my chat AI and the other for SD ... why 2 webui ...i just dont get it
i can run lm studio as a server for my AI ... but i cant run it with sd
yes ... auto generation of original visual content ... lol that s the plan
that would be a bomb
lol
we can do it via open interpreter ..but it s alot of work
... a guy ! make open interpreter deliver him a pizza at home ... wt a txt line ... it s that powerfull ..
my problem a cant run both at the same time on my pc
you need 12g vram to try it
With comfyui you could just integrate a LLM into the image generation process. It would offload the LLM model after the inference is done to free the resources for SD.
i need 2 laptop
ok good i didn t know , everyone talk about comfy ...i intall it tomorow ..
open interpreter do more then just answering question ... it can run your pc
i need it
most of the APIs for LLMs like openai tend to have a specific command for "keep alive" and if you set them to zero in the call from comfyui, the model will unload after generation
i use only local AI ,
you just have to make sure to call it in the node when you send the request to llama.cpp/python or ollama or whatever local server backend you're using
That might be a bit of a problem for a comfy beginner 😬
@narrow kernel i m new in this ..lol
@honest mica exactly
i just discover SD a week ago ... i m amazed
i m a cgi artist ...everyone is freaking out lol
Prepare yourself for a steep learning curve with comfyui. But you will get rewarded with high flexibility.
good
then think sd will steal there job ...they are so wrong
it will make it more creative lol
Yeah, the initial image generation is only the starting point. There's still a lot you can/have to do by yourself.
anyway i ll give you news tomorow about comfy
Load the portable version from the GitHub and don't make the mistake of going through the installation process.
@honest mica sincely to get excaly what you what from sd it will take you 3h of prompt testing , in 3 h i can do 10 images lol
so ...we are far away to be replace lol
but for creation it blow me off totaly
change backround in a second wow
it speed up the process alot
True
... you all know the expression ...with me , without me ... well with AI , without AI ... lol the new version of it lol
Well, you say you use it a week. Have you heard of controlnets already?
yeah
controlnet is impressive
i try many options but it still limited to SD 1.5
still hand and feet proplem
need to train it to get better result
there is so many extention ...to learn them all it will take a year lol
mastering controlnet il take me a month at least lol ...
ok comfy is nodal ui ...yes i read about it ... hummm need hd space for it
anyway thanks for your advice
Hey all ... I've been off the server for a while so I might have missed it ... I can't find the bot channels to put in a prompt any more ... sorry if this is a noob question ... is that no longer available?
Oh, just saw the update on #1047610792226340935 , nevermind me
10 years ago, who would have thought that AI programs struggle the most when asked about basic maths and not creative prompts, and image generation can draw nice organic shapes, but can't do text, straight lines and proportionate repetive building elements.
@gritty dust .... i need to test lora ... if you give him building reference ... it can create a city
lora is a major part of SD
as a fact ... SD is the only AI that can learn localy ...that i know
#💬|general-chat hello
I have learnt that deep faking people into animated characters is incredibly difficult
Especially really stylistic charachters
so then we will get loras for 4 different sizes and none are compatible with each other? Sounds horrible. You will basicly never be able to combine multiple loras of your choice because they probably are all on different sizes... or are they compatible with eeach other ?
Is there a general online solution for an actually good quality upscaling of watercolor or oil paintings ?
what does the controlnet do, and where to put the file?
you dont need it to use SD, it's for more control of image generation. Skip it for now until you figure out what it does, then learn how to use it. It is a powerful tool.
hello guys, i'm wanting to instal the extension DreamArtist-sd-webui-extension, in fact i instal it like said in the github page, adding that command git clone "https://github.com/7eu7d7/DreamArtist-sd-webui-extension.git" in the folder of c:/stable difusion/extension, but i havent the tab to acces to the extension when i open stable difusion, someone could help me ?
you dont put it in the folder, there's an extensions tab in the actual webpage when you have it open, and a sub-tab called "install from URL". That's where it goes
ok, so i have to paste that in the url "https://github.com/7eu7d7/DreamArtist-sd-webui-extension.git" right ?
yes
thanks a lot
well done it, but no dream art tab in the ui....know why ?
you have to apply and restart your browser, and make sure it shows under extensions ( the main tab )
done that, still missing
haiyaaa, I need it or not is my decision.. I only wanna know how it does the job
fair enough, answering or not is my decision
that was rude btw
its a lengthy explanation so i dont think anyone here has the time to explain all of it so its better for you to read this whole article https://stable-diffusion-art.com/controlnet/
ok, thanks
it's becoming clear to me now why ai bots will eventually replace search engines. the culure is changing, how people research
Hello folks. I have a question. Does anyone happen to have a tutorial or how2 that explains sensibly how you can host a GPU or alternatives. I have to admit that I'm unfortunately too stupid to understand all the providers' own docs.
Option1: you give them money they let you borrow GPU
Option 2: you give store money, you take GPU home
but to answer your question, i dont know of anything other than youtube guides
are you asking how to be paid to lend out your GPU or how to rent a cloud GPU?
when you say host a GPU, that normally means your GPU is being used by others
definitely both. Most of the time it only says €1.50/hr and I wonder if that's the full price or what else is needed.
I've only rented them, the price/hr depends on the service and the hardware, like Vast is really cheap for like a 3090, I can get one normally for like $0.25/hr US
renting out your gpu is like the 2024 version of torrenting, lol
gotta have a good seed ratio 😄
my intention is. I would like to host stable diffusion local on my PC. my own gpu is more than too weak for that...lol... I only have a GTX 970. That's why I wanted to get the necessary gpu online.
yea plus cant really make money with just a 4090 because clusters offer better hardware and prices
Hello - is it publicly known what the enterprise level of SAI entails? (% of revenue? something else?). I emailed the address 7 times (SEVEN) and never got a reply. This is on behalf of investors who are hesitant to back entrepreneurs without first understanding how much SAI will take of the final revenue. I hope this is the right channel to post, thank you.
I imagine you're trying to setup a situation where it's always running and you want to be able to offset the cost of that, I think that will be difficult
tbh, no one is going to rent a 970
no i think he wants to rent a better gpu online because he has a weak one
hah, I think they were saying they cant use the 970, so trying to setup a cloud resource for permanent use
he said both, as in renting his out, and renting a cloud one
ok, I misunderstood then. if that's the case, that'll never work
I never said anything about wanting to rent out my own. could be due to my poor English. is not my native language.
best option is just to rent a cheap cloud gpu when you need to do something. If you're using it for both inference and training, over time that will get expensive, and you should consider buying a better GPU
no worries, it happens 😉
some people have made free services like kaggle or colab work, but it's definitely a lot of hassle
i need to order a new ssd, a 4090, or rocm needs to come to windows asap. the 7900xtx is great, but still kinda slow in windows, and my linux ssd is decaying
oh man...lol...you confuse me just like the explanations on google cloud colab...lol...it gives me a headache reading it. 🤣
haha, i get too excited about tech
I'm on a 6800xt in linux, works ok for most things, but I do use cloud for training
i'm signing off, going to spend the day in Akiba tomorrow so need to rest 😄

