#💬|general-chat
1 messages · Page 180 of 1
the usual suspects, comfyui, stable swarm, auto1111, forge. Take a look at the pinned guide in #🤝|tech-support and pick your poison
yeah thats exactly what i was looking for - cheers!
Hello!
Hi
are AMD gpus pretty good for image generation now?
i wanna upgrade my card but i also wanna generate stuff
my vote goes to swarm
for basic generation sure but advanced stuff youd hit some hacky drivers and limitaitions
nvidia works pretty out of the box and amd sorta lost the ai race for now
Flux is still overall the best model especially with the flux toolkit controlnets and redux.
Sd3.5 is better for artistic styles though. There is also a small 2b version which is nice.
where do you hit the limitations? video?
or training and stuff like that
from what i understand yeah
A high-resolution photograph of an ancient, towering tree with massive, gnarled roots extending across a vibrant, emerald green forest floor, intricately detailed bark. The tree is adorned with ethereal, glowing bioluminescent flowers and vines, casting a soft, magical light. Perched on a branch is a majestic, white bioluminescent owl with luminous feathers that glint in the golden hues of the setting sun, casting long, dramatic shadows. In the background, a pristine mountain range with snow-capped peaks and a crystal-clear lake reflecting the scene. Ultra-realistic textures and lighting, high dynamic range, 8K resolution, photorealism, and volumetric lighting. Artstation, trending on CGSociety.
drop the artstation and trending on CGSociety - that data is going to bring up a LOT of stuff with a more drawn and painted quality to it. it's mostly injection noise anyway. also drop photorealism as it's a painting term and makes AI's think of painting effects. 8K resolution isn't going to get you anything, it's just noise. but HDR might be a good term to add into that prompt.
just wondering, did you get most of this prompt through a llm?
All of it, lol. I just needed a test prompt, this wasn't meant to be a final image.
phew. i cant imagine someone unironically thinking up a prompt like that naturally
call me stupid but tag based prompting is the best way imo
For flux as well?
i havent messed with flux a lot but natural prompting using words like whimsical, elegant, etc somehow just make me irrationally mad
Tag based prompts with flux works well too, it seems more creative with it imo.
Do you or @desert dagger know a good source for prompt terminology? Words that when added will give nice results or effects? I know this totally depends on the model used. But I have seen alot of popular words like "whimsical" or "Canon EOS R5" (or some other model). Would be nice to have a list of such words that are a common occurance in the training data and thus will yield good results.
does tag based prompting work well with illustrious?
yes. its trained on danbooru images a lot and well thats tags only
artist tags on illustrious confuse the hell out of me. is there a file with artist name and style out there somewhere because there aint no way people are memorizing all that
i would like to meet the guy who decided whimsical was a acceptable prompt for a civil discussion. i know some models have a image map
unless they are 😭
i memorized some lmao but theres actually a great source
sec
this COULD contain nsfw but i cant check a gigantic space like this https://apple-puddle-pair.glitch.me/?source=https://files.catbox.moe/jh6pb9.json#space
(used https://civitai.com/articles/8977/epis-embedding-maps-and-graphs "artists 1024 for this one)
if you really zoom in you see a image for a artist style ish
if you find a artist you like i recommend seaching that one up on danbooru/gelbooru etc and copy the artist tag into your promptbox
granted for illustrious youd get better results then for base SDXL and if its a not a well known artist you might need to use a lora
thanks! ill check it out
it also has a character list (2048 images) that shows supported characters (mostly)
i have about 7000 hours of deep dives into this over the last 2.5 years, so most of what I know is hours of work, one term at a time, with all the different stable diffusion models and all 3 of the flux models - exploring exactly how the models react to and use the tokens. i can give you a link to my spreadsheet if you want, it's got a fraction of that in it but you might find it useful. 99% of it is now learned information i just simply know at this point.
If you have a public spreadsheet, please share the link. I would love to read it.
it's public https://docs.google.com/spreadsheets/d/1bdidA4w5pB2BQMyxhkFu710Nu5bzKM1E-Wh_38ZZlT0/edit?usp=sharing i share it out once in a while. make a copy for yourself unless you like confusion as i update it just about every day - and have been known to totally rearrange the entire thing a couple of times
Is there a base prompt or is the modifier the only prompt?
there's no such thing as a base prompt. if you use a single modifier, like you see on the flux modifiers tab, that is the entire prompt
this is how you find out what the AI thinks about by default when it sees various terms. you give it just that term and generate a few times - i suggest around 10 times - and then study what it did
that's going to give you the top of the bell curve for that term and the most likely data associated with that token
so you get a feel for what it'll pull up when you use that token with other tokens in a prompt
Hmm. But if you use the token "small", you couldn't really tell from results how that would affect other tokens, or could you? I mean intuitively we all know what effect "small" should have, but could you actually deduce it from just the output using that token?
that's a different exploration. first, you want to know what 'small' means at all. then you go down the 2 and 3 word phrase path with that
for a single word prompt you should get content that's all over the place IF that word is not a noun
hi
but all the content should represent the term. you shouldn't get fantasy elves for a term like unconcious
you should get things that are dead, asleep, or represent unconcious in some way
Well, if there's alot of unconcious fantasy elves in the training data, you will definitely get fantasy elves. I had this issue with "isometric" when using dall-e. It would sometimes generate a grid, because most images in the training data that were captioned with "isometric" were presumably screenshots from CAD software.
ROFL! well then you should see an elf that looked unconcious, not one that's playing a harp or shooting a bow or something
but that's the point - you will get what's at the top of the bell curve, what the most likely data is. and if something like 'unconscious' gives you elves having a feast - that gives you an idea of what the data the AI will retrieve is most likely to be when you use that term. for stuff like 'trending on artstation" you should get random data but all with a more artsy look - as that trending tab on artstation is a specific thing - so what the AI learned when it learned that term is whatever was on that trending page when the liaon crawler indexed it into the database. that's a lot of random, artsy noise
Hi, I’m looking to make a PS2 style image but I am quite new to this. What steps can I take to generate it? Any website or tutorial? I’m doing it from my laptop so hopefully there’s a cloud platform that allows me to make them
thanks but it just allows me to download it or run an app on macos/ios. any way i can download the model and use it on another platform? if so which one?
Probably flux with some lora or sd 3.5. like most fake influencers do
You can difinitely use those on civitai itself. If you want to run it locally you need SD 1.5 for this LoRA. You can download and install a UI of your choice.
If anyone here has takes on what the US govt should be doing about AI can submit up to 15 pages of them via a Request for Information for the development of an Artificial Intelligence Action Plan: "Through this Request for Information (RFI), OSTP and NITRD NCO seek input from the public, including from academia, industry groups, private sector organizations, state, local, and tribal governments, and any other interested parties, on priority actions that should be included in the Plan." https://www.federalregister.gov/documents/2025/02/06/2025-02305/request-for-information-on-the-development-of-an-artificial-intelligence-ai-action-plan
You have until March 15
thank you very much
you're welcome
By the looks of it Trump and co are "all in" on AI.
yep
Why do people think there's a meritocratic future with ai prompting
Like the point of the tool is that it draws really well for you just have have fun with it at least.
Idk if you wanna make money with the prompter setup a list of prompts that generally work like 200 prompts and just have a bot cycle through those 200 prompts on twitter and post 7-8 times a day youll make money pretty fast if you wanna make money off the prompter
did someone tell you they thought they were going to make a living from prompting?
cos this feels like what is known as a straw-man argument
anyone online please
how to prevent generating characters' clothes with wind effect
SD always generates it like wind blows from below
what are you using for a prompt?
hi
/subscribe
Ban this guy
Civitai moderation took down an article a few days ago that I was going through discussing the harm of synthetic images in data sets. I dont understand why civitai would take down the article discussing this paper and they didnt list a reason as to why the article was removed and instead its just gone. It is confusing because now I cant actually prove this paper was floating around civitai not even a month ago because the evidence that it ever existed in that way is gone. You seem to at least try to be honest so maybe you can explain whats going on. Here is the paper Im talking about. https://arxiv.org/pdf/2311.12202 @finite cloak
why are you complaining about civit on this discord instead of on civit's discord? and why would you assume he works for civit?
I was just asking in case theres something actually wrong with the paper or not that would cause moderation I guess I could ask people at civitai but im already in this server man. Idk maybe you can answer the question.
Maybe you could suggest to me someone else who is accessible to ask questions
just got stable diffusion on my comp. My first image blows. just thought Id let yall know.
if your issue is with civit - you need to actually talk to the people that own and run civit. so i would suggest you go to their discord to find them
@drowsy hearthMaxfield
Whos that?
You dont mean the illustrator i would think right?
a co founder of Civit, also now involved with Stability and is here
I think the paper is misleading but that does not mean it should have been removed from Civit
having said that it is likely something else happened to cause the removal
but very afk - and has been all day
ya i sure he is a busy man
yeah, and just starting this job, so all the stuff you have to come up to speed on with a new job as well
What about it is misleading?
we quite regularly have diffusion models with like 50% or more artificial training data these days
Is it pony that has artificial training data?
or upwards of 50%
cuz the one Ive seen discussed the most when it comes to artificial training data is pony
pony doesn't really count because it was not trained right
I see which one are you referring to specifically then
cuz I wanna read about it
also what about pony wasnt trained right because when I use the pony model it seems to just visibly outperform every stable diffusion model.
well it depends on if you test it on its own terms or not
if you benchmark it using standard benchmarks it will not do well
but that's not necessarily fair to the model
what does a standard benchmark test look like?
every study they've done shows that synthetic data is actually cleaner and results in better trained models
All that matters to me is how well it responds to composition prompting
that has to do with your encoders
ive found like 100 that say otherwise but Ive also read that people have gotten away with it
shrug. every lora i've made are 100% synthetic data - an image is an image, it doesn't matter where it came from
good image = good quality basically?
usually. depending on what you mean by good
high quality, clean, general 'good'
From what I understand the model collapse problem has to do moreso with the bias that ai generated content introduces and less to do with the quality of the content
also an ai generated lora isnt going to occupy anywhere near 3% of a dataset thats numbered in the billions
do you actually know what model collapse means?
its a point of no return a model can get to where you cant retrain the model to fix it after it accumulates too much bias.
yeah. if you have an image with good clear details and not a lot of confusing information, it's going to work much better than one that is hard to figure out as a general rule
Like you dont have to patronize or gaslight because I have 5 articles written by the university professors in ml that you love to glaze
you can always retrain a model though
one (or two)more question(s), it doesnt seem like most models handle western style animation/cartoon styles well?
im not sure if i just have been unlucky in finding a good model for it or im just ass 💀
but do YOU recommend any models for something like that?
the article i posted had them attempt to retrain the model and it wasnt possible so by definition there is a way to corrupt them regardless of how uncommon it might be right?
no one's patronizing or gaslighting you? i asked a question
Like if someone indefinitely corrupted an ai model on a super computer then that means its possible to indefinitely corrupt an ai model regardless of how common or easy it might be is kinda how i understand these things
it has to do with how you prompt
no this isn't the case
you can always retrain
how i prompt while training? or in general?
I see it doesnt really make a difference either way. But if they didnt manage to fix it with retraining I assume youd basically have to completely retrain the model with another massive dataset.
Idk I dont have a super computer to test that kinda thing on
the amount of data needed to fix it could be large yeah
how you prompt when generating an image
a lot of things are just about budget
they created LoRA in order to not have to retrain
could you give an example of what i might need to prompt in order to get the style
like quite often when big models come out people say they can't be trained, but with enough budget they can be
do you have a link to an image that is in the style you're after? if so, DM it to me and i'll try to work with you on prompting for it
yeah - but most people dont' have thousands of dollars to do that training
they should probably just say that then
The only real problem with using synthetic data for training is if you use junk data. example: use generated images but don't clean them up. they're full of odd strange stuff, and the AI learns that's what it should create. clean them up - you don't have that problem.
the same with data for LLMs - go through it, fact check, make sure it's all correct and accurate - it doesn't matter if you scraped it from the web, you wrote it, or a machine wrote it at that point.
yea like if the synthetic data point is an image made with a diffusion model but you did tiled upscale on it and some kind of post-processing pipeline, it would probably be as good or better than most photos
sweet thanks, guve me a sec
its just a fine tune
I always disagreed with Civit ai calling it a separate model
same for shuttle, its just schnell
Thats interesting cuz when I use it. The way it works is kinda like a basemodel were you need a checkpoint to really get it doing stuff.
But it only has like 80,000 images at the same time how exactly does it work cuz Im very confused
its just an SDXL fine tune that everyone went crazy about
there's not even more to it than that
the clips changed but that's not unusual for an SDXL checkpoint as they also can train the clips
it's not a fine tune. it IS a base model. what he did to it destroyed the embeddings.
All I know is it seems to work quite well
and pony apparently has a ton of synthetic data in it and the way it works without more finetuning is a lot like a basemodel and it responds to finetuning like a basemodel too. Thats all I can definitely say Ive experienced with pony
The only difference is that its very performant when compared with its competition and better than alot of its competition in its compositional variety, quality etc.
this is an over-reaction really
people train the text encoders during fine tunes sometimes, and if you train something it is going to change
normally training the text encoders of an existing model is not seen as a new base model
its trained on types of things you generate not surprised you find it good
well yea anime waifu's, femboys and furries are elite as fuck bro. Theyre literally the bees knees.
it's not an over-reaction. it destroyed the embeddings. it's listed as a base model because it is.
Do you still gotta use that silly oops that was trained in? the uptag_quality_1234 crap
Yall seem to dislike pony diffusion
i dont dislike it, i just dont need it
yes - and he goes into a lot of detail why he requires those tags on a page on civit
you never did answer the question of what grade you are in
the goal of a fine tune is to change things though
like RealvisXL changed the weights of SDXL base a lot but I wouldn't call RealvisXL a new base model
not like this. he deliberately destroyed the embeddings
are you using the word destroy to mean change?
https://civitai.com/models/257749/pony-diffusion-v6-xl if you scroll down that page, you can read what he says
i mean obliterate
"Pony Diffusion V6 is a versatile SDXL finetune"
yes. scroll a bit farther and you'll find his discussion about those tags
but what does this mean in practice
when I tried the model I prompted R2D2 in baroque palace with various objects and it made the image
it has had a lot of changes to the clips
and a lot of overfitting combined with forgetting
but it was not fundamentally different
I wanna draw an AI model do you think if I used like a 10 image finetune to just put in my render style then drew only the control nets for the rest of the 70,000 images if I could get an ai model thats really narrowly or even more focused like pony diffusion
and this is his original article https://civitai.com/articles/4248
70,000 linearts is a lot of work Im not unfamiliar but its also just very possible to do and the amount of painting over I have to do with any given ai generation is like 30 seconds to a minute.
i think its pretty cool, nice big tune, successful, but im not allowed to use it cuz i dont know the difference between cartoon and anime
I don't know the difference either TBH
damn boomers
when you train SDXL you don't train on controlnet input images. Not sure if that is what you meant, but if it is, its not the way to go
I have no idea what civitai moderation would be up to. This paper however is one of those "poisoning techniques" which get frequently misinterpreted and abuse. In this paper they repeatedly retrain one model on its own outputs, which is an interesting bit of research into the workings of AI models. However it's bound to lead people to go "aha! posting ai images will destroy future ai models because they'll be poisoned by synthetic data!" which, no, is not the case. AI models often benefit from synthetic data, and are often intentionally trained on synthetic data (for example Flux Dev is trained entirely by synthetic data generated by Flux Pro) -- it's specific details within the process of iteratively looping a model's own data onto itself that breaks it down here. Which is a process that never happens "in the wild", as of course you can't have released a model and also have not-yet-trained the model. (Note also that paper was based on SDv1 or 2, a model class that was known to break down with this type of artifacting if you even sneezed at it too loudly, due to a heavy underuse of normalization calls.)
This type of misunderstanding was most famously pushed by the nightshade/glaze authors, who discovered a very similar technique (poison a single precise target model in pretraining, which of course is never possible to achieve "in the wild" cause the model is already trained by the time you can target it), and then scammed artists in the public by claiming "run our magic software to stop ai training on your art", while in reality it would only stop the training of a very specific AI model that has already been trained, and does nothing regarding future models that are still yet-to-be trained and might incorporate that data. It was also tested and confirmed that nightshade/glaze doesn't stop lora training even on the target model, ie it does nothing of practical use.
So to return to your question: I don't know what civitai mods would be doing, but I can imagine exactly the way discussion on that paper would've gone down stupid/misleading/scammy directions and needed to be removed.
@fervent thunder new toy https://replicate.com/lucataco/dotted-waveform-visualizer
if anyone here uses ADetailer in tandem with Ultimate SD Upscale, do you know how to mitigate the obvious mask the it leaves? ty
The part about it thats interesting to me is that the reason the models seem to have collapsed isnt so much because the data is synthetic but moreso because synthetic data can have a lot of bias. I wouldnt particularly want to have synthetic data be untrainable as I have enough foresight and have used models enough to know theres ways to make it work. Also was the flux dev model a distillation of flux pro. My plan with ai tools and the pet project Im working on has a lot to do with training models on synthetic data. I basically want to make a model like pony thats specialized so specifically that it would essentially create a model development process that couldnt be otherwise easily replicated. I used a model at a studio that was NDA but I didnt sign the nda contract because I was visiting. The model was made by artists taking photos of paris and the model seemed to completely and consistently understand paris with 10,000's of images that shared every minute detail about the city of paris and I want to replicate that in a commercial capacity but I dont understand fully how that was done.
Also the model used for the oasis project looks like it mightve been made in a similar way but I have no idea for sure they havent actually shared alot about it.
do you have a link to the paper for that project or anything?
No I dont this is actually an I met a guy situation so I try not to bring it up to often lmao cuz it sounds insane.
i did a search for it when you mentioned it the first time, as i hadn't heard of it, and found a lot of things with that name that have nothing to do with AI, and then some medical ai stuff
and would like to actually see what it is
you mean the oasis project?
yes. do a google search for "oasis project" and you get a sorts of interesting 'that can't possibly be the right thing" hits
I dont know exactly how this model was made though
oh - this is the generates the world in real time project
they did that with doom, too
The paris one I mentioned was similar but it would just simulate paris in real time
yeah. this isn't a video model. it's actually generating the world as you do stuff. there were posts on twitter, and even a couple youtube videos
But when I saw it he told me it was nda but I didnt actually sign an nda agreement he just showed it to me anyways because he had his artists make the thing
It's called Google Street View.
there's a lot about it on the net. go search google for "ai generates minecraft in real time"
i'ts not google street view. it's generating a minecraft world
it doesn't keep coherence very well though
I'm obviously joking.
So, in this Paris model. If you look up at the sky and down again, are you suddenly on a sunny Californian beach?
if you scroll down the decart page they show a short overview of the architecture, it's a latent Diffusion Transformer that takes a continual sliding context input of previous frames and current user input
iirc from looking into that before it uses error-correcting autoregressive logic. ie it keeps generating 1 frame at a time continually, and they trained it with errors in the frame data to force the model to correct any errors it makes to prevent autoregressive self destruction
however it doesn't have much contextual memory - it forgets what you did fairly easily, and generates generic minecraft stuff.
or mars
I imagine with the paris model, if you look up at the sky and back down, you'll end up elsewhere in Paris.
Part of how this model design works is it's heavily trained on a single topic, if it understood locations outside of Paris it'd be a lot harder to keep it coherent and efficient.
Don't look up is the new don't divide by zero.
Leonardo has been working on this post for quite some time now. Must be his Opus Magnum.
Any minute now
The paris model wasnt an interactable you just could prompt it to visualize different areas and it would simulate that space in real time. It was much more consistent because of that I think. Yea I dont think it would make sense to try and make a model by hand thats wide as an ocean shallow as a puddle like the Dall-E because I already have that. I want to make one that understands a specific narrative concept with like 70,000 images that are ai assisted drawings based on my paintings. When I read that paper I didnt interpret it as ai not being able to learn from itself I was actually more interested by the similarity between the generations and their seeds. I thought that it would make sense if an AI model was trained by hand using ai assistance you could likely dig pretty deep without model collapse consequences because it appeared to me that the cause of model collapse had more to do with the AI struggling to expand on composition past its seed. Whats interesting is this is exactly how humans draw as well. You cant actually draw something that isnt loosely based on something youve seen. Ive also done batch generations with different stable diffusion models and noticed that compositional bias has been a progressive issue with the ancient stable diffusion models although they mightve been terrible actually wouldnt replicate the same composition within a prompt right. I want to make something like pony diffusion thats ot like 70,000 ai assisted drawings and maybe some fresh ideas with rendering because although rendering is much more biased then composition naturally theres still little subjects here and there like the windmill principle and shape welding. I want to literally make an AI model with AI content its like my ultimate goal with ai bro.
You cant actually draw something that isnt loosely based on something youve seen. < little kids draw stuff they've never seen all the time
It just appears abstract because your memory doesnt pictograph things even if its completely photographic. Kim Jung Gi talks about this if you wanna look that guy up and learn how drawing works or read drawing with the right side of the brain which discusses the neuroscience of drawing you can. Until then your statement is based in ignorance not fact right. When a little kid draws the process is loosely based on their subjective experience Im not saying that everything you draw is a perfect representation of your memory. Its kinda like how AI makes stuff right when people draw they create an abstraction of an idea that they might not even be able to fully consciously recall. AI works in a similar way according to that paper in that you can find an image in its data set that loosely resembles the composition of the generation but the similarities dont go beyond that. When humans draw they make something thats loosely resembles the composition of their memories but doesnt actually perfectly replicate or anything to that matter and can be heavily abstracted to the point that its borderline unrecognizable.
contrary to your belief, you really aren't leonardo DaVinci
Im sorry that it upsets you that humans cant actually fully abstract from the images in their head but its basically just proven at this point right. Drawing with the right side of the brain goes through mri scans and shit going into excrutiating detail about the neurology behind drawing. Your drawings genuinely cant capably escape the image database stored in your memory its commonly called "the visual library."
I told you a book you can read to learn about this maybe you could check out kim jung gi stream as well but past that i cant help you
old research i guess, right/left brain "functions" have been debunked.
might use more of a side but its both sides
i propose a simple test, remove left side of brain and see if you can still do right side functions
This is true if you read scott robertsons book for instance youd learn that left brain functions are actually pretty useful. The point of that book in this context doesnt actually have to do with the finality of brain functions in drawings and moreso to do with visual libraries in the brain.
These things are not the same topic despite the name of the book
For instance the way you learn how to draw from imagination is by drawing things you saw a second ago then subsequently drawing things you saw a day ago and you practice drawing further and further back in your memory until you can draw anything from imagination.
Theres some extra nuance to it though because the scott robertson style turnaround studies are generally paired with that
That's not true. You never learned that a mouth full of teeth is frightning. You are born with that "knowledge". There's alot of visual knowledge that you are born with and have never seen.
Yea your memory doesnt just born out with images of a mouth full of teeth. The way those things evolve is by hammering in biases over and over that cause you to eye track towards things that vaguely resemble a threat. For instance a bushlike shape placed adjacent to a person is going to track your eyes subconsciously. Its not because you were born knowing what a bush is its because the architecture of your brain just interprets all shapes that resonate as bushlike to be threatening. You can read about that too in a book called imaginative realism when it discusses compositional eyetracking. It has to do with predetermined biases towards shapes not because you know what teeth are when youre born you still need to learn what teeth are through formative growth by seeing them.
i have a mouth full of teeth and its not frightening, but there are some concepts people understand/feel that arent taught
Yea but its not that those people are born knowing what teeth are and what they look like man its because the visual cortex is just preprogrammed to interpret simple shapes.
Thats why phobias appear like a fear of holes despite the fact hooles arent threatening the fear is there because the simple shapes resemble something that was threatening to our ancient ancestors.
ya that popped into my head, couldn't remember the term for it, some survival instinct
Its not like your brain has prestored pictures it just has a way of interpreting images almost like a clip model
when people were shorter and lived near the water
I didn't say we are born with images of a mouth full of teeth. I said we are born with visual knowledge. You say you can't draw things you haven't seen. Yes you can't draw a lion's mouth, for example, if you haven't seen it. But you could draw something that resembles a mouth full of teeth.
well to be fair the extent of whats stored in the optical region of the brain and how thats understood is fairly shallow. The same goes with visual libraries right. The thing that interests me about AI is that it seems very similar to visual libraries and a huge step in learning how to draw is learning how to reference your visual library so you dont need reference anymore. The only viable and proven way to access the visual library is through turn around and drawing memories progressively further and further back in time until the begin to abstract to the point that even youre not totally consciously aware of where they come from. But the funny thing is if you just scribble on a piece of paper youre actually doing the exact same thing in a way as an artist who draws incredibly well without reference. The key difference between the two is that the artist that draws really well without reference learned how to extrapolate that information in a way that appears realistic. Does this make sense?
So an AI model is kinda like a visual library and a lot of what I read about it at least in a surface level way seems very similar. The only difference is human visual libraries are orders of magnitude larger then AI models with trillions of synapses.
Artists have different ways of drawing. You could draw things by understanding the structure and proportions of a subject, or you could simply visualize it and draw it as you see it.
actually, there are peopel that have very little brain at all - it's a birth defect - but they can function, and do things. and one was even on a talk show. They can't do, obviously, what someone with a full brain can do - but they can do everything both 'sides' are supposed to do however in a limited way. And there are people with severe brain damage that recover and can live almost normal lives. The brain is very flexible.
wrong discord
it's not. but we had that discussion last night and you don't wish to listen to the people that program the very AIs you are talking about.
@vapid dove spammer alert
Yea he always gets me confused because I think he's talking about current generative models, but then he says things that don't align with reality even remotely and then I wonder if I misunderstood the subject of the conversation.
you didn't misunderstand. if he actually believes what he's spouting off, I have serious concerns
hes not wrong just maybe not the best choice of words, dont throw the baby out with the bathwater
more like sparse data points that can be reassembled
On some points he's absolutely wrong. He focuses too much on implementation details and misses the bigger picture.
yup that may be
Naw the neurology is already pretty purely understood theres actually a process in the brain that kinda formulates stuff. People have different approaches to learning though but thats moreso because the process of formulating an idea you might draw involves a series of nuanced systems in the brain its still the same series of systems at the end of the day if that makes sense.
for instance someone might learn turnarounds really fast but they need to study gesture for months while someone might be really good at gesture really fast then need to study turnarounds for months. People generally have a proclivity towards certain things but the overarching process of learning how to draw from imagination is pretty cut and dry.
Hey, so I am trying to create a story with AI images, I wanted to understand how I can get multiple character consistency in a scene.
Can anyone please help me with the same?
Only in the gen chat with images :-)
i had an amazing gif you dont understand 😭
Regional prompting does have some success but you probably have a custom character too so it would complicate stuff. Either ipadaper/facetools for face consistentcy
But it's a numbers game
Make me understand lmao, send
--cref is not working, i have created both the characters that I want to put together in one picture but it is really not understanding...
Ah ive seen the gif. Ill visualize it
Oh so I work with midjourney actually...
Here's the neat part. You don't
Especially with online services
yeah its already hit or miss with local stuff 💀
Locally you could do some wack inpainting
i can only imagine how bada itd be on online stuff-
all my inpainting comes out wack 
so I have achived it before with animated images
I am trying to do the same with realistic images, it is pretty ANNOYING...
But how i generally do it for certain comissions. Make the character a transparent png and place it in the scene
Even anime images are pretty inconsistent with random features blended and mixed
honestly i enver thought of using two different images to make one
i could probably gen a fuck ton of images, pick out the perfect two, cut them out-
yeah
easy work
Oh yeah, I could actually do that!!!
20 minutes tops in photoshop, im gonna try that sometime
can u ppl recommend me a model to create wide angle images with longer prompt
instead of close up portrait
any model
no
Flux lets you do it natively 
dreamshaper and juggernaut doesnt help me with that
Its your prompt probably. What's the prompt
my laptop cant HANDLE FLUX
Woops was distracted im in the shower
always creating close up images and cant handle much detailed prompt
Meant swarm
$10 says youre right
So I have previously created a music video with AI, I could achieve consistency till some extent along with adding movement to the images using another AI software!
euro 
howd you mix up swarm and flux, there had to have been like
a clash of two thoughts
I've been haunted by my friend chat. Ill send lmao
I am going to do try this, lets see
oh bet
glgl
OutOfMemoryError: CUDA out of memory. Tried to allocate 7.98 GiB. GPU
ım getting this error when ı increased the resolution to 16:9
from 1024x 1024
and also when ı tried to use upscale
You shouldn't generate at a to large native resolution
And for upscale you need the tiled VAE extension
Then try 720x1280 and upscale by 1.5 to get FullHD or by 2 for WQHD
Hires steps on 10
Hello everyone! I'm a newbie in Stable Diffusion. Who can help me in DMs?
I wanna train a model with my pics, anyone can help me with this?
Hello can somebody tell me how i can attach sora to stable diffusion? i got model of me in replicate
Anyone have any tips for blending style Loras with likeness ones? I find I lose a lot of the likeness when I try and mix with styles.
i recommend following the tutorial's in #🤝|tech-support pinned messages. if you have a decent GPU that is! otherwise i recommend using a online service like civitAI
make image > upload to sora basically? unless you mean you want to make the video on your own pc?
seeing how sora is closed source and cant be run locally i recommend the LTX model for image to video or wait for hunyuan to release its image to video model (this requires a really good pc, 4090 or better)
LTX can be run on a 30xx series though
takes a bit
hope that awnsers your question. setting up LTX is actually pretty easy 👍 if you got other questions you free to shoot a dm or reply
Hi
I mute discord tabs and just let every server accumulate red ping numbers
the effect is that pings actually don't do anything to me at all
I did this without modding the client cos that does break terms of service
i was walking outside and got the ping on my phone so it doesnt bother me either but.. why ping in the first case if your not gonna talk further
tends to be either very old or very young people, or, separately, people with non-native English
cos not every country has commonly used social media that resembles ours
true but i assume if someone is competent enough to download discord. find the stable diffusion server, get into general chat and ping a person hes capable enough to type a response back
well I use discord via browser for example
and if I remember rightly you don't even have to verify email at first
you do nowadays no?
not sure
Hey all! I’m a newbie in SD land. Hope to get more proficient in ai this year before the world ends.
some handy resources do include our tech support channel. in its pinned messages theres a couple of installation guides like for webui-forge or Swarm (both beginner friendly) if your pc can run it
otherwise i recommend the generator on CivitAI
recommending CivitAI generator to newbies is probably a decent idea TBH yeah
i'm a newbie too. i just tried rundiffusion's free 30 min trial as i don't have the hardware to run locally
does civitai's generator allow img2img?
yeah for some reason it uses a bunch more cpu than other tabs
Its a unoptimized site but i think it does? You still use credits.so your free daily generating is limited
No it's just. You just won't understand it will you? All the research you refer to only looked at one form of how most artists draw and paint things. It doesn't mean that all drawings/paintings ever made follow that principle. It doesn't mean you can simply condense the different artistic processes into an algorithm. If anything, you could argue that art is what doesn't work like that default "algorithm" people have in their minds. And then there's another point that you also just don't seem to understand: All the imagery that mankind has produced so far does not contain all the imagery that mankind will produce from hereon. You can't extrapolate mankind from rules, that's not how it works! Even your most advanced image generator will at some point be outdated. Machine learning does not have that evolutionary process because everything beyond the domain learned from the training data is just noise. You cannot feed even the most advanced futuristic super AI with everything mankind has done up until 1900 and it's just magically gonna invent Cubism, Surrealism, Jazz, Rock and Roll etc. Not how it works!
it's part of the discord onboarding. you're taken to various channels and told to do things. they get broght to this channel and told to say hi, they just post hi. it clears the onboarding requirement
fair but why ping someone specifically lmao
bot perhaps? newbie that doesnt' know better? who knows
maybe but its also "can someone help me in dm" i dm and no reply lmao
that could be discord sticking your dm in their spam folder. I didn't think, for the longest time, there even is a spam folder and that people who said they found stuff in theirs were full of it - but i did stumble across it one day. if discord puts DMs in it, discord does not notify you that it did so - so you never know someone dm'd you at all
wait theres a spam folder? where lol i got dm requests on but you get a red dot for that
so i'll send a DM but also tag the person on the discord and tell them I DM'd
and https://support.discord.com/hc/en-us/articles/7924992471191-Message-Requests there's other things for messages that no knows about apparently
i had a alt that required a freaking email verification every time i tried to add someone as friend
tat was agonizing
that sounds painful
switched it to sessions because i couldent bother doing it 40 times in a row
Its my teams status at work lol
i like to add to the end of it, and if it's software related.....log file or gtfo
😄
at my last place of employment, I was like SME for a buncha legacy stuff, so I made up a guidelines for help request, with steps to gather the required log files for me to help them for various situations. Little did they realize at first, it was lowkey teaching them how to troubleshoot it without my involvement. lol
guy's i have a project that i rly want to do but i struggle with generating what i ask in my prompt
i want to do ai videos of a character juste like this guy but not with sonic, with another character https://www.youtube.com/shorts/z0qogHHNRSE?feature=share
if someone can help me realise this it would be great (btw im french so sorry for my english)
vous pourriez rédiger vos messages en français, puis les faire passer par ce traducteur, et utiliser la traduction comme message pour l'IA. https://www.deepl.com/
steamcommunity-20.com sounds legit
Hey how’s it going ?kind new to all this , Was just wondering if anyone here is running stable diffusion on Mac book ?
SD3.6 when
It doesn't invent styles because it doesn't experience my dude
Yes and experience is relative to the subject. Hence why every attempt at creating an AI that emulates human experience is futile.
Yea well I'm just gonna draw an ai anyways
It doesn't seem super deep bro ai just makes stuff associated with preexisting material in its data set is not magic
Ahem, that's what I always said. You said you're gonna make an AI that does the same as a human artist. Which is not "just makes stuff associated with preexisting material in its data set".
Funny how you suddenly flip the script.
How do I enable vpred in forge webui?
Huh thats crazy
you sound like one of those artists that still believe that there's a database of small images that the ai mashes up together
you should probably ask this in the #🤝|tech-support channel
Seems like I know everything I need to know about AI models. If making a model that trains solely off of a narrative niche like how pony diffusion aniverse is cracked at making anime and furries. Ima just make a huge dataset of art that looks good and has its fitting tested that responds to the prompts I want it to respond to. Youre not gonna stop me bud
All I gotta do is draw 70,000 storyboards controlnets are so good you can basically plug em in and have the ai make something it wouldnt otherwise make.
just put any style I want in there when I want to paint it
so you can teach univerisity classes now then?
otherwise, no, you really dont'
if you have to draw 70K storyboards to get somethng decent, there's an issue
Im not doing machine learning Im not a ml engineer Im an artist you can train a 70,000 image ai model with kohya at the click of a button
not really cuz i get something decent after like 10 storyboards cuz i can just start by using my stuff to make loras
then once i get up to 70k i can just replace the textual embedding like they did with pony then write checkpoints over my basemodel starting with loras again
hello everyone
hey all! popping in cuz I'm getting started with locally hosted stable diffusion stuff
I was wondering if anyone had a good guide to training a model on your own art? I wanted to put something together using my own art style so I could quickly generate stuff for personal use/fun :)
its not the way to go because the model needs to see bad stuff too
to be able to be responsive to guidance that is guiding it from bad to good
Ill just have the qualities listed in the dataset its not like it would be any different in how it looks from other training libraries. Whether or not something is bad can still be determinate.
that's a bit related to how deepseek v3 works compared to deepseek RL.
v3 has all the info, RL was trained to make it biased towards better outputs of the things that are in v3.
datasets are just img caption pairs, there's not another place to put more info
good I can put literally any info I want in there with storyboards
if you do a different architecture to base LDM or base DiT then potentially
like there are a lot of weird architectures out there
they add more stuff
I might try out different architectures from how I understand is that the compositional variety is what the fitting process kinda has the most to do with so Ima just build up a image library over the course of my classes. Im already building up a portfolio and its pretty close to the lighting quality you would see in AI so I figured I can just make an AI model that specializes in things I want to be able to prompt.
storyboards arent like a way to finish a painting or anything theyre just a way to jot down an idea really fast in perspective thats why storyboard artists like to emphasize learning gesture because its all about getting the essence of an image out of your head as quickly as possible so the only technicalities that need to be there are the ones that are productive to the storytelling component of the composition.
there are several very good tutorials on youtube for training loras. you just need to create a good data set out of your art first
Can you recommend a specific one?
start with this guy https://www.youtube.com/watch?v=1mEggRgRgfg
ty :)
also - based posted this earlier today https://x.com/BasedLabsAI/status/1888313013276684711
no, but Luna was wanting resources to learn how to create loras
Thank you! I'll check out both of these.
I've been wanting to create a good rapid thumbnailer, hehe
This app allows you to change your outfit using a image
should be able to do that with comfy and a batch node
yes? and?
I also would like to use it just for fun generated images for use in sillytavern, so i dont ahve to make a bunch of expressions for characters if i dont have the time to lol
I found it cool thought i should share☺️
yeah. you should be able to use comfy for that, too
if it was comfy it attempts to VAE decode first without tiling and then if it hits VRAM limit it makes as second attempt with tiling
so it was probably vram thing
webui
it was ab something the tile size as ı remember
to fix it
but ı dont know the exact value
could you consider switching to comfy or diffusers?
would it be possible to sell amd gpu and buy nvidia?
did you read @warm junco 's AMD gpu guides that are linked in the #🤝|tech-support channel and pinned at the top of it?
ı ve been thinking that too
that was webui guide only
oh no
is Forge webui good
plenty of people seem to like it
it's for more than webui https://github.com/CS1o/Stable-Diffusion-Info/wiki/Webui-Installation-Guides
then that's a good option
Is there an actual list of Artist and Art Styles that are used in SD 3.5 Large
I've searched for the mythical list and have come up with links that say there are lists is not helpful❌
ChatGPT succeeded rooting out a Styles List with a link to Github. Multiple Styles with example Images and how to implement the styles - None of them work as advertised. Using latest version of ComfyUI. v0.3.14
Absulutely frustrated.
SDXL has superior support on artists and art styles and very easy to implement in prompts but lacks the improvements implemented in SD 3.5. That would have been a big edge on Flux
Never OOM Integrated
▼
Enabled for UNet (always maximize offload)
Enabled for VAE (always tiled)
is bottom option works instead of tiled vae
ım getting cuda error again
Please post error log in #🤝|tech-support
that's the big lawsuit attractor though, to train on proprietary names, styles, identities and content
probably will not get a big model doing that again
dont have
There are bigger problems. I've been running tests for about 5 hours. Art styles are hit and miss even using the Github link listing the art styles. 1024x1024 images have a greater chance of rendering as some form of art style - not neccesarily the one specified in the prompt.
ComfyUI - Using a 4090 card and have 64 GB sys memory. Even flushing memory in Comfy to eliminate cached info doesn't correct style prompt errors. .
You have a cmd window, spitting out a error probably
No errors
there is not tiled vae in forge webui and ı got cuda error again
lowered the native res its fixed
othe models Flux and SDXL work flawlessly
Wait what res were you trying to generate?
1024x1024
704 to 960
And what model were you using?
realvisxl
3.5 Large.
Im asking the numbers guy lol
what
Sdxl uses 1024x1024
and other fine tuned 3.5 models to compare against
so ı shouldnt edit the width as ı want ?
You can but try to use aspect ratios
1:1 1024x1024
4:3 1152x864
3:2 1248x832
16:9 1344x768
You can swap them too like 2:3 for tall images etc
thx
this isn't a problem this is how it is supposed to be, if they train on art styles then it is not good on legal grounds, for a base model
RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)
controlnet giving error
Please again, use #🤝|tech-support for errors
sry
im trying to run Automatic1111 in open webui does anyone know how to?
Open-webui docs should have a tutorial for that.
thanks
in automatic 1111 i ran the playground 2.5 model but all i got was a static image
Why would people care if I storyboarded lora models I gotta know cuz the more I think about it the more confused I get.
Wouldnt that just be dope as fuck like I could make the lora model do literally whatever I want. Any pose model I could think of I could work up even now in like a few days.
I just want robots with better detachable limbs thatd be a good lora model
you can make a lora do whatever you want, you just gotta train it my guy 👍
Good I was getting debated on like crazy thats wild I gotta do homework for now though ill put some of it towards my detachable robot arms lora
Hello, I would like to discuss cooperation opportunities . How do I contact ?
cooperation with the users or with stabilityAI?
Thats not an issue. Works as intended
The controlnet model you used is not compatible with the selected checkpoint
with stabilityAI🫡
well on their website there could be a contact us page
but i doubt they would cooperate with a one man show
alright~
Hellooo
Does giving bedrooms proper name variations in training like calling it "Courtney's bedroom" make the bedroom backgrounds more consistent in Lora models when I use the proper noun tag?
i think if you give it a unique name so after tagging every image automatically
replace "bedroom" with like "courteyroom"
is better
to avoid existing tags
What would happen you think if a sizable Lora or checkpoint was trained on images that are completely narratively continuous with everything having proper nouns?
To the point where the ai model itself is an application of world building
so like flux would? natural prompting and all
gross
also "courney's bedroom" is four tokens while "bdrm" is 2
So basically if i did my cyberpunk city with robot waifus and femboys by callin everything with proper names and making the whole thing narratively continuous I could theoretically get the most goated ai model at cyberpunk city robot femboys?
I think after observing your ramblings that you dont know what your talking about mostly
I mean would it though?
I'm not going to entertain your train of thoughts further as were getting nowhere
I mean I just wanna know how far defining the world building goes before its not relevant to do so anymore youve already suggested to me that I could get more consistent stuff and the lora models I have suggest I can get more consistent stuff with proper nouned tags.
No you dont use nouns
Just i have thing, i have images of thing, i call it whatever in my lora and its just that
Im only asking because I have and have made loras with characters and if the characters have a name the image generator tends to get them right more. I just wanna know how far that goes.
Give it a unique tag, thats it
Really, thats it
So unique tags are just useful and only have bigger and bigger payouts the more you use them?
Just look at how other loras have it with their tags
Or even better, follow a tutorial
I mean other loras generally just have the name of the character I haven't seen or heard of a Lora that name bedrooms, streets, cities, characters, cars etc. I'm asking mainly about the latter where you'd obsess over unique tags by naming basically everything
I'm just gonna assume yes for the most part
@finite cloak could you answer my question it's pretty much the only question about ai that I have left before I leave
what
I'm asking how far using unique tags can be used before they stop being useful. Like for instance instead of just naming the character with a lora trigger I name the environments that character occupies as well with things like street names or the Courtney's bedroom analogy I used earlier. I've found having the name of the character in the tags increases the consistency in which that character is portrayed so I was wondering if naming street, rooms and houses for instance are also productive in the same way that naming the character is
webui forge is terribly slow
any idea how to fix
[Low GPU VRAM Warning] To solve the problem, you can set the 'GPU Weights' (on the top of page) to a lower value.
[Low GPU VRAM Warning] If you cannot find 'GPU Weights', you can click the 'all' option in the 'UI' area on the left-top corner of the webpage.
[Low GPU VRAM Warning] If you want to take the risk of NVIDIA GPU fallback and test the 10x slower speed, you can (but are highly not recommended to) add '--disable-gpu-warning' to CMD flags to remove this warning.
You wanna lower the gpu slider in the webui and then it'll go faster
no mainline image generation models have been trained to use tags, only anime models are. Normal models are trained for natural language captions and/or "internet natural" captions -- the latter being when images on the internet are often Some natural description of a scene in plain English here, cameraname, location, year, etc. detail or similar format with a small set of "tags" appended on the end like that.
When training a lora and not using anime models, you want to match the way the rest of the model is trained, ie use natural language.
Some people support abusing the tokens of a text encoder, ie replace the name with an obscure specific token in the textenc to try to match it, other people support abusing relevant natural language - for example if training a person, you might give the name of a celebrity that looks similar, just to make it so the model has to learn less unique info, just adapt something it already mostly knows.
But for the most part as long as you're not actively fighting prior knowledge heavy, you can caption however you want, and should absolutely caption with just the normal plain name of whatever you're training.
As for what to name explicitly: name whatever you want to be able to call back easily. If you have several pictures taken on a particular street, and you want to be able to ask the model to generate within that street, name the street in your caption. If you don't care about the specific street, call it "a street".
Be aware that associating content with labels means if you forget that label, the model might skip that content. For example if you have images captioned "Courtney in Courtney's bedroom", and you later generate "Courtney in her bedroom" with that model, it might make up a new bedroom, because it expects "Courtney's bedroom" to only appear when that exact phrase is used (or it might just generate her bedroom correctly anyway. AI does AI things.)
ı lowered to 10k mb from 12k
it just got more slow
Try lowering to half or below half I have like 8k max on mine and use around 2k it's weird but you actually want that as low as you can get it without any problems on your setup
are u on amd
Nvidia graphics with amd cpu
What's your GPU?
6700xt
Ahh
is there tiled vae on forge ?
Yep its called never oom for tiled VAE only at the bottom
Any artist that died over 100 years ago. Painters from the 1800's are beyond copyright protection issues. Even some artists from the early 1900's are open. They are classic artists and SDXL included many of them. No legal or copyright infringements in those cases.
A bigger problem is that applying "Art Styles" is broken. Applied as noted will not guarentee the style is applied. Something about 3.5 is it doesn't "flush"
an old art style when replaced with another (training issue bug ???)
The best chance for them to stick is using a 1:1 ratio. Outside that ratio, things will likely break. I understand this has more to do with it being a training issue.
Solution: Render in SDXL with a solid finetune model - img2img with a finetuned FLUX model for output. That's an extra step, but it works for now.
One more question in the instance that im naming a bedroom or a street is it valuable to focus on getting rotations of that location or could that cause fitting problems?
I.E. a worms eye view, birds eye view, quarter angle and orthographic views of main street.
When you train a lora of a subject you want to have that subject in as many different situations as you can possibly get. You want to separate everything that is not an intrinsic part of the subject from the subject itself. For example if all your photos of a street have snow, the model might not be able to create an image of that street without snow.
In an ideal case the common factor between all your training images is only the subject and nothing else.
That includes abstract things like perspectives as well. Your captioning should also describe everything in the image EXCEPT the subject that you're trying to learn.
hi
is stylegan3 the successor to Neural Style Transfer?
I am newb to this, so I am figuring out a gameplan for what I want to do
I am looking for a model to replicate a certain art style then apply it to anything I want
alright sounds good
swarmUI is good (comfy backend + a good ui) but reforge/ web ui forge is pretty popular
hey, I would like to start using stable diffusion on local but I don't know where to start. Can you help me?
What gpu do you have?
AMD radeon graphics
Hmm a bit more specific? Can you open task manager and i think its the second tab that shows off your gpu and such
i send it #🤝|tech-support
can i still use plugins made for comfy?
like comfyui manager/custom nodes?
specifically this one plugin to integrate comfy with photoshop
i mean i think you can yeah. its a comfyUI backend.
meaning swarm is just run over it
these (i cant decide yet)
https://github.com/AbdullahAlfaraj/Auto-Photoshop-StableDiffusion-Plugin
https://github.com/zombieyang/sd-ppp
well you probably want the second one as it has a comfy tutorial
but seeing how it really wants to use comfymanager (which isnt always the best option from what i heard) you need to install that manualy in the comfyui backend in swarm
i think youd want to use git clone method 👍
are you saying comfymanager isnt compatible with swarm
it IS compatible
but you cant download the portable version
you need to install it using the gitclone method
can you explain what a 1:1 ratio is here?
I agree there are old things that no longer have copyright
the thing about flux is its so big
the dataset must be huge
the cost of curating that dataset more could be so high that it would cut into training budget
its not just budget its also time because it was made pretty hastily
1:1 means 1024x1024
so the basic ratio
a portrait (3:4) would be 896x1152
oh in that context yeah i think it means that ish
having said that, 16:9 and 21:9 look drastically better to me in most models
models that are trained for multi-aspect (ie most models other than SD1) tend to pick up a content bias relative to resolution. For example, they often do much better at portrait images of people with a vertical aspect -- because most good portrait images out there are vertical. 21:9 will heavily bias towards cinematic source data, and so naturally does well for things that would look nice in a cinematic setup
yeah I do 21:9 for the cinema bias
cos sci fi movie is essentially the only sort of image I ever make
I use 1:1 tiles for upscaling or inpainting cos I want to maximise model strength there
hey guys, im looking for nsfw diffusion can anyone tell me hot to get there
sorry but where is it?
you'll have to just search civit for it
thnx a bunch
Composition fundamentally just isn't the same between portrait and landscapes. I've had it do bisectional compositions in landscapes kinda like a selfie composition and it just looks bizarre. Putting (rule of thirds:1.2) or something of that magnitude gets it to fill space better in landscape compositions but it's still very poor its also regressive. SD 1.5 seemed to do this kind of thing better then every model.
you're arguing with a man that programs and trains AIs - professionally. you should probably listen to what he's telling you.
Not only that, someone who actively worked on stable-diffusion it self before lmao
yeah. exactly.
I just blocked him by now, low effort ranting / not willing to listen to anyone unless they agree with him
but i get the feeling that Leo would argue with Einstein about relativity, and Schrodinger about quantum mechanics
The stream of low quality users do get tiring sometimes lmao
Not saying I'm the best user either
that's what the banadoco discord, and the L3 discord, are for - to escape to
Haven't heard of those before, can you shoot me a invite in dm?
I didn't argue my guy anyways I got the information I need about training I'm just gonna leave. Also important note that knowing how composition works and machine learning are not mutually inclusive. Anyways ima go study. This communities fanbase clearly has a cultural problem I haven't claimed ai isn't art or whatever I haven't actually argued at all you just lash out and get offended at the prospect of hand drawing an ai model for no reason. You keep going with your friends until I get angry I haven't actually seen you post or engage in even using ai tools on this server either. You may rebelrouse and cause problems the reason artists hate this tool isn't because of alex mcmonkey it's because of people like you. I'm leaving
you haven't looked very hard if you haven't seen my image posts. shrug. and your rant has nothing at all to do with the comment you just replied to, either.
Don't feed him crystal. Hes looking for engagement
not sure about that, really - but maybe he'll go study and find enjoyment in that
But tbh i should post more gens/content but the problem is most requests/comissions i do are nsfw so i cant post most of em lmao
you can always make stuff specifically to post
Or its hyper specific that i wouldent want to share albeit hilarious ("rukia eating dessert and being disappointed (she paid $40 and it's worse than storebought) "
True true, but ive been working on my workflows to enhance those pictures i dont think much of my own lately
hey all
I am wokring on creating an LLM from where shall I start?
pytorch docs
pytorch actually has better docs and more readable code than like 90% of the stuff that is based on it
The office granted approval and said it determined the image "contains a sufficient amount of human original authorship in the selection, arrangement, and coordination of the AI-generated material that may be regarded as copyrightable."hmm this sounds like a really big deal
lucky that it went this way I think cos it will set precedent
aye. it's a big deal.
Hey guys!
Sorry for the newb question but I'm new to this and wondering what tutorial you would recomand to follow if you were to start today. I'm seeing a LOT of tutos on YT but most being a year or more old makes it kind of obsolete i feel. If anyone could point me in the right direction I would greatly appreciate. I'm looking for photorealistic style. 🙏
heya
How many gpu's do you have and budget lol
the sweet spot for flux is 672x1024 or 1024x672
ive had great results with 1920x1088 as well
anything around 1mp should be pretty fast, just pick an aspect ratio
1:1 1024 x 1024
3:2 1216 x 832
4:3 1152 x 896
16:9 1344 x 768
21:9 1536 x 640
I can confirm that flux is not great at 2mp. You should only do 1920x1088 as a second pass, not as the first pass. Even at 1.5mp you already can see compositional issues. I wouldn't go much higher than 1mp for a first pass.
flux is still pretty good at higher resolutions even on first pass
ofc you will need to adjust other settings like increasing number of steps
1920x1088 is actually my preferred resolution for when I want quick hd gens without having to upscale with another model
Yes for something quick it's definitely good enough. But if you aim for the best result I can't recommend going much above 1mp for a first pass because the structure and composition of large objects and the image as a whole will suffer from it.
hello
it's always going to be better to do your gens at a lower resolution, and only upscale, with an actual upscaling tool, the ones you actualy need to be upscaled
does anyone have flux dev FP8 version on a rtx 3060 ti or 8gb vram card + 16g ram to tell me how is the performance?
Flux can generate at 3072x3072 without distortions with certain combinations of checkpoint, lora, sampler and scheduler
well rtx 3060 ti or 8gb vram card are two very different things
cos if its a 4000 series 8gb card then fp8 can work
but 3060 ti doesn't have dedicated hardware matmul for fp8
it can use int8 however
thanks, I'm using the "gguf Q4" and I was wondering about the posibility but it's ok it's working fine the one I got
which card do you actually have at the moment?
it's a pain in the ass download flux models cuss they weight a lot
rtx 3060 ti 8gb vram, I forgot about the 4000 series xd I know it's different technology, just like the 5000 series
rtx 3060 ti has both int 8 and int 4
to get good performance out of int 4 is fairly advanced and tricky but
you could get int 8 easily using Huggingface Quanto its not complex
this would be a big speed boost
i didnt really notice a huge difference between fp8/q4 other than speed
between FP8 and GGUF Q4 the difference isn't enormous but
Int 4 done in a simple way can lose a lot
GGUF Q4 and Int 4 are not same
good quality Int 4 is possible but it needs a longer process, and sometimes needs custom code to run as well
Thank you very much for the info, I'm fine with the GGUF Q4, I'm using flux to upscale & enhance my XL artworks and its amazing, I can even use loras
yeah if you like it as it is then that's fine
Can anyone recommend me a really good artistic model? I am trying to generate an idea for a tattoo but the prompt isn't quite giving me what I want
have you tried flux dev
I have not
I see quite a few different options on civitai
which one is the right option? 😛
main thing is to learn the difference between dev and schnell
cos they are different
dev takes more steps
its generally seen that dev is higher quality but I don't fully agree with that in all cases
dev is a distilled pro. schnell is a lobotmized dev
designed for speed, not quality
the reason I think schnell can be higher quality is that
during certain distillation methods they involve an object recognition model e.g. DINO
it is usually DINO
and it boosts the distilled model's ability to separate objects from backgrounds, and make coherent objects
sometimes this actually ends up with the distilled model overtaking the teacher in that aspect
at least to my eye, schnell seems to have this effect
and models like Flux Fusion that mix dev and schnell end up looking like dev but with that effect
in my experience, all i ever get out of schnell is ... about what I get out of sd 1.5 without loras. and that's far worse than sdxl. they also don't follow the prompt correctly in most cases.
SD 1.5 base is actually my favourite model in some ways
it has the most crazy creative compositions
have to seed hunt hard, like pick the best out of 1,000 seeds, but you can get nice ideas from SD 1.5 base
my main prompt adherence tip is just tiling, its the way I do it
like different prompts per tile
why does the image look different even when i use the same image settings
only difference is the web ui i used
wait my bad wrong chat
hello all, I was wondering if there is another model/service in the idea of Runway Act One and Liveportrait ? Mainly for changing facial expression using 2 videos
Probably because of seed, that should be same too
i agree with you on this - SD 1.5 has a unique data set which i think gave it the best understanding of concepts - but it's not refined enough to get stodgy
SD 1.5 can take way more noise than SDXL for some reason also
if flux is on-topic .. is there any opensource workflow that can do skyboxes using this .. if not same question with recent SD variants
because SDXL uses a refiner to get those extra details in - and noise gets... denoised
oh I mean without the refiner
I've actually never used SDXL refiner LOL
oh. well then you might just as well use SD1.5. that's about the only difference is that refiner
it was trained on a slightly different dataset, but if you're not going to use the refiner, ther'es not a lot o point in not just using sd 1.5
the refiner works on any model, you could use it on SD 1.5 or use it on Flux even
without refiner SDXL still has some advantages, particularly stronger VAE and hands
SD 1.5 has drastically better prompt adherence because of Ella
I think SDXL is a big stronger in layout and structure, but SD 1.5 can win on details
details is a bit of a moot point though cos dedicated upscaling models have way more details than either
Use a skybox lora.
sure. and you can use anything as a refiner - which really has nothing to do with the main point. SDXL without the refiner is really not any better quality wise than sd 1.5 - but 1.5 understand various terms in a unique way that was lost when they move on to trying to give the peopel what they want - pretty females
I think its better to judge quality from benchmarks than by trying to judge it personally
and SDXL does benchmark a lot higher than SD 1.5
why? the benchmarks can't' possibly know what you like or feel is quality
is just scientific method
I think sdxl is better then sd1.5 but not by a really large amount like flux/sd3.5. The refiner made things too plasticky imo and didn't really seem that useful imo.
I think using a low-step variant of sdxl like lightning, ttd, dmd2 is probably the best choice if you want to use sdxl, they are pretty much the same quality as 25+ steps but much faster.
have you tried SDXL as the base model and sd3.5 medium as the refiner?
maybe - i tend to look at software benchmarks about the same way that I look at movie reviewers. especially if those benchmarks are based on random users on the internet voting which image they like best
I still didn’t try sd3.5 medium as a refiner 😄 but sd3.5m as base has grown on me, it’s not too shabby.
human preference benchmarks are the ideal test though
it's a really good refiner - dango deliberately made it more artsy
not in my opinion.
human benchmarks for pizza is that pineapple should never be used
but that only matches up with people that don't like pineapple on pizza
ok there is an alternative
can simply use inverse problems to judge the strength of a diffusion model
e.g. upscaling, inpainting, deblurring and colourisation
in fact I am leaning towards this being the best way overall
that is probably the better set of tests.
Human preference is a great test within the confines of understanding what the results mean. All tests are biased. The best test is a series of different tests and noting the biases of each as you go.
For human preference benchmark testing, it's a frequent flaw that they will favor "pretty" generations over "good" generations. So if the prompt is a cat, and the AI gave you a sexy woman with big booba... preference voting says sexy lady model is best!
It's also a frequent issue that noise looks better in photos, and noise that comes from genuine model or sampling errors still "looks pretty" if it's in small doses, and gets preference votes, while other tests would highlight it as a failure.
If you were doing even a narrow test, say for example have models generate a bunch of pretty ladies and see what human votes think are the prettiest... you'll get a ton of votes for Flux Dev. So flux dev is the best at generating pretty ladies? No, it's the best at generating one specific pretty lady over and over. "Fluxgirl" is quite pretty, but it's a major problem/limitation of the model, that would cause it to fail instantly in loss metrics or other tests that compare prompt following on details or source matching.
iirc schnell is a second distillation of pro, rather than from dev - the key practical difference here is being separated makes it difficult to share loras and all, which was the magic that made Turbo so useful for SDXL - you can turbofy anything, the turbo could be applied with a lora itself even.
i was told it was dev - but i'm not about to argue with you on this - i'll take your word for it.
ftr my word on that is not absolute, that is secondhand memory (I recall it being said but not sure where) mixed with analytic consideration (if it was a direct descendent of dev, loras would cross between models fine)
yes, well - you have a lot more information on the subject than I do, i just have one post when it was released saying it was dev. and I have a huge amount of respect for you, even though we clash at times.
i have no idea
we're all in this together then lol
I've sent you a guideline @fervent thunder add upp
is it known which distillation method was used because the different methods seem to have different effects in various ways
e.g. whether a distillation method had an adversarial component seems to make a big difference
dev is "guidance distilled", schnell is "timestep distilled"
dev had a """guidance""" value baked in (akin to CFG, but not CFG, and its value included in an embedder)
schnell just did essentially what SDXL Turbo did. I don't think they published the exact details of schnell's methodology but considering BFL is literally Stability's former research team, I'd be shocked if it was any different than what they did on SDXL Turbo and SD3 Turbo (not much reason to change what worked), so it'd be https://arxiv.org/abs/2403.12015 same as the LADD paper published about sd3 turbo by the same team
LADD in turn being basically just ADD but more of it running in latent space, as the names imply
ah okay thanks, yeah if its the same team then that is likely
I guess then this lora is LADD turbo since it is literally called turbo https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha
Fluxgirl?
A lot of people say that Flux dev has low diversity cause it's distilled and a flow matching model. But honestly, diversity over multiple seeds in flux is much higher for me than, say, SDXL
there are ways you can look at log likelihoods directly, can bias the sampling towards them if needed also
and yeah log likelihood distribution for flux is better
regarding prompt adherence, I've seen Flux beat SD3m on GenEval
I think at this point a new generation of models is needed to compete with Flux really
I was very suprised that the dev de-distill efforts that were trying to train the ability to use CFG back into Flux worked, but they worked really well it can happily use CFG 6 or so 🤔
Easy, what would you require this "tool" to do in your game?
there's a lot of consistency in the appearance of women generated, the easy reference point is look at the chin shape, it's got a rather unique cleft chin in almost all generations unless you actively fight it. Plenty of other consistencies too, just that one's the easiest to verbalize and point out
it's likely a result of the quality-tune data and/or the distillation process
there's plenty of "unwanted baked in features" of the model, strong bokeh on photoreal images is another common one
if you're in the swarmui discord you can find a few rants from baratan about fluxgirl #1243167929646710886 message
alternately here's photo of a woman seeds 1 through 4 as a quick easy reference https://i.alexgoodwin.media/i/misc/40c7ae.png
and just for comparison, same request on SDXL Base 1.0 https://i.alexgoodwin.media/i/misc/01cbbe.png
the model has high diversity in terms of the log likelihood distribution though, which is what is ultimately going to cap your ability to get diverse samples out of it
the things that you are referring to can be changed by even a small lora
I think that sort of criticism was a lot more valid on release day
when it was uncertain whether the model would be possible to train
80% of my gens don't use a lora🗣️ :tbh i love sdxl still
Maybe once i can snipe the 5090 ill move on
To be clear my comment above is an explanation of what "fluxgirl" is, after it was mentioned previously in the context model benchmarking methods
I use fluxdev for 99% of my generations at this point i don't have any problems using it lol
ah okay I see
I also dislike the default day 1 flux look that it launched with
but I use the checkpoints based on dev de-distill with CFG and 2-3 strong photography loras
Day 1 launches are mostly terrible due lack of proper workflows lol
I couldn't run sdxl on release (3min+ gens) . Now i can do it easyly in 13s
you can see my comments on this discord I actually rly disliked flux at first, and didn't use it for like 2 months
yeah launches are always rough
RTX 5090 doesn't have sage attention apparently
I currently use Segmind Vega https://huggingface.co/segmind/Segmind-Vega
not sure if I recommend it but its SDXL at 0.7B or so
I agree that there are certain faces that appear over and over again - but all other models have the same issue. I don't know if that is a problem of the training data or a problem of dpo. It can be improved with lorad or fine-tuning though
more recent dpo papers have said that older dpo methods can overfit hard
or have an issue that is slightly different
where is is a bit like mode collapse in GANs
I'm not that interested in photo portraits, but I kinda think that maybe photo portraits need their own dedicated foundation model
or just a really massive flux lokr
where they try to have dataset have a really broad coverage of photo portrait types and subjects
What UI did you use?
Help!
I used to have ComfyUI on Windows 11.
I switched to Pop Linux (on the same machine with same hardware) and now my workflow does not work.
I installed the latest ComfyUI.
However it is the same workflow.
Now, I get OOM errors on every video generation on the workflow that worked on Windows.
I was running portable ComfyUI on windows, and now regular ComfyUI for Linux.
Has anyone had same issues when switching?
are there custom nodes
I recommend going into #🤝|tech-support but stupid question: did you update any drivers after switching?
Or what gpu do you have? Just mention anything relevant in techsupport
Almost All of them are web ui based.
Hmm chances are no but i recommend posting in #🤝|tech-support if someone else also uses the 1111 version
Personally i dont because its rather outdated
Ah. I completely updated the system. It’s Linux now. I don’t even know if any of the dependencies are the same
i recommend getting the recommended drivers for your system/OS or running it can be unstable
did you follow a Linux install guide?
.
I used chatgpt / deepseek to guide me, and the requirements text. I’m out of options unless someone has had this issue . I presume it’s updated stuff that’s making it either too efficient or something
okay i got to say thats the worst possible guide you couldave followed. are you experienced in linux at all?
No
mind me asking why you went for linux?
preprocessor "anime denoise" from controlner Lineart is what? I say it is to generate images on anime style, but I don't know if it's correct
Hello everyone
joining under our new business discord. we make images, videos and games etc
Dang, that's a nice model you got there
hello
does anyone know how many decimal places are supported for LORA weighting in prompts?
Hello
Hello
hey there!
Sure is, especially for its time, Mr. Etal should be proud of his work
OH MY GOODNESS! FLUX IS 22GB?! IS IT WORTH IT?!
Hello. I have a few questions, can anyone help me with the audio?
yes. Also, just use quantized versions, they are smaller
Well...It's hard for me to find the right version that will give me what I am trying to make
you cannot run 22gb on a consumer gpu anyways
search for a quantized model that fits your gpu
eg. 11gb or 8gb, depending on your vram
I have a list of Flux model variants and usage tips in the SwarmUI docs here https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model Support.md#black-forest-labs-flux1-models
Thanks bro, Appreciate it~
depends what you are doing cos Flux wins overall for standard text-to-image or img-to-img
but it will lose to smaller models that were designed from scratch for specific tasks
for various forms of image manipulation or unusual conditioning input methods for example
base SDXL is still rly nice for cityscapes 🙂
certain things like that it can do about as well as the fine tunes
stable diffusion kullanan birisi iletişime geçebilirmi benimle ?
Sorunuzu burada sorabilirsiniz, ben bir çevirmen kullandım.
I want to make costum potrait but I can't do it, can you help me?
There are not many specialists in Turkey, can you come if I come to voice chat?
hi
I am not very good at teaching the first steps
my advice would be to try to download SwarmUI and run their basic workflow
you could also try easydiffusion for an easier option although I have not tested it
if that is too tricky (which is understandable) then I would maybe suggest using a paid service online, I think the easiest out of anything is probably to ask chatgpt to make an image
these are the links to the things I mentioned
https://github.com/mcmonkeyprojects/SwarmUI
https://github.com/easydiffusion/easydiffusion
can u tell me how to use stable diffusion ?
If I share the broadcast, we can look together. ?
sorry I can't help more
Thank you, I'll try what you said.
I really recommend watching a youtube video first and dont use their guide but use a guide from #🤝|tech-support pinned messages
But before you do, what gpu do you have?
So i can say if you can run it or your wasting your time before you invest time into it
Should be possible, since your a beginner i recommend following the tutorial from cs1o. Either web ui reforge OR SwarmUI
SwarmUI lets you get used to advanced features as you learn. Forge just lets you use basics mostly
And on CivitAI you can get nice "stable diffusion models" either anime or realistic finetunes/merges
Personally for anime i recommend illustrious wainsfw and use any of their example images as a way to start making good looking images
Hey everyone! 👋
I’ve been exploring AI-generated art for a few weeks now, mainly for concept designs in a sci-fi interactive game I’m working on.
I’m trying out multiple models, and stumbled on this server!
I’ve been struggling to get consistent character designs and keep the art style uniform with all the models I used.
how do you all handle this challenge? Any tricks for keeping AI-generated characters looking the same across multiple images?
Loras mostly but if its just face, facetools/reactor
Some realistic models it helps to use a name in the prompt but its hella inconsistent
#ask Is Hunyuan the best video generator for on ComfyUI?
My computer uses Rtx 3060 12GB, can it run text to video?
Can you please tell me more about that
I'do love to see what you've done with it
oh i dont use it as i mostly gen commisions for people
u plug a jack into the front I/o of ur puter
and windows be like
hmmm, what device did u plug in, front audio?
im like, what do u think?? ;d
well yeah headsets and speakers dont carry a load of extra chips
if its unsure its better to ask then to assume its a speaker while its a headset
xd
i use computers all day, every day. they suck. they are terrible and i wont stop using them
i think itd be cool if hardware could exist for PCs, that can measure the ohms of an audio device, to in a sense, autonomously deicide how to drive the headset or device u plug into the audio jack, and if u wanted to choose manaully, u can jus go to settings
instead of like, throwing like 3 diffrent windows at u, to choose urself, before anything is outputed
are you saying you're addicted?
there are lots of addictions that are very real and not recognized by the medical profession as such (who's gonna go to a doctor to try to get over a chocolate addiction for example?)
there is a lot of debate about it yeah
even in DSM-5 they did include it, just not as a recognised condition, they put it in the section for more research
Im sure he is wherever he is off to now
did stability ever release the creative upscaler
don't think so
how do you 'borrow' an upwork account?
For comfyui if I wanted to get the prompt from a exported png (with embeeded prompts) would I just grab the image header in python?
Hello
yeah
that works
hello, i have laptop rtx 3050 4gb vram, its fine if use stable diffussion? i already try, i cant render anything, its just like green screen
no. you need at least 8gig vram
https://huggingface.co/hum-ma/SDXL-models-GGUF/tree/maincan fit in 2GB okay
not on that laptop
there is a trick where you run it headless so the screen goes black while it generates image
but then you can use full VRAM
but it should be fine without that anyway
Hi there
hello
So what kinda stuff do you guys generate
I’ve made a Lora of myself once that was okay but that was like a year ago and now I’m lost with automatic1111
I forgot how to use a1111 I did do it a bit
I just tried to jump back where I left off and so much is different now
I feel like a1111 is basically out of commission
gregorian chants
does anyone know where I can download a cheatsheet of prompts for stable?
Personally, I like to occasionally listen to a group of old guys with grey beards gather round and yell dragon words at me
ABBA
Yes try using Forge Webui, it works with your GPU
hiya
What's currently the best/easiest way to do good faceswaps? seems like every time i google it a new extension or library is suggested heh
and half seem to be not updated for like 10+ months
FaceFusion
will have a look
How can I generate QR codes?
Hiiii
Just joined 😗😗😗
I’ve a question.. I’ve tried my first ever Lora training, using my selfies as dataset, around 7 pictures 😌😌
The lora “works” as in it can kinda generate my likeness but only from front view. Hmmm is there any tips or reads I can consume to learn how dataset curating works n best practices?
which model is this?
smaller models will "generalise" less so they are more likely to need to be explicitly shown everything during training
i.e. to get back angles they would be more likely to be shown back angles
the larger models like flux, SD 3.5L etc have better generalising ability so if you show them only a front angle they can "infer" what the back angle would look like
Ah I’m using SDXL juggernaut xl
The face gets distorted sometimes from front view but its surely kinda working lol
It would fail if I prompt for look left look right side view lol
Hey Stable Diffusion community!
I've seen those funny Photoshop requests floating around where people ask artists to add them into a vacation photo they missed. It got me thinking... is there a way to automate this kind of request, similar to how Pika's new "addition" feature works for video? I'm specifically looking for a solution for photos, not video.
My current understanding is that one approach would be to inpaint a "stand-in" figure into the photo, then use a face-swapping technique to replace that figure's face with the person's photo. However, this seems like a somewhat manual process.
I'm wondering if anyone has developed or knows of any ComfyUI workflows (or any other automated methods) that could streamline this? Ideally, something that could potentially be batch-processed.
Any pointers, ideas, or even just brainstorming would be greatly appreciated! Thanks in advance!
I like SDXL a lot but its very finnicky to fine tune compared to modern models
I would recommend using a large lycoris lokr in simple tuner
for SDXL
but that is not simple
oh im using onetrainer
was using kohya but i hit into OOM iussue in my 8gb 3070
LOL
lycoris.. hmm
time to google
ah I don't know much about the low vram stuff cos I use cloud
do you have like base tips on dataset itself?
example, i provide like 10 pictures of myself in selfie
is that bad?
yeah my tip for dataset is
get a bunch of image quality assessment models, and an image embedding model
and download a big dataset from somewhere like huggingface
take one image that you like, and use embeddings to search for the most similar images in the dataset
then use image quality assessment models to filter out low quality
10 pictures is fine for flux, for SDXL it is on the low side but it might be ok
if the method I gave was too complex then just try to manually get like 30 high quality images if you can
so meaning, i try to take 30 pictures of myself?
do i have to be in different expressions? close up and full body?
if my goal is to learn my face.
for lora what you need to do is use images that are close to the style and structure that you want your output images to be
so if you want images that are full body it is not good to train a lora only on close up face portrait
it could still work but its better to have a closer match between training image and desired output image
so like if your goal is to make full body images then train on full body images
i see
ah i was thinking in the way that once the lora learns my 'face'
it will know to transplant it to any body and pose
LMAO
yeah its a really common mistake with loras
it can still work sometimes, especially with strong models
but its just a lot better to train it close to the image you want
Hello
hi all, wanted to ask if anyone knows if the stabletuner discord is still around and kicking or not!
hello fellas!
o7
No?
the github is archived. the discord is probably not any more live than the github is - and the discord invite on the archived github is invalid
Are there any other similar tools available that are more recent?
what are you specifically trying to do?
Fine tune SD
a bit more details on what you're trying to acomplish, please. there are hundreds of things that might work but are probably not quite what you want. i need more information before i can recomend anything
Sure, yeah. This is a work project so I can only share so much under NDA but we’re looking to fine tune SD with control net-like mechanism for highly controlled image generation
I appreciate any guidance you can give on the matter; it seems as though most tools in this area are fairly old and somewhat deprecated
ah. you're on the wrong discord for this. you need to be on the L3 discord and you need to talk to several people there. starting with @fervent thunder
i can dm you an invite if you like
I would appreciate that very much, thank you 🙂
Hello
sent
Thanks!
happy valentines day everyone :3 💜
happy valentines
for reference the developer of StableTuner, after publishing that project, got hired at Stability AI, worked there for a couple years, then left, joined BFL, works there now. Probably quite a few more life events between when he last updated that project and now lol.
There's tons of other SD trainer software options, like kohya's scripts, OneTrainer, etc.
Yeah it’s an old repo 😅 I was hoping there was a better option than something 4-ish years old and archived
Wait who made stabletuner? oh, that dude!
Hey guys
I want to replicate something like this -> https://www.youtube.com/shorts/mhSMgSFdYsM
How are people able to generate consistent characters?
Do they train a lora for the characters or how does it usually work?
it was me
How are you liking it over at BFL then 
hmm
are there like "light weight" versions of sdxl?
i love SD1.5 run speed on my 8gb vram
but SDXL images looks nice out of the box.
the only channels you can generate in are the artisan channels and there are only 4 of them. read the information in #artisan-faq first
yea segmind vega is good
its my main model
i see
gonna give it a try
hahaha
i'm mostly ok with sdxl , juggernaut
and it fits within my 8gb vram
but
if im trying to train lora...
Do you train your own LoRAs, neon?
yo
the video seems to be using face swapping, it works pretty well
hey guys
im having some trouble with connecting AUTOMATIC1111 stable diffusion running on docker to my owui also running on docker
kinda blocked at this part
well the bit where you enter the API
I'm not really into lora training