#💬|general-chat

1 messages · Page 180 of 1

quartz seal
#

comfy-ui?

still glacier
#

the usual suspects, comfyui, stable swarm, auto1111, forge. Take a look at the pinned guide in #🤝|tech-support and pick your poison

quartz seal
sand basin
#

Hello!

solemn plume
#

Hi

deep narwhal
#

are AMD gpus pretty good for image generation now?

#

i wanna upgrade my card but i also wanna generate stuff

#

my vote goes to swarm

atomic mortar
#

nvidia works pretty out of the box and amd sorta lost the ai race for now

quartz siren
#

Flux is still overall the best model especially with the flux toolkit controlnets and redux.

Sd3.5 is better for artistic styles though. There is also a small 2b version which is nice.

deep narwhal
#

or training and stuff like that

atomic mortar
#

from what i understand yeah

lucid bobcat
#

A high-resolution photograph of an ancient, towering tree with massive, gnarled roots extending across a vibrant, emerald green forest floor, intricately detailed bark. The tree is adorned with ethereal, glowing bioluminescent flowers and vines, casting a soft, magical light. Perched on a branch is a majestic, white bioluminescent owl with luminous feathers that glint in the golden hues of the setting sun, casting long, dramatic shadows. In the background, a pristine mountain range with snow-capped peaks and a crystal-clear lake reflecting the scene. Ultra-realistic textures and lighting, high dynamic range, 8K resolution, photorealism, and volumetric lighting. Artstation, trending on CGSociety.

desert dagger
atomic mortar
lucid bobcat
atomic mortar
#

phew. i cant imagine someone unironically thinking up a prompt like that naturally

#

call me stupid but tag based prompting is the best way imo

lucid bobcat
atomic mortar
#

i havent messed with flux a lot but natural prompting using words like whimsical, elegant, etc somehow just make me irrationally mad

quartz siren
#

Tag based prompts with flux works well too, it seems more creative with it imo.

lucid bobcat
deep narwhal
#

does tag based prompting work well with illustrious?

atomic mortar
deep narwhal
#

artist tags on illustrious confuse the hell out of me. is there a file with artist name and style out there somewhere because there aint no way people are memorizing all that

atomic mortar
deep narwhal
#

unless they are 😭

atomic mortar
#

sec

#

if you really zoom in you see a image for a artist style ish

#

if you find a artist you like i recommend seaching that one up on danbooru/gelbooru etc and copy the artist tag into your promptbox

#

granted for illustrious youd get better results then for base SDXL and if its a not a well known artist you might need to use a lora

deep narwhal
#

thanks! ill check it out

atomic mortar
#

it also has a character list (2048 images) that shows supported characters (mostly)

desert dagger
# lucid bobcat Do you or <@407561236339752981> know a good source for prompt terminology? Words...

i have about 7000 hours of deep dives into this over the last 2.5 years, so most of what I know is hours of work, one term at a time, with all the different stable diffusion models and all 3 of the flux models - exploring exactly how the models react to and use the tokens. i can give you a link to my spreadsheet if you want, it's got a fraction of that in it but you might find it useful. 99% of it is now learned information i just simply know at this point.

lucid bobcat
desert dagger
lucid bobcat
desert dagger
#

this is how you find out what the AI thinks about by default when it sees various terms. you give it just that term and generate a few times - i suggest around 10 times - and then study what it did

#

that's going to give you the top of the bell curve for that term and the most likely data associated with that token

#

so you get a feel for what it'll pull up when you use that token with other tokens in a prompt

lucid bobcat
desert dagger
#

for a single word prompt you should get content that's all over the place IF that word is not a noun

native pasture
#

hi

desert dagger
#

but all the content should represent the term. you shouldn't get fantasy elves for a term like unconcious

#

you should get things that are dead, asleep, or represent unconcious in some way

lucid bobcat
desert dagger
#

but that's the point - you will get what's at the top of the bell curve, what the most likely data is. and if something like 'unconscious' gives you elves having a feast - that gives you an idea of what the data the AI will retrieve is most likely to be when you use that term. for stuff like 'trending on artstation" you should get random data but all with a more artsy look - as that trending tab on artstation is a specific thing - so what the AI learned when it learned that term is whatever was on that trending page when the liaon crawler indexed it into the database. that's a lot of random, artsy noise

deep salmon
#

Hi, I’m looking to make a PS2 style image but I am quite new to this. What steps can I take to generate it? Any website or tutorial? I’m doing it from my laptop so hopefully there’s a cloud platform that allows me to make them

rough coyote
#

Hello, I want to ask which ai is this guys using man 😭,

deep salmon
atomic mortar
lucid bobcat
fallow shard
#

If anyone here has takes on what the US govt should be doing about AI can submit up to 15 pages of them via a Request for Information for the development of an Artificial Intelligence Action Plan: "Through this Request for Information (RFI), OSTP and NITRD NCO seek input from the public, including from academia, industry groups, private sector organizations, state, local, and tribal governments, and any other interested parties, on priority actions that should be included in the Plan." https://www.federalregister.gov/documents/2025/02/06/2025-02305/request-for-information-on-the-development-of-an-artificial-intelligence-ai-action-plan

You have until March 15

lucid bobcat
#

you're welcome

low moon
fallow shard
#

yep

drowsy hearth
#

Why do people think there's a meritocratic future with ai prompting

#

Like the point of the tool is that it draws really well for you just have have fun with it at least.

#

Idk if you wanna make money with the prompter setup a list of prompts that generally work like 200 prompts and just have a bot cycle through those 200 prompts on twitter and post 7-8 times a day youll make money pretty fast if you wanna make money off the prompter

fervent thunder
warped heart
#

anyone online please

#

how to prevent generating characters' clothes with wind effect

#

SD always generates it like wind blows from below

desert dagger
little flower
#

hi

patent temple
#

/subscribe

desert dagger
lucid bobcat
#

Ban this guy

drowsy hearth
#

Civitai moderation took down an article a few days ago that I was going through discussing the harm of synthetic images in data sets. I dont understand why civitai would take down the article discussing this paper and they didnt list a reason as to why the article was removed and instead its just gone. It is confusing because now I cant actually prove this paper was floating around civitai not even a month ago because the evidence that it ever existed in that way is gone. You seem to at least try to be honest so maybe you can explain whats going on. Here is the paper Im talking about. https://arxiv.org/pdf/2311.12202 @finite cloak

desert dagger
drowsy hearth
#

I was just asking in case theres something actually wrong with the paper or not that would cause moderation I guess I could ask people at civitai but im already in this server man. Idk maybe you can answer the question.

drowsy hearth
glossy tulip
#

just got stable diffusion on my comp. My first image blows. just thought Id let yall know.

quartz swan
#

thats so real

#

share it with the class

#

dont be shy

desert dagger
cedar salmon
#

@drowsy hearthMaxfield

glossy tulip
drowsy hearth
#

You dont mean the illustrator i would think right?

cedar salmon
#

a co founder of Civit, also now involved with Stability and is here

fervent thunder
#

I think the paper is misleading but that does not mean it should have been removed from Civit
having said that it is likely something else happened to cause the removal

desert dagger
cedar salmon
#

ya i sure he is a busy man

desert dagger
#

yeah, and just starting this job, so all the stuff you have to come up to speed on with a new job as well

fervent thunder
#

we quite regularly have diffusion models with like 50% or more artificial training data these days

drowsy hearth
#

or upwards of 50%

#

cuz the one Ive seen discussed the most when it comes to artificial training data is pony

fervent thunder
#

pony doesn't really count because it was not trained right

drowsy hearth
#

cuz I wanna read about it

fervent thunder
#

if I remember rightly Janus Pro

#

it is autoregressive rather than diffusion

drowsy hearth
#

also what about pony wasnt trained right because when I use the pony model it seems to just visibly outperform every stable diffusion model.

fervent thunder
#

well it depends on if you test it on its own terms or not

#

if you benchmark it using standard benchmarks it will not do well

#

but that's not necessarily fair to the model

drowsy hearth
#

what does a standard benchmark test look like?

desert dagger
drowsy hearth
#

All that matters to me is how well it responds to composition prompting

desert dagger
drowsy hearth
desert dagger
quartz swan
#

good image = good quality basically?

desert dagger
quartz swan
drowsy hearth
#

also an ai generated lora isnt going to occupy anywhere near 3% of a dataset thats numbered in the billions

desert dagger
drowsy hearth
desert dagger
drowsy hearth
#

Like you dont have to patronize or gaslight because I have 5 articles written by the university professors in ml that you love to glaze

fervent thunder
quartz swan
#

im not sure if i just have been unlucky in finding a good model for it or im just ass 💀

#

but do YOU recommend any models for something like that?

drowsy hearth
desert dagger
drowsy hearth
#

Like if someone indefinitely corrupted an ai model on a super computer then that means its possible to indefinitely corrupt an ai model regardless of how common or easy it might be is kinda how i understand these things

desert dagger
fervent thunder
#

you can always retrain

quartz swan
drowsy hearth
# fervent thunder you can always retrain

I see it doesnt really make a difference either way. But if they didnt manage to fix it with retraining I assume youd basically have to completely retrain the model with another massive dataset.

#

Idk I dont have a super computer to test that kinda thing on

fervent thunder
#

the amount of data needed to fix it could be large yeah

desert dagger
fervent thunder
#

a lot of things are just about budget

desert dagger
quartz swan
fervent thunder
#

like quite often when big models come out people say they can't be trained, but with enough budget they can be

desert dagger
desert dagger
fervent thunder
#

they should probably just say that then

desert dagger
#

The only real problem with using synthetic data for training is if you use junk data. example: use generated images but don't clean them up. they're full of odd strange stuff, and the AI learns that's what it should create. clean them up - you don't have that problem.

#

the same with data for LLMs - go through it, fact check, make sure it's all correct and accurate - it doesn't matter if you scraped it from the web, you wrote it, or a machine wrote it at that point.

fervent thunder
#

yea like if the synthetic data point is an image made with a diffusion model but you did tiled upscale on it and some kind of post-processing pipeline, it would probably be as good or better than most photos

drowsy hearth
#

is pony diffusion a base model?

#

or just a really big fine tune

fervent thunder
#

same for shuttle, its just schnell

drowsy hearth
#

But it only has like 80,000 images at the same time how exactly does it work cuz Im very confused

fervent thunder
#

its just an SDXL fine tune that everyone went crazy about

#

there's not even more to it than that

#

the clips changed but that's not unusual for an SDXL checkpoint as they also can train the clips

desert dagger
drowsy hearth
#

All I know is it seems to work quite well

#

and pony apparently has a ton of synthetic data in it and the way it works without more finetuning is a lot like a basemodel and it responds to finetuning like a basemodel too. Thats all I can definitely say Ive experienced with pony

#

The only difference is that its very performant when compared with its competition and better than alot of its competition in its compositional variety, quality etc.

fervent thunder
cedar salmon
drowsy hearth
cedar salmon
#

and the TE is trained on danbooru tags

#

so it ends up pretty diffrent

desert dagger
cedar salmon
#

Do you still gotta use that silly oops that was trained in? the uptag_quality_1234 crap

drowsy hearth
#

Yall seem to dislike pony diffusion

cedar salmon
#

i dont dislike it, i just dont need it

desert dagger
desert dagger
fervent thunder
#

like RealvisXL changed the weights of SDXL base a lot but I wouldn't call RealvisXL a new base model

desert dagger
fervent thunder
#

are you using the word destroy to mean change?

desert dagger
desert dagger
cedar salmon
#

"Pony Diffusion V6 is a versatile SDXL finetune"

desert dagger
#

yes. scroll a bit farther and you'll find his discussion about those tags

fervent thunder
#

but what does this mean in practice
when I tried the model I prompted R2D2 in baroque palace with various objects and it made the image

#

it has had a lot of changes to the clips

#

and a lot of overfitting combined with forgetting

#

but it was not fundamentally different

drowsy hearth
# fervent thunder are you using the word destroy to mean change?

I wanna draw an AI model do you think if I used like a 10 image finetune to just put in my render style then drew only the control nets for the rest of the 70,000 images if I could get an ai model thats really narrowly or even more focused like pony diffusion

desert dagger
drowsy hearth
#

70,000 linearts is a lot of work Im not unfamiliar but its also just very possible to do and the amount of painting over I have to do with any given ai generation is like 30 seconds to a minute.

cedar salmon
#

i think its pretty cool, nice big tune, successful, but im not allowed to use it cuz i dont know the difference between cartoon and anime

fervent thunder
#

I don't know the difference either TBH

cedar salmon
#

damn boomers

fervent thunder
finite cloak
# drowsy hearth Civitai moderation took down an article a few days ago that I was going through ...

I have no idea what civitai moderation would be up to. This paper however is one of those "poisoning techniques" which get frequently misinterpreted and abuse. In this paper they repeatedly retrain one model on its own outputs, which is an interesting bit of research into the workings of AI models. However it's bound to lead people to go "aha! posting ai images will destroy future ai models because they'll be poisoned by synthetic data!" which, no, is not the case. AI models often benefit from synthetic data, and are often intentionally trained on synthetic data (for example Flux Dev is trained entirely by synthetic data generated by Flux Pro) -- it's specific details within the process of iteratively looping a model's own data onto itself that breaks it down here. Which is a process that never happens "in the wild", as of course you can't have released a model and also have not-yet-trained the model. (Note also that paper was based on SDv1 or 2, a model class that was known to break down with this type of artifacting if you even sneezed at it too loudly, due to a heavy underuse of normalization calls.)

This type of misunderstanding was most famously pushed by the nightshade/glaze authors, who discovered a very similar technique (poison a single precise target model in pretraining, which of course is never possible to achieve "in the wild" cause the model is already trained by the time you can target it), and then scammed artists in the public by claiming "run our magic software to stop ai training on your art", while in reality it would only stop the training of a very specific AI model that has already been trained, and does nothing regarding future models that are still yet-to-be trained and might incorporate that data. It was also tested and confirmed that nightshade/glaze doesn't stop lora training even on the target model, ie it does nothing of practical use.

So to return to your question: I don't know what civitai mods would be doing, but I can imagine exactly the way discussion on that paper would've gone down stupid/misleading/scammy directions and needed to be removed.

desert dagger
nova lily
#

if anyone here uses ADetailer in tandem with Ultimate SD Upscale, do you know how to mitigate the obvious mask the it leaves? ty

drowsy hearth
# finite cloak I have no idea what civitai moderation would be up to. This paper however is one...

The part about it thats interesting to me is that the reason the models seem to have collapsed isnt so much because the data is synthetic but moreso because synthetic data can have a lot of bias. I wouldnt particularly want to have synthetic data be untrainable as I have enough foresight and have used models enough to know theres ways to make it work. Also was the flux dev model a distillation of flux pro. My plan with ai tools and the pet project Im working on has a lot to do with training models on synthetic data. I basically want to make a model like pony thats specialized so specifically that it would essentially create a model development process that couldnt be otherwise easily replicated. I used a model at a studio that was NDA but I didnt sign the nda contract because I was visiting. The model was made by artists taking photos of paris and the model seemed to completely and consistently understand paris with 10,000's of images that shared every minute detail about the city of paris and I want to replicate that in a commercial capacity but I dont understand fully how that was done.

#

Also the model used for the oasis project looks like it mightve been made in a similar way but I have no idea for sure they havent actually shared alot about it.

desert dagger
drowsy hearth
desert dagger
#

and would like to actually see what it is

desert dagger
drowsy hearth
#

I dont know exactly how this model was made though

desert dagger
#

oh - this is the generates the world in real time project

#

they did that with doom, too

drowsy hearth
#

The paris one I mentioned was similar but it would just simulate paris in real time

desert dagger
#

yeah. this isn't a video model. it's actually generating the world as you do stuff. there were posts on twitter, and even a couple youtube videos

drowsy hearth
#

But when I saw it he told me it was nda but I didnt actually sign an nda agreement he just showed it to me anyways because he had his artists make the thing

lucid bobcat
desert dagger
desert dagger
#

it doesn't keep coherence very well though

lucid bobcat
drowsy hearth
#

According to decart it has some kinda image diffusion

#

in the minecraft ai

lucid bobcat
#

So, in this Paris model. If you look up at the sky and down again, are you suddenly on a sunny Californian beach?

finite cloak
#

if you scroll down the decart page they show a short overview of the architecture, it's a latent Diffusion Transformer that takes a continual sliding context input of previous frames and current user input

#

iirc from looking into that before it uses error-correcting autoregressive logic. ie it keeps generating 1 frame at a time continually, and they trained it with errors in the frame data to force the model to correct any errors it makes to prevent autoregressive self destruction

desert dagger
#

however it doesn't have much contextual memory - it forgets what you did fairly easily, and generates generic minecraft stuff.

finite cloak
#

I imagine with the paris model, if you look up at the sky and back down, you'll end up elsewhere in Paris.
Part of how this model design works is it's heavily trained on a single topic, if it understood locations outside of Paris it'd be a lot harder to keep it coherent and efficient.

lucid bobcat
#

Don't look up is the new don't divide by zero.

lucid bobcat
#

Leonardo has been working on this post for quite some time now. Must be his Opus Magnum.

#

Any minute now

drowsy hearth
# finite cloak I imagine with the paris model, if you look up at the sky and back down, you'll ...

The paris model wasnt an interactable you just could prompt it to visualize different areas and it would simulate that space in real time. It was much more consistent because of that I think. Yea I dont think it would make sense to try and make a model by hand thats wide as an ocean shallow as a puddle like the Dall-E because I already have that. I want to make one that understands a specific narrative concept with like 70,000 images that are ai assisted drawings based on my paintings. When I read that paper I didnt interpret it as ai not being able to learn from itself I was actually more interested by the similarity between the generations and their seeds. I thought that it would make sense if an AI model was trained by hand using ai assistance you could likely dig pretty deep without model collapse consequences because it appeared to me that the cause of model collapse had more to do with the AI struggling to expand on composition past its seed. Whats interesting is this is exactly how humans draw as well. You cant actually draw something that isnt loosely based on something youve seen. Ive also done batch generations with different stable diffusion models and noticed that compositional bias has been a progressive issue with the ancient stable diffusion models although they mightve been terrible actually wouldnt replicate the same composition within a prompt right. I want to make something like pony diffusion thats ot like 70,000 ai assisted drawings and maybe some fresh ideas with rendering because although rendering is much more biased then composition naturally theres still little subjects here and there like the windmill principle and shape welding. I want to literally make an AI model with AI content its like my ultimate goal with ai bro.

desert dagger
#

You cant actually draw something that isnt loosely based on something youve seen. < little kids draw stuff they've never seen all the time

drowsy hearth
# desert dagger >You cant actually draw something that isnt loosely based on something youve see...

It just appears abstract because your memory doesnt pictograph things even if its completely photographic. Kim Jung Gi talks about this if you wanna look that guy up and learn how drawing works or read drawing with the right side of the brain which discusses the neuroscience of drawing you can. Until then your statement is based in ignorance not fact right. When a little kid draws the process is loosely based on their subjective experience Im not saying that everything you draw is a perfect representation of your memory. Its kinda like how AI makes stuff right when people draw they create an abstraction of an idea that they might not even be able to fully consciously recall. AI works in a similar way according to that paper in that you can find an image in its data set that loosely resembles the composition of the generation but the similarities dont go beyond that. When humans draw they make something thats loosely resembles the composition of their memories but doesnt actually perfectly replicate or anything to that matter and can be heavily abstracted to the point that its borderline unrecognizable.

desert dagger
drowsy hearth
# desert dagger contrary to your belief, you really aren't leonardo DaVinci

Im sorry that it upsets you that humans cant actually fully abstract from the images in their head but its basically just proven at this point right. Drawing with the right side of the brain goes through mri scans and shit going into excrutiating detail about the neurology behind drawing. Your drawings genuinely cant capably escape the image database stored in your memory its commonly called "the visual library."

#

I told you a book you can read to learn about this maybe you could check out kim jung gi stream as well but past that i cant help you

cedar salmon
#

old research i guess, right/left brain "functions" have been debunked.

#

might use more of a side but its both sides

#

i propose a simple test, remove left side of brain and see if you can still do right side functions

drowsy hearth
#

These things are not the same topic despite the name of the book

#

For instance the way you learn how to draw from imagination is by drawing things you saw a second ago then subsequently drawing things you saw a day ago and you practice drawing further and further back in your memory until you can draw anything from imagination.

#

Theres some extra nuance to it though because the scott robertson style turnaround studies are generally paired with that

lucid bobcat
drowsy hearth
# lucid bobcat That's not true. You never learned that a mouth full of teeth is frightning. You...

Yea your memory doesnt just born out with images of a mouth full of teeth. The way those things evolve is by hammering in biases over and over that cause you to eye track towards things that vaguely resemble a threat. For instance a bushlike shape placed adjacent to a person is going to track your eyes subconsciously. Its not because you were born knowing what a bush is its because the architecture of your brain just interprets all shapes that resonate as bushlike to be threatening. You can read about that too in a book called imaginative realism when it discusses compositional eyetracking. It has to do with predetermined biases towards shapes not because you know what teeth are when youre born you still need to learn what teeth are through formative growth by seeing them.

cedar salmon
#

i have a mouth full of teeth and its not frightening, but there are some concepts people understand/feel that arent taught

drowsy hearth
#

Thats why phobias appear like a fear of holes despite the fact hooles arent threatening the fear is there because the simple shapes resemble something that was threatening to our ancient ancestors.

cedar salmon
#

ya that popped into my head, couldn't remember the term for it, some survival instinct

drowsy hearth
#

Its not like your brain has prestored pictures it just has a way of interpreting images almost like a clip model

cedar salmon
#

when people were shorter and lived near the water

lucid bobcat
drowsy hearth
# lucid bobcat I didn't say we are born with images of a mouth full of teeth. I said we are bor...

well to be fair the extent of whats stored in the optical region of the brain and how thats understood is fairly shallow. The same goes with visual libraries right. The thing that interests me about AI is that it seems very similar to visual libraries and a huge step in learning how to draw is learning how to reference your visual library so you dont need reference anymore. The only viable and proven way to access the visual library is through turn around and drawing memories progressively further and further back in time until the begin to abstract to the point that even youre not totally consciously aware of where they come from. But the funny thing is if you just scribble on a piece of paper youre actually doing the exact same thing in a way as an artist who draws incredibly well without reference. The key difference between the two is that the artist that draws really well without reference learned how to extrapolate that information in a way that appears realistic. Does this make sense?

#

So an AI model is kinda like a visual library and a lot of what I read about it at least in a surface level way seems very similar. The only difference is human visual libraries are orders of magnitude larger then AI models with trillions of synapses.

lucid bobcat
desert dagger
# cedar salmon i propose a simple test, remove left side of brain and see if you can still do r...

actually, there are peopel that have very little brain at all - it's a birth defect - but they can function, and do things. and one was even on a talk show. They can't do, obviously, what someone with a full brain can do - but they can do everything both 'sides' are supposed to do however in a limited way. And there are people with severe brain damage that recover and can live almost normal lives. The brain is very flexible.

#

wrong discord

desert dagger
#

@vapid dove spammer alert

lucid bobcat
desert dagger
cedar salmon
#

hes not wrong just maybe not the best choice of words, dont throw the baby out with the bathwater

#

more like sparse data points that can be reassembled

lucid bobcat
cedar salmon
#

yup that may be

drowsy hearth
#

for instance someone might learn turnarounds really fast but they need to study gesture for months while someone might be really good at gesture really fast then need to study turnarounds for months. People generally have a proclivity towards certain things but the overarching process of learning how to draw from imagination is pretty cut and dry.

minor vapor
#

Hey, so I am trying to create a story with AI images, I wanted to understand how I can get multiple character consistency in a scene.
Can anyone please help me with the same?

quartz swan
#

aw 😔

#

cant send gifs

night gladeBOT
atomic mortar
quartz swan
atomic mortar
#

But it's a numbers game

atomic mortar
quartz swan
#

bet dms

#

its perfect euro

minor vapor
atomic mortar
#

Ah ive seen the gif. Ill visualize it

minor vapor
#

Oh so I work with midjourney actually...

atomic mortar
#

Here's the neat part. You don't

quartz swan
#

yeah that kind of sums it up

atomic mortar
#

Especially with online services

quartz swan
#

yeah its already hit or miss with local stuff 💀

atomic mortar
#

Locally you could do some wack inpainting

quartz swan
#

i can only imagine how bada itd be on online stuff-

quartz swan
minor vapor
#

so I have achived it before with animated images

#

I am trying to do the same with realistic images, it is pretty ANNOYING...

atomic mortar
#

But how i generally do it for certain comissions. Make the character a transparent png and place it in the scene

quartz swan
#

thats ai for you

#

annoying lmao

atomic mortar
quartz swan
#

i could probably gen a fuck ton of images, pick out the perfect two, cut them out-

#

yeah

#

easy work

minor vapor
quartz swan
#

20 minutes tops in photoshop, im gonna try that sometime

pliant hornet
#

can u ppl recommend me a model to create wide angle images with longer prompt

#

instead of close up portrait

quartz swan
#

any model

pliant hornet
#

no

atomic mortar
pliant hornet
#

dreamshaper and juggernaut doesnt help me with that

atomic mortar
quartz swan
atomic mortar
pliant hornet
#

always creating close up images and cant handle much detailed prompt

atomic mortar
#

Meant swarm

quartz swan
quartz swan
#

craazzzzzyyyyy

minor vapor
#

So I have previously created a music video with AI, I could achieve consistency till some extent along with adding movement to the images using another AI software!

quartz swan
#

euro laugh

#

howd you mix up swarm and flux, there had to have been like

#

a clash of two thoughts

atomic mortar
#

I've been haunted by my friend chat. Ill send lmao

minor vapor
quartz swan
pliant hornet
#

OutOfMemoryError: CUDA out of memory. Tried to allocate 7.98 GiB. GPU

#

ım getting this error when ı increased the resolution to 16:9

#

from 1024x 1024

#

and also when ı tried to use upscale

warm junco
#

And for upscale you need the tiled VAE extension

#

Then try 720x1280 and upscale by 1.5 to get FullHD or by 2 for WQHD

#

Hires steps on 10

fervent oracle
#

Hello everyone! I'm a newbie in Stable Diffusion. Who can help me in DMs?

devout rover
#

I wanna train a model with my pics, anyone can help me with this?

frigid quartz
#

Hello can somebody tell me how i can attach sora to stable diffusion? i got model of me in replicate

crimson tulip
#

Anyone have any tips for blending style Loras with likeness ones? I find I lose a lot of the likeness when I try and mix with styles.

atomic mortar
atomic mortar
#

seeing how sora is closed source and cant be run locally i recommend the LTX model for image to video or wait for hunyuan to release its image to video model (this requires a really good pc, 4090 or better)

#

LTX can be run on a 30xx series though

#

takes a bit

#

hope that awnsers your question. setting up LTX is actually pretty easy 👍 if you got other questions you free to shoot a dm or reply

atomic mortar
#

Love it when people randomly ping without responding

fervent thunder
#

I mute discord tabs and just let every server accumulate red ping numbers
the effect is that pings actually don't do anything to me at all

#

I did this without modding the client cos that does break terms of service

atomic mortar
fervent thunder
#

tends to be either very old or very young people, or, separately, people with non-native English

#

cos not every country has commonly used social media that resembles ours

atomic mortar
#

true but i assume if someone is competent enough to download discord. find the stable diffusion server, get into general chat and ping a person hes capable enough to type a response back

fervent thunder
#

well I use discord via browser for example

#

and if I remember rightly you don't even have to verify email at first

atomic mortar
#

you do nowadays no?

fervent thunder
#

not sure

#

Hey all! I’m a newbie in SD land. Hope to get more proficient in ai this year before the world ends.

atomic mortar
#

otherwise i recommend the generator on CivitAI

fervent thunder
#

recommending CivitAI generator to newbies is probably a decent idea TBH yeah

deep salmon
#

i'm a newbie too. i just tried rundiffusion's free 30 min trial as i don't have the hardware to run locally

#

does civitai's generator allow img2img?

fervent thunder
#

not sure

#

I feel like probably

#

I can't open that site it lags too hard for me

deep salmon
#

yeah for some reason it uses a bunch more cpu than other tabs

atomic mortar
#

Its a unoptimized site but i think it does? You still use credits.so your free daily generating is limited

outer oriole
lucid bobcat
# drowsy hearth Naw the neurology is already pretty purely understood theres actually a process ...

No it's just. You just won't understand it will you? All the research you refer to only looked at one form of how most artists draw and paint things. It doesn't mean that all drawings/paintings ever made follow that principle. It doesn't mean you can simply condense the different artistic processes into an algorithm. If anything, you could argue that art is what doesn't work like that default "algorithm" people have in their minds. And then there's another point that you also just don't seem to understand: All the imagery that mankind has produced so far does not contain all the imagery that mankind will produce from hereon. You can't extrapolate mankind from rules, that's not how it works! Even your most advanced image generator will at some point be outdated. Machine learning does not have that evolutionary process because everything beyond the domain learned from the training data is just noise. You cannot feed even the most advanced futuristic super AI with everything mankind has done up until 1900 and it's just magically gonna invent Cubism, Surrealism, Jazz, Rock and Roll etc. Not how it works!

desert dagger
atomic mortar
desert dagger
atomic mortar
#

maybe but its also "can someone help me in dm" i dm and no reply lmao

desert dagger
# atomic mortar maybe but its also "can someone help me in dm" i dm and no reply lmao

that could be discord sticking your dm in their spam folder. I didn't think, for the longest time, there even is a spam folder and that people who said they found stuff in theirs were full of it - but i did stumble across it one day. if discord puts DMs in it, discord does not notify you that it did so - so you never know someone dm'd you at all

atomic mortar
#

wait theres a spam folder? where lol i got dm requests on but you get a red dot for that

desert dagger
#

so i'll send a DM but also tag the person on the discord and tell them I DM'd

atomic mortar
#

oh no way i found it now what

#

thats neat lol

desert dagger
atomic mortar
#

tat was agonizing

desert dagger
#

that sounds painful

atomic mortar
#

switched it to sessions because i couldent bother doing it 40 times in a row

atomic mortar
abstract latch
#

i like to add to the end of it, and if it's software related.....log file or gtfo

#

😄

#

at my last place of employment, I was like SME for a buncha legacy stuff, so I made up a guidelines for help request, with steps to gather the required log files for me to help them for various situations. Little did they realize at first, it was lowkey teaching them how to troubleshoot it without my involvement. lol

verbal yew
#

guy's i have a project that i rly want to do but i struggle with generating what i ask in my prompt

#

if someone can help me realise this it would be great (btw im french so sorry for my english)

desert dagger
cedar salmon
vapid solstice
#

Hey how’s it going ?kind new to all this , Was just wondering if anyone here is running stable diffusion on Mac book ?

cedar salmon
#

SD3.6 when

drowsy hearth
lucid bobcat
drowsy hearth
#

It doesn't seem super deep bro ai just makes stuff associated with preexisting material in its data set is not magic

lucid bobcat
#

Funny how you suddenly flip the script.

burnt copper
#

How do I enable vpred in forge webui?

desert dagger
desert dagger
drowsy hearth
#

All I gotta do is draw 70,000 storyboards controlnets are so good you can basically plug em in and have the ai make something it wouldnt otherwise make.

#

just put any style I want in there when I want to paint it

desert dagger
#

otherwise, no, you really dont'

#

if you have to draw 70K storyboards to get somethng decent, there's an issue

drowsy hearth
drowsy hearth
#

then once i get up to 70k i can just replace the textual embedding like they did with pony then write checkpoints over my basemodel starting with loras again

outer oriole
#

hello everyone

compact thicket
#

hey all! popping in cuz I'm getting started with locally hosted stable diffusion stuff

#

I was wondering if anyone had a good guide to training a model on your own art? I wanted to put something together using my own art style so I could quickly generate stuff for personal use/fun :)

fervent thunder
#

to be able to be responsive to guidance that is guiding it from bad to good

drowsy hearth
pine path
fervent thunder
drowsy hearth
fervent thunder
#

if you do a different architecture to base LDM or base DiT then potentially

#

like there are a lot of weird architectures out there

#

they add more stuff

drowsy hearth
#

I might try out different architectures from how I understand is that the compositional variety is what the fitting process kinda has the most to do with so Ima just build up a image library over the course of my classes. Im already building up a portfolio and its pretty close to the lighting quality you would see in AI so I figured I can just make an AI model that specializes in things I want to be able to prompt.

#

storyboards arent like a way to finish a painting or anything theyre just a way to jot down an idea really fast in perspective thats why storyboard artists like to emphasize learning gesture because its all about getting the essence of an image out of your head as quickly as possible so the only technicalities that need to be there are the ones that are productive to the storytelling component of the composition.

desert dagger
compact thicket
long holly
#

Have anyone tried this

desert dagger
desert dagger
compact thicket
compact thicket
#

I've been wanting to create a good rapid thumbnailer, hehe

long holly
#

This app allows you to change your outfit using a image

desert dagger
desert dagger
compact thicket
#

I also would like to use it just for fun generated images for use in sillytavern, so i dont ahve to make a bunch of expressions for characters if i dont have the time to lol

long holly
desert dagger
pliant hornet
#

tiled vae]: the input size is tiny and unnecessary to tile.

#

how to fix this one

fervent thunder
#

if it was comfy it attempts to VAE decode first without tiling and then if it hits VRAM limit it makes as second attempt with tiling

#

so it was probably vram thing

pliant hornet
#

it was ab something the tile size as ı remember

#

to fix it

#

but ı dont know the exact value

fervent thunder
#

could you consider switching to comfy or diffusers?

pliant hornet
#

no

#

they re causing problems with amd gpu

fervent thunder
#

would it be possible to sell amd gpu and buy nvidia?

desert dagger
pliant hornet
pliant hornet
#

oh no

#

is Forge webui good

desert dagger
pliant hornet
#

yes ı checked it out

#

forge webui seems to fix my issues

desert dagger
copper forge
#

Is there an actual list of Artist and Art Styles that are used in SD 3.5 Large

I've searched for the mythical list and have come up with links that say there are lists is not helpful❌

ChatGPT succeeded rooting out a Styles List with a link to Github. Multiple Styles with example Images and how to implement the styles - None of them work as advertised. Using latest version of ComfyUI. v0.3.14
Absulutely frustrated.

SDXL has superior support on artists and art styles and very easy to implement in prompts but lacks the improvements implemented in SD 3.5. That would have been a big edge on Flux

pliant hornet
#

Never OOM Integrated

Enabled for UNet (always maximize offload)
Enabled for VAE (always tiled)

#

is bottom option works instead of tiled vae

#

ım getting cuda error again

atomic mortar
fervent thunder
pliant hornet
copper forge
# fervent thunder that's the big lawsuit attractor though, to train on proprietary names, styles, ...

There are bigger problems. I've been running tests for about 5 hours. Art styles are hit and miss even using the Github link listing the art styles. 1024x1024 images have a greater chance of rendering as some form of art style - not neccesarily the one specified in the prompt.

ComfyUI - Using a 4090 card and have 64 GB sys memory. Even flushing memory in Comfy to eliminate cached info doesn't correct style prompt errors. .

atomic mortar
copper forge
#

No errors

pliant hornet
#

there is not tiled vae in forge webui and ı got cuda error again

#

lowered the native res its fixed

copper forge
#

othe models Flux and SDXL work flawlessly

atomic mortar
copper forge
#

1024x1024

pliant hornet
#

704 to 960

atomic mortar
pliant hornet
#

realvisxl

copper forge
#

3.5 Large.

atomic mortar
#

Im asking the numbers guy lol

pliant hornet
#

what

atomic mortar
copper forge
#

and other fine tuned 3.5 models to compare against

pliant hornet
atomic mortar
#

You can swap them too like 2:3 for tall images etc

pliant hornet
#

thx

fervent thunder
pliant hornet
#

RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)

#

controlnet giving error

pliant hornet
#

sry

tall matrix
#

im trying to run Automatic1111 in open webui does anyone know how to?

atomic mortar
tall matrix
#

in automatic 1111 i ran the playground 2.5 model but all i got was a static image

drowsy hearth
#

Why would people care if I storyboarded lora models I gotta know cuz the more I think about it the more confused I get.

#

Wouldnt that just be dope as fuck like I could make the lora model do literally whatever I want. Any pose model I could think of I could work up even now in like a few days.

#

I just want robots with better detachable limbs thatd be a good lora model

atomic mortar
drowsy hearth
bright flame
#

Hello, I would like to discuss cooperation opportunities . How do I contact ?

atomic mortar
warm junco
warm junco
bright flame
atomic mortar
#

well on their website there could be a contact us page

#

but i doubt they would cooperate with a one man show

bright flame
#

alright~

fervent thunder
#

Hellooo

drowsy hearth
#

Does giving bedrooms proper name variations in training like calling it "Courtney's bedroom" make the bedroom backgrounds more consistent in Lora models when I use the proper noun tag?

atomic mortar
#

replace "bedroom" with like "courteyroom"

#

is better

#

to avoid existing tags

drowsy hearth
#

To the point where the ai model itself is an application of world building

atomic mortar
#

so like flux would? natural prompting and all

#

gross

#

also "courney's bedroom" is four tokens while "bdrm" is 2

drowsy hearth
atomic mortar
#

I think after observing your ramblings that you dont know what your talking about mostly

atomic mortar
#

I'm not going to entertain your train of thoughts further as were getting nowhere

drowsy hearth
atomic mortar
#

Just i have thing, i have images of thing, i call it whatever in my lora and its just that

drowsy hearth
atomic mortar
#

Really, thats it

drowsy hearth
atomic mortar
#

Or even better, follow a tutorial

drowsy hearth
#

I'm just gonna assume yes for the most part

drowsy hearth
#

@finite cloak could you answer my question it's pretty much the only question about ai that I have left before I leave

finite cloak
#

what

drowsy hearth
# finite cloak what

I'm asking how far using unique tags can be used before they stop being useful. Like for instance instead of just naming the character with a lora trigger I name the environments that character occupies as well with things like street names or the Courtney's bedroom analogy I used earlier. I've found having the name of the character in the tags increases the consistency in which that character is portrayed so I was wondering if naming street, rooms and houses for instance are also productive in the same way that naming the character is

pliant hornet
#

webui forge is terribly slow

#

any idea how to fix

#

[Low GPU VRAM Warning] To solve the problem, you can set the 'GPU Weights' (on the top of page) to a lower value.
[Low GPU VRAM Warning] If you cannot find 'GPU Weights', you can click the 'all' option in the 'UI' area on the left-top corner of the webpage.
[Low GPU VRAM Warning] If you want to take the risk of NVIDIA GPU fallback and test the 10x slower speed, you can (but are highly not recommended to) add '--disable-gpu-warning' to CMD flags to remove this warning.

drowsy hearth
finite cloak
# drowsy hearth I'm asking how far using unique tags can be used before they stop being useful. ...

no mainline image generation models have been trained to use tags, only anime models are. Normal models are trained for natural language captions and/or "internet natural" captions -- the latter being when images on the internet are often Some natural description of a scene in plain English here, cameraname, location, year, etc. detail or similar format with a small set of "tags" appended on the end like that.

When training a lora and not using anime models, you want to match the way the rest of the model is trained, ie use natural language.
Some people support abusing the tokens of a text encoder, ie replace the name with an obscure specific token in the textenc to try to match it, other people support abusing relevant natural language - for example if training a person, you might give the name of a celebrity that looks similar, just to make it so the model has to learn less unique info, just adapt something it already mostly knows.
But for the most part as long as you're not actively fighting prior knowledge heavy, you can caption however you want, and should absolutely caption with just the normal plain name of whatever you're training.

As for what to name explicitly: name whatever you want to be able to call back easily. If you have several pictures taken on a particular street, and you want to be able to ask the model to generate within that street, name the street in your caption. If you don't care about the specific street, call it "a street".

Be aware that associating content with labels means if you forget that label, the model might skip that content. For example if you have images captioned "Courtney in Courtney's bedroom", and you later generate "Courtney in her bedroom" with that model, it might make up a new bedroom, because it expects "Courtney's bedroom" to only appear when that exact phrase is used (or it might just generate her bedroom correctly anyway. AI does AI things.)

pliant hornet
#

it just got more slow

drowsy hearth
pliant hornet
#

are u on amd

drowsy hearth
warm junco
pliant hornet
warm junco
#

I guess with zluda then, And which model do you use?

#

Or do you use DirectML?

pliant hornet
#

ı was the guy with 341789134578913547893514789 nick

#

zluda

warm junco
#

Ahh

pliant hornet
#

is there tiled vae on forge ?

warm junco
#

Yep its called never oom for tiled VAE only at the bottom

pliant hornet
#

its aint work ı guess

#

getting cuda error

warm junco
#

It works

atomic mortar
copper forge
# fervent thunder this isn't a problem this is how it is supposed to be, if they train on art styl...

Any artist that died over 100 years ago. Painters from the 1800's are beyond copyright protection issues. Even some artists from the early 1900's are open. They are classic artists and SDXL included many of them. No legal or copyright infringements in those cases.

A bigger problem is that applying "Art Styles" is broken. Applied as noted will not guarentee the style is applied. Something about 3.5 is it doesn't "flush"
an old art style when replaced with another (training issue bug ???)
The best chance for them to stick is using a 1:1 ratio. Outside that ratio, things will likely break. I understand this has more to do with it being a training issue.

Solution: Render in SDXL with a solid finetune model - img2img with a finetuned FLUX model for output. That's an extra step, but it works for now.

drowsy hearth
#

I.E. a worms eye view, birds eye view, quarter angle and orthographic views of main street.

lucid bobcat
#

In an ideal case the common factor between all your training images is only the subject and nothing else.

#

That includes abstract things like perspectives as well. Your captioning should also describe everything in the image EXCEPT the subject that you're trying to learn.

frail mist
#

hi

#

is stylegan3 the successor to Neural Style Transfer?

#

I am newb to this, so I am figuring out a gameplan for what I want to do

#

I am looking for a model to replicate a certain art style then apply it to anything I want

brisk ferry
#

whats the recommended UI these days

#

auto is dead and it seems comfy's the new meta

atomic mortar
ebon forge
#

hey, I would like to start using stable diffusion on local but I don't know where to start. Can you help me?

ebon forge
#

AMD radeon graphics

atomic mortar
#

Hmm a bit more specific? Can you open task manager and i think its the second tab that shows off your gpu and such

ebon forge
brisk ferry
atomic mortar
brisk ferry
atomic mortar
#

i mean i think you can yeah. its a comfyUI backend.

#

meaning swarm is just run over it

atomic mortar
#

well you probably want the second one as it has a comfy tutorial

#

but seeing how it really wants to use comfymanager (which isnt always the best option from what i heard) you need to install that manualy in the comfyui backend in swarm

#

i think youd want to use git clone method 👍

brisk ferry
#

are you saying comfymanager isnt compatible with swarm

atomic mortar
#

it IS compatible

brisk ferry
#

oh ok

#

thank you

atomic mortar
#

but you cant download the portable version

#

you need to install it using the gitclone method

fervent thunder
#

I agree there are old things that no longer have copyright

#

the thing about flux is its so big

#

the dataset must be huge

#

the cost of curating that dataset more could be so high that it would cut into training budget

#

its not just budget its also time because it was made pretty hastily

atomic mortar
#

so the basic ratio

#

a portrait (3:4) would be 896x1152

fervent thunder
#

is that what he meant?
oh ok

#

the models are way stronger in 1:1 yeah

atomic mortar
#

oh in that context yeah i think it means that ish

fervent thunder
#

having said that, 16:9 and 21:9 look drastically better to me in most models

finite cloak
#

models that are trained for multi-aspect (ie most models other than SD1) tend to pick up a content bias relative to resolution. For example, they often do much better at portrait images of people with a vertical aspect -- because most good portrait images out there are vertical. 21:9 will heavily bias towards cinematic source data, and so naturally does well for things that would look nice in a cinematic setup

fervent thunder
#

yeah I do 21:9 for the cinema bias
cos sci fi movie is essentially the only sort of image I ever make

#

I use 1:1 tiles for upscaling or inpainting cos I want to maximise model strength there

opaque epoch
#

hey guys, im looking for nsfw diffusion can anyone tell me hot to get there

opaque epoch
#

sorry but where is it?

desert dagger
opaque epoch
#

thnx a bunch

drowsy hearth
# finite cloak models that are trained for multi-aspect (ie most models other than SD1) tend to...

Composition fundamentally just isn't the same between portrait and landscapes. I've had it do bisectional compositions in landscapes kinda like a selfie composition and it just looks bizarre. Putting (rule of thirds:1.2) or something of that magnitude gets it to fill space better in landscape compositions but it's still very poor its also regressive. SD 1.5 seemed to do this kind of thing better then every model.

desert dagger
atomic mortar
atomic mortar
#

I just blocked him by now, low effort ranting / not willing to listen to anyone unless they agree with him

desert dagger
#

but i get the feeling that Leo would argue with Einstein about relativity, and Schrodinger about quantum mechanics

atomic mortar
#

The stream of low quality users do get tiring sometimes lmao

#

Not saying I'm the best user either

desert dagger
atomic mortar
#

Haven't heard of those before, can you shoot me a invite in dm?

drowsy hearth
# desert dagger you're arguing with a man that programs and trains AIs - professionally. you sho...

I didn't argue my guy anyways I got the information I need about training I'm just gonna leave. Also important note that knowing how composition works and machine learning are not mutually inclusive. Anyways ima go study. This communities fanbase clearly has a cultural problem I haven't claimed ai isn't art or whatever I haven't actually argued at all you just lash out and get offended at the prospect of hand drawing an ai model for no reason. You keep going with your friends until I get angry I haven't actually seen you post or engage in even using ai tools on this server either. You may rebelrouse and cause problems the reason artists hate this tool isn't because of alex mcmonkey it's because of people like you. I'm leaving

desert dagger
atomic mortar
#

Don't feed him crystal. Hes looking for engagement

desert dagger
atomic mortar
#

But tbh i should post more gens/content but the problem is most requests/comissions i do are nsfw so i cant post most of em lmao

desert dagger
atomic mortar
#

Or its hyper specific that i wouldent want to share albeit hilarious ("rukia eating dessert and being disappointed (she paid $40 and it's worse than storebought) "

atomic mortar
slender hawk
#

hey all

final remnant
#

I am wokring on creating an LLM from where shall I start?

fervent thunder
#

pytorch docs

#

pytorch actually has better docs and more readable code than like 90% of the stuff that is based on it

fervent thunder
#

The office granted approval and said it determined the image "contains a sufficient amount of human original authorship in the selection, arrangement, and coordination of the AI-generated material that may be regarded as copyrightable."hmm this sounds like a really big deal

#

lucky that it went this way I think cos it will set precedent

vagrant trail
#

Hey guys!
Sorry for the newb question but I'm new to this and wondering what tutorial you would recomand to follow if you were to start today. I'm seeing a LOT of tutos on YT but most being a year or more old makes it kind of obsolete i feel. If anyone could point me in the right direction I would greatly appreciate. I'm looking for photorealistic style. 🙏

neon perch
#

heya

atomic mortar
fast sage
#

For flux what should the latent size be? 1024?

#

What about 512x768?

desert dagger
fervent thunder
ripe rain
ripe rain
lucid bobcat
ripe rain
#

1920x1088 is actually my preferred resolution for when I want quick hd gens without having to upscale with another model

lucid bobcat
scenic trellis
#

hello

desert dagger
#

it's always going to be better to do your gens at a lower resolution, and only upscale, with an actual upscaling tool, the ones you actualy need to be upscaled

mystic siren
#

does anyone have flux dev FP8 version on a rtx 3060 ti or 8gb vram card + 16g ram to tell me how is the performance?

fervent thunder
#

Flux can generate at 3072x3072 without distortions with certain combinations of checkpoint, lora, sampler and scheduler

fervent thunder
#

cos if its a 4000 series 8gb card then fp8 can work

#

but 3060 ti doesn't have dedicated hardware matmul for fp8

#

it can use int8 however

mystic siren
#

thanks, I'm using the "gguf Q4" and I was wondering about the posibility but it's ok it's working fine the one I got

fervent thunder
#

which card do you actually have at the moment?

mystic siren
#

it's a pain in the ass download flux models cuss they weight a lot

#

rtx 3060 ti 8gb vram, I forgot about the 4000 series xd I know it's different technology, just like the 5000 series

fervent thunder
#

rtx 3060 ti has both int 8 and int 4
to get good performance out of int 4 is fairly advanced and tricky but
you could get int 8 easily using Huggingface Quanto its not complex

#

this would be a big speed boost

ripe rain
#

i didnt really notice a huge difference between fp8/q4 other than speed

fervent thunder
#

between FP8 and GGUF Q4 the difference isn't enormous but
Int 4 done in a simple way can lose a lot

#

GGUF Q4 and Int 4 are not same

#

good quality Int 4 is possible but it needs a longer process, and sometimes needs custom code to run as well

mystic siren
fervent thunder
#

yeah if you like it as it is then that's fine

final tusk
#

Can anyone recommend me a really good artistic model? I am trying to generate an idea for a tattoo but the prompt isn't quite giving me what I want

fervent thunder
#

have you tried flux dev

final tusk
#

I see quite a few different options on civitai

#

which one is the right option? 😛

fervent thunder
#

main thing is to learn the difference between dev and schnell

#

cos they are different

#

dev takes more steps

#

its generally seen that dev is higher quality but I don't fully agree with that in all cases

desert dagger
#

designed for speed, not quality

fervent thunder
#

the reason I think schnell can be higher quality is that
during certain distillation methods they involve an object recognition model e.g. DINO
it is usually DINO
and it boosts the distilled model's ability to separate objects from backgrounds, and make coherent objects
sometimes this actually ends up with the distilled model overtaking the teacher in that aspect

#

at least to my eye, schnell seems to have this effect

#

and models like Flux Fusion that mix dev and schnell end up looking like dev but with that effect

desert dagger
fervent thunder
#

SD 1.5 base is actually my favourite model in some ways
it has the most crazy creative compositions

#

have to seed hunt hard, like pick the best out of 1,000 seeds, but you can get nice ideas from SD 1.5 base

#

my main prompt adherence tip is just tiling, its the way I do it

#

like different prompts per tile

civic cradle
#

why does the image look different even when i use the same image settings

#

only difference is the web ui i used

#

wait my bad wrong chat

viscid perch
#

hello all, I was wondering if there is another model/service in the idea of Runway Act One and Liveportrait ? Mainly for changing facial expression using 2 videos

quartz siren
desert dagger
fervent thunder
#

SD 1.5 can take way more noise than SDXL for some reason also

signal nest
#

if flux is on-topic .. is there any opensource workflow that can do skyboxes using this .. if not same question with recent SD variants

desert dagger
fervent thunder
#

I've actually never used SDXL refiner LOL

desert dagger
#

it was trained on a slightly different dataset, but if you're not going to use the refiner, ther'es not a lot o point in not just using sd 1.5

fervent thunder
#

the refiner works on any model, you could use it on SD 1.5 or use it on Flux even

#

without refiner SDXL still has some advantages, particularly stronger VAE and hands

#

SD 1.5 has drastically better prompt adherence because of Ella

#

I think SDXL is a big stronger in layout and structure, but SD 1.5 can win on details

#

details is a bit of a moot point though cos dedicated upscaling models have way more details than either

desert dagger
fervent thunder
#

I think its better to judge quality from benchmarks than by trying to judge it personally

#

and SDXL does benchmark a lot higher than SD 1.5

desert dagger
fervent thunder
#

is just scientific method

quartz siren
#

I think sdxl is better then sd1.5 but not by a really large amount like flux/sd3.5. The refiner made things too plasticky imo and didn't really seem that useful imo.

I think using a low-step variant of sdxl like lightning, ttd, dmd2 is probably the best choice if you want to use sdxl, they are pretty much the same quality as 25+ steps but much faster.

fervent thunder
#

yeah TDD is what I use

#

but sometimes doing long gens of SDXL is also nice

desert dagger
desert dagger
# fervent thunder is just scientific method

maybe - i tend to look at software benchmarks about the same way that I look at movie reviewers. especially if those benchmarks are based on random users on the internet voting which image they like best

quartz siren
fervent thunder
#

human preference benchmarks are the ideal test though

desert dagger
desert dagger
#

human benchmarks for pizza is that pineapple should never be used

#

but that only matches up with people that don't like pineapple on pizza

fervent thunder
#

ok there is an alternative
can simply use inverse problems to judge the strength of a diffusion model
e.g. upscaling, inpainting, deblurring and colourisation

#

in fact I am leaning towards this being the best way overall

desert dagger
finite cloak
# fervent thunder human preference benchmarks are the ideal test though

Human preference is a great test within the confines of understanding what the results mean. All tests are biased. The best test is a series of different tests and noting the biases of each as you go.

For human preference benchmark testing, it's a frequent flaw that they will favor "pretty" generations over "good" generations. So if the prompt is a cat, and the AI gave you a sexy woman with big booba... preference voting says sexy lady model is best!

It's also a frequent issue that noise looks better in photos, and noise that comes from genuine model or sampling errors still "looks pretty" if it's in small doses, and gets preference votes, while other tests would highlight it as a failure.

If you were doing even a narrow test, say for example have models generate a bunch of pretty ladies and see what human votes think are the prettiest... you'll get a ton of votes for Flux Dev. So flux dev is the best at generating pretty ladies? No, it's the best at generating one specific pretty lady over and over. "Fluxgirl" is quite pretty, but it's a major problem/limitation of the model, that would cause it to fail instantly in loss metrics or other tests that compare prompt following on details or source matching.

finite cloak
# desert dagger dev is a distilled pro. schnell is a lobotmized dev

iirc schnell is a second distillation of pro, rather than from dev - the key practical difference here is being separated makes it difficult to share loras and all, which was the magic that made Turbo so useful for SDXL - you can turbofy anything, the turbo could be applied with a lora itself even.

desert dagger
finite cloak
#

ftr my word on that is not absolute, that is secondhand memory (I recall it being said but not sure where) mixed with analytic consideration (if it was a direct descendent of dev, loras would cross between models fine)

desert dagger
ocean mist
#

hey guys!!

#

how can this tool help me with my game?

solar trout
ocean mist
willow bear
fervent thunder
#

e.g. whether a distillation method had an adversarial component seems to make a big difference

finite cloak
# fervent thunder is it known which distillation method was used because the different methods see...

dev is "guidance distilled", schnell is "timestep distilled"
dev had a """guidance""" value baked in (akin to CFG, but not CFG, and its value included in an embedder)
schnell just did essentially what SDXL Turbo did. I don't think they published the exact details of schnell's methodology but considering BFL is literally Stability's former research team, I'd be shocked if it was any different than what they did on SDXL Turbo and SD3 Turbo (not much reason to change what worked), so it'd be https://arxiv.org/abs/2403.12015 same as the LADD paper published about sd3 turbo by the same team

#

LADD in turn being basically just ADD but more of it running in latent space, as the names imply

fervent thunder
#

ah okay thanks, yeah if its the same team then that is likely
I guess then this lora is LADD turbo since it is literally called turbo https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha

abstract quarry
#

Fluxgirl?
A lot of people say that Flux dev has low diversity cause it's distilled and a flow matching model. But honestly, diversity over multiple seeds in flux is much higher for me than, say, SDXL

fervent thunder
#

there are ways you can look at log likelihoods directly, can bias the sampling towards them if needed also
and yeah log likelihood distribution for flux is better

#

regarding prompt adherence, I've seen Flux beat SD3m on GenEval
I think at this point a new generation of models is needed to compete with Flux really

#

I was very suprised that the dev de-distill efforts that were trying to train the ability to use CFG back into Flux worked, but they worked really well it can happily use CFG 6 or so 🤔

atomic mortar
finite cloak
#

it's likely a result of the quality-tune data and/or the distillation process

#

there's plenty of "unwanted baked in features" of the model, strong bokeh on photoreal images is another common one

fervent thunder
#

the model has high diversity in terms of the log likelihood distribution though, which is what is ultimately going to cap your ability to get diverse samples out of it
the things that you are referring to can be changed by even a small lora

#

I think that sort of criticism was a lot more valid on release day

#

when it was uncertain whether the model would be possible to train

atomic mortar
#

80% of my gens don't use a lora🗣️ :tbh i love sdxl still

#

Maybe once i can snipe the 5090 ill move on

finite cloak
#

To be clear my comment above is an explanation of what "fluxgirl" is, after it was mentioned previously in the context model benchmarking methods
I use fluxdev for 99% of my generations at this point i don't have any problems using it lol

fervent thunder
#

ah okay I see

#

I also dislike the default day 1 flux look that it launched with

#

but I use the checkpoints based on dev de-distill with CFG and 2-3 strong photography loras

atomic mortar
#

Day 1 launches are mostly terrible due lack of proper workflows lol

#

I couldn't run sdxl on release (3min+ gens) . Now i can do it easyly in 13s

fervent thunder
#

you can see my comments on this discord I actually rly disliked flux at first, and didn't use it for like 2 months

#

yeah launches are always rough

#

RTX 5090 doesn't have sage attention apparently

#

not sure if I recommend it but its SDXL at 0.7B or so

abstract quarry
#

I agree that there are certain faces that appear over and over again - but all other models have the same issue. I don't know if that is a problem of the training data or a problem of dpo. It can be improved with lorad or fine-tuning though

fervent thunder
#

more recent dpo papers have said that older dpo methods can overfit hard
or have an issue that is slightly different
where is is a bit like mode collapse in GANs

#

I'm not that interested in photo portraits, but I kinda think that maybe photo portraits need their own dedicated foundation model

#

or just a really massive flux lokr

#

where they try to have dataset have a really broad coverage of photo portrait types and subjects

atomic mortar
#

What UI did you use?

broken bronze
#

Help!
I used to have ComfyUI on Windows 11.
I switched to Pop Linux (on the same machine with same hardware) and now my workflow does not work.
I installed the latest ComfyUI.
However it is the same workflow.
Now, I get OOM errors on every video generation on the workflow that worked on Windows.
I was running portable ComfyUI on windows, and now regular ComfyUI for Linux.
Has anyone had same issues when switching?

fervent thunder
#

are there custom nodes

atomic mortar
#

Or what gpu do you have? Just mention anything relevant in techsupport

atomic mortar
#

Almost All of them are web ui based.

Hmm chances are no but i recommend posting in #🤝|tech-support if someone else also uses the 1111 version

#

Personally i dont because its rather outdated

broken bronze
atomic mortar
#

did you follow a Linux install guide?

broken bronze
#

.

broken bronze
atomic mortar
#

okay i got to say thats the worst possible guide you couldave followed. are you experienced in linux at all?

broken bronze
#

No

atomic mortar
#

mind me asking why you went for linux?

wet grotto
#

preprocessor "anime denoise" from controlner Lineart is what? I say it is to generate images on anime style, but I don't know if it's correct

split flame
#

Hello everyone sunsmile joining under our new business discord. we make images, videos and games etc

hushed quarry
sleek blade
#

hello
does anyone know how many decimal places are supported for LORA weighting in prompts?

copper tree
#

Hello

frigid wing
#

Hello

ocean mist
#

hey there!

finite cloak
final tusk
#

OH MY GOODNESS! FLUX IS 22GB?! IS IT WORTH IT?!

sour falcon
#

Hello. I have a few questions, can anyone help me with the audio?

abstract quarry
final tusk
abstract quarry
#

you cannot run 22gb on a consumer gpu anyways

#

search for a quantized model that fits your gpu

#

eg. 11gb or 8gb, depending on your vram

fervent thunder
#

depends what you are doing cos Flux wins overall for standard text-to-image or img-to-img
but it will lose to smaller models that were designed from scratch for specific tasks

#

for various forms of image manipulation or unusual conditioning input methods for example

fervent thunder
sour falcon
#

stable diffusion kullanan birisi iletişime geçebilirmi benimle ?

fervent thunder
sour falcon
sour falcon
surreal sandal
#

hi

fervent thunder
# sour falcon I want to make costum potrait but I can't do it, can you help me?

I am not very good at teaching the first steps
my advice would be to try to download SwarmUI and run their basic workflow
you could also try easydiffusion for an easier option although I have not tested it
if that is too tricky (which is understandable) then I would maybe suggest using a paid service online, I think the easiest out of anything is probably to ask chatgpt to make an image
these are the links to the things I mentioned
https://github.com/mcmonkeyprojects/SwarmUI
https://github.com/easydiffusion/easydiffusion

surreal sandal
#

can u tell me how to use stable diffusion ?

sour falcon
fervent thunder
#

sorry I can't help more

sour falcon
#

Thank you, I'll try what you said.

surreal sandal
#

hi

#

any help

atomic mortar
#

But before you do, what gpu do you have?

#

So i can say if you can run it or your wasting your time before you invest time into it

surreal sandal
#

i have 3080 RTX

#

@atomic mortar

atomic mortar
#

Should be possible, since your a beginner i recommend following the tutorial from cs1o. Either web ui reforge OR SwarmUI

#

SwarmUI lets you get used to advanced features as you learn. Forge just lets you use basics mostly

#

And on CivitAI you can get nice "stable diffusion models" either anime or realistic finetunes/merges

#

Personally for anime i recommend illustrious wainsfw and use any of their example images as a way to start making good looking images

tired tusk
#

Hey everyone! 👋

I’ve been exploring AI-generated art for a few weeks now, mainly for concept designs in a sci-fi interactive game I’m working on.

I’m trying out multiple models, and stumbled on this server!
I’ve been struggling to get consistent character designs and keep the art style uniform with all the models I used.

how do you all handle this challenge? Any tricks for keeping AI-generated characters looking the same across multiple images?

atomic mortar
#

Loras mostly but if its just face, facetools/reactor

#

Some realistic models it helps to use a name in the prompt but its hella inconsistent

fossil hedge
#

#ask Is Hunyuan the best video generator for on ComfyUI?

My computer uses Rtx 3060 12GB, can it run text to video?

tired tusk
#

I'do love to see what you've done with it

atomic mortar
tired tusk
#

oh okay

#

well, thanks either way. I'll def check it out 😄

gritty lava
#

u plug a jack into the front I/o of ur puter

#

and windows be like

#

hmmm, what device did u plug in, front audio?

#

im like, what do u think?? ;d

atomic mortar
#

well yeah headsets and speakers dont carry a load of extra chips

#

if its unsure its better to ask then to assume its a speaker while its a headset

gritty lava
#

xd

atomic mortar
#

i use computers all day, every day. they suck. they are terrible and i wont stop using them

gritty lava
#

i think itd be cool if hardware could exist for PCs, that can measure the ohms of an audio device, to in a sense, autonomously deicide how to drive the headset or device u plug into the audio jack, and if u wanted to choose manaully, u can jus go to settings

#

instead of like, throwing like 3 diffrent windows at u, to choose urself, before anything is outputed

desert dagger
fervent thunder
#

not sure computer addiction is real

#

it didn't get recognition in DSM-5

desert dagger
fervent thunder
#

there is a lot of debate about it yeah

#

even in DSM-5 they did include it, just not as a recognised condition, they put it in the section for more research

hushed quarry
versed yoke
#

did stability ever release the creative upscaler

fervent thunder
#

don't think so

desert dagger
#

how do you 'borrow' an upwork account?

fast sage
#

For comfyui if I wanted to get the prompt from a exported png (with embeeded prompts) would I just grab the image header in python?

vital prism
#

Hello

granite thorn
#

hello, i have laptop rtx 3050 4gb vram, its fine if use stable diffussion? i already try, i cant render anything, its just like green screen

desert dagger
fervent thunder
#

https://huggingface.co/hum-ma/SDXL-models-GGUF/tree/maincan fit in 2GB okay

fervent thunder
#

there is a trick where you run it headless so the screen goes black while it generates image

#

but then you can use full VRAM

#

but it should be fine without that anyway

#

Hi there

#

hello

#

So what kinda stuff do you guys generate

#

I’ve made a Lora of myself once that was okay but that was like a year ago and now I’m lost with automatic1111

#

I forgot how to use a1111 I did do it a bit

#

I just tried to jump back where I left off and so much is different now

#

I feel like a1111 is basically out of commission

fervent thunder
#

So what kinda music do you guys like

#

Anyone like Radiohead? Strokes?

ripe rain
#

gregorian chants

sinful shore
#

does anyone know where I can download a cheatsheet of prompts for stable?

fervent thunder
#

Personally, I like to occasionally listen to a group of old guys with grey beards gather round and yell dragon words at me

low moon
#

ABBA

warm junco
honest solar
#

hiya

low moon
#

Peak AI atm.

severe ruin
#

What's currently the best/easiest way to do good faceswaps? seems like every time i google it a new extension or library is suggested heh

#

and half seem to be not updated for like 10+ months

severe ruin
#

will have a look

quick linden
#

How can I generate QR codes?

ruby parcel
#

Hiiii

#

Just joined 😗😗😗

#

I’ve a question.. I’ve tried my first ever Lora training, using my selfies as dataset, around 7 pictures 😌😌

#

The lora “works” as in it can kinda generate my likeness but only from front view. Hmmm is there any tips or reads I can consume to learn how dataset curating works n best practices?

fervent thunder
#

smaller models will "generalise" less so they are more likely to need to be explicitly shown everything during training

#

i.e. to get back angles they would be more likely to be shown back angles

#

the larger models like flux, SD 3.5L etc have better generalising ability so if you show them only a front angle they can "infer" what the back angle would look like

ruby parcel
#

Ah I’m using SDXL juggernaut xl

#

The face gets distorted sometimes from front view but its surely kinda working lol

#

It would fail if I prompt for look left look right side view lol

fair crystal
#

Hey Stable Diffusion community!

I've seen those funny Photoshop requests floating around where people ask artists to add them into a vacation photo they missed. It got me thinking... is there a way to automate this kind of request, similar to how Pika's new "addition" feature works for video? I'm specifically looking for a solution for photos, not video.

My current understanding is that one approach would be to inpaint a "stand-in" figure into the photo, then use a face-swapping technique to replace that figure's face with the person's photo. However, this seems like a somewhat manual process.
I'm wondering if anyone has developed or knows of any ComfyUI workflows (or any other automated methods) that could streamline this? Ideally, something that could potentially be batch-processed.

Any pointers, ideas, or even just brainstorming would be greatly appreciated! Thanks in advance!

fervent thunder
#

I would recommend using a large lycoris lokr in simple tuner

#

for SDXL

#

but that is not simple

ruby parcel
#

oh im using onetrainer

#

was using kohya but i hit into OOM iussue in my 8gb 3070

#

LOL

#

lycoris.. hmm

#

time to google

fervent thunder
#

ah I don't know much about the low vram stuff cos I use cloud

ruby parcel
#

do you have like base tips on dataset itself?

#

example, i provide like 10 pictures of myself in selfie

#

is that bad?

fervent thunder
#

yeah my tip for dataset is
get a bunch of image quality assessment models, and an image embedding model
and download a big dataset from somewhere like huggingface
take one image that you like, and use embeddings to search for the most similar images in the dataset
then use image quality assessment models to filter out low quality
10 pictures is fine for flux, for SDXL it is on the low side but it might be ok

#

if the method I gave was too complex then just try to manually get like 30 high quality images if you can

ruby parcel
#

do i have to be in different expressions? close up and full body?

#

if my goal is to learn my face.

fervent thunder
#

for lora what you need to do is use images that are close to the style and structure that you want your output images to be

#

so if you want images that are full body it is not good to train a lora only on close up face portrait

#

it could still work but its better to have a closer match between training image and desired output image

#

so like if your goal is to make full body images then train on full body images

ruby parcel
#

i see

#

ah i was thinking in the way that once the lora learns my 'face'

#

it will know to transplant it to any body and pose

#

LMAO

fervent thunder
#

yeah its a really common mistake with loras

#

it can still work sometimes, especially with strong models

#

but its just a lot better to train it close to the image you want

bronze quiver
#

Hello

wet tinsel
#

hi all, wanted to ask if anyone knows if the stabletuner discord is still around and kicking or not!

mint lava
#

hello fellas!

ruby parcel
#

o7

atomic mortar
#

No?

desert dagger
wet tinsel
desert dagger
wet tinsel
#

Fine tune SD

desert dagger
# wet tinsel Fine tune SD

a bit more details on what you're trying to acomplish, please. there are hundreds of things that might work but are probably not quite what you want. i need more information before i can recomend anything

wet tinsel
#

I appreciate any guidance you can give on the matter; it seems as though most tools in this area are fairly old and somewhat deprecated

desert dagger
#

i can dm you an invite if you like

wet tinsel
fleet bough
#

Hello

desert dagger
wet tinsel
gritty lava
#

happy valentines day everyone :3 💜

untold pewter
#

happy valentines

finite cloak
wet tinsel
hushed quarry
warm hull
#

Hey guys

#

How are people able to generate consistent characters?

#

Do they train a lora for the characters or how does it usually work?

karmic brook
hushed quarry
ruby parcel
#

hmm

#

are there like "light weight" versions of sdxl?

#

i love SD1.5 run speed on my 8gb vram

#

but SDXL images looks nice out of the box.

desert dagger
#

the only channels you can generate in are the artisan channels and there are only 4 of them. read the information in #artisan-faq first

fervent thunder
#

its my main model

ruby parcel
#

oh

#

it's not SD?

fervent thunder
#

its sdxl based

#

it looks similar to SDXL in side by side comparisons

ruby parcel
#

i see

#

gonna give it a try

#

hahaha

#

i'm mostly ok with sdxl , juggernaut

#

and it fits within my 8gb vram

#

but

#

if im trying to train lora...

#

Do you train your own LoRAs, neon?

royal yoke
#

yo

untold pewter
stuck gulch
#

hey guys

#

im having some trouble with connecting AUTOMATIC1111 stable diffusion running on docker to my owui also running on docker

#

kinda blocked at this part

#

well the bit where you enter the API

fervent thunder