#๐Ÿ†•๏ฝœsd3

1 messages ยท Page 67 of 1

sterile pendant
#

and it then becomes statistically signficant

scarlet wharf
#

left is AYS kolors workflow, right original workflow by kijai. add AYS seem give more adherence on prompt

craggy crest
#

sure. but not very useful for the individual person. the only thing that really is useful to them is whether the results they personally get are what they want or not

edgy kelp
#

Reading this link with my tiny eyes it looked like "Im gay dot org"

alpine summit
hallow lion
#

Manual man, we need a manual. XD

sterile pendant
# craggy crest sure. but not very useful for the individual person. the only thing that really ...

play around on the site, you'll quickly see why it's such a powerful tool. it pits two random models against each other while only showing you the prompt. you vote L/R or tie, then it shows you afterwards which was which. the whole point is to test how flexible a model is. on the rankings page, you can click the stats button to see how a model fairs against other specific models. Oh and the prompts are random stuff people ask for (you can prompt for something as well i think, but ive never tried)

so basically, when you thrown a bunch of random prompts at a model, you get a much more realistic rating of how well the model handles various types of concepts vs the cherrypicked BS that most papers show where they have some ultra fine crafted prompt that happens to work really well with a model, since they know the data/captions that were used to train it.

alpine summit
craggy crest
sterile pendant
#

you should see some of the random ass prompts people throw at it, it's actually kind of rare to see cliche waifus and stuff.

#

people are undereducated and the average reading/writing level is that of an eleven year old

low stone
#

All of this is until 8b is out, so if you could get on that. ๐Ÿ™‚

craggy crest
craggy crest
low stone
#

I'll also drop more sugar on you, I've been using your aam xl model a lot. It's really great as a general model as well, able to handle an impressive amount of concept beyond just "anime".

foggy cloak
#

First impressions matter a lot, SD3 got off on the wrong foot

edgy kelp
#

Imagine if they keep hyping it like the 2B and in the end it's not the same as they are using in the API but it's a neutered one

low stone
craggy crest
lavish osprey
#

I like Neta Art

edgy kelp
lavish osprey
#

in general most models built on top of Animagine are pretty great

sterile pendant
lavish osprey
#

AAM XL in particular is built on top of both Animagine (indirectly) and Dreamshaper. Another reason why it's able to go Turbo without too much quality loss.

low stone
sage burrow
#

@lavish osprey I'm wondering if my glif (that I remixed from another) has anything built in somehow that isn't mentioned? I looked through it all, and it's only SD3 and Claude helping with prompting. But somehow the images come out better than just straight up SD3 on my own computer.
To see any settings, you can just hit remix. But more likely it's something glif has built in to just make everything better, OR, is claude the awesome and it's all about prompting?

craggy crest
lavish osprey
#

on your computer it's 100% sd3 medium, since we didn't release large yet.

bitter hearth
#

Hello waow

craggy crest
sterile pendant
edgy kelp
craggy crest
craggy crest
bitter hearth
uncut river
#

promise promise
dogs that makes lots of sound
typically dont bite

edgy kelp
craggy crest
bitter hearth
bitter hearth
craggy crest
edgy kelp
bitter hearth
edgy kelp
sterile pendant
sterile pendant
#

i will

craggy crest
sterile pendant
#

i wont

finite osprey
bitter hearth
edgy kelp
#

Dunning Freddie Kruger effect

craggy crest
bitter hearth
#

I have no clue what you guys are on about, seems very dumb

edgy kelp
bitter hearth
#

Like the people I see in games arguing who is right thomas

bitter hearth
edgy kelp
finite osprey
#

Sharing is caring

bitter hearth
craggy crest
edgy kelp
#

Ball rate 10/10

bitter hearth
edgy kelp
#

Imagine doing this in the house of a vegan, they won't be able to eat

sterile pendant
# bitter hearth You will

nah, bros acting like he's dr. jenkins from starship troopers saying "its afraid..." like he can magically read an ai's mind or something and trying to hit people with the git gud spiel lol...

craggy crest
finite osprey
sterile pendant
craggy crest
craggy crest
bitter hearth
craggy crest
edgy kelp
#

Fishy situation I reckon

craggy crest
#

@bitter hearth

sterile pendant
bitter hearth
foggy cloak
craggy crest
edgy kelp
low stone
foggy cloak
finite osprey
#

I just realized you can go negative in the positive prompt. Like I can write "None of them are outside." and it worked.
You all probably already knew that, but anyway

bitter hearth
sage burrow
reef urchin
#

Only their cat has 4gb of vram. They themselves have more than enough.

low stone
lavish osprey
craggy crest
bitter hearth
low stone
edgy kelp
low stone
lavish osprey
edgy kelp
bitter hearth
low stone
lavish osprey
bitter hearth
#

Booba

edgy kelp
lavish osprey
bitter hearth
lavish osprey
lavish osprey
bitter hearth
#

thomas I dunno

craggy crest
lavish osprey
bitter hearth
#

Try it on 2b
My prompt is "A shining metallic ball with 2 arms open lying on grass"

lavish osprey
low stone
lavish osprey
craggy crest
# low stone

when you've spent too much time laying in the grass

edgy kelp
bitter hearth
edgy kelp
#

Damn

bitter hearth
#

runs away

edgy kelp
#

New York Times Opinion: Is This Too Suggestive For SAI?

low stone
uncut river
#

lol! does this count ... ?

bitter hearth
edgy kelp
#

Because hands are under

craggy crest
edgy kelp
#

That grass looks like a... yeah lol I was beat to that

low stone
craggy crest
bitter hearth
#

How sd learned hands: it grows them

uncut river
#

the side view made me rotate the image, was more hoping for a front view with flat solid green wall of grass

edgy kelp
craggy crest
low stone
edgy kelp
#

Imagine a mummy during that covid toilet rolls shortage

bitter hearth
low stone
edgy kelp
bitter hearth
#

Mummy ball?

low stone
#

Neither sd3 8b nor kolors is understanding "barren shelves". It's giving me the opposite. Maybe he's angry about too many choices.

edgy kelp
#

100% sure OpenAI will make a special denoiser (or call it what you like) for AI generate images and train the next Dall-E on insane AI generated images but will be sure to edit every one with some automation in order to avoid artifacts

sage burrow
bitter hearth
low stone
edgy kelp
# bitter hearth

That's my local supermarket after I bought all the toilet paper (I have a hyperactive gut)

low stone
edgy kelp
#

BRB gotta photograph myself in mummy bendages while I show despair in an empty supermarket (I have the budget)

bitter hearth
edgy kelp
#

Empty shelves is a banned phrase from the SAI datasets because it's too naughty

low stone
edgy kelp
mortal mesa
#

govt intervention, cant show empty shelves

low stone
# edgy kelp No I was joking

I know, but a dedicated Lora of people shouting why at the skies in grocery stores would probably find a niche market on civit

bitter hearth
edgy kelp
mortal mesa
#

go to walmart and take pictures for your dataset

low stone
edgy kelp
mortal mesa
bitter hearth
uncut river
#

so ... real life photo model for clothing branding, seems an easy task for ai.

custom desing NOEDEL brand outfits. not for sale yet!

alpine summit
uncut river
#

Off-grey stretchy combi outfit, top with half-long stretch skirt. Good for both parties or casual activities. Not for sale yet. Noedel brand.

#

sleeves not included

alpine summit
uncut river
#

Hey Goo Goo Gage, I hope you get banned for life

craggy crest
#

y'all are such good friends

uncut river
#

well crystalwizard, I certainly hope you missed that video which has been removed. was there for too long.

sage burrow
#

So how come Ella over the various Claude or Ollama ones?

uncut river
#

I don't care about the hands, but how can I make SD3 stop using double ll ? It's Noedel, not Noedell

uncut river
#

no, sd3 should do it. back in the days using sd15 i first generated and image, then photoshopped the text over it, then do a refinement with img2img

#

SD3 should do text better, it's not like I'm asking for a full poem in correct layout

#

maybe I should ...

#

btw, this is my character Caitlin. Always wearing custom designed glasses, cuz she wants and can afford it.

craggy crest
uncut river
#

I just want a working imitation of M$ Word Art (tm) inside the SD3 medium model. perhaps too much to ask for a 2003 technology?

#

๐Ÿ˜„

#

sorry, that was bordering trolling. Nevermind, I switched to generating realistic images of Caitlin anyway!

uncut river
#

lol, no. ascii is 1963 tech

bitter hearth
bitter hearth
mortal mesa
uncut river
#

no, what is that?

mortal mesa
#

lots of graphics and text nodes, its quite good

uncut river
#

Caitlin, expensive glasses for each day or outfit. Cute looks but a calculated cold character, who loves her work and does not like standing still. Her study in behavior management really helps her at the facility.

devout schooner
#

For all the talk of censorship in SD3, I find it draws women who are randomly straight up naked (but disturbingly without nipples or like anything other than continuous skin downstairs) quite often, even when I didn't prompt for it at all. Not gonna post any examples cause they're all creepy, looks like burn victims or something lol

mortal mesa
#

yes

uncut river
#

damn, it seems to be hard to get a real photo style with purple eyes from sd3

uncut river
#

it seems they have not removed all nsfw from the dataset and trained from scratch, but instead evolved existing models based which do include nsfw images in the dataset. I think sd3 is just trained not to show the more explicit stuff that is hidden within it.

mortal mesa
#

if nipple reroute to smooth plastic

bitter hearth
#

does anyone have a guide to CFG and Steps numbers for SD3

uncut river
#

maybe it's just me, but I think sd3 makes better sexy images when putting stuff like (naked, nude, explicit:1.1) in the NEG prompt

#

yes, go for low CFG

bitter hearth
#

there was a complex discussion on here a few days ago about negative prompts
they might not do much

uncut river
#

like between 3.8 and maybe up to 6 typically around 4.0 to 4.4 or maybe just keep it on 4.375

bitter hearth
#

I am not sure as I am just now barely starting to test SD3

#

ah okay thanks

#

yeah 2 was too low and 7 too high

uncut river
#

and if you want to go crazy, don't let the usual limits stop you

#

btw, I only know sd3 medium running locally

#

im not sure, but I think the architecture and prompting for the larger models differ

bitter hearth
#

I am using a huggingface space

#

but the only problem is it doesn't say the sampler

#

I might have to do it properly in comfy to know what the actual full settings are

uncut river
#

hm, i think samplers are overrated

#

I just use Euler for about everything

bitter hearth
#

its mostly that they either converge or not

#

and need a different amount of steps

mortal mesa
#

many samplers dont ever converge by design, kinda why i like DPM_Adaptive it picks out how many steps it needs by itself to converge

uncut river
#

Caitlin back at work, as happy as she can get! (she loves serious work...). though maybe sd3 turned her a bit asian. Oh well, it's all about behavior for Caitlin, the good looks are just her trademark.

bitter hearth
#

I've always used DPM++ 2M Karras, DPM++2S a Karras or DPM++ SDE 2M Karras

#

cos that gets you an ancestral one and an SDE one

#

as well as just DPM++ 2M Karras

#

Interesting

uncut river
#

very similar prompt in SD15 - RealisticVisionV60B1 - for reference

#

you don't see it, but there was color bleeding all over, so had to cherry pick this one. SD15 might be able to give nice(r) results, but at the cost of some rejects

craggy crest
#

prompt: hdr photograph, head and shoulder shot, a man,1960s hippy

bitter hearth
bitter hearth
craggy crest
uncut river
craggy crest
bitter hearth
#

sadly the bartender triggered the anatomy problem

#

ok so putting man in the negative is the way to go

torn wharf
#

my friend challenged me to train this barbie bimbo instagram model on sd3 and said it was impossible and could't be done. i'd had difficulty getting people's likenesses but i thought i could show him up. so i think i trained sd3 to do this megan millions girl using pics scrapped off her instagram. it does a lot of selfies mostly but it works.

#

||https://ibb.co/album/F5cDXw|| may be slightly pg13. these are all from sd3 with my trained lora. she's not my type but i love giving a good ol "in yo face" to haters.

#

if i could do this with 100 images over 100 epochs. minute per epoch. i believe in fine tuners

edgy kelp
bitter hearth
low stone
bitter hearth
uncut river
torn wharf
edgy kelp
torn wharf
#

i was joking a bit lol. i have no idea what i'm doing. i'll try with less images.

Those results are VERY cherry picked i should say too. All the underlying model problems are still there

bitter hearth
#

depends on the character or whatever

#

if you need more or less images thomas

edgy kelp
#

I think the average instagram model should be easy enough to train on 10 images, you'd have more difficult time with videogame aliens characters

craggy crest
desert garnet
edgy kelp
#

Well... if you have to train nudity on SD3 maybe you might need some thousands of images for a lora, as Cat with 99999 gb vram implied

bitter hearth
#

DO NOT make us use our vram

#

thats just asking for it

torn wharf
#

sd3 training is more efficient or maybe i'm just stupid. I can barely do batches of 2 on sdxl but on sd3 i can do batches of 10

#

wth room to spare

edgy kelp
#

Do not make me come here and use my VRAM, yeah?!

bitter hearth
#

๐Ÿซฃ

torn wharf
#

like, totally

edgy kelp
#

I have no idea either

edgy kelp
#

BALLS

bitter hearth
torn wharf
bitter hearth
#

oh goodness!

#

ok so it can make a bartender correctly
but only if the bartender is batman saying hello

craggy crest
bitter hearth
#

lol yeah

#

art deco in the prompt does that

#

if you cherry pick you get much better results, this one is fantastic

#

its very inconsistent

bitter hearth
#

hello

#

hmm the image quality goes up if you stop prompting for super heroes

#

same prompts but with bartender instead of batman or superman

#

the fashion

craggy crest
odd basalt
hazy kestrel
placid swallow
#

can someone DM me the 3.1 2b plz

bitter hearth
#

if you cant I think its a skill issue

placid swallow
#

no way jose I copied all the skillz i needed from the hugginface

craggy crest
sage burrow
#

A narwal anthro ๐Ÿ˜‰

bitter hearth
hazy kestrel
bitter hearth
#

"inverted colors" in the prompt

alpine summit
hallow lion
#

those are some magnificent cats

hallow lion
# bitter hearth

hey its the sd3 wizard form the promo banner, he's back with a gooder license and the promise of an even gooder model to fix and rectify the universe. SD3.5 will be great just like SD1.5 just like poetry- it will rhyme.

regal hollow
#

how to use?

hallow lion
#

thats a very vague question...

#

if u mean sd3 then u got the API and/or comfyui but youll have to dig a bit for the models since civitia banned them

regal hollow
#

i want to use like a midjouney in discord

hallow lion
#

then go to the midjourney channel

regal hollow
#

i cant?

#

nono i want use sd. but use direction is like a discord usable

hallow lion
#

well there is Artisan

regal hollow
#

thank u

finite fractal
#

At dusk, a muscular man riding a bicycle at 120 KM/H on the highway, dramatic lighting, intense motion blur, dynamic pose, cinematic atmosphere, high-speed action, detailed muscles, realistic style

sacred jewel
bitter hearth
#

I am bored... so its time for more balls

torn wharf
#

balls!

bitter hearth
#

No one ballingsadcat

hazy kestrel
bitter hearth
placid belfry
#

made with the medium opensource version

bitter hearth
hazy kestrel
bitter hearth
sullen moss
bitter hearth
hazy kestrel
bitter hearth
mortal mesa
mortal mesa
mortal mesa
edgy kelp
bitter hearth
sage burrow
#

There seems to be 3 versions of sd3 for dl. 1 w/o clips, the 10gb one w clips, and the 15gb one w clips. Does the w/o clips version require less vram? Does the 10gb one require less vram than the 15gb one?

muted dove
#

The largest includes clips and the T5. You can load everything individually, or all together. The different sizes just give you flexibility on which parts you want to use/load.

bitter hearth
#

I like SD3 now

#

didn't really at first

#

but making stormtroopers invade 17th Century France has been fun

sage burrow
#

I love it when software "shopoing" is free ๐Ÿ˜„

sterile pendant
#

And it will take up the same amount of storage space as if you downloaded the largest sd3 checkpoint that contains all three encoders with it

static cedar
#

one tree during winter, reflection from lake, all white --v 6.0

sterile pendant
#

oh and dont worry about the fp16 version of the t5, it's mostly pointless. you could run a million A/B blind tests and would likely see them both within margin of error of each other, in terms of voting

#

if you have the ram/vram for it, sure, go for it

bitter hearth
sage burrow
sage burrow
bitter hearth
#

ah okay that's good

#

midjourney is the best at knowing artists I think

edgy kelp
#

Dall-E 3 very likely knows even more, but the prompt expansion might fugg them up... and not to talk about the filters lol

bitter hearth
#

I don't know anything about anime but this is my sci fi anime attempt

#

not even sure which anime that style comes from

limpid thunderBOT
#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

sterile pendant
hallow lion
#

the one thing emad delivered on: it is PC resource friendly.

#

Thanks emad. Say thanks kids.

sterile pendant
#

well i'd also give a HUGE shoutout to comfyanon and all the others that work on comfyui. that's where the real performance comes from. his system for automatically handling model offloading at various steps and stages of the process, is what keeps the vram usage down.

hallow lion
#

yes comfy is amazing, it is as fast as foooocus but without the quality loss

cobalt moon
#

6GB also run just fine

#

although may be a little bit slower by community standard

#

( 40 sec per 28 steps 1024x1024 )

hallow lion
#

i started with 4GB now that was slower than standards

#

took about 2-3 mins for 512

cobalt moon
hallow lion
#

4-5 for sdxl 1024

sterile pendant
cobalt moon
#

lmao

hallow lion
#

๐Ÿค”

cobalt moon
#

it is a Laptop 4050 so yeah

sterile pendant
#

if you know specific resolutions you like to use, i highly advise going the tensorRT route for experimenting. like if you don't need to use cnets, loras, ipa, etc, you can get like 75% reductions in generation time. it's great for exploring (they might be able to work some of those features? not sure, never tested to actually see)

#

like for me, the only resolutions i use are 1024/1024, 1152/896 and 1344/768, or reversed

bitter hearth
#

I like ultrawide

#

more than I even use square LOL

odd basalt
sterile pendant
# bitter hearth I like ultrawide

yeah, but the problem is that most models are not trained for extreme aspect ratios like that. sure, they'll work sometimes, but think of the dataset used to train the models. there is likely VERY little content involving things like 21:9 aspect ratios. 16:9 would is likely the widest they naturally want to go, which aligns pretty well with 1344x768(still roughly 1 megapixel, so well within a typical model pixel range, and also, both numbers are divisible by 64/32/16/8 evenly). i use 1344/768 a lot because if you do a NN latent upscale by 1.5, it puts it just a hair of 1080p and is super easy to crop/resize to size

deft briar
#

Create a highly realistic and dynamic image of the Indian cricket team celebrating their victorious moment after winning the Champions Trophy. The scene should capture the exhilaration and joy of the players as they celebrate on the cricket field. Use vivid colors and sharp details to portray the players in their blue uniforms, some holding the trophy high, others embracing, and some jumping in joy. Include elements like confetti raining down, fireworks in the sky, and a jubilant crowd in the background. The expressions on the players' faces should reflect pure happiness, pride, and excitement. Ensure the setting is a well-lit stadium, with bright floodlights, a lush green pitch, and the Champions Trophy prominently displayed. The image should evoke a sense of triumph and national pride, making the viewers feel the energy and emotion of this historic win.

Specific Details:

Players' Emotions: Capture various emotions like shouting with joy, tears of happiness, and players lifting each other in celebration.
Team Unity: Show the players in a close group, arms around each other, symbolizing team spirit and camaraderie.
Trophy Display: Ensure the Champions Trophy is clearly visible, being held by the team captain or a group of players, reflecting the significance of the win.
Background Elements: Include a cheering crowd, waving Indian flags, and banners with congratulatory messages, adding to the festive atmosphere.
Action Shots: Some players could be shown spraying champagne or doing victory laps around the field.

proven cipher
#

when i try to create an image with SD3 1368px * 2048px it just fills top and bottom with nonsense and process a square. Is there any workaround ? SDXL works fine in comparison.

signal shuttle
proven cipher
sage burrow
signal shuttle
proven cipher
signal shuttle
sage burrow
signal shuttle
sage burrow
low stone
signal shuttle
low stone
#

This is with cfg 4. Seems harsh lighting.

signal shuttle
edgy kelp
#

I generally think that's an issue of prominent synthetic data in the pretraining

low stone
#

Euler. Heunpp2 is massively slower and doesn't yield better quality for me. I, doing 50 steps though which did make a difference.

#

I'm using comfy's workflow that he published.

#

It the regular one

bitter hearth
low stone
#

Not the regular one.

proven cipher
#

with sdxl i got a lot better results

low stone
proven cipher
cobalt moon
#

today I just try SD3

#

on my another 4050 laptop

proven cipher
signal shuttle
low stone
#

Based on comfy's workflow, I've found the optimal res to be 1024x1280 or 1024x1344. I don't use anything else anymore.

proven cipher
#

i need something like 1368 x 2048 as smartphone backgrounds

#

i will try with 1024x1344

low stone
#

Yeah, if you need higher, you'll have to use upscaling methods. Sd3 won't full had resolutions directly.

proven cipher
#

i get heavily nonsense with SD3, idk xD (same prompt as told above)

low stone
#

I use the workflow on the woman picture on that page, at 50 steps

hallow lion
edgy kelp
#

Do not make me come here and use my VRAM, yeah?

lucid swift
sage burrow
edgy kelp
#

Dude's Ballz (TM) are amazing, I guess it's not a neutered cat

mortal kite
#

I have to say tho that all the low vram users I think limit the model. Can we really get an excellent model if it has o fit in 4GB?

bitter hearth
#

I put mask in the prompt to avoid face issues
and it decided to do a lace mask

#

its quite clever at adapting to the theme

#

when I put friendly witch in the prompt it added flowers and a vase to also make the background more friendly

#

hmm same prompt but now they have covid mask

sage burrow
hallow lion
#

Allegedly. Meow.

sage burrow
#

The catch with the API (and glif and huggingface) is that we won't be able to add checkpoints and loras to it when they come out I've been eyeing up some cloud systems!

#

via glif SD3 large with CLaude helping

#

A werewolf sphinx lol

#

brb, Imma post this on some conspiracy groups ๐Ÿ˜‰

urban arch
sage burrow
#

A Ctulu centaur lol

sterile pendant
sterile pendant
foggy cloak
bitter hearth
#

I tend to use hidiffusion or deep shrink for generations like that

sterile pendant
#

Point was that an entry level card of the current generation is on par with a top of the line card from the 20xx gen

#

But it also meant I got a ton of usage out of it before needing to upgrade again, so there's that

mortal mesa
#

mmm its better than some, i rock a 2080 TI

#

waiting on 50 series pricing, when i get up off the floor ill probably get a used 3090

sage burrow
#

Hmmm, cloud computing options seem to be less than monthly computer payments, for a computer which will be obsolete in 2 years, hmmm

torn wharf
#

not obsolete. the 2080 is 6 years old and still useable. just ancient

#

obsolete has a specific meaning

#

lots of old tech stil has uses

mortal mesa
#

landfills!

torn wharf
#

e waste is a useless thing

bitter hearth
#

Give me that useless and obsolete card sadcat

mortal mesa
#

who doesnt want more land

sterile pendant
# torn wharf lots of old tech stil has uses

Yeah exactly. But I'm holding out for probably a 5070. I'm not THAT big of an AI junkie, but I still game quite a bit. I don't care about shit like 4k 360fps style gaming though, so the xx70 model cards are more than enough

silver sluice
sterile pendant
#

I'm assuming the 5070 will have 16gb vram, so it's good enough

torn wharf
#

i game tons so i'm always getting a new gpu. AI was the cause of me switching from AMD to Nvidia. I probably should've switched around the 1080 generation instead. Nvidia really started to shine so hard then.

mortal mesa
sterile pendant
#

I have a 7900xt in our other PC, it's dawgpoop for AI related stuff, or at least it was the last time I tried it last fall

torn wharf
#

new card every couple of years but i try to find purpose for my old cards and don't just landfill them. at my old house i'd have them on my wall but they were ugly. i might make a knolling case for them next

mortal mesa
#

thanks i learned a new word today

torn wharf
#

was actually going to get the 7900 on launch week but it was apaper launch in Canada. I couldn't find any places anywhere between edmonton and vancouver that got any in stock. amd fucked around that launch hard. that was the other major factor that made me switch to nvidia

torn wharf
mortal mesa
#

nice

sterile pendant
#

Bruh the term knolling case is so redundant... Almost every case you'll ever see or uses 90deg angles, even if there are curves, the base is flat or the object being held is kept perpendicular to the surface the case is on

low stone
#

Sd3 2b. I asked for inside the brain of Cookie Monster.

sage burrow
#

Never ladfill computers! in my city there's this for eg., as well as another~~ one~~about ten or so which donates them to people to broke to buy their own computer. Probably similar in every city https://www.rebootcanada.ca/#:~:text=Supporting reBOOT Canada is simple,virtually anywhere in the country.

reBOOT Canada provides computer equipment, training and technical support to charities, non-profits and people with limited access to technology.

desert garnet
mortal mesa
#

gotta save room for the wind turbines, they ar HUGE

sage burrow
bitter hearth
#

Becky is one step ahead of me sadcat

sage burrow
mortal mesa
#

used to pick computers on garbage days and Frankenstein them into something, dont see much on the curbs like that now

torn wharf
sage burrow
bitter hearth
#

I think.. it might be from 2019

sage burrow
bitter hearth
edgy kelp
bitter hearth
edgy kelp
#

Haha

low stone
#

A colorful, swirling vortex of half-eaten cookies and crumbs inside a fuzzy blue brain. Chaotic synapses shaped like chocolate chips firing erratically. Dark, shadowy corners filled with forgotten vegetables. Frenetic thought bubbles containing jumbled letters spelling "COOKIE" in various fonts. Tiny, worried-looking Sesame Street characters trying to navigate through the cookie debris. Flashing neon signs reading "EAT" and "MORE" scattered throughout the brain tissue. A distant, echoing laugh track playing in the background. Cracked mirrors reflecting distorted images of cookies and milk. Pulsing veins carrying streams of cookie dough instead of blood.

bitter hearth
wide pagoda
bitter hearth
#

I always use GPT-4o for prompts its pretty good

#

his prompt kinda looks AI

edgy kelp
#

I think SD3 should work best with AI generated prompts as its training captions were made with CogVLM

low stone
mortal mesa
bitter hearth
#

AI prompts, what's next ? AI images?? sadcat

edgy kelp
#

Ah darn, these new things made by THE DEVIL, in my times we had Dall-E 3!

torn wharf
#

for a lot of people this means using an LLM

rich tartan
#

How to use?

kindred pumice
#

Hi

edgy kelp
torn wharf
#

i believe you're stuck in magical thinking and don't have any evidence to support your hypothesis. cogvlm captions don't lead a model to understand LLMs better than people prompts.

You have a skewed understanding of what natural language is.

edgy kelp
#

I mean I'm not entirely sure, there are many systems that recognize "synthetic" sentences and long forms, but that's another thing, when I said it works best with AI generated prompt I didn't mean necessarily in contrast to prompts written by humans but rather "it's the intended way"

torn wharf
#

natural language was intended. not a stronger compatibility to LLMs. LLMs can produce tag style prompts too

winged grail
#

hey im having problems installing ReActor, it's not showing up when i installed it. Any way you guys could help me out?

torn wharf
# winged grail hey im having problems installing ReActor, it's not showing up when i installed ...

consle probably says an error, something about no insightface installation. this is tricky on windows and i often elect to use a precompiled version of insight face. precompiled are often not ideal since that's how viruses can easily spread, but in this case insightface is popular so it's kind of easy to find a reliable one. https://github.com/Gourieff/sd-webui-reactor?tab=readme-ov-file#viii-for-windows-users-if-you-still-cannot-build-insightface-for-some-reasons-or-just-dont-want-to-install-visual-studio-or-vs-c-build-tools---do-the-following

the docs have a wicked troubleshooting section

GitHub

Fast and Simple Face Swap Extension for StableDiffusion WebUI (A1111 SD WebUI, SD WebUI Forge, SD.Next, Cagliostro) - Gourieff/sd-webui-reactor

bitter hearth
#

I actually think OpenAI did the right thing by providing an LLM that was finetuned to prompt their model
I wish every publisher of a diffusion model did that

torn wharf
#

After seeing what omost can do, i'm ceratin that LLMs to prompt the model is the future. But thats for reasons outside of "cogvlm captions mean it understands LLMs better than something human written". I mean, LLMs were trained on human material to begin with

winged grail
#

yeah its the insightface

torn wharf
bitter hearth
#

the original cogvlm is not too smart though
its not like GPT-4o
there are patterns in its captions

#

like repeated mistakes etc

winged grail
#

@torn wharf i sent u a dm

bitter hearth
torn wharf
# bitter hearth

you see that ninja scrolls is getting a remastered theatrical release?

bitter hearth
#

I'm afraid I know nothing at all about anime

sage burrow
#

I was in best buy today picking up plugs. Stopped by the computer department for fun. I looked at the specs of the ones on display and said out loud "is that ir?!"

The person working there explained that the second any high end computers are released they are bought immediately. There's a limited amount if gpus apparently.

This was in Vancouver, Canada, a rather large city.

Not sure how true that us but Nvidia owner us buying a few more yachts I think lol

bitter hearth
#

all I know about anime I learnt from looking at stable diffusion generations lol

#

yeah nvidia is most valued stock right now

sharp stag
#

Hey, need someone's help with gen. that has stable installed on a local machine

Long story short im away at work and will be home in like a month. Need someone with "stable" running locally (non XL, can be makeayo). Had been bored and wrote a prompt in my spare time. Wanted to see how well it performs, but it needs to been locally since its long/weighted and i wont get the same results with cloud based gen. (like never). Side note its nsfw(nothing hardcore or lolicon). I would send both positive and negative privately. Thank you

torn wharf
torn wharf
#

no fires yet thats later in the year

sage burrow
torn wharf
sage burrow
torn wharf
#

sort of built mine. Bought a second hand alienware area 51 pc and i've replaced most of the core components at this point. went from amd threadripper to an intel alderlake

#

this one was a 3 paragraph long prompt instead of one i wrote. its okay too i guess. still balls.

#

ice cold

mortal mesa
sacred jewel
torn wharf
mortal mesa
hazy kestrel
low stone
hazy kestrel
low stone
bitter hearth
#

top is SD3 and bottom is Kolors

#

not a fair comparison since SD3 a base model

#

but I hope it can get to Kolor's level

low stone
bitter hearth
#

wow nice

#

looks festive

#

this is sd3

#

sometimes it does great

low stone
#

Kolors is excellent. Sadly it doesn't know Cookie Monster or Elmo particularly well

low stone
low stone
gusty trail
#

I managed to train it with my own script. It just replace the encoder to glm

bitter hearth
gusty trail
bitter hearth
#

wow thanks this is awesome

#

will try to make some kolors loras

#

what is kolors

#

in dumb cat terms sadcat

edgy kelp
#

Also in BALLS terms please

bitter hearth
#

anything becomes cool if you add 4k background on it lmao

edgy kelp
#

So many balls to witness

#

I'm going emotional

bitter hearth
#

you might not be able to handle this one @edgy kelp

edgy kelp
#

๐Ÿฅฒ

bitter hearth
#

sadcat not round enough

edgy kelp
#

Oh no, unballed

#

Sad stuff

bitter hearth
#

ball factory waow

edgy kelp
#

I work there ๐Ÿซก

#

Notice also that the emoticons are BALLS

bitter hearth
#

true, everyone loves using balls pretty much

edgy kelp
#

thomas If-you-know-what-I-mean

runic tusk
bitter hearth
#

sadcat beans are horrible

mortal kite
bitter hearth
torn wharf
vapid radish
pseudo owl
# bitter hearth were you able to hires

should be possible, you just need to enable sliced vae decoding and vae tiling. i have no idea how to do that with fooocus or comfy but in diffusers its pretty easy.

runic tusk
craggy crest
bitter hearth
#

delicious toothpaste makes you smile after eating the pizza right

runic tusk
#

@bitter hearth

low stone
#

now with cat meatballs

bitter hearth
low stone
hazy kestrel
low stone
bitter hearth
sacred jewel
torn wharf
strange grotto
#

anyone tried this?

#

sd3m finetune

#

trained on 50000+image

sacred jewel
bitter hearth
#

We could run it real quick

sacred jewel
bitter hearth
#

"An ancient castle draped in ivy, looks even more majestic under the setting sun"

#

Not sure it is worth running

#

Should be free on Tensor

sacred jewel
pseudo owl
bitter hearth
sage burrow
torn wharf
sage burrow
torn wharf
#

so then eveyrone in line would be like "why would you ask!?"

bitter hearth
#

I come back and so many balls

torn wharf
brittle nexus
brittle nexus
bitter hearth
low stone
#

auraflow

low stone
hallow lion
#

so many balls

hallow lion
sage burrow
#

at least he didn't mix pineapple and anchovies ๐Ÿ˜‰

torn wharf
#

sounds kinda good

fallen pier
alpine summit
alpine summit
junior peak
alpine summit
bitter hearth
cobalt moon
#

hmm, one 25steps 1024x1024 required me 20 minutes on my 2GB VRAM setup

#

it is painful ofc

hallow lion
#

You are patient sensai.

bitter hearth
#

why would you not use lightning in that situation? LOL

brittle dragon
#

A futuristic, abstract representation of a human brain fused with circuit boards and digital neurons, set against a deep space background. The brain should be partially translucent, revealing pulsing energy and data streams within. Incorporate vibrant, electric blue and purple hues to represent cognitive activity. Add subtle, glowing lines connecting various parts of the brain, symbolizing neural networks. The overall shape should resemble the letter "C" for CogniZone. The style should be sleek, high-tech, and slightly ethereal, conveying the concept of advanced artificial intelligence and cognitive computing.

alpine summit
alpine summit
low stone
dull star
#

is this auraflow

low stone
low stone
dull star
#

oh, I understand

#

I made a mistake

#

this is absolutely sd3

muted dove
#

Not bad for an early beta with regular updates promised.

fleet meteor
low stone
bitter hearth
#

guys the heun ones seem good

#

much better than dpm ones

#

ok I did more trials

#

euler heun heunpp2 dpmpp_2m uni_pc uni_pc_bh2 were good

#

with

#

sgm_uniform or simple

#

however ddim_uniform gave more "baked" results

#

which sometimes was fun

#

heun heunpp2 were better for realistic people than euler overall

dull star
#

sometimes images look like something out of ideogram

low stone
muted dove
low stone
#

Compared to Kolors which was midjourney, so it often has a very over stylized look to it (I still love it too)

muted dove
#

I'm surprised nobody has tried "girl laying in the grass" yet ๐Ÿ˜„

low stone
muted dove
low stone
edgy kelp
#

AuraFlow's text encoder is a "pile-t5-xl", I assume that being trained on The Pile dataset it can understand and learn NSFW, for anyone interested

low stone
edgy kelp
fleet meteor
low stone
#

the model itself is 16 gigs. This is what Lykon means when he talks about the 8b sd3 being larger than most people can deal with.

#

that said, it's all one big file right now. I don't know what the possibililties are in terms of breaking that out into image and text encoder components at some point.

edgy kelp
#

AuraFlow model has 6,8B Transformer but has ONLY the big text encoder (T5), I think if you use the 8B SD3 with only the clips you'd have a different "scale" of GPU use

low stone
cobalt moon
#

why not use both

low stone
#

Because clip is awful.

#

When you have a t5 llm there, there's no reason to use anything else.

#

The clip is only when you're trying to shoehorn it into small cards.

edgy kelp
#

I think if you don't use the clips you won't be able to use 99% of the Loras though

#

Not sure though

low stone
#

If the Lora's are trained on t5 then it's not a problem

edgy kelp
#

That's what I meant

#

I think most people won't train loras on the T5

#

But I have no idea haha

#

We'll see

low stone
#

There's always going to be a market for small card technology, but there's little progress if we keep holding onto outdated stuff.

alpine summit
cunning lintel
# low stone Because clip is awful.

Still not convinced of that, models with clip seem to know more, both styles/artists and obscure characters. Of course might just be the training of new models on synthetic stuff combined with current vlms being pretty bad at describing style and artists being stripped. Still feels like a regression sadly.

strange grotto
#

aura

fleet meteor
low stone
cunning lintel
#

Yeah, i hope things like IP-adapter will allow for style transfer in the future

sterile pendant
# cunning lintel Still not convinced of that, models with clip seem to know more, both styles/art...

and you're not going to get them again from a major company's public release, due to the wild wild west hay-days being already being over with. lawsuits and threats of lawsuits galore, shut that shit down fast. from here on out, newer models are only going to contain what they are legally allowed to contain and won't be able to include things like named people and artists, without their consent. so basically, it will mostly just be a bunch of copyrightless datasets and any artists that are okay with their stuff being used will have to opt IN and not opt OUT now. so if you want stuff like that again, people will have to risk potentially being sued to train loras/models with stuff they want, until governments step in and regulate that part as well(won't be long, two years tops for pretty much all modern countries).

dull star
#

hope it improves

verbal epoch
#

AuraFlow

mortal kite
craggy crest
mortal kite
#

tired of every single thing on the internet being an argument

craggy crest
craggy crest
mortal kite
#

ugh bye

low stone
# mortal kite They could have given a 4b though

I say this sincerely, but I think this is where sd3 medium will shine. They apparently considered a 4b but it was decided it wouldn't benefit enough people, so 2b and 8b are the ones they're focusing on. I think once sd3 medium is "fixed" as per their press release, it'll be really great.

alpine summit
cunning lintel
low stone
cunning lintel
#

Noticed the same, claude is way more creative

low stone
#

gpt4o just kind of expands slightly on what you type. whereas claude adds all sorts of elements including text banners and signs that really enhance things.

#

which sd3/aura are both really good at.

craggy crest
cunning lintel
craggy crest
mortal mesa
#

dalle ect, no that i like it

low stone
#

๐Ÿ™‚

mortal mesa
#

its Ella

mortal mesa
#

ok let me navigate around the glaring issues of SD3 and make some stuff

craggy crest
mortal mesa
#

i can just look at stuff on my computer, no scrolling

#

orange you glad im here

alpine summit
low stone
mortal mesa
low stone
mortal mesa
low stone
lavish osprey
#

Pixart 800m uses kv compression. It's essentially like a much larger architecture, but compression has drawbacks

mortal mesa
mortal mesa
frail shoal
bitter hearth
#

Blue girl, city background
Surely

mortal mesa
# frail shoal prompt ?

A dreamlike, ethereal portrait of Stability AI, its digital form dissolving into swirling clouds of iridescent gas. Glowing blue lines pulse through its translucent body, as if infused with an otherworldly energy. Its face, a blend of human and machine features, appears serene yet intense, with eyes that seem to hold the secrets of the universe within their depths. The surrounding environment is distorted, with buildings and landscapes warped into impossible shapes, reflecting the AI's ability to manipulate reality itself.

torn wharf
mortal mesa
#

lol pretty much

bitter hearth
torn wharf
#

monstrous covid

frail shoal
mortal mesa
#

are those SD3? your getting cool mirrored patterns, i like it

#

Loads Sacred Geometry LoRA

frail shoal
low stone
#

could get it to do all the text quite right.

bitter wadi
#

Wen SD3 ? cheems

mortal mesa
hazy kestrel
hallow lion
#

Oh noes... Emad. you OK?

low stone
torn wharf
mortal mesa
#

Prompt challenge, wordsmith this into something usable: text of "always coming from take me down" reflecting the text of "never going to give you up"

dull star
#

that sounds impossible

mortal mesa
#

yes 100's of gens later

#

i mean maybe not but ya haha

dull star
#

wanting an offline ideogram comes at a price

torn wharf
torn wharf
hallow lion
#

Maybe not safe.

dull star
#

idk how they didn't think of this oversight

#

but hey, its a free model that's in like 0.1 state

#

an actual free model with apache 2 license

#

I didn't test too much but 2 character facial expressions work better

like I ask for the person on the left to be scared or crying and the one on the right to be shouting and angry, it gets it right

torn wharf
#

likely they used the prompts to make ideagram images as the captions, so in training those unrelated captions will learn that cat, when that cat has nothing to do with whats being captioned.

dull star
#

this is sooooo ideogram its painful

#

but I still like how it looks though

hallow lion
#

ideogram suez SAI coz why not XD

torn wharf
#

SAI didn't make it

dull star
#

SAI making an apache 2 model?

#

even older models are at least openrail++

torn wharf
#

it was made by the guy who brought loras to image models

dull star
#

yeah its wild

torn wharf
#

as far as i could tell, ideagram's terms don't limit using images as base model training

hallow lion
#

we went out took photos of real world stuff and trained our model on that.

mortal mesa
#

the guy that implemented lora wanted to try to train a mmdit from scratch, it worked they released - the story

dull star
#

its crazy

torn wharf
#

its' not just stability's mmdit architecture though. they modified it to be more efficient

mortal mesa
lucid swift
dull star
#

hopefully

torn wharf
#

yeah it's v.01 and that's an obvious training data fix

cunning lintel
#

I saw the cat thing, tooo sad

torn wharf
#

oh i mean v0.1

cunning lintel
#

But pretty hard to get and even if you manage it's only sometimes :p

torn wharf
#

i am looking into running it, but it's nearly 7b parameters and i don't think comfy is optimized to use it in 16gb of vram

cunning lintel
torn wharf
#

it's a rushed out v0.1 release yeah. They can change that. I'm sure stability had many versions of sd3 and other models that would've counted as 0.1 versions, but never released them. This guy released his and it's the ugly side of the development process that many people aren't used to

mortal mesa
#

the we made something that works stage

cunning lintel
#

fall even released a 16 channel vae, it's surely coming

mortal mesa
#

i made some bad cats on a 2080 TI or i think it all fallback to CPU

torn wharf
mortal mesa
cunning lintel
#

Funny when SAI seemed to be close to falling apart and rushed 2b, it seemed open generative AI for images was going to be a black hole, now new models are planned or released in various places ๐ŸŽ‰ (and SAI seems back in business)

torn wharf
#

starbucks isn't even that great at coffee anymore. least the ones in my podunk hillbilly region.

torn wharf
#

probably a little bit of social engineering from competitors

cunning lintel
#

i feel if all was fine 2b would never been released as it was, but will never know

torn wharf
#

there's a wide spectrum of diverse situations between "falling apart" and "all is fine"

mortal mesa
#

they started pumping expectations too early

torn wharf
#

spectrums?! too woke tooo woke

hallow lion
#

They were trembling...

torn wharf
#

seems like an over dramatic metaphor. they were hung over after emad partied too hard

hallow lion
#

it was a solid scale 7 on the richter scale

#

lol

#

emad is SAI SAI is Emad, he is the mascot

torn wharf
#

hangovers take a couple days to get over

hallow lion
#

i hope they will recover whatever the future their contribution to open ai was awesome

#

Emad was the vision man tho

#

without vision its hard

hallow lion
#

You weekended the wrong way

cunning lintel
hallow lion
#

its like Disney without Lucas.

#

Maybe Sai will end up like that o without Emad

#

just hopeless atmepts at cahs grabbign bot no real vision or integrity

#

i hope not tho but ye

mortal mesa
#

they will be joining the WEF soon

hallow lion
#

Someone email Musk

#

Musk is the eternal whimsical clueless billionaire who can save us all coz pockets deep enough

#

Help me Papa Elon, you're my only hope.

mortal mesa
#

is trust and safety trustworthy and safe if they hide things like that

#

.1 shift 6 CFG, were glitching

hallow lion
#

glitch art - never fails.

lavish osprey
# dull star wanting an offline ideogram comes at a price

No idea what's going on with model creators training on Ideogram outputs lately (isn't pixart also trained on Ideogram data?)

Not only it borderline violates ToS (sure, you can claim you downloaded them from HF, but come on), but no way in hell an Ideogram image is better than a real photo or art, and its prompt following and text accuracy is never going to better than what a VLM can caption from a real image (that can also caption very small text).

And forget doing that with a 16ch vae. Sure, pixart and auraflow use SDXL vae, but even that is good enough to not be a bottleneck compared to synth data training that hasn't been at least refined.

mortal mesa
#

FOMO

lavish osprey
#

that being said, this is the #sd3 channel, please keep it on topic if you can ๐Ÿ˜„

lavish osprey
#

it's probably just very cheap, since you only have to scrape, no need to (re)caption (which is very expensive on large data)

#

MJ/Ideogram images come from prompts, so you can get free image+caption (again, violating the ToS)

#

there are a bunch of datasets like that on HF

#

making models is very expensive. I wonder why some companies ask you to pay for commercial use over a revenue threshold (or not, like Kolors)

craggy crest
#

maybe. they seem to have hidden their TOS page

brittle nexus
hot dawn
#

against my expectations, the non-attention parameters of SD3 seem to be more important for training than the attention blocks, despite that I froze them earlier in training it still seems most of the learning is there. It's possible that whatever censorship they did was semi hardcoded in the conditional pathway to remap the embeddings away from certain areas, though I doubt it since characters and styles are also primarily learned in the non-attention blocks.
if there any vram savings to be made for finetuning SD3, it might be best to stick to training the non-attention parameters, which are probably much smaller too

#

potentially training with those non-attention components frozen from the start would give different results, perhaps the easier training went there but it may not be ideal

#

"roman soldiers in formation, covered in mud and blood" did get much worse with the non-attentions only, possibly because there were only a few sketchy examples of roman soldiers in the dataset, and most of the learning has gone into those components, good and bad

lavish osprey
# hot dawn against my expectations, the non-attention parameters of SD3 seem to be more imp...

It's possible that whatever censorship they did was semi hardcoded in the conditional pathway to remap the embeddings away from certain areas
That's just a silly rumor.
AuraFlow has the same issue, right? It's in the "llm nature" of the architecture. At 8b params it scales very well, 2b is an attempt of having the same tech run on local hardware.
(Also not sure why Aura fails it too, since it's almost 7b params)

hot dawn
#

it seemed a bit iffy since I did see your post about how the issues with the grass pose were there earlier

lavish osprey
#

yeah but that just covered up nudity, did nothing to anatomy

#

if anything, some biased filter on the dataset might have amplified the issue

#

but nothing done in SFT under my watch had any effect on that

hot dawn
#

there's not much nudity in my dataset, but the ability to generate nude people seems to have been learned in the non-attention blocks as well

lavish osprey
#

also 8b was trained on the same "filtered" dataset and has no issues with this use case

#

can make it flawlessly

#

you can try it yourself on API

hot dawn
#

I suspect 2B can learn that pose if focused in the dataset. I've always found that lying poses are the hardest to finetune and didn't assume it was a conspiracy by SAI that they didn't work well in the base model

lavish osprey
#

(even if I still think that WHEN sd3m manages to get this prompt right, it's the best)

runic tusk
#

She's got a little something in her "pocket".

lavish osprey
#

the hands are wrong, but look at the details. Native gen, no upscaling

#

SFT on sd3m was done very nicely

#

too bad the base wasn't perfect

hot dawn
#

I didn't caption text in my dataset so I'm surprised this wasn't broken ๐Ÿ˜…

brittle nexus
#

Part of the problem was including feet as sex organs

lavish osprey
#

but this at least ensures that simple pictures come out amazing and that the model is a very (very) good refiner

hot dawn
#

there's no women laying on the ground in the grass in my dataset, the model just worked it out from general pose examples

lavish osprey
lavish osprey
#

it's gonna be a mess when we release Large ๐Ÿ˜„

mortal mesa
#

give it a name and refer to it like that, Gigantor

hot dawn
#

using base SD3M as a refiner

hot dawn
lavish osprey
bitter hearth
#

No balls in sight sadcat

lavish osprey
#

I usually use 0.15 denoise with sd3m

#

but to fix text you need to go harder

#

0.15 denoise on sd3m is equal to about 0.4 on sdxl

hot dawn
#

that was switching back to the base model from step 10 of 28

brittle nexus
lavish osprey
hot dawn
#

using a negative prompt for the first stage

lusty oyster
#

Just asking for a piece of advice should I use SDXL or SD3

#

For better images

bitter hearth
#

Yes

hot dawn
#

depends what you're creating, SD3 is amazing at a lot of stuff, but bad at people from the waist down

lavish osprey
lavish osprey
#

use a SDXL finetune for the rest

#

SDXL turbo finetunes are also crazy fast

lusty oyster
lavish osprey
#

4 steps of dreamshaper xl turbo-lightning for gen + 3 for upscaling + 0.15 denoising (or 8/28 steps) of sd3m for refining

brittle nexus
bitter hearth
#

Lykon do you know if prompting for "8" or "eight" is better or something

lavish osprey
bitter hearth
hot dawn
#

Lykon adding to my wishlist of stuff it would be great to know from SAI, it's not clear if the first timestep (1000) should be finetuned or not. I think in SD1.x it wasn't which led to greyness problems, but after that I never really kept up

lavish osprey
#

models can't count sequentially

#

they count "at a glance"

#

and even humans have issues past 4

bitter hearth
lavish osprey
#

by the way, this is also one of the reasons why hands are a hard problem

#

and why a huge ass 8b model is better at them compared by small-brain 2b cousin

hot dawn
lavish osprey
#

nah

#

hands are just very hard in general. Too many visual permutations

hot dawn
#

with attention / non-attentions. Think I've found the cause of my sometimes all-white images. Potentially merging layer by layer could find the problem

lavish osprey
#

these are all "valid" hands

#

but they all look very different

hot dawn
#

yeah I've been paying more attention to hands im photos and have realized how utterly insane hands are

#

so often a finger just can't be seen due to being bent the right way

lavish osprey
#

most models that are decent at hands basically overfit on small data and fewer hand positions

#

or take the shortcut of using only anime style or only realistic style

#

(or are huge ass like 8b)

hot dawn
#

well SD1.x was fighting against the VAE not even being able to encode and decode hands below a certain size threshold without introducing new lines etc, so I was optimistic a more powerful VAE would lead to a huge improvement