#🆕|sd3
1 messages · Page 87 of 1
the 2B not the 8B
The other night my subconscious just came outta no where and told me "hey you know how Jason Reitman directed the new Ghostbusters movies. Go look up Ivan Reitman director of the originals." Literally that just popped into my mind and I checked and they're father son
Brains are coo
the 2B has had its aesthetic fine tuning but the 8B has not yet
yeah you'll need clips. the t5 and clipL, load them with the dual clip loader and set it to flux
I wasn't even thinking of ghostbusters
where do I download them?
you can use the ones from there
they are the same
thank you
i'd go with the fp8 version of the t5 encoder though
the one thing i dont do
Ok, well, here is SD3 2B on the astronaut:
A bit of a catastrophe
that's what I was saying
if you don't think that SD3 is better for photography
then its not gonna convince you
where does the t5 go?
Yes, well, 8 fingers, mutant hands, an audience fresh from a Cthulhu horror movie. We have our own ideas on photorealism for sure
anatomy issues are to do with DiT not VAE
the way to prove that is to VAE encode and decode photographs
and what you will find is that the SD3 anatomy problems are not in the VAE
i think for the purposes of discussion, you'd select the best photos sd3 could produce and compare those.
I honestly could not care then, because the image is completely unusable, and you can wax poetic all day and night on its VAE and it won't improve that image
whiel i'ts not consistent right now, it highlights differences in architecture
https://civitai.com/models/636355?modelVersionId=725298
What do you think of this Lora? I have mixed feelings. These are the tests with different weights: https://civitai.com/posts/5464381
at the end of the day, it doesn't matter who "wins and loses" the internet argument. The real world differences are real and will be apparant to anyone that can see them
If you tell me that image I showed is BETTER than the others, well.... ok then
facts don't care about feelings
a VAE can't improve an image
VAE encode and decode are both destructive processes
they make the image slightly worse
this can also be proven by doing a VAE encode and decode on an actual photo
i'm not bothering with loras until authors figure it out. whats the comfy converted weights? why doesn't 0.2 have them? i go get xlabs loras and theres two versions of each. their text guide lists different ones. i try to load loras in comfy. no work. it's all a catastrophe. lora training authors rushed this stuff out to get street cred and now the files that are out there are a compatibility disaster zone
Stare at a waterfall for 30 seconds - then stare at a dark, blank space - and the space seems to fall as well!
lossy is a good word in pixel contexts too
Look, this talk was all about how SD3 produces better photorealistic images than MJ and Flux, and you persisted in this with some claim the reason was the VAE. What I see is a complete disaster of an image, but that's just me. So talking all day about how better its VAE is, for me, is .... absurd.
the vae has nothing to do with the structure
thats a great one. i also love where you stare at a monitor screen of all weird hues, then swap to a black and white image and it'll appear in full color until you dart your eyes
SD3 8b ❤️
sd3 produces extremely photorealism - but the model is fucked up and yes, the images it produces are a disaster
Flux is able to do similar degree of photorealism. It's just hard to prompt for it
Tell that to him. He said SD3's photorealistic images were better, and then said the reason was the VAE.
I only asked for an opinion on the tests I’ve done... ??? But anyway, thank
sooo not photo real then
As an art medium, I believe that SD3 medium is a hugely successful tool. It matters not what quintile of what quotient, which mote of SD3 makes it successful as an artist's medium
the model is unfinished and basically unusable in its current form
that's my opinion. i think the effect of each is subtly working. but overalll...
But!
it makes some pretty details
ice didn't want to say that SD3 produces better images. He said that it can produce an extremely high degree of photorealism on the finedetails
you may want to unblock me so you can hear me better
The discussion was with Neon
my daily flux email spam 🙂
When it comes to photorealism - there is an element of visual psychology which needs to be considered - and in SD3, what do we need to tweak in order to conform to human visual psychology?
produce feet and legs?
Rick Salmon (famed photographer) heavily relies on psychovisual dynamics in his work
The realism in a photograph is underpinned by (correct fingers/limbs/toes) saturation; contrast; and sharpness
if i take a complete trash photo on a $3 toy camera is it still photorealistic, yes
How can SD3 be prevailed upon to better exemplify human psychovisual traits?
It’s not better or more realistic. What you have is greater sharpness; it’s been trained to have a shallower depth of field, which makes the overall feeling more natural and realistic. But if we remove that, what’s left is this other thing (hopefully they’ll release a version that competes with Flux)
Psychologically, SD3 2B should stop producing this, so I won't feel... traumatized:
Me-ness good!!!
😄
i know that name. my design teacher was all about him and his one liner pro tips for composition. we were learning scene composition at the time and he was all about how we can take wisdom from other professionals for design rules. i dug of a few of these one liners to show how great they are.
- The name of the game is to fill the frame.
- Dead center is deadly.
- When you think you are close, get closer.
- The camera looks both ways. Adapted from Freeman Patterson's "The Camera Points Both Ways."
- Expose for the highlights.
- Use your camera like a spaceship. (A Dick Zakia philosophy.)
- Light illuminates, shadows define.
- Backlight = shoot tight.
- Make pictures, just don't take pictures.
- See eye to eye - shoot eye to eye.
- Take the darn flash off the camera.
But look at it this way: it has a 16ch VAE and that skin texture is perfection. Definitely the best image possible
Textual embeddings made using A1111 can fix the anatomy problems of SD3 medium
The funny thing is that SD3 8B struggles too with this as seen here:
I tested this at Glif for any curious to try themselves
What would be interesting would be a website listing all of the various AI txt2img, but then listing what each specializes in, and what each is terrible at.
every author is going to submit their work as the expert at everything
in the clip folder
along with the clipL file as well
and this is the node and way you'll use it
at this point I am not sure if you really get the difference between VAE and DiT
because you keep making criticisms that apply only to DiT
ignoring LoRA and only referring to established commercial models, I could tell you that easily, but I might not cover all special use-cases
Or maybe we have different criteria on photorealistic images
no because a VAE cannot make changes to anatomy
if you VAE encode and decode photos you will understand
differences in VAEs with these models pertain to colours and very fine (high frequency details)
something like anatomy is known as a low frequency detail
I don't think so. If someone says 'more photorealistic image' or better photorealistic image' then I look at the image. Period.
i think here is that for him, the goalposts were shifted long ago from a discussion of the vae and what it brings to the model to "sd3 is bad and you just don't get it" but for you, yu're still on the original discussion about midjourney not having a sota vae yet
but we were talking about VAEs?
No, YOU said it was more photorealistic BECAUSE of the VAE
no
really?
it was in the context of a conversation about VAEs
want me to scroll up and quote you?
the point I was making was that the image is more realistic in terms of the VAE
see it's not actually a matter of having a discussion. he has to win. there's no compromise to build a mutual empathetic understanding. no grok
you then jumped to comparing things that don't pertain to a VAE
too adversarial. no empathy
Sure?
cos there are misunderstandings
i see human drama everywhere especially in moments of irrational discussion
Ok
oh i know. the misunderstandings tennd to be forced through by some.
I already explained that I was referring to the VAE
No, you said the reason was the VAE
you're assuming incorrectly what I was saying
see what i mean by irrational. he's insisting he knows what you meant more than you could
Whatever
sometimes misunderstandings are manufactured
also regarding the hundreds of images
this is an example of one of the sites I used:
https://images.flrty.li/
there are about 200 comparisons there
"trash 3$ toy camera" included in the prompt haha
ok but you're stirring the pot a bit here
there's not rly drama
Am I imagining thigs, or is Flux better at limbs and fingers?
perhaps. good check. will go make breakfast
I could not say. He has been on my blocked list for a while
one of my pet peeves is people who force through misunderstandings. i often see it when one starts speaking towards what someone else meant. BIG pet peeve. but yeah. perhaps you're right
I can tell you guys had some kind of argument in the past
Nah. I found him the opposite of helpful. Everything came down to, and this came up not only with me: this is all wrong and you are at fault.
nah, 2 people right in their own ways clashing ideas
nd backed this up with rudeness, which for me is a kiss of death
death. 🙄
"The text is above the page!" is often said of literature. For a photographer it would read "The photo is without the camera!" And I hope that's not too contradictory 🙂
Here is Flux Dev NF4 with the astronaut. Aced it as expected:
vlean the camera lens its blury
I didn't really explain very well before
that I also think SD3 has bad anatomy (its does in my images too)
its the fine details that I like, for photographic styles
Sometimes I just hate the amount of blur in the gens 
like come on my phone can do better
Do you get the idea that at SAI - the SD3 team 'never' spoke to the BFL team at all?! 😉
they did cos Emad said that he helped the team transfer out
must be the suburbia silicon valley millionair neighborhood to have a classroom that ritsy
thats no inner city school
(There was a little irony there ...!)
Make your own pinhole camera
speaking of photography
https://huggingface.co/fal/AuraFlow-v0.3 focused on photographs
anything noticable
damnit. all day that song in my head now
oh wow the next auraflow
[Can anybody tell me why this w/f will not work?} TIA
I think thats it, can't think of any other iconic songs that would have something to do with balls rolling around or something

hmm that song def over rides flow rida
\
wait a second what does that song have to do with balls rolling around
Lots of DOF 
sorry its too hard to read
maybe try and load SDXL base and refine seperately?
can't stop the ball I guess nothing much
I haven't really seen them together that often like that
who knows this one
Not bad actually, and an image I can deeply sympathize with, having taken apart and salvaged many a legacy lens
but wait... it wasn't a rock. it was rock BALLL //organ solo//
My personal fav is the Mamiya-Sekor 55mm f/1.4
help im a rock - Frank Zappa
Starship - we built this city on rock and balls
Personally my take on SD3 vs Flux for what people were talking about is, SD3 struggles with anatomy due to obvious undertraining (bad results look identical to early epoch samples from Lora training), but I believe it was trained on significantly more images that were actually photographs and not synthetic than Flux. Just based on their default aesthetics respectively
maybe some Jerry Lee Lewis
wtf thats an actual song cx
yes its either that or its that flux was fine tuned in a way that narrowed its sampling distribution
i.e. that flux started out "better" for photography than the final flux version
and the final fine tune pushed it more towards its current style
not rly sure which
I think it also has serious guardrails which may have unintended effects on the output
yeah quite possibly
i made this one ez
let me give you an example. If I ask for 'comic-style', then regardless of the specific comic style I ask for, such as 'realistic comic-style', it all looks something like this. Big bold lines, two-tone shading and coloring, and minimal fine details. No amount of prompt engineering will change that.
It's weird how they decided to fully include female nipples in Flux, but clearly undertrained them such that they're really strange and melting looking
Like what's the point lol
maybe it could be distillation issue also
However, if I drop the CFG down to about 1.5.... suddenly it starts to 'remember' some styles, including realistic comic styles.
more tonality, fine pen lines for shadows and details, etc.
you using pro?
Now this is homebaked Dev
to be able to modify the CFG in Pro I'd need to use up my Fal credits, and my interest was not that high
this is the best
The other Pro offerings don't really give you that level of control
no amount of prompting changes anything.
Sure, you can use Pro in other places, but not to the point of custom resolution, or CFG
biting my tongue here. i think trying to explain distillation was one reason he blocked me
lower CFG raises image diversity in all models
so this is consistent
ah okay
Wow! Text coherence?! It took me about 10 tries with Flux NF4 to get a long string of text correct!
batch of 4, one came out good, and hammered guidance
Yes, but that it got that much right is a huge win. The only other generator that could do that is Ideogram
memories of ocarinas
I find Pro does get text better usually but other than that it's really not a big improvement over Dev
@torn wharf I did actually try to create a ballz one awhile back, was with SD3 though https://glif.app/@LadyLalita/glifs/clyd621a10000yha2zalqjbb3
Love these ones
I prefer pro over Dev, but I usually try for painted art styles, which is probably relevant.
I don't like Schnell at all anymore though 😦
When I was post-processing using Photoshop, 25 steps in NF4 produced a "quilted pattern" in the processed picture.
I have upped iterations to 35 to try and deal with that quilting
just to add i also use fp16 T5 with the NF4, i have an unscientific belief that i like it better
Yes, t5xxl_fp16 rocks
Schnell is great simply to rough-out an idea - and decide whether to expend some "real time" on making a good copy
https://imgsli.com/Mjg4MDk4
comparison flux nf4 v2 vs gguf v8
Carmen Curlers
Why oh why this mat1 and mat2 anomaly?
You and me, which is why I made an embedded model with the T5 fp16
noice. did that use my lora?
Roundish monsters? ROUNDISH MONSTERS?
i dig it
Looking for bts ai models please! I found a few but only 2 that looks like Jin and jimin. I need the other members
What is BTS?
A kpop group. I just need the Lora mode for the rest of them
The first and foremost place would be Civit
That’s where i found the two I’m using. Not the rest.
That was your best try, if not, your best bet might be training one yourself, or enlisting the help of someone to do it for you
i know this is off topic but i just want to go on record that nobody caught these free bonus points at all
I want free ice-cream
I haven't gotten to the point of being able to get glif to use loras yet
I don't think you can, since although you can build a Comfy setup, you are still confined to the models they offer
sd3 and flux know mad ballz somewhat already. my lora just helps it along, and probably not even in teh best way. it's a REALLY bad dataset still
His model is so hard to find a good one to use
OK so I used to volunteer at a male teen residential treatment center.... and I said balls one day refering to a qigong exercise... So ever since then, anytime there may be teen males int he audience I use SPHERES instead ROFL ROFL ROFL
civit has some for sd15. this is #sd3 though so i dont think we can help you much. i only ever load 1.5 models for animatediff playing
It was the most epic beavis and butthead moent ever... "she said BALLS"
what!? and take away the joy of balls?
You can enjoy your own balls 😉
one of my favorite things about this decade has been B&B made a comeback
Whatever happened to the SD3.1 thing? Still no word?
It does? I’ll have to look it up later
two weeks
civita doesn't have them?
https://youtu.be/Na3y1DI8bQI for scientific referencing
This clip from Beavis and Butt-Head Do the Universe on Paramount+ is where I drop the line to add that edit.
Just for Jin and jimin. Those are the best ones I could find
We are distracted enough with Flux for now, that they have time to perfect SD3 😄
yeah. stability shouldn't rush it lol. we're satiated
oh boy...
remember hearing that 16 rank lora would do nothing to a 12B model? so many experts came out to claim that immediately when lora testing was first happening
Wow, I'm surprised. You can try just typing in their names into a major checkpoint model. Most would know them by now I think.
I was honestly surprised too. They're probably the biggest act on the planet right now
Tried
Just found a jungkook one.
It amazes me that nobody has come close to Michael Jackson famous since , but BTS could be the ones. They still got a lifetime ahead of them
Suga (warning that site does explicit images far too often) https://civitai.com/models/139045/bts-suga
Yeah I saw that one Tbh doesn’t really look like him at all to me
Been using this one a lot https://civitai.com/models/139391/jin
people got to much respect for BTS to ponify them
they've somehow built a huge fanbase of non toxics
Tbh it just shows a normal photo of him. Not really showing the model they created
Tbh a lot of fans are like that. Which gives the fandom a bad name
yeh. lots of celeb loras don't do anything creative for examples. just generate the captions and post those
young people dont usually toxify till later in life
Does this help at all? https://civitai.com/models/9006/k-pop-boy
Not really that’s good for a ooc lol
"jaded" is what i've heard . takes a while for the minerals to saturate in an turn someone's heart to stone
Option C. create your own lora. CIvitai has a lora creator on there. Just grab 200 images of each...
Wait that’s how you do it? It keeps giving me a pth to create
toxic fans i think are people who are obnoxiously obsessed without self awareness too. not just jaded people either. toxicity comes in many forms. swifties will attack people who like another star for example.
200 images for one person? wuhwuhhhhh
i thought i was doing 50 was tons
people making ones on flux with 12
That's how I do it anytime I want something specific.
You could also try the pony model and see if it h as the data. If you do that, make sure to prompt SFW , trust me o that
A lot of toxicity comes with kpop fandoms not just bts. I met a few for blackpink and exo
OK perhaps I overdo it lol.
I’ll have to try that later thanks
Flux?
Sorry I’m new to the whole creating Lora’s 😂
i dont know what blackpink is but some people seem REALLY angry about it.
yeah i've seen it a lot. "Fan" comes from "Fanatic" so that's a thing to consider https://en.wikipedia.org/wiki/Fanaticism#/media/File:Eugène_Delacroix_-_The_Fanatics_of_Tangier_-_WGA06195.jpg
Because you lack Skillz?
JUST KIDDING 😛
Worked fine for me... first try...
oh wait that makes me think you can also use img2img workflows. On your own coputer, or glif (just cause glif is so easy)
oh man wait till you hear about flux
Blackpink is a kpop girl group
tbh it's also a really hot color scheme
Those sorts of comy workflows give me nightmares about dozens of orange boxes
this si flux when i prompt BTS on stage. is it accurate?
long way to go before they're Michael Jackoson famous. hundreds of millions of fans is nothing to make fun of , but it aint BILLIONS of fans
Bts has 7 members lol and not really doesn’t look a thing like them lol good try though
is this a good one? It's Flux Dev
do you think anybody that's a fan of beiber or swift or bts or all these world tour acts these days, do you think any of them even consider that those idols are nothing compared to the King
Elvis? He is still alive, you know?
MJ is too, but legally owns Elvis's estate. Son in law shit
wait i guess legally both are dead
Pretty good lol
the estate is some hedge fund and they were just prevented from selling a house
you'll want flux dev then. Seems to know BTS really well already
I created this one using aieasypic
How do I use it?
come on guys they're really phoning in those heart hands here. pop em with conviction.
They do that sometimes lol
just came out 2 weeks ago but there's already a ton of different ways to use it on your home GPU. Becky linked the hugging face space up a few. there are other sites that offer flux use like Gliphy
Gliffy?
Jliffy
far left has everything everywhere all at once vibes
i dont want to be the one that says they all look the same..... but i feel like thye all look the same
I’ll have to try it
flux does money suits though
I don't know what they look like, but I get the impression this is dead wrong
Did this one with a free anime one for Jin but can’t remember what it was lol
Try Duc Haitens Pony models
I was a clown as a kid growing up with the boy bands an pop queens in US. Brittany. Aguilera. Timberlake. Backstreet boys. Everybody. Yeeeaaah.
I'd do karaoke of it an air band shows
Flux can't count to 7....
Do a landscape one it might help for that sorta prompt. Flux is trying to compose them all in the pixels it has too
I grew up on those too. NSYNC was my jam, along with spice girls 🤭
Yeh Yeh it's all good shit. Venga boys too
BT the producer had wicked good solo albums
Will it work with a lora model? Plus i don’t remember what the prompt i used was for the photo lol
With AI art, you just have to keep trying stuff until it works. Or head over to midjourney, find a good BTS image and borrow their prompt 😉
I have mid journey but I tried to use a prompt and it says I need to pay
Oh I didn't mean to use MJ. I just mean to browse others' images
Now I’m confused 😂
Hi confused I am iceycold
2 Weeks, or as soon as we sort out hands.
Hi. Don;t be confused it;s almost Friday.
Better if they take time and get all tools ready to deploy and don't talk about it till it's all good to drop then drop it and talk about it.
Like BFL but they didn't give tooling with the release, so it's a chance to upstage
Yes. Learn from Black Forest. No hype. Then drop the goods out of clear blue sky.
Also, BFL just dropped it all and then haven't really talked about it
😏
6 Months silence... then ladies frollic in grass!
Kill hype culture. Murder it. Death to hype.
Yeah
The death of hype shady
Hype killed star wars
I agree. It's not entirely dead but.....
First few episodes of Mandalaroan, Andor and rogue 1 is probably the best new star wars we got.
then Kenobi came and said Well Hellow Thear.
hype kills a LOT of creative projects. Hype is the mind killer
And it should have been awesome but it was not so awesome
Kenobi worked well yeh. flash in the pan
you didn't like it? yeh there are polarized opinions about it i suppose lol
acolyte looks like a lot of dune vibes and i'm all hardly done digesting dune yet. that one is heavy duty cinema
i haven't got to it yet
please watch the first version
ive seen all versions many times
yup
XD XD XD
David Lynch. H.R. Giger
Weirding modules for days.
Directors Cut lynch version lol. i wish. always was rumors. people would claim to have it. then the internet came along and was like "that shit doesn't exist you liars!"
I watched Inland Empire just the other day
i love lynch but inland was too ugh
Sir Patrick Stewart as Gurney Halleck wadddup. I like Thanos doing it too but lets be serious. He's no Knight
But I think it wa sjuts the crappy early digital cam technology
if Inland was shot on 35mm I woudl eat it up
Heinekin!?!?!!
i'm really basic so gimme the bud light lime. drink taht liek juice. but if i want to enjoy an ale i'll get one i can't see through and has more calories than a loaf of bread
that might count as a stout
tellin u
$3 tall boys of faxe you're good to go
10% evil viking.
i didnt know flux1 worked with dmppp at first
works with bosh3 too if you caught the node for it
havent heard of a sampler bosh3, is it new?
me neither
if you search the server for "comfyui-ode" you'll find a lot of differetn times they get talked about
hmm interesting, so its kinda like dpm adaptive
think @ drhead made them
this was the kpop prompt i used earlier though bosh3. ymmv. i like that finger artifact
neat... i'll wait for it to come out of alpha version
at higher steps, it'll take a long time to work, but it works very accurately (in theory). there are other samplers in there too that work slower even. you can really make your gpu grind on images with them
slower on account of it being so damned accurate! (in theory)
quality looks good
alot of kpop groups used that pose alot
oh yeah i know i'm prompting for it, but this sampler got it to generate the rare heart hand duet
unipc works on flux too yeh
it's a crapshoot still for what works and doesn't on the forge side of things. euler and dpm2++
what type of negitive prompt do i use to fix the fingers??
Forge is madness.
@sage burrow if you wanna use flux1-dev-bnb-nf4-v2.safetensors https://huggingface.co/lllyasviel/flux1-dev-bnb-nf4/tree/main
good prompt theme. madness. balls. ball madness. mad ballz!
You can never go wrong with balls.
balls save fresh water reservoirs. true story. you drinking water? maybe balls were dunked in it. just sayin
https://youtu.be/YjDH-26Ksac balls are life
In Los Angeles, a remarkable sight unfolds as a truck releases countless black balls into a sprawling water reservoir. At first glance, it might seem like an attempt to create the world's largest ball pit, but this spectacle serves a far more critical purpose – ensuring the safety of the city's drinking water supply. These black spheres play a c...
space... madness
its a mess but i liked the idea
spaaaace
MAAAADNESSS
https://images-ng.pixai.art/images/orig/c24f5687-32e5-4f15-b998-e2baa6544ba3 oh dear lol turned him into a wolf 🤣
Madness ?
This is...
(not sparta, its me out of free credits to gen images
)
Word in Ideogram discord is that the new model 2.0 will roll out publicly anytime now
people mimicking AI video seems legit
love that. thats awesome. the noodles
will smith did it with his spaghetti video too.
Didn't expect AI to improve drastically💀.
#willsmith
#ai
#vs
#random
#randomvideo
#justahappytroll
Video Source:
https://youtu.be/UQmgKIWFnHc?si=nUCL-YcBs0-H8NRv
https://youtu.be/Zfk8mcRECiM?si=nE5Sagp5EgzZezjF
Now I am become ball, destroyer of cubes
one more. flux goes hard
wow that is actually very nice 🙂
Would it be too political to do a ball series on mpox?
thinking people don't conflate illnesses with politics, that just failure from the get go
balls are nice.
Goose wtf
Flux pro 🙂
You sure? I can run it on 8. Takes 10 mins tho lol
Have you tried nf4 yet?
I lied, takes 210 seconds for me right now. What’s Nf4?
Impossible becausee mine can. It actually uses like 6 gigs on Forge lol
That's not too bad. Nf4 is a version that makes Flux take half the time, or even less.
https://huggingface.co/duuuuuuuden/flux1-nf4-unet/tree/main is this what I’m lookin at?
when you thought it was settled https://huggingface.co/city96/FLUX.1-dev-gguf
dodododododoodoododo do the ball part dodododododo dodo dodooodoo
i remember when thye tried to introduce humans to the show but the square didn't fit and no one liked it. bigger than new coke
and i guess there's some code in kohya to allow training to be split into parts and fit into vram
Best 70 Sitcom. the Ball Train.
Id call this an img2img flux pro fail, but some people might feel it's a great success 😄 @torn wharf
F R I D A Y.
Modelo Especial FTW
Flux found a woman in my word soup I'm not able to create an image prompt from that text input, it seems to be a series of characters that don't form a coherent phrase or description. If you meant to provide an image request, please rephrase the input into a brief and descriptive phrase that includes keywords for (an adjective, type of image, framing/composition, subject, subject appearance/action, environment, lighting situation, details of the shoot/illustration, visuals aesthetics and artists), I'll be happy to assist you in creating an image prompt.
Naked girl
So for anyone interested, Aura Flow 0.3 is out, the third model in a bit over a month. It is still in beta, so keep that in mind. Here is the link: https://huggingface.co/fal/AuraFlow-v0.3 Just download the main file into the Checkpoints folder of Comfy. Here is an image with simple workflow.
Damn. I just went to bed.
Remind me tomorrow or hit me up via private dm 👊
so you telling me you cheated? jk 🤣
Flux pro via glif, with gpts' help, I cheated big time 😉
Do the Paris Olympics do medals for downright creativity?!
Including Face Detailer? With me it gets stuck at mat1 and mat2 shapes ... ?
Ordinarily it works just fine - yet when I add-in the Face Detailer - I get the above error?
the way you lay it out is kinda the worst case scenario
in terms of how confusing it is
Change the noodles from spline to straight?
its difficult to give advice that would apply to every workflow
but mostly
space it all out a lot more, across several screens
and consider bus nodes
also sometimes I have multiple loaders instead of loading something once and then stretching noodles all the way
a big thing is to choose the horizontal and vertical position of nodes on the basis of whatever would produce the least line crosses
It works perfectly well without Face Detailer, but stops when Face Detailer is unmuted. I agree that it is not attractively laid out! 😦
... still waiting for Aura.Flow to d/load ...
Big Barrel
Any body used these at all? https://huggingface.co/marduk191/Auraflow0.3_collection/tree/main
#artisan-1 asd
^..^<
SwarmUI SD3
does dev nf4 version support lora yet?
It's like coming home from an IMAX theater and sitting down to watch a black-and-white TV.
Nope
Art is cool on a B&W TV (imho) 😉
But SD3 does stop me using all my favourite artists in the prompt 😦
i see ok, there's been comfyui update yesterday or something and some ppl are saying nf4 might support lora after that
There are 5 Dev LoRAs on Civitai
But full Dev takes a month of Sundays!!!
im curious on the development with nf4... regular dev takes too long to render on my system
I just don't really follow this format much because I use the standard one.
schnell nf4 takes 15 seconds about, and dev nf4 takes around 1min few seconds
I'm getting used to it
On my 8Gb VRAM 64Gb RAM setup, nf4 takes 35 steps and one image every 2 minutes 10 seconds
Flux.Dev can take up to 30 minutes only at 20 steps
30 mins... 👀
But Flux.Dev can take advantage of LoRAs
Yes, VRAM too little ...
If I don't use 35+ steps on nf4 a strange "quilting pattern" can be seen ...
I'm eager to try the new ODE Samplers in ComfyUI - and AuroraFlow3
This is the same image with SAG+PAG off
and with SAG+PAG on
im liking the comfyui new updates
UI looks good
Yup. No hangups 🤷♂️
maybe instead of using facedetailer
write the process out manually with impact pack nodes
all the detailers are just inpainting with some combination of Yolo, Grounded DINO and Segment Anything
Guys do you know why guff q4 flux which is about the size of sdxl takes same amount of s/it as full fp16 version of flux?
like, shouldn't it be faster?
isn't llm gets faster with smaller quants
the otter (?) 10x better than the robot

the otter is better yeah
The auraflow before this one wasn't worth the price, so I'm more than hesitant to try.
Has comfy figured out how to make the nf4 install easier yet?
ugly
Yo... Whatever City96 did with the GGUF loader addon last night... Omg... I'm using the dev q4_1 model and can generate images at 4 megapixels, without oom errors and while getting 15 sec/it, all on an 8gb gpu
Regular sized images in the 1mp range are around 3 sec/it
Uglier

machines are taking over the girls
can you smell the sdxl in these
very good composure and sweet girl
I got it up and running with the help of @errant dust in under 12 minutes
Install bitandbytes ver 43.3 first
Make sure you have a Cuda of 12.4
Did all that, kept getting errors after an hour finally got it to work. Then accidentally hit install the next morning and destroyed it 😦
So now I'm waiting in a method that works without 100 errors on the way 😄
This is SD1.5
turns out you get good color grading if you just put R2D2 as the IP adapter images
YMMV. It is slower for me than NF4 on 1k images.
NF4 v1. v2 is a snail as it exceeds the confines of my 8GB
Might, but I get a massive slowdown on NF4 when I exceed 1k anyhow
nevertheless, will check it out. the diff in speed is not huge between gguf and NF4 at 1k. NF4 is clearly faster for me, but once it tries anything larger than 1k, even 1280 x 1280, it slows to a crawl. Will see how GGUF model handles it. I tried them all last night. Well, all the Q4 and Q5 models
NF4 was indeed faster for me also
Where did all the SAI representatives go? Ever since FLUX appeared in this channel, none of them have posted anything.
"appeared" 😂
So is it possible to run cascade via comfy? How?
yup
pretty easy, just need the unets from the huggingface page in your comfy/models/unet folder, etc
need to clean up the description etc but got it up now
I'm getting really tired of that faded look Flux seeks to default to. If I were to ever print and sell one, everyone would ask if I was running low on printer ink.
What they can do? Send us to another discord won't help them. What can they say?
add vibrant to your prompt
So this is interesting, same with sd I wonder?
update it, it's fixed. i'm literally running the q8 gguf now at 4 sec/it and that's pretty much the same speed i was running nf4v2. this is an 8gb gpu. i'm even doing 2048x2048 images at like 16-18 sec/it. also make sure you aren't using the force clip device node anymore
This is goind to take some time!
nf4 or fp16?
Full Flux.1 Dev
you need the RES4LYF repo to be able to use those exact nodes
rtx 2080 FE and this is with Q8
hmmmmm. 2000 civit buzz and i can train a lora for flux. so i go look in the bounties to see what i can do. TMNT lora for 100k? wtf.... okay.. oh. they want TMNT from the 80s in porn scenes.
The bountie board at civit is disgusting shit. Thats outta hand.
Oooh, did they add flux to their training models now? 😄
yeah but i got no buzz . and earning it means participating.
it's supposed to be tiger sized, but there's nothing to compare the size to. so you just have to pretend it's a 400lb cat
click on any image, hit the remix button, click on the claim 25 buzz...but don't generate, or you lose some buzz. daily. Or buy some for, $5, costs $2 to create a lora apparently.
I have lots of buzz because I was going to create a bounty before, but everyone told me just use controlnet after I posted it.
Way above my skill level. Does all this work on Mage or glif? 😄
you'd have to ask Roi about it for mage, they'd have to impliment it
nope
I'm glad to see he apparently doens't have bad breath 😉
Yet... 😉 😄
The power of MENTOS
So only $2 per Flux lora creation now... I've been saving up my buzz (actually I was saving it ever since sd3 just in case lol). Time to experiment!!!
and the benefit of using their service is, you can up to about 1000 images last I remember!
$2 is all it costs for 2000 buzz? hm
maybe i don't have to build a pornographic 90s ninja turtles dataset after all
only problem with online lora creators like civit is
they set the learning rate super high so that most loras don't fail
this is fine but it means it over-fits
so the lora result might kinda lock you into that subject a bit
but that's ok cos most people seem to prefer that type of lora anyway
Min purchase is $5 for 5000 though.
That is my plan 😄 Though, ty, I'll change the auto settings for my next one and compare.
I created several with SDXL before and they turned out pretty awesomely. Though I probably have higher standards now, so we'll see 😄
a lot of the time people use Lora to create a specific character or a movie scene
so its ok if the Lora makes the model forget quite how to make teapots like it used to
fuck. no deal then. it's not even an evenly divisible amount. thats how they getcha
/chronically cheap/
2 loras for $5 seems less expensive than the other methods?
civit's allows you to define the lr. looks like the default is 5e-4
i'm just being bombastic. i might use it yet.
Your skill levels are MUCH higher than mine for such things though, so you might be able to do it for less using Vast?
Quickly making loras before they decide to make Flux ones more expensive!
i'm out here winging it most times. but i'm also a big believer that most of the skill is in the training dataset building. settings are a lot more forgiving than people think. i lean towards UI's i'm comfortable with and can affect easily. convenience.
like you know how kids will dive off the 20m tall tower at the pool, even though they're not olympic divers? meeting the challenge is a big step, but it doesn't make them into contenders. thats the skill level i think i'm at. i'm just running and leaping off and then trying to land in a non bellyflop position
civit looks appealing to me to try once or twice, because it's a stupid easy interface from what i can gather
i've also been poking around at this library from nvidia, "transformer engine". if i can get it working on windows or a docker, can potentially train flux in fp8
people are paying for loras?
I'll do them for free
just give me the source material
hungry ahh mfs tryina monetize everything
greed is good. coffee is for closers
Now to figure out where to use my new loras. Mage doesn't have Dev yet (last I looked), and Glif doesn't add loras (quite yet).... Also, I don't think Dev loras work with nf4?
https://www.youtube.com/watch?v=VVxYOQS6ggk educate yourself
Wall Street movie clips: http://j.mp/1ixl1zu
BUY THE MOVIE:
FandangoNOW - https://www.fandangonow.com/details/movie/wall-street-1987/1MVeaaa1c97c94e46c8120984b6eeeff9e8?cmp=Movieclips_YT_Description
iTunes - http://apple.co/1Fx0nEp
Google Play - http://bit.ly/1SZsuW0
Amazon - http://amzn.to/1IgCm7W
Fox Movies - http://fox.co/2AXaWWT
Don't miss t...
just use prodigy
learning rates is so 2022
prodigy has never cranked out a shit lora for me
do you know why rickrolling is a thing? because the memes that have the most staying power come from the 80s. like wallstreet.
prodigy often is just adamw that finds a learn rate in the first 20min of training and then stableizes. running adamw at that stable lr is no different.
yh its a dynamic scheduler
it's also optimized for unets and not MMDiT networks
its not the only one of its class, but it is the best one Ive tested
oic
yh the same stuff that worked for 1.5 is prolly not gonna work for flux. I am trying to learn how to train flux after all
Looks at Prodigy pricing............ $$$$$$$$$
at this point in the game, you're better off sticking to adamw for flux and sd3. the adaptive ones will just find a lr and stableize at it. just use that lr instead
its not a website
its a training method
so that ppl dont have to set a learning rate manually
So you don't need to dl some overpriced software?
noooo. kohya is free
and it comes with the scheduler
if yall can help me find a tut on flux loras I can gladly train whatever yall want free
I mastered dreambooth and loras for 1.5
flux is new tech
rip prodigy ||https://vimeo.com/144850907||
Kohya is for the rich who can afford 16gb GPUs 😄
(or it can be used on a cloud too though)
ugh. attitude. noted.
My computer budget was for 8gb GPU only lol
loras take like 30mins top
uh huh. you're flexing and it's tacky
It's all the same shit: curve fitting, derivatives through back propagation and setting the million knob directions to be scaled by the learning rate per step. Prodigy is good shit, you just mostly hear about other methods because people are scared of change and the sunk cost fallacy that goes along with them having spent hundreds to thousands of hours perfecting their settings and workflows.
Do you mind several hundred N SFW furry images? LOL
OMG
hit the nail on the head
do you want them all in one lora or different ones?
Well, to compound the difference, I could not run V2 at all faster than say plain Dev. Only V1
what sunk cost? it's a tick box.
test it on sd3 yourself. it finds a lr and stablizes. plug that lr into adamw and give it a similar warmup on the same training seed. similar network will be produced.
I started reading the prodigy paper it sounds good
thanks for this
or have you tested prodigy and invested too much time into it to investigate this?
alls im sayin is, prodigy works. its made to avoid overtraining. so if you wanna train 2000 steps or 10,000 steps, it will adjust the lr so that it wont overtrain
and...it works. ive trained about 80 loras
always cooked to perfection 🧑🍳
I just load the .json on kohya
select the source folder and click a button. no more
prodigy can very much still over train. done it a few times with it on xl and 1.5 models
other options mitigate that. less steps. decay. etc
and 
I will add this though, after testing them all at length at 1280 x 1280, only the GGUF models were actually doing them in usable times. NF4 quickly becomes impossible. V1 or V2
I say remotely usable, meaning 6 mins
nf4 on my system is 30 second gens.
Your tests are qualified by your system specs and your skill level it looks like.
as opposed to.... 20 for NF4
page file swapping is a biiish
NF4 1280
loaded partially 5323.275 5859.856831550598 0
100%|███████████████████████████████████████████████████████████████████████████████████████████| 35/35 [20:46<00:00, 35.62s/it]
Requested to load AutoencodingEngine
Loading 1 new model
Prompt executed in 1335.04 seconds
how much vram is this?
I've trained dozens and dozens of lots with datasets ranging from dozens to hundreds of images. Used every major type of training method. Prodigy is extremely hard to f up, rarely over trains beyond recovery(picking an earlier save point) unless you have a bad dataset or screw the settings up (don't use network dropout ANYWHERE else other than in the prodigy settings and never go above maybe 3-5%) and usually settles quicker than a lot of the other methods
both. One with a mix of everything, then a few with specific themes. It'll take me awhile to organize them though!
But starting with just one, with a specific theme would probably be good.
Wait, you said half an hour?!!!!!! How much vram do you have?
24giggies. its a 3090
ya I can do that np. I am curious to test flux after all
and am gonna do some loras of mine as well
I just need to learn the settings for flux first since its all new
it works best on larger datasets in my opinion. my poorly built ball dataset is one that prodigy hates dealing with. i heard once that the authors intended it more for anime, but i've never encountered problems with it for that
loras are usually quick to train, its dreambooth that takes like 3hrs for me
also, network drop out is a tool i use to generalize my loras a lot better. maybe that's why i have difficulty with prodigy sometimes. thats a good thing to mention. thanks
like these were my settings for the last lora i trained, used cosine with no restarts and no warmup. lora turned out fine
the key is to make absolutely sure you dont have network dropout in any of the other settings
I did a checkpoint with everydreamtrainer2 once... it has a dataset of 15 images LOL. Took 4 hours haha
I'm trying to learn Huggingface Accelerate
cos it lets you split stuff across multiple GPUs
I find good deals at night time for Swedish 8 GPU servers cos no one wants them lol
where is network dropout in the prodigy settings?
in onetrainer or koyha?
either. i'm agnostic to the interface as long as it's an interface
in koyha, you have to add it to the line with the betas and whatnot
in onetrainer, you click the options
this screen?
Lack of balls and sd3 today 
that goes to the screen where i don't see any network dropout options, the one i just replied to
people still use sd3?
this is in the lora tab
set that to zero
yes. why woudln't they?
I mean, if they're fixated on porn then no. but other people are still using it. i thought at first i wouldn't ditch flux but sd3 suceeds in some situations
there are other settings that are technically "dropout" like settings, like noise and whatnot, but i'm not doing a deepdive on it all right now
SD3 is awesome if you want particular artist's styles. Flux isn't as good at that.
flux is infinitely better
oh and weight decay = dropout in the prodigy settings
this is the key right? but the one you just showed me is the only place network dropout is.
In kohya it's only in one place too and i can't find anything in the prodigy paper https://arxiv.org/pdf/2306.06101
Other than that.. Questionable
I went back to SDXL
but now I throw up to 50 images of R2D2 into IP adapter to get different looks
read what i just said and scroll up to my first screenshot. it's weight decay. basically does the identical thing that network dropout does. randomly forgets some precentage of connections
i'll keep using network dropout i think. if it's incompatible with prodigy then i'll just avoid prodigy
I have yet to see that... but give me about half an hour, my civitai lora experiment is running lol
weight decay is not basically the same thing as network drop outs.
for prodigy, it is
remember that this is still only base model flux
once we get finetuned flux.... ooh la la
consequences will never be the same 😏
this is why people have issues with prodigy, because they have a bunch of other stuff hammered into them
if we're just making things up, then i dont' think we're having an honest technical conversation. i'm trying to legitiamtely ask you for accurate information of the technical sides of things, and you're giving me loose "its basically the same" explanations.
forget i asked.
prodigy is made specifically so you dont have to twiddle a million knobs. just load the damn .json and train!
I can send yall the .json if yall need it
I can only imagine how many loras people will have out about 1 week from now!
Claude Monet
"read what i said" is also very abrasive to me. We can all read here. I think that's apparant.
i posted my balls lora to reddit and the stable diffusion community mass reported me for spamming. my 2nd appeal just went through and my account finally got reinstated
don't post to reddit.
Ty 🙂
look in the code, and network dropout is an artifical thing hacked into the algorithms
not just for SD overall but for comfy too
and diffusers reddit community doesn't rly even exist
all the code is an artificial thing
Lmao
i'm not sure what that means
it just willy nilly nixes some random weights at each step and wasn't actually a part of the original research
yes. the name is self explanitory. it drops out parts of the network
this is something I always worried about with the training scripts like kohya
that they could just put some random things in
prodigy uses something more along the lines of alpha=(-group["weight_decay"] * lr)) to more properly decay the weights
thats from adafactor, but prodigy is likely the same
i don't feel like pecking through a bunch of code to fully explain it, but it's better than just randomly killing entire connections
since the algorithm is already adaptive and all
Better than the ones I've seen on civitai I hope 😄
yeah and 99/100, commonly circlejerked scripts are just specialized for people making their furry anime pron with some ultra specific dataset size and expected GPU vram for things like batch count and whatnot
caption shuffling and caption dropout are artificial hacks just added in later too. so is cosine scheduling. cfg is a hack added after the fact. i'm still hung up on that point, like why even make it?
Here
That's literally the last prodigy .json you will ever need
at least for 1.5
well i'm not going to bother trying to covert you over or anything, you do you. prodigy works great, you just have to use it right.
like any tool . tools have proper uses and work for some and don't others. it's why there are many different kidns of wrenches for the same job.
for instance, there are probably a dozen variables in onetrainer/kohya that literally end up doing the same thing: scaling the learning rate. like an easy one is network dimensions vs alpha. divide one by the other and boom! that's essentially your scale factor. if it turns out to be like 0.1, then whatever learning rate you're using is now going to be lr*0.1. oh and another big key thing is that for prodigy, make sure to set all learning rates to 1.0. not sure if it automatically ignores the settings or not(hopefully), since it handles it all
now you fuck with batch count
and boom
another 0.25 scaler
on top of the 0.1, so it's now 0.1 * 0.25 * lr
weight decay is actually one aspect i've really tried to read up on before. https://arxiv.org/abs/1711.05101
i've not found anything to suggest that it's incompatible with network drops or caption drops or rank drops. they're all just different tools to affect the network in different ways. here's a paper on network dropout btw. not just a "hack" but actual real research (which in many cases is just hacking with academic funding) https://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf
yeah, that's the setting you use with prodigy. network dropout isn't the same
You might recognize a couple of those names on it
and that paper is old as dirt now
right. kidna my point. network dropout wasn't just added after the fact
if you knew, why'd you say that?
you can tell it is old because ilya's email address isn't ilya@imrichaf.bomb.com
wow the original dropout paper was Ilya? that's cool
network dropout has always been a thing, been working with AI for over a decade now(simple shit like teaching a network to win a super mario level).
decay ends up working out better in the long run for shit like loras because your aren't teaching mario that he has to move to the right, not the left
within a single pass of the dataset, while training a lora, the network already has a pretty good understanding of which way the neurons need to go
that's the whole point of having back propagation: you don't have to brute force shit now
this is weird because now you're saying network dropout has been around all along and you have lots of experience, but earlier you said this which we just established was not entirely correct.
i got experience galore too, just not in ML. I would craft algorithms for game bots and i did a ton of database normalisation and i lived in flow charts, diagrams, and vim. i'm no summer child. i'm not going to hold any of that up as a reason why what i'm saying is correct though. i rely on source info and manpages for that usualy.
network dropout isn't normally included in the official model training papers for architectures, that's why i said that
neither is prodigy
you're confused what prodigy actually does though
it's adamw with an adaptive learn rate. same as adafactor and dadapt. i think it's a fork of dadapt. all basically the same jist , justusing different algorithms to figure out their deltas.
we can sit here arguing about this all day, but i've got shit to do. like i said earlier, you do you and think what you want. at any rate, if you use prodigy and want dropout, you only use weight decay with it and do not use regular dropout with it. you might be able to set weight decay to 0 and use network dropout, but i don't think the algorithm likes it as much, so i don't recommend it
ultimately a good adaptive algorithm shouln't break with network drops. it should be entirely decoupled from that and when the network drops, the adaptive learn rate figures it out. That might be tough though becaues how do you calculate loss with the networkdrop before you get there. That consideration might be the root of why adaptive learn rates don't play well with dropout
its not rly worth arguing about, its just a case of going through the papers again
when i'm trying to figure out something, the last thing i want is a magical faith explanation.
I rely on faith a lot but i also like an accurate lens to peer through that veil with.
"just a hack added later" 🙄
sorta like http
also in my experience network dropout is valuable and not worth giving up. weight decay even works well with it. i use that combo with adamw often. i've stopped using so much prodigy and dadapt because when i look at the tensorboard graphs, the lr just levels out at one value
then it's just adamw with a static lr and a little bit of overhead to calculate the non delta adapting every step
I wasn't going to spend 15min typing up a rant about it. Network dropout is kind of an OG carryover from early ML before we started using shit like back propagation. It was there for things like teaching Mario to run and jump toward the right and to prevent getting stuck because of some bad cluster of neurons in the network. When the dataset is known, you just back propagate everything and don't need to rely on vanilla dropout like that since it takes derivatives of every step backwards to figure out exactly what combination of +/- to apply to all the knobs to fit the n dimensional data curve and can then scale the learning rate by that factor as well. You really aren't going to end up with as many "stuck" nodes like you would if you were blindly brute forcing the actual training, that would then benefit from completely deleting the connection. We already know how wide and deep the lora will be, so it's also not going to balloon out of control either
So using network dropout for loras usually just makes shit train longer to keep plugging the deleted holes vs decay handling it more elegantly
yeah this is true
but every time I replaced advice I saw on reddit/discord/youtube with what the papers say, it has gone well LOL
a great example is the samplers and schedulers
which sampler to use and for what schedule is discussed in massive detail over the course of like 100 papers
well gossip vs developed and tested theories, im not surprised
has similarities
VERY LARGE
You see a new model came out, so we are all using it instead now lol
Has anyone tried using it via colab + lora?
I mean flux
SD3 and Flux are kinda the only two in their class
cos they are the only models that have all 3 of the new meta:
- rectified flow instead of diffusion
- big 16 channel VAE
- DiT instead of Unet
- Can someone test my LoRA Flux bc it seems to me that it will be Very Bad (I am writing here bc for flux you need a prompt like sd 3)
It makes sense I guess
auraflow lacks the 16chan yet but i trust they'll get to that. The other two aspects are a BIG deal yet
yeah I seem to remember either from reddit or discord that both the auraflow and pixart models in the future will have 16 channel vae
rectified flow is great cos we don't need the SDE samplers any more
its a lot faster to use ODE
well, unless you use the super slow ODE samplers of course
isn't ODE sampler still in alpha tho?
euler, heun, DPM 2M++, UniPC are all ODE
oh huh
euler a, DPM 2M++ SDE are SDE, for example
well im using euler as we speak with flux schnell
what about dpmpp 2m
oh nvm you mentioned that
there's only really one rule
and that's no SDE samplers on rectified flow models
so long as you don't break that rule its ok
well its either euler or dpmpp 2m depending on schnell and dev
dpmpp 2m is a lot better
euler is mathematically the worst possible sampler
it still does ok but its as simple as possible
yeah but euler is recommended for schnell
you don't have to follow the recommendation though
that's the fun part of the models
this is schnell with euler
most of the crazy samplers I use like Bogacki–Shampine are not "recommended"
let me try that same seed with dpmpp 2m
deis/beta
