#๐๏ฝsd3
1 messages ยท Page 101 of 1
When you say SDXL, you mean the Base model or a finetuned version?
Sdxl
hyper flux and flux dev(not schnell) can even kind of generate gif's which is really amazing
What? How? Do tell!
the prompt is: "A consistent side by side four frame image showing consecutive stills from a looping gif moving from left to right. The gif is of a {what the gif is about}"
Nothing else, just a plain prompt.
WOW... OK so then I guess I can crop each panel and paste it in to a GIF? I must try it ๐
Yeah the only problem is that sometimes flux might format it differently like this.
try 1024x1024 res? it might just not work at complicated characters, i didn't test them yet.
I think you're right. My subject is too complex
It starts out with the four panels but then it goes off the rails as it converges
try this: "A consistent side by side four frame image showing consecutive stills from a looping gif moving from left to right. The 4 frame very consistent gif is of {prompt} There are 4 frames in the gif image and each is separated by a black line."
This helps me do a bit more complicated characters.
My original prompt and the adapted prompt into your suggestion
Nice, I got this too, just a man laughing while water is pouring on his head and the same format.
Sdxl
Florence2/Flux img2img
with and without LongClipL
Wow! ๐ I have so much trouble with teeth sometimes.
So, I created a single huge checkpoint that includes Flux.1 Dev, the fp16 T5XXL encoder and the CLIP_L encoder as well as the VAE... 32GB compatible with the Checkpoint Loader Simple core node and a standard ksampler.
I was surprised it worked at all ๐
Here is a result from it.
i did a bunch of that grafting in my preferred t5
than all the other versions came out GGUF ect
Any news on 3.1 being testable ? Been a some time now.
no, but they got a new CTO last month
url?
So what do you then use for Clip and VAE at all?
if it works in the checkpoint loader node then you just take clip and VAE from that node
"No File"
2 weeks probably ๐
Built in to the Load Checkpoint Simple ๐
I know, somewhow Civit futzed the upload ๐ฆ
Re-uploading now.
I have a 1Gb/s FiOS but the upload is going at 200Mb/s ๐ฆ
Why is CivitAI sooooo painfully slow with large files?
too much merges of the same models per second being uploaded)
2 weeks
gib prompt 
impasto oil painting, dark depressing fantasy art, a medieval knight hugging with arms around a large angry squid, running through town,
lol 31 gigz
32.6 to be exact ๐
It makes life MUCH simpler for me... two nodes replace like 7 other nodes before.
I always use T5XXL fp16, full Flux.1 Dev and the VAE so... works for me and others have also liked it so far.
My bad... it's even bigger ๐
I wonder if there's a node to make checkpoints that can let you use ggufs for the transformer and t5. You could cut that file size in half and have 99.9% the accuracy for the bf16 versions. Would save a shitload of sysmem as well
thanks I love these
so easy for cloud to just do 1 download
actually on second thought, scratch that. you'd still need the gguf addon and the addon would have to be expanded to include hooks into comfy to detect that it needs to be in gguf mode. would probably take a bit of tinkering to save a few node's worth of clicks
the typical prompt for Flux Dev
for comparison, an SDXL image from earlier
I wonder if it uses the correct clip L
Apparently there was a reddit thread about a clip L finetune
Im not sure anyone realizes flux has its own trained version of clip L.........
They didn't exactly announce it or draw any attention to it
Just kinda stuck it in the diffusers version and said nothing about it but it's there and it's better
Idk, they never said, but it's different and you get better results with it
Presumably it was frozen and the weights trained against it
So def makes sense to use that over anything else imo
It's def clip L still
The differences are small
They tuned it but didn't overhaul it completely or anything
Yep
In the text encoder folder
You'll see the hash is diff from. The openai one everyone is incorrectly using
They def should have something on their main page about it
I know even comfyanon was using the wrong one lol
Cool yeah you're good then
Yeah def
You want your stuff to get the best possible visibility
Why work hard as hell to make an incredible model and not spend 30 sec telling ppl to download the correct text encoder lol
Tons of images ppl are posting are degraded cuz of using the wrong te
yeah, no doubt
lol
check out those keys
my own
yeah pretty nuts what this thing is capable of with a bit of tuning
or what it's capable of doing right stright out of the box
i'm pretty tempted to slap together a pair of 3090s and go nuts with training experiments
yeah, no doubt
the blur shit though... ugh
i don't get blur ... ๐
managed to pretty much compleetly get rid of it
unless i want it
pull up dev, drop in a prompt, and stick the word yesteryear at the start of the prompt
it's like writing a song where the entire thing is a drum breakdown with guitars building up to more buildup lol
pro is overfit for fantasy, dogs, women, and anime catgirls. dev suffers with the same thing - for both of them, if there's no clear subject, you're goign to get a subject that is one of those 4 things, and sometimes a mixture of 2 or more
schnell does't have nearly the problem with that
lol, damn
Yeah and bokeh to hell and back
sure, i'm walking through latent space, finding out where things are at
It basically reduces flux to a single subject model when it does that
Yep good stuff to check out for sure ๐
have found some interesting effects. some interesting areas.
generating directly at these resolutions
no inpainting or upscaling
the ceiling is really high for this thing
Xenolithic rock
and some really strange terms that i have no idea why it knows, but it does. this the term Xanthate - and it looks almost exactly like what the chemical actually is
can confirm, i'm a chemist, that does look like it lol
now why in the world does it know that, but NOT know centaur?
once second, let me run that
i can't tell if it's smart or glitchy
first one reminds me of this
just a tiny bit
Does anyone have a Good Flux Workflow with LoRAs, and Upscaling that they would be willing to share. Or even a link to a page with a good workflow? I could really use the help. Everything I have found is either way to basic or way to over the top.
there should be some in the workflows channel on the L2 discord
L2 Discord? do you have a link for that?
i'll DM it to you
can't post discord links here
what happens if you add the phrase "Yoshiyuki Tomino anime" to the start of the prompt for this one?
will do once i figure out wtf that one was lol
Because there are tons of science images in the dataset. Almost every wiki page for elements or compounds has some kind of image usually. Centaurs are much more likely to be hit and miss because they are mostly illustrated by artists and BFL probably went the safe route of only(mostly) using whitelisted artists
not these...fukcin around with too many experimental custom nodes with halfass implementations, just gonna cause confusion
plain flux dev?
with a realism lora i'm working on
well you're doing great
thx
there's one thing unrealistic about that last image lol
though i guess maybe she was patient enough to just stand there during a snow squall for an hour or two
wearing layers lol
Anyone have a random prompt generator w/f for ComfyUI at all? ๐
DM'd you
I have d/loaded the 32Gb Flux1.DevSingleFile checkpoint by YFG - Flux.Dev output (4.2 s/it) and Flux.DEVSF output (5.7s/it)
I think the larger ckpt just about wins - although there's very little in it!
Florence2/Flux.Dev.SingleFile img2img
I'm getting a "No module named dyanmicprompts.samplers" error
Are you using a dynamic prompt capable node?
Like the ImpactWildcardProcessor?
What I gave you is just a dynamic prompt.
So I could use something in the Impact Nodes?
Let me see ...
It works with ImpactWildcardProcessor
How to get it to reference that list of prompts at all?
They're in the Wildcard folder ...
OK, its working ๐
Random Automobiles (Mobius checkpoint)
With flux?! If so, AWESOME!!! ๐
Dynamic prompt are independent of the model type.
Here i thought sd3 onwards didn't accept weights and dynamic prompts?!
I still use them ๐คท๐ปโโ๏ธ
Time to dig out my old prompts! ๐
First thing I do with every new model...go through all my old prompts ๐
Weights didn't seen to make any difference; perhaps I'll have to ramp them up
girl
I added the clip_l i have been using since SD3. Seems to be fine. I saw a mention of using ViT Clip_L yesterday instead ... Have to look further.
Do you have any side by sides you've done? I am still in bed. ๐คญ But will look in to it as soon as I have coffee ๐
Sd1.5
How do I use the two encoders from the Flux.1-Dev Repo on Huggingface? Unless they got compiled as CLIP (sorry I do not know the proper terminology) Comfyui compatible checkpoints. I do not know how to use diffusers via Python or otherwise. I have vaguely used them before but I forget how.
Thanks
She is a bit, necky ๐
Different than what? It is the same as the SD3 Medium one.
hash is different though
it could just be re-uploaded and it is otherwise the same
what I would say is we need to make sure it is definitely a quality improvement rather than just having an effect that is the same as changing the seed
One of Flux's many foibles - longunek
๐คฃ
Moebius checkpoint using Searge SDXL nodes
Ok so i downloaded the versions (encoders that is) linked in the comfyanonymous Huggingface that he links to in his samples page. I assumed he compiled them from the BlacolkForestLabs flux repo diffusers.
@dusky thistle do you have any links to this info and a rabbit hole I can dive into? I am always after top highest quality so now you have me in intrigued ๐คญ
gonna switch to the one of the official huggingface space just in case anyway
does anyone happen to know if you need auth token for wget to their huggingface?
in general github does not need auth token and civit does or does not depending on whether the uploader ticked a box requiring login
Flux knows what my old room looked like back in the 90s ๐ ๐
Flux is gated... so yeah, you need to use API I think.
or some sort of token mechanism
ok yeah will need token then
it's not
those are the wrong version
https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/text_encoder/model.safetensors this is what you want to use... just name it clip_l_flux.safetensors or something
CLIP_L from ComfyUI HF Repo
Ahh... okie... let me go play ๐ Thanks as always... ๐
ViT-L/14 replacement for CLIP_L
and now for my testing prompt: CLIP_L first (the one from ComfyUI repo)
try having it write a full sentence of text on something
that's one of the better coherence comparisons
the weights were surely trained against the frozen clip L i just linked you to
does anyone know good nodes to use kolors these days
ViT-L/14
Is it better? Looks better to me...
better to test with a lot of text
a cat holding a sign that says "i'm not sure if we've ever been using the correct version of clip L"
try smoething like that
you just can't tell very esaily from an image like that
there's too many things that end up being subjective
text is right or wrong
Just downloaded the Clip_L from BFL HF repo as proposed... here are the hashes for the ComfyAnon version vs the BFL version if you want to know ๐
OK here goes...
CLIP_L ComfyAnonymous Repo
oh no
bfl repo has both clip_l and t5 in bf16 as opposed to comfy's fp16
but the t5 is in diffusers format, no?
the comfyanon one is the sd3 one
it's the same thing, just split into a couple parts
you can merge them all together and use in comfy
i'm gonna guess something's not ideal with your sampling/scheduling/shift
Ah ok... how can I do that?
lemme see if I still have the script for it
Here is my workflow (the key parts)
And here is the ViT-L/14 version
ViT_L/14 is certainly getting long text better, no?
This is the repo for the finetuned ViT-L/14 for Flux
I can't find the script but this model matches the sha256 of mine
https://huggingface.co/city96/t5-v1_1-xxl-encoder-bf16/blob/main/model.safetensors
how did you verify that they are the same?
I just looked at the keys
I'm confused cos Clown said the hash is different
I'm talking about t5xxl, not clipl
yeah the t5xxl you can grab whichever
hashes will be different between fp16 and bf16, yes
technically bf16 should be a bit more accurate in both cases (clip and t5)
but the differences are tiny
it's pretty visible
i don't know if they've retuned it or what
there's not exactly any discussion or mention of it anywhere
i just grabbed their stuff day one and didn't look into it any further
a few ppl have done comparisons and gotten significantly better results with the one from the flux repo
I'm honestly not sure cause I've been using their checkpoints from the start too
never bothered checking fp16
what i can say is with cascade, i can't even see a pixel level difference between bf16 and fp32
but here it look spretty different with their official one vs the sd3 one comfy links to
Downloading now... let's see which one I got ๐ I bet I got the one from ComfyAnonymous repo.
you can just load the model you got with comfy and it should show precision iirc
Is that anode that displays this? or on the console?
in the console as it's being loaded
one way to think of it is like what language and accent you grow up learning
someone who grew up in minnesota will still be able to understand someone who grew up in ireland
all three above using the t5xxl from ComfyAnonymous Repo.
Will test now with the one from city96
but it's better to use the language and accent they became familiar with as they grew up
wrong language = wrong file entirely
wrong accent/slang/whatever = gonna get reasonably close, but things will be a bit off
what flux model is that? i get this with flux.1 dev took from their repo:
whatever encoder they trained their model on is the one you should use
finetuning clip doesn't really make a lot of sense imo unless you then freeze them and tune the weights again
download t5 from google's repo and extract the text encoder layers if you want to be 100% sure you're getting the best of the best x)
Flux.1 Dev from repo. Using Diffusers Loader... BFL repo that is... my seed is 1234. 16:9 A:R, Euler/Beta/3.5 Guidance
My current CLIP folder ...
yeah there's no visible diff from their t5 and the google one
LOL I wish I could have a nice full clip folder like that
on cloud I always start from empty
sadly diffusers doesn't support euler beta, but with flowmatcheuler, seed 1234, 3.5 guidance
just get a sacrificial 4090 lol
smear some syphilitic elecrtons on it to break it in
I can't, cloud is too addictive
today I can see 8x A40 45GB for under $2 per hour
its such crazy value
Not sure what you mean by "it's not compatible" because clearly mine is working... unless I missed something?
no, I'm using diffusers python library, not comfyui diffusers loader.
that's gotta add up
Gotcha
btw this is the new finetuned clip, right?
25 hours a day for me ๐
Yeah, someone posted it on Huggingface... I linked it above...
was looking at it earlier too but it's kinda hard to say if it's obviously better
it just doesn't make any sense to tune a text encoder after the weights unless you're fuckin around
yeah, so far it is just different, not sure if better.
and here are the two versions of T5XXL ... ComfyAnon on the left, City96/BFL version on the right
the best thing to do is to have like 10 different things where there's a clear right way for it to be generated, and a clear wrong way
then spam out at least 10 seeds for each
seed now 9999
Above is CLIP_L from BFL and T5XXL from BFL
Flux1.Dev / Flux.Dev.LARGEFILE
what clip L did you download?
Slightly less detail in the LARGE FILE version!!!
https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/text_encoder/model.safetensors this is the official noe
one
are you using that
t5xxl_fp16/clip_l
I only see a few pixels difference...
kohya's sd-scripts is working really well for me
Realistically, its not got a lot going for it
onetraner appears to support it now as well - i had horrible results with my one test, but i'm sure i screwed something up
there is a new sampling method on arxiv for faces
that no one has ported anywhere yet
it might be amazing combined with existing stuff like ip adapter
What do I rename this to at all?
Photoshop says this is the pixel diff between them
The trouble with art - both look pleasingly good!
I renamed mine "clip_l-BFL.safetensors"
So there's no right one and one wrong one
i renamed it to clip_l_flux.safetensors myself
https://arxiv.org/abs/2408.02078
Yeah, for me it has to be a glaring quality difference ... otherwise it is somewhat subjective.
with text encoders, there are right and wrong ones
the weights are trained to interpret the results of the text encoder
if you use a different text encoder, it's going to be working with something it wasn't trained on
the difference will usually be small, often be subjective, but it will be worse as a whole
if you want to see how big of a deal it can be in a more obvious scenario, try training a lora where you train the encoders separately (which you should, with sdxl for example)
Can you sahre your side by siudes shwoign the differences you mention? I see no clear differences PRO CLIL_L from BFL vs the version from ComfyAnonymous repo.
Maybe your tests are more obvious than mine. I only see minor differences but not yet always better differences
i don't have em handy
other ppl ran tests a while ago
but you don't need tests for it
try training a lora where you train the unet/transformer/weights first
then, in a second run, train the text encoder
So how can I judge otherwise? I mean, I have not seen the major differences you are suggesting ๐ฆ
then, do it the other way: train the text encoder only, then train the weights against that
the results are quite a bit better
Running with clip_I_BFL
well it's not the way any model is trained to train the text encoder after the model
it's not really a different text encoder though, it's the same text encoder in different precision, the differences will be minor at best, it's like a rounding error
idk, i never saw a difference between bf16 and fp32 that's even close to that
if they didn't tune them somehow, then something weird is going on
it's not impossble that for some crazy reasons i don't understand this new tuned version really will lead to better results across the board, but i wouldn't trust that without hundreds of test images
neither clip_l nor t5xxl were trained along with the flux transformer, it's using the default models from other repos
ya know what... i might just go manually recast the openai clip L to bf16 and see if that looks identical
clip_I / clip_I_BFL (both Flux1.Dev.SingleFile)
make sure you don't just cast everything tho, some of the weights are in fp32, I mean in fp32 even in the fp16 model
yeah putting Deepfake in the name of your Arxiv paper is wild
I actually didn't like the results in the paper
but I am very optimistic because these are the results before IP adapter has been applied on top
Her left earring larger and brighter in the clip_I_BFL version
that's what is expected - that's where clip_it's strengths lie... in the fine details, the ambient and background stuff, the artsy stuff
yeah i'll just recast everything to bf16, for both versions
quick n lazy and if they really are different it should be obvious
it was published at the beginning of august, when "the word deepfake is bad" sentiment hadn't really gained much traction. but with the california stuff in such high profile, now you don't want to even think that.
deepfake in the title... omg lol
that word has had a really bad reputation in my social circles for a while at least
DeepFriedMarsBar
texas state fair
TexasDeepStateFair
yeah. they'll fry anything and sell it at the snack bars there
LARGE one has better defined circles on her dress/chest area IMO. Or whatever those black round things are lol
Yes, when it comes to arty output - each choice has its strengths - there is never really a right one and a wrong one
Its better testing on the photoreal
where you seeing these prices? i'll prolly have to go throw some training jobs on that at some point, that's perfect for fp16 loras
ye that's the thing, local diffusion inference can be done locally but I would be pushed onto cloud for inference of other types of model, or most types of training, anyway
oh, small gpu?
SingleFile and clip_I_BFL
I accidentally selected two CLIP_Ls in the Dual Clip Loader instead of T5 and CLIP_L and got this LOLOLOL
SD3 uses clip_l, clip_g, and t5xxl - flux just uses clip_l and t5xxl. I wonder what you'd get if you switched out flux's clip_l with sd3's clip_g
hmmm
Error...
Trying that now ...
I love how even botches look good now
๐
some of my favourite images I made a full on botches
wut? it worked at all?
i wonder if it just used the t5 you had selected before
or nightmare fuel perhaps
Clip_g - Size Mismatch error!!!
drat. well, worth a shot
๐ฅณ
Hmmm... I cleared the caches...
True...
That's a lot of cheese that error generates!!!!!
Ok, well, they are the exact same size in number of bytes, and I get identical results, whether I use the one downloaded from BFL's Hugging Flux page or the SD3 page.
What about the one in the ComfyAnonymous HF Repo?
BFL and Comfy one are different SHA256
Couldn't say. I always used the ones provided in the official Hugging pages. Ex:
The only clip I use that is different is the 32-bit T5 clip in GGUF format
As per Clownshark, the one from BFL is a slightly better choice as it is different SHA256 than the one from SD3
Well, I will make 3 images, and if I see a difference by even a pixel, I will look. My first sample is identical.
I'll try one of those ultra detailed ones in that TomsGuide article
My use of the 32-bit and 19GB T5 file is questionable. Meaning it is unclear there is any real benefit
19GB T5? Oh boy, I want that one LOL.
Can you link the link here?
They load separately. I mean, if I can use it with my pitiful 8GB Vram, I am sure you can handle it
OK so I then assume the t5 is loaded, used, dumped, Clip_l loaded, used, dumped, Flux loaded, used, dumped?
I thought all had to be in VRAM at once. Or are T5 and CLIP loaded in regular RAM?
I must assume so, and even then, the GGUF is loaded in blocks, not the entire file. Otherwise it would be quite impossible for me to use it with this laptop 4060
That is the power of the GGUF models
Heh, reminds me of that famous scene from Close Encounters
Spielberg was in full genius mode then. Don't know if you know the story, but while he was filming Jaws, every night he was writing Close Encounters in his trailer. One night he shows the 2/3 done script to Richard Dreyfuss, and he was so blown away and enthusiastic that Spielberg rewrote the character (originaly in his 50s) so Dreyfuss could take the role.
I did not know that.
Yeah, you'd think filming the first blockbuster in history would be enough, but no, the guy has to be writing a new brilliant script at the same time.
I'd like to think the sign has little paws for people to put their cats behind them and take pictures
:P
well It could be just a sign on the ground but that works >.>
make a little sign like that and put a bowl of food behind it
so when the cat is eating you see him holding it lmao
Any news about SD3 2B refresh?
In a couple of weeks
The Matrix, 100 years before ๐
oh my... ๐คค
in a fortnight
Honestly, with how models like flux have shown how well quantization works with dit models, even all the way down to Q4 or Q5, along with fp8, int8, etc, I'd almost bet that SAI will put out a much larger model like 6b or 8b now and just give a bunch of directions and options for quantizations so people can use the version that works with their hardware
I doubt it. SAI might be done for good. The silence is very loud IMO
Transitional periods do that, they might have started a new model from scratch to compete with flux
Remember Blockbuster? Remember RIM/Blackberry? Remember [insert blank] ?
well, new CEO, new CTO, trust and safety person gone, there is action
i cant wait to play with the new model while i wait ill go play Concord with the boys
awh i cant i already got my refund
its not dead is just a transitional period
when exactly did this new CTO and CEO take office?
CTO last month, CEO like 3-4 months ago, new board members ect
very stable
Diffusing the situation?
well, the point was more: were they part of the botched release, or trying to fix it, or moved on and will try to best forget it.
Yeah, it will come back out in six months as a f2p game with overly aggressive mtx and BP, and still flop again lol... That or their cave to the degenerate gooner crowd and put waifu characters and skins in the game
Myeah, say what you will, but the 2B model has serious issues
they did had waifus,its just that they were the 600 pound inflation fetish type of waifus
Not as stable as a system update from Nintendo.
and even without being hung on NSFW, I found my images with people completely useless
Ahh I didn't know that. Never played it or even watched a trailer for it, but I heard a ton about it in the news
well it was an overwatch souless copy with generic characters and it wasnt free2play so that killed it
they might think they have to do this, but this isn't true. I like SD3 much better in many situations then any other available model
What issues?
How the heck did the AI figure this out? WOW
oh heck yes!! Imagine the parties
Cascade?
tuned flux
cascade remains the best for artistic styles so far, but the photo abilities of flux are next level
just needs some tuning
If you look close, you will notice the grass is not the right shade of green
Is that what it is?
This has SOOO much potential for signage and memes ๐
I didn't say whether I was true of false, I was just speculating and throwing ideas out. Sd3 is a solid model in a lot of ways, but since the bulk majority of people are just going to use dev/schnell, SAI will need a model that can actually compete to draw users back. Even huggingface was completely blown away at how quick flux took off.
her hair is combed against the part in an unnatural and unconvincing way
You're kidding... ๐ ๐
the main issue i've noticed with flux that sd3 also has is a grid-like pattern of lighting artifacts at resolutions beyond 1024x1024
one, flux is actually useful for most things, whereas 2b just has no idea wtf to do most of the time and makes images that are so flawed they're unsalvagable
two, flux is much, much easier to train
i haven't had that experience
nor any real major success with training like i have already had with flux
put a lot of time into it... it's just so stiff
flux is just a lot bigger and/or better trained
flux has zero issues with hands, 2b is a disaster, just as one example
and text
Absolutely. And he devs who left and reported they were instructed to not deliver or train a good model and to stop development that was well underway. They were all lying
yeah no doubt that whole release was a messsssss
and they deliberately were told not to deliver a good model
honestly, if they'd just release 8b, they'd be fine
but they gotta do it before the ecosystem develops fully around flux
good controlnets apparently cost like $40k to train
it's gonna be hard to convince ppl to bankroll that for a lateral move from one model to something that's at a similar level
No, that is not what the devs who left said
so if they wait too much longer, those tools will exist for flux and not for sd3
and it'll stay that way, maybe for good
yeah idk... matt3o was going to train a higher resolution clipvision/ipadapter for sdxl
and then funding got pulled cuz of flux
You think Flux happened by accident?
sai would have to help make it happen, if they release their updated sd3 after a lot of good cnets etc drop for flux
yeah i don't think they'd say to trash their reputation on purpose lol
i think the whole thing was a trainwreck of bad communication
yea, i think 4b was his
Sorry, let me extend that phrase. The SD3 4B model HE WAS TRAINING
plug got pulled for whatever reason
I wonder who these Flux devs are. They are awfully clever. Stability should try to hire them
The 4B model was not cut because of money
It was cut because of a change of direction
The SD guys who founded BFL and made flux and who left Stability, were university researchers. They did not leave because of lack of money either. In fact, the very fact they produced Flux, from scratch in a handful of months shows how ready they were.
mama told me they stole the model from sai and published it under diff name ๐
lol
It is not a little telling that after having all the good public open source projects to be delivered to the community being cancelled, and only a broken 2B model, that they deliver a monster 12B model
It is saying "That broken 2B model you guys got? Wasn't our idea. We were against it and hated the idea of a gated 8B model to just let you wish. Here is a 12B model, the best we can do. Go crazy."
And a few weeks later they close a monster deal with Elon Musk
Side?
pony side
It is the absolute opposite for me. SD3 is good, flux is great... no question. Nothing at all fragile in my experience.
too late ... unless it is orders of magnitude better than flux, why would anyone spend all that time, money and energy on a SD3 that is just similar to Flux?
Variety is the spice of life. But honestly I'd enjoy playing around with SD3 Large/ultra if it weren't so overpriced
right now, the only commercial getting my hard-earned pez is Ideogram 2.0
that's the issue
right now, the big advantage sai potentially has is developing a small model that can do most of what flux can do
but it's gotta be able to outperform the quantized stuff
BTW, this LoRA is quite nice: https://civitai.com/models/705130/claude-monet-flux?modelVersionId=788725
my hair irl 
In my 20s, my hair had that weird pyramid look like one of the Jesus images. No matter what i did, it would spread to reach the ends of my shoulders in waves
wait a min thats not anime
Oh crap. My Bad. Worse: the people have clothes on
(lowers head in shame)
By itself, Flux knows Monet... but not his style ...
yeah
I liked when my hair was at shoulder length
not quite there
I figured you were hairy all over... but maybe you're one of those Sphynx cats ๐
Not quite. Actually, not even close... There are things Photoshop does that no other program does.
If you mean SAVE AS JPEG, sure, MS PAINT can do that.
There are also licensing constraints. AWS has their own Slack/Teams/Zoom thing as does Verizon... they all suck dirty ass but hey, someone HAS to buy it because they can;t use Microsoft or whatever due to licensing or conflicts
Nah, it's not... It likely has quite a few major changes in the architecture and was also likely trained from scratch. You aren't a part of either dev team,
Don't spread BS like that lol
Testing waifus != Knowing how they're training a model
Again, making some images with an API doesn't mean you know anything about what's going on behind the scenes
LEads me to a question I've had for a bit:
Are there any such things as Easter Eggs in any Diffusion model these days? Would be awesome to have that. I miss Easter Eggs.
so you had a local copy of the model and intimate knowledge of all of the training methods/scripts/dataset/etc to be able to boldface lie and say that BFL's flux models are SD3, just "bigger?"
you're trippin bro... calm down lol
so you used an API...
if you didn't have direct access to the files and data, without being gated, you were using an (A)pplication (P)rogramming (I)nterface to use them...
that you had to access...
Yeah, you're describing an API... Your comfy client would be granted access to the files for use, that weren't locally accessible
And being faster!
So stop lying and acting like you're in some kind of "know" about the actual things/training processes that went into the models behind the scenes and stop claiming that Flux is just SD3 but bigger and that they just took it and ran with it lol... It's BS and making up that kind of BS does more harm to the community than good.
I am not paying a single gram of attention (1 cubic hair length of toes in the air for americans) to discussions in this channel anymore, have more ugly cat 
Well you can keep deluding yourself however you want, but it's ultimately up to others reading these messages to formulate their own opinions. I'm just calling it out and dispelling BS rumors/conspiracies as I see them...
Flux != SD3, so stop saying that.
well you sure seem hot and bothered by it, but you can't powertrip here and it likely drives you nuts now (saw someone on the subreddit mention that you're a mod there now and that you've been powertripping or something)
Ollama Bin Cattin'
calling someone out on lying isn't power tripping. you're just embarrassed that you got called out. so again, stop lying and trying to act like you're in the know of how these models are made and stop claiming flux is just a bigger sd3.
touchรฉ
where's the alleged proof that flux = sd3...
and i already proved that you didn't have access to the models/training/etc etc
cool, salty devs make false claims all the time
well i saw the mod drama on reddit and i can confirm it was juicy
i didn't really read any of it, but i know it was a thing
so here are two images, same prompt and seed. The first is plain Flux, and he second is with the Claude Monet LoRA.
not bad, i like it
that's pretty cool.
surprised how well it works even on all the shit in the background
mimics traditional art where you desaturate subjects further away and all
Yes, it is a well-done LoRA. I was impressed. https://civitai.com/models/705130/claude-monet-flux?modelVersionId=788725
donald trump is fighting with narendra modi and both have punching gloves
On twitter? or on live TV?
on facebook
not good they both are fighting with each other
that was yesterday,today they kiss
this is more powerful
add that modi has hit trump's face badly and now it is bleeding and his teeth are broken
Sd1.5
CivitAI LoRAs for Flux just exploded. In the category of Style alone there are already thousands of LoRAs. I don't say thousands as a euphemism for a lot. I mean literally thousands.
I managed to hide the seams lol
this is 3 images
seams happen in the background so the first pass uses depth maps to blur the shit out of the background
now the seams are gone
wow there are thousands?
that's amazing
No joke. I was trying to see them and got tired of the endless flow. So to simplify, I asked it to show me all LoRAs with the letter A, in the category of Style
it told me thousands
I was quite shocked
I had no idea
This isn't the sum of all the types, just kinky, or anything. No I asked only for the category Style
it was ENDLESS
I tried to skip flux and focus on SDXL but I think flux is too good to skip
cos it clearly trains well
And this is after only a single month. Less if you mean the time it has been trainable as it is now
I was thinking
if the model gets smart enough with finetunes and tooling
we might just not need negatives
after all negatives do divide your speed in half
Myeah, there are things you might want to be able to control. Background blur, or 'bokeh', is one
there's nodes I like that need negative
SAG blurs the subject of the image and pushes the model away from blurry subject
PAG does something similar but with degraded subjects
but if the model is smart enough we can get by without these
Yes, you can try a positive with camera lens references, but Flux has been known to add blur even in non-photo scenarios
IP adapter takes negatives as well
is probably a better example
of negatives that are not prompts
yeah checkpoints will have to fine tune in more depth of field
I assume it will be the same situation as with SDXL where Jugger funded by rundiffusion.com and Realvis funded by Mage.Space are dominant
But this also shows just how appreciative the community is for the gift of this incredible 12B model. A greater demonstration of love I cannot think of
yeah lora is legit the best way to show appreciation
it kinda shows that people were hungry for something too
exactly
All the quantization models, the LoRAs and so on really highlight how much it is appreicated
I'm looking at some of the character loras
its learning faces better than any model I've seen
and it doesn't even have a proper IP adapter yet, its just straight out of model
When it came out, the naysayers were desperate to paint it as a bit of a unicorn: untrainable, requiring an insane amount of memory, etc. Now look at it
its a bit of a complex issue
there are still some costs to it being distilled, namely the worse guidance and the lower image diversity
and then the second issue that the base model was slightly overtrained on one aesthetic
but since the image quality is very high it is still a useful model for now
I think I'm staying on SDXL and waiting for the next Pixart for now
was a good sign that they are backed by Nvidia now
A wise decision. It also protects you
ะฟะฐะท ะปะตะนะฑะป
I started using TCD lora
its a bit like lightning, its really good
its the latest one, as far as I know, possibly hyper came slightly after
TCD can do both low steps as well as high, which is cool
yeah I worked out today that with SDXL you get at most 5 passes
before the VAE compression kills you
but 5 passes is loads, that could be 5 different models
oh i don't mean within one wf
i jus tmean, in general
good to use multiple models, mutiple samplers, multiple methods, multiple trainers, etc
ah yeah that's what I thought you meant
I started doing composite so each panel could be a different model like this image
just have to use nodes that help hide seams
multidiffusion, tiled ksampler or mixture-of-diffusers help
I always hate when people post Ultimate Upscale images and you can see the grid lines
best thing to do is use overlapping gradient masks with differential diffusion
seams are impsossible with that tiled approach
oh thanks will try that
my method still had seams it just hid them under a plant lol
lol
yeah the idea is to do something like... for example, say you are upscaling to something that is 1024x1536
you'd have two 1024x1024 tiles
half would be solid white masked, the other half of it would be a gradient to black
then you flip it around and do the other side
so the total mask value is 1.0 across the board and no more no less
These are all Flux, and all without severe bokeh
ah this sounds great
I've been trying to learn noise injection properly cos I've been doing it "wrong" the whole time
doing noise injection using k-sampling, with correct sigmas, is better than what I do
if you add noise in pixel space and VAE encode it doesn't quite work properly
blepping showed me on the civit discord. VAE encode cannot make a latent with high enough values
Ahh yeah... I've always done that stuff in latent space so I hadn't checked on that, good to know
Sd1.5
Florence2/Flux img2img with SD3 input images (SD3 first, then Flux)
Florence2/Flux img2img with SD3 input images (SD3 first, then Flux)
Florence2/Flux img2img
TEST
Florence2/Flux img2img
Too many ads, didn't make it past the first paragraph
excellent, so SD3 8B weights are never seeing the light of day
welp, time to wait for more flux painting loras ๐ด
Try that instead. What I do when I encounter this problem of too many ads is I save it to my Pocket account, which is free by the way, and it removes all the ads
You mean more than the thousands that are already available?
That wasn't a euphemism. There are literally thousands already available
At least on Civit
It came across that SAI aren't completely sure of the way forward; and have a touchy-feely attitude to the way forward?!
There was a certain amount of reserve in the tone of the article.
One interesting tidbit was that they won't add SD3 Medium, but rather SDXL instead
that is noteable yeah
Make your own ๐
Or just flux them down the toilet
If ClipDrop is 8B weights, then its photorealistic output is phenomenal
good idea, I'm gonna train my own 8B MMDiT
Bc most people only want to read articles woth drama, so it's always spun that way.
Its the lack of drama in the tone of the article which is more surprising - perhaps the journos are aware of SD3's history of false-dawns?!
