#πο½sd3
1 messages Β· Page 78 of 1
I've just tried a Satyr - he ain't half got a massive amount of public hair!!! Perhaps not best for on here ...
styles
is the comfy all in one flux any better on ram or same as the fp8's separate, probably same ya?
Wow, flux does taurs way better than any other model! π
too lazy to fix the img, but it wouldn't take long
Female Satyr - not really!
It's forgetting to add goats legs - the horns are present ... ?!
I was doing painterly-style cards too, but I gave up. They were just turning out way too steamy.
Perfect! π
Monster Card Series V2
23 - Centaur
My female satyrs are quite racy!
are loras and checkpoints even going to be needed anymore?
what ever he posted, he deleted it
satyr only has to have a tail and a hardon, no horns hoofs ect thats all newer depictions
I've been thinking about it since yesterday, and I still don't have a solid usecase for either. Definitely controlnet though.
Flux does not realise that a pair of goat legs are needed to give the complete Satyr!!! π
I remember Faerie Queene mentioned "backwards-bent knees" so no too recent. Maybe you mean Greek pottery paintings. Don't remember those in detail.
But a girl satyr will not have that one thing in particular. π
perhaps some Claude 1st to help?
I jprompted a satyre and got a nude lady with horns (ps flux an't do nips to save its life)
I'm beginning to understand that a Satyr has a prodigious amount of tumescent flesh!!!!!! Mr Tumness was tame π
ty glif and gpt4
Been thinking of Legacy lately. Stability legacy now roosts at BlackForest Labs. I thought that the new CEO of stability had Weta legacy, the studio that worked on chronicals of narnia. But he was CEO of weta digital. A software offshoot of weta workshop that he built up only to sell off to the highest bidder. Unity bought it and now are squandering it. Typical enshitifcation. He came on, puffed up the value then had the company acquired, and left. Now he's brought on to stability.ai . hrm. legacy.
Where is BlackForest's Discord???
good question
Flux is interesting lol
Not even penguins escape the furry 
interestingly enough, last week the antarctic had a catastrophic warming event. small pockets were warmed to 30 degrees. nobody knows why.
The sun felt hotter that week imo. sometimes i think global warming isn't entirely because of CO2. But that makes me a climate denier or something and people lock me up
i'm kind of a sun worshipper and accept that it is the supremely dominant entity in our world. 99% of the entire solar system's mass. right there in that big hot thing.
crazy colored chaotic cartoon style, cow being abducted by an alien spaceship, blue volumetric spot light on the cow
flux-schnell
ohhw, wait. I didnt read that correctly

crazy colored chaotic cartoon style, spaceship being abducted by a cow, blue volumetric spot light on the spaceship, cow floating on top, spaceship on the ground
That is a solid cow abduction illustration though. applause
can be both
could be. But the sun is one big bitch.
I'm gonna get these printed. π This is the first time I've been sure the quality was good enough to warrant it.
An illustration of a farm landscape at night time. A cow hovers in the air at the top of the image. Emitting from the bottom of the cow is a volumetric blue light cone. On the ground within the light cone is an alien UFO with landing gear extended. it looks as if it is being lifted off the ground. @bitter hearth
I need a workflow, Flux main character, then SD3 background
Missing a Lamia. Pretty classic monster girl
An illustration of a farm landscape at night time. A cow hovers in the air at the top of the image. Emitting from the bottom of the cow is a volumetric blue light cone. On the ground within the light cone is an alien UFO with landing gear extended. it looks as if it is being lifted off the ground. crazy colored chaotic cartoon style threw that style vibe on it
its so dumb how entertained i am by this
Lmao
10/10
sure. let's say you want to be able to generate images with your cat. the AI knows cat, but doesn't know YOUR cat
maybe there is a world where you add in your company branding via lora
And now for something completely different. My hen after a hawk dove at her this morning
Now THIS is my type of prompt, forget all these llms and techniques, behold:
Super powerful cow exodia
i have enough nightmares already
Would you like a link to my horror DA? π
let me guess, it's packed with sugary sweet fluffy bunnies?
True story, I also have a chibi account π
images and characters can be very flexible on the bio-gender, let's put it like that. Lovely style, but not really my content...
take that as a compliment becky93!
You are thinking of my main account, my horror account is actualy horror (and mostly sfw)
ohh, ok. nevermind then.
Anyone that does DA, the trick is to keep different genres to different accounts, even if it's to not traumatize viewers lol
Btw, horror is extremely popular for views, but no one buys it ever π¦

I use to love horror, but im not an active fan anymore
Lovely "tired of your shit" face
I really really like surrealism π Also post apocolyptic odd creatures is fun π
It's pretty awesome that we can create our own custom art, thousands of it! (then print it out for $1 per page and frame it, if desired)
Mr Tumness! (Tum e s ) centaur
First model that doesn't make a nightmare fuel from that
Flux brand piano. π
Gonna use a card-printing service. I think I need more cards though... Only have 24.
To be honest, cascade did a good job too
Even better, because those hands are used to play the piano, and to scare the children
i bet she likes rusty spoons if u know what i mean
flux
this seems to be a very useful feature for schnell, it's already fixing anatomy in the default settings. I have to explore a bit these settings.
That was upscaled and the prompt wasn't quite the same, but yes, it's pretty good
the hand was missing fingers, and as the steps went, it fixed it. But it changed the bacground a lot. Using euler btw which should not do a lot of change in each step. I need to find the sweet spot of fixing anatomy but not changing the overall picture a lot in each step.
Channel dev? I only have the shift on the default channel
idk what you mean
i'm using schnell
comfyui
still don't
π 
struggling with this prompt idea i had. crayons melting into a painting. i'm hitting notes but , its not coming together. mmdits sure are fun to prompt though. and its magical when they work.
lower steps kind of helps
very pretty. i'm noticing it rarely understands what a melted crayon is supposed to look like in this particualr case.
for painterly I always like lower steps euler
with overly low CFG
you have to perturb the model a bit
if CFG++ is working for Flux then that helps a lot with softness
the issues with paintings is my biggest issue with flux so far. Hope lora fine tunes will solve that
am using dev. i hvaen't really tried other workflows yet. i saw one on civit called flux street.
i hooked up cfgguider for negative pormpting and cfg, but any cfg ruins prompt comprehension
flux don't have real cfg and I don't think it works well with cfg (haven't tried yet, though)
we rly need the paper
i saw an immediate issue using cfg so i swapped back to the basic guider
15 steps low as i can go for eular. in general really
is it not novel that it doesn't take CFG?
from sd3 paper, the distillation paper and the source code
they distilled the cfg out of the model
got these
similar like turbo models distilled the high step count out
the model is basically finetuned to produce images that look like images with cfg
I don't think this is it
necessarily
I think it has a different block structure
the source code is open
it's block structure is not that different
they have mmdits and dit blocks mixed as in Auraflow
they use parallel transformers - what sounds fancy but is a very simple trick to make it faster
in contrast to sd3 they don't use CLIP embeddings anymore
instead they only use the pooled embedding of CLIP as micro conditioning together with cfg and timestep information
I'm very sure the pro model still uses cfg
ok thanks I haven't looked into it yet
with SDXL I've tried a few different things like
injections into self attention, into cross attention and into key matrix (out of the K V M matricies)
maybe some of these would be useful
https://github.com/comfyanonymous/ComfyUI/commit/ce9ac2fe0581288d4a24869dae2e04a3c2b67061#rssowlmlink did anybody catch this? We were using flux in a haggard half working state for a few moments on the first day unless you updated after this happened
I am thinking that
if the model has worse guidance due to distillation
then attention map injections could be a really great idea
i.e. dense diffusion / Omost style
I don't think so
it's probably just a variable naming thing without effect
it wasn't prompting the clip layer just the t5 layer
you cannot accidentally switch clip l with clip g
injecting noise into the key matrix also works for some reason
because the CADS node has that
but I don't know why it works
flux has no clip g
i get very good quality and softness in schnell, i'm just trying to fix the missing fingers with the model sampling node. Basically each step tries to follow prompt more form what i understand and based on the settings is how much the image changes in each step. This is what i understand this far.
iit was just going no where so the code didn't fail
there is a trade-off for soft images
in my limited experience with softness, every technique that makes it soft lowers prompt adherence and structural coherence
π₯ fire π₯
Monster Card Series V2
25 - Satyr
make it soft with artist styles in clip l
this is my favourite soft image I have made
it was Euler-ancestral-CFG++ with a CFG of just 0.1
and then not given enough steps to fully converge
not giving it enough steps helps a lot for that soft wispy quality
this is the same reason why, unlike most people, I run realistic photos for like 300-1000 steps sometimes, depending on solver. I have seen things get fixed at over the 300 step mark.
use fp16 if you can
is that sd3
its zavychroma sdxl
same prompt using replicate's flux pro interface
i think they run it full precesion
very nIce
flux
Now @bfl_ml has released Flux.1(dev) 12b @StabilityAI should release SD3 8b
Itβs the benchmark model in the paper that performed so well (including laying down on grass aha) & is much more tunable.
Glad to see releases, onwards we go..
emad just tossed down the gauntlet. we'll see what they do
that thread is interesting
Yeah 100%, and a turbo variant would be amazing too. It could probably redeem them.
if 8b is faster than flux 12B then I don't even want turbo, but I can understand the amazing advantage for lower end users
Flux PRO via Repilcate
Yes 8b is going to be faster. I still dont want to wait like 1 min for 1 image so I would definitely like turbo.
Schnell
SD3 (it's on purpose)
if i can run 12b that means i can run 8b right?
been trying out some imagen prompts. A single beam of light enter the room from the ceiling. The beam of light is illuminating an easel. On the easel there is a Rembrandt painting of balls.
imagen version
it's class collapsing on dragonfruit, but it kinda knows pitaya and then really nails it with pitaya fruit. prompting is king.
imagen prompt is A dragon fruit wearing karate belt in the snow.
Yes and it will use considerably less vram.
so far the messaging about 8b is that you'll need over 24
I remember lykon saying that it's not that much slower than 2B, but we'll see
flux vs imagen
yeah imagen imagens π
the 8B isn't enough to match a 12B necessarily
however that also isn't necessarily the case
Flux dev is better, schnell with 6 steps is similar to 12b.
8b is going to be faster tho and save vram as well.
yeah I'm probably waiting until SD4 until I use a Stability model again
there is definitely a need for low VRAM models
in the same way as there is a need for 3B LLMs
but that doesn't interest me that much
3.1 might be decent but they are still trying to compare how good sd3.1 images of dogs are.
They should really test humans first.
humans are proabbly a big part of their testing procedures. what they published is just a small lens on it all
new leadership is at the helm now so just hope better decisions are going on
they could compete on license
cos that's a way to make use of a model that is not quite as strong
you can release a more free license
not open source but less restricted
jump to sd5. 4's a dumb number. not even prime
or they could just market it as a lower VRAM alternative
Mixtral got extremely popular by being lower VRAM than 70Bs
Mixtral did way better than the average 70B
Schnell is Apache 2.0 which is basically the completely open. Dev is definitely a lot more closed
transformers change everything imo. it's a whole new ball game
there are developments to be made
transformers are there since 2016.
Even SD 1.5 is a transformer architecture - but we had this discussion already...
sadly Schnell looks too bad to me relative to Dev
I tried to like Schnell but I can't
its better to talk about things in terms of individual layers anyway
because that side-steps the debate about what is, and is not, a transformer
its a pedantic debate.
models get better and better since the initial dalle 1. But these are incremental updates, no revolutions
You have to try 6 steps, itβs a lot better then. But dev is still better but itβs close.
ok thanks will try varying steps
will also try adaptive solver on schnell, which I have not tried yet
I use adaptive solvers like 99% of the time aside from quick test images
partly because it saves the mental effort of having to optimise scheduler or steps
4 steps is the right one, 6 steps is left. Prompt: βAn image of a woman holding her hands up in a roomβ
2020 i couldn't do any of this. 2024 we doin it. that's very revolutionary. french revolution took 10 years. American Revolution took 18. Sometimes revolutionary periods last a while. This is a huge shift in digital graphics things won't settle down for some time.
ah thanks thats big change
Random seed tho, but 6 is still considerably better most of the time.
One day we'll have the true revolution. Canada will take over it all and then you'll all be sorry
its mostly that Schnell is the opposite type of model to the one I need
Schnell is about making images fast
but I want models that try to make good images slowly
What a time to be alive!
hold on to your papers.
now imagin, two papers down the line!
Yeah dev is definitely better then.
has 2min done something on flux yet?
all the youtubes i found so far, the guys just read verbatim the bfl blog post
yeah not very cretaive lol youtube is like big media now just reporting the news basically
in whateve rniche they operate in
the popular youtube channels for machine learning are ridiculously bad sadly
both for LLMs and for image gen
papers! get em!
the amount of youtube videos pushing brand new multi-agent frameworks is over the top
having said that, some of my favourite comfy workflows came from youtubers so it does vary
oh NO it's colappsing into a singularity!!
memes are so aesthetic now lol
someone know about codes in collab and maybe can help me fix a problem?
What problem?
And also move to tech support
oh ok
guess this is it. we're in the papers now. gotta just push through
now the big bang. thats it. thats the new paperverse.
what a time to be alive
Epic
which of those 2 vases of roses is flux, and which is SD3
I would say first one flux but it is too small to be sure
click on the image, then click on open in browser, then you'll get the full sized image not the discord compressed version
Still a 331 x 579 image
so the uninitiated dont think i have a demented imagination. just demented inspiration. here's the og short from don hertzfeldt's rejected https://www.youtube.com/watch?v=UcwfEMdV-aM
don hertzfeldt is the best
his new stuff is phenomenal
im trying out single checkpoint version of Flux ... not using any clips in the workflow
a 16gb safetensor which is fp8 version of the dev model
20 steps and its kinda slow on my rtx 3060 gpu with 32gb ram compared to schnell with 4 steps
i swear there's a huge difference in amount of samples when using fp8_e5m2
what do you mean?
fp8_e5m2 is closer to fp16 but needs more samples to look good
won't that be incompatible with the t5 clip?
no it works fine π
interesting, im gonna try it out then
okay
but what i meant is that any amount of samples over 20 actually helps the image a lot
lots of people were saying that anything over 20 samples is not needed
ohhh
but im rendering an image now with that setting you mentioned
i'm gonna upload an image with 120 samples in just a sec
ok
and then another one with 22 samples
for comparison
120 samples
22 samples
40 samples
the top one of the 3 looks kinda more vivid in tone
and the fence on the second floor is completely fixed
as well as other small issues
but idk if 120 samples is worth it π
steps
if you got a beefy gpu sure
i'm gonna upscale it 4x and send it here
im on rtx 3060 so can't use the dev version
ohhh
how much vram?
8gb?
12gb
nice
120 steps is fine I often do more than that
i wish it was easier to run
it takes from 3 - 4 mins to render with dev
here is the difference between 60 and 120 steps btw
yeah sometimes stuff gets fixed at higher steps
1st is 120 steps and 2nd is 60 steps
I have seen stuff fixed hundreds of steps in
yeah details are much better on the left one
wait my bad
right one is 120 steps. really makes you wonder
left one is better
the right one misses some of the detail with blur
unless you are confusing which is which steps lol
no i checked again and it's definitely the less detailed that's 120 steps π
yep
the left one and right one .. see the shade on top.. left one is pretty distinct
that's where i like the 120 step one better, the 60 step version looks like it's trying to make chimneys on either side which wouldn't make sense
either way it's pretty interesting how the saying that "anything above 20 steps doesn't matter" isn't remotely true
lol i see what you mean, but still left has more detail but a bit extra haha
for quality tho, the right image has blurry spots
60 steps looks better
but 60 steps would also take a bit of time
can you share that prompt?
sure man
facade with apartments, outside view, viewed from the front
i mean just the prompt
not the workflow π
i got my workflow with schnell setup
the prompt is above
like in the message
facade with apartments, outside view, viewed from the front
here it is again π
thought you would want the seed which is why i shared everything
nah just the prompt for the heck of it, just wanna see how schnell handles it
not trying to replicate
facade with apartments, outside view, viewed from the front
ahh ok
lol ..
oh the tree is very conveniently placed π
here is 4x upscale with sdxl lightning, now i'm gonna try sdxl normal
it's an rtx 4070ti super π
my system would probably freeze if i attemped to upscale above 1500
16gb vram but it almost isn't enough lol
ah nice π
yeah that's honestly completely fair
my pc freezes for a moment when trying to upscale
a blonde girl sitting on a bean bag, indoors, potted plants, window, sun light.
the keyword girl generates little girls
oh
the thing about longer steps is
you might need a different sampler to normal
female is not age specific
try female
woman is more age specific as well i would say
female is more neutral just as male is
oh i only saw this now
maybe i'll try a couple hundred steps on the facade pic
You can specify the age: "21 y.o." or "34 y.o." etc
the thing about upscaling the image is that my roof tiles are getting messed up
does that work?
epic
trying
ok π
whoa, nice call
Monster Card Series V2
No. 26 - Kappa
looks cozy lol
oh wow you're making a lot of cards
is it for a card game?
her feet are kinda fused lol
I'm aiming for a full deck of 55. (That's how many the printer service wants.)
but cool thing to know about female instead of girl π
i wonder if male works as well
Not a game, but I am going to print them. Could make it go fish or something, but nah.
Once AI is good enough that we can consistently describe everything on the card, I'll try a game. π (And Flux is very close to that level.)
Monster Card Series V2
No. 27 - Phoenix
definitely should, lemme try one
would be super cool to eventually make a hearthstone inspired game using generative ai
and llm's for the programming
Monster Cark Series V2
No. 28 - Basilisk
And the playtesting balancing!!! That's the hardest part of card game design by far.
yes!! don't want the cards to be too good
slight glitch in the anatomy but face came out alright
(This one turned out mostly just a lion, not a girl... Still cute though.)
Monster Card Series V2
No. 29 - Chimera
Monster Card Series V2
No. 30 - Jackalope
Is that Dev? I thought this model had beaten the line soup once and for all. π¦
it's the upscaled version of the image using sdxl lightning
the roof?
Merrow is an Irish mermaid with a magical red cap.
Monster Card Series V2
No. 31 - Merrow
yeah if only i could upscale it and still have perfect roof tiles then it would be perfect
that's how those roofs really look
they are jumbled. they're not in straight rows
Maybe try upscale with Dev? But larger images take forever .
Monster Card Series V2
No. 32 - Troll
my pc is gonna hate me lolll
okay, the tile details aren't right, but i don't think the AI is going to fix that. you'd need to train a lora on real roof files
try magnific
I still haven't been able to get a good 4K render. I think I need to upscale 1K -> 2K -> 4K. Hopefully I'll remember to leave that rendering tonight.
Monster Card Series V2
No. 33 - Cthulu ...(Isn't this just Kraken again though? Oh well.)
maybe that approach could improve my roof tiles as well. thanks for the idea
so far it seems like 512x512 has the worst tiles, 1024x1024 is better and 1536x1536 is the best. i'll try to set each tile to 2048x2048 and upscale it 2x, then 4x
Don't use Latent upscale. It's never worked for me on FluxDev. Use VAE decode to get pixels, then 4x-Ultrasharp or whatever your favorite is, and then use a scale node to get it to the perfect size, then VAE encode, then back into the KSampler. (Actually it worked okay for 512->1024, just not for bigger stuff.)
Monster Card Series V2
No. 34 - Selkie
(I absolutely love how the sphinx turned out. π One of my top 5.)
Monster Card Series V2
No. 35 - Sphinx
Very kreepy. π
...But not as creepy as... This!
Monster Card Series V2
No. 36 - Wraith
Have a whole hoard of undead.
Hieronymus Bosch
Monster Card Series V2
No. 37 - Lich
Artist?
ya, old
Still not too sure about this one. I guess it's cute enough... Has a very Tim Burton look.
Monster Card Series V2
No. 38 - Skeleton
im trying out negative prompt with flux .. i wonder if that makes any difference
How??? Well, first tell me if it works.
nice approach, i just tried it. it looks like it's actually the base image it's trying to upscale that has the problematic roof tiles. to fix this maybe i can render it at a higher resolution at the start π
Hmm. Okay.
i increased the cfg to 1.5 and added clip node for negative prompt
Monster Card Series V2
No. 39 - Zombie
uses these keywords for it ..
ugly, distorted, deformed, conjoined, fused limbs, watermark.
let me try render an image
Just render a landscape and put "blue" in the negative. Does it come out red-green?
Then try "red" in the negative. Does it come out cyan?
Then try "green" in the negative. Purple and red?
Also upweight the negative maybe?
Monster Card Series V2
No. 40 - Ghoul
thanks though, it's definitely better
clever straight foward idea, but gimme a min im rendering an image with character
At first I thought I couldn't do multi-heads, but then I realized it could be like sisters arguing. Definitely cute-able.
Monster Card Series V2
No. 41 - Hydra
P: a female with red hair, ponytail, light makeup, choker, mini sundress, bare thighs, half body, blue background, oil painting.
N: ugly, distorted, deformed, conjoined, fused limbs, watermark.
Yeah I'm not sure. Maybe.
should have used portrait canvas
Monster Card Series V2
No. 42 - Chupacabra
Is Anubis really a monster? Yes. I need 55 of them, so yes it is.
Monster Card Series V2
No. 43 - Anubis
Monster Card Series V2
No. 44 - Kumiho
It was really hard to get these working. 75% of them would render with Japanese / Chinese characters instead of English no matter how I prompted it.
Monster Card Series V2
No. 45 - Kitsune
Monster Card Series V2
No. 46 - Bake-Danuki

wait a minute... does this upscale something 6x?
from yesterday's discussion on flux: 1344x768 gives the most reliably sharp images.
1536x896 can be good but you get occaisional blurry images.
the higher the resolution you go, the higher your chance of blurry.
sticking at 1344x768 (or some other 1 megapixel total image) gives the best results
I always use the "Upscale Image" node and just set a size. It will scale it down if needed too.
(This is what a tanuki looks like, which the bake-danuki is based on. It's a raccoon-dog thing, and the bake-danuki version is when it's possessed by a spirit.)
Can you spell "ouroboros"? Because Flux can.
Monster Card Series V2
No. 47 - Ouroboros
will try that, thx
Current collection. Just 8 more to go! π
My top 5 favorites. The three on the right came out with such perfectly fitting personality in their expressions.
LOL basilisk. "Eh. Guess I'll poison you. Oh you died. Okay."
wtf
hi cat
i can finally fix the way the roof looks. it comes at a cost though π
10k image
but i'll downscale it again
LOL watch out. Don't want to come back after a half hour render to see a big gray rectangle full of red text.
Can't get over how real these things look. Every wrinkle, button, stitch, pleat, strand of hair. Everything is solid. There's no vague line soup anywhere to be found.
And without a second pass, adetailer, upscale, inpainting...
Yeah for real. There's an AI saying (by some famous guy whose name I would know if I were a real AI guru), that goes something like, "Self-improving software will eventually eat everything else in the stack."
Meaning if you have AI plus other code and processes, eventually the AI will eat that other code and processes. It'll be AI all the way down.
this is an 8k image i just took a screenshot of so it's scaled down
all roof tiles have been fixed
nice!! looks very good
thanks π
That is legitimately amazing.
all roof tiles have been fixed unless you're literally right up and inspecting the picture
i'm gonna do some forest images now!!
pretty happy with the result π
took 40 tiles of 1024x1024
i don't have the computing power to make it better but if only it was easier to run it could be improved a bit more
hopefully i can afford something better one day
Another prompt that SD3 failed. The bump that snout
it's pretty close though
downscaled from 8k to 4k π
Both are flux
i'm sorry but i thought you meant you wanted the dog's snout to bump the camera so i said it's pretty close to the camera. i might have misread it. my bad π
My bad then. As I posted two images and mentioning SD3, it would be rational to think it was the result of each. I deleted one so as not to create confusion
yes but testing it with a pencil sketch is a hilariously bad example
just VAE encode and decode a high res photo lol
birchlabs is the inventor of hourglass diffusion. i think they can test on what they want - they were looking to see how it handled those sorts of details
I'm not saying the test is technically invalid
its just that I would have used a photo of some sand or something
No it's not, it makes it super easy to see issues. High frequency noise is what gives your skin texture with things like pores vs it coming out like a smooth airbrushed doll
The contrast of the image used is excellent for benchmarking
sand wouldn't be good. it's too uniform. the pencil shading texture on rough paper has extremely fine details that are chaotic and perfect for what was being tested
zoom in more, every piece is different π
no no not sand. this is fine grain
this image is not sharp enough but this is the sort of sand I mean
this is also fine grain
this fine grain too
sand isn't grain at all like not even a bit yuck
somehow I can tell that this is Dev lol
Tony Montana moment.
This channel now officially renamed Flux.
now that's pasta before it's pasta
Until something new comes out and we are bored of Flux π
this channel is in ... flux...
when you typo the prompt in flux -and write elle instead of apple
Just got a 25%-50% speedup in FluxDev! π²
All I did was open ComfyUI on my laptop. The Comfy server is running on my desktop.
Opening the UI on my laptop for some reason made it way faster. Your mileage may vary.
(To clarify: The server is still running on my desktop, I'm just making the workflow and clicking queue on my laptop.)
What it probably is is the random thing that will happen where it the model is too big for your vram, it has to offload to your sysram. Well it seems there's an issue where sometimes the CPU doesn't help at all and sometimes it does. For instance, my 2080 and 13600kf will have 15sec/it if working correctly and if the CPU doesn't help, it's like 30sec/it
I've tested every combination of Nvidia cuda fallback and comfy --lowvram --highvram. The only thing that fixes it for me is to start a generation, wait for the models to load and for it to hit the clip prompt, then cancel it and try again. Or restart comfy
I know it's working when I see my CPU at like 50% usage
My guess is it's a windows memory cache or management issue or something
Yeah the VAE runs on my CPU if I use the UI on my desktop, and it takes 15-30 seconds per image.
If I switch the UI to my laptop, the VAE starts running on the GPU instead, and takes <1 second. I'm using Firefox if that makes any difference.
Do you have hardware acceleration on on your browser?
I assume it's on by default in 2024.
Maybe try turning it off and see if it is the difference.
Not worth it. (My dev work is browser WebGL.)
Also, I still have my browser open, but only Discord. Not using the ComfyUI. It might be the ComfyUI itself is using the GPU.
For what it is worth, I have similar speeds as you with the speed-up running it locally.
What hardware / model / steps are you using?
Here comes Pony Flux.
3090, dev Flux. Bit less speed than you, but I am also running undervolted.
Steps / Resolution?
I'm 4090(24GB) / 64GB sysram, 23 steps, 832x1216 (same as 1024x1024)
Getting very consistent 31-32 seconds.
25 steps, 1024x1024, 16Gb sysram, Euler. About 35s.
Heun / Beta. I'm surprised you're running that fast. Let me try Euler.
Yep. 16-17 seconds now.
And the quality difference isn't worth 2x. This is awesome! π
Yeah, there's differences with the samplers but unless you are shooting for absolute quality for a single planned picture, I don't really feel like it is worth it to use the slower ones.
He's a bit weird π (SD3)
Finished! π
Full set of 55 cards, ready to print.
Well, still need to size them correctly.
And a few more fix ups here and there. I'll check each one carefully for problems and see what turns up.
Final 8 monsters were:
No. 48 - Qilin
No. 49 - Will O' The Wisp
Cerberus turned out exactly like Beta from Emminence in Shadows. π
Also just noticed the line on her paw/leg/haunch is wrong. Gotta fix that.
No. 50 - Cerberus
No. 51 - Yuki-Onna
Headless horseman. Another practically impossible task for FluxDev.
No. 52 - Dullahan
Gremlin's are a very recent fantasy monster. Invented by airforce pilots as the supernatural entities that caused plane problems.
No. 53 - Gremlin
FluxDev also doesn't like doing shamrocks apparently? It wanted to do 3-leaf clovers. Maybe my prompt was wrong though.
No. 54 - Leprechaun
And finally, the last card in the set!
No. 55 - Lamia
Now just gotta do the backs and get them printed. π
test
Hmm, Iβm really surprised
Was this image generated by Flux? SD3 can never generate an image with such perfect toes.
Yes, it's FLUX. For me, SD3 doesn't exist at the moment. The more I test FLUX, the more I fall in love with this model.
How to use the drawing function of Stable Diffusion
MIDJOURNEY IN STABLE DIFFUSION SERVER π
Under the sunset, a beautiful girl is standing by the lake watching the sunset. The plot of a Chinese girl movie is --niji 6 --s 250 - @compact snow (relaxed)
Hey @unkempt matrix
Due to extreme demand we can't provide a free trial right now. Please </subscribe:987795925764280351> to create images with Midjourney.
[Please check the bottom of the channel for more information.](#πο½sd3 message)
Three finger hands. Flux is better.
nice
you go away
SD3 placed feet on the list of censored sexual organs. So all foot images were properly lobotomized
Ironically they can show feet
Total CENSURED π
/1girl
ERROR: Division by zero is not defined
This prompt photo of a woman standing against a solid black background. She is wearing a matching black bra and panties. Her long dark hair is straight and falls over her shoulders. She is facing the camera directly, with her arms relaxed by her sides and her feet slightly apart. The lighting highlights her toned physique and balanced posture, creating a sharp contrast between her figure and the dark backdrop. The overall composition is minimalistic, focusing attention entirely on the subject. will give you blurry images with Euler unless you use "Beta" scheduler. Any theory?
Hmm...
Schnell
Flux Schnell vs Dev
I definitely prefer Dev for comics!!
vs Meta
All the exact same prompt for everything
I'm sorry Flux, but it seems Meta is definitely giving you some competition!
"A close up of a bearded man with kind eyes, wearing a lab coat, looking through a magnifying glass at a small vial containing a green liquid. A speech bubble which says "Throughout history, humans have sought to change matter into more useful forms." Manhwa photorealistic art style, by Junji Ito"
For comparison, sd3 local + copax lora, first 2 gens with this prompt above
Next I tried a grasshopper Penguin woman.....
Meta...
SD3
and Flux
and Flux via glif with GPT4's help lol
Is that Flux or SD3?
Sd3 + Sd 1.5
I knew 1.5 was still awesome! π
does chatgpt assistence make the images better, or just make the program easier to use?
it makes the prompt better which can make better images. Also, you dont have to write a full prompt so it can make it easier and faster.
Motion shots (schnell)
Has chatgpt been trained to create prompts that give higher quality images? Seems hard to do
Comfy question. Can I use the Latent output of 1 sampler (using sd3) connect to a next sampler (using sd15) ? I gives some errors. Do I first need to convert to image, then back to latent space .... ?
most models need a bit of different style of prompting. Most dit models like auraflow, flux, sd3 work good with natural prompts. You can just do "enhance this prompt: {simple_prompt]. Keep it 2-3 sentences." and it should probably work.
ah, so its more like sd3 and other models are build for language model prompting
sd3 and sd15 do not have the same latent space so you need to convert it to a image. There are some things that can let you convert sd3 latents directly into sd15. vice versa too. For example
https://github.com/city96/SD-Latent-Interposer
It just improves the prompts to be more descriptive and/or more in line with whichever specific txt2img you are using
Flux
Til that these are a thing and that they're called an interposer. Tyvm
What I find with the t5 encoder is that since it has self attention, you don't just do style tagging with it. Give it a little lore. Backstory. Jazz it up. Talk about all the context around the events of the image.
Anyone know of a workflow for flux like for sd3 where you can prompt all the encoders separately?
I think there is a node in comfyui called cliptextencodeflux, that does that
Beans
sd3+sd15 workflow, using sd3 first for snappy prompt2img then refining with Cute Lunar for the style and details. seems to work fine.
dam nice work
how does this work?
drag the image into your comfy canvas to see the workflow and all settings. almost all my posted images include the whole comfy workflow
keep in mind that the image you see in discord isn't the original image. so to get to the one with the workflow, first click the image, then on the larger one that opens, click the words open in browser. and then save THAT one.
gonna try that one thanks brother
anyone know why flux schnell cant reference certain art? for example i testing to create image about some character from arknight mobile games, but schnell model dont undersyand at all what is arknight mobile game that i referein to
probably nothing in it's data set that had the lable you're using
but how can that be? sdxl with way less parameter and data sample can do that. but schnell have 12B param.
It's cute and all that people are making SD3 nsfw loras, but WHY are they all kinda BLURRY?!
pony
what has that got to do with anything? it might ahve all the art, but if the label yo uare using wasn't used, how's it going to know what you're talking about?
what do you mean by blurry? never see that
Pony isn't blurry? Do you mean they are using Pony ref images?
Check civitae, all slightly blurry
civit probably has something set wrong
only nsfw one? or all image on civitae
People are posting finetunes they made of SD3 and they post images to highlight their models, but all those images are blurry. However, the SFW SD3 finetunes seem fine
- motion (schnell)
oh, well make sure your profile settings haven't changed
Only the promo images for the SD3 nsfw models
Not that blurry, more like 80's disposible camera kinda blurry
i dont see that, can you provide an example? mine look fine
off to crop some images, brb lol
use sd3 to generate the image and a finetuned sd15 to refine it
never try this before, prob tomrrow im going to sleep rn. but that brother image look stunning wanna try it myself
should do a final pass back to sd3, enough to refine the colors out while keeping composition. get that sd3 vae
mostly because of the sd15 model, the sd3 is just for the composition. You can do it with any other models as well.
did something happen? is sd3 cool again? what'd i miss? i thought this was flux now
will it work with model that have build in vae?
no, sd3 just still makes ehhh images. but sd15 helps since its a lot better in humans.
cropped for obvios reasons, but that's about standard
What can 1.5 do to an image that SD3 cannot? I'm wanting to try this out, but curious what I'm after specifically
dam look sad
Sometimes flux isn't all that, so we have to revert back to SD3 lol
They aren't all this blurry, but even slightly blurry isn't acceptable these days (IMO)
At least SD3 ladies with 1 arm and 3 legs laying on grass aren't blurry ROFL
Using different models for different things in comfy is fun, since each has its strengths and waeknesses.
PS I need more vram
schnell is better and faster imo but at least that supports negative prompt
its diffusers so it confusers me
How come this one would be used instead of just Dev? I hear Schnell is better for txt, but so far I haen't seen that
I just want an SD3/Flux merge π (I know, I know, not possible lol)
its the two merged together. dev an schnell. i get great text from dev myself. schnell has great speed tho
its a merge of dev and schnell. so low steps like 4-10 and cfg as well is possible.
Schnell is only twice as fast though π¦
schnell is like 1-10 steps and dev needs like 20-25 steps i believe. 6 is a good spot for schnell and its like 3-4x faster
we had a huge discussion about this yesterday, on this discord, and the concensus was that negatives really make no difference
they just remove some objects if you want but you cant increase quality really.
I have no idea what prompt but i think not too bad with schnell, 8 steps
also not real sure how merging schnell and dev gets you anything. you're just undoing the distillation process
they are both distilled
and probably not undoing it correctly
for sdxl, merging turbo with base also had interesting effects of resulting in a better model
so why not trying it. Might be good, might be bad
yes they are, and when you merge them, you mess with that
So which one/s? π
they are one
this is vastly different. turbo is optimized for speed. so when you merge it with the base all you do is cut down the speed optimization really
both, diffusers shard them so you dont need to load the whole thing in ram. You require less ram bc of that.
itln diffusers the models are often splitted in multiple files
just download all the models. and they go in your Unet folder
turbo is also a distilled model.
Thank you π
i know. optimized for speed. when you take an optimized anything, and mix it back in with the base - you destroy the optimization and you don't really accomplish anything. it's sort of like brewing beer, distilling it to make a fine, high quality beer, then pourin that into a pitcher and pouring some of the unfiltered stuff in too - and then thinking you've got something great
it's not that easy
who said anything about easy. but tha's essentially what you are doing.
first, merging back to base is similar to what ema is doing. It often improves quality
in who's opinion? and in compareison to what
sticking a fork in an electric socket might improve your hearing aids too - but it might do a whole lot of other damage
second, turbo is restricted to 512x512. Merging it back to base gave it the possibility to generate 1024x1024 with cfg will still being extremely fast
so yes, it's no guarantee that it works but it's not a dumb idea trying to merge the models given that schnell is better for certain tasks (emg. paintings)
guys stop doing things, someone who doesnt have a high level of knowledge doesnt like it
right xD
is there a fancy workflow which puts a separate prompt into each? π
if you study, you'll have a higher level of knowledge
captain obvious
easy to do
since it's so very new, instead of running around trying to break it immediately, maybe it would be good to learn it first?
just use the flux prompting node
dude, experimenting is part of learning
you learn by breaking when you dont have an instruction manual
sd15 has more various styles, also fixes most poses
you've missed your calling in life
wrong again
Half my fave checkpoints are 1.5 π
sd3 + sd15 can do female laying on stones, in the grass
it's just the grass!!!!
π
you should be doing stand up comedy
Humans are a lot better, better styles, celebrities too. Its very small and really quick as well. prompt following is very very low and might mess up your text in images but apart from that its great as a refiner.
Have you tried Flux + 1.5 yet? π
it's just pointless talking with crystalwizard π€·ββοΈ he has no clue if anything but starts arguing with everyone, although arguing means insulting and flaming for him
hands are ehhhh but still good
lot better then raw sd3
another pretty far off opinion, im not good
you're very good.
not at comedy
wanders off to mix Pony with Flux...
you're very good at comedy.
does flux has online img2img ... ? surely not
can you go to off topic to talk to yourself
and then the ponys will flux in and out of reality
just following your lead
next year: flux based pony video models
Mage does, but also a glif of that is probably not too difficult (adds onto to do list)
please not
you lead with a dumb statment of guys your ruining it
though you might work on literacy
everybody sees
,,( * .. * ),, hi guys, everything ok?
only a few minutes and my first trained flux lora is done π€©
im not blocked by anyone teehee
why must you attack all the time, low bar for you, my literacy is fine, you are the one often doing doublespeak and gaslighting
get help
get drunk, get some of that natural green stuff, both of you. You might feel better
Your training a lora for flux?, how much Vram is needed?
cropped frustration never work out well
yeah, let's all just hug. And then cancel anyone who doesnt join! π
ya the cancel culture is strong in some
I'm at 18gb currently
im cancelled
mee too
not here, civitai
maybe you can reduce it a little bit, but I guess you will need more than 16gb
18gb? interesting, last time i heard the minimum requirement to train a lora for flux is 24gb
You have to train it quantised
i deleted my civit account, then just yesterday i wanted a particular workflow and i was suprised when it wouldnt let me download it, made an account again....
but that seems to work very well and is not huge issue - in the end we all run it quantised anyways
what'd u delete for? i know someone else did too
Is the quality affected at all?
I don't think so. But too early to say
text probably does a lot better with higher precession. most things maybe not
dont want to be associated with some of the smut there and the anime level was personally anoying to me digging through it, just not gonna post images
thanks. just curious. seems to be a common reasoning
Are you using simple tuner?, i want to give training a lora for flux a shot
yes
took me a while to get it run, but with every commit and bugfix it gets easier I guess
I think my collection is cute, only up to X rating for some images. Can't see why I would ever delete it, but just in case, you might wanna check out: https://civitai.com/user/youhnr5/images
last article I read said 48 π¦
yeah, turns out it's trainable on a 3090 π₯³
you think sd3 is better at text then you'll always be wrecked TOP THAT
You can disable anime I think
ya still irked me, less anime wqas still like 80% anime lol
I think you might wanna do some OCR and then let language models check this piece of text ...
nobody's appreciating my dope ass teen witch deep cuts
that image along with that text ROFL
then re-route back into the image AI until it get's it perfect
full prompt
On the right hand side of the image is a large black margin with the lyrics printed out. Text print in arial font-face. Very legible letters.
("I'm king, and they know it
When I snap my fingers, everybody say show it
I'm hot, and you're not
But if you wanna hang with me, I'll give it one shot
Top that, top that
You can give all that you can, but you will never top that
Top that, top that
TOP THAT! TOP THAT!
You can dream until you're blue but you can never top that, huh-huh!")
On the left of the image is a photo.
In a dimly lit underground venue, a mad scientist takes the stage, his wild, unkempt hair and stained lab coat contrasting with the urban setting. He wears a pair of joke spring eyeball glasses, with the eyeballs bouncing humorously as he moves. The crowd is packed tightly, swaying to the heavy beats of the hiphop music. Graffiti-covered walls and flickering neon lights add to the gritty, energetic atmosphere. The mad scientist holds a microphone, his expression one of manic excitement as he delivers his rhymes, blending science and hiphop in a unique and captivating performance.
would it work better if you made the text in 4 sections? Would that be easier for the ai?
https://youtu.be/fAWJdXJvngU love these kinda plans
Garth's idea to get Mr. Biggs to see the demo of Cassandra playing in Wayne's basement.
Civit wouldn't let me post this but i was seeing questionable age pron all over, camel straw back
how to say your over 40, without saying you're cover 40 π
i've tested writing lyrics in sections like that on other stuff. it doesn't seem to care.
i'm over 40 an love anime π¦
But not like LOVE like filthy frank loves it
I just threw this prompt at my sd3-sd15 workflow ... aaannd it's a mess
i squeel anytime i see a new version of that scene
I like Manhwa, but don't like anime at all π¦
looks like a good show still
its painfull to see the sd3 preview inside the workflow have better text, and then it's gone π
not even ghibli? i just figured everybody loves ghibli. maybe i'm projecting
I just googled that, no, definitely not
lol, next project. Full prompt, I kid you not:
waifus = ["Asuna", "Mikasa", "Rem", "Hinata", "Zero Two"]
for waifu in waifus:
print(waifu)
hahaha the previews are often a curse. like the aesthetic distallation of the scene kicks in and suddennly blammo. all that detail you could see shaping up shifts to aesthetic pulp
with flux
why using 1.5 to destroy hands and let 4ch vae decrease quality
why? cuz style and poses. yeah, I know it destroys some of the good sd3 stuff, like also text
i even made anime, not a hater, just dont seek it out as much
look good
always nice to see 2d and 3d combined in a single image
oh ok fair and I thought first pass is flux
flux is quite good at anime
yes, it has a nice anime style. one of the very few downsides of flux that I've seen is trying to get grainy photos. As soon as persons are added to the prompt, the style becomes smooth and "perfect" again (the default realistic flux style)
made with flux-schnell online yesterday
flux-schnell anime style
hm, maybe looks more cartoon or common 2d
@uncut river what am I doing wrong with your workflows?
I wanted to turn all your wiafus into zombies! π π€£
hm, not sure but I've seen a similar error
the error I think I made was connecting the latent output straight to the input of the next sampler. that did not work. it needs to go through vae-decode to image, and then back to latent space for sd15 with vae-encode using the sd15 stuff. But if the workflow is included in the image, it should work (without alterations, expect maybe the chosen models)
dev is great at dali's style but i can't get it to do the drooping clock faces. its neat though, in the early steps, i can make out the drooping shapes. but then as the detail comes in it changes to another form.
I didn't change anything except the models and the vae (though perhaps I used the wrong vae)
I hadn't even gotten to the prompt part yet
hm, could indeed potentially be a vae mismatch or something. Probably the wrong vae connected to the vae-decoder after the sd3 generation. When did you get the error?
my instant concern with those two pass workflows are that they do final output with the old vae
yes, but i'd rather use something like this, than merge the two flux models
OK now I give up lol
i'm not saying they're a bad approach. i'm just thinking, do a third pass
get that sota vae
maybe. have you tried running both models in one workflow yet?
refine flux with flux, call it reflux
ouch, but good pun
maybe i am a comedian hah
lol
well...
can flux write words it doesnt know?? like "omgwtfbqq" ??
both models - schnell into the BasicScheduler node, and dev into the BasicGuider node
try it yourself at for example https://replicate.com/black-forest-labs/flux-schnell
It's hand cream, not what you thought. π
for hair strengthenin...
cinnamon roll icing
seems like shampoo instead of something for the face, or she is just crazy
flux gives me this on:
Discover the secret to a radiant, clean face with our family-style face lotion. Just like our anime heroines, your skin will shine with confidence and grace, no matter the weather. Our lotion's advanced formula keeps your face moisturized and protected, even in the harshest conditions. Embrace every day with a glowing, flawless complexion!
that's melted vanilla ice cream and what you don't see is her younger brother behind her dumping it over her head
i've done multi model workflows before. comfyui aint my first nodegraph ui and i've done stuff so much more extensive than these boys can be. the worst is just figuring out the names for all the nodes. ever managed database normalization?
After creation of the first prompt, I simply asked the AI to "now make it real" ... 1st gen
1st was better
Time to shave. lol
through strict scientific analysis i have noticed a descrepency in the data. when prompting for the letter q sometimes these modesl will use the letter p instead. from this correlation we can hypothesize that diffusion models aren't capable of minding their ps and qs. || EVERYBODY COME ON FHQWHGADS ||
I didn't give flux a prompt and this was the result
nice scene
I got really lucky with the seed
all this checks out for a futuristic cyborg sumo match.
Nice human alien hybrids
Yall Bern hanging around civitae too much ,)
hmmm. i'm running these not in fp8 mode. it's filling my 16gb vram but not over .
have you guys ever gotten these weird artefacts on the red fog that you see there. I just got it in schnell and one time on the hugging face space with dev. What is that about ?
same seeds one is fp8 em4 turned on an one is off
Did you know you can specify measurements in your prompts?
three cats, from left to right: a one foot tall cat, a two foot tall cat, a three foot tall cat, in the kitchen
It doesn't get the measurements right, but it gets the differences (big little small).
three cats, from left to right: a three foot tall cat, a one foot tall cat, a two foot tall cat, in the kitchen
I got an idea for a comfy workflow where each step gets its own ksampler
and each ksampler only moves the ODE by one step
then in between each one you select CFG, SAG, PAG and control net etc for the next step
That's why they've got one paw ungloved. Get both bludgeoning and slashing damage.
yeh size prompts work good
this text is really good WTF
is that flux?
This is stolen from reddit, anyone else tried images about gaming on flux? π
yes
this has potential to be a ridiculous discord server bot. the mod can just prompt breaking news anytime someone is stupid.
yeah one of the first things i did. it works good for gaming. had a lot of screenshots in the data surely
is Flux text better than SDΒ£ or about the same?
much better
Flux is the best text of any model I've ever seen, by a very large margin.
ah ok nice
yes! too bad its too big, and cant fit my vram
I don't do text but that's good to know
sd pound it
that is about one of the only down sides, its big size
Its way better, the text is not just "pasted on top" like in many sd3 generations (of course this is not all the time)
therefore, we NEED ... .eh, would like to have maybe a MINI-flux
calling SD3 as SDΒ£ was actually a typo
but it looks like I am insulting them lol
was just an accident
oh yeah I hate the pasted on top thing
flux ....
WTF it can do a whole NYT paper with the newspaper name correct
that's so amazing
not bad for a rough lorem ipsum level layout
True. Design instincts.
sometimes I just read lorem ipsum instead of reading real human-written writing TBH
I forgot and lost the prompt, but tried to recreate from mind in flux-schnell. here it is.
2d anime style, hot mom standing surprised in the kitchen, large fluffy white dog sitting on the sink, simple anime style, playful text "SHIRO" written above the white dog, lazy dog
lol, the sd3-sd15 workflow can somewhat keep up! it even adds a bit of japanese (I guess...)
how much worse is Schell?
It can't do much text, but it's a lot faster.
the time-quality balance is about the same, it just very schnell
schnell means fast in german, for who doesnt know
do you mean its the same quality?
no
