#๐๏ฝsd3
1 messages ยท Page 90 of 1
I'm going to remix my renix to add sale options etc lol
"You are a professional Amazon fake product JSON generator, your task is to generate a JSON for details about an Amazon product that doesnt exist, from a prompt provided by a user. Anything is possible and the products are not bound by limitations, so follow user instructions, not create toys or replicas of a product unless stated. The stars should not be floats, only integers in string, and they should be dependant on the product type. Make sure to mention the color in the image prompt. The product name should not be more than 10 words long. You must follow the JSON structure and only respond with the JSON and nothing else.
Here is an example:
INPUT: batman sofa
OUTPUT: {
"product_name": "Batman Cozy Flip-Out Sofa - 2-in-1 Convertible Sofa to Lounger",
"product_stars": "4",
"product_price": "39.99",
"brand": "Batman Furniture",
"color": "Black",
"about_1": "Superhero-Inspired Design: Shaped like the iconic Batmobile, this sofa brings the excitement of Gotham City right into your living room.",
"about_2": "High-Quality Materials: Upholstered in durable, easy-to-clean black faux leather with yellow accents.",
"about_3": "Comfortable Seating: Features plush cushioning for a comfortable seating experience for up to three people.",
"image_prompt": "a black batman sofa, there is a batman logo on the sofa, and batman heads are used as headrests, the sofa has batman sculptures"
}"
Needless to say, flux had some help
Math but words instead
My brain 
Flux dev really does super well with flags
Prompt: "A flag for Banana Land in 4:3 aspect ratio, featuring a horizontal tricolor design with green, yellow, and blue stripes. The green represents the lush rainforests, the yellow symbolizes banana plantations, and the blue represents the Caribbean Sea. In the center of the flag, a white circle contains a stylized banana bunch crossed with a machete, symbolizing agriculture and revolution. Above the emblem, five small white stars in an arc represent hope for democracy and the five main regions of the country. The flag has a subtle ripple effect to convey the nation's tumultuous history and ongoing changes."
OK so looking into it more (and asking people), it's more like flux+htnl+plus save as image, than flux managing that much text consistency
Boys gone fishing! SD3 via SwarmUI
Thanks but no need for workflow. I mean it failed in just about every single aspect of the prompt except for some of the text. But to be fair, Flux is failing to produce any text most of the time, regardless of whether Dev or the many quantized versions.
The prompt is actually interesting as it seems to produce the most astonishingly diverse results from the image generators. Which is curious as the prompt is not lacking in detail.
Ideogram was the one with the best text, unsurprisingly, but it actually flubbed the text 2 out of 4 times
Yes, I thought Flux was almost as good as Ideogram; but after dozens of efforts, I am less than impressed!!! ๐
Flux fails better at longer strings of text
It does do some things better, but those are measured only, IMHO, in terms of pure prompt adherence
Don't get me wrong, it is not a matter of one better. Ideogram has left me gnawing my knuckles more than once, and when I subscribed, you might find over a dozen renders (with 4 samples each) in my private area.
Perhaps all generators fail at long strings of text?
Flux is the crown prince, but Ideogram is king for now
I'v elooked at clouds that way is a lyric?
Joni Mitchell - Both Sides Now
SD3 in SwarmUI
I mean, let's be honest: Flux can produce some amazing and precise images, but it doesn't really have the 'artistic freedom' of its rivals. Filling in the blanks. I say young man at writing desk, plus text, plus lush fantasy forest with valiant knight and female mage casting spell, and its results are accurate, but not terribly inspiring. Some rivals will really put on a show like this:
SD3 makes this scene quite lush, and less than double transparency.
It isn't a knock per se. Each has its strengths
Flux tends to retain the double transparency look.
I tested SD3 Large, and whle the text was a clusterf***, the overall result was fun
SD3 and your prompt - more like a composite than a double transparency (bad text!)
Flux at least gets hands right 90% of the time now , SD3 no
Flux Dream Diffusion - very good transparency
A captivating image featuring a man seated at a desk, diligently working with a pen in his hand, his silhouette barely visible against the backdrop of a mesmerizing vibrant fantasy scene. The double exposure technique reveals a lush green forest, a powerful knight, and a powerful female mage casting a dangerous spell, all seamlessly blended together. The words "Imagination is your mind set free" are elegantly written in large bold letters on the side.
I "borrowed" this ๐
I get so many less bad hands in Flux I don't even think about the problem anymore
That's the new checkpoint? I tried it a coupe of times but got nothing that told me use it
Flux understands the transparency aspect. SD3 prefers to ignore it!
Flux doesn't know what defibrillator paddles are tho ๐
It works ... I haven't compared it to any other ... on my carousel (so to speak) are Q8-GGUF, nf4, Schnell and Dream Diffusion
I must admit it has me super super curious about the imminent release of Ideogram 2.0, apparently already in active testing
Any text generating AI app which somes up 90% of the time will be a game-changer! Watch Adobe or Canva bid big bucks for it (that's if they cannot develop their own technology for the same!)
here, if anyone wants to play around with SD3 Large for free: https://glif.app/@gliffyglif/glifs/clv488uy10000djtrx70u03no
Well, one I have not tried, thoug will today, is Google's new one by Deep Mind, Imagen 3
Ideogram 2.0
SD3 Large looks a lot like Dice_AI's Gold and Platinum SD3 checkpoints
Their 'paper' is ridiculous, and is just one more to show why THEIRS is the obvious best
Bear in mind, SD3 did the same, and so did Flux
It's all 'puff'
PR is big bucks - get there first - get there fast - and ring-fence your customers!!!
Meh, I will give you my two cents: wihtout questioning the integrity of the tests per se, if you only test what you want, and you happen to 'win' that match, well, you cannot lose.
Yes, large lies told often enough "become reality" - I'm not saying that PR is lying?
Flux's results were through the roof right? per their testing. I cam imagine that 'impressionist oil painting' and variants thereof, did not come up in any of those tests. lol
Just spin spin spin and see who falls into your web
It's just an example, and I can come up with dozens. but the point is clear. You cannot lose a comparison test if you don't test that type or prompt
Flux the more I use it the more I can see holes ... their text was initially impressive, but disappoints.
Prompt coherence at first glance was good, but after many uses, it seems to fade?!
Is that through a paid subscription?
Flux is brilliant. I won't say otherwise. Just brilliant. And it can be developed into LoRAs. I certainly cannot do that with Ideogram, Dall-E 3 or whathaveyou
not even. It is in testing
I will wait on the LoRAs - my PC would spend a month of Sundays making a chicken-feed LoRA ๐
So is Flux Pro you know. At least in theory. There are back doors.
Do you have a freebie glif flux-pro?
I plan to try Civit's tool. Apparnetly it does Flux too
yes, but I have better
etc
I think I'll wait - @hexed dirge is the non-pareil master of LoRAs - just hope he'll throw his hat back in the ring after exiting Civitai?!
I'm on Flux-Pro ... prompt = streamline moderne, oklahoma dustbowl ,depression, anxiety, shriek, shout, frail children of dust, peruvian arpillera andrea kowch nychos sthenjwe luthuli rogerio novais inuyasha ellis barnes marci mcdonald coles philips elizabeth catlett lugeja karl parsimagi terror fear anger rage homeless hungry
Yeah, I'm not a fan of such word salads. There is no actual rhyme or reason to them and you are just throwing random words to see if they produce anything.
I once believed that
Since the advent of t5, there is a certain understanding of mood and intent discoverable in the phrasing and grammar
The psychology of vocabulary and phrasing is ever more integral to prompting, signifying and shaping intent
Sure, ten of those words are synonyms: anger, anxiety, fear, etc. And the image will reflect it. But there is no specific visual being aimed at
This though "kowch nychos sthenjwe luthuli rogerio novais inuyasha" is just nonsense
Flux.Pro is wild - a higgledy-piggledy stash of disjointed images - no thematic coherence - good quality images nthough!
Andrea Kowch - amazing American artist; nychos has his own inimitable anime style; sthenjwe luthuli and rogerio novais make great afro-american art; and inuyasha the most feminine of anime images ...
I'm not saying the words don't exist, or names
Its how I test a checkpoint - just how deep and wide its net has trawled imagery - see if it picks up on these niche artists
Then test each one individually.
... if I had the wealth of limitless time ...
How else would you know if any name actually had an impact on the image otherwise?
It is not hard. Take a simple prompt, and say "in the style of"
Yes, but I'm so busy otherwise - if I get a slack day ๐
It's fine. And by all means just do whatever makes you smile. I was simply saying I was not a fan of word salads. But I have certainly indulged, though differently. Back when Midjourney 3 was king, it was common to just throw in the most obscure phrases such as "eternal weight of life's burdens" and just hit re-run as it came up with one mesmerizing image after another.
At MJ, one of my best prompts was "Stock Photo" - the results were/are amazing
eternal weight of life's burdens - just entered that into MJ ...
๐
MJ6 eternal weight of life's burdens
Very cool prompt - may I use this freely at all?
With some small scripting you can clearly see, who was 'fogotten' in Flux and who not:
A SD3 version is avail on civitai. The Flux version sits still here on HD.
bnb-nf4
imagine a parrot
Off topic sorta, but sd3 made the images lol
https://glifsite.replit.app/websites/PlayfulJaguar.html
Cute
Can anyone explain what distilled cfg is and why it affects so much art prompts?
Flux?
Yup... Flux.1 Dev
distilled usually means the model was made smaller for efficiency, and as a result lost some of it's variety. It gets faster to good-ish images, but there are less to find.
In principle it should be possible to do some finetuning while doing the distilling training, but im not sure if that is done
im sure they could make a flux-schnell-anime-distilled, flux-schnell-photo-distilled, etc
but there are already loras, so why would they do that
CFG is normally a technique for improving prompt coherence in diffusion. You basically run the diffusion twice: first generating an image with the positive prompt, then generating an image with the negative prompt (or just empty prompt). Then you take the difference of both images and add it on top of the generated image.
Normally, this would make disturbed images, because you cannot just take the difference between to images and add it. But in diffusion the process consists of many intermediate steps, and as more you reach the final image (the noise gets less and less) as more do postivie and negative prompt image look similar, such that the difference becomes 0 at some point
I mean, you know that, you use CFG yourself a lot
thing is: CFG has several disadvantages:
1.) it only works if you run the image generation in many steps. If you only have 1-3 steps, then CFG cannot work. That's why Turbo Models usually have to work without CFG
2.) it makes the generation slower, because you basically generate two images in each step, so you need twice as much time.
however, CFG seems to be necessary for Diffusion models to work. You cannot train a diffusion model that generates an image from noise in one step (that would be then something like a GAN - but GANs are known to be non-creative and can only produce very limited outputs)
after you have trained a diffusion model, however, you can use it to "distill" a new model that works without CFG
as Noedel said: distillation is usually used to make models smaller. But you can also use it to let models work with fewer steps (=turbo models like SDXLTurbo or Flux Schnell)
or to make a model work without CFG (like Flux-Dev)
That's said, you can still use cfg with flux (there are comfyui workflows for it), it just doesn't work as well and it doesn't respond so good to negative prompts
hello I need help. ForgeUI was updated tu support gguf. But I get "You do not have CLIP state dict!"
i dont think the gguf files have the tencs in them. forge has the drop down menu to add those . you need to choose which t5 and clip l you want to load
yeah that'll happen on older cards. 40 series have specialised hardware for dealing with fp casting. the hopper transformer engine. without that it's a pretty expensive operation to deal with the weights.
I'm with 4070 plain 12GB
weird. that should have hopper transformer engine and not slow down so much from quants. i'm not entirely familiar with gguf though. code for it may not tie into optimized libraries. casting weights to fp8 before gguf benefitted a lot from hopper.
Lol, they are just words. What is fun is seeing the creative results, and also showing that, at least with MJ, you don't actually need that stream of consciousness to yield ever original and creative results. It has been a longstanding feature of it. I mean, I entered simply 'chess and technology' into MJ4 at the time. Now think for half a sec and what kind of imagery it might come out with. and I will now show one of the images. WIth prompt visible so you can see
i have 16gb so i use nf4 and fp8. they're both pretty snippy. fits into 12.3gb though just shy
how much system ram?
64gb
you're set there. you got it fast with low timings?
Completely bonkers but also "how on earth did it think of such a mad idea?"
But sure, this was image AI 2 years ago
I get 1.6 with fp8 and 2.39 with gguf Q8
i think it's cool he asked. Prompts are the one part of these generations that we could potentially own the copyright on as a human creator. Not if we make them using LLM or if they're derived from other people's prompts though. Still, it's a big domain of copyright and asking is only polite and fair.
by the way flux in forge UI is something else entirely. It's way faster than comfyui for me. I couln't even run gguf or nf4 in comfy some days ago, but now in forge ui i can, and it's very fast. I'm even using sdxl in nf4. It speeds it a lot and worth it using for upscaling and inpainting now. I don't seem to notice any big quality changes in sdxl nf4
i mena memory timings. i'm just thinking that since you're at that edge of shared vram swapping, tightening up your ram timings may be a hot thing. like if xmp isn't on or you're running a default memory profile instead of the optimized 16ms timings . something like taht
oh ok
I was actually trying to come up with a logo for my YT channel "Chess & Tech", hence the simple text. I then told it I wanted a logo and even added, at a friend's goading, "in the style of Louis XIV's court"
Even today, that is simply not the sort of stuff I can get out of Flux, no matter how I prompt it
each is heading in their own direction
try schnell, it listens to prompts more. Maybe a dev+schnell merge could be better.
look, Flux does great, and I am not even saying its output is ugly. Just that this sort of illustration creativity is not its strength
I can't get nice results from schnell personally
I find no need with the release of the GGUF and NF4 builds of Dev
yeah I feel the same
why not fp8?
fp8 what? Plain Dev?
also the one in forge
I have 8GB Vram. it won't fit so becomes a snail
if you have cpu-z you can check timings this way.
which also , i'm realizing my bios must be haggard cause that aint my good memory profile
couple power outages recently so it might've reverted to fail safe settings or somet
I would feel okay using midjourney images as input to IP adapter
or for canny edge
Yes, they just released an amazing web UI in fact. It has been getting rave reviews
are the big paid services worth using
just midjourney and ideogram now?
or are there others
leonardo maybe?
thanks cos there was no way I was using the discord
The Discord is super practical, so not sure why that would be an issue
we're memory twinsies!
I'm kinda joking, it became a meme to hate on the discord interface
serving up models via discord is not that bad TBH its literally better than ChatGPT's GUI
there is the need, i have seen that schnell has more prompt adherance
its a worse interface than comfyui imo
yes. how many it/s do you get?
I can barely use the comfyui website at all
EDIT: meant civit ai not comfy
i don't believe using a chatbot as a UI is a "fine" interface paradigm and imo it's just a fad. chat bot assitants will continue but they'll move away from behaving like an app itself, and will instead move towards shwoing you how its done in an app
wait i meant civit ai
I would challenge that, but it is not important. If you are happy with your results then that is all that matters
not sure. 3-4s per iteration? i don't really care about it/s or s/it. i look at total generation time. 1024 images in 30 steps are typically 30-40 seconds
thanks
i guess my guessing math was wrong. thats more like 1-2 second
for what its worth I don't really think the noodles are a great interface either
I'm going back to pytorch lol
yes it's less than 2
they're a great interface for backend management. node graphs are really powerful fot things like shaders or video editing
I have not tried on local. I gave the plain Flux Pro result as it was faster for me
yeah I think for people doing creative work, node graphs are decent
this is dev local
Yes, I imagined.
for people just wanting a prompt to an image and some detailing tools, i think comfyui doesn't hold up. there's a reason the webui is stil teh vastly more popular project
I guess swarm will be good for those
swarm is a nice bridge
let me see what SD3 Large gives
breathtaking a stunning artist depiction of an alluring sexy phoenix girl with huge burning pastel wings in all the colors, inside a golden cage, looks a the viewer with a sorry face, demanding to be rescued. The scene is depicted with a sense of wonder and sadness, aethereal light shining on the girl and the cage . award-winning, professional, highly detailed, RAW candid cinema, 16mm, color graded portra 400 film, remarkable color, ultra realistic, textured skin, remarkable detailed pupils, realistic dull skin noise, visible skin detail, skin fuzz, dry skin, shot with cinematic camera
first one is dev, the other 3 schnell+dev merge. I think it's better at prompt adherance*
I was disappointed with diffusers
documentation is not great and its quite rigid
i should get one of these schell dev merges. they look worth testing balls out on
oh sweet its published now?
I think so
not 100% sure
people are using it anyway.
they may have pulled it early
this is a big step. coudl lead to less custom nodes for simple jobs. too many custom nodes being required is a security problem.
https://docs.comfy.org/essentials/execution_model_inversion_guide
https://github.com/comfyanonymous/ComfyUI/pull/2666
"merged" hurray
I didn't knew that one. Great! I really wished this feature for long time
they had to invert the entire execution model to get it there. so now comfyui starts at the back of the graph, works it's way to the front , and then executes the path it figured out. (i think. i probably got that all wrong). It should allow for a ton more creative node graph work to be achieved
conditional graphs would be nice. i've kind of thought too, like spreadsheet programs have, comfyui could benefit from multiple sheets in one workflow.
a lot of potential yeah
dataset building workflows. like ones that use ip adapters to create a plethora of images organized into folders to be used for training. That'll get crazy too cause now you can run a classifier on that process so that it'll keep making a single concept until it works. you could even set it up to generate many images with many different ipadapter settings, and then choose the best of the batch
This one is pretty good
Frazetta lora starts looking good.
i do like that one a lot too. like it's classical technology like bronze etching. early lithography kinda work.
could get it laser cut / hydro cut and have a keychain metal logo to. its an effective design
thanks for the data! ;D
Twas my pleasure
To be fair, it was for this exact purpose of a LoRA that I went about obtaining it
@hexed dirge you understand strong design! kind of people that will benefit most from AI tools. those who just want a model to make a logo for them won't be able to achieve what you do with your eye for design.
Though I had no idea you'd be the brilliant executor
thanks
The colors, shading, pen style, it is really special
tell me it wasn't just a random prompt you tried and you had a vision you were going for ? XD i wanna fangirl more (not a girl but thats the vibe i have rn)
Super impressed
I think it's already in Flux as so many things. You often don't need a lot of data and steps to uncover these styles
I will fanball
You never know.... It has the core info on people and whatnot, but it needs the LoRA to transform that into this
that's what i suspect to kai. flux is 12B parameters. these guys have 33m in funding. they have some world class consultants on their team. they're not phoning in the dataset with anything less than substantial. I do believe the beastie boys are within flux too. It knows something about them.
It's just the aesthetic distillation sort of galvanized it all with very strong guidance.
hmm, fanballs sound like they could get violent.

It is not that those are bad, ugly, or anything of the kind. It is just that Frazetta was a very famous pulp comic artist with a very recognizable comic signature style
Yes. I should try telling Frazetta
let me try
IMO, Carl Barks is THE definitive comic artist of the world. It's just Disney didn't allow him to name his work at all for decades so he's barely recognized. He basically defined Anime with his donald duck comics
comics go back waaaay before fraz
Not shitting on Fraz just saying
well Donald Duck seems to work
the problem is that the CLIP model knows who Franz Frazetta is, but it seems T5 doesn't know him
i've never been able to get other duckberg characters out of any model. Donald has always been a strong one. No launchpad, no darkwing, sometimes theres a little scrooge in there.
that's why Flux has such a problem to make his arts
Here s an example with his actual work from a comic series he illustrated
I assume that they just labeled their datasets automatically using some captioning tools. But these tools then not write "art by Frazetta", but instead write "comic illustration"
nice little comic strip about a true legend of comics
he definitely had his own ideas.... lol

haha, yeah, I haven't used most of these images
flux too seem having its own ideas
I think they would just destroy the base model lol

I mean, some of it is so chaotic, it would be hard to get anything comprehensible into the LoRA, but some is eminently usable such as:
yes you need a lora for that
Firefly and PS do an incredibly recover job though
I had it fix tghe contrast and colors and remove the text bubble
took a few seconds
yeah, I do similar: first train on the good images, then use the trained model to improve the bad images
but it's still tedious work
Compare the two. Before and after:
it absolutely is. And is why I gave you the 'raw dump'. I simply had not had the time to start the process, so had nothing else to give you. My load of Frazetta works was only very recently acquired
btw can you get images of homer simpson like a real man? I get only simpson images every prompt I try
Heh, but it takes more then reshade (and yeah, I know the reference) to remove the text bubble
Someone said the same thing about sd3 how midjourney could do that well and couldn't get it on other models
Photoshop is incredibly good with this sort of thing, but you still have to sit down, circle the text window, and look at what it proposes as a possible fill
sdxl can manage it
slider loras can help for these edge cases
syntehtic data!?!?! what do you wanna do! REINFORCE ARTIFACTS AND MODE COLLAPSE?! /s ||its funny because this is the over reaction i always see about synthetic data sets||
https://replicate.com/ostris/flux-dev-lora-trainer/train quality of life updates now in place
do they fixed their vae bug that messed up all images? X_x
AIDS machine?
and drunk.
gguf loads so quick and loras work with it
now... do I need to update something specific, or just connect the LoRA
will say this much: Flux is really strong with logos
the addon itself and comfyui (just in case you are on a specific commit that broke it)
loras work again in comfyui in general
I have the quantized K models working fine
even if you use fp8
it completely ignores the damn company name, but the image and circle shape are exactly as requested
circular logo of company called "Chess & Tech", chess and technology, circuit board, elegant, creative, HQ
You might say I did not tell it specifically to add text, but unfortunately, even when I do, I have seen it ignore such requests more often than I'd like
I love flux Dev
second time's a charm. Added the brand, but logo is much less interesting. Such is life
At home drawing pictures Of mountain tops With him on top Lemon-yellow sun Arms raised in a V And the dead lay in pools of maroon below
no deads tho
heh hehehe
Jim Lee style
๐คญ๐คทโโ๏ธ
cant get any ultra instinct gokus but there's this . this guy is coo
feel like he's about to rage on the red army
Robert's got a quick hand, He'll look around the room, but won't tell you his plan, He's got a rolled cigarette, Hanging out his mouth, he's a cowboy kid, yeah, He found a six-shooter gun In his dad's closet, and with a box of fun things, I don't even know what, But he's coming for you, yeah, he's coming for you, all the other kids with the pumped up kicks, You better run, better run outrun my gun, Photorealistic style with a balanced depth of field, adding depth and dimension to the image
which one's that?
Inkpunk Flux LoRA
Be careful out there... watch out for Shark Flora ๐
@sacred jewel your images are getting better deaily! ๐
You are VERY kind.. .thanks for the nice words.. .you are ALWAYS an inspiration to us all...
Is that the new RING Doorbell?
Can confirm that GGUF models are completely compatible with LoRAs
I had complained that this image started with the words "impressionist oil painting of..." and got this:
GGUF pure, produces the same thing roughly, I checked to avoid any possible mistakes.
Without changing a single setting, I added a LoRA with GGUF based on 'impressionism'. All else was identical. And lo and behold:
Workflow is in images for any interested as per usual
I threw in another based on Caravaggio and it yielded:
Which is great news since it means the VRAM crippled who benefit from GGUF models can also enjoy the full benefits of LoRAs
Is that Josh Hartnett?
Heh. I'm still plesantly surprised
With commercial models like MJ or DE3 to a lesser degree, you get spoiled into expecting a certain level of artistic output, so it really grated on me that even the most innocuous and basic ones were censored
I was getting exceptional Norman Rockwell back in 2022 already
So seeing photorealistic style images from the words "impressionist oil painting" was a bit of a "are you kidding me?" moment
So what is GGUF? DId I miss a version? ๐ (works with Dev loras?
It is a binary format that drops data in exchange for compactness while trying to maintain fidelity
I believe it came from the Llama C++ project...
How are the images, compared wtih Schnel,l Dev , and Pro?
It produces more accurate and better results than NF4
I often get near mirror images of Dev
Pro is a differnet beats
beast
Some here have stated they LOVE the results and it really shines on small VRAM setups... I am not one of the testers as Dev works for me
and since I cannot do an apples to apples comparison, cannot be judged
GGUF works really really well
I cannot mass produce images at super high speeds, granted, but one of the main caveats of Pro and NF4 is the lack of compatibillity with LoRAs
in fact, even though people were (wrongly) yelling that Flux of any flavor would never work with LoRAs, this was later picked up again with the quantized builds like NF4
GGUF seems to have not gotten the memo
i.e. it works with the Flux LoRAs just fine
which is a boon since pure Dev is out of the question on this poor laptop
Oh good, I eed to test my loras more, without using up civitae buzz!
You have the same oe I do though right? I just set it to run, clean the house a bit, set another to run, go do errands, set another to run, take a nap ROFL
Actually I usually multitask in another window and it takes 1/2 hour that way ROFL
20 people have downloaed each of my Flux Dev loras!!
Yes, as I recall we were two peas in a pod in terms of hardware
You can actually just queue up as many asd you like while it is generating in the BG... and then walk away or sleep on it.
Lately, I just use a text randomizer node and a a lot of my prompts and just queue up 200 inferences and go to bed.
In the morning, I have a bunch of new fresh art ๐
I think you're way ahead of some. I have seen posts here from folks with 20 series cards and even someone with a 1080 ๐ฒ
Maybe so, but I meant you and me specifically with a laptop 4060
Gotcha...
And that 4060 is a laptop 4060 which is even less than the "number model" would suggest.
When I said it works on my system I meant with the most simple of workflows ROFL. Though now I'm tempted, are you willing to share your flux one with the prompt randomizer? I love my SD3 one, wake up in the morning to so many images!
My own laptop has an A3000 GPU and it is painful... BUT still MUCH better than my MacBook Pro which is an older M1 which SUCKS DIRTY ARSE with any diffusion model... even TURBO is slow.
There's glif, mage, and HF...
and fluxpro.art
Sure, I'll share mine...
Let me know if it is too confusing. I'll hit you up via DM to not clutter this thread too much.
Are you creating dozens of loras? ๐
I plan to try CivitAI's LoRA generator to give it a spin
Oh crap, forgot that I was waiting on that darn countdown timer on that site lol. No lora support there though
hahaha... Only done ONE Flux LoRA and three SD3 LoRAs ...
None of the Flux Pro offerrings support LoRAs afaik. Yet.
I don't count Twitter's which is not a LoRA, just an uncensored one, well, other than the obvious
It does work, I have made a buch, with each model. I'm going to try a real method as soon as I have some free time to compare
Glif will sooon, I'm waiting
I have been pleasantly surprised at how well Flux handles the concept of logos in general
got a few prime candidates for a replacement of my VERY old one made with MJ4
Heh. Night owl here. Even though I'm an hour ahead if not mistaken. 11:40 here in Rio de Janeiro
7:48 here
west coast?
Cool. I think the furthest west I ever lived was in Wisconsin for college
Furthest east would be Germany
um vizinho...
Oh boy, what a moron. The guy who posted a Frazetta LoRA must have scanned dual page images and left the divider mark.
Mora aonde?
RJ
Olha. Copa aqui
Nรฃo tรฃo perto. Vila da Penha
Ufa, bom, tb nรฃo chega a ser distante como Rondรดnia..... kkkk
(he lives in same city)
Wisconsin doesn't count as west! ๐
Flux is pretty good at wirewrapping, and fingers!
@dusky thistle "Elon Musk eating hotpot in China: Musk is sitting at a traditional Chinese hotpot table, dressed in casual clothes, smiling, and focused on the hotpot. The table is filled with various hotpot ingredients like sliced meat, vegetables, and tofu. The hotpotโs broth is steaming. The background features a Chinese-style restaurant with calligraphy or traditional decorations on the walls. Natural light illuminates the scene, creating a warm dining atmosphere."
Here is the image you requested.
Here is the image you requested.
City96 updated his GGUF nodes to now support gguf versions of t5! Shit works really well. Q8 t5 and Q8 flux can both load in 32gb ram without rolling over into the pagefile during the initial loading phase
He put all the quants on his HF as well
It is west of NY or Rio. ๐
BALLS NOICE
anyone play with this yet? https://github.com/unity-research/IP-Adapter-Instruct
Wow!
A Wet One! DD-Platinum checkpoint from @bitter hearth
Huge dependency conflicts! https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
A little. I got really bad quality results ๐
Flux dev
damn that goes hard
Very nice, how did you get the Dali-esque look?
Certainly a page out of his color palette
SD3. I know this is the flux channel but... lol
I promoted picasso for the sun and Dali for the moon
Why tf won't autocorrect learn the word prompt?
I had Flux churning 30 logos based on a prompt, to revive the corpse of my YT channel. ๐ It did not do badly at all. Some things it got consistently right. Shape, color, and even the chess and circuitry. Some was hit and miss, notably the two-word brand name with an ampersand separating.
the last was not its biggest issue. I'd say about 50%, maybe less, got the text right
and I am forgiving its curious 'addons'
still, I got a good half dozen serious candidates to choose from, so cannot complain
Your YT Channel, obregado?
Yeah, I got involved in another job which completely wiped out any time I had for it, so it literally was left for dead for 18 months. Not a big channel of course, or the decision would have been harder, but oddly the subscribers increased 80% in that period, god knows why or how.
Hi, I'm Albert, and in this channel I will cover chess in all its flavors and variety. I have been a player for over 30 years, and a chess writer with thousands of published articles to my name.
I am now returning to active play. So you will find a series called Study Chess with Me, in which I invite you to join me on the journey to share and ...
Anyhow, the logo was more a picture than a proper logo, and was made with MJ4. Fun, creative, and I patched on some half-assed brand name with my non-existent graphic editing skills at the time.
Sadly. YT logos and whatnot are so small, this became more of a colorful blob at some sizes
Awwww. Well, chess is fun. And can be enjoyed at any level, but I won't preach here. Has been basis of my livelihood on and off (mostly on) for decades
Writing (mostly), editing, promoting, and the list goes on
Pro had done the best text for me
The prompt, for transparency's sake, was: "circular logo of company called "Chess & Tech" with its name clearly visible, chess and technology, green circuit board, elegant, creative, HQ"
Here are a couple of successful ones and some of the misses, some of which were really odd. The last one I may do some work in PS to see how it looks. Aside from the weird C and T, it looks nice.
the last one is interesting: it seems to use the king as a T
It does, but I'm unsure about it as a rule. The odd C and T make if very crowded, which may flop as it shrinks in size
Cascade
How tf do you run that in comfy? (Without the 100 installation steps)?
The w/f is in these png pictures - try it - and that Clownshark, it's his w/f
The 50 missing nodes even after I hit install custom nodes button ๐ฆ
SD was easy to install, cascade looks impossible ๐ฆ
You must d/load these nodes made by Clownshark - you also need ComfyrollStudio, WAS Node Suite, and Impact Pack https://github.com/ClownsharkBatwing/RES4LYF
It's the cascade install that seems impossible
Be warned - it can take up to 9 minutes/image using 8Gb VRAM/64Gb RAM
But they're fine images all the same
So the same as flux dev ๐
I hope mage or glif gets cascade soon, my computer is too slow lolol
Cascade is in the past - it used to be on here - but SAI re-rented the space on the server - so they left!
You can always play around with Aura Flow
cascade is too big for me, but try the workflows from https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade/
Flux dev is better then anything i used, i mean JUST look at that map of Europe!
Its not even the main focus of the image and still got it right
uhm
Prototype for a vacuum cleaner?
Do you keep space-juice in that?!
The cow that jumped over the moon "charcuterie version?!?!?"
oh yeah, I see the cow!
dON qUIXOTE? oR dONKEY HAUGHTY??? ๐
meanwhile sd15 still produces nice waifus
Superlative
Is the w/f in the png at all?
Windows Strips it when I copy/paste the image ... I don;t drag and drop the original. So no... But I will post the workflow.
afk
google imagen 3
surrealistic landcape with dark monster with big teeth, horror, eerie, uncanny, liminal spaces, molten clock, 1940ies, realistic dark lighting
flux-schnell
surrealistic landcape with dark monster with big teeth, horror, eerie, uncanny, liminal spaces, molten clock, 1940ies, (realistic dark lighting:0.8)
sd15 custom merge
Sd 1.5
Whose nose is that?
her arm disappears
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
iphone 25
The upcoming world future (Horror theme)
What happens when it reaches 0?
AuraFlow0.3
I don't think it's an extension. it's own python project and should have it's own venv. not sure why dependcencies are conflicting.
@vital crag i was hoping to play with the face swap aspect, but it's sd3 so people are mostly bad subjects to swap to
This is their requirements.txt:
einops
numpy
Pillow
safetensors==0.3.3
torch==2.3.1
transformers==4.42.3````
can't see a conflict there, provided it has its own venv as iceycold said
if you're trying to mix it with the rest of the ecosystem that's a different matter
Native res gen from my SD 1.5 model ZootVision:
AuraFlow requires torch 2.4 and cu124
Trorch 2.3.1 for XFormers
ah thanks
you were right in that case
its gotta be compatible with the actual image model lol
Thanks. It just has a lot of native training at 1024px and higher basically, I started it essentially as an experiment to see if you assemble an SD 1.5 model that could just sort of pretend to be XL by iteratively training Loras at native 1024px and merging each one into the previous version of the model lol. Turned out to work great. Probably around 40,000 images worth of data in there on top of my baseline as of the most recent version I released
CivitAI trainer made this very easy, can do 1000-image Loras at 1024 / batch size four and have em back in just a few hours
wow yeah that's amazing
Probably hardware requirements was why no one did this early on I guess, like it wouldn't have been possible if I couldn't just pay some buzz to run it all on Civits crazy enterprise hardware lol
they're turning out to be a good choice for Flux Loras too I'm finding, can use settings I'm pretty sure like nobody or almost nobody can run locally, in terms of dim / res etc
Are there any Auraflow0.3 TensorRT setups? Or even FluX?
god damn it's hard to not switch to flux with all these gguf optimizations.
I've released three Flux Loras so far despite not being actually able to run it locally in any reasonable amount of time lol
Just use them and train them all online
no idea if comfy ui supports torch compile but torch compile gives a 4x boost in speed(depends on gpu) but takes like 10 minutes to compile first.
I have torch and cuda compiled together via pytorch.org
python projects don't mix. it's one of the biggest headaches with python. dependency hell.
I don't think we're ever going to escape python at this rate. We might as well just all wear satin and grow horns so we fit in.
well there's a second, almost entirely separate, machine learning community for C/C++
AuraFlow0.3
yeah potentially could drop their own "torch" implementation and that would really start an avalanch
no, not install cuda or torch, the link you gave is correct, that is torch compile but you need to compile the flux model. In diffusers all you have to do is
pipe.unet = torch.compile(pipe.unet, mode="max-autotune")
or pipe.transformer if its transformer type model.
I wouldn't really know how to do this?!
Go to Python ... import diffusers ... then pipe.unet = torch.compile(pipe.unet, mode="max-autotune")
and start ComfyUI?
torch is not defined
just do
import torch
before you do torch compile
wait this is auraflow?
looks amazing for auraflow
that's really great news given how nice auraflow license is
yeah I believe comfy ui has a pretty different way of loading and inferencing models but torch compile should work.
if you can, go to your comfy ui repository(not githubs) and go to comfy/ldm/flux/model.py
https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/ldm/flux/model.py
and basically in line 144(on your comfy ui repository not githubs) edit it so it becomes
@torch.compile
def forward(self, x, timestep, context, y, guidance, control=None, **kwargs):
bs, c, h, w = x.shape
add the @torch.compile part thats all. For other models like auraflow it should work too if you add it right before the forward function.
yeah image quality increased so images definitely look a lot better but sadly prompt following and text rendering decreased by a pretty considerable amount. Still a very nice model.
as far as I know its the best apache-style license model
Schnell is better imo(faster and better image quality, prompt following, text rendering) but auraflow is definitely great.
which is faster BTW
Schnell or the more compressed dev quants?
Another one
Schnell is way way faster. Quantization only slightly improves the speed. Depends on how many steps you do with schnell but its like 5x faster or so.
but schnell's quality is the worst
AuraFlow0.3
so i guess people are able to train schnell with this adapter thing now? thought it was imposssssible
looks like someone came up with a novel way
i've heard rumors that AF team are going to reevaluate after 0.3 and figure out new directions. Flux really shook up their future vision and now they're wondering if they're redundant. might pivot to other research to crack this new model open since training AF is more expensive than they were anticipating (speculation)
The person showed me a screenshot of something simo said in a discord chat, but of course i can't find it now and i couldn't vouch for it's credibility in the first place. I haven't seen any official indication that AF is deprecated
Not sure how it got up there in one piece, but em kay...
Flaming ballz
just prompt?
PLUS a Flux.1 Dev LoRA called "rbcharc0al"
thanks.
@bitter hearth
Love the sun on this. Great colors
So I put the new AI by Google through some of my basic prompts, and in some it was a homerun, but not all. It is all the same a new player and exceptionally powerful image AI for sure.
It is very capable of handling text, even long text, but not revolutionary. I gave it a tough assignment and it nailed it in one of the four samples it gave me
Very impressive
it completely fumbled the storm ina teacup in such a bad way I was really taken aback
I tried three timesd with some prompt adjustments
Got the text ok, but the ship and sea seem sort of pasted above, and not an organic part of the image
I asked it for a satirical comic of an artist at his easel painting a scene with chess, and it was the best result I have seen to date
Who's training lora for flux?
The world map on a cup of cappuccino was just bad.
I also asked it for the 'impressionist oil painting of two young men playing chess ina park, and it did fine' Nothing special
Not sure why they all look almost Victorian
but ok. I also asked for two stylized cartoon images of the very same topic, and it did well
Lastly I asked for logos, a topic I was working hard with Flux. It did fine and it was elegant, but not terribly useful. Might be small sample size but I did ask for several
For whatever reason, they all came out as if poorly lit in a dark room
me,
but I train locally using SimpleTuner
drinking alcohol bad
so drink apple juice
good 
Is this from a particular artist?
a lora
I know, but based on....?
well ...... I just create a style inspired by sci-fi comics with a long description, some time ago in MJ. After that I used to train SDXL Model and then Flux
But Flux images comes out 100 times better than SDXL and also original MJ
Not a surprise. I mean we have never had such a powerful base to work with
so imagine how amazing the LoRAs can be
still, they seem consistent in style, so I wondered what artist or comics the LoRA was emulating
I have a 3090
so 24GB
maybe but 12GB no way
surprising - I never expected that you trained on synthetic data
?
oh yes. I don't use copyrighted images
I mean that's a good thing. It just surprising as the art style feels so consistent and unique
I'd like to publish them with a disclaimer not using for porn but in the end I think they'll flood me that way
let's see if it work
I loved this one for sdxl
haha, bow and arrow - impossible for sdxl
also the lora is 18mb, so probably only rank 1 or 2
amazing what Flux makes out of a rank 2 lora
