#๐๏ฝsd3
1 messages ยท Page 104 of 1
a lot of people find it to be good nuff. I find diffusion models to be superior with the fine grain details. SDXL and others. Flux showed up and absolutely obliterated the current landscape in that regard.
nice thing about topaz is it's ready to go for a print shop to drop into their tech pipeline. it's not the best way to go about it though, and since we're at the bleeding edge, why rely on legacy solutions?
because it's not a legacy solution, and they are fairly bleeding edge - they update their models constantly - and they are industry standard
and most people are not doing anything that requires even that much post production
adobe just put video into firefly btw
they're a commercial product. cutting edge is their best hope. Since they don't release their R&D openly, the bleed they have internally is cut and not used by us. Potentially worked inot a future cutting edge product.
It's not diffusion though. it's still a gan. They've really made one that suits studios well, but it's not bleeding edge capable
not used by you, but used by a very large number of members of the community, and GANs are actually better in a lot of ways than diffusion models
the nature of open source is nightly releases. or pull requests merging in. the "cutting edge" is the stable release
and? i don't think the question had anything to do with 'bleeding edge' just 'how can i succesfully upscale flux, i can't get it to work in comfy'
i know why they're used. i use a gan to boost the image before i do the diffusion denoise on it. they do well. but ultimately, you zoom in on those highest details, it's recognizeably topaz
not everyone wants to be on the bleeding edge - they just want it to work
you assume a lot about what i'm saying and where i'm coming from and what i use and how i used it. please stop following me around countering everything i say
yeh i covered that one chief
i'm not following you around - and you jumped into the conversation not me
so you're countering me
#๐ฌ๏ฝgeneral-chat message just matching your energy.
i was giving funnyguys advice to listen to that uses "free" tools. not proprietary $200 tools. wasn't any need to argue any of that nonsense and chaos.
i dont know what purpose having a fanboy discussion about a 3 figure priced tool, when you could talk about it's capabilities instead
and it's good advice, it's jsut not necessarily going to do him a lot of good - and i did suggest the free image upscaler on capcut's magic tools page as well
new features don't really matter at all, when diffusion gets the highst zoom details better
well perhaps you could trouble shoot his workflow then, as he isn't successfully getting that
i think he got it covered. experimentation is a process that has to be done. i'm a big believer in allowing people to fail until they don't
he wouldn't be asking for help after failing enough to get frustrated, if he didn't need help
most people just give up
my advice would be to care less about the original composition. It was generated seconds before anywyas. it's not sacred. it doesn't need to be precisely exact. Generally close to it is nice though
i mean, those people, if i try to help them they tend to give up too
?? i can understand why you might not feel like helping people if they just blow you off and give up anyway
but that seems a little odd
if i see someone struggling with a new thing but persisting, i wouldn't say "try doing something else instead". I actually respect the hell out of the struggle
the struggle is real
see there you go, acting like you've got me all figured out. I just have a very different way of "helping" people than you might
why is it that you have to, so frequently, drop a negative comment and direct it at someone?
dont try to understand me. you're jumping to all the wrong conclusions. i'll tell you right now, you should just stop. Sometimes, peopel do need to be told when trying something else is a good idea.
i'm not trying to understand you - just have a neutral, non-argumentative conversation with you.
i never said i don't like helping people. so, you forced me into this position by saying "i understand that you dont..."
or maybe you just took it that way.
right. because there was another take away
sure there is. there are as many different ways to take a statement as there are human beings on the planet
those are really cool
Thanks...
On the large one, it is the exact same prompt, but just 1920x1080 and 50 steps... Sometimes I get lucky straight out of Flux ๐
"the offense you took to my words isn't valid" is all i'm hearing. I've heard other people act that way before too. I don't think it's valid. but it's negative to point that out lol.
No one wants this drama. upscalers shouldn't have this much controversy. Diffusion models beat gan approaches in fine grain detail. that's just facts.
i'm out
oh and finally messing around with the max and base shift (as if I know what the ehck I am doing ๐ )
might be what you're hearing, isn't what I said. there isn't any drama - you stated some good technical information, we discussed it. you're apparently angry for some odd reason.
#๐ฌ๏ฝgeneral-chat message you tend to be hostile towards me off the line. these moments persist long past you've forgotten them.
now, i'm out
depends on the scale
and if you've trained a lora for it
i can generate directly at 1920x2560 tbh
no upscale step or tiling
With the same level of detail as 1MP? I've tried generating at ~2MP a handful of times and I always got blurry output. I really like 4k+ generations, so tiled upscale works better for me. The Mixture of Diffusers method produces some nice results, but it's about 50% slower than SD Ultimate Upscale, unfortunately.
Wow, SDXL sucks beyond beleif!!!!
I was too busy using SD3 and Flux I'd forgotten how utterly horrible SDXL was/is. Fortunately there are hundreds of SDXL loras...
I guess I've been spoiled by Flux!
Prob with 20 or so loras.. ๐
never learned how to use SDXL?
SDXL's anatomy is atrocious! We are talking SD3 ladies on grass level lol
yup. never learned how to use sdxl
I thought I had, but then today I thought I'd try to create an owl lady riding a horse.....
I've always used loras or checkpoints with SDXL....
ah. so you were using base sdxl?
A pretty terrible SDXL checkpoint. Well terrible for what I'm trying to do. I'm going to resort to the checkpoints I'm familiar with and talk them into what I need lol
here's a starting point
grab the full sized version and the workflow's in it
I also tried SXL without any loras or checkpoints, and it reminded me of SD3 laying on grass images
well it's a base model...
Thank you very much ๐
Why was everyone laughing at SD3 in that case?!
Never used a LoRA until I tried one a couple days ago with Flux.
i'll DM you
speaking of SD3 - here's a good SD3 workflow for you to play with
Now SD3 I fortunately haven't had many problems with.
You never used SDXL loras?! I have an entire HD full! (but none for women lololol)
Some of the best loras and Checkpoints are SDXL ones ๐
a lora for men works well if you use a female name in the prompt and the pronoun she
I'm realizing that instead of just trying to remember which images contain which workflows, that I maybe shold organize them, save them, and put them into folders!!!!!
How do you all organize your workflows so you can find them easily?
like that
Very detailed too! ๐
All I need is an owl lady riding a black horse, with a white bg. You can guess how they keep turning out lol
That is some fantastic organization right there.
I should do more of that with my workflows... My images are pretty organized though.
recent research has found that a highly organized lifestyle is associated with an increased risk of dementia
as is picking your nose
i have to stay organized, it's too easy to get lost and confused otherwise
confusion helps build new neural pathways and keep your mind fresh as you age
recent studies have found the researchers are out of good topics to get research funding for
so does reading upside down, writing backwards, and doing other unusual, out of the norm things
however, confusion, and especially frustrated confusion, are especially neuroprotective
yes, well - when it turns to irritation and frustration, it's lost it's beneficial effects
(and i get plenty of frustration from other things)
also - always boil your tap water before you filter, then drink it
describe for me what an owl lady looks like
Not organizing my workflows sorta worked, until we now have 4 models to use ๐ Even just organizing it all by model would help.
that's not the description of an owl lady
i've found it's easier when you just mash the keyboard when you save a workflow
the length and content of consonants vs vowels can help indicate the mood and style of the workflow
Learning a few model version each day just has to help as well ๐ What was that most recent faster one called? lol
and in place of organizing files, i've found it's better to just shit them all into an external hard drive and keep buying new ones when they fill up
plus, it's neuroprotective
is Seagate giving you a kickback for this promtion ๐
poke
So far Flux is the best at seperating an owl lady and a horse, and giving a white bg...
also, if you save workflows with random file names every time you realize it's an important one
statistically, the odds of finding that important workflow become pretty good
especially if you scatter it across a lot of different folders
becky - describe for me what an owl lady looks like
The lighting ๐
made a custom node so i dont have to change the prompt much during lora testing, might add it to the custom node library in the manager at some point if others think it would be helpful
Yeah you need to train a Lora though
is SD3 new? better than XL?
It is newer that XL, but has some "issues".
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
Definitely
flux is really good at the tiny details
you can get good tiny details on SDXL like that but only if you downscale from a larger image
Friday has arrived.
Reminds me of syd Mead style.
Oh yes ๐
Nope, not who I was thinking of ๐
He did Blade Runner and Tron visuals, but it's not like those.
A vector-style illustration of a male and female Polo Ralph Lauren bear playing pickleball, wearing an athletic uniform, vector style
"numbered step-by-step drawing from sketched pencil outline to drawing of an owl on a tree branch"
(flux schnell)
That looks fantastic. If you don't mind I am curious to know what you used for a prompt for that one?
A chaotic doodle that forms the shape of a child curled up in a corner. The entire drawing looks like random, messy scribbles, but upon closer inspection, the scribbles come together to reveal the shape of the child and the corner they are in. Around the child, the chaotic thoughts are part of the scribbles, with erratic lines and abstract shapes blending into the figure, symbolizing the mental chaos. There is no pencil effect in this version, making the doodles pure and simple without any textured shading.
...
A dynamic, angled view of an RGB-lit gaming keyboard and mouse, placed on a desk setup. The perspective is diagonal or low, giving a dramatic and futuristic feel, with the lighting from the RGB colors (blues and purples) standing out against a dark background. The keyboard and mouse appear large and prominent, with reflections and shadows creating depth. The surrounding elements like the monitor are subtly visible in the background, adding to the sense of space and modernity. The composition is sleek, minimalist, and optimized for an elite gaming setup, with a cool, high-tech atmosphere.
DallE Theme of the Day!
Is there anything available that is capable of extracting prompt data and or tags for an image? Also is there anything that can take an already created image and describe it properly to create a new prompt? Even for NSFW content?
Ollama and/or Florence2
Are those custom nodes?
Im not sure if tags but description , the best local model Iยดve tested (and run in my gpu ) is minicpm-v:8b-2.6-q8_0 on ollama, but i think florence is better
was gonna say florence
Florence2 img2img then save the prompts generated to text file
Do you know if there is a link to a workflow that has that already working?
Try my Ollama setup - it can generate separate text files/prompt - or join them all into one massive text file
This is Ollama img2img
I updated ComfyUI and my Florence2/Flux workflow broke!
My Ollama w/f does work - it is embedded in the two PNGs above ...
Thank you. Does it matter which SDXL models are used?
No, if its only to harvest the prompts.
But you can use SD3, Mobius - if you can get it to work with Flux - let me know! ๐
If you send me your broken flux workflow I might be able to make some changes and get that working.
OK, I'll keep that in mind. Thanks ๐
What folder do I put that Ollama model in?
Thank you...
I believe it's this one... not sure because I use a large prompt file and randomly pull a prompt. so it could be this one or the next one that was pulled.
(((Ultra HD quality details, realism, ideal))), typography dark moody atmosphere, zentangle In the black void, a white porcelain female face with geometric gold patterns on it, navy blue background. Intricate, abstract, monochromatic, patterned, meditative, highly detailed, dramatic, mysterious, the moody atmosphere of navy blue, the soft misty atmosphere of Russ Mills, the gritty tones of navy blue of Dave McKean's collage work, dreamy and Whimsical magical realistic illustration style navy gray with pink elements of dripping paint - hyperrealism, surrealism. Extreme close-up, poster, fashion, 3D rendering, illustration, typography, dark fantasy, graffiti, painting, film, photo, ```
Not only it's Friday. It's the 13th. Well dang.
It's like the final boss of Fridays I guess.
Custom nodes
Go to ollama.com and it'll tell you how to setup ollama and d/load some llava models
Ollama img2img
I'm back had to pick up my son from school. Do you recommend a particular model?
Llava 3
This one?
Yes
kewl... Also I may have a way to get that florence2 workflow working. I found your workflow in an earlier image.
I updated this morning so Ill see if mine is broken too. Once I get it set up.
As I said, my Florence2/Flux w/f broke when I recently updated ComfyUI
Hopefully it won't be too long B4 its fixed
Ollama
.5
Lol I just realized the workflow I found of yours wasn't Florence2 Img2Img. Looks like its Text2Img
img2img
Does this Ollama LLM need to run as a separate server or does it just run right inside ComfyUI with no external setup?
seperate app running in backround
Ollama is software that can run many different LLM
Initial onetime setup is explained at ollama.com (which is external); and then when using ComfyUI it self-installs
Ok so I got it installed but it doesn't have a UI How do I get a model downloaded into it?
At ollama.com go to Models and select whichever model u like and d/load
llama3-8b-text-q6 is good
llava:7b-v1.6-vicuna-q2
so it looks like it does a fresh install every time I want to use a new model?
Everytime u d/load a new model it takes a little time. But once d/loaded, it is loaded into ComfyUI instantaneously each time
So just d/load the one Model File for now
Ollama img2img
the one you suggested earlier llava-llama3 Is that uncensored?
I don't know ... but you can always try?!
I mean, I have some excellent vanilla (SFW) prompts; yet some unprompted bare breasts do appear from time to time
I ended up going with https://ollama.com/mannix/llama3.1-8b-abliterated
It seems to be working.
@noble coyote Does the Florence2 i2i Workflow work in the same way where it also needs an external instal to run?
No, install into Custom_Nodes direct from Github
There is a Florence2 node, and a Florence-2 node - its this last one that I use
Ollama img2img
Whats mobius?
A checkpoint https://civitai.com/models/490622/mobius
It will redefine your reality
So with the ollama workflow. It doesn't seem to see the actual image. It still wants a description provided by me and guidance for what I would like the outcome to be. I'm not sure if I am missing something but it just doesn't seem to see the provided image.
Does anyone know which one of these is better for NSFW content?
none
Any Suggestions?
Looks like Joycaption can do nfsw but I never tested
Is that a custom node?
dolphin-llama3 does quite well
dolphin-mixtral is another well received uncensored one
Cute ๐ฅบ ๐ฑ
This is my Ollama "seed phrase"
Ollama img2img (technically img2txt, then txt2img)
Ollama img2img (technically img2txt, then txt2img)
Joy caption
Joy caption Joy tagger
I've tested both extensively ๐
They don't work very well for sfw though, which is funny
Anybody got an Ollama img2img setup using Flux.Dev at all?
stand by
So sorry... mine is Florence 2 ๐ฆ
I deleted the custom nodes for Ollama Vision img2img ๐ฆ
I keep getting some 4 bit/quantisation but need 16 bit/quantisation error (I think?!) ๐
Yes, trying to marry Ollama to Flux gets me 4 is expected but 16 received error (let me find its exact wording)
Cool!
Now that Man Utd have beaten S'oton 3 - 0 - I'm going to have a very nice Saturday afternoon!!! ๐
Dunno what's wrong with Florence2 or TensorOps - cannot get them to load!!!
Am I missing something? Civitai just doubled their price for a faster and 1/10th the amount of epochs flux lora training version; then offered it for half price. ?????
In most cases I've only ever found epocks 5 and 10 to be any good, and 1 to be the worst.
1st pass Flux/Ollama
What is wrong with this Ollama/Flux workflow?
curious what models you using with Ollama cuz there are thousands ๐
cool just looking for better flavors maybe, atm using Llama3.1-storm.8b, seems significantly better than base llama3.1, just wondering others experiences
Waiting for this inpainting to be available in Comfy. https://huggingface.co/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Alpha
Also here's a shiny anime card.
can't tell if that is AI or a photo
And now the world will never know.
Here's the best 4K I've been able to get. I just can't get rid of that banding / grid pattern no matter what I try. But it's not super pronounced. I officially have this as my new wallpaper until we get a better model.
I'm just floored by its flower accuracy. I asked for:
...a bright covering of variety, including poppies, sunflowers, bluebonnets, daisies, marigolds, and others.
No, but it was a finetune from Civitai plus the Hyper-SD lora. Generated in 3 passes of 6 steps each, around 1 minute on my 4090 if I remember right.
Pretty bad results at just 6 steps. But I gen at 6 steps, sharpen it, then denoise 0.63 or so at another 6 steps, upscale with a model and sharpen again, then do a final denoise at 0.41 to 0.57 for 6 steps. So 18 steps total. The sharpening helps add detail like texture.
ah nice that sounds good
yeah I use that smart sharpen from Comfy Essentials, sharpening between passes helps a lot
I'm re-running the workflow now. I was doing an initial 1920x1080 render, then upscaling it to 3840x2160. It's taking 159 seconds per image, so about 2 and a half minutes.
For comparison, here's my previous desktop from Cascade, which was about a 30 second 4k native render if I remember right. The overall impression is great, but the reflections are wrong, and the flowers are just sort of nothing weirdness.
ah yeah I haven't got into cascade yet
It doesn't have prompt accuracy, and it can't do details, but it can do very large images. I haven't used it in a while now. Just Flux.
Here was my SDXL wallpaper before we had Cascade. Definitely no detail, and the reflections are subtly off for the mountains, and either missing of off for the near shore.
But each of these was the absolute best 4K I could get out of a given model after a huge amount of experimentation and cherry picking.
that's rly nice yeah
these two are my best high res images, using SDXL
they have comfy workflow attached if you want it
method was resadapter, gradual deep shrink and perturbed attention guidance
Yeah those nothing-details work really well in a jungle setting. They could be vines or leaves or moss. The panels on the robot don't really work though.
yeah its a bit messed up
if I remember rightly it had FreeU node with high S values
which makes more fine details but messes it up a bit
that effect where the panels are a bit wobbly
I got a lot of missing red nodes. But SDXL always messes up R2's panels. Flux was the first model I ever saw make them solid (if not exactly movie-accurate).
yeah flux does good R2
there's kinda 2 flavours, clean and dirty
... An epic and symbolic portrayal of Laplace's Demon, standing triumphantly as the Vanquisher of Chaos. The demon is depicted as a wise and powerful figure, exuding an aura of omniscience and control over the universe. In one hand, it holds a scroll or a celestial map, symbolizing its ability to predict the future by knowing all forces and particles. Surrounding the demon is a swirling vortex of chaotic elements like flashes of lightning and fragmented celestial bodiesโbeing subdued and ordered under its will. The demon itself is calm and composed, with glowing eyes and an ethereal presence, while the chaotic elements appear tamed in the background. The overall scene evokes a sense of control over the unpredictable, combining cosmic imagery with an air of philosophical depth.
... A tilt-shift effect scene featuring 3D cartoon people in a room, running around in a state of panic. The characters are exaggerated with comical expressions and dynamic poses, emphasizing the chaos. The room is filled with various office items scattered around, adding to the frantic atmosphere. In the center, a large speech bubble with bold, capitalized text reads 'STARBOARD IS DOWN'. The overall style is playful and cartoonish, with vibrant colors and a miniature, diorama-like appearance due to the tilt-shift effect.
... "Abstract impressionist painting with heavy impasto paintstrokes and thick textures. Dark cosmic colors. A painting vaguely depicting the Earth, hovering in a pitch black void. Accenting faint color streaks and ribbons. Undifferentiated shapes and indistinguishable matter. Faint blurs and digital grunge effect."
Help?! Why is this so slow?! #๐๏ฝsd3 message
How good IS SD3 for logo?
DallE Theme of the Day
Via Mage (so schnell)
They should have hired me to make the images ๐
it's decent but unless you are using the 8b api, I would probably recommend flux instead.
Schnell doesn't believe in male fairies it seems
Thank you. It's possible for you to give me some samples?
for sd3 or flux?
With flux if you Can
Something in the the gaming niche if possible
Thank you. Looks better than SD3
This us just scnell, dev and pro ate even better!
I like it
I have tested with SD3. But its never comes with great rรฉsults
It's dont write all the letters
Maybe a Bad prompt from myself
You have to reroll a lot sometimes
Are u using flux?
Not yet. I use SD3 with api as i dont have a pc with GPU
Took 6 tries. That's schnell tho, dev is better
wow this is an amazing study
seems like small differences affect it a lot
couple of things I noticed is I liked higher step ones more, and the background trees mostly only came back with high guidance
Yeah, I also noticed in some cases, details come and go depending on step count.
its very strange
I did a similar study with shift, sadly I forget to save stuff
it would alternate between good and bad image sometimes
Yeah, that is likely my next test is the whole shift thing.
I long for the day that we just ... wait for it....
haha yeah
we're still pretty far from that
I did a test of 1000 steps once:
definitely not worth it for most images
if I remember rightly this is guidance 1.2-1.4 and shift 0-0.3
Flux/Ollama - img2txt (Ollama) - txt2img (Flux)
SD3 originals Ollama/Flux img2img
Ooo. It definitely has credit cards in the training data. I wonder, does that mean it has the potential to leak credit card numbers? ๐ค (I asked for the name John Smith and the number 1234 5678 9123, which I realize in retrospect was not a long enough number to ask for.)
This things text accuracy still amazes me though. These aren't cherry picked. I've only generated two credit card images. (Gonna stop though. Nobody likes thinking about credit cards.)
nlo.com redirects to telepathy.com.
Which model is this? I tried to make some wheels with juggernaut XL but it cant make the lug nuts for some reason
Pixel art style illustration inspired by a Street Fighter 2 screen. A teddy bear character, designed with exaggerated features like oversized button eyes and a stitched mouth, is performing a dynamic spinning move resembling Zangief's Typhoon. The teddy is wearing a colorful wrestling costume with a cape. Opposite, a human character in a vibrant blue judo gi is standing with legs apart, one arm extended as if launching a hadouken-like projectile, with a glowing hand effect. The background is a cozy sofa room with pixelated details like a sofa, a coffee table, a lamp, and plush toys as spectators. The scene includes pixel art health bars, scores, and a timer at the top to resemble a classic fighting game screen.
I'm working on an art piece right now and ran into a problemโFLUX doesnโt understand the skin of older people
Promt: A close-up image of an elderly grandmotherโs hands, showcasing the detailed texture of her wrinkled, aged skin. The hands are weathered with deep lines, visible veins, and worn knuckles, symbolizing years of hard work and life experience. The lighting is soft and natural, highlighting the creases and folds, bringing out the warmth and wisdom in the hands. The background is blurred, ensuring the focus remains entirely on the hands, capturing the essence of age, history, and quiet strength.
Flux
Dalle
Nor does Dall-E per that image
It looks nothing like an old person's hands
Working on an algorithm for Mangled Merge Flux. I'm doing things old school until I or someone else can figure out the more advanced Dare/Ties/Della methods. Only 20 loras in so far, but seeing some promising results. Still a little rough around the edges though.
Dev on the left and Mangled Merge on the right.
Does anyone know what could cause this at the edges of an image? Also have an idea for a possible solution?
40 loras merged. Some outputs seem better, some seem worse. Needs more experimentation. Dev on the left, Mangled Merge on the right.
those are printers marks - odd that you'd be getting that on a generated image
I read it might possibly be because the original image is not divisible by 16. This only happens with img2img by the way. but ya, I thought it was a bit wierd and interesting.
flux dev tends to do that for me, maybe 2-5% of the time
might be a sampler thing. nature of the mmdit network and the edge of math for dpm2++ . iirc it happens more with that sampler
Revealed in a-Dream: SD3.5 has 8 billion paramours!!! ๐ ๐ฅณ
howdy yall, so i have been importing specific tools from custom nodes and ive been getting "name 'WHATEVER' is not undefined". are the definitions not in the same .py documents where i get the codes to copy and paste from? o.O
Here is the image you requested.
o.O i requested an image? also, thanks for the help, but i cant read what that language is in after the title ๐
Ollama/Flux img2txt - txt2img. Ollama input = What is this an image of? Make an image which adds a face, head, hands and arms.
[The input images are vintage fashion mannequins.]
Some output remained headless; so I utilised Generative Fill in Photoshop to produce head and hair.
The good thing about Ollama is that it is img2txt first - so you can modify the text prompt on the fly - which then becomes txt2img.
With Florence2, editing the prompt is only possible by stopping the process; modifying the prompt generated in the text file; and starting a new txt2img implementation of Florence2.
A group of wild boars and a group of hyenas are confronting each other from left to right. During the day, the atmosphere is tense and panoramic ,in the style of Pixar, in the style of Disney --ar 16:9 - @sturdy tiger (fast)
in how many weeks?
Someone call the moderators, or soon the inappropriate proposals will start. LOL
mod's shouldn't be needed for that one. its the same post in every single channel here, verbatim, from a user that just joined.
The server is wide open to abuse when that can happen. it's only a matter of time before all the other ants find out that this place doesn't have mod controls during certain parts of the day
#imagine car
Imagine A muscular man stands confidently, covered in dust and sweat, wearing a fitted, worn-out t-shirt that highlights his toned arms. He holds an object casually in his right hand, with a rugged military-style jeep parked behind him in a dusty, outdoor setting. The camera angle is from a low perspective, looking slightly upwards, which emphasizes his imposing figure and strong posture. His hair is styled in medium curls, giving a rugged and natural look that complements his intense demeanor. In the background, a hazy mountain range stretches into the distance, adding to the scene's rugged atmosphere. The lighting accentuates his determined expression as he gazes off to the side, creating an intense and gritty vibe.
There is no bot that will do this for you...
Y
No idea, just run your own ai
there is a bot, if you want one, in the artisan channel
In comfyui, if you set flux guidance to 1.0 and then use a regular cfg of like 4 or 5, it really does allow a negative prompt and doesn't destroy the image. Saw some example on diffusers for "true cfg" and looked through the code and it's easily doable in comfy doing what I just said. Obviously, it doubles inference times though.
My next experiment is going to be doing some hybrid approach with something like dynamic/automatic cfg where it starts out with guidance 1.0 and CFG 5.0 and then transitions back to something like 3.5 guidance and 1.0 cfg. Might need to make a custom node to handle it though
I just saw the reddit post
not sure how they pulled it off
I wish they wrote an explanation lol
literally do what i said, just set flux guidance to 1.0 and cfg to like 5
in comfy. probably works in forge as well
those two images are the same seed btw, 30 steps, dev
wow that's it? such a simple fix all this time
rly weird how simple it is yeah
cos those other methods mostly have side effects, dynamic thresholding, autoCFG etc
I use tonemapwithrescaleCFG personally, on every image
right, but they probably aren't having people set guidance to 1.0. but that's why i was saying i want to try some kind of transitional approach like what dynamic/automatic cfg do, but have it go from guidance 1.0 cfg 5.0 -> guidance 3.5 cfg 1.0
figuring out the ideal falloff curves might be a bitch though
i know people were hackily doing the perpneg stuff a while back, but it had drawbacks probably related to guidance as well
but f me, flux can actually do art this way now...
perpneg is nice on SDXL sometimes, it acts a bit differently
interestingly perpneg combines with CLIPNegPip (you can weight a token negative)
so if you use perpneg but the tokens in the negative have negative weight, you get a rly strong positive effect
i'm too lazy for all that diminishing returns gains type stuff, i just use automatic cfg or dynamic cfg lol
automatic cfg wins for sheer laziness yeah its great
the reason I have to hand tweak it is I like to use a ton of PAG, SAG and SEG to different blocks
then a really heavy anti-burn setup is needed to counter all of that
this is for SDXL though, flux doesn't have attention guidance nodes yet
automatic cfg can do all that with the advanced profiles, iirc
oh yeah I saw that in the repo but I never learnt how
would probably be ideal for me
like the "excellent attention" preset does PAG
on top of the automatic stuff and boost
yeah I need to learn that part of the repo
at least for my images, PAG, SAG and SEG have been the biggest quality boost in the last year
here's another example of it working, gawt dayum
wow i even forgot to change an to a when i changed the prompt, oops
wow yeah it really works
it does have a couple of downsides to manage, because more guidance generally gave better compositions
and secondly because higher resolutions sometimes needed more guidance
maybe scheduling helps
this is an absolutely basic workflow, using simple and euler
just loaders, modelsamplingflux, cliptextencodeflux and ksamplers
about to try it with my hyper merge 16 steps. i also want to see how 1:1 cfg vs guidance really are. it might be that guidance 3.5 is actually closer to something like cfg 7 or something
I always forget the name but a paper found the best steps to apply negatives
and it was less than the whole steps
the hyper lora looks good yeah
yeah which is what boost mode on automatic cfg does
for SDXL I actually use TCD and two different hyper loras now lol
4-8 steps are great
negatives really only matter early on, basically, where the largest area under the sigma curve is before it goes into "refining details" steps
where the curve falls off
yeah the composition kinda gets locked in
its quite funny in the beta sampler paper
they made charts of what steps different sized details appear in
and they found that almost nothing happens in the middle steps
there's this weird gap
yep, mostly just broad color shifts. i use beta a lot
yeah for SDXL I always use beta now
same
the only exception is when the composition is having trouble
then I try karras, and then SGM uniform
this mostly happens trying to stretch the resolution too much
I know this may seem like a truly stupid question, but, how do you know which one should be Clip L weights, and which one should be T5 weights. I only have one Clip on this custom workflow. With no negative prompt at all. Can I set them both to T5?
Also I have a 4090 and cannot for the life of me get the fp16 weights to work. This stuff make my head spin.
just tested with automatic cfg and dynamicthreshold, it works with both. but with automaticcfg, you have to disable boost or it turns into a mess
here's another example. note how it broke in the middle, but when using automatic cfg, it fixes it on the right. definitely sticking to using automatic with things(boost off because it breaks things)
whenever lykon says so
FWIW 16 weights work on my 2070
more than -1 and less than 10,000
yeah, when you turn off flux guidence, you basically turn off flux
What would you call this style?
Vintage illustration doesn't get me this look exactly
Vintage illustration would have been my best guess too
A small, fluffy cat with white and gray fur is sitting on the surface of the moon, gently nibbling on a fish. The fish is glowing slightly, with a silvery-blue hue, as it floats in the moonโs low gravity. In the background, Earth is visible, casting a soft blue glow across the moonโs cratered landscape. The sky is pitch-black, dotted with bright stars, adding to the surreal and tranquil scene."
naive minimalist?
ohhh that sounds nice
Anyways, I figured it out
For some reason my schenll only liked when I used flat colors in the prompt as well as vintage illustration
Kind of weird but I don't care
This is pretty nice, they made flux used normal guidance scale. Results seems pretty good: https://huggingface.co/nyanko7/flux-dev-de-distill
How i can generate image from text
By chance, does anyone here use lm studio because I need some help figuring out how to make it use my GPU and not CPU? Other programs don't have a problem with it.
Idk much about lm studio but I know a decent amount about of what it uses(llama.cpp) you just have to set n gpu layers to -1 or like 99
Can you DM me and then like put that into dummy words
This tutorial introduces what LM Studio is and shows you how to install and run LM Studio to chat with different models.
I tried looking but it's outdated or something. There's no section like that, actually, the whole UI Is just different
Yeah that's what I'm coming from
I didn't like that Jan forced gguf so I switched to this, but I don't see any gpu acceleration option anywhere
Unlike Jan where it let me and everything was always lightning quick, this takes forever to load a simple sentence off of my cpu
There's one thing in all of settings talking about gpu but it doesn't do anything because lm studio isnt picking up my gpu or its broken?
Like I have the whole cuda stuff updated and I was using Jan a few hours ago
Do u have nVidia GPU?
3060 8GB, which is supported by lm
Use GeForceExperience Console to pair LM Studio with GPU
How do I get there?
have they fixed the questionnaire yet?
NVIDIA GeForce Experience Auto Update Drivers and Optimise Game Settings
Well here's the thing. I've found some weird part of the settings that had like something to do with CUDA and llama.cpp, so I updated that and the cpu, and then also in settings for the model I turned off keep memory in case that works but I don't think it does. And so basically, the program recognizes my gpu so I don't think I have to mess around with nvidia settings as this is clearly an app issue. Also my gpu drivers are updated
But still wont use my gpu at all
Yeah, which is why I'm debating going back, but I love the easy-to-use model discover page and also that I'm pretty sure I'm not limited to gguf like on Jan. But if they just removed the gpu acceleration settings then I definitely will
What has your testing shown? They only show 2 examples ๐ฆ
you got nvidia card? cuda needs it. zluda is dead now too.
Oh shouldve deleted it. I found help and got the issue fixsed, thanks though
cool. also, koboldcpp was better imo, but llm studio do work well in one small package
it just rambles on about wanting to know the name of your first born son, and what you had for breakfast today
i see
wild link ty for sharing
i'm going to wait for there to be a version that's separated files. i dont' want dozens of t5 copies on my drive. but it is on my list of things to test and explore this week ๐
Itโs only fp16 version of the dit model, no t5, clips or vaes I believe. Sadly canโt fit it in my gpu.
BFL published Flux.1 Dev in fp16. The 23gb file has t5 and clip l inside of it.
The stock weights when pruned are 16gb. every 23gb version is over 7gb of duplicated data
was so confusing finding which civit version was what
lots of different combinations on there
different quants confuse the decision surely
the various FP8 speedups are what I had the most trouble with
turns out there's two versions of FP8 and some of them need a specific version
fp8 e4m3fn vs fp8 e5m2
||mfw i got away with calling you Shirley||
Airplane is good
i thought that em5 was determined not suitable for flux puproses at all
something about the way the data model aligns
maybe, the comfy FP8 speedups require the fp8 e4m3fn one
I've switched to this pytorch script which works differently to both fp8 e4m3fn and fp8 e5m2
it quants different parts of the model differently https://github.com/aredden/flux-fp8-api
neat looking
as far as I know this is the fastest script, it matches Fal.AI's speed if used on H100
flux is way too slow I want to find all the speedups lool
sadly I don't like the hyper loras, that might have been good. hopefully lightning or TCD is coming
try this: https://huggingface.co/DarkMoonDragon/TurboRender-flux-dev
seems like a pretty nice alternative(supposedly more detail and requires 4-6 steps and even better detail at 10-15)
thanks a lot that seems promising
Inpainting Controlnet with Flux!!! ๐ ๐ฅ
Workflow. ComfyUI now supports the inpainting controlnet from AliMama https://huggingface.co/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Alpha/discussions/1
This was just a proof-test. Generated at 6 steps for the initial render and the inpaint.
i dont think seams like that show proof of concept. i guess its a workflow that executes to the end though
don't know if it will load on discord
but this is the img-to-vid for Cogvideo that came out today
It's proof the controlnet is supported (and compatible with finetunes, with Differential Diffusion, and with the Hyper-SD 8-step lora), not proof of quality.
I am trying with less harsh settings now.
oh no
yeah cogvideo img2vid is nice, this is a pretty good video it made. (original img was low res)
that worked well yeah
got to do the cropping outside really cos it squishes a bit. It pulls of simple pans nicely
It can do seamless inpainting for sure, but I am getting very unstable results. I don't understand inpainting nodes in Comfy I think.
yeah nice, its taking forever though since there are 10 people waiting in the queue.
cogvideox is really amazing, definitely best open source model in image2video and I tried it with an actual good image(made by flux) and it honestly blew my mind.
quality speaking seems very good, but movement are so unnatural
wdym by unnatural?
she's too steady
oh yeah thats true, but i didn't really specify what she should do in the prompt so its understandable
its a big step up yeah
working out what sort of video works best with it is an important step i think
and definitely the dimensions cos it squishes 1024x1024
i believe it said 720*480 works best, its nice that flux can generate at that res.
yeah its not a bad res
its quite good timing TBH that we got flux for this
given that flux is more flexible with res
Here is a couple of Demonic looking Warriors I created tonight. Some img2img with AI captions then took the caption and put it through text to image with some LoRAs to get these.
Did you get cogvideox running on windows? Compiling DeepSpeed requires Intel's OneCCL, which is closed-source and does not have a windows binary available as far as I can tell.
Edit: Nevermind manually installed older version of deepspeed with windows support. Let's hope that resolves everything... Requirements install is progressing!
And finished ๐ฎ Let's see if we can run inference.
Weights downloading now...
Reinstalled Torch and running now. ๐ฎ
Okay, no open-source image-to-video yet. I kept failing to make i2v work, then finally found this note "(Coming Soon)" next to the 5b-I2V model link.
Image to video from the (currently) closed-weights HF demo.
how to use
3.5L
SD 3.5?
beach
sd3.5 large
unfortunately, 8b parameters worth wasted on this
all images posted are pre-finetuning (whether it be safety or aesthetic)
Not that bad, try some humans(something like 3 people holding a sign laying in grass, schnell could do it). Doesn't beat flux but could give solid competition, not sure what your prompt is but this is schnell 15 steps, not even dev.
prompt: A photograph of a complicated notebook filled with formulas and a post it note stuck on it that says "there are so many latents left to diffuse" written out in black ink and "Diffusion Sells", high quality, 8k, real, incredible detail.
the parameter advantage really shows with the clarity of the text
and possibly its also more trained
80 steps on dpmpp_2m is the sd3.5L default
woman lying in grass
cannot test multi-subject composition, please tell lykon to start the bot again
safety tuning does that to you
What is this, do you have early access?
a sort of early access, yes
any info about status?
extremely early access (started 1w ago) points in a good direction, unless they start the aesthetic tuning now
oh, so needs more cooking, nice
always need more t ime in the gpus, never enough time
I think it will be a big deal, if they release training script and don't plan to distill or something else
more tim e
Yea, thanks for sharing!
I like the aesthetic, flux(even schnell) seems to be better but its not done training so it could be a great alternative and its not distilled.
8b parameters yet lose to a GAN-wannabe
I wish they release 8b instead of 2b this time, people could just quantization to make it use a small amount of vram.
wow so SD 3.5 is real
that grass picture is workable, its good enough to refine with realvis
aesthetic seems to be more pleasant than flux, for my opinion
yeah this is more realistic than base flux
some flux fine tunes are more realistic but not the base
looks very great for pre-aesthetics tuning btw...
or I have wrong understanding how should it look
hope it will also share sd3m's ability to generate unique faces
one must hope lykon retains the style through finetuning
yea!
imho the less aesthetics tunig the better, aesthetics tuning gets you flux, push out all aesthetics but the one you want, i'm not a fan
aesthetics tuning doesn't mean not realism neccesarily
but ๐ SAI/SD3 shows a sigh of life again!
yeah great, the style is nice but it duplicated left and forgot diffusion sells(assuming you are using my prompt)
sd3.5 large is showing promise
They probably have to make comparable base model, but would be cool to get pre tuning version too
If you want to nitpick, your image isn't of a "complicated notebook", it's of a textbook with printed text. A notebook would be handwritten notes, so it's not like your prompt was followed any better either.
sd3.5L's generation at 621x621 btw
or it got downscaled somehow...
downscaled, native is 1024x
My hope for SD3 is to be reasonable at prompts but look much less AI/plastics/smoothed than flux
did you generate it with sd3.5 large? and yeah the notebook part could be considered wrong.
also to have negatives without tricks
and it should be a bit faster!
when 8b was offered in api i always used to comment about how difficult it was to get different styles, but compared to flux, it was brilliant in hindsight :p
the 8B version of SD3 was always good I think
I don't use API models but if that model was open sourced I would switch to it
2B was where the problems were
these landscape ones are all great especially the shark
I never minded it that much, seeing it as teaser (and gliff offered lots of gens for free), but when the "real" thing turned out 2b and the teased thing never was released, i lost interest. I hope this time there will be weights, after all this time, only actually seeing weights is believing for me
yeah need the weights for sure
they should have just dropped the SD3 8B weights as soon as the controversy began TBH
and then they could monetise SD 3.5 Pro or something
but its easy to forget the company nearly went under this year
Flux with normal cfg produces much more nicer imgs imo, you need good negative prompts tho
I donโt have the exact one now but it was something like: โdistorted, deformed, low quality, average quality, bad quality, night, ugly,โ
Prompt was: โa woman laying in grass, sunnyโ
Aaaa where is iiiiiit ๐
@pseudo owl @bitter hearth another finding I forgot to mention yesterday when I was talking about using negatives with flux is to set the flux guidance to 1.1 and then the cfg to whatever. Kind of like when using cascade where the one stage uses 1.1. My guess is it helps avoid holes or poles(zeros or infs) or plain identities in calculations somewhere, but I'd have to analyze the code more. All I know is that tiny difference helped, but it still seemed to work well into the 1.5 range. Keep in mind, I was using automaticcfg(boost off) with things
ask lykon
never more!
I can't quite understand. SD 3.5 has already been released, but it's again limited by some bot? What's the point of such promotion?
they have to have terrible marketing, its traditional at this point
it might as well not exist if you can't download it from huggingface
yeah but its nice to have previews like this to know its coming
for budgeting purposes
Sd3
Very pretty, what Lora is this?
It has been released? Where ? How can we try it ?
some people got early access
WHere was this reported?
I have a feeling we are only going to get a 'fixed' SD3 Medium, but here's to hoping I'm wrong
doubtful, I think they are gonna release the 8b model, if not then they lose to flux
I have no such expectations. Still, what I'd really like to know is how reliable that claim of 3.5 is.
they're obviously gonna release something, june 12 was their last release and it flopped so 3.5 in a month or so doesn't seem unlikely, just curious if it's open weights