#๐Ÿ†•๏ฝœsd3

1 messages ยท Page 128 of 1

craggy crest
#

sd 3.5 large

lavish sparrow
#

"Kingslayer"

bitter hearth
#

#๐Ÿ†•๏ฝœsd3 panorama view of a carnival, lit at night, beautiful night background with Ferris wheel, stall, roller coaster

craggy crest
craggy crest
icy drift
icy drift
#

Why do the drone shots have drones in them though. Also these buildings are pretty wonky.
Probably because of the low resolution on my PC. ๐Ÿ˜•

dull star
#

with stg or no?

#

or is this not ltxv?

errant dust
#

I don't have that issue at all. Quite the contrary. I get contrasty punchy images

#

Sounds like an issue in your setup. Either nodes or Python.

brittle nexus
bitter hearth
#

I agree, SD 3.5 has more vibrant colours than flux in my opinion

muted dove
muted dove
buoyant mesa
#

Kohya_ss does not support Lora training for SD3.5 yet, right?

cunning lintel
#

SD 3.5L def is still nice, it's like flux's uneducated but creative cousin, bit dumber but really tries to make up for it ๐Ÿ˜‰

mortal mesa
devout schooner
#

What were your prompts here?

devout schooner
errant dust
#

Show me a prompt

pseudo owl
devout schooner
#

I have some outputs saved at home that show this IIRC, I'll find them later

errant dust
#

Any prompt. Even the seed number. I will run it and see if they match.

#

Easiest and best way to troubleshoot

mortal mesa
devout schooner
#

It's not a bug, it's definitely just a stylistic difference lol
I'm quite sure my Comfy setup is totally fine

errant dust
#

The thing is, I use a very wide variety of prompts, art, design, photorealism, and so on, and have yet to see a single instance that remotely matches your description

lavish sparrow
#

i have no idea what the actual prompts were

#

i just press generate at this point

errant dust
#

Speaking of LLMs, I downloaded and tested one of the quantized versions of Llama 3.3 (30GB file) since my machine has a total of 32GB RAM, and was shocked at how good it was. Truly shocked. I fed it a favorite, asking to describe a scene as if for a role-playing game, and it was absurdly good.

#

Since talk is cheap, here is my query and its reply.

#

Write me a description for a role-playing game in which the player has just entered a small cave. In it are two goblins arguing over a small chest of treasure in the center.

#

As you step into the small cave, the cool darkness envelops you, a stark contrast to the warmth and brightness of
the outside world. The air inside is stale, heavy with the scent of damp earth and mold. Your eyes adjust slowly
to the dim light that filters through the entrance, illuminating the rough stone walls and the uneven floor. Ahead
of you, in the center of the cave, lies a small chest adorned with rusted locks and bands, its wood surface worn
smooth by time and handling.

Two goblins, their wiry frames tense with agitation, stand on either side of the chest, their argument filling the
space with high-pitched voices and animated gestures. They seem oblivious to your presence, too engrossed in their
dispute over the treasure that supposedly lies within the chest. One goblin, slightly taller than the other,
brandishes a crude dagger, its point glinting faintly in the dim light. The shorter goblin, undaunted by the
threat, sneers and spits back, baring yellowed teeth.

The cave around them is sparse, with jagged rocks jutting out from the walls and a scattering of pebbles and dust
covering the ground. A small, dying torch, its flame flickering wildly, casts eerie shadows on the walls, making
it seem as though the very darkness itself moves and twists in response to the goblins' quarrel.

Your entrance has gone unnoticed by the arguing pair, offering you a moment of surprise and the opportunity to
decide how to proceed.

#

This was quite literally my first query, not some result after multiple tries or the like.

errant dust
#

Q3_K_S

#

it was all I could run. lol

#

but in light of the output, there are no complaints either

mortal mesa
#

ya im still in "testing" mode heh

errant dust
#

Out of curiosity I reran the exact prompt, and while some things are the same (a few phrase bits), it made multiple and quality changes too.

#

I know these models have a very wide range of uses, even programming is supposed to be top notch, but I like to test the basics too. It must live up to the ANLP after all.

lavish sparrow
#

i'm trying the exaone by lg, i think it's pretty damn good too, to be honest

#

Scene:  You cautiously step into the narrow, dimly lit entrance of a hidden cave, the musty air thick with the scent of damp earth and ancient moss. The flickering light from your torch casts eerie shadows across the rough stone walls, revealing intricate, gnarled formations that hint at eons of silent passage. Ahead, the cave opens slightly, revealing a small, circular chamber where the flickering torchlight illuminates a scene of unexpected chaos. 

The Encounter: 
In the heart of this chamber, two goblins are locked in a heated argument, their small, green forms towering over a modest chest nestled in the center. One goblin, clad in mismatched armor with a crude sword clutched tightly, is red-faced and gesturing wildly, clearly agitated. Its counterpart, perhaps slightly more cunning, is dressed in ragged cloaks, clutching a crude dagger and trying to reason with its compatriot through frantic gestures and shrill shouts. 

Treasure Chest:  The chest in question, though unassuming, gleams faintly with gold and jewels peeking through its worn leather bindings. The treasure within could provide a significant boost to your adventurer's coffers, but navigating the situation requires finesse and perhaps a bit of cunning. 

Choices Ahead:  

    Intervene:  Approach the arguing goblins and attempt to mediate the dispute, perhaps offering a bribe or a clever negotiation to claim the chest without bloodshed.
    Stealth Approach:  Move silently around the chamber, observing the goblinsโ€™ behavior for an opportune moment to snatch the chest when their attention is diverted.
    Confrontation:  Charge into the fray, using your combat skills to overpower both goblins and claim the treasure by force.
     

The path you choose will set the tone for your journey ahead, potentially unlocking new alliances, enemies, or hidden paths within this labyrinthine network of caves. What will you do? ```
errant dust
lavish sparrow
#

what are you running it on?

errant dust
#

In this case, I simply run it in CMD

#

i.e.

lavish sparrow
#

what hardware ๐Ÿ™‚

errant dust
#

ollama run llama3.3:70b-instruct-q3_K_S

#

Ohhhhh.... lol. New laptop I got for BF on a deal ($1070 after tax): i13900-HX (32 threads), 32GB, RTX 4070, Cherry KB

lavish sparrow
#

that looks like mostly run in ram then -> must've taken a while ot get that answer

errant dust
#

it did, but wasn't back breaking either

#

Since unlike images where you wait until the end, here I got the text output in order, as it was rendered

#

word by word, line by line

lavish sparrow
#

marco-o1 is REALLY good for prompt enhancing, if you add a reasoning block (it's actually a reasoning model)

lavish sparrow
lavish sparrow
#

and time to say goodnight. sd3.5L one of the best models to play around with โค๏ธ

craggy crest
craggy crest
lavish sparrow
lavish sparrow
#

the coffee went bad

lavish sparrow
buoyant lark
#

prompt:A beautiful woman, black hair, modern dress, sitting on a sofa

lavish sparrow
buoyant lark
#

Ah, lol, which group is the right one?

lavish sparrow
#

i actually have no idea

buoyant lark
#

OK, thanks all the same

uneven storm
errant dust
toxic jewel
#

prompt = (
"A beautiful Asian woman, wearing stylish and slightly sexy clothes, "
"sitting in a cozy cafe, holding a cup of coffee. "
"Photorealistic, highly detailed, natural lighting."
)

errant dust
#

prompt = (
"A user who doesn't waste space asking for free images, knows how to read, wearing stylish and slightly sexy clothes, "
)

remote holly
fleet meteor
#

Nice lighting! ๐Ÿ‘

#

Textures in sd 3.5 are also very good

craggy crest
bitter hearth
#

can GPT O1 make a nice prompt or would it be terrible

devout schooner
# errant dust The thing is, I use a very wide variety of prompts, art, design, photorealism, a...

here's a good example of what I meant
here neither the original SD 3.0 or SD 3.5 Medium really came out particularly good in terms of overall composition and coherency (both would need at least probably upscaling and a second denoise pass), but the stylistic difference is quite obvious
first pic is the original SD 3.0, second one is SD 3.5 Medium, seed was the same
prompt was
a still from a movie, featuring an elderly man dressed in a traditional wizard's attire. He is sitting in a dimly lit, ornate library filled with rows of wooden bookshelves that stretch into the background. The man has a long, flowing white beard and is wearing a large, pointed, black hat that casts a shadow over his face. His robe is also black, made of a heavy, textured fabric that adds to the mystical appearance. He holds an open, large book in his lap, and his hands are resting gently on its pages, suggesting he is deeply engrossed in reading or studying. The room is illuminated by a warm, golden light, which enhances the rich, dark tones of the wood and the man's attire. Several candles in brass holders are placed on wooden tables around the room, their flames flickering softly. Behind the wizard, a large, intricate astronomical globe is mounted on the wall, adding a sense of ancient knowledge and mysticism to the scene. The overall atmosphere is one of quiet reverence and intellectual curiosity, typical of a scholar's sanctuary. The image captures a moment of quiet contemplation and magical study.
Again this definitely isn't a "bug", this is just what SD 3.5 thinks certain prompts should look like for whatever reaon, in contrast to the original SD 3.0
but I'm not sure what actual part of the prompt it is exactly making it think the whole room should appear foggy / hazy

#

my only guess is maybe 3.5 had more training on like classical oil paintings and similar traditional art than the original 3.0
and that's unintentionally bleeding into the image sometimes when certain keywords are hit

fleet meteor
devout schooner
errant dust
#

well, here is what I get with SD3.5

fleet meteor
#

๐Ÿ˜

lavish sparrow
#

i dont'care about realism that much... xD

devout schooner
# fleet meteor It looks amazing

yeah, that sort of thing is the main reason I like SD3.5
Flux can't generate images that look like that unless you use Pro Ultra 1.1 in "Raw" mode (which seems to be I guess just their actual full model without any distillation, like even the original "Pro" seems to be distilled in comparison)

lavish sparrow
devout schooner
fleet meteor
fleet meteor
devout schooner
lavish sparrow
#

"spaghetti universe taking over reality"

fleet meteor
#

Yeah i noticed xd, but its text2img

devout schooner
fleet meteor
#

It seems it doesnยดt support wide resolutions

devout schooner
fleet meteor
#

well time to try

devout schooner
devout schooner
# fleet meteor well time to try

i wouldn't use the FP8
GGUF Q8 will be better quality if you can't run the FP16 for some reason
but even that is only like 5GB

#

FP16 T5 is the issue for most people really
just using the GGUF Q8 or FP8 T5 encoder can be enough to get you going in a lot of cases

#

while still running the FP16 version of the actual transformer model

errant dust
devout schooner
errant dust
#

Yes

fleet meteor
#

๐Ÿ’€ ๐Ÿ’€ ๐Ÿ’€

errant dust
#

I usd no LoRA

devout schooner
# errant dust Yes

well again I did say you "have to use Pro 1.1 Ultra in Raw mode" to get similar results to my 3.5 Gen
like if I actually try it on Ultra Raw (only gonna do one cause it costs like a million dollars per gen lol), I get this:

errant dust
#

I did not use Flux Pro.

fleet meteor
devout schooner
#

they're in Manager if you don't have them

fleet meteor
devout schooner
# errant dust I did not use Flux Pro.

right, yeah, I was saying their most expensive API only model is the only one that can reliably do photographic stuff without any hint of distillation

#

getting random cartoon gens from Flux Dev and such on the other hand is definitely a thing though yeah

errant dust
#

A random cartoon by Flux Dev

devout schooner
# errant dust

yeah that's about the best Dev can do
it's got that noticeable CGI-like crispiness everywhere though
particularly on the moss and stuff

fleet meteor
#

I wonder if dev has that problem only because it is a distilled model , and also if it could be "fixed" with finetunes

errant dust
#

It is much lower resolution than the image you posted

#

which is 1800 x 2600

devout schooner
#

that's my whole point lol, Dev looks like all distilled models do, overall

toxic bone
#

you throw a rank 16 lora trained for 100 steps at flux and it'll blast cartoons out fine

#

trained with 512 x 512 images in fp8 mode

devout schooner
#

best example of this is, what people jokingly refer to as "Flux Girl" should really just be called "Distillation Girl"
cause if you generate literally any image of a woman with SD 3.5 Large Turbo specifically, you get results that could easily be mistaken for a Flux Dev gen without Loras

#

the sameface is clearly a result of distillation in general, basically

toxic bone
#

you guys using comfyui? it doesn't make the knob fiddling very easy. on forge i dialed in a couple settings, prompted for a cartoon dog, and got a cartoon dog

#

flux isn't that hard to crack through the distillation

#

i get it. it's difficult to load an entirely new worksheet for every situation, but maybe it's just a problem with your node graph and not the model

devout schooner
#

like I've released numerous Flux Dev loras myself lol, I'm well versed in it

toxic bone
devout schooner
#

so still I think misinterpreting what I said

toxic bone
#

the picture was labelled "a random cartoon" and posted in context of flux making cartoons.. so i guess there was hidden context i wasn't aware of

devout schooner
#

again my entire point boils down to "Dev looks like exactly what it is" lol, nothing else

errant dust
toxic bone
#

i just wanted to clear the misconception up that flux is incapable of cutting through distillation aesthetic

devout schooner
toxic bone
# devout schooner no lol

dude is a troll. i'll ignore for future reference. seems he's an AP that wants to derail discussions

devout schooner
#

maybe this is some kind of language barrier thing

#

not sure

errant dust
#

??

toxic bone
#

not you. you know who i mean

errant dust
#

Anyhow, here is your reply Zoot:

#

ZootAllures91 โ€” Today at 3:17 PM
right, yeah, I was saying their most expensive API only model is the only one that can reliably do photographic stuff without any hint of distillation
getting random cartoon gens from Flux Dev and such on the other hand is definitely a thing though yeah

devout schooner
#

yeah

#

the last part of that what I meant was

#

it can be more random

#

than Ultra Raw ever will be

#

it doesn't know what "photo" means as a single keyword as reliably

#

basically

#

a separate point from the appearance of the images themselves

errant dust
#

YMMV, but I had zero issues with the beach image it produced.

#

As to SD3.5 Medium (I was using SD Large ass I did not have it installed), here is the output on your prompt:

devout schooner
errant dust
devout schooner
# errant dust

yeah this one is pretty good, or better at least
could just be a seed thing
there's still definitely some varying keyword behaviour between 3.0 and 3.5 though

errant dust
#

I would hope so since the original had so many weird issues it was bound to have an effect on non human anatomy images

#

But I only use 3.5L as a rule anyhow

devout schooner
#

Also here's a good example of Flux Dev's sometimes odd prompt behavior I just did real quick on CivitAI
1girl is MASSIVELY weighted towards Anime, in Flux Dev
so much so that it almost always overpowers everything else

errant dust
#

what is '1girl

#

?

#

some cartoon character?

bitter hearth
#

its a booru tag

devout schooner
# errant dust what is '1girl

it's a general tag meaning "one female person / character / etc" basically
often used in the SD 1.5 days for tagging stuff

errant dust
#

the whole point of T5 is NLP, not the ancient single word prompting of the past

#

Complaining the AI cannot understand some weird name such as 1girl is not really much of a complaint to my eyes

devout schooner
bitter hearth
#

the Pony crowd like to prompt that way

errant dust
bitter hearth
#

its a thing from Japan

#

it means image board in Japanese

#

and they post tagged anime images there

devout schooner
errant dust
#

ok, a fair point but if 1girl is used almost exclusively in an anime setting and you way you want a photo of an anime girl.....

#

As to the Pony crowd..... cry me a river

devout schooner
# bitter hearth the Pony crowd like to prompt that way

I train all my Loras concatenating Florence 2 Large "more detailed" mode natural language captions with Booru tags from wd-eva02-large-tagger-v3 right after them, in the same file
best approach I've found
cause each captioner tends to get stuff the other doesn't
so the combination is better than eiteher alone

eternal steppe
#

flux with dynamic thresholding is still my go-to

#

๐Ÿคทโ€โ™‚๏ธ

errant dust
#

Well, as far as I'm concerned it does a nice job in photorealism, though to be fair I quite literally never do any photorealism with them

#

Here is a higher res image with that beach prompt

eternal steppe
eternal steppe
#

It's so pointless to argue anything if you can just generate an image with one model then do a second pass with any other model of any style

#

Flux Redux helps with this magnificently combined with loras

bitter hearth
#

you lose the flux VAE though

#

if you do that

eternal steppe
bitter hearth
#

if you do a workflow that first pass fluv dev and second pass SDXL
your image exits via the SDXL VAE

eternal steppe
#

Then you can just re-encode it. Again.

#

This changes nothing lmao

#

If vae was an issue, you could literally just make a vae encodedecode loop

#

and loop hundreds of ksamplers

bitter hearth
#

you can't undo the quality loss from the SDXL vae though

eternal steppe
#

The main issue at hand with diffusers-based transformers is temporal consistency

#

The changing and/or loss of data based on denoise value and steps

#

I don't think vae decode is relevant in terms of loss quality

bitter hearth
#

if you try img-to-img with SDXL, lots of k-samplers at 0.5 denoise
the image quite quickly gets destroyed

#

it isn't able to fix the damage

eternal steppe
#

Well yeah, because you're denoising half of the image for the model fo inference upon.

#

Have you tried preventing it via other methods?

#

Canny, Depth controlnets

#

IPadapters

bitter hearth
#

was still the same issue

eternal steppe
#

I use a very low denoise combined with flux redux to maintain quality

#

original

#

brocolli in the clouds

#

its raining legumes

bitter hearth
#

if you try 50 ksamplers in a row you will see what I mean
the image is so damaged its not gonna be saved

eternal steppe
#

redux itself is a style model

#

it applies the image into conditioning

toxic bone
#

anime image boards have been ready for datasets with captions for years. those people have had taxonomical tagging down to a science for the entire 21st century. nobody tags image collections like anime enjoyers

eternal steppe
#

That's what I use in combination with lower denoise

eternal steppe
#

But ik what you guys are talkin bout

#

and i aint going down that rabbit hole

toxic bone
#

danbooru is one i avoid. so much extreme smut there

eternal steppe
#

I do not judge people for their interests, just actions.

toxic bone
#

there's many sites that have used image tagging and it's not specifically a smut thing

eternal steppe
#

Rocket Fish.

toxic bone
#

and it was done long before any ai classifiers existed. WD tagger was trained on those datasets

eternal steppe
#

I'm literally just letting flux make its own images.

#

Using an LLM.

#

I gave an LLM a sysprompt, and a constantly changing user prompt with a seed value.

bitter hearth
#

you can throw WD tagger into the captioning mix if you want yeah
what Sana does is use four captioning models
and then it selects the caption with the highest clip score for the image
and uses that caption

eternal steppe
#

Sadly for me I just can't use SANA properly rn.

#

I can inference it but it's vae outputs on intel arc aren't good

bitter hearth
#

ah I don't know intel arc too well

eternal steppe
#

A lot better supported than it used to be

#

I'm glad I bought the A770 at the time

#

16gb of vram and 4060-level speeds

#

good enough for me

bitter hearth
#

intel CPU can be okay as well

#

with Onnx or OpenVino

eternal steppe
#

or IPEX*

#

As that also exists for them

#

Nice leaves.

toxic bone
#

for gamers it truly is a golden age of hyper graphics

eternal steppe
#

The universe has decided to play an elaborate game of hide and seek with the moon, and as it's now in a never-ending cycle of falling off its axis due to an infinite loop, time itself has begun to unravel, creating ripples through the fabric of existence. The sky is painted in hues of violet and silver, with occasional patches of neon green swirling about like psychedelic storms. Meanwhile, giant mutant lobsters have taken over New York City, transforming it into a sprawling underwater metropolis teeming with bioluminescent flora

#

lol one of the prompts

toxic bone
#

3 manufacturers all putting out budget level cards that can smash FHD resolutions on any game on the market. its a good time to be a gamer

eternal steppe
#

I should be using gemini flash 2.0 from openrouter, they got it as a free model

#

probably get better prompts

toxic bone
#

phi-4 is out now too

eternal steppe
#

Is it? I wonder how good it is.

#

phi 4 14b

#

ok

#

it competes with 70b models

#

ok

toxic bone
eternal steppe
#

I'd been a good frontend that supports it then

#

IPEX OLLAMA defo wont

#

hmm

toxic bone
#

i use koboldcpp for llms but i dont know if it has phi4 support or not

eternal steppe
#

wait

#

it says the arch it uses is phi3

#

phi3 itself is supported

bitter hearth
#

ah thanks I didnt know about IPEX

eternal steppe
#

IPEX is how I'm running ComfyUI and LLMs locally

#

And now with Xess2 and XeLL

#

Intel's really starting to pick up the pace on competition

bitter hearth
#

I've started doing JAX stuff

#

TPUs scale a bit better for total TFLOPs/$ than Nvidia at the moment

toxic bone
#

i trust that intel is going to create some ML instruction sets directly on their future CPUS. i was going to boost to a rocket lake or whatever. but then those power issues showed up. going to wait for the next big generation of their CPUs with some serious improvements. my alderlake works good nuff for now

eternal steppe
#

come to danny

toxic bone
#

would ipex use my i7's integrated gpu very well you figure?

bitter hearth
#

all the cloud servers I use have been AMD

eternal steppe
#

Ok.

#

Phi-4 works on ollama 0.3.6

#

So

#

it works on intel

bitter hearth
eternal steppe
#

Welp no need to use anything else

bitter hearth
#

https://github.com/rupeshs/fastsdcputhis is a good openVINO SD implementation

toxic bone
#

the integrated gpu is a gpu. separate from the cpu cores

eternal steppe
#

For individuals on

#

nvidia

#

use svdquant

fleet meteor
eternal steppe
#

same guys who worked on the dc-ae used for sana

#

SVDQuant is a super-performant quantization

toxic bone
#

i just want to load an llm on something other than my nvidia, so that i can use it in tandem with image gen

#

i got a 4080 for images

eternal steppe
#

Then use llama cpp

#

lmao

#

or

#

ollama cp

#

u

#

and they would both work

toxic bone
#

yeah so no benefit to using integrated gpu then. just go with cpu

eternal steppe
#

Not really no. You'll get more FLOPs out of the cpu alone

#

Otherwise using your GPU would be more sensical

eternal steppe
#

You can use ollama with OLLAMA_KEEP_ALIVE="0"

#

in order to load a model, generate and unload after

#

phi-4 is nuts

toxic bone
#

good to know xD

bitter hearth
#

svdquant is what I currently use yeah

eternal steppe
#

I can't use it myself

bitter hearth
#

its on the tensorRT model optimiser roadmap
so it will improve soon

#

I don't have a GPU so I just use cloud

eternal steppe
#

openrouter

#

free models

#

๐Ÿคทโ€โ™‚๏ธ

bitter hearth
#

there's always gonna be limits on free stuff

#

rate limits etc

#

and lower speeds

eternal steppe
#

In the mystical city of Quagmire, where skyscrapers were made entirely of licorice and pigeons sported top hats adorned with tiny monocles, the citizens lived in an uproar of chaos and calm. One day, a peculiar event unfolded: Mr. Puddingpot, the local wizard with a penchant for mismatched socks and rainbow-colored waffles, decided to invent time-traveling croquet mallets. The invention was powered by the synchronized chirping of rare golden geeseโ€”each bird had their own distinct jingle tune that resonated only when they saw a full moon through a kaleidoscope lens. Meanwhile, in the corner of Puddingpot's laboratory, Lady Zinniaโ€”a dragonfly with an extraordinary collection of porcelain teacups and a slight allergy to rhinestonesโ€”was knitting scarves out of spider silk laced with fairy dust for her upcoming tea party with the whispering willow trees. As fate would have it, Sir Whifflebottom, a knight whose armor was perpetually sticky from jellybean juice spills, had just stumbled upon a magical map that only appeared in rainbows on Thursdays when painted by left-handed penguins. This extraordinary day promised to bring even more strange and

#

Phi-4 is good at prompting.

bitter hearth
#

thats good yeah

eternal steppe
#

I asked for a "random things" prompt.

bitter hearth
#

I like to use florence 2 first

#

then second LLM improves

eternal steppe
#

I actually use wd-14 with underscores removed

#

lmao

#

but since im currently using flux

#

i actually don't need to make an image2image prompt

#

redux with clip vision does that for me

bitter hearth
#

I haven't tried redux yet it seems good

eternal steppe
#

I can test an image for you if wanted

bitter hearth
#

its ok I've seen a lot of examples

eternal steppe
#

๐Ÿ‘

lavish sparrow
devout schooner
#

(this is an actual photograph of Madison Beer to be clear, not an AI generated image to begin with)

#

so you can see how eyes that aren't necessarily even as large in the overall image would eventually become kinda weird
when training a whole model with the XL VAE
as is the case in base SDXL

eternal steppe
#

nope nvm

#

even that has differences between each gen

bitter hearth
#

most people its a small issue cos they don't have many k-samplers

#

but I like weird long complex workflows
and I get limited by it

#

spamming dozens of k-samplers is what I'd like to do, will be possible one day

eternal steppe
#

Well crap. Ipex-llm was updated and now it broke my xpu support

#

lol

bitter hearth
#

lol yeah stuff breaks all the time

devout schooner
bitter hearth
#

yes with flux, using a lot of colour match nodes in between, I was able to use like 20 VAE cycles once

lavish sparrow
fleet meteor
#

Do anyone knows if last sd forge update broke something? Im getting 31.33 seconds per iteration in flux and I was getting 4-6s/i

toxic bone
#

forge doesn't get many updates too often. the issue might be elsewhere.

I went to update my nvidia drivers last week, and it installed this new thing to replace geforce experience. "nviida app" .. it's a complete hog and i hate it. i've had so many issues since. Overlay is turned off but it's still attaching some container process to everything. It's crap and is not ready for them to deploy. But i had to get it to get the new driver so whatever. fuck.

fleet meteor
fleet meteor
#

I think Iยดve downloaded the wrong fp8 version, im downloading the 17gb one this time

fleet meteor
#

Its fixed!

craggy crest
remote holly
turbid grotto
#

why bro brought midjourney bot here

remote holly
#

hyper realistic cinematic photo, candid, black and white, professional photo, fashion, a woman with dark hair standing leaning against a wall blowing a light transparent smoke, she is wearing a white t-shirt with a long black coat, long exposure, in a dark setting, low light, dynamic shadows, volumetric light, 8k, detailed, masterpiece

lavish sparrow
remote holly
lavish sparrow
# remote holly i love it

the only human hands involved in this image is an idiot creating a workflow so he doesn't have to actually prompt ...

#

my workflow also does well with actual prompts...

gleaming wigeon
#

The

vestal wadi
#

top

rapid pivot
remote holly
silver bridge
#

So how is SD 3.5 looking? And u guys use large turbo, large or medium? Iโ€™m considering using it

pseudo owl
# silver bridge So how is SD 3.5 looking? And u guys use large turbo, large or medium? Iโ€™m consi...

Sd3.5 large is pretty great, a good alternative to flux dev.

Turbo is good too but requires 4+ steps and I think shuttle3(flux schnell finetune) surpasses it since it can do same if not better quality at 2steps and can do 2k images.

Medium is probably the sota base model for its size but I think the low-step models are better choices since you can just use quantization with them for speed-ups and lower vram usage. Shuttle3 is probably faster then medium.

errant dust
#

These are large powerful models. Qwen is 72B and might take the throne of best image describer (for local installs)

#

Their Qwen 2.5 LLM is comparable to Llama 3.3 in quality

#

Heated race between Meta and Alibaba on that front

silver bridge
pseudo owl
silver bridge
pseudo owl
errant dust
#

I assume it accepts the various quantized models?

#

GGUF, IQxXXS, etc

pseudo owl
errant dust
#

The interface will download and install the models?

sturdy pine
#

is comfy UI completely separate from 3.5 medium, large, and large turbo? I just installed it and it never gave me a chance to choose

lucid swift
sturdy pine
#

thx

lucid swift
# sturdy pine thx

allso the images on that page are workflows. so you can downdle them and drag and drop them into comfyui to get the workflwos that are deskibed

#

and you shuld probably use the fp8 checkpoint it uses less vram

sturdy pine
#

Thx. I think I'm finally getting it now. Glad that link actually says where to put everything

lucid swift
#

yes its very helpful i also used it xD

craggy crest
cunning kraken
#

Amara, an anthropomorphic owl with a serene expression, prepares matcha in a traditional Japanese tea house. She wears a simple kimono with floral patterns and carefully whisks the bright green tea in a ceramic bowl. The room is minimalistic, with tatami mats, sliding shoji doors, and a small zen garden visible outside. Delicate steam rises from the bowl, and the atmosphere is tranquil, with soft sunlight filtering through bamboo trees. The scene captures a moment of calm precision and elegance.

lavish sparrow
remote holly
#

Linear complexity ๐Ÿคฏ

#

I not will be suprised if one day we reach log(n) complexity

lavish sparrow
#

@remote holly that's what one of the path of exile dev's figured how to do with screen space reflections

#

doing ray-casting (path tracing) with log(n) complexity

#

so for each sample point the next point will be cheaper. he held a talk about it. when i was watching my mind was like "poooof"

#

raytracing becoming CHEAPER for each increase in resoluation instead of more expensive.

lavish sparrow
#

getting pretty reliable text out here ๐Ÿ˜ฎ

remote holly
bitter hearth
#

non-quadratic attention is what Sana does also

bitter hearth
#

is there an optimal size for a flux image?

#

yeah 384x384 in my experience

lavish sparrow
remote holly
bitter hearth
remote holly
bitter hearth
#

well its fast

#

quality I am not sure

#

depends how well it trains

dry wave
bitter hearth
#

yeah I guess breakthrough was too strong of a term when its possible that it is worse LOL

#

the research on VAE speedups might be more valuable

#

I struggle with VAE encode/decode times when doing something like 4k SDXL with 4 steps

#

feels weird when VAE encode/decode time is longer than the Unet time

#

for non-linear attention the Mamba stuff might be more promising, the Dimba, Dim or ZigMa models

dry wave
#

its like Stable Cascade. Sounds good in the beginning, but comes with a lot of flaws

#

if you train your own custom finetune you usually don't have to finetune the vae. I bet, however, you would have to do that for Sana

#

their vae has a much higher compression factor that probably won't work well with custom finetunes

#

(same happened for StableCascade with their stage b model)

bitter hearth
#

I feel like there is a middle ground between the SD 1.5, SDXL and Flux VAEs and their TAESD versions
even going a bit faster for a bit of quality loss would be welcome sometimes

#

the TAESD versions are just too bad I can't get them to work a lot of the time

#

I wasn't aware that you had to fine tune the VAE for Cascade fine tunes, that does sound pretty bad

#

I skipped stable Cascade cos I didn't like the aesthetics

#

if we could get just one photography/realism checkpoint for Sana that would be good maybe
my bigger issue with Sana is that it is not that much faster than Schnell

#

particularly the Schnell finetune Shuttle Diffusion

#

which happily does 2560x2560 in 2 steps

pseudo owl
#

Yeah I think vaes can definitely be improved. This is reducio vae, sota quality and compresses 4096x times:https://huggingface.co/microsoft/Reducio-VAE
For reference, mochi-1 compresses 384x and Hunyuan does 512x. Reducio vae is similar if not better quality and does 4096x. Limited to 1sec sadly.

bitter hearth
#

we're gonna need it for images if diffusion is going to be 1 step

dry wave
#

that's video.

#

of course you can compress videos stronger than images

pseudo owl
dry wave
#

you can also compress images arbitrarily strong, but is it worth it? At some point if you compress your image too strong you end up with a GAN like model which lacks any creativity and cannot be finetuned anymore

pseudo owl
#

For fast real-time generations, probably a good idea. Vae needs to be probably lightweight too though.

dry wave
#

I don't know. You need a strong compression cause attention is quadratic. Maybe we have to overcome the transformer architecture instead

bitter hearth
#

in the LLM world I think they got further with some of the non-quadratic attention stuff
they got at least a 7B or something

#

I'm in the "focus on scaling" camp mostly, I think in the short term we just need to keep on scaling transformers
can distill from the larger ones

craggy crest
dry wave
#

they have nice benchmarks showing they are as good as attention. But just in their benchmark. As soon as you leave their narrow benchmarks attention outperforms this stuff clearly

#

what works, however, are hybrid architectures where you mix global and local approaches. They show better performance while dropping only very slightly in quality. However, this approach don't solve the scaling issue, it just makes it more efficient but quadratic runtime is still quadratic runtime

bitter hearth
#

hybrid architectures don't get rid of the quadratic scaling yeah

#

its really bad scaling if we want sequence lengths to get several orders of magnitude higher

craggy crest
limpid thunderBOT
#

Last 7 days <Dec 10 2024> โ†’ <Dec 16 2024>

  • Member counts
  • 346039 โ†˜ 346018 โ†˜ 346017 โ†˜ 346003 โ†— 346018 โ†˜ 346000 โ†˜ 345993
  • Action members
  • 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†— 70
  • Message members
  • 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†— 58
  • Reaction members
  • 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†’ 0 โ†— 25
    More details
Summary | comcom Analytics

comcom analytics ใฏใ€Discord ใพใŸใฏ Slack ไธŠใง้‹ๅ–ถใ•ใ‚Œใฆใ„ใ‚‹ใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใ‚’ๅˆ†ๆžใƒปใƒขใƒ‹ใ‚ฟใƒชใƒณใ‚ฐใงใใ‚‹ๅฎŒๅ…จ็„กๆ–™ใฎใƒ€ใƒƒใ‚ทใƒฅใƒœใƒผใƒ‰ใงใ™ใ€‚็พๅœจใ€ใƒ‘ใƒ–ใƒชใƒƒใ‚ฏใซฮฒ็‰ˆใ‚’ๆไพ›ใ—ใฆใ„ใพใ™ใ€‚

lavish sparrow
#

รค

#

I'm never going to outprompt my LLM again ๐Ÿฅฒ

lavish sparrow
lavish sparrow
#

"i think i'm lost"

worn zealotBOT
#

Author: @pine loom
MotionPrompt: say hello
Command: </animate:1181475143340539995>
Background: from video
Model: V3-beta
Explore more features at viggle ai

Viggle AI

Bring your characters to life with AI. From professional motion capture to viral memes, discover endless ways to create with Viggle.

lavish sparrow
lavish sparrow
glossy bloom
unkempt mango
remote holly
remote holly
#

i can't wait hunyuan svd quant

turbid grotto
remote holly
#

20 steps

#

is like generating 2 images with flux

turbid grotto
#

also, flux takes only ~100s for me on 3060

remote holly
#

q8 or fp8 ?

#

i think i could do with more frames , maybe 3s

#

i have not enough memory for decode

turbid grotto
civic trail
errant dust
# lavish sparrow

WHat was the prompt? Was this by LLM? If so which one and what was asked?

#

Imagen 3 by DeepMind is updated

lavish sparrow
#

the output:

"T5": "A surreal dreamscape where gravity and space warp in defiance of the ordinary. Jagged, floating islands with archways that lead nowhere and staircases spiraling infinitely upwards or downwards dominate the scene. Colors are vivid and unnatural, bleeding into one another, creating a mesmerizing yet disorienting effect. Figures float effortlessly, their forms merging and separating as if woven from light and shadow, embodying the fluidity of non-Euclidean space.",
"CLIPG": "non-Euclidean, surreal, floating islands, infinite stairs, vivid colors, ethereal figures",
"CLIPL": "Surreal floating landscapes with warped geometry and ethereal figures in vivid, merging colors.",
"ARTSTYLE": "Surrealism, digital art"}```
#

i noticed SD3.5L and M really like it when the clipg and clipl aren't as long. so i have a LLM output various fields ->
reasoning: to get its bearings. It also helps getting rid of stuff that should not go into a prompt
T5: obvious
clipg: shortest prompt, just keywords
clipl: short stentences, basically a summary
artstyle: i add these in front of clipg and clipl, so there is an actual art style used

#

the LLM model i use is ExaOne, i'm really impressed with it

errant dust
#

what size? I was not terribly impressed with its text, but admit it had some curious cultural habits with its creative output

dry wave
#

why do people always use different text for clipl and clipg ;_;

remote holly
lavish sparrow
#

if you still have that parameter in there, it's TERRIBLE for real

remote holly
#

is possible to ask a reasonning for describe an urbex video ?

lavish sparrow
#

i'm not sure if vlm models are really suited to reasoning

#

and exaone isn't a reasoning model per se -> it's smart enough to do it when instructed, which helps

remote holly
#

i mean use vlm output as an input for reasonning model

buoyant hedge
#

@lavish sparrow that's really cool!

lavish sparrow
remote holly
#

yeah , i will try with marco-o1 in ollama

lavish sparrow
#

marco o1 is surprisingly capable

#

if you want to use it in comfyui, you have to so some extra stuff

remote holly
#

i will use for sure, can you split reasoning part and take only the final ?

lavish sparrow
#

yes, that's the thing i've done, actually!

remote holly
#

json format ?

lavish sparrow
#

set the format to json, and add in the system prompt what should go in what fields

pseudo owl
# remote holly i mean use vlm output as an input for reasonning model

probably wouldn't recommend that, just using something like qwen2vl/minicpmv2.6/pixtral should work well for describing images in a specific format.

No need for reasoning honestly like marco o1, llama 3 8b might be even better if used properly.

All format issues can be super easily solved by few-shot, just give it examples and you will get identical formats.

lavish sparrow
#

so i let it reason in the reason tag, but i can discard it afterwards, but the intelligence of those models comes from having reasoning available to them in the first place.

pseudo owl
#

I kinda dislike marco o1, its just a cot like finetune which even llama 3.2 1b/3b can do zero shot. The only open source model that actually seems to improve performance with reasoning is QwQ 32b(qwen) imo but thats more for coding.

remote holly
# pseudo owl probably wouldn't recommend that, just using something like qwen2vl/minicpmv2.6/...

is use minicpm-v2.6 and qwen 2 vl with this system prompt : You are an AI assistant expert in writing very precise and quality urbex video descriptions from images of places and spaces. When the user gives you an image representing a place or a space you write a description of a video following this template:

A urbex video filmed in the first person in [insert a short description of the place], the operator moves forward, [insert your description with details], the lighting is [insert a description on the lighting of the place], hyper realistic, fluid, masterpiece, amateur video, tilted camera

Instructions for generating video descriptions:

  1. Do not contradict the image of the first frame.

  2. Start with a concise description of the scene in a single paragraph, focusing on the key actions in order.

  3. Include detailed character and object movements, focusing on specific, observable actions.

  4. Provide essential background details to set the context, highlighting elements that enhance the mood without overloading the description.

  5. Mention camera angles and movements that define the visual style of the scene, keeping it concise.

  6. Specify lighting and colors to establish tone, ensuring they complement the action and setting.

  7. Make sure the description reflects the type of source, such as live action or animation, clearly and naturally.

#

it works very great but in think i can do better results

pseudo owl
remote holly
#

i will try give an example

errant dust
bitter hearth
#

they're not the same as what O1 did cos it wasn't cot

#

it was reinforcement learning apparently

limpid thunderBOT
#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

granite pasture
#

help Chinese

limpid thunderBOT
#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

uneven storm
# remote holly hunyuan video gguf q8 with rtx 3060

very nice, i saw that the distelled versions of hunyan lost was heabily degraded but this seems reeally nice, how long does it take for a vid to gen, at what fps and for how many seconds can you get it, also which ui are you using? i do allot of image stuff but never video so curious

#

i also have 3060 so would be good info ๐Ÿ™‚

remote holly
#

@uneven storm i have an rtx3060 12gb. You can generate 2s(50 frames) of video with classic resolution and 20 steps , it takes me ~10mn to generate one video.

I am using hunyuan-q8 with the new native nodes in comfyui

uneven storm
remote holly
uneven storm
#

tyvm

unkempt compass
remote holly
#

yeah the comfyui-gguf nodes have a support for hunyaun video , adn fastHunyuan works only with h100 gpu i think

unkempt compass
remote holly
#

i dont know, researchers test always they work on h100 gpu

unkempt compass
#

And about a working workflow for GGUF Hunyuan?

remote holly
#

just use comfyanonymous huyuan workflow and replace model loader to unet gguf loader

#

this one

unkempt compass
muted dove
remote holly
#

In a surreal, neon-lit dreamscape, a young woman stands surrounded by thick, swirling clouds of electric blue mist. Her sharp silhouette contrasts with the glowing haze, creating a sense of otherworldly wonder. She wears an oversized black coat adorned with glowing neon stars, which pulse with an ethereal, blue-white light, casting faint, shimmering reflections on her surroundings. The stars appear to hover slightly above the coatโ€™s fabric, as if suspended by magic, radiating a celestial glow. Her face is illuminated softly by the ambient light, and her expression is calm yet enigmatic, as she gazes directly into the distance. The atmosphere is immersive and futuristic, as the dynamic fog continues to drift and curl through the scene, giving the impression of a cosmic, dreamlike environment. The contrast of the deep shadows and vibrant neon hues evokes a moody, cyberpunk-inspired aesthetic.

civic trail
civic trail
turbid grotto
craggy crest
bitter hearth
#

lots of conditioning that is like control net or pullid

#

and they included seg control net which is one of the better but rarer ones

craggy crest
nocturne flint
#

็”Ÿๆˆๅธฆ้›ชๅฑฑใ€ๆƒ…ไพฃใ€็ป“ๅฉš่ฏ็š„ๅ…ƒ็ด ็š„ๆผซ็”ป้ฃŽๅ›พ็‰‡

remote holly
night oar
#

Hello

civic trail
remote holly
#

i have the prompt for image version , i will send you

#

i dont find the prompt sorry

#

but is used these keywords :


a pretty woman with dark black hair , standing with a stylish pose in front of a red background with snow on the ground and smoke ,

she wears a large blue wool coat with long blue wool pants and a large blue hat ,

reflections , detailed shadows , bright and dynamic light ,

8k, detailed, masterpiece```
lavish sparrow
remote holly
#
hyper realistic cinematic photo, artistic photo, artistic fashion style, long exposure, bokeh effect, cinematic view, fashion, close angle, bright colors, professional photo, a pretty woman with dark black bob hair, floating in the air in a photo room with a red background, she is wearing a large orange wool coat with long orange wool pants and a large orange hat, reflections, detailed shadows, bright and dynamic light, 8k, detailed, masterpiece
#

with sd3.5l

lavish sparrow
#

??? XD

remote holly
bitter hearth
#

you can send your Flux, Cascade, SD 1.5 or SDXL latents directly to SD 3.5 with this https://github.com/city96/SD-Latent-Interposer

#

particularly for SD 1.5 it lets you avoid using the small VAE

errant dust
#

Midjourney is having a special for this month, with UNLIMITED IMAGES using their smallest subscription. They said the reason is they have too many servers that are running idle, so made this announcement. They literally state that (IMHO it means the competition from free purveyors like Flux, and the sheer strength of said models, is the cause)

#

They call it the Relaxathon, meaning set it from Fast image generation to Relaxed, but due to huge server availability there is little to no wait time using this theoretically slower mode

#

Again, this is in their very own announcement

bitter hearth
#

its quite slow in relaxed mode as far as I know

errant dust
#

So you are trying it in this Relaxathon and new conditions? Ok, good to know

bitter hearth
#

no I don't use closed source personally
its just what I have heard

errant dust
#

But you heard this now, when they declared that their servers are currently running idle

#

Still good to know

#

This is the announcement for those interested:

#

Hey. we got a ton of servers for our models next year, and they're currently sad and idle. So for the remainder of the year we're doing a holiday RELAX-ATHON. Relax mode is now available to all members (including basic) with near-zero wait time.

More specifically this applies to
Standard jobs, variations, sref, permutation jobs

It does not apply to:
Upscales, weird jobs, editor, outpainting, retexture

We've also increased the queue sizes and maximum simultaneous jobs
You can now do up to 12 simultaneous relax mode jobs
You can now have 24 jobs in your waiting queue
(these numbers may change based on server availability)

This is a pretty huge amount of free computing power and we hope you can use it over the holidays to have a lot of fun and take some crazy aesthetic journeys.

Happy holidays! โค๏ธ

#

Sounds also llike MJ7 is on the way

#

I have not tested even the new release of Imagen 3 frankly, which is free and came out yesterday.

#

I still may bite for fun.

errant dust
#

They added different ratios, so now it is not locked into 1:1 images

bitter hearth
errant dust
rapid pivot
errant dust
#

oh... no, just website

rapid pivot
rapid pivot
#

I can't generate Santa using a jetpack, 0/10 sadcat

errant dust
#

Yeah? I never tried. Let me see what happens

rapid pivot
#

Yeah they tried to "diversify" things on the older models and got backlash for it

errant dust
#

Worked for me

rapid pivot
#

Maybe only there

errant dust
#

Well, I said Imagen, not Gemini, so cannot comment

rapid pivot
rapid pivot
errant dust
#

As a rule I always get rid of the middleman

#

๐Ÿ™‚

rapid pivot
#

But because of that it might be getting blocked like the old models

errant dust
#

Like Copilot and Dalle3.

rapid pivot
rapid pivot
errant dust
#

That's not a very grounded opinion IMHO. Preference won't make it true.

#

Oh, and nor in my country. I use a VPN

#

I mean, all models, open source or closed source are made by companies big and small. There is no magic sauce in creating a model that makes the open source inherently better. You can create tools and LoRAs and so on, but the base model? The sheer cmpute alone is outside the geenral community's reach for now.

#

This is in contrast to open-source software projects directly developed by the community

rapid pivot
errant dust
#

Imagen 3?

rapid pivot
#

I just updated the page and worked

#

Yeah

#

I got an image from it

errant dust
#

Huh. It wasn't a couple of months ago

#

I got messages about availability soon blablabla

rapid pivot
#

I went to the faq and saw Brazil there and was like wtf

#

Went back and it worked waow

#

The UI in that site is pure suffering on mobile

#

@lavish sparrow

errant dust
#

Solid artwork though

pseudo owl
#

Hunyuan can gen 20sec vids now with ctx window and not using extra vram. Will use extra time obviously though.

bitter hearth
#

was in the early Gemini days

bitter hearth
#

pink spaghetti noodle in ComfyUI

fleet meteor
fleet meteor
#

Too bad its closed source but at least its free (no, im not gonna subscribe or upgrade to it XD)

bitter hearth
#

looks nice

fleet meteor
#

๐Ÿคฃ

pseudo owl
#

Flux Schnell 1step, 1024x512

remote holly
fleet meteor
bitter hearth
#

you use the VAE of the final model

remote holly
#

new vae , built in stg and image compression !

bitter hearth
#

wow nice

#

LTX is good

remote holly
#

yeah

#

i love they listen the community and add the image compression and built in stg

#

i am looking forward controlnet and lora support

#

ltxv with controlnets have the potential to be the new animatediff with DiT

remote holly
#

anatomy is not good but i love colors

#

is it possible to correct the hands with inpainting ?

pseudo owl
pseudo owl
remote holly
#

realy amazing !

#

the movement and camera are better

#

no crazy deformations with stg

pseudo owl
#

Yeah it got a nice 5 finger hand, and itโ€™s actually faster too. Previous ltxv already took like 15sec on a 4090 with stg. This one is 15% faster I believe

remote holly
#

what is this new option :
Additional initial latent optional input to chain latents for high res generation.

remote holly
#

4s/it for 5s and 30 steps !

hallow lion
craggy crest
remote holly
lavish sparrow
lavish sparrow
#

nomnomnom

#

"Let me devour your heart"

pseudo owl
lavish sparrow
short haven
#

blood microchips

lavish sparrow
#

remember to water your plants

dull star
lavish sparrow
lavish sparrow
#

time for the little monster to sleep

lavish sparrow
#

and morning coffee, important too

lavish sparrow
upbeat girder
civic trail
lavish sparrow
lavish sparrow
#

abstract rabbits xD

#

"i'm tired, boss"

#
"T5": "Abstract swirling flames dance across a canvas, interwoven with ethereal, flowing lines representing voices and emotions. Shades of deep reds and oranges contrast with cool blues and purples, symbolizing bittersweet desire and peace. Interlaced figures, barely discernible, embody connection and eternal devotion, their forms merging into a harmonious chaos that speaks of burning passion and eternal union.",
"CLIPG": "flames, ethereal lines, red-orange, blue-purple, intertwined figures, voices, eternity",
"CLIPL": "Abstract flames and flowing lines symbolize burning passion and eternal love, merging figures in a vibrant, bittersweet palette.",
"ARTSTYLE": "Abstract Expressionism"}``` the fact that sd3.5 L even gets this close, is 100% amazing...
toxic bone
#

"The fact that..." [opinion]

lavish sparrow
#

the fact that i'm getting images i'm getting entertained by -> i 100% do not care other people might not like it. i like it. that's all that matters to me. why would anyone's opinion matter in the first place? but i know there's people in this channel that might like it too.

lavish sparrow
lavish sparrow
lavish sparrow
craggy crest
craggy crest
turbid grotto
bitter hearth
#

I saw a comment once saying
instead of waiting for Comfy wrapper, just go to the tool where its at
and I've started doing that lol

turbid grotto
#

Yea, I should learn

#

but later agony

#

What plans do Stability have? Will sd3.5 be the last img model?

bitter hearth
#

its ok everyone has skill issue
not sure what's gonna happen with Stability

#

feels like video is the focus of everywhere for a while

finite osprey
#

I wish sd3.5 would hit 2mpx natively : (

pseudo owl
#

Yeah when will stability cook a sota video model, same with Black Forest labs and theyโ€™ve been promising it for a very very long time.

bitter hearth
#

they took a lot of VC money so the funding is there

pseudo owl
finite osprey
#

2nd passes still gives artifacts for me, but probably skill issues. Unless you talk about upscaling which I dont like

bitter hearth
#

they trained it on flux de-distilled

#

it can go to 2560x2560 even

#

I agree shuttle is probably the best model out there currently, for quality per FLOP

finite osprey
#

do they run on a 3070 ?

bitter hearth
#

on linux yeah

#

shuttle-3-diffusion-Q4_K_S.gguf - 6.88 GB will fit

finite osprey
#

will try

bitter hearth
#

SVDquant would be the best thing for you
its currently tricky to use
they are gonna make it easier apparently

finite osprey
#

#soon

#

Thanks @bitter hearth goodjob

bitter hearth
#

I recognise the Shuttle look lol

finite osprey
#

The fact shuttle is written on his arm is a giveaway

bitter hearth
#

lol didnt even see

lavish sparrow
bitter hearth
#

Shuttle beats dev quite regularly

#

it can't do every style/subject but when it works its great

finite osprey
#

what are its weaknesses?

bitter hearth
#

it goes painterly sometimes

#

I always want photographic

#

also it is not a soft model by any means

#

which is both a pro and a con

lavish sparrow
bitter hearth
#

very nice green

finite osprey
fleet meteor
finite osprey
stiff cypress
#

So I have SD forge....so how to can use sd3?

elder urchin
#

Can we train loras on SD3/SD3.5?

#

Also, is it worth it go from SD2 to 3/3.5 for anime?

civic trail
pseudo owl
craggy crest
elder urchin
#

I want loras for specific characters

#

I doubt the base model can run all of the characters I want on good quality

craggy crest
craggy crest
frail shoal
frail shoal
#

and sd3.5m does shitty composition on it's own

craggy crest
#

have you tried the GGUFs that were put out, out of curiosity?

frail shoal
frail shoal
craggy crest
frail shoal
craggy crest
#

well, we'll wait and see what they release

craggy crest
#

SD 3.5 large

summer ginkgo
#

Whitfield Diffie (SD3.5 Large Turbo).. turned him into father time... a bit ๐Ÿฅฒ๐Ÿ˜‚ hf space

fervent dragon
sweet quartz
rapid pivot
#

Don't move

sage burrow
craggy crest
#

SD 3.5 large

hallow lion
#

Rathorse. A sing of our times.

sage burrow
frail shoal
#

sd3.5 large turbo + medium

#

anyone finetuning sd3.5 medium ?

#

also have people started using more sd3.5 now, or most of them are still on flux

bitter hearth
#

I don't wanna check myself cos Civit's site lags everything up

frail shoal
#

jsut saying that flux loras had exploded when it came out, and it seems not that much dor sd3.5, there is almost nothing

bitter hearth
#

oh I thought you were asking a question rather than making a rhetorical point

sage burrow
#

It's true though, barely any 3.5 loras in comparison

frail shoal
frail shoal
sage burrow
#

There's some stuff that sd3 can do that flux cannot

frail shoal
#

seeing the ton of loras for flux, i was tempted to switch again to it lol

sage burrow
# frail shoal like ?

Anything nsfw, but also melting clocks was extremely difficult to create loras for with flux!

frail shoal
#

flux has something special though, like symmetric details. idk how they achieved that

sage burrow
#

Fortunately they are both free, and run locally ๐Ÿ˜„

bitter hearth
#

most 20B DiTs have good structure

frail shoal
sage burrow
#

Anyone made a Hunyuan Video lora yet? ๐Ÿ™‚

craggy crest
pseudo owl
craggy crest
frail shoal
sage burrow
craggy crest
#

here

#

start there - those are all style loras

pseudo owl
#

The cool thing with hunyuan is since its also like a text-to-image model, you can train it on images too and it can actually gen videos of that character/style. Image training is a lot cheaper and faster.

sage burrow
#

<-- got distracted by Hunyuan Video lol

craggy crest
#

just sayin'

sage burrow
frail shoal
sage burrow
craggy crest
bitter hearth
#

there is 3.5m turbo now also

craggy crest
#

but i'm making style loras. what sort of lora are you looking for?

bitter hearth
#

https://old.reddit.com/r/StableDiffusion/comments/1hieril/introducing_tensorarts_stable_diffusion_35_medium/

frail shoal
craggy crest
pseudo owl
craggy crest
#

they just released it yesterday

#

so it's in beta still

pseudo owl
#

nice, there was no free demo of hunyuan before and this one is unlimited๐Ÿ˜„

frail shoal
craggy crest
#

it's low res, keep that in mind. they will put in high res, but not till it's out of beta and that probably won't be free to all accounts

craggy crest
frail shoal
#

i would need some time to sort them

craggy crest
sage burrow
frail shoal
#

i never trained before

sage burrow
craggy crest
frail shoal
craggy crest
#

why don't you see if you can crete what @frail shoal is after. and i'll do a lora from a smaller dataset if it doesn't work

pseudo owl
craggy crest
#

yeah. i made several videos right after it released, they worked well

frail shoal
#

although i must warn you, there are a lot of women. not all, some dragons and a lot of fighting scenes, but a lot of women still

sage burrow
craggy crest
#

otherwise you might get just their stomach

frail shoal
craggy crest
frail shoal
#

my images are not all in 1024 though, they have different aspect ratios. would i need to crop them before ?

frail shoal
craggy crest
sage burrow
frail shoal
sage burrow
craggy crest
#

if they are all the same AR, then you can have issues if you try to generate somethign that's not that AR

sage burrow
#

So, anyone know a good free video upscaler?

craggy crest
#

on the other hand, you might get a better training if they are all the same AR

craggy crest
frail shoal
#

i cropped like a 100 and inpainted some details into things that were needed. but it takes a lot of time. i also turned a guy character into a woman, because i really liked that one lol

craggy crest
#

on their magic tools page

sage burrow
#

Hunyuan Video is decent at sword swinging characters btw! Better than still images last o tried

analog bolt
#

Is there any fancy way of upscaling with SD 3.5 (I'd like to get 4k image), or do I just generate standard 1k image and then pass through SD upscaler?

analog bolt
bitter hearth
#

yeah if you can split the tiles up
then run a captioning model, control net and IP adapter on each tile, you can get much better results

bitter hearth
#

LOL

#

I'll make a workflow for you later if you want

analog bolt
#

I've used one workflow from da internets with basically default setup (just removed inpainting) and the upscale produces artifacts.
I guess that's because it's using 512 tiles, not 1024 which are default for sd 3.5l?

bitter hearth
#

SD 3.5L like 1024 yeah

analog bolt
#

Setting tile size to 1024 did get rid of the artifacts, but there's clear chessboard pattern for brigthness

bitter hearth
#

was there controlnet

analog bolt
#

workflow should be included in image

finite osprey
#

Trying dropping the setting to around 0.10 in the upscale toolbox

analog bolt
#

denoise?

bitter hearth
#

when denoise is low it means it only changes the image a little bit

analog bolt
#

that was a question to @finite osprey. And I did reduce denoise to 0.1 and grid is still there.

finite osprey
#

Not quite sure then maybe you should post your upscaler screenshot here so that people can help more

frail shoal
#

Are there any fancy flow samplers that work with sd3 .5 in comfyui ?

bitter hearth
#

yeah the clown stuff all works https://github.com/ClownsharkBatwing/RES4LYF
in stock Comfy samplers, Euler ancestral and DPM++ 2SA also work

#

as far as I understand

#

the thing that needs to be done for an SDE/ancestral sampler to work with rectified flow models, is for the SDE to be variance preserving rather than variance exploding

analog bolt
#

What's with the bot constantly pinging alextitanov and mayz? annoying

craggy crest
remote holly
#

structured prompts work with T5 xl ?

#

prompt ```
<subject>
fashion presentation, professional video
</subject>

<style>
fashion, long exposure, professional photo, vhs, hyper realistic, cinema
</style>

<actor>
a woman with black bobbed hair
</actor>

<clothes>
a white shirt, a very long black navy blue velvet coat, black wide pants, a blue shark-shaped cap, blue shoes
</clothes>

<action>
the woman is standing holding her hat
</action>

<background>
a bright room, a black background with a starry sky
</background>

<camera>
the camera films the woman from above
</camera>

<extra>
dynamic shadows, light reflections, fashion, professional video, filming, cinematic view, hyper realistic, 4k, masterpiece, smooth, detailed and sharp faces
</extra>

remote holly
#

i love the colors and aesthetics but the hands ruin everything

#

prompt :

fashion presentation, professional video
</subject>

<style>
fashion, long exposure, professional photo, vhs, hyper realistic, cinema
</style>

<actor>
a woman with black bobbed hair
</actor>

<clothes>
a large black tee-shirt dress with blue bubble patterns, black boots with blue arrow patterns, a blue velvet hat
</clothes>

<action>
the woman is standing holding her hat over opaque pink water in space
</action>

<background>
a dark space background with stars and the planet earth in the distance
</background>

<extra>
reflections, dynamic shadows, light reflections, fashion, cinematic view, hyper realistic, 4k, masterpiece, smooth, detailed and sharp faces
</extra>```
heavy oriole
frail shoal
fervent dragon
muted dove
bitter hearth
#

they mean context window
but I am not sure how it was done

#

if it was just passing the last frame as a new initial frame that doesn't rly count

#

its still useful to do that but its different

turbid grotto
#

did anyone manage to install svdquants in comfy on windows?

muted onyx
#

Is it okay to talk about the safeguard on StableDiffusion 3.5 Medium?
I have set pipe.safety_checker = None but there is still significant safeguarding

bitter hearth
muted onyx
pseudo owl
# muted dove How did you get a longer video? What's "CTX window"?

not an expert, Kijai is the one who implemented it, but basically the steps are divided into segments(If you do 40steps, 20sec video with 5sec segments, then 10steps per segment) and there is an overlap factor so the next segment "remembers" what happened before.

It should be a bit better then passing last frame as next frame as neon said and the passing last frame technique won't even really work since hunyuan has no i2v yet.

craggy crest
#

I'm surprised no one's released in image 2 vid for hunyuan yet

pseudo owl
craggy crest
#

that'll be fantastic if they meet that.

#

controlnet would just be an extra bonus

muted dove
pseudo owl
muted dove
#

I use those nodes (comfy version just gives me blurry output), but I don't see anything that explains how to create a longer video. Increasing the frames just takes forever and crashes my PC...but that's another issue.

#

I'm using a 4090

bitter hearth
pseudo owl
bitter hearth
#

really important paper

#

PipeFusion is what I was looking at currently, it should stack with that

muted onyx
pseudo owl
muted onyx
pseudo owl
bitter hearth
#

maybe try thesehttps://github.com/CompVis/stable-diffusion/issues/331#issuecomment-1562198856 https://stackoverflow.com/questions/77616823/stablediffusionpipeline-always-producing-empty-picture-probably-because-of-saf

bitter hearth
craggy crest