#🆕｜sd3 | Stable Diffusion | Page 127

dusky thistle Nov 27, 2024, 12:12 AM

#

yeah, this.

#

and this.

#

it's not a matter of taste, samplers are either better or worse

#

more accurate or less accurate

craggy crest Nov 27, 2024, 12:20 AM

#

dusky thistle it's not a matter of taste, samplers are either better or worse

the terms better and worse are comparison terms that make something different to every single person. so are the terms accurate and less accurate. comparison terms. who decides what 0 for accurate is? maybe what I consider accurate and what you do are nothing close to the same - you won't be satisfied with less than 32 decimal places and i'm fine with just 4

pseudo owl Nov 27, 2024, 12:23 AM

#

Yikes I forgot how good mochi is, especially for an open model.

halcyon yarrow Nov 27, 2024, 12:29 AM

#

pseudo owl Yikes I forgot how good mochi is, especially for an open model.

you generated all those yourself?

pseudo owl Nov 27, 2024, 12:31 AM

#

halcyon yarrow you generated all those yourself?

These are generated from genmo, the official website. According to an employee, website uses 200steps, full precision, and an upscaler. Uses the open model.

I obviously can’t run the full precision one so I use genmo.

halcyon yarrow Nov 27, 2024, 12:31 AM

#

pseudo owl These are generated from genmo, the official website. According to an employee, ...

lol that's ridiculous

bitter hearth Nov 27, 2024, 12:31 AM

#

craggy crest the terms better and worse are comparison terms that make something different to...

this is a really good point
essentially the way the field seems to deal with this is by putting all of those things inside a box of "human preferences"

halcyon yarrow Nov 27, 2024, 12:32 AM

#

no wonder it looks so good thhx i was wonderinig what they were doing and why my stuuff looks like garbo and theiris looks so pristine

#

so it's not that they're generating at 2x the resolution they're upscaling, i tried generating at 2x the resolution of 480x840 and my stuff crashed after 90 minutes, that made me sad lol

bitter hearth Nov 27, 2024, 12:33 AM

#

and then having a subsection of the paper/presentation/docs that deals with the "human preferences" benchmarks
separate to the objective metrics

#

I actually think the preference studies are more valuable a lot the time, they are just rarer as they are expensive to make

pseudo owl Nov 27, 2024, 12:33 AM

#

halcyon yarrow no wonder it looks so good thhx i was wonderinig what they were doing and why my...

Yeah quantization drops quality considerably. Even fp32 to bf16 drops quality. And yeah they use an upscaler.

craggy crest Nov 27, 2024, 12:34 AM

#

i'm still of the opinion that all of their benchmarks are worthless. for LLM's if you want to know whether it's thinking as well as a human, give it the same tests you give humans - give it the mensa test. for AI image generators, all that matters is whether the person using it likes what it's creating or not

bitter hearth Nov 27, 2024, 12:34 AM

#

for art yeah but for example for inpainting you might wanna have a technical metric

#

or for something like "how well does this canny control net match the input image"

halcyon yarrow Nov 27, 2024, 12:36 AM

#

pseudo owl Yeah quantization drops quality considerably. Even fp32 to bf16 drops quality. A...

there's a very specific prompt someone used and posted their vidieo for it on Genmo's discord, I tried to recreate it locally and I tried a bunch of settings even res_3s and 5s and I never managed to hit that level of coherencyy it had

#

this is two of the 'best' ones

pseudo owl Nov 27, 2024, 12:37 AM

#

halcyon yarrow there's a very specific prompt someone used and posted their vidieo for it on Ge...

They probably use some sort of llm enhancer too, I didn’t ask anything about that though.

halcyon yarrow Nov 27, 2024, 12:37 AM

#

i think there is an option on the website to level up your prompt right? like a toggle switch maybe?

craggy crest Nov 27, 2024, 12:37 AM

#

bitter hearth for art yeah but for example for inpainting you might wanna have a technical met...

the point though is that what each person thinks is best isn't what other people think is best. so while there can be suggestions of how to achieve X or Y or whatever, they are jsut that. suggestions.

pseudo owl Nov 27, 2024, 12:38 AM

#

Hmm not sure, let me check.

#

Nope, they have the option to use mochi or their legacy model(before mochi was released). And seed

bitter hearth Nov 27, 2024, 12:39 AM

#

most of the ML tasks that come up on discord are ones with qualitative/subjective objectives so yeah

#

with statistical models or quant models its more common to have something be outright correct or incorrect

craggy crest Nov 27, 2024, 12:40 AM

#

halcyon yarrow i think there is an option on the website to level up your prompt right? like a ...

you could accomplish almost the same thing by having meta.ai just revise your prompt

craggy crest Nov 27, 2024, 12:40 AM

#

bitter hearth with statistical models or quant models its more common to have something be out...

1+1=2(sometimes)

halcyon yarrow Nov 27, 2024, 12:42 AM

#

okay the prompt was:

A close-up cinematic shot of the protagonist’s face, rain dripping from their determined expression under the glow of neon city lights. Their eyes, sharp and focused, reflect the faint blue hue of a glowing blade they grip tightly in the foreground. Raindrops streak across the lens in slow motion, with subtle steam rising from the wet ground below, framing the intense moment in vivid detail.
and I remember loading it and I got a negative prompt out of the file:
Flat composition, overexposed lighting, unrealistic or cartoonish details, harsh shadows, pixelation, grainy textures, unnatural movements, static framing, cluttered background, artificial colors, poor depth of field, low resolution, jittery camera motion, lack of cinematic contrast, dull expressions, poorly blended lighting effects, unbalanced composition, unconvincing water effects, stiff posture, and unrealistic reflections.

But I'm on the website and I'm seein there is no field for negative prompt.... so i think its fair to say it was indeed rewritten and had the negative prompt injected

bitter hearth Nov 27, 2024, 12:43 AM

#

I don't like it when they rewrite prompt

dusky thistle Nov 27, 2024, 12:44 AM

#

bitter hearth this is a really good point essentially the way the field seems to deal with thi...

when it comes to sampling, it pretty much just verifies that they didn't fuck something up with their analysis imo

#

there's nothing to be gained from inaccuracy, the only debate is about how much you lose, really

halcyon yarrow Nov 27, 2024, 12:44 AM

#

here's what genmo.ai came up with

#

or wait maybe he ran it locally? bc i see its an mp4 file and its only 1 second, genmo.ai wouldn't have made that file

bitter hearth Nov 27, 2024, 12:45 AM

#

dusky thistle there's nothing to be gained from inaccuracy, the only debate is about how much ...

that is the case currently yeah that humans prefer more accurate sampling on average
but we should check that result every now and then lol

#

can't take it for granted

dusky thistle Nov 27, 2024, 12:46 AM

#

yeah, i presume that's why they do it, to verify that there's nothing screwball

cunning schooner Nov 27, 2024, 12:46 AM

#

I have a SD3.5M Diffusers question, if anyone can help, greatly appreciated.

I can only seem to find FlowMatchEulerDiscreteScheduler and FlowMatchHeunDiscreteScheduler for SD3.5M but they both give a brighter result than the Euler from comfy results. Any idea why my comfy results are prettier (deeper, less bright overall and slightly better contrast?
I'm using Euler + simple for both

pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config(
    pipe.scheduler.config,
    timestep_spacing="trailing",
    shift=3.0
)

bitter hearth Nov 27, 2024, 12:48 AM

#

there are so many differences between comfy and diffusers

#

in terms of the underlying system

#

its not worth trying to get the exact same output

cunning schooner Nov 27, 2024, 12:50 AM

#

No not the same per se, but it seems my diffusers output is always sightly too bright and I can't find the right settings to tweak.
I only see Euler and Heun for example and switching to trailing does barely anything

halcyon yarrow Nov 27, 2024, 12:56 AM

#

genmo's website, positive prompt

#

the cool things is the large model does run on my machine even if it is slower, i thiink that other guy was just lucky with his seed bc the reference video stiil looks better than this one

craggy crest Nov 27, 2024, 1:20 AM

#

halcyon yarrow okay the prompt was: > A close-up cinematic shot of the protagonist’s face, rain...

kling

#

just say 'no' to negative prompts

halcyon yarrow Nov 27, 2024, 1:25 AM

#

yeah it naiiled it really well

craggy crest Nov 27, 2024, 1:26 AM

#

halcyon yarrow yeah it naiiled it really well

ran your prompt through SD3.5 large, then animated it in kling

halcyon yarrow Nov 27, 2024, 1:26 AM

#

halcyon yarrow here's what genmo.ai came up with

i feel the format fits better in landscape but nontheless i still like this one over Kling's even

craggy crest Nov 27, 2024, 1:26 AM

#

negative prompts - are a bad idea

bitter hearth Nov 27, 2024, 1:30 AM

#

got news you will really like

#

there is a paper which seems to have solved negative prompts

#

they found a way to automate making a negative prompt that is way more optimal than normal ones

#

and it can boost image quality on any image apparently

#

https://arxiv.org/abs/2411.05473

craggy crest Nov 27, 2024, 1:51 AM

#

halcyon yarrow yeah it naiiled it really well

i like her eye movement near the end

craggy crest Nov 27, 2024, 1:51 AM

#

bitter hearth they found a way to automate making a negative prompt that is way more optimal t...

the only solution for negative prompts is to get rid of them totally

halcyon yarrow Nov 27, 2024, 1:52 AM

#

im not a fan of negative prompts, i rather do without them entirely, but i do see the value for people who use them, sometmes there's no other way but to need a negative prompt

bitter hearth Nov 27, 2024, 1:52 AM

#

its like turning a toy boat by pushing the back rather than pushing the front

craggy crest Nov 27, 2024, 1:54 AM

#

bitter hearth its like turning a toy boat by pushing the back rather than pushing the front

i wouldn't have a problem with it if they were used correctly, but most of the time they are used by people that don't know what they are for, they are used incorrectly, the results are not what the individual wants, and then they're hunting people down to demand tech support and refusing to listen

bitter hearth Nov 27, 2024, 1:54 AM

#

yeah the massive Civit negatives I know what you mean

spark quail Nov 27, 2024, 3:17 AM

#

so ur tellin my my word pasta of bad fingers, mutated anatomy, etc are awful eh? hah yeah right!

spark quail Nov 27, 2024, 3:18 AM

#

craggy crest kling

also this has gotta be the highest quality output of anything AI ive seen so far

low sinew Nov 27, 2024, 3:58 AM

#

is there sd3 for a1111?

craggy crest Nov 27, 2024, 4:48 AM

#

low sinew is there sd3 for a1111?

i think a1111 has sd3.5 support

gentle mango Nov 27, 2024, 5:49 AM

#

do i install flux just like any other model for automatic1111?

dusky thistle Nov 27, 2024, 6:56 AM

#

bitter hearth and it can boost image quality on any image apparently

very interesting

cunning schooner Nov 27, 2024, 11:08 AM

#

Hi everyone! I'm using Stable Diffusion 3.5 with Diffusers, but I'm a bit confused about the scheduler and sampler options. I only seem to find FlowMatchEulerDiscreteScheduler and FlowMatchHeunDiscreteScheduler. SwarmUI and ComfyUI have more choices like SGM, DDM, DPM++, etc. When I try using schedulers from SDXL, I get messed up images.
What are my actual options for schedulers and samplers with SD 3.5 and Diffusers? Thanks!

cunning schooner Nov 27, 2024, 1:45 PM

#

Why are images with the same settings, always brighter with Diffusers compared to Comfy? (same seed etc, both EulerDiscrete with simple sampler)

dry wave Nov 27, 2024, 2:11 PM

#

one really significant difference I noticed between diffusers and comfy is the T5 prompt length

#

For Flux-dev the T5 prompt should be 512 tokens in length

#

but Comfyui seem to use 256 instead

#

it's an error on comfyui, or maybe a feature cause they might noticed it works better? I don't know

halcyon yarrow Nov 27, 2024, 2:49 PM

#

SD35L fp8 using controlnet now

bitter hearth Nov 27, 2024, 3:15 PM

#

dry wave For Flux-dev the T5 prompt should be 512 tokens in length

oh thanks a lot, I need to look into this
really don't want it to be splitting my prompt like that

dry wave Nov 27, 2024, 3:16 PM

#

I don't know... I find the Flux default behaviour also strange. Having a, say, 80 token prompt but using 512 tokens without proper attention masking

bitter hearth Nov 27, 2024, 3:18 PM

#

I just wish everything was explicit and documented

#

the attention masking issue is an interesting one yeah, cos in LLM world people do tend to use attention masking properly when fine tuning T5

dry wave Nov 27, 2024, 3:19 PM

#

I guess in Flux they don't do it cause its a "waste of compute". Instead, Flux is using the empty tokens as registers

#

however, I wonder if it does effect quality and introduces artifacts in the generation

#

in particular if you vary between long or short prompts

bitter hearth Nov 27, 2024, 3:22 PM

#

I think its possible yeah
been trying to read about that but its hard to get a definitive answer
Sana used Gemma 2 2B which is a fantastic model for its size, might be a good trend

dry wave Nov 27, 2024, 3:23 PM

#

dunno

#

decoder-only architectures are not good for image generation

bitter hearth Nov 27, 2024, 3:23 PM

#

T5 doesn't do so well for other languages, whereas the Chinese models that tend to use GLM, or Sana with Gemma, do okay with translating prompts

fossil pagoda Nov 27, 2024, 3:23 PM

#

halcyon yarrow SD35L fp8 using controlnet now

Did you make the blur one work yet?

bitter hearth Nov 27, 2024, 3:23 PM

#

when it comes to architecture my opinion is Omnigen is the way to go

#

or similar like Meta Chameleon

dry wave Nov 27, 2024, 3:24 PM

#

the problem with Gemma and other decoder-only structures is that it's task is to predict the next tokens, so it models the embedding in a way to achieve this task

#

so if you give it a prompt "A cat with black fur", then it will try to autocomplete the sentence, adding certain token probability distributions to the embeddings like ("playing", "sleeping", "eating", whatever a black cat is doing)

halcyon yarrow Nov 27, 2024, 3:26 PM

#

fossil pagoda Did you make the blur one work yet?

i tried all 3, depth canny and blur, i liked canny best, blur seems really hard to control

dry wave Nov 27, 2024, 3:26 PM

#

in Sana they try to get around this problem by adding a prefix to the image prompt like "The following is an image caption. Summarize and describe the image of this caption")

bitter hearth Nov 27, 2024, 3:27 PM

#

decoder only models still often underperform yeah
Kaggle competitions are often still won by simple ensembles of flan-t5-xxl and DeBERTa-V3-large
not stuff like GPT 4

fossil pagoda Nov 27, 2024, 3:27 PM

#

halcyon yarrow i tried all 3, depth canny and blur, i liked canny best, blur seems really hard ...

Yeah I couldnt get anything useful out of the blur one

halcyon yarrow Nov 27, 2024, 3:28 PM

#

i was thinking blur was like redux too

#

i noticed the default canny settings for the workflow of 0.1/0.8 are too vague I had better luuk with 0.1/0.2 instead

bitter hearth Nov 27, 2024, 3:28 PM

#

dry wave in Sana they try to get around this problem by adding a prefix to the image prom...

I see what you are saying, next token prediction is problematic for creating embeddings for DiTs/Unets, relative to masked word prediction

#

cos with masked word prediction its not specifically trying to predict the word at the end

dry wave Nov 27, 2024, 3:29 PM

#

yes

bitter hearth Nov 27, 2024, 3:29 PM

#

its the same with text classifiers really, even now

dry wave Nov 27, 2024, 3:30 PM

#

I'm also not sure if we really need better text models. T5 is quite powerfull already

bitter hearth Nov 27, 2024, 3:30 PM

#

I wish one of the big companies trained a bigger and better version of DeBERTa-V3-large

dry wave Nov 27, 2024, 3:30 PM

#

the issue is rather that the latent space of T5 is really huge and there are not enough training data for models like Flux to really learn from this latent space

bitter hearth Nov 27, 2024, 3:30 PM

#

for the task it is currently doing in Flux, T5 is easily good enough yeah

#

I think they could easily get 2-3 more years out of simply scaling DiTs

#

cos we are only 6 months in to that trend really

#

in the long run as hardware improves something like omnigen makes more sense in terms of scaling

#

the end game for transformers, just one giant transformer that does everything, across modalities

halcyon yarrow Nov 27, 2024, 3:50 PM

#

@dry wave @bitter hearth I found a project yesterday that essentially uses qwen as ollama would

https://github.com/ZHO-ZHO-ZHO/ComfyUI-Qwen/tree/main

here's the workflow:

https://github.com/ZHO-ZHO-ZHO/ComfyUI-Qwen/blob/main/QWEN2 WORKFLOWS/QWen2.5 %2B Flux.1 Dev Story【Zho】.json

I think this could be a good starting of point to integrate the new QWENVL version, especially since it supports Qwen2-7B-Instruct which is the exact model the other projet uses

#

that node also outputs tokenizer, I'm thinking loading the model was half the battle that solves that piece

#

I was looking at this part of the code from the qwen2vl+flux project I think I would need to turn this piece into a node:

   def process_image(self, image):
        message = [
            {
                "role": "user",
                "content": [
                    {"type": "image", "image": image},
                    {"type": "text", "text": "Describe this image."},
                ]
            }
        ]
        text = self.qwen2vl_processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True)

        with torch.no_grad():
            inputs = self.qwen2vl_processor(text=[text], images=[image], padding=True, return_tensors="pt").to(self.device)
            output_hidden_state, image_token_mask, image_grid_thw = self.qwen2vl(**inputs)
            image_hidden_state = output_hidden_state[image_token_mask].view(1, -1, output_hidden_state.size(-1))

        return image_hidden_state, image_grid_thw

so the node would input image from load image, and Qwen2 from that ModelLoader and then perform this operation to return the image_hidden_state

bitter hearth Nov 27, 2024, 4:08 PM

#

oh yeah I've been using qwen in comfy already there's quite a few node packs that do that

#

its useful, having an llm in comfy in node form

halcyon yarrow Nov 27, 2024, 4:10 PM

#

yeah it does produce a noticable boost in quality but i think the whole shitck of qwen2vl is that rather than doing this:
input prompt > LLM > output text > Clip text encode
it can do this:
input image > VLM > output conditoning > merge with clip text encode

#

considering qwen2 is multimodal, using it only as an LLM and not taking advantage of it's VLM capabilities is like underusing it

bitter hearth Nov 27, 2024, 4:14 PM

#

you can't do this without training a new model to sit in between qwen and the Dit

pseudo owl Nov 27, 2024, 4:28 PM

#

dry wave I'm also not sure if we really need better text models. T5 is quite powerfull al...

I mean gemma 2b is a better option, in the Sana paper, it outperforms it in everything basically. Its faster, uses less vram, and is better quality.

halcyon yarrow Nov 27, 2024, 4:31 PM

#

bitter hearth you can't do this without training a new model to sit in between qwen and the Di...

they already did that, it's a file called t5_embedder.pt part of the source

bitter hearth Nov 27, 2024, 4:38 PM

#

halcyon yarrow they already did that, it's a file called t5_embedder.pt part of the source

I am not sure what you are referring to

halcyon yarrow Nov 27, 2024, 4:40 PM

#

this thing dude: https://github.com/erwold/qwen2vl-flux/tree/main

I'm still thinkng of how I can use that in comfy w/o running their code lol

bitter hearth Nov 27, 2024, 4:45 PM

#

wow these examples in their technical report are by far the best style transfer I have seen

#

with the exception of training checkpoints and loras

pseudo owl Nov 27, 2024, 4:58 PM

#

How much control tools does cogvideox get lol, https://github.com/PKU-YuanGroup/ConsisID
Example vids
Left pic(original face), Right vid is generated

#

bitter hearth Nov 27, 2024, 5:00 PM

#

pseudo owl How much control tools does cogvideox get lol, <https://github.com/PKU-YuanGroup...

this one is really good

#

I didn't realise cog was this good

#

I dismissed it initially cos it couldn't make R2D2 move forward, but that's a niche thing, I underestimated it

craggy crest Nov 27, 2024, 5:03 PM

#

bitter hearth I didn't realise cog was this good

;) it's euler all over again

bitter hearth Nov 27, 2024, 5:03 PM

#

lol yeah

pseudo owl Nov 27, 2024, 5:03 PM

#

original pic left, above right is ConsisId(uses cogvideox), below right is sota closed source at this task(vidu)

bitter hearth Nov 27, 2024, 5:05 PM

#

I prefer cog there

pseudo owl Nov 27, 2024, 5:08 PM

#

bitter hearth I prefer cog there

Vidu even has a paper, and quality seems really amazing but is closed source. Nice that open source is catching up.

#

https://arxiv.org/abs/2405.04233

bitter hearth Nov 27, 2024, 5:10 PM

#

yeah it may well be that they have caught up to sora now
when sora leaked this week, a few of the people who used it said that it was worse than kling/minimax

#

and so mochi is close to kling/minimax

#

so maybe close to sora

#

I feel the sora museum flythrough was particularly good but maybe heavy cherry picking was done there

pseudo owl Nov 27, 2024, 5:11 PM

#

bitter hearth I feel the sora museum flythrough was particularly good but maybe heavy cherry p...

which vid, mochi is amazing at drone flythroughs, better then closed source even imo.

bitter hearth Nov 27, 2024, 5:12 PM

#

https://www.youtube.com/watch?v=GcWucj5uexQ

YouTube

Kieran Gilmurray

Museum Fly-Through with Sora: Mind-Blowing!

Explore a stunning museum fly-through with Sora, an open AI, as your guide. This immersive experience will leave your mind-blown. I am convinced that education should be accessible to everyone, not just the rich, so allowing everyone to see the best of humanity is key. See the power of generative AI in action and discover new perspectives on ar...

▶ Play video

pseudo owl Nov 27, 2024, 5:16 PM

#

ok let me try something similar with mochi, one thing impressive with sora is the 1min videos though, even closed source models don't reach that long. cogvideox does have a way to extend videos better then just extending with img2vid though, but quality is a bit lower

remote holly Nov 27, 2024, 5:33 PM

#

I dont know how is possible to reach 1mn of video

remote holly Nov 27, 2024, 5:37 PM

#

pseudo owl original pic left, above right is ConsisId(uses cogvideox), below right is sota ...

I just find a new model that do the same https://youtu.be/PhlgC-bI5SQ?si=dKDq8ubIbgJ2smoy

YouTube

SHENGHAI YUAN

Introducing ConsisID: Identity Preserving Text to Video

Bring your identity to life with just your words. If you can describe it, you can now visualize it. Introducing Identity-Preserving Text-to-Video Generation, powered by ConsisID. Discover more at https://github.com/PKU-YuanGroup/ConsisID.

▶ Play video

craggy crest Nov 27, 2024, 5:42 PM

#

remote holly I dont know how is possible to reach 1mn of video

final frame technique

turbid grotto Nov 27, 2024, 6:27 PM

#

svdquants is a magic

#

a bit lower quality but more than 2x speedup

#

schnell from 21s down to 8s and vram consumption under 8gb

#

on rtx 3060

bitter hearth Nov 27, 2024, 6:28 PM

#

wow nice

#

its great yeah

#

I am not sure on Ada/Hopper if it is faster or slower than FP8-fast with torch.compile, or TensorRT
am currently testing that sort of thing
but for RTX 3060 SVDQuant is certainly the best choice

turbid grotto Nov 27, 2024, 6:31 PM

#

super cool, finally some speeds

#

would be even cooler to have dev + 8 step lora in svdq

#

and sd3.5

bitter hearth Nov 27, 2024, 6:35 PM

#

I didn't know it would work on 3060 ```We currently support only NVIDIA GPUs with architectures sm_86 (Ampere: RTX 3090, A6000), sm_89 (Ada: RTX 4090), and sm_80 (A100). See this issue for more details.

#

from here https://github.com/mit-han-lab/nunchaku

#

its good to see that it will work on more GPUs

turbid grotto Nov 27, 2024, 6:40 PM

#

bitter hearth I didn't know it would work on 3060 ```We currently support only NVIDIA GPUs wit...

thankfully, 3060 has compute capability 8.6 (sm_86)

bitter hearth Nov 27, 2024, 6:45 PM

#

ah nice

#

funnily the H100 doesn't

remote holly Nov 27, 2024, 7:14 PM

#

turbid grotto svdquants is a magic

Yeah, I was impressed when I saw the article, I saw that it could work with models like cogvideoX or mochi, I'm looking forward to cogvideoX support personally

#

but the roadmap didn't seem to have changed for weeks

#

I hope to see progress soon because it would be a shame not to go all the way, it looks really promising, I haven't seen anyone talking about it yet

turbid grotto Nov 27, 2024, 7:23 PM

#

remote holly I hope to see progress soon because it would be a shame not to go all the way, i...

Everyone missed it, I have been waiting comfy support since release...
To be fair, comfy team had tons of work with day 1 support and developing app

#

But still, this need attention, maybe worth to make post in stablediffusion reddit...

remote holly Nov 27, 2024, 7:24 PM

#

yeah

#

imagine the svd quant with ltxv lol

turbid grotto Nov 27, 2024, 10:55 PM

#

remote holly imagine the svd quant with ltxv lol

realtime videogame filter sponging

remote holly Nov 27, 2024, 10:55 PM

#

catwhaaa

#

Shaders on steroid

rapid pivot Nov 28, 2024, 1:12 AM

#

Compiling shaders please wait

low stone Nov 28, 2024, 1:12 AM

#

remote holly imagine the svd quant with ltxv lol

#

sd3 large turbo -> ltxvideo

errant dust Nov 28, 2024, 1:21 AM

#

turbid grotto svdquants is a magic

What is SVDquants? Link?

#

For any interested, this was posted in Ideogram Discord today:

mno — Today at 6:27 PM
Hi , Here is our Black Friday Sale :
50% off Ideogram Basic: just $4 for the first month or $42 for the first year.
50% off priority credit top-ups: only $2 for 100 priority credits that roll over beyond billing cycles, as long as your subscription remains active.
Use code BlackFriday at checkout. Taxes may apply. Offer ends Dec 5, 11:59 PM PT.

#

https://ideogram.ai/pricing

low stone Nov 28, 2024, 1:50 AM

#

#

more sd3 large turbo

remote holly Nov 28, 2024, 6:55 AM

#

low stone

Woah it's realy smooth

low stone Nov 28, 2024, 6:56 AM

#

remote holly Woah it's realy smooth

#

yeah, sd3 turbo is making good starter images and it's really running with it

remote holly Nov 28, 2024, 7:11 AM

#

Amazing

#

ltxv was trained on compressed videos if you want to solve the problem of static videos in i2v you can compress the reference image https://www.reddit.com/r/StableDiffusion/s/YHoknAp5vK

throttlekitty's comment on "Playing with the new LTX Video model, p...

Explore this conversation and more from the StableDiffusion community

#

The image need to bee blur

halcyon yarrow Nov 28, 2024, 2:00 PM

#

SD35L

turbid grotto Nov 28, 2024, 2:52 PM

#

errant dust What is SVDquants? Link?

original repo, however, I failed to install that due to lack of knowledge:
https://github.com/mit-han-lab/nunchaku

But this fork was successful:
https://youtu.be/P46Cjw0MSzo?si=RpY7tySTzfo0AK_t
https://github.com/newgenai79/nunchaku

GitHub

GitHub - mit-han-lab/nunchaku: SVDQuant: Absorbing Outliers by Low-...

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - mit-han-lab/nunchaku

YouTube

NewGenAI

Flux on 8GB VRAM? Witness the Magic of Lightning-Fast Image Generat...

Nunchaku / SVDQuant
https://github.com/mit-han-lab/nunchaku

Installation guide
https://drive.google.com/file/d/1qtr00-PusMrbdNz5mBs7bCh_THg5VufG/view?usp=sharing

#AI #StableDiffusion #TechInnovation #ArtificialIntelligence #DeepLearning #AIExploration #TechEnthusiast #CreativityInAI #StableAIHub #AICommunity #InnovationHub #TechBreakthroughs #...

▶ Play video

GitHub

GitHub - newgenai79/nunchaku: SVDQuant: Absorbing Outliers by Low-R...

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - newgenai79/nunchaku

twilit matrix Nov 28, 2024, 3:13 PM

#

帮我设计一个logo，要求要有"Aether Link",然后像素在320*320，小于3M

errant dust Nov 28, 2024, 3:14 PM

#

turbid grotto original repo, however, I failed to install that due to lack of knowledge: https...

Thanks. I guess the question is whether this retains quality of Dev, or is deeply lossy like fp8

turbid grotto Nov 28, 2024, 3:23 PM

#

errant dust Thanks. I guess the question is whether this retains quality of Dev, or is deepl...

indeed lossy, checkpoint is about 6gb - comparable to quant 4 in size

errant dust Nov 28, 2024, 3:41 PM

#

That would have a lot to do with the ability to work with less RAM as well as render speeds

turbid grotto Nov 28, 2024, 3:43 PM

#

their quantization mothed works with encoder too, so doesn't take much memory either

errant dust Nov 28, 2024, 3:45 PM

#

The thing is, I'm not really very open to big quality sacrifices. For me the entire point of a large model like SD3.5L or Flux is the large model quality. Once you start butchering that in the name of speed you also begin to drop to the quality of smaller models. If I wanted that, I'd just use a smaller model to begin with, and not some dumbed down large model.

turbid grotto Nov 28, 2024, 4:30 PM

#

errant dust The thing is, I'm not really very open to big quality sacrifices. For me the ent...

3.3x speedup for dev 20steps is crazy
From 100s to 28s
Maybe not that necessary if you have 3090\4090 but for me it is a huge deal, and it still does hands and smart so it is better than sdxl despite similar sizes

errant dust Nov 28, 2024, 4:44 PM

#

I have a laptop 4060

#

I also know what I can get from all the models. I don't mass produce images. And when I make one, I want that quality I see. I can wait a bit to get the best

turbid grotto Nov 28, 2024, 4:46 PM

#

that is fine
I like experimenting

errant dust Nov 28, 2024, 4:46 PM

#

oh so do I, and I love trying the new tools, controlnets and more. Not to mention models and LoRAs

#

and I love SD3.5L too. Make no mistake. We may be talking Flux here, but they are all super cool

turbid grotto Nov 28, 2024, 4:49 PM

#

Same, I have been playing with 3.5m a lot

#

However I don't see any success with finetuning yet, which makes me think Stability has to step in

remote holly Nov 28, 2024, 4:51 PM

#

I thinks is very helpfull for video models

#

Because actual quants make big quality loss

#

I read the paper and they said you dont need to train a lora again for the quants

#

The lora works with svdquant

lucid swift Nov 28, 2024, 5:15 PM

#

bitter hearth ```https://arxiv.org/abs/2411.05473```

that is a interesting idea. you put your positive promt in the negative let it generate and then deskibe this image in the acual negative

civic trail Nov 28, 2024, 5:33 PM

#

pseudo owl Nov 28, 2024, 5:37 PM

#

errant dust I also know what I can get from all the models. I don't mass produce images. And...

Yes its an increase in quality as well compared to bnb4bit and gguf4bit, seemed somewhat similar quality to fp8, sometimes better. Uses less vram then gguf4bit, bnb4bit and is way faster. Supports lora loading as well which gguf4bit/bnb4bit don't.

civic trail Nov 28, 2024, 6:30 PM

#

a608766da69ae44be34a0f6d72fc354ce5046233581ff29bd275cc372be6528b.png

halcyon yarrow Nov 28, 2024, 7:37 PM

#

@errant dust do you use cmfyui?

errant dust Nov 28, 2024, 9:11 PM

#

I do

#

Sorry, was looking into BF deals on the horizon and buff the laptop

#

which I already has a buyer

#

(my old one that is)

#

for better rendering and video editing

#

There are some exceptional deals right now. Will see if the one I have highlighted can be beat

#

HP Newest Victus 16.1 Ryzen 7 RTX 4070 Premium Gaming Laptop, 16.1" FHD 144Hz, AMD Ryzen 7 8845HS (Beats i7-1355U), NVIDIA GeForce RTX 4070, 64GB DDR5 RAM, 2TB SSD, HDMI, Wi-Fi 6, Windows 11 Pro for $1266

#

which is right at the limit of my budget but fine

craggy crest Nov 28, 2024, 9:31 PM

#

hint: use "small person" in your prompt

halcyon yarrow Nov 28, 2024, 10:04 PM

#

errant dust I do

I think if you’re in the hunt for a good deal on a laptop then do what I did. If you live in the states use best buy to look through the returned inventory. I got lucky I bought a laptop rated quality: fair when really I’d consider it excellent

For $1400 it’s i9 4070 FHD. It came with 1tb and 16gb but I’ve since upgraded it to 6tb and 32gb, msrp on it was like $2200

errant dust Nov 28, 2024, 10:15 PM

#

Actually, I found a more attractive deal already. And this one is it. It has a 20-core CPU, 4070, 32GB Ram (this last is a minimum requirement for my work), but also three magic words: "Cherry Mechanical Keyboard"

#

Only 1TB SSD, but I can add 1 I have spare anyhow. That much is easy

#

As I live off my writing more than anything, the KB is just magic to my ears.

#

For any curious: https://www.dell.com/en-us/shop/dell-laptops/g16-gaming-laptop/spd/g-series-16-7630-laptop/useghbts7630hcwn

Dell

Dell G16 Gaming Laptop - Intel Gaming Laptop with NVIDIA GPU | Dell...

Play your favorite games on a Dell 16 inch gaming laptop with the latest Intel alongside a powerful NVIDIA GPU. View on Dell.com.

halcyon yarrow Nov 28, 2024, 10:19 PM

#

Yeah sometimes it’s small things like that which turn into dealbreakers, for me I require a numeric pad won’t get a laptop without one

errant dust Nov 28, 2024, 10:20 PM

#

Believe it or not, I actually place my wireless mechanical KB on top of my laptop (my current one) for my writing. Heh.

#

It's a sight

halcyon yarrow Nov 28, 2024, 10:21 PM

#

Checking your laptop it’s newer cpu, same GPU, more ram, but I’d never get it unless there’s an option to upgrade the display, QHD is way too sucky

errant dust Nov 28, 2024, 10:22 PM

#

I will disagree. I had a 4k laptop and hated it. The pixel size is way too small natively, so you either increase the scaling a ton, or reduce resolution, in which case: why bother?

halcyon yarrow Nov 28, 2024, 10:22 PM

#

Sorry I’m mistaken, QHD is fine, FHD is not fine lol

errant dust Nov 28, 2024, 10:22 PM

#

ahhhh\

#

This is QHD

halcyon yarrow Nov 28, 2024, 10:23 PM

#

And yes I agree, my previous laptop was 4K and I did not enjoy it, I did enjoy the touch screen tho I do miss that

#

Yeah my current laptop is QHD, mental derp

errant dust Nov 28, 2024, 10:24 PM

#

I also do a ton of video editing, so the RAM is not optional. The minute you have enough moving parts, clips, sound, and so on, insufficient RAM can be crippling

#

(Just YT stuff, but you'd be amazed at how much RAM the editors demand once you start micromanaging it enough)

halcyon yarrow Nov 28, 2024, 10:26 PM

#

Same goes for AI, 32gb helps a lot for making images quickly

errant dust Nov 28, 2024, 10:27 PM

#

I well know. 🙂 The 4070 is dual purpose. It also serves to run some of the large NNs I use in chess

#

Mechanical KB in laptop. Ahhhhhh...

#

lol

halcyon yarrow Nov 28, 2024, 10:31 PM

#

The difference between the 4070 and the next tier up being 16gb of vram is really expensive too in a laptop. Like it costs at least $1000 more for any laptop with 16gb vram. You can’t get one for under $2200 that was way out of my budget so I settled for 4070

errant dust Nov 28, 2024, 10:31 PM

#

Yeah, the 4080 has 12, and cosst a nifty 500 more, and let's not even mention the 4090s with 16

halcyon yarrow Nov 28, 2024, 10:31 PM

#

Considering it’s Black Friday maybe you could get a 16gb one for real cheap? You should def check best buy they might even have that cherry keyboard one as a return

errant dust Nov 28, 2024, 10:32 PM

#

As much as I love the idea, it is simply outside my scope

#

well, I will absolutely be open to whatever comes my way on BF

#

not pulling trigger yet. Just lining up the prime options

#

I live in Brazil, but have family coming from US in ten days and staying here for a month, so....

halcyon yarrow Nov 28, 2024, 10:36 PM

#

Go to best buy.com and look for the open box items in their area and check that inventory

#

https://www.bestbuy.com/site/searchpage.jsp?_dyncharset=UTF-8&browsedCategory=pcmcat748300666044&id=pcat17071&iht=n&ks=960&list=y&qp=category_facet%3DLaptops~abcat0502000&sc=Global&st=pcmcat748300666044_categoryid%24cat00000&type=page&usc=All Categories

raw path Nov 29, 2024, 1:28 AM

#

I farted

limpid thunderBOT Nov 29, 2024, 3:12 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

errant dust Nov 29, 2024, 3:57 AM

#

errant dust For any curious: https://www.dell.com/en-us/shop/dell-laptops/g16-gaming-laptop/...

@halcyon yarrow I ended up finding this for $1078 (tax included)

bitter hearth Nov 29, 2024, 4:15 AM

#

that seems nice yeah

bitter hearth Nov 29, 2024, 4:54 AM

#

I feel like with GGUF smaller hardware is now viable

#

I went from renting L40s to renting 3060s

errant dust Nov 29, 2024, 12:30 PM

#

You gave me the idea though. I found it in the Dell Outlet store which has some insane BF discount coupons (BLACKFRIDAY17).

errant dust Nov 29, 2024, 4:08 PM

#

The fact the Outlet also does not charge tax made a big diff

craggy crest Nov 29, 2024, 6:48 PM

#

errant dust You gave me the idea though. I found it in the Dell Outlet store which has some ...

are you in austin?

errant dust Nov 29, 2024, 7:04 PM

#

Rio de Janeiro

craggy crest Nov 29, 2024, 7:39 PM

#

errant dust Rio de Janeiro

there's a dell outlet down there?

errant dust Nov 29, 2024, 9:16 PM

#

Dell Outlet is a part of the Dell website and store

#

Just go to:

#

https://outlet.us.dell.com/GDOOnline/Online/InventorySearch?c=us&cs=28&l=en&s=dfb&buid=11&brandid=2801

spiral geyser Nov 29, 2024, 9:18 PM

#

errant dust Rio de Janeiro

Hola

errant dust Nov 29, 2024, 9:18 PM

#

Select the options you want, put the item in the Checkout and apply the coupon BLACKFRIDAY17. the price will be severely reduced AND have no tax appended

mortal mesa Nov 29, 2024, 9:23 PM

#

Brazil has a huge economy, if a north america company with international reach isnt in Brazil it should be considered a failure on the international front

errant dust Nov 29, 2024, 9:23 PM

#

Dell exists in Brazil just fine. The problem is that all electronics are severely upmarked

#

Here in Brazil the machine I paid 1078 dollars for would come out to some 3500-4000

spiral geyser Nov 29, 2024, 9:24 PM

#

errant dust Here in Brazil the machine I paid 1078 dollars for would come out to some 3500-4...

Wow

errant dust Nov 29, 2024, 9:26 PM

#

If you can find it at all. Top end tech is a rarity. But it makes sense. The markup is not even taking into account relative incomes

#

It isn't a big issue for me though. Aside from just bringing things back myself, I have family regularly staying here. Such as next week, heh.

#

Fam coming in from US that is

bitter hearth Nov 30, 2024, 12:43 AM

#

its good to get a GPU I think

#

cloud is too tiring

#

huggingface and civit download speeds are not always the same speed on one day as on the next day 🤔

#

but getting private storage like google drive or backblaze is tricky cos it takes money that could have gone to GPUs

craggy crest Nov 30, 2024, 2:51 AM

#

errant dust Dell Outlet is a part of the Dell website and store

okay. cause the only dell outlet i was aware of was the store next to the dell plant in austin

bitter hearth Nov 30, 2024, 2:51 AM

#

is texas becoming a big tech place now like california is?

#

I don't follow US stuff too closely but I heard that

craggy crest Nov 30, 2024, 2:53 AM

#

bitter hearth is texas becoming a big tech place now like california is?

has been for years

#

bitter hearth Nov 30, 2024, 2:55 AM

#

ah okay

crisp pine Nov 30, 2024, 7:09 AM

#

help

opal sedge Nov 30, 2024, 7:47 AM

#

help

lean totem Nov 30, 2024, 11:37 AM

#

Help me generate a colored icon that represents the generation of asset inventory tables and the printing of asset labels

short thicket Nov 30, 2024, 3:07 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_184947620677641_0_deis_beta_30_5.0_1.0_44.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_847078589792104_0_deis_beta_30_5.0_1.0.png

short thicket Nov 30, 2024, 4:52 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819026_0_deis_beta_30_5.0_1.0.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819032_0_deis_beta_30_5.0_1.0.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819032_0_deis_beta_30_5.0_1.0_09.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819032_0_deis_beta_30_5.0_1.0_14.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819035_0_deis_beta_30_5.0_1.0_02.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819040_0_deis_beta_30_5.0_1.0.png

short thicket Nov 30, 2024, 6:18 PM

#

That's Mangled Merge Flux V1 + a lora I am in the middle of training for macro photography/etymology

mortal kite Nov 30, 2024, 6:29 PM

#

anyone use flux outpaint much? I'm not sure I am understanding how to use it correctly

short thicket Nov 30, 2024, 6:29 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819043_0_deis_beta_30_5.0_1.0_01.png

mortal kite Nov 30, 2024, 6:32 PM

#

flux doesn't seem to like outpainting only in a single direction or something 😄

short thicket Nov 30, 2024, 6:49 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819045_0_deis_beta_30_5.0_1.0_08.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819045_0_deis_beta_30_5.0_1.0_11.png

short thicket Nov 30, 2024, 7:23 PM

#

Entomology sorry

#

working on it now. In the meantime...

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819045_0_deis_beta_30_5.0_1.0_16.png

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819045_0_deis_beta_30_5.0_1.0_19.png

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0.png

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_01.png

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_04.png

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_05.png

#

This one is with my lora at 100%

"a macro photograph, a chubby lady that looks like a lady bug and she looks stylish and rich"

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_06.png

#

This is the base model alone with the lora at 0%. Working on 45%.

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_07.png

#

I didn't see much of a change at 45% but here is 70% lol. I'm gonna try 80.

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_10.png

#

80%

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_11.png

#

Looking for that sweet spot. Here's 93%. LOL

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_15.png

#

getting close at 93.9%

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819046_0_deis_beta_35_4.0_1.0_19.png

#

yeah, It's interesting to see the transition though.

short thicket Nov 30, 2024, 8:23 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_1021068366819047_0_deis_beta_35_4.0_1.0_03.png

civic trail Nov 30, 2024, 9:44 PM

#

08d407d024b1e90b9f60dc32a8c06e67e0def82cfd72a6f7d020ec4473657275.png

#

short thicket Nov 30, 2024, 10:45 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_218324694065017_0_deis_beta_35_4.0_1.0.png

short thicket Nov 30, 2024, 11:19 PM

#

2024-11-30_False_Flux1-dev_Flux1-dev_860176923366204_0_deis_beta_35_4.0_1.0.png

grim crest Dec 1, 2024, 5:37 PM

#

craggy crest has been for years

austin texas is. the rest of texas has been the electronics components and semiconductor center for 60 years. see texas instruments and mouser.

grim crest Dec 1, 2024, 5:40 PM

#

short thicket

craggy crest Dec 1, 2024, 6:36 PM

#

grim crest austin texas is. the rest of texas has been the electronics components and semic...

dallas is where all the major computer companies are at. Austin is where Dell and IBM are/were - and steve jackson games

grim crest Dec 1, 2024, 6:36 PM

#

there is a lot more going on in Austin than IBM/DELL.

#

also oracle and apple

#

HPE is in dallas, austin, houston

neon hawk Dec 1, 2024, 7:08 PM

#

Create an image for a clothing store called branded_items that should show silhouettes of clothing frozen in ice cubes.

mortal mesa Dec 1, 2024, 7:16 PM

#

IBM is in NY though

#

always were

pseudo owl Dec 1, 2024, 7:43 PM

#

New shuttle 3.1 is great, has more detail then 3.0 for sure. This is 4steps, 2048x1024 with just Euler discrete

mortal mesa Dec 1, 2024, 7:52 PM

#

look closer, its kinda not good

#

the lines, ive had them pop out for unknown reasons also

weary crystal Dec 1, 2024, 8:00 PM

#

1448x1448 was the recommendation for maximal generating size

#

Not sure, outpainting would be a option too

mortal mesa Dec 1, 2024, 8:06 PM

#

look at the sky when you open in browser

#

shuttle 3/3.1

pseudo owl Dec 1, 2024, 8:13 PM

#

This is natively generated, I didn't upscale it.

mortal mesa Dec 1, 2024, 8:13 PM

#

4096x4096 expect problems

pseudo owl Dec 1, 2024, 8:14 PM

#

It can generate as high as 2048x2048 but you get weird artifacts, shuttle3/3.1 is better at native generation at higher-res.

mortal mesa Dec 1, 2024, 8:14 PM

#

my shuttle starting point is 1536x1024

pseudo owl Dec 1, 2024, 8:24 PM

#

Yeah also shuttle needs really low steps so its very fast since its a finetune of schnell, 4 is usually a good spot.

mortal mesa Dec 1, 2024, 8:24 PM

#

i think i found a comfyui bug maybe, didnt look if known, but like with the shuttle models if i gen on one and switch to the other it crashes, i think hitting the unload buttons prevents that maybe

weary crystal Dec 1, 2024, 8:33 PM

#

Well i tried the shuttle model 7 steps 11 steps 60 steps and 100 steps. I love how all steps above ten work as refiner

mortal mesa Dec 1, 2024, 8:46 PM

#

this is a workflow i modified for shuttle 3, i love it, doesn't really give the same results with 3.1, almost seems like it needs more steps now, its a handful of 2 steps that ends in a 8MP image

📎 Shuttle-2step-Upscale.json

#

pseudo owl Dec 1, 2024, 9:05 PM

#

I mean not too bad for 1step with Flux.1 schnell, euler discrete, 2048x1024

pseudo owl Dec 1, 2024, 10:32 PM

#

mortal mesa this is a workflow i modified for shuttle 3, i love it, doesn't really give the...

Yeah shuttle 3.1 seems to need 4+ steps, shuttle3.0 works with as low as 1step

limpid thunderBOT Dec 2, 2024, 12:45 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

craggy crest Dec 2, 2024, 3:19 AM

#

#

#

#

bitter hearth Dec 2, 2024, 6:10 AM

#

mortal mesa

nice, I use the same TTP_Toolset workflow I just arrange it differently LOL

#

it works the same, I just replaced the tiling method with ComfyUI_SimpleTiles, added ColorPallete from ComfyUI-APQNodes, some post-processing from ComfyUI_LayerStyle and the new depth map thing

#

I removed the texture detailer section which was probably a mistake

supple socket Dec 2, 2024, 6:27 AM

#

hey.. Do you guys have advice for me like how can i extract patterns from images. I mean which techniques should i use for it. For example this image i liked the tshirt pattern

bitter hearth Dec 2, 2024, 6:29 AM

#

you are lucky because the best tool for this came out recently https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControlif you scroll down to "Masked Conditioning Images" you can see he used it to extract the floral pattern from the clothes
its essentially exactly what you are asking for

supple socket Dec 2, 2024, 6:31 AM

#

bitter hearth you are lucky because the best tool for this came out recently ```https://github...

you are hero

bitter hearth Dec 2, 2024, 6:32 AM

#

no problem 🙂

supple socket Dec 2, 2024, 6:34 AM

#

bitter hearth no problem 🙂

ahh i only use with flux right

bitter hearth Dec 2, 2024, 6:40 AM

#

yeah

supple socket Dec 2, 2024, 6:48 AM

#

bitter hearth yeah

this hurts my brain 😄

bitter hearth Dec 2, 2024, 6:49 AM

#

same

#

so slow

supple socket Dec 2, 2024, 6:49 AM

#

i couldn't even manage to run

#

i got weights error for sigclip

bitter hearth Dec 2, 2024, 6:50 AM

#

what GPU do you have

supple socket Dec 2, 2024, 6:50 AM

#

bitter hearth what GPU do you have

3x 3090

#

bitter hearth Dec 2, 2024, 6:53 AM

#

ah yeah 3090 isn't great it can't take full advantage of FP8 matmul

supple socket Dec 2, 2024, 6:55 AM

#

bruuh

#

i wonder is there any sd alternatives

bitter hearth Dec 2, 2024, 6:55 AM

#

would try downloading clip vision again

supple socket Dec 2, 2024, 6:55 AM

#

but idk the keyword for research

bitter hearth Dec 2, 2024, 6:55 AM

#

just search for diffusion

supple socket Dec 2, 2024, 6:56 AM

#

bitter hearth would try downloading clip vision again

i downloaded from manager

bitter hearth Dec 2, 2024, 6:56 AM

#

would recommend not using manager

supple socket Dec 2, 2024, 6:56 AM

#

bitter hearth would recommend not using manager

ah

bitter hearth Dec 2, 2024, 6:57 AM

#

and just using git and wget

supple socket Dec 2, 2024, 6:58 AM

#

bitter hearth and just using git and wget

now im trying to download with wget sigclip

bitter hearth Dec 2, 2024, 6:59 AM

#

also try several different sites

#

sadly it is very common for people to upload broken models

#

or to mess with the model in some way e.g. pruning

#

or heavily quantising without telling you

#

its kinda hard to get pure versions of models sometimes

supple socket Dec 2, 2024, 7:01 AM

#

i downloaded all models from huggingface but still same error

#

bruh

bitter hearth Dec 2, 2024, 7:02 AM

#

not sure, would ask on comfy discord or banodoco

craggy crest Dec 2, 2024, 7:38 AM

#

#

#

rain surge Dec 2, 2024, 8:25 AM

#

craggy crest

I like the style, looks a bit like book cover you could have on the old days glorp

bitter hearth Dec 2, 2024, 8:26 AM

#

really nice artstyle yeah

rain surge Dec 2, 2024, 8:28 AM

#

Really want to try SD3, but too lazy to setup another docker image, and it's seems not compatible with current ComfyUI setup i have (despite being latest) glorp

bitter hearth Dec 2, 2024, 8:30 AM

#

its rough setting up docker instances yeah

#

TBH most workflows are not model agnostic, its the reason why I have only used 3 models ever (SD 1.5 -> SDXL -> Flux)

supple socket Dec 2, 2024, 10:34 AM

#

is there any custom nodes to join images like top bottom left right

craggy crest Dec 2, 2024, 1:23 PM

#

supple socket is there any custom nodes to join images like top bottom left right

Explain please

rapid moon Dec 2, 2024, 3:00 PM

#

i was told that 3.5 large will fit 4090, but i see its cant load the model due to insufficiant memory, why is that? is there something special to do?

rapid moon Dec 2, 2024, 3:39 PM

#

be ware of this "supporter", its a thief

gusty trail Dec 2, 2024, 4:21 PM

#

supple socket is there any custom nodes to join images like top bottom left right

I have custom node to do that on specific purpose. But it only join images on right and bottom

rapid moon Dec 2, 2024, 5:33 PM

#

anyone have python example how to run sd3.5large on 4090?

supple socket Dec 2, 2024, 5:44 PM

#

gusty trail I have custom node to do that on specific purpose. But it only join images on ri...

I've just created for myself to each edge

supple socket Dec 2, 2024, 5:47 PM

#

craggy crest Explain please

Do you own crystools

craggy crest Dec 2, 2024, 6:00 PM

#

supple socket Do you own crystools

i have it installed, if that's what you mean?

supple socket Dec 2, 2024, 6:01 PM

#

craggy crest i have it installed, if that's what you mean?

Your name

craggy crest Dec 2, 2024, 6:02 PM

#

supple socket Your name

oh. nope. not the developer

turbid grotto Dec 2, 2024, 6:19 PM

#

How is sd3.5 finetuning right now? Is it begins figuring out?

supple socket Dec 2, 2024, 6:22 PM

#

can i use 2 comfy on same gpu

#

and if i can how will it handle

gusty trail Dec 2, 2024, 6:30 PM

#

Why would you use two comfy? You could just use another tab

rapid moon Dec 2, 2024, 7:06 PM

#

do u have a code example?

#

i only see comfyUI

#

i dont use windows, i need it in python

#

i use sd commercialy with liscence, thats why i need it in code, where can i find that backend?

#

oh its in python also... didnt know that.. thanx! will try that

supple socket Dec 2, 2024, 7:18 PM

#

yea cuz queue using 100% of gpu

rapid moon Dec 2, 2024, 7:20 PM

#

ill just call the functions directly

#

i also run rest, anyway, ill take a look

#

ill probably call it from another python

#

blaaahhhhh RuntimeError: operator torchvision::nms does not exist

#

ill work it out... thanx!

#

no thats version stuff.. ill reinstall packages...

rapid moon Dec 2, 2024, 7:55 PM

#

@halcyon yarrow everything is perfect, you helped me alot! Thanx!

pseudo owl Dec 2, 2024, 9:24 PM

#

Oh nice, a way to enhance quality(more detail, diverse, and overall better) of mochi-1/open-sora/svd: https://junhahyung.github.io/STGuidance/

Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling

real terrace Dec 2, 2024, 9:26 PM

#

I'm looking for a flux model that won't OOM my system and work decent in quality/speed, for 12 GB VRAM

bitter hearth Dec 2, 2024, 9:27 PM

#

this would be fine flux.1-lite-8B-alpha-Q4_K_S.gguf https://huggingface.co/city96/flux.1-lite-8B-alpha-gguf/tree/main

#

offload text encoders before loading DiT

real terrace Dec 2, 2024, 9:31 PM

#

bitter hearth this would be fine ```flux.1-lite-8B-alpha-Q4_K_S.gguf``` ```https://huggingface...

great ty! I was testing one you share for other user and it took like ages to load and generated, and the it went literally OOM in Comfy (didnt generate)

#

I'll test those ones, thanks

bitter hearth Dec 2, 2024, 9:31 PM

#

the memory management in Comfy has never worked for me on any GPU

#

not sure if I have always had a setting wrong

#

but its pretty rough

#

depends on the situation
TCD works well with eta and only 8 steps

#

for the most part eta is for 40 steps plus

#

I also think flux gets harmed by it

#

I think people who are not making photos like the results but for image quality I think it is a negative

#

oh there's no point in taking into account recommendations

#

yeah I remember

mortal mesa Dec 2, 2024, 10:05 PM

#

yup head exploding https://jixiaozhong.github.io/Sonic/ https://jixiaozhong.github.io/Sonic/results/generated_videos/6.png_female_long.wav_audio_compress.mp4

Sonic: Shifting Focus to Global Audio Perception in Audio-driven Po...

▶ Play video

bitter hearth Dec 2, 2024, 10:24 PM

#

I don't follow video but that's the best sound sync I have seen

#

might be okay yeah

#

wouldn't do it personally but it could work

#

0 eta before 40 steps

#

is the main thing

dusky thistle Dec 2, 2024, 10:26 PM

#

depends a bit on the model

#

and the step type

bitter hearth Dec 2, 2024, 10:27 PM

#

if you are able to schedule it then you also want the eta to decay

#

but that might be too complex for this project

dusky thistle Dec 2, 2024, 10:27 PM

#

flux can handle low steps with a bit of noise better

#

2m handles noise less well at low steps than 2s

#

yeah it's good to keep it real low for anything with only 20 steps, if not zero

#

30 is kinda where SDE starts to clearly become better than ODE

bitter hearth Dec 2, 2024, 10:29 PM

#

there is a bit of nuance especially for flux
you can do a little bit at low steps

dusky thistle Dec 2, 2024, 10:29 PM

#

40+, unleash the noise for sure

bitter hearth Dec 2, 2024, 10:29 PM

#

I don't know why but TCD lora with TCD sampler
can do high eta at like 8 steps and it goes fine
something very different is happening there

#

99% of my images have been with TCD, until flux era

#

its the acceleration lora with the highest complexity score

#

which is why I like it

#

its kinda not relevant now we have Shuttle Diffusion doing like 2560x2560 in 3 steps

dusky thistle Dec 2, 2024, 10:35 PM

#

yea that's def not enough steps for 2m

#

you can kinda get away-ish with res_2s at 10-15

bitter hearth Dec 2, 2024, 10:35 PM

#

are you willing to use turbo lora

#

it is compatible with other loras

#

bare in mind 15 steps of res_2s takes the time of 30 steps of res_2m

craggy crest Dec 2, 2024, 10:42 PM

#

bitter hearth Dec 2, 2024, 10:42 PM

#

lol

craggy crest Dec 2, 2024, 10:44 PM

#

else (cross.eyes.=="true");

bitter hearth Dec 2, 2024, 10:45 PM

#

I think its an improvement

#

eta = 0.20 at 30-40 steps might be a bit rough

#

its good to experiment anyway

dusky thistle Dec 2, 2024, 10:46 PM

#

with the res samplers you can generally get away with that once you get to 30 steps

#

espec with flux

#

tbh the most important thing might be how many steps you run with a low noise level, as opposed to the total number of steps

#

insofar as mutations and crazy shit are concerned

#

if you cut the eta down after a certain number of steps it can be helpful

#

so like... if you're runinng 30 steps, steps 0-10 might have eta 0.5,

#

11-20 eta 0.2

#

21+ 0

#

this would start at eta = 0.5 and follow that plot

#

and then it should fill in the rest with 0s

#

np

#

yeah thats in the newest version

#

its in the manager now too fyi

#

though honestly i think it's good to handle it with git cuz you can git reset --hard so easily to hop around versions

#

i don't do funny stuff with requirements.txt so you wont break dependencies doing that with my repo

bitter hearth Dec 2, 2024, 10:57 PM

#

it's kinda easier to use git anyway if you have docker

#

cos it just goes in the provisioning script

#

if you are updating or installing anything on a live server after provisioning script ends that's a mistake

#

main cloud advice I have is don't pay for storage

#

it works out better to download each time

bitter hearth Dec 2, 2024, 11:35 PM

#

civit download is so slow that it would be kinda hard to hit 1TB

mortal mesa Dec 2, 2024, 11:37 PM

#

i think there are also other type of bandwidth rules, i drop speed for chunks of time than its back to zippy for a bit

#

its either i get full saturation or 3mbit

#

ya so have i, but i have had extreme slow downs also. i dont know why, them, me, isp, who knows

rain surge Dec 3, 2024, 2:05 AM

#

turbid grotto How is sd3.5 finetuning right now? Is it begins figuring out?

Bit late, but better late than never,
There is a good guide
https://stabilityai.notion.site/Stable-Diffusion-3-5-fine-tuning-tutorial-11a61cdcd1968027a15bdbd7c40be8c6

Stability.ai on Notion

Stable Diffusion 3.5 Large Fine-tuning Tutorial

Interested in generating images like this from SD3.5 Large?

turbid grotto Dec 3, 2024, 2:38 AM

#

rain surge Bit late, but better late than never, There is a good guide https://stabilityai....

thanks, I had success with loras but I don't see big finetunes on civitai and some people say it is hard to tune too

dusky thistle Dec 3, 2024, 2:54 AM

#

sd35m, euler vs gauss-legendre SDE

rain surge Dec 3, 2024, 3:00 AM

#

turbid grotto thanks, I had success with loras but I don't see big finetunes on civitai and so...

SD3 Medium may be much easier to finetune performance wise

craggy crest Dec 3, 2024, 4:11 AM

#

unkempt warren Dec 3, 2024, 6:44 AM

#

#Midjourney

dusky thistle Dec 3, 2024, 8:30 AM

#

Wf should be embedded but yeah it's from the new version

#

https://en.wikipedia.org/wiki/Gauss–Legendre_method

Gauss–Legendre method

In numerical analysis and scientific computing, the Gauss–Legendre methods are a family of numerical methods for ordinary differential equations. Gauss–Legendre methods are implicit Runge–Kutta methods. More specifically, they are collocation methods based on the points of Gauss–Legendre quadrature. The Gauss–Legendre method based on s points ha...

limpid thunderBOT Dec 3, 2024, 10:02 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

muted dove Dec 3, 2024, 10:53 AM

#

I used Flux Redux and used a ComfyUI workflow as one of the 2 input images 😄

bitter hearth Dec 3, 2024, 11:26 AM

#

haha the noodles

#

that's so cool

#

legendary version of gaussian noise sounds cooler than what Gauss–Legendre actually is LOL

pseudo owl Dec 3, 2024, 12:39 PM

#

Amazing new text to video model, definitely better then mochi-1 and probably sota. It can even do text very accurately: https://huggingface.co/tencent/HunyuanVideo

tencent/HunyuanVideo · Hugging Face

bitter hearth Dec 3, 2024, 12:39 PM

#

are there samples

pseudo owl Dec 3, 2024, 12:40 PM

#

The size though, lol. 13b params and 8b text encoder

pseudo owl Dec 3, 2024, 12:40 PM

#

bitter hearth are there samples

Yes

#

https://aivideo.hunyuan.tencent.com

bitter hearth Dec 3, 2024, 12:41 PM

#

ah nice

pseudo owl Dec 3, 2024, 12:44 PM

#

They have GitHub examples too, the beginning samples are things that can mochi do but in the end, quality is really amazing

https://github.com/tencent/HunyuanVideo

GitHub

GitHub - Tencent/HunyuanVideo

Contribute to Tencent/HunyuanVideo development by creating an account on GitHub.

bitter hearth Dec 3, 2024, 1:58 PM

#

its Hunyuan so it has potential

#

I don't follow anime stuff, is Illustrious the new big thing?

#

there's this NoobAI thing too

#

I think its Illustrious-based but not sure

#

it has V-pred and ZTSNR which is awesome

#

oh awesome there's a paper thanks

#

I'd like to try Pony and Illustrious for photorealistic sci-fi and fantasy
I'd have to hide my workflows for those so I don't get criticised for using Pony LOL

#

yeah 2MP is good for SDXL

#

does remix meant img to img

#

stock SD3.5 does mostly look better than stock flux yeah

pseudo owl Dec 3, 2024, 2:13 PM

#

Did you check the ghost example and the text painting example, mochi has issues with large amounts of motion where it has distortion.

A nice thing is that according the their paper, with shifting scheduler and high shift, you can get good quality with even 10 steps.

bitter hearth Dec 3, 2024, 2:15 PM

#

ah okay yeah

#

I've been working on exactly that
this is for flux but will do for SD 3.5 as well

#

its not done yet it needs better depth map engine and more control nets

#

depth maps get way better than this but IDK how yet

#

the downstairs did okay

#

but the depth map broke for the back wall

#

and the side walls are going funny cos my depth map is not smooth

#

the gradient has lines and it things the lines are a physical feature

#

canny is way stronger yeah

#

the reason I prefer depth is that I want it to creatively add stuff

#

but only a little bit

#

yeah that helps as well

bitter hearth Dec 3, 2024, 4:24 PM

#

seen people refining flux with SD35L

#

interesting

#

when I am doing long generation workflows I always end with SD 1.5 unet, sent to SD3M VAE using latent interposer

#

i made this complex system where loras are categorized by an LLM and then an LLM picks the best categories for a given promptthis is a great idea

#

the first part I'm saying I upscale with SD 1.5

#

do you know this thing https://github.com/city96/SD-Latent-Interposer

#

it lets you pass latents around

#

the crucial thing is to only use 16 channel VAEs

#

with this you can send SD 1.5, SDXL and Cascade images to SD3M VAE

#

also lets you get the image from Flux into the older models without using their VAE encode

bitter hearth Dec 3, 2024, 4:47 PM

#

its an improvement

rapid moon Dec 3, 2024, 7:07 PM

#

@halcyon yarrow just wanted to say thanx again! it is now in our production, and thats just WOW! it works about X2 times faster on 4090 than on A100 !!

#

i work at evmux.com , part of our service includes ai images for streams

#

and yeah, its on the backend now

#

i connect to comfy websocket for the progress and result and call the comfy api on /prompt by rest

#

basically like teir UI is doing, but on the server side

#

yeah, works great

#

currently im just waiting for the 'executed' event, in the future will expand

#

im not saving, im using the preview and sending it as the result, i dont want images to be saved on the server

#

that shouldnt happen... we are running on many data centers

#

and losing it is not a big deal if its super rare, it will just retry

#

exactly

#

    async def handle_message(self, msg):
        data = json.loads(msg)
        if 'type' in data and data['type'] == 'executed':
            data = data['data']
            if data['prompt_id'] in self.waiting_requests:
                fut = self.waiting_requests[data['prompt_id']]
                try:
                    if 'audio' in data['output']:
                        fut.set_result(data['output']['audio'][0]['filename'])
                    else:
                        fut.set_result(data['output']['images'][0]['filename'])
                    asyncio.create_task(self.clear_request(data['prompt_id']))
                except Exception as e:
                    stackTrace = traceback.format_exc()
                    print(stackTrace)
                    fut.set_result(False)

#

and also added the stable audio, maybe we will add it to

#

yeah, and then i see it in /api/view?filename={filename}&type=temp

#

i just call it as the ui do, with randomizing seed

#

yeah, copied it from the request in the browser and using it as template changing seed and prompt

#

async def draw(self, prompt, neg_prompt, width, height):
        request = create_draw_request(self.client_id)
        request['prompt']['3']['inputs']['width'] = width
        request['prompt']['3']['inputs']['height'] = height
        request['prompt']['6']['inputs']['text'] = prompt
        request['prompt']['7']['inputs']['text'] = neg_prompt
        res = await self.do_post('/api/prompt', request)
        fut = Future()
        self.waiting_requests[res['prompt_id']] = fut
        return res['prompt_id'], fut

#

create_draw_request is the template

#

i currently have workflow to keep 2 models in the vram, the sd3l and the stable audio, pretty cool

#

we need speed so we just use 1 flow on each 4090, loading the model takes time

#

We just need to generate pics so actually its the same flow

craggy crest Dec 3, 2024, 7:47 PM

#

@halcyon yarrow you did good :)

#

yes

pseudo owl Dec 3, 2024, 11:22 PM

#

Couldn't get Hunyuan vid to work with low vram, but other people's gens in banadaco discord are amazing. It's also better at nsfw then mochi/cogvideox from their testing.

#

and these are with fp8 quantized and lower-res versions if I'm not mistaken.

#

This is what it produced with the will smith eating spaghetti prompt and this is again with the fp8 quantized and lower-res versions. Doesn't know who will smith is but the spagetthi eating is near-perfect. Even with quantization, it produces better quality then full quality Mochi-1 from genmo with a upscaler.

Hope some more optimized code comes soon.

pseudo owl Dec 3, 2024, 11:44 PM

#

16gb lol but I didn’t try with low vram, maybe that could work

rapid pivot Dec 4, 2024, 12:08 AM

#

pseudo owl This is what it produced with the will smith eating spaghetti prompt and this is...

It's always a gamble to click on ai videos of people eating

#

sadcat

pseudo owl Dec 4, 2024, 12:31 AM

#

rapid pivot It's always a gamble to click on ai videos of people eating

It knows Donald trump at least

rapid pivot Dec 4, 2024, 12:35 AM

#

sadcat

rapid pivot Dec 4, 2024, 12:35 AM

#

pseudo owl It knows Donald trump at least

What is this model called

pseudo owl Dec 4, 2024, 12:37 AM

#

rapid pivot What is this model called

New open source video gen model that came today: https://huggingface.co/tencent/HunyuanVideo

The only problem is that it’s very slow locally since it’s not fully optimized yet and is 13b params with an 8b llm text encoder. Requires 24gb vram right now, but probably will go down.

rapid pivot Dec 4, 2024, 12:37 AM

#

To generate longer videos do you need more vram or only take longer?

summer bloom Dec 4, 2024, 12:38 AM

#

Using what method

pseudo owl Dec 4, 2024, 12:40 AM

#

rapid pivot To generate longer videos do you need more vram or only take longer?

5sec is the real limit but it should be theoretically possible to extend it, Kijai the goat made a way to extend CogVideoX vids from 6sec to as long as you want.

But short answer, no, not right now.

You could also probably extend with image2vid but that’s not out yet, I believe it was the end of this month in their plan.

rapid pivot Dec 4, 2024, 12:44 AM

#

pseudo owl 5sec is the real limit but it should be theoretically possible to extend it, Kij...

So it would just take longer if it was possible

#

I was wondering if the AI as it generates more frames it had to keep what was in the past in it's memory or something

bitter hearth Dec 4, 2024, 1:10 AM

#

a lot of methods to extend AI video are just starting a second run through the model

#

in those situations it has no "memory" of what came before, just a keyframe it was given

sacred jewel Dec 4, 2024, 1:58 AM

#

pseudo owl It knows Donald trump at least

It also knows fork and spaghetti 😛

bitter hearth Dec 4, 2024, 2:24 AM

#

I think its too early for video really

craggy crest Dec 4, 2024, 3:25 AM

#

#

#

#

craggy crest Dec 4, 2024, 4:28 AM

#

#

#

#

sullen moss Dec 4, 2024, 10:04 AM

#

I no longer have time to test everything, let alone conduct full-scale, in-depth experiments.

#

There are now more video generators than image generators.

muted dove Dec 4, 2024, 10:28 AM

#

I'm having too many problems trying to install the required packages to get this to work (mainly triton), so I gave up 😦

#

Not sure if that's because it's limited in the Windows portable version...?

sage burrow Dec 4, 2024, 10:59 AM

#

does anyone know if sd3.5 large knows japanese?

sage burrow Dec 4, 2024, 11:36 AM

#

#

pseudo owl Dec 4, 2024, 12:42 PM

#

bitter hearth I think its too early for video really

Honestly, I think it’s kind of the exact opposite. We have so many options, and great models.

CogVideoX is like sd1.5 of the video models, with so much support and control(Lora’s, dimensionx, control net, fun models, multiple aspect ratios, rf-inversion, rf-edit, image2vid, trajectory, consisid)

Ltxv is crazy fast, and now we know some hacks to make it great quality as well. It takes like 10sec to generate a 5sec video.

Also, mochi is already sd3.5 large prompt following level and generates excellent videos. Mostly for realistic and cartoonish style.

Hunyuan vid seems to be trained on many styles for a video model, it knows lots of art styles/anime styles/cartoon styles for a video model surprisingly, and is excellent quality.

#

Hunyuan video is surprisingly flexible too, it was trained on 5sec videos and a specific res, but it can do lower res, and even 10sec videos

This is a pretty dramatic gen 😆

remote holly Dec 4, 2024, 12:54 PM

#

pseudo owl Hunyuan video is surprisingly flexible too, it was trained on 5sec videos and a ...

This model is amazing , i wish one day there are a way to reduce memory usage and increase inference speed

#

I can't wait to see ltx v paper to discover what they do for the optimization

pseudo owl Dec 4, 2024, 12:58 PM

#

remote holly I can't wait to see ltx v paper to discover what they do for the optimization

Yeah same, I’m also waiting for ltxv1.0.

There was a trick to increase quality, by encoding the image with a specific video encoder and decoding so it had a video film grain. That would improve quality considerably for Ltxv.

remote holly Dec 4, 2024, 12:58 PM

#

pseudo owl Yeah same, I’m also waiting for ltxv1.0. There was a trick to increase quality,...

Is that the new ltg paper ?

pseudo owl Dec 4, 2024, 1:00 PM

#

remote holly Is that the new ltg paper ?

No that was just a trick the ltxv team found since Ltxv was trained on such videos mostly with that specific video film grain. So it results in better quality.

remote holly Dec 4, 2024, 1:00 PM

#

Ho , that's smart

#

i tried stg in comfyui with ltx , the quality increase is amazing

#

but i can't do img2vid

pseudo owl Dec 4, 2024, 1:14 PM

#

Why?

remote holly Dec 4, 2024, 1:14 PM

#

i dont know how to modifiy nodes for i2v

#

but you can probably do i2v without issues

muted dove Dec 4, 2024, 1:45 PM

#

I probably followed the same instructions. I got as far as trying to install the Triton wheel, but I keep getting the same error: ERROR: triton-3.1.0-cp39-cp39-win_amd64.whl is not a supported wheel on this platform.

bitter hearth Dec 4, 2024, 1:45 PM

#

doing the setup for stuff is always super hard

#

I almost never manage to get everything setup to reach maximum flux speed on a server
I've only managed it like twice

muted dove Dec 4, 2024, 1:46 PM

#

I have CUDA and VS installed, the path is set for cl.exe, but still no luck.

bitter hearth Dec 4, 2024, 1:48 PM

#

12.6 is quite a spicy CUDA version, I mostly see 12.1 for diffusion stuff

#

it might be okay though

#

sage is big boost sometimes

#

mostly for video

#

much better yeah

#

fp32 files need double vram

#

filled vram

muted dove Dec 4, 2024, 2:06 PM

#

Trying, but it's slow going... Building wheel for flash_attn (setup.py) ... -

#

Does it do i2v as well then?

#

What?!?!?!?!

#

https://tenor.com/view/1for-all-1for-all-dnd-deerstalker-pictures-dungeons-and-dragons-faint-gif-21535512

Tenor

#

https://tenor.com/view/ticking-clock-saw-time-is-running-time-flies-by-gif-21902124

Tenor

bitter hearth Dec 4, 2024, 2:13 PM

#

good small details versus good composition

#

is the video good

remote holly Dec 4, 2024, 2:19 PM

#

😔

bitter hearth Dec 4, 2024, 2:26 PM

#

pseudo owl Honestly, I think it’s kind of the exact opposite. We have so many options, and ...

I mostly meant cost-wise
quality-wise I think mochi and Hunyuan are there now

#

Flux used to be slow for me but with Nunchaku and 2-step Shuttle, Flux is fast now
but with video I'd go back to being slow again

pseudo owl Dec 4, 2024, 3:03 PM

#

bitter hearth I mostly meant cost-wise quality-wise I think mochi and Hunyuan are there now

Yeah that’s an issue but it’s kinda understandable because you basically have to generate like a 100 frames for a 5 second video. I think we definitely need low step models now, like even 8step would be amazing.

limpid thunderBOT Dec 4, 2024, 3:35 PM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

pseudo owl Dec 4, 2024, 4:48 PM

#

Nice, you made it work with low vram? I have to try it with that. It looks decent with that old retro look.

muted dove Dec 4, 2024, 5:31 PM

#

3.5hrs later...I GIVE UP!!

error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.6\\bin\\nvcc' failed with exit code 4294967295
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for flash_attn
  Running setup.py clean for flash_attn
Failed to build flash_attn
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (flash_attn)

pseudo owl Dec 4, 2024, 5:43 PM

#

Hunyuan is clearly trained on a lot of anime😆

#

I’m on my phone right now, and my pc is at my home so can’t run it right now locally.

#

Yeah it’s probably trained on a large amount of anime videos, it seems to beat any other video model in its diversity of styles.

Mochi is decent at anime but has very little styles, CogVideoX isn’t too great with anime.

Ltxv with stg guidance btw is pretty good too and really fast.

#

Prompt?

#

It has a different prompting style then most other models, it actually likes concise 1-2 sentences abt the vid.

#

And some Ltxv i2v examples with the video compression strategy and stg, it’s surprisingly good.

dull star Dec 4, 2024, 6:18 PM

#

pseudo owl And some Ltxv i2v examples with the video compression strategy and stg, it’s sur...

this is great

#

are you using fp8 or fp16 T5?

remote holly Dec 4, 2024, 7:26 PM

#

pseudo owl And some Ltxv i2v examples with the video compression strategy and stg, it’s sur...

Where you find the i2v workflow ?

#

How you added the stg ?

#

Is like skip layer guidence for video models

dull star Dec 4, 2024, 7:40 PM

#

the difference is massive

#

cfg vs stg

remote holly Dec 4, 2024, 7:48 PM

#

dull star

how many steps ?

dull star Dec 4, 2024, 7:49 PM

#

30 iirc

#

I love ltxv cause its super fast

#

not real time fast but still very very fast

remote holly Dec 4, 2024, 7:49 PM

#

yeah , i love also

#

i get this with stg

#

20 steps

dull star Dec 4, 2024, 7:56 PM

#

I love hunyuan but I don't want to generate MSDOS level resolutions just to run it

remote holly Dec 4, 2024, 7:57 PM

#

yeah

remote holly Dec 4, 2024, 8:14 PM

#

when you use stg it modify the ltxv model : https://github.com/logtd/ComfyUI-LTXTricks/issues/6

GitHub

STG Image to Video · Issue #6 · logtd/ComfyUI-LTXTricks

Thank you for supporting STG so quickly! One question, I can't get it to work with image to video. The output video has nothing to do with the image, only the prompt. Have you any idea how to g...

dull star Dec 4, 2024, 8:19 PM

#

24GB

#

but im still impatient

#

so even if the quality is better I can't wait

remote holly Dec 4, 2024, 8:26 PM

#

you can resolve with "ltx add latent guide" node

brittle nexus Dec 4, 2024, 8:31 PM

#

remote holly when you use stg it modify the ltxv model : https://github.com/logtd/ComfyUI-LTX...

It weird to say permanently, it modifies the loaded model, it will back to normal once you clean the loaded model

remote holly Dec 4, 2024, 8:32 PM

#

brittle nexus It weird to say permanently, it modifies the loaded model, it will back to norma...

my bad , sorry

brittle nexus Dec 4, 2024, 8:39 PM

#

If you use the node "modify LTX model", even in another workflow, it will break the I2V unless you clean the loaded model property or restart Comfy. You may use the ltx add latent guide node to make it work again, but you will be using the modified model even if you current workflow don't use the modify model node.

remote holly Dec 4, 2024, 8:45 PM

#

why i get this error : The expanded size of the tensor (17) must match the existing size (18) at non-singleton dimension 3. Target sizes: [1, 128, 1, 17, 23]. Tensor sizes: [128, 1, 18, 23] ?

brittle nexus Dec 4, 2024, 8:54 PM

#

remote holly why i get this error : `The expanded size of the tensor (17) must match the exis...

one inputs are not compatible with the other, we need to know the workflow

#

According to my first tests this perturbed attention for LTX is game changing in I2V

pseudo owl Dec 4, 2024, 9:19 PM

#

brittle nexus According to my first tests this perturbed attention for LTX is game changing in...

wow that looks amazing! you got it to work with pertrubed?

brittle nexus Dec 4, 2024, 9:21 PM

#

pseudo owl wow that looks amazing! you got it to work with pertrubed?

The perturbed is what you call STG, right?

pseudo owl Dec 4, 2024, 9:23 PM

#

brittle nexus The perturbed is what you call STG, right?

stg is different, but its actually similar(same?) thing as skip layer guidance from stability according to comfyanon.

brittle nexus Dec 4, 2024, 9:26 PM

#

pseudo owl stg is different, but its actually similar(same?) thing as skip layer guidance f...

In the LTX Tricks page says STG but the nodes are those, that's why i'm confused. But yes, i made this node to work with i2v

brittle nexus Dec 4, 2024, 9:45 PM

#

With and without

pseudo owl Dec 4, 2024, 9:45 PM

#

brittle nexus With and without

yeah with clearly seems much more realistic and better

brittle nexus Dec 4, 2024, 9:48 PM

#

pseudo owl yeah with clearly seems much more realistic and better

From what I understood from a previous post you also used it, right? The Javier Bardem one

#

The workflow is too big since i'm doing 4 videos at times and combining them but you just need this node

pseudo owl Dec 4, 2024, 10:00 PM

#

Lots of nice example workflows for ltxv including with stg: https://github.com/logtd/ComfyUI-LTXTricks/tree/main/example_workflows

brittle nexus Dec 4, 2024, 10:37 PM

#

📎 LTX_Pertubed_x4.json

brittle nexus Dec 4, 2024, 11:05 PM

#

You can't do side by side with i2v anyway. It changes the model for both

#

Once you use the "modify model node", you need to flush de model or restart comfyui

brittle nexus Dec 4, 2024, 11:36 PM

#

goodjob

remote holly Dec 4, 2024, 11:42 PM

#

brittle nexus one inputs are not compatible with the other, we need to know the workflow

i dont get errors anymore , is this right ?

📎 ltxv-i2v-ltg.json

brittle nexus Dec 4, 2024, 11:44 PM

#

Without and With

remote holly Dec 4, 2024, 11:55 PM

#

i get more consistency with ltg

young blade Dec 5, 2024, 12:04 AM

#

brittle nexus Dec 5, 2024, 12:11 AM

#

remote holly i dont get errors anymore , is this right ?

Looks correct (except the height and width were inverted, you should connect those nodes' inputs), I don't know about the performance of the fp8 clip or using only 20 steps

remote holly Dec 5, 2024, 12:12 AM

#

thanks , i corrected this

turbid grotto Dec 5, 2024, 12:26 AM

#

https://huggingface.co/SG161222/RealVis_Medium_1.0b
sd3.5 medium finetune

SG161222/RealVis_Medium_1.0b · Hugging Face

remote holly Dec 5, 2024, 12:27 AM

#

ltg seems to get more motion on video

turbid grotto Dec 5, 2024, 1:06 AM

#

anatomy is better but yet not enough, also it works with higher cfg now

fleet meteor Dec 5, 2024, 3:30 AM

#

dull star

Any idea idea if stg uses more vram? I wanna try it

young blade Dec 5, 2024, 4:18 AM

#

#

making lora's is kinda addicting

craggy crest Dec 5, 2024, 7:02 AM

#

@bitter hearth you are being watched

slim oxide Dec 5, 2024, 7:05 AM

#

girl with white skirt

buoyant mesa Dec 5, 2024, 10:45 AM

#

hallo, i am new to Stable diffusion 3.5

#

is it possible to train a Lora for it (i know you can for 1.5 and XL)

short thicket Dec 5, 2024, 2:35 PM

#

BTW. I have a little over 10k images in my dataset now. I have a few last things to wrap up this weekend with it and then I'm gonna fine tune next week.

short thicket Dec 5, 2024, 3:00 PM

#

I have heard of it, but haven't messed around with it. It has a color palette option right? I think that's pretty cool.

bitter hearth Dec 5, 2024, 3:47 PM

#

short thicket I have heard of it, but haven't messed around with it. It has a color palette op...

if you like color palette someone made a node for flux that works https://github.com/AIPOQUE/ComfyUI-APQNodes

#

its cool cos it didn't require training

#

they found colors the model already knows then they convert your color palette to the nearest ones

#

lol nice

ancient radish Dec 5, 2024, 4:07 PM

#

anyone using sd forge? they have rly good upscale when generating image but i cant upscale any of my older pics good with img2img/extras. whats the current method on forge?

short thicket Dec 5, 2024, 4:24 PM

#

Nice! Yeah, I'm not really trying to train on millions of images or anything. I like the smaller yet more selective type for now. I will build up on it over time. Maybe go for another fine tune when I have 30k images.

brittle nexus Dec 5, 2024, 4:27 PM

#

ancient radish anyone using sd forge? they have rly good upscale when generating image but i ca...

img2img, script, sd upscale

ancient radish Dec 5, 2024, 4:28 PM

#

brittle nexus img2img, script, sd upscale

tried that it takes like 4 seconds and doesnt add any quality, idk not working for me

short thicket Dec 5, 2024, 4:30 PM

#

No that's pretty cool. Is there a link to the original? Ide like to look into it more.

bitter hearth Dec 5, 2024, 4:35 PM

#

demoncore is the best I have tried

#

ok thanks will try it

brittle nexus Dec 5, 2024, 4:41 PM

#

ancient radish tried that it takes like 4 seconds and doesnt add any quality, idk not working f...

It is working normally here

ancient radish Dec 5, 2024, 4:42 PM

#

brittle nexus It is working normally here

got it working cheers, half my upscalers aren't working for some reason, but the same ones did when i used on generated image

#

¯_(ツ)_/¯

brittle nexus Dec 5, 2024, 4:52 PM

#

https://www.reddit.com/r/StableDiffusion/comments/1h781s6/security_alert_crypto_mining_attack_via/

From the StableDiffusion community on Reddit: ⚠️ Security Alert: Cr...

Explore this post and more from the StableDiffusion community

young blade Dec 5, 2024, 5:05 PM

#

anyone recall a website or hugging space that allows you to upload a short video animation and then an image, and it will try to animate your image in the same way the video is?

craggy crest Dec 5, 2024, 5:27 PM

#

buoyant mesa is it possible to train a Lora for it (i know you can for 1.5 and XL)

yes. it's very easy to train for

pseudo owl Dec 5, 2024, 6:27 PM

#

Bruh hunyuan text to video is completely uncensored. It can probably compete with ponyxl in that department.

#

What the hell did Tencent train it with lol

dull star Dec 5, 2024, 7:10 PM

#

pseudo owl Bruh hunyuan text to video is completely uncensored. It can probably compete wit...

woahhh

low stone Dec 5, 2024, 7:12 PM

#

brittle nexus Without and With

wow that's a major improvement

remote holly Dec 5, 2024, 7:25 PM

#

Can i run t5xxl fp16 with 12gb vram ?

#

I have 16gb of vram

mortal mesa Dec 5, 2024, 8:16 PM

#

i do with an 11gb card in comfy

brittle nexus Dec 5, 2024, 8:41 PM

#

I have 12gb

remote holly Dec 5, 2024, 8:53 PM

#

I dont see any difference between fp8 and fp16 with ltxv

turbid grotto Dec 5, 2024, 8:58 PM

#

is it possible to run hunyuan with 12gb vram?

#

no I am not asking due to certain message before

pseudo owl Dec 5, 2024, 9:00 PM

#

turbid grotto is it possible to run hunyuan with 12gb vram?

@halcyon yarrow ran it in 8gb vram so probably yes, not sure

turbid grotto Dec 5, 2024, 9:01 PM

#

pseudo owl <@156588917875933184> ran it in 8gb vram so probably yes, not sure

thanks

turbid grotto Dec 5, 2024, 9:47 PM

#

gonnabegood

brittle nexus Dec 5, 2024, 9:56 PM

#

LTX

craggy crest Dec 5, 2024, 10:44 PM

#

sd3.5 large to create the image, kling to animate it

#

really liked how the final effect came out

brittle nexus Dec 5, 2024, 11:31 PM

#

sullen moss Dec 5, 2024, 11:31 PM

#

https://www.theverge.com/2024/12/4/24312352/openai-sora-o1-reasoning-12-days-shipmas

The Verge

OpenAI’s 12 days of ‘shipmas’ include Sora and new reasoning model

OpenAI has 12 days of Christmas planned.

buoyant mesa Dec 6, 2024, 12:31 AM

#

craggy crest yes. it's very easy to train for

with what?

craggy crest Dec 6, 2024, 12:48 AM

#

to put it to the test

mortal mesa Dec 6, 2024, 12:49 AM

#

smoke on the water

#

sounds like you need to be in a relationship with an AI model

craggy crest Dec 6, 2024, 12:50 AM

#

mortal mesa sounds like you need to be in a relationship with an AI model

only if the model is grok ;)

mortal mesa Dec 6, 2024, 12:51 AM

#

Noooo!

turbid grotto Dec 6, 2024, 2:17 AM

#

Richard, how did you manage to run hunyuan video? I can't even load text encoder with 12gb 😦

brittle nexus Dec 6, 2024, 2:24 AM

#

turbid grotto Richard, how did you manage to run hunyuan video? I can't even load text encoder...

I did, with those settings:

#

craggy crest Dec 6, 2024, 2:37 AM

#

turbid grotto Richard, how did you manage to run hunyuan video? I can't even load text encoder...

it needs around 18gig vram at least, i believe.

turbid grotto Dec 6, 2024, 2:51 AM

#

brittle nexus I did, with those settings:

thanks, but sadly, I can't go past "(Down)Load HunyuanVideo TextEncoder" node, it just goes out of vram. Is there a way to offload to cpu or force using shared memory?

brittle nexus Dec 6, 2024, 2:51 AM

#

turbid grotto thanks, but sadly, I can't go past "(Down)Load HunyuanVideo TextEncoder" node, i...

How much ram do you have?

turbid grotto Dec 6, 2024, 2:51 AM

#

craggy crest it needs around 18gig vram at least, i believe.

flux fp16 needs 24gb but I can run it with offloading

turbid grotto Dec 6, 2024, 2:51 AM

#

brittle nexus How much ram do you have?

32gb

brittle nexus Dec 6, 2024, 2:52 AM

#

turbid grotto 32gb

Same as me. It's not the problem.

craggy crest Dec 6, 2024, 2:53 AM

#

brittle nexus Same as me. It's not the problem.

are you using Comfy? or something else?

brittle nexus Dec 6, 2024, 2:54 AM

#

craggy crest are you using Comfy? or something else?

Comfy.

craggy crest Dec 6, 2024, 2:54 AM

#

brittle nexus Comfy.

you might ask in the #🧣｜comfy-ui channel, or on his discord

turbid grotto Dec 6, 2024, 3:11 AM

#

I'll wait for wrapped to mature, for now

bitter hearth Dec 6, 2024, 4:30 AM

#

I've had loads of trouble on 32GB DRAM servers
I'm saying DRAM not VRAM here
would recommend 64GB DRAM

#

if you can, then 128GB, but 64GB is mostly ok

mortal mesa Dec 6, 2024, 5:06 AM

#

they killed Richard

craggy crest Dec 6, 2024, 6:47 AM

#

craggy crest Dec 6, 2024, 11:00 PM

#

from comfy's newest annoucement: ComfyUI now natively supports masking and scheduling of LoRA and model weights. This works for all LoRAs and models - as long as the LoRAs and models are not of conflicting architectures. The full blog post is here: https://blog.comfy.org/masking-and-scheduling-lora-and-model-weights/

Comfy Org Blog

Masking and Scheduling LoRA and Model Weights

As of Monday, December 2nd, ComfyUI now supports masking and scheduling LoRA and model weights natively as part of its conditioning system. It is compatible with all models. The update also has an extensive ModelPatcher rework and introduction of wrappers and callbacks to make custom node implementations require less hacks,

patent acorn Dec 7, 2024, 12:29 AM

#

craggy crest from comfy's newest annoucement: ComfyUI now natively supports masking and sched...

i thought that was implemented already 🤔

craggy crest Dec 7, 2024, 12:34 AM

#

patent acorn i thought that was implemented already 🤔

they just made the announcement today - so maybe you saw a beta test something?

mortal mesa Dec 7, 2024, 12:41 AM

#

it was but you needed nodes, this is native support

patent acorn Dec 7, 2024, 12:55 AM

#

yeah custom node makes thing pissible

turbid grotto Dec 7, 2024, 1:46 AM

#

I can run HunyuanVideo now but all I get is black video, anyone had similar issue?

bitter hearth Dec 7, 2024, 7:41 AM

#

patent acorn i thought that was implemented already 🤔

its been doable in various ways for a long time
this is essentially another way of doing it

remote holly Dec 7, 2024, 9:47 AM

#

i just tested sd3.5M is realy great

remote holly Dec 7, 2024, 10:37 AM

#

what is the best upscale model ?

remote holly Dec 7, 2024, 11:38 AM

#

bitter hearth Dec 7, 2024, 12:04 PM

#

remote holly what is the best upscale model ?

closed source is the topaz gigapixel one
open source typical ones are StableSR, CCSR and SUPIR

#

there is a newer one that slightly beats SUPIR maybe can't remember the name

#

but SUPIR has been slightly broken into its components by Kijai, for ComfyUI so I think that is still a better choice

#

if you wanted the image to not change as much then openmodeldb is the place to go

remote holly Dec 7, 2024, 12:06 PM

#

thanks very much for the information ! @bitter hearth

pseudo owl Dec 7, 2024, 6:14 PM

#

bitter hearth there is a newer one that slightly beats SUPIR maybe can't remember the name

I think this one right: https://huggingface.co/InstantX/InstantIR

bitter hearth Dec 7, 2024, 6:31 PM

#

pseudo owl I think this one right: <https://huggingface.co/InstantX/InstantIR>

thanks yeah that was it

remote holly Dec 7, 2024, 6:42 PM

#

pseudo owl I think this one right: <https://huggingface.co/InstantX/InstantIR>

thanks !

sullen moss Dec 7, 2024, 7:38 PM

#

https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/issues/55

GitHub

All black output · Issue #55 · kijai/ComfyUI-HunyuanVideoWrapper

I was able to get this working this morning, and generated a few videos. But suddenly out of the blue, all of my outputs are solid black. Why would this be? I'm not getting any error codes or a...

#

Has anyone else faced this problem? If so, how did you resolve it?

turbid grotto Dec 7, 2024, 9:01 PM

#

sullen moss https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/issues/55

same, don't know solution yet

turbid grotto Dec 8, 2024, 1:02 AM

#

sullen moss Has anyone else faced this problem? If so, how did you resolve it?

installing sage attention solved it

turbid grotto Dec 8, 2024, 1:58 AM

#

I am able to do 41 frames at 512x320 with 35steps in 290 seconds on rtx3060 gonnabegood

#

wish svdquants worked with this model

#

would be even faster

bitter hearth Dec 8, 2024, 2:00 AM

#

it may well work

turbid grotto Dec 8, 2024, 2:11 AM

#

it is actually great even at 512x320 fp8, we need video upscale now

bitter hearth Dec 8, 2024, 2:16 AM

#

nvidia put svdquant on their tensorRT roadmap too so there could be another speedup there

turbid grotto Dec 8, 2024, 2:17 AM

#

hope not only for 50 series

bitter hearth Dec 8, 2024, 2:19 AM

#

I would say that it should be possible to make TensorRT engine for that on older GPUs
but its Nvidia so you never know thomas

turbid grotto Dec 8, 2024, 2:20 AM

#

TRT works super fine with sdxl on 3060 at least, but nvidia is nvidia

#

Yea, I can tell, hunyuan did not filter training data 😂

toxic bone Dec 8, 2024, 6:05 AM

#

could just be that for the optiimization to work, it needs hardware support. Like they made the raytracing exclusive to RTX cards and people were like "Ughghhn monopoly and shenanigans!" but then they pushed a version that worked on GTX and without the hardware support it just wasn't good.

Sometimes instruction sets on the chip are crucial to make a routine performant. The optimization might not be effective if it's done on the fly in software.

Sort of like how fp8 isn't slow on ada cards, but it is on older models. That's because the hopper transformer engine with it's hardware fp8 support facilitates it.

remote holly Dec 8, 2024, 7:58 AM

#

SVDQuants are finaly avaible in comfyui ! 🤯 https://www.reddit.com/r/StableDiffusion/s/dRiI1yR1R0

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

turbid grotto Dec 8, 2024, 7:59 AM

#

remote holly SVDQuants are finaly avaible in comfyui ! 🤯 https://www.reddit.com/r/StableDiff...

gonnabegood

bitter hearth Dec 8, 2024, 7:59 AM

#

awesome

remote holly Dec 8, 2024, 8:26 AM

#

dont works waow

#

missing nodes

bitter hearth Dec 8, 2024, 8:28 AM

#

git clone?

remote holly Dec 8, 2024, 8:29 AM

#

yess

bitter hearth Dec 8, 2024, 8:29 AM

#

hopefully will be sorted out soon if so

remote holly Dec 8, 2024, 8:29 AM

#

i copied the nodes files to custom nodes but it doesn't works

#

you need to install with anaconda

#

it doesn't make sense , you have to install nunchaku with conda environement but how comfyui is suposed to find them ?

turbid grotto Dec 8, 2024, 11:35 AM

#

sadcat

sullen moss Dec 8, 2024, 11:49 AM

#

turbid grotto installing sage attention solved it

mortal mesa Dec 8, 2024, 9:37 PM

#

304x256 16 frames Hunyuan video s about all i can do 😂

turbid grotto Dec 8, 2024, 11:09 PM

#

sullen moss

did you install triton correctly? agony

pseudo owl Dec 9, 2024, 12:10 AM

#

mortal mesa 304x256 16 frames Hunyuan video s about all i can do 😂

Use block swap, you can do 49 frames 1280x720 at 12-13gb vram. Will be slower though.

dull star Dec 9, 2024, 12:20 AM

#

Switti

#

#

pseudo owl Dec 9, 2024, 12:30 AM

#

dull star Switti

Switti is so fast and pretty decent, I would say sdxl quality, slightly better?

toxic bone Dec 9, 2024, 2:40 AM

#

really good impressionist paintings

turbid grotto Dec 9, 2024, 4:07 AM

#

dull star Switti

works in comfy?

opal quarry Dec 9, 2024, 4:32 AM

#

#

turbid grotto Dec 9, 2024, 5:27 AM

#

opal quarry

hmm great hands but doesn't look like Flux, what model is that?

rain current Dec 9, 2024, 7:04 AM

#

turbid grotto hmm great hands but doesn't look like Flux, what model is that?

pin stable diffusion? pinterest.com/pin/864902303412978889

cursive frigate Dec 9, 2024, 8:50 AM

#

pseudo owl I think this one right: <https://huggingface.co/InstantX/InstantIR>

@bitter hearth I thought SUPIR and InstanIR were upscale models like
Why are those two so much bigger? Am I missing something?

bitter hearth Dec 9, 2024, 8:52 AM

#

diffusion ones are larger

cursive frigate Dec 9, 2024, 8:54 AM

#

@bitter hearth are there any new great upscale models that are better than the ones in my screenshot. mainly for photo quality and sometimes anime stuff

bitter hearth Dec 9, 2024, 8:55 AM

#

the Nomos2 ATD

#

the input needs to be very high quality

#

so sometimes the other ATD models are better

cursive frigate Dec 9, 2024, 9:00 AM

#

I'll give it a try. Just got everything running on Ubuntu. Got a new 4tb nvme, pulled out the windows nvme, installed ubuntu, put the windows nvme back in so no bootloader or grub files ended up on windows disk and I can now use either one. Got docker running for open-webui, installed ollama, got ComfyUI running. And I have to say everything runs so much better on linux.

#

@bitter hearth Anyways, have a good night. I'm off to bed, gotta go to work in about 5 hours.

bitter hearth Dec 9, 2024, 9:03 AM

#

night

charred wraith Dec 9, 2024, 12:02 PM

#

yo

limpid thunderBOT Dec 9, 2024, 5:20 PM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

sullen moss Dec 9, 2024, 6:47 PM

#

https://www.youtube.com/watch?v=OY2x0TyKzIQ&ab_channel=MarquesBrownlee

YouTube

Marques Brownlee

This Video is AI Generated! SORA Review

SORA generates videos. This is the first review.

Get up to 40% off on last minute gifts at https://ridge.com/MKBHD

The (real) birding video: https://youtu.be/FjUEHdWU2Vw?si=KncheffCaMT6hK5k

MKBHD Merch: http://shop.MKBHD.com

Tech I'm using right now: https://www.amazon.com/shop/MKBHD

Playlist of MKBHD Intro music: https://goo.gl/B3AWV5

~
h...

▶ Play video

pseudo owl Dec 9, 2024, 7:40 PM

#

sullen moss https://www.youtube.com/watch?v=OY2x0TyKzIQ&ab_channel=MarquesBrownlee

Yeah sora finally released, lot of weird restrictions though like i2v can’t have humans?

Honestly, open source is surprisingly close to it and MiniMax is comparable to it(better in physics but sora has better aesthetic)

mortal mesa Dec 9, 2024, 7:50 PM

#

sora is probably a monster that isnt worth the power bills to run, its the openai way

#

quantity not quality

sullen moss Dec 9, 2024, 7:57 PM

#

We’ll see how it goes. For me, DALL-E 3 is still an unmatched generator. Almost a year has passed, and no model has come close to its understanding of complex scenarios and concepts. The only and biggest downside is its strict censorship. I’m hoping for the growth of decentralized AI—it's the only way regular users can compete with big corporations.

#

heh

#

#

😂

remote holly Dec 9, 2024, 9:12 PM

#

😭

#

I hate theses restrictions policy

pseudo owl Dec 9, 2024, 9:15 PM

#

sullen moss We’ll see how it goes. For me, DALL-E 3 is still an unmatched generator. Almost ...

Hmm flux/sd3.5 large should be miles ahead even in prompt following, not even close, can you provide an example prompt that dalle3 does well but not flux/sd3.5 large?

#

dalle3 is completely outdated, and has very large biases too that are even worse then flux

sullen moss Dec 9, 2024, 9:20 PM

#

Everyone is entitled to their own opinion.

pseudo owl Dec 9, 2024, 9:21 PM

#

sullen moss Everyone is entitled to their own opinion.

Oh yeah aesthetic wise, yeah everyone has different preferences but I'm talking about text, prompt following, image quality, styles, humans is where both models are better.

sullen moss Dec 9, 2024, 9:23 PM

#

Have you generated a lot in DALL-E?

pseudo owl Dec 9, 2024, 9:25 PM

#

sullen moss Have you generated a lot in DALL-E?

alright amount, I don't use it now often but a few weeks ago, I generated 2-3 imgs per day and compared to other models sometimes too.

sullen moss Dec 9, 2024, 9:26 PM

#

pseudo owl alright amount, I don't use it now often but a few weeks ago, I generated 2-3 im...

I’ve generated around 100,000 images.

toxic bone Dec 9, 2024, 9:27 PM

#

flux prompting is on par with Dall-e imo. And then you can make it comprehend what you mean with a lora. i can't speak towards sd3.5 as much. there's also omost which uses an LLM to form a regional prompt structure and generate complex images.

local solutions are not only as good, but they give more tools to leverage with. The level of contorl you have is far superior.

sullen moss Dec 9, 2024, 9:28 PM

#

I’ve seen what DALL-E 3 is capable of if you dive deeper into its dark subconscious.

toxic bone Dec 9, 2024, 9:28 PM

#

lol it has no subconsious

dull star Dec 9, 2024, 9:28 PM

#

its probably a form of expression

pseudo owl Dec 9, 2024, 9:28 PM

#

sullen moss I’ve seen what DALL-E 3 is capable of if you dive deeper into its dark subconsci...

dalle3 uses an llm to enhance your prompt, maybe thats the reason why you think its better? use something similar with flux and you will definitely get better results then dalle3 for sure.

dull star Dec 9, 2024, 9:29 PM

#

yeah it absolutely does use an LLM to change your prompt

toxic bone Dec 9, 2024, 9:29 PM

#

expressions that anthropomorphise models show a fundamental misunderstanding of the tools.

dull star Dec 9, 2024, 9:29 PM

#

I prefer flux (pixelwave especially) over DALLE3 any day of the week

#

DALLE3 is good for like... powerpoint presentations I guess

#

it must have some style or substance in it if you prompt for it, but I just don't like the images it produces

#

and besides its a service, I like offline

toxic bone Dec 9, 2024, 9:31 PM

#

even the simplest aux tools like inpainting are unavailable

dull star Dec 9, 2024, 9:31 PM

#

bruh

toxic bone Dec 9, 2024, 9:31 PM

#

hell you can't even adjust the number of steps

pseudo owl Dec 9, 2024, 9:32 PM

#

The new sora is uh interesting, from orabazes

#

Hunyuan T2V which is open source(not even full quality, the fp8 lower quality version and less frames which makes it lower quality)

toxic bone Dec 9, 2024, 9:38 PM

#

can it really be called a new sora with only 3 second attention span?

pseudo owl Dec 9, 2024, 9:39 PM

#

toxic bone can it really be called a new sora with only 3 second attention span?

No it can do more, it was just less frames to save vram. and that was an old video, lots of improvements came and now its possible to make high res videos in just 12gb vram.

toxic bone Dec 9, 2024, 9:40 PM

#

i'm sure it can do more, but each 3 seconds will be a sepearate attention context

#

point is, sora seemingly has no attention seams. it's in another class and it's hard to compare models with that short attention span to it

sullen moss Dec 9, 2024, 9:42 PM

#

pseudo owl Hunyuan T2V which is open source(not even full quality, the fp8 lower quality ve...

I finally installed it too, and I’m genuinely surprised. Yes, the quality doesn’t quite reach corporate models, but the complete freedom is incredibly satisfying. I’m amazed that such an uncensored model was released as open-source.

pseudo owl Dec 9, 2024, 9:43 PM

#

toxic bone i'm sure it can do more, but each 3 seconds will be a sepearate attention contex...

No, the normal generation is 5s like sora on the plus plan, you can just do less frames to save vram at the cost of quality. Not required now mostly, lots of optimizations have come.

It even works with vids as long as 10sec but 5sec is the most stable(without seperate gens)

toxic bone Dec 9, 2024, 9:44 PM

#

that limitation is put there for payment purposes. it's artificial. Sora can do many minute long videos, even if they're not offered on the plan you have.

#

i get it though. sora is the brand name. eveyrone wants to be sora. so you call it "the new sora" or "the sora killer"

pseudo owl Dec 9, 2024, 9:46 PM

#

toxic bone that limitation is put there for payment purposes. it's artificial. Sora can do...

Limited to 1minute according to the paper, not more, but yeah everyone calls new video models sora killer because basically sora started the hype.

turbid grotto Dec 9, 2024, 9:46 PM

#

pseudo owl Hmm flux/sd3.5 large should be miles ahead even in prompt following, not even cl...

I think too that Dalle is still smarter than flux or sd3.5, and not just for a little

toxic bone Dec 9, 2024, 9:47 PM

#

so provide an example

#

if there's not a little, that implies theres a lot. so it should be easy

turbid grotto Dec 9, 2024, 9:48 PM

#

toxic bone so provide an example

I have none, it is just from my experiece

toxic bone Dec 9, 2024, 9:48 PM

#

figures

pseudo owl Dec 9, 2024, 9:48 PM

#

turbid grotto I think too that Dalle is still smarter than flux or sd3.5, and not just for a l...

Can you explain though, I tested both on several different prompts like blue dog/image-grid and more. Even on benchmarks, they outperform dalle3.

bitter hearth Dec 9, 2024, 9:59 PM

#

Flux Dev beats Dalle by 0.01 in the GenEval benchmark
that's within margin of error

#

this is from the Playground v3 paper

turbid grotto Dec 9, 2024, 9:59 PM

#

pseudo owl Can you explain though, I tested both on several different prompts like blue dog...

I found it understand complex/crazy concepts better than anything else, but as there are conflicting opinions, I should retest

bitter hearth Dec 9, 2024, 10:03 PM

#

found a much better table
from the Sana paper
this time Dalle 3 and Flux Dev are exactly even on GenEval, and Dev is 0.5 ponts higher in DPG

mortal mesa Dec 9, 2024, 10:05 PM

#

to me its not an apples to apples comparison, Dalle should be thought of as a complex workflow, not a single model

sullen moss Dec 9, 2024, 10:06 PM

#

These are all synthetic tests. I never pay attention to them. Any tool needs to be deeply tested to understand its hidden capabilities and limits.

bitter hearth Dec 9, 2024, 10:22 PM

#

you can make your own benchmark dataset if you want
its quite tricky though as you would want it to be well balanced in terms of the distribution of entities, attibutes, spatial relations and non-spatial relations across the prompt dataset
and then you would want to run at least 1k prompts on it

toxic bone Dec 9, 2024, 11:15 PM

#

was funny to me when there was a benchmark model people were all about called "chad" or something. It was trained on all tons of gigachad images. so people were like "this prompt works better because my classifier model rates it higher" but it was just cause it looked more gigachadish... which is far from realistic

#

https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/1831 this one it got really popular for a quick minute before people realized what it was actually ranking towards

GitHub

SD Chad - Stable Diffusion Aesthetic Scorer · AUTOMATIC1111 stable-...

SD Chad - Stable Diffusion Aesthetic Scorer Have been using SD to create art for the last month, finding a template that works across prompt, seed, settings, and then creating 100s of images from i...

pseudo owl Dec 9, 2024, 11:17 PM

#

Some sora vids generated by orabazes, so far quality is pretty good but MiniMax might be a better option especially since sora 1024p with 10sec is 200$ price

craggy crest Dec 10, 2024, 12:16 AM

#

pseudo owl Some sora vids generated by orabazes, so far quality is pretty good but MiniMax ...

you also can't generate videos with people in them if you don't pay for that 200 dollar plan - theoretically media did a very nice review of it earlier on his youtube

cursive frigate Dec 10, 2024, 7:03 AM

#

Anyone doing anything interesting in ComfyUI these days?

wind basalt Dec 10, 2024, 9:25 AM

#

Hello all, thought I would check in, wondering if the newest SD is able to render hands/fingers in front of faces without messing up the face behind the fingers?

noble phoenix Dec 10, 2024, 9:29 AM

#

/fingers

still lark Dec 10, 2024, 10:44 AM

#

mr

compact spruce Dec 10, 2024, 12:24 PM

#

Hello guys. I'm a dev agency and I'm developing a flux workflow to create head shot image, and especially that I will deploy the workflow onto the web app so that we can monetize it. After I talked about that idea, a lot of guys have been texting me and asking me to build web app for them, which I think is a great way to earn money from your workflow.
I think there are a lot of people here have cool workflow and want to make it into a web app, launch it to a thousand of users and make money but don't know how to build it. If you are thinking about it and need any help, let me know, I will support nicely :).

#

for example we are deploying this workflow onto our server

buoyant mesa Dec 10, 2024, 1:45 PM

#

great thing wish you best luck!
But to use flux you need a licence but i guess you know that

compact spruce Dec 10, 2024, 2:35 PM

#

buoyant mesa great thing wish you best luck! But to use flux you need a licence but i guess y...

Yup, we are using the flux.schnell

#

flux.dev is non-commercial license so we do not use that

sullen moss Dec 10, 2024, 3:43 PM

#

https://blog.google/technology/research/google-willow-quantum-chip/

Google

Meet Willow, our state-of-the-art quantum chip

Our new quantum chip demonstrates error correction and performance that paves the way to a useful, large-scale quantum computer.

#

🤔

buoyant mesa Dec 10, 2024, 3:46 PM

#

can you use loras for SD 3.5 in comfyUI yet?
I can't get it to work
can somebody help pleassseeee

errant dust Dec 10, 2024, 5:26 PM

#

Is there a GGUF version of the Flux tools?

#

Fill/Redux

pseudo owl Dec 10, 2024, 5:54 PM

#

Mmaudio with Hunyuan t2v is pretty amazing, both are open source

remote holly Dec 10, 2024, 5:58 PM

#

pseudo owl Mmaudio with Hunyuan t2v is pretty amazing, both are open source

Whoa ! I was looking for a vid2audio generator

ancient radish Dec 10, 2024, 6:10 PM

#

https://www.runninghub.cn/post/1837756005856821249

very clean character sheet but i cant get it to work, if anyone interested in trying it and gets it working maybe they can give a hand xd

多角度一致性写实人物_基础版 - RunningHub ComfyUI工作流

ComfyUI Workflow - 多角度一致性写实人物_基础版此工作流可以生成写实风格的一致性多角度人物1. 自定义你的男性/女性角色，包括性别、年龄、国籍、头发、服装种类等多个自定义类型2. 上传参考动作图3. 描述背景，如果不需要背景可以输入 simple background4. 生成女性选择1，男性选择2，默认放大1.5倍enjoy！作者微信：xdtr6688

errant dust Dec 10, 2024, 6:32 PM

#

pseudo owl Hmm flux/sd3.5 large should be miles ahead even in prompt following, not even cl...

That's easy enough: A large pizza with a world map made from pizza toppings.

toxic bone Dec 10, 2024, 9:02 PM

#

you actually don't need a commercial license to use the outputs of flux dev. you only need the license to offer the model on a service.

#

i know you fall into the latter category, i'm just clarifying

pseudo owl Dec 10, 2024, 10:07 PM

#

remote holly Whoa ! I was looking for a vid2audio generator

Yeah mmaudio is truly amazing, and with hunyuant2v, this thing is ridiculous

split bramble Dec 10, 2024, 10:28 PM

#

pseudo owl Yeah mmaudio is truly amazing, and with hunyuant2v, this thing is ridiculous

Need some more examples of this, pretty neat!

toxic bone Dec 10, 2024, 10:30 PM

#

pseudo owl Yeah mmaudio is truly amazing, and with hunyuant2v, this thing is ridiculous

there's a busker guy in the city i grew up in who fiddles in a vader costume. It's extreme. i've loved him for years and always throw some coin in his hat

#

https://www.youtube.com/watch?v=dVKwiJkc8OI he's still at it looks like

YouTube

Colton Phillips

Darth Fiddler Still Rips It Up In 2021

darth fiddler is a street artist in victoria bc

▶ Play video

lucid swift Dec 10, 2024, 11:05 PM

#

pseudo owl Mmaudio with Hunyuan t2v is pretty amazing, both are open source

wow this is cool!

bitter hearth Dec 11, 2024, 12:24 AM

#

new model dropped weights
https://github.com/lehduong/OneDiffusion?tab=readme-ov-file

GitHub

GitHub - lehduong/OneDiffusion

Contribute to lehduong/OneDiffusion development by creating an account on GitHub.

pseudo owl Dec 11, 2024, 12:27 AM

#

bitter hearth new model dropped weights https://github.com/lehduong/OneDiffusion?tab=readme-o...

Nice, 3d multi view thing looks pretty interesting and seems very small.

bitter hearth Dec 11, 2024, 12:45 AM

#

ooh it takes SEGs

#

the SD 1.5 SEG control net was really nice

#

acts a bit like tile for layout but it lets the model be a bit more free

#

I think a model like this is what will be the thing that will dethrone Flux next year

#

a big transformer with lots of control stuff

cursive frigate Dec 11, 2024, 6:26 AM

#

@bitter hearth I still can't figure out custom sigmas 😉 ,seems to be a bit above my skill level. I also can't really find anything online that refers to them or how to use them.

civic trail Dec 11, 2024, 8:33 AM

#

bitter hearth Dec 11, 2024, 9:29 AM

#

cursive frigate <@456226577798135808> I still can't figure out custom sigmas 😉 ,seems to be a b...

this is what a linear schedule of custom sigmas for Flux would look like:
1.0, 0.8, 0.6, 0.4, 0.2, 0.0,
its just a list of numbers that goes down to zero

remote holly Dec 11, 2024, 12:51 PM

#

https://github.com/kijai/ComfyUI-MMAudio

GitHub

GitHub - kijai/ComfyUI-MMAudio

Contribute to kijai/ComfyUI-MMAudio development by creating an account on GitHub.

dry wave Dec 11, 2024, 4:14 PM

#

pseudo owl Yeah mmaudio is truly amazing, and with hunyuant2v, this thing is ridiculous

how are you generating these videos? On the hunyuan page they say you need at least 45 GB VRAM. Do you use an external gpu, or can you quantize hunyuan to use <24GB?

pseudo owl Dec 11, 2024, 4:47 PM

#

dry wave how are you generating these videos? On the hunyuan page they say you need at le...

Yes you can use fp8 quantization and even block swap to fit in 12gb vram.

dry wave Dec 11, 2024, 4:47 PM

#

nice, should try it. Is there a tutorial anywhere for comfyui?

bitter hearth Dec 11, 2024, 4:48 PM

#

keep an eye out for FP4/Int4 stuff
cos that's getting more common now (its half the VRAM and up to twice the speed of FP8)

remote holly Dec 11, 2024, 4:57 PM

#

I can't run mmaudio , i have torch issues

bitter hearth Dec 11, 2024, 4:58 PM

#

I had torch issues with SD 1.5 today lol

#

its so hard juggling dependencies and docker containers

remote holly Dec 11, 2024, 4:59 PM

#

Cannot import ComfyUI-MMAudio module for custom nodes: No module named 'torch.nn.attention'

#

what is the module torch.nn.attention and wich version has this module ?

pseudo owl Dec 11, 2024, 5:59 PM

#

dry wave nice, should try it. Is there a tutorial anywhere for comfyui?

Not sure abt tutorial but here is the node with some example workflows, https://github.com/kijai/ComfyUI-HunyuanVideoWrapper

dry wave Dec 11, 2024, 6:01 PM

#

thx

remote holly Dec 11, 2024, 6:35 PM

#

remote holly `Cannot import ComfyUI-MMAudio module for custom nodes: No module named 'torch.n...

resolved , mmaudio works only with torch and torchaudio 2.5.1 version and cuda 1.24

toxic bone Dec 11, 2024, 6:40 PM

#

nodes requireing specific versions of dependencies. that's pretty sussy considering how vulnerable packaging infrastructure is.

wasn't it a video node thatpeople were excited about that was the first big security breach on comfyui?

#

the latest was a specific pypy package that was used on a hundred different nodes

remote holly Dec 11, 2024, 6:57 PM

#

yeah is annoying , i have chance because i dont use lot custom nodes

pseudo owl Dec 11, 2024, 7:24 PM

#

toxic bone nodes requireing specific versions of dependencies. that's pretty sussy conside...

Kijais nodes are 100% trustable, he makes the first and usually best nodes for every new video model(CogVideoX, mochi, hunyuan) and everyone mostly uses his nodes for video models.

remote holly Dec 11, 2024, 7:39 PM

#

I finaly generated the audios

brittle nexus Dec 11, 2024, 7:48 PM

#

Analyzing the two images, is there a clear winner?

toxic bone Dec 11, 2024, 7:49 PM

#

pseudo owl Kijais nodes are 100% trustable, he makes the first and usually best nodes for e...

yeh but the infrastructure around it isn't.

running free scripts natively out of excitement is never a smart decision

#

Also, Kijais whoever they are, is just an online account. Anyone could get access to it. It could always be compromised

civic trail Dec 11, 2024, 8:12 PM

#

lucid swift Dec 11, 2024, 8:28 PM

#

bitter hearth new model dropped weights https://github.com/lehduong/OneDiffusion?tab=readme-o...

wow that is cool. but i wish that they would have model offloading. and the option to runn the vlm on the cpu with lamma cpp or something. because nobody has so much vram. but i will try to caption it by hand for now. whatever that means

toxic bone Dec 11, 2024, 9:46 PM

#

great thing about weights is that anyone can take the code and reform it to use the weights the way they want to

bitter hearth Dec 11, 2024, 9:53 PM

#

really don't think its worth CPU offloading, or running models on CPU

#

would be better to pick smaller models that fit in your VRAM

mortal mesa Dec 11, 2024, 10:11 PM

#

anything can be compromised, probably best to stay away from computers in general

#

the app your typing on was compromised

bitter hearth Dec 11, 2024, 10:25 PM

#

there's been seven cases of malware this year in the ecosystem, have been keeping count

#

I keep it sandboxed in docker containers on cloud personally

lucid swift Dec 11, 2024, 10:27 PM

#

bitter hearth would be better to pick smaller models that fit in your VRAM

we are talking about captioning a image that is very fast so its worth it to run on the cpu. the main model shul of corse be on the gpu. also ofloading works quite well.

pseudo owl Dec 11, 2024, 10:30 PM

#

lucid swift we are talking about captioning a image that is very fast so its worth it to run...

You can probably just use florence2/moondream2. You don't need the captioner, it just makes it more automated instead of hand writing captions.

Florence2/moondream2 are really fast and good quality and use very little vram(4gb or less?), definitely better options then molmo/llava.

bitter hearth Dec 11, 2024, 10:34 PM

#

lucid swift we are talking about captioning a image that is very fast so its worth it to run...

its hard to know what speeds people are used to
it might be fast relative to things you have done before

lucid swift Dec 11, 2024, 10:40 PM

#

bitter hearth its hard to know what speeds people are used to it might be fast relative to thi...

it will take a few seconds

lucid swift Dec 11, 2024, 10:42 PM

#

pseudo owl You can probably just use florence2/moondream2. You don't need the captioner, it...

yes the ui is just a bit confusing. and i am not 100% sure if i am using it corectly

bitter hearth Dec 11, 2024, 10:53 PM

#

lucid swift it will take a few seconds

ok but this is no good when my image gen is under 1 second, for example

lucid swift Dec 11, 2024, 11:20 PM

#

bitter hearth ok but this is no good when my image gen is under 1 second, for example

its not. the image generation takes like 30s on a 3090

bitter hearth Dec 11, 2024, 11:37 PM

#

oh for this new model yeah I just mean for what I have done before

#

I wouldn't find a few seconds that fast

toxic bone Dec 12, 2024, 12:25 AM

#

mortal mesa anything can be compromised, probably best to stay away from computers in genera...

there are levels of risk and levels of compromise. nothing is black and white like that.

#

i dont install custom discords hacks for instance. i have the app installed but i haven't loaded it for many months. i typically use discord in a chrome tab

winged seal Dec 12, 2024, 6:22 AM

#

Haven't been in here in a while lol

#

decided I would pop in

toxic bone Dec 12, 2024, 7:21 AM

#

Awww ❤️ After i popped in on you lol. The guitar mastah.

frail lagoon Dec 12, 2024, 8:02 AM

#

can SD3 be used in a1111/forge yet?

lavish sparrow Dec 12, 2024, 8:14 AM

#

I still like sd3.5

#

#

#

#

#

remote holly Dec 12, 2024, 10:10 AM

#

lavish sparrow

Sd3.5 large ?

lavish sparrow Dec 12, 2024, 10:10 AM

#

yeah 🙂

remote holly Dec 12, 2024, 10:10 AM

#

Is realy good for styles and colors

lavish sparrow Dec 12, 2024, 10:11 AM

#

yeah, i like the aesthetics of sd3.5 way better than flux

remote holly Dec 12, 2024, 10:11 AM

#

Yeah by far

lavish sparrow Dec 12, 2024, 10:13 AM

#

and i used some new tricks i picked up from one of the capabilities of ollama and an idea by qwq (LLM model)

#

make the LLM output json, and have it a dedicated field for reasoning when interpreting a prompt, but make another field the output prompt: it'll listen REALLY well

#

and with sd3.5L having the proper formatting for t5, clipl and clipg really helps too

#

if clipl goes over 75 tokens the image goes wonky fast

remote holly Dec 12, 2024, 10:14 AM

#

ho , i never tried reasonning models

lavish sparrow Dec 12, 2024, 10:15 AM

#

first, qwq is actually really amazing, but i figured, why not try with something like mistral, or any other small models

#

and even that works, if they have a little bit of "private" space where they can lay out things for themselves before atcualyl making the prompt seems to help a lot

#

example from mistral-small-22b:"

"T5": "A desolate battlefield strewn with discarded armor, each piece lined with glinting silver, lies amidst cyanide-tainted earth that shimmers in an eerie blue hue. The ground is scorched and barren, devoid of life except for the remnants of a destructive dance. Swords and shields are scattered haphazardly, as if left behind by an army in haste. The sky above is darkened by a churning storm, with lightning striking down ruthlessly, casting long, ominous shadows across the bleak landscape.",
"CLIPG": "battlefield, silver armor, cyanide ground, scattered weapons, dark storm",
"CLIPL": "A grim battlefield littered with silver-lined armor and discarded weapons on cyanide-tainted earth, under a threatening stormy sky.",
"Artstyle": "Dark Romanticism"}```

#

before, that part that got in the front would end up in parts of the prompt that would be submitted, so now it really gives better visual only representations

remote holly Dec 12, 2024, 10:28 AM

#

intersting , i will try it and share the results

lavish sparrow Dec 12, 2024, 10:32 AM

#

pull my image for the systemprompt and nodes used tho, you need some formatting tricks to get it to work properly

#

and unfortunately qwq fails at making json if the systemprompt is too big

buoyant mesa Dec 12, 2024, 4:43 PM

#

frail lagoon can SD3 be used in a1111/forge yet?

in forge yess but i controlnet and so one dont work , so i use it in ComfyUI

devout schooner Dec 12, 2024, 6:01 PM

#

lavish sparrow yeah, i like the aesthetics of sd3.5 way better than flux

I prefer Medium personally, both because it supports higher initial generation resolutions, and also because it can actually work properly in a traditional hi-res-fix context
Like e.g. SD 1.5 was perfectly capable of genning at 512 at then refining the image at 1024 once upscaled with ESRGAN or whatever
But 3.5 Large cannot do that (or really any sort of img2img) even for some reason, it creates noticeable artifacting at any denoise strength
3.5 Medium however goes back to behaving like you'd expect in that context
So it's just better IMO

#

Only thing I don't like about 3.5 is that both Large and Medium lean towards extremely desaturated grayish kind of images, as though "smoke, fog, haze" was always in the positive prompt or something
In contrast 3.0 didn't have that problem at all

lavish sparrow Dec 13, 2024, 12:01 AM

#

i don't feel smoke, fog, haze are a part of 3.5l