#✨｜sdxl | Stable Diffusion | Page 151

nimble heart Oct 20, 2023, 11:09 AM

#

let me bypass the mi200 checks too

vital ermine Oct 20, 2023, 11:10 AM

#

Yeah, the dev of it said that but now says to use some others over his Prodigy due to how large XL is. Sadly, the suggestions we don't have access to.

fierce hollow Oct 20, 2023, 11:10 AM

#

worst case gpu just self combusts 🤣

soft bone Oct 20, 2023, 11:11 AM

#

vital ermine Yeah, the dev of it said that but now says to use some others over his Prodigy d...

i decided on prodigy after watching this
https://www.youtube.com/watch?v=QpWacUWeqbE&ab_channel=kasukanra

YouTube

kasukanra

Kasucast #18 - Stable Diffusion: Training SDXL 1.0 Art Style (Finet...

#sdxl #ComfyUI #LoRA #runpod #dadaptation #prodigy #stablediffusion #style #styletraining

This is an SDXL 1.0 training log for art style. However, the workflow is also interchangeable with SD1.5. I document my thought process, experiments, mistakes, and analysis of quantitative and qualitative results. Hopefully, this can be a good starting gui...

▶ Play video

nimble heart Oct 20, 2023, 11:12 AM

#

okay this problem I'm too dumb to fix

#

well it didnt even compile before so progress?

vital ermine Oct 20, 2023, 11:13 AM

#

soft bone i decided on prodigy after watching this https://www.youtube.com/watch?v=QpWacUW...

Yeah, been subbed to him for a long while now. His is all anime while mine is realism so a lot just never translated well.

soft bone Oct 20, 2023, 11:13 AM

#

https://civitai.com/models/167991/loving-vincent

and i trained this with prodigy

Loving Vincent - v1.0 | Stable Diffusion LoRA | Civitai

The revival of Van Gogh's style from the film Loving Vincent is finally on XL! This model serves as a style-converter. I recommend NOT using other ...

vital ermine Oct 20, 2023, 11:14 AM

#

that;s cool

soft bone Oct 20, 2023, 11:15 AM

#

it outperforms my 1.5 full dreambooth version

#

by a lot

peak dove Oct 20, 2023, 11:15 AM

#

Trying to add TensorRT as an extension in A1111 - getting this error - AssertionError: extension access disabled because of command line flags - found the answer - DO NOT USE --share and/or--listen in the .bat file when installing TensorRT!

vital ermine Oct 20, 2023, 11:16 AM

#

Oh, I remember you in here using that and van goh-ing everything, lol

soft bone Oct 20, 2023, 11:16 AM

#

lolol yes and the cats

vital ermine Oct 20, 2023, 11:16 AM

#

yes

#

It was good

soft bone Oct 20, 2023, 11:17 AM

#

vital ermine Oct 20, 2023, 11:17 AM

#

I never have much luck training loras on any SD version but DB just works for me

#

catshrek, lol

soft bone Oct 20, 2023, 11:18 AM

#

does db even run with XL?

vital ermine Oct 20, 2023, 11:19 AM

#

yes

#

Only option is to extract to Lora and Lycoris says it is easy to make it extract to locon that I want. Doesn't do it

#

rustic garnet Oct 20, 2023, 11:28 AM

#

lora extraction only makes sense if you train the same layers as in lora

vital ermine Oct 20, 2023, 11:29 AM

#

Oh, it makes a whole lot of sense and actually does give better results from the people I talk to. Takes longer to do is the reason people say they don't do it.

#

for me that is the only way any of my loras have been made

rustic garnet Oct 20, 2023, 11:30 AM

#

it does not make sense if you train db on all layers and then only extract a subset of them

#

freeze everything except the attention layers when you want to extract lora

#

for locon you would also train the conv layers in the resnet

#

in general extracting loras afterwards can be more parameter efficient, as you can set the lora rank dynamically. But it has to be done right

vital ermine Oct 20, 2023, 11:32 AM

#

yes, and why I have been trying to get that as extraction but they are just not doing it saying Kohya can add it that it is easy. Well, if the dev says it is easy who are any of us to say it isn't? All I know is for XL it hasn't been done.

rustic garnet Oct 20, 2023, 11:32 AM

#

it is easy, just a few lines of python code

vital ermine Oct 20, 2023, 11:32 AM

#

exactly what he said too

#

Kohya just hasn't done it for some odd reason

#

I prefer Locon over lora

soft bone Oct 20, 2023, 11:44 AM

#

encanto trained well

#

vital ermine Oct 20, 2023, 11:48 AM

#

probably XL had an idea about it

soft bone Oct 20, 2023, 11:49 AM

#

nay i checked

#

vital ermine Oct 20, 2023, 11:50 AM

#

every images was totally different?

#

yep, it has a general idea

soft bone Oct 20, 2023, 11:50 AM

#

vital ermine every images was totally different?

vital ermine Oct 20, 2023, 11:51 AM

#

what's up with emma watson?

soft bone Oct 20, 2023, 11:52 AM

#

im not using "animation, disney, pixar, 3d, animated, cgi" or any styling words at all. just subject line and token. XL knows how to do pixar kinda but i make sure to keep the concept separate from that data

vital ermine Oct 20, 2023, 11:53 AM

#

btw, I am training using your info but I had to make it 2 epochs.

soft bone Oct 20, 2023, 11:53 AM

#

vital ermine what's up with emma watson?

my generation grew up with her in our favorite movies, hard not to love her

vital ermine Oct 20, 2023, 11:54 AM

#

soft bone Oct 20, 2023, 11:54 AM

#

how many imgs?

#

oh i gladly train up to 2 hours nowadays

vital ermine Oct 20, 2023, 11:54 AM

#

546

#

I had far more luck with locon in 2.1 than a lora so let's see if this trains after a bit

#

I hate how the images on civit is so small now. Someone mentioned that to me last month and yeah, a pita now.

#

#

far to small to really investigate

#

good, now to test

soft bone Oct 20, 2023, 12:06 PM

#

caption this

vital ermine Oct 20, 2023, 12:06 PM

#

fingers crossed

#

interesting

soft bone Oct 20, 2023, 12:08 PM

#

vital ermine Oct 20, 2023, 12:10 PM

#

#

How do I tell?

soft bone Oct 20, 2023, 12:13 PM

#

lately i just eliminate the most problematic checkpoints in the XY plot until the last 2-3 and then choose by preference

vital ermine Oct 20, 2023, 12:13 PM

#

I am thinking one of the last five.

#

how do I tell which ones are the problems?

soft bone Oct 20, 2023, 12:14 PM

#

ones with the most artifacts

vital ermine Oct 20, 2023, 12:14 PM

#

oh, I never see those

soft bone Oct 20, 2023, 12:14 PM

#

in that first one you sent, the guy's torso is glitching in half

vital ermine Oct 20, 2023, 12:15 PM

#

what I see is skeletons so that is too similar to the original

#

that first one all by itelsef IS base XL

#

the one on the left by itself is base XL

#

no lora

soft bone Oct 20, 2023, 12:18 PM

#

ah i see

#

vital ermine Oct 20, 2023, 12:20 PM

#

I may need to train for longer

#

more steps

soft bone Oct 20, 2023, 12:21 PM

#

freckles!

#

i may need to train a general pixar model

vital ermine Oct 20, 2023, 12:23 PM

#

I am going to train for 1500 more steps

soft bone Oct 20, 2023, 12:24 PM

#

vital ermine Oct 20, 2023, 12:25 PM

#

1092

lilac wren Oct 20, 2023, 12:34 PM

#

what hardware and settings to train SDXL do you have guys?

soft bone Oct 20, 2023, 12:36 PM

#

3090

#

settings are all over the place depending on dataset

indigo carbon Oct 20, 2023, 12:46 PM

#

vital ermine Oct 20, 2023, 12:50 PM

#

yeah, this is a failure

soft bone Oct 20, 2023, 12:54 PM

#

#

im surprised at this flexibility considering its only trained on one movie and no styling prompts are needed at all. no regularization either

#

vital ermine Oct 20, 2023, 1:01 PM

#

looks like 300 is it but all of the faces are screwed

#

#

The first one is 300 more steps over the last highest

#

It's XL I live with janky jacked up trainings.

wet nacelle Oct 20, 2023, 1:43 PM

#

Do we know of any wire extension for comfy that would allow me the user to actually put a pin on a part of a wire that connects the nodes?

#

I want to just orientate the wire to my will.

#

#

vital ermine Oct 20, 2023, 1:47 PM

#

wet nacelle

Nope

wet nacelle Oct 20, 2023, 1:47 PM

#

vital ermine Nope

Do you think there is one out there?

#

vital ermine Oct 20, 2023, 1:52 PM

#

wet nacelle Do you think there is one out there?

Nope. We really need something to handle the mess of wires

wet nacelle Oct 20, 2023, 1:53 PM

#

vital ermine Nope. We really need something to handle the mess of wires

The only thing we have is the straight line one right?

vital ermine Oct 20, 2023, 1:55 PM

#

wet nacelle The only thing we have is the straight line one right?

the ones in built, and it really messes with me when I am building. I am used to way different and ones if I grab the node the wire will follow so slide right the noodle/wire will go left so I can see where it goes. Right now it is 100% crap. Another issue is that just clicking the wire doesn't tell you where it is connected most times.

wet nacelle Oct 20, 2023, 1:56 PM

#

vital ermine the ones in built, and it really messes with me when I am building. I am used t...

ah okay

vital ermine Oct 20, 2023, 1:57 PM

#

#

#

#

#

#

#

peak dove Oct 20, 2023, 2:22 PM

#

ComfyUI SDXL Sytan's w/flow DynavisionXL model

vital ermine Oct 20, 2023, 2:23 PM

#

peak dove Oct 20, 2023, 2:24 PM

#

vital ermine Oct 20, 2023, 2:25 PM

#

wet nacelle Oct 20, 2023, 2:28 PM

#

vital ermine Oct 20, 2023, 2:32 PM

#

icy brook Oct 20, 2023, 2:32 PM

#

https://civitai.com/models/170188/aether-bubbles-and-foam-lora-for-sdxl

Aether Bubbles & Foam - LoRA for SDXL - v1.0 | Stable Diffusion LoR...

This is A e t h e r B u b b l e s & F o a m - a bath foam and soap bubble texture based subject and object oriented LoRA trained on animals, pe...

#

It's up!

#

vital ermine Oct 20, 2023, 2:33 PM

#

#

wet nacelle Oct 20, 2023, 2:40 PM

#

vital ermine Oct 20, 2023, 2:41 PM

#

#

glass notch Oct 20, 2023, 2:46 PM

#

Is it normal for images to not look that great with SDXL compared to 1.5 upscaled? Here are my settings: absurdres, high quality, top quality, a colorful parrot flying in a mangrove jungle
Negative prompt: easynegative, worst quality
Steps: 40, Sampler: DPM++ 2S a Karras, CFG scale: 7, Seed: 1338516404, Size: 1344x768, Model hash: e6bb9ea85b, Model: sd_xl_base_1.0_0.9vae, Clip skip: 2, ENSD: 31337, Refiner: sd_xl_refiner_1.0 [7440042bbd], Refiner switch at: 0.8, Version: v1.6.0. I am running lowvram (due to 6GB) no half vae and xformers arguments

vital ermine Oct 20, 2023, 2:46 PM

#

#

@glass notchyour prompt

#

I did not use anything but the positive prompt from what you posted

glass notch Oct 20, 2023, 2:50 PM

#

It's clear that it's easier to prompt concepts but from your own example it does seem that the quality I got is to be expected. Thanks for doing the test

vital ermine Oct 20, 2023, 2:52 PM

#

Welcome

#

#

#

wet nacelle Oct 20, 2023, 3:56 PM

#

crisp owl Oct 20, 2023, 3:57 PM

#

Freyja

wet nacelle Oct 20, 2023, 4:00 PM

#

crisp owl Oct 20, 2023, 4:00 PM

#

Freyja after acquiring Brisingamen, yet unable to find joy again

indigo carbon Oct 20, 2023, 4:09 PM

#

wet nacelle

I never trusted a banking agency ever since Sberbank did what they did

#

Those MFs made Kandinsky

wet nacelle Oct 20, 2023, 4:11 PM

#

indigo carbon Those MFs made Kandinsky

That's fucking wacky as hell

indigo carbon Oct 20, 2023, 4:12 PM

#

wet nacelle That's fucking wacky as hell

It is, it also came out of nowhere.. they just decided to pour money into training a pixel diffusion model

wet nacelle Oct 20, 2023, 4:12 PM

#

indigo carbon It is, it also came out of nowhere.. they just decided to pour money into traini...

And an LM right?

indigo carbon Oct 20, 2023, 4:12 PM

#

Though that model had image blending capabilities somehow

wet nacelle Oct 20, 2023, 4:12 PM

#

indigo carbon Oct 20, 2023, 4:13 PM

#

wet nacelle And an LM right?

Unsure, they're sketch af

#

I'm also not sure if Kandinsky can blend images due to being pixel diffusion or maybe because it uses ViT-14 or something like that

wet nacelle Oct 20, 2023, 4:14 PM

#

indigo carbon Unsure, they're sketch af

https://www.reuters.com/technology/russias-sberbank-releases-chatgpt-rival-gigachat-2023-04-24/

Reuters

Russia's Sberbank releases ChatGPT rival GigaChat

Russian lender Sberbank said on Monday it had released technology called GigaChat as a rival to ChatGPT, initially in an invite-only testing mode, joining the artificial intelligence chatbot race.

#

not much came out from it from what I can tell.

indigo carbon Oct 20, 2023, 4:15 PM

#

Yeah, I hate them. They made a model that can blend images and never explained what causes it to be able to do so

#

Oh that's it. "Image encoder: ViT-bigG-14-laion2B-39B-b160k"

#

So it can have image input due to having both a text encoder and an image encoder?

rustic garnet Oct 20, 2023, 4:32 PM

#

indigo carbon I'm also not sure if Kandinsky can blend images due to being pixel diffusion or ...

because they are trained on clip image embeddings - it's as easy as that

rustic garnet Oct 20, 2023, 4:34 PM

#

indigo carbon So it can have image input due to having both a text encoder and an image encode...

SDXL has that, too, by the way. But they always train on text and clip embedding at the same time. Thus, providing only images but no text is probably less effective in SDXL than in Kandinsky

#

I'm not soo interested in image blending, so I haven't tested if image blend in SDXL works as good as in Kadinsky

half ivy Oct 20, 2023, 4:36 PM

#

looking for masking tool for auto, asking for a friend o0

wet nacelle Oct 20, 2023, 4:36 PM

#

#

strong copper Oct 20, 2023, 4:50 PM

#

wet nacelle Oct 20, 2023, 5:08 PM

#

cyan crown Oct 20, 2023, 5:13 PM

#

wet nacelle https://www.reuters.com/technology/russias-sberbank-releases-chatgpt-rival-gigac...

I use chatgpt, but I also like local Vicuna 13B

wet nacelle Oct 20, 2023, 5:14 PM

#

cyan crown I use chatgpt, but I also like local Vicuna 13B

k

#

#

lilac wren Oct 20, 2023, 5:42 PM

#

13188-1980921765-a_woman__lora_LeahLora_v4_SDXL_1__circle_lenses_dressed_in_mini_skirt_and_red_tank_top_cell_Shaded_amazing_anime_scenery_wal.png

wet nacelle Oct 20, 2023, 5:45 PM

#

#

cyan crown Oct 20, 2023, 5:54 PM

#

wet nacelle

is it a Lora?

wet nacelle Oct 20, 2023, 5:55 PM

#

cyan crown is it a Lora?

No. It's base sdxl 1.0

#

cyan crown Oct 20, 2023, 5:55 PM

#

prompt?

lilac wren Oct 20, 2023, 6:13 PM

#

lapis gale Oct 20, 2023, 6:20 PM

#

252191-524893542-comic_cool_cow_._graphic_illustration_comic_art_graphic_novel_art_vibrant_highly_detailed.jpg

#

#

252141-618968407-a_perfect_birthday_cake_for_a_heavy_metal_computer_scientist_programmer_grimdark_exquisite_details.jpg

#

just a bit of an SDXL variety dump 🙂

251927-2518358369-Surrealist_art_the_delorean_time_machine_in_a_1960s_hippie_community_._Dreamlike_mysterious_provocative_symbolic_intri.jpg

lilac wren Oct 20, 2023, 6:28 PM

#

icy brook Oct 20, 2023, 6:28 PM

#

crisp owl Oct 20, 2023, 6:45 PM

#

indigo carbon Oct 20, 2023, 6:54 PM

#

rustic garnet SDXL has that, too, by the way. But they always train on text and clip embeddin...

No, SDXL won't have image input unless you use it with IPAdapter

#

Or ControlNet?

#

Blending is definitely not a capability without IPAdapter, many people have tested and come to that conclusion

#

@rustic garnet are you referring to the CLiP_vision as an image encoder? Because that's not really a part of SDXL.. SDXL just has 2 text encoders, no image encoder according to HF

#

With using CLiP_vision, SDXL can have image input, I only tested this with 2 inputs for blending and it wasn't coherent at all.. it was able to blend the 2 images only one out of like 30 times

#

Haven't tested that method with 1 input, so idk how does that behave

rustic garnet Oct 20, 2023, 7:08 PM

#

indigo carbon <@321288280651857922> are you referring to the CLiP_vision as an image encoder? ...

clip vision is part of clip which is part of sdxl

indigo carbon Oct 20, 2023, 7:09 PM

#

rustic garnet clip vision is part of clip which is part of sdxl

it doesn't work though

rustic garnet Oct 20, 2023, 7:09 PM

#

indigo carbon With using CLiP_vision, SDXL can have image input, I only tested this with 2 inp...

yes, as said, sdxl is not trained on clip pooled only input. That's why it's not as good for that. But this is a problem of the training not the model itself

indigo carbon Oct 20, 2023, 7:10 PM

#

the outputs with image input using CLiP vision has nothing to do with the input

rustic garnet Oct 20, 2023, 7:10 PM

#

you could fine-tune sdxl on that - but honestly, ipadapter is the better solution anyways

lapis gale Oct 20, 2023, 7:11 PM

#

spelling needs a bit of work, but the cat pumpkins are nice

252187-3445351136-Flat_papercut_style_sign_saying__sample_some_text__a_cute_anime_style_woman_with_a_big_halloween_pumpkin_evil_s.jpg

vale eagle Oct 20, 2023, 7:11 PM

#

https://github.com/PixArt-alpha/PixArt-alpha

GitHub

GitHub - PixArt-alpha/PixArt-alpha: Fast Training of Diffusion Tran...

Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis - GitHub - PixArt-alpha/PixArt-alpha: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synth...

indigo carbon Oct 20, 2023, 7:11 PM

#

rustic garnet you could fine-tune sdxl on that - but honestly, ipadapter is the better solutio...

IPAdapter causes the quality to degrade by a lot most of the time, I have plenty of experience that thing

#

it can blend images, but again, nothing like the quality you normally get out of SDXL

vale eagle Oct 20, 2023, 7:14 PM

#

PixArt Alpha used dalle3 liked structure and it might open source soon.

wet nacelle Oct 20, 2023, 7:15 PM

#

indigo carbon Oct 20, 2023, 7:16 PM

#

vale eagle PixArt Alpha used dalle3 liked structure and it might open source soon.

when showcasing they said "can be as good as SDXL and even midjourney" which is a huge red flag since SDXL itself already is better than midjourney

wet nacelle Oct 20, 2023, 7:16 PM

#

lusty wolf Oct 20, 2023, 7:18 PM

#

Hair is clumpy? I think it is FreeU?

wet nacelle Oct 20, 2023, 7:19 PM

#

indigo carbon Oct 20, 2023, 7:21 PM

#

lusty wolf Hair is clumpy? I think it is FreeU?

does FreeU even improve anything on SDXL? I never tested it

vale eagle Oct 20, 2023, 7:22 PM

#

FreeU does help in some cases but not a general solution

lusty wolf Oct 20, 2023, 7:22 PM

#

indigo carbon does FreeU even improve anything on SDXL? I never tested it

It does, but very finicky, beter eyes, face and hands, and lots of detail...

indigo carbon Oct 20, 2023, 7:24 PM

#

idk, the images I generate already have exceptional quality imo, I'll try it, but I doubt it'll improve anything

cyan crown Oct 20, 2023, 7:26 PM

#

XL_More_ART Lora does a good job improving image quality

#

glass notch Oct 20, 2023, 7:37 PM

#

tribal lantern Oct 20, 2023, 7:37 PM

#

vale eagle PixArt Alpha used dalle3 liked structure and it might open source soon.

kinda different though, dalle and sdxl hava a unet, this one has a transformer structure, the similarity with dalle is t5 text encoder and automatic capitioning to generate better captions. but all in all pixart seems more innovative than dalle-3, unless the dalle paper leaves out lots of details, it seemed kind of ordinary to me, just dalle++ more training, better data, and go with it, even sdxl did with two mixed clip encoders, aestic score, size in training data, might be because llm's is openai's core (and that makes it all the weirder they kept unet vs transforsmers)

cyan crown Oct 20, 2023, 7:40 PM

#

ionic dragon Oct 20, 2023, 7:42 PM

#

Suggest a cool concept to train a lora on?

cyan crown Oct 20, 2023, 7:43 PM

#

if you were Italian I would like Dylan Dog comics

stone fossil Oct 20, 2023, 7:49 PM

#

Hackerman SD XL 1.0: https://civitai.com/models/170538?modelVersionId=191617

Chalkify SD XL 1.0: https://civitai.com/models/170557?modelVersionId=191642

Hackerman SD XL 1.0 - Hackerman SD XL 1.0 | Stable Diffusion LoRA |...

Introducing Hackerman SD XL 1.0, the LoRa that transcends ordinary transformations. Join the digital revolution, as Hackerman SD XL 1.0 unleashes t...

Chalkify SD XL 1.0 - Chalkify SD XL 1.0 | Stable Diffusion LoRA | C...

Introducing Chalkify SD XL 1.0, a revolutionary LoRa (Text to Image) model designed to immerse your images in the tender and enchanting world of so...

cyan crown Oct 20, 2023, 7:54 PM

#

glass notch Oct 20, 2023, 7:54 PM

#

ionic dragon Suggest a cool concept to train a lora on?

Cuneiform tablets

#

wet nacelle Oct 20, 2023, 7:59 PM

#

eternal fog Oct 20, 2023, 8:07 PM

#

There's been a few recent updates to both ComfyUI and the IPAdapter Nodes and it's sorted out the memory efficiency a lot.

I used to constantly go over the VRAM limit with 10GB and get slowdowns, now it's so much better. I can even run multiple controlnets with it and get no slowdowns.

cyan crown Oct 20, 2023, 8:14 PM

#

#

wet nacelle Oct 20, 2023, 8:25 PM

#

indigo carbon Oct 20, 2023, 8:26 PM

#

eternal fog There's been a few recent updates to both ComfyUI and the IPAdapter Nodes and it...

also AITemplate is about to be officially supported in ComfyUI

eternal fog Oct 20, 2023, 8:27 PM

#

The custom node broke for me a bit ago, so I just stopped trying

indigo carbon Oct 20, 2023, 8:27 PM

#

modules and code from the previous AITemplate repo for ComfyUI had been salvaged and slowly gets implemented into a new one made by Comfy and FizzleDorf

cyan crown Oct 20, 2023, 8:28 PM

#

hoary saddle Oct 20, 2023, 8:42 PM

#

indigo carbon modules and code from the previous AITemplate repo for ComfyUI had been salvaged...

which AIT nodes are you using now, the legacy set or the newest release in the manager?

indigo carbon Oct 20, 2023, 8:43 PM

#

hoary saddle which AIT nodes are you using now, the legacy set or the newest release in the m...

The old ones

#

With pure txt2img I'm still using a commit from a month ago, that's the fastest on my 4070ti

fierce hollow Oct 20, 2023, 8:44 PM

#

I can't get the new ones working at all 🥲

hoary saddle Oct 20, 2023, 8:44 PM

#

gotcha

#

fizzledorf i assume

indigo carbon Oct 20, 2023, 8:45 PM

#

hoary saddle fizzledorf i assume

What about him?

hoary saddle Oct 20, 2023, 8:45 PM

#

#

from the legacy branch in manager

indigo carbon Oct 20, 2023, 8:45 PM

#

hoary saddle

Yeah, that's the old one

#

Both the new one and the old one use the modules I made though 😛

hoary saddle Oct 20, 2023, 8:46 PM

#

cool, gonna remove this new one and pull the old back in

#

oh, sweet

fierce hollow Oct 20, 2023, 8:46 PM

#

oh I did get the new one working now after all, had to compile aitemplate manually... but the git patch still doesn't apply so not sure what's that about

indigo carbon Oct 20, 2023, 8:47 PM

#

fierce hollow oh I did get the new one working now after all, had to compile aitemplate manual...

Comfy said the patch will be included in soon ComfyUI versions by default

fierce hollow Oct 20, 2023, 8:48 PM

#

I assume that's already happened then(?)

rustic garnet Oct 20, 2023, 8:48 PM

#

tribal lantern kinda different though, dalle and sdxl hava a unet, this one has a transformer s...

unet has transformers, too 😉
I'm not so sure if using transformers only is really an advantage...

indigo carbon Oct 20, 2023, 8:48 PM

#

fierce hollow I assume that's already happened then(?)

Nope, not yet afaik

#

It will eventually though

fierce hollow Oct 20, 2023, 8:49 PM

#

well the patch just says

error: patch failed: comfy/ldm/modules/attention.py:91
error: comfy/ldm/modules/attention.py: patch does not apply
error: patch failed: comfy/ldm/modules/diffusionmodules/openaimodel.py:370
error: comfy/ldm/modules/diffusionmodules/openaimodel.py: patch does not apply

indigo carbon Oct 20, 2023, 8:49 PM

#

And that'll be what everyone will probably use until exDiffusion comes around

fierce hollow Oct 20, 2023, 8:49 PM

#

so I thought it has nothing to change in there, but I guess it's probably because it was made for some previous commit

indigo carbon Oct 20, 2023, 8:50 PM

#

fierce hollow so I thought it has nothing to change in there, but I guess it's probably becaus...

It's not compatible with any commit, Comfy needs to do some stuff on his end for that

#

And he will, eventually

fierce hollow Oct 20, 2023, 8:50 PM

#

right

indigo carbon Oct 20, 2023, 8:51 PM

#

It should be as fast as the old commits are, so that's probably what everyone will use for a while

#

Well, until people figure out how to make optimized kernals instead of engines for diffusion

#

Much like happened with LLaMa

fierce hollow Oct 20, 2023, 8:55 PM

#

yeah umm I still don't see that happening, openai released their dall-e paper though which is nice

#

shows that a good model should probably be trained on something better than laion captions for starters

indigo carbon Oct 20, 2023, 8:58 PM

#

fierce hollow yeah umm I still don't see that happening, openai released their dall-e paper th...

How does making optimized kernals have anything to do with OpenAI?

fierce hollow Oct 20, 2023, 8:58 PM

#

no just thinking that maybe let's get a solid model or something before making it super fast

#

much like everybody started doing stuff with llms only after llama was released

vale eagle Oct 20, 2023, 8:59 PM

#

no

indigo carbon Oct 20, 2023, 8:59 PM

#

fierce hollow no just thinking that maybe let's get a solid model or something before making i...

SDXL isn't a solid model? I know it only does text input, but it pretty much masters it

vale eagle Oct 20, 2023, 9:00 PM

#

The next model is a better model

#

You can't wait the best model to do stuff

fierce hollow Oct 20, 2023, 9:00 PM

#

compared to dall-e 3, it could certainly use some work

#

(inb4 somebody posts 'but look at this cool image, dalle can't do it', but then dall-e draws 3 people with exact specified shirt colors)

indigo carbon Oct 20, 2023, 9:01 PM

#

fierce hollow compared to dall-e 3, it could certainly use some work

I often compare SDXL to DALL-E 3, SDXL is usually way better with the quality of the images. The reason DALL-E 3 has a better understanding of language is due to using a better encoder

fierce hollow Oct 20, 2023, 9:03 PM

#

it's still t5 (how many params remains a mystery), their captions are just better

indigo carbon Oct 20, 2023, 9:03 PM

#

fierce hollow it's still t5 (how many params remains a mystery), their captions are just bette...

It's not just the captions, the model itself has some kind of LLM to handle inputs

fierce hollow Oct 20, 2023, 9:04 PM

#

if you look at the paper they sort of mostly explain that, basically user = dumb, so they process the prompts to match the ones the model was trained on

vale eagle Oct 20, 2023, 9:04 PM

#

user's prompt -> gptv4 -> descriptive prompt (which match the trained format with the model

indigo carbon Oct 20, 2023, 9:04 PM

#

The model is also a pixel diffusion model, which is losing points in my book

whole kettle Oct 20, 2023, 9:05 PM

#

Well you kind of have to read tokens.json to really understand what to prompt it with yeah.

vale eagle Oct 20, 2023, 9:05 PM

#

It is latent diffusion model

indigo carbon Oct 20, 2023, 9:06 PM

#

vale eagle It is latent diffusion model

DALL-E 3? Nope, you can easily tell it's pixel diffusion by looking at the graininess of smaller details

tribal lantern Oct 20, 2023, 9:06 PM

#

thanks, got cofused because the pixart paper claimed to be new in having transformers instead of unet. Whether it's better or worse remains to be seen.

vale eagle Oct 20, 2023, 9:06 PM

#

fierce hollow Oct 20, 2023, 9:06 PM

#

if you make your prompt too long you can see the vae bleeding through...

indigo carbon Oct 20, 2023, 9:07 PM

#

Well, if DALL-E 3 is latent diffusion it's definitely not a good one at that; the most common symptom of pixel diffusion is graininess - which DALL-E 3 has

tribal lantern Oct 20, 2023, 9:08 PM

#

What irks me about these new textencoder heavy models (pixart now but also deepfloyd/imagegen) is the text encoder is larger than the latent model.

fierce hollow Oct 20, 2023, 9:08 PM

#

you can offload/quantize the encoder so it's really not that huge of a deal

indigo carbon Oct 20, 2023, 9:09 PM

#

Speaking of, does DALL-E 3 have an image encoder?

tribal lantern Oct 20, 2023, 9:09 PM

#

that was my first thought as well, but since it'snot done yet, maybe it'snot so simple

vale eagle Oct 20, 2023, 9:10 PM

#

Highly descriptive captions is the key to improve prompt following.

fierce hollow Oct 20, 2023, 9:10 PM

#

I don't know about pixart but imagen encoder can be quantized with bitsandbytes (the results are terrible but that's a separate matter I guess)

indigo carbon Oct 20, 2023, 9:12 PM

#

The main weakness of SDXL is the lack of an image encoder, so I'm assuming future versions of SD will also have image conditioning capabilities?

vale eagle Oct 20, 2023, 9:13 PM

#

I don't see image encoder mentioned in Dalle3 paper.

indigo carbon Oct 20, 2023, 9:13 PM

#

vale eagle I don't see image encoder mentioned in Dalle3 paper.

Damn, this means DALL-E 3 won't have image input without the assistance of something like IPAdapter

vale eagle Oct 20, 2023, 9:13 PM

#

But Dalle3 is highly connected to GPT4V which able to accept image input

tribal lantern Oct 20, 2023, 9:13 PM

#

main weakness i'd say is prompt following, image input is nice to have too of course

tribal lantern Oct 20, 2023, 9:14 PM

#

vale eagle But Dalle3 is highly connected to GPT4V which able to accept image input

is it though?

indigo carbon Oct 20, 2023, 9:14 PM

#

tribal lantern main weakness i'd say is prompt following, image input is nice to have too of co...

That's just CLiP being a bottleneck

tribal lantern Oct 20, 2023, 9:14 PM

#

Maybe clip's a bottle neck

vale eagle Oct 20, 2023, 9:14 PM

#

GPT4V handle user input and make the prompt to Dalle3

tribal lantern Oct 20, 2023, 9:14 PM

#

maybe it's a strength too for styles/aesthetics

#

dalle-3 goes out of its way to follow weird/contradicting prompts for me leading to ugly images it seems

indigo carbon Oct 20, 2023, 9:16 PM

#

vale eagle But Dalle3 is highly connected to GPT4V which able to accept image input

Gluing together DALL-E 3 and gpt 4 is just a method of censoring the model without choking the training

tribal lantern Oct 20, 2023, 9:16 PM

#

sdxl's understanding of prompts is beautifully abstract in a way

vale eagle Oct 20, 2023, 9:16 PM

#

And Idea2Img make use of GPT4V to have the image input to make a prompt for SDXL

#

indigo carbon Oct 20, 2023, 9:18 PM

#

I'm still stumped on the idea of DALL-E 3 being a latent diffusion, it has such a pixel diffusion look to it

wet nacelle Oct 20, 2023, 9:18 PM

#

indigo carbon Oct 20, 2023, 9:21 PM

#

Anyways, I think SDXL will have a better understanding of language if the encoder won't be CLiP, and switching from CLiP to something else will also open the opportunity to also have an image encoder

rustic garnet Oct 20, 2023, 9:27 PM

#

indigo carbon Anyways, I think SDXL will have a better understanding of language if the encode...

argh, no, it doesn't

#

this has really NOTHING to do with the text encoder

#

in fact, using CLIP is the best thing you can do if you want to easily use input images

wet nacelle Oct 20, 2023, 9:27 PM

#

rustic garnet argh, no, it doesn't

https://tenor.com/view/i-know-your-ip-address-we-know-your-ip-address-im-outside-your-house-meme-memes-gif-23579728

Tenor

rustic garnet Oct 20, 2023, 9:28 PM

#

if you want a model that can take images as input you have to train it so that it also accepts images as input (at least in a certain % of the cases)

nimble heart Oct 20, 2023, 9:28 PM

#

lotta new diffusers use t5 which is apache licensed

rustic garnet Oct 20, 2023, 9:28 PM

#

SAI haven't done this, probably they thought image inputs are not important anyways

indigo carbon Oct 20, 2023, 9:29 PM

#

nimble heart lotta new diffusers use t5 which is apache licensed

the only ones I've seen do that is PixArt

rustic garnet Oct 20, 2023, 9:29 PM

#

and DeepFloyd

nimble heart Oct 20, 2023, 9:29 PM

#

thought dalle did too?

#

used t5xxl

indigo carbon Oct 20, 2023, 9:30 PM

#

DeepFloyd is pixel diffusion, making it somewhat irrelevant

nimble heart Oct 20, 2023, 9:30 PM

#

https://cdn.openai.com/papers/dall-e-3.pdf

wet nacelle Oct 20, 2023, 9:30 PM

#

nimble heart Oct 20, 2023, 9:30 PM

#

dalle 3 uses t5 it seems

indigo carbon Oct 20, 2023, 9:31 PM

#

so T5 is the way, huh?

#

is that disabling a model to also have image input? because Dall-e 3 doesn't have image input it seems

nimble heart Oct 20, 2023, 9:31 PM

#

indigo carbon the only ones I've seen do that is PixArt

pixart is hilarious. the text encoder is 10x the size of the actual model

#

No idea. I just briefly skimmed the paper to confirm the T5 thing

#

I imagine you could just use a non-zero latent same as other diffusion models

indigo carbon Oct 20, 2023, 9:33 PM

#

nimble heart pixart is hilarious. the text encoder is 10x the size of the actual model

that encoder seems to do a better job than CLiP though

fierce hollow Oct 20, 2023, 9:33 PM

#

has pixart uploaded their models yet? I can only find links to t5 and vae, and the hf space is dead

indigo carbon Oct 20, 2023, 9:33 PM

#

I think due to encoders being more simple than the UNETs it can be quantized, no?

nimble heart Oct 20, 2023, 9:33 PM

#

indigo carbon that encoder seems to do a better job than CLiP though

yea I just find it funny that it's the complete backwards approach to SD with a small encoder and a phat unet

vale eagle Oct 20, 2023, 9:34 PM

#

fierce hollow has pixart uploaded their models yet? I can only find links to t5 and vae, and t...

haven't yet. Might be soon

nimble heart Oct 20, 2023, 9:34 PM

#

fierce hollow has pixart uploaded their models yet? I can only find links to t5 and vae, and t...

they have a python script to run their pixart alpha model on GitHub as of like 2 days ago but I couldn't make it work

#

also no safetensors format or diffusers pipeline so sus

indigo carbon Oct 20, 2023, 9:35 PM

#

one red flag about PixArt, they say it's good at photorealism, but wtf about all the other stuff?

rustic garnet Oct 20, 2023, 9:35 PM

#

tribal lantern thanks, got cofused because the pixart paper claimed to be new in having transfo...

yes. they use diffusion transformers instead of unet. I just say, the unet is ALSO a transformer. A unet is basically a combination of transformers with convolutional residual networks. The transformers in the unet work on the latent pixel space (thus, they are expensive) and the convolution is necessary to get the spatial relationship between the pixels right. diffusion transformers instead split the image into large blocks and then use transformers on these blocks instead on the latent pixels. This is cheaper. Question is, though, if this is also better. I somehow doubt that.

nimble heart Oct 20, 2023, 9:36 PM

#

once they actually make a safetensors version and it gets a diffusers pipeline it'll be easy to compare

#

it's licensed under AGPL which is cool

fierce hollow Oct 20, 2023, 9:37 PM

#

indigo carbon one red flag about PixArt, they say it's good at photorealism, but wtf about all...

if their example images are to be believed it's good at a lot of styles

whole kettle Oct 20, 2023, 9:37 PM

#

rustic garnet yes. they use diffusion transformers instead of unet. I just say, the unet is AL...

It probably is if sorted correctly. Group abstraction is very useful.

nimble heart Oct 20, 2023, 9:37 PM

#

fierce hollow if their example images are to be believed it's good at a lot of styles

i never believe examples

tribal lantern Oct 20, 2023, 9:38 PM

#

using something like t5 seems excessive to me, a text to image model shouldn't need capabilities of something like t5 (it can to text to text, eg translation, that's crazy if theres proper captioning in one language only) it just need to understand things like how tokens are related how x inside z is interpreted

nimble heart Oct 20, 2023, 9:38 PM

#

could be an nvidia "the 4060 gets twice the performance as the 3060" example

rustic garnet Oct 20, 2023, 9:39 PM

#

indigo carbon that encoder seems to do a better job than CLiP though

as mentioned earlier: a good text encoder doesn't help you if your captions are bad. The good thing on CLIP is that it is extremely robust even on bad captions. The key is to use good captions with a good text encoder, and that is only possible if you improve the captions using some powerful llm and multimodal models like blip or llava)

nimble heart Oct 20, 2023, 9:39 PM

#

maybe one of you guys will get it working
https://github.com/PixArt-alpha/PixArt-alpha

GitHub

GitHub - PixArt-alpha/PixArt-alpha: Fast Training of Diffusion Tran...

Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis - GitHub - PixArt-alpha/PixArt-alpha: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synth...

rustic garnet Oct 20, 2023, 9:39 PM

#

but I agree, now that we have these good multimodal models we could replace clip

nimble heart Oct 20, 2023, 9:40 PM

#

but yea kinda funny. The mid size T5 with a tiny diffusion model

indigo carbon Oct 20, 2023, 9:40 PM

#

I just looked at the examples, they also have the graininess I was talking about

tribal lantern Oct 20, 2023, 9:40 PM

#

pixart is trained on almost nothing though, can't help wonder what happens if more images are fed to it

nimble heart Oct 20, 2023, 9:41 PM

#

sometimes small highly curated datasets can do better than a billion bits of trash

indigo carbon Oct 20, 2023, 9:41 PM

#

tribal lantern pixart is trained on almost nothing though, can't help wonder what happens if mo...

yeah, maybe if we'll get an SDXL2 that's trained with T5 it won't have that graininess to it?

nimble heart Oct 20, 2023, 9:41 PM

#

how curated their set actually is though idk

vale eagle Oct 20, 2023, 9:41 PM

#

tribal lantern pixart is trained on almost nothing though, can't help wonder what happens if mo...

It trained with mutiple stages

nimble heart Oct 20, 2023, 9:42 PM

#

also it uses the SD 1.5 vae which is funny

#

i mean if it's not broke dont fix it I guess

fierce hollow Oct 20, 2023, 9:43 PM

#

I feel like dall-e has a really good vae compared to any sd model, it's evident when trying to feed the images into sd - like half the details are lost

indigo carbon Oct 20, 2023, 9:43 PM

#

fierce hollow I feel like dall-e has a really good vae compared to any sd model, it's evident ...

Dall-e suffers from the graininess syndrome it seems

nimble heart Oct 20, 2023, 9:43 PM

#

idk im a bit sus of pixart. They constantly bring up carbon emissions and hardly ever mention inference performance/quality

fierce hollow Oct 20, 2023, 9:44 PM

#

all the models have some grain to them, idk why you keep saying that

rustic garnet Oct 20, 2023, 9:44 PM

#

fierce hollow I feel like dall-e has a really good vae compared to any sd model, it's evident ...

when I feed images in to SD VAE I have to zoom in to see any difference...

whole kettle Oct 20, 2023, 9:44 PM

#

nimble heart idk im a bit sus of pixart. They constantly bring up carbon emissions and hardly...

Don't hate the player hate the game.

indigo carbon Oct 20, 2023, 9:44 PM

#

fierce hollow all the models have some grain to them, idk why you keep saying that

SDXL doesn't have nearly as much graininess

nimble heart Oct 20, 2023, 9:45 PM

#

whole kettle Don't hate the player hate the game.

what does that even mean in this context

tribal lantern Oct 20, 2023, 9:46 PM

#

for any model, using it is being able to tell whether it's good, cherry picked images say nothing, deepfloyd seemed promising, but i never managed to get anything remotely decent out of it that wasn't similar to "[subject] holding a sign with the text "wtfbbq this is next-level"

nimble heart Oct 20, 2023, 9:47 PM

#

also deepfloyd OOM's on my 24gig card...

#

it follows prompts decently well but the end result looks like hot garbage

indigo carbon Oct 20, 2023, 9:48 PM

#

so maybe it could be possible to train something like SDXL with T5 and quantize the encoder? that seems like a logical way to go

nimble heart Oct 20, 2023, 9:48 PM

#

why quantize the encoder

indigo carbon Oct 20, 2023, 9:48 PM

#

because it's 6B

rustic garnet Oct 20, 2023, 9:48 PM

#

tribal lantern for any model, using it is being able to tell whether it's good, cherry picked i...

yeah, I agree 😦

nimble heart Oct 20, 2023, 9:48 PM

#

T5 is small enough to run on 8G cards isnt it?

indigo carbon Oct 20, 2023, 9:49 PM

#

indigo carbon because it's 6B

4.3B*

nimble heart Oct 20, 2023, 9:49 PM

#

just swap the encoder and unet from ram->vram

rustic garnet Oct 20, 2023, 9:49 PM

#

in my opinion quantisizing the encoder would make totally sense. As Aliquip mentioned: the T5 model is way to heavy for the image caption problem anyways

nimble heart Oct 20, 2023, 9:50 PM

#

quantizing hurts performance so quickly it wouldnt make sense tbh

#

maybe 8bit would work?

rustic garnet Oct 20, 2023, 9:50 PM

#

at least 8bit quantization wouldn't hurt much I guess

nimble heart Oct 20, 2023, 9:50 PM

#

yea

#

I found 13B 8bit substantially outperforms 30B 4bit on my card

rustic garnet Oct 20, 2023, 9:50 PM

#

uh, interesting

#

I always read the opposite

indigo carbon Oct 20, 2023, 9:50 PM

#

so a model that has 8bit T5 and a UNET like SDXL would be a good idea?

nimble heart Oct 20, 2023, 9:50 PM

#

rustic garnet I always read the opposite

actually try it in Oobabooga or something

rustic garnet Oct 20, 2023, 9:51 PM

#

I mean, SAI tried T5 for SDXL and they decided for CLIP

nimble heart Oct 20, 2023, 9:51 PM

#

could also be that all the 30B models are older which doesnt exactly help

nimble heart Oct 20, 2023, 9:51 PM

#

rustic garnet I mean, SAI tried T5 for SDXL and they decided for CLIP

2 clips lol

#

the 2 clip thing is kinda weird to me

rustic garnet Oct 20, 2023, 9:51 PM

#

so I guess that T5 might give better text understanding but this doesn't mean the images look better

nimble heart Oct 20, 2023, 9:51 PM

#

wonder if it was to make it more compatible with 1.5 style prompting?

vale eagle Oct 20, 2023, 9:51 PM

#

2 clips actually good

rustic garnet Oct 20, 2023, 9:52 PM

#

nimble heart the 2 clip thing is kinda weird to me

it toally is. I guess they didn't cared and just chosed what worked best

nimble heart Oct 20, 2023, 9:52 PM

#

they tested this all in the bots

#

so im guessing the new and old clip working together acted to sorta boost people constantly 1.5 prompting the bots

#

T5 probably needs a totally different prompt structure so people had shit results

rustic garnet Oct 20, 2023, 9:52 PM

#

nimble heart wonder if it was to make it more compatible with 1.5 style prompting?

I assume that CLIP-L is better in certain artist and styles which were removed in CLIP-G training data 🙈

rustic garnet Oct 20, 2023, 9:53 PM

#

nimble heart T5 probably needs a totally different prompt structure so people had shit result...

this is also possible...

vale eagle Oct 20, 2023, 9:53 PM

#

but yeah. prompting style also affect result. People is dump compare with LLM

indigo carbon Oct 20, 2023, 9:53 PM

#

rustic garnet I mean, SAI tried T5 for SDXL and they decided for CLIP

I wonder why. T5 seems like a logical choice from what I'm reading

tribal lantern Oct 20, 2023, 9:53 PM

#

nimble heart T5 probably needs a totally different prompt structure so people had shit result...

this is a huge factor for the bot, doubt all was decided based on bot, but alreadythe bot sometimes feel sooo bad to me when old prompts fail

rustic garnet Oct 20, 2023, 9:53 PM

#

why?

#

CLIP is a multimodal model

#

makes sense for a text to image model

#

T5 is a pure text model that has never seen any image and was never trained on image captions

nimble heart Oct 20, 2023, 9:55 PM

#

i suppose that'd mean if they went T5, things like unclip/clipvision wouldnt work

#

I guess you could always interrogate with clip then just feed the text into T5?

rustic garnet Oct 20, 2023, 9:55 PM

#

(and yes, I totally agree that training on a text corpus makes sense to get a model that is better in text understanding. I just say that CLIP is not totally stupid)

nimble heart Oct 20, 2023, 9:55 PM

#

wouldnt be ideal though

indigo carbon Oct 20, 2023, 9:56 PM

#

nimble heart I guess you could always interrogate with clip then just feed the text into T5?

you'll loose the subjects in that process

#

to achieve image input you'd need the conditioning to have the images

nimble heart Oct 20, 2023, 9:57 PM

#

regardless of what they do next with text encoding I just hope they ditch the refiner lol

#

I cant say I've used it once in the last month

indigo carbon Oct 20, 2023, 9:58 PM

#

nimble heart regardless of what they do next with text encoding I just hope they ditch the re...

same, the refiner seems useless for the most part

#

just extra params

tribal lantern Oct 20, 2023, 9:58 PM

#

hmm, refiner....

vale eagle Oct 20, 2023, 9:58 PM

#

It is kinda interesting. You could use GPT4 to read the image, create prompt and feed it into sdxl to get the required result

nimble heart Oct 20, 2023, 9:58 PM

#

main problem is the refiner butchers high frequency details

#

like it can make some structures look better but the image almost looks lower res as a result

indigo carbon Oct 20, 2023, 9:59 PM

#

vale eagle It is kinda interesting. You could use GPT4 to read the image, create prompt and...

you will loose the subject in the image input like that

vale eagle Oct 20, 2023, 9:59 PM

#

no

tribal lantern Oct 20, 2023, 9:59 PM

#

often the refiner is a skip, then i use foocus with all the defaults, and am amazed how nice the results are

rustic garnet Oct 20, 2023, 10:00 PM

#

in my opinion, Figure 3 in the PixArt alpha paper shows nicely why their method might work so well

indigo carbon Oct 20, 2023, 10:00 PM

#

vale eagle no

if you feed it an image of let's say: a dog, then use the output as a prompt- it WON'T be the same dog

vale eagle Oct 20, 2023, 10:00 PM

#

It wouldn't be same via vae

nimble heart Oct 20, 2023, 10:00 PM

#

rustic garnet in my opinion, Figure 3 in the PixArt alpha paper shows nicely why their method ...

yea when it gets a diffusers pipeline it might be worth trying out in SD.Next

rustic garnet Oct 20, 2023, 10:00 PM

#

indigo carbon Oct 20, 2023, 10:01 PM

#

vale eagle It wouldn't be same via vae

though it WOULD be the same via image conditioning

nimble heart Oct 20, 2023, 10:01 PM

#

rustic garnet

so smaller dataset of "refined captions"

rustic garnet Oct 20, 2023, 10:02 PM

#

yeah, they should refine LAION

vale eagle Oct 20, 2023, 10:02 PM

#

they have

rustic garnet Oct 20, 2023, 10:02 PM

#

and as they write: if your captions are well aligned, you need less data to train

nimble heart Oct 20, 2023, 10:03 PM

#

yea kinda why you can train a Lora with just like 100 or so hand-captioned images

indigo carbon Oct 20, 2023, 10:03 PM

#

rustic garnet yeah, they should refine LAION

so, a better dataset will produce a model with better language understanding without depending on CLiP?

#

is that what's the Dall-e 3 paper is about? I haven't read it yet

nimble heart Oct 20, 2023, 10:03 PM

#

that's the pixart paper

indigo carbon Oct 20, 2023, 10:04 PM

#

nimble heart that's the pixart paper

https://cdn.openai.com/papers/dall-e-3.pdf

nimble heart Oct 20, 2023, 10:04 PM

#

Ah I meant kaibo's screenshot

indigo carbon Oct 20, 2023, 10:04 PM

#

I was talking about this

tribal lantern Oct 20, 2023, 10:04 PM

#

rustic garnet

i'm hoping captions is 90% of what makes models better 🙂

vale eagle Oct 20, 2023, 10:05 PM

#

tribal lantern i'm hoping captions is 90% of what makes models better 🙂

dalle3 and pixelart used similar approach for captioning

#

I think dalle3 is doing better on this

indigo carbon Oct 20, 2023, 10:06 PM

#

indigo carbon https://cdn.openai.com/papers/dall-e-3.pdf

according to this paper the understanding of language can be substantially improved by training on highly descriptive generated image captions.

nimble heart Oct 20, 2023, 10:06 PM

#

pixart is a tiny (ish) model by comparison though. it might be a better approach for local inference

#

hence why it's also AGPL licensed

mellow tendon Oct 20, 2023, 10:07 PM

#

In never have much luck with the few times I have tried to prompt Pixart.

nimble heart Oct 20, 2023, 10:08 PM

#

a high quality model built specifically for easy training and local inference licensed under AGPL is a winning combo if it actually turns out to not be garbage

#

but for now we wait for a diffusers pipeline

vale eagle Oct 20, 2023, 10:08 PM

#

Screenshot_2023-10-21-06-08-09-363_cn.wps.moffice_eng.jpg

#

pixelart tried with Laion but choose another dataset for training(Table 1)

rustic garnet Oct 20, 2023, 10:12 PM

#

vale eagle they have

did they? I heard that multiple times, but never read about that anywhere

vale eagle Oct 20, 2023, 10:13 PM

#

LAION-LLaVa is the refined dataset

rustic garnet Oct 20, 2023, 10:13 PM

#

yes

#

but I mean LAION themselves

nimble heart Oct 20, 2023, 10:13 PM

#

vale eagle pixelart tried with Laion but choose another dataset for training(Table 1)

fair tbh. the sd 1.5 tuning ecosystem is so big cause they're all just undoing laion-isms

#

also waifus but beyond that its other datasets

vale eagle Oct 20, 2023, 10:15 PM

#

rustic garnet but I mean LAION themselves

They evaluate those dataset in table 1 and choose to use SAM-LLaVa

rustic garnet Oct 20, 2023, 10:15 PM

#

yes, but they created the captions for LAION and SAM themselve

#

what I meant: did LAION ever came up with the idea of refining their captions? I think pseudo said that once, but I never found evidence for that

upbeat summit Oct 20, 2023, 10:16 PM

#

vale eagle Oct 20, 2023, 10:18 PM

#

rustic garnet what I meant: did LAION ever came up with the idea of refining their captions? I...

oh, I understand what you mean. I don't know anything about LAION itself

indigo carbon Oct 20, 2023, 10:27 PM

#

it seems that T5 isn't limiting PixArt from having image input, they were able to make ControlNets for it

rustic garnet Oct 20, 2023, 10:28 PM

#

indigo carbon it seems that T5 isn't limiting PixArt from having image input, they were able t...

the text encoder has nothing to do with that...

indigo carbon Oct 20, 2023, 10:28 PM

#

rustic garnet the text encoder has nothing to do with that...

the params of the model didn't say anything abot an image encoder though

rustic garnet Oct 20, 2023, 10:28 PM

#

the control net is its own image encoder

#

a control net is a separate network

#

that takes the control net image as input

indigo carbon Oct 20, 2023, 10:29 PM

#

it won't be able to blend images it seems

rustic garnet Oct 20, 2023, 10:29 PM

#

not if they haven't trained for it

nimble heart Oct 20, 2023, 10:31 PM

#

indigo carbon it seems that T5 isn't limiting PixArt from having image input, they were able t...

controlnets but no diffusers pipeline smh

tribal lantern Oct 20, 2023, 10:34 PM

#

the code uses diffusers and follows a similar structure/api

#

doesn't seem hard to incorporate into diffusers

nimble heart Oct 20, 2023, 10:35 PM

#

at make a safetensors file

tribal lantern Oct 20, 2023, 10:35 PM

#

they seem to really do their best to blend into the existing generative ai eco system

nimble heart Oct 20, 2023, 10:36 PM

#

yea they just made the HF page two days ago so im not expecting miracles

#

but diffusers is listed on the "todo" so 100% just wait for that instead of trying to hack their inference code into a UI

#

hypothetically it should just work™️ on sd.next then

indigo carbon Oct 20, 2023, 10:37 PM

#

I compared some of their showcased results with SDXL, SDXL seems to be better in most cases

#

so better language understanding or not, SDXL still takes the cake

#

though I don't doubt SDXL would be even better if it was trained in the way they trained that model

#

so maybe we'll get an "SDXL2" or even an "SD3" that will have better language understanding

nimble heart Oct 20, 2023, 10:44 PM

#

wonder how pixart does with dark latents

wet nacelle Oct 20, 2023, 10:51 PM

#

soft bone Oct 20, 2023, 11:08 PM

#

upbeat summit

its interesting how XL still has this split ground perspective problem. idk how it can master reflections but cant keep the ground level. happens to me constantly

nimble heart Oct 20, 2023, 11:11 PM

#

oh hell it's the kung fury hacker dude

#

Kung Fury 2 apparently debuts November 17th

wet nacelle Oct 20, 2023, 11:23 PM

#

icy brook Oct 20, 2023, 11:35 PM

#

weary yacht Oct 21, 2023, 12:46 AM

#

what's the largest SDXL images you guys have made?... I'm trying a 3440x1440 right now

weary yacht Oct 21, 2023, 2:19 AM

#

yeah... 3440x1440 is a no go

vital ermine Oct 21, 2023, 2:24 AM

#

#

eternal fog Oct 21, 2023, 2:27 AM

#

lmao

#

That's more normal lol

#

indigo carbon Oct 21, 2023, 2:32 AM

#

rustic garnet T5 is a pure text model that has never seen any image and was never trained on i...

I've looked into it, it should be possible to have a model that uses T5 as a text encoder, while having something different as an image encoder

#

PixArt doesn't have an image encoder due to trying to be as efficient as possible in training, but they theoretically could

#

This might true to DALL-E 3 as well

#

In the case of DALL-E 3, the language understanding doesn't come from the text encoder though, they explained it was the dataset they trained it on according to the paper

eager onyx Oct 21, 2023, 2:57 AM

#

programmer

hoary saddle Oct 21, 2023, 3:23 AM

#

did someone mention a few days ago, a website where you can upload a dozen or so images and it will make a lora for you?

nimble heart Oct 21, 2023, 3:57 AM

#

weary yacht what's the largest SDXL images you guys have made?... I'm trying a 3440x1440 rig...

with a 2-pass approach I've done 3840x2160.

just one pass it's like 1080p tops

weary yacht Oct 21, 2023, 3:57 AM

#

two pass?

nimble heart Oct 21, 2023, 3:58 AM

#

what the automatic1111 UI calls "high res fix"

weary yacht Oct 21, 2023, 3:58 AM

#

you mean when you make a lower res image then use AI to upscale and add detail?

#

or just upscale and enhance?

nimble heart Oct 21, 2023, 3:59 AM

#

feed the text2img result into an img2img

#

so the former I suppose

weary yacht Oct 21, 2023, 3:59 AM

#

yeah, I'm mostly just stressing my hardware out to see what it'll do

nimble heart Oct 21, 2023, 3:59 AM

#

4k is like 7 seconds per it for me lol

#

so I only do like 15 samples

weary yacht Oct 21, 2023, 4:00 AM

#

what's your GPU?.. I was getting about 7s/it for 3440x1440

nimble heart Oct 21, 2023, 4:00 AM

#

7900 XTX

weary yacht Oct 21, 2023, 4:01 AM

#

i generated a 3440x1440 image but it failed at the decode phase

nimble heart Oct 21, 2023, 4:02 AM

#

use tiled

weary yacht Oct 21, 2023, 4:02 AM

#

vae decode tiled?

#

I'm going to attempt a 50-step 3440x1440 with tiled decoding and see if it works, or if a red text box pops up and cusses me out again

nimble heart Oct 21, 2023, 4:24 AM

#

jeeze try it out on like 5 steps first make sure it decodes

weary yacht Oct 21, 2023, 4:54 AM

#

boom.. 50 steps 3440x1440

#

12 minutes, roflmao

#

there is so much wrong with it too like how planets and even a sun are sitting on the ground, but that wasn't really the point

nimble heart Oct 21, 2023, 5:32 AM

#

weary yacht 12 minutes, roflmao

after the first tile decode and it builds the kernels subsequent ones are faster

slender coral Oct 21, 2023, 6:00 AM

#

Got a prompt question, when prompting everything that I get is extreamly new, almost like a 3D render for somethings, how I change that is to use old dirty that works 80% of the time, but then I get something like an animal that I want to look like anatrual looking animal but dirty makes it dirty, suggestions on prompts to fix this?

crisp owl Oct 21, 2023, 6:02 AM

#

photograph of
cinematic photo of
portrait shot of
digital photo of
movie still of

hoary saddle Oct 21, 2023, 6:23 AM

#

made an image gallery from the ComfyUI output folder, no more searching for an image from 2 weeks ago

#

Screenshot_2023-10-21-01-24-32-46_cbf47468f7ecfbd8ebcc46bf9cc626da.jpg

#

mobile friendly 🙂

weary yacht Oct 21, 2023, 6:30 AM

#

nimble heart after the first tile decode and it builds the kernels subsequent ones are faster

I meant the whole image took 12 minutes doing that high of a resolution. But that resolution was mostly just a proof of concept, I'll never actually have a use for images of that resolution and if I did, I'd do it in a more organized manner, starting off probably with 20 step SD1.5 stuff I could upscale and feed through img2img after I have a good starting point

nimble heart Oct 21, 2023, 6:39 AM

#

1.5 is slower than XL after 768x768

hoary saddle Oct 21, 2023, 6:46 AM

#

https://github.com/Smuzzies/comfyui_image_gallery

GitHub

GitHub - Smuzzies/comfyui_image_gallery

Contribute to Smuzzies/comfyui_image_gallery development by creating an account on GitHub.

weary yacht Oct 21, 2023, 6:48 AM

#

yeah, but you can start out with a smaller image, and use it for img2img when you create one that has the basic look that you want, and you can upscale before plugging that into img2img and make bigger images based off that smaller one.. the benefit of doing the smaller ones is you can generate a lot of them, often several at a time, and reach a starting point faster to build from

#

if I do a 1920x1080 image, and there are imperfections, it requires a lot of subsequent effort with inpainting to correct, and all of that may be avoided by starting out with an image already close to what you want

nimble heart Oct 21, 2023, 6:52 AM

#

nimble heart Oct 21, 2023, 7:58 AM

#

bonus image

shy kelp Oct 21, 2023, 8:35 AM

#

nimble heart bonus image

prompt and model?

nimble heart Oct 21, 2023, 8:35 AM

#

base XL no refiner

#

prompt uhhhh

#

if you use comfyui you can just drag the image onto the canvas and see the workflow

shy kelp Oct 21, 2023, 8:36 AM

#

im completely new to sd

#

ill google comfyui and basexl

nimble heart Oct 21, 2023, 8:36 AM

#

grainy deep ocean footage of a an monstrous tentacle woman in the abyssal depths below

nimble heart Oct 21, 2023, 8:36 AM

#

shy kelp ill google comfyui and basexl

base xl is just SDXL

#

like the normal model

#

base version

shy kelp Oct 21, 2023, 8:41 AM

#

oh gotcha

#

just made this as one of my first uh...pictures? idk what to call them

#

vital ermine Oct 21, 2023, 10:19 AM

#

I found the one I was talking about that is dead for a long time now, and the one I was mentioning I left their discord when they said it was dead. I had their icon in my mind and just found it. https://github.com/devilismyfriend/StableTuner I never heard of simple tuner before.

GitHub

GitHub - devilismyfriend/StableTuner: Finetuning SD in style.

Finetuning SD in style. Contribute to devilismyfriend/StableTuner development by creating an account on GitHub.

lusty wolf Oct 21, 2023, 1:02 PM

#

Just happy my Comfy is working again...

vital ermine Oct 21, 2023, 1:13 PM

#

#

#

#

vital ermine Oct 21, 2023, 2:02 PM

#

lilac wren Oct 21, 2023, 2:09 PM

#

@vital ermine Alice in Wonderland and Madmax?

#

and ghost in the shell 🙂

glass notch Oct 21, 2023, 2:19 PM

#

vital ermine

Is that Rutger Hauer from Blade Runner in the background?

south horizon Oct 21, 2023, 2:21 PM

#

#

vital ermine Oct 21, 2023, 2:31 PM

#

#

#

#

#

#

vital ermine Oct 21, 2023, 3:11 PM

#

limber citrus Oct 21, 2023, 3:11 PM

#

The quality is amazing 👍🏻

tropic turret Oct 21, 2023, 3:22 PM

#

strong copper Oct 21, 2023, 3:28 PM

#

vital ermine Oct 21, 2023, 3:28 PM

#

tepid sinew Oct 21, 2023, 3:28 PM

#

How do I make pics?

vital ermine Oct 21, 2023, 3:30 PM

#

#

#

#

glass notch Oct 21, 2023, 3:38 PM

#

vital ermine Oct 21, 2023, 3:38 PM

#

#

#

#

vale eagle Oct 21, 2023, 3:43 PM

#

vital ermine Oct 21, 2023, 3:44 PM

#

glass notch Oct 21, 2023, 3:45 PM

#

vital ermine

This could be a superhero whose power is to replace and repair public water utilities

vital ermine Oct 21, 2023, 3:47 PM

#

glass notch This could be a superhero whose power is to replace and repair public water util...

Shit, lol

south horizon Oct 21, 2023, 3:49 PM

#

glass notch Oct 21, 2023, 3:57 PM

#

tropic turret Oct 21, 2023, 4:02 PM

#

vital ermine Oct 21, 2023, 4:06 PM

#

#

#

#

#

vale eagle Oct 21, 2023, 4:16 PM

#

vital ermine Oct 21, 2023, 4:19 PM

#

#

south horizon Oct 21, 2023, 4:33 PM

#

vale eagle

nice

south horizon Oct 21, 2023, 4:33 PM

#

vital ermine

pretty good conan

lusty wolf Oct 21, 2023, 4:34 PM

#

Chalk Lora from @stone fossil

south horizon Oct 21, 2023, 4:34 PM

#

vale eagle Oct 21, 2023, 4:34 PM

#

#

Testing my new fine tune

south horizon Oct 21, 2023, 4:35 PM

#

vale eagle Oct 21, 2023, 5:03 PM

#

#

noble shoal Oct 21, 2023, 5:13 PM

#

vale eagle Oct 21, 2023, 5:15 PM

#

south horizon Oct 21, 2023, 5:18 PM

#

#

vale eagle Oct 21, 2023, 5:53 PM

#

#

#

mellow tendon Oct 21, 2023, 6:43 PM

#

vale eagle Oct 21, 2023, 6:56 PM

#

mellow tendon Oct 21, 2023, 7:02 PM

#

vale eagle

Did you use the same prompt? If you wanted the prompt and model I used it is here: https://civitai.com/images/3082642?period=AllTime&sort=Newest&view=categories&modelVersionId=190677&modelId=169671&postId=713401

vale eagle Oct 21, 2023, 7:03 PM

#

no

#

I used llava to descript the image and generate the prompt

indigo carbon Oct 21, 2023, 7:38 PM

#

I took a look at DALL-E 3's paper, they do indeed use T5

jolly creek Oct 21, 2023, 8:15 PM

#

steady grove Oct 21, 2023, 8:23 PM

#

indigo carbon I took a look at DALL-E 3's paper, they do indeed use T5

they've given out access to their paper? i thought anyone who is allowed to see it has to sign an NDA

indigo carbon Oct 21, 2023, 8:24 PM

#

steady grove they've given out access to their paper? i thought anyone who is allowed to see...

It was leaked and after they noticed it was leaked it popped up on their website to keep their cool

steady grove Oct 21, 2023, 8:24 PM

#

https://cdn.openai.com/papers/dall-e-3.pdf oh they've put it out finally.

indigo carbon Oct 21, 2023, 8:25 PM

#

steady grove https://cdn.openai.com/papers/dall-e-3.pdf oh they've put it out finally.

Yep

#

Uses T5

steady grove Oct 21, 2023, 8:25 PM

#

time to pour some extra strong coffee

steady grove Oct 21, 2023, 8:25 PM

#

indigo carbon Uses T5

yeah i figured they did. it's pretty good

indigo carbon Oct 21, 2023, 8:25 PM

#

steady grove yeah i figured they did. it's pretty good

Why didn't SDXL use T5?

steady grove Oct 21, 2023, 8:25 PM

#

/shrug

targeting home gpus probably

indigo carbon Oct 21, 2023, 8:26 PM

#

T5 can be quantized, that's no excuse

steady grove Oct 21, 2023, 8:26 PM

#

would a t5 trained model work with a 3070?

indigo carbon Oct 21, 2023, 8:26 PM

#

Yes, easily. An 8-bit T5 won't hurt it

steady grove Oct 21, 2023, 8:27 PM

#

also i think they like openclip becuase they have the license to it. t5 is a restrictive license isn't it?

vale eagle Oct 21, 2023, 8:27 PM

#

indigo carbon Why didn't SDXL use T5?

It is too big

indigo carbon Oct 21, 2023, 8:27 PM

#

vale eagle It is too big

It can easily be quantized with minimal loss

#

I'm almost sure OpenAI aren't using full precision on T5

#

Even 8bit would be enough to make it run on about the same scale as CLiP

steady grove Oct 21, 2023, 8:30 PM

#

i'm sure if it were easy we'd see more researchers doing stuff with it. there's probably big caveats. t5 has been out for a long while

#

it's very impressive too

#

people don't just ignore that. there's gotta be a reason why

indigo carbon Oct 21, 2023, 8:31 PM

#

steady grove i'm sure if it were easy we'd see more researchers doing stuff with it. there's...

we do see that though

steady grove Oct 21, 2023, 8:31 PM

#

kandensky never used it either right?

indigo carbon Oct 21, 2023, 8:31 PM

#

steady grove kandensky never used it either right?

Kandinsky uses ViT_14 or something like that

vale eagle Oct 21, 2023, 8:32 PM

#

At that moment, people used to use Clip L style's prompt. I think using prompt like that T5 might not perform a good result.

steady grove Oct 21, 2023, 8:32 PM

#

lots of other models coming out but i only see google and other big proprietary ai companies using it. must be a lot of licensing issues tied to it

modern kraken Oct 21, 2023, 8:32 PM

#

It really is the simple things that hook me.

steady grove Oct 21, 2023, 8:32 PM

#

vale eagle At that moment, people used to use Clip L style's prompt. I think using prompt l...

people slipped right into prompting dalle 3 so easily. dalle 2 too. it's clearly a bitter layer for prompting

indigo carbon Oct 21, 2023, 8:33 PM

#

DALL-E 3 prompts very easily and it uses T5

vale eagle Oct 21, 2023, 8:33 PM

#

prompt don't go directly into dalle 3

steady grove Oct 21, 2023, 8:33 PM

#

you can punch sdxl style prompts into dalle3 and it'll do fine

#

has better comprehension too

vale eagle Oct 21, 2023, 8:34 PM

#

user prompt -> gpt -> descriptive prompt -> dalle 3 T5

steady grove Oct 21, 2023, 8:34 PM

#

That's if you're using it through chatgpt. there are other interfaces

indigo carbon Oct 21, 2023, 8:34 PM

#

vale eagle user prompt -> gpt -> descriptive prompt -> dalle 3 T5

the GPT part is just to censor the model, it isn't improving anything

steady grove Oct 21, 2023, 8:35 PM

#

the api for it is wide open now

#

"open"

vale eagle Oct 21, 2023, 8:35 PM

#

Dalle 3 trained with descriptive prompt. They use gpt to generate that style of prompt to use the maximum capability of the modal

steady grove Oct 21, 2023, 8:36 PM

#

using gpt to rewrite a prompt won't make it better comprehension. you can't just throw gpt4 at sdxl and get dalle results

#

t5 is the core reason why dalle prompting is so good

vale eagle Oct 21, 2023, 8:37 PM

#

steady grove using gpt to rewrite a prompt won't make it better comprehension. you can't jus...

Actually, you could get better result using gpt4 refined prompt

indigo carbon Oct 21, 2023, 8:37 PM

#

steady grove t5 is the core reason why dalle prompting is so good

yep, I wonder if future SD versions will use T5

steady grove Oct 21, 2023, 8:38 PM

#

vale eagle Actually, you could get better result using gpt4 refined prompt

sure, but that's subjective. prompt comprehension won't improve though.

#

you can actually score prompt comprehension thorugh a variety of methods

vale eagle Oct 21, 2023, 8:38 PM

#

prompt comprehension wasn't only come from the T5

indigo carbon Oct 21, 2023, 8:39 PM

#

vale eagle prompt comprehension wasn't only come from the T5

it's also from the dataset, but I'd say SDXL's dataset is way more than enough to make a model better than DALL-E 3 when trained with T5

#

maybe SAI didn't use T5 because quantizing wasn't a thing when they began training SDXL?

#

because if quantized properly; T5 can have close performance to CLiP

nimble heart Oct 21, 2023, 8:45 PM

#

doesnt DFIF use T5 already?

#

they're pretty familiar with T5 so I assume the reasons for going clip on XL was more than just the extra gig of vram

indigo carbon Oct 21, 2023, 8:46 PM

#

nimble heart doesnt DFIF use T5 already?

it was also a pixel diffusion model though, and was extremely performance heavy, so IDK if they already quantized T5 there

nimble heart Oct 21, 2023, 8:47 PM

#

pixel like there's no VAE?

#

maybe that's why the images look like garbage lol

indigo carbon Oct 21, 2023, 8:47 PM

#

nimble heart pixel like there's no VAE?

it doesn't diffuse latents, it diffuses pixels. that's kinda dumb as of now

#

so no, no VAE

indigo carbon Oct 21, 2023, 8:49 PM

#

nimble heart they're pretty familiar with T5 so I assume the reasons for going clip on XL was...

I don't know, it's either they couldn't quantize T5 or had their own reason not to use T5. because evidentially; T5 is the better text encoder

nimble heart Oct 21, 2023, 8:49 PM

#

the double-upscaling sucks ass. pixel could maybe work with a better method for that

#

though maybe using unedited images directly instead of the VAE makes it noisier

steady grove Oct 21, 2023, 8:50 PM

#

i did a dozen generations with df and decided it wasn't feasible as a tool

pure crystal Oct 21, 2023, 8:51 PM

#

indigo carbon Oct 21, 2023, 8:51 PM

#

nimble heart the double-upscaling sucks ass. pixel could maybe work with a better method for ...

we are already past that point, latent diffusion is the superior method as of now. but that has nothing to do with the fact that they had no reason not to use T5

#

maybe performance, but we can fix that via quantization with minimal loss as of now

nimble heart Oct 21, 2023, 8:58 PM

#

apparently T5 by itself uses > 12GB

indigo carbon Oct 21, 2023, 8:58 PM

#

nimble heart apparently T5 by itself uses > 12GB

also LLaMa, but that's full precision

nimble heart Oct 21, 2023, 8:58 PM

#

if you quantize T5 down to 4bit it's going to destroy the quality

#

and 8bit would still be like 7 gigs

#

which is as much as the entire XL pipeline right now

indigo carbon Oct 21, 2023, 8:59 PM

#

nimble heart if you quantize T5 down to 4bit it's going to destroy the quality

not if quantized properly, look at LLaMa for instance; even when 4bit, it can do stuff just as good as full precision

nimble heart Oct 21, 2023, 9:00 PM

#

I've literally made my own 4bit quants and they suck ass

#

even 6bit can be sketch

#

if you compare with what the full fp16 model does it totally destroys the outputs

indigo carbon Oct 21, 2023, 9:01 PM

#

you didn't do it properly then.. it won't make sense that it barely effects LLaMa and destroys T5

#

or maybe degradation rate is higher with 4.3B models?

nimble heart Oct 21, 2023, 9:02 PM

#

tf you mean I didnt do it properly. have you actually used higher than 4bit on meaningful models?

#

if you only ever use 4bit gptq then they seem nice until you try the same model at 8bit+

steady grove Oct 21, 2023, 9:03 PM

#

my figuring is if it was as good as hype says it is, it would've caught on by now. i used the 4bit llama too and it descended into garbage after two prompts. oculdn't make it work at all.

#

24gb doesn't seem to be enough for a llm that doesn't dissolve into gibberish with the slightest bit of context

nimble heart Oct 21, 2023, 9:03 PM

#

if you have ooba I have a bunch of EXL2 quants on my huggingface including a 4bit https://huggingface.co/Beinsezii/MythoMax-L2-13b-EXL2
There's absolutely 0 comparison between 4 and 8 bit

steady grove Oct 21, 2023, 9:03 PM

#

maybe i'm doing it wrong, okay, butif it were simple to deploy and usable, people would be using it. That's how i see things.

indigo carbon Oct 21, 2023, 9:04 PM

#

nimble heart if you only ever use 4bit gptq then they seem nice until you try the same model ...

I did try 8bit, it was barely any different than 4bit

indigo carbon Oct 21, 2023, 9:04 PM

#

steady grove maybe i'm doing it wrong, okay, butif it were simple to deploy and usable, peopl...

people use 4bit quantized LLaMa all the time

steady grove Oct 21, 2023, 9:04 PM

#

i hear that okay and i see them using it, but i feel like they're struggling to use it and are pretending it's all good

#

comes to a certain point where i might as well just write things myself

nimble heart Oct 21, 2023, 9:05 PM

#

people use 4bit because they have to

#

most dudes are still on 8 or 12 gigs of VRAM

indigo carbon Oct 21, 2023, 9:05 PM

#

nimble heart people use 4bit because they have to

it works very nicely though, if inferred properly; people rarely complain

nimble heart Oct 21, 2023, 9:06 PM

#

i mean what's there to complain about?

#

the alternative is using cpu offload with transformers and getting 0.2 it/s

steady grove Oct 21, 2023, 9:06 PM

#

um, there's a HIGH bar to use LLM. a level of technical know how. people complain constantly about that. there's a huge sea of newbs wishing they could use LLMs of any kind and they're all rabbling

indigo carbon Oct 21, 2023, 9:06 PM

#

nimble heart the alternative is using cpu offload with transformers and getting 0.2 it/s

LLaMa speed isn't measured by it/s

nimble heart Oct 21, 2023, 9:06 PM

#

so having 4bit fit entirely in vram and running @ 5 it/s is a godsend to them

#

tokens whatever

#

same thing

steady grove Oct 21, 2023, 9:07 PM

#

it's hard to find good advice for ooba booga because of how many newbs are trying to make their own e gf

#

maybe you filter out all the noise but i assure you there are complaints

indigo carbon Oct 21, 2023, 9:08 PM

#

so you're saying SAI doesn't use T5 due to performance, eh?

steady grove Oct 21, 2023, 9:08 PM

#

no. i think its more about compatibility and ease of deployement

nimble heart Oct 21, 2023, 9:08 PM

#

if you really want T5 you can always use DFIF

steady grove Oct 21, 2023, 9:09 PM

#

that includes performance, but also helping people implement nich software libraries

indigo carbon Oct 21, 2023, 9:09 PM

#

look, a model that uses T5 is going to release soon, we'll see how that does

nimble heart Oct 21, 2023, 9:09 PM

#

pixart?

indigo carbon Oct 21, 2023, 9:09 PM

#

yeah, that uses T5

nimble heart Oct 21, 2023, 9:09 PM

#

they already have an inference script on their github

#

go try it

#

I'm just waiting for the diffusers pipeline

#

so it'll work in sd.next

#

https://github.com/PixArt-alpha/PixArt-alpha#-dependencies-and-installation

steady grove Oct 21, 2023, 9:13 PM

#

i disregarded pixart when i first heard about it, because they were bragging about how few carbon emissions it costed to train. i think all of that is just dumb poppycock nonsense. minimizing the carbon footprint of a single project isn't going to do jack all. we need to plant trees.

Anyone trying to brag about their carbon footprint are scammers like recycling companies are, so i tend to lose trust when i see it

#

it might be good, but they're using carbon footprint to hype it, so i don't think it has any legs

nimble heart Oct 21, 2023, 9:15 PM

#

optimistically, if it's as easy to tune as they say it is and runs well locally then it could be a success if the images arent garbage.

AGPL license nice.

#

but the fact that they bring up carbon emissions 5x as often as inference quality is sus

steady grove Oct 21, 2023, 9:17 PM

#

its out now? oooo. worth a look, but i'm a giant cynic about carbon footprint obsessed tech projects

#

oh no no weights yet

#

and Inference requires at least 23GB of GPU memory.

lusty wolf Oct 21, 2023, 9:21 PM

#

Something pretty for a change... 😜

steady grove Oct 21, 2023, 9:22 PM

#

sdxl always struggles on teeth really bad i've noticed. worse than hands ever were. Cool image though. life an death

noble shoal Oct 21, 2023, 9:25 PM

#

steady grove sdxl always struggles on teeth really bad i've noticed. worse than hands ever w...

What do you mean with "struggles on teeth really bad"?

steady grove Oct 21, 2023, 9:26 PM

#

no canines. all incisors. sdxl LOVES incisors

noble shoal Oct 21, 2023, 9:26 PM

#

steady grove no canines. all incisors. sdxl LOVES incisors

100%

lusty wolf Oct 21, 2023, 9:28 PM

#

Cheers...

vale eagle Oct 21, 2023, 9:31 PM

#

#

noble shoal Oct 21, 2023, 9:34 PM

#

lusty wolf Cheers...

Cheers

indigo carbon Oct 21, 2023, 9:38 PM

#

nimble heart apparently T5 by itself uses > 12GB

this is odd. T5 has 4.3B params, how does it take that much VRAM?

#

the entire SDXL model has more params than that and it doesn't take as much

nimble heart Oct 21, 2023, 9:38 PM

#

different architecture

#

look on pixart's HF. the t5 encoder is literally like 16GB of just weights

indigo carbon Oct 21, 2023, 9:39 PM

#

nimble heart look on pixart's HF. the t5 encoder is literally like 16GB of just weights

yeah, and they didn't even release the UNET yet, the HF has just the text encoder and the VAE

steady grove Oct 21, 2023, 9:39 PM

#

noble shoal Cheers

single artwork for famous song gotta have teeth - obie trice

nimble heart Oct 21, 2023, 9:41 PM

#

so considering one of the main points of XL was fitting on 8gig cards I think t5 was automatically off the table

indigo carbon Oct 21, 2023, 9:42 PM

#

nimble heart so considering one of the main points of XL was fitting on 8gig cards I think t5...

fair enough I guess. so the reason T5 wasn't used is the encoder itself being as big as the UNET

nimble heart Oct 21, 2023, 9:42 PM

#

bigger

indigo carbon Oct 21, 2023, 9:42 PM

#

nimble heart bigger

SDXL UNET is over 6B, a little smaller

nimble heart Oct 21, 2023, 9:42 PM

#

even if you halved the size with a quant without destroying the inference quality t5 would still be bigger than XL

steady grove Oct 21, 2023, 9:42 PM

#

bigger in bits not unrelated parameters in different architecture.

nimble heart Oct 21, 2023, 9:42 PM

#

in terms of gigabytes

indigo carbon Oct 21, 2023, 9:43 PM

#

yeah, I see

steady grove Oct 21, 2023, 9:43 PM

#

some pentiums reached 4ghz. they are not faster cpus than modern i3's

nimble heart Oct 21, 2023, 9:43 PM

#

both clips combined are like one gig I think

steady grove Oct 21, 2023, 9:44 PM

#

how does df's t5 implementation work? i was running it on my pc and it didn't need 24gb

nimble heart Oct 21, 2023, 9:44 PM

#

maybe an "SDXXL" targeting 16 or 24gb minimum would work

nimble heart Oct 21, 2023, 9:44 PM

#

steady grove how does df's t5 implementation work? i was running it on my pc and it didn't n...

dfif used > 24 gigs for a single image when I last tested

#

unless there's improvements in diffusers now

steady grove Oct 21, 2023, 9:45 PM

#

nimble heart dfif used > 24 gigs for a single image when I last tested

maybe the two step helped me out. i only got 16gb

#

made a dozen or so images when it dropped public.

#

was slow though

nimble heart Oct 21, 2023, 9:45 PM

#

steady grove maybe the two step helped me out. i only got 16gb

maybe load t5 first, encode, unload t5, diffuse?

indigo carbon Oct 21, 2023, 9:45 PM

#

nimble heart maybe an "SDXXL" targeting 16 or 24gb minimum would work

I don't see that happening, the idea is to make a solid model that can run on most stuff.. it seems like T5 is automatically defeating that purpose

steady grove Oct 21, 2023, 9:46 PM

#

when nvidia releases the 5080 it'll be good right? that'll come with 30gb right??? /padme

nimble heart Oct 21, 2023, 9:46 PM

#

the 3080 had 10 gigs and the 4080 was gonna have 12 before they renamed it

indigo carbon Oct 21, 2023, 9:47 PM

#

steady grove when nvidia releases the 5080 it'll be good right? that'll come with 30gb right?...

if they'll continue the path they're going in they are going to cuck the VRAMs bandwidth again

nimble heart Oct 21, 2023, 9:47 PM

#

if you just want vram and nothing else you can get an A770 16gb for like $300

steady grove Oct 21, 2023, 9:47 PM

#

no i like speed too

#

i've considered older cards though. might still yet

nimble heart Oct 21, 2023, 9:48 PM

#

7900 XTX 24gb for $950. outperforms a 3090 when on equal playing field

indigo carbon Oct 21, 2023, 9:48 PM

#

they literally used 128bit VRAM on almost the entire 4000 series, that's stupid

#

idk if they'll do that to 5000 series

steady grove Oct 21, 2023, 9:48 PM

#

gets you banned in counterstrike though (not that i play i just think it's hilarious and typical amd driver moment)

nimble heart Oct 21, 2023, 9:49 PM

#

play better games, ez

steady grove Oct 21, 2023, 9:49 PM

#

i grow weary of amd. was using them for a long while. 4080 is my first nvidia gpu tbh

nimble heart Oct 21, 2023, 9:49 PM

#

my XTX absolutely demolishes VR games

indigo carbon Oct 21, 2023, 9:50 PM

#

steady grove i grow weary of amd. was using them for a long while. 4080 is my first nvidia g...

unironically, the 4080 and above are the only 4000 series cards they didn't fully cuck

nimble heart Oct 21, 2023, 9:50 PM

#

running everything at 150% of my headset's resolution

steady grove Oct 21, 2023, 9:50 PM

#

yeh when it works it works. Linux drivers were so superior when i used amd. gained 15fps in alot of agames when i had my vega64

steady grove Oct 21, 2023, 9:50 PM

#

indigo carbon unironically, the 4080 and above are the only 4000 series cards they didn't full...

i haven't bought a budget card for a long while. enthusiast and prosumer options only

nimble heart Oct 21, 2023, 9:50 PM

#

with Proton on Linux I can play Devil May Cry 5 @ 8k Ultra and still get 80fps

steady grove Oct 21, 2023, 9:51 PM

#

nimble heart with Proton on Linux I can play Devil May Cry 5 @ 8k Ultra and still get 80fps

yeah proton and other options, lutris, all that, it's so good

nimble heart Oct 21, 2023, 9:51 PM

#

shit's cracked

#

the mesa ray tracing isn't that good yet though

steady grove Oct 21, 2023, 9:51 PM

#

you can get some nice fps boosts on windows too if you use dxvk wrappers on old games

steady grove Oct 21, 2023, 9:51 PM

#

nimble heart the mesa ray tracing isn't that good yet though

there are trade offs yeh

nimble heart Oct 21, 2023, 9:51 PM

#

for cyberpunk I still switch to windows

#

mesa right now is like 1/5th the speed

#

apparently if you compile it from git it's "up to" like 1/2 speed

steady grove Oct 21, 2023, 9:52 PM

#

nimble heart for cyberpunk I still switch to windows

cyberpunk's new patch is 🤌

nimble heart Oct 21, 2023, 9:52 PM

#

still have to play that I need to finish other games first

indigo carbon Oct 21, 2023, 9:52 PM

#

SDXL's quality is certainly good, but it doesn't follow the prompt as much as T5 models do

#

maybe SAI'll make something that replaces CLiP? T5 isn't the solution for what they're going for

steady grove Oct 21, 2023, 9:53 PM

#

indigo carbon SDXL's quality is certainly good, but it doesn't follow the prompt as much as T5...

agreed. luckily with sdxl we have so many other options to control the end product other than prompts. it's a boon.

steady grove Oct 21, 2023, 9:53 PM

#

indigo carbon maybe SAI'll make something that replaces CLiP? T5 isn't the solution for what t...

they'll probably make a new openclip model. or somehow tie one of their LLM's into openclip

#

what i love about stability is they are all in on researching this stuff to run on consumer hardware instead of corporate hardware

nimble heart Oct 21, 2023, 9:54 PM

#

dfif in shambles

steady grove Oct 21, 2023, 9:54 PM

#

credit to others who are contributing to that effort too of course, but sai seem to be leading the pack here

steady grove Oct 21, 2023, 9:55 PM

#

nimble heart dfif in shambles

dfif was always just a resaerch model and i think they got a ton of good results from it

indigo carbon Oct 21, 2023, 9:55 PM

#

steady grove what i love about stability is they are all in on researching this stuff to run ...

yeah, I bet they'll figure out eventually to make SD have a better language understanding

nimble heart Oct 21, 2023, 9:55 PM

#

I think they should make a dfif2 tbh

#

could be their "sdxxl" for absolute highest quality at cost of all your vram

steady grove Oct 21, 2023, 9:57 PM

#

#

i'd never be able to get that from sdxl. that's dalle

indigo carbon Oct 21, 2023, 9:57 PM

#

steady grove i'd never be able to get that from sdxl. that's dalle

what was the prompt? I kinda wanna try

steady grove Oct 21, 2023, 9:58 PM

#

a fluffy cat on their back, playing with a computer mouse as if it was a real live mouse

#

#

a fluffy cat pouncing on a computer mouse like it was a real live mouse

the way it understands prompts is phenomenal for real

#

it just sucks that it needs corporate datacenter level computation

#

you can see in his eyes, he wants to eat

indigo carbon Oct 21, 2023, 10:01 PM

#

steady grove a fluffy cat on their back, playing with a computer mouse as if it was a real li...

I get this. I'd say the quality of SDXL is better but the prompt following isn't there

steady grove Oct 21, 2023, 10:02 PM

#

yeah the quality of renders compares very well. especially if you're a skilled operator

indigo carbon Oct 21, 2023, 10:03 PM

#

steady grove yeah the quality of renders compares very well. especially if you're a skilled ...

honastly SDXL's quality is better imo, but idk. this is just my opinion

bright valley Oct 21, 2023, 10:04 PM

#

It's a lot different that's for sure

#

But once you get the hang of how it works you can definitely control it a little better

steady grove Oct 21, 2023, 10:04 PM

#

Yeah often, i prompt knowing that sdxl isn't going to get the prompt very well. i'm just throwing stuff out there to sort of nudge it towards what i want

#

sd1.5 did that even more so. prompt salads i use extensively on that side

bright valley Oct 21, 2023, 10:05 PM

#

Like I made this render with SDXL and there's 30 different anime in it, and I think it nailed just about all of them, through prompting alone #🎥｜animation message

indigo carbon Oct 21, 2023, 10:06 PM

#

steady grove yeah the quality of renders compares very well. especially if you're a skilled ...

also the speed of SDXL kinda makes up for it, even more so when using AIT

glad grove Oct 21, 2023, 10:06 PM

#

try with something harder like a "1girl praying inside a dark temple with a golden buddha statue with 16 arms in background" this one took me like 140 images to get on sdxl

indigo carbon Oct 21, 2023, 10:07 PM

#

takes me less than 30s to generate an entire batch like this

bright valley Oct 21, 2023, 10:07 PM

#

I don't use things like 1girl on SDXL

#

that's kinda exactly what I mean

#

that works grat on 1.5 but certainly not XL

#

great

steady grove Oct 21, 2023, 10:08 PM

#

glad grove try with something harder like a "1girl praying inside a dark temple with a gold...

1girl is a booru porn board tag. you won't get that in sdxl until someone trains it in

#

it's on sd15 because novel ai did all that expensive work and eveyrone stole it

glad grove Oct 21, 2023, 10:08 PM

#

first try on dalle

indigo carbon Oct 21, 2023, 10:09 PM

#

glad grove first try on dalle

quality isn't there though, but it followed it perfectly

glad grove Oct 21, 2023, 10:09 PM

#

wonder if you could expand the scene with outpainting,idk if sd would understand the details and add them properly

steady grove Oct 21, 2023, 10:10 PM

#

microsoft likely found a sanitized booru tag dataset to train with. likely are investing millions into data set building

indigo carbon Oct 21, 2023, 10:11 PM

#

steady grove it's on sd15 because novel ai did all that expensive work and eveyrone stole it

4chan did that, right? people hacked NAI and leaked them

steady grove Oct 21, 2023, 10:12 PM

#

indigo carbon 4chan did that, right? people hacked NAI and leaked them

someone leaked em. i don't know who. everyone who used it after that stole it

#

sd15 is a poisoned well. a lot of garbage happened in it's early development before the popularity kicked up

south horizon Oct 21, 2023, 10:16 PM

#

#

#

south horizon Oct 21, 2023, 11:02 PM

#

#

steady grove Oct 22, 2023, 12:17 AM

#

if mona lisa were a hot valley girl

half cedar Oct 22, 2023, 2:26 AM

#

half cedar Oct 22, 2023, 2:26 AM

#

half cedar

Bing

crisp owl Oct 22, 2023, 2:44 AM

#

Bing can make some great images at times, as Dalle3 can, but they both are so grainy if blown up to anything beyond quick viewing size

half cedar Oct 22, 2023, 2:55 AM

#

Sdxl

high skiff Oct 22, 2023, 3:09 AM

#

Wanted to give a little update on my realism LoRA progress. Here are some new examples of what it looks like now

Top left is mine, top right is RealisticVisionXLV2, bottom left is Realism Engine, and bottom right is Real Stock Photo

Current dataset is only 90 images and not trained too well. Working on the 500 images version with very meticulous tagging. Also experimenting with some new papers in training with the goal to get much higher fidelity, and much better brightness control

#

It's being trained to mimic the look of properly color graded professionally photographed portraits and various other image subjects

crisp owl Oct 22, 2023, 4:50 AM

#

"hypertile" in recent comfyui commit 🤔

nimble heart Oct 22, 2023, 5:07 AM

#

yea it's broken

#

supposed to help 1.5 models scale like XL does I believe

#

but it always errs out

crisp owl Oct 22, 2023, 5:08 AM

#

ah

nimble heart Oct 22, 2023, 5:08 AM

#

it tiles the first unet attention or something

#

so it doesn't blow up at high res

#

so it should make a sorta more linear XL performance curve instead of just super exponential

crisp owl Oct 22, 2023, 5:10 AM

#

Could be cool, does say still testing, so maybe it'll get perfected soon-ish

nimble heart Oct 22, 2023, 5:10 AM

#

and in the future it'll maybe expand to other areas of the model

clever verge Oct 22, 2023, 5:58 AM

#

high skiff Wanted to give a little update on my realism LoRA progress. Here are some new ex...

Great job! Your images feature more pronounced shadow details compared to the stock photo (and the other generated ones as well).

vital ermine Oct 22, 2023, 6:03 AM

#

high skiff Oct 22, 2023, 6:10 AM

#

clever verge Great job! Your images feature more pronounced shadow details compared to the st...

Stock photo is a SDXL realism fine tune

I specifically hand sourced and tagged the data set images for the training. My specific goal with this project is to increase the realism, diversity, and most importantly, the lighting / camera dynamic range of SDXL when doing photographic images

#

I can share more information about it tomorrow, I have a horrifically bad headache at the moment, and I'm off to sleep

clever verge Oct 22, 2023, 6:12 AM

#

high skiff Stock photo is a SDXL realism fine tune I specifically hand sourced and tagged ...

Here's a sample from my latest LoRA.

high skiff Oct 22, 2023, 6:13 AM

#

Has very nice fine details. My LoRA wasn't trained fully enough to really pick up on fine details

Yours has a bit more of a painterly look mixed in with the realism which is a nice aesthetic

clever verge Oct 22, 2023, 6:16 AM

#

I've mixed in some post process grain and LUT adjustments but it's subtle.

heavy zinc Oct 22, 2023, 6:45 AM

#

Hi all, Do I use sdxl? I have Sd 1.6

uncut fiber Oct 22, 2023, 7:02 AM

#

depends on model you are using. If about 6GB and containing XL most probably

noble shoal Oct 22, 2023, 7:07 AM

#

clever verge Here's a sample from my latest LoRA.

She has 2 different skin colors and 3 legs, but the details are quite nice 👍

clever verge Oct 22, 2023, 7:09 AM

#

noble shoal She has 2 different skin colors and 3 legs, but the details are quite nice 👍

Yeah, legs got bit messed up on the gen. Skin I see what you say but not sure if it's a normal tan or not 😄

nimble heart Oct 22, 2023, 7:09 AM

#

3 legs == 2 vagonyas

clever verge Oct 22, 2023, 7:11 AM

#

Normally when you have as light skin as she has I'd say it's normal to be white on the non-sun side or do you see anything else that I have missed?

nimble heart Oct 22, 2023, 7:11 AM

#

her left arm is tanned like she hangs it out the window while driving

#

compared to her right arm and first two legs

steady grove Oct 22, 2023, 7:12 AM

#

nimble heart Oct 22, 2023, 7:13 AM

#

cookie monster always scared me

steady grove Oct 22, 2023, 7:17 AM

#

#

crumbs alllllll over his kb what a mess

noble shoal Oct 22, 2023, 7:27 AM

#

steady grove Oct 22, 2023, 7:30 AM

#

cookie thing

#

#

dalle totally got what i was prompting for

lapis gale Oct 22, 2023, 7:56 AM

#

warning, cookie overload

252210-2101827286-anime_artwork_a_cookie_monster_made_of_blue_cookies_._anime_style_key_visual_vibrant_studio_anime_highly_detailed.jpg

#

cuuuuute

252210-1982147459-pixar_style_cute_big_eyes_furry_a_cookie_monster_made_of_blue_cookies.jpg

noble shoal Oct 22, 2023, 8:04 AM

#

indigo carbon Oct 22, 2023, 9:18 AM

#

do you see it?

noble shoal Oct 22, 2023, 9:37 AM

#

Who in the right mind has not commented out this download out of the Automatic1111 code? smh

nimble heart Oct 22, 2023, 9:38 AM

#

it only does that if your models folder is empty

fierce rivet Oct 22, 2023, 9:39 AM

#

Hi, newbie here. quick question with regards to the image dimension when generating with sdxl checkpoint in Sd webui. Do I keep it at 512 x 768 and then upscale it by x2, or just generate it at 1024 x 1536 without upscaling?

indigo carbon Oct 22, 2023, 9:39 AM

#

noble shoal Who in the right mind has not commented out this download out of the Automatic11...

the A1111 webui is kinda outdated

#

not nearly as efficient as ComfyUI

noble shoal Oct 22, 2023, 9:40 AM

#

indigo carbon the A1111 webui is kinda outdated

No shit. I am a hardcore comfyui fanboy. Just want to check out BLIP2 captioning.

#

But loading this antique checkpoint....

nimble heart Oct 22, 2023, 9:41 AM

#

fierce rivet Hi, newbie here. quick question with regards to the image dimension when generat...

1024x1536 will probably work okay-ish on XL you'll just have to seedhunt a bit more.

#

512x768 is actually too small for XL

fierce rivet Oct 22, 2023, 9:43 AM

#

nimble heart 512x768 is actually too small for XL

Thank you'

indigo carbon Oct 22, 2023, 9:45 AM

#

also I think SAI are working on a new encoder, I looked at things mentioned in the SDXL released and Emad said something about a future SD3.0 being entirely different

nimble heart Oct 22, 2023, 10:27 AM

#

emad speculates on a lot of things. we'll only see when the time comes

vital ermine Oct 22, 2023, 10:52 AM

#

vast ridge Oct 22, 2023, 11:03 AM

#

vital ermine Oct 22, 2023, 11:06 AM

#

#

#

noble shoal Oct 22, 2023, 11:19 AM

#

Made a test training a lora on text. Getting interesting results so far.

vital ermine Oct 22, 2023, 11:35 AM

#

lilac wren Oct 22, 2023, 11:44 AM

#

noble shoal Oct 22, 2023, 11:52 AM

#

#

vital ermine Oct 22, 2023, 12:04 PM

#

02033-1178499679-apocalyptic20dramatic20shot20of20a20futuristic20cybernetic20exo-suit20ninja20stealth20sneaking20cinematic20detailed20background20hyperdetai.png

noble shoal Oct 22, 2023, 12:10 PM

#

vital ermine Oct 22, 2023, 12:20 PM

#

lusty wolf Oct 22, 2023, 12:21 PM

#

#

vital ermine Oct 22, 2023, 12:34 PM

#

#

half cedar Oct 22, 2023, 12:53 PM

#

noble shoal

"I served with your father in the cookie wars"

vital ermine Oct 22, 2023, 1:00 PM

#

thorny frost Oct 22, 2023, 1:41 PM

#

hi guys! Any model recommendation to generate landscapes?

vital ermine Oct 22, 2023, 1:41 PM

#

thorny frost Oct 22, 2023, 1:42 PM

#

XL model j mean

cyan crown Oct 22, 2023, 1:52 PM

#

thorny frost hi guys! Any model recommendation to generate landscapes?

with good prompts SDXL base model is awesome

vital ermine Oct 22, 2023, 1:54 PM

#

cyan crown Oct 22, 2023, 1:56 PM

#

strong copper Oct 22, 2023, 1:57 PM

#

final stilization

cyan crown Oct 22, 2023, 2:05 PM

#

wet nacelle Oct 22, 2023, 2:33 PM

#

#

SDXL 1.0 base is still very good guys.

#

#

@vital ermineHow many Loras do you have in the works?

wet nacelle Oct 22, 2023, 2:34 PM

#

cyan crown

rad

vital ermine Oct 22, 2023, 2:37 PM

#

wet nacelle <@315547788211388417>How many Loras do you have in the works?

2 and one I want to start work on in a week or so

wet nacelle Oct 22, 2023, 2:37 PM

#

vital ermine 2 and one I want to start work on in a week or so

rad

vital ermine Oct 22, 2023, 2:38 PM

#

About to release this one if this training works right but buckets are not playing nice

noble shoal Oct 22, 2023, 2:47 PM

#

wet nacelle Oct 22, 2023, 2:50 PM

#

noble shoal Oct 22, 2023, 2:57 PM

#

#

wet nacelle Oct 22, 2023, 3:20 PM

#

#

#

#

#

#

and it's gone

noble shoal Oct 22, 2023, 3:38 PM

#

wet nacelle

noble shoal Oct 22, 2023, 3:39 PM

#

wet nacelle and it's gone

Sorry, was a thumbnail

wet nacelle Oct 22, 2023, 3:39 PM

#

meat couches

#

#

#

indigo carbon Oct 22, 2023, 4:54 PM

#

vital ermine Oct 22, 2023, 5:25 PM

#

half cedar Oct 22, 2023, 5:29 PM

#

Sdxl -> sd1.5 dreambooth -> Pika -> After Effects

sweet wyvern Oct 22, 2023, 5:54 PM

#

what's the minimum spec for the XL refiner?

crystal gazelle Oct 22, 2023, 6:09 PM

#

Does anyone know a proper tutorial to download SDXL, I've tried so many and everytime get some cmd error

rustic garnet Oct 22, 2023, 6:12 PM

#

try invokeai, that's quite user friendly

wet nacelle Oct 22, 2023, 6:22 PM

#

#

steady grove Oct 22, 2023, 6:36 PM

#

crystal gazelle Does anyone know a proper tutorial to download SDXL, I've tried so many and ever...

command line terminal work isnt' always intuitive. The tutorials might be telling you commands that work on their system, but not yours.

There's another app that is easier to manage this stuff with. Stability Matrix. Package manager for a lot of different UI's for you to try out. https://github.com/LykosAI/StabilityMatrix/releases/download/v2.5.5/StabilityMatrix.exe

crystal gazelle Oct 22, 2023, 6:37 PM

#

@steady grove Will it allow me to generate AI images from prompt

steady grove Oct 22, 2023, 6:38 PM

#

matrix won't. it's an installer for various UI's like automatic1111, sd.next, foooocus

#

one of those will do prompts to images if you've got the hardware for it

wet nacelle Oct 22, 2023, 6:38 PM

#

#

Too tall

#

He fell

#

https://youtu.be/vacJSHN4ZmY?si=ehNv1ME5aGfhtcZO

YouTube

Tikendrajit Rabha

Ini the beninging. (in the the beginning)

How hard to start from the beginning

▶ Play video

noble shoal Oct 22, 2023, 7:11 PM

#

#

#

cyan crown Oct 22, 2023, 7:33 PM

#

wet nacelle Oct 22, 2023, 7:37 PM

#

cyan crown Oct 22, 2023, 7:38 PM

#

wet nacelle Oct 22, 2023, 7:45 PM

#

cyan crown Oct 22, 2023, 7:46 PM

#

wet nacelle

can you provide your prompt ?

wet nacelle Oct 22, 2023, 7:47 PM

#

cyan crown can you provide your prompt ?

Pos: vhs camcorder footage of bladerunner Japanese town

Neg: black and white (cartoon), 3d, render, low res, low resolution, ((text)), ((watermark)), ((logo)), tongue out, ugly, masculine, vibrant, .com, ((tanlines)), (( ososedki.com))

cyan crown Oct 22, 2023, 7:47 PM

#

wet nacelle Pos: vhs camcorder footage of bladerunner Japanese town Neg: black and white (c...

thank you

wet nacelle Oct 22, 2023, 7:47 PM

#

yuppers

primal vault Oct 22, 2023, 7:49 PM

#

I just released a small update for Searge-SDXL to version 4.3 on CivitAI and on Github that adds support for FreeU v2 in addition to FreeU v1. It also adds some more FreeU presets.

Searge-SDXL: EVOLVED v4.3 - Optimized Workflow for ComfyUI - 2023-1...

Searge-SDXL: EVOLVED v4.3 Version 4.x is here. I made a convenient install script that can install the extension and workflow, the python dependenc...

cyan crown Oct 22, 2023, 7:53 PM

#

high skiff Oct 22, 2023, 7:55 PM

#

@noble shoal how does your LoRA work for making things out of text? My research group has a couple people researching text performance for SDXL, and one person who is doubting how good SDXL could ever be for text

I'd love to see what else your text LoRA can do, or even play around with if I'd you'd be so kind

#

From what I have seen so far, I'm quite impressed to say the least

half cedar Oct 22, 2023, 8:08 PM

#

noble shoal Oct 22, 2023, 8:10 PM

#

high skiff From what I have seen so far, I'm quite impressed to say the least

Thank you. Well, my one man research group has carefully captioned 98 photos with text in it. I included {ObjectInPicture} with the text "{Text shown in the image}" on it in every caption. I might be hallucinating but i think overall text coherency improved. One or two words get usually nailed instantly. I managed up to 6 word sentences in my tests. So yeah, i guess if the dataset is captioned good enough, SDXL has no problems with text.

#

high skiff Oct 22, 2023, 8:11 PM

#

It's quite incredible how fast SDXL seems to pick up on concepts with around 90 images of properly tagged data

noble shoal Oct 22, 2023, 8:12 PM

#

high skiff It's quite incredible how fast SDXL seems to pick up on concepts with around 90 ...

high skiff Oct 22, 2023, 8:12 PM

#

My realism LoRA is 90 images (working on a new much better 500 image version), and it makes a monumental difference compared to even the best realism finerunes out there

#

#

Mine is top left in all three

#

#

It's trained specifically for much better lighting, foreground/focus/background separation, and overall DSLR dynamic range compression

#

It's also trained to work with painfully simple prompt

#

"a portrait photograph of a black woman with blonde hair wearing a green suit at dusk in front of a shop"

#

Unfortunately, I'm only on my phone right now so I don't have any more examples, but I've probably tested at least 80 comparisons

noble shoal Oct 22, 2023, 8:16 PM

#

Oh, maybe this makes also a difference, but i am unsure. My training images have only a resolution of max. 768x512. This allows me also to create images in this resolution and then upscale them.

noble shoal Oct 22, 2023, 8:19 PM

#

high skiff

high skiff Oct 22, 2023, 8:20 PM

#

noble shoal Oh, maybe this makes also a difference, but i am unsure. My training images have...

thats a very interesting approach

#

my training images are 4k-8k+ lol

It doesn't matter much right now, but the final version of my LoRA should be able to handle absurd detail levels

wet nacelle Oct 22, 2023, 8:21 PM

#

noble shoal Oct 22, 2023, 8:42 PM

#

high skiff my training images are 4k-8k+ lol It doesn't matter much right now, but the fin...

8k? Wow, i don't think that this is really needed. Weirdly enough, my text lora is spitting out some very nice people too.

high skiff Oct 22, 2023, 8:43 PM

#

noble shoal 8k? Wow, i don't think that this is really needed. Weirdly enough, my text lora ...

for the type of training I am gonna be doing, 4k is gonna be necessary. For now, they are all being downsampled to 1024x1024 and its eq ratios

noble shoal Oct 22, 2023, 8:45 PM

#

radiant tartan Oct 22, 2023, 8:52 PM

#

for OpenPose ControlNet SDXL ive tried a few openpose models none seem to work, no errors, the image just never comes close to it. do i need that 5gb open pose model? i tried the smaller one and no luck..