#💬|general-chat

1 messages · Page 130 of 1

hot kettle
#

I guess it doesn't work like that with the full model

karmic cedar
#

you know how you walk into a very basic grocery store and it’s got aisles of redundant boxes, all with very clever marketing designs yet comprised of the same refined substances, etc.? it’s not like a higher end grocery, where there are more varieties of produce, more varieties of food in general…that’ll be the trade-off between the various parameter builds of SD3 that get released

#

in terms of aesthetics

#

(theory)

distant swift
# hot kettle I guess it doesn't work like that with the full model

The model needs to be trained for both image and text conditioning to be able to do image encoding without controlnets or IP-Adapters, but the SD3 arch specified in the paper uses a CLIP model and a T5 model simultaneously for text encoding, and has a separate CLIP model for image encoding

hot kettle
karmic cedar
#

yes

static cape
#

let's wish for some news to arrive this week

karmic cedar
#

yeah i’m just being a pessimist, i love being wrong because it means progress was made where i didn’t think there would be

hot kettle
karmic cedar
#

yeah—just trying to imagine how that will translate visually

#

i think it will lead to a lot of syntactic homogenization

#

(which is…regrettably…also in the best interests of high profile marketers, etc.)

hot kettle
#

(not fact checked)

karmic cedar
#

that would be sweet

distant swift
#

I'm pretty sure people will start to make quantized versions of the biggest SD3 model similarly to what was done with LLaMa2, to have the same model used by most people

hot kettle
distant swift
#

Though quantizing diffusion models isn't nearly as simple as LLMs

karmic cedar
#

when you consider the differences between a quantized text model, low and high bit

#

quality-wise

#

what stands out?

#

because what stands out in a text model will translate visually

hot kettle
karmic cedar
#

probably

#

unless they don’t have the compute lol

hot kettle
sage reef
red lynx
#

Omg what's with these new accounts today

karmic cedar
#

how about tech support?

sage reef
#

oh maybe you meant for training? i misread, cause i know for inference, t5 is optional

distant swift
sage reef
#

yea

#

it would be cool if the architecture is modular in a way that we can plug any variant of t5 we want and not just the one they are using

hot kettle
#

I still can't get over the fact that we came a full circle back to GANs with turbo models

#

How is that even still diffusion

sage reef
#

i personally dont want speed, i want quality :3

hot kettle
sage reef
#

😮

#

well low vram is always nice yes

distant swift
sage reef
#

my god 1060 is so old right? when was that released

hot kettle
karmic cedar
#

context is what makes a field of red a rothko painting

sage reef
#

well for example, the flan t5 pruned is only like 2.5GB, so very compact and still works nice

hot kettle
#

If they use a pretrained one and the features are similar for both I'm sure it'd work

distant swift
#

I remember a model called Pixart being able to use different variations of T5

karmic cedar
#

pixart is still being updated afaik

#

good model

sage reef
#

well i tried pixart sigma, but cant get it to work with flan t5, so rip

red lynx
karmic cedar
#

there’s an llm2vec model now that can take mistral7b and turn it into a text encoder lol

distant swift
#

Cool, it's similar to SD3 arch wise, right? Transformer diffusion model with T5 text encoder

sage reef
#

but wait, the first one was alpha right? so they skipped pixart beta, or pixart gamma? straight to sigma? lol

hot kettle
#

So not all encoders will work

karmic cedar
#

it also suggests we might see techniques applied inversely to diffusion models so that they are more adaptable to this framework

hot kettle
karmic cedar
sage reef
#

PixArt Infinity

karmic cedar
#

i’ll premiere it on my fake youtube game comedy show series called “What’s With Those Latents?”

red lynx
#

It is quite coherent

sage reef
#

im still waiting for them to release some version of stable audio, musicgen is nice, but very limited in some aspects

distant swift
hot kettle
#

I wonder if some other big diffusion player will join the 'market' in the near future

#

Hopefully opensourced too

sage reef
#

the more competition the better

#

but yea hopefully open source :3

distant swift
#

Pretty sure SAI said SD3 is their final diffusion model

karmic cedar
#

it seems that the most proprietary and advanced platforms are being engineered to exist in a very narrow market, i.e. multimedia professionals who already have the equivalent of seed money from their hollywood success

distant swift
#

Well, that's what was said last time I checked in here

karmic cedar
#

it’s their final T2I model yes

sage reef
#

why would they make it final? they dont want money anymore?

hot kettle
karmic cedar
#

they have set themselves up for a direct course with many different industries that are already reveling from other AI-related developments

sage reef
#

cosxl is very cool, im using it to edit some pics, very cool results, but sometimes the output is a bit blurry it seems

hot kettle
sage reef
#

stable confusion 🙂

hot kettle
#

Stable disfunction

ornate flame
#

unstable corruption

sage reef
#

and sd3 will have cosxl or editing capabilities too, so that's gonna be awesome as well

hot kettle
#

Stoic scattering

ornate flame
#

sd3 might never have a public release

sage reef
#

no it will

distant swift
#

SAI and the others will probably start working on multimodals now that we pretty much have a model for most stuff (3D, image, audio, etc..)

ornate flame
#

a lot has been changing at SAI recently, they might just abandon public model weight releases

sage reef
#

yea multimodal is the way it's going these days

hot kettle
sage reef
#

or maybe happemad is gonna release some greatness in the future

hot kettle
sage reef
#

haha

ornate flame
#

and I doubt he can just buy H100s with his own pocket

sage reef
#

😦

#

im sure Emad is a rich lad

hot kettle
#

I'm still waiting for the Mixture of mambas cosine scheduler adversial contrastive slicing diffusion

#

Some paper names these days seem really absurd

sage reef
#

yo you saying the word mixture, like the mixture of experts in LLM, maybe we will have mixture of diffusions (MoD) where one expert is good at hands (assuming you want to generate something with hands), another expert is with face, etc :3 lol, prob not...

hot kettle
#

While it'd seem intuitive, MoE don't actually seem to divide the task the way humans would

#

It seems rather random, but improves the speed, so we just go with it

sage reef
#

on the downside... MoE models are usually huge in size... so imagine something similar for image generators, i dont think it would be great for even folk with 24gb vram cards LOL

hot kettle
sage reef
#

yea

hot kettle
#

But for MoE only 14b params should be used at once

#

So who knows if no other new architecture comes maybe will get MoD

sage reef
#

idk... i mean considering we have all these cool toys in 2024, and at the speed tech is moving... my goodness, imagine what we will have in just couple years from now.. but hopefully nvidia starts releasing cards with at least 32GB vram... cmon bruh

#

i mean we dont have the final vram confirmation for 5090 right? it's all rumours i think

hot kettle
#

I ain't spending 2K for a new GPU

sage reef
#

ye the other thing is for the image generators to somehow use less vram, using some new algos

hot kettle
#

But yeah the next couple of years, especially if no law against open source small company models will be passed, is going to be crazy

sage reef
#

one day, we will be able to make super mario 64 by clicking just one button happemad

hot kettle
#

Maybe AI based game engines? Ai physics, rendering etc

sage reef
#

yea idk

#

i mean technically it's possible

#

we already have these image to 3d models

hot kettle
#

Yeah, also all nerfs and what nots

#

And we already use DLSS and fsr to improve performance

sage reef
#

next we need a neural network trained on level design, and it can generate a level with 3d assets for you, and you prompt just what style or whatever you wanna see in the level :3

hot kettle
hot kettle
#

Too late to explore the earth, too early to live on mars, but just in time to experience ai technology exploding

sage reef
#

instead of image 2 image, you have game style 2 game style, so you input mario 64 and it will give you something similar design wise, 3d platformer :3 , or maybe you just textually prompt: in the style of mario 64

hot kettle
#

We'll get brainwaves to image

sage reef
#

lol

hot kettle
#

Especially with the new stability ai research

sage reef
#

one day we will have ai brain surgery: you prompt: fix brain and it fixes the brain 🙂

hot kettle
#

Man I wish I was a part of this major ai development

sage reef
#

welp, time for me to shave... and im lazy af

hot kettle
#

Just use stable razor

sage reef
#

haha

#

that's actually a cool product name

hot kettle
#

That's gonna be stability ai merch rebrand after they drop ai

low moon
#

So what do the lucky ones have that layman peasants lack that they were chosen to test this out?

hot kettle
clear oyster
#

which samping method should dreamshapers use?

low moon
#

If all you get out of bed for is to make money might as well not get up. I welcome our AI overlord and they can have all our jobs and economy.

hot kettle
#

What

low moon
#

the ultra tall super empty skinny towers of NY tell the story

#

We are movign towards the Star Tek future folks

#

no more money

clear oyster
low moon
#

4 hour work week?

#

try no hour work week

clear oyster
low moon
#

i dont think so

#

i wish tho haha

#

i never smoekd weed

hot kettle
low moon
#

she

clear oyster
low moon
#

and no im not trianign checkpoints

#

tho i did some loras

#

i think they work

sage reef
#

stable coin? she is using dodge coin obviously

hot kettle
#

That's why she doesn't have access

sage reef
#

dodge coin is very unstable

low moon
#

Yeah :/

#

We all should buy gold and silver. Maybe AI won't take that from us. XD

sage reef
#

buy? how bold of you to assume i have money... 😦

hot kettle
#

She doesn't know about stable mining...

low moon
#

Well no one has money.

sage reef
#

stable miner is the new minecraft game

low moon
#

Those who have it its mostly tied up in empty glass towers and other legalities.

#

Stonks and "Art" and off shore areas. Etc.

#

So the top 0.01% is busy stressing over whatever they have so even thay can;t enjoy it. Really.

#

Of course the party line says otherwise.

#

Everyone is happy on Facebook.

#

even the oil barons smell their end and are scaling back their absurd megaprojects.

sage reef
#

there will be blood

low moon
#

Well there is

#

you could argue we are in the middle of WW3

#

no one labeled it as such tho

sage reef
#

history will label it in 20 years

low moon
#

uhuh

sage reef
#

il use my stable fork

low moon
#

hope its made of gold

#

or silver

sage reef
#

it's made of diamonds

low moon
#

mm

clear oyster
clear oyster
clear oyster
clear oyster
sage reef
#

did you just ping me 3 times

#

sigh

#

we are just having a laugh :3 (i think)

karmic cedar
#

use your stable spoons ya’ll

#

the mind can bend

low moon
#

there is no spoon

karmic cedar
#

^^

sage reef
#

@karmic cedar your about me says you are machine learning researcher? you publish papers? 😮 or just student/learning type

karmic cedar
#

just a learning type! 😛

sage reef
#

ah

#

maybe you will give us the next stable diffusion after sd3 🙂

karmic cedar
#

i’m mostly an experimental artist

#

but i have a neuroscience background

sage reef
#

cool

karmic cedar
#

so i approach AI with that in mind

#

pun intended

sage reef
#

yea mind, neural network, it all makes sense now :3

karmic cedar
#

yeah it all runs together :_D

#

i love that people from different backgrounds can relate to AI based on its logic and structure

sage reef
#

im working on a neural network from scratch, so im also learning i guess

karmic cedar
#

nice!

#

transformer-based?

sage reef
#

wow easy there cowboy... when i said from scratch and learning... it's really from scratch... and learning.. as in... i just started 🙂
so technically im learning the inner workings of neural nets, like simple concepts and moving forward, no idea what it will turn
out in the end. i ultimately want to create some sort of domain-to-domain network, so like maybe GAN or idk.. like you input something
and it outputs something within the same domain, so either image to image or music to music or idk.. but not like text to image, then
again depending where this journey takes me, i might try that too, but for now want to learn mostly image to image stuff, cause i want
to combine it with my other research field, with digital signal processing and combine it 🙂

karmic cedar
#

nice!

#

i’m fascinated by fMRI2img

teal pagoda
#

Anyone knowing the launch date of SD3?

sage reef
#

april 26 happemad

teal pagoda
#

yea, sure

sage reef
#

i should probably stop saying this date cause people might start believing it and then when it turns out it's not, im gonna get like 100 pings

sage reef
#

haha

#

on the other hand, i hope it's not april 26, cause then people will think im working for SAI lol

hot kettle
sage reef
#

yea not easy stuff :3

hot kettle
#

Only thing that somewhat worked from my models was SR

red lynx
hot kettle
sage reef
#

i think GANs started in 2014 if i recall

hot kettle
#

Well yeah, but with current rate of progress that's archaic technology

#

And easy to write from scratch using torch/tensorflow

sage reef
#

oh im not using torch or anything, im literally doing it from scratch, limiting myself only to numpy for example and some plotting i guess, i really want to learn the inner workings

hot kettle
#

I think I wrote a evolutionary neural net from scratch in c# a while ago

sage reef
#

nice

hot kettle
#

But I'd not dare to even try implementing SGD with just numpy

sage reef
#

haha

#

it's not that bad :3

hot kettle
#

Yeah except every source for anything ai related gives me some crazy equations with characters I've never seen lol

sage reef
#

well i got help from wiki math when it comes to symbols im not familiar with, i know most of them

hot kettle
#

Well good luck with that but I'll stick to joining torch blocks together

#

And even with that I barely understand anything from the last few years

sage reef
#

i spent a lot of time reading technical research papers and sometimes they provide pseudo code within the paper and that helps piece the whole thing together and then i implemented it myself, for some of the projects (non ai projects), so you can learn, but some stuff is a bit too convoluted and the paper wont help you, unless they maybe decide to also release some code on github to learn from, but yea...

hot kettle
#

Yeah I try to ready every bigger paper that comes out and some older ones I find interesting

#

But outside of the general mechanics they describe in text, most of the pseudo code or math equations don't make me understand it better

sage reef
#

that's why i decided to start from scratch or just math (numpy) cause i really wanna grasp it, so i then have total understanding and control of what im doing and i know what im doing... as opposed to.. here are these legos... and make something... but how did they make the lego itself :3

#

and of course i take notes and comment my code a lot

hot kettle
#

Well I understand those most basic basics, but I doubt I could code them without further reading

sage reef
#

im actually super crazy when it comes to commenting code haha, i spend like paragraphs just on one line of code sometimes, cause i need to remind myself what this is doing and how it can be used if you alter it or whatnot

#

or if i write a custom function

hot kettle
#

I make a comment every 200 lines if not more

#

Unless I specify the tensor sizes inside model parts

#

Or to segment my code

sage reef
#

i remember during one project, i was stuck implementing a research paper, and i remembered i did something very similar and even commented it on another project and that saved me... and i completed the project

#

cause i had to understand the logic

#

im also working on a 3d game engine, got the renderer part (but not complete), the physics (but not everything), and now doing animations, but that is a pain in the thingy

hot kettle
#

I never made an engine but I did make some game-ish projects, but that is way easier

sage reef
#

im a programmer, so i like to try all sorts of projects 🙂

hot kettle
#

I do some programming for fun but wouldn't call myself a programmer

narrow badger
#

hey

charred mesa
sage reef
#

mhm

#

you know it

sage reef
loud solar
#

Hello World!

narrow badger
sage reef
#

ok if you are replying, you are not a bot :3

narrow badger
#

just couldnt run stabilityai/stable-video-diffusion-img2vid-xt

sage reef
#

ah

narrow badger
#

maybe i find answers here

sage reef
#

the xt version takes a lot of vram if i recall

hot kettle
narrow badger
#

its been like 5 minutes

#

but its still on %0

#

i have 4070super

#

is it normal

sage reef
#

i think you mean img2vid tho? and not img2img?

narrow badger
#

img2vid

sage reef
#

4070 super how much vram is that? i dont know all the models

narrow badger
#

wow it says 3.5hours left

#

its my gpu

loud solar
sage reef
#

yikes

#

but wait, how many frames are you generating?

loud solar
#

Hopefully not more than 25 🙂

sage reef
#

lol

#

gpu gonna explode

loud solar
#

Model can't handle more ...

sage reef
#

yea

narrow badger
rugged mirage
#

are you definitely using your gpu? sounds like you might be using cpu/downloading some models from the workflow for the first time if it's hours

narrow badger
#
from diffusers import DiffusionPipeline
from PIL import Image

print("-------------------- START -------------------")

pipeline = DiffusionPipeline.from_pretrained("stabilityai/stable-video-diffusion-img2vid-xt")
print("Pipeline loaded")

image_path = "image.png"
print("Image loaded")
image = Image.open(image_path)
print("Image opened")

result = pipeline(image)
print("Image passed to model")

result.save("output_video.mp4")


#

here is the code

rugged mirage
#

Im guessing that downloads the models from that links, that can take adges currently

narrow badger
#

first time just running it without ui

rugged mirage
#

show the output not the code

sage reef
#

is your pipeline using cuda?

rugged mirage
#

and check in task manager or nvidia-smi or whatever if the gpu is being used

sage reef
#

@loud solar why is he looking me like that? :3

#

but nice stuff man

loud solar
loud solar
sage reef
#

i recommend you use comfy, it's very well optimized for memory and image2vid works nice in there

#

or forge i guess

loud solar
#

Comfy is better if you don't need to learn everything 🙂

sage reef
#

nah you dont need to learn a lot :3

narrow badger
#

can you. guys provide me a doc or a link that would help a lot

#

actually im tryna implement ai to my app so i need an proper api

rugged mirage
#

you can use comfy and forge in an api too

narrow badger
#

thats why i tried it on python

loud solar
#

just copy model and that's it ...

sage reef
narrow badger
#

for api services which one do you guys suggest

sage reef
#

i never used api services personally, they are very limited

narrow badger
#

oh why

loud solar
#

Same here

rugged mirage
#

if your workflow is simple then yeh the way you are doing it with diffusers is best for API, if it's more complex then you can turn comfy into API

sage reef
#

im gonna stay with comfy forever ❤️

narrow badger
#

i just need to animate images in a very simple ways

#

no need configurations etc

sage reef
#

so you just want the most easiest approach, in which case comfy is not for you

narrow badger
#

but all the examples on internet are about UIs

narrow badger
#

just a simple python code to run this model

#

i could not find any

sage reef
#

the only problem with the most easiest approach is that it's most likely not gonna give you the best results, for example people literally have comfy workflows specifically tuned for svd to work and give nice results, which is not just the svd part, so yea.. you prob wont get the best results

narrow badger
#

i see

#

then i will check them apis

#

i love you guys thx a lot

sage reef
#

well good luck and always ask if you need help

narrow badger
#

thxx

gusty oriole
loud solar
#

Nicer to have some serious questions 🙂

sage reef
#

as opposed to not serious questions

gusty oriole
sage reef
#

what up doc

loud solar
sage reef
#

come on,.. that is serious haha

loud solar
#

Got fresh brewed tea ... that's serious ^^

sage reef
#

tea for two

#

i mean the question is serious, but the answers are not... for example april 26 🙂

loud solar
#

Nobody here can give the answer so how serious can be the question?

sage reef
#

well to be fair, not everyone is up to date with news perhaps, so technically speaking, they could have missed some actual news from SAI in which they did announce an actual date, and so the people asking sd3 when kinda makes sense in that regard

#

but yea most are just meming at this point... cause sd3 is a myth :3 lol

loud solar
#

But even if somebody working for SD would be here ... I don't think he would know ...

sage reef
#

i can only assume the lead dude knows, cause he gives the final order for release, the devs dont know when it will be released, they are just devs working on it

karmic cedar
#

And let’s not forget who the devs report to

loud solar
#

But the lead dude has other problems than chatting here 🙂

sage reef
#

exactly

loud solar
#

But he can DM me 😄

sage reef
#

we are just the peanut gallery for the lead dude, he can't possibly be bothered by us down here

loud solar
#

I still offer to watch the killswitch ^^

sage reef
#

"look at these peasants waiting for sd3", he said calmly

trail lion
#

look at these people thinking they are owed timelines and status updates

sage reef
#

right

loud solar
#

Maybe we should start a crowdfunding for earlier release? 🙂

sage reef
karmic cedar
sage reef
#

im actually curious what Emad is gonna cook

karmic cedar
#

him joining microsoft is poetic for the most part

#

i hope he works towards his own dreams while facilitating his role there

loud solar
karmic cedar
#

i mean what if that was a clever name for a new diffusion model

#

since diffusion models are technically entangled spaghetti

sage reef
#

@honest mica hey just have a question for you. i know you trained the CCTV loras, and that is technically a concept i guess? i want to try to train a concept lora, so wondering what are the best practices when training for a concept rather than just some object or thing? do concepts need something special? either parameter wise, amount of picture wise, or captions wise, or idk.. any tips? :3

steep timber
#

what is the best model for converting a pixelated image to a realistic one?

sage reef
#

SUPIR

#

unless you mean like an actual pixel style image, in which case, it can be almost any realistic model, cause you are doing image to image.

#

but if by pixelated you mean low degraded quality image to restored version, then SUPIR should help with that

#

and if the realistic model doesnt work, you can force it further with a lora like realistic slider, and put the slider to max strength and it should convert it

#

i did this to convert some anime pics to real and vice versa

karmic cedar
#

magic image refiner could also do the trick

sage reef
karmic cedar
#

i believe it is a comfy workflow—let me try to source a link for ya

sage reef
#

thx

karmic cedar
#

chiggity-check

#

it be a cog.

obsidian viper
#

hi, does anybody have a reference to a good image to video comfyui workflow?

karmic cedar
#

they have lots

sage reef
#

so wait im confused... cog? but it can used in comfy or no?

karmic cedar
#

i’m gonna say no

#

i’m looking more closely at it

#

sorry for the confusion

#

it is a pretty good controlnet sandwich though

steep timber
karmic cedar
#

oh hi WizardLM 2 8x22B holy $@^#$&^

sage reef
sage reef
karmic cedar
#

crazy performance looks like

steep timber
#

i want a model that converts pixel art to realistic

#

that model looks like just an upscaler

sage reef
karmic cedar
#

I enjoy Fooocus for its ease of use and consistency

sage reef
#

yea cause i wasnt sure what you meant by the word pixelated

steep timber
#

yea, sorry

#

language skill issue 🤓

sage reef
#

it's ok 🙂

karmic cedar
#

all good here

sage reef
#

ye so you can use almost any realistic model, combine with optional lora like realistic slider, and then use image 2 image with strong denoise (cause it needs to convert it, so maybe 0.50 or above) and perhaps controlnet canny or whatever, depending if you care about the details

karmic cedar
#

you could find a really complex workflow and then go Doc Brown on a DeLorean in ComfyUI

sage reef
#

sometimes even simple workflows can work too, no need to go extra crazy 🙂

karmic cedar
#

oh yeah, a little canny can go a long away.

#

way, even?

pearl ocean
#

Forge is all you really need

nova zodiac
#

but adetailer, peturbed attention guidance, regional prompter, and controlnet all working awesome

pearl ocean
nova zodiac
#

connect net or control net?

nova zodiac
pearl ocean
pearl ocean
sage reef
#

you need to go to university to become a prompt engineer

nova zodiac
#

hell even just asking the bot on civitai for a prompt gives you a half decent starting point to work from

sage reef
#

i hope you didnt take what i said actually seriously... lol, i meant it as a joke considering people call it "prompt engineering" and engineering is usually related to university studies :3

nova zodiac
#

the /s was dropped

sage reef
#

it's all good fam

next dawn
#

Virtual assistant available here ✌🏻

sage reef
#

virtual? im only interested in actual real assistants, sorry

karmic cedar
#

when is DeepMind going to start playing Vampire Survivors and can I watch the stream plz

#

wow Cascade does sausages really well

#

is that because it’s a german architecture?

sage reef
#

based on what i understood, Cascade is basically Wurschten v3, or however you spell that in german and it seems that word means sausage, so i guess it makes sense? lol

karmic cedar
#

are you serious

#

i legit did not put that together

sage reef
#

haha

worn aspen
#

I feel dumb, but browsing the cascade sub, I'm looking for conversations and information, but it's mostly images. Is this an image generation bot channel thing like mid journey? I'm not seeing any indication in the rules channels.

karmic cedar
#

speaking of midjourney what’s with all the mid finetunes claiming to match v6 performance lately?

#

small spoiler: they do not

shell tendon
sage reef
#

for sure

heavy lark
shell tendon
#

yep. unfortunate.

heavy lark
#

i was using it for it's superior clipvision ability, but now that ipadapter for sdxl has been revamped, that's now way better than cascade was.

shell tendon
#

if you don't use enough steps, you have leftover noise. but if you have too many steps, you have oversampling noise of some kind.

#

generally, the number of steps that lay between those two areas is zero

shell tendon
#

too bad that's not a reality

pearl ocean
nova zodiac
pearl ocean
#

🍻

warped shard
#

hmm quick question, anyone know where .pt files go?

#

well.. where to find it

#

dropped it in the embedding file but cant seem to find out where to use it

trail lion
# warped shard hmm quick question, anyone know where .pt files go?

first make sure you have the right kind of model loaded. ie. a 1.5 model with an embedding trained for 1.5. if you have something like 2.1 or sdxl it wont be compatible. secondly, look for a tab called "textual inversion" on the main page of automatic1111, and you should see the embedding there

#

the .pt files go under stable-diffusion-webui/embeddings

warped shard
trail lion
#

you click on it and it puts a tag in your prompt

sage reef
warped shard
nova zodiac
karmic cedar
#

I’m going to finetune a lora of just action heroes busting through doors with guns and call it Embed This

sage reef
#

do it

karmic cedar
#

nah

warped shard
#

but uh if anyone know how to use the embedding lmk

ornate flame
karmic cedar
#

“Democracy”

honest mica
thorny echo
#

Has anyone been able to make pixel art by using a image as its style?

thorny echo
shell tendon
#

np 🙂

thorny echo
#

Just to double check it goes in the Lora folder right?

shell tendon
#

yup

#

err

#

yes

#

that is a lora

#

had to check real quick cuz there's also a checkpoint that's kinda similar

thorny echo
#

Doesnt seem to be showing on in the lora tab I am putting it in D:\sd.webui\webui\models\Lora

shell tendon
#

might need to hit the refresh arrow

karmic cedar
#

damn lora how u tune so fine

sage reef
#

lora? barely knew her

thorny echo
shell tendon
#

restart it then

thorny echo
#

yep It shows up now 👍

sage reef
wide tendon
#

do you think in the future we won't hire clothing models?

small cloak
sage reef
#

huh

fervent thunder
#

What is a good way to turn 3d to anime

#

The opposite is usually much easier

warm bane
#

Are there people that are considered to be experts in use of AI now? Not only image related but also other stuff? I have an interesting idea and was wondering if you guys could point me to 'experts' 😄

#

Like basically all the creative stuff like image generation, LoRAs, music, animation (Most important is image generation but you get what I mean)

nova zodiac
#

I mean theres levels of experts - there are those that are aware of the wide variety of tools, there are those that know how to put those into a workflow to combine em, and then theres those that know how to build em

warm bane
nova zodiac
warm bane
#

most of the stuff doesn't even have proper explanation and/or even if you get to that point, there are always minor issues and it's so hard to understand why

unkempt hatch
#

which model would you use for inpainting faces?

shell tendon
#

yeah the documentation situation as a whole is horrendous

#

most of the time i've spent learning this shit was wasted combing through bad or nonexistent or wrong documentation

unkempt hatch
#

e.g. I have a class picture of my students. One parent has requested their son's face be removed (e.g. a celebrity kid and they don't want that shared). I could inpaint and change his ID without wrecking the picture

warm bane
warm bane
#

I have so many random basic questions all the time, I don't even know where to start 😄

#

But then if I ask something simple and/or have an issue that's clearly not even my fault, a lot of people I talk to would be condescending

#

and I also really want to set things up but dont wanna bother someone 24/7 so idk

#

very frustrating dynamic

proper locust
#

Baby Iron Man

loud solar
#

YaY! My 5k ASUS coupon arrived 🙂

pearl ocean
#

is EpicrealismXL the most realistic model??

nova zodiac
#

I might be biased though

pearl ocean
nova zodiac
#

but use that, dpm++ 2m karras, 3cfg, 832x1216, 30 steps, no hires fix, and pag scale of 0.6, adaptive scale 0.6

stone walrus
#

This used to be easy, but i guess it changed. What syntax am i supposed to use to generate an image now? I tried # stable-cascade and it doesn't respond

stone walrus
#

why do i see other images that look recently generated?

nova zodiac
#

because they would have been made with a different generator and copied in

gray vapor
#

I'm working on a workflow for AI architectural renderings. So far my results are very promising and already very useful in my practice for design choices. My first goal was to have control over the materials in specific elements in the view, also avoiding prompt bleeding, and I have achieved it with the use of ipadapter+attention masking plus regional prompting/regional sampler over color masked exports.

My next goal is harder and has to do with being able to generate multiple renderings from different points of view while keeping consistency in the materials. Using the same references in ipadapter helps but the material is not exactly the same, details appear at different places etc. Geometry consistency is obvs achieved with controlnets.

#

For this I have considered those strategies:

    1. Using the og color masks to cut the parts in the first generated image and using those cutouts as references for ipadapter in the next views, with the plus model and a high strength. The problem is that, for small elements, this cutouts can be quite small or far from square ratio. This could be solved with an upscaling but seems too inefficient. Also this wouldn't help with the position of specific details, only with the general material color and texture.
    1. Given that I have the og geometry. Somehow transform the parts of the first image and place them in the correct spatial location in the next view, use this as a latent in a img2img setting (but it would have a lot of "non filled parts".
    1. my fav: Considering AnimateDiff, This already tries to solve a problem of temporal consistency. I could export a video orbiting around the space, generate a full video maybe at low steps or low res (for efficiency) and then only choose the frames that are interesting to me to continue denoising and upscaling. I like this idea but also seems inefficient. I wonder if there is a way to "hack" the motion information from the module to use it directly without generating every frame in the middle. Also, having access to the geometry, maybe I could export the accurate motion vectors directly without relying on preprocessing.

I'm relatively new to SD and therefore I'm sure there are a lot of other ways to tackle this problem of point of view consistency. I'm really looking forward to hearing about your ideas.

#

thx in advance 🙂

pearl ocean
#

Imagine using an A.I pin

hallow nexus
pearl ocean
hallow nexus
#

not just for portraits, they are excellent because they have many varieties of faces and postures. Obviously it depends a lot on the prompt and the samplers you use (I recommend DPM++ 2M SDE Karras, or DPM++ 3M SDE Exponential)

pearl ocean
#

lol, I got@like 6 models already end dowmldoed for some reason XD

thorn quarry
#

HI

#

BABY

#

what are you doing

#

sb

fair pewter
#

im honestly mad

#

interesting

thorn quarry
#

amazing

#

no you are not a honestly mad

wet dawn
#

hi

ionic crown
#

Are these bots not gonna work no more or what

#

Somebody clarify pls

loud solar
trail lion
#

forget the bots, no way they're coming back

bleak matrix
#

Good morning, everyone! How are we all today?

onyx hedge
#

原神

worn aspen
static cape
#

While we are waiting for SD3... is there any way to use Pixart-Sigma with ComfyUI, Swarm or Forge?

heavy lark
static cape
pine fiber
static cape
#

300 token limit and color / prompt adherence sounds good.

pine fiber
#

It’s pretty bad from what I tested

#

but the model was pretty small iirc

vernal saffron
#

Hope the reason SD3 is taking time isn’t because they’re retraining without any copyrighted data.

trim magnet
#

comedian eh

pine fiber
#

why do u assume they trained with copyrighted data in the first place

vernal saffron
#

I have to be able to generate pictures of Mario taking a shit 🚽

#

That’s my benchmark

karmic cedar
pine fiber
#

thats a bad thing why?

karmic cedar
#

it’s not a bad thing

#

we already know SD3 is the end of T2I for Stability devs

pine fiber
#

you think they will stop after that? why

karmic cedar
#

Stability has declared SD3 to be their last T2I model.

pine fiber
#

I didnt know that, strange

#

wonder why then

karmic cedar
#

it has a lot to do with Emad leaving, but it could also be some other factors involved as well.

pine fiber
#

yeah well I wouldnt know enough about that

karmic cedar
#

lots of politics involved there

pine fiber
#

I dont know if people will adopt sd3 anyway

#

the inference times seem really bad

#

maybe with turbo

karmic cedar
#

I’m certain they will, which is part of the complexity of the issue for sure.

pine fiber
#

but it probably costs a ton more to train

karmic cedar
#

cloud GPUs will step in on the timing part.

#

those will begin to take on more commercial representation as more investments are made

#

but prices will rise

#

the same logic Apple uses to deduce that consumers are okay with having a monthly plan for their smartphone as opposed to owning it outright is going to be the same logic used for cloud GPU, etc.

pine fiber
#

personally I dont think diffusion transformers are very good right now.. I dont see the same kind of generalisation LLMs have in these DiT models

karmic cedar
#

it’s all about attention guidance now

#

and structure

worn aspen
#

Is forge better than comfy regarding memory management and avoiding out of memory errors? I like comfy and have no desire to go back to an A1111 style UI, unless memory management improvements are significant.

karmic cedar
#

i think forge is memory king atm iirc

pine fiber
karmic cedar
#

that’s also why i say structure

#

with more controlnets for different types of data

#

etc.

#

or rather, not data but syntactic detail

worn aspen
#

Am I stubborn for liking the comfy interface above all other considerations?

karmic cedar
#

nah not at all 😛

#

it’s a cool interface.

#

it represents the actual workflows that these models all use and that’s like operating a steam engine almost 😛

pine fiber
#

I like it but it gets spaggheti

karmic cedar
#

^

worn aspen
#

I have no need to generate images, but it's fun, and comfy keeps me interested. I suppose I do need to try forge though.

pine fiber
karmic cedar
#

not really—not to my standards at least!

pine fiber
#

I cant think of a good example

karmic cedar
#

they’re just not that granular…yet.

#

but they can be with enough code. 😄

pine fiber
#

just like, does this piece of clothing make sense, should this thing look like this, etc

karmic cedar
#

text encoding precision is also a key factor…obviously

#

text encoding is sort of like how the number of pixels in a raster window are defined

#

in my simplistic view lol

pine fiber
#

its counter intuitive to me because I assume smaller prompts would work better for some reason

karmic cedar
#

consider how much better SD 1.5 images tend to look when they use ELLA

pine fiber
#

less stuff for the model to get wrong

karmic cedar
#

which is just really nice text encoding for the most part

#

right right

pine fiber
#

thats true

#

I just thought DiTs would be able to recognise its own mistakes more often but that doesnt seem like the case

karmic cedar
#

perhaps there’s more potential for them to down the line, but the code doesn’t seem to support that function as much as it’s theorized at the moment.

pine fiber
#

yeah

karmic cedar
#

we have an instinctive tendency to approach models holistically, which is good, but we’ve managed to make older stuff shine more just by building in new functionality to preexisting architectures. this is going to continue to be a powerful thing since the sky’s the limit with creativity and AI.

#

and how it gets extended. it’s like digital putty.

#

IMO

pine fiber
#

I agree

#

thats why I was on the fence about sd3 being "good enough" when 1.5 and sdxl are still getting better every day

karmic cedar
#

i’m just getting caffeinated this morning so i’m already on my AI soapbox

#

I think it’ll be a really nice, polished sports car of a model. But we’ve got Honda Civics already that have plenty of potential for mileage. That’s how I see it. 😛

#

like Sora—that’s a lamborghini for sure.

pine fiber
#

we get a sports car when we need an off roader

karmic cedar
#

lol

#

oh….we don’t get the sports cars

#

hollywood gets those lol

#

j/k

#

not j/k

#

what’s really going to be interesting is when smartphones and other devices start to carry localized LLMs. for example the current iphone can run Mistral 7B

#

and others

#

When those types of models begin driving other functions of the device, that’ll be a game changer. Even local image diffusion will be a thing, the likes of Apple could even have their own proprietary diffusion algorithms baked into a future release

reef wing
#

Sd3 api released

karmic cedar
#

😮

reef wing
#

Open weights with stability membership soon according to twitter

karmic cedar
#

the 26th is seeming more realistic now

#

or that range of dates

shell tendon
karmic cedar
charred mesa
reef wing
#

On X

charred mesa
#

whoops

mortal delta
charred mesa
#

no

#

we'll get people invited to Stable Assitant where you can use SD3

#

and also we'll get the models themselves in the future

#

(model files + code)

mortal delta
#

So now we just play the waiting game?

charred mesa
#

like we have been all this time tbh

#

lol

#

but this means we're finally closer

#

they've been promising API "soon" for 3 weeks

mortal delta
charred mesa
#

yup

#

but all that matters is that we WILL get the models that we can use offline and etc

#

even if its like 2-3 weeks away

mortal delta
#

I cant stand being impatient, it feels like ive been waiting for ever.

sudden ruin
#

I wonder why people are so impatient, yes a few weeks is a lot of time in the AI world, but sometimes it feels like the only reason to live for some people is to complain about SD3 not being released already

charred mesa
#

well yeah

#

people on the internet LITERALLY have nothing else to do

sudden ruin
#

Poor souls

mortal delta
charred mesa
#

see

#

now I do admit that announcing it so early was a stupid mistake

#

its been almost 2 months since they announced it

#

like thats the worst possible way to hype something up

rugged mirage
#

I hope they create a #sd3 channel soon

charred mesa
#

^

#

exactly

#

it would make sense

frigid wolf
#

And I reading that announcement right, sounds like paid membership will be required for SD3 model weights even for personal/noncommercial usage?

charred mesa
#

what

#

no

#

for non-commercial no

#

only $20 for commercial usage and that's it

rugged mirage
frigid wolf
#

Okay good, it's not overly clear in the announcement

mortal delta
rugged mirage
#

so you have to pay for membership even if you have 10 viewers and get 2 cents from youtube ads?

honest mica
#

We aim to make the model weights available for self-hosting with a Stability AI Membership in the near future. kinda sus

charred mesa
#

yeah they wrote it in a weird way lol

#

but model weights will come

mortal delta
#

will they be open sourced by chance?

charred mesa
#

the code yes, the models will have that license where its noncommercial

charred mesa
#

but everything offline available

mortal delta
charred mesa
#

if you dont make revenue, like its just for free you're fine

#

I mean I would not have the heart to make these images for money

#

I do these for fun

rugged mirage
#

Id like to make some videos where I can get at least some ad revenue if anything goes viral, but kind of a nonstarter if you have to pay every month even at first when you are making 3 cents

hasty badge
#

from my tests, sd3 is laughably bad 😢 could they have some pre pre alpha in the api... that's hard to belief as well

charred mesa
# steel dome Why not?

well unlike these ai comic artists, I barely put in effort, and even with all the opted out artists there are a bunch of artists or studios that had their style left in so that would also make me feel guilty

hasty badge
#

it can't even get text without errors while that was supposed the big thing

charred mesa
#

if you put in a lot of effort and draw over it a lot and stuff then sure, I get it, you'd want some revenue for it cause you actually put in effort

frigid wolf
#

@charred mesa same, I love seeing these models as open and free as possible as it fuels research and innovation. Like supir, omg that is incredible, especially for restoring old photos.

mortal delta
polar roost
frigid wolf
#

There's just so much cool stuff that wouldn't exist if SD wasn't open

graceful spade
#

Hi guys

#

Is there any steps can you suggest to change expressions in video using stable diffusion, (not faceswap to different person) just like my video is has a bored face and turn it to singing?

rugged mirage
#

inpaint a different expression

graceful spade
#

can you please tell me more how to do it, if you can please

rugged mirage
#

idk what you use for your video, but most of the approaches have an initial image you can provide

karmic cedar
#

temporal video editing hasn’t really become a thing…yet

#

it’s getting there

#

but not quite

graceful spade
#

can't we do it now? just try?

#

😐

mortal delta
#

So just to make sure sd3 is free thru api at the moment but cant be used to profit off of? is that right?

karmic cedar
#

it’s being made available via the developer API, yea

mortal delta
karmic cedar
#

no clue

#

i think you need to have a membership at the very least

mortal delta
#

Interesting...

karmic cedar
#

they must be having some interesting internal conversations

steel dome
#

So you will also need a membership to use the models on your machine? (comfyui/invoke/a1111)

rugged mirage
#

there is a cost for sd3 in the api docs at least

#

so wouldnt think it's free

real zodiac
frigid wolf
#

World definately be nice for stability to clarify this.

pine fiber
#

the only time you need a paid membership is for commercial usage and its $20 a month

charred mesa
#

^

pine fiber
#

dark your pfp looks like netero

steel dome
#

Lol, i guess

pine fiber
#

well if you thought you could use a million dollar model commercially for free thats basically stealing

#

I think $20 a month is a good compromise

charred mesa
#

NETERO??????

pine fiber
#

yes

steel dome
#

I still think there should be like a threshold that makes it "commercial use" kind of like Unity3d does

#

If you do less than X money, it's free

rugged mirage
#

yeah I wish the $20 a month was at least only for if you do make more than $20 (or a bit more even)

#

though I guess they are unlikely to come at you if you are making $10 but still

mortal delta
trim magnet
steel dome
rugged mirage
#

eh, if you are just someone trying to make tiktoks or youtube videos, and you pay for a year trying to make it $240 while you make $0.34 back, on top of all the other stuff you use it's not great

mortal delta
rugged mirage
#

especially since you also need a bunch more hardware, electricity etc. to generate compared to paying slightly more for sora or whatever

sudden ruin
#

I remember times when you could buy stuff and simply own it

mortal delta
steel dome
timid bloom
#

x20 cost of SDXL?

#

NICE.

#

xD

rugged mirage
#

gpt costs a lot more to run, like you can never run it at home at that speed, so being same price while gpt4 has no hardware you need to purchase and runs on their hardware is a good comparison of how much better of a deal gpt4 is

mortal delta
timid bloom
#

try with fp8

#

working nice on A1111 with 8gb 3070

trim nymph
#

what are the sd 3 costs via api? can someone let me know real quick

mortal delta
timid bloom
#

A1111 now support fp8 mode

mortal delta
timid bloom
#

so if you download new version it should be right there in options under optimization

#

no idea

#

never used

mortal delta
#

I guess ill research it then.

rugged mirage
#

forge generally has lower requirements than a111 so definitely try it over a111 if that's your issue

mortal delta
rugged mirage
#

you probably cant but 500 means their server is crashing from the request, so either it's a problem on their end or you are sending broken data - ilformatted or something

steel dome
pine fiber
#

oh

#

20x is crazy

#

scaling laws my ass

#

sounds like issue on their end

timid bloom
#

yeah

#

its on dalle 3 price level

pine fiber
#

it looks bad tbh

#

the anatomy is really messed up for some reason?

timid bloom
#

no idea, didnt test it out

rugged mirage
#

so exactly what I guessed and they've even told you what it is

#

youve put 'your account' instead of a token, presumably coppying the documentation directly instead of actually registering, getting a token and putting it where it tells you

balmy rune
#

When can we expect the weights?

timid bloom
#

soon TM

#

xD

pine fiber
balmy rune
#

I mean the model is essentially not released for as long as it's on the API.

rugged mirage
#

my guess is next month

#

are you sure you didnt close/unclose some bracket or quote in your request

hasty badge
steep arrow
#

I think we should focus on SDXL Ella implementation and getting text gen working in SDXL.

If SD3 is going to be this heavy, Stability is not considering local usecase

rugged mirage
#

hm fair enough then

pine fiber
hasty badge
#

500 generally means internal server error

rugged mirage
steep arrow
rugged mirage
#

most servers arent that well configured, and can return a 500 in a ton of cases

woven panther
pine fiber
woven panther
#

I fixed the comfy native ELLA node btw

#

it never worked properly before today 😛

steep arrow
steep arrow
pine fiber
#

yeah because the ella models are trained on t5 arent they? bigger is always better imagine using llama 3 would be great

woven panther
#

I have made a wrapper node for LaVi bridge as well, it's far worse than ELLA and there's no SDXL model for it either :/

steep arrow
#

Ella 1.5 > Composition Adapter > SDXL
And then additional images for style control is really ace.

Unbelievably controlable on SDXL, and the CLIP seems to work better when composition weighting is involved.

pine fiber
#

cool very nice

#

I would try it if I wasnt running amd

carmine herald
#

I stopped messing with Stable diffusion for a few months and came back and now everyone's talking about ponies, what's going on

sudden ruin
#

Pony is the go to anime Checkpoint right now if im not mistaken

carmine herald
#

What makes it so much better?

sudden ruin
carmine herald
#

a terrifying location

sudden ruin
#

Only nice people in there dogsmile

drifting knot
#

Hi everyone, I've been wondering about is it possible to generate exact some person on different pictures, like some Tom(which is not real or celebrity). For example I want to create picture where Tom is cooking or walking dog.
How to make it? I need to describe Tom in prompt or use some special seed?

trail lion
#

There's a guy in the images room showing off the preview, fwiw

pine fiber
sudden ruin
#

Im not the right Person to ask

pine fiber
#

right XD

ornate flame
#

I don't like the wording of the announcement

trail lion
#

It will take effort, but if can be done

drifting knot
#

Is it possible to get it from default model, without training lora?

#

like any model from civitai

trail lion
#

Yeah, with the controlnet method, but Lora will be more flexible

rain aurora
#

(Masterpiece), (Best quality), (Ultra HD), (Super detail), (Whole body :1.2), 1 girl, Chibi, cute, smile, flowers, outdoors, holding the camera, sitting on the roof looking out into the distance, with mountains in the background, amber, warm yellow, sunset, artistic sense, Quadratic style, white clothes,

trail lion
#

Eww, 1.5 prompt

charred mesa
#

hehehe

#

opposite of natural prompting

rugged mirage
#

tbf that's more natural to me, because that's how I search in google or whatever, and not writing prose with a bunch of obviously useless filler words

sage reef
charred mesa
#

well I got to try the SD3 api

#

and its almost over

#

only 25 credits, 4 credits per SD3 Turbo image

#

and thats not a bundle of 4 images or anything

#

just ONE image

#

so I guess I'll either wait for stable assistant or weights lolll

sage reef
#

they need to make money before making it open weights, i guess it makes sense, but yea...

cerulean kraken
#

So.. as just a normal user that wants to try SD3, any easy way I can use this API?

karmic cedar
#

Some comfy words from the source

sage reef
#

cant wait

cerulean kraken
#

Is there like a simple website I can go try this on?

sage reef
#

so he said few weeks from now huh... that kinda passes my estimate of April 26 it seems :3

#

so maybe May 10

charred mesa
#

so you have to figure it out

cerulean kraken
#

Nothing I can just input my API key into and go off to the races?

rugged mirage
#

so he says today (always API first, then a few weeks later weights), a few = 2+ so likely early may, possibly mid may

sage reef
#

my new estimate is May 10 happemad

charred mesa
#

4 credits per one SD3 Turbo image

#

and 6.5 credits per one SD3 image

cerulean kraken
sage reef
#

so how does the turbo version compare quality wise to base sd3?

charred mesa
charred mesa
#

could NOT test it cause I dont feel like spending 10$

sage reef
#

understandable

honest spear
#

so sd3 will require a membership for commercial use?

charred mesa
#

yes

cerulean kraken
#

nodders technically required for non-commercial use too but it's a free membership in that case

charred mesa
#

thats only for the api

cerulean kraken
# charred mesa thats only for the api

that's not what the post says 🤷 "In keeping with our commitment to open generative AI, we aim to make the model weights available for self-hosting with a Stability AI Membership in the near future."

rugged mirage
#

as far as I know sdxl also requires membership for commercial use

charred mesa
#

they cannot be this stupid

charred mesa
#

you sure you don't mean sdxl turbo

rugged mirage
#

nothing has changed there

#

also when you download most of their models through huggingface (I think including sdxl but idk anymore) you agree to the same non-commerical clause

cerulean kraken
honest spear
charred mesa
honest spear
#

well, whatever, I'll check sd3 licence at launch, if it's bad I'll just stay on sdxl or other open rail models that come out

rugged mirage
#

I guess Ill worry about it later, I do hope this helps them keep existing at least

#

because currently the chances of ever getting sd4 are kind of grim

cerulean kraken
charred mesa
#

10 times out of 10

#

guaranteed! 👍

rugged mirage
#

I guess at least if someone trains using the code and architecture from scratch, but not their weights they can make it fully open

woven panther
charred mesa
#

we have pixart-sigma which is openrail++, but that's 0.6B

#

similar prompt adherence, no text capabilities at all and somewhat cooked images (ESPECIALLY FOR PHOTOS)

sterile raven
#

Let me know when someone makes a UI for the SD3 API

wheat anchor
#

will sd3 not allow fine-tunes to be uploaded to e.g. hf and civitai because of its license?

charred mesa
#

naaah

cerulean kraken
charred mesa
#

it will

wheat anchor
#

hm

charred mesa
#

its just that the finetuned models will have to be licensed the same

wheat anchor
#

:/

charred mesa
#

Also

#

Textual Inversions may come back into fashion

wheat anchor
#

at least we now know why there was drama with people leaving stability ai

charred mesa
#

those were researchers

wheat anchor
#

ok...?

charred mesa
#

and Emad

#

emad makes sense

sage reef
charred mesa
#

cause emad left because of some opennes reasons or whatever

#

I forgot the exact reasion

wheat anchor
#

and that is exactly what i refer to?

rugged mirage
#

I mean, they've been too open anyway, it was impossible for it to last - it was basically burning VC money to give us free shit and that can only last so long before they run out of people giving them money

wheat anchor
#

look up enshittification

rugged mirage
#

sort of

wheat anchor
#

its the definition of it

rugged mirage
#

no, the definition of it is closer to milking users to increase profits

vague pond
rugged mirage
#

this is literally burning VC money for the users

charred mesa
#

drhead do you think Textual Inversion will make a comeback

#

you can train on 24GB and it may work for all 4 models

#

a dev said that SDXL Textual Inversions may work on SD3

wheat anchor
#

hopefully meta will join the diffusers game too and scare all commercial solutions like they did with llama2 and are gonna do again with llama3 next month

charred mesa
#

that'd be sick

vague pond
rugged mirage
#

llama hasnt scared commercial solutions, it's been really cool for people but hasnt scared openai or anthropic one bit

wheat anchor
#

look up mistrals latest open model

rugged mirage
#

but still none of it scares them as it can never really quite catch up

#

tho yeh supririnslgy close

steep arrow
charred mesa
#

if only there would be another openrail++ t2i besides Pixart-Sigma

sage reef
#

pixart too cooked for me

charred mesa
#

^

#

sadly

wheat anchor
charred mesa
#

but it has great potential

sage reef
#

yea potential is there

wheat anchor
#

if you go back the chain to its root

vague pond
#

like, as far as I know textual inversion is one of the safer ways to do "quality" alignment. when i was trying to make soyjak faces on one of the furry models, I noticed that all of the preference alignment loras I looked at made the outputs into just glossy sparkly generic stuff, and the negative embedding for boring images made higher quality soyjaks which is what I wanted

rugged mirage
wheat anchor
#

also I didnt know Command R+ is rated higher than some gpt4 versions lol

sage reef
#

i wanted to try Emu Edit, but it's not open weights 😦

rugged mirage
#

yes it's the first one that got higher than early gpt4s

wheat anchor
#

I can only run Command R @ 4bits

#

R+ is just too slow :/

steep arrow
charred mesa
#

both emad (former stability ""dev"") and Lykon is saying that weights will come

sage reef
#

yea similar but cosxl has some blurryness problem idk or it kinda deforms the output

charred mesa
#

So to clarify, once you are done finalizing the architecture, will the model be released where people can download it for free for personal use?

Maybe even before, it's not up to me to decide.

steep arrow
sage reef
#

but cosxl technically is the "test" before sd3 edit model i guess, so hopefully the sd3 edit model will be better

charred mesa
#

yeah cosxl edit was mid

steep arrow
#

Even cosxl base was mid, I was hoping for different clip coherency.

Turns out IpAdapter Composition does way more to help things than a model switch did.

charred mesa
#

^

sage reef
#

then again... i didnt try to combo cosxl edit with sdxl refiner.. maybe it fixes the output

charred mesa
#

oh I was hoping for more than a blurry finetune with just contrast being increased

vague pond
#

I'm not concerned one bit about the non-commercial license tbh, I already am used to releasing my finetunes as non-commercial. I make those models for people to use, not for people to throw on some expensive cloud service or for people to put into some paywalled low-effort merge.

steep arrow
charred mesa
#

^ same

sage reef
#

yea refiner i only used at the very start of sdxl, then finetunes came and it was kinda pointless

charred mesa
wheat anchor
rugged mirage
#

I used the refiner in like the first 2 weeks of sdxl, and I think maybe once in january

charred mesa
#

yeah

steep arrow
#

I just used hires mulitpass from the get go. I only played with the refiner right at release

charred mesa
#

the refiner was deemed useless like 2-3 months after usage

#

if not less

vague pond
#

they probably should have made the refiner go over the last 300-400 timesteps instead

steep arrow
#

A bunch of the finetunes integrated refinger training stuff for a while too. I don't really pay too much attention once I get things working

rugged mirage
#

the 2nd wave of finetunes were already exclusively saying to run without refiner

charred mesa
#

exactly

vague pond
#

literally all you have to do is change the timestep selection from sampling [0, 1000) to [0, 200)

#

and since you're only training 1/5th of the timesteps the model should pick up on things about 5x as fast since it will only ever need to learn high frequency details

eternal ledge
#

It's also just tedious to manage/publish 2 files rather than one 🤷‍♂️ and help confused users who applied them in the wrong order and so on

vague pond
#

plus the refiner model is also differently structured:

"attention_head_dim": [
    6,
    12,
    24,
    24
  ],
  "block_out_channels": [
    384,
    768,
    1536,
    1536
  ],
  "transformer_layers_per_block": 4,
  "up_block_types": [
    "UpBlock2D",
    "CrossAttnUpBlock2D",
    "CrossAttnUpBlock2D",
    "UpBlock2D"
  ],

vs the base:

 "attention_head_dim": [
    5,
    10,
    20
  ],
  "block_out_channels": [
    320,
    640,
    1280
  ],
  "transformer_layers_per_block": [
    1,
    2,
    10
  ],
  "up_block_types": [
    "CrossAttnUpBlock2D",
    "CrossAttnUpBlock2D",
    "UpBlock2D"
  ],
rugged mirage
steep arrow
#

Same reason cascade is not popular, people do not like to load a million things to do one thing.

rugged mirage
#

I feel like most workflows require you to load a ton of things and people deal with it

sage reef
#

i like cascade more for the image remixing part 🙂

rugged mirage
#

if it was enough better people would've lived with it

steep arrow
vague pond
steep arrow
#

Look at civit rollout of cascade models, how was that ever going to make sense for sharing finetunes? lol

rugged mirage
#

the a111 was really off yeh true

vague pond
#

I think Diffusers is the only one that implemented it correctly