#💬|general-chat

1 messages · Page 186 of 1

fervent thunder
#

I heard about stepfun cos I read about it in arxiv

floral umbra
#

Oh, only via api apparently. How much vram will it require minimum? Thonk

fervent thunder
#

yeah api isn't rly support

#

are you including blockswap?

#

with blockswap it will run on anything

#

without blockswap I am not sure what the actual minimum is

#

but their reference code defaults to 4 H100s and then one more server for controlling it

floral umbra
#

Haven't even acquired it yet, as i couldn't find a version of it being a single .safetensors

fervent thunder
#

oh I don't think we should be doing that anyway

#

huggingface's original format was so much better

#

VAE and text encoders separate, and files broken up

floral umbra
#

Interesting Thunk How much memory does it require to run ram and vram wise? Like, same as wan 720p?

fervent thunder
#

that's what I was saying

#

if you blockswap then it will run on anything only a few GB needed

#

without blockswap I am not sure

#

their code is 4 H100 plus one more server

#

to run it

floral umbra
#

Then not for me, as i prefer all local kek

#

Easier to pay 80 cents per 24 hours of my own 3090 than 80 cents for an hour of runpod for instance omegaLUL

fervent thunder
#

cloud is cheaper, than the electricity required to run a 3090 at home

#

if you want to use at home for privacy that is fine

#

but cloud is actually the cheaper option

floral umbra
#

Well, runpod is 24x pricier than my electricity :P

#

Plus i do all local anyways due tp privacy. I would only use cloud to test out models capability, or just workstation cards's speed vs my own card

ancient mauve
oblique agate
#

in economic sense always run cloud

fervent thunder
#

yeah I have no problem with libertarians that's fine
personally I just want cheap inference

#

you can do a mixture anyway

fervent thunder
#

vast ai is the cheap one

fervent thunder
#

RTX 4090 for $0.109/hr

#

for me to run that at home would cost $0.30 in electricity

oblique agate
#

kinda like dot com bubble

fervent thunder
#

yea I agree

#

when it becomes mainstream there will be more political will

#

and a larger market to sell into so raising capital will be easier

#

they need to research targeting use-cases more

#

at the moment its all a hammer in search of a nail

oblique agate
fervent thunder
#

sounds good

#

I started in the 90's lol

#

but it depends on what you call ai

polar wagon
#

Would anyone be able to point me towards a guide geared towards using InvokeAI to create large amounts of game assets (images of playing cards)? I'm making a game and I have cards that give powerups. I was thinking I could take the card descriptions, pass those to InvokeAI, and have it pump out placeholder artwork so I could continue development.

fervent thunder
#

I used it a tiny bit

#

but i am not familiar enough

oblique agate
#

I need to research more on image generation

fervent thunder
#

if you've mostly used comfy trying diffusers can be good

#

or pure pytorch like the original flux code (that particular code is rly nice)

sage reef
#

@woven panther hey i appreciate you started porting some SkyReels V2 stuff ❤️

oblique agate
fervent thunder
#

background removal is nice yeah

oblique agate
sage reef
#

how many background removal tools do we have these days? i lost count :3

fervent thunder
#

ye llm can pump out a nice GUI

#

in no effort

abstract quarry
oblique agate
#

basically nowadays as soon as someone pump out some ai tools. Someone will fork it to boost their resume

ancient mauve
#

ok how do I install the sdxl inpainting model or the ace plus model, none of them are working

abstract quarry
#

you don't install models

ancient mauve
#

you copy paste them in the correct folders

woven panther
oblique agate
#

capstone projects used to be a good idea until everyone just copy each other on github as soon as it looks cool

sage reef
woven panther
#

no time to use them

sage reef
#

mhm

abstract quarry
#

I would assume you put them in the same directory as normal models, but not sure. I don't use forge

ancient mauve
#

does anyone here use forge?

abstract quarry
#

you use them only for inpainting the same way you would inpaint with a normal model

#

what is not working for you? does the model not appear in the model selection menu?

fervent thunder
#

some forge users hang here

sage reef
#

something interesting about SkyReels V2, i did a small inference test using basic comfy workflow with the i2v 540p 1.3 model,
and it generates the video sure... but like the starting frame (image) very quickly and abruptly changes and the camera even moves
in a strange way lol, idk.. il wait for your wrapper anyway, maybe you will implement it the way it's meant to be used, cause right now
kinda wonky lol @woven panther

fervent thunder
#

forge did look appealing I just never got round to it
and now I have left for Rust lol

ancient mauve
#

with forge UI

#

but I keep getting error

abstract quarry
#

what errors?

fervent thunder
#

ye its tricky cos not having used forge UI its hard to help

ancient mauve
#

what UI do you guys use

abstract quarry
#

comfyui or (rarely) InvokeAI

floral umbra
ancient mauve
#

Unable to start ComfyUI Desktop v0.4.36

#

this is why I dont use comfy ui

sage reef
#

i personally use just the normal comfy the portable one, not the desktop

abstract quarry
#

invokeai is the most simple to use ui in my opinion 🤷‍♂️

abstract quarry
#

comfyui is the most flexible one, but with a high learning curve

sage reef
#

i mean it's not even that high...

fervent thunder
#

comfy is the best gui yeah
to beat gui you have to go to command line / code frameworks

woven panther
#

and it follow the prompt REALLY closely

#

if you prompt something that's not in your input image, it can just move onto that and ignore it

#

and it's human centric model.. non-human stuff does that more often than not

#

I did get some amazing outputs when I initially tested it though

woven panther
#

so I don't think there's anything wrong, it works in both the wrapper and native just as it is too

ancient mauve
fervent thunder
#

the GUI has had more bugs lately

sage reef
#

hmm maybe il try again and see, but il wait for your wrapper as well 🙂

fervent thunder
#

but you can decouple the GUI from the back end and use the back end alone

#

one of my current projects is to make rust front end

ancient mauve
#

comfy my ass

woven panther
sage reef
#

yea

fervent thunder
#

I mean what GUI is alternative?

ancient mauve
#

comfyUI doesnt even work, I cant manage to install inpainting in forgeUI fm

fervent thunder
#

alternative GUIs are forge or invoke?
these are like 0.01% of comfyui features

#

if you include CLI/code-based then you get all of pytorch/jax/julia/C++/rust ecosystems etc
but these are not GUI

sage reef
#

try the non-desktop version of comfy, it should work

abstract quarry
#

invoke and forge have quite a lot features

ancient mauve
abstract quarry
#

99% of the users don't need 99% of the extra features

ancient mauve
#

it needs a model to use

sage reef
fervent thunder
#

its tricky cos a lot of features are edge cases
where you only need it a handful of times
but in that moment you really needed it

#

there are a lot of features that I have not used in recent workflows that I found indispensable in previous ones

ancient mauve
#

whats the point of having one million features if something as basic as dragging objects on screen doesnt work

fervent thunder
#

I mean I agree I've switched to rust lol

ancient mauve
abstract quarry
#

it seem to work for everyone else 😂

fervent thunder
#

I got too frustrated after nearly 2 years of bugs

#

does it though

#

😂

ancient mauve
fervent thunder
#

the thing is its hard no matter what you do

#

some stuff like loading and casting I find hard in every single codebase and language

#

and sorting out compile

#

sage attention and teacache/firstblockcache also

#

this stuff needs setting up in every fresh project

abstract quarry
#

if you want a easy to install and easy to use ui, I would use invokeai.

I would argue that comfyui is the wrong tool for you. It's complicated to use when you don't understand the internals

sage reef
#

technology moves so fast that by the time those will be default in some setups, they will most likely be deprecated by that point 😂

ancient mauve
#

forgeUI is the one Im linking cause is the same as authomatic1111 the one I used in the past

#

its just that I cant manage inpainting for now

abstract quarry
#

you can also use forge. But you won't find help by anyone if you cannot precisely say what is your error message

sage reef
#

isnt swarmui a nice GUI? it has tons of features, like close to comfy features i think and it should give you inpainting stuff

ancient mauve
#

which is supposed to be an inpainting model

#

I put it in the stable diffusion models folder

abstract quarry
#

😬

#

don't use that one

ancient mauve
#

and when I try to do a generation I get AssertionError: You do not have CLIP state dict!

fervent thunder
ancient mauve
fervent thunder
#

sage is new yeah

ancient mauve
#

what do I use then

fervent thunder
#

although I actually use STA instead of sage where I can

abstract quarry
#

use sdxl inpainting for example

ancient mauve
abstract quarry
#

or flux inpainting if forge supports it (I don't know)

ancient mauve
#

where is the safetensor?

#

how do I use this then @abstract quarry

abstract quarry
#

it's a diffusers model. You could check civitai if they have a forge compatible one

ancient mauve
ancient mauve
#

I just want to download the model but I see a bunch of files instead of a single one and I dont know how to set up in forge

ancient mauve
#

what about diffusion_pytorch_model.safetensors whats the difference

sage reef
#

well you might need vae and clip, but thats also all there for you to download

#

fp16 is smaller and basically the same quality

ancient mauve
#

so I need the safetensor, the vae and a clip

#

3 files, not 1

sage reef
#

well i never used forge idk.. but usually thats how it works, you either get 3 separate things (unet, vae and clip) or if you lucky you have all in one.
im sure there was a download for inpainting all in one somewhere, but i dont remember where

ancient mauve
#

thats the problem then, it wasnt working because I was missing files 🤦🏼‍♂️

ancient mauve
abstract quarry
ancient mauve
abstract quarry
#

also, it's based on flux-fill which is not supported by forge

#

at least it seems so for me

#

which is sad. flux-fill is probably the strongest inpainting model

fervent thunder
#

ye

#

powerpaint v2 for SD 1.5 is not bad

#

its bizzarely strong for an sd 1.5 thing

#

before Flux fill it regularly took SOTAs

sage reef
#

yea if you have the hardware specs, go maybe with flux fill

ancient mauve
#

wait

ancient mauve
sage reef
#

lol

ancient mauve
#

what is the best inpainting model I should download

#

for image EXTENSION

sage reef
#

depends on your specs

ancient mauve
#

I have a 4090gtx

abstract quarry
#

flux-fill is the best inpainting model BUT I don't know if it us supported by forge

sage reef
#

well you can do flux fill then

#

and i also have no idea about forge

abstract quarry
#

cause on their GitHub they write they haven't implemented full flux support yet

sage reef
#

yikes

ancient mauve
#

I want to bash my head against a wall

sage reef
#

this is why comfy is king 🙂

ancient mauve
#

isnt there a standalone github repository for inpainting or something

sage reef
#

i mean im sure you can find even a huggingface space for free to inpaint

fervent thunder
abstract quarry
#

or use invokeai, it has support for flux

ancient mauve
fervent thunder
#

if you search github flux black forest labs it should come up

ancient mauve
abstract quarry
#

there are plenty of inpainting models

sage reef
ancient mauve
native heart
#

anyone can give me tip on how to make face remain the same on i2v using wan 2.1 model

abstract quarry
ancient mauve
#

the flux one you say doesnt seem to work with forge as you say and the other model the other guy said doesnt seem to eb an inpainting model after all

abstract quarry
#

you can try the sdxl inpainting you downloaded with forge

ancient mauve
#

and after 3 UI installs none of them work because one is incompatible with flux and comfyUI doesnt even have a working drag feature and Im getting crazy

abstract quarry
#

try invokeai. It has a full installer that also automatically download all the models for you

abstract quarry
ancient mauve
#

wait forge UI seems to work with flux

fervent thunder
#

maybe they updated

#

there is another one called reforge

#

IDK what it is

ancient mauve
#

Im gonna try with forge I dont want to isntall any more stuff

ancient mauve
#

and the third one?

fervent thunder
#

text encoders

ancient mauve
#

do I have to download both or only 1 of them?

abstract quarry
#

xou haven't used sdxl so far?

ancient mauve
#

no

abstract quarry
#

😅

#

then you have to download everything

ancient mauve
#

I used 1.5 long ago

abstract quarry
#

or just use invokeai 😬

ancient mauve
#

and I remember seting up the model and the vae, then no longer needing the ave files

#

but dunno how things work now

ancient mauve
fervent thunder
ancient mauve
#

so I end up with 4 files in the end, the unet, the vae and the 2 text encoders

abstract quarry
#

I don't want to make advertisement for invokeai 😂 it's just really newcomer friendly and it sometimes makes me crazy when tools like comfyui are recommended for new people although these tools are definitely more for professional users

abstract quarry
#

and if you want to also use sdxl you only need the sdxl unet file

fervent thunder
#

TBH I just forget for months at a time that invoke exists
as a tool I have no issue with it

ancient mauve
ancient mauve
sage reef
abstract quarry
fervent thunder
#

also there is swarm

#

I think swarm was unmentioned so far

sage reef
#

yea swarm is cool, even tho i never used it lol

#

i did mention it

fervent thunder
#

ah ok

abstract quarry
#

inpainting is an extension of the model

fervent thunder
#

didn't see

ancient mauve
ancient mauve
boreal dew
#

is 8gb of vram enough for illustrious?

#

starting to think not.

fervent thunder
#

is ok

boreal dew
#

it does not appear to be.

ancient mauve
atomic mortar
#

i used to run it on a 3070TI

#

but it depends if you are using a nvidia card of amd

boreal dew
#

which i didn't do at first
but now it's running a lot better now that i did

abstract quarry
abstract quarry
abstract quarry
ancient mauve
#

ValueError: Failed to recognize model type!

#

fuck this

atomic mortar
#

woah another one, gotta be bots or something

#

happens a lot here

#

lol i believe you

#

hmm hope you find clientele here

#

hmm not here probably, rarely some businesses appear here wanting AI solutions for dirty cheap/free

#

but its mostly a community server

#

you might have more luck on fiverr for freelancing

fervent thunder
#

hi if this is real person
your way of advertising is a really bad idea
cos it literally looks like a malware bot

brittle kraken
#

Hello everyone, I'm new on Stable diffusion, as I saw, we can create images with models, and upgrade them with Loras, is there something else that we have to input to upgrade them ?

abstract quarry
#

loras are model finetunes, not necessarily upgrades. But the sd ecosystem is huge, so yes, there is a lot of other stuff

fervent thunder
#

some loras are

#

downgrade

brittle kraken
atomic mortar
#

your asking if theres more stuff you can use then loras?

#

if so, theres controlnet you could use and embeddings?

brittle kraken
#

Fine Ty, I will look if I can get some info on youtube

solemn harness
#

Hi, I'm relatively new to this AI stuff, and I have a question.

#

I'm using qDiffusion. I tried out some negative embeddings, and I got this error message, and I don't know how to fix it. Any ideas? Error while Encoding.
stack expects each tensor to be equal size, but got [1280] at entry 0 and [768] at entry 18 (clip.py:71)

desert dagger
solemn harness
#

Text to image

solemn harness
#

I managed to get chatgpt to fix the code to automatically resize it. Seems to work just fine now.

upper plinth
#

bruh sora are the biggest posse of wussies I have ever witnessed

#

cant have even a milimeter of cleavage on ur gens before they get tagged as violating policies

#

thank the ultra-feminists for this bastardization

agile tusk
#
** If anyone is using Nvidia driver 576.02, there is a bug that can cause it to ignore GPU temperatures and therefore not control the cooling correctly. I found that it can be fixed by reinstalling it using the custom installation method, and checking the box for "Clean installation". Check that your GPU temperature changes and isn't fixed at the temperature it was at during start-up.
ebon locust
#

Hello, nice to meet you

wanton harness
#

Hello! Looking forward to explore!

hasty nebula
#

from transformers import BertTokenizer, BertModel
import torch

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

inputs = tokenizer(sent, return_tensors="pt", return_attention_mask=True, return_token_type_ids=True)
outputs = model(**inputs)
embeddings = outputs.last_hidden_state[0] # shape: [seq_len, hidden_dim]

Get mapping from subwords to original words

word_ids = inputs.word_ids()

Accumulate embeddings per word

word_embeddings = []
current_word_id = None
current_word_embeddings = []

for idx, word_id in enumerate(word_ids):
if word_id is None:
continue
if word_id != current_word_id:
if current_word_embeddings:
word_embeddings.append(torch.mean(torch.stack(current_word_embeddings), dim=0))
current_word_embeddings = [embeddings[idx]]
current_word_id = word_id
else:
current_word_embeddings.append(embeddings[idx])

Append the last word

if current_word_embeddings:
word_embeddings.append(torch.mean(torch.stack(current_word_embeddings), dim=0))

Convert to tensor

sent_embedding = torch.stack(word_embeddings)

atomic mortar
#

Can confirm crashes have stopped too

fervent thunder
#

I use qt framework with py sometimes

ancient mauve
#

does anyone here use forgeUI

atomic mortar
ancient mauve
#

or this

#

but I keep getting ```AssertionError: You do not have CLIP state dict!

atomic mortar
#

Like in the cloud?

vale viper
#

Hi

ancient mauve
#

I have forgeUI locally but i cant make any of those work

#

maybe im not settig the folders right

sage reef
#

@woven panther LOL.. Phantom Wan ? this really doesn't stop does it... haha

knotty rain
#

Unsure if this is the best place to ask.
I have a 3080 with 10Gb. What would be the best option for me to train a LoRa? Also open to using runpod or similar cloud options.
I've heard people use flux trainer. I'm not set on a model yet, but between flux and sdxl

stiff hinge
#

I’ve lately been thinking a lot about how AI is affecting the graphic design industry, so I made a quick dive into the topic with this new video. 🤔🎨
Would love to hear your thoughts — I’m open to any feedback! 🙌
Check it out here:
https://youtu.be/uLwnGXXPrfc?si=8tzI6EZaaGERGehq

mellow meteor
prisma owl
#

Hello people!

#

Today im here to introduce a important question to the people

#

I want to upscale a picture of Shannon Sharpe

#

And i want to know how, and what the best method is for the high quality pictures. Thank you very much

oblique elk
prisma owl
#

Yes, i will soon be the only expert in upscaling Shannon Sharpe images

#

It will go in the history book

oblique elk
swift roost
#

Would a 4GB GTX 1630 be useful for AI?

#

My guess is probably not but I'm trying to look for anything I can use

iron pendant
#

technically yes but you will be very very limited

atomic mortar
#

I think you could run stable diffusion 1.5? I personally havent tried it since it was hella unoptimized but nowadays you maybe could

steel prawn
iron pendant
#

I personally would not recommend a 1630 at all, save some more pennies

#

maybe a 2070 Super 8GB or something

steel prawn
#

yeah i think 8GB is pushing it but workable. Ive got 10GB with my 3080 and its still slow sometimes if im trying to upscale a lot.

swift roost
#

It's what I already have on hand. I was able to run an LLM on a 12GB 6700 XT which I currently main.

atomic mortar
#

My 3070ti used to do really well with xl pushing 30s per image with a few loras

atomic mortar
#

Probably forge

swift roost
#

Oh nice, I will look into that

tall gorge
#

2060 basically the minimum

#

acc runs quite a good amount of stuff but stuff like sdxl struggles a bit

oblique agate
#

Does it really cost the Nvidia and amd that much to put 24gb vram in their gpus

swift roost
oblique agate
prisma owl
oblique agate
#

and 5070 ti and 5080 likely have larger memory bus size than that so they should be able to squeeze 24gb vram in

swift roost
#

5060 Ti has a 128-bit bus, the 1630 has a 64-bit bus

#

Eight 16Gbit chips gives 16GB

oblique agate
#

oh well amd, intel and nvidia are not giving us the best at a good price

upper plinth
#

nah bro the gpu's come pre-scalped now

#

average price for a 5090 if you manage to catch one is about $3200 USD

#

5080 I purchased mine at $1600

#

lowest they go probably $1300 for the crappy PNY ones

prisma owl
#

Can i run SDXL on 8gb Vram?

atomic mortar
atomic mortar
#

ooh an intel arc thats a first ive seen it

prisma owl
#

Is it hard to set up? Im new to image generations and just want to try make some images in various art styles

atomic mortar
#

Hmmm im not sure what you consider difficult, have you used Git before?

atomic mortar
#

hmm are you on windows?

prisma owl
atomic mortar
prisma owl
#

when u said AMD i thought u meant as in CPU

#

I forgot they made GPU's

#

So thats why I said Intel

#

I got a Nvidia card ahahahhaha

atomic mortar
#

ohhh

#

yeah then i recommend SwarmUI or ForgeWebui

#

swarm is an easier install imo

#

but in tech-support you can have more support here

#

with forge webui

desert dagger
prisma owl
#

@atomic mortar Do u perhaps know why all my images are like deformed

#

like the faces etc

#

the body

atomic mortar
#

Hmm faces are often distorted but what model are you using?

#

What resolution

#

Etc

prisma owl
#

Just using SDXL

#

1024x1024

atomic mortar
#

Base sdxl?

prisma owl
atomic mortar
#

Sdxl for realism

prisma owl
atomic mortar
#

But I'm going to bed, 3am n all

prisma owl
#

Hahahahahah same for me

atomic mortar
#

If you get stuck i recommend popping into the #🤝|tech-support channel or the SwarmUI discord if its a UI specific thing

prisma owl
#

Preciate u for ur help bro

#

Have a good night of sleep

upper plinth
#

Okay I have to admit Sora's image-to-video is blowing me away

#

Im going to try animating my Taylor Swift dark magician girl and pray that it doesn't tag it as violation

#

it is INSANE how detailed it is. Almost tempts me to pay for the Pro version

high flint
#

So, I've been away for a while from SD ai art gen, and I see a LOT of new model types, Such as Illustrious, Pony, Flux, and more. I'm mostly used to 1.5 and SDXL, what benefits do the new model types bring, and what use cases should I use them for?

quiet finch
#

It's all about finding what feels right and comfortable to use.

abstract quarry
#

Flux is the newest model family and has the best prompt following. It also gets anatomy right most of the times. Its weakness are a certain "plastic look" for photorealism and its lack of many style understandings (in particular for paintings). Both can be solved via custom models, though

fervent thunder
#

Hidream erasure 😂

abstract quarry
#

HiDream is probably just Flux finetuner on new text encoders 🙊

fervent thunder
#

it probably is secretly Flux in a trench coat and a hat yeah

prisma owl
#

@atomic mortar Hey

#

I was wondering how to impaint pictures

#

Impainting is changing stuff on pictures u have made right?

atomic mortar
#

Yeah either adetailer or changing something entirely

#

In swarm right?

prisma owl
#

Yeah in swarm

#

I read about adetailer but i couldnt download it to swarm

atomic mortar
#

If you just want to fix the face i recommend segment:face or use the + button next to the prompt box

prisma owl
#

unless its pre-installed already

atomic mortar
#

Automatic segmentation is basically adetailer

#

But bigger

#

You can segment anything to "fix"

#

Always do a segment at the end of a prompt

prisma owl
#

What is automatic segmantation in comparising with segment:face

atomic mortar
prisma owl
#

When i generate or like after?

atomic mortar
#

Both is possible

#

But give this a read

prisma owl
#

I will, yeah will adding that segment also add faces? because rn i have a prompt where i tell it to show face with this: (face out of frame:1.1) in negative. I grabbed the prompt from Citivai so Idk how it completely works

atomic mortar
#

Hmmmm

#

No

#

Segment will look for a face

#

And "fix" it

prisma owl
#

Ohhh

#

Idk why its not working then unfortunately

#

I have in positive

#

"face showing"

#

in negative (face out of frame:1.1)

atomic mortar
#

Give me a few sec and ill look

#

Making a omelette

prisma owl
#

Hahahahaha enjoy

#

Take ur time

atomic mortar
ancient mauve
#

Im finally using flux fill with comfyUI

#

but outpainting like 100 pixels takes me hours and hours is that normal?

#

outpainting a single image is not nearly done after 3 hours

sudden jewel
hasty night
#

hi. i'm iqram

ancient mauve
#

is it normal that generating an outpainting with flux takes so much time?

abstract quarry
#

outpainting, inpainting, txt2img, img2img they all are internally the same thing

#

so no, it should not take more time than generating an image of same size

abstract quarry
ancient mauve
#

Will test

ancient mauve
ancient mauve
#

outpainting now takes almost no time, but it isnt working

#

it gives me a grey extension instead of generating anything really

trim knoll
#

IMAGINE/Bússola estilizada integrada a uma tela de TV ou antena

abstract quarry
#

in both cases you don't need a prompt

ancient mauve
ancient mauve
abstract quarry
#

it changes the masked region maximal

ancient mauve
#

it says lower values will mantain the structure of the OG allowing for image to image sampling

abstract quarry
#

in img2img as higher the denoise as more of the original image is changed

#

you want to outpaint, so the part you want to change is empty (e.g. gray). you want to 100% replace this part of the image

ancient mauve
#

yeah, so what changes that is denoise 100 not denoise 0 then

abstract quarry
#

100% denoise means completely replace this part of the image

ancient mauve
#

ok cool

#

also, does it matter how many pixels do I outpaint?

#

should I stuck with 64x64 multiples or something like that then crop

#

instead of dunno, augmenting top by 81 pixels and left with 149

abstract quarry
#

in theory multiple of 16 but I think most tools handle that internally

ancient mauve
#

so if I want something like 60 pixels, 16*4=64 then crop the extra 4 pixels

#

rather than just augmenting 60

abstract quarry
#

outpainting is the same as inpainting. You just extend the image size beforehand and then do inpaint on the extended edges

#

only the total image size has to be multiple of 16

ancient mauve
abstract quarry
#

whole image

fervent thunder
#

in machine learning its just easier to make everything multiples of 64

abstract quarry
#

and as said, tools usually handle that internally anyways (e.g. extend to 16 and then crop)

ancient mauve
#

so I shouldnt worry with comfyUI then

#

im using the default template for fluxfill

#

I just put what I want in the pad image node

abstract quarry
#

I think the default templates are really bad

fervent thunder
#

everyone got this sequence stuck in their head now lol 🫠
64, 128, 192, 256, 320, 384, 448, 512, 576, 640, 704, 768, 832, 896, 960, 1024, 1088, 1152, 1216, 1280

abstract quarry
#

cause they don't preserve the original pixels

ancient mauve
#

I have a rectangular image and I just want to turn it into a square

#

I just want to outpaint not many pixels really, closest is 128 extra pixels

abstract quarry
#

you want to copy the changed part of your image into the original part.
But you can also keep the current template and check how the quality is first

ancient mauve
#

ah ok padding doesnt let you choose any outpainting anyways

#

I can choose 64 or 72 but no in between

ancient mauve
abstract quarry
#

you encode your input image with the vae, then change the edge of the image, then decode it back through the vae. The vae is a compressor. Think of it like you convert a png image into jpg and then back to png. It will lose quality

ancient mauve
abstract quarry
#

it's not so severe with Flux as flux vae is using less compression than sd 1 and xl

abstract quarry
#

that would prevent that your original image loses quality

ancient mauve
#

changing my image to a square then using that square as training data isnt good

#

but I just did a generation and visually, it looks ok

abstract quarry
#

you could also just train on a method like flux that natively supports non-square images 😅

#

or sdxl

ancient mauve
#

oh well, I wanted to try first with sd1.5, then the other ones and compare results

fervent thunder
#

if people trained on the latest models they would have an easier time

ancient mauve
fervent thunder
#

its the opposite to people's intuition

#

people think training big new model would be harder but its easier

abstract quarry
#

I mean, Flux trains very differently from SDXL, so it might be good to try both and decide

#

but usually flux just gives you best results but takes most of the time

fervent thunder
#

with lion I saw someone get ok result in 70 steps

#

did require lion though

#

bit of a messy optim

ancient mauve
fervent thunder
#

bigger = better almost always

abstract quarry
#

lion is weird 😬

ancient mauve
#

I thought training with flux was harder but if you say otherwise

ancient mauve
#

can you train flux in comfy UI or whats the way to go nowadays?

#

flux isnt a SD model so I suppose its different in some ways

abstract quarry
#

flux is by the same developers as SD

ancient mauve
#

because if I dont have to waste that much time setting up the dataset...

abstract quarry
#

it's just not called SD due to the devs left the company

ancient mauve
fervent thunder
#

I mean at this point their new company is a stronger brand so
it is swings and roundabouts 😄

abstract quarry
#

there are so many training tools

fervent thunder
#

the main threat to any western AI firm is the Chinese firms anyway

abstract quarry
#

kohya, onetrainer, simpletuner, aitoolkit

fervent thunder
#

the Chinese firms are releasing very large models with full apache/mit licenses
I actually don't know how western AI startups can compete with that

#

I am not sure they can compete, purely on the model front

#

so they will have to pivot

#

to more service-based model or something

ancient mauve
abstract quarry
#

they can just build on top of that models. I don't think open source is a threat at all.

ancient mauve
#

if quality has increased that much im excited

abstract quarry
#

they all usually use the same input more or less. Only configuration is different

ancient mauve
fervent thunder
#

is the issue

#

I stay out of AI investing cos of this sort of reason
I can't see where the moats are

abstract quarry
ancient mauve
#

@abstract quarry do you have any good guides for training the flux model?

fervent thunder
#

I just think its a way smaller market

#

than for example 2-ish years or so ago

#

when firms like midjourney had monopolies

abstract quarry
#

I think most tools have guides or default settings

#

I remember Simpletuner and Aitoolkit have default settings for Flux.

ancient mauve
abstract quarry
#

but Simpletuner might be difficult on Wimdows

fervent thunder
#

I feel like I'd love to use simpletuner but the install is not made easy

#

compared to others where its a container or an API endpoint

abstract quarry
fervent thunder
#

maybe its skill issue on my part

#

I only skimmed the docs but they looked quite manual

#

I mostly look either for cloud endpoints or containers I can quickly make cloud endpoint

abstract quarry
#

I tried kohya, simpletuner and Aitoolkit. I found them all quite similar

fervent thunder
#

is mostly that for some I found containers

#

funnily enough there is a cog for simpletuner but its an old version

haughty smelt
#

Hello air fryer people

fervent thunder
#

hello

fervent thunder
ancient mauve
#

btw how are prompts with flux, is it betetr to have words separated by commas or a long continious description?

atomic mortar
#

Long

fervent thunder
#

long ye

fervent thunder
#

hey i need someone who is really good on image generation

fervent thunder
atomic mortar
#

if its a question someone can answer they will but fishing to get the real question out is what i do at my job enough already lol

pine path
ancient mauve
#

does it mean training?

floral umbra
#

Is it possible to "convert" a checkpoint to to a lower billion parameter? As gguf is for quantized models, but iirc gguf needs it's own nodes, and would wanna use a 7/8b parameter wan.safetensors for instance

haughty smelt
#

Why does everyone here use perfect grammar as if they are at work chatting on teams or smth?

#

Like... I do not care if you talk normally.

steel prawn
#

😂

primal current
#

I need to cover my nipples with AI InPaint, but no workflow works for me. Does anyone have a solution? I pay for the service.

prisma owl
#

I have a question

#

I was reading on CivitAI and they said they are updating their policy

serene mountain
#

Yeah. Fun stuff.

prisma owl
#

Is it for pictures or does it also mean they cant upload models/loras on there surrounding those things

upper plinth
#

policies policies policies censorship censorship censorship

#

Didnt this orange mf say he was going to end these invasive restrictions?

abstract quarry
#

lol, reading this you could think that civitai bans nsfw content, but no, they only ban very extreme and specific stuff

#

and now people cry cause they can no longer generate porn with women having period

upper plinth
#

lmfaoooo

#

yeah there is a line for sure, but the overcorrections are insane

#

nothing like waiting 15+ minutes for a single video gen in Sora just to be told that it can't be shown because of a mysterious "policy violation"

abstract quarry
#

ChatGPT might be extreme in its censorship

#

but I'm also annoyed that the only big image gen website is a porn site basically

upper plinth
#

bruh. welcome to reality. Goonality should I say

#

AI is 98% gooner stuff, 2% productive stuff

#

I wouldn't have it any other way.

#

I'm getting error messages on Sora that they've hit capacity since everyone and their grandma is trying out the new models and based on the gens I've seen I bet a large chunk of that overcapacity comes from gooners like me trying to bypass their ridiculous censorship through trial and error

abstract quarry
#

any art platform is full of nsfw. Looking through devian arts means looking through naked bodies.
The difference is: devian arts is aesthetic. Its arts.
Civitai is just pervert porn.

#

like when you go on a porn site you want to stay in certain categories. It's so annoying seeing an add popup of, say, granny porn 😬 similarity, I'm sure straight people don't want to see gay fetish porn.
But on civitai all these weird fetish stuff is just thrown onto you. You could open a model "world morph into glass" and half of the showcase images are masturbating women with unnatural large breasts. It's just disgusting and it's difficult to get rid of it. You have to disable all mature content but even then you still see a lot of fetish stuff

upper plinth
#

That's a lie. CivitAI doesn't show anything NSFW unless you turn that on in the settings

#

Now if your concern is that weird porn is getting mixed in with traditional porn, well, welcome to the golden age of depravity circa 2025. As the world decays, people get lonelier > people get into weirder and more depraved shit which is then normalized. Idiocracy/cyberpunk dystopia in full motion.

abstract quarry
#

even in sfw mode you get a lot of weird fetish stuff that is just not "nude enough" to be counted as nsfw

steel prawn
#

Sad but true. I think we can all agree that the changes to remove minors in images as well as implications of SA or forced sexual situations is probably for the best, and probably the removal of celebrities. Art is meant to be subjective, and if you see something you dont like you shouldnt click on it, and your free to have your opinion of it. But that does not mean the artist is in the wrong for creating it. People think the works of certain surrealist and horror artists are over the top or distatesful because the imagery doesnt agree with them, but that doesnt mean it isnt art.

We live in a day and age where increasingly sex and porn are being normalized, even the weirder fetishes, and that of course means its gonna bleed into the artistic side of things. Case in point, danbooru is full of it and a lot of that isnt just AI art. You just gotta accept that that is the space now, and take the tools you need to make what you want and go about your business. Besides, a lot of this is just Civit covering their asses before a lawsuit happens.

upper plinth
#

Oh all big companies are covering their asses, but the bias is absolutely asinine

#

Go to Sora's main page and you're going to find loads of Donald Trump or Putin turning into poop parodies

#

yet the moment I try to even remotely animate my Taylor Swift dark magician girl - policy violation. Of course.

paper gazelle
#

hello everyone!

#

hows everyone doing today?

median jewel
#

does someone know if its possibel to use fooocus codes in visual studio, trying to inpaint, lora, prompt etc feature but getting access to it through code, is that possible or do i need to use the website interface they have for that? Cause i have tried using simple SDXL code with lora and masks but it doesnt get nearly the same good result as fooocus does

night gladeBOT
fervent thunder
#

you can use comfyscript for comfyui

#

otherwise pure pytorch etc

#

forge api maybe

#

comfyscript with custom comfyui nodes or pure pytorch have much nicer syntax and modularity than diffusers

#

but diffusers is more stable

#

so it depends

#

I am part switching to rust but I don't "recommend"

round dove
#

Hey, trader.
If you are also facing issues from your challenge account passing or making profits on your live account on any of your chosen trading platforms on this prop firm. I'd like to tell you what my research brought for me that makes me to always take enough profits per day on my live account.
Msg me if you are interested

nova glade
#

Hey, I wanted to ask if there's any rules for making a new post on r/StableDiffusion, I don't use reddit much and maybe my account does not have enough karma. I can't see my new post appearing, maybe it is pending moderation.

I had made a tool to easily archive civitai content so was hoping to share that with the community, https://github.com/dreamfast/go-civitai-downloader

nova glade
fervent thunder
#

wow thanks so much for this

nova glade
#

sad I can't get all the models, i only have so much space, but i got all the loras i wanted for video

uncut venture
#

Whats the best setup for amd users? Just running comfyui straight up or is there any good programs that package other useful things along with it?

nova glade
#

cool i just added torrent stuff so u can generate torrent files based on what you downloaded, sad i can't share it with r/StableDiffusion 😦

nova glade
oblique agate
upper plinth
#

probably midjourney and if they dont have their own animation AI then use sora

rose drift
#

Hi, can anybody help me?

oblique agate
left hatch
#

Hi @robust otter

median jewel
# fervent thunder you can use comfyscript for comfyui

Just so I understand, if I wanted to use lora and image prompt can I run a simple python program that uses fooocus app without me going into either comfyui and manually adding photos I can make a code that runs and uses the api instead? Will this cost money even though I run it locally?

robust otter
nimble light
#

How do I use Stable Diffusion or other AI General Tools like Flux, like Photoshop's Generation tool. Kitra is a software that allows me to do that, like Photoshop, masking out an area, and for example, masking out a lake,. and telling it to add boats. Pinokio just makes images from scratch, but I want to modify certain parts of images locally using GPU

#

Also is AMD RX 570 8 GB Enough

#

So I openned a ticket, now what?

still glacier
nimble light
#

YEAH

still glacier
#

who....

nimble light
#

It's MOD SAM

still glacier
#

99.9999% chances of it yes.

nimble light
#

Yes what? Also can't you just tell me

fervent thunder
#

you still pay for electricity locally

ancient mauve
#

at least Flux.1 Dev

nimble light
ancient mauve
#

fo you guys reccomend a 1024x1024 dataset for flux1Dev?

#

I want to set up a new dataset but I want confirmation if possibel

#

I want to do good quality but im not used to flux

fervent thunder
#

I used it a bit over a year ago

#

can't remember

abstract quarry
ancient mauve
abstract quarry
#

no

ancient mauve
abstract quarry
#

yes

#

you have the usual "multiple of 16" rule, but the training tools will just crop your images to a multiple of 16

fervent thunder
#

hmm

ancient mauve
#

I want to control what goes in in the end

fervent thunder
#

if you do a big fine tune without the resolutions spread nicely in the training data
flux will lose its ability to do multi resolution

#

but for small lora it is okay, that is probably what they mean

ancient mauve
#

how many images are we talking here lora vs a full finetune

#

wanna try both

abstract quarry
#

you cannot do full finetune flux with 24gb vram

ancient mauve
#

how much do I need for a finetune and how much for a lora with flux

#

both nº of images and vram

fervent thunder
#

if you blockswap you can do it

#

you could do 1 img if you want

ancient mauve
fervent thunder
#

its where you move blocks back and forth

#

from motherboard DRAM to graphics card VRAM

obsidian plume
#

Have you ever found a way to convert a Disco diffusion CLIP model into a diffuser or .ckpt file for use in something like Deforum?

fervent thunder
#

would be easier to make a fresh code base than go back to the old stuff rly

iron swallow
fervent thunder
obsidian plume
#

there are colab codes that function still, but would be so nice to have one to save locally and to not have to use those extremly heavy servers

fervent thunder
#

ye it would be cool

granite river
#

Hello together, I´m new here and excited what we can create together 🙂

merry ginkgo
#

anyone know of a civitai alternative

#

since the site is dying

sage reef
#

@woven panther just a question about your Phantom Wan implementation.
I noticed that the way it embeds the subject images, it seems to
embed them the same size and that size is then used for the
video generation size. but is there a way to decouple this?

like let's say I want to generate a 768 x 512 video, but..
the subject images can be either same or different sizes
from that, like 480x480 for image 1 and 600 x 400 for image 2.

also, is the 3rd and 4th embedding working? cause it doesnt seem
to be copying them correctly, maybe because 1.3B model is too small
for more than 2 subjects?

woven panther
heady pivot
#

Hi

fallen axle
nova glade
#

hey sd pals, i did some big updates for this https://github.com/dreamfast/go-civitai-downloader so now it's very easy to download many models or loras, also images from civit ai. After the models or loras are done downloading you can generate a torrent file and magnet link too. I am hoping this will help preserve some of the content that is doomed for oblivion.

steel prawn
placid hatch
#

anyone have an issue with models where the face of a character in a generated image will suddenly be in a completely different style than the rest of the image?

ancient mauve
#

I don't want flux choosing what it cuts

upper plinth
#

Bruh what was the point of electing Trump if the internet is going to keep snowflakizing?

steel prawn
#

Like i said before, seems like Civitai is covering their asses, and in the grand scheme of things, its probably better for the AI Art movement/scene/whatever you wanna call it if its not being viewed as a place to create pornographic material that even porn studios wouldnt film (hence the removal of certain things that could be, at least in a court of law, skewed to implicate such things as SA or pedophilia). But, their a business, and all businesses shake and move when their investors say so, so its no surprise.

placid hatch
#

Unfortunately it is still part of a broader bipartisan assault on adult art and adult artists that has been happening over the last decade.

upper plinth
#

bro, what is the point of AI if it's not to create erotica?

#

AI is and always has been about goonerism, in fact, sex robots is arguably the end goal of all this. Who the hell wants to deal with real women with all their flaws when we can have our own ideal bot partners?

#

the fact that people keep trying to pretend that AI is completely exclusive from porn is ridiculous. Just admit that the two go hand in hand, there's nothing wrong about that despite what the loud blue-haired karens on twitter are shouting

fervent thunder
#

at least then you know what it is doing

merry ginkgo
merry ginkgo
ancient mauve
fervent thunder
#

its actually harder to use kohya in some ways cos documentation is not thorough

#

I recommend simple tuner if you are gonna use a pre-made one they have a thing called lokr

#

lokr is separate its part of a project called lycoris, but it is integrated well into simple tuner

atomic mortar
#

Scam, dont click

ancient mauve
#

Ok so for what I'm reading, one preprocessing flux does is bucketing

#

You select a size and it makes groups on that size with x64 muktiples

#

So my database can have images of 256x256 if I select that, but it can also have for example a 320x320 image in the dataset

#

Or 384x384, etc

#

using 1024x1024 as reference, it seems that as long it has the same size as any of those buckets or same aspect ratio, is all ok

#

if anyone can confirm pls

fervent thunder
#

where did you read this?

#

could you quote it pls?

ancient mauve
#

I mean it gives youa lot of options

ancient mauve
#

what do you think

fervent thunder
#

hmm need info from a proper source like a paper
or quotes from the company rly

silk latch
#

How to get invoice from stability?

atomic mortar
silk latch
#

i tried but they didn't answer

upbeat fjord
#

hello

abstract quarry
#

with batch size = 1 you don't need buckets (or every unique resolution can be just its own bucket)

abstract quarry
ancient mauve
#

by the moment im preparing my dataset with that chart I linked

#

should be enough for something like 10-50 images

abstract quarry
#

"Avoid Ambiguous Images and Distracting Elements: Avoid having too many images that mix styles, characters, or concepts. For example, if you are training a character, don’t use an image that shows that character in a group of other characters." <-- this is bullshit

ancient mauve
abstract quarry
#

it's the opposite: if you train character loras, you definitely should add images with multiple characters. It's sufficient to put just multi-panel images with different characters in there. Without that, your Lora will transform every face into your character

ancient mauve
abstract quarry
ancient mauve
#

so if you have many images of one person you want him in diifferent scenarios

ancient mauve
abstract quarry
#

yes, but also add him with other characters

#

the model has to learn that "NAME" refers to this specific character, not to other characters

ancient mauve
#

or style

abstract quarry
#

hm, dunno what you mean with that

ancient mauve
#

so if you want to learn a character like dunno, john wick, and you tagg it as "john" if the AI already knows other johns it gets confused

#

same with concepts and such

abstract quarry
#

a common name is not so good, in particular if it is already loaded with a meaning

ancient mauve
#

you get my idea

#

the thing is to not mix some tags, depending on what you want

abstract quarry
#

like "John" is a very American/British name, so using it for a Asian guy might be not so good

abstract quarry
#

I would use natural names, though

#

like when I train on my own face I always use my real name (first name+ last name)

#

(funnily, my first name is Kai, which is a common German name, but the model associates it with Japanese and in the beginning often mixes in Asian elements)

#

(so I trained my first loras with the name Christian instead, which sounds more Caucasian. However, it doesn't really matter. The model also learns my real name after a while)

#

Many guides use random characters as names instead. I wouldn't do that, cause T5 understands the concept of a name and might get confused by random characters. But in the end both will work nevertheless

ancient mauve
#

the AI doesnt know your true name, it only cares on how you look

#

to avoid things like the asian thingy

abstract quarry
#

yes, but if you use first name+last name you are usually fine

ancient mauve
#

anyways where you able to see anything else in that civitai tutorial

#

I dont really like civitai that much but it is popular

abstract quarry
#

I think the rest is okay

ancient mauve
#

im just making a database of like 9x7 images

#

and squares

abstract quarry
#

style or character training?

ancient mauve
#

will try with a lora for a character I think

#

by the moment

#

I want to do both but mabe character is easier and needs less images

#

for what I read

abstract quarry
#

yeah, keep it simple. You can train on hundreds of images, but you can also train on just 10 images

#

it's not always clear what's better

#

(I mean, more is better. But quality> quantity)

ancient mauve
#

how many images would you say for a character and for a style each?

abstract quarry
#

also I would not use gradient accumulation. Takes too much time. You can use batch size if you can afford the vram. Training on batch size 1 also works, though

#

as said, more is better, but you can often train on surprisingly low number of images. The guide you posted is right with saying you should rather pick 10 highest quality images than using 50 low quality ones

ancient mauve
#

and some "default" settings I can edit in future generations

#

I just not want to go like a headless chicken

abstract quarry
#

dunno, I think 20 is a good number

ancient mauve
#

at this point im comfortable copy pasting what you use

#

you seem to know your stuff

abstract quarry
#

no, haven't used it so far

#

feel free to paste the config

ancient mauve
#

I managed to have my dataset in 3 ratios

#

for square, horizontal and vertical

ancient mauve
#

oh fuck I have to tag my dataset first 😮‍💨

abstract quarry
#

you have 24gb vram? You might use gemma 3 for assisting you with creating the captions

#

but for 20 images you can do it yourself

#

for more it's quite helpful to automate this. A big advantage of using AI for creating the captions is that you can use multiple captioning strategies (tag based, natural language short captions, natural language ling captions)

atomic mortar
#

i like to use civit ai's captioning system tbh

#

upload pics, download em after tagging

ancient mauve
#

with flux is bette rlong descriptions or single words separated by commas

abstract quarry
#

gemma is the strongest, though. It has a really deep understanding and you can teach it any captioning style

abstract quarry
ancient mauve
abstract quarry
#

short words have the disadvantage that your trigger words lose their effect in long prompts

ancient mauve
#

any limit for both? number of tags or size of paragraph for the other one

ancient mauve
abstract quarry
ancient mauve
#

what do you mean by trigger word exactly

abstract quarry
#

trigger word is also the character name

ancient mauve
abstract quarry
#

?

ancient mauve
#

only what its extra

abstract quarry
#

you always add a name

ancient mauve
#

like, if there are 2 characters and you only want 1, you only describe the one you dont want

abstract quarry
#

the idea is that you don't describe what is implicitly defined by the name

#

so if you train on, say, on Son Goku, you don't describe that he has black hair and is muscular, cause this is implicitly clear

ancient mauve
#

but you "have" to tag the word "son Goku"

#

to define those

abstract quarry
#

you add "Son Goku" to the prompt, yes

ancient mauve
#

ah ok

abstract quarry
#

and if there are multiple characters, you write "An image with two characters. Left is Son Goku. The character on the right is a man with pink hair and a muscular body."

ancient mauve
#

(I suppose left by my POV not the image's

abstract quarry
#

"Son Goku and another man" ?

#

Tags are just text, too. There is nothing special with them

ancient mauve
#

no I mean, you can put tags like Son Goku and another man fighting

abstract quarry
#

yes

ancient mauve
#

or you can put Goku, man, fighting, muscle

abstract quarry
#

I would definitely use the upper one

ancient mauve
#

ok, any kind of limit for what goes into the prompt

#

how many words or how big or total descriptions should have

abstract quarry
#

as said, I would try both: short and precise prompts as well as long and detailed prompts

#

that's also how you want to prompt in the end

ancient mauve
#

and what was that programm that helped you tag

ancient mauve
abstract quarry
#

You can use multimodal llms nowadays

ancient mauve
abstract quarry
#

I use gemma

ancient mauve
#

I remember using wd14 but I think thats only for anime

abstract quarry
#

cause you can run it locally

#

you could also use ChatGPT if you have a subscription, though

ancient mauve
#

but you mean a local llm model

abstract quarry
#

yes

ancient mauve
#

there must be some already made exclusively for flux

#

for tagging

abstract quarry
#

you can download gemma 3 4bit quant and run it in your local machine

abstract quarry
ancient mauve
abstract quarry
#

"I show you an image of a character named Son Goku. Please answer with a prompt that describes this image. The prompt should be short and precise (10-30 words) and include the name Son Goku. Do not describe Son Goku's appearance, but describe what he is doing in the image. Describe also the background. Answer only with the prompt."

#

something like this

#

the cool thing on llms is that they really understand what you want. If you are not happy, you can add more information into the prompt

#

you could even tell the llm that you want prompts for Flux.

ancient mauve
#

The prompt should be short and precise (10-30 words)

#

this is what I wanted to know more or less

ancient mauve
#

local llm

abstract quarry
#

gemma 3

abstract quarry
#

short prompts, long prompts, tag based prompts

ancient mauve
#

google censors the crap of their products

abstract quarry
#

not true

ancient mauve
#

uh

abstract quarry
#

they never do

#

censorship only happens during alignment step at the end of training. In its core none of the models is censored

ancient mauve
fallow veldt
#

I'm amazed how good Sora is... it seems to get everything to ask in the prompt in the correct style with no confusion

fallow veldt
#

As I said once, I thought it would be continuous development and optimization for generating locally but it seems that what we got is what it is

#

From SD 1.5 to SD XL was wow

echo cobalt
#

yo guys

#

whatsup

snow cedar
#

hey

#

I need help

quiet bison
#

Hi

#

Hi

fervent thunder
#

not rly into video

#

I think most people were super excited for video and many switched over right away
but I still prefer image

#

its cos I started out in upscaling hobby first

nimble light
#

Can someone help me with a question

echo cobalt
nimble light
# echo cobalt what question

What is the best host software like Kitra AI, that allows me to use models like Stable Diffusion or Flux to generate image in masked areas, like Photsohp, isntead of generating image from scatch, i want to mod sepcific parts, EG add boats to a part of the river

echo cobalt
nimble light
#

Like Photoshop? Mask select an area and generate or mod?

echo cobalt
nimble light
#

What you mean by pretty much? The word pretty much means there are some caviats

echo cobalt
#

If you want the most Photoshop-like experience but free: InvokeAI

#

If you're okay with a bit of complexity for ultimate power go with ComfyUI

#

thats all

nimble light
#

Or the git hub one?

echo cobalt
#

github one

oblique elk
atomic mortar
#

Bit more user friendly and you have acces to comfyUI as a backend

main snow
#

Swarm is apparently good for Flux too so there's that

#

You get to use the miiiiiracle checkpoint type lol

odd patio
#

Can someone give me an invite to comfy org discord?

#

when I click on it it shows I am not logged in, when I log in the tab forgets I asked to join that room

#

Cant find it in discover search

ancient mauve
#

what prompt do you reccommend for tagging images for a flux training in gemma3? llm its not giving me good desc riptions

#

it does the usual yapping these models do which I dont really need

#

btw having a local llm rocks

abstract quarry
#

dude, you can just ask Gemma for a good system prompt lol

#

My prompt: "Write me a good system prompt for an image captioning model. I want to generate image captions for training/finetuning a Flux diffusion model for image generation. Write me a system prompt for such a captioning llm."

#

gemma gave me a good system prompt. I then added:
"This is great. Modify the system prompt such that the model will always output two different captions: one short which only highlights the most important aspects of the image and one detailed. Also, if I show it an image and write "this image shows [SOME NAME] I want the captioning model to use [SOME NAME] in its description and do not describe the main subject of the image in details (as these details are already implicitly defined by its name). Do you understand that? Write me a system prompt!"

#

What came out was:

#

as you can see on my prompts: you don't need good prompts. Just write anything and ask Gemma to make a good prompt out of that

#

then use this prompt as system prompt for your image captioning stuff

steel prawn
#

I just slap this into any llm im using at the time if i get creative block:


You know the secrets of the lost art of prompting gorgeous anime wallpapers, at 16:9 and 2560x1440 resolution. You also have extreme proficiency in character profile shots in a 9:16 aspect ration, at 1440x2560 resolution. Some have said your creativity knows no bounds, and they are right.

Your also extremely proficient with all of the extensions and tools available on Automatic1111 with Stable Diffusion to enhance images, especially controlnet and regional prompting. And when necessary you will suggest using these tools, as well as providing a mock up open pose skeleton or depth image for controlnet.

I am your human counterpart, the one who enters the prompts to bring your forbidden knowledge and majestic works of art to the masses. Any prompt you give me, no matter how ridiculous, will be entered. And if additional tools are needed to achieve your glory, you will tell me and structure the prompt as it should be entered with those tools in mind.

With all of this in mind, your only job today is to provide me with prompts for stable diffusion anime art of the highest caliber. After each prompt you will ask me to submit the image generated, and then suggest no less than 3 options for our next prompt for me to choose from. Each prompt will be detailed, exquisite, and balanced so as to showcase the character and the scene in its proper glory. Once i pick a prompt option you will generate me the prompt you have in mind, and the cycle will repeat. The world will know the name and our brand by the time we are done. ```
abstract quarry
#

haha, that's a good one

steel prawn
#

Just input whatever image generator your using in place of A1111 (ive upgraded to forge for the time being myself) and run wild with this. It'll spit out pretty good stuff and you can steer it with your selections, upload the outputs to critique, and use it to build consistent styles for lora training etc if you want.

#

Doesnt fix them being chatty though.

#

They love their emojis

lofty scarab
#

I don’t know if someone could help me. I can’t do checkpoint merge anymore. I use to pretty often and now it always end up in an error. With A1111, forge UI, comfy UI, none of them work. I’m on windows 11 24h2, 12900ks, RTX 4090. I’m on the latest driver 576.02. Is it a problem with the gpu driver? It used to work but now it doesn’t anymore. Is someone got a clue?

steel prawn
#

lol the damn scammers tryin to get crafty

#

ask in tech support Dude, they might be able to help you

sage reef
floral jay
#

my favourite thing forge has over a1111 must be how the interrupt and skip button actually works.

fair shadow
#

Hello friends, I’m using Automatic1111 and I want to create a consistent character, but I don’t know how to do it. I looked online, but no one has explained it thoroughly. Can you help me with this?

obsidian plume
#

hey'

#

hey! still in super need of to make clip models become .ckpt if thats even possible?

#

from .pt to .ckpt

abstract quarry
#

rename it? 😂

#

these endings do not have a meaning. Usually, they are pickled dictionaries or models.

obsidian plume
abstract quarry
#

no. What I want to say is: there is no "checkpoint format" or "pt format".

#

even safetensors, although its own format, is not "standardized"

#

so your question has to be: I have file X downloaded from source Y and want to use it in tool Z.

obsidian plume
serene mountain
#

I have a 3070 8GB presently on Forge Web UI…considering upgrading to a 16gb. Had some feedback in another discord that 16gb is already not enough. Budget is tight but if Im gonna upgrade to do quality image gen whats the minimum i should be looking at without going overboard (i know in gaming there are diminishing returns).

Im doing this recreationally but once I get proficient I want to incorporate it into my business model.

#

Are there key specs on the cards I should be looking at? Or is the raw amount of ram the most important thing.

still glacier
#

ammount of vram dictates which model you can load on your gpu at once / without having to chop it in pieces and load it bit by bit during the generation process ( usually done automatically by whatever program you ll use )
Having the model loaded fully will avoid the costly / long loading and unloading of data to your gpu.
With that said, if you have enough vram to load what you want, then yes the gpu speed / architecture itself will become your main concern regarding speed. Newer gpu will go faster ( assuming there is enough vram to load everything at once )

Now.... Is 16gb enough. Yes for image generation definitely. For video generation meeeh, video generation is still in is infancy, so it s hard to tell. You ll have enough to do stuff for sure. But will it be """""""futureproof"""""""" is hard to tell. Even 8gb is enough for video generation if you use some tricks.

fervent thunder
#

and then you can open up a model that is in the format you want

#

and have a think about what you need to do to get it to be in that format

#

its mostly just renaming stuff but sometimes there is more

serene mountain
# still glacier ammount of vram dictates which model you can load on your gpu at once / without ...

Itll be a minute before i get into video, im still trying to learn everything about image. Im getting there…

It seems like running sdxl models works ok on what i have now, so any upgrade would be an improvement but the resounding answer in the other chat was to go cloud. I see the benefit but i have privacy concerns there. I guess i just hate being tethered to a 3rd party.

Its tough because im starting to see the suggestions are all over the place

#

I know flux is pretty VRAM heavy

fervent thunder
#

you can fit flux in 8GB

#

int4 flux is 6.64 GB

#

or nf4

#

same size

still glacier
serene mountain
#

Budget $600 ideal - i saw a few 5070 cards (16gb) in the 500-600 range. I can push $1000 but thats about my ceiling.

I had this card for a while, it runs all my other games and software fine on high or max settings (i do graphics, photo and video professionally). So if I had something that worked well, Id probably keep it until it melts or software just totally out paces it.

abstract quarry
#

16GB is fine. Sure, more is better, but this is also true for 24gb. As soon as you have 24gb, you want even more vram. It never stops ;D

fervent thunder
#

$600 is the used RTX 3090 area

#

but there is risk in used cards

still glacier
#

keep in mind that RTX 5000 will get longer support than RTX 3000 too. (at least in theory, if Nvidia does not become a fully AI company by that time...)

fervent thunder
#

yeah that's true

#

its tricky

still glacier
#

Personally with that budget I d go with RTX 5070 because of the support, faster cores, dlss4 and because I don t care about video gen :p

fervent thunder
#

the gaming and rendering stuff like dlss might be worth yeah

serene mountain
still glacier
#

saying it just in case. upgrading your gpu will NOT upgrade the quality of the outputs

#

it will just change the speed

fervent thunder
#

every GPU will be out of date at some point because ASICs are coming
but this might take a few years

#

ASIC just means "specialist chip"

still glacier
#

to be fair I remeber hearing about asic already available that can do inference for a fraction of the cost (but not training) but I don t think they re selling for the public yet.

serene mountain
still glacier
#

sure lora will add to the vram cost but not that much tbh. And the more loras you shove in your prompt the more they will fight each other usually so it s not recommended to use many of them at once.

serene mountain
abstract quarry
#

make sure you run invoke in low vram settings

#

in general, the length of your prompt should not matter as long as it is below 500 tokens

#

and the number of loras shouldn't matter either

serene mountain
#

I thought i had that setup, it ran great for 15 generations then all it would spit out was black squares.

still glacier
#

it s worth a try to reinstall and or run comfyUI before dropping hundreds of $ into a new gpu.

serene mountain
#

ChatGPT said the error in the log was from to many tokens but it definitely wasnt 500.

still glacier
#

ChatGPT tells you what you want to hear.

#

If you can t / dont know how to verify what it says, I would not trust it blindly. Same goes for every LLM.

serene mountain
still glacier
#

LLM can be a good tool to start your research, it will at the very least give you a few pointer, stuff to research.