#💬|general-chat

1 messages · Page 184 of 1

manic wraith
#

Try with reForge! The GitHub is by "panchovix"

tawny drum
#

ty for the advice

sacred lake
#

does anybody have tutorial how to install stable diffusion on amd? (i have rx5700xt)

desert dagger
warm junco
#

that card can use sdxl

#

i used sdxl on a gtx1080

#

with auto1111, but forge and reforge are faster

tawny drum
#

I always get a memory error tho 🤔

#

Wish I understood this stuff more waaa

restive tusk
#

hello

fervent thunder
#

RTX2080 and especially the ti version aren't even that bad

#

I think you could run SD 1.5 on like 50MB of VRAM only if you used BK SD Tiny and GGUF Q4

unborn hedge
#

Wan is pretty cool

cedar salmon
#

its Wantastic

errant monolith
#

Hello

covert trellis
#

ᴀʟʟ ᴛʜᴀᴛ'ʟʟ ʀᴇᴍᴀɪɴ, ᴡɪʟʟ ʙᴇ ᴀᴜᴛᴏɴᴏᴍᴏᴜꜱ
ʙᴇɪɴɢꜱ ᴅᴇᴄɪᴅᴇᴅ ɪɴ ᴏᴜʀ ɪᴍᴀɢᴇ

warm junco
autumn gyro
#

I appreciate the fast response! ❤️
Is there a youtube video that can perhaps elaborate more?
Don't want to annoy you with tons of questions!

sharp socket
sharp knot
#

General question: How do Stable Diffusion whiz's feel about the new OpenAI image generation in terms of quality? I never got too deep into customizing SD or using the likes of LoRa's. Is this just a case of commercial products catching up to what can be accomplished locally with a bit of know-how?

humble iris
#

what s the best adetailer I should use in Comfy for Illustrious (including realistic style)?

twilit forge
#

how to upscale shit like hires fix but in comfyui?

abstract quarry
tawny drum
#

Opinions are subjective 🤔

runic cargo
#

heyy everyone where can i make request? i cannot see any channel for it

#

hey?

abstract quarry
#

As usual there is no paper about it so we can just guess. It's very likely something different from sd and midjourney

#

they claim that their images are natively generated by chatgpt

#

something similar was already done by Meta, but results were not that good

#

also DeepSeek did something similar already

#

but so far the conclusion was that diffusion works better for images than auto regression

#

so either they heavily improved on that, or they used some kind of mixed architecture where auto regression builds up visual tokens that are decoded via diffusion

#

I could also imagine it's something similar to Omnigen but built on top if ChatGPT

#

that's two different things

#

yes, you can do similar stuffs with control nets and loras and so one

#

actually it's very likely they trained there model on such tasks

#

the other question is if the technique behind it is completely new

#

if it's just an auto regression on visual tokens then no, such methods already exist for some time

#

the reason why Stable diffusion, Flux, PixArt and so on are not doing that is because the results are usually much worse than diffusion based methods

#

however the big advantage of generation images natively in an llm such as ChatGPT is that you have probably the best possible amount of prompt understanding

#

and you can discuss with the ai and improve an image step vy step

#

it's very likely not something completely new, but it's probably very different from what Midjourney, Flux and Stable Diffusion are doing

#

it's very likely also a MUCH larger model

fervent thunder
#

like half the imagenet leaderboard is autoregressive now I think its proven architecture for img

spark root
#

are 10 images enough for training a realistic character LoRa?

fervent thunder
#

yeah or even one

bronze cove
#

Looking for some advice, feel like I'm hitting a wall in my learning process as I try to advance my skills with stable diffusion.

The problem - I want to generate high quality, fantasy, full body character art. However, when I render a full body image with a detailed outfit, I get a low quality face, eyes, mouth, teeth, etc.

Things I've tried, but don't seem to work (maybe user error)

-Inpaint to improve face - sometimes works, hit or miss, usually causes problems.

-Increase image resolution when generating for higher detailed - Causes body deformation. (probably because Illustrious doesn't like going above 1024x)?

-Codeformer - makes things look uglier no matter what settings I seem to use

-GFPGAN - makes things look uglier no matter what settings I use.

-Adetailer - usually makes things look uglier no matter what settings i use.

-Mosaic - Tried to create a half body image for higher detail and then expand the image with mosiac - doesn't stay true to the original when expanding at all

-outpainting - Same as Mosiac, produces bad results that don't match the original image

-Controlnet - can't figure out how to install the models to make it work.

Getting frustrated, keep watching videos of people claiming such and such extension or script or tool is amazing, but when I try to learn to use the said thing, I get bad results.

I'm considering using a different model other than Illustrious but I'm already fairly invested in Illustrious, I like the 3d semi-realistic anime look, and I already learned how to prompt and use loras with illustrious and don't feel like trying another model atm.

Any advice, tips, words of encouragement would be apricated XD.

Message #🙌┃creation-help

fervent thunder
#

these days we have such a good trick where you spin some video models round and round to get lots of extra views of your lora subject

warm junco
bronze cove
#

It tends to make my character's face look worse sadly.

warm junco
#

inpainting works too and is very easy,
img2img inpaint.
set inpaint area to "masked only"
mask the face, set denois to 0.5
set the resolution to 1024x1024
hit generate

bronze cove
#

Yeah, I've tried all of these before 😭

warm junco
bronze cove
#

sure, give me a sec

spark root
#

When captioning images for LoRa training, should I use natural language, for example:
a blonde-haired girl, wearing a black dress and jewelry. She has brown eyes, makeup...
or this:
1girl, looking at viewer, blonde hair, brown eyes, jewelry, black dress...

rose stirrup
#

is there a specific channel to ask about lora training?

viral wren
#

hi

#

how to generate image can anybody tell me?

abstract quarry
# fervent thunder like half the imagenet leaderboard is autoregressive now I think its proven arch...

I disagree ;D
1.) imagenet are super small resolutions. Yes, for very small resolutions autoregressive might work, but it gets worse and worse as more you increase the resolution. Autoregressive needs discrete visual tokens to work and this will just degrade the performance
2.) maybe we look at different rankings, but from what I see all leading methods are hybrid methods and use diffusion AND autoregression

gritty nest
#

Hello

hasty hornet
#

Anyone got \ seen workflow to generate 2d spritesheets? Ideally based on already existing image\s

abstract quarry
visual canopy
#

Hello. New here. I've recently installed Stability Matrix. Messed around a little bit, tbh though, I'm quite lost. Any good channel to get some help? Are the VC's active here?

hasty hornet
# abstract quarry Flux is quite smart for such kind of tasks. You probably can train a lora on exi...

models can gen sprite sheets, but there are issues, main one - I can't really get as high res as I would like, ideally I would want like 512x512 per image, for each animation that would be like what, at least 12 images per animation, there isn't a model which can do that, it also needs to be proper consistent sequence, which again, basic tools cannot achieve, img2img trying for luck ain't ideal

#

I remember there was a tool which allowed to "move" some parts of the image and AI figures out other stuff, how to make it look decent, that would help, but I don't remember which tool was it lol

#

512x512 might be overkill, but even 256x256 is still alot + again, inconsistency

hasty hornet
abstract quarry
#

smooth animations are difficult... I wonder if the inpainting model could be used for that.

runic nimbus
#

Is there an up to date tutorial on setting this up?

#

I dont remember how and i got new laptop

hasty hornet
hasty hornet
runic nimbus
#

The web ui i think?

#

For the image gen

hasty hornet
#

there are probably thousands of them at this point

#

if not these you need to be more specific 🤔

runic nimbus
#

Oh alr, lemme look

#

ye its the webui

#

ty

brittle rose
#

Some Arab Metal fans are arrogant, despicable and provocative, laughing at me just because I use AI and still they didn't manage to make one song like my songs. To think that using AI is trivial while there are some songs that I spend 6 days to make them. Strangely enough, I sent my songs to several Western metal groups and they didn't underestimate my songs. As AI professionals we need to spread awareness about AI to fight conspiracy and bullying against human development.

runic nimbus
#

in case for stable diffusion webui by automatic, where do i config (and ig how) for 4080 laptop version

#

also what sampling methods are good

night gladeBOT
#

qual a melhor estrategia para a compra de small caps no ibov?
No data source is currently selected. Please choose a data source from the dashboard and try again.

fast sage
#

Whats currently the best way to train flux Lora’s

sage reef
#

@woven panther hey I see you converted the AccVideo to fp8, but my question is, can that still somehow work with 12GB vram? or
would I need more? i would really love to test it. I hope someone makes some quants of it too

woven panther
#

if you mean you need to use GGUF.. then someone would need to make those, I mean I can do that too but it's somewhat laborous to do all the GGUF variants

sage reef
#

i can do it if you tell me how to do it, like is there a script/tutorial i could follow, i just want maybe q4 km for now to test

#

also, is the workflow literally the same as a normal Hunyuan t2v, but just with like low steps? if you can share the workflow would be nice 🙂

#

would be awesome if their technique is adapted to Wan2.1 also and the i2v variants

exotic sphinx
#

I'd try adetailer / inpainting at diff denoise levels and with diff prompts, then upscale the result

#

And then maybe run it over another low denoise pass

#

Then you can downscale from that

#

Essentially if you want higher resolution, you prolly should start out with an intended resolution, upscale, then you can tile the resulting image and run a diffusion thing over that at low denoise

#

Can't post my previous pics for that but ye it's what I sometimes do for better background

#

I sometimes make the model at each stage different to try and capitalize on different strengths

#

But it's also very finnicky to then preserve the strength of the first model

terse bronze
#

Hey guys

#

Good morning

winged jay
#

I need a workflow in comfyUI that will take father and mother photo as input and generate image of their future child (child should have facial feature of father and mother) if anyone could help it would be great.

sage reef
#

@woven panther sorry to bother you, just wanted to say I got AccVideo working (your fp8 version) 😮
it does it nicely in 5 steps indeed, crazy. i think i saw on author's github, they are working on the image 2 video version too (in coming weeks) , that would be very nice as well.

sage reef
#

wow awesome stuff! il test it later 😮

sage reef
#

ok gguf q4 ks works nicely! and quality is still very good! i didnt test Lora cause i personally don't care too much about it as an "adapter", but it's cool to have as an option i guess.

#

i can't wait for them to release the i2v version

#

btw, is it possible to quantize the Nvidia Cosmos models? like just the 7B maybe?

fervent thunder
#

will quant fine yeah

river nacelle
#

That’s really amazing!

torn cobalt
#

Looking for an OCR developers for a big project
if you are interested in the small test project with the budget of $50 plz ping me with your previous relevant experiences

brittle rose
#

Anyone here didn't fall in the trap of OpenAI Ghbli style? That's not even Ghbli style despite the adherence to face features.

abstract quarry
#

There are so many Ghibli loras out there 🤷‍♂️

#

Despite the new openai imagen might be impressive in the technical side, all the showcases can also easily be done with existing open source methods. It's just people hype this stuff who never heard about anything else than midjourney and dall-e

#

except for info graphics, they are extremely good

fervent thunder
#

when I read about style stuff the conclusion was that style is not really a clear thing you can separately benchmark cos, while most styles only change stuff like colour, texture and lighting, some styles also change objects or layout

#

like if you imagined an art deco building the shape changed

abstract quarry
#

reminds me a bit on the "Greg Rutkowski" style xD

#

like it was for some time the most often used style prompt

#

and he was pissed of that so many people "copy his style" and many news media wrote about it

#

but what they didn't get was that almost none of these AI images really had any greg Rutkowski style even if it was part of the prompt

#

"in the style of Greg Rutkowski" was basically a way to tell the text encoder: "give me a high quality digital painting with fantasy elements"

#

it was some basic prompt element like "masterpiece, award-winning, highres"... boah, I get nostalgic thinking back to the time where we had to put all this bullshit into the prompt xD

fervent thunder
#

yeah they never actually looked like Greg Rutkowski
a bit like how they had Octane render or 8k

#

I still use SD 1.5 lol I have to do searches for good tokens regularly

covert osprey
#

hey

woeful gulch
#

Hi people 🙂

hybrid harbor
#

hi cute patootie

bright inlet
#

Yo guys

#

Do you know how to fix hands on the image?

fervent thunder
#

can somebody tell me how to convert a pic to ghibli, everybody is on this trend, only i am not able to generate it for my self

pine path
#

they're mostly using gpt 4o. They might have added a filter due to copyright which will make it harder to do.

fickle bridge
#

hi

thorny nimbus
#

hello all

tribal crown
#

hey guys, any way i can a make a wan character twerk ? any loras for this? thanks

atomic mortar
#

What are you bringing to the table other then an idea? Just curious

exotic tangle
#

Hello everyone. I used to use an APP called Mon_AI on my phone, it was "powered" by stable diffusion, and it let me create an unlimited number of images per day. It was really great, and probabably the best free AI art generator I've used. What is the best way to use stable diffusion for free on my laptop? The APP worked great, but I couldn't help but feel like it would be better to use stable diffusion on my PC. Any suggestions?

#

nope, and it's less of an issue and more of a curiosity

daring ginkgo
#

hi :) I am making an AI-based pixel art game with a custom model related to stable diffusion - it runs in-game so you can create new creatures yourself! it is on steam (a 'coming soon' page) over here if you are interested in wishlisting https://store.steampowered.com/app/3614730/Monster_Pod_Quest/
sorry for self promo! couldnt find it mentioned in the rules or a dedicated place, so thought i would message but keep it short waow

sage reef
#

@woven panther just saw on GitHub that AccVideo team is also working on Wan too 😮
We really need some of these accelerators/faster versions so this is gonna be awesome!

verbal osprey
#

still waiting for forgeui to work on 5000 series cards

atomic mortar
#

5000 series owner here, with forge your gonna be waiting while

dense sluice
#

Hello! I have created a cartoonish character that resembles an animal, and I want to dress it up. I want to make it anthropomorphic with clothes, but I don't know how. I tried SDXL + IPAdapter, and FLUX + PULID, but I didn't get a satisfactory result. I want the face to be the same, but it keeps changing. Is there another way to try? Or is there no other way than to train Lora for my character?

verbal osprey
atomic mortar
#

And well you can keep forge around since you can set the model paths in the swarm settings

#

So when it updates you can switch back but maybe after trying you dont want to👀

verbal osprey
atomic mortar
#

Flux, wan, etc is possible

atomic mortar
#

But my 5080 does flux schnell in 4s per image

verbal osprey
atomic mortar
#

Oh thats built in

verbal osprey
#

And the other is a faceswapper. I forget the name of it

atomic mortar
#

Ah reactor

#

Yes possible

verbal osprey
#

Reactor, yes.

atomic mortar
#

Also one click install

verbal osprey
atomic mortar
#

Hmm i do recommend checking the swarm discord for the 5000 fix if its not native already. I cant send it rn bc im leaving for work

verbal osprey
atomic mortar
#

Like

#

If its a feature that makes sense? Mcmonkey fixes/adds it in no time

#

And you can comfortably use the latest models

#

Like Wan/hunyuan without a big hassle

verbal osprey
atomic mortar
#

Yeah you should check on the github > docs

#

Its a LOT of good stuff

verbal osprey
# atomic mortar Its a LOT of good stuff

Is it generally considered to be the best generator if you don't want to mess with comfy? I like the flexibility of comfy, but I prefer to just use an interface that just works and not have to hunt for workflows and all of that. It gets really messy and confusing.

atomic mortar
#

I only used comfy like twice because i wanted my own custom workflow, otherwise you don't need to use it at all, swarm takes care of it

verbal osprey
atomic mortar
#

But i prefer darkmode

mellow otter
#

Is this stable ronaldo discord or some random other person idk

light whale
#

Hi, I have been trying for several weekends to give a GTA style to my uploaded photos in Automatic1111. Unfortunately, without the expected results. I think I am making some serious mistake during the SD setting process. I tried using ControlNet, checkpoints from gtavstyle however without satisfactory results. For the sake of completeness of information: I want a gta effect that preserves the resemblance of the uploaded photo (object or person). Maybe for this I need some midjourney? Or needs to install an IPadapter to SD?
Please at least a little help to direct me to solve the problem and learn the methodology for creating such styles img2img2. Thank you very much and best regards

abstract quarry
#

you have to give more information WHAT is not working

#

controlnets + style lora should work to keep the overall scene the same. Alternatively, you can use IPAdapter instead (or additionally) to controlnets

#

if the faces are not accurately transferred, you need something like FaceID or IPAdapter

#

Alternatively: you can also try my Reflux plugin for ComfyUI that works really well for changing the style of an image without changing its content

#

see examples in https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControl
set the downsampling_factor to 2 or 3 and then use "GTA-style artwork, gta 5 style . Satirical, exaggerated, pop art style, vibrant colors, iconic characters, action-packed" as prompt. At least for me that was enough to get any image transferred into GTA style without controlnets

fervent thunder
#

lol yeah your reflux node is still one of the best things out there

#

there are a handful of other reflux nodes that do some interesting things as well

#

https://github.com/yichengup/Comfyui_Flux_Style_Adjust https://github.com/yichengup/Comfyui_Redux_Advancedsimilarity_threshold and noise_level in these two nodes are cool

abstract quarry
#

yeah, I need a free weekend to make a new update ;_;

fervent thunder
#

there's probably more tricks with redux that haven't been discovered, its a weird tool
saw someone interpolating the encoder embeddings and it looked cool

abstract quarry
#

I really hoped I could find a "style-extract" but I totally failed with that xD

#

interpolating between two images or what do you mean?

#

the code makes zero sense and I think his examples are all cherry picked

#

however, that was months ago. Maybe newer versions of his plugin work better but I somehow doubt it

#

ah... he now also writes this in his github page: "Note: Sorry, there is no basis for this weight separation, I just want to try it out for learning. [...] This is an inaccurate node."
so at least he is transparent about it now

fervent thunder
#

they called it "redux travel" and it was using redux embeddings apparently
it was on Banodoco server

#
```LOL ah okay
#

noise_level in the other one was good when I tried it

#

it looked a bit more natural

abstract quarry
#

yeah, you can do that. It works also with normal prompts. I once made a video of myself transforming into a werewolf by just interpolating between a prompt describing myself and a prompt describing a werewolf. Worked great.

fervent thunder
#

hmm okay yeah will try that one

abstract quarry
fervent thunder
#

oh is that what it does?
yeah it looked okay lol

abstract quarry
#

maybe I find some time this weekened to add some stuff and go over the issues and merge requests xD

fervent thunder
#

CADS does something similar and I like CADS so I guess that makes sense

#

it gives more varied/interesting images for SD 1.5 or SDXL

#

sadly no flux support

abstract quarry
#

hm, with Flux its difficult

fervent thunder
#

the double blocks are weird yeah

#

when they mix prompt and image together

abstract quarry
#

nah, if I think about it.... it shouldn't matter

#

you can just add the noise on the T5 embeddings

fervent thunder
#

T5 and Clip_L work differently, right? in Flux
yeah maybe this will work 🤔

#

I'll try it today, on the stock flux repo

abstract quarry
#

yes, but CLIPL is.... I don't know how much it does in the end

fervent thunder
#

not a lot lol

abstract quarry
#

CLIPL is only used for the AdaNorm Layers

#

the noise idea is nice... also for the Reflux

fervent thunder
#

I often run flux using a node that disables clip-l anyway

abstract quarry
#

like I could imagine you could add noise on different areas of your image to "lower their importance" and use that also to add some kind of focus "which part of the image should be preserved mostly"

fervent thunder
#

yeah in my experience injecting noise in random places of models can be good lol

#

with SD 1.5 or SDXL noising the attention maps can get you really high detail sometimes

#

it has side effects

abstract quarry
#

hm... I miss the UNET a bit...

#

the layers in the unet had relatively clear roles you could influence

fervent thunder
#

I've only used SD 1.5 in 2025 pretty much
I prefer it

abstract quarry
#

in mmdit it seems unclear which layer is doing what

fervent thunder
#

yeah I did a long sweep of loads of Flux block combinations on a big server once
wasted a fair amount of money on it
there is no really clear trend

#

the double blocks do go from coarse to fine, or the other way around, can't remember which

#

but individual blocks are not salient

abstract quarry
#

such that I can at least understand how the prompt tokens behave in different layers

#

and if flux even understands certain tokens or not

fervent thunder
#

ah yeah I came across that on github searches 😂
I need to try it, this could be useful yeah

#

I wish both comfy and diffusers made it easier to get attention maps

abstract quarry
#

It was a pain to get them from diffusers xDDD

fervent thunder
#

yeah I've avoided diffusers all this time and it seems to have paid off

#

cos after 18 months I can finally just write pytorch

abstract quarry
#

😥
I liked the idea if having an unified framework. But the API is really not well designed

#

in particular for Flux I have to say: the original code by BFL is just extremely clean and simple

#

probably because they had to write from scratch instead of just adding another layer of code on already existing codebases

fervent thunder
#

yeah that codebase is so nice

#

I don't know where autoregressive models are going to go
they seem to tend to use HF transformers

#

or just pure pytorch

abstract quarry
#

do they? I thought its often the other way around and transformers is adapting to their stuff. But I have no clue

fervent thunder
#

not sure, I went through a bunch of github repos of stuff like llamagen or infinity VAR
it seems to vary though

#

they're all under-trained but we have a bunch of them

abstract quarry
#

I don't belief in pure AR methods for image generation xD

#

I bet even OpenAI is using diffusion under the hood

fervent thunder
#

the pure ones don't look too good yeah
hybrid probably best

abstract quarry
#

yes, hybrid is the future

fervent thunder
#

like either a diffusion head, or the ones where the AR process mimics a diffusion process

abstract quarry
#

something like StableCascade with a AR method as first stage

fervent thunder
#

that sounds good yeah

abstract quarry
#

because what they found out in the end is that diffusion works great, but it can get better if you do not try to diffuse an image in one generation but instead split it up in smaller chunks and generate them sequentially

fervent thunder
#

we have an absolutely massive one, Lumina mGPT 34B
if you are willing to wait hours for your 4k image 😂

abstract quarry
#

however, I think AR makes only sense if you integrate the image generation into an existing llm

#

because only then you get the full power of "conversational image generation" we see in ChatGPT

fervent thunder
#

yeah I talked to Janus Pro and other AR image models and their text conversations are not so good

faint timber
#

@fervent thunder What are attention maps? Where would I inject noise in a SD1.5 workflow.

polar cave
#

Hey, does anybody have experience in creating hyper realistic pictures when a lora is already trained? I tried with flux and stable diffusion but have some problems some times

vale barn
#

Hey there

modest haven
#

Hi

sage reef
fervent thunder
faint timber
#

I've never seen those maps. Is there a preview node for attention maps? What would I plug into the preview?

fervent thunder
#

no this is what we were saying, that its annoyingly hard to make them in comfyui and diffusers

abstract quarry
#

it's not that these maps are important for daily use anyways. You can use them to interpret models and try to understand what layers might doing what

#

for example it was easy to see with attention maps that certain padding tokens are used as registers and are important, while most padding tokens are probably useless

fervent thunder
#

some tools manipulate attentions maps for various reasons

sage reef
#

@woven panther i see you slowly implementing some VACE support ❤️

#

il wait for a workflow

woven panther
#

Yeah I have to sleep but it does work

sage reef
#

nice

#

the model file is in fp32? seems big in size

woven panther
#

Tested control with depth and pose, with or without reference image, video extension and outpainting so far

#

Pretty impressive first impression

sage reef
#

awesome !

#

fantastic work

#

VACE has so many abilities it's crazy

woven panther
#

The base model is in fp32 and the vace model (in the same file) in bf16

#

It does use almost twice the VRAM than normal 1.3B

sage reef
#

they have a huge guide on how to use it lol

#

with the preprocessing

#

and this is just the preview version, they will release the actual 1.3B and 14B versions soon

#

if 1.3B is twice VRAM, 14B gonna be huge for most people i guess idk

#

man April just started, we get VACE, i also found out about TripoSG for 3D asset stuff 😮 (il try it tonight)
plus we have AccVideo stuff, etc,

fervent thunder
#

AccVideo is a bit rough but the others are good

#

its hard to make video distills

#

there was a closed source paper that did a good one but they used something like 256 H100 to smooth out the gradient updates

#

which is too excessive for open source

gritty adder
#

Hi guys yall might have seen but my friend and I created Rem—a dream journaling app where you can easily record, analyze, and share your dreams on https://lucidrem.com

vital charm
#

Hi all, just wondering if anyone has a recommendation for an Illustrious model for general use (toony not realistic). I'm using Illustrious Toon Mix but sometimes its just too cute, so i need something more general purpose

atomic mortar
#

Spam

atomic mortar
sage reef
#

@woven panther I see you are still implementing VACE stuff 🙂
just wondering in the current state of your implementation, is anything ready to test
with a workflow you could provide or you are still getting it ready for a workflow?

woven panther
abstract quarry
#

2x as much VRAM sounds like it's not realistic to use the larger wan model... :/

woven panther
#

then you can use a single reference image, with or without the control signal, but main thing is you need to either put your image to a white canvase or remove it's background, so that it's white, or it won't work

woven panther
fervent thunder
#

you can always blockswap

#

if you accept the blockswapping slowdown then suddenly you can run way larger things

atomic mortar
brisk mirage
#

Is SD working on an answer to OpenAI's new Image Gen capabilities? Would like to see the enhanced image gen on SD models

solar wraith
#

Hey guys is there any tool / library from which we can call any phone number .

Note:- this is needed for voice bot. ( Once call is connected to number we can enable the voice bot)

atomic mortar
#

Guess your in the wrong place

solar wraith
#

See I want to implement voice bot but how to implement it until you call a mobile number

atomic mortar
#

So you want to make something that could do that but stop before you get to that point?

atomic mortar
outer crescent
#

Does anyone here use audio narration software on webpages? I'm fixing up a bit of css/html and if anyone is active and does, it'd be great to know what works and what doesnt

atomic mortar
outer crescent
#

thanks though!

#

my current system lets a person using a narrator skip the tedious taskbar, but if they go back. wait, nevermind, you can go back using the browser tools instead of narration. Thanks for letting me sound the thought!

rain flint
#

Alright, after constantly having issues before, now that I have a new PC, I want to get a fresh install going for Stable diffusion. What is the best install guide to follow for Win 11 using a web UI? (used automatic1111 before)

fallow veldt
#

What is chatGPT using for this Ghibli style image? I briefly entered the generation page and saw that it uses SDXL Juggernaut and Flux, or at least you can choose (correct me if I'm wrong). So it uses that, but I guess the prompt is enhaced by chatGPT, also I gues the img2img is enhanced as it keeps just the right amount of detail mostly, what is it?

#

I was wondering if they had some kind of ControlNet or it is just a img2img, as detail and sometimes text stays

abstract quarry
#

if the model is large enough you don't need control nets for such tasks

#

Flux for example is extremely good in such kind of things if you let it generate multi-panel images

#

it's just that Flux cannot use previous images as conditioning for new images. Otherwise you could probably do similar things with Flux

viral valley
#

Could someone help me import a diffuser into Inpaint Anything on A1111? I downloaded an inpainting model from Civit but I don't know where to put it or how to get Inpaint Anything to recognize it

#

The directions on the inpaint anything github seem to give directions to download one from a repo you know the name of, but not a local file you have on your machine.

delicate sedge
#

Hi

atomic mortar
#

A1111 while nice is outdated

upper geyser
#

Guys, whats the best way to put my face on another photo? to make it realistic like faceswap?

atomic mortar
#

so when inpainting you select that one but A1111 is a distant memory for me so i cant help you much there

rain flint
atomic mortar
#

personally i like swarm a lot

atomic mortar
#

though swarm also has a extension for this

abstract quarry
#

there is ipadapter for SDXL. For Flux you can use FaceID

viral valley
atomic mortar
#

hmm must the the extension specific thats acting up because i used to be able to select them just fine as a main checkpoint

#

tough if you dont mind me asking

#

why inpaint anything?

viral valley
#

segment anything for masking

atomic mortar
#

hmm

#

ahh i see it uses a llama mode ltoo

#

yeah i see what you want now

viral valley
#

for example, I don't have dreamshaper, but I seem to have a dreamshaper model available in inpaint anything.

atomic mortar
#

hmm i dont see any documentation being written on the page other then "its compatible"

#

so one would assume it looks in the stable-diffusion folder. cant help ya here man sorry

viral valley
#

from inpaint anything sd webui repo on github: The inpainting model, which is saved in HuggingFace's cache and includes inpaint (case-insensitive) in its repo_id, will also be added to the Inpainting Model ID dropdown list.

If there's a specific model you'd like to use, you can cache it in advance using the following Python commands (venv/bin/python for Linux and MacOS):
#

The model diffusers downloaded is typically stored in your home directory. You can find it at /home/username/.cache/huggingface/hub for Linux and MacOS users, or at C:\Users\username.cache\huggingface\hub for Windows users.

#

but I would like to not limit the model for use only in inpaint anything (sometimes I send the mask to img2img) but I also don't want 2 copies of it taking up twice as much space

#

also I didn't get the model from HuggingFace I got it from CivitAI, so.. I guess it's possible it has a repo there.. hmm

#

Nevermind. You can configure the extension to look in another directory.

viral valley
#

hmm, I pointed it at the models folder where I keep all of them, but it bugged out.. I had to redownload the sam model segment anything uses, which created a file in the folder, but the segmented image is like a thumbnail and doesn't show the full image..

#

... I didn't read it correctly. you can only change the segment anything model folder.

#

oh. omg I'm dumb. there's another tab in inpaint anything next to cleaner and controlnet inpaint for webui inpaint, where all the models are listed. forget I mentioned it

fallow veldt
median beacon
#

how to make image?

deep narwhal
#

anyone manage to scoop up a 9070 or 9070xt and get it to work on windows?

warm junco
safe shale
#

hello

craggy rune
#

hi

fervent thunder
zealous flume
#

Can anyone help me with getting flux gym installed locally on Mac?

fading spruce
#

Hello everyone. I am new here and new to ai in general, but I have set myself a challenge. I have been watching on as my friends have started side hustles like drop shipping and automated news programs or social media channels. I wanted to start a monetized project of my own, and I decided that I would finally publish the comic book stories that I had been dreaming up for years, and I would use ai to do it. I am planning to have my story analyzed by one bot that will turn the descriptions into prompts for stable diffusion to make the art for, compile/format it into a comic and sell it on Amazon KDP. I have Stable diffusion with the AUTOMATIC1111 UI, but I don't know how to get the art to turn out right (additionally I will need consistency of character design). I tried it out myself after watching some how to videos and got some wanky resalts, and when I asked the ai I plan on using to convert my stories to prompts (Claud ai), the results its prompts and settings got were disturbing to say the least (I asked it to try and reproduce Garfield the cat. The images came back with two torsos attached like pats of an ants body, three eyes and five or more legs). For consistent comic/manga art does anyone have advice on what I need to learn, add or download/incorporate?

craggy rune
#

im praying there will be an sd3.5 detailer

fervent thunder
#

there are some occasionally, cogview 3 unet version had one

#

and I saw some normalising flow model refiner

agile lion
#

Hello all

fervent thunder
#

hello

craggy rune
fervent thunder
#

ah ok I don't know impact pack very well

#

it looks like a good node pack its just that I don't want the SEGS thing

#

but impact pack is mostly about SEGS really lol

#

I just sort of blend a bunch of gradients together with photoshop blends instead, and then turn to masks

fervent thunder
#

Hi guys sadcat ** I wish to hire** someone that can upgrade my Real Estate Renders for my client, I have a 3D model of the project, and real life photographs from onsite, client wants to showcase the house as finished to be able to list diffusionhand

finite geyser
#

Hi!

runic pasture
#

hi all

brittle rose
#

turned out the Grok3 has more adherent filter face ID than Sora

fervent thunder
#

ye

#

Grok's one is autoregressive apparently

twilit forge
#

what sampler/scheduler do you guys use for wan 2.1 img2vid generation

hot tendon
#

I'm having trouble getting good results with image-to-image in Stable Diffusion. My outputs are very low quality the eyes, facial details, and overall sharpness are really lacking. But I see other users getting amazing results. What could I be doing wrong or missing?

atomic mortar
#

They also do a "hiresfix" on images they post a lot

hot tendon
#

Thanks! I’ve heard of Hires.fix but I haven’t used After Detailer yet Is it a separate extension or part of a specific workflow? And how exactly do you use it for img2img?

atomic mortar
hot tendon
#

I think it's related to ControlNet using a reference image and transforming it into a different style with AI without changing the main structure. I’ve installed the extension, but it doesn’t work well for me. The results are very poor the face and eyes often come out distorted or missing. Any idea what I might be doing wrong?

atomic mortar
#

Hmm your using controlnet for image to image?

#

What are you trying to do?

hot tendon
atomic mortar
#

Like specifically

hot tendon
atomic mortar
hot tendon
#

Okay I am gonna send with my details

grave knot
#

Hello everyone, great to be here to learn first to contribute later.

#

I have started to create images with comfyUI, can anyone tell me where to find a model to generate 3D comics?

celest osprey
#

Hello

sage reef
#

@woven panther how did you get the original VACE wan model from 7.15gb to like 1.47gb 😮
is it really equivalent and i can use your smaller version? cause that would be really nice if i can avoid the extra space

#

il check your workflow tonight

upper plinth
#

HEEEEEEEEEEEEEEEEEEEEEEELP

#

I got a 5080 and now stable diffusion doesnt work

#

someone said I need to update to a different version of pytorch

#

does someone know the easiest way to get this fixed

upper plinth
#

Bro these GPUs are so overpriced and they don't even have backwards support out of the box

#

Nvidia can eet a phatt dikk

atomic mortar
#

hmm the problem is its a completely different architecture

#

better ye

upper plinth
#

well yeah, you make a new architecture, charge up the arse for it, and don't even plan on helping out the folks who were using your previous architecture

#

I HATE this monopoly these mfs run

woven panther
lethal zephyr
#

Hi quick question does Stable diffusion still offer video generation API’s, image-to-video, and image generations? Also is there any restrictions or can you simply follow the API docs provided to get access to the API’s?

fervent thunder
#

https://platform.stability.ai/pricing

#

its got an interesting aspect that they have 3D models on there as well as video and image with controlnets
so you could make a 3D model, turn to image with controlnet then turn to video, with just one API

unreal verge
#

Am I'm weird for preferring both Flux and SD3.5 together?

#

Both have their flaws for sure, but using both of them really provides me interesting images to generate.

fervent thunder
#

its not weird its good to combine models

#

I end with SD 1.5 or SD 2.1 for every image I make, for example

lethal zephyr
fervent thunder
#

no problem

#

I recommend looking at 3D stuff, the latest Gemini Pro can write working blender scripts to make use of 3D models

#

you can do a quick render and then go back to diffusion model to finish it

unreal verge
#

Well... I ended up getting banned from the Midjourney discord.

#

I probably shouldn't have said "4o Image is so good that I make multiple accounts to generate more image."

#

And look where it ended up.

#

Oof.

fervent thunder
#

yea they have to ban for that, its probably in discord terms

#

discussing unlawful activity etc

unreal verge
#

Well I learned my lesson, I'll keep my mouth shut next time.

fervent thunder
#

not rly a fan of piracy cos
I have to pay more for things due to the pirates

unreal verge
#

I am mixed toward piracy, piracy should always be a last resort. Otherwise, I pay for my stuff. But I can see where people are coming from when they pirate.

fervent thunder
#

to me last resort things are like food, water, housing, medicine etc
media and software not so much

atomic mortar
#

sometimes piracy is the only way when companies decide to abandon a product (old game) but refuse to re release it shrug not advocating for it but i get why

#

or when you bought software and the newer version just sucks while the old one worked better. so you download the old one etc

fervent thunder
#

that's a description of the motivation "why do they want to do the thing"
rather than an ethical judgement "is it morally right for them to do the thing"

vestal sand
#

Hello

unreal verge
#

Hi.

serene hazel
#

Anyone know of any good tool to use for branding / infographs / asset creation?

I got logo, colors and stuff ready for a project. Tryna find a tool that i can use as a template to create banners, infographs etc.

errant bronze
#

I am your partner in AI-powered business transformation. My mission is to bring innovative, AI led solutions, to your business problems, through a personalised human led approach. Delivering excellence for clients and customers with demonstrable results and measurable return on investment.

If you are looking for AI engineer, I 'd like to discuss with you.
Thanks

faint timber
#

Supposedly, ChaptGPT 4o can do that.

atomic mortar
#

for your use i recommend Canva

#

for banners infographs, flyers etc

#

it has Ai intergrated but its tool is really nice regardless

open umbra
#

Is Kohya_ss still the standard for making lora? I've made a couple character lora successfully but I've noticed it hasn't been updated significantly in a while. Is it because people use something else now or it just doesn't need updates?

#

Oh, looks like after a while there was one last week and before that September I could of sworn I didn't see any before 😅

#

Still curious is it still the standard to use?

atomic mortar
#

Hmm i don't see why not. For What model you wanna make a lora?

open umbra
#

All sorts of models really. I'm still new to training so I'm pretty ignorant on what other people use

#

Heard kohya was good and after fiddling with the sliders after failed attempts I get pretty solid results now

#

Just weren't sure if it was the best trainer or not

#

I haven't watched any videos just been YOLOing AI stuff so it's been a journey 😅

wraith dagger
#

where can i post the stuff i create and ask how to make them better and stuff

sage reef
#

based on Wan 😮

woven panther
#

yeah, but doesn't seem to have control support so not too interesting for me personally

sage reef
#

yea i understand, will see what happens

abstract quarry
#

when new models arrive, there might be differences and bugs between different training methods

#

but if you train models that are out since many months then it probably doesn't matter

#

just use the training tool you are familiar with

floral umbra
#

Almost tempted to use my spare 1050TI as 4GB extra vram for comfy to offload to lol. As i hit 24GB vram and 64GB ram cap way too damn often lol

#

Or, hell.. just checked, and i can just get a used 3060 12GB for 220, so with the offloading, i will essentially have 36GB vram kek

summer sedge
atomic mortar
#

you want to make a style or a character lora?

summer sedge
#

character. and, like i said, i can only really start with a single picture as my data set. i heard some time ago that it's possible.

atomic mortar
#

hmm maybe with IP adapter? but for a lora you generally need over +50 images minimum

summer sedge
#

maybe i can slowly get there... what's an IP adapter?

atomic mortar
#

hmm it has many uses such as style transfer, etc but im not familliar with it since i like to make a style from the get go

summer sedge
#

well, i am using A1111, and, even though i always wanted, i never really got around to experimenting with lora training

#

so, i am mostly wondering if i should use some extension

atomic mortar
#

Well heres the thing, theres one extension but its not recommended as its outdated and even broken for the newer models iirc?

#

what GPU do you have before i recommend some stuff

summer sedge
#

nvidia 16 gb, rather solid i think

atomic mortar
#

hmm what generation? 30, 40 or 50 series?

summer sedge
#

rtx 4070 ti super

atomic mortar
#

hmm you could train locally but it would use your pc for a solid while

#

but since you wanna try 1 image lora's its better to run locally imo

#

oh Cs1o is typing

#

👀

summer sedge
#

locally sounds right

#

so, no extension then?

#

the inbuilt tab?

warm junco
atomic mortar
#

id say avoid at all costs yeah sillius

summer sedge
#

i seem to remember there is a kohya extension?

atomic mortar
#

its a stand alone tool

#

any extension is probaby a "fan made" extension and probably not as up to date

summer sedge
#

hmm okay... is the inbuilt A1111 training implementation lacking somehow? or should i just start there, then?

atomic mortar
#

koyha_ss

#

or onetrainer

summer sedge
#

okay, i will look into it, ty

atomic mortar
#

theres also guides availible for that iirc

#

but for the other stuff:
Its either outdated, broken or complex to figure out for beginners

summer sedge
#

oh, that makes the decision easier

#

are people even still using A1111 or has everyone switched to comfy?

atomic mortar
#

Mostly forge, Swarm or comfy yeah, some people use Fooooocus (more or less o's idk)

#

people who use A1111 mostly gotten a guide on youtube from a few years ago

summer sedge
#

yeah well, i'm not so up to date...

atomic mortar
#

i mean it still works

warm junco
#

I'm still using Auto1111, performs good as I dont do much with flux

#

Forge is good too but some extensions won't work there

summer sedge
#

can't say i miss anything in A1111 either tbh...

#

anyway, i will look at kohya, thanks

abstract quarry
#

training from one image works

#

but I would recommend to try ipadapter first

#

training from one image often means: you overfit like crazy, use this model to generate multiple new images, start again with a larger training set

#

with ipadapter you can directly generate more training data

#

5 training images are a much better start

upper plinth
#

holy guacamole

#

I just got A1111 running with my 5080

#

worth every cent of this pre-scalped card

#

generation speed is INSANE compared to 3090

#

Also thumbs up CS1o for still using A1111

#

I'm never switching to anything else if I can help it. This interface to me is golden

#

Flux is good for images with text and quick concept arts

#

1.5 is still king of details IMO. Also the easiest to train

devout quest
#

:cowdance:

still ivy
left depot
#

GM everyone

How to generate images here?

nocturne prawn
#

heyy

#

anyone here into AI/DS

summer sedge
#

One more thing: are there (up to date) integrations of kohya and ip adapter into comfyUI?

whole prism
#

Hey guys,
I need to run DiffusionPen and it requires Stable Diffusion v1.5 from runawayml.
Can anyone help me with this.

whole prism
earnest knot
#

hi

vagrant silo
#

hi all

finite copper
#

Hello everybody

sage reef
#

@woven panther hey thx for supporting SkyReels-A2 and providing the converted weights too,
will check it tonight. Is there a workflow available?

woven panther
sage reef
#

no problem ❤️ take your time, amazing work as always!

#

im assuming infer.py is the actual workflow pipeline, so maybe the answer is there somewhere :3

noble edge
#

hello, while generating images i got a problem and i was hoping i could get some help! i try to generate 2 characters, capella and crusch from rezero. In the anime Capella is riding Crusch. I managed to do the positions, and characters to not merge with eachother, but every single time, Crusch is on top while capella is bellow. I want the opposite! Can somebody please help me how i can fix this problem? and make capella to be one above?

verbal osprey
#

Has anyone here used swarmui and sd.next? Wondering which one I would prefer more. I really like automatic1111 and forge.

kind dawn
#

Hello

hearty kettle
#

anyone know how to get stab matrix working with 12.8

#

i keep running into issues

#

manage to fix some but not all

woven panther
sage reef
#

cool, will try all that tonight, thx 🙂

fervent thunder
#

they tend to get recommended as easy install methods

#

but if they are for install why do they not resemble install scripts 🤔

hearty kettle
#

they are just painted differently

#

thats what makes it easier in a sense

#

more organized

fervent thunder
#

1. download Docker 2. docker pull aidockorg/comfyui-cuda 3. docker run -d -p 5000:5000 --name comfyui aihub/comfyui 4. http://localhost:5000after that you have docker setup with comfyui and automatic model and node installs

#

its isolated from your system so it is safer for security as well

#

and if the comfy install gets messed up you can remake fresh docker container with one command

desert dagger
fervent thunder
#

IDK cos they have this other computer aspect going on ```Architecture
Pinokio takes inspiration from how traditional computers work.

Just like how a computer can do all kinds of things thanks to its comprehensive architecture, Pinokio as a virtual computer is a comprehensive platform for running and automating anything you can imagine with AI.

File System: Where and how Pinokio stores files.
Processor: How pinokio runs tasks.
Memory: How pinokio implements a state machine using its built-in native memory.
Script: The programming language that operates pinokio.
UI: The UI (user interface) through which users access apps.```

craggy rune
fervent thunder
#

can m4 pro run stable diffusion also new amd 9070 smoothly without lag hang stutter melting?or ngreedia only option for SD

#

even xt will do

#

or not?

#

9070 xt?

frail prism
#

I wonder how OpenAI's newest image generation works. It's clearly more complicated than just processing a prompt through some weights.

zealous flume
#

Does anyone here have experience with Lora Training in Flux Gym? Ive been working at it and keep running into an issue were it says my training complete after like a minute. No sample images or anything.

simple widget
solid spindle
#

i have huge prompt list in pos. prompt textfield and want to add one after another to see how it takes effect.
how to lets the other prompt in the textfield but it should not influence anything...how to exclude or ignore these prompts without copy pasting and such workarounds?

#

Wait it seems i can use # per line ?!

upper plinth
#

How good are Flux loras, any examples?

#

I want someone to show me a Flux taylor swift to see if it beats SD 1.5

atomic mortar
upper plinth
#

Show pictures

#

Of celebrity so I can judge

atomic mortar
#

filter by flux 1s or 1d dev

#

and thne

#

then by SD 1.5

upper plinth
#

1.5 still better

atomic mortar
#

?

upper plinth
#

For details

#

This gonna make a lot of people mad, but image generation peaked at 1.5

#

At least for NSFW

humble iris
#

I just tried Deep Cache for SDXL (ComfyUI) and it makes generation 3x faster and I like the image even more.. Are there any other speedup methods?

atomic mortar
#

if i were to create the stuff i do on 1.5 its so much more messing around

upper plinth
#

I've used flux for some personal projects, but I'm not convinced it's good at porn

atomic mortar
#

extra upscaling, adetail, refiner etc

upper plinth
#

Exactly you have to use a lot of extra tools but overall I'm still blown away with how much more flexible and uncensored it is

#

I just don't see that with the newer models and I don't see that with trainability

#

But if someone can prove me wrong, I wanna see what celebrity trained Lora on flux looks like. Doesn't even have to be NSFW.

#

Civitai is just full of anime

atomic mortar
#

wrong filters then

#

pinged in the other general

#

bc no images here

scarlet smelt
#

Hi guys, what is actually the best efficient AI for generative pics in open source ? And/or model ? Thanks a lot

unreal verge
#

Will there be a Stable Diffusion 4?

fervent thunder
#

probably but I suspect their main product is gonna be something different

#

they keep hiring VFX people

#

and if you look at current VFX software its very much doing things that could be replaced by transformers

vital charm
#

Does anyone know which types of SD models do a good job with text? I've never really tried it.

Bonus if it can do non-Latin alphabet characters!

vital charm
#

If the answer is "Stable Diffusion 3", then I don't think I have access to that, unless I'm wrong

abstract quarry
#

I would say none of the diffusion models is particularly good at text, but Flux is probably best

#

there is also a clip checkpoint for flux that surprisingly improves text a bit

vital charm
#

oh ok, thank you

fervent thunder
#

something launched recently with glyph by t5 v3

frank river
#

Hey guys what hosted sites (free or paid) do the best image generation with a character reference? I'm trying to get my face, but in an action scene, that sort of thing. I tried with ChatGPT 4o but it's not getting the face right. Is the only real way to do it using SD or flux locally?

sage reef
#

@woven panther thx for providing the Wan2.1 fun reward loras! Just a question, are these only for the InP model
or can also work with the Control model? Cause I mainly use the Control model.

woven panther
sage reef
#

ah ok ty

#

also, from personal testing, which is better, MPS or HPS?

#

i mean i guess il try both :3

full pollen
#

Hi everyone I am currently in the research part of a video I am making "AI: Creating or Killing Art?". It would be a big help if I could get some of your answers on these three questions : 1. Can AI be a tool for artists? 2. What does it mean for something to be art? 3. Is AI "art" theft? Thank you in advance!

sage reef
#

for fun, il answer those questions:

  1. I would say it is a tool, just like a calculator is a tool, or a music sampler is a tool to make music with libraries (instead of hiring a real orchestra for example, etc)

  2. what is art? I have no idea, but whatever it is, it is completely subjective from person to person. You know the saying "one man's crap is another man's gold" (or something like that). But the point here is that because it is in fact subjective, than what does it matter how it was created? AI or not. I can dislike or like an AI-only art piece just the same way as I can dislike or like a non-AI art piece. so what does it matter?

for example: what do i care if you composed a piece of music with only tools and AI, im interested in the "actual thing im listening to", so what do I
care how you did it, i just like the music. and the same logic applies to art pieces.

  1. i mean theft in what way tho? copying a style? i didnt know there was a copyright attached to styles. you can be inspired by a style, idk...
    if that is theft, then so many other things are theft too, which would be ridiculous to go into in detail.

But to answer the main question of the title: creating or killing art, i would say creating for sure. again, the judge is the end user that looks at it and if they like it or not, aka, subjective.

full pollen
#

Thank you for your input!

lament cradle
#

Create a smiley for me

sage reef
#

🙂

silk saffron
proud dawn
#

Any good upscalers? Please help

regal storm
#

hey yall whats the best starter tutorial for learning how to edit images of myself with stable diffusion? i know nothing and theres 100+ youtube tutorials vids. im wondering which ones explained it best for you

#

any specific youtuber or guide online?

summer goblet
#

where can i find discord of unstable diff?

floral umbra
#

Oh boi, gonna train my first video lora, tested with one few seconds clip, was a success, and now i'm preparing a few thousand frames kek So far 2 clips, 1700 frames total, and it still remains to have 6-8 clips. So might take a good few consecutive days to train lol. Sadly i can't train with runpod and the like because i still can't figure out how to use it eugh

floral umbra
#

@vapid dove Scammer above.

icy crypt
#

hello

nova tulip
#

hello

wanton beacon
median swift
#

Hello

vast pilot
#

Hello 🚘

vale lark
#

hey everyone can anyone guide me how to use the image generation

floral umbra
turbid depot
#

o/

burnt bridge
#

hi!

lethal zephyr
#

If someone has ever used the image-to-video endpoint and knows how to get a proper video URL please let me know!!

main snow
#

man, i wish making loras wasn't the equivalent of stick n ball torture without the stick n balls

#

soul sucking process

golden magnet
#

kinda want to find a model that provides a straight front view toward the character

main snow
molten tangle
#

So how's SDXL consistency nowadays ? Can you generate the same character on and on ?

main snow
main snow
molten tangle
main snow
#

pony just kinda feels outdated atp, only upside is due to being older has more loras

molten tangle
#

Like a character render

main snow
#

plus you can remove the background in seconds anyway

#

by just using photopea

summer sedge
#

is forge a continuation of A1111? it doesn't just look virtually the same, it gets its extension list from A1111, too... are they actually compatible?
i'm trying to figure out whether i should finally leave A1111 behind and switch to comfyUI, swarmUI or forge.

summer sedge
#

and what about swarmUI? is it built on top of comfyUI and contains all its features? no need for standalone comfy?

tepid burrow
#

Went on the image arena to try the Dev HiDream. Did about 80 runs and i will say some of it is inflated, because it was put up against terrible models like 60%-70% of the time. But i do think its pretty good. Probably better than Flux pro if i had to rate it.

kindred heath
#

Do someone know how to create nsfw images with ai?like the works on pixiv

#

Is there some websites can do this?how do the nsfw creators on pixiv do?

desert dagger
sleek shuttle
#

is there a channel where i can post pictures of sand

teal nova
#

hello

abstract quarry
#

it feels like they tried to do "more is better" instead of "less is smarter"

abstract quarry
#

2 CLIP, T5, Llama3

#

which else?

#

oh, and it uses mixture of experts X_x

abstract quarry
#

nah, the architecture looks insane inefficient

#

I don't have the feeling they really though about how to make a good image model, but instead just throw everything into it they found and add a lot of money they had lying around anyways 🤷‍♂️

sleek shuttle
#

shouldve given the money to me

#

i can draw a little

#

i have a cat

obsidian drum
#

i have 2 cats

fervent thunder
#

using other hidden states than the pooled layer helps for Flux as well
someone on comfy discord tried it

#

using unpooled clip l

hasty badge
# abstract quarry I don't have the feeling they really though about how to make a good image model...

After having seen the post from ostris on twitter (https://x.com/ostrisai/status/1909415316171477110) showing that apart from llama none text-encoders even come close to understanding the prompt and their influence is erratic at best (https://x.com/ostrisai/status/1909415316171477110) this model makes no sense at all. It wouldn't surprise me if the main reason for better prompt following is that it just has seen better captioned images during training. SD3 attempted to balance the text encoders, flux choose to favor t5 and only use the low on information pooled clip embedding, but this one seemed to just have gone full steam ahead destroying the effects of the other textencoders in the process. So weird, makes no sense at all. Can't help but wonder what their data + an architecture like lumina2.0 would have resulted in. MOE is nice though, but seeing this, i wonder whether it even works well in this model.

main snow
vast topaz
#

how do i prompt here?

main snow
#

Here? Ya don't lol, you do it in whatever app ya using or on Civitai website

fervent thunder
#

if one text encoder contributes most of the magnitude then removing it will result in a very low magnitude vector even in cases where it is working well

sharp glade
#

Hi quick question, regardless of price does work with stable diffusion profit more from an 285K or an 9950X/X3D?

warm junco
sharp glade
#

but what difference does it make

sleek shuttle
sleek shuttle
abstract quarry
#

there is just no reason to use so many text encoders together. lama should be enough. Maybe also CLIP-L for some styling improvements

#

also they not only used lama, but they also added each individual llama layer

#

so they used 4 text encoders + 30 variants of the llama text encoder

#

this is just an insane waste of parameters X_x

sleek shuttle
#

this is fucking awesome out of context

#

"also they not only used lama, but they also added each individual llama layer" 😭 😭 😭

#

gimme a car with a llama layer

#

3 levels of protection

hasty badge
twin girder
#

hi

meager jasper
#

what do you guys use for image captioning? Joycaption? I am trying to generate a batch of prompts from images

sleek shuttle
#

ms paint

pulsar venture
#

Hi, do you know someone that could help me to fix comfyui, even paid?

atomic mortar
#

pls explain there whatsp

pulsar venture
#

but no one help me

abstract quarry
meager jasper
#

@abstract quarry what about on runpod? is a 4080 SUPER fast enough

main snow
#

i do too just use paint too lol, quickly add a speech buble or just type wtv

abstract quarry
#

yeah, think so. You have to quantize it a lot, though

#

llms are extremely fast as long as they run in gpu

#

it only gets a bit slow if you have to run in on cpu (or partly on cpu)

#

so you might use the 12b version. That should fit into your vram easily

sleek shuttle
#

SEAGULLS

gritty patio
#

Does (NVIDIA ONLY) on pinko mean I can only use a nvidia gpu on that AI?

deep axle
#

I build with make and stability.
Offer it as a service.
I build automations.
And open for jobs

topaz elbow
sharp ridge
#

Hello

topaz elbow
dawn sedge
#

how is stable difussion on AMD gpu with windows 11?

desert dagger
pearl sand
#

Is the latest version of Flux_dev model possible in forge?

#

I am getting t5 state errors, and my PC crashes.

devout hull
plush hedge
pseudo robin
floral umbra
#

DAmnit, can't find much info at all on how lora training works at all for wan eugh

#

As i see people feed it videos, but it uses more vram, and don't know if i can just feed it videos as frames, or if it needs to be video files for it to work lol

sudden jewel
#

Hey y'all! Joined a few hours ago
I've been scouring through the internet for a more thorough understanding of the various samplers, which lead me here, one of the biggest servers discussing diffusion image gen
For now, I've this question:
From what I searched, diffusion is baically solving ODEs, and there are many ways to do that. "Higher order" solvers achieve a higher accuracy at the cost of more computation per step, but should be able to converge to the answer in fewer steps total.
Then why is it that, say on SDXL, euler (first order) or dpm++ 2m (2nd order) can generate nice looking images in as few as ~20 steps, but stuff like dpm++ 3m (3rd order) or ipndm (4th order) needs >30 steps to get images without large chunks of artifacts?

abstract quarry
#

with CTRL+M you can mute them, with CTRL+B you can bypass them

#

mute = the node is inactive
bypass = the node is inactive but will output its inputs without doing something

#

bypassing makes sense for stuff like lora loading or applying control nets where your input is a model and the output is a modified model. With bypassing the node still works but it just doesn't do anything

#

for disabling entire paths, it is sufficient do mute the output node at the end of a path. Comfyui will only run nodes which outputs are used for something

#

so if a certain path ends in a "save image" or "image preview" node you can simply mute this image saving node and the whole path will not be run

cedar salmon
#

if you put things in groups you can right click the title bar and mute/bypass the whole group, much handy

fervent thunder
#

hey

fallow veldt
mortal bison
#

Sup

fervent thunder
#

I stopped using mutes or bypasses because sometimes nodes would have an effect anyway

#

they really weren't supposed to but its not rare to get nodes coded outside of the intended Comfy way

#

most common issue when this happens is what they call Monkey Patching

#

thought about making a node that only shifts the noodle traintracks but otherwise keeps it the same 🤔

#

it would be resilient to those bugs

nocturne sand
#

hey, anybody know where I need to go for help with hunyuanvideo?

tacit narwhal
#

Runninghub open source UNO ComfyUI plugin
https://github.com/HM-RunningHub/ComfyUI_RH_UNO
Functions and features
Support flux-dev-fp8 and flux-schnell-fp8
Support flux-dev and flux-schnell running bf16 on 24g gpu, everyone can use it on local 4090
Detailed bar
Real-time display to query and track
Local model loading. Do not force hugginface to download the model, more friendly to the CN environment

spare jungle
#

thanks

deep axle
umbral lotus
#

Is there a location for feature suggestions/requests? 🤔

barren solar
#

How do I use this without Discord? I'm very confused. I'm paying for a membership but can't figure out how to use it.

eager current
#

How you guys do your controlnet models? Make preprocessor folders and put them in there? so a Controlnet/Canny/ for the canny models? and so on?

atomic mortar
rustic schooner
placid hatch
#

still no 9070xt support in rocm 6.4

dawn sedge
#

is there a way to always auto-detect size in automatic1111?

limpid shore
#

ZLUDA + experimental ROCm support

ruby garnet
#

Has anyone got Hidream working in comfyUI? I tried running it but it just wont work correctly. I left it on overnight only to see the generation was only 17% done in the morning which is insane. I thought maybe it had to finish downloading something but I get no errors when running it and task manager does show full utilization of my 4090.

steady hearth
#

Hello all

#

I am looking to learn some basics of stable difusion, anyone recommend some courses? I found a few on Udemy. I am mostly at this time interested in generating text to images for anime and fantasy style artworks for making a TCG

abstract oak
#

Hi, I am creating original PUZZLE GAME for PC, Mobile and Board game as well. I don't have Nvidia GPU and found to run SD on my PC difficult. Will you generate few pictures for me?? Result should be buildings in simple cartoon style for the game menu. Game itself is in 3D...

atomic mortar
#

To get a style and concept your happy with because if i were to hop on it would be longer the just a few

desert dagger
abstract oak
#

Thank you for advices. I will try that. I already tried some cloud ai before as leonardo.ai but it always generated me the isometric view even I prompted front view, not isometric view. Bing/Create worked much better for me but miss there more controls.

sweet flint
#

hi

#

anyone knows any goods AI that are free for a formal portrait for CV n stuff.

abstract quarry
#

how about using a real photo? I would never invite somebody to an interview if I find out his photo is AI generated

fervent thunder
#

putting anime AI avatar on your CV is not recommend

fierce prism
#

what would be the best ui for me to use aside for comfy? it looks too complicated for me

sterile plinth
#

Anyone know if making stereoscopic 180 deg vr videos with Wan2.1 is possible?

pearl sand
#

Why tf is Flux so blurry agony

fervent thunder
#

not sure TBH

#

would have been better if not so blurry

pearl sand
#

for you 2?

fervent thunder
#

yeah

#

you can get a bit less blur if you put wide image

#

and low guidance

#

when I say low guidance I mean like 1.4

pearl sand
#

I'll try ...

proud yew
#

hi

sage cloud
#

gn

abstract quarry
#

it's an opinion from someone who used all four tools 🤷‍♂️

rustic schooner
#

some body is smoking something and they aint sharing it

hasty hornet
#

gonna respect ban speedrunners 🫡

rustic schooner
#

are you in need of a mental hospital

hasty hornet
#

although it seems like we can't report anymore, hm

#

nvm, we can, I'm blind

#

The application did not respond...interesting, interesting

rustic schooner
#

needs a new AI by Tesla

desert dagger
#

@vapid dove your bot is out of control

lunar sable
#

well hello there kind stranger, does thou want to striketh a deal

#

drugs

#

50cent per kilogram

#

croatia

#

tomorrow

#

ill pay travel costs

#

well then thats a deal!

#

great doing business

#

no prob dude

#

you gotta leave now tho or youll miss the flight

#

😉

#

dont check your walls

summer sedge
#

Is it just me or can't I work in and queue multiple tabs in Forge? And do I really need to zoom in all the way every time I need to place 1 pixel to extend the inpaint mask region for necessary context? Those are actually deal breakers for me. Can't really tell what advantages Forge has over A1111. Integrated extensions (controlnet etc) aren't really required. Went back to A1111...

drowsy timber
#

guys is did anyone get the " Torch is not able to use GPU" error when installing sd?

#

zero coding expirience btw

floral umbra
# drowsy timber zero coding expirience btw

No coding experience needed. Just knowledge how to find incompatible modules.

Look in the cmd window.

Most likely you installed just "torch" and not torch with cuda compiled.

drowsy timber
#

so i redid everything following the correct guide

#

now i get a different error

floral umbra
#

Do pip uninstall torch torchaudio torchvision
Then go here, https://pytorch.org/, scroll down, select cuda 12.6, copy the pip command, and paste onto your cmd

#

~~ i have to uninstall mine constantly because it overwrites the torch dev i use~~ lol

drowsy timber
#

cant post images in here

floral umbra
#

Nope, but you can in the channel right below

drowsy timber
#

rocBLAS error: Cannot read C:\Program Files\AMD\ROCm\6.2\bin/rocblas/library/TensileLibrary.dat: No such file or directory for GPU arch : gfx1031

rocBLAS error: Could not initialize Tensile host:
regex_error(error_backref): The expression contained an invalid back reference.

#

that the error i get

#

yeah im already in tech support

iron jasper
#

hey

deep axle
cosmic barn
#

Hi

ocean garden
#

hello

floral umbra
#

ahh.. heck.. 128 rank lora training, 25 epochs, one save per epoch, 1.2GB per lora, and output folder for this round is now at 86GB tech_jesus

fervent thunder
#

wtf

steel frost
#

anyone know if Hidream is more or less graphic intensive than flux?

#

nevermind

fervent thunder
#

more intensive

#

it rly needs SVDQuant or some other method

#

I mean you could stick it in Quanto and ask for Int4 and see what comes out the other side 🤔

upper plinth
#

duckers to me it got so bad I bought a whole SSD dedicated to AI stuff

#

These models and checkpoints and pytorches and all this crap takes up massive amounts of storage space

lunar sable
#

HOY

rain yew
#

hey my friends

atomic mortar
#

but ngl i could do with less if i cleaned up once in a while

upper plinth
#

same im just way too lazy

#

but hey if my motherboard came with 4 M.2 slots I might as well use them and get my money's worth.

atomic mortar
#

hell yeah, im still waiting for video ai opensource to be a bit more stable before i tackle on another nvme

upper plinth
#

local video ai is a fantasy

#

have you seen sora's newest video gens

atomic mortar
#

Hmm wan is pretty decent if you do small stuff

upper plinth
#

no way you can run that without tens of thousands of dollars worth of GPU power

atomic mortar
#

Hm for now

#

One day

#

We said the same thing about image gen and llm's

upper plinth
#

true

atomic mortar
#

And ngl, simple animation is really doable in wan already

upper plinth
#

wan?

#

and can you do nsfw?

atomic mortar
#

Yeah lol

upper plinth
#

I wanna see examples, last I used was AnimatedDiff

#

which was kinda midd

atomic mortar
#

Hmm check on civitAI, videos tab > wan2.1

#

Or models > sort by wan

upper plinth
#

can you run that shit on a 5080?

atomic mortar
#

Yes! I run the 480p model no problem

upper plinth
#

whoa. Im gonna try it

atomic mortar
#

I could try the 720p but eh the 480p takes 8-9min

upper plinth
#

what GPU u got

atomic mortar
#

For 81frames

#

5080 lmao

upper plinth
#

ooh

atomic mortar
#

I have to say i run it in swarm & not the nightly comfyui version

#

I messed in some torch files to get it to work

upper plinth
#

did you have to juryrig your swarm to work with 50 series?

#

I see

atomic mortar
#

Hmm but theres a guide in the swarm discord on how to get it to work

upper plinth
#

textbook case of dependency hell with this whole compatibility stuff

#

They usually do work but theres always this one mf component in my case flash-attention that isnt yet supported or requires some ridiculous workaround

atomic mortar
#

Oh if you can wait till tomorrow after my work (00;03rn, i im home at 6pm ish) i can send you my comfy portable

#

Oh i dont use teacache or flash attention

#

But my comfy works straight out of the box, gives you a error you gotta click ok on though because torch audio is broken due my messing around

upper plinth
#

I use flash-attention for LoRA and Dreambooth

atomic mortar
#

Waiting till swarm offically supports it before i update

#

Ohh lora training

upper plinth
#

I dont use image gen without being able to train my own subjects

#

I hate genning randoms

atomic mortar
#

I mostly make known Characters

#

Dreambooth for lora training eh?, that works on 5080's?

#

Or not yet

upper plinth
#

There are people on github claiming to get it to work

atomic mortar
#

Hmm i got it mostly working on KoyhaSS

upper plinth
#

and then you go into the flashattention/xformer dev forums and they say there is not

#

I havent used kohya in a while but last time I checked dreambooth sucked ass for it, I only used it for loras

atomic mortar
#

I had to get a nightly version of torch, torch attention etc

upper plinth
#

And I understand not everyone likes dreambooth, but once you do a Dreambooth+Lora of a character you never go back. It's literally perfection

atomic mortar
#

And messing in the configs was annoying because it kept replacing files here n there

upper plinth
#

yup

#

its nightmarish

atomic mortar
#

For now ill stick with civitAI trainer lmao

upper plinth
#

free?

#

and does it do dreambooth?

atomic mortar
#

Nah just loras

#

Never used dreambooth

upper plinth
#

for 1.5 it was heaven

#

I dont recommend dreambooths for anything beyond 1.5 bc as far as I know they never worked well

atomic mortar
#

Ahh I'm a illustrious main

#

5s for a image with a small lora

upper plinth
#

Im old-school. 1.5 still holds up for photorealism all these years later.

#

For anything other than photoreal you got Pony, XL etc

atomic mortar
#

Less then a second for 1.5 gens with no loras

upper plinth
#

When it comes to waiting 1.5secs vs. 5 secs it makes no difference to me

atomic mortar
#

Flux also gets me 6s gens

upper plinth
#

Im more about quality over quantity

#

If I can't train it, it's garbage - thats my motto

atomic mortar
#

Same ngl but running refiners etc at a decent speed is nice

#

Went from 27 steps to 40+100 refinement stepd and still come out on top in speed

upper plinth
#

it is nice if im making a logo or concept art

drowsy timber
#

is there a fix for lora and checkpoint previews not showing?

gleaming spire
mortal shuttle
#

Hi! I'm looking for a community where people actively train their own models — like LoRA, DreamBooth, or full fine-tunes using kohya_ss or SDXL.
I want to learn from others who are doing real training, sharing configs, logs, tips, and maybe even datasets.
Is this the right place for that, or can you point me somewhere more focused on workflows?

fervent thunder
#

but uh
I would recommend reading arxiv instead if you want to keep up on methods
there is like an entire line of research that goes through stuff like AlignProp and Adjoint Matching that the online training community didn't really adapt to

fervent owl
fervent thunder
#

could you switch to comfyui?

#

I don't know forge but there are forge people here who could help if you switched to forge also

mortal shuttle
fervent thunder
#

AI Toolkit discord and Onetrainer discord are good

#

simple tuner discord is the best but might have stopped taking new members

#

huggingface discord does cover these topics but tends to be low on activity for these topics

#

I'm not actually aware of a koyha discord, this feels odd cos its the most popular trainer

abstract quarry
#

I only used kohya and simple tuner so far. For simple tuner I'm not sure if it is supported for windows, though

fervent thunder
#

they get categorised as preference optimisation methods, and you are right that that is what they are used for most of the time
but the reward model can be anything, it doesn't have to be trained on human preference
as a dumb but funny example they used a classifier that counts strawberries as the reward model
and the resulting fine tune would fill every image with strawberries

abstract quarry
#

I can only tell you: don't listen to much on youtube videos that tell you "the perfect parameters" or "the only way to finetune". There is one particular guy, for example, who spammed a lot of videos and he has not really any clue he is talking about ^^°

#

in the end you have to experiment a lot

#

cause the result very much depends on your training data

fervent thunder
#

ye I haven't watched AI youtube for about 6 months

#

I got tired of the mind blown emoji

abstract quarry
#

and parameters might change with every training data 🤷‍♂️ Although, I would argue that the data itself is much more important than your settings anyways

fervent thunder
#

this one 🤯

#

on youtube everything has to be 🤯

abstract quarry
fervent thunder
#

apparently Meta just grid search param with brute force

#

there's not enough reason to avoid grid search yet

#

Blender youtube is also bad as I found out this month

abstract quarry
fervent thunder
#

ah yea it can't add a new subject matter that is true

#

I read that at the moment the LLMs or diffusion models outsmart the reward models eventually every time as well

#

"reward hacking"

#

like the famous "aesthetic" reward model gets hacked by making the images brown with wavy lines

verbal island
#

gm

fervent thunder
#

hi

#

maybe I am overselling reward training TBH
it works really well sometimes like the SPO loras for SD 1.5, they really transform that model

shell falcon
#

Hello !

patent remnant
#

hi which robot is available to create image now

pearl sand
#

When you download all these things for SDXL get it setup then you find out SORA...

#

generations way better and it understand what I want

#

highly recommend

abstract quarry
#

sdxl is an old and much smaller model 🤷‍♂️

fervent thunder
#

you have the weights of SDXL and you don't have the weights of SORA

#

given enough compute you can make a better image with SDXL

pearl sand
fervent thunder
#

I'm not rly a fan of prompting anyway

#

I think its better to split an image up

#

into tiles or segments

trail gust
#

Hello party people

clever river
#

hello

fervent thunder
#

Hi

mellow meteor
#

huh

verbal raven
#

hello

glacial glade
#

not sure if this is the right channel to ask, but does openpose work with forgeUI? I've been trying to get it to do something but the generation tends to completely ignore the openpose controlmap

late sonnet
#

Hey just checking out SD.

forest panther
#

Hello everyone

sage reef
lean field
#

hey guys, what is the best chat for me to go in to ask for help?

steep swift
#

Hi, I'm new here in Discord. I create visual art and lo-fi hiphop without using AI as well. I've been using Stable Diffusion since the beginning of this year and I am still a newcomer in this field. Nice to meet you all! 🙂

lean field
#

hey guys, can someone suggest some easy youtube videos to help me learn state defusion please

floral umbra
#

@vapid dove Spammer above

floral umbra
#

Also, when one reinstall windows, i guess the best course of action is to wipe venv to avoid hitting previous windows install bugs and incompatability Thunk

glacial glade
#

I asked one question yesterday and got 4 people dming me a link to a server telling me to create a ticket for my query, is that normal

#

well I just found out the name is blocked by automod, so that might be somewhat telling

wet grotto
#

guys, do you have anote site like Vast.AI for me run SD? I'm having problem add extensions on Vast, started last week

warm junco
haughty smelt
#

where do i download

atomic mortar
#

Before you download it's handy to know before you do

#

To see if you can run it or not

wet grotto
#

I will show you

haughty smelt
#

not much but my bday is a month away i told my mom so its a start

atomic mortar
haughty smelt
#

oh ok

atomic mortar
haughty smelt
#

swarmUI got it thanks

atomic mortar
#

But swarm is pretty beginner friendly and supports most image gen models

#

After that you might wanna hop over to civitAI and check the models tab, look for a checkpoint that looks nice for what you wanna make