#🆕｜sd3 | Stable Diffusion | Page 129

lavish sparrow Dec 23, 2024, 8:01 PM

#

sage burrow Dec 23, 2024, 10:30 PM

#

It does create extremely good loras fortunately 😉

craggy crest Dec 23, 2024, 11:40 PM

#

#

#

#

#

#

#

mortal mesa Dec 23, 2024, 11:53 PM

#

looks good, capturing the 3 dimension made of something look

dull star Dec 23, 2024, 11:56 PM

#

craggy crest

I wonder which surgery is tomorrow

craggy crest Dec 24, 2024, 12:55 AM

#

civic trail Dec 24, 2024, 2:37 PM

#

civic trail Dec 24, 2024, 6:57 PM

#

craggy crest Dec 25, 2024, 4:11 AM

#

muted onyx Dec 25, 2024, 6:57 AM

#

I do respect your taste, but is there a fine tuned model for more realistic human

sage burrow Dec 25, 2024, 9:36 AM

#

sage burrow Dec 25, 2024, 12:28 PM

#

lost birch Dec 25, 2024, 12:48 PM

#

Live-action version of Crayon Shin-chan

rapid pivot Dec 25, 2024, 4:23 PM

#

sage burrow

sadcat

#

Hello beeeeckyyy been a while

craggy crest Dec 25, 2024, 4:35 PM

#

muted onyx I do respect your taste, but is there a fine tuned model for more realistic huma...

sure. use Duchaiten's pony models. pony no score is probably your best bet

civic trail Dec 26, 2024, 2:32 PM

#

craggy crest Dec 26, 2024, 5:32 PM

#

fleet meteor Dec 26, 2024, 7:31 PM

#

#

almost

hallow lion Dec 26, 2024, 9:37 PM

#

fleet meteor

an effort was made

sage burrow Dec 27, 2024, 2:35 PM

#

rapid pivot Hello beeeeckyyy been a while

@rapid pivot @amin_06894 it has! Did you get your vram yet? lololol

rapid pivot Dec 27, 2024, 2:38 PM

#

sage burrow <@1250664790800470118> @amin_06894 it has! Did you get your vram yet? lololol

sadcat if only Santa were real

sage burrow Dec 27, 2024, 3:06 PM

#

Is it just me or does hunyeon video do better hands than most still image creators?

sage burrow Dec 27, 2024, 3:06 PM

#

rapid pivot <:sadcat:1130568570712109176> if only Santa were real

I just use Mage lol

#

also sdxl on my own system

proven lantern Dec 28, 2024, 1:45 AM

#

cool

sage burrow Dec 28, 2024, 2:08 AM

#

rapid pivot <:sadcat:1130568570712109176> if only Santa were real

craggy crest Dec 28, 2024, 3:21 AM

#

craggy crest Dec 28, 2024, 3:49 AM

#

#

rapid pivot Dec 28, 2024, 8:28 AM

#

sage burrow

I regret everything I said

#

agony

muted dove Dec 28, 2024, 9:56 AM

#

rapid pivot <:sadcat:1130568570712109176> if only Santa were real

https://tenor.com/view/stoning-stone-monty-python-thot-life-of-brian-gif-12291021

Tenor

craggy crest Dec 28, 2024, 6:23 PM

#

bitter hearth Dec 29, 2024, 8:20 AM

#

it is so hard to lerarn the SD

kindred stone Dec 29, 2024, 8:29 AM

#

sage burrow

craggy crest Dec 29, 2024, 8:34 AM

#

civic trail Dec 29, 2024, 9:59 PM

#

limpid thunderBOT Dec 29, 2024, 11:45 PM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

limpid thunderBOT Dec 30, 2024, 12:44 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

wanton slate Dec 30, 2024, 12:48 AM

#

"Couple holding hands on rural hilltop, watching apocalyptic sky filled with violent aurora borealis and magnetic storms, burning city in background, windswept landscape, dramatic lighting, 8K, photorealistic, cinematic framing"

#

doh

odd notch Dec 30, 2024, 3:24 AM

#

MGM Grand Las Vegas on 11 hectares land area designed by Veldon Simpson, capturing the entire edifice in a single shot from a distance of 100 meters, through a soft-focus lens, bathed in warm Sunlight, modern architecture, rtx lighting, cloudy sky

#

"MGM Grand Las Vegas on 11 hectares land area designed by Veldon Simpson, capturing the entire edifice in a single shot from a distance of 100 meters, through a soft-focus lens, bathed in warm Sunlight, modern architecture, rtx lighting, cloudy sky"

#

help

limpid thunderBOT Dec 30, 2024, 3:34 AM

#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

lavish sparrow Dec 30, 2024, 2:55 PM

#

lavish sparrow Dec 30, 2024, 4:14 PM

#

#

lavish sparrow Dec 30, 2024, 5:16 PM

#

#

#

lavish sparrow Dec 30, 2024, 11:16 PM

#

lavish sparrow Dec 30, 2024, 11:37 PM

#

#

"Lord of the API's" -> I like the wifi staff

muted dove Dec 31, 2024, 3:40 PM

#

#

#

muted dove Dec 31, 2024, 4:08 PM

#

#

#

civic trail Dec 31, 2024, 5:27 PM

#

young blade Jan 1, 2025, 4:04 AM

#

craggy crest Jan 1, 2025, 5:07 AM

#

neon imp Jan 1, 2025, 4:27 PM

#

Posting my full findings soon and the relevant additions of code for ai-toolkit and koyah_ss but I am fairly certain I’ve discovered mass scale misalignment of the text encoders across the most popular training tools. Here are some before/after tests from multiple character and style LoRAs with the exact same settings aside from the added parameters to ensure proper alignment of text encoders with the u-net. I know this is a large claim with huge implications that said, I would not be sharing if I did not 100% believe this to be true.

bitter hearth Jan 1, 2025, 5:14 PM

#

neon imp Posting my full findings soon and the relevant additions of code for ai-toolkit ...

will this affect flux

neon imp Jan 1, 2025, 5:27 PM

#

bitter hearth will this affect flux

Yes. As a matter of fact, the bottom two rows on the first image are examples of improved training stability with Flux Dev.

While the other images highlight the more drastic improvements to SD3.5 Large training as a whole.

Going to test 3.5 medium and Schnell next but I need to finish documenting and get this fix out to the community today.

bitter hearth Jan 1, 2025, 5:38 PM

#

okay thanks

craggy crest Jan 1, 2025, 7:24 PM

#

neon imp Posting my full findings soon and the relevant additions of code for ai-toolkit ...

the text encoders aren't even in sync with themselves

#

neon imp Jan 1, 2025, 7:28 PM

#

craggy crest the text encoders aren't even in sync with themselves

Very true, while StabilityAI did document the text encoders in the config for each of the three various text encoders when the model was first posted, It seems to have gone overlooked by the creators of these training scripts.

craggy crest Jan 1, 2025, 7:30 PM

#

neon imp Very true, while StabilityAI did document the text encoders in the [config for e...

trust me, it wasn't overlooked. It's far more likely an issue of training an encoder is expensive and they just used what was there

#

#

neon imp Jan 1, 2025, 7:41 PM

#

craggy crest trust me, it wasn't overlooked. It's far more likely an issue of training an enc...

You would be shocked. I am not training the text encoders at all. I am defining its parameters for proper alignment between the text encoders and the u-net. There is no noticeable difference in compute resources. Style LoRA training starts to take at lower steps and there are clear improvements with far less deformed features and better color depth.

From my tests this seems to be a universal misalignment issue. In the results across various character and style LoRAs at different ranks double checked with both ai-toolkit and koyah_ss as well as 3.5L and Flux Dev.

craggy crest Jan 1, 2025, 7:43 PM

#

neon imp You would be shocked. I am not training the text encoders at all. I am defining ...

part of it might just be ai-toolkit and koyah_ss - have you done the same tests with luca's dreambooth trainer?

#

he has a dreambooth for flux, and he has one for sd 3.5 large

neon imp Jan 1, 2025, 7:49 PM

#

craggy crest part of it might just be ai-toolkit and koyah_ss - have you done the same tests ...

I have never heard of luca's dreambooth trainer googing did not provide any concrete results. Is that the name of the github author?

craggy crest Jan 1, 2025, 7:51 PM

#

neon imp I have never heard of `luca's dreambooth trainer` googing did not provide any co...

oh boy do i have a bunch of things for you to explore :) https://replicate.com/lucataco this is his main repo on replicate. he's got all SORTs of stuff there, including both of his dreambooth trainers

#

just scroll all the way to the bottom and slowly scroll back up, he's got tons of stuff

#

click on anything, and then look across the top, you'll find a link to it on his github repo

neon imp Jan 1, 2025, 7:59 PM

#

The full code doesn't seem to be shown and runs through a paywalled api.

While it is possible that this or any other induvial user could very well be taking the extra effort to define these parameters. I do think think this is a known issue and if it is a known issues that some are keeping secret behind paywalls.

That fundamental goes against my personal views on the technology as a whole.

craggy crest Jan 1, 2025, 7:59 PM

#

neon imp The full code doesn't seem to be shown and runs through a paywalled api. While...

just click on the github icon on the top of the page and go to his github repo

#

the entire purpose of this is to make sure whether the issue is the trainers - ai-toolkit and kohya_ss - or if it's something else.

neon imp Jan 1, 2025, 8:06 PM

#

craggy crest the entire purpose of this is to make sure whether the issue is the trainers - a...

Yes, I understand. Sorry for the confusion I mostly avoid commercial services such as replicate and was not aware it was also posted on github. I will run some tests but I do not see it taken into account in the code itself.

craggy crest Jan 1, 2025, 8:08 PM

#

neon imp Yes, I understand. Sorry for the confusion I mostly avoid commercial services su...

dreambooth is stability.AI's trainer - so you should be able to see if it's the 3rd party scripts, or if it's something else.

#

neon imp Jan 1, 2025, 8:17 PM

#

craggy crest dreambooth is stability.AI's trainer - so you should be able to see if it's the ...

Fully aware of dreambooth. Just haven't ran it since SD1.5. Thank you, will look into if this was implemented and if not make the required adjustments and re-run some test. Still going to publish my findings so far today.

craggy crest Jan 1, 2025, 8:23 PM

#

neon imp Fully aware of dreambooth. Just haven't ran it since SD1.5. Thank you, will look...

i've been using luca's implementations of it for the loras i train, it works well - so it should be a good way to test what you're seeing with the other scripts

#

#

lavish sparrow Jan 1, 2025, 9:26 PM

#

toxic bone Jan 1, 2025, 10:08 PM

#

craggy crest dreambooth is stability.AI's trainer - so you should be able to see if it's the ...

dreambooth was actually published by Google. They've been reluctant to publish open source research since then because of what was done with the techniques they demonstrated

#

https://dreambooth.github.io/

craggy crest Jan 1, 2025, 10:09 PM

#

it idea is for him to make his tests on something other than just the two scripts that seem to have issues, and to determine whether it's those trainers - or if something else is going on - for that, dreambooth is a good option

toxic bone Jan 1, 2025, 10:12 PM

#

I don't think reference code for it was ever published by stability ai either. It was just a community member that published code to github. Huggingface has been the biggest source for reference code afaik

craggy crest Jan 1, 2025, 10:23 PM

#

toxic bone I don't think reference code for it was ever published by stability ai either. ...

https://github.com/JoePenna/Dreambooth-Stable-Diffusion/tree/main

GitHub

GitHub - JoePenna/Dreambooth-Stable-Diffusion: Implementation of Dr...

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focuse...

toxic bone Jan 1, 2025, 10:26 PM

#

few points on your gooogle fu "gotcha" attempt.

is not reference code
is not maintained by stability ai
was published before Penna was hired by Stability
Joe don't work here no mo.

Hope i don't get banned for disagreeing with you on something you're wrong about.

Dreambooth is probably not even what the guy should use. LoRA would be a better approach for their needs. But i'm not commenting towards that. I just correct things when I know better

craggy crest Jan 1, 2025, 10:48 PM

#

toxic bone few points on your gooogle fu "gotcha" attempt. 1) is not reference code 2) i...

all i did was give you another repo to look at, where someone else had done some implementation, in case you might have been interested. cripes you have a bad attitude. and if you look at what he's doing, he's testing trainers.

toxic bone Jan 1, 2025, 11:18 PM

#

history informs my interactions with you. In the past, while you were moderator of /r/stablediffusion, you kicked me from that server after similar disagreements here, on another server. You remember that don't you?

#

If i have a bad attitude maybe you should ask yourself "what have I done to this person?"

craggy crest Jan 2, 2025, 12:05 AM

#

toxic bone history informs my interactions with you. In the past, while you were moderator...

you earned everything that's ever happened to you - and probably a whole lot more.

#

don't start. i'm done talking to you.

toxic bone Jan 2, 2025, 12:06 AM

#

oh okay. So you're sociopathic and can't recognize that others disagreed with your arbitrary kicks, leading you to no longer be a moderator of that server or subreddit. but alright. got it.

craggy crest Jan 2, 2025, 12:06 AM

#

assume whatever you like.

toxic bone Jan 2, 2025, 12:07 AM

#

not an assumption. conversation with the admins of that server and subreddit confirmed why you were removed. because of kicking me all those times for no reason.

craggy crest Jan 2, 2025, 12:11 AM

#

toxic bone not an assumption. conversation with the admins of that server and subreddit con...

there were valid reasons. however, assume whatever you like.

#

and just keep on posting

toxic bone Jan 2, 2025, 12:13 AM

#

"valid reasons" being we had disagreed about something mundane on here and i usually back it up with facts.

when i asked the other mod there why i had been kicked repeatedly from that other server he looked into it and found no valid reason. I imagine when he asked you, you told him some sing song story and they disagreed with your validation. I doubt you told him "he was arguing with me on another discord server".

craggy crest Jan 2, 2025, 12:19 AM

#

toxic bone "valid reasons" being we had disagreed about something mundane on here and i usu...

depends on which 'admin' you're referring to - no, there was no discussion

#

toxic bone Jan 2, 2025, 12:26 AM

#

@viral plaza pinging you sorry. ... but... i mean... seriously.
https://discord.com/channels/1031106063837184021/1308975746529890344
This topic was the last time i got kicked from the /r/stable server. None of the kicks were ever explained to me. I only found out because i noticed the server icon no long in my list.

I dont like being gaslit like this didn't happen so i have to address it.

Another community i'm part of had a member talk to sandcheezy about you which was illuminating as well.

craggy crest Jan 2, 2025, 12:28 AM

#

always nice to know people are spreading lies behind other people's backs. you do realize that alex rarely pays attention to any discord but his own? he's somewhat busy, you should try pinging him there.

#

#

#

#

toxic bone Jan 2, 2025, 12:36 AM

#

craggy crest always nice to know people are spreading lies behind other people's backs. you d...

I did. Thats why you got removed as moderator from their community. ❤️

craggy crest Jan 2, 2025, 2:47 AM

#

#

#

#

#

#

lavish sparrow Jan 2, 2025, 9:33 AM

#

turbid grotto Jan 2, 2025, 10:35 AM

#

lol, sd3.5 large lora easily training on 12gb gpu

#

under 8gb vram at 1024 with offoading 0.5
7.70s/it which is almost 3 times faster than flux

#

but it might converge slower, or I don't have correct settings yet

#

I am training at lr 0.002 💀

viral plaza Jan 2, 2025, 11:14 AM

#

toxic bone <@105458332365504512> pinging you sorry. ... but... i mean... seriously. https:...

Since crystal wants to gaslight you about it here I'll go ahead and post for you and anyone else that cares:

Crystalwizard not only silently kicked nuuideas from the r/sd discord repeatedly, but deleted the logs of having done so from our internal mod logs. I only discovered this when nuuideas asked about it and I digged through the discord audit logs and managed to find this out. I asked about it at a time when crystal was otherwise active, they didn't reply within the span of about a day, and I spoke with the other mods and we mutually agreed to remove crystal from the team, as not only was this far from the first issue with their activity as a moderator, but also the fact that they were deleting data from mod logs indicated that past reported incidents we had no proof of were potentially true as well. After removal, Crystal left not only r/SD but other discords as well, without ever saying anything at least to me. I think they spoke to cheeze at one point after?

(Also to be clear, no, as best I can tell, crystal had no valid reason to kick nuuideas at all, they just had a disagreement on some random technical point and crystal would rather exert authority than let themself lose an argument on the internet or something)

desert garnet Jan 2, 2025, 1:25 PM

#

that what happens when u add crazy ppl as mods

dry wave Jan 2, 2025, 1:46 PM

#

doesn't surprise me and probably nobody else who read messages from crystalwizard 😂

craggy crest Jan 2, 2025, 4:44 PM

#

mortal mesa Jan 2, 2025, 8:27 PM

#

name change incoming, the truth is spreading

hallow lion Jan 2, 2025, 8:35 PM

#

He mistreated so many people including me.

mortal mesa Jan 2, 2025, 8:46 PM

#

don't you know its assault if you disagree or have factual information

toxic bone Jan 2, 2025, 8:51 PM

#

viral plaza Since crystal wants to gaslight you about it here I'll go ahead and post for you...

Thank you for taking time to address this. It seems a little insane that he's in full on denial mode.

happy 2025. 🥳

hallow lion Jan 2, 2025, 8:57 PM

#

I wish discord would literally just make a person completely invisible for you when you block them. They did send me a survey asking if I like the block system on here and I suggested it. Hopefully this is a much needed change they will implement soon. I think most people would prefer if blocks worked this way.

#

He probably has the record tho on being blocked by most people. :))

mortal mesa Jan 2, 2025, 8:58 PM

#

by far

toxic bone Jan 2, 2025, 9:00 PM

#

i've always had issues blocking people. can't follow conversations. him specifically tends to be very sycophantic , so many will engage with him when he has his behavior facade up. I just give people notes. I wish the notes would show next to someone's name though. Or could give people custom colors.

hallow lion Jan 2, 2025, 9:05 PM

#

He is strange... sometimes he acts almost normal but then has this other side. There is something going on with him for sure.

toxic bone Jan 2, 2025, 9:06 PM

#

It's rude to talk about someone in the 3rd person when they are right there. I just wanted to point that out. Not a criticism. 😉

hallow lion Jan 2, 2025, 9:10 PM

#

probably unrelated but where is 4GB VRAM cat!???? XD

#

(for those the new guys who don't know there used to be an active user here named "Cat with 4GB Vram (send help)"

mortal mesa Jan 2, 2025, 9:59 PM

#

@bitter hearth allo

foggy cloak Jan 2, 2025, 10:09 PM

#

What's everyone's go to flux model?

bitter hearth Jan 2, 2025, 10:11 PM

#

base FP8 with turbo lora

mortal mesa Jan 2, 2025, 10:11 PM

#

shuttle 3

lavish sparrow Jan 2, 2025, 11:21 PM

#

hallow lion probably unrelated but where is 4GB VRAM cat!???? XD

this i can answer for you tho

#

he lost his account 😢

#

he goes by @rapid pivot now.

#

hallow lion Jan 2, 2025, 11:26 PM

#

awww

rapid pivot Jan 3, 2025, 12:41 AM

#

hallow lion awww

sadcat

rapid pivot Jan 3, 2025, 12:43 AM

#

mortal mesa <@456226577798135808> allo

sadcat wa

#

#

hallow lion Jan 3, 2025, 12:53 AM

#

rapid pivot

You're alive. ^_^

#

sad about your other account then.

#

catwhaaa

#

I guess the 4GB cat didn;t get their VRAM. :/

rapid pivot Jan 3, 2025, 12:55 AM

#

It only gets worse as time goes by

#

agony

desert garnet Jan 3, 2025, 2:03 AM

#

rapid pivot It only gets worse as time goes by

why u lost your acc? sadcat

hallow lion Jan 3, 2025, 2:15 AM

#

oh hai steven segal

desert garnet Jan 3, 2025, 2:17 AM

#

hai there

toxic bone Jan 3, 2025, 2:35 AM

#

desert garnet why u lost your acc? <:sadcat:1130568570712109176>

Probably the same reason that i got kicked repeatedly from other servers. He argued some technical stuff with someone who has an army of sockpuppets ready to report them

desert garnet Jan 3, 2025, 2:43 AM

#

toxic bone Probably the same reason that i got kicked repeatedly from other servers. He ar...

still safe here aint no mfking redditor can touch me

#

https://tenor.com/view/reddit-soylent-soy-soyface-nerd-gif-11917812235578842485

Tenor

toxic bone Jan 3, 2025, 2:45 AM

#

desert garnet still safe here aint no mfking redditor can touch me

i quit reddit long ago. also quit x too. not sure where to find good info feeds anymore. I still use those sites but i dropped the accounts

desert garnet Jan 3, 2025, 2:53 AM

#

toxic bone i quit reddit long ago. also quit x too. not sure where to find good info feeds...

i just use 4chan,better guides there

toxic bone Jan 3, 2025, 3:01 AM

#

i've never had any idea how to browse 4chan. its like in no order at all

bitter hearth Jan 3, 2025, 3:04 AM

#

yeah IDK how to navigate 4chan

#

apparently there is a lot of diffusion stuff on there

#

but I am not sure if it is good advice or not

toxic bone Jan 3, 2025, 3:08 AM

#

yeah i have nothing against it. i just dont know how to digest info there

rapid pivot Jan 3, 2025, 9:45 AM

#

desert garnet why u lost your acc? <:sadcat:1130568570712109176>

Same as the other 5

#

thomas

#

Phone number cancelled, not enough patience to go through recovery and blah blah

bitter hearth Jan 3, 2025, 3:53 PM

#

but why 5 phone numbers cancelled

bronze blade Jan 3, 2025, 4:17 PM

#

can AUTOMATIC1111 use SD3.5 gguf?

rapid pivot Jan 3, 2025, 4:21 PM

#

bitter hearth but why 5 phone numbers cancelled

well not all accounts were lost phone numbers

#

I don't pay for mobile data, sometimes I want one for emergencies or whatever thomas

#

so they get cancelled eventually

#

if discord gets stuck in login for me for whatever reason I just literally create another account its not a thing I care that much about lmao

bitter hearth Jan 3, 2025, 4:26 PM

#

oh this sounds fine

#

I thought it was shenanigans

rapid pivot Jan 3, 2025, 5:15 PM

#

im lazy waow

remote holly Jan 3, 2025, 5:21 PM

#

#

photo realistic, a pretty woman (with dark red bob hair wearing a black suit with a dark red tie and a long black coat with long black palazzo pants with vertical red stripes) a determined look, standing in a white room, dynamic shadows, volumetric light, long exposure, sun rays, cinematic view, bokeh effect, fashion advertisement, Dior

#

wraith axle Jan 4, 2025, 3:45 PM

#

Photo via Kodak portra 160,young pretty girl, 20 years old. She has blonde hair, blue eyes, pale skin. Split into four images, Shot of different angeles, white background --style raw --v 6.0

frail shoal Jan 4, 2025, 4:11 PM

#

#

bitter hearth Jan 5, 2025, 4:12 PM

#

nice shadows

prisma valley Jan 6, 2025, 12:44 PM

#

We need an easy simple, non-technical local Lora training like a FluxGym (Is there one?) and we need DreamShaper, Juggernaut, EpicRealism and Realvis versions of SD 3.5 or better in 2025 🙏🏽

bitter hearth Jan 6, 2025, 12:56 PM

#

100 steps of lion

rapid pivot Jan 6, 2025, 1:10 PM

#

bitter hearth 100 steps of lion

turbid grotto Jan 6, 2025, 2:20 PM

#

prisma valley We need an easy simple, non-technical local Lora training like a FluxGym (Is the...

RealVis is on the way!
https://huggingface.co/SG161222/RealVis_Medium_2.0b

SG161222/RealVis_Medium_2.0b · Hugging Face

#

have anuone tried it?
https://github.com/lehduong/OneDiffusion

GitHub

GitHub - lehduong/OneDiffusion: Official implementation of OneDiffu...

Official implementation of OneDiffusion paper. Contribute to lehduong/OneDiffusion development by creating an account on GitHub.

lucid swift Jan 6, 2025, 2:47 PM

#

turbid grotto have anuone tried it? https://github.com/lehduong/OneDiffusion

i have tried it

#

its ok

wind talon Jan 6, 2025, 3:12 PM

#

is there a tiled controlnet for sd 3.5m?

turbid grotto Jan 6, 2025, 4:01 PM

#

lucid swift i have tried it

I wanted to try it so much but there is no easy way to use it locally agony

sage burrow Jan 6, 2025, 6:51 PM

#

prisma valley We need an easy simple, non-technical local Lora training like a FluxGym (Is the...

That runs with only 8gb GPU 😉

#

You can install and run your own civitai locally?! Unfortunately the lora training aspect won't work though. https://github.com/civitai/civitai

lucid swift Jan 6, 2025, 7:50 PM

#

turbid grotto I wanted to try it so much but there is no easy way to use it locally <:agony:10...

i did test it locally

#

its not worth it. omnigen is better

bitter hearth Jan 6, 2025, 9:16 PM

#

omnigen didn't train on enough aesthetic data

#

it could improve maybe

stable snow Jan 7, 2025, 3:12 AM

#

3d风格一张图以红色为基调，上面祝福语是大吉大利周围装饰图标金元宝、红包、烟花、梅花

rain current Jan 7, 2025, 1:33 PM

#

Which one do you prefer?

toxic bone Jan 8, 2025, 1:05 AM

#

2 cause i like kodak tones more than fuji tones. totally personal preference though. in the west we're all about the warmer colors

sage burrow Jan 8, 2025, 1:36 AM

#

left looks like one of those disposible cameras from the 80s! Or you ran out of printer ink 😄

rapid pivot Jan 8, 2025, 2:41 AM

#

I just don't like her legs on left

#

It looks like a doll

#

Now you can't unsee thomas

sage burrow Jan 8, 2025, 3:46 AM

#

We are all art critics! 😄

cursive frigate Jan 8, 2025, 6:40 AM

#

Does anyone know any good sources for documentation for ComfyUI and Custom Node creation that I can feed into something like ChatGPT or a local LLM that will allow me to talk to it and have it help me either create my own custom nodes or at least help me put together better workflows for specific use cases?

uneven storm Jan 8, 2025, 7:15 AM

#

cursive frigate Does anyone know any good sources for documentation for ComfyUI and Custom Node ...

comfy_ollama

#

#

toxic bone Jan 8, 2025, 7:45 AM

#

if anyone here that still has respect for crystalwizard then here's some interesting reading for you to see on the comfy org discord server. #1319770970868945057 message

#

catch it before he deletes it

cursive frigate Jan 8, 2025, 7:47 AM

#

toxic bone if anyone here that still has respect for crystalwizard then here's some interes...

It says I don't have access to the link

#

I guess I don't have access to that channel for some reason

toxic bone Jan 8, 2025, 7:48 AM

#

i thought discord would let you join servers thorugh a link like that. it's the public comfy org server. i can't post the link here because they block discord invite links

#

DM'd you it

cursive frigate Jan 8, 2025, 7:50 AM

#

Thanks

#

So what is he getting at. I guess could I get a TL;DR of how this convo started. It seems like Comfy is now monitoring created nodes?

toxic bone Jan 8, 2025, 7:54 AM

#

They have a public registery of nodes now. Anyone can apply to have a node on it. By default, manager will only install nodes that are registered

#

its' a good direction and a step in the right direction. but he's flipping out because he has to change a few configuration files to install a custom node

#

you can still git clone directly into /custom_nodes/ folder

#

personally i find it very exposing of his absolute lack of expertise and professional decorum

cursive frigate Jan 8, 2025, 7:57 AM

#

He definitely has a strong opinion. I can see both sides of that coin.

#

It is a good direction for sure. but maybe there should be a time delay for some legacy nodes, or at least an audit and conversion timeframe for older pre-existing nodes.

toxic bone Jan 8, 2025, 7:58 AM

#

This has been coming for a while. The registery wasn't just launched today empty

#

it's just fully deployed now

#

https://blog.comfy.org/p/launching-comfyui-registry

cursive frigate Jan 8, 2025, 8:00 AM

#

I wish I knew enough to be able to make my own custom nodes. I would take a shot at getting on the registry and see how it goes.

#

I think its a good thing.

toxic bone Jan 8, 2025, 8:02 AM

#

I always complain about the security risks of having 5 dozen custom node packs installed, but what i truely hate most about it is the dependency problems. Nodes over writing each other's dependencies in the virtual environment is an ongoing problem. This could help to alleviate that issue among others

#

It can also lead to a standard library. Something i herald often.

cursive frigate Jan 8, 2025, 8:02 AM

#

Ya this could be great for conflicting nodes for sure

toxic bone Jan 8, 2025, 8:03 AM

#

in the past, there was already a list. nodes that were recognized by the manager. but now it's public and anyone can submit to it

#

much more standardized and tied to secure practices

bitter hearth Jan 8, 2025, 9:16 AM

#

the registry was in response to malware yeah

#

but it will also help with the dependency issues

proven pecan Jan 8, 2025, 10:29 AM

#

toxic bone i thought discord would let you join servers thorugh a link like that. it's the...

Is it public? I can't find it.

toxic bone Jan 8, 2025, 7:51 PM

#

www.comfy.org has a link but i'll dm you the invite @proven pecan

torpid marlinBOT Jan 8, 2025, 8:18 PM

#

toxic bone Jan 8, 2025, 8:22 PM

#

https://en.wikipedia.org/wiki/Quantum_mind

could use traditional search much easier. LLM's don't need to replace every single task. traditional softwre still excells in most arenas

Quantum mind

The quantum mind or quantum consciousness is a group of hypotheses proposing that local physical laws and interactions from classical mechanics or connections between neurons alone cannot explain consciousness, positing instead that quantum-mechanical phenomena, such as entanglement and superposition that cause nonlocalized quantum effects, inte...

#

it's funny to me that people are heralding all the capabilities of gpt, like counting the letters in strawberry now, when a simple string operation could do that already for a half century

#

LLMs are certainly a break through. I don't believe we're anywhere near AGI though. They may become more generalized in their use, but general intelligence they are not. It's all theranos level hype

mortal mesa Jan 8, 2025, 8:27 PM

#

toxic bone if anyone here that still has respect for crystalwizard then here's some interes...

thats the person on here that induced me to misgender Comfyanon

toxic bone Jan 8, 2025, 8:36 PM

#

seems to thrive on faux expertise and flexing

rapid pivot Jan 8, 2025, 8:37 PM

#

mortal mesa thats the person on here that induced me to misgender Comfyanon

what is it thomas

toxic bone Jan 8, 2025, 8:38 PM

#

I dm'd you the link to it chaos. everyone should join comfy org discord server. there's shakers an movers there

#

also, manditory linking needed now. https://www.youtube.com/watch?v=ZG_k5CSYKhg

YouTube

UPROXX Indie Mixtape

Faith No More - Epic (Official Music Video)

Faith No More - "Epic" (Official Music Video) from the album 'The Real Thing' (1989)

🔔 Subscribe to UPROXX Indie Mixtape and ring the bell to turn on notifications: https://uproxx.it/mrln2hd

✅ Subscribe to the newsletter for weekly music recommendations in your inbox: http://indiemixtape.com

🎧 Stream the official Topsify playlist: https://lnk...

▶ Play video

mortal mesa Jan 8, 2025, 8:41 PM

#

rapid pivot what is it <:thomas:1005605185013416016>

https://www.youtube.com/watch?v=Hc31HotThA0

YouTube

Latent Space

AI Engineering for Art - with comfyanonymous

Full show notes: https://www.latent.space/p/comfyui

Happy new year friends! Thanks for all the love on the Latent Space Live and 100th Episode End of Year recap. Your support has boosted us 30 places in the Podcast charts, and that always helps us book great guests and organize more industry events for you! We don't say this enough but thank yo...

▶ Play video

rapid pivot Jan 8, 2025, 8:41 PM

#

ai video im scared sadcat

#

52 minutes of interview agony

#

I have watched so many interviews these past few months why you do this to me sadcat

sage burrow Jan 8, 2025, 11:29 PM

#

rapid pivot 52 minutes of interview <:agony:1002961183105634415>

The various llm can give vid summaries or cliff notes 😉

uneven storm Jan 9, 2025, 1:39 AM

#

brittle nexus Jan 9, 2025, 2:09 AM

#

Google image is bizarrely good

#

#

#

sage burrow Jan 9, 2025, 4:20 AM

#

Those are amazing!

#

flux tried 😄

#

SD3 large gave it a try as well (don't count the fingers!)

torpid marlinBOT Jan 9, 2025, 7:04 AM

#

bronze ivy Jan 9, 2025, 7:12 AM

#

大家好

muted dove Jan 9, 2025, 7:19 AM

#

He was replaced.

flat raft Jan 9, 2025, 10:55 AM

#

glass rail samples

turbid grotto Jan 9, 2025, 11:12 AM

#

anyone knows flux finetune that makes it more unique? I want to get rid of this style as it become too generic

#

https://github.com/welltop-cn/ComfyUI-TeaCache this thing works on rtx3060 - 1.9x speedup

GitHub

GitHub - welltop-cn/ComfyUI-TeaCache

Contribute to welltop-cn/ComfyUI-TeaCache development by creating an account on GitHub.

cunning lintel Jan 9, 2025, 1:58 PM

#

muted dove He was replaced.

Pff that's marketing too, in reality it's like this 🤡

placid plover Jan 9, 2025, 3:31 PM

#

A group of 8 realistic cats taking a selfie together. The cats have human-like expressions, they are all standing close together in a friendly pose, resembling a group photo. The background shows a blurred indoor setting with other people. The lighting is natural with soft shadows, creating depth and realism

sage burrow Jan 9, 2025, 11:34 PM

#

placid plover A group of 8 realistic cats taking a selfie together. The cats have human-like e...

Sd3.5 large, medium, turbo

rapid pivot Jan 9, 2025, 11:43 PM

#

sage burrow Sd3.5 large, medium, turbo

https://tenor.com/view/meow-cat-gif-27161697

Tenor

#

So many friends

craggy crest Jan 10, 2025, 4:24 AM

#

rapid pivot https://tenor.com/view/meow-cat-gif-27161697

you still stuck with just 4gig vram?

bitter hearth Jan 10, 2025, 4:35 AM

#

sage burrow Sd3.5 large, medium, turbo

there is a medium turbo too on huggingface

#

made by tensor art

rapid pivot Jan 10, 2025, 4:52 AM

#

craggy crest you still stuck with just 4gig vram?

Yes

#

sadcat

bitter hearth Jan 10, 2025, 4:56 AM

#

4GB of VRAM is fine you can run flux.1-lite-8B-alpha-Q3_K_S.gguf in headless mode

#

its 3.74GB so it will fit

toxic bone Jan 10, 2025, 5:50 AM

#

only leaves .26gb for generation though. better off using an sd15 refine then

bitter hearth Jan 10, 2025, 6:02 AM

#

is not an issue

#

so long as you are running headless

#

your screen goes black while image is generating

#

and it works ok

#

LOL IDK if people would like this advice though, they might not like their screen going black

sharp moth Jan 10, 2025, 6:42 AM

#

dog

toxic bone Jan 10, 2025, 7:13 AM

#

Screens off? what are you some kindda luddite hippy communist?

bitter hearth Jan 10, 2025, 8:33 AM

#

lmao

#

just close your eyes when image is generating and then you won't know

hallow lion Jan 10, 2025, 8:48 AM

#

at least its not a blue screen (of death)

sage burrow Jan 10, 2025, 11:00 AM

#

Cloud services or mage are so affordable now, don't need more than 4gb vram at home 😄 @rapid pivot

bitter hearth Jan 10, 2025, 11:08 AM

#

ye pretty much

hallow lion Jan 10, 2025, 11:40 AM

#

He's a local cat, doesn't hang out in the cloud.

bitter hearth Jan 10, 2025, 11:53 AM

#

ah okay that's fine

rapid pivot Jan 10, 2025, 12:25 PM

#

bitter hearth 4GB of VRAM is fine you can run flux.1-lite-8B-alpha-Q3_K_S.gguf in headless mod...

Now try running that on a 2018 amd card

bitter hearth Jan 10, 2025, 12:40 PM

#

oh no

rapid pivot Jan 10, 2025, 12:45 PM

#

sadcat

sage burrow Jan 10, 2025, 3:10 PM

#

Both glif and mage.space are pretty awesome

civic trail Jan 10, 2025, 9:29 PM

#

proven pecan Jan 12, 2025, 5:04 PM

#

neon imp Fully aware of dreambooth. Just haven't ran it since SD1.5. Thank you, will look...

I saved this comment and now I'm wondering what is left of it?

neon imp Jan 12, 2025, 5:10 PM

#

proven pecan I saved this comment and now I'm wondering what is left of it?

Haven't had time to experiment more and change things but I have shared my ai-toolkit configs and have had many others say they have seen improvements with the changes. I saw ai-toolkit was updated last week but I havent touched anything since I was getting such great results before.

I have been very busy with both work and getting my Project Odyssey 2 video finished before the deadline but I have uploaded an absolutely amazing 3.5 L negative detail LoRA that outshines everything I had done before (with no changes to the dataset or other settings) so I am convinced there is something there. Wish I had more time to dive in but have published my "best guess" at the time to the cause as an article on Civitai.

#

This starts as a SD3.5 base render and each frame is a decrease in lora strength by 0.01 (since its a negative lora) The video pingpongs back down to a loop.

bitter hearth Jan 12, 2025, 5:17 PM

#

neon imp This starts as a SD3.5 base render and each frame is a decrease in lora strength...

how do you make these?

proven pecan Jan 12, 2025, 5:19 PM

#

neon imp Haven't had time to experiment more and change things but I have shared my ai-to...

Ok, thanks.

neon imp Jan 12, 2025, 5:25 PM

#

bitter hearth how do you make these?

I will be posting a full article explain the process and various improvements I have found to the process, share my dataset, ect I have found with what I call negative reinforcement training. I am currently spreading myself a little thin. The idea is to train on what you don't want and use the end lora with a negative strength value to force things to a conceptual opposite latent vector value.

I first found this by accident when training a SD2.1 textual embedding to try to make images I could feed into "point-e" by OpenAI to make 3d point clouds. But I had not gotten results as stable as this with 3.5 until the recent changes I have been discussing.

bitter hearth Jan 12, 2025, 5:38 PM

#

neon imp I will be posting a full article explain the process and various improvements I ...

okay thanks
I was wondering what the conceptual opposite of high detail like this looks like 🤔

#

is it like something extremely smooth or blurry?

neon imp Jan 12, 2025, 5:44 PM

#

bitter hearth okay thanks I was wondering what the conceptual opposite of high detail like thi...

My first dataset was the seed images used for COCO CLIP R-Precision evaluations you can find it at the bottom of OpenAI's point-e github page.

The images resemble something between early CAD and 2000s video game renders with simple flat colors and minimal details. I find at times some pixel art or anime loras can also have somewhat of this effect to some degree and always recommend doing a negative test with loras you find in the wild because you never know.

bitter hearth Jan 12, 2025, 5:45 PM

#

okay thanks this makes sense yeah

#

funnily enough negative schnell lora does this a bit

neon imp Jan 12, 2025, 5:46 PM

#

I have been testing with the de-distilled versions of Dev and Schenell and have had some limited results so far but want to get them to a better place before sharing.

bitter hearth Jan 12, 2025, 5:47 PM

#

lol I love this dataset
its funny that it worked

an_orange_bike_leaning_on_a_pole_in_the_snow..png

#

yeah it could work on de-distilled

neon imp Jan 12, 2025, 5:49 PM

#

The Dev one I recently did works but only from 0 to -0.25 and then things just get crazy in the images. Ill prob share it soon but I got to get back to my PO2 todays my last day before getting back to the office Monday.

rapid pivot Jan 12, 2025, 5:49 PM

#

bitter hearth lol I love this dataset its funny that it worked

I'm printing and framing on my wall

bitter hearth Jan 12, 2025, 5:54 PM

#

maybe this would make a good negative lora for flux

craggy crest Jan 12, 2025, 9:27 PM

#

cursive frigate Jan 14, 2025, 7:56 AM

#

I just put this in the ollama chat... I think this is an interesting conversation starter....

I kind of had a runaway thought.... hear me out.

So they say AI on regular computing with LLMs and such are way different than Quantum Computing, that is 100% true, however. Why not give the best AI access to Quantum computing data like RAG or a knowledgebase and see if AI can help advance Quantum Computing.

Its probably already been thought of but, I haven't heard anyone mention it so I figured I would put it out into the ether.

#

Any takers?

viral plaza Jan 14, 2025, 12:41 PM

#

cursive frigate I just put this in the ollama chat... I think this is an interesting conversatio...

looks like nvidia's already on it https://developer.nvidia.com/blog/enabling-quantum-computing-with-ai/

lavish sparrow Jan 15, 2025, 1:41 PM

#

analog dome Jan 15, 2025, 1:59 PM

#

keywords?

lavish sparrow Jan 15, 2025, 6:01 PM

#

analog dome keywords?

who, me?

#

#

lavish sparrow Jan 15, 2025, 6:32 PM

#

lavish sparrow Jan 15, 2025, 7:13 PM

#

#

lavish sparrow Jan 15, 2025, 8:27 PM

#

#

#

#

#

#

so yeah. LLM + sd3.5L is just the way to go.

tropic vine Jan 15, 2025, 8:36 PM

#

Hello, I have an issue with diffusion models on a new computer
it's with an RTX 4090, when testing it with flux-dev, it seems to take forever to generate an image, several long minutes
what do you think im might be missing?

lavish sparrow Jan 15, 2025, 8:45 PM

#

#

"A million miles away"

lavish sparrow Jan 15, 2025, 9:37 PM

#

"Bury the light"

#

"Stained, brutal calamity"

#

"several species of small furry animals grooving together in a cave with a Pict"

lavish sparrow Jan 15, 2025, 10:32 PM

#

#

#

#

#

bitter hearth Jan 15, 2025, 11:32 PM

#

lavish sparrow

wow these are incredible, which model is this?

rapid pivot Jan 15, 2025, 11:47 PM

#

#

thomas

lavish sparrow Jan 15, 2025, 11:58 PM

#

bitter hearth wow these are incredible, which model is this?

Obviously sd3. 5l

bitter hearth Jan 16, 2025, 12:07 AM

#

okay nice

analog dome Jan 16, 2025, 12:59 AM

#

lavish sparrow who, me?

yes you

sage burrow Jan 16, 2025, 1:46 AM

#

tropic vine Hello, I have an issue with diffusion models on a new computer it's with an RTX ...

Patience 😉

bitter hearth Jan 16, 2025, 2:02 AM

#

tropic vine Hello, I have an issue with diffusion models on a new computer it's with an RTX ...

for Flux Dev 1024x1024
RTX 4090 should be at least 3.6it/s for FP8 and 4.5it/s for SVDQuant
this is using sage attention and torch.compile for the FP8
with the Flux Turbo lora, that uses 8 steps, you should be getting an image in around 2 seconds
if you are getting slower than that then something is up and needs fixing

craggy crest Jan 16, 2025, 3:46 AM

#

rapid pivot Jan 16, 2025, 3:46 AM

#

sadcat moss man

craggy crest Jan 16, 2025, 3:57 AM

#

rapid pivot <:sadcat:1130568570712109176> moss man

he comes from here

lavish sparrow Jan 16, 2025, 5:15 AM

#

analog dome yes you

in that case, the keywords sent to the llm: alternate species, monsterification,hybrid,, mist and fire creature:, nature, anime ->

"T5": "In a mystical forest under a twilight sky, a hybrid creature emerges from swirling mist and flickering flames. Part beast, part ethereal being, it combines elements of various natural forms—sharp, clawed limbs intertwined with delicate, flowing plant-like appendages. Its body is enveloped in a cloak of smoke that dances like living flames, casting an otherworldly glow. The creature's eyes glow intensely, reflecting the fiery and misty elements around it. The scene is vibrant yet eerie, with deep greens and fiery oranges blending seamlessly, capturing the essence of both nature and monstrous transformation. The anime style renders this creature with exaggerated, fluid movements and expressive features, emphasizing its hybrid and fantastical nature.",
"CLIPG": "hybrid creature, mist, fire, anime style, forest, glowing eyes, plant-animal fusion",
"CLIPL": "A vibrant, anime-styled hybrid creature blending plant and beast features, enveloped in mist and flames, glowing eyes, amidst a mystical forest.",
"ARTSTYLE": "Anime",
"NEGATIVE": "photorealistic, mundane textures, dull colors"}```

#

i'm actually using all different parts of the clip

#

thursdays

muted dove Jan 16, 2025, 10:55 AM

#

lavish sparrow

First output from Cosmos

lavish sparrow Jan 16, 2025, 11:00 AM

#

Cosmos is video model? Looks wicked tho

toxic bone Jan 16, 2025, 11:08 AM

#

lavish sparrow Cosmos is video model? Looks wicked tho

its a "world" model, but yeah it creats videos. seems neat

toxic bone Jan 16, 2025, 11:08 AM

#

muted dove First output from Cosmos

frozen bolt. not very world accurate but i LOVE the effect

muted dove Jan 16, 2025, 11:09 AM

#

toxic bone frozen bolt. not very world accurate but i LOVE the effect

Another one

#

A bit janky

toxic bone Jan 16, 2025, 11:10 AM

#

yeah but it's free jank so yippee kiyay

pseudo owl Jan 16, 2025, 12:58 PM

#

toxic bone its a "world" model, but yeah it creats videos. seems neat

Yeah idk why they named it world model when its architecture is just for video gen. Hunyuan is clearly much better in t2v but cosmo is faster and control is pretty nice for sure, you can have 9frame context and you can input multiple images in middle/end/beginning.

sullen moss Jan 16, 2025, 1:08 PM

#

Hello, everyone. Any fresh updates from the SAI team? I’ve noticed that SD 3.5 turned out to be a largely underwhelming model, with very little community activity on Civitai.

lavish sparrow Jan 16, 2025, 7:06 PM

#

lavish sparrow Jan 16, 2025, 7:06 PM

#

sullen moss Hello, everyone. Any fresh updates from the SAI team? I’ve noticed that SD 3.5 t...

personally, i don't think it's underwhelming, but with the release of flux a few weeks earlier, people lost interest really fast

lavish sparrow Jan 16, 2025, 7:34 PM

#

" winnie the poo, style of warhammer 40k"

#

#

#

"being alone doesn't scare me"

#

#

lavish sparrow Jan 16, 2025, 8:32 PM

#

#

#

#

lavish sparrow Jan 16, 2025, 9:12 PM

#

lavish sparrow Jan 16, 2025, 9:59 PM

#

#

urban arch Jan 16, 2025, 10:23 PM

#

lavish sparrow `"several species of small furry animals grooving together in a cave with a Pict...

I'm old enough to have actually heard that particular Pink Floyd song.

lavish sparrow Jan 16, 2025, 10:24 PM

#

urban arch I'm old enough to have actually heard that particular Pink Floyd song.

i haven't heard it either, was a a prompt someone suggested to test on my llm setup

urban arch Jan 16, 2025, 10:25 PM

#

lavish sparrow i haven't heard it either, was a a prompt someone suggested to test on my llm se...

It's the title of a VERY old Pink Floyd song.

mortal mesa Jan 16, 2025, 11:41 PM

#

ahh good Syd Barret Floyd

#

thank you for the idea, gonna try some Astronomy Domine prompts

lavish sparrow Jan 17, 2025, 4:32 PM

#

lavish sparrow Jan 17, 2025, 5:04 PM

#

3.5 absynth finetune, not bad

bitter hearth Jan 17, 2025, 5:05 PM

#

the absynth method is cool yeah

#

negative loras

molten anvil Jan 17, 2025, 11:37 PM

#

What is the best way to create photorealistic images with SD3.5? My experiments sp far are giving me plasticky/cartoony photos. Any ideas would be most appreciated. (Particularly with Turbo model. )

pseudo owl Jan 18, 2025, 12:01 AM

#

molten anvil What is the best way to create photorealistic images with SD3.5? My experiments ...

Some photorealistic Lora is the best choice probably, I don’t really like their turbo model, flux schnell seems better in terms of realism/overall and sd3.5 turbo lacks detail too imo.

This is what I got with sd3.5 turbo: “Polaroid, amateur photograph, a woman”

Not too shabby but you can clearly see hair is weird and white borders.

#

Above is 4steps, this is Schnell with 1step only and looks much better(although white borders still there)

turbid grotto Jan 18, 2025, 3:57 AM

#

molten anvil What is the best way to create photorealistic images with SD3.5? My experiments ...

turbo seems to be smooth but normal version is just on another level of realism

strange ermine Jan 18, 2025, 6:10 PM

#

give me an image that looks like this

#

give me a picture of an envelope

lavish sparrow Jan 18, 2025, 6:14 PM

#

rapid pivot Jan 18, 2025, 6:24 PM

#

lavish sparrow

Making me want to use sd3 for more backgrounds

#

:<

lavish sparrow Jan 18, 2025, 6:40 PM

#

computer virus

hallow lion Jan 18, 2025, 7:19 PM

#

lavish sparrow Cosmos is video model? Looks wicked tho

is it better than hunyuan tho?

lavish sparrow Jan 18, 2025, 7:20 PM

#

hallow lion is it better than hunyuan tho?

i would not know, i never make vid

hallow lion Jan 18, 2025, 7:21 PM

#

Just asking coz hunyuan is pretty tremendous for a local model so is it worth looking at other models for now...

#

if hunyuan gets image to vid thats like game changing

buoyant mesa Jan 18, 2025, 7:46 PM

#

where can i train sd3.5 Loras ....you cant do it in KohySS so far as i know?

bitter hearth Jan 18, 2025, 8:56 PM

#

hallow lion is it better than hunyuan tho?

Cosmos VAE is a lot better yeah

proven pecan Jan 18, 2025, 9:07 PM

#

lavish sparrow computer virus

computer worm

turbid grotto Jan 18, 2025, 10:41 PM

#

buoyant mesa where can i train sd3.5 Loras ....you cant do it in KohySS so far as i know?

OneTrainer can do it - very low requirements. However, SimpleTuner should be better as the creator still messing with sd3.5m

pseudo owl Jan 19, 2025, 12:47 AM

#

hallow lion is it better than hunyuan tho?

Hunyuan is miles better at text to video in quality. Cosmos does have a few benefits like 7b one is a bit faster, and vae is more efficient like Neon said. Also has image to video which is very useful.

Quality wise, hunyuan is much more better and comparable to closed source stuff while cosmos is considerably behind at img2vid, text2vid. But it’s pretty controllable at least.

#

It is the best open img2vid so far right now

bitter hearth Jan 19, 2025, 1:10 AM

#

I'm not saying the Cosmos VAE is more efficient I'm saying its higher quality

sage burrow Jan 19, 2025, 1:45 AM

#

hallow lion if hunyuan gets image to vid thats like game changing

Still holding my breath waiting!! 😄

sage burrow Jan 19, 2025, 1:45 AM

#

buoyant mesa where can i train sd3.5 Loras ....you cant do it in KohySS so far as i know?

Civitai. About $2 per.

rapid pivot Jan 19, 2025, 3:11 AM

#

#

adorbable

#

#

brazen lake Jan 20, 2025, 11:13 AM

#

dog

#

#dog

muted dove Jan 20, 2025, 12:04 PM

#

bitter hearth I'm not saying the Cosmos VAE is more efficient I'm saying its higher quality

Can it be used with other models?

bitter hearth Jan 20, 2025, 1:26 PM

#

sadly using a VAE on a different model never rly works
it would have to be retrained

fathom path Jan 21, 2025, 12:05 AM

#

rapid pivot Jan 21, 2025, 9:59 PM

#

lavish sparrow Jan 21, 2025, 11:05 PM

#

lavish sparrow Jan 21, 2025, 11:39 PM

#

deepseek-r1-32b + sd3.5L is a nice combo indeed ^^

cedar oyster Jan 22, 2025, 2:35 AM

#

a long sword, simple color, no one, game icon, 2D animation style, white background

fathom path Jan 22, 2025, 5:05 AM

#

lavish sparrow deepseek-r1-32b + sd3.5L is a nice combo indeed ^^

R1 just kicked Marco's a$$ 😍 it's insanely good

lavish sparrow Jan 22, 2025, 5:40 AM

#

I'd have to get the 7b r1 for fair comparison tho..

sage burrow Jan 22, 2025, 5:50 AM

#

lavish sparrow Jan 22, 2025, 10:19 AM

#

lavish sparrow Jan 22, 2025, 10:50 AM

#

fathom path Jan 22, 2025, 10:54 AM

#

lavish sparrow I'd have to get the 7b r1 for fair comparison tho..

I did and it's doing a lot better.

lavish sparrow Jan 22, 2025, 10:55 AM

#

yeah, i suppose it does. but marco o1 wasn't bad for it size at all

fathom path Jan 22, 2025, 10:57 AM

#

Yeah, I've been using it since its been released, and the CoT really good.

lavish sparrow Jan 22, 2025, 10:57 AM

#

rofl xD nice 4th of july

fathom path Jan 22, 2025, 11:03 AM

#

lavish sparrow rofl xD nice 4th of july

Good stuff 👌. Sd35 is probably the most creative model since pixart

lavish sparrow Jan 22, 2025, 11:03 AM

#

it's the smartest by a far shot

fathom path Jan 22, 2025, 11:05 AM

#

Yeah. Flux becomes limited very fastly unfortunately

lavish sparrow Jan 22, 2025, 11:05 AM

#

#

muted dove Jan 22, 2025, 11:55 AM

#

#

#

#

#

#

#

#

#

#

muted dove Jan 22, 2025, 2:38 PM

#

#

muted dove Jan 22, 2025, 3:02 PM

#

#

muted dove Jan 22, 2025, 3:21 PM

#

real terrace Jan 22, 2025, 9:50 PM

#

Hi, I haven't try flux for a while, is there some light good models, for 12 GB VRAM? I got OOM when running Flux

split bramble Jan 22, 2025, 9:55 PM

#

real terrace Hi, I haven't try flux for a while, is there some light good models, for 12 GB V...

I run Flux Dev on 12GB without issue.

real terrace Jan 23, 2025, 1:15 AM

#

split bramble I run Flux Dev on 12GB without issue.

This encouraged to try it again, but It still stops my youtube video from time to time

turbid grotto Jan 23, 2025, 2:59 AM

#

real terrace This encouraged to try it again, but It still stops my youtube video from time t...

use quantized version of model and encoder to save vram and ram

#

q4 is the lowest you can go, I think. And it looks fine still

#

I am running it at 12gb too, also able to use controlnet

#

svdq can give ~3x speed improvements with the quality of q4 but it has the worst comfy integration. I only managed to make it work outside comfy

#

on other had, there is a teacache which can give 2x improvement for flux dev in comfy, however there won't be noticable improvements at lower steps

real terrace Jan 24, 2025, 2:41 AM

#

turbid grotto use quantized version of model and encoder to save vram and ram

uhm ok! that's why I asked about improvement in the models for 12 GB

real terrace Jan 24, 2025, 2:42 AM

#

turbid grotto q4 is the lowest you can go, I think. And it looks fine still

so flux dev quantized?

#

And I didn't know encoders were quantized aswell

#

I was using the ones it came out when it came out

turbid grotto Jan 24, 2025, 3:14 AM

#

real terrace so flux dev quantized?

I don't entirely understand the question.
Flux we have is distilled from full model but it can be further quantized in a smart way to reduce memory requirements and not get too big of a loss.

#

also, there is an 8b parameters variant of flux but I am not sure if it worth using over sd3.5l for now

real terrace Jan 24, 2025, 4:08 AM

#

turbid grotto I don't entirely understand the question. Flux we have is distilled from full mo...

I mean which models, like these ones? https://civitai.com/models/647237/flux1-dev-gguf-q2k-q3ks-q4q41q4ks-q5q51q5ks-q6k-q8

rapid pivot Jan 24, 2025, 4:58 AM

#

turbid grotto Jan 24, 2025, 6:02 AM

#

real terrace I mean which models, like these ones? https://civitai.com/models/647237/flux1-de...

yes

uneven storm Jan 24, 2025, 7:43 AM

#

real terrace uhm ok! that's why I asked about improvement in the models for 12 GB

If you dont want to suffer heavy quality lost on the quantized flux models you can use Flux dev with wavespeed and xformers on comfy to a smooth 1.62s/it

#

its what i do and i use 12gb 3060

turbid grotto Jan 24, 2025, 8:29 AM

#

uneven storm If you dont want to suffer heavy quality lost on the quantized flux models you c...

xformers needs to be installed separately?

#

1.62s/it sounds super good

uneven storm Jan 24, 2025, 8:29 AM

#

can do --xformers in command line arg but i installed manually

#

but wavespeed node is what makes it fast

#

with 3060 wont be able to use compile+ node and only the block cache

turbid grotto Jan 24, 2025, 8:31 AM

#

Thanks!!
I will try it

uneven storm Jan 24, 2025, 8:32 AM

#

np

real terrace Jan 24, 2025, 3:46 PM

#

uneven storm If you dont want to suffer heavy quality lost on the quantized flux models you c...

I use it on Ubuntu

#

but ty

#

and I have AMD card

tulip wadi Jan 24, 2025, 3:47 PM

#

real terrace and I have AMD card

AMD card specs

real terrace Jan 24, 2025, 3:47 PM

#

tulip wadi AMD card specs

RX 6700

real terrace Jan 24, 2025, 4:02 PM

#

turbid grotto yes

But what about this?

pseudo owl Jan 24, 2025, 11:11 PM

#

Some realistic 1step gens with Flux Schnell, no loras or anything.

rapid pivot Jan 24, 2025, 11:23 PM

#

pseudo owl Some realistic 1step gens with Flux Schnell, no loras or anything.

what happened to the witcher man

#

no one played his games anymore, had to retire x.x

hallow lion Jan 24, 2025, 11:30 PM

#

Winnie the chinese president lora

#

So you guys know about tiananmen square regardless of this hush vee pee enn thing right?

#

google it lee!

#

You almost did it

#

Give it another try

#

😉

#

Trump's got you

#

It's now or never

#

Ceausescu PCR

#

I grew up in Romania under that fker. If we can do it so cna you.

#

catwhaaa

#

https://tenor.com/view/halo-halo-halo-hallo-hello-damskoo-gif-25142480

Tenor

#

Alo Alo Beijing.

turbid grotto Jan 25, 2025, 12:37 AM

#

real terrace But what about this?

gguf does support lora

summer ginkgo Jan 25, 2025, 12:41 AM

#

hallow lion Alo Alo Beijing.

Why no tibetan flag emoji? 😭

rapid pivot Jan 27, 2025, 3:20 AM

#

remote holly Jan 27, 2025, 11:35 AM

#

someone is trying to fine tune sd3.5m for make them look like illoustrious : https://huggingface.co/AngelBottomless/Illustrious-sd3.5m-fails

AngelBottomless/Illustrious-sd3.5m-fails · Hugging Face

remote holly Jan 27, 2025, 11:53 AM

#

i never seen theses finetunes on DiT models

bitter hearth Jan 27, 2025, 1:23 PM

#

next Pony is DiT

#

on auraflow

#

things will change a lot when GB200 NVL72 comes out

#

its gonna unlock quite a lot of new abilities in terms of training

#

the biggest issue with training is that there is clearly a benefit from large batch sizes in terms of training quality
but to use very large batch sizes at high speed is not easy- the issue is the communication between the GPUs
GB200 NVL72 goes a long way towards fixing that because it puts 72 big Nvidia machines in one pod

#

I think the issue with anime models is dataset quality though rather than compute, at the moment

#

but how you build a high quality anime dataset out to the tens of millions of images I do not know

#

its 1000x easier with photographic stuff because to some extent all photographs above a minimum quality level are useable, whereas anime has to be very specific styles and content

finite hollow Jan 27, 2025, 3:09 PM

#

has there been much changes in the last 3 months ? i didnt watch the news kinda

#

i can't seem to find improved pictures in any of the channels here sadly

remote holly Jan 27, 2025, 5:53 PM

#

@bitter hearth I didn't know that, thank you for your explanations, I'm really looking forward to seeing these finetunes come in the DiT models because understanding the prompts they offer will be a game changer, I don't really like this tag system I prefers to use natural language

sullen moss Jan 27, 2025, 6:24 PM

#

https://huggingface.co/deepseek-ai/Janus-Pro-7B

deepseek-ai/Janus-Pro-7B · Hugging Face

bitter hearth Jan 27, 2025, 6:50 PM

#

yeah that would be the big advantage of a DiT anime model, the prompt following

rapid pivot Jan 27, 2025, 6:57 PM

#

One day we'll have anime models that aren't dumb

#

agony

bitter hearth Jan 27, 2025, 7:00 PM

#

Janus-Pro-7B is nice
bare in mind its 384x384 and is not one of the fast types of autoregressive

sullen moss Jan 27, 2025, 8:50 PM

#

bitter hearth Janus-Pro-7B is nice bare in mind its 384x384 and is not one of the fast types o...

So far, I don’t understand the hype around this model. But I’ll keep an eye on it to see what it might turn into.

bitter hearth Jan 28, 2025, 7:01 AM

#

fairy sure the hype is just over the brand name deepseek and not because people actually want a 384x384 autoregressive model

hallow lion Jan 28, 2025, 10:42 AM

#

mods

#

get him!

muted dove Jan 28, 2025, 10:49 AM

#

#

brittle nexus Jan 28, 2025, 6:54 PM

#

Sorry for the dumb question but this cheaper and efficient training method used by deepseek can help img2img models?

dry wave Jan 28, 2025, 7:31 PM

#

there is no new efficient training method

#

training on artificial data can help speed up training - PixArt is already doing that. I would still prefer large scale tuning though

pseudo owl Jan 28, 2025, 7:33 PM

#

brittle nexus Sorry for the dumb question but this cheaper and efficient training method used ...

Yeah kinda, this is a very very cheap cheap moe diffusion models yet by Sony. It’s not bad for the size, but everyone still uses something like flux Schnell/hyper for speed

https://github.com/SonyResearch/micro_diffusion

GitHub

GitHub - SonyResearch/micro_diffusion: Official repository for our ...

Official repository for our work on micro-budget training of large-scale diffusion models. - SonyResearch/micro_diffusion

#

This is not a moe architecture but very fast training(21x faster) as well, and better then normal dits at similar sizes. It’s more of a demo then an actual usable model but has a lot of potential: https://github.com/hustvl/LightningDiT

lavish sparrow Jan 28, 2025, 9:08 PM

#

errant dust Jan 28, 2025, 11:55 PM

#

https://www.livescience.com/technology/artificial-intelligence/deepseek-stuns-tech-industry-with-new-ai-image-generator-that-beats-openais-dall-e-3

livescience.com

DeepSeek stuns tech industry with new AI image generator that beats...

Chinese AI lab DeepSeek has released a new image generator, Janus-Pro-7B, which the company says is better than competitors.

remote holly Jan 29, 2025, 2:51 PM

#

the svdQuant project seems abandonned , that's sad

brittle nexus Jan 29, 2025, 2:52 PM

#

dry wave there is no new efficient training method

https://www.tomshardware.com/tech-industry/artificial-intelligence/deepseeks-ai-breakthrough-bypasses-industry-standard-cuda-uses-assembly-like-ptx-programming-instead

Tom's Hardware

DeepSeek's AI breakthrough bypasses industry-standard CUDA, uses Nv...

Dramatic optimizations do not come easy.

dry wave Jan 29, 2025, 2:53 PM

#

I know they do a lot of engineering and optimization like training on fp8. It's not a new method, though.

#

you can optimize every training setup and everyone is doing that already, although some teams are definitely better in this than others

bitter hearth Jan 29, 2025, 11:58 PM

#

remote holly the svdQuant project seems abandonned , that's sad

they had an update 6 days ago

craggy crest Jan 30, 2025, 3:32 AM

#

errant dust https://www.livescience.com/technology/artificial-intelligence/deepseek-stuns-te...

not so stunned - most of the people that have been talking about it have also been saying that they're not impressed

errant dust Jan 30, 2025, 12:55 PM

#

Not stunned by superior quality. Stunned that they released it

#

It wasn't on anyone's radar

bitter hearth Jan 30, 2025, 12:59 PM

#

it was a suprise yeah

#

there's a new Lumina too

#

https://huggingface.co/Alpha-VLLM/Lumina-Image-2.0

#

cunning lintel Jan 30, 2025, 1:39 PM

#

bitter hearth there's a new Lumina too

Now that's a nice surprise too 🥳

bitter hearth Jan 30, 2025, 1:54 PM

#

I think this might be really good yeah

muted dove Jan 30, 2025, 2:11 PM

#

https://github.com/comfyanonymous/ComfyUI/issues/6648

GitHub

Add support for Lumina-Image-2.0 · Issue #6648 · comfyanonymous/Com...

Feature Idea https://huggingface.co/Alpha-VLLM/Lumina-Image-2.0 Existing Solutions No response Other No response

cunning lintel Jan 30, 2025, 2:28 PM

#

It feels a bit like auraflow while trying a few prompts in their gradio space, in that it at times seems to be rather rigid in following prompts. fun 🙂

bitter hearth Jan 30, 2025, 3:04 PM

#

that was likely their previous model

dull star Jan 30, 2025, 3:27 PM

#

yes the space is using Lumina-Next-SFT

cunning lintel Jan 30, 2025, 3:37 PM

#

The one linked from https://github.com/Alpha-VLLM/Lumina-Image-2.0 (http://47.100.29.251:10010/) should be the new one?!

GitHub

GitHub - Alpha-VLLM/Lumina-Image-2.0

Contribute to Alpha-VLLM/Lumina-Image-2.0 development by creating an account on GitHub.

Gradio

dull star Jan 30, 2025, 3:38 PM

#

yes

#

if you use that and not the one from huggingface then you were right

bitter hearth Jan 30, 2025, 4:08 PM

#

oh sorry this is correct
I assumed you went from their huggingface

#

violet escarp Jan 30, 2025, 8:14 PM

#

errant dust https://www.livescience.com/technology/artificial-intelligence/deepseek-stuns-te...

it doesn't seem that impressive on its own in terms of quality. It could be nice to finetune though. Lumina is interesting, but it seems to struggle with complex poses from what I've seen like sd3 2b. It might need more parameters to learn properly.

bitter hearth Jan 30, 2025, 8:19 PM

#

yeah 2B is rough for DiT

#

Sana 1.5 just dropped at 4.5B or so

#

might have potential

violet escarp Jan 30, 2025, 8:21 PM

#

Sana is different

#

it depends on the channel count for the vae and compression

#

there was some math I saw that's a bit over my head. The main takeaway I got from it is that a higher channel vae is harder to train with so it needs more parameters. Sana is also higher compression so it doesn't need as high of a parameter count.

the DOF ratio between the input vector dim size vs model dim matters a lot here
16(64) channel vae needs DOF larger than 32x atleast
so you need atleast 2048 hidden dim

bitter hearth Jan 30, 2025, 8:27 PM

#

better VAEs are harder to train with yeah

#

its harder to get the DiT training to converge

ashen abyss Jan 31, 2025, 2:38 AM

#

https://tenor.com/bxGTC.gif

Tenor

bitter hearth Jan 31, 2025, 7:11 AM

#

dice?

cunning lintel Jan 31, 2025, 6:37 PM

#

SD3.5L Things

#

Other 3.5L Things

still pike Feb 1, 2025, 2:13 AM

#

请画一幅满屏幕都是笑脸的橘子和苹果

lucid swift Feb 1, 2025, 5:16 AM

#

still pike 请画一幅满屏幕都是笑脸的橘子和苹果

#

craggy crest Feb 1, 2025, 6:07 AM

#

errant dust It wasn't on anyone's radar

look at the timing - it was released as a repost after the tiktok ban, along with a massive amount of hype, targeted right at where it hit - the stock market

#

not that it did much in the long run but it sure did damage for a few hours

#

bitter hearth Feb 1, 2025, 7:16 AM

#

I could tell when people hadn't tried the demo, for Janus-Pro-7B

fiery wharf Feb 1, 2025, 7:18 AM

#

the only fake hype was openai lying about inflated costs to train LLMs just to milk investors thats why the stocks went down Sam Altman bs finally caught up to him

bitter hearth Feb 1, 2025, 7:19 AM

#

I think Deepseek themselves did not hype Janus-Pro-7B yeah
I think some journalists found it and wanted to make an exciting article

#

the Janus-Pro-7B paper makes it clear several times that its not going to be amazing image quality and that its just a base for future models

#

its only 384x384 after all

#

if you want nice autoregressive image model that is out now you can use Infinity, Switti or the CoT version of Show-O

mortal mesa Feb 1, 2025, 7:23 AM

#

i liked the image classification or whatever you call it on the 1b, but dont really need it

bitter hearth Feb 1, 2025, 7:24 AM

#

yeah the image understanding was what it was really about

#

it improved that a lot, for that class of model

#

these are the other models I mentioned if anyone wants to try them https://huggingface.co/FoundationVision/Infinity/tree/main https://huggingface.co/yresearch/Switti https://huggingface.co/ZiyuG/Image-Generation-CoT

#

they look good already, in their current form

silent iris Feb 1, 2025, 7:56 AM

#

dreamy merlin Feb 1, 2025, 7:58 AM

#

Hello, I have a hard time running stable diffusion large on with A4500 with 20GB Vram, it is always running out of VRAM. if I us fp8, it is runable, but how to run without quantization? I heard someone able to run it on even smaller VRAM.

lavish sparrow Feb 1, 2025, 8:37 AM

#

shopping for groceries in 2026

lavish sparrow Feb 1, 2025, 8:57 AM

#

lavish sparrow Feb 1, 2025, 9:13 AM

#

shy leaf Feb 1, 2025, 5:09 PM

#

scam link

turbid grotto Feb 1, 2025, 6:53 PM

#

Anyone has info about Stability, how are they?

silent iris Feb 1, 2025, 8:30 PM

#

pretty good

#

would i say

#

the best you can do is to add the prompt ''Detailed environment'' or ''Detailed background environment'' should give you every time a high quality image

lavish sparrow Feb 1, 2025, 8:36 PM

#

hallow lion Feb 2, 2025, 12:02 PM

#

turbid grotto Anyone has info about Stability, how are they?

They're stable.

turbid grotto Feb 2, 2025, 3:06 PM

#

hallow lion They're stable.

Hope they're diffusion too

fathom path Feb 2, 2025, 8:52 PM

#

bitter hearth Feb 3, 2025, 3:24 AM

#

they made a 3D model

#

so there's at least something going on

hallow lion Feb 3, 2025, 11:26 PM

#

WAIT!

#

WHERE IS EMAD!

#

WHERE IS THE EMAD EMOJI!!!!????

#

WHAT HAVE YOU DONE?

#

fiery wharf Feb 3, 2025, 11:30 PM

#

hallow lion WHAT HAVE YOU DONE?

is getting replaced by the james cameron emoji

hallow lion Feb 3, 2025, 11:32 PM

#

https://tenor.com/view/james-cameron-gif-27197031

Tenor

#

That is not a real dollar bill or I would go to jail.

analog folio Feb 4, 2025, 5:10 AM

#

lavish sparrow

absolutley beautiful

turbid grotto Feb 4, 2025, 2:03 PM

#

hallow lion WHERE IS THE EMAD EMOJI!!!!????

agony

terse aspen Feb 5, 2025, 1:06 PM

#

Apple, engraving in the style of Dürer

cunning lintel Feb 5, 2025, 11:21 PM

#

wise pewter Feb 6, 2025, 7:41 AM

#

请画一幅满屏幕都是笑脸的橘子和苹果

cursive frigate Feb 6, 2025, 7:52 AM

#

I forgot where to put bbox files in comfyui. Can anyone help me with that?

runic tusk Feb 6, 2025, 11:07 AM

#

cursive frigate I forgot where to put bbox files in comfyui. Can anyone help me with that?

ComfyUI/models/ultralytics/bbox

tawny orbit Feb 6, 2025, 11:15 AM

#

wise pewter 请画一幅满屏幕都是笑脸的橘子和苹果

One message removed from a suspended account.

craggy crest Feb 6, 2025, 4:23 PM

#

tawny orbit One message removed from a suspended account.

https://www.deepl.com/en/translator this is an excellent translator

DeepL Translate: The world's most accurate translator

Translate texts & full document files instantly. Accurate translations for individuals and Teams. Millions translate with DeepL every day.

neon imp Feb 7, 2025, 7:05 AM

#

Really happy with my latest 3.5M fine-tune!

mild wind Feb 7, 2025, 9:42 AM

#

please generate a picture showing a house

lucid swift Feb 7, 2025, 3:38 PM

#

wise pewter 请画一幅满屏幕都是笑脸的橘子和苹果

lucid swift Feb 7, 2025, 3:40 PM

#

mild wind please generate a picture showing a house

mental moth Feb 7, 2025, 3:42 PM

#

please generate 4 pictures with a Tiger

lucid swift Feb 7, 2025, 3:59 PM

#

lucid swift Feb 7, 2025, 4:01 PM

#

mental moth please generate 4 pictures with a Tiger

#

#

this is the lumina 2.0 model

sick pendant Feb 7, 2025, 5:00 PM

#

generate 4 pictures to Halloumi cheese

heady bluff Feb 7, 2025, 5:43 PM

#

please generate an image of apple

dull star Feb 7, 2025, 6:18 PM

#

The lumina model is okay.

#

The point is that it's Apache-2, but unlike auraflow it's much smaller

turbid grotto Feb 7, 2025, 6:23 PM

#

lucid swift this is the lumina 2.0 model

hi, how it compares to sd3.5m in speed and quality?

lapis relic Feb 7, 2025, 6:28 PM

#

please Generate a living white shrimp with the cephalothorax organs clearly visible

lucid swift Feb 7, 2025, 6:40 PM

#

dull star The point is that it's Apache-2, but unlike auraflow it's much smaller

beeing smaller is also a advatage because oyu can train it more easy

#

i cnat generate more images because i am no longer on a ai computer i am just at a steamdeck now

dull star Feb 7, 2025, 6:51 PM

#

yeah I hope we'll get a painting lora or something

craggy crest Feb 7, 2025, 7:25 PM

#

lapis relic please Generate a living white shrimp with the cephalothorax organs clearly visi...

read the information in #artisan-faq <--- that channel

pseudo owl Feb 7, 2025, 7:30 PM

#

turbid grotto hi, how it compares to sd3.5m in speed and quality?

Roughly similar quality to sd3.5m base model(not a finetuned one like absynth though). Cool thing is that you can add system prompts though.

turbid grotto Feb 7, 2025, 10:55 PM

#

pseudo owl Roughly similar quality to sd3.5m base model(not a finetuned one like absynth th...

I heard that it is a lot slower than sd3.5m, is this true?

pseudo owl Feb 7, 2025, 11:02 PM

#

turbid grotto I heard that it is a lot slower than sd3.5m, is this true?

yeah its decently slower then sd3.5m right now but I believe thats just an optimization issue.

cunning lintel Feb 8, 2025, 12:09 AM

#

SD3.5L and Lumina 2.0

An_anthropomorphic_black_cat_dressed_in_a_tailored_emerald-green_robe_sits_gracefully_in_a_high-backed_ch-PROMPT_IN_METADATA_0.png

A_striking_anthropomorphic_black_cat_posed_regally_in_an_ornate_chair_by_a_grand_arched_window_wears_a_da-PROMPT_IN_METADATA_1.png

#

I'd love a bigger lumina 2.0, its style and details fall a bit short, but its prompt following (not apparent in such simple prompts) is really next level. As it's now i think it'll just be an interesting curiousity,

violet escarp Feb 8, 2025, 1:31 AM

#

cunning lintel SD3.5L and Lumina 2.0

It doesn't even have to be as big 3.5L. 3B could be pretty nice. I'm pretty sure sd3 and flux are kinda inefficient since they use fused transformers. I know someone who wants to finetune lumina 2.0 as well.

violet escarp Feb 8, 2025, 1:53 AM

#

they might add extra parameters to lumina and train like that

turbid grotto Feb 8, 2025, 4:12 AM

#

pseudo owl yeah its decently slower then sd3.5m right now but I believe thats just an optim...

thanks. Seems like comfy support already there gonnabegood

craggy crest Feb 8, 2025, 6:44 AM

#

#

civic trail Feb 8, 2025, 5:27 PM

#

tacit lodge Feb 11, 2025, 8:24 AM

#

一个男孩刚醒来坐在床上，窗外是灰蒙蒙的阴天，动漫场景

#

id:guide

#

#1237460438229450772 一个男孩刚醒来坐在床上，窗外是灰蒙蒙的阴天，动漫场景

sullen moss Feb 11, 2025, 10:34 AM

#

Flux. LOL

meager yew Feb 11, 2025, 1:33 PM

#

Generate a image with a blue bird, that has three claws on its wings, it is flying in the sky.

native olive Feb 11, 2025, 3:44 PM

#

"cinematic wide shot, 21:9 aspect ratio, 1920s Jinan Railway Station, steampunk atmosphere, baroque architecture with Chinese elements, crowds in Republican-era clothing, steam locomotive emitting smoke, golden hour lighting with volumetric rays, Kodak Ektachrome film simulation, intricate historical details, hyperrealistic textures"

hallow lion Feb 11, 2025, 8:15 PM

#

sullen moss Flux. LOL

This is how expert advisors see the market.

#

Go Dugtogs!

craggy crest Feb 11, 2025, 9:20 PM

#

#

#

lucid swift Feb 13, 2025, 9:47 PM

#

did stability ever release the creative upscaler?

craggy crest Feb 14, 2025, 12:18 AM

#

lucid swift did stability ever release the creative upscaler?

don't remember hearing anything about that releasing

lucid swift Feb 14, 2025, 1:48 AM

#

craggy crest don't remember hearing anything about that releasing

daim

pine forge Feb 14, 2025, 2:02 AM

#

@ hensen

#

请画一幅满屏幕都是笑脸的橘子和苹果

wild ravine Feb 14, 2025, 6:57 PM

#

"A beautiful and enchanting humanoid nine-tailed fox spirit from ancient Chinese mythology, blending human elegance with mystical fox traits. She has long, flowing silver-white hair with golden highlights, and her eyes are sharp, intelligent, and glowing with a magical aura. Her nine luxurious tails fan out behind her, shimmering with ethereal energy. Her face is delicate and serene, with a hint of otherworldly charm. She wears a traditional Chinese robe adorned with intricate patterns, standing gracefully in a misty, ancient landscape surrounded by bamboo forests, flowing rivers, and distant mountains. The atmosphere is dreamlike, with soft moonlight illuminating the scene, evoking a sense of mystery and fantasy."

craggy crest Feb 14, 2025, 7:57 PM

#

wild ravine "A beautiful and enchanting humanoid nine-tailed fox spirit from ancient Chinese...

You need to read the information in #artisan-faq

astral sigil Feb 15, 2025, 6:24 AM

#

The back of a woman on a cliff.

hallow lion Feb 15, 2025, 10:03 AM

#

astral sigil The back of a woman on a cliff.

uhm ok...

lucid stag Feb 15, 2025, 11:58 AM

#

A glowing portal to another dimension,星空背景，科幻风格，次元之门，门外是城市夜景，门内是奇幻世界，4K高清，cyberpunk style, glowing particles, futuristic, --v 5 --ar 1:1

frail shoal Feb 15, 2025, 6:06 PM

#

craggy crest Feb 15, 2025, 9:41 PM

#

#

dark gust Feb 19, 2025, 2:32 AM

#

Whatever happened to SD 3.5 medium controlnet?

woeful terrace Feb 19, 2025, 6:32 AM

#

opps

#

oops

bitter hearth Feb 19, 2025, 5:46 PM

#

dark gust Whatever happened to SD 3.5 medium controlnet?

oh they came out

#

you can use them now if you want

#

https://huggingface.co/tensorart

tensorart (TensorArt Studios)

#

they made a turbo lora as well

devout schooner Feb 19, 2025, 6:19 PM

#

Original SD3 (left) vs SD3.5 Medium (right), on the "Juggernaut XL Model Card Lady Prompt". 30 steps / DPM++ 2M SGM Uniform / CFG 4.5 / same seed for both. Eyes are focused oddly in the original SD3 one but the overall image is aestheticically way closer to what I'd want
It's a somewhat unfortunate recurring trend I've found after using both for a while now
SD 3.5 Medium is definitely compositionally way more coherent for photographic gens (moreso for non-closeup full body stuff) but it trends far more towards a sort of fake airbrushed look aesthetically than the original SD3 did
A model with OG SD3 aesthetics but 3.5 Medium coherence would be essentially perfect lol

bitter hearth Feb 19, 2025, 6:29 PM

#

SD3M was more photographic yeah

abstract egret Feb 20, 2025, 2:13 AM

#

Prompt:(minimalist logo design), (granular texture), (fading gradient), (data visualization elements), muted color palette,
clean background, geometric shapes, symbolic metaphor, (calm and rational mood), high detail, 4k, Negative Prompt:complex patterns, 3D render, glossy effects, neon colors, handwritten fonts, chaotic composition

craggy crest Feb 20, 2025, 8:09 AM

#

devout schooner Original SD3 (left) vs SD3.5 Medium (right), on the "Juggernaut XL Model Card La...

Sd 3.5 medium is more artsy

remote holly Feb 20, 2025, 9:09 AM

#

where are medium controlnets !?

steel remnant Feb 20, 2025, 9:25 AM

#

sdsad

#

Prompt:(minimalist logo design), (granular texture), (fading gradient), (data visualization elements), muted color palette,
clean background, geometric shapes, symbolic metaphor, (calm and rational mood), high detail, 4k, Negative Prompt:complex patterns, 3D render, glossy effects, neon colors, handwritten fonts, chaotic composition

tribal monolith Feb 20, 2025, 1:01 PM

#

prompt:A hyper-realistic portrait of a young man with delicate facial features, holding a cup of coffee in a cozy café. His hands are elegantly positioned, with natural-looking fingers and realistic skin texture. A newspaper with the headline "AI Revolution" is visible on the table beside him, with sharp and readable text. The café background has warm lighting and a blurred effect for depth.

pseudo owl Feb 20, 2025, 1:11 PM

#

tribal monolith prompt:A hyper-realistic portrait of a young man with delicate facial features, ...

Here’s your image

dim sigil Feb 22, 2025, 11:21 AM

#

PROMPT: Create hyper-realistic background image designed for use in a video. The scene features professional-grade lighting with a warm, inviting atmosphere. A sleek modern desk is positioned to the right, camera left angle, complementing the overall aesthetic. The background has a blurred yellow neon effect, adding depth and cinematic appeal. The composition is clean, with no people present, ensuring a seamless integration into video production.

muted cargo Feb 22, 2025, 12:32 PM

#

prompt : an image showcasing how there is no image generation bot active in this channel

rain current Feb 22, 2025, 12:39 PM

#

sd3.5-large

mossy prawn Feb 22, 2025, 4:55 PM

#

#artisan-1 and

dry wave Feb 22, 2025, 7:07 PM

#

rain current sd3.5-large

looks good. My own results with sd3 large so far are rather disappointing. What's the trick to make it look so good?

dull star Feb 22, 2025, 11:08 PM

#

rain current sd3.5-large

sd3-l or sd3.5-l?

rain current Feb 22, 2025, 11:43 PM

#

3.5L, sorry

craggy crest Feb 23, 2025, 12:19 AM

#

SD 3.5 large

craggy crest Feb 23, 2025, 12:20 AM

#

dry wave looks good. My own results with sd3 large so far are rather disappointing. What'...

what all are you doing with it so far? i give each of the encoders their own prompt, written to what they use best - and a fairly simple workflow.

#

you're welcome to play around with if if you want to

📎 three-encoder-workflow.json

tardy imp Feb 23, 2025, 1:55 AM

#

prompt : beans

past cipher Feb 23, 2025, 4:39 AM

#

tardy imp prompt : beans

https://tenor.com/view/have-a-beautiful-day-fart-smoke-gif-11988036401361324488

Tenor

dry wave Feb 23, 2025, 7:51 AM

#

craggy crest what all are you doing with it so far? i give each of the encoders their own pro...

I already do that. So far the results are just inferior to Flux, though. Last time I tried to make a dnd character with it and while Flux gave diverse results sd3.5l gave me the exact same face all the time (exact opposite of what people say)

#

also, I thought the big advantage of sd3.5l would be negative prompts - but it has issues with them

craggy crest Feb 23, 2025, 7:55 AM

#

dry wave I already do that. So far the results are just inferior to Flux, though. Last ti...

flux does fantastic: women, dogs, animecat girls, and fantasy. if that's what you want, it'll do it. you want something else? you're not going to get it to do what you want without a huge battle and probably resorting to breaking it

dry wave Feb 23, 2025, 9:24 AM

#

it's also good with complex prompts

#

but I'm still experimenting. Haven't found the sweet spot for sd3.5 yet

bitter hearth Feb 23, 2025, 9:33 AM

#

it worked better in the official demo on huggingface than in Comfy and I am not sure why

wild adder Feb 24, 2025, 4:10 AM

#

将这双鞋子的背景替换成在木质地板上面放着

dusky thistle Feb 25, 2025, 2:23 AM

#

#

SD35M

#

bongsampled RES_3S with pseudoimplicit guidance

stone wasp Feb 25, 2025, 1:09 PM

#

prompt : beans

muted cargo Feb 25, 2025, 2:28 PM

#

prompt : no generation robot available here, check out #artisan-faq

bitter hearth Feb 25, 2025, 5:39 PM

#

I didn't like the subject matter but I liked the technicals a lot
this is rly impressive

#

the way that green guy flies up at the start is good

#

its hard to make the video models do movement directly towards camera

dusky thistle Feb 27, 2025, 6:40 AM

#

maiden pasture Feb 27, 2025, 10:09 AM

#

circle0624

jagged gate Feb 28, 2025, 1:12 AM

#

tribal thistle Feb 28, 2025, 3:17 AM

#

A long-haired girl leaning on a mailbox, standing on a busy 1940s Shanghai street, with a few pedestrians walking and vendors setting up stalls on both sides, grayscale, high resolution, slightly blurred background."

jagged gate Feb 28, 2025, 3:49 AM

#

past cipher Feb 28, 2025, 3:51 AM

#

jagged gate

Are you making assets for a game? Because a lot of these look like they would fit right in with a puzzle game.

dusky thistle Feb 28, 2025, 7:38 AM

#

dusky thistle Feb 28, 2025, 8:11 AM

#

#

#

#

#

#

#

all bongsampled sd35 medium

dusky thistle Feb 28, 2025, 10:01 AM

#

jagged gate Feb 28, 2025, 12:41 PM

#

#

#

#

granite cove Feb 28, 2025, 4:29 PM

#

bear

#

Original SD3 (left) vs SD3.5 Medium (right), on the "Juggernaut XL Model Card Lady Prompt". 30 steps / DPM++ 2M SGM Uniform / CFG 4.5 / same seed for both. Eyes are focused oddly in the original SD3 one but the overall image is aestheticically way closer to what I'd want
It's a somewhat unfortunate recurring trend I've found after using both for a while now
SD 3.5 Medium is definitely compositionally way more coherent for photographic gens (moreso for non-closeup full body stuff) but it trends far more towards a sort of fake airbrushed look aesthetically than the original SD3 did
A model with OG SD3 aesthetics but 3.5 Medium coherence would be essentially perfect lol

dusky thistle Feb 28, 2025, 4:43 PM

#

dusky thistle Feb 28, 2025, 4:43 PM

#

granite cove Original SD3 (left) vs SD3.5 Medium (right), on the "Juggernaut XL Model Card La...

got examples?

#

proven pecan Feb 28, 2025, 8:25 PM

#

icy drift Feb 28, 2025, 9:50 PM

#

First try. Amazing.

icy drift Feb 28, 2025, 10:41 PM

#

The physics lighting is amazing. Very UFO.

jagged gate Mar 1, 2025, 7:09 AM

#

icy drift Mar 1, 2025, 12:05 PM

#

Tried to make a fox playing with a butterfly, but forgot to change my prompt. "A plasma orb UFO full of lightning hovers slowly above a rustic barn at night. The light from the plasma orb UFO illuminates the scene with a silvery glow. The plasma orb UFO flies away in a flash."
Poor guy. Zapped by a butterfly ufo.

bitter hearth Mar 1, 2025, 12:57 PM

#

icy drift Tried to make a fox playing with a butterfly, but forgot to change my prompt. "A...

lol this is amazing

tidal oasis Mar 1, 2025, 7:55 PM

#

1

ancient plume Mar 2, 2025, 5:51 AM

#

A close-up of a glowing, fiery Sun with bright orange and yellow flames swirling on its surface. Solar flares shooting out, creating a mesmerizing effect. Space in the background with small distant planets visible.

amber nest Mar 2, 2025, 8:49 AM

#

fervent tapir Mar 2, 2025, 1:09 PM

#

A serene and introspective scene of a young adult sitting cross-legged on a cozy bed in a softly lit room. The person is holding a leather-bound notebook in one hand and a pen in the other, deeply focused on writing. Their expression is thoughtful, with a slight smile, as if recalling vivid dreams. The room is warm and inviting, with soft morning light streaming through sheer curtains. A cup of steaming tea sits on a nightstand nearby, and a few books are scattered on the bed. The atmosphere is peaceful and reflective, emphasizing the act of self-discovery and mindfulness. The art style is realistic with soft, dreamy lighting, capturing the quiet beauty of the moment.

queen edge Mar 3, 2025, 1:47 AM

#

Intricate dragon and phoenix embracing a candle flame, traditional Chinese ink painting style, gold and crimson colors, flowing ribbon with company name

devout schooner Mar 3, 2025, 1:51 AM

#

craggy crest Sd 3.5 medium is more artsy

I've assumed that was the case mostly
let me tell you though, from the perpective of someone who has actually very very very extensively tried to train Loras for SD3 originally a bit and now more recently SD 3.5 Medium (and still is)
it's not, in fact, "easy to train" in any way shape or form relative to SDXL (or Kolors)
"easy to train" would mean I could mindlessly use the exact same UNET LR 1.0 / TE LR 1.0 / Cosine Scheduler / Prodigy optimizer settings for literally any dataset and they would be 100% guaranteed to produce desirable results every single time without fail no matter what (as is the case for all UNET based models)
and it'd also mean that the extremely annoying exploding gradient thing wouldn't be a problem that existed at all (as it also wasn't in any way for UNET-based models)
TLDR Basically from the perspective of an enduser / "finetuner", DiT as a general architecture seems rather flawed in all honesty, as in practice you only notice the numerous blatant downsides (can't do normal hi-res-fix in the way people have come to expect, is limited to very mediocre sampler / scheduler combos, and so on and so on an so on), you do not notice any of the upsides of the architecture that (supposedly) exist joeshrug

craggy crest Mar 3, 2025, 1:53 AM

#

devout schooner I've assumed that was the case mostly let me tell you though, from the perpectiv...

SD 3.5 - not SD 3

#

SD3-2b-medium was released as an unfinished beta - it is missing a lot of the fine tuning that normaly goes into a model, and was only releaed to enure the community that SAI is still interested in being open source

#

SD 3.5 has all the fine tuning, and we worked very hard to make sure it was very easy to train

devout schooner Mar 3, 2025, 1:58 AM

#

craggy crest SD 3.5 has all the fine tuning, and we worked very hard to make sure it was very...

I think you missed my overall point
e.g. left / first is stock Flux Dev, middle / second is stock Kolors, right / third is Kolors with a photo Lora I trained (on the same seed)
the Flux output is arguably significantly less sensible composition wise, and certainly the ONLY place it has any kind of real advantage is in being rendered with a 16-channel VAE
a hypothetical Kolors with a 16-channel VAE would make all versions of Flux and all versions of SD 3 / SD 3.5 look like absolute jokes comparatively speaking in terms of output-quality-to-overall-ease-of-use-and-resource requirements

#

a company that makes a model that in practical terms functions EXACTLY like SDXL, but with a 16-channel VAE and a better text encoder
WILL make a kravillion dollars day one
is all I'm saying

past cipher Mar 3, 2025, 2:01 AM

#

devout schooner a company that makes a model that in practical terms functions EXACTLY like SDXL...

NAIv4 apparently uses a 16-channel VAE, the same one FLUX does

craggy crest Mar 3, 2025, 2:02 AM

#

devout schooner a company that makes a model that in practical terms functions EXACTLY like SDXL...

naw, people are in ruts. if they want something that's exactly like SDXL... they'll just use SDXL

devout schooner Mar 3, 2025, 2:03 AM

#

devout schooner a company that makes a model that in practical terms functions EXACTLY like SDXL...

(additionally you don't want to see the SD 3.5 Medium outputs for this prompt because it's almost completely impossible to not have her fingers be melty noise weirdness)
I like the photographic realism of all versions of SD3, comparatively to Flux
but the weird, weird noise issues it has even in 3.5 are just super annoying
but again it's not really about SD 3.5 in particular
it's about the supposed advantages of DiT as an actual architecture not actually being visible in any way in any model that anyone has ever released

#

in practical terms, at least

devout schooner Mar 3, 2025, 2:05 AM

#

craggy crest naw, people are in ruts. if they want something that's exactly like SDXL... they...

re-read what I said, I guess, I don't think you got my overall point
which is that ALL DiT models have perceivable downsides relative to UNET models
but no real perceivable upsides
and then less so my point was that (as DiT models go) SD 3 / 3.5 (both) are "explodier" than others
by a lot

craggy crest Mar 3, 2025, 2:07 AM

#

devout schooner re-read what I said, I guess, I don't think you got my overall point which is t...

of course they do. but you said "a model that in practical terms functions EXACTLY like SDXL, but with a 16-channel VAE and a better text encoder, WILL make a kravillion dollars day one" and i'm saying that people are in ruts. those that like sdxl will just use sdxl. and everyone is already getting into a rut with the other shiny toys. by the time someone does that, and no one's likely to now, no one will even look at it

devout schooner Mar 3, 2025, 2:19 AM

#

craggy crest of course they do. but you said "a model that in practical terms functions EXACT...

I agree the community is full of "dragon chasers" who constantly wait for the "next new thing" while basically barely trying new things that are actually released

past cipher Mar 3, 2025, 2:19 AM

#

craggy crest of course they do. but you said "a model that in practical terms functions EXACT...

People are still making models and custom nodes specifically for SD1.x/2.x architecture. People will always look at different models.

devout schooner Mar 3, 2025, 2:23 AM

#

i'm just saying like, from a practical third-party training perspective and inference perspective
absolutely no extant DiT model actually has any architecture-specific advantages that are visible regardless of what advantage they might have on paper
in particular the supposed better support for multi resolution seems like absolute fiction in practice
becauase the whole image just getting ugly artifacts even if the composition might be perfect, when going outside the trained resolution range, is far less easy to "fix" than just re-rolling a seed if you get like an extra foot or something, and just generally way less preferable as an outcome
and also because you could already just go ahead and train UNET models at whatever res you wanted, even if it was beyond their original training res

#

basically what I meant was, assuming people WOULD actually use it, hypothetically, the practical manner in which SDXL functioned from an inference and training perspective was completely perfect
so the "perfect model" would in theory be one that was not in any way different in those regards
but just had a better VAE and stronger text encoder

#

the officially suported samplers for DiT are a notable pain point too
absolutely nobody would ever use Euler SGM Uniform or DPM++ 2M SGM Uniform if they didn't have to
because they're just really not very good in comparison to e.g DPM++ SDE Normal or DPM++ 3M SDE Exponential or what have you

#

ComfyUI hacks to make the Ancestral ones work help a lot in that regard but it's still not perfect

#

so that just again seems like a design flaw

dusky thistle Mar 3, 2025, 3:57 AM

#

dusky thistle Mar 3, 2025, 3:57 AM

#

devout schooner ComfyUI hacks to make the Ancestral ones work help a lot in that regard but it's...

try bongsampling

#

https://github.com/ClownsharkBatwing/RES4LYF

GitHub

GitHub - ClownsharkBatwing/RES4LYF

Contribute to ClownsharkBatwing/RES4LYF development by creating an account on GitHub.

#

#

#

SD35 medium

#

#

#

devout schooner Mar 3, 2025, 4:24 AM

#

dusky thistle try bongsampling

I'll look at it
how's speed?

dusky thistle Mar 3, 2025, 4:26 AM

#

devout schooner I'll look at it how's speed?

Anything in the multistep menu (res_2m, 3m, deis etc) runs at the same speed as euler

#

Res_2s is the same speed as the old dpmpp_sde and is pretty special with bongmath on

#

#

#

#

devout schooner Mar 3, 2025, 4:49 AM

#

craggy crest of course they do. but you said "a model that in practical terms functions EXACT...

one other thing I should mention (again, as one of the few people I think who has actually very painstakingly trained the same datasets over and over and over again on both the original SD3 and SD 3.5 Medium just as I'm the sort of person who actually enjoys fiddling with this sort of thing)
they have BIG issues picking up the likeness of single subjects with any remotely obvious training settings
which is what most people are going to check first

it's not impossible to get good results, but you're literally basically limited to the CAME optimizer (AdamW and Prodigy seem to be total dead ends for single subjects for reasons that aren't really clear to me at the moment)
and also training as Dora instead of Lora (at a low "factor", no higher than 2 - 4) with 64 Dim / 32 Alpha is pretty much a necessity in particular for photographic datasets as far as single subjects go
and lastly due to the annoyingly-rigid-and-not-actually-better-in-any-visible-way that resolution works in DiT models, to avoid artifacting you kinda have to (I'm referencing SD 3.5 Medium again specifically, here) train at a BASE resolution of 1440x1440 with images that are all equal or higher to that resolution in the first place and bucketing enabled to sort them properly
simply to avoid severe degradation of base model knowledge

figuring out literally all of that by myself by training the same lora about a zillion times over was the only way I was eventually able to get this pretty accurate Sydney Sweeney likeness, for example

the overwhelming majority of people will never go to the lengths I did, they will just immediately throw a model in the garbage if it doesn't perfectly and predictably learn the likeness of XYZ single-subject with very obvious default settings with absolutely no potential for "exploding gradient" whatosver (as is the case for all UNET based models)

so that's again what I really meant by "easy to train", no DiT model comes anywhere remotely close in that regard (not even Flux, because degradation of base model knowledge is still a huge issue there and you also typically need about 2x as more steps than any UNET model did to get good results)

devout schooner Mar 3, 2025, 4:50 AM

#

dusky thistle Anything in the multistep menu (res_2m, 3m, deis etc) runs at the same speed as ...

nice, i'll have a look
cool gens btw Prayge

past cipher Mar 3, 2025, 5:02 AM

#

dusky thistle https://github.com/ClownsharkBatwing/RES4LYF

Currently have RK-Sampler, so I'll give this a shot too. Thanks for the link.

dusky thistle Mar 3, 2025, 5:39 AM

#

guide image

#

output (WF embedded)

#

#

they will take longer as you go from stuff like 2s to 3s to 4s, but the quality will sometimes go up spetacularly

#

with medium i like using stuff like res_3s and res_5s

#

adding a bongmath implicit step will make it take longer but can also really improve things

#

#

#

dusky thistle Mar 3, 2025, 6:10 AM

#

#

bitter hearth Mar 3, 2025, 7:24 AM

#

devout schooner one other thing I should mention (again, as one of the few people I think who ha...

I totally agree with most of what you are saying
giant 16 channel Kolors would be amazing
its easier said than done though
a lot of the recent models are designed around what is easy to train rather than what is good to use once it is finished

#

and so models are being made to handle less and less variance over time

#

this does allow them to train easier and get bigger, but then when you use them the sampling is more restricted

dusky thistle Mar 3, 2025, 7:35 AM

#

#

fallen marsh Mar 3, 2025, 10:36 AM

#

👍

dapper stream Mar 3, 2025, 1:50 PM

#

/generate

violet escarp Mar 3, 2025, 8:37 PM

#

craggy crest of course they do. but you said "a model that in practical terms functions EXACT...

y'all are blaming the wrong thing. It's a problem with the 16-channel VAE. There are actually downsides to using it, with the most notable being that it bloats the needed parameter count for the model to learn effectively. It's why Flux is so big. It's why sd3 medium learns so slowly and has mangled anatomy.

gusty trail Mar 3, 2025, 9:05 PM

#

violet escarp y'all are blaming the wrong thing. It's a problem with the 16-channel VAE. There...

lumina 2.0 also use flux 16 channel vae and it is 2.6B

violet escarp Mar 3, 2025, 9:09 PM

#

and it also learns slowly. It's not as bad as sd3 though since it uses more efficient arch

#

Also more efficient than Flux, but better arch isn't enough to make up for Lumina's size

#

I heard Lumina team is going to release a bigger version though eventually

#

The efficient arch being that it doesn't used fused transformers which sd3 and flux do use btw

#

Flux also wastes 3b on encoding timestep embedding

dry wave Mar 3, 2025, 9:17 PM

#

VAE channel count has nothing to do with model size

#

you only have it in the input and output, that doesn't matter

#

it might be true that training on a larger vae takes more time, though, as it preserves more fine details which are often hard to learn. But I don't think that this is the reason why models take more time to train

#

I mean, the main reason why Flux is taking so much time to train is probably that it is not a CFG model

devout schooner Mar 3, 2025, 11:44 PM

#

dusky thistle

gonna try these now
would you say CFG for the SDE-alike ones kinda "scales" in the same way it originally did? Like for example I would usually run DPM++ SDE GPU Normal at around CFG 5.0
or DPM++ 3M SDE GPU Exponential at around CFG 4.0

devout schooner Mar 3, 2025, 11:50 PM

#

devout schooner gonna try these now would you say CFG for the SDE-alike ones kinda "scales" in t...

started with CFG 5.0 with 2S
latent preview looks pretty good so far
I'd say it's actually slightly faster than an SDXL gen with DPM++ SDE at the same resolution / step count FWIW
using 3.5 Medium
at least on my machine

dusky thistle Mar 3, 2025, 11:51 PM

#

yeah should be fine

#

i usually do cfg around 5.5 with medium

#

devout schooner Mar 4, 2025, 12:06 AM

#

bitter hearth I totally agree with most of what you are saying giant 16 channel Kolors would b...

I guess the gist of my point was again it doesn't really seem in practice like any existing newer model is "better" specifically because of being DiT and having XYZ more parameters than any given older UNET model
the improved text encoders and higher quality VAEs seem to do the overwhelming majority of the heavy lifting
and then there's various factors that come off as straight-up regressions in practice with DiT
like the whole "the image just immediately begins to artifact randomly when you go outside the training range" thing
so if your max is just 2MP like on SD 3.5 Medium you kind of have to train loras at that resolution to begin with just to have at least some hi-res-fix headroom when coming up from a generation at a more standard lower res (because the artifacting problem doesn't happen in reverse, e.g. it can scale down fine seemingly, just not up)

devout schooner Mar 4, 2025, 12:26 AM

#

BongSample 2S definitely very nice

#

is there any particular one you recommend using for half-strength-denoise hi-res-fix passes? E.G. typically I would tend to do DPM++ 2M Simple at 0.5 denoise strength and the same CFG of like 5.0, for hi-res-fix on an image generated with DPM++ SDE GPU Normal

devout schooner Mar 4, 2025, 12:54 AM

#

devout schooner BongSample 2S definitely very nice

multistep 2M seems to work actually for hi-res-fix, sort of the same way
similar result (arguably better even) for the second pass with it, but a bit faster

dusky thistle Mar 4, 2025, 1:56 AM

#

yeah multistep is pretty good for when you want the image to stay similar

#

you might like some of the guide stuff too, that can help with that

#

i made a little summary of some of the new functionality here

📎 intro_to_clownsampling.json

#

includes ultracascade too so you might see a couple red nodes on the left side (https://github.com/ClownsharkBatwing/UltraCascade)

GitHub

GitHub - ClownsharkBatwing/UltraCascade

Contribute to ClownsharkBatwing/UltraCascade development by creating an account on GitHub.

dusky thistle Mar 4, 2025, 2:15 AM

#

#

#

#

#

#

#

#

dusky thistle Mar 4, 2025, 3:36 AM

#

#

dusky thistle Mar 4, 2025, 3:57 AM

#

bitter hearth Mar 4, 2025, 4:47 AM

#

devout schooner I guess the gist of my point was again it doesn't really seem in practice like a...

tricky because to use strong VAE well likely requires stronger model
DiT scales better with data and compute, and DiT has better self-attention

#

some of the issues in this conversation were more to do with rectified flow loss, and its possible to have DiTs that don't have that

#

e.g. Pixart Sigma or Flag DiT

devout schooner Mar 4, 2025, 4:49 AM

#

dusky thistle i made a little summary of some of the new functionality here

nice, i'll have a look, thanks

bitter hearth Mar 4, 2025, 4:49 AM

#

I wish big Pixart came but it did not come out

#

Pixart team became Sana

devout schooner Mar 4, 2025, 4:50 AM

#

devout schooner nice, i'll have a look, thanks

i'm assuming "steps_to_run" isn't important right, for the beta sampler, just leave it at -1?

devout schooner Mar 4, 2025, 4:51 AM

#

bitter hearth Pixart team became Sana

I don't think I understand the point of the SANA VAE, it's much slower and resource intensive than the Flux / SD3 ones but also worse in quality to my eye

bitter hearth Mar 4, 2025, 4:52 AM

#

its worse quality but its a lot faster and less resource intensive

devout schooner Mar 4, 2025, 4:52 AM

#

bitter hearth its worse quality but its a lot faster and less resource intensive

it's not though, it's like a 1.5GB file

#

it uses more memory and the encodes / decodes are a lot slower

bitter hearth Mar 4, 2025, 4:54 AM

#

are you including the diffusion time

#

its over 100x faster than flux

devout schooner Mar 4, 2025, 4:57 AM

#

bitter hearth are you including the diffusion time

i'm talking about like
in ComfyUI
just the actual like, "write image to file" decode, when the image is done
was WAY slower with Sana when I tried it
than the same kind of decode with the 16-channel SD3 or Flux VAE is
and the initial load is a bit longer too of course because like I said the physical file is much bigger than the SD3 / Flux VAE files

bitter hearth Mar 4, 2025, 5:01 AM

#

oh I see

#

yeah if you don't include the diffusion time or the diffusion model vram then Sana is slower and more vram-heavy

#

to just decode a latent with the vae

devout schooner Mar 4, 2025, 5:05 AM

#

bitter hearth to just decode a latent with the vae

yeah that's what I meant
the inference was pretty fast
but the VAE by itself in a vacuum is very slow

bitter hearth Mar 4, 2025, 5:05 AM

#

there are some niche areas of machine learning where you only use a VAE

#

so yeah for those it could be worse

past cipher Mar 4, 2025, 5:06 AM

#

bitter hearth there are some niche areas of machine learning where you only use a VAE

Just say "image classifier" next time.

bitter hearth Mar 4, 2025, 5:06 AM

#

LOL

#

I read that some people use VAE encode/decode to store images

#

is a cool idea

#

although if you were gonna do that, the greater compression ratio of Sana might be good

devout schooner Mar 4, 2025, 5:09 AM

#

bitter hearth so yeah for those it could be worse

I'd have to recheck, but it didn't seem like the overall experience was much faster than SD 3.5 Medium
possibly just due to Gemma
maybe with a GGUF it'd be better

bitter hearth Mar 4, 2025, 5:10 AM

#

Gemma is faster than T5 though

devout schooner Mar 4, 2025, 5:11 AM

#

bitter hearth Gemma is faster than T5 though

whatever I had to use when I tried was not faster to load / encode in ComfyUI than the Q8_0 quant of T5 is

#

it could be a node code issue

#

not sure really

bitter hearth Mar 4, 2025, 5:12 AM

#

oh a Q8_0 quant of T5 encoder is indeed slightly smaller than Gemma encoder apparently

devout schooner Mar 4, 2025, 5:14 AM

#

again it could be that the "ExtraModels" code for "Gemma Loader" is just slow in some way relative to the City96 GGUF loader also
I don't know

bitter hearth Mar 4, 2025, 5:15 AM

#

its normal for Q8_0 T5 to be smaller than Gemma

#

its just that you are comparing a quant version to an unquant version

#

which is not a fair comparison

devout schooner Mar 4, 2025, 5:16 AM

#

bitter hearth its just that you are comparing a quant version to an unquant version

yeah that's why I said maybe a quant would help, initially

#

but there wasn't one that worked with that whole ExtraModels node system I don't think

bitter hearth Mar 4, 2025, 5:21 AM

#

I had a look and there are two other implementations of Sana

#

the official one, and one by the SVDQuant team, both are in Diffusers though

devout schooner Mar 4, 2025, 5:27 AM

#

in another news it seems like Clownshark stuff makes Camera Lady prompt work a lot better, so far 👀

#

in SD 3.5 Medium that is

#

normally her fingers kinda just melt

#

SLG helps but it's usually too high contrast for this one for some reason
so getting a good result without it is pretty cool

bitter hearth Mar 4, 2025, 5:29 AM

#

yeah the noise scaling is better

#

it makes SD 3.5 and flux look nicer

devout schooner Mar 4, 2025, 6:06 AM

#

bitter hearth it makes SD 3.5 and flux look nicer

came out great
this is why I'm hesitant to give up on SD 3.5 Med and keep fiddling with it though lol
it's technically capable of like perfect photo gens
they're just tricky to get out of it

gusty trail Mar 4, 2025, 6:29 AM

#

https://huggingface.co/THUDM/CogView4-6B

THUDM/CogView4-6B · Hugging Face

#

new 6B model with 16c vae and apache license

past cipher Mar 4, 2025, 6:42 AM

#

gusty trail https://huggingface.co/THUDM/CogView4-6B

No, just no. The requirements are too high

Resolution     enable_model_cpu_offload OFF     enable_model_cpu_offload ON     enable_model_cpu_offload ON
Text Encoder 4bit
512 * 512     33GB     20GB     13G
1280 * 720     35GB     20GB     13G
1024 * 1024     35GB     20GB     13G
1920 * 1280     39GB     20GB     14G
2048 * 2048     43GB     21GB     14G

bitter hearth Mar 4, 2025, 6:47 AM

#

you could use cloud

#

my first CogView4 image

devout schooner Mar 4, 2025, 6:50 AM

#

gusty trail https://huggingface.co/THUDM/CogView4-6B

why do these new models keep comparing themselves to SD3 Medium
instead of 3.5 Medium
seems kinda sus

past cipher Mar 4, 2025, 6:52 AM

#

devout schooner why do these new models keep comparing themselves to SD3 Medium instead of 3.5 M...

They also don't compare VRAM usage at specific resolution sizes. That's my issue with most of them.

devout schooner Mar 4, 2025, 6:54 AM

#

bitter hearth my first CogView4 image

what was the prompt for this one

bitter hearth Mar 4, 2025, 6:55 AM

#

devout schooner what was the prompt for this one

Cinematic movie still of a majestic dragon resting in a dense, misty forest. The dragon’s scales glisten under soft, diffused light filtering through towering ancient trees. Mist swirls around its massive wings, and glowing embers float in the air, hinting at its fiery breath. The scene is captured in a dramatic wide-angle shot with rich cinematic lighting, deep shadows, and a shallow depth of field, evoking a sense of awe and realism. Ultra-detailed textures, realistic foliage, and a filmic color palette enhance the immersive atmosphere.
but I pressed prompt enhance button as well

past cipher Mar 4, 2025, 6:57 AM

#

bitter hearth ```Cinematic movie still of a majestic dragon resting in a dense, misty forest. ...

https://tenor.com/view/sml-bowser-junior-alright-thanks-for-the-idea-thanks-for-the-idea-thank-you-for-the-suggestion-gif-23934441

Tenor

bitter hearth Mar 4, 2025, 6:57 AM

#

you can calculate VRAM estimates btw

#

they won't be 100% accurate but

#

VRAM use is highly linked to the parameter count

devout schooner Mar 4, 2025, 7:04 AM

#

I got this for a quick 25-step gen with SD 3.5 Medium, just using Euler Ancestral Beta at CFG 6.5, no fancy Clownshark stuff

bitter hearth Mar 4, 2025, 7:09 AM

#

its nice yeah

devout schooner Mar 4, 2025, 7:11 AM

#

bitter hearth its nice yeah

it's not really a great prompt to test I don't think, neither your or my gen are really "better" than this one I just did with Base SDXL and two loras lol

bitter hearth Mar 4, 2025, 7:12 AM

#

I know I don't rly know why you wanted it lol

devout schooner Mar 4, 2025, 7:12 AM

#

bitter hearth I know I don't rly know why you wanted it lol

just thought I'd try it I guess

#

try camera lady on it a close-up photograph of a young woman holding a vintage camera in front of her face. She is looking directly at the viewer with a serious expression on her face, as if she is taking a photo. The camera is silver in color and has a large lens attached to it. The woman has long dark hair and is wearing a black top. The background is blurred, so the focus is on the camera and the woman's face. The lighting is soft and natural, highlighting her features.

devout schooner Mar 4, 2025, 7:16 AM

#

devout schooner it's not really a great prompt to test I don't think, neither your or my gen are...

I think Kolors "wins" actually
as is not uncommon I find lol

bitter hearth Mar 4, 2025, 7:16 AM

#

the demo broke for me sadly

#

it just keeps going 120 seconds plus

#

its a common bug with grado demos

gusty trail Mar 4, 2025, 7:21 AM

#

Kolors is underrated.

devout schooner Mar 4, 2025, 7:24 AM

#

gusty trail Kolors is underrated.

yeah
great lora results from it too
whoever that guy was that said it "needed" Chinese prompting was just wrong
I captioned this Lora only in English (as I don't know how to read, write, or speak Chinese lol) with JoyCaption and it worked great:
https://civitai.com/models/1204546/zoots-flux-pro-ultrafier-for-kolors

gusty trail Mar 4, 2025, 7:28 AM

#

I trained this model with all english captioning. https://civitai.com/models/602580/kolors-openkolors-v24-multiple-style-general-kolors-model It is able to produce decent Chinese understanding with fine tuning in English. The embedding space between Chinese and English is different. Even you messed up the English caption, it still produce good Chinese pormpt adherence. (I find that when I trained with mismatched color prompt in English.

devout schooner Mar 4, 2025, 7:29 AM

#

gusty trail I trained this model with all english captioning. https://civitai.com/models/602...

yeah that's what I figured

#

it'd have to be kind of separate like that

#

my Lora also doesn't have a text encoder component at all anyways since I wasn't going to try to train ChatGLM obviously, it's just the UNET part
which seems to be all you need

#

I actually don't think a ComfyUI node that could load a Kolors Lora with some sort of ChatGLM part even exists lol
nor am I sure you even could train a Lora like that

#

now that I think about it

bitter hearth Mar 4, 2025, 7:48 AM

#

I would still recommend chinese prompting but I don't think you need it yeah

#

wow OpenKolors looks great, thanks for this

gusty trail Mar 4, 2025, 7:50 AM

#

Personally, I would say it is much better than official one in most case.

bitter hearth Mar 4, 2025, 7:51 AM

#

yeah for general use its better

#

I really love the base Kolors style but base Kolors is artistic

#

OpenKolors looks better for photography

devout schooner Mar 4, 2025, 8:26 AM

#

bitter hearth OpenKolors looks better for photography

I really should release my Kolors photo Lora, I keep meaning to lol
it's pretty good IMO
a photograph of a woman standing at a formal event. She is a young woman with a light skin tone, striking green eyes, and a slender, athletic build. She has blonde hair styled in loose, wavy layers that fall just below her shoulders, with a subtle, elegant updo at the back. She has a natural, glowing complexion and wears a sophisticated makeup look featuring a nude lipstick, subtle blush, and well-defined eyebrows. She is dressed in a luxurious, off-the-shoulder, white satin gown with a deep V-neckline, accentuating her cleavage. The gown is made of a silky, shimmering fabric that catches the light, giving it a radiant appearance. She wears an elaborate, multi-layered necklace adorned with large, sparkling diamonds that cascade down her chest, complementing the neckline of her dress. The necklace is paired with matching diamond earrings. The background is a vibrant red wall with abstract geometric patterns in shades of red and white, creating a bold, modern aesthetic.
base Kolors is first one, Lora @ 0.7 strength is second one
same seed and everything

bitter hearth Mar 4, 2025, 8:29 AM

#

its more realistic yeah

devout schooner Mar 4, 2025, 8:30 AM

#

bitter hearth its more realistic yeah

gets the prompt way more accurately overall too

#

this is with Lora but the prompt translated to Chinese
which interestingly I guess kinda still works
even though the Lora is not captioned in Chinese
not quite as good though, it misses e.g. She wears an elaborate, multi-layered necklace adorned with large, sparkling diamonds that cascade down her chest, complementing the neckline of her dress

bitter hearth Mar 4, 2025, 8:33 AM

#

that helps as well yeah if the prompt adherence improves

devout schooner Mar 4, 2025, 8:35 AM

#

bitter hearth that helps as well yeah if the prompt adherence improves

I mean I wasn't going to bother Google Translating all of my made-with-Joy-Caption-and-then-manually-edited captions lol
since I wouldn't know if the Chinese caption result was even good
so seemed to make more sense just to do it in English and improve the (already pretty good overall) English support in the model

bitter hearth Mar 4, 2025, 8:37 AM

#

it might take a lot of compute though for the english side to catch up

#

is the issue

devout schooner Mar 4, 2025, 8:40 AM

#

bitter hearth it might take a lot of compute though for the english side to catch up

I mean this 1000-image Lora was enough to make it like, very significantly better than pretty much any realistic SDXL model in terms of prompt adherence
I was even able to teach it ~2 full on NSFW concepts within that 1000 images lol

#

the results for photographic stuff specifically aren't really much if any better on stock Kolors when translated to Chinese anyways, when I've tested that before

bitter hearth Mar 4, 2025, 8:46 AM

#

ah okay I had read the chinese side was better but maybe it is not

devout schooner Mar 4, 2025, 8:50 AM

#

bitter hearth ah okay I had read the chinese side was better but maybe it is not

it seems to be more important for sort of smaller things
like I guess Kolors is supposed to be able to do some text output but as far as I can tell it works much much better in Chinese
so things like that

#

another one lol
a high-resolution photograph featuring a young woman of Asian descent with a radiant smile, standing in front of a classic white Porsche sports car. She is wearing a shimmering silver bikini top that accentuates her medium-sized breasts and white, frayed denim shorts that are unbuttoned, revealing her toned abdomen. Her long, wavy black hair cascades over her shoulders, and she accessorizes with a delicate necklace, a wristwatch, and several bracelets on her left wrist. In the foreground, she is pointing a black handgun directly at the camera, creating a sense of excitement and boldness. The background features a clear blue sky, tall palm trees, and a desert landscape, suggesting a warm, sunny location, possibly California or Arizona. The car's sleek, glossy surface reflects the bright sunlight, adding to the vividness of the scene. The overall mood is playful and adventurous, with the woman exuding confidence and a sense of fun. The image captures a moment of high energy and boldness, blending elements of fashion, adventure, and a classic car aesthetic.

gusty trail Mar 4, 2025, 8:58 AM

#

devout schooner it seems to be more important for sort of smaller things like I guess Kolors is ...

It is the first open source model able to produce chinese characters. It is not very good to generate English characters

devout schooner Mar 4, 2025, 9:15 AM

#

devout schooner another one lol `a high-resolution photograph featuring a young woman of Asian d...

this one came out even better lol
Kolors is like the Flux of UNET models for hands, generally
it doesn't mess them up very often
even without this lora

devout schooner Mar 4, 2025, 9:31 AM

#

gusty trail It is the first open source model able to produce chinese characters. It is not ...

I bet it could learn English text generation pretty well though since the text encoder is already a lot better
maybe I'll try a Lora for that sometime too
even on SDXL it sort of works to just have a bunch of images where the text is all accurately described, in a Lora, to teach it to generate at least short stuff

gusty trail Mar 4, 2025, 9:32 AM

#

Of course it could

#

I am just more focus on Chinese characters

dry wave Mar 4, 2025, 10:08 AM

#

devout schooner I guess the gist of my point was again it doesn't really seem in practice like a...

In general I agree with you. I always said the unet architecture is not as bad as many people think and I had a lot of discussions with people who didn't understand that a dit architecture is just simpler but not fundamentally different from unet. I do think that Flux has sometimes an amazing understanding and logic in it's generation so maybe there is something in the mmdit, though. This becomes apparent if you let it generate multi-part images ("give me a technical sketch of a building on the left and the very same building as photography on the right").
Regarding resolution: SD 3.5 just has a very shitty resolution handling. I wouldn't say that resolution is a general problem, it's just SD.

#

but I think the problem is: we never have fair benchmarks where different architectures are compared on exactly the same training data. So it's never clear if model A is better because of it's architecture or it's training data or it's parameter size

bitter hearth Mar 4, 2025, 10:52 AM

#

the 1k version of Sana removed the positional embeddings and it went ok

#

positional embeddings seem to be the main unet dit difference

#

they added them back in for the 2k to 4k sana though

dry wave Mar 4, 2025, 11:39 AM

#

yes. But I think positional embeddings make totally sense. Why let the unet learn them itself?

bitter hearth Mar 4, 2025, 1:00 PM

#

I agree, I don't use Sana, I wish Sana worked because I care mostly about speed

#

but I can't get adequate quality out of it

#

there are some resolution flexibility advantages from not having pos embeds

#

but I don't think that matters much because Flux goes to 8k without tiling (e.g. in the CLEAR paper)

#

there was some fast 3x3 conv model on arxiv last year but its like SD 1.5 quality at best

dry wave Mar 4, 2025, 2:09 PM

#

bitter hearth there are some resolution flexibility advantages from not having pos embeds

actually, resolution is the biggest disadvantage of conv

#

like you can usually increase resolution in pure transformer models without everything fall appart (see Pixart for example)

#

while convolution models get a lot of artifacts when increasing resolution (double heads, enlongated necks and so on)

candid latch Mar 4, 2025, 2:41 PM

#

How i use this? Good Mornig xd

bitter hearth Mar 4, 2025, 3:37 PM

#

dry wave like you can usually increase resolution in pure transformer models without ever...

when it goes well yeah but then you get models like SD3.5L that are way less resolution flexible than SDXL for example