#🆕|sd3

1 messages · Page 121 of 1

pseudo owl
#

You can just input multiple images and they can understand them(you can ask questions or give instructions) They support video understanding and I believe oryx 1.5 and llava onevision support 3d understanding too.

I think MiniCPM v2.6 is the only one supported by ollama so far and I prefer that for vision understanding. They are around 7b-8b in size. I forgot that theres also Qwen2 vl which is pretty great(supports img, multi-img, vids long as 20min).

toxic bone
#

was using 3.1 8b and some extension

pseudo owl
#

Now there are even open models that can natively generate speech output and understand speech.

toxic bone
cursive frigate
mortal mesa
#

i dont find myself doing thing with vision models but way back when i did i used Moondream, that not even on the radar anymore? the newer stuff eclipses it?

toxic bone
#

theres a weird thing you can do with flux an sd3. use one of these vision models to describe a long detailed prompt about something that was generated, and then give that back as a prompt and get a very similar image

#

i used florence large for that a few times

pseudo owl
cursive frigate
toxic bone
cursive frigate
#

I made a section for just feeding the prompt LoRA keywords because AI sucks at keeping that text exactly the way it needs to be to trigger LoRAs and another section where I can tell it some basic instructions for the image and style I want and it also analyses the image and provides a prompt then I have it combine all of that to generate the image. The results have been really good.

#

If you want the workflow let me know and I will upload an image with it baked in.

toxic bone
#

actually loaded fooocus today for a bit

cunning lintel
cursive frigate
toxic bone
#

i got time off then

cursive frigate
bitter hearth
#

I've had a theory for about 6 months now that if we can get good enough captioning models then we can fix tiled upscale

#

cos the reason tiled upscale gets duplicated subjects is that we are using the same prompt across all the tiles

bitter hearth
#

that's the usual explanation yeah

#

like imagine if it was just a photo of a cat on a rug

#

when generating the initial image every tile has a cat in it, cos there is only 1 tile

#

but imagine if we are at 4x upscale, there are 16 tiles, and many don't contain a cat yet the prompt says cat

mortal mesa
#

ive prompted separate tiles, to much work for my use case which is pretty much nothing

craggy crest
craggy wave
#

Hi there ! Anyone here have a comfyUI workflow for sd 3.5 large with Lora Loader + hire fix feature ? Thanks 🙏

dusky thistle
dusky thistle
kind acorn
#

Paint a picture of sunset and lone ducks flying, autumn water together in the sky

#

@dusky thistle Paint a picture of sunset and lone ducks flying, autumn water together in the sky

bitter hearth
#

hoping vision models can do that well soon

bitter hearth
#

new MiaoshouAI tagger version https://huggingface.co/MiaoshouAI/Florence-2-large-PromptGen-v2.0

#

it can give output for T5 and ClipL now

noble coyote
#

President Donald Trump - again - PotUSA number 47!

craggy crest
noble coyote
#

"I think it should be in knead two kneau!" 🥳

craggy crest
noble coyote
#

Florence2/Flux img2img

noble coyote
noble coyote
#

Mid-West USA

craggy crest
#

arizona

#

no daylight savings

noble coyote
#

Beautiful State - I love the series of Route 66 photos of Hackberry Az (by Carol M Highsmith)

#

Are your clocks set for 'winter time' yet?

craggy crest
#

i love driving on parts of the old route 66

#

there are still burma shave signs on part of it

noble coyote
#

Its 7 a.m. here in London UK

craggy crest
#

good morning then :)

noble coyote
#

Yes, Route 66 is a wellspring of Americana

craggy crest
#

a lot of it is.

noble coyote
craggy crest
noble coyote
#

I saw CSBW's version - an eagle in place of some ducks!!! 😄

craggy crest
#

i saw that too, i think he added that on the sly

noble coyote
craggy crest
#

i wrote that one down, i quite like it

noble coyote
muted dove
#

Surprisingly nice image for such a bad prompt 😄

#

The ducks I mean

craggy crest
craggy crest
muted dove
#

"Lone" ducks, flying "together" 🤦🏻‍♂️

craggy crest
#

too many tokens in that

craggy crest
noble coyote
noble coyote
noble coyote
#

I love the way you have the cougar "looking at that spinning apple!"

craggy crest
noble coyote
#

🥳

craggy crest
#

told you i was tired ;)

noble coyote
#

It brings out your sense-of-humour

craggy crest
#

here's the first paragraph of that monster prompt you posted

craggy crest
noble coyote
#

Silly is good. This room can get a tad doomy at times 😦

craggy crest
#

second paragraph

noble coyote
craggy crest
#

third paragraph

noble coyote
craggy crest
#

5 prompts for the price of one ;)

untold valley
#

ppl need to add token*counter to comfyui cant have over i belive 256 tokens or something like that

muted dove
craggy crest
noble coyote
#

There is a URL available (!) which parses your prompt and returns the number of tokens it contains

craggy crest
craggy crest
noble coyote
#

Yes - the prompt-owner either has a great grasp of language - or he's sleeping with an LLM!

dusky thistle
craggy crest
#

he's sleeping with an llm - however, at least for flux and SD3.X - since the images in the dataset were captioned with CogVLM - getting an LLM to create prompts for them works perfectly

#

you just have to remember to tell it how many tokens

noble coyote
craggy crest
muted dove
#

Ferrari inspired armour design

noble coyote
#

When I visited Hungary many moons ago - so many people breakfasted on beer!!!

noble coyote
craggy crest
craggy crest
noble coyote
#

The prompt consisted of "the white witch eats pancakes and drinks beer with the red knight!"

#

A secondary prompt dropped the beer and pancakes, and substitued "does the laundry"

muted dove
#

It's really struggling with monochrome!

craggy crest
#

prompt: a group of giggling girls chasing geese

noble coyote
#

^..^<

muted dove
noble coyote
craggy crest
#

@-->--->---

noble coyote
dusky thistle
noble coyote
#

Hi, you look familiar?!

craggy crest
#

<%%%%|==========>

noble coyote
#

Sword

craggy crest
#

--~~~=:>[XXXXXXXXX]>

noble coyote
#

<*))))))))))><

craggy crest
#

:)

#

@@@@:|

#

@@@@:)

#

that works better

dusky thistle
noble coyote
#

Indigenous beefcake

craggy crest
#

black and white portrait photo of an elderly native american in the 1800s

noble coyote
untold valley
#

Oh damn, Trump won.

noble coyote
#

PotUSA 47

dusky thistle
noble coyote
muted dove
#

What a disaster and a terrible image of the American population. They were just starting to gain some credibility back with the rest of the world. I fear it'll be a worse shit-show than last time too.

craggy crest
#

prompt: diana ross and the supreme sandwich

dusky thistle
noble coyote
craggy crest
noble coyote
#

The USA is entering a period of Unstable Diffusion

muted dove
#

Weighing up her options after finding out who the next prez will be.

dusky thistle
#

absolute nightmare tbh

noble coyote
#

The evil cabal of Farage Musk and PotUSA

#

GM all y'all

#

Welcome to a new dawn

dusky thistle
noble coyote
#

Your excellent w/f can take up to 15 minutes per image on my 8Gb VRAM PC 🥳

craggy crest
dusky thistle
#

thse take about 50 sec on mine

noble coyote
#

Great!

craggy crest
#

they don't take any time at all on mine.

noble coyote
#

I wonder how my (desired) 5090 will do?

craggy crest
#

(cause i can't run 'em)

noble coyote
#

🙂

craggy crest
#

waiting for him to get his node into manager

noble coyote
#

I just updated RES4LYF

dusky thistle
dusky thistle
noble coyote
#

Is installing via Manager any worse or better than a git pull?

craggy crest
noble coyote
#

How?

craggy crest
#

you mean git clone

noble coyote
#

Git pull as in update?

craggy crest
#

but you hve to have it before you can update

noble coyote
#

So a Manager-inuced install wins hands down over a git clone? How?

dusky thistle
#

not sure

noble coyote
#

OK

dusky thistle
#

but he doesn't have pywt installed in the right spot

#

i guess manager handles that

craggy crest
dusky thistle
#

you can also just comment out the import pywt btw

noble coyote
#

In truth, it probably is no difference

dusky thistle
#

it just won't let you use the wavelets noise type then

#

but it's one of 17

craggy crest
noble coyote
#

But what?

craggy crest
#

sometimes manager's install works great. and othertimes it barfs and git clone is what you wind up with

noble coyote
#

What is the essential difference between a git clone install/Manager install?

craggy crest
#

when one doesn't work, the other usually does?

noble coyote
#

OK, but even if both work - which one is better?

untold valley
craggy crest
untold valley
#

same thing end result

craggy crest
#

"does not compute"

dusky thistle
untold valley
#

anyone been able to recreate 1.5 skin textures with 3.5 yet? like somehow removing the "stylized" format that it outputs. its nice, eye pleasing but sometimes you want that generic non-retouched looking image

dusky thistle
#

tbh, sdxl can too

untold valley
noble coyote
muted dove
untold valley
#

yeah maybe its user error

dusky thistle
muted dove
#

This was with freckles, but the refiner went overboard with it 😄

dusky thistle
#

a close up amateur cell phone photo of a woman smiling in her messy apartment

untold valley
#

maybe its just that we are genning in a higher res but you can kinda see how they have some sort of plasticity,

#

not my images from *photoreal civitai page, but it just feels different, resolution may be the key. but sdxl,3,3.5 dont have this type of airbrush, high make up feeling. idk maybe im just seeing things and need sleep.

dusky thistle
#

those do look a lil airbrushed yea

muted dove
#

This looks perfectly normal to me

dusky thistle
muted dove
#

That's a noisy mess 🤷🏻‍♂️

dusky thistle
#

that's the point

#

amateur cell phone photo should look like this

muted dove
#

Why? It doesn't look good. Not even like a poor amateur photo.

dusky thistle
#

disagree

muted dove
#

It's ok as small images on Discord, but full size they're bad.

dusky thistle
#

one shot quick generations with no refinement using sd35M

#

the faces you posted above look pretty plastic

#

it's more convincing on the low end quality side

#

most camera photos are amazingly crap

muted dove
dusky thistle
#

without any other context

#

what i posted is better than most of em

#

they're blurry, hazy, full of artifacts

muted dove
#

Nobody has phones that bad nowadays though 😄

untold valley
#

ok awesome at least im not losing my mind and you kinda sorta are getting it that left one leans in the right direction, but u see the plasticity thing on the rights forehead.

dusky thistle
#

and maybe theoretically the camera phones are good but they're usually not used that well

#

glare, dirty lenses, blur from the hand shaking, poor lighting, etc

#

half the time you can barely make out the structure of the iris

#

without the cell phone part

#

sd35M

muted dove
untold valley
# dusky thistle

kinda sorta zeroing in this and the left hand image of glaxy pink tank top shirts going the right way

muted dove
#

That could be down to a combination of using Flux as a refiner and the sampler/scheduler choice.

noble coyote
muted dove
#

Skin is textured, I think it looks acceptable in this one. 🤷🏻‍♂️

#

Anyone got a pin?

untold valley
untold valley
#

so just "a face"

muted dove
#

This one was...

a young tanned __nationality__ woman with fair skin and a petite physique. She has a round face with a soft, natural makeup look, and her hair is styled in a casual, side-swept bob. She is wearing a fitted, sleeveless, pink ribbed tank top that accentuates her small to medium-sized breasts. Her attire includes thin, transparent straps that add a modern, minimalist touch. The woman is wearing round, gold-rimmed glasses that frame her face, giving her a studious appearance. Her expression is happy, with a smile, and she is looking directly at the camera with a confident stance, one arm raised to adjust her hair.
In the background, a dark street scene
dusky thistle
noble coyote
muted dove
#

@untold valley I do use an LLM in my workflow, so that prompt isn't necessarily what is used for the end result.

untold valley
untold valley
dusky thistle
muted dove
#

baby shark

dusky thistle
#

doododoo

untold valley
noble coyote
muted dove
dusky thistle
muted dove
dusky thistle
untold valley
#

a lot of workflows now incorporate LLM's keep trying to see examples but its crazy how much you can tell a ai wrote it. we have come full circle, using ai to generate ai prompts to generate ai art.

dusky thistle
muted dove
dusky thistle
untold valley
#

@dusky thistle @muted dove instagram photo is helping facepalm

dusky thistle
muted dove
#

an amateur iphone photo of a face

untold valley
#

awesome

#

iphone, instagram jfc just need to think like a white girl out to get her venti mocha latte from starbucksbobagirl

dusky thistle
untold valley
#

i could tell you used "sharp image" in your prompt lol

dusky thistle
muted dove
#

This is raw sd3.5 output

untold valley
#

that's a really clean crisp gen

muted dove
#

I don't like the neck

dusky thistle
#

held up surprisngly well considering how close up it is... sd35 is a lil better than expected there

#

normally that just turns into nonsense

#

they're pretty coherent models for sure

untold valley
#

regarless of anything sd3.5m is a great model

untold valley
muted dove
#

I'm using SD3.5L

untold valley
#

have you tried medium?

muted dove
#

Yes

dusky thistle
#

i bounce back and forth all the time

muted dove
#

Me too, they're all just different models

dusky thistle
#

everything recently above is medium

#

yeah def

#

no doubt the training set was different

muted dove
dusky thistle
#

flux tends to make skin look plastic

#

it's better to have something be noisy and shitty looking than too clean

#

very refined sets off our BS detectors

muted dove
#

Sometimes I'd agree, but Flux does tidy up a lot of the "mess". It fixes hands and fingers for example.

untold valley
#

i will say based on that comparison the textures on materials like brick etc stand out, the skin texture not so much but the bricks/cement 🤌

muted dove
#

...and the watch detail

#

...eyes, hair... 😄

untold valley
#

kinda reminds me of like 2.0 good at "landscapes" horrible everything else

#

except better at many things

#

minus the humans

#

thing that bugs me ab 3.5 is it loves loves the 3/4 shot

#

doesnt like to zoom out

dusky thistle
dusky thistle
#

jpeg artifacts are good for fooling the eye obv

#

the smudge patterns phones give too

bold fossil
#

Underwater world with colorful fish, coral reefs, and sunken ship, illuminated by natural light filtering through water, in a hyper-realistic style

limpid thunderBOT
#

Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.

If you have any questions, feel free to ask us!
Your dashboard
Help
Support server

Other languages
en: help
ja: help Japanese

dusky thistle
muted dove
#

Love the window/water detail in this (raw sd3.5L)

dusky thistle
untold valley
#

could I ask which one of these appeals/ looks the best according to wte criteria you want?

dusky thistle
#

ask dinogator

muted dove
untold valley
#

I can’t make up my mind either

dusky thistle
muted dove
dusky thistle
ocean mural
#

make image of an alien galactic council

noble coyote
#

Thwaark!!!

dusky thistle
fossil pagoda
dusky thistle
#

forget if i said this last night, but i got the deis samplers working!

#

it's kinda interesting - the third order multistep samplers seem to struggle a tiny bit with SD3.5, maybe because we're using CFG

#

DEIS_3M is basically what comfyui uses when you select "DEIS", i'm finding DEIS_2m is much better

craggy crest
dusky thistle
#

i was thinking of having a ksampler select and a "klownsharksampler" (who knows what i'd call it, but yeah)

#

equivalents to that

#

gotta work out some good presets first... generated like 9k sd35 images now to that end

#

but yea def agree it would be good to have a simple interface for easy entry

#

SD3.5 Medium

#

living rooms are something diffusion models really struggled with before

craggy crest
dusky thistle
#

it's great with animals

old walrus
#

medieval empress sitting on her throne, dark fantasy, digital art illustration

bitter hearth
#

oh no

#

I opened the workflow on the kitty image

noble coyote
craggy crest
#

when your cat has an affair with a rat

craggy crest
bitter hearth
#

chaos cloud

craggy crest
noble coyote
#

£10 cash

craggy crest
#

that was interesting

bitter hearth
#

just a lot of nodes

#

in a big cloud

marble sand
#

lol

noble coyote
#

Flux and Silhuflowart2 LoRA

pseudo owl
#

Tried will smith eating spaghetti with Mochi-1 from genmo website(apache2.0 open model)
I mean not bad but not perfect.

untold valley
bitter hearth
#

found a gguf of flan T5

#

https://huggingface.co/dumb-dev/flan-t5-xxl-gguf

#

its better than normal T5

#

its similar idea to common thing where people replace Clip-L with this https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14

#

don't rly need a gguf for Clip its so tiny already lol

sacred jewel
#

Cool LoRA thanks for heads up 😄

noble coyote
#

You're welcome!

rapid pivot
#

The nightmares ive seen...

pseudo owl
sullen moss
pseudo owl
#

Cool but sadly closed source, images do look great but are they even going to release something open now?

dusky thistle
bitter hearth
#

the flux pro update does look amazing

#

2048x2048 native, and they made a realism mode

sullen moss
sullen moss
#

This is really crazy

craggy crest
dusky thistle
cedar axle
#

SD3.5L

dusky thistle
cedar axle
dusky thistle
craggy crest
#

A dark, Gothic castle floating on a mist-covered, rocky island in the sky | towering spires stonework, eerie, isolated atmosphere | surrounded by thick fog, fading into a bleak, overcast sky | minimalist composition with high contrast | surreal, haunting ambiance

dusky thistle
errant dust
errant dust
#

Not a problem. I am a serial email complainer

#

(curiously got me two jobs at companies I was complaining to. True story.)

#

Maybe three, if I include writing for a newspaper back in 90s. Not sure that counts, and was not by email.

dusky thistle
#

the great north american sandfish

bitter hearth
#

FLUX1.1 [pro] – ultra mode: This option enables image generation at four times the resolution of standard FLUX1.1 [pro], without sacrificing prompt adherence. Unlike many high-resolution models that experience significant slowdowns at higher resolutions, our performance benchmarks show sustained fast generation times—over 2.5x faster than comparable high-resolution offerings. This model is available at a competitive price of $0.06 per image.that's the thing, their price for one single image is around the same as renting an RTX 3060 12GB for 30 minutes

dusky thistle
untold valley
# dusky thistle

I have a big complaint.... this looks so good on discord but then you click iand the rocks/waterfalls so blocky. water still looks nice tho. but still bait n switch lol

dusky thistle
#
Best Buy

Shop LG SIGNATURE 97" Class M3 Series OLED evo 4K UHD Smart webOS TV with Wireless Connectivity (2023) at Best Buy. Find low everyday prices and buy online for delivery or in-store pick-up. Price Match Guarantee.

untold valley
#

lolololol the blocks

#

i have two screens

#

one is like a 14 or 17" POS, my main is a nice 27" IPS one

dusky thistle
#

damn

#

i'm rockin dual ultrawide

#

got a 57" dual 4k and a 49"

untold valley
#

we are not all millionaires sadcat

dusky thistle
#

me either lol

#

i'm just a big believer in selling passenger doors off cars to fund more hardware purchase

dusky thistle
dusky thistle
noble coyote
#

Flux and Silhuflowart2 LoRA

craggy crest
untold valley
#

open in browser

dusky thistle
#

looks like sandstone cliffs to me

#

got stuff like that in my area

#

the water gets into the sandstone and separates the layers through freeze thaw cycles over the years

robust echo
#

An old 1960s vintage photograph album depicting a parked car in front of a house --ar 3:2

dusky thistle
untold valley
#

Clownshark has became the Ai.

robust echo
#

a Chinese man standing before a blue lake,ride a red mountain bike, wear a sunglass, on the plate refelcting the lake

muted dove
# dusky thistle

Damn you Scotty, you got the teleport coordinates wrong again!

craggy crest
lone eagle
#

We lie down together amid the cacti. The succulent leaves closest to the ground are a marbled grey, as if turned to stone, and we become absorbed by them, feeling our way around their rounded contours with our fingers. As we gaze up, following their odd tear-shaped forms bundled together, the sprawling double cypress tree – two trunks locked in an embrace – claims our attention with its swaying branches splitting into ever finer branches and twigs ending, here and there, in clusters of cones. Initially, we can’t really tell if it’s the wind that’s causing the canopy to stir and sway that way. It forms a dark shifting frame that we enter and get lost in as one does in a forest.

noble coyote
#

Flux Turbo LoRA w/f with 2 x KSamplers, 2 x Upscale and Sharpen

bitter hearth
#

the 8 step Flux Turbo LoRA gave me about as good an image as 1200 steps of Flux Dev

#

its so amazing

#

something about these new models means they can compress like that

sacred jewel
rapid pivot
#

I remember this prompt lmao

sacred jewel
sacred jewel
signal shuttle
#

Finally just started to train a 3.5 medium lora, its going pretty well, so far i noticed that it learns styles better then flux

craggy crest
lunar canopy
#

@noble coyote mind accepting friend request? catlook

bitter hearth
#

the hyper one is also very strong its not bad either

#

the bigger your final resolution, the less harm they do

#

so they do especially well for tiled upscale

pseudo owl
#

It nailed everything

#

Prompt: A white cat on top of a blue dog sitting on a brown couch in a living room. Behind them is a window with 4 cow pictures, one in each corner. Outside the window is outer space and a ufo.

bitter hearth
#

often acceleration loras can improve various aspects

#

people often view them as always being worse

#

but TCD is substantially better than regular SDXL, for example

noble coyote
#

Flux Turbo (8-step with LoRA) AliMaMa

pseudo owl
#

Seems pretty impressive so far, both didn’t get it 100% right, I think flux dev might win at this one tho.
Prompt: A man holding a sign that says “This is the grand contest of the teacher or the student. Will 8 steps win or 30 steps?”

bitter hearth
#

oh if you want accurate text that is probably a pretty big exception

craggy crest
untold valley
#

Ooooo Torcello big trouble.

bitter hearth
#

I think using DiTs for text will not be the way to go soon

#

Omnigen-style models are much better for text

#

the huge downside of Omnigen models is they can't add noise to correct for past mistakes like we do with diffusion

bitter hearth
#

not sure

#

it may well not be a problem

#

we're using stuff like DPM++ 2SA, Euler A and Clownshark stuff on the Rect Flow models but you're not really "supposed" to

craggy crest
bitter hearth
#

on Comfy server someone was saying, the big API providers like Fal aren't doing that

#

lol

bitter hearth
#

using fancy stochastic solvers
they are likely using DPM++ 2M or heun or euler or something like that

bitter hearth
#

oh I mean for Flux and SD3/SD3.5

craggy crest
#

i know what FAL is using for SD3.5 - i gave them the settings.

#

not sure about flux

bitter hearth
#

on comfy server someone speculated that FAL have made a hand-written INT8 kernel
their Flux Dev endpoint is a lot cheaper and faster than their competition

craggy crest
#

they might have

bitter hearth
#

but yeah the point I was making was the research tends to just want simple samplers for these modern Rect Flow models
compared to diffusion where SDE/ancestral was favoured

craggy crest
#

cause that's what I told them to use

bitter hearth
#

oh thanks that's really helpful

#

so we've finally reached the point where the trajectories are straight enough to use euler

#

it was gonna happen eventually

#

if your line is straight then euler is optimal

craggy crest
bitter hearth
#

haha

#

its swings and roundabouts

craggy crest
#

;)

bitter hearth
#

diffusion models tend to have curvy trajectories, is the issue

craggy crest
#

plotting out the path through an image is worse than plotting the trajectory to hit the moon with a rocket from earth

bitter hearth
#

its a pretty rough method yeah

craggy crest
#

"getting better all the time"

bitter hearth
#

GANs and VAEs don't do it they just jump in one go

#

apparently SD 1.5 VAE was almost a GAN anyway, there's a lot of overlap

#

I still find it funny that with diffusion we run the models backwards

craggy crest
bitter hearth
#

yeah they seem quite chill about it

craggy crest
civic trail
icy coral
#

I don't know where the beef with the author of the P-word model originated from, but I was wondering if the team behind Illustrious decided to train a model on 3.5 would they be in a better position when it comes to obtaining a license?

bitter hearth
#

fairly sure the situation was that Stability AI were busy and were behind in terms of dealing with the licenses

#

rather than them specifically denying a license

pseudo owl
#

p-word is pony?

rapid pivot
#

If that's it

#

I call stupid

#

lmao

signal shuttle
icy coral
untold valley
#

@icy coral and everyone I guess, you need to think about optics with Pony models we are 100% talking about heavy NSFW and SAI or any company likely wants to steer 1000000000ft away from it. This is because in doing so you lose potential for future funding. It’s all about a business and money and lastly public perception.

#

What SAI needs is a adept PR person for retail.

signal shuttle
#

Man flux ultra is so good, like what

icy coral
untold valley
untold valley
signal shuttle
bitter hearth
#

flux ultra is incredible quality but closed source models are absurdly expensive
for the price of one flux ultra image you can rent a 3060 for 30 minutes

signal shuttle
#

MAN I REALLY WANT A OPENSOURCE MODEL THAT CAN DO SICK STUFF LIKE THIS

pseudo owl
craggy crest
signal shuttle
# pseudo owl prompt? I'm sure open models can do this with a bit of extra settings.

my prompt was made with Claude "Vibrant anime key visual poster, "The Adventures of a Clumsy Female Knight!" prominently displayed in stylized Japanese text, featuring a cheerful young woman with messy blonde hair in oversized, ill-fitting armor stumbling forward, sword awkwardly held aloft, against a backdrop of a whimsical medieval castle and lush fantasy landscape, dynamic action lines emphasizing her clumsy movement, soft pastel color palette with pops of bright accents, highly detailed character design in the style of Studio Ghibli meets "KonoSuba", expressive eyes and exaggerated facial features conveying both determination and embarrassment"

craggy crest
signal shuttle
craggy crest
#

but we had this discussion yesterday

signal shuttle
#

Its small and in black

pseudo owl
craggy crest
#

sd3.5 base, no loras

signal shuttle
craggy crest
#

yes

#

prompt: a blond haired anime cartoon knight, happy, wild hair. across the top of the image is written the text "the adventures of a clumsy female knight"

bitter hearth
craggy crest
craggy crest
bitter hearth
#

yeah essentially not possible anymore

#

the impressive thing about Flux Pro Ultra is just the size without tiling, done within 10 seconds

#

it doesn't even need to be a new technology though, hand-optimised pipeline on Nvidia H200s, plus more finetuning and distillation shenanigans can go far

mortal mesa
#

digital cameras(cell phones) don't necessarily take true pictures like film

bitter hearth
#

yeah film was nicer IMO

craggy crest
craggy crest
# bitter hearth yeah film was nicer IMO

course, there's the issue of the actual brains that are processing the data coming in through the eyes of the human that's viewing whatever it is. not everyone sees the same thing in the same way. re: that blue dress on the internet a couple of years ago

#

if you want to get really technical, you're not seeing the actual objects at all, just the light that is bouncing off them

river sleet
#

You aren't even seeing that. You're seeing vague shapes and a tiny circle in the middle of your vision that's actually clear and sharp. Everything else is a hallucination made up by your brain.

craggy crest
signal shuttle
craggy crest
bitter hearth
#

nice, really sharp and good colours

mortal mesa
#

the illusive leapordtigerpuma

craggy crest
mortal mesa
#

hah nice

craggy crest
#

he's self-polinating

pseudo owl
craggy crest
rapid pivot
rapid pivot
bitter hearth
#

the canvas looks amazing

craggy crest
bitter hearth
#

I've essentially switched to diffusers/pytorch only at this point

#

I did enjoy comfy a lot though

craggy crest
#

you know anyone that's made any 3.5 medium finetunes?

halcyon yarrow
#

has anyone here tried mochi? I just tried it and it takes me 12 minutes with 8GB vram to render 31 frames or 2 seconds @ 15fps lol

bitter hearth
#

seems too early for 3.5 medium finetunes but there's been some progress in koyha, ST communities etc apparently

#

people are split between more models now, and that's probably gonna get worse as time goes on

rapid pivot
#

It's good and bad

halcyon yarrow
#

same reason i don't want BFL to release a flux-dev 1.1 it's just going to split the community further and invalidate all our work

#

14.75 minutes to render 73 frames aka 4.8 seconds

#

i'm going to keep increasing my frame count to find my max

craggy crest
halcyon yarrow
#

i can totally wait 15 minutes for a 5 second video clip if it's worthy and high quality

halcyon yarrow
#

and progress won't dictate communty sentiment

craggy crest
#

yup

halcyon yarrow
#

looks like something clown would make, good stuff

craggy crest
halcyon yarrow
#

ive seen that in a few youtube videos and a few like "how tos" in civitai it's liike the trendy new thing to try, just picking at the model's layer to see how output changes given the same input, interesting stuff indeed

craggy crest
mortal mesa
#

sooo this is pretty wild [MOVIE-SHOTS] In an enchanting tale of nature's wonders, [SCENE-1] shows <Sophie> observing butterflies in a sunlit meadow, her expression one of awe and delight, [SCENE-2] transitioning to <Sophie> sketching the butterflies in her notebook, her brow furrowed in concentration, [SCENE-3] wrapping up with her lying back in the grass, gazing at the sky with a contented smile, surrounded by nature's beauty.

craggy crest
#

you do need to make sure you're on the latest level of comfyUI though or you won't have the new node and the new scheduler

mortal mesa
bitter hearth
#

back with SD 1.5 and SDXL it was more needed to rely on this big ecosystem of fine tunes

#

but the models come out of the box stronger now, and loras can be done in 2 minutes on Fal

bitter hearth
halcyon yarrow
#

just saying imagine if SDXL 1.5 came out or SDXL 2.0, I don't think we can compare SD 1.5 to SDXL since it's a complete archicture change, I'm talking about a company doing an incremental model release, it would suck to have to keep around 2 loras that amount to the same thing for 2 similar models becase they're not cross compatible

#

that's why i'm glad flux 1.1 was liimited to just pro so it doesn't fragment the community pool

bitter hearth
#

I don't think its that big a deal anymore

#

people can mostly just use the base models now

#

and recreate loras themselves if needed as it can be done so fast

halcyon yarrow
#

I guess storage is cheap enough where it's not that big of a deal but still sucks in other ways like having to keep 2 sets of vaes for each release. i just wish when flux does the next release it's a major release with significant changes that make sense to split the community for, i for once am not okay with incremental updates to major models

#

for SD3 vs SD3.5 it makes sense, SD3 was dead anyways and SD3.5 kicked it back to life, I'd consider 3.5 a worthy incremental change where they just messed with the training data and made everyone happy lol

mortal mesa
#

The four-panel image showcases a playful bubble font in a vibrant pop-art style. [TOP-LEFT] displays "Pop Candy" in bright pink with a polka dot background; [TOP-RIGHT] shows "Sweet Treat" in purple, surrounded by candy illustrations; [BOTTOM-LEFT] has "Yum!" in a mix of bright colors; [BOTTOM-RIGHT] shows "Delicious" against a striped background, perfect for fun, kid-friendly products.

bitter hearth
halcyon yarrow
bitter hearth
#

I get what you are saying, it would be convenient for there to just be one big model with everything on it

#

1.7TB wow

#

okay yeah your perspective makes a lot more sense then

halcyon yarrow
#

I'm going to try this prompt as-is with that film-storyboard one

[MOVIE-SHOTS] In a vibrant festival, [SCENE-1] we find <Leo>, a shy boy, standing at the edge of a bustling carnival, eyes wide with awe at the colorful rides and laughter, [SCENE-2] transitioning to him reluctantly trying a daring game, his friends cheering him on, [SCENE-3] culminating in a triumphant moment as he wins a giant stuffed bear, his face beaming with pride as he holds it up for all to see.

CivitAI would have a field day if someone reposts those LORAs on there

#

this is exactly what CivitAI loves the 4-panel story board, film board, side by side concepts and it's not available there yet

mortal mesa
halcyon yarrow
#

oh i see thanks for the heads up, you did a good job maintaining the same syntax they asked for but i feel Sophie iisn't consistent in every frame

mortal mesa
#

portrait-illustration.safetensors This two-panel image presents a transformation from a realistic portrait to a playful illustration, capturing both detail and artistic flair; [LEFT] the photograph shows a woman standing in a bustling marketplace, wearing a wide-brimmed hat, a flowing bohemian dress, and a leather crossbody bag; [RIGHT] the illustration panel exaggerates her accessories and features, with the bohemian dress depicted in vibrant patterns and bold colors, while the background is simplified into abstract market stalls, giving the scene an animated and lively feel.

halcyon yarrow
#

crossbody bag you say? oh you must mean crosseyed hag lol

#

85 frames seems to be my max @ 24 minutes for essentially 5.6 seconds at 15fps

untold valley
#

finetunes taking forever fr, way too slow. lol/s but the higher we go in params the slower things will go.

#

need that nsfw fix from time to time....

dusky thistle
civic trail
dusky thistle
untold valley
#

I think im happy w faces.

dusky thistle
dusky thistle
#

all SD3.5M

#

my mona lisa

pseudo owl
#

You can also apply Lora’s on it while you can’t on bnb4bit

turbid grotto
craggy crest
craggy crest
untold valley
# craggy crest that's why you create LoRAs instead of full finetuned checkpoints

and fight the actual model with concepts it does not inherently knows a semblance of, and have to have stacks on stacks of loras, nah. There is a need for better finetunes, like all Sai models that have been released to the public have mostly been designed for. Training Lora's for hundreds of characters and styles and everything in between vs training 1 model, seems theres a clear winner.

craggy crest
#

a full fine tuned check point is going to be vastly more expensive and take a whole lot longer time

untold valley
#

loras are great fill a void/niche/gaps but its nicer when the model inherently understands fundamentals. SD3.5 is a superb base. a finetune would be wonderful to train other loras on.

#

And then SDXL went on to be wildly successful. Im hoping the same for 3.5

dusky thistle
bitter hearth
#

for a base model is good to be very underfit

#

so that fine tunes have more room to change the model

untold valley
#

thats what quote saying.

bitter hearth
#

is it?

#

I read it as saying the opposite

limpid thunderBOT
#

Last 7 days <Nov 01 2024> → <Nov 07 2024>

  • Member counts
  • 345955 ↗ 345969 ↘ 345955 ↘ 345930 ↗ 345948 ↗ 345967 ↗ 345976
  • Action members
  • 0 → 0 → 0 → 0 → 0 → 0 ↗ 85
  • Message members
  • 0 → 0 → 0 → 0 → 0 → 0 ↗ 58
  • Reaction members
  • 0 → 0 → 0 → 0 → 0 → 0 ↗ 43
    More details
Summary | comcom Analytics

comcom analytics は、Discord または Slack 上で運営されているコミュニティを分析・モニタリングできる完全無料のダッシュボードです。現在、パブリックにβ版を提供しています。

untold valley
#

saying you need good base model, not too over trained so community finetunes.

dusky thistle
#

Yeah I think it's just phrased a little oddly

bitter hearth
#

maybe, not sure
I never really like trying to work out what past ambiguous quotes are saying

#

anyway if he means it should be underfit I agree

dusky thistle
#

Yeah I think that's what he means

#

That it won't look great cuz it's designed to be a little undercut

#

Underfit

bitter hearth
#

Flux was somewhat helpful in that it taught the population that if the model is overfit its hard to get rid of the Flux Chin

untold valley
bitter hearth
#

lol yeah

#

some of the realism loras are good though

#

sadly I didn't save the workflow but I found one that works much better than the others

#

this same debate about fit happened in LLM world when Phi models came

#

cos Phi models are underfit so they are bad on release

#

but for example Omnigen is based on Phi, they are really useful

untold valley
bitter hearth
#

oh yeah this was my own workflow

#

I spent $10 on L40s making it

#

then forgot to save it lol

#

all I have is this screenshot

untold valley
#

should be enought info there to remake it no?

bitter hearth
#

almost yeah
sadly the nodes to put the settings up top were wired up a bit wrong so it is missing some
but most are there

#

I feel very lucky I put that label up top and the graphs on the right

#

I would have nothing otherwise

hallow lion
dusky thistle
bitter hearth
#

base Flux can't do him well but with realism lora Flux can

untold valley
hallow lion
#

r2D2 frollic in grass

bitter hearth
dusky thistle
hallow lion
dusky thistle
hallow lion
#

Usual Hogwarts report.

dusky thistle
hallow lion
#

Clownsharks images are like a near death experience

#

couple of days ago I dreamt of nuclear blasts.

craggy crest
fleet meteor
untold valley
runic tusk
#

I ordered a frappuccino, where's my fuckin' frappuccino.

craggy crest
noble coyote
#

Here's my photo! Sign me up!!! 😄

#

SD3.5 L Turbo

mortal mesa
#

The CogVideoX1.5-5B series model supports 10-second videos and higher resolutions. The CogVideoX1.5-5B-I2V variant supports any resolution for video generation.

untold valley
#

@lunar canopy sry ping but dont know if server has mods, this is a scam.

#

Actually am not sorry ping bobagirl catlurk

mortal mesa
#

you come here from the SAI webpage then get fished

#

like this

untold valley
#

Oh that guy spam posting everywhere

#

Where the mods

mortal mesa
#

abandoned ship

hallow lion
#

Titanic is sinking.

cunning lintel
#

This discord is so wholesome it needs no mods 🤣
Sucks btw, this used to be a nice place, now i still hope to find some tidbits of interesting talk/images here, but quite often i think "ignore, ignore, don't get dragged in"

hallow lion
#

lol

craggy crest
#

with SLG - skipping specific layers to adjust the details, and without slg

craggy crest
bitter hearth
craggy crest
bitter hearth
#

yes plus the NSFW community

#

I never see drama in the more dry areas of ML like communities for graph neural networks or time series analysis models

#

its always in image or chatbot communities

sterile pendant
craggy crest
#

and then they get very emotional

mortal mesa
#

that is %1000 wrong

craggy crest
mortal mesa
#

i have a brain

craggy crest
mortal mesa
#

so you should cry when you do bad science to get people to leave you alone right

craggy crest
#

you are, apparently, in need of coffee this morning

bitter hearth
mortal mesa
#

you said one of the craziest things ive heard involving science

mortal mesa
#

society is wrong i suppose

craggy crest
mortal mesa
#

you broke sd3 by demending it came out as a huge member of the community

#

the clips you have to talk to the clips

craggy crest
#

and he dives down a radom rabbit hole

mortal mesa
#

nah i just get annoyed

#

i dont expect anything of anyone here

craggy crest
#

how did you go from talking about scientsts to SD3?

mortal mesa
#

you could read, its right here

craggy crest
#

you went right from talking about rabid scientists yelling at each other directly to SD3 being broken, without even a transistion

mortal mesa
#

i dont hate you i just often dont like you

craggy crest
mortal mesa
#

maybe put it into a LLM to explain

hallow lion
#

🍿 popcorn?

craggy crest
#

naw, i'm just gonna send you some coffee

mortal mesa
#

ty

hallow lion
#

maybe wizard is just misunderstood...

craggy crest
#

i thought you blocked me, mr. coin

hallow lion
#

im willing to give a second chance to everyone

#

and its miss

bitter hearth
#

there's a lot of stuff I said earlier in the year which I now think is cringe because I understand the math better

hallow lion
#

Right, we're all friends here!

#

Models come and go

mortal mesa
#

you all ruined sd3

hallow lion
untold valley
#

Hello

#

Did I miss participating in the drama 🎭

craggy crest
hallow lion
#

nah

craggy crest
hallow lion
#

very mild popcorn moment

craggy crest
untold valley
#

Wait why are Kagi and Crystal arguing? Don’t y’all share same opinions?

mortal mesa
#

ya over crying scientists lol

craggy crest
#

i actually respect Kagi's opinion quite a bit. but we do lock horns at times.

mortal mesa
#

im a mere mortal

bitter hearth
craggy crest
bitter hearth
#

ah okay

#

I find it hard to keep track of the drama

craggy crest
untold valley
#

Idk but crystal likes gaslighting the community over sd3 failure and it seems kagi also thinks it catlurk
Good thing 3.5 was success imo mostly cuz they removed some training restrictions.

craggy crest
untold valley
craggy crest
#

you're more than welcome to go back through several months of posts and read what the commuinty was saying between march of this year and when SD3 releaed - and read the extreme toxic whining that was constantly going on

#

they got what they asked for and demanded

untold valley
#

Ok we can go back to posting gens.

craggy crest
#

sure.

untold valley
hallow lion
untold valley
#

@craggy crest figured out to make texture better. Add node to add grain to hide it sadcat

craggy crest
#

that looks really good

lunar canopy
#

nicest fall gens get to be server banner catlurk

#

or...er winter?

hallow lion
#

Ye it does, still amazes me how we can get stuff like this by typing words.

craggy crest
lunar canopy
#

that'd be fun

untold valley
#

Winter so like igloos and penguins catlurk

craggy crest
mortal mesa
#

i dont lora much but i saw a 2gb flux and 900mb sd3.5 lora, is this a new normal

hallow lion
#

lol 2gb flux?

mortal mesa
#

ya its kinda nice too

untold valley
#

Good lord

craggy crest
bitter hearth
#

there's people doing Lycoris Lokrs rather than Loras for Flux also

#

there's some on Civit if you want to try them

lunar canopy
untold valley
#

Ty

bitter hearth
#

it happens quite often that someone comes and puts their prompt in public like its midjourney, and the prompt is either super-NSFW or crazy

hallow lion
#

Good riddance

bitter hearth
#

midjourney requiring discord probably cost them hundreds of millions of dollars
most bizarre business error

#

they kept saying it is too much work to make a backend which makes no sense

#

it would have cost them way less than the cost of training the model

craggy crest
bitter hearth
#

what I was thinking was server hosting cost plus the wages of a few developers

craggy crest
bitter hearth
#

I mean if Midjourney had made an external website on day one

#

it would have been better, but they would have had to pay server costs and dev costs

#

but I think it would have paid off massively because so many people avoid midjourney because you had to use the discord

craggy crest
bitter hearth
#

oh I see, is it one of those companies that is aiming for goals other than money?

#

wasn't aware of that

craggy crest
bitter hearth
#

ah okay yeah that makes sense

noble coyote
craggy crest
halcyon yarrow
# craggy crest

is this generated liike that, stiched together or using that new lora Kagi showed off yesterday?

#

@mortal mesa thanks for showing off that in-context lora project that thing is really cool man

halcyon yarrow
#

really cool, did you speciify for 5 frames or did it just choose 5 on it's own?

craggy crest
#

prompt was: happyness: colorful autumn trees by artist "Shaun Tan", by artist "Mab Graves", by artist "Rien Poortvliet"

halcyon yarrow
#

ok that answers a lot

craggy crest
#

i didn't, but i used several names, and stable likes to create sections when you do that

halcyon yarrow
#

were you messing with the clip skip layers thing when you made it?

mortal mesa
#

ya its cool stuff, the loras are attempting more consistency

craggy crest
halcyon yarrow
craggy crest
#

@bitter hearth no SLG vrs skiping layers 8,10,18,23,22 at scale 1

bitter hearth
#

wow it makes it brighter

#

I thought it would only help structure but it goes beyond that

untold valley
bitter hearth
#

yeah makes sense

craggy crest
bitter hearth
#

this means you've gotta start liking negatives though, cos this works via the negative 😂

craggy crest
#

but with this you can be exact. with negatives, it's hard to be this exact

#

not only do you skip layers, you adjust how much, and you have other values. you can pin point

bitter hearth
#

PAG and SAG work on the negative too
SAG blurs the subject in the negative and PAG scrambles it

craggy crest
#

this isn't using negative prompts

bitter hearth
#

yeah I mean it uses the negative prediction

#

SLG drops layers when making the negative prediction but keeps them when making the positive

#

dropping the layers makes the image worse, which is okay as CFG pushes us away from the negative

#

its very weird

craggy crest
bitter hearth
#

ah maybe I misunderstood it, haven't looked in detail at the code yet

craggy crest
civic trail
bitter hearth
craggy crest
bitter hearth
#

there are some bad arxiv papers out there yeah

craggy crest
#

huggingface is really doing a nice job of currating papers now

#

(we're back to carrying on a single conversion in mutiple channels ;) )

bitter hearth
#

AI moves so fast that

#

it might be too fast for curation

craggy crest
bitter hearth
#

not sure

craggy crest
mortal mesa
#

not SVD as in SAI SVD

pseudo owl
# mortal mesa speaking of papers https://hanlab.mit.edu/projects/svdquant

Yeah that quant is pretty amazing, similar quality to 8bit while being much faster then bnb4bit and using less vram. It can also work with loras, no need to requant(unlike bnb4bit)

They have a space to compare,
Flux.1 Schnell bf16 vs Flux.1 Schnell SVD quant
prompt: A man holding a sign that says “Is 4bit quant better or the full bf16 model?”

bitter hearth
#
 We will replace unsafe prompts with a default prompt: "A peaceful world."```LMAO
mortal mesa
#

the demo loads the Gemma-2B model as a safety checker by default. To disable this feature, use --no-safety-checker

bitter hearth
#

bare in mind the biggest speedups are gonna be with the FP4 version not the Int4 one

#

and the FP4 version doesn't look as good as the Int4 one

#

especially with RTX 5090 which will have native FP4 matmul acceleration

noble coyote
noble coyote
craggy crest
craggy crest
dull socket
craggy crest
halcyon yarrow
#

@short thicket have you been working on any new models? i'm waitinig for the booru-free release

halcyon yarrow
# mortal mesa speaking of papers https://hanlab.mit.edu/projects/svdquant

cool share, thanks for the link i just went through it but a few points

  • where can I download the svd quant version of flux dev?
  • I'm thinking it only runs through scripts right now there isn't a comfyui node to load these models?
  • i can't find the script or program that will allow me to take any safetensors or gguf file and quantsize it using their technique
craggy crest
#

without SLG

#

vrs With SLG

mortal mesa
halcyon yarrow
halcyon yarrow
#

@mortal mesa using that in-context lora lol

mortal mesa
halcyon yarrow
#

yeah the concept of some paper chips in a plain brown paper bag kinda intrigues me too

#

STOIQ did a terrible job with the same prompt horrible text horrible design, I'd avoid those chips bc the bag looks so flat and like there's nothing in there

#

I noticed for the couple-generation lora it doesn't really seem to abide by rules set forth https://civitai.com/models/929592?modelVersionId=1040555 @mortal mesa check out how like more than half of the images I made it didn't make a clean split it just ignored the prompt and put them together

untold valley
#

giving large more a chance but damn its like 6x time longer gen time, but some results so different.

#

Medium Left, Large Right

#

damn

amber crow
#

good

short thicket
halcyon yarrow
#

took tiime off to touch grass ini other words to smoke some weed? lol

short thicket
#

I needed to take some time to "brain storm"

#

my goal is to build a small dataset of about 10,000 images to start.

halcyon yarrow
halcyon yarrow
#

it's your call but it doesnt hurt to do A/B testing w and w/o it just to see how it's affecting image generations, currently not happy with how it's performing even at 61 steps check it out.... damn nvm i can't show you, I instinctively delete bad generations, needless to stay it's very sloppy and incoherent even with 61 steps and 3.5 cfg on what I'd consider a super complex prompt, meanwhile flux dev destill manages to do remarkably better

short thicket
#

It will get better in time. I'm basically just doing this in my free time as a hobby. There is always room to improve things. But after putting out 3 models in 2 weeks, I'm gonna take some time to chill. It would be just as easy for you to get the models and merge them how you want.

craggy crest
halcyon yarrow
#

makes sense, has to stay fun if it's as a hobby, dont wanna burn out on too much 'fun' lol anyways yeah youre right it's just merging not training, just never messed with that field of stuff

halcyon yarrow
# craggy crest just a thought, but a lot of people like those booru tags...

the booru tags are cool my disdain is the trainnig methodology the model creator used ended up making things worse not better imo, i think the training dataset was overly ambiitious and he didn't throw enough compute at it so it feels half baked but that's just my opinion from using it, maybe i'm using the model witih the wrong settings still

craggy crest
short thicket
halcyon yarrow
#

well to be fair he threw a bunch of H100s at it but I think his dataset was larger than his compute, the ratio very well likely could've been off, and one more thing the creator specifically mentioned in one of my comment threads it was not trained with booru tags, in other words all the images processed used natural language VLM captions rather than the booru tags they were found to be part of

short thicket
halcyon yarrow
craggy crest
short thicket
halcyon yarrow
#

using the in-context lora suggested by @mortal mesa using the STOIQ model on this one

short thicket
halcyon yarrow
#

lol yikes that thing looks nitense

craggy crest
#

will be interesting to see how it comes out

craggy crest
short thicket
craggy crest
#

i still might not just set the same weight on every block, however

halcyon yarrow
#

yeah i could't stand that, i would have chatgpt fix that for me

craggy crest
halcyon yarrow
#

just give it the code and ask it to make you an input at the top that affects all the other ones

halcyon yarrow
craggy crest