#🆕|sd3

1 messages · Page 131 of 1

devout schooner
#

there's definite bleed of unrelated data in 3.5 Medium

errant dust
#

not sure which is which, but the first image is clearly the better one

#

The second one looks like a cartoon

devout schooner
#

unless you really believe that everything should randomly be super grey and dull and very clearly have elements of literal paintings mixed in even when that wasn't asked for or desired in any way

#

neither base is "perfect" by any means but the left side one (3.5 Medium) looks WAY worse than the right side one (3.0 Medium)
the 3.5 one is a poorly resolved mess that looks more like the inside of a piece of particle board wood than any kind of metal

bitter hearth
#

its hard to describe but there is this thing like "scratchy details" that can come in a variety of models and SD 3.5 can get it quite easily
if you used SD 1.5 and SDXL with PAG then its basically the thing that goes away when you increase PAG

#

I kinda understand what you mean by cartoon because the colours went bolder on the one on the right and some people prefer muted colours

#

I used to have the same tastes I recently started using CIELCh colors and I appreciate bolder colours more now

errant dust
bitter hearth
#

when they test people's personal taste about images what they find is that people's personal taste varies so so much

#

so IDK how much its worth debating matters of taste

#

its partly why I thought the big "battle" between SD 3.5 and Flux was a bit silly

#

if you preferred robust+clear object outlines and a stylized aesthetic then you would prefer Flux, if you were not bothered by some degenerated object outlines and you preferred a high detail natural aesthetic then you would prefer SD 3.5

#

there were like 100+ arguments about SD 3.5 vs Flux in the community but it just comes down to that taste difference really

devout schooner
# errant dust No, it doesn't. And it is you who fails to understand over and over my point abo...

I mean I strongly disagree but ok
if I scrounge up more examples I'll post them
this is IMO a very real problem SD 3.5 Medium (and to some extent Large) has even when the prompt mentions absolutely nothing about light
it aggressively wants to make everything dull and gray (and painterly), all the time
and make the entire image appear as though seen through inexplicable, excessive amounts of fog or smoke
in a way that is not at all desirable

devout schooner
# bitter hearth if you preferred robust+clear object outlines and a stylized aesthetic then you ...

putting everything we're talking about here aside I think it can't be stated enough that in fact Flux basically single-handedly invented and originated the "plastic skin CGI look"
for all realistic outputs
that all models since have done
stock Flux will never ever produce an image that looks like this 3.5 Medium one (meaning, just like, normal) under any circumstances
for any prompt like
a close-up portrait photograph of a young woman's face, focusing on her facial area from the eyebrows down to the upper lip. She is 18yo and has freckles. Her skin has a smooth, glossy texture from her makeup. She has dark brown eyes, framed by thick, dark, well-groomed eyebrows. Bold red eyeshadow extends from the inner corner of her eyes to the outer corners.

#

but revisionists want to insist nowadays that the way Flux looks has been how all models look
and that it's some kind of unavoidable problem
which obviously it is not

#

this is a big part of why I'm not particulaly amped about HiDream
needing to use the full 17 billion parameter version to even just get normal-looking realism is a bit silly

#

and API-only models have definitely taken into account what I'm saying
like Reve does not produce CGI-esque Fluxy realism
nor does Ideogram
and so on
it's just this recent stream of extremely similar-to-Flux open source ones that seem to be trained by people who think being utterly incapable of authentic realism is desirable

mortal mesa
#

surface hasnt been scratched with HiDream yet, but ya its a chonker, still might have some pleasant surprises

bitter hearth
#
```early JuggernautXL though lol
#

did that look super early on

#

I never rly found stock Flux useable, as I was saying the other day I never jumped from RealvisXL to flux until RealvisSchnell came out

#

and a bunch of photography+lighting loras

#

I also refined each flux image with RealvisXL or Realvis for SD 1.5

split bramble
#

I don't know if I've seen RealvisSchnell.

bitter hearth
#

best things always have zero hype
its like universal rule

#

there are better checkpoints these days though, at the time it was unrivaled IMO

#

it loses to properly de-distilled checkpoints as an example

devout schooner
devout schooner
#

HiDream looks more like Flux Schnell (at best) here

#

literally nobody wants shit that looks like this though
this is the number one complaint everyone has always had about Flux
and numerous loras were trained specifically to get rid of it (including by me lol)
like nobody is looking at this stuff and being like "yep, perfect"
I don't know who is convincing all these recent model trainers that this ridiculous not-actually-realistic-at-all bokehmaxxed style is desirable to anyone

#

like just make an open source model that looks normal but also doesn't have any amount of weird ass coherency and noise resolve issues
why is this so hard KEKW

mortal mesa
#

data issue

devout schooner
# mortal mesa data issue

it's probably partially that but it's definitely also caused by people deciding that their model NEEDS to be 12B to 17B parameters for some unexplained, unclear reason
and then aggressively distilling and DPOing them down as a result of that to make them runnable for average users

#

there's no way any of these models NEED that many parameters
like the practical difference between SD 3.5 Medium and Full HiDream DEFINITELY doesn't amount to a "14.5 billion parameter" difference
not even remotely close

mortal mesa
#

got alot of bang for the buck with quantity for a while now, and ya ide hope it shifts

devout schooner
devout schooner
# mortal mesa got alot of bang for the buck with quantity for a while now, and ya ide hope it ...

eh
I still just don't really get why people are so aggressively pushing HiDream
I even got into a truly bizarre argument with the simpletuner guy yesterday where he posited that LLAMA not being a "Chinese" text encoder was somehow a significant contributor to the reason
I'm still not even sure what exactly he really meant by that
or how that would plausibly be relevant to any average user lol

bitter hearth
#

going from flux to flux turbo to schnell does the same

#

something about distillation makes you lose soft lighting and blur

dry wave
#

HiDream could have been the model that replaced Flux (which is only available distilled).
But for me the model is way too parameter inefficient

#

I disagree with the claim you would not need many parameters, though.

#

I was sceptical myself first, but Flux is just so much smarter than smaller models

bitter hearth
#

snapchat made a 0.3B image model that looks about as nice as any to me

#

oh it won't be as smart yeah

dry wave
#

looking nice does not mean being smart

#

look at the ChatGPT model what you can reach with enough parameters

bitter hearth
#

I know what you mean by Flux being smarter yeah
it fixes stuff during inpaints and upcales etc
that does seem to require more paramaters

dry wave
#

I would like to have a new flux with stronger text encoder and more efficient parameter use (e.g. adaln parameter sharing)

bitter hearth
#

I've been quite happy with tiny models for tiled upscale, like this one https://huggingface.co/cqyan/hybrid-sd-224m its SD 1.5 but squished to 224m so like a two-thirds size reduction

#

IDK if I am that bothered about parameter efficiency like
SVDquant Flux FP4 with 8-step turbo lora is rly fast on 5090 servers

#

so like without even mentioning B200s it can be fast on domestic 5090s even

dry wave
#

it's not so much about speed but about using parameters where they help most

#

like one big jump from sd1.5 to sdxl was that they removed the transformers in the first block because - surprise - they haven't done anything, and put them into the middle block

bitter hearth
#

yeah that first block in SD 1.5 is a big pain cos it makes SD 1.5 slower than SDXL at high res

#

which is crazy

#

when I use SD 1.5 I always use Modified Shifted Window Multi-head Self-Attention cos it fixes that issue
but then that makes it no longer work easily with torch.compile
its a mess

#

there is a "fix" which is to make custom CUDA kernel for tiled SD 1.5, which is kinda one of my current projects lol

turbid crane
turbid grotto
bitter hearth
#

I asked comfy about supporting it but no dice

turbid grotto
#

understood!

devout schooner
devout schooner
hallow lion
#

So is this the hidream channel for the time being?

#

We have a new kid on the block

jagged gate
queen edge
#

Close-up view of a corner connection for stacked shelves (thin galvanized steel rectangular tube). The top surface of the lower shelf corner has small metal blocks welded to form a square locating pocket or fence. The upper shelf has a 10cm tall square spacer foot welded underneath its corner. This foot fits neatly inside the locating pocket/fence on the lower shelf. Show the 10cm separation created by the spacer foot. Detailed, metallic, industrial design, 3D render.

bitter hearth
#

but when I go from 0.2B pruned SD to 30B stepfun I do see an increase in "smartness"

#

even though they are fundamentally the same type of network

#

the 15,000% increase in parameter count gives some benefits

#

having said that, its amazing how well 0.2B keeps up with 30B

#

0.2B pruned SD is perfectly fine for tiled upscale and other tasks like that

dusky thistle
safe creek
#

Is the training set for SD3 or SDXL disclosed?

bitter hearth
#

sadly no

dusky thistle
raven fern
#

@dusky thistle hahaha it's been a long while since i was kinda active on this server, im happy to see you are still doing some clownshark pics 🙂

#

did you try some with HiDream?

dusky thistle
raven fern
#

😮

#

nice

dusky thistle
#

gotta get all the attention masking working so i'll have to modify the model code

raven fern
#

i actually never tried HiDream yet, will try it tonight, it seems most people use the Dev version?
should I go with that one?

dusky thistle
#

no idea

#

i'm trying full atm

raven fern
#

yea i will hopefully upgrade my PC this summer so I can enjoy all the good stuff instead of relying on quants or
compromises

icy drift
icy drift
#

YES. Absolutely nailed it. Paws are so perfect. (This is the best of maybe 20; kept cancelling run halfway through based on preview.) Just realized the basketball is wrong though...

#

Huh. Looks like it almost never gets a basketball right.

#

Great at soccer balls. A little iffy overall.

icy drift
#

It usually seems like just a slightly better version of Flux.
But then you give it a 12-constraint prompt like this and it's just like, "Yeah, I got all that." And it nails all 12 constraints every single time (with accurate hands). It really is way more powerful / intelligent than Flux, just not more knowledgeable.
A male elf with braided silver hair and green skin is wearing a purple toga, standing on the shell of a giant turtle. In his left hand, the elf is holding an intricately etched golden staff. In his right hand, the elf is holding a slice of meat lover's pizza. The scene is cinematic moodily lit under an overcast sky.

#

By default, it's super consistent seed-to-seed; always the same pose. But you can use the normal partial-denoise two-pass to get as much creative variation as you want.
I'm trying hires 2x now, at 2496 height which is > 2048 the original repo maxed out at.

#

Aww. Errored out. No >2048 resolution maybe?

#

LOL rifle spear cartoon? Hires 1.5x no obvious artifacting, but hard to tell because of the print texture. Gotta specify the style.
I'm amazed at the fingers and toes. I need to try a hands-and-feet prompt.

#

It probably hurts the performance that we can't do separate prompts for the different text encoders, but the ComfyUI native nodes definitely fixed the banding problem that the custom node had.

Hmm. That is some jaw-dropping texture detail, although the overall luma / chroma is wonky. Lemme see if I can fix it.

#

Is Flux actually better at architecture solidness / symmetrical objects?

#

Getting too close to the original pose again though.

raven fern
#

@icy drift but the true test are tcg cards haha 🙂

icy drift
raven fern
#

yea

icy drift
#

Details on staff, necklace, and leather obviously AI. 2nd-pass needs 1.0 CFG or colors will blow out.

#

The preview shows this weird ping-pong effect like the model keeps trying to shake things up during generation. I haven't read the technical report if there is one, so I don't know how / if it differs from Flux.

#

That's one shiny card.

#

Printable center design no problem.

#

Got background art.

icy drift
#

I give up. It just can't do reflections. It may have been trained on synthetic reflection data that was wrong.

#

It's the best teeth-brushing model yet though.

#

I think it's main real power is subject-inclusion constraint following. You can add tons of stuff into an image, and it will all show up.
It can't seem to do before/after same-person generations either. This model just has a really vague / fuzzy understanding of identities, materials, and structures.
Hmm. So why is it so good with hands (and presumably limbs from other peoples' tests)?

dry wave
sand osprey
#

生成一只兔子的卡通形象

rustic bramble
#

Anyone worked with mamba backbones for diffusion?

quartz hamlet
#

High-detail map of North Africa, Morocco highlighted including Western Sahara, vintage parchment background, deep red and gold tones, cinematic lighting, ultra-realistic, no text, 16:9 aspect ratio

bitter hearth
#

which is even faster

dry wave
#

and works really bad in my opinion

bitter hearth
#

lol I remember you didn't like sana yeah

#

it makes an ok potato

dry wave
#

I think linear attention is not so important/critical for image generation. Maybe I'm lacking imagination, but why would you want to scale up diffusion to millions of tokens? I can only see two scenarios: first, generating huge resolution images like 8k. But honestly, I think before you come up with a better attention mechanism, I would rather come up with a smarter upscaling technique. second, having a long conversation with multiple images (like omnigen or the current chatgpt). The latter might be interesting, but I think its currently more realistic to have a strong diffusion method that can be conditioned on one or very few images; that would be still possible with normal quadratic attention

#

(I also don't think that mamba and Co have a future in large language models to be honest. Might be wrong here, but my intuition is rather that memory models are the future)

bitter hearth
#

oh generating to huge resolutions is like

#

the only thing I do lol

#

there is a project called CLEAR that nearly linearised Flux attention using a sliding window

#

it gives a 600% speed boost or so at 8k

#

flux is really nice at those higher resolutions especially if you can pass the 16k mark

#

its like getting a model from the future

#

the lighting goes so nice and soft

dry wave
#

even than I would first generate in low res with quadratic attention and then upscale to higher res with something else

#

although sliding window is not "linear attention" to me xD But yeah, its almost linear in time I get that

bitter hearth
#

yeah it was not quite linear time

#

so if the resolution got high enough it would still start to scale in an unfriendly way

#

but it got pretty high

dry wave
#

damned, its always hard to google for flux stuff xD

bitter hearth
dry wave
#

this looks really cool

#

but the memory consumption makes it probably still hard to generate more than 2k with it X_x

bitter hearth
#

yeah

#

gotta use those thailand h100s

dry wave
#

"The reason for these phenomenons is that pre-trained
DiTs, such as FLUX, rely heavily on local features to man-
age token relationships. To validate this, we visualize atten-
tion maps in Fig. 4 and observe that most significant atten-
tion scores fall in the local area around each query."

#

that's exactly what I hate at these DiT architectures X_x

#

why did they not simply add both, sliding window attention and global attention, in a 90:10 ratio or so

bitter hearth
#

if you want a more modern / smarter version
thunderkittens team did sliding tile attention kernel for hopper

violet escarp
bitter hearth
#

the quality hit is fairly big yh

rustic bramble
bitter hearth
#

there's been all sorts of attempts to stop that quadratic scaling

#

I don't think it would help for me to link to a specific paper you need to read the whole literature really

#

like including RNN and GRU

#

its a frontier-level topic so its not rly something that can be distilled down into a small summary

rustic bramble
#

Hmm

#

I thought that mamba was the fastest sequence modeling framework...

bitter hearth
#

fastest is a loaded term cos

#

its hard to compare apples to apples

#

remember there are also non-deep methods

#

naive bayes, XGBoost, random forest etc

#

or just like standard panel data multivariate regression difference-in-differences models

#

or something like a dynamic stochastic general equilibrium model where you are essentially parametrising partial differential equations

dry wave
#

Mamba got a lot of attention when it was published, but it turned out it just doesn't work as good as transformers. Nobody is using mamba-only architectures anymore. However, there are mamba-transformer hybrids in use

#

but there are million ways of speeding up transformers.

#

methods like mamba try to find a completely new architecture to solve the problem. But there are also just tweaks and fixes that can speed up everything sufficiently

#

for example: most of the time you don't need to attend everything to everything. So instead of removing all your attention layers from the model and replacing them with, e.g. linear attention, mamba, whatever, you could also just replace half of them or 80% of them. You could also use a unet like architecture and use transformers only in the middle layer (where they are cheap) and use other methods like neighbourhood attention in all other layers. There are so many ways of speeding up things which are not explored yet

bitter hearth
#

transformers keep getting faster and faster with 6d parallelism, kernel search methods, better distilling etc

#

so the need for mamba or linear attn is less and less

#

hybrids can be ok yeah

icy drift
sullen moss
icy drift
# sullen moss HiDream ?

With Flux. I can't get HiDream alone to give me clear images at all for some reason. I'm using the recommended settings, and the quality is just bad.

sullen moss
icy drift
# sullen moss

How does it handle architecture? (E.g. banister railings.)

sullen moss
icy drift
# sullen moss Give me a prompt to test.

Just finished typing up a prompt and testing now with new settings.
The photo is taken from the bottom of a winding staircase in the interior of a suburban home. At the top right of the image, a woman is standing at the top of the stairs, leaning over the railing and looking down. The scene is a brightly lit interior.

#

Look at her wonky fingers and optical-illusion infinite stairs with super crazy disjointed banisters everywhere. It always comes out like this.

#

Here it is cleaned up with Flux at 0.61 denoise. I should've used 0.67 for architecture, but it did its best to fix the mess. (Of course, Flux can't actually follow this prompt if you try it just straight up. It can't get the camera position right.)

sullen moss
icy drift
sullen moss
icy drift
# sullen moss sora

Wow that's great prompt adherence (same architecture problems, but I remember it being crazy fast). What version of Sora is this? Where can I get it?

sullen moss
#
Sora

Transform text and images into immersive videos. Animate stories, visualize ideas, and bring your concepts to life.

icy drift
#

Oh LOL nevermind. I was thinking Sana. I don't care about closed source stuff.

bitter hearth
#

need enterprise deal to get access to the good sora sadly

#

love sana so much

icy drift
#

Okay, then full critique on that trash. The banister sweeping in from the top-left suddenly ends mid-air, clipped off by the right banister. She has no hands.

icy drift
sullen moss
bitter hearth
#

ye its not flux level at all

#

it can do text

#

but I could not get "the" or "jedi"

icy drift
bitter hearth
icy drift
#

But Flux can do full book covers too, with title and author title, in whatever font you want, with amazing kerning and composition.

bitter hearth
#

I think we need separate foundation models for text anyway really

#

cos it feels like a very specialist task

#

I like having separate models for stuff

#

I don't make big images often these days but when I used to, I would chain together like 20 different models, some as regions and some as upscale passes

sullen moss
icy drift
sullen moss
#

Dream

icy drift
# sullen moss Dream

How is the leather on her belt so consistent / solid looking? I get a blobby mess every time. 😦

icy drift
knotty axle
jaunty basin
#

Comic-style wide shot of a dark alley, with the polar bear and the little man in the black hoodie locked in a tense face-off. The bear stands on two legs, wearing a red scarf and brown newsboy cap. The man is frozen in place as the bear’s massive shadow stretches over him, cast by stark moonlight. Mood: fatal finality. Use silhouetted framing with sharp light-dark contrast, emphasizing the bear’s dominance and the gravity of the moment. Let the moonlight carve out dramatic shapes in the alley, heightening the cinematic tension.

surreal nova
#

Generate an image…. What the hell am I doing? How do I create images on here?

#

Or is this just a chat for people to talk about?

dry wave
#

there is a bot for generating images in a different channel, but it's not for free. This discord is about open source models, so people generate the images on their own graphic cards.

lucid roost
#

"how to draw a cartoon elephant, step-by-step guide, 6 panels, each step building progressively -- step 1: simple head shape -- step 2: add trunk and eyes -- step 3: connect body to head -- step 4: draw legs and tail -- step 5: complete the full body with outlines -- step 6: fully colored elephant in soft pastel colors, clean vector art, minimal background, kids tutorial style, high resolution"

polar coral
#

"A normal-looking public charging port (at an airport/mall) with a hidden microchip inside. A close-up shot shows the chip's LED faintly glowing as soon as a user connects their phone, with a 'Data Transfer' animation flashing on the phone screen. In the background, binary code streams and a hacker's hand is partially visible."

digital valley
#

dream

#

Dream

#

(8k game sprite), (front view), pixel art, office worker,
(stressed face), (messy hair), (glowing computer screen reflection on glasses),
(untucked shirt), (coffee stain on tie), (holding smartphone under desk),
(cubicle background), flat shading, muted colors,
(comedy elements: tiny cactus with "F**k Work" sign),
art by Scott Benson, inspired by "Don't Starve"

pale blade
#

Нужно в первой иллюстрации добавить ещё одну колонку с персонажами слева от основной. В новой колонке должны быть пустые слоты для абордажников. Затем весь интерфейс нужно выровнять.

quick lava
#

generate image

gusty wing
craggy crest
craggy crest
cinder junco
icy drift
#

In a futuristic white room, on the right an android is floating in a green tank, and on the left a woman is standing at a control panel. The green tank is a tall cylinder of aquarium glass filled with fluid. The android inside the tank is floating weightless above the floor, with head tilted back with eyes closed, and with arms spread. On the left, the woman standing at the control panel is facing left and looking down at the panel. Her hands are on the control panel. The woman at the control panel is wearing a white lab coat. The control panel is a blue and white futuristic LCD screen. The room is futuristic and white, with panels and lines. The scene is brightly lit.
Yep, that's some absolutely amazing composition and constraint following from the prompt. Blows Flux out of the water in that regard. It can't do the solid-looking architectural structures (like the wobbly lines on the vent overhead or the mangled lines on the screen / panel / her fingers), but the prompt adherence is just so useful. It just needs low-denoise hires with Flux to fix the mistakes.

crimson hull
#

In a futuristic white room, on the right an android is floating in a green tank, and on the left a woman is standing at a control panel. The green tank is a tall cylinder of aquarium glass filled with fluid. The android inside the tank is floating weightless above the floor, with head tilted back with eyes closed, and with arms spread. On the left, the woman standing at the control panel is facing left and looking down at the panel. Her hands are on the control panel. The woman at the control panel is wearing a white lab coat. The control panel is a blue and white futuristic LCD screen. The room is futuristic and white, with panels and lines. The scene is brightly lit.

crimson hull
wet kayak
#

Hello @Team,

I’m currently working on generating a Toy Starter Pack-style image using the following API:
"https://api.stability.ai/v2beta/stable-image/generate/sd3"

However, I’m not getting results that align with the attached reference image. I’ve included the prompt in the appropriate section and tested with various "seed" values, but the output still doesn’t meet expectations.

Could you please advise:
If this is the correct API to use for achieving this specific style?
If there are particular parameters or configurations I should be adjusting to improve the results?

Your guidance would be greatly appreciated.

craggy crest
craggy crest
tulip fern
#

Can someone help me with this?

muted cargo
#

sd3 is trained for 1024x1024

tulip fern
tulip fern
dry wave
#

dunno what schedule type automatic means, but rectified flow works simply with the Euler Sampling method

tulip fern
dry wave
#

Euler and normal are the settings in comfyui

dry wave
#

Sampling method: Euler
Schedule Type: Simple

tulip fern
#

I'm still gettting image same as previous

weary crystal
icy drift
#

HiDream E1 working! 🙂 (E1 only works with 768*768 as far as I know, so output is smaller. At 1024, it will shift the image. No problem to upscale / hires / low-denoise.)
Change the painting on the canvas into a tuna fish.

#

It did its best with my terrible sketch.
Change the rough sketch style to a clean lineart style with coloring and shading.

#

I had to reroll and modify the prompt 3 times, and it still missed one of the apples. Maybe if I had said "4 apples" instead.
Change the apples' materials from gold to transparent crystal.

icy drift
#

I think Omnigen might still be better though.

bitter hearth
#

omnigen is still good sometimes yeah

serene whale
#

a beautiful and powerful mysterious sorceress, smile, sitting on a rock, lightning magic, hat, detailed leather clothing with gemstones, dress, castle background

dull star
#

Mods?

#

Scam

#

I cant ping mods at once

#

they removed that option

#

uhhh @spark grove ig

lilac plinth
#

Make moustache narrow and wide with slightly curved ends, Make eyes positioning in correct direction simultenously natural, Mir anees

craggy crest
icy drift
#

Make it a rainy day, photoreal style.
Nailed it in one! 🙂 New IC-Edit lora with Flux-Fill.

#

Give the kitten sunglasses, photoreal style.

#

Underwater.

weary crystal
mortal kite
#

My harddrive can't handle all these models

dry wave
#

I waited for a lora for flux-fill 😀 do you have a link?

tacit ermine
#

Can anyone tell me how to modify parts of an image through prompt in stable diffusion

weary crystal
tacit ermine
#

I'm new to discord

weary crystal
craggy crest
azure thorn
#

Help me generate a pink butterfly with a black background.
Only pink and black images

craggy crest
#

Hot dog!

deft locust
#

Como genero una imagen?

#

Is this free?

craggy crest
lucid glade
#

Hot dog!

final lynx
#

can i run sd3 on a rx580?

stark plume
#

Make a cartoon big mango with human eyes

real terrace
#

if it works it would be really slow I guess

violet escarp
#

@spark grove

#

it's one of those fake steam bots again

spark grove
sage burrow
sage burrow
blazing spoke
#

Make a realistic pakistan PIA Aeroplane

restive wigeon
#

<dem_form> <img1_style> a biomechanical humanoid creature with tusks and extended tongue, bust portrait, in the exact rendering style of the second image, cinematic shadows, dark metallic skin, surreal alien armor, inspired by H.R. Giger, highly detailed, photoreal 3D style, atmospheric lighting, monochrome tones

craggy crest
#

You can not generate imsges in this channel

craggy crest
dry talon
#

pig image

woeful prawn
#

a 185cm high sexy man wears transparent sexy underwear under the sunshine

turbid grotto
jagged gate
#

Hidream

rugged yacht
#

generate cartoon image with girl

runic tusk
#

No.

harsh fjord
#

how to use this

runic tusk
remote holly
#

sd3.5 doesn't deserve to be forgoten by community

cinder elk
#

global automotive manufacturing facility, robotic assembly lines, quality control inspection, international collaboration, high-tech production environment, workers in clean uniforms, cinematic lighting, panoramic wide aspect ratio

proven pecan
jade minnow
#

A very pretty Chinese girl with a smile on her face and a nice figure, wearing a purple dress

#

A very pretty Chinese girl with a smile on her face and a nice figure, wearing a purple dress

warped prairie
#

A cheerful, illustrated poster featuring a variety of wild animals engaging in humorous and chaotic parenting moments, like a koala dropping its baby, a hamster eating its young, and a black bear sleeping through parenting duties, all surrounded by hand-drawn floral borders and playful typography that says “There Are Moms Way Worse Than You”, pastel color palette, children’s book illustration style, flat vector aesthetic, clean white background --ar 4:5 --v 6.0 --style raw --s 250

craggy crest
frail shoal
icy drift
#

Settings made a big difference for Bagel. Times are on my 4090.
A trading card from a trading card game. The title of the card at the top says: "GOBLIN FIEND". The card art shows a green goblin with red eyes holding a knife. Under the art is a text panel. The text panel has the text: "The goblin fiend loves the taste of fried foods and does not like vegetables." At the bottom-right corner of the card is a number panel. The number panel has an icon of a sword, and the number 7. The card is polished and well designed, with highly detailed art and text in a clear, crisp font.

#

(Bagel with bad settings. Made the difference between SDXL accuracy vs. better than Flux Dev accuracy.)

weary crystal
#

Guess the amount of steps is more important these where made with 26

#

38 Seconds with Chroma v30 unlocked.

#

SD3.5 Medium 15.6 Seconds

icy drift
#

Yeah the steps are more relevant unless you're using my same hardware...
My example gens were 50 steps each. The point was just to compare Bagel's speed to the speed of other models. I've never heard of Chroma, and that's some impressive prompt following. Easily on par with Flux. I'll check it out.

#

Oh chroma is just flux nvm. Just gives me solid black images in Comfy.

violet escarp
#

it's modified from Flux schnell

#

it's a different arch. It prunes schnell from 12b to 8.9b and corrects a mistake related to padding tokens. You need a different workflow for it.

turbid grotto
#

it also does not need clip l and still in 512px pretraining phase, so there is a potential

craggy crest
sand badger
#

A cheerful, illustrated poster featuring a variety of wild animals engaging in humorous and chaotic parenting moments, like a koala dropping its baby, a hamster eating its young, and a black bear sleeping through parenting duties, all surrounded by hand-drawn floral borders and playful typography that says “There Are Moms Way Worse Than You”, pastel color palette, children’s book illustration style, flat vector aesthetic, clean white background --ar 4:5 --v 6.0 --style raw --s 250

#

A cheerful, illustrated poster featuring a variety of wild animals engaging in humorous and chaotic parenting moments, like a koala dropping its baby, a hamster eating its young, and a black bear sleeping through parenting duties, all surrounded by hand-drawn floral borders and playful typography that says “There Are Moms Way Worse Than You”, pastel color palette, children’s book illustration style, flat vector aesthetic, clean white background --ar 4:5 --v 6.0 --style raw --s 250 ¯_(ツ)_/¯

mental raptor
#

where do you get stable diffusion, preferably a gui version?

craggy crest
# mental raptor where do you get stable diffusion, preferably a gui version?

https://civitai.com/models/878387/stable-diffusion-35-large - this is the model. you'll need to install something to run it in. i recommend you install https://github.com/mcmonkeyprojects/SwarmUI

GitHub

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility. - mcmonkeyprojects/Swa...

icy drift
swift saffron
#

An Asian man with a receding hairline and a round face, wearing a white shirt inside and a gray-blue jacket outside, holding a wooden dragon-headed cane. He is 190 cm tall and stares at the cane arrogantly. This is a 2D game concept map, with obvious strokes, transparent colors, delicate clothing materials, and shiny leather shoes. --cref https://s.mj.run/CxufHB5Qy1s --cw 100 --ar 9:16

rocky geode
#

is it possible to run a flux dev with 22 gigs insted we only have 16 gpu ? (may be by vram ? got 64 of thems)

#

! thank for this speed answer !

weary crystal
#

don't click on the server

rocky geode
#

ha ?

#

i dont click on

#

you have good moderation here =)

weary crystal
#

Good for you. Lately scammer appear whenever questions are ask to join a discord and they need you login etc.

rocky geode
#

i may not have to share things here ?¿

weary crystal
#

Back to your question. There are some flux dev quantisation models with fp8 that are smaller. You will not be happy with the run time if the model will be pushed towards the ram

#

Well tech-support would be a good channel for these question. Scammers are everywhere 😦

rocky geode
#

oh i'm ok with fp8 vae ones

#

♥ Lords thanks for your rescue ♥ i'll try to be carefull with what i ask !

weary crystal
rocky geode
#

promess, i'll wireshark logs them ^^

icy drift
sullen moss
#

Flux Kontext

icy drift
#

Sure... I'll try the Kontext Dev version when available. I'll be BLOWN AWAY if it can do any of the following at all:

  1. follow literally even one simple instruction like: "make his shoulders wider", or "make her look off to the right", or "make the dog cross its paws".
  2. preserve outfits from references without changing sleeve length, adding panels, moving seams, changing button count, etc.
  3. do simple / basic UI edits like, "add a sword icon in the bottom-right corner", or "make the title font at the top larger", or "add a raised border around the red button on the bottom right"
    I am expecting the model to struggle immensely with anything other than the tasks listed in the paper, and I'm expecting it to need exact-word prompting to achieve those.
cunning lintel
#

Tried the first, no joy on the pro tagged version

#

But who knows, maybe conrolnet pose will be added and you can twist and turn your char anyway you want too, that doesn't seem that big of a leap

cunning lintel
#

Still, the way it preserves details is amazing, original photo "add a roswell alien riding the zebra" "make the alien hold one arm up, as in a greeting". And all gens actually keep looking like a photo, by far the biggest win for me, not the faux 3d render photo look so often seen.

#

So i thought style transfer would be really good too, nope, very hit or miss, hit for common digital art styles, but when using simple line doodles or charcoal stuff, style was pretty much ignored 😢

icy drift
cunning lintel
#

maybe it's the prompting (might help to add a little style to the prompts), $.04/image is too much too experiment a lot for me

icy drift
cunning lintel
#

using this style create: a towering Lizardfolk mercenary whose scales are fused with veins of obsidian and reinforced with magi-tech plating. His eyes glow with internal arcane energy. He wears heavy brass pauldrons enchanted for durability and carries a powerful sonic disruptor gauntlet. He's gruff, pragmatic, and focused solely on the highest bidder.

#

it's something, maybe i just want too much :p

#

the first one was my first try and seeing it keep the effect of white border, image over it, i thought, wow, that's good

craggy crest
devout schooner
devout schooner
# devout schooner I finally got around to actually releasing an SD 3.5 Medium lora after spending ...

relevant training notes I guess if anyone cares (in Kohya-ian terms):

  • CAME optimizer
  • no text encoder training
  • 0.0001 model learning rate
  • Cosine With Restarts Scheduler set to 3 restarts
  • "Noise Offset" at 0.2
  • Multires Noise Iterations and Multires Noise Discount disabled entirely
  • Dim 64, Alpha 32, Dora model type, "factor" set to 2
  • Batch Size 1 but with 5 gradient accumulation steps (as an alternative to a regular batch size of 5, which I imagine would have a similar effect)
#

also I think I generated all the samples with DPM++ 2S Ancestral SGM Uniform @ CFG 5.5, in Comfy

dry wave
#

why do you even use noise offset?

#

it's a weird hacky technique full if potential errors that is not even necessary for rectified flow matching models

devout schooner
#

I can't speak to anything other than the results lol

#

the only "training guide" ever released for SD 3.5 anything was basically nonsense in my extensively tested opinion

#

anything other than Dora is basically useless
any normal Adam optimizer is basically useless
I've never gotten vaguely good results for SD 3.5 Medium with anything other than CAME and low-factor Doras joeshrug

#

it doesn't train anything remotely like any other model ever

modern rover
#

create a icon like this photo

lucid jasper
#

Draw a statue of an anime god : "raise the level alone" contrast photo with highlights, without background

opaque oyster
#

#artisan-1 - PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.

weary oxide
#

PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.

buoyant mesa
dull locust
#

prompt
"تیزر آموزشی هوش مصنوعی: چرخ‌دنده‌های مکانیکی کلاسیک (نماد سیستم‌های قدیمی) به آرامی به ساختارهای دیجیتالی تبدیل می‌شوند. ابتدا به لایه‌های نورانی یک شبکه عصبی ساده (3 لایه با نورهای آبی و سبز) تغییر شکل می‌دهند، سپس به یک معماری پیچیده Deep Learning (با صدها نور قرمز-زرد-آبی متصل) تکامل می‌یابند.

echo pond
ebon bloom
#

`prompt
主题公园景观,青柠水晶雕塑作为核心装置,透明玻璃温室中悬浮水滴形青柠树,弧形玻璃步道环绕浅绿色反光水池,现代极简风格建筑由玻璃与亚克力构成,阳光透过棱镜折射彩虹光斑,柔焦清新色调,等轴视角构图,by Nendo工作室 --ar 16:9 --v 6.0

dusky thistle
#

SD35M

#

got a lot of upgrades to style transfer going

dusky thistle
dusky thistle
dusky thistle
dusky thistle
copper whale
# dusky thistle

i like this one, this kinda reminds me of the starry night of vincent van gogh, hope u generate more of thisss :>

lilac plank
#

Realistic image of a porsche

craggy crest
copper whale
# craggy crest

soo stunning...like a glowing fantasy world come to life broo

craggy crest
hasty plank
#

Modern Style Bedroom Interior Scene with Contemporary Decorative Style, Bed, Cabinet, Nightstand, Table, Greenery

open heart
#

a lone survivor senses danger lurking behind shattered glass.
Shot concept: Grit, suspense, and post-apocalyptic atmosphere.
#AIart #Cinematic #SurvivorScene #PostApocalyptic

devout schooner
tardy niche
#

bear

#

can someone tell me how to make images pls

runic tusk
#

Nice scam.

urban arch
#

Nah, the really good ones are the ones that actually sound real.

open heart
devout schooner
# copper whale awesome bro

Thanks
I've got a big detailer one trained solely on hi res Flux Pro Ultra outputs I'm gonna release soon too
Stock Medium on Left, with Dora on right
Same seed / prompt / sampling settings / etc

simple flame
#

create a friendly, cute, white and round robot assitant that resembles eve from wall e, deptic her from different angles

brisk brook
#

Create a photorealistic and realistic image with a resolution of 3840x2160 in a cyberpunk style. In the foreground, depict a very beautiful, slender woman with a short haircut, who is half Asian and half Caucasian. She wears thin, tight-fitting clothing with the inscription "Xaero," through which the outlines of her nipples are visible. In the background, show a megacity with dark tones accented by blue, pink, and purple colors, and a cyberpunk-style sports car parked nearby. Please generate 3 different variations of this image. The image should have photographic realism, with detailed lighting, textures, and atmosphere typical of high-end cyberpunk visuals

cunning lintel
short thicket
#

Works alright for me. Cheers.

copper whale
rapid sparrow
#

PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.

cunning lintel
# copper whale i think this is just fine bro maybe u have to explore more? i guess

The intend/prompt was to use the style of a source image (the same i used in the post i replied to), sadly the model hardly followed it, only the color scheme a bit. Other trickery might or might not wotk (i had some success adding stuff to real photo's) but style reference isn't something i managed to make the dev version do. (and it was what i looked most forward to 😞 )

fresh plover
#

Create a 1990s realistic portrait featuring Mexican American singer Selena Quintanilla with long dark hair and bangs, she's wearing red lipstick she's smiling

spark quail
#

PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.

muted cargo
#

short name for a channel holding every informations related to this discord's channel bot.
🤖 : beep boop beep, there you go #artisan-faq

vast crag
#

Chinese ink painting of the Red Cliffs battlefield at dusk, towering red cliffs with crashing waves (‘乱石穿空,惊涛拍岸’), ruined ancient fortifications in the distance, a young General Zhou Yu (周瑜) in silk headscarf and feather fan (‘羽扇纶巾’), standing beside Lady Xiao Qiao, romantic yet heroic atmosphere, misty river reflecting moonlight, fusion of historical grandeur and poetic melancholy, Song Dynasty landscape style.

muted cargo
#

...

sly escarp
#

“Are there open-source virtual try-on projects I could help with or test?”

raven elm
#

kaleidoscope sucked into a kaleidochromic vortex --s 750 --v 7.0 --raw - Remix (Strong)

lyric iron
#

Close-up professional corporate man headshot, modern business portrait. The subject's head and shoulders are tightly framed, filling most of the image. Focus is sharply on the face, particularly the eyes, with a shallow depth of field blurring the background.
Lighting: Three-point studio lighting setup optimized for a close-up. A soft, diffused key light directly or slightly to the side of the face to minimize harsh shadows. A fill light to subtly illuminate the shadow areas under the chin and nose. A hair light or rim light from behind to add a subtle highlight along the hair and separate the subject from the background.
Background: Smooth, solid, neutral dark gray or deep blue background, completely out of focus to ensure maximum attention on the subject.
Camera & Style: Simulated DSLR photography with a high-quality portrait lens (e.g., 85mm equivalent). The image should have ultra-detailed facial features, realistic skin texture (without excessive smoothing), and professional, neutral color grading suitable for business use. The overall feel should be confident, approachable, and trustworthy.

copper whale
soft hamlet
#

Create a hyper-realistic 8K resolution cinematic poster of Mobile Legends: Bang Bang featuring 5 characters: Layla (with her cannon), Dyrroth (in a fierce battle stance), Harley (casting a magic card), Esmeralda (with her cosmic scythe and flowing cloak), and Akai (spinning with his bamboo staff). The scene should be dark and dramatic, with intense rim lighting, glowing particle effects, lens flares, and smoke in the background. Position the characters in a powerful triangular composition on a fantasy battlefield with magic energy storms and ruins. Each hero must look dynamic and battle-ready, with ultra-detailed armor and realistic facial expressions. Add cinematic color grading and film grain for a movie-poster look. Include the title ‘Mobile Legends: Bang Bang’ in bold metallic lettering at the bottom center. Aspect Ratio: 16:9. Full movie-poster tone, highly detailed, epic fantasy style."

tardy prism
dusky thistle
#

Probably was zavy

deft grove
#

sakura, white, pink --ar 9:16 --sref 2121577414

vernal yew
#

Conver to anime

fresh plover
#

A 1990s-style self-portrait of 27-year-old Jennifer Lopez, with long, dark wavy hair and soft bangs. She wears bold red lipstick and is styled like Mexican American Tejano singer Selena Quintanilla. The photo has a warm, vintage studio portrait vibe, with soft lighting and a nostalgic 90s glamour aesthetic.

surreal anvil
#

Generate a black and white portrait of my face, shot from a close-up, overhead angle, with my head facing forward. I used a 35mm lens and 10.7K 4HD quality.

Proud expression, water droplets on my face. Background with deep black shadows: only my face is visible, and it looks ultra-sharp. Aspect ratio of 4:3, with a 1/5 depth of field effect.

copper whale
runic tiger
#

#▶|stable-video-diffusion a divine digital painting of Lord Krishna as Radha Ramana sitting beneath a blooming kadamba tree on a carved stone bench, Radha resting gently against his shoulder, Krishna wearing a saffron‑yellow silk dhoti and peacock‑feather crown, softly playing the flute, Radha in a pastel pink and turquoise lehenga with jasmine garlands around her braid, lotus‑filled pond glimmering behind them, morning mist and golden rays filtering through leaves, peacocks and deer in the background, tranquil Vrindavan atmosphere, ultra‑detailed devotional art, cinematic soft lighting, peaceful romantic mood, high‑resolution

torpid marlinBOT
#

how make ai photo
No data source is currently selected. Please choose a data source from the dashboard and try again.

sly escarp
#

@civic latch yes I have the specifications/ details of the logo

dusky thistle
dusky thistle
dusky thistle
dusky thistle
dusky thistle
dusky thistle
dusky thistle
tender oak
#

golden retriever dog

dusky thistle
#

Here is the image you requested.

#

Here is the image you requested.

#

Here is the image you requested.

dusky thistle
dusky thistle
dusky thistle
dusky thistle
raven elm
#

house in the woods resemblance of a castle but more like a home

violet escarp
#

@spark grove spammer scammer

spark grove
violet escarp
#

@spark grove again

craggy crest
#

1: that prompt is too long and 2: you can't generate in this channel

dawn cargo
#

Ultra-realistic photogrammetry 3-D globe named Gloxus, 16-k resolution earth texture, micro-topographic detail on every mountain ridge and river delta, continents carved from obsidian-black basalt with razor-sharp displacement maps, iridescent neon-cyan ocean currents swirling under a thin glass layer, holographic magenta circuit-veins mapping city lights across landmasses, subtle cyan grid lat/long lines hovering 2 mm above surface, cinematic rim-light from a cool white sun at 45°, micro-scratches and fingerprint smudges on glossy protective dome, shallow depth of field f/1.4, 32-bit HDR, octane render quality, ray-traced reflections, photoreal shadows, ultra-sharp 200 mm lens, clean black studio background, --ar 16:9 --cfg 12 --steps 40 --sampler DPM++ 2M Karras --vae kl-f8-anime2 --no text, watermark, logo, frame

cinder junco
# dusky thistle Here is the image you requested.

Shame on you. You’ve picked on this poor, helpless bot and now, somewhere on the Indian subcontinent, there is a web page where this image is captioned as an attractive young woman in a business suit. I hope you think about the suffering you have caused.

pulsar oak
dusky thistle
copper whale
languid terrace
#

Anime-style third-year student with spiky hair, wearing a tank top and shorts, dramatically leaping towards a basketball hoop placed on a mountain summit, sweat droplets flying, exaggerated wind effects, vibrant sunset colors with pink and orange clouds, stylized rocky terrain, action comic book shading, inspired by 'Slam Dunk' artwork

raven elm
spark quail
#

how the heck do we counter-report someone

#

this king is one of the few keepin this channel alive

#

okay 😭

fresh yoke
#

Hello im new here diffusionhand

regal vault
#

Photorealistic full-body portrait, eye-level shot, sharp focus on subject: A beautiful, energetic 22-year-old Vietnamese woman, exuding confidence and strength. Her skin is glowing with perspiration, highlighting her active state. She is clad in sleek, form-fitting athletic wear (e.g., a tight sports top or tank top and high-waisted leggings) that accentuates her toned physique and prominent bust. The fabric, slightly damp with sweat, clings closely to her body, subtly emphasizing her natural contours and definition beneath. She is captured mid-movement or pausing in a modern, well-equipped gym, with blurred fitness equipment, bright mirrors, and a motivating atmosphere in the background. Her expression is focused and determined, yet radiating a youthful vitality. Natural gym lighting with subtle highlights on her skin and the sheen of sweat. Captured with exquisite detail and sharpness, showcasing natural tones in a realistic photographic style, akin to a professional shot on a high-end DSLR (e.g., Canon EOS R5 with a 70-200mm f/2.8 lens), ISO 400, 1/160s shutter speed, and f/3.2 aperture. Shallow depth of field, drawing all attention to the woman. True-to-life colors. Aspect ratio 9:16.

worldly gulch
#

vorrei che rappresentassi una scritta "Il volo di Crà"; la immagino adaggiata sulla riva di una isola, leggermente lambita dal mere. I caratteri che la compongono vorrei che fossero come scolpiti su degli scogli e leggermente ricoperti di vegetazione.

raven elm
solar grail
#

expand this

rigid marten
open heart
teal ingot
#

Hi

raven elm
craggy crest
#

@spark grove spammer alert

spice pine
#

ocean,beach,young girl

dark plume
#

I wana generate this kind of image some one help me

slate glacier
#

Anyone use Stable Diffusion to segment?

craggy crest
junior cloud
thick aurora
#

guys, how can I run sd3.5 on Forge? I belive I'm doing something wrong, because don't generate image and tilt my Colab when I use it =/

errant dust
#

So any opinions on Krea yet?

ruby prawn
#

my first generation with generate/ultra

sullen moss
# errant dust So any opinions on Krea yet?

I recently tested it locally. I can't say I'm overly impressed with this model. It's very noisy, in my opinion. You could say it's just another fine-tuned model, nothing more.

errant dust
#

Krea took Flux Dev Raw and then did their own post-training. This blog entry details it

#

So calling it a finetune is not wrong, but it goes quite a bit deeper all the same

#

I will add that my initial images with Krea are very nice and are more intresting to me than vanilla Flux dev. Is it the best overall Flux? I haven't come to any conclusions. My other fav was/is Pixelwave. I tried otehr attempts but they were inevitably not very interesting. This is all without Loras of course

#

I sitll really love SD 3.5 FWIW. They each have their strengths and weaknesses. FOr actual text, Flux is in a class of its own for open local models.

#

SD 3.5 Large of course.

#

Of course, Flux's biggest strength is precisely its flexibillity to be finetuned or have Loras

radiant quiver
#

hey is there a easy tutorial on how to train a lora

gritty turtle
#

if anyone need help in lora training let me know

torn marsh
#

Flux-Krea

torn marsh
#

diptych of two identical images as a split screen featuring the same character: a young woman from jrpg game. On the right she looks at viewer. On the left she's wearing a straw hat

dry wave
dry wave
#

it has more issues with anatomy than Flux Dev, but on the other side its much more diverse in styles

#

did you tried to use author prompts with Flux Krea?
Flux Dev never responded on them. Flux Krea, however, can roughly imitate styles you name (similar quality as SDXL)

#

in Flux Dev everything always looks the same style. Flux Krea allows you to use different styles in your prompt without needing a Lora

#

For painterly stuff, I prefer PixelWave more than Krea, but Krea comes close. I think finetuning Krea will give better results than finetuning Flux Dev on styles

raven elm
#

Old stone alley, mossy banyan, flower-filled balcony. Natural, vibrant, cinematic. Miyazaki style, 32K UHD.

long needle
zinc delta
#

hey is SD3 dead? no updates for almost a year?

dry wave
#

there is the Flux ecosystem by the guys who initially developed SD. There is highdream, there is Wan (can generate videos and images) and there is now also Qwen image

zinc delta
#

I know

#

I have still SD3 at my app

#

and today I refactored it

#

have more models etc

#

and removed it

#

it was the worst model honestly

twilit pollen
#

Is there anyone looking for dev?

craggy crest
errant dust
#

Qwen image?

#

HiDream is nice but it is really sluggish

#

I am not at all convinced it is worth the effort

#

SD3.5 is super cool

#

But even Flux can be complained about in terms of updates. Krea is really just a different post-trained Flux

#

The commercials have not been idle either: Imagen, Mid7 and Ideogram have all been pulling ahead in some aspects

errant dust
#

So miraculously two accounts spamming the exact same crap. Brilliant.

#

On a relevant note, I did generate some images with Qwen Image and it is quite good. Good text adherence too.

#

A lot slower than Flux, and too early to say whether the vanilla is an improvement or not over Flux Krea

devout schooner
# zinc delta hey is SD3 dead? no updates for almost a year?

a lot of people REALLY weren't happy about the recent "safety policy" update with regards to "core" models at least
especially in light of the fact that SD 3.5 was mostly uncensored
they didn't literally DPO-tune female nipples out of the model the way Flux did KEKW

errant dust
#

I think he meant the SD3 in general and was not suggesting SD3.0 specifically

#

Anyhow, I did some testing, very light, of the new HiDream and Qween Image in terms of models with text

#

Qwen really I king of correct text, but it also sacrifices a lot to achieve it IMHO. The default imagery is much less inspired, and the fonts are downright boring. It never deviates or produces anything fun looking, which is likely solid if you are trying to put together some ad or banner. Hopefully it is more readily tunable and new tweaked models will emerge on Civitai

#

HiDream's text ability seems about on par with Flux and is definitely more intreesting visually. Albeit it... it botches long words a LOT

#

Here is an example of ultra correct Qwen:

#

I will point out that Qwen is by far the most accurate portrayer of chess pieces. It gets them right each and every time

#

Others, including Flux or SD35, can be a bit creative at times

#

FWIW, I ran the prompt multiple times with varying samplers and steps. Qwen really does not gain any improvement beyond 25 steps. You can see the occasional micro diff, but never anything warranting it to be called an improvement

#

This is hidream

#

another to illustrate. Flux is no better with this text

#

On the other hand, Qwen is incredibly strong at making logos

#

and not merely because of text accuracy

#

WHich is usually not a big deal since logos don't usually have major texts

#

I threw Qwen a bit of a curveball with the request for a logo for Chess & Tech, round, with a design based on a circuit board and.... Louis XIV

#

Not bad at all

hard lion
#

A luminous ‘Digital Giant Tree’ stands at the heart of a futuristic city, its trunk entwined with flowing data chains forming a ‘2019-2025’ timeline. The canopy spreads into a massive ecological dome shaped like the number ‘6.’ AI drones perch on its branches like birds, while roots connect to an underground 5G network. The ground features transparent solar panels and dandelion-inspired smart streetlights. A river glows with quantum computing projections, and humans interact with nature in a holographic garden. Cyberpunk lighting blends with forest mist, rendered in a surrealist style

south wharf
errant dust
#

Is that supposed to be inspired from Ancient Rome or some other antique civilization?

raven elm
silent hinge
#

你好

brave apex
silent hinge
#

你好

craggy crest
raven elm
#

The cafe

zealous sierra
#

SD 3.5 L, Dreamy Aesthetics

frail shoal
frank haven
#

/create: big red mouse

zealous sierra
#

Neon Rev - Electric Denim Girl.

viral moon
#

Am I able to use Stable Diffusion to make image to gif (while keeping the transparent background)?

craggy crest
raven elm
raven elm
errant dust
craggy crest
errant dust
#

I had understood, but in case he felt stymied by its inability to do so, I was offering up solutions.

serene fiber
dry wave
#

stable diffusion cannot do transparency cause the vae has no alpha channel. So SubtleOne is right: you need an extra tool to remove the background

raven elm
chilly igloo
#

#🆕|sd3 Manga style, black-and-white ink, dramatic contrast, cinematic angles. Sequential panels, consistent characters, tense horror-thriller mood. Silent library, frustrated writer, masked killers, surreal ending. Each page shows panels with continuous story flow.


Page 1 – Library
2 panels: vast empty modern library, tall shelves, rows of tables; closer view of books and dust in silence.

Page 2 – Writer
6 panels: close-up of man (30s) writing furiously; pen in hand; wide shot alone at table with books; messy scribbled handwriting; crumpled paper; shadowed angry face.

Page 3 – Intruders
4 panels: library doors open, masked men enter; close-up of cold eyes; killers moving between tables; man tapping desk, killer behind.

Page 4 – First Kill
3 panels: disruptive man tapping; killer grabs his hair; throat slashed, blood on table.

Page 5 – Girl
4 panels: young woman gasps; killer covers her mouth; “shhh” gesture at Keep Silence poster; silenced pistol shot, she collapses.

Page 6 – Writer’s Rage
3 panels: writer slams fist; killer behind with knife; suspenseful knife over him.

Page 7 – Break
4 panels: writer rips page; killers vanish, library empty; writer breathing heavy; fist smashing wall.

Page 8 – End
3 panels: glass door shatters; writer crushed under shards; close-up of shard with “Do Not Disturb, Keep Silence.”


craggy crest
#

you can't generate in this channel AND you can't give the AI a script and expect it to create a movie or something

errant marsh
#

girl

#

#🆕|sd3 一个小女孩端着咖啡,微笑着面对着我

upper rivet
raven elm
humble kelp
#

??

charred vale
#

一个小女孩端着咖啡,微笑着面对着我

upper rivet
raven elm
#

Kim jung gi style

scarlet canopy
craggy crest
scarlet canopy
#

What channel do I choose and how do I start writing the prompt because I tried # and I also tried /

scarlet canopy
craggy crest
raven elm
split bramble
rustic crown
#

The terrifying office of China's cattle and horse employees

true jay
#

Yo guys where to generate images
I’m new

runic tusk
#

You don't. Unless you want to pay. Or you do it on your own computer.

solemn lintel
raven elm
faint rock
narrow hawk
# raven elm

What type of prompt are you using? These are gorgeous

gleaming swift
#

1

deft badge
#

Where do you generate the images? Cloud or your Pcs?

livid rose
#

I generate on my PC, then post the results.

drifting hull
weary crystal
raven elm
final fjord
#

que sucede

raven elm
bitter hearth
errant dust
#

Any thoughts on the monster new release?

#

Probably impossible to run locally for now, but still the biggest OS image generator to date in terms of sheer size

#

It is MOE though, so maybe I will be wrong

#

Right now my fav local model is that new Flux out of the box. The big Qwen was ok, but didn't wow me

#

"The Largest Image Generation MoE Model: This is the largest open-source image generation Mixture of Experts (MoE) model to date. It features 64 experts and a total of 80 billion parameters, with 13 billion activated per token, significantly enhancing its capacity and performance."

#

"Our model can effectively process very long text inputs, enabling users to precisely control the finer details of generated images. Extended prompts allow for intricate elements to be accurately captured, making it ideal for complex projects requiring precision and creativity."

cunning lintel
#

In a way, it's what i hoped to see after SDXL, what SD3 and later models were supposed to be, it follows prompts and doesn't override styles with pre-baked crap

#

But it also has more errors, i tried it out on tencent site, let it create 4 gens, a few are always plain unusable bad, 3 arms like bad, but others are nice. And there's variety in outputs

errant dust
#

Fal has it to test, but it is pay to use, which is fair, except I cannot imagine myself paying to use it when there are literally a number of free private ones such as Nano Banana, nevermind ones I can run on my own machine like Flux Krea or SD3.5 L.

cunning lintel
errant dust
#

I went there and they wanted me to sign up with WeChat

#

which i do not have

#

and am certainly not going to install for this

cunning lintel
#

there's three tabs at the top, the last is email

errant dust
#

ok, so your impressions are that the results do not match the hype

cunning lintel
cunning lintel
errant dust
#

ok, I entered, let me try something simple, but offbeat

cunning lintel
#

it's just well, it has issues (the foot became a paw, double trident, but it's also the first model where the bull and god is actually seemingly made of water)

#

Things like this i never managed with qwen and hardly with flux

#

and it understands "In a warm, sun-drenched Japanese classroom, a bright-eyed, cherry-blossom-haired schoolgirl named Sakura** playfully twists a lock of her hair between her lip and nose, creating a makeshift mustache that** makes her giggle uncontrollably, as her friends look on in amusement, by renowned anime artist, Hirohiko Araki."

errant dust
#

well, for a comic rendition of a Gnoll with a sword, it actually did a decent job.

#

A powerfully built gnoll, resembling an upright hyena, covered in short brown fur spotted in darker brown. It wears a short kilt and a hardened leather apron adorned with metal links and spikes. The gnoll holds a short sword, ready for a fight, with a 3/4 body view, showing its full body. Rendered in a classic 80s comic book style with strong, defined linework, and detailed rendering of textures and shadows.

#

It is not really 80s commic book style, but nevertheless solid details

#

it also nailed the 3/4 body view

#

for the graphics assets of my game, I tend to use Nano for its repeatability in style as well as unlimited free use

#

(a significant deal)

#

though for the starting asset Flux Krea has been great too

#

This however was better to be honest

#

How many generations can you get? I assume there is a daily cap

#

or weekly...

cunning lintel
#

I haven't really used imagegen a lot recently, I just never liked the look of new models. Flux with lora and sd3.5 were the last I actually enjoyed using. Hunyan 3 is exciting to use like those, it feels like a throwback to styles/textures in a good way

errant dust
#

Nano is not Imagen

cunning lintel
#

Flux krea was a disappointment to me, the real krea had nicer outputs

#

I actually never used nano, only imagen.

errant dust
#

Try Nano. Aside from being the top rated text to image generator on LM Arean, it has unique editing abilities no one has

#

Editing with it is done via prompt, but let me show you what I mean by unique

#

Here is a plain jane image, not made by Nano

#

simple enough, sunrise, pirate ship, etc.

#

Now I tell Nano: change the image to a sunset

#

Just that, no masking or anything else

#

It's insane

cunning lintel
#

Yup that's good, i kinda stopped being interested in this editting after flux-edit

#

the pro version was nice, the dev version abysmal

errant dust
#

You can ask it to take a person, or even those cartoons, and tell it to raise the arm, have him turn his head to the right, and it will all be perfect

#

fur, ears, everything

#

as if telling the model to move around for the next photo

#

Anyhow, that is why Nano is overall king for now

#

overall, not necessarily in each individual thing

#

to be clear

cunning lintel
#

afraid i tried it on wrong thing, i tried to transfer style, that didn't go well :p

#

but yeah, haven't looked back since... i understand it actually can generate images too, which is nice

errant dust
#

So suppose I like Hunyuan's core image. I could use it as the starting point and then have Nano make thge modifiucations

#

Nano is top rated on LMArena as I mentioned

#

I assume you know what LMArena is

cunning lintel
#

yup, aware of it 🙂

errant dust
#

so what are the limits in Hunyuan public use? DO you know?

cunning lintel
cunning lintel
#

just a random prompt i remembered models struggled with to make look natural, hunyan does a good job

Atmospheric wide shot in a dense, ancient forest under dappled sunlight. Large, incredibly adhesive spiderwebs stretch high between gnarled trees, their thick, glistening strands shimmering as they catch the light. A wild deer (doe) is visibly ensnared, its body tangled in the sticky webbing. Nearby, a young woman struggles against the webs, her clothing and hair tightly bound, her face showing distress and the sticky strands clinging to her. Eerie shadows. Highly detailed dark fantasy illustration.

#

nana banana kinda iffy first not much web, when i asked made it more entangled i got zombie 🤡

#

Where is SD 4 (i guess never, new sai doesn't seem big on open models or even new model dev for individuals (as opposed to enterprises)) 😢 Maybe it's because what i've seen/used first, but the SAI models have that something special (style/textures i just call it) newer models just haven't captured. hunyuan kinda seems to have as well, but it's early days.

muted cargo
#

Sincere question. What would you expect SD4 to be ? What do you expect from it ?

errant dust
#

Expect? Or want?

#

For me, 4 things:

  1. Easily trained for LoRAa. Flux has an iron grip on this right now, and it is a big deal IMHO.
  2. Stronger text handling. It can handle 2-3 words ok most of the time, but it is now lagging quite a bit behind its peers.
  3. A larger more powerful model.
  4. And please tone down the nanny police. Efforts to control such things are not only wasted, since it is literally the first thing targeted by others for removal, but it invariably has detrimental effects on general image production. It need not overtly allow sexual content, but nor should it feel like a 1950s movie censorship board.
#

Just my 2 cents

#

I really like SD3.5 L FWIW, but I tend to use Flux Krea for more consistent results and style. I can ask SD for an 80s comic books style, and it will deliver, but even with plenty of details, it is all over the place in the results. It is why I mentioned LoRAs. Someone is bound to want something that it doesn't handle well, anyhow.

muted cargo
#

I don't see 4) happening anytime soon for any model released by any company. This kind of usage is bad press for the large audience. Moreover the easier it is to do this kind of content, the easier it is to abuse it. Add to that all the legal issues and stuff going on such as ID restriction getting introduced in some countries for that kind of content... And yeah... They pretty much have to do that kind of policing.

#

Otherwise yeah you pretty much expect it to catch up with others.

dry wave
#

I mean, Flux is more or less SD4. I wouldn't expect a new successor of Stable Diffusion as all people who developed SD are now developing Flux

#

I think the reason why basically all new models. including Flux, have big issues with styles is because they are using T5 or other text-based models instead of the CLIP as in SD, and because they are trained on synthetic captions

#

SD 1.5 and SDXL were trained on ALT tags, so the image captions often contained hints regarding the style

#

newer models use VLLMs to caption the image, but VLLMs usually don't capture stylistic nuances. They know the difference between a "cartoon" and a "photography", but they barely understand differences between certain art styles. When they generate captions, the captions focus on the content and not on the style. Models are trained on these captions and never learn how to describe these styles via prompt

#

thats probably the reason why models like Flux can easily learn (via lora) a lot of different styles, but its hard or even impossible to reach these styles via prompt engineering

#

at least thats my theory 🤷‍♂️

#

unfortunately, styles are also a thing all the big companies are not interested in. Styles are often associated with specific authors, and everyone fears copyright issues. Furthermore, if you want to make money with image generation, you want to target the advertisement industry. For this, you don't need art styles

cunning lintel
# muted cargo Sincere question. What would you expect SD4 to be ? What do you expect from it ?

I'm afraid the answer is like "a better horse", i know what we have know, what i like and don't like but no idea what's possible.

But the reason I mention SAI models is that compared to others their outputs always felt less artificial, more fine details and textures, instead of overly smoother AI look. (after SDXL, i feel 3.0 (the API version) did this still well, but in 3.5 it suffered a bit, some styles just became much worse or flat out impossible, it felt more like exploiting clip's knowledge as opposed to having the model actually trained ion them). Maybe SAI has a really really good data set, better than what other models have been trained on (maybe cause it's older it just has less synthetic data).

Anyway, what i would hope is ,much, much better prompt following (also when things are off the beaten path), but not at the cost of style or variety, like many recent models. So good prompt following, wide range of style and fine details/textures. "Promptable" by just by using references, both images (like ideogram) and codes/hashes (artists seem a no-go anyway), my dream would be throw some images to it, extract a style hash that's a merge of styles in those images, kinda like a lora but instant. And detailed as in make the creature in ref style a, the other creature in a blend of a and b, the background in style c. I suppose that's already beyond simple image-gen and close to current instruction models, just also for style not just subject please.

On top of that consistency, which again would probably mean an instruction like model, that allows consistent subjects and environments in various styles / perspectives / angles.

cunning lintel
# dry wave at least thats my theory 🤷‍♂️

That's been my thought as well, and after that it's often "why not use that info from captions to make the model learn styles in a better way, including a way to get it out of the model". Obviously it's no that simple 😉

errant dust
#

It is a lot simpler than made out, or all the other image generators would struggle just as badly. As to the devs who made Flux, well, there are more than those devs in the world. Whether or not Stability will actually develop a new model is entirely up to them, but producing a subpar model, relative to the existing ecosphere, with a list of reasons why it is subpar isn't going to cut it. There is no shortage of competition nowadays.

dry wave
#

I don't say SAI cannot make another model. My point is, that such a model would have the name "stable diffusion" but it would be made by entirely different development team. From my perspective, a truly successor of stable diffusion xl/3 would be another flux model

#

cause for me its the people who count, not the brand name. That's said, it doesn't matter much anyways. I'm also happy to work with models like Qwen. The days where all good open models for image generations came from SAI are long over anyways

#

(although, sadly, most of the newer models like Qwen were mostly just copy&pasted SD3 or Flux models )

cunning lintel
#

I feel data matters too, and wouldn't be surprised if SAI liberally scraped the internet and/or used screencaps for their models where newer models simply scraped ideogram/dalle/flux/MJ, ie already not the most finely detailed ai-slob.

#

I think we'll never know what a new SAI model would be like, in the end it doesn't matter a whole lot where new models come from, though i have to admit it feels a bit iffy so much is from china, i'd ike some western biased models too :p

dry wave
#

of course they scraped the internet. But I would be surprised if flux is not just using the same data

cunning lintel
#

You sure would think so, same team and all 😉 And yet, flux seems to have less knowledge, but i've also read it's the result of preference optimization, or the distillation. SAI outputs just usually appear less AI to me. Then again, i've convinced myself flux is good with hands cause they trained tons of hands to the point where children got adult hands ;p who knows what othewr optimizations were done.

errant dust
#

I think the massive inFLUX (pun intended) of Chinese open source models is centered around two things:

  1. It is the best way to get non-Chinese to use them. After all, if these were some models locked behind some Chinese ChatGPT equivalent, the use would be a fraction of what it might be.
  2. There is a massive US vs China war on the AI front, and their efforts are very much in the good graces of the CCP. If you look at just the number of papers publlished on AI, China is actually ahead of the US in the last 12 months.
#

I mean frankly, it is much the same with local LLMs

#

The best right now is hands down the Qwen models by Alibaba

#

it isn't even close anymore

#

In fact, just now Qwen3 80b MOE was just released and it is an absolute beast

#

The commercial models by the US are still king of the hill by a good margin though. ChatGPT5 is no.1, and then it is between Claude and Gemini. So the Open Source front is where they have the most chances to shine

fathom prawn
#

I have a question what model or lora could be closes for this art style ?

gaunt scarab
#

A photorealistic masterpiece, shot on Arri Alexa, cinematic it, shaking his head furiously. The camera is handheld, with a slight, almost imperceptible shake, enhancing the raw emotion color grading. A bald dwarf who is an exact likeness of Cristiano Ronaldo is sitting on the floor of a bedroom of the scene. The visual style must match the provided reference images, focusing on gritty realism, deep shadows, and des. His face is a mask of visceral anguish and pure rage, with hyper-detailed, wet skin and realistic tears streaming down hisaturated colors.

#

a bald dwarf who is an exact likeness of Cristiano Ronaldo/imagine prompt: a cute pink chick with big muscles, wearing green kung fu clothes, 3d anime style, ultra realistic, cinematic lighting, standing in a dojo

vapid radish
#

I have been experimenting with upscaling Hunyuan 3 images with Qwen as I think 1024x1024 is way to low res.

full minnow
#

想問有沒有人是用中國的模型 會不會有被閹割之類的狀況呀
I'd like to ask if anyone has used the Chinese model and if they have been castrated or something like that.

stoic salmon
#

Create a modern scientific laboratory scene with clean white counters, chemical storage cabinets, safety signage (like PPE reminders and hazard symbols), and realistic lab equipment such as microscopes, beakers, and fume hoods. Include subtle lighting and a slightly dramatic tone to suggest a challenge or escape room atmosphere. The layout should be modular and clear, suitable for overlaying interactive hotspots or puzzle elements.

faint vault
#

create a future robot on a new earth

tranquil vector
#

hi can i use SD3 to edit the color grading in a photo i upload?

devout schooner
#

I really hope someone deep dives into what the heck happened with the SD 3 / 3.5 arch one day
this is SD 3.5 Large on the left and SD 3.5 Large Turbo on the right, same seed, same prompt
I never believed any of the issues even with the original SD 3.0 had anything at all to do with "censorship" but like rather
there's definitely some really weird deeper technical issue problem caused it to be the case that distilling 3.5 Large into 3.5 Large Turbo actually significantly IMPROVED the coherency and pixel resolve (and almost completely elimated the strange edge artfacting problem) as opposed to the opposite (and no one will deny this is the case if they actually do enough seed-to-seed comparisons between the two, I promise)
there's numerous questions to be asked here no one has ever answered to date joeshrug

dry wave
#

turbo variants are often "better". Same happens for SDXL. I think they do some dpo with the distilling

unique sigil
#

How to generate image?

raven elm
potent inlet
#

I installed SD 3.5 Large but run in error, I think my method is wrong, could some point to the correct way to download and install?

potent inlet
languid blaze
#

Photorealistic rendering of the letters why4e, make the letters readable but broken, like the wreckage of a spaceship, with a dark, gloomy space background, traces of a dying explosion

calm thistle
#

/ generate promt: dark contrast noir photo realism with detective and ufo

#

photograph of [object], [details], [environment], professional photography, 50mm lens, f/1.8, natural lighting, high resolution, sharp focus, detailed texture

jagged gate
jade lion
severe prism
#

𝙃𝙚𝙡𝙡𝙤

astral lantern
#

animation

torpid marlinBOT
#

how to generate images with propmts?
No data source is currently selected. Please choose a data source from the dashboard and try again.

willow oar
#

paint a rabbit

summer ginkgo
#

🐇

thorny stream
#

Paint a crab

summer ginkgo
#

🦀

warm hollow
#

paint a rabbit

south creek
#

Draw an orc fighter

bitter hearth
#

draw me

chilly storm
#

/genrate test

summer ginkgo
#

[F]

noble sparrow
#

how to create image with an existing image?

errant dust
#

Probably use something like Nano Banana and tell it the modifications you would like

#

All is quiet on the image front I guess. for my own ends, for pure creation I think Hunyuan has the edge (though it is impossible to run locally, being much too big).

#

For editing Nano is best

#

at least for large transformations

livid rose
#

@gilded stone Instead of supplying empty latents to the KSampler, use a vae encode node to convert your reference image to latents. Connect those latents to the Ksampler. Then lower the denoise to 0.5-0.8. Your original image will shine through, modified by your prompt.

quick pelican
#

i've been doing some analysis/testing of sd3.5 large over the past months, it seems something really nasty happens at MMDiT block 35. best i can tell, block 35 has the strongest influence on making the speckled greebled texture that's common with outputs. maybe the growing values also have something to do with the poor quality? (i'm not studied enough in the math at play to make a full evaluation of it)

(attached img: this graph is a single step (0 of 24) of sd3.5 large under bf16 unet)
under fp16 unet, which is what comfyui runs as default with the sd3.5_large.safetensors file, l2 hits inf

quick pelican
#

(@devout schooner, you mentioned you were interested in some analysis on sd3)

sacred shard
# quick pelican i've been doing some analysis/testing of sd3.5 large over the past months, it se...

For some reason, I'm not convinced yet that that is the primary or direct cause of the speckled visual effect on outputs.

If that is the case, and if it was easy to isolate this issue, we would likely have had a new custom model available by now that could counteract this issue. I agree, I've noticed a strange effect with that specific latent. However, it's difficult to say with confidence what the exact cause is, given the complexity of these algorithms.

sacred shard
#

@quick pelican

quick pelican
#

I partially drew that conclusion from using the skip layer guidance sd3 node, where 35 had the biggest reduction in that pattern, I'd have to post some examples of it.

I have other test situations where I've seen higher quality results, like using 1344sq resolution, or disabling bias on various modules. I can post some of those if you're interested

sacred shard
# quick pelican I partially drew that conclusion from using the skip layer guidance sd3 node, wh...

It's plausible that the cause is latent 35, all I'm saying is that I'm not yet convinced it's the most probable cause. Also, it makes sense to me that skipping the first and last stages of a diffusion process would cause quality improvements, although it should be worse on average given the greater noise present.

I'd be interested in learning more about this as I'm actively doing research in this area still, but we should really have a real conversation, at least for a moment if that would be fine with you.

#

I don't doubt it's involved, but I'd be curious to know what other tests would determine its primary cause, I suspect the cause could be in the algorithm itself rather than in a latent and the issue is only exacerbated by block 35.

sacred shard
#

@quick pelican

quick pelican
#

yeah, you can dm me or continue here. I've been busy/symptomatic lately, apologies for the delay in posting more info

hallow lion
#

Imagine out of the blue stable diffusion drops SD4. XD

quick pelican
#

there's certainly a new framework they could use
https://arxiv.org/abs/2510.02300
i wonder if it'd prove to be better than flow matching in large datasets/models

rustic bramble
#

anyone know how to get u_t ^{theta} from flux api ?

#

or flux mini

#

im trying to setup post training experiments

upbeat girder
bitter hearth
#

hi

sacred shard
#

@rustic bramble In the past, in a conversation about machine learning, I found a lot of the people I was talking with couldn't give me a good explanation of how back-propagation works. What is your level of technical knowledge around neural nets?

#

It depends on what you're using. What library are you using? Are you using TensorFlow? Pytorch? Are you a PhD student or is this for your day job?

#

Are you using PyTorch? Tensorflow? Keras? Do you want U theta to be a weight in the net? A gradient?

rustic bramble
#

and not score parametrization

sacred shard
#

It's also not that surprising that most people you talk about aren't able to give you technical details on how neural net training works. Neural net training is highly abstract and complicated, I've seen a lot of PhD students and people of that level struggle with it.

#

If you're talking about a system of autonomous ODEs, I have also studied that, but if you're taking about the use of ODEs in a neural network I am not familiar with that idea.

sacred shard
#

@rustic bramble

dry wave
#

lol, is he a bot?

dry wave
#

Alexander. Sometimes it's hard to say if someone is a bot or just not a native speaker 😅

sacred shard
sacred shard
#

@dry wave

#

@dry wave Yeah, you obviously pretend you didn't saw this dude.

muted cargo
#

let's not go that way.

sacred shard
#

@muted cargo Bruh

urban arch
#

@Mods?

pure yacht
hallow lion
#

oh noes... i remember when this place was so active lol

#

Emad do something! yeah, yeah i know he doesnt work there anymore

ancient folio
snow echo
calm parcel
#

I'm trying sd3-Turbo on my AI Platform for simple picture generation. I want to keep it cost effective for after the rollout in April. I am getting wolves with three ears, two headed bunnys, dragons that are breathing fire not from the correct end. Is there a more ... responsive version?

calm parcel
#

Okay ... all 20 pictures were complete failures. I'll switch to a different model, perhaps their flagship version, been around and tested longer. But this needs to not be a thing.

fierce heath
serene lantern
#

A woman reclines across a slab of warm stone near the edge of an abandoned quarry at night. Her upper body rests on one elbow, legs bent with grace. Her bare skin catches scattered beams of light that fall from a distant industrial lamp. A long piece of translucent fabric runs beneath her, catching subtle folds of shadow.

Technical Notes:

Lens: 50mm, aperture f/2.0

Lighting: Hard backlight + diffused fill from below

Camera Angle: Side-profile at ground level

Color Tone: Cool with amber accents

Atmosphere: Quietly mysterious, cinematic

barren rock
#

Hi someone help me pls?

#

I dowloaded forge webuii but when i want to generat image i get a error like this

barren rock
#

Yeah i know that. But i don’t know how to solve this problem

runic tusk
#

Hence the use of Google to read and apply potential troubleshooting steps and solutions.

barren rock
runic tusk
#

You don't need to be smart, you need to read the links and do what they do to see if it works.

runic tusk
#

Start with the first one, then go down from there if it doesn't work. This is how basic troubleshooting works. You try something, learn something, try a different thing if it doesn't work.

#

Literally everyone can do it.

#

Nobody is born with the solutions in their brain already.

#

I believe in you.

barren rock
#

My English is bad, it would take a tons of time to read the links, my brain is already burning😩

calm lava
#

I had this problem and decided to hire help from fiverr. Otherwise I would have been at it for months trying to troubleshoot. I watched the guy dealing with dozens and dozens of errors of various kinds. Had I not hired, I could have probably eventually gotten it working, but it was about getting it working in 6 hours vs 6 weeks or 6 months

#

6 hours because i had him install a whole bunch of things very specific. the basics would probably be a lot less

#

The downside is I didn't learn how myself, but I also work full time and have a bunch of other things going on so for me it's about time

summer ginkgo
hallow lion
# barren rock

Use Forge Neo... Old forge man is no longer updating the stuff.

#

Aktivald a windozd bazdmeg! XD

cinder socket
#

Hi! I train FLUX'SDXL Face lora for ur realistic AI influencers. Happy to share workflows, tips, and examples 😊
https://www.behance.net/gallery/243708697/Stable-AI-Influencer-Private-Flux-Face-LoRA

I create realistic AI influencers and stable AI identities.What I create:AI influencers for Instagram, TikTok, and X (Twitter)Digital personas for Patreon and OnlyFansLong-term AI characters for branding and monetizationRealistic AI faces for lifes...

shut talon
#

a tree

ebon roost
#

A dog

hallow lion
#

Anyone else has the feeling that if sd3 was a good model it would have been very good?

lament rampart
#

Hello guys please help me out with these images I have a very low PC I can't run local stable diffusion and I do not even know how to write this image high quality prompt please if anybody knows please help me out this it's sucks me from last two months please

long jasper
tough viper
marsh lintel
livid rose
north tide
#

oh isnt cyberdoll tongue has like two halves?

#

or the lack of that line?

#

hmm.. cannot recall

marsh lintel
livid rose
marsh lintel
eternal jasper
#

blue moon

rugged bear
#

Kann ich jetzt hier was erstellen

muted cargo
hallow lion
#

moderate pls

cunning nebula
#

#🏞|general-with-images Cinematic, surrealist medium shot of a vintage 1970s cream-white sedan partially submerged in deep, dark teal water. Thousands of fresh, vibrant flowers—primarily ranunculus, dahlias, and baby's breath in shades of peach, soft pink, cream, and pops of orange—are overflowing from the car's windows and hood, floating out onto the rippling water surface. The lighting is moody and ethereal, featuring a soft misty glow from the background and shimmering reflections on the water. High-grain film photography style, 35mm lens, shot on Kodak Portra 400. Deep shadows, realistic water textures, melancholic yet beautiful atmosphere, hyper-detailed floral petals and rusted chrome accents

marsh lintel
near sierra
#

#🆕|sd3 genearte an classic hyper realsitic image of burger eating a mna

#

hi

marsh lintel
marsh lintel
urban arch
#

I don't know they're right.
I can't say they're wrong.

faint sand
faint sand
#

Yo m’y bro how do you use sd?

muted cargo
# summer ginkgo

not gonna lie I thought you were one of those bots at first :p

muted cargo
# faint sand Yo m’y bro how do you use sd?

I mean ... what are you looking for exactly ? Can help on the details but it s hard to tell with such a vague question.
To use sd either you install it locally, you use this server #artisan-1 /2/3 channels or use one of the trusted online services to either generate stuff or rent a gpu to run your own client on it.

summer ginkgo
urban arch
#

It's what's for dinner.

faint sand
#

Okay man thanks for the help 🙂

grand wyvern
snow prairie
livid rose
#

Of course it is, rule #7.

silent cedar
#

Where bot?

tidal garnet
#

Generate an anime picture, just a test.

#

:(

fair kettle
#

/generate an human

summer ginkgo
#

^ hooman

north rain
#

/generate a bright brown alien face

dreamy palm
#

⁠generate an human

warped lodge
#

generate a bright brown alien face

#

/generate a bright brown alien face

north rain
#

/generate blue alien head on white background --square

#

/generate alien landscape scribble, 8k, highly detailed, best quality -malformed hands

raven elm
#

Found a solid free alternative: draw.freeforai.com. It's a web-based SaaS, so no login needed. Best part? Completely unlimited and free with no watermarks. Perfect for when you just want to whip up an image fast

potent prawn
#

Dawn of the Divine Archer

south pendant
upper delta
#

/generate alien landscape scribble, 8k, highly detailed, best quality -malformed hands

south pendant
#

🙊

sturdy python
#

can I take it for free jk 😝

urban arch
#

That would certainly prove it's not spam. 😁

covert crown
#

Generate a cartoon picture

summer ginkgo
waxen sonnet
#

:/imagine prompt: a cyberpunk cat with neon lights, wearing sunglasses, 8k, hyper-detailed --ar 16:9,

hallow zenith
#

/generate imaginative photo of cryptoman dark blue on white background NeoBlessVerseGenerate

hallow zenith
#

/generate imaginative photo of man dark blue on white background NeoBlessVerseGenerate

jolly drum
#

Most people generate locally here

#

I recommend using cloud services if you don't have a GPU such a civitAI or any of the comfyUI Integrated services thru API

#

Though those also cost money unfortunately

soft lantern
#

/generate image of donald trump dancing with benjamin netanyahu

jolly drum
soft lantern
#

i was joking lol

warm zealot
#

正向提示词 (Positive Prompt):
A magnificent ancient oak tree, intricate bark textures, glowing moss and tiny bioluminescent mushrooms growing on the trunk, soft volumetric sunlight piercing through dense green leaves, cinematic sunbeams, floating dust particles and magical spores in the air, macro photography perspective, hyper-detailed, 8k resolution, photorealistic, masterpiece, depth of field, sharp focus on tree bark, rich textures, award-winning nature photography.

反向提示词 (Negative Prompt):
(worst quality, low quality:1.4), blurry, smooth texture, plastic look, deformed, out of focus, artificial, cartoon, drawing, illustration.