#🆕|sd3
1 messages · Page 131 of 1
not sure which is which, but the first image is clearly the better one
The second one looks like a cartoon
I really think you're being intentionally disingenous at this point 
you keeps skirting around every single one of the actual points I'm making
unless you really believe that everything should randomly be super grey and dull and very clearly have elements of literal paintings mixed in even when that wasn't asked for or desired in any way
neither base is "perfect" by any means but the left side one (3.5 Medium) looks WAY worse than the right side one (3.0 Medium)
the 3.5 one is a poorly resolved mess that looks more like the inside of a piece of particle board wood than any kind of metal
its hard to describe but there is this thing like "scratchy details" that can come in a variety of models and SD 3.5 can get it quite easily
if you used SD 1.5 and SDXL with PAG then its basically the thing that goes away when you increase PAG
I kinda understand what you mean by cartoon because the colours went bolder on the one on the right and some people prefer muted colours
I used to have the same tastes I recently started using CIELCh colors and I appreciate bolder colours more now
No, it doesn't. And it is you who fails to understand over and over my point about light. I'm a photographer. Light is everything. Candle light, overcast, or sunlight will completely change the colors, the contrast and so on.
when they test people's personal taste about images what they find is that people's personal taste varies so so much
so IDK how much its worth debating matters of taste
its partly why I thought the big "battle" between SD 3.5 and Flux was a bit silly
if you preferred robust+clear object outlines and a stylized aesthetic then you would prefer Flux, if you were not bothered by some degenerated object outlines and you preferred a high detail natural aesthetic then you would prefer SD 3.5
there were like 100+ arguments about SD 3.5 vs Flux in the community but it just comes down to that taste difference really
I mean I strongly disagree but ok
if I scrounge up more examples I'll post them
this is IMO a very real problem SD 3.5 Medium (and to some extent Large) has even when the prompt mentions absolutely nothing about light
it aggressively wants to make everything dull and gray (and painterly), all the time
and make the entire image appear as though seen through inexplicable, excessive amounts of fog or smoke
in a way that is not at all desirable
putting everything we're talking about here aside I think it can't be stated enough that in fact Flux basically single-handedly invented and originated the "plastic skin CGI look"
for all realistic outputs
that all models since have done
stock Flux will never ever produce an image that looks like this 3.5 Medium one (meaning, just like, normal) under any circumstances
for any prompt like
a close-up portrait photograph of a young woman's face, focusing on her facial area from the eyebrows down to the upper lip. She is 18yo and has freckles. Her skin has a smooth, glossy texture from her makeup. She has dark brown eyes, framed by thick, dark, well-groomed eyebrows. Bold red eyeshadow extends from the inner corner of her eyes to the outer corners.
but revisionists want to insist nowadays that the way Flux looks has been how all models look
and that it's some kind of unavoidable problem
which obviously it is not
this is a big part of why I'm not particulaly amped about HiDream
needing to use the full 17 billion parameter version to even just get normal-looking realism is a bit silly
and API-only models have definitely taken into account what I'm saying
like Reve does not produce CGI-esque Fluxy realism
nor does Ideogram
and so on
it's just this recent stream of extremely similar-to-Flux open source ones that seem to be trained by people who think being utterly incapable of authentic realism is desirable
surface hasnt been scratched with HiDream yet, but ya its a chonker, still might have some pleasant surprises
```early JuggernautXL though lol
did that look super early on
I never rly found stock Flux useable, as I was saying the other day I never jumped from RealvisXL to flux until RealvisSchnell came out
and a bunch of photography+lighting loras
I also refined each flux image with RealvisXL or Realvis for SD 1.5
I don't know if I've seen RealvisSchnell.
best things always have zero hype
its like universal rule
there are better checkpoints these days though, at the time it was unrivaled IMO
it loses to properly de-distilled checkpoints as an example
Turbo or Lightning XL models always had that issue (as well as any merge that was ever trained on AI generated images erroneously captioned as "photos")
but not to the same extent as Flux
I really haven't formed an opinion on it
all I have discerned is that the Dev version often looks laughably bad even in comparison to stock Flux Dev on the same seed and prompt
that's the verbatim "Juggernaut Lady" prompt lol, HiDream on the left, Flux Dev on the right
HiDream looks more like Flux Schnell (at best) here
literally nobody wants shit that looks like this though
this is the number one complaint everyone has always had about Flux
and numerous loras were trained specifically to get rid of it (including by me lol)
like nobody is looking at this stuff and being like "yep, perfect"
I don't know who is convincing all these recent model trainers that this ridiculous not-actually-realistic-at-all bokehmaxxed style is desirable to anyone
like just make an open source model that looks normal but also doesn't have any amount of weird ass coherency and noise resolve issues
why is this so hard 
data issue
it's probably partially that but it's definitely also caused by people deciding that their model NEEDS to be 12B to 17B parameters for some unexplained, unclear reason
and then aggressively distilling and DPOing them down as a result of that to make them runnable for average users
there's no way any of these models NEED that many parameters
like the practical difference between SD 3.5 Medium and Full HiDream DEFINITELY doesn't amount to a "14.5 billion parameter" difference
not even remotely close
got alot of bang for the buck with quantity for a while now, and ya ide hope it shifts
BTW using ClownsharkBatwing's exponential samplers doesn't necessarily help with the greyness problem in 3.5 Medium vs 3.0 Medium
but it does DRASTICALLY improve the resolve of lines, in general
much cleaner
eh
I still just don't really get why people are so aggressively pushing HiDream
I even got into a truly bizarre argument with the simpletuner guy yesterday where he posited that LLAMA not being a "Chinese" text encoder was somehow a significant contributor to the reason
I'm still not even sure what exactly he really meant by that
or how that would plausibly be relevant to any average user lol
ah yeah turbo and lightning always seemed to lose some softness or blur ability
going from flux to flux turbo to schnell does the same
something about distillation makes you lose soft lighting and blur
HiDream could have been the model that replaced Flux (which is only available distilled).
But for me the model is way too parameter inefficient
I disagree with the claim you would not need many parameters, though.
I was sceptical myself first, but Flux is just so much smarter than smaller models
snapchat made a 0.3B image model that looks about as nice as any to me
oh it won't be as smart yeah
looking nice does not mean being smart
look at the ChatGPT model what you can reach with enough parameters
I know what you mean by Flux being smarter yeah
it fixes stuff during inpaints and upcales etc
that does seem to require more paramaters
I would like to have a new flux with stronger text encoder and more efficient parameter use (e.g. adaln parameter sharing)
I've been quite happy with tiny models for tiled upscale, like this one https://huggingface.co/cqyan/hybrid-sd-224m its SD 1.5 but squished to 224m so like a two-thirds size reduction
IDK if I am that bothered about parameter efficiency like
SVDquant Flux FP4 with 8-step turbo lora is rly fast on 5090 servers
so like without even mentioning B200s it can be fast on domestic 5090s even
it's not so much about speed but about using parameters where they help most
like one big jump from sd1.5 to sdxl was that they removed the transformers in the first block because - surprise - they haven't done anything, and put them into the middle block
yeah that first block in SD 1.5 is a big pain cos it makes SD 1.5 slower than SDXL at high res
which is crazy
when I use SD 1.5 I always use Modified Shifted Window Multi-head Self-Attention cos it fixes that issue
but then that makes it no longer work easily with torch.compile
its a mess
there is a "fix" which is to make custom CUDA kernel for tiled SD 1.5, which is kinda one of my current projects lol
hi, does it require external script to run? no comfy compatibility?
I asked comfy about supporting it but no dice
understood!
Flux Pro Ultra shows what it can really actually do well IMO
but the distilled versions (Dev / Schnell) are so aesthetically DPOed to hell that it kinda noticeably harms prompt adherence a lot of times I've found
even in comparison to like SD 3.5 Medium sometimes
which of course uses the same text encoder ultimately
it does have very clean "resolve" yeah
I'm not sure that's related to parameter count entirely though
Close-up view of a corner connection for stacked shelves (thin galvanized steel rectangular tube). The top surface of the lower shelf corner has small metal blocks welded to form a square locating pocket or fence. The upper shelf has a 10cm tall square spacer foot welded underneath its corner. This foot fits neatly inside the locating pocket/fence on the lower shelf. Show the 10cm separation created by the spacer foot. Detailed, metallic, industrial design, 3D render.
parameter count and "abilities" are a super loose link yeah
but when I go from 0.2B pruned SD to 30B stepfun I do see an increase in "smartness"
even though they are fundamentally the same type of network
the 15,000% increase in parameter count gives some benefits
having said that, its amazing how well 0.2B keeps up with 30B
0.2B pruned SD is perfectly fine for tiled upscale and other tasks like that
Is the training set for SD3 or SDXL disclosed?
sadly no
@dusky thistle hahaha it's been a long while since i was kinda active on this server, im happy to see you are still doing some clownshark pics 🙂
did you try some with HiDream?
yeah, just tested and confirmed res4lyf is working with it
gotta get all the attention masking working so i'll have to modify the model code
i actually never tried HiDream yet, will try it tonight, it seems most people use the Dev version?
should I go with that one?
yea i will hopefully upgrade my PC this summer so I can enjoy all the good stuff instead of relying on quants or
compromises
I recommend full version. It's the only one with negative prompt, same size and works fine at 30 steps just like dev. Download the fp8 model from Comfy's HF then run in fp8_e4m3fn_fast if supported.
YES. Absolutely nailed it. Paws are so perfect. (This is the best of maybe 20; kept cancelling run halfway through based on preview.) Just realized the basketball is wrong though...
Huh. Looks like it almost never gets a basketball right.
Great at soccer balls. A little iffy overall.
It usually seems like just a slightly better version of Flux.
But then you give it a 12-constraint prompt like this and it's just like, "Yeah, I got all that." And it nails all 12 constraints every single time (with accurate hands). It really is way more powerful / intelligent than Flux, just not more knowledgeable.
A male elf with braided silver hair and green skin is wearing a purple toga, standing on the shell of a giant turtle. In his left hand, the elf is holding an intricately etched golden staff. In his right hand, the elf is holding a slice of meat lover's pizza. The scene is cinematic moodily lit under an overcast sky.
By default, it's super consistent seed-to-seed; always the same pose. But you can use the normal partial-denoise two-pass to get as much creative variation as you want.
I'm trying hires 2x now, at 2496 height which is > 2048 the original repo maxed out at.
Aww. Errored out. No >2048 resolution maybe?
LOL rifle spear cartoon? Hires 1.5x no obvious artifacting, but hard to tell because of the print texture. Gotta specify the style.
I'm amazed at the fingers and toes. I need to try a hands-and-feet prompt.
It probably hurts the performance that we can't do separate prompts for the different text encoders, but the ComfyUI native nodes definitely fixed the banding problem that the custom node had.
Hmm. That is some jaw-dropping texture detail, although the overall luma / chroma is wonky. Lemme see if I can fix it.
Is Flux actually better at architecture solidness / symmetrical objects?
Getting too close to the original pose again though.
@icy drift but the true test are tcg cards haha 🙂
I couldn't find any YuGiOh in its training data. 😕
Very good with text prompt adherence.
yea
Details on staff, necklace, and leather obviously AI. 2nd-pass needs 1.0 CFG or colors will blow out.
The preview shows this weird ping-pong effect like the model keeps trying to shake things up during generation. I haven't read the technical report if there is one, so I don't know how / if it differs from Flux.
That's one shiny card.
Printable center design no problem.
Got background art.
I give up. It just can't do reflections. It may have been trained on synthetic reflection data that was wrong.
It's the best teeth-brushing model yet though.
I think it's main real power is subject-inclusion constraint following. You can add tons of stuff into an image, and it will all show up.
It can't seem to do before/after same-person generations either. This model just has a really vague / fuzzy understanding of identities, materials, and structures.
Hmm. So why is it so good with hands (and presumably limbs from other peoples' tests)?
it's basically flux and then they added a lot of text encoders in a very inefficient way
生成一只兔子的卡通形象
Anyone worked with mamba backbones for diffusion?
High-detail map of North Africa, Morocco highlighted including Western Sahara, vintage parchment background, deep red and gold tones, cinematic lighting, ultra-realistic, no text, 16:9 aspect ratio
ye we have linearised dit also
which is even faster
and works really bad in my opinion
I think linear attention is not so important/critical for image generation. Maybe I'm lacking imagination, but why would you want to scale up diffusion to millions of tokens? I can only see two scenarios: first, generating huge resolution images like 8k. But honestly, I think before you come up with a better attention mechanism, I would rather come up with a smarter upscaling technique. second, having a long conversation with multiple images (like omnigen or the current chatgpt). The latter might be interesting, but I think its currently more realistic to have a strong diffusion method that can be conditioned on one or very few images; that would be still possible with normal quadratic attention
(I also don't think that mamba and Co have a future in large language models to be honest. Might be wrong here, but my intuition is rather that memory models are the future)
oh generating to huge resolutions is like
the only thing I do lol
there is a project called CLEAR that nearly linearised Flux attention using a sliding window
it gives a 600% speed boost or so at 8k
flux is really nice at those higher resolutions especially if you can pass the 16k mark
its like getting a model from the future
the lighting goes so nice and soft
even than I would first generate in low res with quadratic attention and then upscale to higher res with something else
although sliding window is not "linear attention" to me xD But yeah, its almost linear in time I get that
oh I think it does do this
yeah it was not quite linear time
so if the resolution got high enough it would still start to scale in an unfriendly way
but it got pretty high
this looks really cool
but the memory consumption makes it probably still hard to generate more than 2k with it X_x
"The reason for these phenomenons is that pre-trained
DiTs, such as FLUX, rely heavily on local features to man-
age token relationships. To validate this, we visualize atten-
tion maps in Fig. 4 and observe that most significant atten-
tion scores fall in the local area around each query."
that's exactly what I hate at these DiT architectures X_x
why did they not simply add both, sliding window attention and global attention, in a 90:10 ratio or so
if you want a more modern / smarter version
thunderkittens team did sliding tile attention kernel for hopper
I think that's the conv in the model
linear attention is really bad
the quality hit is fairly big yh
Hmmmmm Aren't transformers inherently quadratic? Is this recent work? I was reading about vision mamba/ jamba hybrid backbones from last year
there's been all sorts of attempts to stop that quadratic scaling
I don't think it would help for me to link to a specific paper you need to read the whole literature really
like including RNN and GRU
its a frontier-level topic so its not rly something that can be distilled down into a small summary
fastest is a loaded term cos
its hard to compare apples to apples
remember there are also non-deep methods
naive bayes, XGBoost, random forest etc
or just like standard panel data multivariate regression difference-in-differences models
or something like a dynamic stochastic general equilibrium model where you are essentially parametrising partial differential equations
Mamba got a lot of attention when it was published, but it turned out it just doesn't work as good as transformers. Nobody is using mamba-only architectures anymore. However, there are mamba-transformer hybrids in use
but there are million ways of speeding up transformers.
methods like mamba try to find a completely new architecture to solve the problem. But there are also just tweaks and fixes that can speed up everything sufficiently
for example: most of the time you don't need to attend everything to everything. So instead of removing all your attention layers from the model and replacing them with, e.g. linear attention, mamba, whatever, you could also just replace half of them or 80% of them. You could also use a unet like architecture and use transformers only in the middle layer (where they are cheap) and use other methods like neighbourhood attention in all other layers. There are so many ways of speeding up things which are not explored yet
transformers keep getting faster and faster with 6d parallelism, kernel search methods, better distilling etc
so the need for mamba or linear attn is less and less
hybrids can be ok yeah
With Flux. I can't get HiDream alone to give me clear images at all for some reason. I'm using the recommended settings, and the quality is just bad.
How does it handle architecture? (E.g. banister railings.)
Give me a prompt to test.
Just finished typing up a prompt and testing now with new settings.
The photo is taken from the bottom of a winding staircase in the interior of a suburban home. At the top right of the image, a woman is standing at the top of the stairs, leaning over the railing and looking down. The scene is a brightly lit interior.
Look at her wonky fingers and optical-illusion infinite stairs with super crazy disjointed banisters everywhere. It always comes out like this.
Here it is cleaned up with Flux at 0.61 denoise. I should've used 0.67 for architecture, but it did its best to fix the mess. (Of course, Flux can't actually follow this prompt if you try it just straight up. It can't get the camera position right.)
Yep. Look at her fingers and the crazy stairs.
sora
Wow that's great prompt adherence (same architecture problems, but I remember it being crazy fast). What version of Sora is this? Where can I get it?
Oh LOL nevermind. I was thinking Sana. I don't care about closed source stuff.
need enterprise deal to get access to the good sora sadly
these are sana
love sana so much
Okay, then full critique on that trash. The banister sweeping in from the top-left suddenly ends mid-air, clipped off by the right banister. She has no hands.
Yeah, that's what I remember from Sana. Nowhere near HiDream or Flux sadly.
🤪
ye its not flux level at all
it can do text
but I could not get "the" or "jedi"
Definitely. Nothing out there can match 4o's text right now.
is ok for sci fi
But Flux can do full book covers too, with title and author title, in whatever font you want, with amazing kerning and composition.
I think we need separate foundation models for text anyway really
cos it feels like a very specialist task
I like having separate models for stuff
I don't make big images often these days but when I used to, I would chain together like 20 different models, some as regions and some as upscale passes
Flux?
Dream
How is the leather on her belt so consistent / solid looking? I get a blobby mess every time. 😦
#🏞|general-with-images pikachu
Comic-style wide shot of a dark alley, with the polar bear and the little man in the black hoodie locked in a tense face-off. The bear stands on two legs, wearing a red scarf and brown newsboy cap. The man is frozen in place as the bear’s massive shadow stretches over him, cast by stark moonlight. Mood: fatal finality. Use silhouetted framing with sharp light-dark contrast, emphasizing the bear’s dominance and the gravity of the moment. Let the moonlight carve out dramatic shapes in the alley, heightening the cinematic tension.
Generate an image…. What the hell am I doing? How do I create images on here?
Or is this just a chat for people to talk about?
yes
there is a bot for generating images in a different channel, but it's not for free. This discord is about open source models, so people generate the images on their own graphic cards.
"how to draw a cartoon elephant, step-by-step guide, 6 panels, each step building progressively -- step 1: simple head shape -- step 2: add trunk and eyes -- step 3: connect body to head -- step 4: draw legs and tail -- step 5: complete the full body with outlines -- step 6: fully colored elephant in soft pastel colors, clean vector art, minimal background, kids tutorial style, high resolution"
"A normal-looking public charging port (at an airport/mall) with a hidden microchip inside. A close-up shot shows the chip's LED faintly glowing as soon as a user connects their phone, with a 'Data Transfer' animation flashing on the phone screen. In the background, binary code streams and a hacker's hand is partially visible."
dream
Dream
(8k game sprite), (front view), pixel art, office worker,
(stressed face), (messy hair), (glowing computer screen reflection on glasses),
(untucked shirt), (coffee stain on tie), (holding smartphone under desk),
(cubicle background), flat shading, muted colors,
(comedy elements: tiny cactus with "F**k Work" sign),
art by Scott Benson, inspired by "Don't Starve"
Нужно в первой иллюстрации добавить ещё одну колонку с персонажами слева от основной. В новой колонке должны быть пустые слоты для абордажников. Затем весь интерфейс нужно выровнять.
generate image
#🏞|general-with-images make a smart room light automation design --aspect 9:16
In a futuristic white room, on the right an android is floating in a green tank, and on the left a woman is standing at a control panel. The green tank is a tall cylinder of aquarium glass filled with fluid. The android inside the tank is floating weightless above the floor, with head tilted back with eyes closed, and with arms spread. On the left, the woman standing at the control panel is facing left and looking down at the panel. Her hands are on the control panel. The woman at the control panel is wearing a white lab coat. The control panel is a blue and white futuristic LCD screen. The room is futuristic and white, with panels and lines. The scene is brightly lit.
Yep, that's some absolutely amazing composition and constraint following from the prompt. Blows Flux out of the water in that regard. It can't do the solid-looking architectural structures (like the wobbly lines on the vent overhead or the mangled lines on the screen / panel / her fingers), but the prompt adherence is just so useful. It just needs low-denoise hires with Flux to fix the mistakes.
In a futuristic white room, on the right an android is floating in a green tank, and on the left a woman is standing at a control panel. The green tank is a tall cylinder of aquarium glass filled with fluid. The android inside the tank is floating weightless above the floor, with head tilted back with eyes closed, and with arms spread. On the left, the woman standing at the control panel is facing left and looking down at the panel. Her hands are on the control panel. The woman at the control panel is wearing a white lab coat. The control panel is a blue and white futuristic LCD screen. The room is futuristic and white, with panels and lines. The scene is brightly lit.
Hello @Team,
I’m currently working on generating a Toy Starter Pack-style image using the following API:
"https://api.stability.ai/v2beta/stable-image/generate/sd3"
However, I’m not getting results that align with the attached reference image. I’ve included the prompt in the appropriate section and tested with various "seed" values, but the output still doesn’t meet expectations.
Could you please advise:
If this is the correct API to use for achieving this specific style?
If there are particular parameters or configurations I should be adjusting to improve the results?
Your guidance would be greatly appreciated.
I recommend gliff
Can someone help me with this?
output resolution is too low.
sd3 is trained for 1024x1024
Okay ill change
Its still same
dunno what schedule type automatic means, but rectified flow works simply with the Euler Sampling method
Should I change it to anything else?
Euler and normal are the settings in comfyui
Sampling method: Euler
Schedule Type: Simple
Nah its the same
I'm still gettting image same as previous
Is this auto1111? if so i am not sure if it supports sd3.5.
HiDream E1 working! 🙂 (E1 only works with 768*768 as far as I know, so output is smaller. At 1024, it will shift the image. No problem to upscale / hires / low-denoise.)
Change the painting on the canvas into a tuna fish.
It did its best with my terrible sketch.
Change the rough sketch style to a clean lineart style with coloring and shading.
I had to reroll and modify the prompt 3 times, and it still missed one of the apples. Maybe if I had said "4 apples" instead.
Change the apples' materials from gold to transparent crystal.
I think Omnigen might still be better though.
omnigen is still good sometimes yeah
a beautiful and powerful mysterious sorceress, smile, sitting on a rock, lightning magic, hat, detailed leather clothing with gemstones, dress, castle background
Mods?
Scam
I cant ping mods at once
they removed that option
uhhh @spark grove ig
Make moustache narrow and wide with slightly curved ends, Make eyes positioning in correct direction simultenously natural, Mir anees
you can't generate in this channel.
Make it a rainy day, photoreal style.
Nailed it in one! 🙂 New IC-Edit lora with Flux-Fill.
Give the kitten sunglasses, photoreal style.
Underwater.
tried hidream e1 with a slightly alternated prompt:
Editing Instruction: change the moustache into a short bushy Toothbrush moustache, wearing a jester hat. Target Image Description: young man
@icy drift will try the kitten image an your prompt next 🙂
My harddrive can't handle all these models
I waited for a lora for flux-fill 😀 do you have a link?
Can anyone tell me how to modify parts of an image through prompt in stable diffusion
Well different options. One of the oldest and trusted methods would be inpainting. So masking the region you want to change and prompt what you needed.
Most recent hidream e1 was released where you simple could write a prompt without masking which editing operations. But this might change more than only the specific region.
Could you please tell me where to do on this server like changing the image with a prompt
I'm new to discord
Well this is a server mostly with people who use stable diffusions models either local or with the paid api / artisan channels.
So there is no free bot etc. just some help to install stable diffusions models either front ends etc. to run on own hardware. Talk about prompt optimization, share art, share news.
you can generate in the artisan channels. read the information in #artisan-faq
Help me generate a pink butterfly with a black background.
Only pink and black images
read the information in #artisan-faq
Hot dog!
Well... if you run the model on your own machine, all you oay us electricty
Hot dog!
can i run sd3 on a rx580?
Make a cartoon big mango with human eyes
I don't think so, I owned a rx480 and SD 1.5 and SD XL was kind of a deal and I had to run it on Linux...
if it works it would be really slow I guess
Use an the original image as an image reference. Then talk about the change over and over again in the prompt for the new image.
I recommend a gguf version of the sd3. Or a lot of patience. large swap file too. I run it on a 4060
Make a realistic pakistan PIA Aeroplane
<dem_form> <img1_style> a biomechanical humanoid creature with tusks and extended tongue, bust portrait, in the exact rendering style of the second image, cinematic shadows, dark metallic skin, surreal alien armor, inspired by H.R. Giger, highly detailed, photoreal 3D style, atmospheric lighting, monochrome tones
You can not generate imsges in this channel
pig image
a 185cm high sexy man wears transparent sexy underwear under the sunshine

Hidream
generate cartoon image with girl
No.
how to use this
You don't. This isn't a bot channel for image creation. There are paid services for that here.
sd3.5 doesn't deserve to be forgoten by community
global automotive manufacturing facility, robotic assembly lines, quality control inspection, international collaboration, high-tech production environment, workers in clean uniforms, cinematic lighting, panoramic wide aspect ratio
Same, though it can be a struggle (tbh I have a love-hate rlation with it)
A very pretty Chinese girl with a smile on her face and a nice figure, wearing a purple dress
A very pretty Chinese girl with a smile on her face and a nice figure, wearing a purple dress
A cheerful, illustrated poster featuring a variety of wild animals engaging in humorous and chaotic parenting moments, like a koala dropping its baby, a hamster eating its young, and a black bear sleeping through parenting duties, all surrounded by hand-drawn floral borders and playful typography that says “There Are Moms Way Worse Than You”, pastel color palette, children’s book illustration style, flat vector aesthetic, clean white background --ar 4:5 --v 6.0 --style raw --s 250
Settings made a big difference for Bagel. Times are on my 4090.
A trading card from a trading card game. The title of the card at the top says: "GOBLIN FIEND". The card art shows a green goblin with red eyes holding a knife. Under the art is a text panel. The text panel has the text: "The goblin fiend loves the taste of fried foods and does not like vegetables." At the bottom-right corner of the card is a number panel. The number panel has an icon of a sword, and the number 7. The card is polished and well designed, with highly detailed art and text in a clear, crisp font.
(Bagel with bad settings. Made the difference between SDXL accuracy vs. better than Flux Dev accuracy.)
About 30 seconds for Chroma....
Guess the amount of steps is more important these where made with 26
38 Seconds with Chroma v30 unlocked.
SD3.5 Medium 15.6 Seconds
Yeah the steps are more relevant unless you're using my same hardware...
My example gens were 50 steps each. The point was just to compare Bagel's speed to the speed of other models. I've never heard of Chroma, and that's some impressive prompt following. Easily on par with Flux. I'll check it out.
Oh chroma is just flux nvm. Just gives me solid black images in Comfy.
it's modified from Flux schnell
it's a different arch. It prunes schnell from 12b to 8.9b and corrects a mistake related to padding tokens. You need a different workflow for it.
it also does not need clip l and still in 512px pretraining phase, so there is a potential
this is the SD3 channel, should probably take the flux discussion to #💬|general-chat
A cheerful, illustrated poster featuring a variety of wild animals engaging in humorous and chaotic parenting moments, like a koala dropping its baby, a hamster eating its young, and a black bear sleeping through parenting duties, all surrounded by hand-drawn floral borders and playful typography that says “There Are Moms Way Worse Than You”, pastel color palette, children’s book illustration style, flat vector aesthetic, clean white background --ar 4:5 --v 6.0 --style raw --s 250
A cheerful, illustrated poster featuring a variety of wild animals engaging in humorous and chaotic parenting moments, like a koala dropping its baby, a hamster eating its young, and a black bear sleeping through parenting duties, all surrounded by hand-drawn floral borders and playful typography that says “There Are Moms Way Worse Than You”, pastel color palette, children’s book illustration style, flat vector aesthetic, clean white background --ar 4:5 --v 6.0 --style raw --s 250 ¯_(ツ)_/¯
where do you get stable diffusion, preferably a gui version?
https://civitai.com/models/878387/stable-diffusion-35-large - this is the model. you'll need to install something to run it in. i recommend you install https://github.com/mcmonkeyprojects/SwarmUI
Yeah I got it working with a workflow from Civitai, then experimented with a bunch of settings. For accuracy I'm sticking with HiDream+llama8. I use Flux for hi-res to fix HiDream's bad architecture, and my 6-step merge works better for that than chroma. It was still interesting to try though.
An Asian man with a receding hairline and a round face, wearing a white shirt inside and a gray-blue jacket outside, holding a wooden dragon-headed cane. He is 190 cm tall and stares at the cane arrogantly. This is a 2D game concept map, with obvious strokes, transparent colors, delicate clothing materials, and shiny leather shoes. --cref https://s.mj.run/CxufHB5Qy1s --cw 100 --ar 9:16
is it possible to run a flux dev with 22 gigs insted we only have 16 gpu ? (may be by vram ? got 64 of thems)
! thank for this speed answer !
pretty sure it is scam
don't click on the server
Good for you. Lately scammer appear whenever questions are ask to join a discord and they need you login etc.
i may not have to share things here ?¿
Back to your question. There are some flux dev quantisation models with fp8 that are smaller. You will not be happy with the run time if the model will be pushed towards the ram
Well tech-support would be a good channel for these question. Scammers are everywhere 😦
oh i'm ok with fp8 vae ones
♥ Lords thanks for your rescue ♥ i'll try to be carefull with what i ask !
Ask whatever you want to know but if people DM you or send you links that sounds fishy.... it is fishy 🙂
promess, i'll wireshark logs them ^^
Not quite dice-like, huh.
Sure... I'll try the Kontext Dev version when available. I'll be BLOWN AWAY if it can do any of the following at all:
- follow literally even one simple instruction like: "make his shoulders wider", or "make her look off to the right", or "make the dog cross its paws".
- preserve outfits from references without changing sleeve length, adding panels, moving seams, changing button count, etc.
- do simple / basic UI edits like, "add a sword icon in the bottom-right corner", or "make the title font at the top larger", or "add a raised border around the red button on the bottom right"
I am expecting the model to struggle immensely with anything other than the tasks listed in the paper, and I'm expecting it to need exact-word prompting to achieve those.
Tried the first, no joy on the pro tagged version
But who knows, maybe conrolnet pose will be added and you can twist and turn your char anyway you want too, that doesn't seem that big of a leap
Still, the way it preserves details is amazing, original photo "add a roswell alien riding the zebra" "make the alien hold one arm up, as in a greeting". And all gens actually keep looking like a photo, by far the biggest win for me, not the faux 3d render photo look so often seen.
So i thought style transfer would be really good too, nope, very hit or miss, hit for common digital art styles, but when using simple line doodles or charcoal stuff, style was pretty much ignored 😢
I was being all huffy to not get my hopes up. But my hopes were up anyway. Oh well.
maybe it's the prompting (might help to add a little style to the prompts), $.04/image is too much too experiment a lot for me
Yeah I'll wait until I can run it in Comfy. Sometimes with great prompting and lots of re-rolls you can find a useful thing that a new model can do, that no old models can do.
using this style create: a towering Lizardfolk mercenary whose scales are fused with veins of obsidian and reinforced with magi-tech plating. His eyes glow with internal arcane energy. He wears heavy brass pauldrons enchanted for durability and carries a powerful sonic disruptor gauntlet. He's gruff, pragmatic, and focused solely on the highest bidder.
it's something, maybe i just want too much :p
the first one was my first try and seeing it keep the effect of white border, image over it, i thought, wow, that's good
just get the gguf and run that
I finally got around to actually releasing an SD 3.5 Medium lora after spending 80 years experimenting with training lol (actually it's a Dora, specifically, not that it matters)
for the art style of Tim Jacobus (guy who did all the original Goosebumps cover art)
https://civitai.com/models/1635408/stable-diffusion-35-medium-art-style-tim-jacobus
relevant training notes I guess if anyone cares (in Kohya-ian terms):
- CAME optimizer
- no text encoder training
- 0.0001 model learning rate
- Cosine With Restarts Scheduler set to 3 restarts
- "Noise Offset" at 0.2
- Multires Noise Iterations and Multires Noise Discount disabled entirely
- Dim 64, Alpha 32, Dora model type, "factor" set to 2
- Batch Size 1 but with 5 gradient accumulation steps (as an alternative to a regular batch size of 5, which I imagine would have a similar effect)
also I think I generated all the samples with DPM++ 2S Ancestral SGM Uniform @ CFG 5.5, in Comfy
why do you even use noise offset?
it's a weird hacky technique full if potential errors that is not even necessary for rectified flow matching models
0.2 gives better results than nothing for SD 3.5 Medium
and better results than the standard 0.1 (or 0.03) people would typically use for SDXL
I can't speak to anything other than the results lol
the only "training guide" ever released for SD 3.5 anything was basically nonsense in my extensively tested opinion
anything other than Dora is basically useless
any normal Adam optimizer is basically useless
I've never gotten vaguely good results for SD 3.5 Medium with anything other than CAME and low-factor Doras 
it doesn't train anything remotely like any other model ever
create a icon like this photo
Draw a statue of an anime god : "raise the level alone" contrast photo with highlights, without background
#artisan-1 - PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.
PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.
What tool did u use if I may ask?
prompt
"تیزر آموزشی هوش مصنوعی: چرخدندههای مکانیکی کلاسیک (نماد سیستمهای قدیمی) به آرامی به ساختارهای دیجیتالی تبدیل میشوند. ابتدا به لایههای نورانی یک شبکه عصبی ساده (3 لایه با نورهای آبی و سبز) تغییر شکل میدهند، سپس به یک معماری پیچیده Deep Learning (با صدها نور قرمز-زرد-آبی متصل) تکامل مییابند.
`prompt
主题公园景观,青柠水晶雕塑作为核心装置,透明玻璃温室中悬浮水滴形青柠树,弧形玻璃步道环绕浅绿色反光水池,现代极简风格建筑由玻璃与亚克力构成,阳光透过棱镜折射彩虹光斑,柔焦清新色调,等轴视角构图,by Nendo工作室 --ar 16:9 --v 6.0
i like this one, this kinda reminds me of the starry night of vincent van gogh, hope u generate more of thisss :>
Realistic image of a porsche
soo stunning...like a glowing fantasy world come to life broo
thanks :) took me a few tries, i kept getting a mushroom top as a head
Modern Style Bedroom Interior Scene with Contemporary Decorative Style, Bed, Cabinet, Nightstand, Table, Greenery
a lone survivor senses danger lurking behind shattered glass.
Shot concept: Grit, suspense, and post-apocalyptic atmosphere.
#AIart #Cinematic #SurvivorScene #PostApocalyptic
https://civitai.com/models/1701368 Did a multi-appearance realistic fantasy "hellhound" creature archetype Dora for SD 3.5 Medium
Nice scam.
Nah, the really good ones are the ones that actually sound real.
🚨 Limited-Time Offer!
Heart of Steel by Blacke Marlin is now FREE on Kindle until June 27!
Enter a dieselpunk world of sky missions, sabotage, and dark secrets.
Don’t miss your chance to grab this gripping novella.
https://www.amazon.com/dp/B0FF3C5YX3?ref_=ast_author_mpb
#dieselpunk #scifi #kindle #freebook #fiction
awesome bro
Thanks
I've got a big detailer one trained solely on hi res Flux Pro Ultra outputs I'm gonna release soon too
Stock Medium on Left, with Dora on right
Same seed / prompt / sampling settings / etc
create a friendly, cute, white and round robot assitant that resembles eve from wall e, deptic her from different angles
Create a photorealistic and realistic image with a resolution of 3840x2160 in a cyberpunk style. In the foreground, depict a very beautiful, slender woman with a short haircut, who is half Asian and half Caucasian. She wears thin, tight-fitting clothing with the inscription "Xaero," through which the outlines of her nipples are visible. In the background, show a megacity with dark tones accented by blue, pink, and purple colors, and a cyberpunk-style sports car parked nearby. Please generate 3 different variations of this image. The image should have photographic realism, with detailed lighting, textures, and atmosphere typical of high-end cyberpunk visuals
So, with kontext dev out, tried the same images and prompt with dev (i used wavespeed online, haven't set it up locally).... uf, not the results i hope for.
Hopefully this ages like milk, and even next week it's shown kontext is amaaaazing
Works alright for me. Cheers.
i think this is just fine bro maybe u have to explore more? i guess
PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.
The intend/prompt was to use the style of a source image (the same i used in the post i replied to), sadly the model hardly followed it, only the color scheme a bit. Other trickery might or might not wotk (i had some success adding stuff to real photo's) but style reference isn't something i managed to make the dev version do. (and it was what i looked most forward to 😞 )
Create a 1990s realistic portrait featuring Mexican American singer Selena Quintanilla with long dark hair and bangs, she's wearing red lipstick she's smiling
PA realistic standing image of Lord Kalabhairava, the fierce form of Lord Shiva. He is depicted with a terrifying yet divine expression, with three eyes glowing like fire. His complexion is dark as a stormy night, adorned with garlands of skulls and serpents. He stands powerfully in a cremation ground, surrounded by blazing fires and spirits. He holds a trident, a drum (damaru), a noose, and a skull bowl in his four hands. His hair is matted and flies wildly, crowned with a crescent moon. His feet are adorned with golden anklets, and he wears tiger skin. A dog stands loyally beside him. The atmosphere is mystical, with storm clouds and divine light behind him, capturing the essence of time and death. Style: Hyper-realistic, high detail, divine and intimidating aura, traditional Hindu iconography.
short name for a channel holding every informations related to this discord's channel bot.
🤖 : beep boop beep, there you go #artisan-faq
Chinese ink painting of the Red Cliffs battlefield at dusk, towering red cliffs with crashing waves (‘乱石穿空,惊涛拍岸’), ruined ancient fortifications in the distance, a young General Zhou Yu (周瑜) in silk headscarf and feather fan (‘羽扇纶巾’), standing beside Lady Xiao Qiao, romantic yet heroic atmosphere, misty river reflecting moonlight, fusion of historical grandeur and poetic melancholy, Song Dynasty landscape style.
...
“Are there open-source virtual try-on projects I could help with or test?”
kaleidoscope sucked into a kaleidochromic vortex --s 750 --v 7.0 --raw - Remix (Strong)
Close-up professional corporate man headshot, modern business portrait. The subject's head and shoulders are tightly framed, filling most of the image. Focus is sharply on the face, particularly the eyes, with a shallow depth of field blurring the background.
Lighting: Three-point studio lighting setup optimized for a close-up. A soft, diffused key light directly or slightly to the side of the face to minimize harsh shadows. A fill light to subtly illuminate the shadow areas under the chin and nose. A hair light or rim light from behind to add a subtle highlight along the hair and separate the subject from the background.
Background: Smooth, solid, neutral dark gray or deep blue background, completely out of focus to ensure maximum attention on the subject.
Camera & Style: Simulated DSLR photography with a high-quality portrait lens (e.g., 85mm equivalent). The image should have ultra-detailed facial features, realistic skin texture (without excessive smoothing), and professional, neutral color grading suitable for business use. The overall feel should be confident, approachable, and trustworthy.
it's fine bruhh just keep doung yo best and eventually things will follow thruuu
Create a hyper-realistic 8K resolution cinematic poster of Mobile Legends: Bang Bang featuring 5 characters: Layla (with her cannon), Dyrroth (in a fierce battle stance), Harley (casting a magic card), Esmeralda (with her cosmic scythe and flowing cloak), and Akai (spinning with his bamboo staff). The scene should be dark and dramatic, with intense rim lighting, glowing particle effects, lens flares, and smoke in the background. Position the characters in a powerful triangular composition on a fantasy battlefield with magic energy storms and ruins. Each hero must look dynamic and battle-ready, with ultra-detailed armor and realistic facial expressions. Add cinematic color grading and film grain for a movie-poster look. Include the title ‘Mobile Legends: Bang Bang’ in bold metallic lettering at the bottom center. Aspect Ratio: 16:9. Full movie-poster tone, highly detailed, epic fantasy style."
thats cool, whats that checkpoint?
Probably was zavy
sakura, white, pink --ar 9:16 --sref 2121577414
A 1990s-style self-portrait of 27-year-old Jennifer Lopez, with long, dark wavy hair and soft bangs. She wears bold red lipstick and is styled like Mexican American Tejano singer Selena Quintanilla. The photo has a warm, vintage studio portrait vibe, with soft lighting and a nostalgic 90s glamour aesthetic.
Generate a black and white portrait of my face, shot from a close-up, overhead angle, with my head facing forward. I used a 35mm lens and 10.7K 4HD quality.
Proud expression, water droplets on my face. Background with deep black shadows: only my face is visible, and it looks ultra-sharp. Aspect ratio of 4:3, with a 1/5 depth of field effect.
tis coolll,,how u do this man?
#▶|stable-video-diffusion a divine digital painting of Lord Krishna as Radha Ramana sitting beneath a blooming kadamba tree on a carved stone bench, Radha resting gently against his shoulder, Krishna wearing a saffron‑yellow silk dhoti and peacock‑feather crown, softly playing the flute, Radha in a pastel pink and turquoise lehenga with jasmine garlands around her braid, lotus‑filled pond glimmering behind them, morning mist and golden rays filtering through leaves, peacocks and deer in the background, tranquil Vrindavan atmosphere, ultra‑detailed devotional art, cinematic soft lighting, peaceful romantic mood, high‑resolution
how make ai photo
No data source is currently selected. Please choose a data source from the dashboard and try again.
@civic latch yes I have the specifications/ details of the logo
golden retriever dog
Here is the image you requested.
Here is the image you requested.
Here is the image you requested.
Here is the image you requested.
house in the woods resemblance of a castle but more like a home
@spark grove spammer scammer
purged
@spark grove again
1: that prompt is too long and 2: you can't generate in this channel
Ultra-realistic photogrammetry 3-D globe named Gloxus, 16-k resolution earth texture, micro-topographic detail on every mountain ridge and river delta, continents carved from obsidian-black basalt with razor-sharp displacement maps, iridescent neon-cyan ocean currents swirling under a thin glass layer, holographic magenta circuit-veins mapping city lights across landmasses, subtle cyan grid lat/long lines hovering 2 mm above surface, cinematic rim-light from a cool white sun at 45°, micro-scratches and fingerprint smudges on glossy protective dome, shallow depth of field f/1.4, 32-bit HDR, octane render quality, ray-traced reflections, photoreal shadows, ultra-sharp 200 mm lens, clean black studio background, --ar 16:9 --cfg 12 --steps 40 --sampler DPM++ 2M Karras --vae kl-f8-anime2 --no text, watermark, logo, frame
Shame on you. You’ve picked on this poor, helpless bot and now, somewhere on the Indian subcontinent, there is a web page where this image is captioned as an attractive young woman in a business suit. I hope you think about the suffering you have caused.
Here is the image you requested.
u kidding man hahha this is hilarious!
Anime-style third-year student with spiky hair, wearing a tank top and shorts, dramatically leaping towards a basketball hoop placed on a mountain summit, sweat droplets flying, exaggerated wind effects, vibrant sunset colors with pink and orange clouds, stylized rocky terrain, action comic book shading, inspired by 'Slam Dunk' artwork
how the heck do we counter-report someone
this king is one of the few keepin this channel alive
okay 😭
Hello im new here 
Photorealistic full-body portrait, eye-level shot, sharp focus on subject: A beautiful, energetic 22-year-old Vietnamese woman, exuding confidence and strength. Her skin is glowing with perspiration, highlighting her active state. She is clad in sleek, form-fitting athletic wear (e.g., a tight sports top or tank top and high-waisted leggings) that accentuates her toned physique and prominent bust. The fabric, slightly damp with sweat, clings closely to her body, subtly emphasizing her natural contours and definition beneath. She is captured mid-movement or pausing in a modern, well-equipped gym, with blurred fitness equipment, bright mirrors, and a motivating atmosphere in the background. Her expression is focused and determined, yet radiating a youthful vitality. Natural gym lighting with subtle highlights on her skin and the sheen of sweat. Captured with exquisite detail and sharpness, showcasing natural tones in a realistic photographic style, akin to a professional shot on a high-end DSLR (e.g., Canon EOS R5 with a 70-200mm f/2.8 lens), ISO 400, 1/160s shutter speed, and f/3.2 aperture. Shallow depth of field, drawing all attention to the woman. True-to-life colors. Aspect ratio 9:16.
vorrei che rappresentassi una scritta "Il volo di Crà"; la immagino adaggiata sulla riva di una isola, leggermente lambita dal mere. I caratteri che la compongono vorrei che fossero come scolpiti su degli scogli e leggermente ricoperti di vegetazione.
cool girl
expand this
shesh, baddieee ❤️
Hi
Howl's Moving Castle
@spark grove spammer alert
I wana generate this kind of image some one help me
Anyone use Stable Diffusion to segment?
start with "2d cel-shaded cartoon" as the first part of your prompt, and then go into detail what you want the cartoon to be
will this do?
guys, how can I run sd3.5 on Forge? I belive I'm doing something wrong, because don't generate image and tilt my Colab when I use it =/
So any opinions on Krea yet?
my first generation with generate/ultra
I recently tested it locally. I can't say I'm overly impressed with this model. It's very noisy, in my opinion. You could say it's just another fine-tuned model, nothing more.
Krea took Flux Dev Raw and then did their own post-training. This blog entry details it
So calling it a finetune is not wrong, but it goes quite a bit deeper all the same
I will add that my initial images with Krea are very nice and are more intresting to me than vanilla Flux dev. Is it the best overall Flux? I haven't come to any conclusions. My other fav was/is Pixelwave. I tried otehr attempts but they were inevitably not very interesting. This is all without Loras of course
I sitll really love SD 3.5 FWIW. They each have their strengths and weaknesses. FOr actual text, Flux is in a class of its own for open local models.
SD 3.5 Large of course.
Of course, Flux's biggest strength is precisely its flexibillity to be finetuned or have Loras
hey is there a easy tutorial on how to train a lora
if anyone need help in lora training let me know
Flux-Krea
diptych of two identical images as a split screen featuring the same character: a young woman from jrpg game. On the right she looks at viewer. On the left she's wearing a straw hat
I find it MUCH better than Flux Dev
I think this is the important point. Flux Krea seems to be less dpo-ed and overfitted than Flux Dev
it has more issues with anatomy than Flux Dev, but on the other side its much more diverse in styles
did you tried to use author prompts with Flux Krea?
Flux Dev never responded on them. Flux Krea, however, can roughly imitate styles you name (similar quality as SDXL)
in Flux Dev everything always looks the same style. Flux Krea allows you to use different styles in your prompt without needing a Lora
For painterly stuff, I prefer PixelWave more than Krea, but Krea comes close. I think finetuning Krea will give better results than finetuning Flux Dev on styles
Old stone alley, mossy banyan, flower-filled balcony. Natural, vibrant, cinematic. Miyazaki style, 32K UHD.
🔥
hey is SD3 dead? no updates for almost a year?
I would say the time where SD was the dominating open source solution for image gen are long over. Good news is, there are so many new models out on the market
there is the Flux ecosystem by the guys who initially developed SD. There is highdream, there is Wan (can generate videos and images) and there is now also Qwen image
I know
I have still SD3 at my app
and today I refactored it
have more models etc
and removed it
it was the worst model honestly
Is there anyone looking for dev?
SD3 is now SD3.5 and it's not dead
Qwen image?
HiDream is nice but it is really sluggish
I am not at all convinced it is worth the effort
SD3.5 is super cool
But even Flux can be complained about in terms of updates. Krea is really just a different post-trained Flux
The commercials have not been idle either: Imagen, Mid7 and Ideogram have all been pulling ahead in some aspects
So miraculously two accounts spamming the exact same crap. Brilliant.
On a relevant note, I did generate some images with Qwen Image and it is quite good. Good text adherence too.
A lot slower than Flux, and too early to say whether the vanilla is an improvement or not over Flux Krea
a lot of people REALLY weren't happy about the recent "safety policy" update with regards to "core" models at least
especially in light of the fact that SD 3.5 was mostly uncensored
they didn't literally DPO-tune female nipples out of the model the way Flux did 
I think he meant the SD3 in general and was not suggesting SD3.0 specifically
Anyhow, I did some testing, very light, of the new HiDream and Qween Image in terms of models with text
Qwen really I king of correct text, but it also sacrifices a lot to achieve it IMHO. The default imagery is much less inspired, and the fonts are downright boring. It never deviates or produces anything fun looking, which is likely solid if you are trying to put together some ad or banner. Hopefully it is more readily tunable and new tweaked models will emerge on Civitai
HiDream's text ability seems about on par with Flux and is definitely more intreesting visually. Albeit it... it botches long words a LOT
Here is an example of ultra correct Qwen:
I will point out that Qwen is by far the most accurate portrayer of chess pieces. It gets them right each and every time
Others, including Flux or SD35, can be a bit creative at times
FWIW, I ran the prompt multiple times with varying samplers and steps. Qwen really does not gain any improvement beyond 25 steps. You can see the occasional micro diff, but never anything warranting it to be called an improvement
This is hidream
\
another to illustrate. Flux is no better with this text
On the other hand, Qwen is incredibly strong at making logos
and not merely because of text accuracy
WHich is usually not a big deal since logos don't usually have major texts
I threw Qwen a bit of a curveball with the request for a logo for Chess & Tech, round, with a design based on a circuit board and.... Louis XIV
Not bad at all
A luminous ‘Digital Giant Tree’ stands at the heart of a futuristic city, its trunk entwined with flowing data chains forming a ‘2019-2025’ timeline. The canopy spreads into a massive ecological dome shaped like the number ‘6.’ AI drones perch on its branches like birds, while roots connect to an underground 5G network. The ground features transparent solar panels and dandelion-inspired smart streetlights. A river glows with quantum computing projections, and humans interact with nature in a holographic garden. Cyberpunk lighting blends with forest mist, rendered in a surrealist style
Is that supposed to be inspired from Ancient Rome or some other antique civilization?
你好
Hello!
你好
The cafe
SD 3.5 L, Dreamy Aesthetics
/create: big red mouse
Neon Rev - Electric Denim Girl.
Am I able to use Stable Diffusion to make image to gif (while keeping the transparent background)?
stable diffusion just creates the image. it'll depend on the interface you run it in whether you can have transparency or not, and export it out as a .gif with transparency
There are plenty of free online tools to remove extraneous background to transparency. FOr most use cases they are perfectly fine. If you want or need really detailed work, then experience and a tool such as Photoshop of Affinity Designer are the way to go for now.
i know, but @viral moon was asking if he could use stable diffusion to make transparent gif's
I had understood, but in case he felt stymied by its inability to do so, I was offering up solutions.
stable diffusion cannot do transparency cause the vae has no alpha channel. So SubtleOne is right: you need an extra tool to remove the background
#🆕|sd3 Manga style, black-and-white ink, dramatic contrast, cinematic angles. Sequential panels, consistent characters, tense horror-thriller mood. Silent library, frustrated writer, masked killers, surreal ending. Each page shows panels with continuous story flow.
Page 1 – Library
2 panels: vast empty modern library, tall shelves, rows of tables; closer view of books and dust in silence.
Page 2 – Writer
6 panels: close-up of man (30s) writing furiously; pen in hand; wide shot alone at table with books; messy scribbled handwriting; crumpled paper; shadowed angry face.
Page 3 – Intruders
4 panels: library doors open, masked men enter; close-up of cold eyes; killers moving between tables; man tapping desk, killer behind.
Page 4 – First Kill
3 panels: disruptive man tapping; killer grabs his hair; throat slashed, blood on table.
Page 5 – Girl
4 panels: young woman gasps; killer covers her mouth; “shhh” gesture at Keep Silence poster; silenced pistol shot, she collapses.
Page 6 – Writer’s Rage
3 panels: writer slams fist; killer behind with knife; suspenseful knife over him.
Page 7 – Break
4 panels: writer rips page; killers vanish, library empty; writer breathing heavy; fist smashing wall.
Page 8 – End
3 panels: glass door shatters; writer crushed under shards; close-up of shard with “Do Not Disturb, Keep Silence.”
you can't generate in this channel AND you can't give the AI a script and expect it to create a movie or something
You can't create directly here on discord.
You can do that on your own stable diffusion env on your computer. 🙂
??
一个小女孩端着咖啡,微笑着面对着我
You can't generate image directly here.
Kim jung gi style
you can not generate in this channel
What channel do I choose and how do I start writing the prompt because I tried # and I also tried /
What channel do I choose and how do I start writing the prompt because I tried # and I also tried /
first of all, did you read the information in #artisan-faq
The terrifying office of China's cattle and horse employees
Yo guys where to generate images
I’m new
You don't. Unless you want to pay. Or you do it on your own computer.
i'll tell you,
1
Where do you generate the images? Cloud or your Pcs?
I generate on my PC, then post the results.
before it was possible to generate with discord right? can you tell me why its not possible now.
I guess you mean generate for free. Otherwise the #artisan-faq artisan channels are still there. In the beginning and for testing purposes when new models appear there where some beta channels open and for free.
But without the purpose of beta testing giving away expensive GPU Calculation Time for free does not sound like a good business model
que sucede
Any thoughts on the monster new release?
Probably impossible to run locally for now, but still the biggest OS image generator to date in terms of sheer size
It is MOE though, so maybe I will be wrong
Right now my fav local model is that new Flux out of the box. The big Qwen was ok, but didn't wow me
"The Largest Image Generation MoE Model: This is the largest open-source image generation Mixture of Experts (MoE) model to date. It features 64 experts and a total of 80 billion parameters, with 13 billion activated per token, significantly enhancing its capacity and performance."
"Our model can effectively process very long text inputs, enabling users to precisely control the finer details of generated images. Extended prompts allow for intricate elements to be accurately captured, making it ideal for complex projects requiring precision and creativity."
To me the new model is complicated 😵💫
In a way, it's what i hoped to see after SDXL, what SD3 and later models were supposed to be, it follows prompts and doesn't override styles with pre-baked crap
But it also has more errors, i tried it out on tencent site, let it create 4 gens, a few are always plain unusable bad, 3 arms like bad, but others are nice. And there's variety in outputs
Fal has it to test, but it is pay to use, which is fair, except I cannot imagine myself paying to use it when there are literally a number of free private ones such as Nano Banana, nevermind ones I can run on my own machine like Flux Krea or SD3.5 L.
it's free here https://hunyuan.tencent.com/modelSquare/home/play/d3d6bl42c3m83jodcrrg?modelId=289&from=open-source-image-zh-0 (sign up with email in the third tab)
I went there and they wanted me to sign up with WeChat
which i do not have
and am certainly not going to install for this
there's three tabs at the top, the last is email
ok, so your impressions are that the results do not match the hype
i like it, it's the first model where things look nice again after 3.0 / 3.5 🙂 But i'm a sucker for fine textures, half my prompts use the word etching or parchment
ok, I entered, let me try something simple, but offbeat
it's just well, it has issues (the foot became a paw, double trident, but it's also the first model where the bull and god is actually seemingly made of water)
Things like this i never managed with qwen and hardly with flux
and it understands "In a warm, sun-drenched Japanese classroom, a bright-eyed, cherry-blossom-haired schoolgirl named Sakura** playfully twists a lock of her hair between her lip and nose, creating a makeshift mustache that** makes her giggle uncontrollably, as her friends look on in amusement, by renowned anime artist, Hirohiko Araki."
well, for a comic rendition of a Gnoll with a sword, it actually did a decent job.
A powerfully built gnoll, resembling an upright hyena, covered in short brown fur spotted in darker brown. It wears a short kilt and a hardened leather apron adorned with metal links and spikes. The gnoll holds a short sword, ready for a fight, with a 3/4 body view, showing its full body. Rendered in a classic 80s comic book style with strong, defined linework, and detailed rendering of textures and shadows.
It is not really 80s commic book style, but nevertheless solid details
it also nailed the 3/4 body view
for the graphics assets of my game, I tend to use Nano for its repeatability in style as well as unlimited free use
(a significant deal)
though for the starting asset Flux Krea has been great too
This however was better to be honest
How many generations can you get? I assume there is a daily cap
or weekly...
I haven't really used imagegen a lot recently, I just never liked the look of new models. Flux with lora and sd3.5 were the last I actually enjoyed using. Hunyan 3 is exciting to use like those, it feels like a throwback to styles/textures in a good way
Nano is not Imagen
Flux krea was a disappointment to me, the real krea had nicer outputs
I actually never used nano, only imagen.
Try Nano. Aside from being the top rated text to image generator on LM Arean, it has unique editing abilities no one has
Editing with it is done via prompt, but let me show you what I mean by unique
Here is a plain jane image, not made by Nano
simple enough, sunrise, pirate ship, etc.
Now I tell Nano: change the image to a sunset
Just that, no masking or anything else
It's insane
Yup that's good, i kinda stopped being interested in this editting after flux-edit
the pro version was nice, the dev version abysmal
You can ask it to take a person, or even those cartoons, and tell it to raise the arm, have him turn his head to the right, and it will all be perfect
fur, ears, everything
as if telling the model to move around for the next photo
Anyhow, that is why Nano is overall king for now
overall, not necessarily in each individual thing
to be clear
afraid i tried it on wrong thing, i tried to transfer style, that didn't go well :p
but yeah, haven't looked back since... i understand it actually can generate images too, which is nice
So suppose I like Hunyuan's core image. I could use it as the starting point and then have Nano make thge modifiucations
Nano is top rated on LMArena as I mentioned
I assume you know what LMArena is
yup, aware of it 🙂
so what are the limits in Hunyuan public use? DO you know?
nano, for made out of water, hunyuan wins
Nope, i fully expect to hit a wall anytime soon
just a random prompt i remembered models struggled with to make look natural, hunyan does a good job
Atmospheric wide shot in a dense, ancient forest under dappled sunlight. Large, incredibly adhesive spiderwebs stretch high between gnarled trees, their thick, glistening strands shimmering as they catch the light. A wild deer (doe) is visibly ensnared, its body tangled in the sticky webbing. Nearby, a young woman struggles against the webs, her clothing and hair tightly bound, her face showing distress and the sticky strands clinging to her. Eerie shadows. Highly detailed dark fantasy illustration.
nana banana kinda iffy first not much web, when i asked made it more entangled i got zombie 🤡
Where is SD 4 (i guess never, new sai doesn't seem big on open models or even new model dev for individuals (as opposed to enterprises)) 😢 Maybe it's because what i've seen/used first, but the SAI models have that something special (style/textures i just call it) newer models just haven't captured. hunyuan kinda seems to have as well, but it's early days.
Sincere question. What would you expect SD4 to be ? What do you expect from it ?
Expect? Or want?
For me, 4 things:
- Easily trained for LoRAa. Flux has an iron grip on this right now, and it is a big deal IMHO.
- Stronger text handling. It can handle 2-3 words ok most of the time, but it is now lagging quite a bit behind its peers.
- A larger more powerful model.
- And please tone down the nanny police. Efforts to control such things are not only wasted, since it is literally the first thing targeted by others for removal, but it invariably has detrimental effects on general image production. It need not overtly allow sexual content, but nor should it feel like a 1950s movie censorship board.
Just my 2 cents
I really like SD3.5 L FWIW, but I tend to use Flux Krea for more consistent results and style. I can ask SD for an 80s comic books style, and it will deliver, but even with plenty of details, it is all over the place in the results. It is why I mentioned LoRAs. Someone is bound to want something that it doesn't handle well, anyhow.
I don't see 4) happening anytime soon for any model released by any company. This kind of usage is bad press for the large audience. Moreover the easier it is to do this kind of content, the easier it is to abuse it. Add to that all the legal issues and stuff going on such as ID restriction getting introduced in some countries for that kind of content... And yeah... They pretty much have to do that kind of policing.
Otherwise yeah you pretty much expect it to catch up with others.
I mean, Flux is more or less SD4. I wouldn't expect a new successor of Stable Diffusion as all people who developed SD are now developing Flux
I think the reason why basically all new models. including Flux, have big issues with styles is because they are using T5 or other text-based models instead of the CLIP as in SD, and because they are trained on synthetic captions
SD 1.5 and SDXL were trained on ALT tags, so the image captions often contained hints regarding the style
newer models use VLLMs to caption the image, but VLLMs usually don't capture stylistic nuances. They know the difference between a "cartoon" and a "photography", but they barely understand differences between certain art styles. When they generate captions, the captions focus on the content and not on the style. Models are trained on these captions and never learn how to describe these styles via prompt
thats probably the reason why models like Flux can easily learn (via lora) a lot of different styles, but its hard or even impossible to reach these styles via prompt engineering
at least thats my theory 🤷♂️
unfortunately, styles are also a thing all the big companies are not interested in. Styles are often associated with specific authors, and everyone fears copyright issues. Furthermore, if you want to make money with image generation, you want to target the advertisement industry. For this, you don't need art styles
I'm afraid the answer is like "a better horse", i know what we have know, what i like and don't like but no idea what's possible.
But the reason I mention SAI models is that compared to others their outputs always felt less artificial, more fine details and textures, instead of overly smoother AI look. (after SDXL, i feel 3.0 (the API version) did this still well, but in 3.5 it suffered a bit, some styles just became much worse or flat out impossible, it felt more like exploiting clip's knowledge as opposed to having the model actually trained ion them). Maybe SAI has a really really good data set, better than what other models have been trained on (maybe cause it's older it just has less synthetic data).
Anyway, what i would hope is ,much, much better prompt following (also when things are off the beaten path), but not at the cost of style or variety, like many recent models. So good prompt following, wide range of style and fine details/textures. "Promptable" by just by using references, both images (like ideogram) and codes/hashes (artists seem a no-go anyway), my dream would be throw some images to it, extract a style hash that's a merge of styles in those images, kinda like a lora but instant. And detailed as in make the creature in ref style a, the other creature in a blend of a and b, the background in style c. I suppose that's already beyond simple image-gen and close to current instruction models, just also for style not just subject please.
On top of that consistency, which again would probably mean an instruction like model, that allows consistent subjects and environments in various styles / perspectives / angles.
That's been my thought as well, and after that it's often "why not use that info from captions to make the model learn styles in a better way, including a way to get it out of the model". Obviously it's no that simple 😉
It is a lot simpler than made out, or all the other image generators would struggle just as badly. As to the devs who made Flux, well, there are more than those devs in the world. Whether or not Stability will actually develop a new model is entirely up to them, but producing a subpar model, relative to the existing ecosphere, with a list of reasons why it is subpar isn't going to cut it. There is no shortage of competition nowadays.
I don't say SAI cannot make another model. My point is, that such a model would have the name "stable diffusion" but it would be made by entirely different development team. From my perspective, a truly successor of stable diffusion xl/3 would be another flux model
cause for me its the people who count, not the brand name. That's said, it doesn't matter much anyways. I'm also happy to work with models like Qwen. The days where all good open models for image generations came from SAI are long over anyways
(although, sadly, most of the newer models like Qwen were mostly just copy&pasted SD3 or Flux models )
I feel data matters too, and wouldn't be surprised if SAI liberally scraped the internet and/or used screencaps for their models where newer models simply scraped ideogram/dalle/flux/MJ, ie already not the most finely detailed ai-slob.
I think we'll never know what a new SAI model would be like, in the end it doesn't matter a whole lot where new models come from, though i have to admit it feels a bit iffy so much is from china, i'd ike some western biased models too :p
of course they scraped the internet. But I would be surprised if flux is not just using the same data
You sure would think so, same team and all 😉 And yet, flux seems to have less knowledge, but i've also read it's the result of preference optimization, or the distillation. SAI outputs just usually appear less AI to me. Then again, i've convinced myself flux is good with hands cause they trained tons of hands to the point where children got adult hands ;p who knows what othewr optimizations were done.
I think the massive inFLUX (pun intended) of Chinese open source models is centered around two things:
- It is the best way to get non-Chinese to use them. After all, if these were some models locked behind some Chinese ChatGPT equivalent, the use would be a fraction of what it might be.
- There is a massive US vs China war on the AI front, and their efforts are very much in the good graces of the CCP. If you look at just the number of papers publlished on AI, China is actually ahead of the US in the last 12 months.
I mean frankly, it is much the same with local LLMs
The best right now is hands down the Qwen models by Alibaba
it isn't even close anymore
In fact, just now Qwen3 80b MOE was just released and it is an absolute beast
The commercial models by the US are still king of the hill by a good margin though. ChatGPT5 is no.1, and then it is between Claude and Gemini. So the Open Source front is where they have the most chances to shine
I have a question what model or lora could be closes for this art style ?
A photorealistic masterpiece, shot on Arri Alexa, cinematic it, shaking his head furiously. The camera is handheld, with a slight, almost imperceptible shake, enhancing the raw emotion color grading. A bald dwarf who is an exact likeness of Cristiano Ronaldo is sitting on the floor of a bedroom of the scene. The visual style must match the provided reference images, focusing on gritty realism, deep shadows, and des. His face is a mask of visceral anguish and pure rage, with hyper-detailed, wet skin and realistic tears streaming down hisaturated colors.
a bald dwarf who is an exact likeness of Cristiano Ronaldo/imagine prompt: a cute pink chick with big muscles, wearing green kung fu clothes, 3d anime style, ultra realistic, cinematic lighting, standing in a dojo
I'm not sure, I have made about 90 images for free with Hunyuan 3 today and I am still going.
I have been experimenting with upscaling Hunyuan 3 images with Qwen as I think 1024x1024 is way to low res.
想問有沒有人是用中國的模型 會不會有被閹割之類的狀況呀
I'd like to ask if anyone has used the Chinese model and if they have been castrated or something like that.
There are difficulties in generating R18 content.
Create a modern scientific laboratory scene with clean white counters, chemical storage cabinets, safety signage (like PPE reminders and hazard symbols), and realistic lab equipment such as microscopes, beakers, and fume hoods. Include subtle lighting and a slightly dramatic tone to suggest a challenge or escape room atmosphere. The layout should be modular and clear, suitable for overlaying interactive hotspots or puzzle elements.
create a future robot on a new earth
hi can i use SD3 to edit the color grading in a photo i upload?
I really hope someone deep dives into what the heck happened with the SD 3 / 3.5 arch one day
this is SD 3.5 Large on the left and SD 3.5 Large Turbo on the right, same seed, same prompt
I never believed any of the issues even with the original SD 3.0 had anything at all to do with "censorship" but like rather
there's definitely some really weird deeper technical issue problem caused it to be the case that distilling 3.5 Large into 3.5 Large Turbo actually significantly IMPROVED the coherency and pixel resolve (and almost completely elimated the strange edge artfacting problem) as opposed to the opposite (and no one will deny this is the case if they actually do enough seed-to-seed comparisons between the two, I promise)
there's numerous questions to be asked here no one has ever answered to date 
turbo variants are often "better". Same happens for SDXL. I think they do some dpo with the distilling
How to generate image?
I installed SD 3.5 Large but run in error, I think my method is wrong, could some point to the correct way to download and install?
try asking in #🤝|tech-support
Ok thank you for your reply and sorry for the wrong post
Photorealistic rendering of the letters why4e, make the letters readable but broken, like the wreckage of a spaceship, with a dark, gloomy space background, traces of a dying explosion
/ generate promt: dark contrast noir photo realism with detective and ufo
photograph of [object], [details], [environment], professional photography, 50mm lens, f/1.8, natural lighting, high resolution, sharp focus, detailed texture
You have to look for nsfw variants then. Check encoders and so on. At least I found things for Wan but haven't found any practical use for it with my limitations.
𝙃𝙚𝙡𝙡𝙤
animation
how to generate images with propmts?
No data source is currently selected. Please choose a data source from the dashboard and try again.
paint a rabbit
🐇
Paint a crab
🦀
paint a rabbit
Draw an orc fighter
draw me
/genrate test
[F]
how to create image with an existing image?
Probably use something like Nano Banana and tell it the modifications you would like
All is quiet on the image front I guess. for my own ends, for pure creation I think Hunyuan has the edge (though it is impossible to run locally, being much too big).
For editing Nano is best
at least for large transformations
@gilded stone Instead of supplying empty latents to the KSampler, use a vae encode node to convert your reference image to latents. Connect those latents to the Ksampler. Then lower the denoise to 0.5-0.8. Your original image will shine through, modified by your prompt.
i've been doing some analysis/testing of sd3.5 large over the past months, it seems something really nasty happens at MMDiT block 35. best i can tell, block 35 has the strongest influence on making the speckled greebled texture that's common with outputs. maybe the growing values also have something to do with the poor quality? (i'm not studied enough in the math at play to make a full evaluation of it)
(attached img: this graph is a single step (0 of 24) of sd3.5 large under bf16 unet)
under fp16 unet, which is what comfyui runs as default with the sd3.5_large.safetensors file, l2 hits inf
(@devout schooner, you mentioned you were interested in some analysis on sd3)
For some reason, I'm not convinced yet that that is the primary or direct cause of the speckled visual effect on outputs.
If that is the case, and if it was easy to isolate this issue, we would likely have had a new custom model available by now that could counteract this issue. I agree, I've noticed a strange effect with that specific latent. However, it's difficult to say with confidence what the exact cause is, given the complexity of these algorithms.
@quick pelican
I partially drew that conclusion from using the skip layer guidance sd3 node, where 35 had the biggest reduction in that pattern, I'd have to post some examples of it.
I have other test situations where I've seen higher quality results, like using 1344sq resolution, or disabling bias on various modules. I can post some of those if you're interested
It's plausible that the cause is latent 35, all I'm saying is that I'm not yet convinced it's the most probable cause. Also, it makes sense to me that skipping the first and last stages of a diffusion process would cause quality improvements, although it should be worse on average given the greater noise present.
I'd be interested in learning more about this as I'm actively doing research in this area still, but we should really have a real conversation, at least for a moment if that would be fine with you.
I don't doubt it's involved, but I'd be curious to know what other tests would determine its primary cause, I suspect the cause could be in the algorithm itself rather than in a latent and the issue is only exacerbated by block 35.
@quick pelican
yeah, you can dm me or continue here. I've been busy/symptomatic lately, apologies for the delay in posting more info
Imagine out of the blue stable diffusion drops SD4. XD
there's certainly a new framework they could use
https://arxiv.org/abs/2510.02300
i wonder if it'd prove to be better than flow matching in large datasets/models
We introduce Equilibrium Matching (EqM), a generative modeling framework built from an equilibrium dynamics perspective. EqM discards the non-equilibrium, time-conditional dynamics in traditional diffusion and flow-based generative models and instead learns the equilibrium gradient of an implicit energy landscape. Through this approach, we can a...
anyone know how to get u_t ^{theta} from flux api ?
or flux mini
im trying to setup post training experiments
hi
Are you asking about neural networks?
This is not the correct forum to ask that, you should go to a programming community or a dedicated Flux forum for better advice and expertise.
@rustic bramble In the past, in a conversation about machine learning, I found a lot of the people I was talking with couldn't give me a good explanation of how back-propagation works. What is your level of technical knowledge around neural nets?
It depends on what you're using. What library are you using? Are you using TensorFlow? Pytorch? Are you a PhD student or is this for your day job?
Are you using PyTorch? Tensorflow? Keras? Do you want U theta to be a weight in the net? A gradient?
by utheta i mean just the vectorfield representation of the ODE
and not score parametrization
I don't really know what you're talking about. Are you working through a research paper or something?
I am familiar with the mathematics of gradient descent and back propagation in the context of neural net training, but I don't know what "vector field representation of the ODE" means.
It's also not that surprising that most people you talk about aren't able to give you technical details on how neural net training works. Neural net training is highly abstract and complicated, I've seen a lot of PhD students and people of that level struggle with it.
If you're talking about a system of autonomous ODEs, I have also studied that, but if you're taking about the use of ODEs in a neural network I am not familiar with that idea.
@rustic bramble
lol, is he a bot?
Alexander. Sometimes it's hard to say if someone is a bot or just not a native speaker 😅
This is your own epistemic bias at work, my guy. And the reason you believe it sounds like AI nonsense is because you've trained your perception to flag certain lexical patterns as synthetic.
let's not go that way.
@muted cargo Bruh
@Mods?
oh noes... i remember when this place was so active lol
Emad do something! yeah, yeah i know he doesnt work there anymore

I'm trying sd3-Turbo on my AI Platform for simple picture generation. I want to keep it cost effective for after the rollout in April. I am getting wolves with three ears, two headed bunnys, dragons that are breathing fire not from the correct end. Is there a more ... responsive version?
Okay ... all 20 pictures were complete failures. I'll switch to a different model, perhaps their flagship version, been around and tested longer. But this needs to not be a thing.
A woman reclines across a slab of warm stone near the edge of an abandoned quarry at night. Her upper body rests on one elbow, legs bent with grace. Her bare skin catches scattered beams of light that fall from a distant industrial lamp. A long piece of translucent fabric runs beneath her, catching subtle folds of shadow.
Technical Notes:
Lens: 50mm, aperture f/2.0
Lighting: Hard backlight + diffused fill from below
Camera Angle: Side-profile at ground level
Color Tone: Cool with amber accents
Atmosphere: Quietly mysterious, cinematic
Hi someone help me pls?
I dowloaded forge webuii but when i want to generat image i get a error like this
Yeah i know that. But i don’t know how to solve this problem
Hence the use of Google to read and apply potential troubleshooting steps and solutions.
I’m not that smart as you i need more help
You don't need to be smart, you need to read the links and do what they do to see if it works.
Which one?
Start with the first one, then go down from there if it doesn't work. This is how basic troubleshooting works. You try something, learn something, try a different thing if it doesn't work.
Literally everyone can do it.
Nobody is born with the solutions in their brain already.
I believe in you.
My English is bad, it would take a tons of time to read the links, my brain is already burning😩
I had this problem and decided to hire help from fiverr. Otherwise I would have been at it for months trying to troubleshoot. I watched the guy dealing with dozens and dozens of errors of various kinds. Had I not hired, I could have probably eventually gotten it working, but it was about getting it working in 6 hours vs 6 weeks or 6 months
6 hours because i had him install a whole bunch of things very specific. the basics would probably be a lot less
The downside is I didn't learn how myself, but I also work full time and have a bunch of other things going on so for me it's about time
Try the pinned guides in #🤝|tech-support
Use Forge Neo... Old forge man is no longer updating the stuff.
Aktivald a windozd bazdmeg! XD
Hi! I train FLUX'SDXL Face lora for ur realistic AI influencers. Happy to share workflows, tips, and examples 😊
https://www.behance.net/gallery/243708697/Stable-AI-Influencer-Private-Flux-Face-LoRA
a tree
A dog
Anyone else has the feeling that if sd3 was a good model it would have been very good?
Hello guys please help me out with these images I have a very low PC I can't run local stable diffusion and I do not even know how to write this image high quality prompt please if anybody knows please help me out this it's sucks me from last two months please
Hi
I understand how frustrating that can be. If your PC can’t handle local Stable Diffusion, I can help you generate high-quality results using optimized workflows or cloud options no heavy setup needed on your side.
You can share the images or idea you have, and I’ll assist with professional prompts and output. Let’s get you unstuck.
oh isnt cyberdoll tongue has like two halves?
or the lack of that line?
hmm.. cannot recall
blue moon
Kann ich jetzt hier was erstellen
#artisan-faq or use online services from reputable websites or create stuff locally
moderate pls
#🏞|general-with-images Cinematic, surrealist medium shot of a vintage 1970s cream-white sedan partially submerged in deep, dark teal water. Thousands of fresh, vibrant flowers—primarily ranunculus, dahlias, and baby's breath in shades of peach, soft pink, cream, and pops of orange—are overflowing from the car's windows and hood, floating out onto the rippling water surface. The lighting is moody and ethereal, featuring a soft misty glow from the background and shimmering reflections on the water. High-grain film photography style, 35mm lens, shot on Kodak Portra 400. Deep shadows, realistic water textures, melancholic yet beautiful atmosphere, hyper-detailed floral petals and rusted chrome accents
I don't know they're right.
I can't say they're wrong.
Yo m’y bro how do you use sd?
not gonna lie I thought you were one of those bots at first :p
I mean ... what are you looking for exactly ? Can help on the details but it s hard to tell with such a vague question.
To use sd either you install it locally, you use this server #artisan-1 /2/3 channels or use one of the trusted online services to either generate stuff or rent a gpu to run your own client on it.
Everyone does lol
It's what's for dinner.
Okay man thanks for the help 🙂
stop bringing politics in this buddy
Is that a rule?
Of course it is, rule #7.
Where bot?
/generate an human
generate an human
generate a bright brown alien face
#🏞|general-with-images generate a bright brown alien face
/generate a bright brown alien face
/generate blue alien head on white background --square
/generate alien landscape scribble, 8k, highly detailed, best quality -malformed hands
Found a solid free alternative: draw.freeforai.com. It's a web-based SaaS, so no login needed. Best part? Completely unlimited and free with no watermarks. Perfect for when you just want to whip up an image fast
Dawn of the Divine Archer
/generate alien landscape scribble, 8k, highly detailed, best quality -malformed hands
can I take it for free jk 😝
That would certainly prove it's not spam. 😁
Generate a cartoon picture
:/imagine prompt: a cyberpunk cat with neon lights, wearing sunglasses, 8k, hyper-detailed --ar 16:9,
/generate imaginative photo of cryptoman dark blue on white background NeoBlessVerseGenerate
Doesn't work like that here
/generate imaginative photo of man dark blue on white background NeoBlessVerseGenerate
And now
Most people generate locally here
I recommend using cloud services if you don't have a GPU such a civitAI or any of the comfyUI Integrated services thru API
Though those also cost money unfortunately
Appreciate it
/generate image of donald trump dancing with benjamin netanyahu
That doesn't work like that here
i was joking lol
正向提示词 (Positive Prompt):
A magnificent ancient oak tree, intricate bark textures, glowing moss and tiny bioluminescent mushrooms growing on the trunk, soft volumetric sunlight piercing through dense green leaves, cinematic sunbeams, floating dust particles and magical spores in the air, macro photography perspective, hyper-detailed, 8k resolution, photorealistic, masterpiece, depth of field, sharp focus on tree bark, rich textures, award-winning nature photography.
反向提示词 (Negative Prompt):
(worst quality, low quality:1.4), blurry, smooth texture, plastic look, deformed, out of focus, artificial, cartoon, drawing, illustration.