#๐๏ฝsd3
1 messages ยท Page 56 of 1
My prompts are usually very vague, I do like "style, character, place, doing this"
same thing with streets and alley ways if you don't specify geographic location you tend to the the same background
A single piece of information for everything
well the models cant read your mind, so they go with the law of averages
Plug a neuralink and run SD4 
ya that's what i'm saying it feels like the average from the set is very specific and not varied as much as the other models but that's probably what they keep pointing to with 2b training
reminds me of this:
https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2F0joyb4oqx2j21.jpg
from way back showing composite average faces by ethnicities
hi
that's why it feels like you have to tell it every nuance of the image
that's the whole point, it CAN mostly handle that
lol
otherwise, it will go to a de-facto law of averages version
I guess i find it's average boring
well blame the internet and the dataset then lol
lol
aztec for rawr
maybe try adding in random tags like (facing forward | turned to the side). i can't remember the exact format for comfyui, but it will randomly pick between the options. it might be that they need to be [a | b], maybe it's curly {}, dunno, but you'll know when it works. just test a prompt like "a (red|blue) ball" but try the different () {} []
maybe, but again, you have to not be lazy and need to specify things more. which is a good thing. in previous models, they didn't comprehend as well, so you got a lot more image variance between seeds. i know people meme about it hard, but it really is a git gud issue as one of the devs said
think of what a full body portrait photo means inside your visual thoughts
99% of people will think the same thing: facing the camera for a professional portrait photo
I'm not saying it's wrong I'm saying it doesn't like to vary unless told which is different and good, just cumbersome when you have to tell it every single thing to change
blow the prompt up with an LLM then. i've seen good results with even prompts as large as 500 words (in theory, you can go all the way up to 10k words, but that would just drown everything out)
for a simple prompt direction like that, that's exactly what you want. it's on you to then add in the details and specifics to flatten the curve out more, which will give you far more variance
jfc, i can't believe i just called visual thoughts "head movies"
brain scenes 
My Datsun!
Gaga
I'm currently testing DALL-E with more lenient filters and realize that the gap is very large between SD3 2b and the current DALL-E. This is very disheartening ๐
Dall-E is more artistic. And it's very good with anime and Artists' styles
SD3 is more realistic
SD will be better
SDXL is good for ART and everything. And with some finetuned models it's miles ahead of SD3
DM me
It's not about realism, but about understanding and constructing complex concepts, especially when it comes to the anatomy and expressions of characters, as well as their interactions
one undertained model vs trillion dollar model 
I find 8b api way wayy more fun to use than anything I'd get from dalle, 1.5 or sdxl 
the newest jugger is good
juggernaut XI preview
I made a bunch of sci fi cities with it
just gotta wait for juggernaut SD3 
yeah wow

Dall-E using GPT for that. It can be achieved with SD3 as well.
tbf sd3 better and bigger than dalle,its just that ppl suffer from skill issues
not sure that sd3 is bigger
No, because SD removes all vulgar content from its datasets, while DALL-E used petabytes of pornographic content for training
SDXL vs SD3 (Right)
dalle could be something crazy like 40b
If u need rly good, non-shiny realistic traditional looking ART, SDXL and Dalle-3 is still the best models for that.
sd3 makes better steven seagals
they banned Seagal from dalle ๐

Where are the 4GB of Vram? ๐ฆ
macho Seagal
Why do people think that DALL-E is trains the same way as SDXL? DALL-E 3 is like SDXL's best and favorite refined version with every favorite LoRA baked in. It's a look.
Try getting a shitty image from DALL-E. You can't. It's always artistic and beautiful. It caters to a look. I am not sure that's very flexible.
Meanwhile, SD models are ALWAYS a great base to start from ...
If you like DALL-E , stop using SD3 immediately. SD3 2B might be lacking in some areas but it's still far more "normal" and flexible than other models. In [my] opinion for [my tastes].
There are things where SD3 is better than Dall-E3. So I'll keep using both for different needs.
haha you funny guys
SD3 is betterthan dalle at Watercolor, SD3produces more realistic watercolor paintings+ Great at photorealism and Digital art style
Dall-E 3 vs SD3 (Right).
SD3 is more superior when it comes to realistic watercolor look.
Prompt: Real watercolor painting of a blueberry cake slice with a mint leaf on top
try with this prompt : Real watercolor painting of a girl sleeping on the floor next to a giant cake slice

Well.... Of course dalle will do better here. I was just comparing the watercolor looks.
Dalle-3 x SD3 (right)
Real watercolor painting of a girl sleeping on the floor next to a giant cake slice
the cake is a lie ๐
dalle doesnt look watercolor
they are turning humans into cheesecakes
Real watercolor painting of a girl sleeping on the floor next to a blueberry cake slice on a plate with mint leaf on top of it
Dalle-3 x SD3 (right)
american flag cake
Have you ever actually done watercolor painting? Most people just think it ends up looking like simple shapes with little blending and few colors used, but it can be extremely detailed if you want it to be
But I'm not about to preach art literacy at 9am lol
yes depends on how wet the tip is
hehe that's what she said
It depends on a bunch of factors more than just that, but at least you know understand the tip of the iceberg
you can also use a big wet brush if u wanna make it look messy
Jugganaut has moved to Pixart i think.
When SD3 hits, it hits well...
SD3 really really likes 5 legged horses... oof..
SDXL x SD3 x Dall-E 3
Prompt: Real watercolor painting of a cat sleeping on the floor next to a blueberry cake slice on a plate with mint leaf on top of it
ugh. Hands
6 
yes it always makes five fingered hands too, it's obessed with 5 of things
wut
I tried for a snake with many heads didn't go well
It must of been you, cuz sd3 don't make no mistakes yo
For sure, I didn't use 500 words from chat gpt ultra extreme llama 90b to describe the image that's why it's skill issue on my part
Yeah this one is even worse.. .typically 5 tho... 4 is rare ๐ฆ
You need to up your skills
git gud! 
Either that horse is really short or he's kneeling.
FOO RRD, my favorite truck brand.
Do elephants on the back
now the dogs are too tall
They are on top of stuff!
I don't know what's going on the left side
Says someone who is using comfyui.... I'm not using it unfortunately, have to use limited HF spaces for it. There's not much options to make ur generations super good.
just because it's complex doesn't mean its "super good"...
Batman's ashy...needs to lotion up.
It's all salt cx
looks like snow
The generated pic with comfyUi will obviously be better. So...
Minor detail...
Salt out of this world
Lmao
hey look its the Twitter belt
This woman lying on grass picture was too sexy not to set spoiler tagging on.
A potato will outrun A1111 and ComfyUI... I am sure of it... Potato3.0
I get the feeling ComfyUI is mainly used by people that think they know what they are doing ๐ It really feels like the trend right now - there are math guys that really know what's going on, and 80% of the AI community thinks they know what's going on "oh add this to your prompt" "oh use this node in comfy" when it's just all random BS LOL
Oh COME ON!!!!
constant conjecture and hearsay
Anyone can use ComfyUI; it's really not that difficult.
It's like a cactus
he got one of those fungal infections
I am using one of them newfangled LLM things...
ComfyUi vs HF space SDXL
You can see how much better actor Chris Tucker is
Wish version
My hardware is limited, So i have to use spaces and sites for generations.
Stop being Poor ๐
Skill issue.
This is a JUDGEMENT FREE zone brother... (or sister)
license plate
It's okay to be poor, we rich are the yin, you're the yang, we need you!
Elephant consumed
Bloo
Your thing really loves to double those trunks
maybe put elephant trunk in negative
or turn up cfg
Neyitiri
and legs...
Leech Cat
4.5 cfg did not improve things ๐ฆ
What is your negative
luma+sd3 
luma is pretty cool
@desert garnet have you used luma on Steven Seagal gens yet 
im out of free gens ๐
Holly molly... we're almost there...
another one
Yeah, but it takes eternity to generate 1 vid. Waiting for gen-3 now, will be out in few days.
it was faster on the first day,i guess they nerfed it
How long do they take
i have one thats been stuck for 1hr
Actual process: 1 minute, but you can buy this premium service! 
The demand is very hight atm, it took me like 5hrs to generate 1 vid.
That's pretty looong 
4 hrs in queue already...
Gaga!
NVIDIA GB200s are ExpenSiVo ๐
Yes is the draconian queue
hmm all 3 text encoders kinda seem to ruin the image, is t5 alone the best?
awesome frame, was that in the prompt?
I wonder if the AI would understand Vader and/or Anakin stating his opinion about sand
And know the reference
Maybe that one model that uses a full 2B llm for text encoding knows?
inpainting isn't quite a powerful as I remember with SDXL. with SDXL it would always make the new inpainted object fit perfectly into the scene. It seems a lot more finicky with SD3 to "hide the mask" and it doesn't always just fit it in
It wasn't perfect with SDXL either
The best state of inpainting I think was SD1.5 with inpaint controlnet, you could turn any finetune into a great inpaint-capable model without changing anything about the model's capabilities
I usually had really good success with it. Anyway, at least I figured out how to do it in Stable Swarm UI, watched a 90 minute tutorial vid last night on how to use Stable Swarm
Personally I prefer just ComfyUI
it's probably more because the terms constantly change between UI's. Some UI's call it "add object" some call it "unsampler prompt" or something.
StableSwarm just adds questionably convinient UI layer on top of Comfy
Nope... (but then again, I did not check the LLM result... let me check )
Yeah, one of the reasons why I use Comfy, it uses technical terms for everything so there is no confusion once you learn the basics
I just don't know COmfyUI that well, there are a lot of nodes, plus in Stable Swarm, if I do go to the Comfy tab, it doens't seem to always equal what Stable Swarm is doing (even if i say "import from generate tab) and pressing the 'Queue image' button in the comfy tab doesn't seem to do anything
Probably because of the way StableSwarm is written
Here is the prompt generated by the LLM
Best quality, masterpiece, Epic, cover-art, : The image is a captivating scene set within the enchanting realm of Renaissance, where a beautiful girl dressed in exquisite garments, adorned with delicate embroidery, gazes towards an ancient castle perched atop distant hills. Alongside her are two playful kittens and a loyal companion, a friendly dog by her side. The sun sets upon the horizon as soft clouds gather above, casting a mystical glow upon the landscape below. Intricate brass elements accentuate the setting, while fine details of nature merge seamlessly with human creations. Every stroke is executed in the mesmerizing style reminiscent of J.C. Leyendecker, his signature artistry captured through vivid colors and intricate patterns. The composition exudes a sense of wonderment, inviting viewers to immerse themselves into a world where fantasy intertwines with reality.
Anyways ComfyUI seems harder than it actually is, this is all the parts of a basic workflow:
1) Loading AI model into program 2) Positive prompt and negative prompt 3) Empty latent image (a format that lets AI understand an image) for AI to generate into 4) The node that runs the AI, it's same as normal a1111 settings 5) Turning latent into a normal picture
you don't have to draw masks you can use object recognition models or face recognition models
hey where u got this model? i tried to use pony too but this turned out ugly each fricking output
Latent/Checkpoint/+Clip and -Clip/VAE in/KSampler/VAE out/Save
You replace that input "empty latent image" with this, loading an image with a hole in it and turning it into latents
Visit the examples page:
https://github.com/comfyanonymous/ComfyUI_examples
All kinds of examples to draw from. ๐
ok..but does that mean I need to draw the mask with an external program? Kind of a hassle..
No
This one, personally I use DPO variant because it's a very slightly better version when it comes to understanding prompts: https://civitai.com/models/288584
comfy can draw masks for you
you just need to give it recognition model
Ok, I'd have to look up some tutorials on the tube on Comfy. Still, seems overly complicated to me for the things I ever want to do
It's really not. It's just different.
Also remember you have to add the whole following text to the positive prompt with Pony models (it's because of a training error sadly, next version just score_9 will be enough)
score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up
if you inpaint a lot its way easier to not have to draw mask
comfy is actually easier in a few ways like that
I've used node based tools before like in Blender...and most music applications are similar.
You grab a workflow example, drop it right into Comfy to load it, then go from there.
thx
In the image load node, you draw your mask and feed that into an Inpaint VAE encode node.
no...
apparently it's a topless beach
This extension lets you right click on a Load Image node to draw mask or basic colors on it: https://github.com/ManglerFTW/ComfyI2I
If you want to play around with other extensions, ComfyUI can easily run nodes for stuff like object detection
And you can automate inpainting different things on your own
Which is essentially exactly how the face and hand detailer nodes work.
The do a little detection magic, draw a box for inpainting, and then go to town.
Anyways I once made this schizo inpaint pipeline, first the nodes on top left can auto downscale an image if it's too big because I felt like doing this manually using math nodes instead of getting an extra extension for it later it turned out I already have it and I used lineart controlnet so that the model is more-less aware of the image composition while generating even at the very start of the generation which helps avoid the issue where the inpainted area doesn't fit, and reference controlnet so that it automatically copies the arstyle of the original image perfectly
The results are very nice, as messy as it looks here
whats better? at the end or start
I believe it was trained with score tags at the start
I recommend joining Pony Diffusion Discord server, or at least reading the original Pony readme, they have some extra instructions and such
What are your results? Also maybe let's move to DMs to stop spamming here, and to not have slow mode
xd ok
make sure use correct sampler Pony is very picky about it
added "framed" to prompt
we could very easily see whats wrong if you give an image with png info 
you have to get the prompting exactly right for that model
pony is a weird combination where
the fine tune training was done very wrong in many ways
but because the dataset is so massive, with decent tagging
it still came out as a very useful model in some ways
Still. Foooocus has the best inpainting tools I've ever used. Even Stable Swarm I can't really get a "add this object and blend it with what's already there" working quite yet. Maybe I'll figure it out. It usually wants to replace the masked area entirely with new stuff
I am thinking perhaps the "Unsampler Prompt" does this but have to test it out
in Fooocus you could choose to add new item and it would layer it over the top of stuff, etc. Seemed way easier to deal with
it looks like the "trick" to having it layer more "appropriately" is if you use the same seed, at least the underlying image stays the same inside the mask...Well anyway, experimenting
Sd3 have inpainting ?
people have ported llly's code to comfy UI nodes. you could jigger that up to work inside stable swarm
Good-ish ๐
that black dressed girl is having a hard time on that tree 
I'm experimenting with a 2 sampler workflow
generating a base composition first, with high shift and few steps
then dial in details with img2img using a different seed
I'm addicted to pure black background
help
the seed doesn't really matter towards the quality or success of that process. it just provides initial noise. Your second img2img pass will start with different pixels so it effectively has different initial noise already.
definitely not bigger than DALL-E most likely, DALL-E 1 is a 12B parameter model and openAI doesn't know how to make reasonably sized models
I swear I've seen the same seed resulting in a burned image sooner. .. I just tested this with the spacecat and of course it does not happen when I want to demonstrate it...๐
heh. I know the feels. the seed really only determines the arrangement of noisy pixels at the start of generation though. It's easy to get stuck into confirmation traps when there's so many settings that affect everything.
It did not happen because I added an upscale to 1048x1048. This is refinig with the same seed and not upscaling:
Hmm, so you still don't believe me that the seed only affects the initial noise positions. Alright. Not my problem to have.
good luck with your testing
I did not... guys you all need to chill... look at the first picture and compare it to the second one... you see a difference? The difference is the seed and the upscale. so if I change the seed or upscale I donยดt get that burned effect. That is all I am saying.
oh ok. good luck then. sounds like you're figuring it all out fine.
more people need to play minecraft to understand what seed really is lol
the number itself isnt a setting but moreso that specific seed with those specific parameters
Aw shit, here we go again.
This looks nuts
Ppl should stop complaining and do more artโฆ they are becoming to sound like a broken recordโฆ never seen so much of it in only the SD communityโฆ itโs crazy and above that SD told everyone thereโs was not gonna be NSFW yet they act surprised about itโฆ crazyโฆ I agree
People who make money on NSFW stuff are definetely affected by this. Lol. I mean, if u rly wanna keep doing nudes, there are tons of SDXL finetunes for that, they produce rly HQ realisistic photos. My only cry for SD3 is not understanding most of the artists' styles and real 80s Anime. The rest is all good. The model is more than usable.
Ah, and the anatomy problems as well...
LMAOOOOO
what happened with the custom one it looks 240p 
or is that not sd3
ooh
I'm dumb runs away
pixart sigma finetune on midjourney images. Results got better
A dark ghostly ice man, reminiscent of a White Walker, stands on a snowy mountain top in the midst of an ice and snow storm. He has long white hair, glowing eyes, and is swinging a long white glowing ice sword, attacking the camera. He also wields a glowing spear. His light armor is made of roots and leaves. The scene is set in darkness, illuminated by moonlight with cinematic and volumetric lighting, but with less backlighting than before. The composition is epic and photorealistic, with a bokeh blur effect. The atmosphere is mystical and menacing, capturing the essence of Nature's mystical beings and the devil spawn (masterpiece, 4k, octane render, volumetric lighting, perfect lighting, perfect picture, best quality)
?
I'm giving you a hard time because you use a dall-e prompt and a character from game of thrones
That's like certain death in some circles
i just used the prompt someone else posted in this channel with sd3. I wanted compare the results with sigma. Last time i used another finetune. Now there is a new one finetuned on midjourney images. So the why of the post
in the sand castle kicking club maybe?
I like Dall-e myself, not so sure about Midjourney. Seems like something you have to use on Discord is a major NO from me no matter how good it may be. I now want to try that prompt for myself however.
sd3 2b did not do a good job last time with this one, however 8b from the api nailed it
seems 2b might have some oversaturation issues
Do Darth Vader eating a sandwich while contemplating the absurdity of his own existence.
sure
best of 4
You really don't get that whole contemplation feeling however with 2b
At least you have a whiskey glass with water in it.
the poor man can't eat his BK Chicken with that mask on
You have the little tin cup of flowers in the 2nd image. Which maybe says " I am ok with my life right now"
the mask is messing up his speech as usual
what is it with sandwiches? yesterday, it was barbarians looking for sandwich thieves, today, darth is inviting you all to lunch
Darth Vader eating a sandwich while contemplating the absurdity of his own existence (masterpiece, 4k, octane render, volumetric lighting, perfect lighting, perfect picture, best quality)
could not manage to get the contemplating part
2nd one has Warhol eating a burger vibes
first one is from midjourney finetune. I switched for the other gens because it gave me giant sandwitches every time lol
He is thinking hard in these
beautiful
thats funny because he has mask and he cant eat with mask -_-
(darth vader using a lightsaber to spread mayo on a piece of bread)
can't even do a fork laying down
Iโm not into NSFW and ppl donโt really make money on itโฆ unless they are pron bottling on pron hub and thatโs even more disturbing because of the c- exploit complaints and the abuse of NSFW is what has ppl wanting to potention for it to stopโฆ Iโm almost positive itโs illegal tooโฆ glad SD has made it a safe model โฆ which they actually said that it wonโt be good at anatomy itโs more for photography landscaping modelโฆ the data sheet shows everythingโฆ itโs a great model once u understand itโฆ
i did say that the issue affects everything, not just humans
Looks more like Leonardo AI generator
i cant get it to keep the burger in the bag, but this is so much closer to the video setting. i love prompting sd3. it always surprises me.
doesn't seem to know the classic heinz bottles so well though and leans towards modern consumerism big bottles
Hereโs what my custom bot said >>>> The metadata extracted from the image includes the following details:
โข sRGB: 0
โข Gamma: 0.45455
โข DPI: (95.9866, 95.9866)
This metadata provides some information about the color space (sRGB), gamma correction, and resolution (DPI), but it does not contain specific details about the tool or software used to create the image. Therefore, it does not indicate whether Leonardo AI was used to generate this image.
holy you want some burger with that ketchup there anni?
Heinz Kernzz day fuct eviecckot burot!
day fuct it up
That is pretty good. Dall.e3 with an upscale still does it the best I think.
You have to keep CFG to 5 or less or yes you will get high saturation. Unfortunately higher CFG also gives you better prompt following. So I say we should have post-image step to adjust saturation and correct it in the image later
I figured as much and chatgpt strips meta data as well I noticed or it should have at least grabbed the ss metadata
Are they really confirmed that improved version of SD3 is coming? Like SDXL Came after SDXL 0.9 ?
no
Love the mood
inteeeresting. Using multiple ksamplers is helping to get around the bad anatomy ๐
dalle3 is the best, but censored. Now tha you said it, i should test it now with that prompt.
i tried now with a new sigma finetune
Ooooh getting closer to overcoming the infamous laying in grass meme xD
Anyone want to test this workflow out with me? It's working kinda well ๐
With higher CFG
side shot please ๐
what do you mean by that? laying on her side?
Yeah, lying straight isn't the issue. It's the other poses that are
Iโm anti Bing but try Dalle XL itโs uncensored and on hugging face or git hub โฆ way better
oh sure, gimme a sec
SD3 versions with same seed?
Itโs still around just ppl forget how to do itโฆ
First try after adding "side shot" to prompt
ah okay, gimme a sec
Yes it does h3ll I still go the long way sometimes and not use AI at allโฆ
These are with FlashSD3, almost similar to the ones u generated.
going to cut out the first image cuz it's kinda nsfw, but guess I gotta find a way to put her belly down hmmm
Facts thatโs why I donโt pay for AIโฆ I draw and only program I pay for is adobe cloud PS is all u need
I only use SD procreate and adobe programs thatโs it
And Lightroom for my photography
U can edit them in Lightroom and use camera raw
Yes, it's doing really well with that.
But I wanted a seed that was chosen. The chosen seed number 1477598 ๐ฆ
Sunday Funday! What's new with SD3 today? I saw kohya has dreambooth scripts up an runnin
hmm yeah, getting other poses is way more difficult... Could probably go crazy with the prompt to fool it, but that'd defeat the point of this anyways
Draw me like one of those french ladies
oooh that gives me ideas, thanks xD
hahaha sofa is OP
make the sofa out of grass
LOL
that broke it for some reason xD
Ppl will think it's real If u don't say that it's AI ๐ฒ with some VSCO filters will be more realistic.
Same seed and the 14456634 batch from start of it, i promise..it makes perfect hands
consistently creates 3 fingered hands though
Uncanny Valley
Awwww so close to greatness
Going multiple ksamplers starting from 512, 768, then 1024 is helping to overcome some "safety" measures haha
Should also mention that the first pass uses this bottom textencode to trick SD3 into making a "safe for work" image, and then throwing the actual "laying in the grass" prompt into step 2 and 3
Feels like you could use pony for the first sampler then?
my life would be much easier with an rtx 4090, but my limited specs is forcing me to stick to just SD3 to avoid hour long workflows xD
avoid the ancestral samplers
well, if that's the effect you want, they are very good for that ๐
i'm working on a sampler/scheduler comparison chart
i'll share them with you at least
thanks for this, I was just writing a system prompt for this (not nearly as exhaustive as yours) and was scrolling up for clip_l/clip_g/t5xxl examples to feed in a few-shot style
we're not using negative prompts anymore for SD3, am I right ?
God, SD3 just has trouble with some words ugh
from all comments and educated guesses I've read recently, and confirmed by SAI staff, seems like SD3 was not trained on negative prompts
the negative prompt debate was confusing
It's crazy how good the vision module of gpt-4o is
i never use negative prompts
a lot of inpainting and had to do the text myself:
try synonyms for those words?
I'm sure I could improve it if I ran it through an upscaler it might fix the inpainting mistakes
if you increase your steps, it does help the text
SD3 sucks at doing "worms eye view" like this
oh ..derp I think I was only at 20 steps, my bad
no, it just sucks at the word "Pathetic" no matter what
it's also very powerful to ask GPT to rewrite the prompt and include some elements it didn't identify all the way or that you need to add to the composition
then try: deplorable,
feeble,
heartbreaking,
miserable,
pitiful,
woeful
so, basically, really verbose and expressive prompts are the key to enable SD3'
no
tbh, I'm really getting more improved images than my puny manual prompts
maybe I'm just bad at prompting SD3 in general..
she needs to say "Pathetic" , it's a meme
fetishes are weird. why do people insist on revealing them to the world? can't we go back to a time when people would keep that stuff to themselves?
probably a steep ask
that's fine, I'm just saying there's no 'key'.
skill issue?
how is this not the correct meme?
this is the correct format 100%
I want to try the regional prompting in Stable swarm but I have no idea how to use it. The prompt itself indicates you can include keywords but I am not seeing any effect yet
They have enhanced the meme with woman looking down over her breasts at you, it's funnier
no boobs is why. theyr'e really important to some
because it's funnier
its like she's peeking at you instead, and barely even can stand looking at you
please don't talk about that disgusting show
created by hedonistic disgusting people
hey chat 
The Boys is hedonistic for parodying modern society and satirising superhero tropes, but Pony XL is fine. Weird
pony makes elf girls, good model

draugrs looking good on unreal engine 5
put some long hair on them
Can you gradient hair lmao, very colorful and stuff
try with 80s glam rock wig
Rolling bones?
Sorry for poor quality but recently my dad has been sending me photos of fake products generated by AI (obvious to me not to him) hah. So I decided to show him that it's pretty easy so I made water and beach sand flavored lays using sd3!
xD
try clown hair
if anyone's using the ODE solver node we just made changes to it to fix the broken progress bars and the overall problem that none of the fixed step solvers worked at all this entire time. it'll also work properly on non-flow (non-SD3/Lumina) models.
wtf is that book of a prompt
LLM gen?
i figured, aint no way i'm writing a 500 word prompt
Here are some images I did with SD3:
Cigarette ad.
would be cooler if there was a tiny snail on it
now try making them laying on grass
Thank you.
I tried to add a snail on the leaves and I think it did a good job:
I think the second one is great too.
Oh dang, SD3 is actually going to be my new dnd item generator. img2img is highly recommended for gloves though
knew it. Can't even draw a leaf on grass. Dead model
Goliaths. I wanted to create guideline compositions based on MOSFET/GPU design schematics, but there's no controlnet support in ComfyUI for SD3?
only diffusers controlnets are available
I tried making leaves on grass, but I struggled a bit with the prompt. Here what it generated:
lol it's all good 'm just memeing. They're high quality crispy images all around
pick the right sampler/scheduler pair, get a normal looking hand. pick the wrong sampler/scheduler pair, however ...
My friend that hand on the left ain't normal unless this apple is extremely poisonous to touch
Reagan.
Heheh
Swords! ๐
that's Tifa...
sounds perverted
why is the morning person green
they are zombie they didn't have coffee yet
suggesting it should be censored? aren't we past that since it's been revealed that's from pretraining?
that was somewhat tongue in cheek given your comments earlier about fetishes
its fun to mock the problem but lets be accurate about what the problem is. misinformation is poison
tell that to reddit
can't tell anything to reddit
LOL
Do you send it to SDXL or is it trained on Gaga?
Face changed with Retake
Generated without Lora
This is straight out of Sdxl
Ok, and with only sd3?
Opendalle fine tuned SDXL
Ok, but what is made with SD3?
Oh wait... I thought I'm posting in SDXL channel ๐
Lol. My bad...
No, you've been posting several times in SD3 channel lol. Not the first time we've seen you posting Gaga's here ๐
Ah, okey. I was thinking we had our first female celebrity trained in sd3 for a while
far more normal than the hand on the right ๐
What's the right sampler/scheduler pair?
Is there a lora training method for SD3?
I think people are attempting it, but I don't believe it is working yet
or if it will ever work..
Hope it works, important for the future development for SD3
prompt: a green haired boy eating a blue apple - t5XXl encoder only, no negative prompt
prompt: a green haired boy eating a blue apple - clip_g encoder only, no negative prompt
prompt: a green haired boy eating a blue apple - clip_l encoder only, no negative prompt
prompt: a green haired boy eating a blue apple - all 3 encoders at the same time, same prompt, no negative prompt
Skinwalkers ๐
and just how far out into lala land clip_l went
Huh, what was it spatting out?
Lizzie Borden
just weird random shapes, I could never get it to draw a raw speaker
like it would draw speaker "boxes" if the cone was't visible, it could never draw an actual speaker CONE
The ones which are physical impossible, even with juice, are extremely popular for some reason. I figure the reason that people like exagerrated aspects of the physical body is that it's just something that one never sees in person. Sort of like superhero comics, not realistic proportions at all.
like this : this is all SD3. I could never get SDXL to do this:
Oil paintings make everything more classy, including chocolate dipped bacon ...
Wendy's bacon shake...
i like the baconator but let's be real that thing will put you to sleep
Hmm, I did it all with just one prompt, but now I wonder if I can talk about each panel set seperately
in Stable Swarm it kind of looks like you might be able to make image layers, that would be useful for adding subjects in the foreground if I could figure it out
love this
the top 2 always look amazing, then the bottom two look like the artist got board/ran out of time
I think the SD3 is horrible for any kind of angle. How are you doing low angle?
"a comic illustration: a tall muscular brunette woman standing in a forest. The camera is looking up at her from ground level in a worm eye view. "
and then just keep genning til it gets it right...
I will try. Ty
ordering some at the bar ๐
So, character consistency, how is that done is SD3? I'm guessing the usual, describe the character in detail, the colour of their clothing, and perhaps give them a name?
I get perfect hydras with SD1.5, but not with SD3 <pout>, they are still awesome though
make it do loss
the day that CAD lost me forever as a reader. i'll always remember it. "Wtf was that comic? Out!" I think i was so upset i stopped reading Penny Arcade too
I've come back to PA a few times though. They're good guys. They've changed though.
I've discovered anime art. With lykon's sdxl checkpoint with the same stuff in it as refiner.
very wide pic :3
just like people generating big booba, very popular.. "for some reason" 
aww that's a cute one
thanks ๐
and what did you conclude overall?
from what i am looking at personally, i like the results from steps 15 and up along with cfg 2 to like 5
nice
Loss??
| ||
|| |_
assuming by PA you mean PixArt, i saw some news on reddit, they are working on the next model with Nvidia? ๐ฎ
Stability also works with Nvidia, and AMD
yea i saw similar conclusions from other people too
i did some tests recently with like TCG cards that have a lot of text, it seems higher steps helps to get better more coherent numbers and text on the cards, even tho it's still gibberish lol, so it really depends on the type of thing you want to generate
you can easily overcook an image.
the more steps you have, the more passes the AI will make, and the more details it'll put in. and that gets to point of no return very rapidly
modified HEAVILY by the sampler you are using (and other values)
xy plot is best to determinate optimal settings
cfg is too high drop it to 3.5 at least
DMing you
his GF arrived
"Ollama-up baby!" The problem with a lot of Generative AI - it has all the answers - just that too few know how to formulate the questions!!! Get gud skilz ! ! !
Ollama indeed! ๐
Try Jan.ai - free LLM generator for prompts and more. Ollama, ChatGPT (via Quality of Life nodes), Anthropic's Claude 3.5 Sonnet etc.
"Get Gud Skilz!!! โข
The model acts like an ancestral or SDE sampler where adding or removing steps will cause the image to change. SD3 usually hits a hard diminishing returns around 35-40 steps
Which is why the suggested default step count is 28
If you want to get micro variations in an image, add in some , , , , in the prompt one by one
But leave the steps and everything else the same
on a side note, comfyui has a new sampler type that seems to work really well with SD3. it's the eulercfg++ but you have to use the alternative mode option of it. obviously, use it with sgm_uniform sigmas
it's based on https://arxiv.org/abs/2406.08070
Classifier-free guidance (CFG) is a fundamental tool in modern diffusion models for text-guided generation. Although effective, CFG has notable drawbacks. For instance, DDIM with CFG lacks invertibility, complicating image editing; furthermore, high guidance scales, essential for high-quality outputs, frequently result in issues like mode collap...
you use it like this, obviously, finish hooking up the other stuff needed
yeah this is pretty damn good
showing a side by side of eulercfg++ vs euler, everything else the same
just using florence-2 for the prompt off an old generation because i'm lazy
yeah i'm finding some really nice improvements myself
i'm going to keep tinkering with it, but out of the dozen or so side-by-sides i've done so far, it's won every time for me
yeah i should throw dpm++ 2m in as well to compare all three
How to get SamplerEulerCFG++? I had upgrade to the lasted comfyui but can not find it
scroll up a handful of posts, i showed an image with the setup
you have to use it with the custom sampler
dpmpp_2m vs eulercfg++ with my weakass sd3 lora
stil ljust can't get the loras to do much of anything
dpm++2m on the right
dpmpp_2m, cfg++, euler
basically the same as euler
huge improvement there. it's a shit image, and suddenly great
yeah ive seen a few where it's drastically different than the other samplers
like even shifted the art style
yeah
very cool... what i like the most is how much it improved the coherence
got two major issues with sd3 myself: coherence, and how the hell do we train it?
dunno, i've mostly been working with pixart sigma lately and doing refinements with sdxl
i wonder how many tens of thousands of watermarks are in their training set
left is base, right is my most recent test lora
i manually shop out all watermarks in my training sets, i don't leave any bullshit in them
if it's art related, almost every single image would likely have either a watermark or signature. have you ever seen a painting that didn't have a signature on it in the bottom right or left corner usually? by definition, the fact that the models try to add them is more correct than if it didn't
but i feel you, we dont want them lol
yeah, just never used a finetune or a base that had them so consistently in the image
they probably didnt do as much random cropping and whatnot and actually trained with the full images
mostly cascade myself^
yeah, prolly
here's one where it changed the style up, but in this case, nails the prompt far better
i mean look at those black and orange flames, glorious
yeah i think this eulercfg++(alternative mode) is going to be my new go-to
Yeah agreed
By far the biggest thing is the improvement in coherence. There's a lot less separated object crap going on. Anatomy is still horrible but in general there's a substantial improvement
i kinda wanna try it out with sdxl, since i can test both modes of it there
Oh, yeah
id have to look at the code to see what all it's doing differently between regular and alternative, but obviously the alternative mode isn't doing w/e ancestral-like stuff in it
I've installed the CFGpp node but I cannot find SamplerEulerCFG++ at all?
it's part of stock comfyui now, you have to just update comfyui
looks like the regular mode is broken, i've tried it with every other sigma type as well. cfg is only 4.5. oh it works fine if cfg is set to 1 though
this is with sdxl
Hmm... I did three minutes ago... not there. Will try again
cfg1.0
did you do an update all?
Yes.... that's why I said I did ๐
Something is goofy with my symlinks. maybe that's it
FETCH DATA from: C:\Applications\StableDiffusion\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [DONE] JSON file does not exist. Update ComfyUI Cmd('git') failed due to: exit code(1) cmdline: git stash stderr: 'error: 'models/checkpoints/put_checkpoints_here' is beyond a symbolic link fatal: Unable to process path models/checkpoints/put_checkpoints_here Cannot save the current worktree state'
The JSON error might have to do with the standalone Windows version? Who knows, will try later.
Figured it out, the put_checkpoints_here file was missing from the symlink ๐
All good, have access to the CFG++ now to play with and still somehow mess up the art ๐
yeah im thinking the cfg range with this eulercfg++ node is mapped weird or something. images all the way down at 1.0 cfg come out completely normal and if i bring it up even to 3-5, it starts to behave like you've got your cfg set to 8 or something(in terms of almost burning the image with too much contrast)
maybe this will be like cascade where you use 1.0/1.1 for cfg
either way, i'll have to play with it more to find the sweetspots for it
the link you posted earlier did say it was better at lower cfgs ... though it did say it is supposed to be better across "all scales". just now getting a chance to catch up on things
has there been any good updates in the past few days other then this cfg++?
Are you using the regular or alternate version? This is CFG 8 on "alternate", regular is bad off the bat
/help
Oh from the actual paper "The resulting process, which we call CFG++, works with a small guidance scale, typically ฮปโ[0.0,1.0]" so yeah it likes low cfg. Not positive, but I think comfyui's cfg 1.0 is actually zero, so 1-2 might be the right range for this sampler
If not, then the range would just be 0<x<=1
CFG 3 is not bad, trying 2.0 next. Or, going to bed because I'd have to be up in less than five hours
So maybe ranges in the 1-3 are like regular samplers in the 2-8 range or something. As you get closer to the cap, it starts to burn
I think so, CFG 8 was burned bad, but oddly worked well on my old workflow (where CFG 4 was blocky)
it's not an anime girl but eulercfg++ def seems to follow better
of course after i say it's not an anime girl it makes anime girls
Wow!!! This CFG++ is really making a difference - REALLY!!! Incredible!!! Just drop the CFG from 8 to about 2.5 or 3 before you start!!!
1st effort CFG++, 2nd effort no CFG++
Nice!
@lavish osprey making sure you don't miss this discussion
And the text has never been crisper!!! Still garbled, but so much classier!! ๐
ill make some more test animations like this but with a tighter range now that i know what to expect. like from 1-3 but with smaller increments
CFG++
Look! Readable text - not just the odd time - almost every time!!!
CFG++
seem CFG++ not work with SD3 Tensorrt, tried but not luck
that is utterly horrifying
Get friendly with an LLM! "Get Gud Skilz!" โข SD3 including CFG++
You may be doing overkill if you implement CFG++ with a ReFINER. CFG++ really improves text!!!
here's another cfg++(alternative) example with SD3
it definitely looks like it's 1-2 for the "normal" range
though conveniently, if you're trying to do more posterized illustrations, with less shading, cranking it up higher actually helps quite a bit, without ruining the whole image
i'll do another one in a realistic style
These are a great help
no problem, figure if im doing it for myself, might as well share some results
On a side note, I was thoroughly pleased to see there was a text overlay node in one of the common packs I'd had installed. Makes it easy to overlay the concatenated cfg: and whatever the selected value for the cfg is for the frame
Close to a person sitting down! I had to blur what was going on in the background because his trousers were pulled down ๐
"a portrait of a young woman with a dramatic makeup look. She has long blonde hair styled in loose curls and is wearing a large flower crown on her head. The crown is made up of pink and orange flowers and green leaves. The woman is also wearing a gold necklace and earrings. She is looking directly at the camera with a serious expression on her face. The background is black, making the woman and her crown stand out. The overall mood of the image is dark and mysterious."
it really is looking like the sweetspot for cfg++ and sd3 is around 1-3
well more around 2-3, but it depends on what you're going for
i'll do a couple more before the morning chaos begins aka the kids get up
Amazing work RX808, CFG++ really does work super well with SD3
thanks, and also just an fyi, it's a completely default workflow, nothing special, just model and prompts straight into the samplercustom, 30 steps
also with cfg++?
Yes
where are you hooking up the SamplerCFG++ node in your workflow?
i showed a screenshot of it earlier
#๐๏ฝsd3 message @green shoal
there, just hook everything else up you need
"a portrait of a young woman with long pink hair. She is wearing a futuristic outfit with a black and silver design on her face and neck. The outfit appears to be made of metal and has multiple layers of armor. The woman has a serious expression on her lips and her eyes are looking directly at the camera. Her hair is styled in loose waves and cascades over her shoulders. The background is black, making the woman and her outfit stand out. The overall mood of the image is dark and edgy."
thanks! no modelsampling to adjust the shift ?
you can pipe that in if you want, the default for ksamplers is 3.0 anyways
everyone's favorite, let's see if cfg++ helps it at all lol:
All you have to do is to update comfyui to get the node?
ja
okey, ty ๐
and if youre going to use it, id recommend using the alternative mode and keeping your cfg between 1-3 or 4, 2 is looking like a pretty decent sweetspot
๐คฃ
looking like a biblically accurate depiction of an angel, well seraphim
2 in 1
god i need a new gpu for this pc. this 2080 is showing its age. we have a 7900xt downstairs, but well you know how SD is with AMD, it probably has the same it/s as this card does lol
Alright last one for the morning:
"a digital illustration of a futuristic landscape with a red planet in the background. The planet is a bright red color and appears to be glowing with a bright orange glow. The sky is filled with stars and planets, and the horizon is visible in the distance. In the foreground, there is a rocky landscape with several tall buildings and structures scattered across it. The buildings appear to be made of stone and have multiple towers and spires. The structures are of different sizes and shapes, and they are arranged in a way that creates a sense of depth and dimension. The landscape is barren and rocky, with patches of grass and shrubs scattered throughout. The overall mood is dramatic and awe-inspiring."
pcm lora doesn't work with the sampler unfortunately
holy sh
did you try it with both modes?
kohya ss supported SD3 finetuning now ( full model )
Yep
anybody got any decent results with SamplerEulerCFG++ regular ? it's either blurry blobs or pixelated burns for me
it seems like you can't go above 1.0 with it without it blowing up
oh yeah I gonna do my captioning real quick ( although Kohya did say that it is buggy )
oh and that was with sdxl, with SD3, it doesn't work i don't think
it's probably ancestral/sde based with regular mode, which don't work with sd3
A little cropping got rid of the feet above her arms and the stumps at the end of legs ๐
Nvm, it's something weird after I updated comfyui...........I shoulnd't have updated ๐ฆ
Before and after comfy updated...great
they might have fixed something under the hood with loras that an addon was using a work around for
I've got these versions:
Python version: 3.11.8
pytorch version: 2.3.1+cu121
Error. No naistyles.csv found. Put your naistyles.csv in the custom_nodes/ComfyUI_NAI-mod/CSV directory of ComfyUI. Then press "Refresh".
comfy_extras.chainner_models is deprecated and has been replaced by the spandrel library.
Same errors before I updated comfyui, but different outputs and output was working before updating it.
yo... wtf kinda data they got in this dataset... i accidentally typo'd 0.1 instead of 1.0 for cfg and got this
looks like some kind of science experiment documentation or something with all the timestamping at the bottom
unexpectly good compared to SDXL?
although no one really use 0.1 CFG
right, but going below 1.0 cfg almost always results in some weird feverdream shit, but this is just nightmare fuel
looks like having a stroke visually
What python and pytorch versions do you have?
Ok, then I know python isn't the issue. What about pytorch?
I think most popular SD application ( ComfyUI, A1111 ) used the latest? But when downloading Pytorch it usually attached and embed to GPU's CUDA
downloading through pypi ( pip install torch ) won't do that if I wont wrong
and also 12.1 and 11.8
i have the same versions, prob isnt a version problem with those 2
Okey, it's just weird that first version of comfyui works great for me, but when I update it fucks up the generation :/
you can launch it?
yes
It's not any issues with workflow. not even a basic ksampler works
when was the last time you hard reinstalled comfy?
might be worth doing a fresh install in another folder and double checking
time to fuck my captioning of my entire SD3 dataset
today, since I got the same issue when I updated comfyui the other day
So it worked when comfyui was download from portable without update, but stops working (bad gens) when I update it.
I know it got updated for the new CFG++ eular sampler, might of had under the hood updates that changed things ... your def beyond by ability to help though sorry ...
have you played with settings to see if you can get the new "bad" gens to line up closer to what your expecting?
I will say I'm running my comfy directly out of a stable swarm install
but the error points to the custom node
quick google search later https://github.com/comfyanonymous/ComfyUI/issues/3463
looks liek its your universal styler custom node
"Error. No naistyles.csv found" when running comfyui, how to fix it? Since importing a production flow about clay and installing the missing nodes, comfyui can start normally, but this er...
Right now I can only compare to my last generation which has larger workflow. But I think it still looks different, Would you be able to see if there's a difference? (workflow included)
id see above first.
Shouldn't be the issue since it was already that way before comfyui updated though. But I'll try to fix it again
one is a file that was made on the 14th one is a generation from today with the same workflow ... its prob a custom node
I just notice that it's the pcm_stoschastic lora that doesn't work for me in the new comfyui update
ah im still running without loras atm so that make sense.
digging the non-euclidean geometry in this one lol
WTB a lora or hopefully a finetune model that knows how a bow is used
Want To Buy ? Cx
bows are cool why they never work
yeah, sorry old gamer slang
did some more training of my gpt
went with elvish ranger in a vaporwave style
good outputs but the bows are killing me
I play poe, that's the only meaning for WTB!
in another game it also means willing to blow 
think it might of been the same on craigslist before the personals section was removed
Who needs sd3 when you have cascade
Grotesque Pikachu creature, taking a selfie, burning football stadium, sky filled with alien ships, eerie green lighting, chaotic crowd fleeing, otherworldly creatures descending, intense smoke everywhere, neon electric bolts, unsettling facial expression
Or what it looks like ๐คญ
QUick question for the model: Where the heck is all that water for the waterfall coming from? LOLOL
" [my gpt] " ??
Do tell ๐ฒ
I've trained a gpt to output a T5, G, and L encoder prompt
is it a single prompt or did you require a aseries of conversations to train it to do that?
did a session of training in the builder to get it tuned
Its a full custom gpt built in the my gpts section of ChatGPT
is it shareable?
this should be a link
https://chatgpt.com/g/g-SrU9OKDp7-prompt-research-assistant
its not perfect, but its a good starting point.
if you dont use a seperate g or l prompt you can always just use the T5XLL prompt as the main.
JESUS CHRIST you're my hero... been looking for this... where can I steal this golden nugget?
cool dude thanks for sharing ill play with it in a bit, i was talking to @craggy crest the other day and he was demonstrating how much better output is when you split the prompts
posted a link
See it...
check out that replied comment you can see the depth and color range is greatly improved when you let each encoder focus on it's own thing
yeah ... i got the idea to do it from him. then used his base foundation of what each encoder wants to start with
oh wow it all comes back around, good stuff thx again for sharing
well the idea that im too lazy to come up with 3 diffrent prompts at least XD
how were you able to do that with free gpt?
i pay
so you are paying for gpt to let everyone use for free?
Yup... so do I ... both UI and API... worth it for the power and ease of use.
(disclaimer, I use it a lot for my day to day job, not just the hobby)
no i pay for gpt plus which means I get to play in the builder and have higher message limits.
Comfy and Stable Swarm, etc, IMO really are too far behind in Inpainting powers, Fooocus is just so much better at it. I think Comfy has a Fooocus inpainting node? Without such powerful inpaint, honestly things are a bit more useless
damn, that's amazing ... thanks for that im gonna play with it
Why are you now happy with Foooocus? why Use Comfy if you like Foooocus?
Foooocus always inpaints and makes things look like they belong in the scene...Stable swarm default inpaint still you get blur and some seams
Um because Foooocus doesn't support SD3 yet so I have to use inferior products
have you played with claude 3.5 btw? as a paid gpt member i feel kind of offended now that there's something better (as far as coding) than openai's system, like it's remarkably better too not just slightly better
Comfyui is in no way inferior to Foooocus, Comfyui is really good if you knew how to use it, Foooocus is more friendly to less tech-savvy people
it IS inferior with inpainting
Should make this into a Yu-Gi-Oh card
i pay for gpt becuase i use it to help run a dnd campaign. thought about playing around with it soon but its a pain in the ass to flip all of the campaigns information that has been made generated in one ecosystem to the other
i know becuase i went from claude to chatgpt when 4 came out.
Claude 3.5 is really good at understanding different concepts, i pay for it and use it to generate prompts, and its far better at understanding my instructions
Results of a test...
The one with the taxicabs was from a set of prompts from your trained GPT
The other one is from my own asking the general GTP-4o to write me separate CLIP-G, -L and T5 prompts
cody does it cost you for ppl to be using that gpt prompt?
or are there caps on how many ppl can use?
no, they made using 4o free for everyone it just has a tight usage limit
ahh sweet thanks man
ppl are gonna use it heavily
I mean how can this even be useful? Inpainting a soda. Stable Swarm..you always get fuzzies
yeah the robot one in its suggested prompts kinda jank for me. the main thing is that you can back and forth with it to get it better
i use it more for ad-hoc programming like just random stuff I need to code rather than relying on it for an entire built up system but I could understand how you're already deeply entrenched into one system and for your use case it doesn't make sense to switch back and forth, for me i have no issue switching back and forth, whoever gives me the best answer wins personally
if you did this in Fooocus you'd never even tell. And it's not the settings, I tried them all, its special magic that Fooocus does
almost like it does a AI content fill after the inpainting to make everything seamless
if i inpaint i always go back to sdxl then come back to sd3 to upscale
BTW correction: I meant "not" not "now" :;
that might work - because the issue I was having is if I go back to Fooocus just to inpaint I would lose detail right, but i guess if you come back to sd3 and upscale you could get it back
Are you using the official ComfyUI example workflow for inpainting or you built your own?
honestly i really like invoke for inpainting. havn't used it in a bit and im sure fooocus is just as good of a program. its just a nice all in one package that has a clean ui and a really nice universal canvas
This wasn't comfy is is Stable Swarm (which uses Comfy backend)
Also keep in mind, there are no specific inpainting models yet for SD3... that I know of... and Differential Diffusion I have not tried yet if it works with SD3 as it does for SDXL
So, in ComfyUI, you are using a proper inpainting model for SDXL?
ChatGPT is not loading for me right now... anyone else? My sessions to it are just spinning
I heard that Comfy does have a Fooocus inpainting style node, though. But I don't know if it would support SD3 or not
I love how it JUST GIVES YOU the meat ๐ No fluffy explanations ... lol. Trying it now. GPT is working here again
awesome thank you! are the generated prompts safed so a possible databaase could be made?
do you or anyone know how to reference a trained GPT like yours or others if using the ChatGPT API?
wanted to do a side by side of mine then yours
same seed
think we are both pretty much on the same path
mines not super trained for it but lets see
Source image and result from SD3
what look like if you just put all in one prompt

