#🏞|general-with-images
1 messages · Page 133 of 1
i dont use open pose much, but i dont think you need t2i for that
It makes it significantly less heavy and makes it faster
just preprocess the image to get the sketch/pose/depth/normal/etc , then turn off the processor, load in the processor result in cnet, turn on model (NOT processor) and generate
fair enough
so performance for you is better using t2i rather than just an open pose model w/ no preprocessor?
huh i might need to go backl and read more on it..
lol we are talking about the same thing 😄
..I misunderstood my bad
t2i is alittle different, i think. its a preprocessor if im not mistaken
Nah
It's like regular models, but lighter and faster. Only runs once throughout the entire generation, while regular runs once per iteration
or at least, there are t2i preprocessors. maybe ive just got my controlnet folder all messed up
both run throughtout assuming start and end steps set that way
also, i highly recommend the dw open pose full, it handles hands, fingers, and facial features as well as basic pose
depth and normal maps are really good too
This was a pre-made, not one I made
got ya
(also you can take the image you generate with that pose, load it into controlnet, run dw open pose full pre processor, and then you can edit hands,fingers etc on the pose if you want more control over everything)
confused about what
Why this error seems so random
ALMOST got it... theres just some random demon ghost faces here and there, but it didnt mess up the character at all
yeah if you wanna send the log ill take a peek at it
I'll send it whenever I get it again...
Like using regular cn models absolutely wrecks speed
you should try miaoshou assistant
helps with memory release
might help, might now lol
might not*
I have uh..memreduct
never heard of that one
I've had it for a while
I just have to wait for that error
I 100% need to go lay down, I'm tired af
haha have a good one
It's also 4am lmao
Good night folks
Good to hear some do it with 4GB!
ok finally
got high denoise upscale without creating a mosaic or completley chaning the image, in fact it seemed to put more emphasis on my prompts.
i also was doing this with a halfway edited image, and it went ahead and fixed the stuff i was going to do in a photo editor 😄
is this not the way to go ?
i cant make her grab the sword at all
no matter how much i change denoise etc
GAH
how can people be using a white background like that.. it is a war crime on the eyes
theres several ways you can do this. try adding to positive and negative prompt, (negative open hand etc) (positive clenchec fist, gripping, hold weapon, etc). also try using control net with inpaint model and upload that as controlnet image, set denoise to 0.7-0.9, whole image checked, mask the hand, llama inpaint preprocessor,.
or what will probably be faster is just do it in gimp or something. 2 layers. one layer with original, second layer (generate image again, same settings same seed, negative sword.) take your original image and put it as top layer, add alpha channel, erase where the hand should be (revealing second layer with hand) , merge layers, low denoise inpaint or img2img. or just completley do it all in one layer, wouldnt be too difficult with alpha channel, erase, clone , then run through img2img to clean it up.
also, what are your inpaint settings? masked only? whole picture? original fill or latent?
wtf, why does it decide it wants to cooperate now
i changed to fill, and denoise of 0,75
but im sure ive done that alrdy
something else that can help, try setting batch to 4+, and maybe turn on extra seed and crank the variation up to like .25+, if you arent getting results you want then throwing in some random noise can help.
trying it now
wait, what do u mean random noise?
gah
maybe 1024 is just not enough for inpainting
sorry poorly worded, i just meant that enabling extra (setting by seed) caqn introduce more variation in results. also 1024 is plenty for inpaint
i see
hands are also something sd struggles with. so, control net is recommended here. and, adetailer with a hand model is helpful as well. both will help avoid theose mangled ass hands youre getting
so if i use only maked, does it use 1024x in that little area i marked?
no, only masked is going to only focus on the area you masked, + taking into account padding and blur. (theres some tricks here where you can do things like mask a very tiny or multiple tiny dots around masked area to have the model focus on a larger area as it inpaints only what you have masked), and whole picture the model will look at the entire picture as it inpaints what is masked. honestly youre really shooting yourself in the foot by not using controlnet while you inpaint though..
honestly, to get a good result using ONLY inpainting and none of the things im recommending, you need to add more to prompt and generate large batches because its going to come down to luck 🙂
alright ill try to understand controlnet then
and , this one is just opinion, but i find it easier to inpaint or do any editing at lower resolution than 1400+ that youre using. for one its going to be quicker. two, the results wont be as crisp but thats what upscaling is for anyway. you just wont it to be mostly right before you do that
its not as challenging as it sounds! tons of good videos on yt that are like 5-10 minutes long that will get you started
using sd without controlnet is like riding a bicycle with no handlebars. gamechanger.
alright, inpaint controlnet pixel perfect
sigh, now upscaling doesent work for me, this is not my day
i might have to go back to comfy
Bwahaha
I guess this explains why I haven't seen any more sd3 pics from these people I followed.
for some reason the neural network often thinks that what I want is not an animal, but an object
/ogurt packaging design, Hourglass shape special-shaped box, cute, creative shape ,mattetexture,round,goldandblack,elegant,
that frog has a tushie!
Guys I need help
it helps him float
yes, one sec
there u go
Trying it now
ooooo those are gorgeous
does anyone know why controlnet would just be ignored, even if it's enabled?
Models not downloaded/ in the right directory?
no, they're in the right directory, and the little box is enabled
one sec
i was thinking i probably had to like...guide it by putting in a specific prompt
normally i don't have to, it just...does it
I set it to balanced just now, maybe that'll help
That seemed to do it
read what say your console
I can't read this very well
please send me an any AI stable diffusion generated demon, I need it for demonstration
can you rotate camera more?
No control over the animation possible ...
oh
I am frustrated
SVD ... you give an image and get an animation ...
i see on civitai full rotate camera
Sounds like DeForum ...
I brought it into Krita, and fixed a bunch of things, and still working on things, and I have come to a part I am just not good at.
i can use svd on my 4 gb vram card?
He has no nips, and idk how to draw nips. e-e Anyone got any magical lora that adds them?
Give it a try ... I don't really know ...
You can use the pic in the USA that way ... 😄
Maybe try inpainting?
24gb?
Missing 24GB?
🥳
just started learning stable diffusion last night
its amazing the things u can do with it
but it's so complex, so much things to learn
kinda overwhelming
that was the best i could pull off so far
Well ... learning is part of the fun!
I'm learning over 2 years now ...
i made an apple to test the speed on my s/o's computer with forge... it goes fast
legit only used the word 'apple' and it came out great 😆
🤣
An apple with instruction ...
I'm not even using sdxl so that logo came out great, didn't change my prompt either
Good nite!
i got curious
Man cyberpunk
yeah good luck with requesting images when the bot is down
model for extension not work...
Guys!
its someting new
This
to
Finally figured out what was wrong
It was doing this, couldn't figure out way.
pressed to generate something 20 min ago and it's still going
is that normal ?
img2img
Depends on how high you set the upscale to
I just started an upscale of x2 on top of running ultra upscaler x4, says it'll take 20 mins
XD
i'm doomed
in, out
for comparison: 0.5 denoise exponential, then karras at 0.4/0.45/0.5 denoise
you can change that in settings : live preview 😄
okay what i've got here is badass as hell
i don't think i've seen anyone do this... (though I can't possibly be the first to think of it) ^^^
in, out
may i ask what model did you use?
with 0.45 and 0.5 denoise with karras. total joke by comparison
this is juggernaut but i'm doing some fancy/insane tricks with the scheduler in comfyui
using res_momentumized as the sampler
thats cool, and juggernaunt can be insanely good at times if done right.
yeah
the model isn't too important here but yeah it's def a very good one
this is with 50% unsampling/resampling with karras
0.5 denoise with exponential, and then 50% unsampling/resampling with exponential. so yeah, i really do have something here
iterative unsampling/resampling via a sine wave sigma scheduler
Do you need to connect hundreds of lines each time?
its real?
yeah
I thought you had to be a scientist to create something there
i do have a chem phd but i'm pretty sure it doesn't help me at all with this
the thing that makes it hard is the lack of documentation
if there were nice written guides that actually explained what each option did with an example, it'd be easy
i try simple extension for animation and it not work
errors...
is so hard to start something
extension with a1111?
yes
the best thing to do is use someone else's workflow in comfy
animatediff
also, i remember you saying you had low vram... comfy uses less vram than a1111, that was why i tried it
also you saw my poll about video cards?
i didn't, link?
been really swamped with work the last few days so i prolly missed a lot on here
can i buy an image from someone that knows how to mess with stable difusion ? if yes, where ?
i have 3 days to create something but i cant get it right
what are you trying to make
cast my vote (4090)
man i feel bad for the person with no gpu at all
ouch
in, out
scorpion
maybe post here what you need and maybe someone will help?
does anyone know of a zora lora? Like Zora from zelda
i dont have the resources to train one, nor the knowledge
for sd1.5 i presume?
haven't seen one, but it's not too hard to do
the guides are a mess
here's what you need to do... get 30 or 40 images of zora together with the most diverse angles, lighting, poses, gender, etc possible
with a consistent size, prolly 512x512 since you're takling sd15
no bad quality ones, it's better to have a small set than crap ones thrown in
Like i understand the file stuff, like 1.png, 1.txt with 1.txt having a bunch of the image information
like tags
since im not on my laptop, i probably do have the resources for it...
onetrainer?
it's easier and uses less vram and is also faster
that's what i use
focus on the data set first
get that and i or someone else can show you what to do next
don't worry about the naming either
alr, i'll do that while i wait for onetrainer
in, out... those eyes 
most of these are game screenshots, would that work?
if that's the look you want, yes
but again... diverse backgrounds, outfits, everything
ideally you want to make it so every single aspect of the image changes except for whatever makes a zora look like a zora
..yeah i dont think i'm gonna be able to do that.
it doesn't have to be perfect
these are all of mipha, it's gonna have like no variation
but if you have, say, zora after zora that only shows up swimming in water, the lora will have a hard time producing a zora on land
yeah, you'll want to get more
nice excuse to play the game more haha
haha yeah, i'm just googling these though
yeah you prolly want to fire up your switch and get your own screencaps
if you're going for the in-game look
nah, i'm going for like... the ability for it to blend in with almost anything. Like if i were to throw it into an anime style i'd get anime style, but throwing it into realism i'd get semi-realism... I know that aint happenin' but still
i might have to run around and get screen caps... i'd just have to charge my switch up.
it's doable, but yeah, you'll want to start with a diverse set if you want to go in that direction
you'd probably have to do some img2img work to create a synthetic dataset
but starting with some in-game stuff would help ya
you can sometimes use a lora that's "stiff" and inflexible to generate just enough new data to train another one that's more flexible
ahh
Made this by mistake, and I love it
oooh
okay so for the training and stuff, does it have to be 512x512?
can they be a bit bigger?
in order to get full body it'd have to be bigger
Amazing
...oh what does this mean
oh yeahhhh
looks like a preset is missing
best you could do is maybe 768x512
cool
yeah the resolutions are so so important
if you want to see why, try generating "a woman walking on the beach" with 512x512, 768x512 and then 768x768
mutant city with 768x768
oh, yeah I know the difference...it does a lot with it
if you make it too big, you get a mutant
if you make it too small, it looks bad
the size changes a lot of stuff
yep
when you train a lora, you're just basically refining a model that was already trained
and those were trained primarily on 512x512, and a bit on 768x512, not so great at many others
okay so i found like...11 images and resized them.
I would boot up my switch but it's charging right now
cropping is fine, downscaling with lanczos is fine, but upscaling you wanna avoid if at all possible
it's better to pad the image by outpainting than to upscale
...what if the image is smaller than 512x512?
like i dont know how to use outpainting
Yeah but they use so many big words and tech terms that it just goes right over my damn head
outpainting is the same as inpainting, except on the outside of what you got, instead of the inside
"You need to activate the schmorgus setting inside the yufidoo..."
😆
Yeah, hold on i'll do some reading
it might actually help me
oh nice okay that wasn't that bad.
i dont know what im doing in order to make this work, but i just kinda...made it bigger i guess
it counts.
probably not.
....
that definitely doesn't count
why can't i throw it through an upscaler exactly?
it still looks good...
if it looks good enough to you, then it's fine
okay so i have my images
this is just a test, so i'm not worried about if it's 100% good or not...
the page gives no instrunctions on how to use this. love it
ah wait there it is
it says i stil need to do the txt file thing
I don't see where
this is what i use
...yeah i dont think I'm cut out for this. I'm reading but retaining absolutely nothing.
It's going right through my brain.
that's what happened with me pretty much every time i read it
just get that program installed
it's pretty easy once you have that

you divided by zero and created a singularity
I have done everything recommended, and it just refuses to work. It's only doing it with this model, so maybe it's just refusing to work
Lmao
i restarted forge
like i closed and reopened it
and it still happened yeah, but i'd just restart each time
I did that, and it didn't work
i guess make a bug report on the github for it with a copy paste of your console?
I am restarting again, to see if it works
Okay, it's working again, but no idea what caused that
Does anyone know how to make the first image as good as the second one?
is it possible to take this scythe and enhance it ? like this glow around more detailed, everything sharp and high res ?
im trying for 2 days, can't get it right
tried different models, idk what im doing wrong lol
Here is the image you requested.
Just upscale basically? 
Maybe little bit of inpainting for more details
how do you use inpainting
I play totk too
Hi Everyone. Not sure if this is the right channel for this. I'm looking for a stable diffusion / MidJourney professional who can assist with a project on digitally altering images of socks. I have PNG images and 3D files of the socks. The goal is to take these images, keep the socks unchanged, and completely transform the model and background to a design of my choice. I would also love to learn this process. If anyone is skilled in these techniques and is open to collaboration and teaching, please DM me. I've attached an example of the final result we want. Thanks!
a 25-year-old friendly-looking man sitting behind a desk in a futuristic studio, wearing a yellow hoodie. window background, smooth, soft, ultra-sharp, detailed, looking straight forward, centered in the image, straight, front-facing a camera.
Here is the image you requested.
hi friends
Beep boop!
Here we say: Moin! 🙂
Here is the image you requested.
That reminds me of a friend who takes Polaroid pictures on parties and writes jokes under them ...
Which webui are you using?
i use forge
never used it, but it looks ~same as a1111, so...there's img2img section => there should be inpainting
You mask section you want and let AI change it...there are some settings, making it a bit mroe complex, might want to go through docs or some vids about it
oh and you will need inpainting model, usually models have 2 versions - base and inpainting version
a1111 is what i use im pretty sure
i just thought it was called forge

okay so i gotta download some stuff
at least that was the case with 1.5x models, idk what's up with SDXL and if it can do inpainting or need something more
it can, but not as well as sd15
my new fav denoising schedule
gorgeous
thank you 🙂
you might like that denoising schedule
workflow embedded
that noise scheduler is giving me some of my best results ever
ty, gonna check
Using the noise thingy?
no, not yet 😄
How to install stable diffusion in low end pc
How much VRAM?
A web service to create could be a better idea
Leonardi.AI is giving some free tokens every day ... https://www.craiyon.com/ should be free
Realtime Generation @ leonardo.ai is for free and interesting for learning, too
so, the image upscaler node doesn´t work even if installed. Already tried "try fix", yet didn´t work either
Take your time 🙂
It just doesnt work, so what´s got time to do with it? 😄
Create beautiful artwork using the power of AI. Enter a prompt, pick an art style and watch WOMBO Dream turn your idea into an AI-powered painting in seconds.
Freemium
Learn ... 😄 Just kidding ... had a problem with the SUPIR and stopped working on that.
had to update everything, now it´s working 🙂
still doesn't work though, something with controlnet missing 😀
even though it installed tons of c-net stuff during the update
@nimble mason impressive stuff. new release from pixart, sigma (versus their old alpha). they have a free space to try it out. https://huggingface.co/spaces/artificialguybr/Pixart-Sigma
I couldn't get sdxl to do this with any amount of prompt expansion and trying various models. this is impressive stuff. I think we're about to start seeing an explosion of new models that use T5 llm models as part of the render pipeline like SD3, Ella, and now this.
I run it offline with comfyui-extra models
there, you can run T5 at fp16, 8-bit and 4-bit, with no conversion
Have a workflow handy for it that you can drop on here?
Yeah, 24 gigs
oh lol
then the one on the repo is perfectly fine: https://github.com/city96/ComfyUI_ExtraModels
its quite easy to install
only problem is (with ELLA as well), is the amount of time T5 needs to load in
its quite slow
but yeah with 24 gigs you can run T5 at fp16
@jovial tiger just wanted to tell you that this and ELLA aren't good at text btw
but they are good for what you are usually doing
complex scenes
I wonder if you can do regional prompting cause of the close integration with comfyui
I mean it has conditioning right there for you to concat/combine and whatever
@cyan shoal awesome, thanks. going all over the place to download stuff. yet again. 🙂
yeah, i tried one of those as well and it was already better. still downloading. can't wait to see what my llm expanded prompts do with it
it's not perfect, but it's a large step up.
certain actions are still not going to be there, but I'll take any leg up at this point.
^
From reddit thread: orange cat wrapped in white bandages and black dog wrapped in red bandages sitting on a bench on top of a hill filled with round stones, photo, cinematic
cool! ⚔️
@cyan shoal what resolution settings are you using for the empty latent?
I keep trying to use my own and it says not good. what's the best way to get hi-res with this?
epic
sure, but on the demo, you can do 1920x1080 for instance.
I tried making an empty latent with 1920x1080 and it refused.
I guess I'll try the usual upscale methods. have you tried samplers other than the default euler?
tjere are only 4 model types rn available to the public
256px, 512px, 512-DMD, 1024px
then there are 2 remaining models that are not available yet: 2K and 4K
you could try kohya's deep downsample
or just generic highresfix maybe
so they guy probably use kohya's deep downsample or something other
but in 8-bit or fp16
wow. he's even holding the skulls i had in there.
I guess I'm happy for now. I'm using 1.67 ratio, which is 16:9. the output is amazing, so I won't fiddle any more.
these sampler settings give really good output
gigantic robot reindeer dwarves tiny santa who is looking up at it, swirling snow, ethereal christmas lights,,ultra highres, High detail RAW Photo, , dslr, film grain, ultra detailed, 8k, masterpiece, hyper realistic, photorealistic, photograph, sharp focus
wow: orange cat with white hat sitting on a park bench next to a black dog wearing a blue scarf and rasberry beret @nimble mason
Is that beret the kind you find in a second-hand store?
you know what? it is!
this is kind of nuts.
I select cpu for the t5 model, and once it's loaded, it only uses 3 gigs of vram. and it's no slower than loading the whole thing on the gpu instead.
and their model isn't censored either.
A man in a rugged helmet grapples with a towering, anthropomorphic Cheeto in a dimly lit living room, as if straight out of a surrealist painting. The camera captures the scene from a low angle, highlighting the absurdity and drama of their wrestling match.
think im using upscale wrong, getting outputs like this
you need a second ksampler with a 0.5 denoise after the upscale latent.
do i plug in the same stuff for model, +ve and -ve prompts?
correct.
just that the latent input is from your upscale latent node instead of the empty latent from the beginning.
Oooo I rewemb.r hearing they were working on it
Those images look great
Same T5 files as before with alpha?
yep
I did notice that if you made a reeeeally complicated prompt, it needed more steps. so 50 instead of 30 for res_moment
but it did it
Nice
Yeah the main issue i remember with alpha was either censorship or under training or both
Had a pretty limited vocabulary
What it knew it was very good at though
it's definitely not censored.
I tried both main uncensored angles and it did both
only catch is that there's no upscale. @cyan shoal mentioned that 2k and 4k versions of the model will be released at some point.
so with a 1.67 ratio, it does 1280x768 which when the prompt is adhering so well, is fine
That's really good to hear re: censorship
I don't even care about making that type of content but when it's censored as hell it really does affect its ability to generate tons of peripheral stuff properly
lol @ them dropping a pickle checkpoint 🤣
i'll use it but jeez what a way to ensure large numbers of people will use it without hesitating
regarding res momentumized... def use the samplercustom version, those extra options (especially the noise sampler and sigmas) make a really big difference
i'm gonna see if i can come up with a better schedule for denoising in general
The perlin sampler is nuts for crisp details in most cases
For noise, uniform is often better than gaussian espec in combo with the perlin sampler
In the foreground, a meticulous mechanic, clad in protective garb, wielding a powerful welder, strikes a focused pose amidst a shower of sparkling arcs, adding intricate details to the colossal robot's metallic body, while towering skyscrapers rise imposingly in the background, emphasizing the immense scale; the scene is captured with a long exposure, creating a breathtakingly detailed and realistic image in shades of grey and blue, capturing the gritty essence of the mechanical realm.
so far it's not limited by 77 tokens
is this a new model?
new image checkpoint, but more importantly, throws CLIP out the window and uses a real llm instead.
wtf how do you have that sampler
there's an extra samplers node in comfy
yeah it keeps adding letterboxes for some reason
thanks I found it, gonna try it out
sometimes it's better, sometimes not.
it's one of the few samplers that seems to work with this pixart thing though.
and looks better than the default euler.
yeah I tried a bunch of samplers
i'm getting good results at 30 steps with res_m, difficult prompts look better at 50.
wow
Res momentumized is the most interesting sampler I've found so far and it's not even close
That doesn't mean "best" but in many cases it is
yeah, i get great results with 20 steps dpmpp_2m with another 20 0.5 denoise for most stuff. but if you don't care about how long things take, then it can be better than the usual higher quality ones like dpmpp_sde_*
hah yeah. for the first image to load form nothing, t5 takes minutes to load into system ram.
well that's one downside for SD3 already 🤔
I mean its not generation speed, but still
might tick people off
minutes??
yeah...
took about 5 seconds for me
but once it's loaded, then generations after that are quick.
what size t5 are you using? the one i got off the recommended site is 20 gigs.
yeah, 2x 10 gig for me
well, once it's cached it's fast.
are you loading off a HDD?
nvme top of the line alienware.
it's not a drive speed thing, it's a processing thing
wtf
yeah it's seriously just a few seconds for me
it's probably doing an md5 hash the first time it's loading it.
I'm doing it across 3 different machines.
so i'm going through that first initial load 3x.
are you just using the standard sdxl vae?
yes
i'm using theirs, but I tried both and I can't see a difference
link to theirs by chance?
maybe they're the same? idk
they actually mention sdxl vae...
so i think it's the same
? I don't understand what you wrote there. 🙂
nice prompt
oh, the linux command
cheeto man is going down.
wasn't sure if you have that on your system or not
i have wsl running on mine so i use that sometimes with the chaos of SD resulting in lots of models from different sources with different names being the same giant file
k cool
I think i was just renaming stuff to make sure i knew it was the new one.
a pink frog sitting on top of a green cat
clearly a bald green cat.
exactly
lol
even has lil cat ears
hey, this is exciting
i'm glad you or whoever noticed sigma was released brought it up
that had completely fallen off my radar
leave it to the clown
yeah it was obviously not trained on text as much
to make the very first prompt test break it
not even close
yeah text is worthless with it
its not like ELLA is any better
so now here's the other q... what's the compatibility situation like with loras and controlnets? i'm guessing zero? and how hard would that be to address
based on the tests you've shared, it certainly seems this is worth a closer look by the community
0.78 ratio works best
I tried just loading a regular checkpoint with this t5 thing and of course, no go.
so what about a mech punching a hole in a building?
I'm trying to smush these 2 together, but they won't go. 🙂
yeah i'd imagine the architecture is different
other thing too: when we're talking about prompt adherence, res can be a problem, i think in part cuz the schedulers we have are usually too aggressive with the sigma schedule
you guys love this game too, right?
damn havent seen it in years
looks burnt with cfg = 6
way better than sdxl, but not like ideogram which is the last image.
I'm using cfg 5.5
so i tried ultimate SD upscale, but it gives me 4 different images instead
wrong tile size for one
if you're using sdxl, tile = 1024
or some other native sdxl resolution, i usually use whatever my latent size was originally
my empty latent is 512x512 though
which is also the wrong size
you want 1024x1024 as your default
sdxl wasn't trained on 512
there's some resolutions for sdxl
Ouch
Changed the prompt a bit to: a pink frog sitting on the head of a green cat and ELLA gave me this
that last one looks like lora fuel lol
side view of an anthropomorphic muscular green cat is pulling a wagon along a sidewalk on a residential street. There is a smiling anthropomorphic pink frog wearing a racing helmet in the wagon.
does it understand left/right/top/bottom?
This is where fine tunes come in. Ella's ability to use existing fine tuned models is a pretty big plus.
I have a command for it working on gremlin
whoa, check this out...
a race car driving on the left side of the freeway against traffic in detroit during a thunderstorm
that is the left side, or appears to be for that image
a lil mushy looking
So the answer is yes, but takes some seeds and there's some subject bleed, so it might take a bunch of seeds before you get a perfect one
ooo great timing, the readme was updated with some great info https://github.com/Extraltodeus/sigmas_tools_and_the_golden_scheduler
looks like ass (messing around with schedulers now) but hey, left side, and i'm pretty sure that's against traffic
that is definitely against traffic
i wonder what kind of noise sigma was trained on? the usual shit, or pyramid?
Hah looks good.
still need to figure out what scheduler/sampler/noise type works well for this obv
but some really good signs already for prompt understanding
also, iirc one thing pixart was throwing around was that their models would be more trainable...?
it can actually do rain... most sdxl models do the effect of rain but don't show it streaking through the air
effect on a surface that is
Whimsical hand-painted watercolors: Vividly depict a cheerful red cat, its fur raising in the gentle breeze, perched to the right beside a serene blue frog atop a dainty mushroom, with a dreamy forest backdrop of soft pastel hues and gentle lighting, creating a delightful and peaceful scene.
Huh. The image prompt adherence went way up when put through prompt expansion first
wrong side, but who cares, great image
is that the prompt that went into T5?
or the one that went into your LLM
what's the expanded prompt?
oh k
a race car driving on the left side of the freeway against traffic in detroit during a thunderstorm
I'll try it through this
Detroit rainstorm, nighttime, dramatic lighting. A sleek race car speeds on the left side of the soaked freeway, defying traffic with its brilliant red body aglow, towering city skyscrapers beyond, creating a breathtaking, high-speed silhouette.
@nimble mason
Looks awesome
nice, nice
also try cfg rescaling at around 0.8
I need to change up my command to do llm expansion instead of raw. Looks like it really benefits
now can we get it to show traffic on the other side too? view of the freeway from a bit farther back
yeah
have it spit out the expanded prompt too when it generates so we can learn from what it understands and what it doesn't
the rescale node in comfy?
RescaleCFG?
yes
def helps with the burnt look
Who needs round exhaust pipes anyway? 🤷 How much vram does it need? Just want to know if i should bother looking at it.
dpmpp_2s_a and karras
that's uniform noise... this is gaussian
pyramid... yuck
power noise
tried setting the t5 type to fp16 and to load via gpu... pow, comfy crashed
#sdxl #ComfyUI #comfyui #inpainting #stabilityai #stablediffusion3 #stablediffusion #SD3
I joined StabilityAI in April 2024. Thanks for all the channel support!
This is a video about the SD3 available on the Stability Discord server. I try out all sorts of prompts and experiment with SD3's new capabilities.
More information about SD3: https:/...
@jovial tiger FINALLY
@jovial tiger you said you were using t5 with fp16...?
supreme/exp
res/exp
all exponential scheduler with gaussian noise: dpmpp_3m_sde_gpu, dpmpp_2s_a, dpmpp_2m
new forza?
i remember it didn't take very long for someone to publish a finetune on civitai with alpha... i don't reumember how big of a diff there was, but i have it on my HDD
In a cinematic, high-contrast noir-style digital painting, a scene unfolds on a stormy night in Detroit where a sleek, aerodynamic race car hurtles down the left side of a rain-slicked freeway. The car, a masterpiece of engineering, is painted a deep, glossy black, accented with stripes of iridescent silver that catch the intermittent light from the storm above. Its headlights slice through the heavy downpour, casting eerie beams that reflect off the wet asphalt and the rain-drenched vehicles it narrowly avoids. The oncoming traffic, a mélange of startled drivers in mundane sedans and trucks, flash their headlights in confusion and alarm. Overhead, the sky is a tumultuous canvas of rolling dark clouds and sudden, jagged flashes of lightning, illuminating the scene in brief, dramatic bursts. Each lightning strike highlights the car’s aggressive motion against the natural flow, emphasizing the danger and chaos of its path. The surrounding environment is a blur of towering billboards advertising local Detroit haunts and neon signs flickering spasmodically, struggling against the storm.
wow
awesome
we're gonna have a lot of fun with this 😄
hope there's a way to train controlnets for it
haha jesus, just the first minute of this sd3 video has me blown away. he flashes insane images real fast by the screen, every one is amazing
if this is anything like it looks ilke so far i'd gladly pony up for some h100 time if needed
oh really
what you test?
#sdxl #ComfyUI #comfyui #inpainting #stabilityai #stablediffusion3 #stablediffusion #SD3
I joined StabilityAI in April 2024. Thanks for all the channel support!
This is a video about the SD3 available on the Stability Discord server. I try out all sorts of prompts and experiment with SD3's new capabilities.
More information about SD3: https:/...
what aret hese tools
its free for all?
my earbuds batteries died and my wired headphones busted so i don't have sound right now
all via discord?
i remember emad saying comfyui wolud be getting an upgrade and/or new tools
He is showing 4 minutes of a bot Chanel that nobody of us has access to. I feel like he wasted my time with that.
pretty annoying tbh that not one regular on their official SD discord has access to their SD3 discord bot, lol
his first SD3 prompt, on pixart-sigma: a wide lens cinematic rear shot of a young male dressed in futuristic minmal brown and dark green sci-fi armor and ragged brown cape overlooking a high cliff, looking down at a large army of desert warriors
idk its still weird to me how inferior these preview images sometimes look compared to lykon's images
admittedly, lykon did use highresfix
Today i test SD3: A cat
so it does improve image quality a lot
bird's eye view of a legion of angry shouting Spartan warrior batmans armed with shields and speers. chaos, debris, confusion, anger, blood, gritty, dirty, mid-action, god rays, yellow smoke,
yeah upscaling def isn't working like it does with sdxl
guess we do need to wait there
sd3 hands seem pretty borked.
unless tiling does something
a full body character design of a female puppeteer, short blonde hair, modern streetwear clothing of white jacket, black shirt, and tattered distressed dark blue jeans, alexander mcqueen fashion, arms raised in manipulating fashion, various futuristic sleek androids of different sizes being controlled by her, background workshop with different synthetic organs floating in large tube containers
what is it sd3?
that's another sd3 prompt
sd3 did it better, but the hands in his video examples were even worse
are you still using res or are you using anything else differently?
res. 50 steps. all the other samplers came out very muddy for me
"steps": 50,
"cfg": 5.5,
"sampler_name": "res_momentumized",
"scheduler": "karras",
huh, i've found res to be muddier so far than ancestral dpmpp_2s_a
a woman standing in a kitchen clasping her hands together behind her back
legit first time i've seen any model do this
not even held together, but still
they're BEHIND not beside
how many steps and scheduler for 2s_a?
ok i just did side by side and the composition of the 2s_ancestral was better
both were clear
running a set of 3 with 2s now
I'm starting to think some of this is just seed based.
both are sharp, but every now and then a random seed will be more blurry/muddy than the others.
ahh
man, watching this video he touches on safety, saying that if someone can do an image of a large container ship crashing into a bridge, that would be bad and effectively should be banned. rage at the clouds for people who think like this. intentionally nerfing models.
that's why i'll never get a robot punching a building with sd3.
Captured in a soft, watercolor-style portrait, a woman gazes directly at the viewer with a gentle smile. Her hands are clasped behind her back, concealed by the flowing fabric of her floral summer dress. The light wash of colors and the fluid brush strokes accentuate her calm demeanor and the subtle twist of her body, suggesting a casual, yet thoughtful stance. The delicate play of light and shadow around her form subtly alludes to the hidden gesture of her hands, adding a touch of mystery to her relaxed pose.
yeah fuck that
are we going to ban photoshop then? cuz i sure as hell could photoshop that. jeez
ideogram doesn't think this way, i believe
it's literally in their terms of service "we won't restrict what images people want to make" and they legit don't.
while on here we're told any amount of blood, or even just a cake made out of meat is too violent/disturbing (despite being at worst, PG-13 imagery, maybe even PG)
ideogram spits out prompts that talk about cannibals and gore and bicycles made out of "human meat and bones"
hah
yeah i did the sd3 monster stabbing a rat, and it did it
so far every sd3 prompt i throw at this pixart, it's doing a really good job
sd3 is better, but it certainly better than ella as far as image quality
sd3 doesn't seem to have the ability to put things in certain places if it's just one subject.
only relative to other objects.
really wants to do this
Another sd3 prompt in pixart-sigma: top down wide camera angle aerial rear view of a kpop male adventurer assassin wearing dark techwear fashion in the style of alexander mcqueen with white and teal accents, flowing robes and hood, in a dynamic upside down falling pose holding on to the railing of a sci-fi futuristic greco-roman space elevator, over a huge sprawling aerial city in the shape of a lotus petal surrounded by water on all sides, a mega structure of a towering babel-like tower space elevator in teh center reaching into the heavens, falling downward in the dusk sky during golden hour, split toning, sunset dusk, obscured by clouds, atmospheric perspective, in the style of painterly ink
neither were the sd3 shots.. none were upside down
are you using these settings too
Illustrated in the style of a modern graphic novel, a race car is dramatically rendered in bold, angular strokes as it navigates against traffic under a thunderous sky on a Detroit freeway. The artwork is characterized by stark contrasts between the dark, ominous sky and the bright, artificial lights from the car and surrounding traffic. The race car, depicted in hues of fiery red and jet black, cuts through the scene with a palpable sense of urgency, its lines sharp and aggressive. Rain slashes across the panels in jagged lines, adding to the sense of speed and danger. Oncoming cars are simplified into geometric shapes, their headlights glaring against the night, adding to the overall tension. The background features high-rise buildings and overpasses, drawn in exaggerated perspectives to enhance the depth and chaos of the urban environment. Lightning forks across the sky in stark white flashes, illuminating the scene in brief, dramatic moments that highlight the reckless bravery of the race car driver.
cfg = 6 here
pixart-sigma
and CFG 5
all exponential
karras, ancestral, cfg=5, 50 steps
In a photorealistic style, a race car depicted in sharp detail drives the wrong way against traffic on a Detroit freeway during a severe thunderstorm. The car, a model of precision and speed, sports a lustrous red finish with sleek black accents that gleam under the storm’s intermittent illumination. Each raindrop is captured as it pelts the meticulously crafted surface of the car, creating a texture of crystal-like beads that stream across its body. The storm above is a dramatic spectacle, with heavy, roiling clouds unleashing torrents of rain that turn the freeway into a reflective mirror of chaos and motion. The headlights of oncoming cars, a mix of whites and yellows, create a disorienting array of lights that challenge the race car’s daring maneuver. In the background, the cityscape of Detroit looms, its familiar landmarks obscured and muted by the heavy downpour, with only the occasional glow of a distant streetlight or the flashing of a neon sign providing a sense of place and time.
really fn good for a base
back to res for this one
res looks better, but I think that's a seed issue.
welp. ak-47
I tried setting mine to auto/auto and now it's using 2 cpu cores and has been for 5 minutes. just siting there.
processing.
pixart-sigma / 2x upscaling with sdxl ai creator checkpoint
0.4 denoise
looks great
no category on civitai yet for pixart sigma
try that scheduler of mine for refining/upscaling
i was getting pretty good results with that
granted, i did only try res with the settings from the workflow last night
but setting the multiplier at 0.10 or 0.15 or so was pretty good
even 0.05 did a lot to clean up the van gogh nuke image
the 1.5x upscale with 0.5 denoise seems to always been the sweet spot. actually more prompt following since i did say batmans.
I have to run away for a while, but I'll have a look tomorrow.
yeah, in general, karras at 0.45 or 0.5, and exponential at 0.5
any idea?
looks like red text on a grey background 😄 Just had that as well with Clownshark´s workflow, couldn´t solve it so far like you know 🙂
Dang ... it worked yesterday 😄
comforting...
Robot love
Yeah ... share love! ❤️
moofi you use dream like sd 1.5?
yep, dreamlike photoreal 2.0
not work for me
wdym?
error
Hey guys is there anyone here that i can send a 16:9 photo and they outpaint it to 3440x1440 because i have a amd gpu and that isnt supported by stable diffusion
i download dream like model but it give me error
workflow?
haven´t tested in A1111, so I couldn´t tell, it´s working in Easy Diffusion + Comfy though
I don´t think that´s really the reason
my gpu not support something in this model
Whta GPU do you have?
1050 ti
I had it running on a GTX 1660
is better
nah
but i cant fix it with any aruments
sdxl ai creator checkpoint?
also standart 1.5 pruned model very creative but with more artefacts
you can always use those for input images on SDXL
And do hires fix along
I want to try to restore my sleep using a neural network
pixart-sigma: In a chilling apocalyptic vision, a menacing Flying Spaghetti Monster, an ominous shape with eyes on stalks, looms overhead as a dark cloud against deep-hued, storm-filled skies threatening to unleash a deluge of delicious meatballs and tomato sauce upon the diminutive figures below, its body a writhing tangle of pasta, the entire scene illuminated by an otherworldly light that casts long shadows in this macabre vision of armageddon.
"pleasant" atmosphere, yet I would work on the face a bit 🙂
@nimble mason
try restore my dream but sd igore part promt
Hedge-hog (slightly shape-edited in PS)
i-Gore? 😄
you like pink color)
not in particular, it´s simply the series with the prompt, containing cyan + pink 🙂
Ideogram
my dream tonight:
from the slightly open door of the house you can see a running man, who is being chased by people on the street among the trees of a dark winter park at night, the lights do not shine
they caught him and started cutting him
💀
saw nightmare tonight 😃
Can you recognize him?
@jovial tiger
This isnt true, I can use the bot right now 
This isn't true? Was I following you on twitter?
Im saying its not true that the bot isnt working right now
I never said the bot isn't working.
The tweet does tho
I said it was turned off for some of the original testers, and now it's been opened up to a new set of people
which is true
There might be multiple servers idk
Dude, there's only so many major releases I can handle per day. 🙂
can someone help me to transform this image
into this
tried so much and i just give up at this point
if anyone can replicate those things, im happy to pay
@shut sinew Feel like trying this one out on SD3? This is what it looks like with pixart-sigma. prompt: Cinematic, low-angle shot of a menacing cyborg shark with sleek, metallic body, glowing red eyes, razor-sharp teeth, and advanced technological enhancements, emerging from the dark, murky waters of a neon-lit lagoon, illuminated by vibrant pink, blue, and green hues reflecting off the rippling surface, casting an eerie glow on the shark's gleaming exterior, as terrified people on jet skis, with panic-stricken faces and flailing limbs, desperately attempt to escape the looming threat, their vehicles leaving trails of churning water and neon reflections in their wake, set against a backdrop of a futuristic, dystopian cityscape with towering skyscrapers and flickering holographic advertisements, all witnessed from a dramatic, underwater perspective.
What do you want transformed about it exactly?
the 2nd image is enhanced, a lot of details, bigger boobs, nice face with ADetailer probably, but the details are there in high resolution
doesnt seem to do anything on pixart
middle
Dm’d you
2 years ago ... ... ... 😄
can you help me with the next promt: a photograph of a creature, with the neck and upper body of a giraffe that is retractable, instead of having hind legs it has a large reptilian tail, standing on its two legs and tail, its habitat is the jungle border in the African savannahs, it is grazing, the sun comes from the upper left side.--ar 2:3
😄
thank you! i know it doesn´t look as much a giraffe but i had to give it a try
Can you help me with the next prompt. It generates a realistic photo about a creature, which is an impressive blend of divine and earthly elements. It has a large humanoid body with imposing musculature, crowned with curved buffalo horns that radiates a sense of power in a majestic enchanted forest. His large, majestic wings gracefully reflect the celestial light as they unfold. Masterfully, in her hands she controls fire and makes flames dance at will in the place where ancient trees and shadows move to the beat of ancestral magic. Her intense and penetrating gaze reveals the wisdom of ancestral beings, embodying strength, magic and majesty -- ar 3:2.
@languid pebble @nimble mason After updating everything for making your file work, I cannot get IPAdapter+ working anymore. This is what it shows in the shell:
they changed the name of the opt 🙂
just click on the weight
