#✨|sdxl
1 messages · Page 182 of 1
No idea, works fine for me. Start with the example workflow (in the Aura repo), but if that doesn't work it's going to be a local issue.
That's good, but there's something screwing it up in your workflow. Now build up on that basic workflow, until something breaks...or it works 😄
i m confused 🙂
it maaaay be a thing that it doesnt want 1532x1024
yeah, i think it hates not squares
id assume its like most 1024 models where they dont like being pushed too far beyond 1 megapixel. if you want a 2:3 portrait ratio like that, try something like 832x1216
I'd recommend this node, the sizes should work
comfyroll also has a pretty good one like that as well
Pixart Sigma base + SDXL refiner and bucket upscaler
Definitely overrated - looks like from all the promo pictures "that it's good at text!!!" It Ain't! Join the FAL Discord if need be!
It's an early beta, but my very first image from it was this #✨|sdxl message
I mean it's good - is it exceptional at all?
Is any early beta release?
Good pictures - still not the visual acuity of SD3!
It has potential...but it's probably too latge for most
I've been using it online ... its OK, I don't think remarkable ... but that's just me 🥳
I played around with SD3 a lot and tbh just don't like the look most of the time. It has this Crysis screenshot look to it oftentimes
Plus It's only good at landscapes and close up portraits. Going outside those boundries it's not as good as guided SDXL imo
Someone said that "SD3 The ReMix" is in the making?!
((((SD4))))
I'm into ipiv's AnimateDiff Master of Motion lately ... it's fun!
Master of Motion? Can you link?
🎓The first 500 people to click my link will get a 1 month free trial of Skillshare https://skl.sh/mdmz05241
Learn how to create morphing animations with ComfyUI! From setup to fine-tuning settings for stunning results. Plus, discover tricks to speed up generation.🚀✨
1️⃣IPIV's Morph Workflow: https://bit.ly/3wvoP04
2️⃣List of Necessary files/mo...
makes sense, yeah i just see that a lot in images around here
it's distinctive and def worse off for it
Oh he changed it "Note
PAG may produce striped "noise", setting sigma_end to 0.7 or higher may reduce striped patterns."
Used to specify sde/a samplers, guess it's all of them
I'll have to keep that in mind, but I don't really use it anymore. I use an attention preset in automaticcfg that also does pag
gotcha
i think the issue mostly manifests when ppl have it cranked naively
it's usually accompanied by a vague blown out look to some parts of the face
anyone know why comfyui keeps giving me mlp generations when using pony v6 xl? (in a1111 im getting humans with same prompt)
hmm i think i figured it out, comfyui uses -2 clip skip
but it didnt change anything 😐
wow really neat
Thanks \o/ Should actually revisit that workflow again
Is Ella a cut above? Is it like Ollama at all?
It gives sd1.5 t5 llm level prompt understanding instead of just clip. I don't know how well it'll work with animatediff but it's worth a shot.
This is ella refined with kolors. Even single subject images benefit because you can use longer llm (gpt4 or Claude) enhanced prompts.
Really? I didn't see anything about this on the page for it
Also, using comfyui, is it possible to set the denoise for different parts of the image to different values? I'm trying to do img2img but I don't want some parts of the image changed as much as others
Also pony does not know what a poncho is
I'm going nuts because pony keeps giving my character glasses even though that's not in the prompt, what the hell?
Yes, you would need to set a mask for each area then chain the nodes so that it batches the masks together and generates for each mask at the different denoise ratios and then combine the images back on the original.
all you have to do is look at the pony prompts
something went wrong lol
guys is there such thing as the ReActor but for short form videos?
totally didn't prompt for that. lol for some reason it likes to double up on dragons
It's a trap. He's hunting vegans.
Lol, hahahaha
Well, Portland has always been a bit weird
this model is refusing to do anime and it's a little disturbing.
that's an interesting effect. the details are nice too
yeah, I was thinking the same
Taste my Gom Jabbar.
A close-up portrait of a puffin with its beak open, holding several fish in its mouth, set against the dark background of ocean depths. The scene is captured from behind and illuminated in the style of studio lighting, highlighting intricate details on both the bird's face and tropical fish scales
Draw an American man and control a puppet dog with strings
A spiritual creature
A puppy
🐉
I started using sharksampler 🙂
also I use this as my latent for everything now
do you know if there is a better one?
I want a latent that makes small details
velvet noise is so spiky
that it kinda works
and I only denoise at 90%
not sure, there's a few different things out there
that one is pretty good
i always denoise to the finish line myself but mileage may vary
I found that it depends on the noise
this is my current one
maybe for this I should go all the way
but some of the other noises for special effect
they lose the special effect if you denoise too much
a certain way of doing pink noise, denoised to 90%, basically adds colourful clouds and smoke and fog
its cool
noise more like this
and it gives you special effects like this
I am pretty sure we still have the sampling 100% wrong
spend the last week reading some books about solving ODEs
I am finding that DPM++ 2M at 150 steps
is easily losing to even just Bosh3
and DPM++ 2M is normally one of the best
SD3 seems different
yeah
ODEs kinda level off in gains pretty early
those high step counts are where SDEs shine
I'm worried that SD3 is stiffer than we thought
and so we should all actually be on implicit methods
there is an old method where
each step it switches between explicit linear multistep and backward differentiation formula
depending on the local stiffness
I think we need that
interesting, what's it called
LSODA
the explicit linear multistep in this case is Adams
there is a fourth order implicit Adams in the ODE sampler node
I think this is probably the best we currently have that I can find
but it will be slow because it can't switch
yeah it might have been brought somewhere
ok so I just ran RK4 to the same tolerance as Bosh3 and it performed much worse
this is very bad news because
that means stiffness
No, it's the best trade off between time and quality. That's why it's one of the most commonly used
For actual higher quality, SDE samplers tend to be superior most of the time, but the trade off is that they are slower
oh I agree
my preference for SDXL is to split the sigmas and at least the first bit is done with SDE
then ODE to finish
sadly no SDE on SD3 though
As long as they use the same noise schedule, it's usually fine to do that. Just watch out for samplers that do different things on the first and last step
Some samplers do an euler last step iirc
But yeah sd3 is very different
So you can't use all the same samplers with it
if I used less steps then splitting the sigmas would be really bad
because the sampler has to start from scratch
but since I like to generate at like 300 steps or more
the switching cost is relatively small
TBH I have to slowly go through to code of everything in my workflow
cos I don't know exactly what each part is doing
like yeah doing euler or heun on the last 1-2 steps
I've seen that
its hard to know exactly what's going on
You can also do stuff like having an overlap between samplers. Like let's say you're doing 100 steps. Sampler A might go to 50/100, but sampler B can always start at 40 and go the rest of the way to 100. Those ten steps of overlap can sometimes smooth things out if you're shifting sampling methods
you can just change the noise scaling/eta in most cases as you go
that's what i did with the "clownsampler" node and a few others... you can schedule eta to be an arbitrary value at each step
I released V4 of Mangled Merge XL if anyone would like to mess around with it. 🙂
https://civitai.com/models/447902/mangled-merge-xl?modelVersionId=645241
Still can't seem to get a horsefly on the back of a horse...
Makes for some interesting horses though
Juuuust over the line with the 2D/Realism bias.
fruit syrup
thanks this is a great tip
gonna try this
I have to learn how momentum sampling works
because momentum sampling lets the second sampler carry on from the previous one's work
it might not work on SD3 though (might be random elements)
I bet those headless dinosaurs can't wait until evolution kicks in for them
Yeah it can help a lot in some scenarios because the second sampler doesn't have to have the same seed even and the overlap helps blend any jarring composition changes if the sampler switch happens at a low percentage of the way through diffusion. Usually most primary forms are all blocked out by ~30% of the way through though. The momentumized samplers are pretty good as well like RES, but yeah, sd3 doesn't really like anything that does any sort of ancestral/SDE sampling
yeah this sounds good
I don't wanna spend much more time optimising sampling
because what I have found is
at longer steps the difference between methods sort of goes away
since I use high step count I don't have to optimise sigmas as much either
I will try Clownshark's Clownsampler as well because that lets you dynamically switch from SDE solver to ODE solver LOL
I already use his Sharksampler 😄
Sharksampler is coloured noise injection
although I tend to do it manually because I split the sigmas so much, I figured I may as well inject some coloured noise before next sigma starts
white cat
that's only gonna matter with anything that has each step depend on the last one somehow (ancestral etc)
overlapped steps can absolutely smooth things out - i've got another little hack that can do that in another way, the "eulers_mom" option which basically uses the euler method to project removing a bit more noise at each step
if you want more detail, skipping a step somewhere can be pretty interesting
terrifying
I know, right?! 🤣
anyone have a good solid IP adapter + controlnet workflow that yields good results for combining 2 images to make something new while maintaining some elements from both original images?
Tried in kolors
Nice! Haven't played much with the new models but the fact that it can do that has me interested. Sadly I'm only running a 3060 with 8g vram so i'm limited. Can it run on 8g?
No idea, i use a cloudgpu, it has a nasty llm that's huge https://github.com/kijai/ComfyUI-KwaiKolorsWrapper https://github.com/MinusZoneAI/ComfyUI-Kolors-MZ
Damn, 16g. Of course it can make a horsefly on a horse. LOL... but it didn't get the macro photograph or size factor 😆 . I'm sure with the right prompt it would be possible though.
I kinda like the ability to make really weird looking creatures though.
Didn't promot for that 😉 But i love weird mixes too, sdxl is best for that, mashing the embedding weirdly together is lost in these newer models :/
Yeah easily with comfyui. I run it just fine on a 2080 with 8gh vram
I might look into it. I wonder if anyone is training on it. At this point it seems like there are so many options
Just make sure to use the quantized 8bit text encoder
The comfyui wrapper addon has instructions on how to use the Q8 text encoder
how does it compare to pixart? I know a lot of people are training on that architecture now.
Visually? Far better than pixart sigma. Prompt adherence? About the same, but kolors does seem to have some concept bleeding if you make things too complex. Usually with colors, like you name a bunch of things in a scene and describe more than a few colors in it
At least that's what I've noticed
But all models will break like that if you go too crazy on specific details
o_o
looks like comfyanon added in a node for the union model to pick which mode you want. hell yeah
before, people could get results a lot of the time, but it wasn't perfect since the cnet was missing the vital extra information tensor responsible for picking the mode. probably was just defaulting to auto internally, but that might bug out in different circumstances
seems to work alright, enjoy the android fish scientist waifu 🤣
send workflow fren!
@hoary saddle oh it's just a standard workflow. i just took a couple of images i made in the past and ran them through depthanythingv2 and anyline, to demo using two different union cnet modes at the same time. the only thing you need to is to put the setunioncontrolnettype nodes between and pick which mode you want it in
managed to somewhat duplicate it 🙂 tx
1
confrontation between Patrick Bateman and Anton Chigurh, manga style, scribble art, black and white, serious, showdown, japanese text in the middle saying the final showdown, angry, screaming, furious, final boss fight, samurai in the background, dead bodies in the background, in the style of Yoji Shinkawa
Not sure we want "confrontation" 🤔
controlnet union (auto)
Native Americans seeing Columbus's 3 ships arriving for the first time
draw: al aqsa mosuqe
hey
Future Photo of John Paul Jones Memorial, with a new bronze donald trump statue addition on top.
((best image quality、8K、masterpiece:1.3))、1 girl、smile、Full body Esbian、slim face、cute woman、(dark brown hair)、super detailed face、fine eyes、double eyelid、blurred background、slim face、in the hustle and bustle of the city_Super long brown hair、big breasts、full body shot、((sitting on a chair, night view terrace background:1.3)),Hair with loose waves on the inside、Wearing a satin blouse, ultra-thin fabric、smooth、See-through fabric:1.3、gold necklace_Larger earrings、((shy smile:1.3)),full body shot、Standing gracefully,
That's a bizarre way to type commas, my foreign friend
Thanks 🙂
how did you get so much detail
I have a couple of detailer/upscaler workflows
One for t2i and one for i2i
These are t2i
They're on my Patreon
ok thanks
I will join and check it out some time
I haven't gone through the different patreons yet that post workflows
That'd be great! It's only in the "Creators Lounge" tier, so choose the right one 🙂
Wondering how 2000 piece jigsaws may look 😄
LOL I spend the last week making star wars images but these are so much better
the best I could do so far was this
I forgot how I made it, maybe kolors
I haven't learnt yet how to do massive detail
It was with kolors, I remember you posting it...possibly in another Discord 🙂
ah thanks
I'm sure kolors just stole midjourney images etc
but they give that style very well
maybe even a bit too much
also he's meant to be firing a laser instead of looking at a brass pipe
A bit less "crazy"
yeah a bit less is good
hjtgj
Anyone played with DMD2 sdxl? So far it's insane oO
What is it?
It runs at around 1fps at 1024x1024
And quality is completely on par with other sdxl models
Tbh one upscales raw sdxl outputs anyway, and so far I don't see a difference in quality
It's mad with auto queue, it's basically realtime
8.35 FID though
Sorry, but what does that mean? 😛
ah its a measure of image score don't worry
this does look cool I will try it
I am mostly the opposite kind of guy though I aim for 30-60 minute generation time instead of 1 second
the other end of the quality-speed trade off
I know what you mean, I come from VFX, sometimes one frame can take a day depending on what one does. I don't mind long gen times as long as it's worth it
But since I upsacle and post pro normal sdxl images anyway, I don't really care about being pixel perfect on the initial image
When I can just iterate in realtime, it'S kinda cool
yeah its not actually worth it yet
my workflow needs more developing
Installed it on my laptop with a 3080Ti, still crazy fast oO
wow that looks nice
Are you talking about DmD2?
Yep
I mean it works with all the loras and normal stuff, but iterating prompts in realtime while still having a very good base is kinda cool
Just stared playing with it.
For a first pass, it works remarkably well, and the times are record-breaking
Absolutely! My way of thought, I run all standard gens through upscalers and buckets anyway, so having this is very nice
These look very good. I suppose that is not a basic workflow?
It's a basic workflow, now I am applying upscale with SDXL Lightning, but those images are a batch of 6, with the basic flow
Cool. I will figure it out... This is a SDXL mix still with 4 steps
- sdxl upscale (x1.5)
😛
Sorry, rookie question. Do you feed the latent into a second ksampler, and that is connected to an SDXl model. How much denoise?
@high spear Thankyou so much. I'm trying on my own... But failing that's why I ask. See what is happening. Like its jiggeling the image...
Reminds me of the random architecture in dreams and your brain will be like "this is completely normal"
Haha true : D
Still really cool though, I like that pic
Sometimes can't resist to break stuff, just love abstract high frequency stuff for a change to "normal" images
Yeah I feel you there, but I'm a big fan of surrealism and absurdism. Normal stuff is too boring
what are you doing at work today.......
must be bored hah
curse you Comfy and you're fun tool
DMD2 - 2688x1536 ~ fast
What model is this? so many objects and particles! epic! 👏
Which one of this did you download bro?
This is much more about the workflow than the model.
its nothing new, but still good. lightning merges are better imo and are similar speeds
finally managed to get inpainting to work with the new union promax controlnet. i guess it just wants the image to actually have black where it needs to be inpainted. also, it doesn't work with gradients, so no differential diffusion, but from my testing, it's right on par with it
but for real, it's pretty f'in rock solid in my limited testing so far
well follow what i did here or manually paint black in mspaint and import that
so far I do all my inpainting with SEGS detailer
which has the downsides of YOLO's limitations
it is too censored to know stormtrooper or gun/blaster/laser
the one drawback im seeing though for union promax vs differential diffusion(all i use, deleted all my inpaint models because they aren't needed with diff-diff) is that it's image only really and doesn't seem to operate with regular latent masking. so it will be great to taking already made images and inpainting, but if you're trying to do inpaints in latent space between ksamplers, it's useless, unless you want a bunch of vae artifacts from constantly encoding and decoding
didn't know people inpainted latents
thats the only real way to do it on complex workflows. encode->decode is a lossy process
if it's a presaved image, you only want to vae encode once. if you're generating the image from scratch and plan on inpainting, keep it in latent space and only decode at the very end
hmm yeah thanks
I am definitely losing too much quality
I also kinda learnt today that clownshark-style noise injections lower your quality a fair bit
for photorealistic
👋 .
Oh and another thing about what I said: you can still pull off vae decoded previews at various inpainting steps to see how it's going, but when going from ksampler to ksampler, you want to stay in latent space
There's a latent bridge node that works really well for doing previews and masking between stages, it saves you a lot of the hassle
Forget which common pack it's in
Probably impact or inspire or comfyroll, can't remember but I'll check
its ok i have seen that one before
👋 .
Is SDXL strictly for generating at a minimum of 1024, or can it do it lower resolutions just as effectively?
I use everytime 768/768 .If I like a picture, I increase the resolution
I'm not sure if it depends on the model but I've generated some decent stuff at 64x64 testing for pixel art.
Really nice! Would probably be $$$ with something like https://civitai.com/models/229213/extremely-detailed-no-trigger-slidersntcaixyz
v2.0 Fixed negative. Range -1 to 1. No trigger required Fork this and train your own slider at https://sliders.ntcai.xyz/sliders/app/loras/e3d5252c...
Bro steam gift 50$ - steamcommunity.com/gift-card/pay/50
UltraPixel
Thanks bro, it worked!
Set/Get and Bypass nodes make workflows so much more organized. Dream come true for OCD.
yes, as I hate noodles with a passion.
they drive me insane and makes it so hard to track anything down
Though I use everythinganywhere as a lot of the time I could not get those to see each other.
What can I be doing differently?
Replacing the woman with Willem Dafoe
No I mean generation wise. This prompt is random from some prompt generator, and I know its long, but why does it just make gibberish in the form of a photo
nightmare for those that work solely in the API 🙂 way too many lines of python
but it does look much nicer
Yeah, plus it causes massive headaches for tracking shit down when you have complex workflows. It's good for beginners and simple workflows though
I had chatgpt shorten the prompt significantly and I think the results are better
That was after 6 failed attempts where the woman grew triple the amount of arms
Well that's the basic workflow that can get images to completion and get you 99% of the way to what you want. You can inject controlnets between the prompts and ksampler if you want more control over composition. Or ipadapter.
Other than that, Google and YouTube will be your best friends for learning
But I want it to be fast. I have an rtx 3060 so I thought everything would've been super quick anyways non-turbo wise, but it wasn't so I've been using this and if I give more than 6 words, it just freaks out
Ok apparently if you remove illustration it makes them naked
I don't like that I get horrible anatomy but then make a typo "hat in cat" and get this masterpiece
LOOK AT THIS
look, i don't want to be that asshat, but there are hundreds of people that have already spent absorbent amounts of time creating tutorials for beginners. this isn't something that can easily be explained in a few replies. i highly recommend you do some basic googling and youtubing.
I have, I've watched so many tutorials and use the exact same settings and get these weird photos generated
well in those videos, they'd also likely suggest using different models other than base models
I tried before and got really weird things with their recommended settings
But can you explain why there's such a huge quality gap between those two images I just sent? Even at the same length of prompt and formatting, they are wildly different
if you want speed, look into lightning or hyper models. they are 1024x1024 based models and will run just fine with 8gb of vram. the 8 step ones are the best tradeoff between speed and quality
I haven't tried it in the api. Time to test it out and see what you mean.
I'll look into it I guess
https://huggingface.co/ByteDance/Hyper-SD is a good start
you can use the lora with other sdxl models
no, it's a lora that augments a model. https://civitai.com/models/139562/realvisxl-v40?modelVersionId=361593 is a really good model as well. and i'd avoid 1 step loras and models, they are almost always garbage
I have that installed but I don't like how slow iti s
are you really using the settings from that screenshot?
Me?
well learn delayed gratification then or get on meds for ADHD
But how come it's so slow on a 3060? I saw someone say either it was a similar card but with like 6gb of vram or it was like a 20 something and they did it so fast
With like 30+ steps too
oh yeah forgot they had a lightning version of it
"Use Lightning models with DPM++ SDE Karras / DPM++ SDE sampler, 4-6 steps and CFG Scale 1-2"
These are the recommended lightning settings. dpmpp_sde karras with 1-2.5 cfg and then 4-10 steps
Oh that's turbo
read what i just said
No I read that but I wrote the turbo one to you by accident
But I'm still within both
and since it's a lightning based model, change your resolution to 1024x1024
Why it so bad lol
youre using 512x512
HOLY
Wait do I make seed fixed and then just add something new to the prompt and it stays consistent?
fixed seed yes
I dont think it really did but still impressive
every seed will be different
would recommend getting off lightning models
unless it really has to be fast
it is pretty ugly in api format. Good thing I set my api up *before * I did all that. 🙂
It does trust
I am a very impatient person when it comes to 1AM ai image generation
ultrapixel is scary to install, damn it's huge with lots of dependencies
wasn't sure comfy was gonna restart or my gfx card drivers still work
one of the top rated models is a lightning model https://imgsys.org/
it's second only to realvisxl 4.0
Ok I changed it back to blue sweater and it looks like the first girl again
holy crap tho, ultrapixel does make nice results on the control net
Yo whats the prompt to the 2nd image smuz
I wanna see how this model makes that
Wish I could reverse google search this to find my future gf 😭
well youre not going to find a future partner sitting around making virtual waifus all day...
No I also make dogs and cats
Anyway
Why this?
Little random error
I made seed +1 and it works, does the seed like run out of uses?
same seed, it doesn't create it again because it's already created
Ok another question get ready
set the seed to random
What nodes do I add to upscale my images?
yes but this is testing just a prompt on its own
for a big comfy workflow lightning/hyper/turbo don't really work that well because they have that really limited useable CFG range
the issue is there is no room for IP adapters or control nets to work
THATS SO COOL
true, i don't like working with models that have strict 1-2cfg ranges. also, yeah, cnets/ipa become extremely difficult to use.
I don't know the science of this but they also break noise injection effects
well a lot of things are based on step counts
the SD3 2B also doesn't like noise injection
but I hope the final SD3 8B will
when you only have like 4 steps, that means each step is walking through MASSIVE amounts of the noise schedule
but speed models have their time and place. they are fun for prototyping and then resampling with a regular model using much high step counts
yeah some people just have the opposite goal to me
and want a decent image in a few second
I am looking more for a workflow that makes the image over the course like like an hour
waste of power
it is yeah
there are really strong diminishing returns on things because the actual model precision is only goes so precise
and each pixel still only ends up with 256 different potential values per RGB channel
for a single pass yeah
but within that hour
you can fit like 3-4 additional passes with refiners
and then 2-3 upscalers
i mean whatever floats your boat and all, but still, it's a waste of power and time. that same hour could be spent creating dozens of other potential cool things
@tawdry current I followed one of the tutorials, took a little long to figure out how to get the manager thing, but I did and now I got good upscaling images now.
Huge improvement between these two as well
Thats screenshotted when I zoomed in, not full image btw
Besides the # of toes and fingers, this looks awesome
Yo Im actually addicted to this
Yo last question
If I was to do this hypothetically in comfyui or another local web app for sd, could I just install everything, then turn off internet and run it to save battery in like a car setting?
you can turn off external internet but you need internal network
ComfyUI is a web UI
so it uses a local host address
if you use something like Diffusers then you can avoid that
Can you dumb that down
I tried asking chatgpt but it started contradicting itself and then said something about a tomato I don't know. I think that it needs a connection to just run the stuff locally as like a website kinda ui, but it doesn't actually need internet
No I got no idea
This is scary cause of how real it is
Bruh... Wi-Fi usage on a laptop would be like <1% of the power being used to generate the images. I don't think they are even a single watt and it wouldn't surprise me if they were in the 100 milliwatt range. Basically, it would be the least of your concerns for saving battery
Well the question was more like aimed towards could I use this in a car
With a bad connection or no connection at all
The answer is yes
So then how does the whole 127.0.0.1 thing work without internet?
You can test it right now by putting your laptop into airplane mode
That IP is your localhost
Aka your PCs own virtual self address to itself
\o/
Create a vibrant and detailed festival poster with the following elements:
-
Background- Dominant color: Deep red with a subtle pattern resembling woven straw or traditional textile. Large bold text at the top: "FESTIVAL" in dark red with a slight shadow effect for depth.
2.Main Character- Central figure: A smiling farmer wearing a traditional Vietnamese conical hat (nón lá) and an orange-brown traditional outfit.- Position: Slightly tilted forward with arms crossed, holding a bundle of golden rice stalks. 3. Supporting Elements:- Surrounding the main character: Abundant golden rice stalks with detailed grains and leaves.
- Green swallows with dynamic wings, positioned to the left and right of the character.- Geometric shapes: A bar chart with upward trending green bars and a cogwheel, both integrated into the design around the character’s feet. 4. Text and Labels-
5Details and Style: - Art style: Vibrant and semi-realistic with a touch of traditional Vietnamese art.
- Colors: Rich, warm tones with bright highlights to emphasize the festive and agricultural theme.
- Lighting: Soft, natural lighting to enhance the realism of the illustration.
- Texture: Slight grain to simulate a printed poster.
-
Additional Elements:
- Behind the character: A stylized mountain range in green and yellow hues, blending into the background.
- Foreground elements: Parts of a gear and a foot wearing a traditional Vietnamese slipper, partially covered by rice stalks.
Anyone have any recommendations for some models and/or workflows for outpainting with SDXL?
Looks wonderful any plans to release it?
ultrapixel is a cool project
Hey hey people!
I wanted to generate some landscape images with SDXL, and so I found a nice picture on civitai and tried to replicate it as a base to go from, so I loaded the metadata from it into SD WebUI, downloaded all the models and LORAs used, but still mine looks horrible in comparison. I didn't expect 1:1 reproducibility, but it doesn't even look like the same style at all. Any idea what is wrong?
Theirs:
https://civitai.com/images/11496743
Mine:
/HelloWorld
Maybe.
It was supposed to be pencil artwork. Not eating pencil...
If you have at least a 12GB GPU, yes.
welp it freezes for me on stage C
i guess back to Kolors
too bad coz it doesnt max out my vram
it just craps up
Hand and nails!
That is indeed an excellent hand.
It's a Kolors hand
Kolors is very good AND FAST. It's the SD3 we never got but deserved.
Yeah, looks like it but there's still hope
Real open source to the rescue!
Kolors is so distinctive
I had a feeling it was Kolors as soon as I saw it
its kinda cool to have a model that goes in a different direction
Forgive me...
Say hi to Reimi.
old sd15 generation on the left. result of being passed thorugh my 16 channel unet on the right
runtime about 3-4 sec
You gave them souls.
Hey, not sure where to ask. What would be the best approach to try to apply a "pixel art" effect to a real image? I have a portrait picture of myself and I want to make it "pixel art".
I am trying with Pixel Art XL Lora (https://civitai.com/models/120096/pixel-art-xl?modelVersionId=135931) but I am getting very poor results. I am doing img2img with 0.2/0.3 denoising strength, prompt including only the lora activator (<lora:pixel-art-xl-v1.1:1>). I have tested multiple combinations of sampling methods, sampling steps and also cfg scale. But I get either completely different faces or just the same picture with some "pixels" (like a super-high quality pixel art).
Any tips/suggestions on how to get this? Ideally I am looking for something like this: https://i.kickstarter.com/assets/044/139/979/c10a1e2f4769063ae87b79167d31ed4b_original.png?fit=scale-down&origin=ugc&width=680&sig=NLWqti%2Fxyd3wfDxLvB7NrKCqOElhzz2xwilywW9wKH0%3D
I haven't tried to do pixel art myself, but you may need better promoting in addition to what you have to invoke the full pixel art style of the base model as well as the Lora. Stuff like 8-bit etc
Most of these models have pixel art training even without a Lora
well, the thing is I want to create a pixel art from an existing image. I know how I can create a pixel art from a prompt. Just not sure if (using SD) I can "apply" an style like pixel art to an existing image (using img2img).
Img2img needs just as much prompt coercion as an image from scratch.
Just make sure to only include adjectives about the style and not about the subject
Kolors does a good job with pixel art with 8-bit and 'pixel art' as well.
if you train the model beforehand on your face then it should be simple
Unfortunately I don't want it only for myself. It's more like a feature I want to provide to my users. So they can "pixelize" their profile pictures.
you can try IpAdapter but for real person portraits I never found it accurate enough
A peaceful forest morning with sunlight filtering through the trees, illuminating a young girl. Her hair shines brightly under the gentle morning light, creating a sense of tranquility and beauty in the scene. --version 6 --quality 1 --chaos 0 --stylize 100 --aspect 3:4
A peaceful forest morning with sunlight filtering through the trees, illuminating a young girl. Her hair shines brightly under the gentle morning light, creating a sense of tranquility and beauty in the scene. --version 6 --quality 1 --chaos 0 --stylize 100 --aspect 3:4
Nice, what model?
No idea! 🤣
I've been changing model with almost every image, sorry.
Haha all good : D
Tbh I rarely play with new models. The last one I tried is https://civitai.com/models/432244/forrealxl
Which is very decent, but have to give it more time
a cat
does anyone know why this would be happening? the colors are way too neon...
That's usually a cfg related issue, but there are a ton of other factors that can cause that.
Looks a little deep fried, can be cause your steps or cfg numbers are wack.
I’m a little more familiar with 1.5, but you generally want your steps between 15 and 60 with your cfg between 5-12 although you generally want to be somewhere in the middle of those numbers. Too low or too higher number can cause a deep fried output.
hi would it be possible to run sdxl on an rtx 3060m? (6gb)
Yes should be possible might be a tad slow. I was using a 2060 mobile for a long while
Sdxl seems to put out the same kinda image every single time is there a way to have it change it up with out having to change prompt every single time? im using styles but still same thing same angle same expression with slight diffrences. help plz.
Hi. I'm doing some tests involving image to image workflows in ComfyUI where I want to generate urban architectural images and I keep having this issue where I get really splotchy or "dirty" looking textures, especially on wall surfaces that are supposed to be reasonably clean. Does anyone of you have some insight as to why this happens that you can share? Or can someone point me to some tutorial/resources where I can investigate how to improve my setups in this regard? Help is appreciated! 🙂
Part one, there's a positive
Truthfully I just need to merge like 4 of these

Big Cat eating burger
有谁能告诉我,contrelnet 现在进化到哪个版本?如何和sdxl 一起使用?
Thanks!
Reminds me of King Tut and Cat Woman from the 1966 Batman show, except Tut is wearing an Egyptian batman costume.
👋 .
#1237459938901491852 there is a waterfall in the middle of a Blue green to pink gradient landscape, volumetric lighting,beautiful fantasy cave scene,fantasy matte painting,cute,beautiful fractal ice background,cascading iridescent waterfalls,pillars of ice background,floating waterfalls,dream scenery art,ice sculpture,cotton candy trees,pink waterfalls,mystical fantasy landscape,neon light and fantasy,fantasy landscape,3 d virtual landscape painting,frozen waterfall,photoshop water art,fantasy beautiful,crystal forest,lost in a dreamy fairy landscape,colorful otherworldly trees,beautiful fantasy painting,fantasy realm,matte fantasy painting,aqua volumetric lights,epic dreamlike fantasy landscape,volumetric lighting. fantasy,detailed matte fantasy painting,fantasy landscape painting,3d landscape
Custom LUTs + 12 exposure step tonemapping
the contrast is really high, looks good
new toy
New audio-reactive geometry system!
[Architecture - Archaic / Ancient - AI]
TouchDesigner audio-reactive geometry system + SD/WP parameter configuration files [3]
Music by @welovephoenix
You can access these project files, plus many more experiments, tutorials, and systems, through: https://linktr.ee/uisato
0:00 – Renaissance
0:27 – Ancient Greek
0:54 – Ancient Egypt
#touchdesigner #animation #stablediffusion
#1237459938901491852 taiga
1 year of sdxl
Guys, what is the best optimizer with LoRA creation?
AdamW8bit, Adafactor, Dadaptation?
I recently read some papers on this
it doesn't seem to matter too much
man, It's a nightmare, I've tried so many times to get a good result...
Can I share with you my dataset and profile.json (Kohya_ss)?
If you could please take a look and tell me if I've done something wrong, I'm trying my best to follow good advice on the net, but nobody agrees with anybody.
I try to train a pony model, but it screws up all the time, the results are not clean whereas it was trained in 1024²...
Look :
is there a lora or model for ultrawide?
those look really damn nice!
made a fancy lora for DND Elementals :3
you trained it yourself?
yeah
Thanks to DoRA I can do a bit more complex things without the need for 2k datasets
Do you train with Kohya ss locally?
@boreal bough
Until a few hours ago, I was using a train profile I'd made from data I'd found here and there, but the results were... Questionable.
I've just taken this profile from Civit.ai, it looks much better, what do you think of these settings? I'd like to have the opinion of someone who makes good LoRA.
yeah. until OneTrainer finishes the DoRA implementation I'm currently stuck on kohya
are you on super low vram?
cause your setup looks like it
you're making a lot of tradeoffs to save vram
bucket_reso_steps = 64
cache_latents = true
cache_latents_to_disk = true
caption_dropout_rate = 0.3
caption_extension = ".txt"
clip_skip = 2
dynamo_backend = "no"
enable_bucket = true
epoch = 200
full_bf16 = true
gradient_accumulation_steps = 1
gradient_checkpointing = true
huber_c = 0.1
huber_schedule = "snr"
keep_tokens = 1
learning_rate = 0.0008
loss_type = "l2"
lr_scheduler = "constant"
lr_scheduler_args = []
lr_scheduler_num_cycles = 1
lr_scheduler_power = 1
max_bucket_reso = 2048
max_data_loader_n_workers = 0
max_grad_norm = 1
max_timestep = 900
max_token_length = 225
max_train_epochs = 500
max_train_steps = 6425
mem_eff_attn = true
metadata_author = "Caith"
metadata_description = "DoRA Trained on 257 images. "
metadata_title = "Fairy (for Autism Pony)"
min_bucket_reso = 512
min_snr_gamma = 5
min_timestep = 100
mixed_precision = "bf16"
network_alpha = 1
network_args = []
network_dim = 32
network_module = "networks.lora_fa"
no_half_vae = true
noise_offset = 0.0375
noise_offset_type = "Original"
optimizer_args = []
optimizer_type = "AdamW"
output_dir = "X:/AI/MODELS/lora/pony/character/dnd/fairy"
output_name = "fairy_v1"
pretrained_model_name_or_path =
"X:/AI/MODELS/checkpoints/pony/autismmixSDXL_autismmixPony.safetensors"
prior_loss_weight = 1
resolution = "1024,1024"
sample_prompts = "X:/AI/MODELS/lora/pony/character/dnd/fairy\\prompt.txt"
sample_sampler = "euler_a"
save_every_n_epochs = 10
save_model_as = "safetensors"
save_precision = "float"
seed = 12345
shuffle_caption = true
text_encoder_lr = 0.0008
train_batch_size = 8
train_data_dir = "X:/DATASET/fairy"
unet_lr = 0.0008
xformers = true
is my current setup
This is ideal for a 3090/4090 on a windows machine
(clip_skip = 2 is for pony models only - so clip_skip = 1 is correct if you do sdxl)
can be used with a minimum of around 16gb vram
where you'd have to adjust batch size, and learning rate only.
(learning rate / batch size = 0.0001)
No, 12GB, i have a 3080Ti
oh. yeah. that does count as super low in terms of training x_x
:\
Meaning? If you can explain a little I'm lost
• training unet only
• using adafactor
• low batch size
all of those impact your final quality, but save a lot of vram
adafactor is not better than adamW8bit? And i don't know why, with this settings, i can't enable adamW8bit, cause i have a bug with LR warmup, i can't set on 0, and i have an error when i try to learn, that says, LR warmun have to set on 0, and its greyed
adamW would be ideal. adamW8bit makes a lot of tradeoffs already.
adafactor is fine as well - just more things that can go wrong, since its super easy to overfit on it. Resulting in loras where the output gets eerily close to the training images. For example faces always pointing in the same direction, or skin color of faces not matching the rest of the body
emphasis on can go wrong
it can be solved with the right settings, and fixing issues in the dataset. also min snr 5 helps A LOT for this
so top improvements for you would be:
• use min snr 5
your network weight of 32 is perfect for this type of lora.
I've just realized that the WD14 caption does a much better job too. How do you formulate your prompts for images? Is it like this? :
sebl4rdpony_v1, solo, looking at viewer, shirt, black hair, 1boy, white shirt, male focus, parted lips, food, blurry, black eyes, blurry background, facial hair, portrait, beard, realistic
I have a really really aggressive blacklist, which removes like 95% of all booru tags, and only leaves those that exist in natural language. (and I use: SmilingWolf/wd-convnext-tagger-v3)
https://github.com/jhc13/taggui
this tool is perfect for tagging stuff quickly, with various taggers
Hmmm k
trigger word, <tags>
or if I'm fancy about it:
trigger word, <caption>, <tags>, <background>
and what if, i have sebl4rdpony_v1 and sebl4rdpony_v2.
If i call sebl4rdpony_v2, it will be a risk that call sebl4rdpony_v1 too?
in A1111? no
I do that myself xD so I'd know
🤣
it's always a worry when you've done dozens, and 99% are failures.
it gets better with time. I'm currently at a roughly 80% success rate
me 80% of failures
tbf, adafactor is harder to get right. especially when you cant train the clip models. so thats not all on you.
if you ever get a better gpu, you'll be real damn good at it, since you actually understand how to the train the unet properly 🤣
Sometimes you don't know why either, like my head (my avatar), 10 crappy photos, with people around sometimes, blur, bad lighting, and it comes out better than other setup much better prepared Oo
most finetunes on civit are just glorified clip loras, that also happened to train unet a bit
I bought a 3080ti yesterday, because my poor 2070S (without BF16) was clearly not made for it x)
AHHH. right you have a 3080
you can enable full bf16 training
that saves A LOT of vram
maybe enough for you to do clip training as well. you should def try it
yeah, in same conditions, but in 768² instead of 1024², 10-15 more hours on train, imagine...
if you also train clip, it cuts down training time by about 4 times
so an 8hour lora, trains in 2 instead
my 2070S was around 20-30s /it, horrible
3080ti with full bf16 : 1.60~2s/it (1024)
I don't regret my purchase, and I couldn't afford a 3090 anyway. not to mention the hellish power consumption, and the roar of the Concorde aircraft.
yeah. I bought two desktop fans, and frankensteined them on top of my 3090, in my training rig 🤣
that fan sound was hell before that
Dude, look my 2070S xD :
2x chromax black, from Noctua
before that :
Of course, I imagine you've also been through the process of making a fan curve on Afterburner that will save your ears the most.
hahaha
I imagine it must change everything too. People don't realize the crappy fans they put on our graphics cards.
2x beQuiet xD
0 audio. it runs in sub 30db, meaning you wont hear it unless there's absolute quiet.
also temps in the 50 degrees, instead of 88
yeah. standard fans that built in cost around 1€ in manufacturing x_x
I hate how exactly thats where they save money
So very much, and they make us pay the price for this crap
It looks like you fried the text encoder
"unet only" :' )
he fried the unet
thats fine. what your image showed is typical issue if you use a too high learning rate on the clip models. you dont have that issue though
fuck this pony in my trigger words...
what do you suggest guys, if i want train a pony model, ponyRealism_v21MainVAE is good for this?
i'm trying with adamW this time
these are ultrapixel with cascade. just don't use the safetensors they provide, they're fake safetensors and contain potentially malicious pickles
i converted them to real safetensors and forked the repo with the necessary changes to ensure it's safe
that explains it 🥲
I remember you talking about your improvements to cascade. was just hoping you found such a good workflow for sdxl, cause of the sdxl chat xD
well, there's a sdxl checkpoint node floating around in that WF somewhere lol
not being used though
however, i kinda wonder if the concepts behind this model could be applied with sdxl/kolors
it can even do 1440x5120 no prob
time to churn out an infinite supply of wallpapers
the only issue I have with the compressed latent methods (including cascade, ultrapixel, koyha deep shrink and hi-diffusion)
is they reach a much higher resolution but at that new resolution their detail level is lower than what is typical for that resolution
depends what you mean by detail tbh
the complexity will be lower, but the resolution of the detail can be comparable with a finetuned version of stage B
in many cases that's for the best imo... myself included, we all have a tendency to spit out images that are too busy sometimes cuz those intricate details/complexity is addictive lol
working very well with me. With the worst dataset xD