#✨|sdxl
1 messages · Page 130 of 1
I'd do 3 images:
base, complex math junk, just increasing the weight normally
with the latter being either text weight in prompt or the unclip weight
yeah, watch it be the same, lol
not the same cause of the concat but I wouldn't be surprised if they're close
well I can say that with each new thing I added I did it incrementally. it's not like I hit the ground running with any of it
add it with .1 weight or something, then move it up slowly
ah yes the slowly boiling a frog to death method
I always do the opposite and start at 1.0 then move it down
speaking about upsampling, this is best but need lot of power for large images. And of course good model but there is too much. I posted today 8x
I've taken that approach
what the fuck is that
are you latent rotating?
is that like a very manual tiling formula or something?
it rotate fliping images resizing them and then blend them. It is not my work, just understanding how it works
try search for ChaiNNer, it is from it. Default Full TTA
wait it's all in pixel space?
hmm. how would that work out? never really tried it
that sounds like a really complicated way to do one of those emulator pixel scaling algorithms
interdasting
does it work on more complicated images at all
lines are easy to do if you smear the fuck outta everything else
i think images are not limit, only proper model for it, and some time or powerful GPU
it can resize realy easy load image upscale and save image.
you can send me some small image because HW limitation on my side
i am testing it because pal need help with printing quality.
define 'smal'
This is the lyrics from Winds of Change by Scorpions
like 720p?
512x512? @nimble heart
i can measure it. But will cant do 8x probably. Maybe better to download it and testing for yourself?
Going test something
idk there's an SD 1.5 image i was trying to make 4k. native is 1080 I shrunk it by 50% so it hopefully doesn't light your pc on fire
let's see, here's one without the math things
with them
I'll try something else. also, I realize I would need to up the weights of things in order to get a more accurate gauge of things
damn running a 4x on a 1080p image is rough
only using 3 gigs of vram though so idk
taking years
It should due to the higher clock but for anything to the gpu more buffer is better
I'd rather save 30 seconds on a long code compile than .02 seconds copying the latent to the gpu on an SD render
ppl with 4090 could have it done in eyeblick
without math or adjusting weights
Training it is much worse
I don't give two shits about generating images it is the training I am all about
- imo 3d vcache is overpriced currently. For the price of a 7900X3D you can basically get a 7950X
first one here is with the math. then second is bumping up the initial clipvision weight by just shy of 2.5x
well, for me it will be an X3D just 79 or 8k depends what is next year
so the stuff I'm doing definitely brings something to the image
i bought my 7900X before the 3d came out so I guess the choice was easy lol
first and third aren't doing much
Well, AMD has said that going forward X3D will be for all of them so we will see. All about the chiplets now.
well it is noticeable that crown isnt best for 8x resize
crown?
Oh
yea I used realesr for my actual one cause its a lot more aggressive with gradients
but in eye is some reflection
I am having a hard time training on a person. Styles are not an issue with me but the person seems better but nothing what I trained
1080p -> realesr 4x -> unwindowed sinc 1024 lobe 50% downscale
looks okayish
but definitely pushing sd 1.5 outputs lol
least without tiling
to fully see it one to have open in browser.
Trying 4x ultrasharp
i think these turned out nice
yea I used realesr over ultrasharp because its better with flat lines
ultrasharp's good for high frequency details like photoreal images
if you side-by-side you can see the ultrasharp one you have smoothes out less stars and stuff but the linework on the helmet is way blurrier
yes stars are nice, would be nice to have some mask and resize different parts differently 🙂
I prefer 4x_nmkd to ultrasharp
least with XL it doesnt matter as much because you can straight up img2img @ 3840x2160 so the upscaler choice is just for mild detail enhancements
I have nmkd superscale idk if that's the same
but ultrasharp did better as a base you denoise on top of
for XL
ahh yeah. I'll sometimes do 2x to 2x with ultrashapr first
so sample 1368x768 -> ultrasharp 4x -> sample 3840x2160 @ 30% denoise
is what i usually do
least for the few 4k images i posted in here
my gecko phone wallpaper was just regular bicubic scale I think
check color of this version @nimble heart
I love burning images
the tattooes and hair are ringing really bad
it is not resized 😄
how do you make those comment boxes in comfy like how it's c key in unreal
double click search 'note'
unless you mean the big groups, that's just right click 'add group'
groups are cancerous so most people dont use them
yep, group thing
if you like snag one pixel anywhere you do the equivalent to your workflow as moving an image in a Microsoft Word document
so what was run?
just studio lite colors only
idk what studio lite is
one of mine in unreal 
looks like a bubba'd unsharp mask
it is actualy 1x broadcast to studio lite pytorch based implementation, it doesnt mater what it do, it my not only resize but denoise, removing Jpg artefacts and so o.n.
This changing color on some pth based model rules
lazy eye
:)))
lol sdxl does that a lot
make everything a side profile so they cant have lazy eyes
7s/it oofie
just bokeh everything so you only have to render one thing correctly
I also do that lol. "out of focus background"
so long as the subject turns out okayish rest doesnt matter
@nimble heart have you tried lexica?
Don't know what that is
better
I always post-process using my own filters I wrote for GIMP
2xlexicaSwinIR
but realy nice, one can aply it twice. Nice stars
blurry helmet again
one cant have everything
you notice the linework a lot more on a 4k screen than the stars
im not super worried about it anyways now that XL is out and I can direct render 4k
I finally replaced my wallpaper
yes xl is great for large renders, just i can probably afford for me enough fullHD.
1.5 I could direct render 1080p which is what that midna image is. Maybe with an nvidia gpu + xformers you could direct render 4k @ like 60s/it lol
XL you can sometimes do 1920x1080 natively without any upscaling
and get not-gross results
768p compositions are better though
i did such wallpapers directly but not handy as wallpaper actually 😄
Prompt: Lyrics from Africa by Toto
1920x1080 without any upscaling
runs pretty well, 1.21 it/s
yes i did it as well native
?
oh lol you literally sent that as I was uploading mine so I missed it
would prefer 1088 probably will play with it, and see
Because you need to render at something divisible by 64.
no you dont
latent space is 1/8th of pixel space so it need sto be divisible by 8
that's why all the latent image sliders step by 8 in comfyui and auto1111
well encoder divides each side by 8
no
1024
You get greater incoherence at steps not divisible by 64.
results prove it
I only render in multiples of 7
ah yes 504x504
are you remixing my midna wallpaper I spent like a whole day on into 80's B movie posters
yes
aight
hide rick rolls in every image you post on here using that method
good idea.
could be done discreetly using the mask input with a solid mask. put him in an out of the way place
like a where's waldo, but it's rick astley
alright, I'll do 2 with the same seed number and what not. with and without the pre-sampler upscale
a 2 pass will always wreck a single pass
direct 4k may be possible but you'd have to cherrypick it
I did that for a bit with 1.5 metadata in another server because someone said my prompts were trash, so I upscaled everything I posted with a 0% denoise with the rockroll lyrics as the prompt just so it changed the metadata of the image.
@nimble heart simply i believe in 64 🙂
this is beauty
1920x1080 cleanly turns into a 240x135x4 tensor so im not sure what else would cause it to be "less coherent"
I'd probably have to use canny edge to make it more subtle lol
was in A1111 it isnt dead at all
That is so damn hard to do
the actual ComfyUI source code for turning an image into a latent shows it just rounds to the nearest 8
yes i believe it is
I wonder why SD refuses to do the vulcan salute?
its a hate symbol
ffs, lol
was trained out during diversity conditioning
Hahahaha
I made my own so I could see how it works
he looks so miserable
I would be too if I had a giant hand
no idea how
tell it to stop being a stupid idiot and make him bigger
big tall handsome Spock, broad sexy shoulders, thick masculine neck, instagram model, realistic hdr 4k 8k HD nikon photoreal make him hot please
I told it large hand in the neg it did this
now he's even more depressed
like he had to put Kirk down via fingergun after he was infected with a brain parasite
LOL
i should pitch that to netflix
see it worked exactly as intended. now he's bigger
whoa
pen is as well I believe
lingeri3 is too lol
wth?
poop isn't banned
ohhhhhhhhh
so at least we have that going for us
400 iq idea for making good vulcan hands
genrate a normal image
inpaint out the 6th finger
hehehe
I am trying to figure out why it makes them giant hands?
I love the one with the old car about to run him over
Look at his hands, ffs
so much emphasis on the hands it pulls them closer to the camera
like if you add lots of handy words to your prompt it does that
in 1.5 as well
weird
probably does it for other things
try like nose facing camera, vulcan nose, nose in frame or something and watch it just stretch into the middle
I don't go down those alleys.
make spock into a real boy
maybe the handgun is a Vulcan
so are you training a vulcan salute lora or something
I guess I better, shit
its one of those insect burgers
try making spocks head small
because thats a thing in art ,head small is big person ,head big is small person
Is that one of those quantum computers?
Im trying to do giant spock
basic setup
(i am new to using this)
what are those three boxes in left upper corner @shy kelp ?
oh 2 checkpoints, didnt know. And lora make sense
@hardy cipher can I dm you?
why not? I remember they exist now. forgot about them and left a bunch of people hanging. so rude
checkpoint and two loras
love the colors
What a lovely resolution
I need to go loooooonger
the subject isn't duplicated so you're already doing better than most
I have set my target width to 256. Don't know if that does the trick
I guess 256 * 4096 is equivalent megapixels to 1024*1024 so it's actually somewhat feasible
I guess i need to go longer
can you? I thought the max size was 4096
max size for what?
I guess you could always override the node's slider
edge length
4096 pixels
so a 512 latent
one of these is not like the others
he just has a slower metabolism
are they wearing pink panties on their chests?
lol that one purple guy trying to be involved
blue guy on the bottom left in the second picture is the best
he's all cocky cause he's half human and doesn't have those black orb eyes
9632x280 (with a highres fix)
Then i need to overwrite the comfyui max width
you're within your rights to do that
Ok let me check. I hope its "MAX_RESOLUTION"
I had an inference but no real output at 8x131072. "Best" i could do was 26.206 x 96
If you click on open in browser, you might see the fresh salad, tomatoes, onions and grilled burger patty.
it's very high quality
ghost busters, star trek, tron, lotr, dune, 2001
hmm, what has happened in this hires upscale 
(the upscale latent is plugged up, wires don't show because it's over the save image)
increase denoise strenght by 1 or 2
why does it feel like the sdxl bot is nerfed?
especially the anime one.
anime style*
so quick as well as low quality generation
:/
also for some reason some styles just refuse to do the generation or take a long/very short time to do so
I don't know
because its a bot,if u want high quality and near perfect pics install local version
Introducing Woolify, a cutting-edge LoRa (Text to Image) model designed to infuse the warmth and texture of wool into any picture. With Woolify, yo...
Solved for good now, borrowed some code from efficiency nodes. It's done client side as my initial code did.
right on. I rarely if ever make things like sampling and scheduling extral to the nodes themselves. but I do it with seeds a lot
It just rewrites any node type you have that use noise_seed so it works with whatever
But you have control over what nodes types you "overwrite"
for comfy when 
Sampler is probably not needed now since I mostly stopped using the refiner but when you do I find it handy to only have to change it once.
I like Subway
the flow grows 
Where does the node types come from @shy kelp?
those are for unreal sadly https://www.unrealengine.com/marketplace/en-US/product/electronic-nodes
upscaling 🙂 You learned a lot, me on other hand trying find good ballance with all models i have, about 50 and wait for each turn which take say 90 seconds, then correct a bit, or change model which i dont know...
yeah im too comfortable on this one model and i need to use others
plus there is probably new shit out that i am not paying attention to lol
sigh, my model merge is a failure, after all 😦
have to check out all the block merges to see where it went wrong 😢
who's she?
why does it being a bot affect the generation output :\
i know it's a bot but how does it being a bot make the output worse?
what's the prompt here?
ask @visual glade 🙂
3 images as input with blip interrogators, ipadapter, clipvision and controlnet. not even modifying prompts for these
i forgot how much you love not typing prompts 😛
no, I'm all about prompts
deliberate was one of my favorite 1.5 models for that reason
i found out you could duplicate nodes by holding alt and clicking but i expected it to be a shortcut to destroy a node wire lol
let's hope it works out, guys
that snow looks so cozy 
quality frogs
I like these frogs tbh
messed with some old watercolor style images I did in 1.5. did some sdxl upgrades
at least 2 moons there
now with ipadapter and clipvision updating old concepts works great
those ones might work out well
a woman with a man are going forward to happy tomorrows.
A1111-exactly what i imagine 🙂
yummers
2001, blade runner, the cell, dune, the fall, the fountain, ghostbusters, hero, the grand budapest hotel
for the 9 images above: prompt "a woman", ip adapter source image:
I quite silently dropped the LoRa for this woolen like stuff so if you missed it here is the link: https://civitai.com/models/146424?modelVersionId=163063
Give it a spin and let me know what you think.
Big thanks @dapper dragon
Introducing Woolify, a cutting-edge LoRa (Text to Image) model designed to infuse the warmth and texture of wool into any picture. With Woolify, yo...
i always seem to generate buildings with lights on in them and can't get rid of it 
prompt: a w00len woman, lora: woolify, ip adapter (same as above), sdxl styler movie: the fall
Might not give the desired result, but i managed it with "abandoned" Pos: A beautiful fantasy artwork showing a little (abandoned:1.4) town on a river at night, illuminated only by moonlight, nobody is home, it is dark | Neg: shallow depth of field, DOF, bokeh, cartoon, drawing, painting, naked, Light bulbs, Tube lights, Lamps, Torches, Lighters, Candles, Fire, Glow sticks
Ok, its more a ghost town now
Good but prompt like so: w00len, yourprompthere.
That should enhance it properly.
And its zero not a o 0 / O
this is w00len, a woman:
(fist lora strength 2, second strength 1)
ty but i might have to hunt through my old gens and see if i can find it, it's just not going away no matter what but i do remember i was able to constantly generate with it off so 
Time for jeans. Challenge accepted
this with some clip vision. all you need
looks like a wes anderson kids movie
are you using same model?
Is this using a noise lora? Or did you photoshop it after to get it so dark?
Just image level adjustments node after the VAE Decode
IRL photography you have to time this perfectly, just after sunrise or before sunset, when the dimness of twilight matches the brightness of the artificial lighting. 🙂
Yes use 1.
this house light 
So... I've been using an SD 1.5 generated landscape as my desktop background for ages now. I just decided to remake it today with SDXL. I was born in Alaska, so it just makes me feel at home.
But, is it strange that these feel more true to my memory than photos I can download online? There's something about the fact that... This is a landscape no one has ever seen. There was no photographer standing here, wondering about color, touching it up later. It's... untouched.
But also not real? Hmm... Probably stuff that belongs in the philosophy section though.
fluffy pig
nice fidelity!
w00len, a hyper realistic macro close up picture of a ((Pumba made out of multi color wool and yarn)), using the colors red, black and yellow, bokeh woolen multiple color jungle backdrop
It is really creepy how it mixes the pine and birch trees though. 😨
Guess who just trained a LoRA with 0 Learning rates...and no i don't mean Kappa Neuro
How high can sdxl go on resolution without deformities/twinning/etc. Without upscaling and hiresfix obviously
Every multiplication of length and width that has a product of 1.048.576
It depends on the subject. I tried a 4k landscape earlier, and it wasn't absurd, just... A little odd.
Native 4k SDXL render (no upscale, no hires fix).
It still got the water below the horizon, mountains in the middle, and sky on top. No stacking layers of sky, at least.

the like one time i come back in the chat after however long cx
yooo! killen it! <3
Will there be any multimodal interfaces in the future that use LLaMa and SDXL in the same time? I feel like that would be the next step
There's tons out there. I am literally running llama2 and SDXL on my PC right now. Llama2 writes some pretty good prompts. It never runs out of ideas. There's also a project called Llava that uses Llama and CLIP to give Llama pseudo-vision.
Also, that multimodality might seem patchwork, but it's really not too different from how human multimodality works. Our vision and language processing aren't "unified". We process vision and language very separately* at the bottom, then integrate them at a high level.
*probably
Huh.. like, prompting isn't the issue atm.. SDXL is by far the best at turning text to images, but it doesn't have the ability to get instructions for editing images, and all complex text and image input stuff. I think if SDXL and LLaMa could be inferred together we could achieve perfect image blending and or text interpretation
You want SDXL to "think" intelligently the way llama2 does. I get it.
IPA is a step in that direction, but it loses a bunch of SDXL's insane quality in the process
Yeah, we NAILED the quality of the images, now the next step should be capabilities
In the same way that learning coding makes an LLM vastly better at logic, theory of mind, and creativity; I expect learning visuals would make it better at something too. 🙂 It's beyond my ability to create, and there are no open source projects that really meet this criteria.
SDXL is way more than enough when it comes to the images and speeds themselves. I just think it would be insane to have an inference where you can give an image then give instructions on how to change the image, what to blend the image with, etc.. there was actually something called "IPA" that released recently that allows it to do something similar, but it makes SDXL lose much quality
So if there would be an SDXL+LLaMa inference, it could potentially enable SDXL to do all those things
How does ipadapter cause the image to lose quality?
Honestly, idk. It's really experimental and chaotic when you don't provide text (which providing text isn't ideal for image blending)
IPA also doesn't support the new optimizations like AIT
So does controlnet, and so does high CFG. The more you control SDXL, the more glitches you get, especially on small faces.
Which also isn't ideal.. I think the right approach shouldn't be modifying SDXL's components, it should be implementing other components (such as LLaMa)
I disagree. The "glue" connecting these can't be hard-coded algorithms. It would have to be an AI at least as advanced as SDXL and Llama.
I just set up a blip model to interrogate the image find into ipadapter on mu latest flow. Rhythm send thst to clip g
It works
I don't see SAI training something from scratch just to give it more capabilities. There must be another solution
Tesla tried originally integrating many neural nets with heuristic code. It hit an absolute limit, and couldn't reach human-level driving performance. Running AI full-stack, from input to output, worked.
In our own brain, the connection between language and vision is not something simple. It's the same kind of intelligent network that the rest of the brain is made out of.
That could be a decent way to blend the images, but not instructively modify them
I haven't written a prompt in days
Have it combine with an empty text box for adding to it if I want. Then through a styler
So you think SAI is going to be making a new type of model that has those capabilities? Because I don't see that happening any time soon..
Well I don't know why you'd need a new model to do more. I don't think monolithic all in one models are the way to go anyway. At least not in most cases
I do remember hearing about something called "CM3LEON" that has those capabilities and it's made by Facebook, but they are awfully vague about it and said it's going to be closed source
Idk if the only way to achieve instructive image editing is making a new model from scratch though. Maybe the way is to combine LLM with diffusion and something like IPA, but idk man.. we'll find out eventually
Super multimodality.
https://arxiv.org/pdf/2309.05519.pdf
You use SD and Llama as an initialization base, but you train one giant "LLM" on top of it. You only need 1% parameter adjustment over those base models to start seeing evidence of a model that "understands" language and vision together.
Interesting. Y'all think SAI would be doing something similar to this?
I don't think they've announced anything like that. Last I heard was their stable audio release. (Haven't tried it out yet.)
This paper released 3 days ago, so they might not have started yet if they're planning on it. 🙂
BTW it used Vicuna, Stable Diffusion, and ZeroScope. Definitely something SAI could tackle with their own toolset.
I took a look at that paper and it does seem like they're talking about inferring diffusion with LLMs to achieve crazy multimodal like this. Idk how close we are to having something like that available as an open source
Using AI to "glue" together these amazing models like SDXL and Llama2. It doesn't take much glue, and the results are as powerful as SDXL and Llama2 combined. (Hopefully!)
They might release their code. The paper is still a WIP.
Awesome! Speeds with that shouldn't be too problematic also, since we have optimizations like AIT and OneFlow just for stuff like that
No, actually, to run SDXL and Llama2 together on my system, I put SDXL on the GPU and Llama2 on the CPU. Not sure how consumer-friendly this would be... But I'll just have to hope for now.
You have no idea how fast OneFlow can be. Stay tuned, it might get implemented very soon
J
I heard OneFlow also decreases VRAM usage, but idk about that
star trek style
4K tall generation (no upscaling / hires).
in pixel sizes?
1024*4096
He has nice necklace, ball bearing 😄
holy cannoli
probably, gonna see
"highly detailed portrait shot of a woman" + 40% control depth + movie styler (nothing else):
the depth map works much better than ip adapter to bring out the movie style, as it doesn't give that much information as ip adapter
so at least for testing it works better
goodness, this stable audio model is legit
have you tried it out?
it's about a million times better than the etension in a1111 I mess with a while back
I realize this isn't the channel to discuss such things. just saying
You have my attention.
Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.
you can make free 45 second mp3s
now don't expect it to be creating on point symphony recordings or anything. that won't happen for at least a year or two
That's highly interesting. Did already someone asked: "When A1111?" 😅
is it going to be released like that? I actually have no idea. there's a channel specifically for it though
"Keep an eye out for upcoming releases from Harmonai, including open-source models based on Stable Audio and training code to allow you to train your audio generation models."
oh nice. well I didn't get that far. but made a "psychonautic masterfully mastered rap beats that goes harder than rocks and hotter than the sun"
those spider-man, star wars, ghostbusters and star trek options need some work. looks nice, but not as the movie:
now we just have to wait for the image to audio models
Can't wait to hear Steven Segal mumbling My heart will go on with a deathmetal backing track
very cool. I'd like to hear him sing in a barbershop quartet
just asked that channela bout the image to audio models. gonna be a clueless noob in there
whoa. this is going to be a huge time sink. ugh
I saw, well...until open source models are released there isn't a huge community working on such implementations like image to audio
If they release it.....4-5 days
true enough. it's pretty incredible tbh because every other model I've seen was very primitive. not knocking them, just weren't quite there
I tried to train a model on bach midi. and well, it trained. but let me tell you, it did not make beautiful music
IPAdaptor for Audio 🤯
dude, multimedia projects
Audio2Audio
audio to audio, audio to image, image to audio, audio to neuralink
OpenPose Dance Moves to Audio
then we just have to wait for the touch and smell models
Prompt: The smell of Steven Segal after running for 10km. Neg: Flowers
Oh no. I already imagined the same text problems as our beloved text 2 image model has.
I guess for now we have to stick with image generation. And if an audio model gets released we all need a 4090
is star trek blocked in sdxl? this is what i get from "star trek borg queen":
that'd be weird if it was. but I have no idea about anything
I get an old lady with my mix model
that one gets closer, but doesn't seem that sdxl really knows how a borg queen looks like:
the wires are from my depth model. and that is the closest thing to a borg queen in that image
no wonder that my movie styler has problems, if the model misses everything about star trek
portrait shot of a vulcan woman
That's definately not the U.S.S Enterprise, or is it?
well someone was posting spock earlier so there must be either a workaround or a lora or something
lol
captain picard:
captain picard captured by the borg - sorry, but that is not just leaving those images out - they fed in some stupid stuff for those key words as well
remixing some of my old sd1.5 wins
now that my flow has made prompts obsolete I can just plug new images in and go
will remove star trek from my movie styler - not worth it that way
yeah, that sucks. and also pretty weird
star wars also just gives some fan art.
darth vader in front of his stormtrooper army:
interesting
still "movie scene from Darth Vader in front of his stormtrooper army"
movie scene of a star wars x-wing fighting the death star
those random results of fan art or off topic stuff doesn't work for a movie styler. star wars will be removed too
I cant get Chewbaka to play drums.
do you think this captures the style of spider-man into the spider-verse?
this is the prompt - how could i improve it?
"a woman in style of 'Spider-Man Into the Spider-Verse', mix of realism and stylization, dynamic camera angles, comic book-style, vibrant color palette, exaggerated character expressions, urban landscapes, graffiti-inspired visuals, onomatopoeic comic book sound effects, motion blur"
whoa
do a binary search and figure out what word it is
Meaning what? Is there a list of disallowed content somewhere?
just test half at a time till you figure out the word thats banned lol
Wouldn't it be way faster to just not post???
bro if you wanna figure out why your message is blocked thats how ur gonna figure it out
https://github.com/NExT-GPT/NExT-GPT just found source. I can't figure out if it's finished or a webui yet
what did you prompt for?
photo of a clear (transparent:1.4) (translucent:1.3) (evil:1.2) worm alien creature in a dark organic cave, evil face, sting, intricate veins, wet, horror, inner biological cell
But my model is something.....special, i guess
i love it
Thank you
Its for the plumbus
LOL
Or it's just ultra greasy
Noice
yea, i like those ugly bastarads alot
Is this still my prompt?
need to set up ipadapter with the song model
The man she tells you not to worry about
Binpba Poulbes looks like its located in a cooling tower. What a lovley place.
yeah, I noticed the pretty background as well
Very nice eyebleach. Very nice
What did you expect? 😅
REAL BITCOIN POG
What up SDXL fam. I finally took some time to figure out how to get my xformers updated correctly to have sdxl work.. I took a random prompt from civitai for the juggernaut model and the step preview looked on par with what it should be
but the output turns out like this
looks like VAE error, also could be an A1111 issue, most people use ComfyUI as of now
I'm sure you can make a workflow for it, if it exists in A1111 there is no reason you won't be able to make some sort of workflow for it
ComfyUI also allows for modern optimizations like AIT, but that's way too hardcore if you just installed Comfy
der alte fritz:
I'm just trying to run Deforum videos with sdxl lol. Either comfy has that ability or it doesn't. Deforum way to complex to "build it out" in comfy if it isnt there
I'll stick to SD V1.5 for now
that's not how ComfyUI works. it doesn't have "tabs" like A1111 has, you are the one that makes the workflow, making the possibilities pretty much infinite
and yes there are nodes for deforum
oh then definitely setting up comfyui. Thanks for the help friend!
the only disadvantage Comfy has imo is it doesn't look fancy after you finished making a workflow.. we need some kind of fancy frontend made that comes with nodes you can add to comfy to allow it to modify certain parts of the workflow while maintaining a simple look
Saving Private Ryan
Fun with Dick and Jane
i've been mixing. a lot
Amélie, or Le fabuleux destin d'Amélie Poulain
does this look like the correct style?
The colors and style come close, yes. Well i guess the realism can be improved.
the REAL teenage mutant ninja turtle
Intense
my goal is to get the style, so i can apply it to different content. like in that example (last one is the source):
Are you doing it only with prompting?
100% VAE issue. Had it set to vae8000 and needed to switch to automatic
the source is canny, depth or ip adapter, the style is just prompting
its go time
this is the same one (the youngest) with matrix:
or ex machina:
mad max witch in paris:
Nice work so far 👍. Are you releasing it on your GitHub?
yes, it's already on there, but not with all movies and the latest small prompt fixes. it's in my node pack and in my workflow.
Cool. Will check it out tomorrow. I am going to bed. Gute Nacht 😉
gute nacht :)
that bottom pick has me a bit flustered
Spock / Leonard Nimoy lora?
that shirt design - haha nice
He still has to exercise, lol
I have been fighting with this one A LOT
Never have issues with styles but people seem to not work for me
the base SDXL Leonard Nimoy doesn't look like Leonard Nimoy at all heh
What made this sort of work, not happy with it though, is I did 1 epoch and all repeats
never tried that before
cool 🙂 the likeness has definitely improved by a lot
I used this from my 1.5 success to learn about chars in lora and not really having any luck
back then it was DB so I tried DB now and success but I wantr this as a lora not a db turned into a lora
An issue I seem to have in comfy is this in the lower right I can't neg away without a HUGE value, and sometimes not at all.
who did it best?!
3
2 is a bit of a smoothskin
yep
none of which i'd want to encounter tho 😮
2 is probably the best model out of the 3, it's a lot more creative 🙂
this is what i meant with creativity -> i feel protovision is lacking in creativity and just made "generic evil thing"
it has one of the best if not simply the best image quality right now, so i'm just trying to get some creativity back in there
fck. my cursed lyrics. i HAVE to do this, and scar my eyes probably
my mission to add creativity to protovision is probably a succes. To celebrate, have a 3 image comparison of the cursed lyrics. warning, you cannot unsee.
I need 3 separate images to celebrate
one more serving of cursed lyrics, coming up ❤️
I need your help in creating a vision. 3 images that have impacted your life
like this, for example
more of the cursed lyrics, who did it best, #1 #2 or #3
protovision doesn't like my upscaler:
well that's weird
which upscaler and renoise are you using?
my own solution in my workflow.
doesn't happen with default sdxl model
dynavision doesn't break the image, but also turns it green or blue for some reason
self-made upscaler?
I've noticed that with some checkpoints, upscaling tossed extra lips on, or with NSFW images adds extra nipples or belly buttons and such
colors with sdxl default model:
colors are the same even before upscaling
protovision and dynavision turn it green. prompt includes vintage grading, so the default model is closer to the prompt
perhaps the vae to re-encode the image
i know socalguitarist has a built-in vae, that might be different
yes, maybe vae if it's different from v0.9
juggernaut does normal colors:
speaking of vae... uhhh how do i force tiled vae usage in comfy?
my hires usually breaks saying out of memory and then switches to tiled vae but why can't i just use tiled vae from the start 
there is a special tiled vae node afaik
don't know if it is a default node or in some of the popular packs
Believe it's default. It now has the option also to select the tile size
it's default and it's hardwired into the code that you default to it if your vram gets maxed
512 sounds like a lot 
hmm lowest i can input is 320 
more nimoy than nimoy
either his hands are huge or his ears or it just isn't right. DB it just works so might witch to that and see if extraction will work
have you messed with the weights at all?
yep
ever go nuts and put the weight into the negative for a lora?
yes, of course
whoa
Does anyone know if there's currently a mnode in comphy that will allow to easily read wildcards from a text file?
I want to do something like thing = [cat, dog, man] and replace those in something like A {thing} in a scene with xyz
you know, that terrible fan art lora seems like it'd be great for that
that's gandalf's evil inbred cousin
LOL
Damn vampires.
I like it when pictures are on the wall in these weird renders
yeah, not even asked for
I'm sure someone will come up with a method for blowing those images up and rendering them on their own
remember "this person isn't real" from a couple years ago? that was some mind blowing stuff. and there'd so often be that secondary person/demon all deformed on the side
soon the ai is going to get too good to gift us with such things
Am I going insane or missing somerthing I thought text was in the core?
9 suns in the sky
looking for #1100170312106127410 ?
I love bots
sent the image through an interrogator and used the result to make a lovely theme song for it
Any reason to use Hires upscaling instead of SD Upscale in Img to Img?
It's bugging or something I'm not sure...it's taking a ridiculous amount of time to produce even one image
Sorry do you mind explaining more in detail, the lower the tile size the less VRAM but Longer the generation? Am I correct?
Yeah, that's essentially how it works. But I was being sarcastic earlier. Smaller tiles would very likely degrade the image significantly
Lol okay thank you so much, I bow to your superior skill! hahaha (seriously tho sick generations my man). Another question, is there a correlation with the tiled VAE and SDXL being trained on 1024 or can you use 512 tiled VAE
If you have VRAM limitations, tiled VAE of 512 will help you as it's split your image processing to 512x chunks.
Slower, but allows to continue without OOM.
If you don't use a tiled VAE node, Comfy will automatically detect if you're gonna OOM and switch itself to tiled processing. But it takes additional time to detect that
just put a tiled vae node where that happens to give yourself back a few seconds
no sense in waiting for it to detect that
yeah, I made my own vae loaders to try and better understand how that stuff works
but it only ever happens at the tail end of a long run of things
always at the end, a stupid upscale or something
Yeah my last vae node, after upscale, I have swapped to a tiled vae
if I make the mistake of leaving in a regular vae I'll usually just close the program and restart because better than sitting there for what seems like several minutes
thank you @visual glade for adding the badge with ID's, i just found it and very appreciative! really helpful with all the python scripting, just wish'd i'd found it earlier LOL
the badge with ids?
or whoever is responsible (maybe creator of the manager addon)
manager thing
ltdrdata's work
ah, well kudos to that person and comfy anyway cause he's aweesome 😉
My MiDaS Node stopped working ,any idea what I can do?
face fixer
pls how
found new waifu, don't think she's happy tho...
This is my face fix process I use
I don't know. I like goofed faces
isn't there an adetailer alternative for comfy?
probably, but for the rare times I need it, this does what I need
there are at least 2 different face fixer things I believe
does this actually work well
would it be able to fix this
dufuq 49 lora launcher?!
this is the perfect face in my opinion
hahaha weird, loads like that in mine 😆
was gonna say, who the hell uses 49 at a time 😅
what's up with these tools not having regular inputs and outputs?
talking about mine?
yeah. I'm not trying to knock it. I just don't understand the why part
cause seems like it'd be simpler to stay consistent with other lora loaders
im not sure what you mean
nevermind. anyway, fixed her face
so real
In my opinion, I think it's in two part. One for the creators own purposes, such as just simple ease of attaching a single point, but you have to stay within your own nodes. So your flow consists of purely "efficiency node" nodes.
And two, to reduce the amount of noodles everywhere, but that's now mitigated with the ability to hide the noodles
I do get that, but let's consider lora loaders for a second. you can stack them vertically, line them up horizontally. I can think of few reasons why they'd need to be separated. so no real need to run some weird pipe that cna only go to one other node etc.
unsure on anime type, but probably. Just still goes off your prompts/models/etc.
Works well enough on my realism images when I need it
just make 3 or 5 stacks
and have regular inputs and outputs
but whatever makes people happy. doesn't matter to me
Yeah I opt for using the Comfyroll lora nodes. Normal connections, plus the on/off switches, so I can keep them inline
I just made my own with a few loaders in one and switches
but similar concept to those
@crisp owl so am i just out of luck
because inpainting doesnt fix the faces
and i dont know what to do to fix them myself
did you try the facefix node I showed?
I screenshot so you can just copy over my settings
worth a try.
Or you can just keep mashing images till you get a good one 🤷🏻♂️
i can barely see the settings
Is this a custom stacker, or is it available on ComfyUI?
its a custom module
Available?
all done
The top image will be the actual finished image. All the others are the seperate processes going on. like "cropped_refined" would show you a preview of the zoomed in face it touched up. etc
You can just throw a preview on each of the outputs to see what they all show.
But the top is the finalized image
bueno
except weird clip idk
what that is
ill see what i can do abt that
oh nvm
thats just hard shadows
that arent good
If you wanna keep some minor details from one vs the other, can just toss into a photo edit program and mask out the one vs the other
yea
thanks again dude
you saved me from a year of headache
as I said, when you need it, it works well enough haha
I'm sure there's other methods, but that was one i got setup working well for me