#ControlNet & T2I-Adapter Support

191 messages · Page 1 of 1 (latest)

uncut sphinx
#

Creating a new thread to discuss adding ControlNet support to InvokeAI

cerulean tangle
#

So I hacked up a depth2image node that looks pretty similar to the code you've linked:
https://gist.github.com/Kyle0654/57c337f7c005662b98a53f4e1ed7a960

But I'm not sure what the "correct" way to do this is. @graceful bronze indicated that this is probably not the right way to do this (IIRC because the user-submitted pipelines may be untested/unsafe). I know that the models need actual model management as well (so we can cache them on CPU, move them to/from GPU, etc.). Not sure how we do all that though.

I do like how simple the code ends up being though, so I'm hoping we can get somewhere in the middle, so it's easy to add new features 🙂

Gist

DepthToImage Node. GitHub Gist: instantly share code, notes, and snippets.

uncut sphinx
cerulean tangle
#

That looks really easy to use. Bet you could take that sample code near the end and prototype a node for it quickly.

uncut sphinx
# uncut sphinx ControlNet support PR from takuma104 just got merged into diffusers: https://gi...

And the haofanwang diffusers-based ControlNet repo referenced above (https://github.com/haofanwang/ControlNet-for-Diffusers) is being redone to reflect the takuma104 merger.

GitHub

Transfer the ControlNet with any basemodel in diffusers🔥 - GitHub - haofanwang/ControlNet-for-Diffusers: Transfer the ControlNet with any basemodel in diffusers🔥

cerulean tangle
#

(though arguably the canny part would be a separate node)

uncut sphinx
uncut sphinx
uncut sphinx
# cerulean tangle (though arguably the canny part would be a separate node)

Canny edge detection is treated as a pre-processing step in both the diffusers example code and the lllyasviel/ControlNet repo. Skimming the lllyasviel repo usage examples, it looks like for controlled inference the control images are treated similarly for any of the controlnet models -- first transformed by a preprocessor , then run through the identical block of stable diffusion code. For instance, diffing lllyasviel repo examples gradio_canny2image.py and gradio_pose2image.py, essentially the only difference is in the control image preprocessing.
So having a node for each preprocessing method but only one node for actually applying controlnet inference makes sense to me.

hidden zenith
#

I'm currently using ComfyUI Node based interface for ControlNet and it works very well and makes a lot of sense. It doesn't come with preprocessing nodes though, so I had to implement a canny node myself.
I'm pretty sure it has been discussed here right?

uncut sphinx
# hidden zenith

Really useful to see this! Can you post a higher rez screenshot?

uncut sphinx
# hidden zenith

So for the version without a canny node, the input control image had already been run through a canny edge tranform?

hidden zenith
#

And yes in case you don't have the node, you need to precompute a canny image and upload it.
All the filters (Canny, HED, Normal etc.) are already implemented in the official ControlNet gradio repo.
And canny is just a one liner in opencv anyway.

#

I'm very new here, is there a discussion on what node interface might to be used in Invoke?

uncut sphinx
hidden zenith
#

BTW if you already have your invokeai environment set up, you can just clone ComfyUI it works out of the box if you want to try it.

uncut sphinx
uncut sphinx
# hidden zenith I'm very new here, is there a discussion on what node interface might to be used...

And another good Invoke node discussion on developer forums, "Node Use Cases: What do you want to do with nodes?"
https://discord.com/channels/1020123559063990373/1049107548264992779
Side note: is there any way to keep threads in our Discrod developer-forums from disappearing? If there are no new posts these threads disappear from my view (they no longer show up in left sidepanel in subtree under developer-forums channel). Only way I've been finding them again is by searching for terms relevant to that thread and wading through the search results till I find one from the thread I'm interested in. Or copying thread link outside of Discord. Both seem pretty clunky. I'm pretty new to Discord, is there an easy way that I'm missing?

uncut sphinx
visual solar
cerulean tangle
#

hrm not going to be as trivial to make a controlnet node I guess. Something about kern not implemented for half. But the code at least tries to run

uncut sphinx
# visual solar Similar to ControNet, there is also T2I-Adapter (https://github.com/TencentARC/T...

Looks like the diffusers PR for adding T2I-Adapter support went in last night: https://github.com/huggingface/diffusers/pull/2555

GitHub

This PR implements the T2I-Adapter, related pipeline, and model sideloading mechanism discussed in #2390.
Model/Pipeline description
T2I-Adapter by @TencentARC is

... a simple and small (~70M par...

#

ControlNet & T2I-Adapter Support

#

ControlNet and T2I-Adapter seem similar enough that I'm adding T2I-Adapter to this discussion...

uncut sphinx
# cerulean tangle hrm not going to be as trivial to make a controlnet node I guess. Something abou...

I've got the diffusers ControlNet example at https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/controlnet working on the same virtual env that I'm using for InvokeAI (mainly need to ensure that diffusers >= 0.14.0 is installed).
Initially the example wasn't working, I was getting a similar error to yours:
"LayerNormKernelImpl" not implemented for 'Half'
Turned out that the example was trying to run on my CPU, which didn't like fp16. Tried first replacing fp16 with fp32, which allowed example to work on CPU, but very slowly (~0.12 it/s). Then put fp16 back and pushed to GPU -- just changed this pipeline call:
pipe = StableDiffusionControlNetPipeline.from_pretrained(....)
to
pipe = StableDiffusionControlNetPipeline.from_pretrained(....).to("cuda")
and it's working well, getting ~10it/s

#

@cerulean tangle are you working on nodes for ControlNet? Anyone else? If there's nobody already working on it, I was thinking of taking a swing at it this weekend... I'm only looking at the backend, from diffusers integration up to wrapping as a node.

cerulean tangle
uncut sphinx
# cerulean tangle Please have at it. I'd actually prefer someone else try it and get some feedback...

Understood -- I'm following the PR discussion at https://github.com/invoke-ai/InvokeAI/pull/2902 to figure out when to switch over.

GitHub

Remove node dependencies on generate.py
This is a draft PR in which I am replacing generate.py with a cleaner, more structured interface to the underlying image generation routines. The basic code ...

#

I'm going to switch over to refactor/nodes-on-generator branch and further branch from there.

mortal elk
#

I think the PR is close to being safe to merge into to main. It's actually been approved, but I think @cerulean tangle should make the decision when to hit the merge button.

uncut sphinx
uncut sphinx
#

MultiControlNet support is now merged into diffusers main.
I've tested passing multiple ControlNet models to the same diffusers StableDiffusionControlNetPipeline instance and seems to be working well.

hidden zenith
#

Hello! Are there plans to support ControlNets from the main UI or will it be a Nodes feature?

steep idol
#

Both

#

We are aiming for a powerful set of tools for controlnet on the canvas

uncut sphinx
#

I've been working on adding InvokeAI backend support for ControlNet, based off the recent Generator refactor. Very close to having a working barebones version, but still hitting some Tensor mismatch errors. If I'm still having problems by end of this weekend, I'm hoping I can get some core dev team support for a little pair programming to figure this out.

#

Here's my current test script to give you a sense of how I'm proposing integration with invokeai.backend.generator

from invokeai.backend.generator import Txt2Img
from invokeai.backend.model_management import ModelManager
from diffusers.models.controlnet import ControlNetModel

canny_image = cv2.imread("/test_images/input/canny_vermeer.png")
canny_image = Image.fromarray(canny_image)

# using invokeai model management for base model
model_config_path = os.getcwd() + "/../configs/models.yaml"
model_manager = ModelManager(model_config_path)
base_model = model_manager.get_model('stable-diffusion-1.5')

# for now using diffusers model.from_pretrained to load ControlNetModel
canny_controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16).to("cuda")

# all default params except control_model
txt2img_canny = Txt2Img(base_model, control_model=canny_controlnet)

# all default params except for control_image
outputs = txt2img_canny.generate(prompt="old man",
                                control_image=canny_image)  
generate_output = next(outputs)
out_image = generate_output.image
out_image.save("/test_images/output/canny_controlnet_testout.png")
uncut sphinx
#

original "Girl With A Pearl Earring" by Vermeer

uncut sphinx
#

Left: Canny edge detection applied to Vermeer image
Right: Using draft PR for ControlNet support, with prompt = 'old man' and control_image = canny_vermeer.png

cerulean tangle
#

Looks like I need to fill out the cv nodes 🙂

#

Have you checked out the branch where I've been trying to convert nodes to use latents? I was trying to think through how ControlNet would be added (and if there are other things like it in the future, how that might work): https://github.com/invoke-ai/InvokeAI/blob/kyle0654/node_latents/invokeai/app/invocations/latent.py

Feels like at least control_image should come in as a parameter. But I don't know if we want to just keep adding things onto a "text to image" node, or if "text to image with controlnet" should be its own node?

GitHub

InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. Th...

undone axle
cerulean tangle
#

Yah. I'm worried about how much that grows over time. And it kind of just puts us right back where we started with prompt2image

undone axle
#

Does it make sense to use both an init image and a controlnet for one image? What about multiple controlnets? I don't know, but it would be fun to be able to try it.

#

It might also be a good idea to have alternative simple prompt node, which does the same thing without all the extra inputs.

final sinew
#

@uncut sphinx great does your current implementation also support controlnet-pose (pose transfer)?

uncut sphinx
honest token
uncut sphinx
# honest token Best to avoid using the original 5.71GB models. Identical results can be achieve...

Ah, I was wrong about where the controlnet models are coming from.
For this first draft of InvokeAI ControlNet support, I'm relying on diffusers loading, for example ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny"). Which is NOT the same as https://huggingface.co/lllyasviel/ControlNet, rather it's https://huggingface.co/lllyasviel/sd-controlnet-canny. And the file it's downloading and caching on my local drive is diffusion_pytorch_model.safetensors. Though it renames it as a long hex number -- not sure if that's for checksum or git versioning purposes? File size is 1.45 GB, so maybe its fp32 instead of the 723 MB fp16 file at https://huggingface.co/webui/ControlNet-modules-safetensors? So not as compact, but definitely smaller than the 5.71 GB versions at https://huggingface.co/lllyasviel/ControlNet

mortal elk
#

@uncut sphinx When you use from_pretrained() on any of the HuggingFace models it will build a cached version of the model in which the data object filenames are replaced with hex numbers that are pointed to by symlinks. A few things to be aware of:

  1. When you download the model, you have the option of passing a revision parameter to from_pretrained(). Most (but not all) models include a revision of fp16, which lets you get the half-precision version of the model. If at different times you request both the fp16 and fp32 versions, they will exist side-by-side in the cached model directory in separate directories under snapshots/. Best not to rely in any way on the structure of the cached directory, because it's been known to change. You can get the available model versions by going to the huggingface models page and looking at the branches in the "Files and versions" tab. Some models have EMA vs non-EMA versions, for example.
  2. HuggingFace by default puts its cached models into the /.cache/huggingface/hub directory in your home directory. A long time ago we made the decision to have InvokeAI move the cache into the ~/invokeai/models/hub directory so that all the models would be in one place and users had more visibility into what was eating up gigabytes of their disk space. To be consistent with this convention, you need to pass cache_dir=global_cache_dir("hub") as one of the parameters to from_pretrained. global_cache_dir() is importable from invokeai.backend.blobals
  3. When you do from_pretrained() with a repo_id, the HuggingFace client code always pings the server to see if there is an update to the cached version. This results in an annoying "Downloading 100%" message. You can avoid this by specifying local_files_only=True, but this is problematic. If you figure out how to quench the message, let me know.
uncut sphinx
mortal elk
uncut sphinx
uncut sphinx
#

So, I've now downloaded 36 different ControlNet models. Any others I should be testing? (Please I hope not)

##############################################

lllyasviel sd v1.5, ControlNet v1.0 models

##############################################
"lllyasviel/sd-controlnet-canny",
"lllyasviel/sd-controlnet-depth",
"lllyasviel/sd-controlnet-hed",
"lllyasviel/sd-controlnet-seg",
"lllyasviel/sd-controlnet-openpose",
"lllyasviel/sd-controlnet-scribble",
"lllyasviel/sd-controlnet-normal",
"lllyasviel/sd-controlnet-mlsd",

#############################################

lllyasviel sd v1.5, ControlNet v1.1 models

#############################################
"lllyasviel/control_v11p_sd15_canny",
"lllyasviel/control_v11p_sd15_openpose",
"lllyasviel/control_v11p_sd15_seg",

"lllyasviel/control_v11p_sd15_depth", # broken

"lllyasviel/control_v11f1p_sd15_depth",
"lllyasviel/control_v11p_sd15_normalbae",
"lllyasviel/control_v11p_sd15_scribble",
"lllyasviel/control_v11p_sd15_mlsd",
"lllyasviel/control_v11p_sd15_softedge",
"lllyasviel/control_v11p_sd15s2_lineart_anime",
"lllyasviel/control_v11p_sd15_lineart",
"lllyasviel/control_v11e_sd15_shuffle",
"lllyasviel/control_v11p_sd15_inpaint",
"lllyasviel/control_v11u_sd15_tile",
"lllyasviel/control_v11e_sd15_ip2p",
"lllyasviel/control_v11p_sd15_scribble",

#################################################

thibaud sd v2.1 models (ControlNet v1.0? or v1.1?

##################################################
"thibaud/controlnet-sd21-openpose-diffusers",
"thibaud/controlnet-sd21-canny-diffusers",
"thibaud/controlnet-sd21-depth-diffusers",
"thibaud/controlnet-sd21-scribble-diffusers",
"thibaud/controlnet-sd21-hed-diffusers",
"thibaud/controlnet-sd21-zoedepth-diffusers",
"thibaud/controlnet-sd21-color-diffusers",
"thibaud/controlnet-sd21-openposev2-diffusers",
"thibaud/controlnet-sd21-lineart-diffusers",
"thibaud/controlnet-sd21-normalbae-diffusers",
"thibaud/controlnet-sd21-ade20k-diffusers",

##############################################

ControlNetMediaPipeface, ControlNet v1.1

##############################################
"CrucibleAI/ControlNetMediaPipeFace", # SD 2.1?
["CrucibleAI/ControlNetMediaPipeFace", "diffusion_sd15"], # SD 1.5

steep idol
#

honestly, if all these work, i think we're in good shape

#

@atomic eagle once nodes api is merged in, is there any reason @uncut sphinx couldnt be noding away?

atomic eagle
#

can't think of any - i'm very happy to have a 1-on-1 to clarify any questions as well

steep idol
#

@uncut sphinx is there a way to test this?

#

would love to poke at it today, and am happy to (try) to help taking on nodifying it

uncut sphinx
steep idol
#

Nodes api pr is getting merged in soon so that blocker should be a non issue. Would love to catch up - will have to be once these kids go down

#

Or sometime this weekend

uncut sphinx
uncut sphinx
# steep idol would love to poke at it today, and am happy to (try) to help taking on nodifyin...

The current draft PR should be usable: https://github.com/invoke-ai/InvokeAI/pull/3156
Or the branch it's based on: https://github.com/invoke-ai/InvokeAI/tree/feat/controlnet_backend
Although I haven't rebased to main in several days, so maybe I should bring it up to date.
Also I don't think there's an example usage script in there -- I'll clean up a test script and add it in this evening.
I've also made changes that I haven't commited yet, though they shouldn't change basic usage.

atomic eagle
steep idol
#

Just got kids down, they may intrude but am good to chat if you both are! @atomic eagle @uncut sphinx

steep idol
#

👌

#

guess we could meet in the discord server voice channel?

atomic eagle
#

sure

atomic eagle
#

@uncut sphinx we're in the voice chat if you are online

uncut sphinx
atomic eagle
uncut sphinx
#

ControlNet in node UI !
It's only partial implementation and hacky right now, but working.

steep idol
#

The most beautiful old man with a pearl earring I ever did see

dusky prawn
#

Hey there! Just jumping in to say that you guys are doing a fantastic job!
I have a question, though. Is Controlnet planned to only work through the node interface or are you guys also thinking about having it elsewhere? I guess I'm imagining that having control net work as a kind of stamp in the canvas would be amazing.
Like in the case of pose control, you could just position your character inside your selection box and inpaint. It would be crazy!

atomic eagle
steep idol
#

@uncut sphinx - how are we looking on controlnet readiness to merge? (and for anyone paying attention to this thread, latest sneak peek/update)

coarse quail
#

There is some confusion with the color based type marking in this image. Control and conditioning outputs are both the same cyan. Also the input to collect is grey but it's taking cyan wires and the control input on T2L is grey for collection, but can't it take control wires directly (ie Cyan)?
It looks like we might be running out colors. Could shapes be used in addition to colors (ie squares, triangles) to prevent overlap?

steep idol
#

would say that inputs/outputs design is not fully final

#

i think the cyan is being used because both are passing in an input+model

coarse quail
#

I know tone does not come across in text very well, and I know the design is not final, so this isn't meant to be accusatory or anything.
Those 2 input types (conditioning from compel vs control net) should never be plugged into each other. If we are running out of colors, and the idea is to clearly mark types graphically, then something more like shapes and colors might be helpful to avoid overlap.
I actually think grey is a good color for "takes any input type" for something like collect that can take different things, but it should then change to match what is being used for input and output. So the collect node starts grey and then once you plug something in, the circles on both ends change to match.

steep idol
#

Yes, my response was more of a "yep, that's true and valid feedback. these aren't intended to be final, so things will change in that direction"

#

There has been almost no styling done on the nodes editor yet, it's pretty much just 'functional UI'

uncut sphinx
uncut sphinx
steep idol
#

tend to think if its a solid/stable foundation, getting it in now and handling enhancements through a diff PR would be optimal

uncut sphinx
#

For TextToLatents-based controlnet support, should be "feature-complete" by Saturday

steep idol
#

🚀

#

just trying to understand in what context its "incomplete" rn

uncut sphinx
atomic eagle
coarse quail
#

My suggestion would be to use a few shapes and stick to fewer, more distinctive colors. In the above shots, there are several shades of blue/green being used that are so close that people might not be able to distinguish them (model, seamless, and scheduler on the T2L for example)

wispy crest
#

Moving the label inside the handle might help a little bit too? It’d make the color a bit more supplementary and give you more room to do other stuff like patterned background colors, or different font/background color combos.

uncut sphinx
steep idol
#

@uncut sphinx just got my env set up to start testing this

#

How does one load an image into the preprocessor node? It doesn't seem like there's an upstream node, but tapping the Image icon etc. doesn't prompt for an upload. Wondering if that's just a firefox issue w/ node editor ui

#

Second question - ControlNet models don't seem to autopopulate as a dropdown in the node editor. Do these need to be manually typed in, or is it because I'm missing models? What are the standard pre-reqs to use this?

uncut sphinx
uncut sphinx
# steep idol Second question - ControlNet models don't seem to autopopulate as a dropdown in ...

Right now there is no autopopulate, it's free text entry. The name of any controlnet models hosted on huggingface should work. Here's my current list of popular models that I copy/paste from:

##############################################

lllyasviel sd v1.5, ControlNet v1.0 models

##############################################
lllyasviel/sd-controlnet-canny
lllyasviel/sd-controlnet-depth
lllyasviel/sd-controlnet-hed
lllyasviel/sd-controlnet-seg
lllyasviel/sd-controlnet-openpose
lllyasviel/sd-controlnet-scribble
lllyasviel/sd-controlnet-normal
lllyasviel/sd-controlnet-mlsd

#############################################

lllyasviel sd v1.5, ControlNet v1.1 models

#############################################
lllyasviel/control_v11p_sd15_canny
lllyasviel/control_v11p_sd15_openpose
lllyasviel/control_v11p_sd15_seg

lllyasviel/control_v11p_sd15_depth

broken, instead use:

lllyasviel/control_v11f1p_sd15_depth
lllyasviel/control_v11p_sd15_normalbae
lllyasviel/control_v11p_sd15_scribble
lllyasviel/control_v11p_sd15_mlsd
lllyasviel/control_v11p_sd15_softedge
lllyasviel/control_v11p_sd15s2_lineart_anime
lllyasviel/control_v11p_sd15_lineart
lllyasviel/control_v11p_sd15_inpaint

lllyasviel/control_v11u_sd15_tile

problem (temporary?) with huffingface "lllyasviel/control_v11u_sd15_tile"

suggestion for now is to replace with:

lllyasviel/control_v11f1e_sd15_tile
lllyasviel/control_v11e_sd15_shuffle
lllyasviel/control_v11e_sd15_ip2p
lllyasviel/control_v11f1e_sd15_tile

#################################################

thibaud sd v2.1 models (ControlNet v1.0? or v1.1?)

##################################################
thibaud/controlnet-sd21-openpose-diffusers
thibaud/controlnet-sd21-canny-diffusers
thibaud/controlnet-sd21-depth-diffusers
thibaud/controlnet-sd21-scribble-diffusers
thibaud/controlnet-sd21-hed-diffusers
thibaud/controlnet-sd21-zoedepth-diffusers
thibaud/controlnet-sd21-color-diffusers
thibaud/controlnet-sd21-openposev2-diffusers
thibaud/controlnet-sd21-lineart-diffusers
thibaud/controlnet-sd21-normalbae-diffusers
thibaud/controlnet-sd21-ade20k-diffusers

#

Should I pre-populate? Is there a way to prepopulate a node port but also allow free text entry for other controlnet models?

#

@steep idol are you testing with Text2Latents node from latent.py or Text2Image node from generate.py? Text2Latents is the most up-to-date, I'm actually not sure if ControlNet with Text2Image will run properly with the latest updates.

uncut sphinx
#

Here's a screenshot of current usage in Node UI, with result and intermediate canny edge detection images in gallery.

#

Another thing to be aware of: if you haven't pre-loaded a ControlNet model, it will download and cache the first time you use it. But there's currently no warning of this on the client, so while downloading, the Node UI will appear to be stuck while executing. If you're running the InvokeAI server in a terminal you can see the download progress there.
And even if you have pre-loaded a ControlNet model, some of the image preprocessors have their own internal models that also need to be dowloaded and cached if they haven't been used before.

atomic eagle
uncut sphinx
atomic eagle
#

omg so cool

uncut sphinx
# atomic eagle omg so cool

Glad its working for you! I'm going offline for the next 20-ish hours, but look forward to any feedback once I return.

atomic eagle
uncut sphinx
#

Thanks! FYI next thing I plan to do is get it working with LatentsToLatents node (for Img2Img). Need to refactor in TextToLatents so LatentsToLatents will inherit the ControlNet functionality.

coarse quail
#

definitely going to need that list to autopopulate for full release.

#

Is there a need to cover the CN1.0 models? I believe CN1.1 is a strict upgrade.

#

Don't forget to test the brand new 'reference' model, especially since it might cause a UI challenge since it's the first preprocessor-only controlnet model.

steep idol
#

Getting an error on Depth model controlnet (1.1)
UserWarning: Mapping deprecated model name vit_base_resnet50_384 to current vit_base_r50_s16_384.orig_in21k_ft_in1k.

#

Having issues in general getting this to generate :/

#

Think it might be just my local install or something I'm not doing right

#

Yeah none of it is generating for me.

#

(normal txt2img)

coarse quail
#

BTW just checking but do you have nodes for passing in control images directly (ie no preprocessor, like open poses generated externally)?

steep idol
#

Yep

graceful kindle
#

You can pass image directly to controlnet node then

steep idol
coarse quail
#

Looks like the refence control net is getting a bunch of custom settings.

steep idol
#

Yeah, same with HED processed_image = hed_processor(image,\nTypeError: HEDdetector.__call__() got an unexpected keyword argument 'safe'\n`, … }

#

output is image going to image, perhaps settings are getting passed in as args somehow?

atomic eagle
steep idol
#

Preprocess > controlnet (image > image connectors)

uncut sphinx
# steep idol ```processed_image = midas_processor(image,\nTypeError: MidasDetector.__call__()...

I see what's going wrong -- for both of the "unexpected keyword argument" errors it's a mismatch between controlnet_aux package versions. That's my bad. For development I've been using current main from https://github.com/patrickvonplaten/controlnet_aux. But for deployment in pyproject.toml I've pinned controlnet_aux to v0.0.3, the latest stable release. And intended to comment out anything that was relying on controlnet_aux > 0.0.3, but I missed a few things.
I will fix and push PR ASAP. If you need a more immediate fix, installing latest controlnet_aux from github repo should fix as well.

broken parrot
#

is there a tag where i can checkout controlnet in invoke?

atomic eagle
#

only functional in the node editor right now

uncut sphinx
steep idol
uncut sphinx
steep idol
#

💯

#

sweet

#

Trying to test out, running into some issues getting my instance up and running.

#

100% positive it's somethingl ocal

#

But can't test until I resolve

#

Yep. I had another instance open 😏

steep idol
#

Hm, now im getting compel errors w/ conjunction 😕

#

ah must have gotten upgraded and still on 1.1.5 🤦‍♂️

#

ok - after getting through all of my local instance challenges, I am sad to report I am still getting error on Midas

\\midas\\vit.py", line 145, in forward_flex\n x = x + pos_embed\nRuntimeError: The size of tensor a (2167) must match the size of tensor b (2073) at non-singleton dimension 1\n', … }

#

HED works though.

uncut sphinx
uncut sphinx
steep idol
#

Think it was a 1280x768

steep idol
#

It’s failing at the Midas node, though, but Midas is outputting to controlnet set to 1.1 depth

#

I’ll give it a try with 512x512

steep idol
#

@uncut sphinx - 512 x 512 works!

uncut sphinx
#

Testing addition of ControlNet support to LatentsToLatents (latent-based nodes for Img2Img):

steep idol
#

So good haha

uncut sphinx
steep idol
#

Yeah it’s a super powerful combo

uncut sphinx
steep idol
#

Yeah you’d get flagged for that non-controlnet age as NSFW!

atomic eagle
coarse quail
coarse quail
atomic eagle
#

inferring the type will be kinda tricky, will have to traverse the graph in the UI. but there's a nice graph library that i'm using for some of that kind of thing which can the hard part of that for us.

uncut sphinx
# coarse quail speaking of can it handle that model since it's so different from the other Cont...

Regarding suppor for "reference-only" ControlNet mode.
The InvokeAI ControlNet support I've been working on does not yet support reference-only mode. I definitely want to add it to InvokeAI. Reference-only does not have the same kind of pluggable model that other ControlNet modes do. From browsing the code, it looks very different. There is actually already a PR pending to add to diffusers: https://github.com/huggingface/diffusers/pull/3435. Which implements the reference-only feature but without the rest of ControlNet, making it easier to understand (at least for me). Any opinion on whether an InvokeAI implementation should include it under ControlNets or separate out as its own thing?

steep idol
#

Separate it

#

I think we’ll use it on its own to power some things on the canvas

#

And I think as a UX pattern, would be confusing - although I can see an argument for folks saying it deviates from what is known in auto

mortal elk
steep idol
uncut sphinx
# steep idol <@938391908814835722> is https://github.com/invoke-ai/InvokeAI/pull/3405 still t...

Yep that's the latest ControlNet PR. @atomic eagle or @graceful kindle, could you take a look?
Recent modifications including pinning to the newly released controlnet_aux v0.0.4, reinstating the Zoe depth preprocessor node, and adding a Mediapipeface preprocessor node. Also, thanks to a great session with @atomic eagle , we got polymorphic input ports on nodes working. So now the control input port on TextToLatents can take either a single ControlField input or a list of ControlFields. So a single ControlNet can connect directly to TextToLatents without going through a Collect node, like:

atomic eagle
steep idol
#

I gave it my 👍

#

I don't know what other reviews are needed before it can be merged.

uncut sphinx
# atomic eagle did you get to double-checking our changes don't cause bigger issues?

I haven't seen any new issues.

There are problems with some node connections disappearing from the UI on reload, even though reloaded graph still executes as if the connections are there. But prior to our changes I was already seeing that with ControlNet nodes, image preprocessor nodes, and sometimes Noise nodes.

But I do need to go over the code changes for polymorphic node inputs again. And I need to test more diverse graphs, I've mostly been testing ones that include ControlNet. Going to try to do a polymorphic node input that has nothing to to with ControlNet...

atomic eagle
#

Some extra tests on different types of inputs/outputs works be great - both valid polymorphics and invalid need to be tested (eg a list of int should not work when connected to a poly string input)

uncut sphinx
atomic eagle
#

yeah baby!

#

great work @uncut sphinx

uncut sphinx