#nodes

1989 messages · Page 2 of 2 (latest)

sinful forge
#

I think what happened is I wrote a bunch of "how to use the side-by-side script" in the nodes PR, but that never got reflected in the repo (because the new stuff was meant to replace the old stuff)

rough halo
#

would the project be generally receptive to including invoke-new and its current temporary web assets in the official docker containers? I'm happy to continue building my own if not, but I don't think the changes would be enormous

#

(and crucially, it wouldn't affect anyone who didn't explicitly choose to change the endpoint)

broken blaze
#

i think with the timeframe we're looking at completing nodes in, it'd be a distraction to not just finish out nodes and then get docker containers updated

rough halo
#

fair enough 🙂

#

(and I'm curious what kinda timeframe that is 🙂 )

broken blaze
#

soon™️

upbeat prism
#

we've got txt2img, img2img, node editor and basic basic gallery functionality working. canvas will be the big kahuna.

broken blaze
#

and we've gotta port LoRA.

hoary pecan
#

Reading tests was super helpful btw, starting to get a feel for things. @broken junco if you refresh some of the docs today, I can happily follow-up with some suggestions/edits when I have time to go deeper.

#

Has there been any talk of like replace all “Invocation” with “Node” across the codebase? Not sure if that’s a perfect suggestion (and it’s not at all necessary) but might make mental modeling easier for newcomers.

broken blaze
#

Naming debates are pretty common. 😛

sinful forge
broken junco
hoary pecan
#

Yeah no worries! I’ll take semi-detailed notes as I learn stuff and can share them whenever.

low furnace
sinful forge
low furnace
sinful forge
#

Uh.. I guess not. With contention that may be difficult.

#

Unless you have per-session compute or something

low furnace
#

contention meaning many different writers racing to modify the queue?

sinful forge
#

Then you could just lock around a session

#

Yah at scale I wouldn't want to lock the queue around a delete

#

I generally try to follow a lockless approach where possible

low furnace
#

okay, yeah, that makes a lot of sense. thank you!

rough halo
#

Might be the ugliest MVP I've ever made, but it's clearly some amount of V 😄

#

fragile discord frontend talking to fragile invokeai client talking to unfinished invokeai nodes. wcpgw 😄

#

(if anyone fancies pitching in with de-fragiling it, I'd be very happy to add you to the (almost entirely empty) discord server I'm testing this, so you get free image generation on my GPU 😉 )

rough halo
#

hmm, I just noticed that my essentially-idle invoke-new.py is using ~15% CPU

rough halo
#

going by strace it looks like it's hard-looping on something

rough halo
#

poking at the code it all seems like it's just on uvicorn and fastapi to DTRT with the asyncio loop :/

rough halo
#

   Ordered by: cumulative time
   List reduced from 15388 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   4488/1    0.078    0.000   17.298   17.298 {built-in method builtins.exec}
        1    0.000    0.000   17.298   17.298 /invoke-new.py:1(<module>)
        1    0.000    0.000   17.298   17.298 /invoke-new.py:8(main)
        1    0.000    0.000   13.875   13.875 /usr/src/InvokeAI/lib/python3.10/site-packages/invokeai/app/api_app.py:148(invoke_api)
        1    0.000    0.000   13.874   13.874 /usr/local/lib/python3.10/asyncio/base_events.py:613(run_until_complete)
        1    0.015    0.015   13.873   13.873 /usr/local/lib/python3.10/asyncio/base_events.py:589(run_forever)
    13153    0.261    0.000   13.858    0.001 /usr/local/lib/python3.10/asyncio/base_events.py:1832(_run_once)
    13153    0.063    0.000    7.374    0.001 /usr/local/lib/python3.10/selectors.py:452(select)
    13153    7.289    0.001    7.289    0.001 {method 'poll' of 'select.epoll' objects}
    13325    0.041    0.000    6.123    0.000 /usr/local/lib/python3.10/asyncio/events.py:78(_run)```
looking at profiling output it's hitting epoll *super* hard
#

polling 700+ times a second doesn't seem sane 😄

sinful forge
#

Feel free to improve it 🙂

The processor also runs another thread, but that should block on a queue pull, so I don't think it spins.

rough halo
sinful forge
#

Ah, could be. I don't think it worked right unless there was at least a small loop, but if you can find a way to get it working that'd be great.

#

I'm still not super familiar with the async patterns in Python 😣

rough halo
#

(also apologies, I didn't mean the "doesn't seem sane" thing as a criticism of the work being done here, I'm going into this with no experience of uvicorn or fastapi and trying to figure out what's going on 🙂

#

ok, so changing that to 0.1 has dropped CPU usage below 1%, so I guess now I should check if I can still talk to the API

#

yep, seems to work!

#

I will send a PR in a moment

#

@sinful forge apologies again, I didn't need to be snarky, I could have just done the work first and put the PR up. I'm also fairly new to asyncio and it has been a painful thing to learn

sinful forge
#

Does it work with any lower number? Worry being that the longer delay may cause issues. (Needing a delay seems really weird either way)

rough halo
#

the delay is because self.__queue.get() can't be allowed to block (otherwise asyncio would be stuck across all coroutines)

#

ultimately in any async/await you are always going to have something, somewhere, polling in a loop

#

I am like 99.5% not an expert in this codebase, but I would be surprised if taking 100ms to pick up an event from this queue, vs taking 1ms, would make any kind of difference other than reducing the polling load

#

I think that class is used for invocations of the graph nodes? In which case these are jobs that are likely to be taking a good chunk of time doing image generation or upscaling or whatever, which suggests to me that this doesn't need to be a hyper-performant polling loop

upbeat prism
#

Fwiw, on my gpu, one step at 512x512 takes about 30ms. I don’t mind waiting an extra 70ms max for my request to get started

rough halo
#

woot merged. thanks!

chrome bobcat
#

Systems can also wait on a lock, which the OS implements however it fits, for events to come in.

#

See also: condition variables.

#

But that article doesn't discuss spurious wakeup, so keep that in mind as well.

tulip sluice
#

@violet sleet just tested the PR .. was doing something similar myself just now

#

works as intended

#

but i think we need to rethink some stuff here

#

Get a model loader node in which outputs three values - Text Encoder, UNet, VAE
Feed the Text Encoder to the Compel Node instead of the Model
Feed the UNet to the Text To Latent instead of Model
Feed the VAE to the Latent instead of the Model

#

but for now .. this is exactly how it should work

violet sleet
tulip sluice
#

or is it already merged?

violet sleet
#

I not PR it as it's not ready yet

#

I create PR with compel as part that works fine now

tulip sluice
#

it works perfectly

#

tagging @upbeat prism coz we've been talking about this all morning ..

violet sleet
#

#dev-chat message

#

Also about model/vae/clip loading talk:
#dev-chat message

tulip sluice
#

ah yes

#

i remember that now

broken junco
# tulip sluice <@522941968452419584> we good to merge that PR?

If you're talking about https://github.com/invoke-ai/InvokeAI/pull/3235, then no, this isn't ready to merge. It was developed to address a specific issue with users who try to attach a checkpoint VAE to a diffusers model, and as a side effect I implemented VAE-only loading. After the convo this morning I decided to generalize this to load and cache arbitrary parts of models, but there needs to be some more work done to support the caching.

violet sleet
#

I thinks about separate 'Simple compel' node that not support loras, but works as you said(and maybe with only one prompt)

sinful forge
broken blaze
tulip sluice
broken blaze
#

Would tend to lean towards generally accepting what Kyle says re nodes

#

🙃

violet sleet
#

I thinks that in this situation there might be implemented check so node can be started only on machine with such model
But this is future anyway, now need to do basic)

tulip sluice
#

so need to figure a way to make that ux better

violet sleet
#

Also about my PR there a question - are we going to support further legacy blend syntax?

sinful forge
sinful forge
#

But I feel like I recall those components not really being very mix and match

#

Though there may be interesting use cases for mixing and matching

#

The CLI uses "defaults" to let you specify a parameter once and then keep using it

hasty bane
violet sleet
#

Yep, but it still supported in code
I not copied this code in my PR

upbeat prism
#

any model nodes should be providing string identifiers of the models (and any serializable config). then the service handles the rest.

tulip sluice
tulip sluice
#

right now even though the model is loaded and the individual components are set, it is still reloading them

sinful forge
#

Passing them through edges like that is not going to work well, unless you're just passing an identifier to help it load later.

upbeat prism
#

to elaborate on the broader context for any nodelings - each node may be handled by a different worker (machine or thread). the inputs and outputs need to be serialisable and transferrable over network.

this means that we cannot send a whole model or model component directly between nodes. in such a distributed system, we can expect that every worker has the same models available, but not necessarily loaded or cached.

so in the invoke() method of a node, if it needs a model to do its business, it should ask the model manager for it by ID and/or type. the model manager will synchronously prepare the model (maybe it is loaded or just retrieved from a cache) and then the business logic can be executed.

the proposed updates to the model manager service will support this - the API is TBD.

in the mean time, if a node needs a model, it should handle loading it itself. you'll need to deal with the inefficient loading for now. once the model manager service is up and running, you'll be able to access it via the context object provided to the invoke() method.

(the exception is standard full SD models as the model manager already handles these)

hollow marlin
#

ControlNet in node UI !
It's only partial implementation and hacky right now, but working.

tulip sluice
violet sleet
#

maybe it's a bit stupid question, but - how to run default graph(text_to_image) in cli?)

upbeat prism
#

you must provide full field name for each paraemter

#

txt2img --prompt "cat" --steps 30 | img2img --prompt "dog" --steps 30 | show_image

violet sleet
#

found, but it's t2i)

upbeat prism
#

ah, that is what you were looking for , i misunderstood

tulip sluice
tulip sluice
#

is this that or did you create an entirely new graph for controlnet?

hollow marlin
tulip sluice
#

awesome .. let me know when you have a pr up .. would love to try .

hollow marlin
tulip sluice
#

getting the preprocessors nodes in too?

violet sleet
#

i think it's might be done with conditioning fields from my pr in future

#

as separate node

hollow marlin
violet sleet
#

or maybe even create new field type, not sure what fits better

chrome bobcat
#

So did we make a tag for dev but not unstable?

upbeat prism
#

ya pre-nodes tag is the last main that has a fully working ui

#

err, maybe i am misunderstandng

hasty bane
# tulip sluice need to look into caching the individual elements

Looks nice. 2 questions:

  1. You are setting the models independently in two places (load model and text to latent). Should load model be outputting to an input on text to latent in some way? Maybe just a string or id so it knows to grab the loaded model. I noticed the model loader is in progress so maybe this is temp.
  2. Latent to image doesn't seem to have size input or settings, just outputs. Are they being implicitly passed through via latent to image from the noise node? Might cause confusion if so (looks like things coming from nowhere).
upbeat prism
violet sleet
#
  1. it's still in progress, for example this my draft:
#

@tulip sluice also - we need to choose if we import loras in compel node, then unet should come in, and clip+unet as output
if we not load loras in compel block then it's like you done

#

@upbeat prism is it possible to create node with variable parameters count(which will be array input parameter)?
so for example if you input value in last field - there another added at end
or with +- buttons

violet sleet
#

@hollow marlin could you look at dm?)

upbeat prism
violet sleet
#

node to load loras

#

so you do smth like

name: lora1 strength: 0.7 (-)
(+)

->

name: lora1 strength: 0.7 (-)
name: lora2 strength: 0.5 (-)
(+)
runic karma
#

loras should be loaded, like they are now, only just-in-time during the SD inference loop

violet sleet
#

i saw it's without loading unet
it's like:

ModelField:
    name: str
    loras: List[Tuple[str, float]]
#

we only said which loras to apply on future unet loadings

#

in comfyui it implemented same way, but they have list of patches to weights

#

we can do same way, but unlike our implementation with hook, it's requires to make unet copy every time and we can't cache it

rough halo
#

hmm, should the image upload API be working? I'm getting some weird errors when I use python, so I figured I'd at least start with curl and compare some pcaps of the two, but it's not working with curl either:

{"detail":[{"loc":["body","file"],"msg":"field required","type":"value_error.missing"}]}%```
#

I added -L to the example from the docs, because otherwise it doesn't follow the redirect to /uploads/ and therefore doesn't even get this far

violet sleet
rough halo
#

huh

#

facepalm I had Content-Type: multipart-form-data instead of multipart/form-data

#

well, at least now I can figure out why my python version isn't working 😄

tulip sluice
violet sleet
tulip sluice
#

Yeah tack it on after the model loader individually and feed that into the sampler directly

#

Does it need to go through prompting ? Don’t think so right ?

violet sleet
#

Lora modifies unet and text_encoder
Prompt goes through text_encoder
Sampler uses unet

tulip sluice
#

Yeah..so we do model — Lora — feed the tokenizer from model to compel and feed the text encoder from Lora to compel … then feed unet directly from model to sampler and the positive and negative conditioning will go in from compel

#

And we’ll need a merge Lora node so we can merge multiple Lora’s before the text encoder is fed to compel

violet sleet
#

Yep, but I don't think that we need to separate tokenizer and text_encoder

tulip sluice
#

Sounds like a plan

violet sleet
tulip sluice
#

No use case ?

#

So model will have 3 outputs ? Text encoder / unet / vae ?

#

Text encoder being txt encoder plus tokenizer

violet sleet
#

Yes, that why I named it clip in my draft

tulip sluice
#

How are you passing them ? Just the name and individually load them ?

violet sleet
#

I don't know how to pass correctly so i create some magic(give a sec, i'l find)

#

#dev-chat message

#

If we assume that all it added to invoke, then yes - i just use name

#

And array of lora names that need to be applied on use

tulip sluice
#

Are they getting cached correctly ?

violet sleet
#

Now nothing work correctly

tulip sluice
#

There are functions to call individual parts of a model

violet sleet
#

They load full model

tulip sluice
#

In the model manager

#

Even if it’s already cached ?

violet sleet
#

They just call get model and gets from object part that you need

sinful forge
#

Make sure it's not called load_model when it's not actually loading (just getting metadata).

#

And remember nothing stays in memory between nodes (as a rule, though in practice it may to improve performance)

violet sleet
tulip sluice
#

it will send you the vae of the model in memory

#

if its already loaded

#

so if you load the model in the Load Model node

#

then you can just call this back when u need the vae later

#

pass out the model name from the Load Model

#

and use that as reference in the VAE decoder to call just the vae of the loaded model

sinful forge
#

But that model is not guaranteed to stay in memory between nodes

violet sleet
tulip sluice
#

which seems to be how it works?

violet sleet
#

So we asked lstein to implement real separate loading

tulip sluice
#

hmm

sinful forge
violet sleet
#

Also - why you need to load full model if you interested only in unet? It can save loading time

sinful forge
#

I mean in some future a node scheduler that intelligently batches nodes by model might be cool, but even in that case a crash between nodes breaks it

violet sleet
#

And loading tokenizer might be a lot faster then full model

sinful forge
#

Yah, I thought the functions on the model manager just loaded and cached the components?

violet sleet
#

Now it's call get_model and after that takes from result vae/text_encoder/...

#

And to use in our case - yes they should do this on components level
It's pretty simple in code until you remember about checkpoint models

#

As checkpoint model can be loaded only fully

tulip sluice
#

we are converting the models now anyway.. so after teh first time convert, we can do it the diffusers way

#

right now i think they're only saved in memory.. we can do that to disk if it isnt doing that already

violet sleet
#

There differents ways - for example we can force convert new models
So all models will be on disk in diffusers format(why diffusers don't use tar or zip to easy move/exchange models? -_-)

rough halo
#

is session_complete supposed to be sent after all the invocations are done? I was just looking again at test.html and I noticed it's subscribing to it, but afaics it's never triggered

#

which potentially makes it non-trivial to tell when all the nodes are done

sinful forge
#

Uh... That or something like it. But it's probably less tested than everything else.

rough halo
#

oh, it may actually not be a thing at all, afaict the only place session_complete shows up in the repo is in test.html 😄

hollow marlin
#

Testing ControlNet with a Canny edge detection image processing node:

upbeat prism
#

too freaking cool @hollow marlin !

hollow marlin
#

Thanks, it's getting closer!

rough halo
#

(that's the output of the Get Session API endpoint)

upbeat prism
rough halo
#

Afaict only the load_image executed. Maybe so grabbed a commit where things are a bit broken?

upbeat prism
#

very possible

violet sleet
#

and in this case it's int in real

upbeat prism
#

depending on where the error occurs, you may need to inspect the invocation_error WS event the client receives

#

sorry it's so in flux at the moment. probably a few weeks til things settle down. you all are early adopters 🙂 thanks for hacking on it

rough halo
#

No apologies needed, this is an awesome project 😁

#

I am subscribing to invocation_error and nothing is coming through. I’ll play around some more

upbeat prism
#

I’m afk for the evening - if you’ve not got it figured out by the time I’m back I’ll try your same graph, didn’t think to do that at first

violet sleet
#

one-line fix, so I don't think that this needs separate PR)
//or you can set indexes in key instead of option value itself, so you can use Number(e.target.value) as index

violet sleet
#

@rough halo I think you saw same images because without downloaded esgran model it's silently left image unchanged)
but you can see console for message about it:
>> ESRGAN is disabled. Image not upscaled.

rough halo
# violet sleet <@347003695495512096> I think you saw same images because without downloaded esg...

From what I can see, it doesn't go beyond the load_image - I get an invocation_started for load_image:
gnubert-dreambot-backend-invokeai-1[11676]: Received packet MESSAGE data 2["invocation_started",{"graph_execution_state_id":"81c2c574-ac7d-4b97-925f-ccc65e0c504c","node":{"id":"888413fd-0d60-4a0a-88ed-a37fcf2bfb38","type":"load_image","image_type":"uploads","image_name":"b818af89-8fa4-4cfd-b37a-fb0c7276fe75_1682526229.png"},"source_node_id":"0","timestamp":1682526229}]
and then an invocation_complete for load_image: gnubert-dreambot-backend-invokeai-1[11676]: Received packet MESSAGE data 2["invocation_complete",{"graph_execution_state_id":"81c2c574-ac7d-4b97-925f-ccc65e0c504c","node":{"id":"888413fd-0d60-4a0a-88ed-a37fcf2bfb38","type":"load_image","image_type":"uploads","image_name":"b818af89-8fa4-4cfd-b37a-fb0c7276fe75_1682526229.png"},"source_node_id":"0","result":{"type":"image","image":{"image_type":"uploads","image_name":"b818af89-8fa4-4cfd-b37a-fb0c7276fe75_1682526229.png"},"width":1280,"height":1006},"timestamp":1682526229}] and then nothing else

violet sleet
#

just see for messages in chrome dev console)

rough halo
#

invokeai itself logs nothing else after it is told to start the graph: gnubert-invokeai-1[11676]: INFO: 172.20.0.8:48584 - "PUT /api/v1/sessions/81c2c574-ac7d-4b97-925f-ccc65e0c504c/invoke HTTP/1.1" 202 Accepted

violet sleet
#

I read stacktraces here always))

rough halo
#

I'm calling into invokeai's API from a python bot, not through the web UI 🙂

#

I don't have much more time to work on this today though, but I'll be poking at it some more tomorrow

violet sleet
#

ok, but now I can think only that you maybe not subscribed for your created graph, but for something old

rough halo
#

so the response I get from InvokeAI after I post the graph starts like this: InvokeAI response: {'id': 'dc65e223-7526-4b58-87d7-505a2358b18f', 'graph': {'id': '8c6a6f43-1189-43f6-95ff-0cd62421951d', 'nodes':... and I'm subscribing to the first id there, so the one ending b18f - that's the one I call "PUT /api/v1/sessions/dc65e223-7526-4b58-87d7-505a2358b18f/invoke HTTP/1.1" 202 Accepted and emit subscribe with {"session":"dc65e223-7526-4b58-87d7-505a2358b18f"}

violet sleet
#

and then
PUT /api/v1/sessions/dc65e223-7526-4b58-87d7-505a2358b18f/invoke?all=true
?

rough halo
#

huh, apparently I stopped doing the ?all=true facepalm

#

I'll try and get some time to add that back in later and rebuild. Thanks!

upbeat prism
#

https://github.com/invoke-ai/InvokeAI/pull/3261 resize and scale latents nodes. these allow us to use hires fix.

but there is an open question:

  • latents are a special case of tensor with a particular dimensionality
  • the noise node generates latents of a particular size and we allow the size to be provided as pixels, and the pixel value is floor-divided by 8 to determine the quantity of the width and height dimensions of the tensor

should the resize node also accept w/h in pixels and floor-divide by 8? or, should ALL latents related invocations deal in the actual quantity of the w/h dimensions, and rely ont he user to multiply by 8?

i think we should go with pixels for latents, and if we do generalized tensor nodes in the future, let those use the actual quantities. thoughts?

GitHub

this resize/scale latents is what is needed for hires fix
also remove unused seed from t2l

violet sleet
#

i think about - if we create graph for basic txt2img/img2img and will call it, then we need to divide width and height by 8 before passing value from user input(width, height) to noise inputs?

#

I think we can forget somewhere to divide/multiply here))

upbeat prism
#

agree

#

so if we use pixels for latents everywhere, we can all continue to think in pixels, which seems less confusing

sinful forge
#

but it's potentially less accurate, which may make future use cases more complicated to implement

#

low-level nodes don't necessarily need to be as easy to use

upbeat prism
#

Unless something really big changes, latents is always going to refer to a tensor with specific dimensions that results in an image with dimensions 8 times larger than the tensor dims

#

so i think this is the right choice, then we can offer the generalizes tensor nodes in the future

runic karma
#

oh i just read the code. um. has that been tested? i’m unsure that the vae latents can simply be interpolated to resize them like that

upbeat prism
upbeat prism
#

the use case for this is 'hires fix', where we need to resize latents

upbeat prism
#

or, more straightforward "upscaling" in latent space

rough halo
#

would y'all be interested in having a couple of invocations to 1) grab an image from a URL, 2) resize an image?

#

I'm doing those in my code at the moment so I can do img2img, but I'd be happy to have a go at porting that over to invoke?

runic karma
#

@upbeat prism ahh, great. sorry for the noise, i saw something that looked weird and thought i should call it out

upbeat prism
#

np, i certainly do not know what i'm doing in a meaningful way here

hollow marlin
#

Is there currently a way to save/load graphs built in the Node UI?

upbeat prism
upbeat prism
#

I think we are naming the nodes in a confusing way. For example, latents to image should be VAE Decode. Thoughts?

violet sleet
#

also, I think we can add flag for tiling decode

broken blaze
#

I think, while more accurate, Latents To Image is a simpler thing to teach someone than "VAE Decoding"

upbeat prism
#

node life is the real life yo

broken blaze
#

node streets

broken junco
#

@sinful forge I'm doing a rewrite of the model manager now in order to satisfy the use case that @upbeat prism raised of being able to load individual parts of a model (such as the unet) and mix and match them dynamically. In doing so I've generalized the RAM caching mechanism so that transformers models, such as CLIP, are cached as well as whole diffusers and parts of diffusers. I want to make this system work across different machines in a distributed environment. What do I need to do to make the model manager a first class service that will interoperate correctly?

I'm also going to start emitting events from the model manager that the UI can display. I was thinking to generate model_requested, model_retrieved_from_cache, model_loaded, model_uncached , and model_load_error events. Is this the right way to do it?

sinful forge
# broken junco <@178692966318080000> I'm doing a rewrite of the model manager now in order to s...

Awesome!

So to make a proper service, you need to define an ABC class, then an implementation of the class for local usage. The ABC lets someone create their own version for their own use case.

Regarding events, who is the consumer and context of the events? Is it the session? The node? Everyone using the service? (Probably not the last one, since you don't want to know that another user is using a particular model). My gut approach would be to have model_loading and model_loaded events, scoped to a node. The caching stuff is a detail of the particular implementation. I'd log that information (or send it to a tracking system) since it'd be very useful data, but I don't know if I'd send the user events about it.

broken junco
violet sleet
#

nodes already use different models and when you implement partial loading they will use only parts of models

broken junco
#

With respect to utilization of the GPU, the current system is very conserving of GPU VRAM and moves models out of GPU as soon as they are no longer in active use. So essentially you get one model at a time in the GPU. An alternative I discussed with @upbeat prism is to wrap a context manager around this process such that you could lock models into the GPU with a context, and therefore have multiple ones in GPU at the same time. Not sure if this is desirable or not and maybe a future feature?

broken junco
violet sleet
#

//i think i still need google translate sometimes...)

upbeat prism
#

model_loading , model_loaded and model_load_error events would be nice for user feedback, and I'd expect to receive those events during the execution of a node's invoke() method. Thanks @broken junco !

upbeat prism
violet sleet
#

No one experienced crashes with vae decode in wsl?)
not always, just in some generations

#

at least, i think try to update cuda version as it's "a bit" old 😄

> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105
rough halo
# upbeat prism Sure! we’ll want to expose the various resize interpolation modes and stuff like...

I haven't done the interpolation modes yet because it wasn't immediately obvious how to have an enum parameter, but I've just pushed up a first pass PR to see if I'm doing this even close to correct: https://github.com/invoke-ai/InvokeAI/pull/3296

GitHub

This is currently completely untested - I'm more looking to get some reviews to make sure I'm doing this right, particularly the TypeScript portions, which I am not at all familiar with.
Tw...

violet sleet
rough halo
#

Nice, thanks

violet sleet
#

there is no image to latent node or I miss something? Oo

upbeat prism
rough halo
#

Ok back to dumb questions - I see y’all posting screenshots of using the nodes in the UI - how do I get that to work? I’d like to test my PR for downloading/resizing images, but it sure would be easier to do that via the web UI on my gaming PC rather than keep rebuilding the docker container and shoving it through my server’s automation jensen_parrot

#

(The only parts of invoke_new.py I’ve seen in the web UI are test.html and docs)

violet sleet
hollow marlin
#

Initial port of ControlNet support from generate-based nodes to latents-based nodes:

broken blaze
#

With Control model on Text to Latents, I'm supposing this doesn't support multi-controlnet yet?

#

This is a bit different than the miro board we talked through, so just trying to understand the progression to this form of inputs

hollow marlin
broken blaze
#

nice! that will extract 'control model' out from text to latents at that point?

hollow marlin
chrome bobcat
#

Does this/will this support posing models?

hasty bane
#

That's a good question. So far all you've posted are tests with canny.

violet sleet
#

canny it's just preprocessor, not controlnet

#

so, you can pass pose-image directly to controlnet and it will work as I understand

hasty bane
#

right but so far only canny input has been shown.

broken blaze
#

If I’m understanding what’s been done by Gregg in our discussions, you will have the pose preprocessor and model usage but not openpose (pose editing)

#

We’ll need to build a UI that allows for pose editing and control

hollow marlin
# broken blaze If I’m understanding what’s been done by Gregg in our discussions, you will have...

Yeah I've only worked with the OpenPose still image pre-processor, not tried anything fancy for pose editing. I'm planning to have most of the standard ControlNet pre-processors ready to go as nodes by tomorrow. Just making sure I've got MultiControlNet support working with TextToLatent first -- getting that working may affect how pre-processor outputs are handled, and I'd rather deal with that code churn on one pre-processor (Canny) than all of them.

broken blaze
#

https://justsketch.me/ i stumbled on this and think its definitely overkill, but something like it would be pretty awesome on top of the canvas

hollow marlin
broken blaze
#

We won't be able to use the OpenPose editor itself - not a permissive license

chrome bobcat
#

So much for the Open part of OpenPose!

hasty bane
#

there are ton of sites for making open pose images. Would be nice to have in invoke, but not sure it needs to be in the initial release. For example https://app.posemy.art/ (I personally prefer sites that have a model over the skeleton as I have trouble visualizing)

#

One thing for backend: open pose files should be in their own folder, not in i2i input folder.

broken blaze
#

The challenge is ensuring that the skeleton/pose is modeled on what the controlnet model was trained on.

#

It can't just be "any" pose ui

#

Alternatively, could create a new model and train it on something else.

chrome bobcat
#

There are some other models for ControlNet, too.

tulip sluice
upbeat prism
#

@sinful forge i see we are using ImageType.INTERMEDIATE for eg CropInvocation. these will often be results, though. I'm not sure intermediates really makes sense to have

upbeat prism
#

Oh, cool - pydantic validators get previously validated fields in their context. this means we can have nodes that have fields that depend on others, for example, resize latents could have a mode of "explicit" | "scale". if explicit, we can require a width and height. if scale, we can require factor. not saying this is a good idea for this particualr node, just an example

#

reaaaaally would like to see better validation errors than 422 tho. sending an issue for you kyle'

sinful forge
upbeat prism
sinful forge
#

Apparently default 422 should look like this

{
    "detail": [
        {
            "loc": [
                "path",
                "item_id"
            ],
            "msg": "value is not a valid integer",
            "type": "type_error.integer"
        }
    ]
}
upbeat prism
#

im blind, that is in the body of the 422 response

#

its kinda confusing, there are 3 behaviors when validation fails:

  • 422 responses (a node field fails validation)
  • 202 on invoking a session, but an error is caught during edge validation (eg InvalidEdgeError)
  • 500 error (something within the invoke() method fails)
#

i think those are all of them... when the 202 occurs, nothing makes its way up to the API layer

upbeat prism
broken blaze
sinful forge
upbeat prism
sinful forge
#

maybe I forgot to add the root-level validator

upbeat prism
#

i assume this is because the imagefield is Optional

#

err, because it is a union with none

#

we can enforce that the field requires input (either direct or via connection) in the UI, but we would need to add schema customisation for that - there doesn't seem to be any distinction between eg Union[str, None] and Optional[str] in the generated schema

sinful forge
#

Yah those are the same thing

#

we'd need our own type hint or something on the Field to indicate that scenario (value MUST be provided either directly or via connection)

upbeat prism
#

Ok, np. I've already got the UI set up to support ConnectionKind (input, direct) and ConnectionRequirement (always, optional, never). I think a combination of these two covers all possible cases

#

I'll be revamping the UIConfig class at some point before release to have a much more comprehensive set of customisations that the UI understands

violet sleet
#

Small suggestion as you talking about connections - maybe we can do that lists can accept multiple connections?
so it will be easier with controlnet for example

upbeat prism
violet sleet
#

it's feels strange create collect node for one input controlnet)

upbeat prism
#

im working on the nodes editor in my spare time, the overall migration is the priority right now, so on the UI side i'm not able to address these things just yet

#

do you know how comfyUI handles this situation?

violet sleet
#

they just chain it

#

controlnet -> controlnet -> sampler

upbeat prism
#

@sinful forge what if the collect node kinda functionality was baked in to to all fields? if the field type is an array, it can auto-collect. if not, it only accepts a single input?

upbeat prism
violet sleet
#

not sure 100%, but as i know each node just add info to list

upbeat prism
#

gotcha

violet sleet
#

I also still not sure what better
as chaining looks... easier to understand? as you have 1 input
and with multiple inputs you need to check from where each came from and forgot you something or not

broken blaze
#

processing serially is a pattern that makes no sense.

#

Me and Gregg talked through what is usable to an end-user - a collection node or the ability for multiple outputs to collect on a valid "collection enabled" input seem to be the only things that are sound UX

hollow marlin
#

My ideal would be having node ports with effectively their own collect functionality, so if three controlnets had their ControNetInfo connected to the same control_info port on TextToLatnents it would collect them into a List when the Node is executed.

#

Oops, I think I repeated what @broken blaze just said 🙂

broken blaze
#

🙂

sinful forge
#

Strong preference for this to be an explicit node

violet sleet
#

from logic perspective I too think that this good
but i think about situation when you have big graph and you have multiple controlnets
so, you need to check from where each input comes and count if you connect them all

sinful forge
#

You can always do some UI magic on top of it if you want to hide that, but handling it implicitly as part of edges would be a headache

hollow marlin
#

I'm okay with a Collect node, but I see stuff built with ReactFlow where input ports can take connections form multiple output ports and it cuts down on visual clutter when this is a common pattern.

sinful forge
#

Again, you can do that as UI on top of the graph

hollow marlin
#

Yeah I have no idea how hard it is to implement in the graph execution...

violet sleet
#

then you need to do so i every client(web, cli, ???)

#

else it's some kind of logic inconsistency between clients

sinful forge
#

It would make edge validation even more of a pain than it already is (and execution, due to value preparation).

violet sleet
#

also - how implemented collection node? as for now we can input only constant count of fields

#

and this makes me think about "to much inputs to connect" error from hidden collection node 😄

sinful forge
# violet sleet also - how implemented collection node? as for now we can input only constant co...

It handles iterations as well.

Roughly, validation is mostly type-based. The first connection (input or output) can be anything, then the next connection (to the other side) must match the item type.

At execution time... well, without describing the entire system, the collect node has to be "prepared". It can't be prepared until all parent nodes are complete (and have produced results). Then all the results are collected into a list, which is produced as the output of the collect node.

It's one of a few special case nodes in the graph (the others being iterate and graph nodes).

violet sleet
#

I don't know code, but from my perspective it looks like painful place in type checking(it clearly disables it here)
and with autoaggregated fields you know that type of value or it's child must be this type

sinful forge
#

it's the X -> list[X] handling that would have to be done everywhere that would be painful

#

(where list[X] -> list[X] would also be valid)

violet sleet
#

yep, so what difference? you just check type twice

sinful forge
#

plus preparation would get more complex, since that would also have to be done at preparation time for every node type (instead of just for collect nodes)

violet sleet
#

let's I say view from outside:
i see it like:
graph created
modifies somehow
start called, here happens validation
after - any other work with knowing that we already validate graph

#

i think i understand a bit - i need to read code about moving result from output to input

sinful forge
hollow marlin
#

Is the intent still to have CollectInvocation working for v3.0?

sinful forge
#

It already works (though I guess not in the UI?)

hollow marlin
#

Yep I meant in the UI too. Hmm, guess I should try it again, think I last tested in the UI a week ago. It's still denyListed in the UI.

upbeat prism
#

node editor ui is wip, ive only cranked it out in spare time. collect needs special handling in the UI, not done yet

hollow marlin
#

I've pushed a new branch to the InvokeAI repo with my latest work on ControlNet support: https://github.com/invoke-ai/InvokeAI/commits/feat/controlnet-nodes
PR to follow shortly. This has support for ControlNet integrated intto TextToLatent nodes, and also includes 11 ControlNet preprocessor nodes for ControlNet v1.1 preprocessors. There are 3 other "experimental" ControlNet v1.1 processors I'm missing, but plan to include soon.

GitHub

InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. Th...

#

Thanks @violet sleet for template of how to support new connection types in the typescript!

broken blaze
#

BOOM

violet sleet
#

The only think - i not sure about combining preprocessor with controlnet model

hollow marlin
#

Aiming to get PR out this weekend.

violet sleet
#

Can be there situation when we already have preprocessed image and want just to load controlnet model? Without preprocessors

hollow marlin
hollow marlin
violet sleet
#

So, i saw it like controlnet node with input image
And we can connect here preprocessed image or output from any preprocessor node

#

Or even use preprocessor itself without controlnet)

broken blaze
#

This is great for now

violet sleet
#

I just asking - maybe i don't know problems)

#

And if this for future - ok

broken blaze
#

I think its possible a user just wants the preprocessed image, but the question would be "for what"

#

And the answer to that is "no reason" right now - The only ingestion element is for processing w controlnet

#

Eventually, we could create a node for the "output preprocessed only", if its needed

#

We ought to sprint to getting Controlnet + Multicontrolnet working with the path designed here by Gregg, sounds like the only blocker for multi-controlnet is on the output side of the multi-controlnet aggregation node?

#

(are all of the other processing nodes in the PR?)

#

yep, looks like it

violet sleet
#

just to be clear what I mean:

hollow marlin
# violet sleet So, i saw it like controlnet node with input image And we can connect here prepr...

Originally I implemented with preprocessors that just just do the preprocessing and pass image to version of TextToLatent* that had control_image, control_model, control_weight, etc fields. But that TextToLatent* as a UI node was getting very unwieldy once started adding multiple controlnet support. I'm still banking on fixes to CollectInvocation (or me figuring out the hacky version) to send multiple ControlFields to single port as a collection / array / list / tuple / whatever.

sinful forge
#

Hrmm... the only thing I'm starting to worry about is the number of parameters on TextToLatent

violet sleet
#

not so big... i remember at beginning it's bigger)

sinful forge
#

e.g. seamless also feels like something that's optional, and as something optional should probably be represented separately (maybe a pluggable list of additional things to do? or something?)

#

Progress images also seems like it should be more of a session setting than a node option

violet sleet
#

comfy for example have sampler and extended sampler

#

so, we can create generator with less parameters

hollow marlin
# violet sleet just to be clear what I mean:

Yep I think that's a totally valid alternative. One thing I liked about the peprocessor/controlnode combo is that the prepocessor used, at least for 99+% of use cases, can pre-determine which ControlNet model to use (or restrict to a few alternative). Haven't implemented that yet but was part of the motivation for the combo.

violet sleet
#

and extended with all

upbeat prism
#

Of course needs backend support

hollow marlin
#

Some of this comes down to aesthetics and ease-of-use, more nodes vs bigger nodes. If subgraph collapsing gets implemented in node UI this may not be so much an issue.

violet sleet
sinful forge
#

each can have parameters, but they take a latent and output a latent (or whatever happens at each step)

violet sleet
#

idealy I see for this case - textbox with hints for standard models, but values from list of standard models not forced

upbeat prism
# sinful forge Yah I was thinking something node-like, but it's just linear and you put them in...

We already have a specific use case for intercepting the inference process at a specific point - the symmetry feature - it mirrors the latents at a certain point. Also I’d love to be able to do prompt interpolation, switch to a different model halfway thru, change cfg/strength at certain points, and so on. Keyframes afford all of that. Need the pipeline to do more with the step callback, probably needs diffusers support?

violet sleet
#

can't imagine how it's can look at nodes %_%

#

but sounds cool, remembers about word switching in a1111 prompts

#

@upbeat prism do you think it's possible to do textbox with list-helper values in node?

upbeat prism
upbeat prism
violet sleet
#

in controlnet model input
hint user default models, but allow input anything else too

upbeat prism
violet sleet
#

yep

upbeat prism
#

Sure. Need to provide ui hints in the invocation. I will update the uiconfig class soon to accommodate more stuff like this

violet sleet
#

@hollow marlin then your think looks ok)

upbeat prism
#

Like type: list-with-free-text

hollow marlin
# violet sleet good think, but in theory - what if user wants to try some new controlnet custom...

Right now loading the ControlNet model relies on diffusers model.fromPreTrained(). And plan is to move it over to new InvokeAI model management stuff from @broken junco .
And ControlNet model must be specified in text entry box. But I'd rather have a dropdown with the "standard" ControlNet models. And maybe a model override field where people can enter whatever they want. Or does reactflow have dropdown+freetext combo?

violet sleet
#

no by default, but sound that we can do it)

hollow marlin
#

Once again I typed before reading the previous two minutes of conversation and repeated what others were saying 😊

violet sleet
#

I think combined preprocessor-controlnet models too ok
at least convert from this nodes to separate nodes as I asked - easy
if there some bad cases will found

sinful forge
#

we might have duplicate code among nodes by combining, right? e.g. we'll want a canny node for other reasons anyway.

hollow marlin
#

@upbeat prism Would you have time this weekend or early next weekend for a video chat to help with getting CollectInvocation working or alternatively help me figure out how to get array/list/collection output from a HackyCollectInvocation (hacky part is it has a separate input port for each entry that is output in the collection)

violet sleet
upbeat prism
#

Just let me know what you need and we can figure out how to make it 👍

hollow marlin
sinful forge
#

Getting the node composition (inputs, outputs, granularity) correct is pretty important. It'll be hard to change that sort of thing later.

violet sleet
#

I see it as:

  1. combined - a bit cleaner, can give user hints about models
  2. separated - can use preprocessors separately or directly load already preprocessed image
#

@sinful forge what better from your point?

sinful forge
#

The problems in #1 can be solved by UI and/or somewhat by subgraphs

#

I generally err toward greater flexibility, given nodes are for advanced users. Then solve any usability issues with UI/CLI.

violet sleet
#

...then it's second option?

violet sleet
#

to be true - I too more for 2 option, but see that this two options easy switchable and first option have some benefits for users with a bit lower skill

broken blaze
#

But I’m confident it is, and that we just need to align on a pattern for how well handle inevitable change in the future.

sinful forge
#

It's backward-compatibility. You don't want to break backward-compatibility in the future

broken blaze
#

We’re going to have to.

sinful forge
#

we should really be versioning node interfaces probably

broken blaze
#

Inevitable.

#

I agree with that

sinful forge
#

If you break back-compat you'll break everyone's libraries, yet again

violet sleet
#

i think @hollow marlin and I still not sure what option to select 😄

broken blaze
sinful forge
#

You may eventually have to support old version deprecation and graceful failure, but you should try your hardest to support API back-compat

#

so in general, parameter addition is fine over time, but parameter removal or changes will be breaking changes (i.e. new version)

broken blaze
#

Right - I hear you on that.

#

That’s different than “we can’t make changes”. We should figure out how we’re managing versioning

sinful forge
#

yah but it's also true that the nodes you make now you should plan on supporting for a long time

#

and as we've explored things, the general pattern that seems to be emerging is to favor fine-grained and minimal/single functionality nodes

broken blaze
#

I'm all for that, but am also in favor of actually shipping something to users who have been waiting for 3.0 for nearing 3 months.

#

🙂

sinful forge
#

e.g. for above I would vote for #2. But if adding controlnet in the UI, I'd probably make it look like #1 (even though it would create 2 nodes)

broken blaze
#

I am also for #2, but for the most part it's already been built that way

#

I can see where an experimental approach to messing with pre-processed images before passing into the t2l node would be cool, but theoretically you could have a distinct "preprocessing node" with an image output that isn't built for controlnet.

#

I'm just hesitant to add more work to something that is so close to actually solving problems that people actually have, over solving for the less than .1% who are going to mess with preprocessed images (which, I'll add, is of purely theoretical value as far as I can tell)

sinful forge
#

I mean the same argument could have been made around outputting latents instead of images

#

and it's not really hard to split something like this into two (or more) nodes

#

plus, canny is used in inpainting, so better not to have two implementations

broken blaze
#

What’s done is done

#

Alright - So your proposal is to split into a preprocessing node that feeds into the control node?

#

Are you willing to help with the work on either that, or the collection/aggregation that Gregg is currently working on?

sinful forge
#

collection already works

#

it just needs UI as far as I know

broken blaze
#

He's having issues on the output side

sinful forge
#

and only needs UI for the node editor

upbeat prism
#

im fixing collection UI thsi mroning

broken blaze
#

If I understood correctly.

sinful forge
#

and I thought node UI was 3.1, so ¯_(ツ)_/¯

broken blaze
#

Right, but controlnet is potentially 3.0

upbeat prism
#

its kinda proving essential to building out the nodes features

broken blaze
#

(so node design for the actual graph will be needed regardless of whether UI is done or not)

hollow marlin
broken blaze
#

Thanks Gregg

#

Do we have a plan ironed out for versioning yet?

#

I recall seeing some sketches for a class that would be used, but not sure if that was actually rolled out anywhere.

violet sleet
#

wow, so much preprocessors
unfortunately smth wrong with Zoe preprocessor for me(a, it's available only in main)

violet sleet
broken blaze
#

I think Kyle was proposing versioning each node.

sinful forge
#

Node interfaces are part of the web API, which would be most sensitive to API changes.

hollow marlin
violet sleet
#

no, I mean ZoeDepth added only in main
in last release - 0.0.3 it's not exists

tulip sluice
# violet sleet I see it as: 1) combined - a bit cleaner, can give user hints about models 2) se...

It needs to be separated imo for a ton of reasons.

  1. The preprocessed image is actually quite independent of what control model is being used. For example, a preprocessed canny image can be used with any single canny model out there. So if I wanna do two nodes with two different canny models, then if they are not separated, I am preprocessing the canny image twice. Instead we should just be processing it once and plugging that result into the control net loader.

  2. The prepreocessor will remain consistent through iterations (unless the user wants it otherwise) .. so it is not necessary to calculate it each time. This is much easier to achieve if a node already has a result which it wont have if its a part of control net coz thats where the variables are likely to change.

  3. There are some different types of control net models that use the same kind of input -- such as scribble / softedge. It feels redundant to do the same hed map or pidinet map for them across three different control net models if we are using multicontrolnet.

  4. As for @hollow marlin mentioning the block was MultiControlNet ...we'll need to create a new node for merging Controlnet's .. and this merge node will have two inputs per control net => control net image and control net model + inbuilt attribute for control net weight .. then we pump these into an array and feed it to the pipeline as it expects it to be? Thoughts gregg?

hollow marlin
#

Here's an example of the revised (option #2 above), separated image processor and controlnet nodes. Still some cleanup to do, but it works.

hollow marlin
violet sleet
#

Merge can be done in different approach, but I still think about chaining))
As then there no point with multiple inputs and you easily can monitor what connected

#

And it's look minimalistic for 1 controlnet)

tulip sluice
#

Pardon my shitty handwriting

violet sleet
#

Oh... I totally against this option 😄

tulip sluice
#

haha why? 😄

violet sleet
#

A lot of nodes
In chain - no nodes used
In collect node as i know collects infinite inputs

tulip sluice
#

how would this be done with less nodes?

#

coz you will inevitably have to load all the required components

#

i thought of a single collection / merge node

violet sleet
#

In chain - every next node adds controlnet info in array
In collection node - all controlnets connects to collection node and output is array

tulip sluice
#

or will it be set to .. lets say 5?

#

oooh

violet sleet
#

I don't know any about it)

tulip sluice
#

so we feed x number of control nets to a collection node

#

and then we get an array based on that

#

but the thing is .. the array needs to be in a specific format

#

coz the pipeline takes images / models / weights in different arrays

violet sleet
#

What?

#

We will read it before pipeline in node of course

tulip sluice
#

do we a collection node for this already?

violet sleet
tulip sluice
#

nothing nvm .. i found it. 🙂

violet sleet
#

It's just collect inputs to list

tulip sluice
#

yep. i was wondering if we had base template that merges variant inputs .. but saw just now that the current collection based nodes we have are just list makers

violet sleet
#

So, collection node might be like your option, but with not only 2

tulip sluice
#

i dont know if we can do dynamic inputs right now .. can we?

violet sleet
#

I think more about - when we have a lot of inputs, we need to check everyone to understand that all ok)

tulip sluice
violet sleet
#

Also we already told about chaining in prompts

tulip sluice
#

yep

#

is your issue with chaning only that it makes for a lot of nodes?

#

or is there any other technical issue you foresee?

violet sleet
#
  1. i think it's easy to check that all ok in chain
  2. create collect node even when you have only 1 controlnet... A bit strange looks)
tulip sluice
#

so as long as it is a control net output .. we dont have to use a a collection for a single node

violet sleet
#

About this variant - which color then ui will use?
Array or controlnet

tulip sluice
#

so a user accidentally doesnt feed a random array

#

also coz the control net's array is definite .. with [img, model, weight]

hoary pecan
hasty bane
#

If you need matrix stuntwork let me know. I have lots of experience playing with Numpy array manipulation (and should be applicable to CuPy too I think since they share most syntax).

tulip sluice
#

@hollow marlin the pypi release of controlnet_aux doesnt have zoe yet.. should i install it directly from patricks repo?

violet sleet
#

Still think about multiple inputs(which from where come to understand) and hacks work with both collection and not and work with special node
When chaining do this all by nodes without additional features and totally type safe) (collection node not so typed)

Chained:

  • Easy implement, easy use
  • sometimes not so easy to arrange nodes

Collect:

  • ?
  • requires special hardcoded node, maybe special handling for one input, user need to create this node(?), All connections comes to one place and make a mess if they not 2-3, requires special support from ui
broken blaze
sinful forge
#

A bit late (putting kids to bed), but the collect node can accept multiple inputs of the same type and put them in a list

hoary pecan
sinful forge
#

The nodes interface is then for the advanced "non-programmer" user. The pro who wants to do complex or experimental things with the building blocks.

hoary pecan
#

Extension developer is advanced definitely. If I make a Dynamic Prompts extension, a normal user should be able to use it. That’s totally solvable in UI like you said though. If my extension can include some graph(s) of normal usage.

sinful forge
#

Yah, normal nodes user still needs to understand how things get put together, but doesn't need to know all the ins and outs of Python, typing, caching between GPU and CPU, etc.

loud helm
#

Wanted to throw my opinion in here. I don't think requiring chaining is a good idea for building an item array, if we are worried about nodes becoming messy then just do what Blender did and add a "group" node and people can organize their nodes that way.

sinful forge
#

Versus the user of the other interfaces, which are a more bespoke experience

hoary pecan
#

But don’t let me pull y’all off the actual topic. Just was curious about that point.

violet sleet
sinful forge
#

Node granularity I believe was one of the original topics for this chat 😛

broken blaze
#

We had decided on collections - when did that change?

sinful forge
#

I already have the collect node implemented (including unit tests!) and it's type-safe for the node system (type is enforced upon edge connection). It's just not implemented in the UI yet.

#

Now, a merge node may make sense in the case where you have a single output from two or more inputs, and want to e.g. control weighting on inputs. Then I could see merge chains making sense.

loud helm
#

Are there already plans for a node similar to the "group" node in blender?

violet sleet
#

As i see it's subgraph?

sinful forge
#

Yah there's a graph node already

#

And a graph library

#

Probably needs more fleshing out for the UI though

#

(it works in the CLI today)

tulip sluice
#

or just layout ?

tulip sluice
#

we'll have layout options and layout groups in the UI

loud helm
#

I mean where you can create a new "group" type node, and then go into it and add nodes inside of it and create inputs/outputs to/from the group node

tulip sluice
#

so once it is setup, doing that would be simple

tulip sluice
loud helm
#

you're familiar with Blender?

tulip sluice
loud helm
#

okay cool, you know what I mean.

tulip sluice
#

yep i do

#

ideally thats how we wanna do complex nodes so it becomes easier for people to build them and share as extensions

#

but that'll require more work to be done

#

current we're trying to get all the basic nodes in

#

so theres as much flexibility as possible

#

and the grouping will mostly be a front end thing

loud helm
#

whats the biggest blocker for moving forward on this project? as in; are we more limited on work hours from contributors or does the situation lean more toward a case of the work being non-trivial?

#

asking because I may contribute depending on the situation.

tulip sluice
#

this is what you want

#

and our UI framework supports it

#

we just need to implement it

tulip sluice
#

there's no real blocker. we just ported the entire backend so we're bringing the current UI to parity to work with the new backend

#

which means the other tabs -- like generation and canvas

#

that is nearly there .. @upbeat prism is pushing it hardcore.

Once we're done with it, we'll shift focus on to the nodes editor

broken blaze
#

"why isnt it done yet" = more people 🙂

tulip sluice
#

are you a front end dev? @loud helm

loud helm
tulip sluice
#

see something that needs doing? just do it

#

im sure we'll find a way to incorporate it into the bigger picture of the release

#

but if you're doing something, just check in with us

#

so we can double check that you arent doing something that'll need changes later.

#

backend is python .. front end is react .. redux for state management .. react flow for the nodes UI

#

backend serves an API via FastAPI

#

and we do sockets for stuff that needs it

loud helm
broken blaze
#

And, frankly, we've only just gotten to a point where it wouldn't be toe-steppin

loud helm
#

Ahh, I see.

broken blaze
#

Nodes is a move towards making it easier to jump in - Code has been a bit tightly coupled and required bunch of file traversing to get things done in the past. Now... nodes

#

And for the large majority, we're close to having the front-end fully ported - @upbeat prism is wrapping up the canvas migration, and I think we've then got non-nodes ControlNet UI to do and maybe some model management/config stuff and... then we're pretty close to 3.0 🚀

loud helm
#

Are there plans to add more standard editing capabilities to the canvas?
Because I was considering making a small Photoshop plugin that dispatches to the InvokeAI backend.
Because then I could use layers, brushes, gradients, blending, magic wand, etc.
Not that I'm expecting Invoke to ever really fully support all of that, just wondering to how far the current plans go for Invokes canvas.

tulip sluice
#

we do want to a bridge

broken blaze
#

THAT SAID

#

I'd love to have most of the stuff you actually would use PS for in the canvas

#

I think 90% of PS is bloat

#

and not necessary for ai-oriented workflow

tulip sluice
#

but yes. if possible, we'd love to have some of the functionality in the canvas directly

#

and use PS only for the stuff that is too hard in a browser

#

theres two things here: 1. create a stand alone plugin. 2: create a bridge

#

which one are you looking to do?

#

the first i presume?

loud helm
#

I wouldnt say that so much of PS is useless for AI oriented workflow.
I've found that AI responds quite well when you are able to communicate concepts to it better. Particularly with inpainting, if you go beyond simple solid colors and do things like blends and smears and simple shading/blending work then it really gives better results when you want really specific stuff.

broken blaze
#

Don't disagree with that

#

But thats the 10%

#

Neural filters, the full array of effects, layer properties, etc.

#

Not necessary for the AI generation workflow

loud helm
#

whats the 90%? The licensing for patented color wheels?

broken blaze
#

Lol, those pantone colors ain't gonna sell themselves

loud helm
#

lmao

tulip sluice
#

the good thing is with the new invoke you get an api .. when you launch invoke new.. you can go to localhost:9090/docs to get access to all endpoints

#

which you can directly use in the plugin

loud helm
#

as far as the plugin goes, from what I read about manifest v5 , it looks like you have full access to XHR and websockets so it should be able to just directly commune with the invoke API

broken blaze
#

the real power is in the type of functionality you would want to have as a digital ai sketchbook. get color in, blend it around, do the "photobashing" type workflows - but ALSO with controlnet, you can get in there and rough in your image and have it stay true to your work

tulip sluice
broken blaze
#

get some stylus pressure in there for folks with tablets and voila.

loud helm
#

yea I use a wacom tablet and its pretty frustrating to use with invoke tbh

broken blaze
#

yeah

#

think the gaps are fixable

loud helm
#

HTML5 input events should allow to fix the gaps yea

broken blaze
#

💯

#

pressure-sensitive brush, maybe a few more brush types,

#

I also have had this concept for a while of ai-oriented brushes

#

some kinda noise+color brushes

loud helm
#

maybe the ability to change keybinds so I can use middle mouse for panning like im used to...

#

lol

broken blaze
#

That's one that is killer

#

Should just make it that way period

#

😛

loud helm
#

that brush idea is a good one.

tulip sluice
broken blaze
#

Or, sooner. You know. People. 🙃

loud helm
#

maybe the ability to have like an inpoainting brush so you can give it a prompt for a kind of texture or something and set a blending multiplier and then paint a texture onto the image?

broken blaze
#

can also see tech limitations on that

#

but there's multi-diffusion which has yet to be implemented

#

and there would be some form of brush element to that.

broken blaze
tulip sluice
#

diffusers has support for them afaik

broken blaze
#

yeah - just need to get it into the canvas

#

the long list

#

😛

tulip sluice
#

its good to have a long list though

#

my real dread is when theres nothing to do

#

then theres nothing to keep you motivated

broken blaze
#

luckily we have so much to do - i feel so alive 💀

tulip sluice
#

yep lol .. id rather have a 100 things that are pending than zero stuff to do

#

@hollow marlin seems like the branch is still on the version where the processor and the cnet are together .. ping me up when you push the separated changes. ill give it a shot then

#

also i couldnt find a control aux release that has the new changes in it .. unless im missing some obvious link somewhere

#

@loud helm if you end up building a PS plugin via invoke, do share with us. would be happy to follow progress.

loud helm
#

if I do then id want to contribute it to the project, theres no way id maintain it forever and I hate doing wasted work lol.

tulip sluice
# loud helm if I do then id want to contribute it to the project, theres no way id maintain ...

if its robust enough and performs all basic modes, i think we can integate it as a part of invoke ..adding any new features should be relatively simple process after even if you dont maintain it ..

it'll need to be setup to work both with oss and commercial though .. effectively if people want local, they launch invoke in the BG or if they want to use the power of the cloud, they use it through the commercial subscription ..That'd be the way to go.

But @broken blaze thoughts on this would go further.

broken blaze
tulip sluice
#

yep

#

coz running PS and invoke in the backend for a lot of systems can be taxing

#

so a lot of people might prefer a cloud option

broken blaze
#

right

tulip sluice
#

if you do make one, please use react ...i dont want to deal with vanilla js

broken blaze
#

Neapolitan js tho

#

🤤

loud helm
#

my docker compose comment is a seperate thing from the PS plugin btw.
I'm talking about a docker compose script for containerizing the running of the whole invoke stack.
So you dont have to deal with any setup or have actual stuff installed on your system folders or program files.

broken blaze
#

We had a docker template at one point

tulip sluice
broken blaze
#

Yeah

loud helm
#

Also:
PS plugins only support restricted HTML for UI and restricted JS for logic.

tulip sluice
#

you'll still be using the core block elements from the PS script but you can make it real fancy if you want to

#

all new plugins should be done via UXP anyway

#

coz then you get easy way to deploy it to the adobe plugin marketplace

#

incase you didnt already know

loud helm
#

How much of the functionality would NOT be just IO to/from Invoke backend API in response to direct user input?

tulip sluice
#

basically any entry that needs to be fed to the api needs to be generated before hand

loud helm
#

for mask gen I would just use the currently active selection marque, like the magic wand selection area.

tulip sluice
#

you'll need a python file to generate the masks i nthe way they are needed

#

for example

#

and then feed that to the api

#

you could probably do that with JS directly but ive never really made masks in js .. i cant see why it wouldnt be possible though

loud helm
#

if by masks were talking about blocks of pixel data with an alpha field then yea you can use ByteArray in JS

tulip sluice
#

ep

#

yep

#

we avoid the alphas.. just fill with black and white instead

#

theres a plugin by abdul alfaraj ..?

#

that does this but with auto1111's api

loud helm
#

yea

#

I saw something about that

tulip sluice
#

ive used it .. works pretty well

#

little stiffy on the ui but gets the job done

loud helm
#

maybe the plugin could like... request the HTML for its UI from the invoke backend directly? if that were possible then invoke wouldnt need to change the plugin in order to keep it insync with the current development of invoke itself.

#

like invoke could send the HTML and then a description of what elements bind to which API endpoints

tulip sluice
#

i also feel UI and logic should always be independent

loud helm
#

well yea, you could basically just give an ID to the HTML elements and then send a JSON structure that maps the backend API endpoints to have a set of inputs & outputs which are the element IDs. then in the JS the plugin can read the json description and just hook onto those element ids, right?

#

I mean thats a simplified statement ofc. but essentially it should be possible.

#

We do something similar at the company I work for, where we send over JSON schemas to describe the UI for customer analytics stuff

broken blaze
#

the only thing that's going to be hard is that there will need to be some logic baked into the plugin that knows what operation it needs to do

#

unless the assumption is always 'inpainting'

loud helm
#

thats what enums are for!

tulip sluice
#

and etc

broken blaze
#

ah, fair.

loud helm
#

yea, and except the radio checkbox is actually a tab panel in the UI

#

so its not ugly

tulip sluice
#

but options are mostly shared

#

so maybea radio group aint too bad

loud helm
#

Hey, so that localhost:/docs page you said invoke has.
Does invoke host that page under normal deployment?
Because its giving me a 404 on mine, but I am running it in a docker container so it might not be deployed the same.

tulip sluice
#

you need to run invoke-new and then go to localhost:9090/docs

#

localhost:9090 is the endpoint host

#

and docs is just a place where fast api documents everything for you

sinful forge
#

I think there's also /redoc which should be cleaner (but not interactive)

hollow marlin
hollow marlin
upbeat prism
#

@hollow marlin Did you get the Collect node working (not in the UI - I know that doesn't work yet)

tulip sluice
tulip sluice
tulip sluice
#

@hollow marlin im getting a tensor size mismatch

#

The size of tensor a (96) must match the size of tensor b (64) at non-singleton dimension

#

at this

#
down_block_res_samples, mid_block_res_sample = self.control_model(
                latent_control_input,
                timestep,
                encoder_hidden_states=torch.cat([conditioning_data.unconditioned_embeddings,
                                                 conditioning_data.text_embeddings]),
                controlnet_cond=control_image,
                conditioning_scale=control_scale,
                return_dict=False,
hollow marlin
tulip sluice
#

i thought tht was the reason

#

but it didnt work with 512x512 either

#

i presumed it might be control net 11 but not that either

hollow marlin
#

Try setting noise to 512x512

tulip sluice
#

ah wait let me give it a shot

#

@hollow marlin yep that works

hollow marlin
#

In LatentToText I need to make sure control_image gets resized to match noise (or rather 8x the noise latent width & height).

tulip sluice
#

@upbeat prism in the ImageField component, is there a way to know which node it is being used in?

violet sleet
#

For what this for you? o_o

#

thinks about modification spreading to other inputs?

hollow marlin
#

@tulip sluice Pushed a fix for resizing control_image to match noise latent, to feat/controlnet-nodes

hollow marlin
#

FYI I'm away from keyboard rest of the day, I'll try to rejoin conversation this evening

hoary pecan
#

Does anyone have a few examples of using nodes in CLI? I’ve only been using UI and want to switch to CLI for a bit of a faster inner loop, but I’m not sure what CLI docs I can trust 😂

upbeat prism
tulip sluice
broken blaze
hoary pecan
#

Ah so there's no way to just pass a JSON graph or something?

broken blaze
#

I think UI is probably easier than CLI, but that's just me 😛

#

no its piped commands

#

noise --seed 10 | t2l --prompt "an old man" | l2l --prompt "an old dog" --strength 0.8 --link -2 noise noise | l2i | show_image

#

for example

#

the UI seems easier than text, but who am i to say

#

suppose if autocomplete works you might be a keyboard warrior

upbeat prism
hoary pecan
#

Yeah that's what I think I'll do

tulip sluice
upbeat prism
#

probably not the best way but that will work for now

hollow marlin
low furnace
# tulip sluice the man you need to talk to about docker is <@1039312982179590144> ... I know z...

catching up after a few days off Discord, lol 😅 @loud helm I have a pretty good docker-compose setup in my fork and hope to get around (Soon™) to contributing it back to the project. The entire docker setup could definitely use a refresh. Hopefully just after 3.0 ill get it done. take a look here if interested, though i've not updated this in many weeks and things have evolved since. https://github.com/ebr/InvokeAI/blob/feat/docker/docker/docker-compose.yml.

loud helm
low furnace
#

also, have you tried the docker setup on a rocm system? i have no AMD GPU and no idea how that works (in terms of container runtime accessing the GPU), or how to test

#

(haven't looked into it rather)

loud helm
#

I have an RTX4090 so I also havent used AMD lol

hoary pecan
#

@upbeat prism - Do you want feedback/suggestions for nodes UI at this point? Trying to find the sweet spot of too early vs. too late 😁

broken blaze
#

@hoary pecan - he's said he'd love ideas on what it ought to look like, vs feedback on what it is rn

#

(because it's at a good point where we know it can be better, just need to make it better)

hasty bane
#

Is there going to be a general prompt node (pos and negative as one)? There is a single prompt/compel node and you have to use 2, which might cause confusion. At the least there needs to be a way to label them so you remember which is which.

hallow vault
#

Hello! Not sure it's yet time for feedback but I wish to report a small issue in the Nodes Editor UI, which probably will be fixed during its development. 🙂

hoary pecan
#

Two biggest pieces of feedback on UI so far is:

  • remove node really should be delete key, not backspace. I find myself slightly missing the text box, hitting backspace and deleting the node instead of text. Also Ctrl+backspace on Windows deletes a whole word, and if there’s no words left in the text box, it deletes the node (if I’m spamming Ctrl+backspace, this might just be a bug I can fix at some point 😅)
  • Once the set of possible nodes grows just a bit more, changing the list will probably feel a lot better as a sidebar that can be pinned/unpinned like gallery and parameters.
hasty bane
#

Delete should have confirmation to prevent accidents.

tulip sluice
#

the node editor is very barebones in terms of ux at the moment.

#

but keep the ideas coming in .. will help us build it the way users might feel most natural

broken junco
#

@sinful forge Will you have time to complete the review on the configuration system? https://github.com/invoke-ai/InvokeAI/pull/3340 I would like to work on the installation system now, and it will be easier if the configuration changes are in.

GitHub

Application-wide configuration service
This PR creates a new InvokeAIAppConfig object that reads application-wide settings from an init file, the environment, and the command line. There is also a ...

sinful forge
broken junco
#

Thanks so much. I sympathize...

peak silo
#

Just a quick question is anyone using the nodes on a networked machine, I am trying too, had to add host: true in to config, it all says its connected, and the webui loads fine, however everything time I invoke a "server error" pops up in web ui but with no error message in cli just says its loading model then nothing, I know no bug reports till its done but just wanted to check

upbeat prism
peak silo
#

@upbeat prism Ok will check when i am back in the office

upbeat prism
peak silo
#

@upbeat prism Thanks for the help, worked it out, somehow I managed to have an older version of nodejs installed on the server

upbeat prism
#

huh, that should not have caused a problem with anything

peak silo
#

@upbeat prism maybe its was the way it was installed or a corrupted file, but removing nodejs v16 off the server and installing nodejs v18 LTS fixed the issue I was having

upbeat prism
broken junco
#

@sinful forge In an invocation node, how do you tell the difference between an input that accepts a simple value, such as the width in the noise node, and one that only takes its input from another node, such as noise in the t2l node? I'm trying to figure out how the graph editor knows to create an orange input dot and a numeric textfield for width and a magenta dot but no textfield for noise.

sinful forge
#

@upbeat prism would have to speak to the UI. I'm pretty sure it uses the field types in the schema to generate UI. On the backend, any fields that can't provide a default value should be marked Optional, and there's some additional schema you can set to indicate that they are required (they can be None upon creation, but must have a connection before execution). There's not a great way to represent that in Pydantic unfortunately 😑

tulip sluice
#

if you go to InputFieldComponent.tsx it'll show you what element is being created based on what type of parameter it is

#

For the noise node .. it outptus a latent field

#

and the latent field component is defined here in InputFieldComponent.tsx

#

in the component you can see there is nothing ..so it returns nothing .. in terms of what needs to go for editing it

#

and the colors of the sockets are defined in constants.ts => so when the latent field is the output, the component generated is null and the socket generated is pink

#

so if you create a new output type, you need to create a new frontend component for it .. one time job

upbeat prism
# broken junco <@178692966318080000> In an invocation node, how do you tell the difference betw...

To elaborate on general sequence of events:

  • Backend generates the OpenAPI schema in memory and serves it on localhost:9090/openapi.json
  • UI, on start up, fetches the schema
  • UI parses the schema, looking for nodes and building node templates from the schema. We can't easily use the OpenAPI schema directly - so the node templates are built as a more accessible representation of the schema.
  • The node template includes things liek the name and description of the node, but also its fields and return value. The fields are parsed into an inputs array, and the return value (output) is expanded into an outputs array. Some intermediary "types" are assigned to each input and output.
  • When you add a node, the template for that node is retrieved and a UI component is built and rendered based on the template. Each input/output has a different UI component based on its "type".
  • At the same time, a simple object is added to the UI's internal state to hold the value of all fields in that node

Every node is rendered as the same top-level node UI component. Then within that UI component, things like the handle colors and input areas are conditionally rendered based on the node template.

There's a good amount of missing logic in how the node templates are built and how they are used to generate the UI. For example, there needs to be more logic for some fields to limit if they can be connected-to or not, if they should be disabled in certain situations, and so on.

Additionally, there is a function called isValidConnection() that is used to validate connections. Right now, you can pretty much connect anything to anything - that's because the validation logic is skipped by an early return true. I've done this for now because the validation doesn't handle all cases properly yet.

broken junco
# upbeat prism To elaborate on general sequence of events: - Backend generates the OpenAPI sche...

Thanks for the detailed explanation. So now I understand that there is no magic flag somewhere that distinguishes fields that can only receive values from edges from those that can be set using textfields, and that this is determined on a case-by-case basis by looking at the field type. I am working on the CLI again trying to make the inline help messages more useful. Currently the help makes it look like every parameter can be entered at the command line. For example:

invoke> t2l -h
options:
  -h, --help            show this help message and exit
  --link LINK LINK LINK, -l LINK LINK LINK
                        A link in the format 'source_node source_field dest_field'. source_node can be relative to history (e.g. -1)
  --link_node LINK_NODE, -ln LINK_NODE
                        A link from all fields in the specified node. Node can be relative to history (e.g. -1)
  --positive_conditioning POSITIVE_CONDITIONING
                        Positive conditioning for generation
  --negative_conditioning NEGATIVE_CONDITIONING
                        Negative conditioning for generation
  --noise NOISE         The noise to use
  --steps STEPS         The number of steps to use to generate the image
  --cfg_scale CFG_SCALE
                        The Classifier-Free Guidance, higher values may result in a result closer to the prompt
  --scheduler {ddim,ddpm,deis,lms,pndm,heun,heun_k,euler,euler_k,euler_a,kdpm_2,kdpm_2_a,dpmpp_2s,dpmpp_2m,dpmpp_2m_k,unipc}
                        The scheduler to use
  --model MODEL         The model to use (currently ignored)
  --seamless SEAMLESS   Whether or not to generate an image that can tile without seams
  --seamless_axes SEAMLESS_AXES
                        The axes to tile the image on, 'x' and/or 'y'

It looks like the positive and negative conditioning, and noise, can be set on the command line, but they have to be inputs from other nodes. The --steps argument can be set, but it looks like the others.

#

I fooled around a little bit last night and changed the help message to this:

Generates latents from conditionings.
INPUT FIELDS:
  cfg_scale                 ConstrainedFloatValue     The Classifier-Free Guidance, higher values may result in a result closer to the prompt
  model                     str                       The model to use (currently ignored)
  negative_conditioning     ConditioningField         Negative conditioning for generation
  noise                     LatentsField              The noise to use
  positive_conditioning     ConditioningField         Positive conditioning for generation
  scheduler                 Literal                   The scheduler to use
  seamless                  bool                      Whether or not to generate an image that can tile without seams
  seamless_axes             str                       The axes to tile the image on, 'x' and/or 'y'
  steps                     ConstrainedIntValue       The number of steps to use to generate the image

OUTPUT FIELDS:
  height                    int                       The height of the latents in pixels
  latents                   LatentsField              The output latents
  width                     int                       The width of the latents in pixels

And immediately began to wonder whether I could automatically change fields like scheduler to indicate they are settable from the command line as --scheduler.

#

BTW, are the dimensions of latents measured in pixels?

chrome bobcat
#

A too-late question here. What if v1 of the CLI uses the same linear model as the UI's non-nodes interface and sticks to the basics of txt2img and img2img and txt2img2img instead of implementing full nodes exposed to the user?

#

So you keep syntax that's similar/identical to the 2.3.x CLI but under the hood it builds and links nodes together.

#

Just thinking of a quick[er] way to get 3.x out the door with a CLI.

broken blaze
upbeat prism
chrome bobcat
#

How many lixels to get to the center of your image?

upbeat prism
# chrome bobcat A too-late question here. What if v1 of the CLI uses the same linear model as th...

I think the issue here is that the nodes CLI is elegantly auto-generated from the node schemas, so writing a cli in the linear style everything is actually a lot of work.

What we should do is define library graphs for the basics and then recommend using them. The library graphs allow you to expose specific ins and outs, hiding away all the internal connections (like conditioning and noise nodes).

The t2i library graph already does this and is a good example. @broken junco

upbeat prism
broken junco
hoary pecan
broken blaze
#

Clean organization

#

Think it’d be pretty easy to define a category on each node

hoary pecan
#

I was looking at setting it up and I was like… hey wait a minute I’ve seen UI like this before… wait that’s the same control! Haha

hollow marlin
broken blaze
#

Will be even more with community nodes :p

upbeat prism
#

Already have added tags to nodes, but they just aren't used yet

peak silo
#

Ok I feel silly for asking, as I try to work things out or search for answers before I ask, but how do you get a start image in to this box ( I must be missing something simple)

broken blaze
#

you can either connect a result image (node output) into the Image input, or drag an image onto the image icon

#

UI hasn't been fleshed out quite yet, but will include an upload function

peak silo
#

I have tried to the drag image in to the box, but that just uploads it to the upload area of the gallery

upbeat prism
#

may need to use a library instead of the native APIs to fix it

peak silo
#

Ok on currently on a windows machine chrome and edge both shows invalid upload when dragging from gallery, but in firefox does work right

#

on ubuntu Firefox works correctly too, but chrome gives the invalid upload, now I know I am not going crazy

#

@upbeat prism thanks for the info, I will just use firefox for now

upbeat prism
hollow marlin
peak silo
#

just a question is the range node random order output?, I did 5-100 in 5 step increments, instead of 5 10 15 20 25 etc... it went 10 60 50 75 45 etc...

upbeat prism
# peak silo just a question is the range node random order output?, I did 5-100 in 5 step in...

known issue related to how range/iterate nodes work internally. the images are not produced in the same order as the iteration output array.

the iterate nodes internally "expand" the graph downstream of the iterate node. so you give it range -> iterate -> txt2img -> img2img, and instead of the single txt2img -> img2img, it splits off to X txt2img -> img2img branches, where X is the size of the range.

so its like a map operation. thing is, this is a graph, and graph branches are not ordered like arrays. so there is no guarantee of the order of the X branches.

end result - you will get all the values back, but they are won't be in order (unless by coincidence).

#

its tricky to solve this within the graph execution logic. you need some kind of flow control outside the graph to manage it. this does not exist in our implementation

for simple iterations, we are considering using the existing queuing capabilities of the backend (currently unused) or adding some simple queuing to the UI

peak silo
broken junco
upbeat prism
#

What I mean is that you’d need to have some context to know how the results should be ordered.

Currently they are ordered by creation time stamp. Ordering by anything else would require a lot of brittle logic (outputs would have to be somehow grouped based on a common upstream iterate node, and then sorted by the field that iterate node connects to, then it’s output list used as the source of truth for sorting).

#

Much better would be to process the iterated nodes in the right order, but I understand this to be a tricky solve

broken junco
#

Maybe you could pass a sort hint along with the metadata?

upbeat prism
#

My intuition tells me that would work but be a total dirty hack, leaving us with a really messy database and image metadata format

hoary pecan
#

Maybe I’m misreading, but a Sort node?

peak silo
#

@upbeat prism you are probably aware, but images are not getting resized to window size in txt to image, so chopping off large images at bottom and moving the arrow around

peak silo
#

I found in txttoimagetabmain changing first height from 100% to 100vh stopped it, but I am sure that's not the right place to change it

peak silo
tulip sluice
#

merged

devout parcel
#

How to invoke the latest cli that is, scripts/invoke-cli.py with node invocation ?

broken blaze
#

Hey @devout parcel 👋

#

The CLI in its legacy format is indeed being deprecated with Nodes.

devout parcel
# broken blaze The CLI in its legacy format is indeed being deprecated with Nodes.

How a user can use a text2latent invocation with current CLI implementation then ?

`invokeai` command will now launch a new command-line client that can
be used by developers to create and test nodes. It is not intended to
be used for routine image generation or manipulation.```

Currently users do run with `invokeai` and post that there will a active command space to provide an prompt and other infos
broken blaze
#

I believe that @broken junco is currently in the process of updating the command line interface, but I'm not entirely sure that current recipe for getting it running

#

However, I do know that the nodes interface is a bit harder to work with through the CLI

#

Just to better understand - What's the goal for building with the CLI? Most of the capabilities/functionality you're looking to do shouldn't require an explicit arg to be created on the CLI itself

#

@violet sleet probably can also share more of his thoughts on how to proceed with your work

devout parcel
broken blaze
#

That will be handled by the model manager

devout parcel
#

We have also updated to use Onnx Execution Provider which takes advantage of the CPU for the inference pipeline, not sure on how the model rewrite works, will discuss with the team and check here

broken blaze
#

There will be some documentation/updates here, but I think we'll be able to better handle ONNX models once this is merged in and discussed. I anticipate us sharing a path forward here in the next day or two.

devout parcel
#

Thanks for the support and understanding, we do expect a clear path once the documentation/update is ready 👍

devout parcel
#

Hi @broken blaze any updates here

broken blaze
#

The merge is here - you can probably parse through it a bit to see where things have changed, but you'll want to follow the patterns established in the model_management and models folder

#

@violet sleet - With ONNX, would the ONNX model be a new BaseModelType, or just alter the pipeline? I'm not as familiar

violet sleet
#

i think it should be big change as now all generation logic expect to got UNet2DConditionModel
last time I tried to do such I think of adding to model loader node field like executor/provider
with values:
Torch
TensorRT
ONNX_CPUExecutionProvider
ONNX_CUDAExecutionProvider

and handle this values inside model loading function

#

in this case - no new model types introduced
but model loading becomes "a bit" long 😄

//of course possible to do without this field by loading in format in which model saved and select provider for onnx base on current selected device in invoke, but I initialy thought of this when tried to compile TensorRT model)

broken blaze
#

Ah I see - so TRT and ONNX both share compatibility issues w/ LoRAs, TIs, etc?

violet sleet
#

onnx a bit less, but in general - yes
we now implement lora by torch hooks
but for onnx/trt we need to add lora weight in model and run this patched model

broken blaze
#

You don't think that deserves organizing it into a different model type?

violet sleet
#

i think initialy we can do such to see how it works and if it possible - merge back in general model
no - left as separate type

broken blaze
#

aye

violet sleet
#

but as I see now:
compel will fail
ti/lora - can't be applied
t2l/l2l - probably crash(unsure, need to check code)

broken blaze
#

what steps would @devout parcel and team take to implement ONNX?

violet sleet
#

give me some time I'll check code a bit

broken blaze
#

If you give them a step by step of what to work through, I think that'd be very helpful for them since we don't have docs yet

violet sleet
#

i more worry about - our generation code fully based on torch and even if they load onnx model it wont run

broken blaze
#

right - they'd need to write a new t2l node for the ONNX pipeline, right?

violet sleet
#

to be clear - we need to rewrite current generation too
now I create generation pipeline in a hacky way, but in real we don't use pipeline

#

but yes - currently they need to implement new t2l/l2l

hoary pecan
#

It’s the bottom of my list to build out more though.

broken blaze
#

I get that- SAMHQ is hot

hoary pecan
#

I’m conflicted lol - DeepFloyd vs SAMHQ

tardy nova
#

Hi @broken blaze in continuation with @devout parcel discussion, actually we have given a PR for ONNX model pipeline by designing a new Model type based pipeline {structure} in PR#3380 https://github.com/invoke-ai/InvokeAI/pull/3380. We are looking out for the changes made for the new pipeline with node based inference. By using CLI, we try to get user input to select Pytorch or Onnx based inference pipeline. So currently we would like to know if the CLI node based pipeline is ready, if so let us know some pointers to the node based documentation and sample cli to test. If CLI based pipeline is not ready yet, can you share us some pointers on how the node structure is being implemented for GUI.

GitHub

GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 330 million projects.

broken blaze
#

@violet sleet can likely comment but I don’t believe we’d recommend using CLI to make that designation.

The node structure is being implemented as a visual graph based editor. Users would select the model in the “model loader” node, the first node in a graph

violet sleet
#

I have no idea how to use cli with current node backend))
And about onnx implementation i decided to look at it after we done with main logic, until first 3.0 version
But in general problems - current logic designed to work with torch classes and we not implemented yet pipeline loading in model manager, all logic done separately in nodes(there still most generation code combined in pipeline class but it's because not all rewriten yet and this pipeline in a hacky way initialized only with unet)
//in real - i have no idea how to implement in onnx such features as clip skip

violet sleet
#

@hoary pecan @tardy nova I read your code while trying to understand "what needs to be done to add something like onnx in our architecture"

after an hour, I found that i had a basic skeleton and... in 4 more hours had something very functional.

it's still draft, but I think I have gotten the majority of the integration work completed for ONNX, including other support nodes (there are likely some things you can assist with) - thanks for your contributions thus far, I got acquainted with ONNX from the PRs you submitted.

I'd love for you to take a look at the PRs and get involved

upbeat prism
# tardy nova Hi <@183428997772738560> in continuation with <@1116033132529975338> discussion...

Hi @tardy nova and @devout parcel ,

Generally, to understand how the CLI and GUI work, I'll suggest you review the docs in https://github.com/invoke-ai/InvokeAI/tree/main/docs/contributing to understand how the nodes engine works in broad strokes.

Correctly implemented nodes (aka "invocations") will automatically show up in the GUI nodes editor. For the other application tabs (the "linear" tabs: Text to Image, Image to Image and Canvas), we need to manually create UI elements to allow the user to interact with different nodes.

The node editor automatically generates a graph from the node GUI, but for the linear tabs, we need to manually create the graph.

Ideally, you do not need to worry about the GUI side, and the process by which the nodes are parsed and UI templates created is fairly involved.

The CLI is autogenerated from the nodes as well, but as you can imagine, working with graph structures on a CLI can be tedious. It's much simpler to work using the GUI.

You can help us to help you by clarifying your goals in working with and on InvokeAI:

  • Do you intend to build on top of InvokeAI, use InvokeAI, or something else (eg only contribute to it)?
  • If you intend to build on it, how do you plan to interface with it (eg CLI, GUI, programatically)? The more specifics the better.

We understand you need ONNX support and have already made a substantial PR. Thank you for this and we apologise for it sitting there so long. We are in the middle of a large migration effort to a new architecture as you may know, thanks for your patience,

broken blaze
tardy nova
#

Sure thank you @upbeat prism for the documentation and heads up on the node structure. Will go through the information shared. @violet sleet thanks for the PR given. It is great, will have a close look at the PR and will help in further development as possible.

peak silo
#

Sorry to post this here i didnt want to post in the invoke-chat, anyone else having issues with main and output images not being made, the latents are getting made, but nothing is going into the image folder, then get 404 errors in console as the image is in the db but not in the folder? (Just making sure its not a me issue)

broken blaze
#

It could be both a "you" issue but also an "invoke" issue 🙃

#

Is this a clean install or a migrated install? Clean db?

peak silo
#

Migrated but clean db

broken blaze
#

you might see if you can go into settings, change your log level, and then try it again to see whats erroring out

peak silo
#

the thing that is throwing me is uploads go to outputs folder and can be retrieved
fine

broken blaze
peak silo
#

"Traceback (most recent call last):
File "/home/invokeuser/InvokeAI/invokeai/app/services/processor.py", line 70, in __process
outputs = invocation.invoke(
File "/home/invokeuser/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/invokeuser/InvokeAI/invokeai/app/invocations/latent.py", line 436, in invoke
vae_info = context.services.model_manager.get_model(
File "/home/invokeuser/InvokeAI/invokeai/app/services/model_manager_service.py", line 224, in get_model
model_info = self.mgr.get_model(
File "/home/invokeuser/InvokeAI/invokeai/backend/model_management/model_manager.py", line 434, in get_model
model_path = model_class.convert_if_required(
File "/home/invokeuser/InvokeAI/invokeai/backend/model_management/models/vae.py", line 92, in convert_if_required
return _convert_vae_ckpt_and_cache(
File "/home/invokeuser/InvokeAI/invokeai/backend/model_management/models/vae.py", line 149, in _convert_vae_ckpt_and_cache
checkpoint = torch.load(weights_path, map_location="cpu")
File "/home/invokeuser/venv/lib/python3.10/site-packages/torch/serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/invokeuser/venv/lib/python3.10/site-packages/torch/serialization.py", line 271, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/invokeuser/venv/lib/python3.10/site-packages/torch/serialization.py", line 252, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/invokeuser/userfiles/models/core/convert/se-vae-ft-mse'
"

#

however that folder and its file are there

broken blaze
#

Ah ha!

#

It's because it's a relative path

#

There's a fix in the works on that - It may have been merged. When did you last pull main?

peak silo
#

about 30mins ago

broken blaze
#

May not have been then! 🙂

#

You're sure you have se-vae-ft-mse, not sd-vae-ft-mse?

peak silo
broken blaze
#

Possibly, yes

#

There may be a few 'relative' things in flight

#

but you might try pulling it to see

peak silo
broken blaze
#

ah, lol

peak silo
#

definitely a realtive path issue, but only with the vae option in models.yaml, I tested by changing from vae: models/core/convert/sd-vae-ft-mse to the full path vae: /home/invokeuser/userfiles/models/core/convert/sd-vae-ft-mse and now no error

peak silo
tulip sluice
still wing
#

Will the nodes support the safety checker? Noticed that the model is not loaded in the TextToLatents node

broken blaze
#

@violet sleet - do you know how/when the safetychecker is loaded in 3.0?

violet sleet
#

there no code for safetychecker currently as it not in priority
but generaly i think it will be added as one more output from model loader node

#

and then safety checker node to run it on generated image

still wing
#

Alright, safety checker as its own node makes sense. Thanks

upbeat prism
violet sleet
#

it's like vae
can be provided and runned separately

upbeat prism
#

ok cool. why would we want it on the model loader node then?

tulip sluice
devout parcel
#

Even after initializing backend with script\invoke-web.py

Could able to see only this node. L2T, Prompt and others aren't available

broken blaze
upbeat prism
devout parcel
#

Trying with latest main only, even there are similar msgs with the dev server console.

unborn onyx
#

Meta question: are ppl working on nodes familiar with the feature set of both Blender node editor and Comfy? so much to learn, from a conceptual viewpoint

#

like nested subgraphs ("groups" or "recipes") are an important building block for composition and being able to create a community around sharing node graphs

broken blaze
#

Yes.

#

Node Editor is in "alpha/experimental" state - It is not the final UI/UX and purely for folks to poke/prod at nodes in advance of the full editor release.

#

Many features that will be supported do not currently exist, and as the developer of it has called out - "There are no mitigations for footgunning currently in place"

sinful forge
unborn onyx
#

awesome! good to know

#

another small thing that I find super useful for auto-documentation purposes is color-coded types for the I/O ports

#

vae/number/text/...

upbeat prism
upbeat prism
#

anyways, the whole thing needs to be redone before release, so we'll find a better solution for this

devout parcel
#

@violet sleet Facing issues with the setup in this branch, https://github.com/invoke-ai/InvokeAI/pull/3562#issuecomment-1651577818. Model Installation also fails in this branch, could you please add steps to be followed for this

GitHub

Note: this branch based on #3548, not on main
While find out what needs to be done to implement onnx, found that I can do draft of it pretty quickly, so... here it is)
Supports LoRA and TI.
As exam...

upbeat prism
opal arch
#

I offer @violet sleet all the credit. I'm currently merging main into the branch

devout parcel
opal arch
#

@devout parcel feat/onnx should now be up to date with Main again

devout parcel
opal arch
#

oh weird, I'll make sure nothing is sitting stale on my local

opal arch
#

fixed @devout parcel

rare maple
#

I have been struggling with input and output types while trying to write my xyGrid node. #1133465385182699582 . What I wanted to do was have a the XYCollector take in X,Y as list[Any] then output a single list[any] that was a product the input lists that could be passed tot he iterate node, but for the life of me I couldn't get it to work. I also tried Union[int, list[int]], the same for float, and other combinations etc. basically I was trying to accept int and float into the same node inputs so I could create an output array without worrying about the input types. Eventually it would need to accept all the types that can be passed as part of a generation. In the end I just converted everything to strings and then back again after the iterate node. This works but seems really inefficient. Any suggestion or is this kind of thing coming in the future of nodes?

broken junco
#

Proof-of-principle "universal translator" node for prompts.

#

PR 4072

broken blaze
#

This came up on the US copyright office's webinar around limited accessibility for AI

broken junco
#

It uses a pool of translation services. How it's actually working is pretty opaque as the API is going to a regional server (mine is located in Ontario) and then "something happens" at the server side. The list of supported translation services are here: https://pypi.org/project/translators/

There is some control for which translation services will be used. For example, you can exclude China-based services.

upbeat prism
peak silo
#

is there currently a way to have a list option in a node? like what model loader node has but for other files?

rare maple
# upbeat prism There's some runtime type checking that needs to be a bit more sophisticated to ...

I have work arounds for now so no real rush. I was more checking that I hadn't missed something obvious. I have a working version with labels on the grid and consistent grid layout, just need to do some code tidying then will release it for public consumption.

No rush from my side its much better to get the refactoring done right than to rush somthing in just for me.

'Union[Any|list[Any]]' was what I originally tried to use to collect the xand y items. In my mind an ideal would be a custom version of iterate that could take in the X and y item arrays and create a product array and then output the x and y items on the other side. That would replace what takes 7 nodes now with just 1 with no loss of functionality. Then something similar could be done on the collect and grid production side of things. I am confident there is an ideal way of doing this that isn't necessarily what is going on in my head 😂

spark jetty
#

great work on the node's gui, since nodes are supported now Consider Nested nodes or trees of them. take inspiration from behavior trees in a game engine.

upbeat prism
upbeat prism
rare maple
dark mango
dark mango
upbeat prism
dark mango
rare maple
rare maple
#

Just updated my image to grid nodes #1133465385182699582 - it now supports correctly ordered XY Grids with labels. Feedback or suggestions are most welcome.

dark mango
rare maple
dark mango
#

Will do! thanks