Invoke Backend (Node-Based Backend) | Invoke | Page 1

misty cedar Sep 27, 2022, 7:35 AM

#

Starting a thread to discuss and share progress. Current progress is in branch invoker-framework, and can be run with python scripts/invoke.py with --api for the API version. Please no pull requests to the branch without prior discussion with @misty cedar - things are stabilizing, but there are still some major components to complete that may involve refactors. Until then, enjoy a preview of what's there!

#

To-do:
(rough list, probably incomplete)
✅ Image management. Images in outputs need to be references to something on an image manager, which should manage load/save/caching of images. The idea being that we can retain N images in-memory for pipeline usage, then load the rest from disk as-needed.
✅ Context history: Provide a way to get previous contexts
✅ Context save/load
⬛ Remove unexecuted nodes from a context
✅ Socket.io: I need to prototype this again.
✅ Events/signals: Need to define some standardized signal format and then hook signals everywhere in the invocations. This will probably be something like { "context_id": "...", "invocation_id": "...", ... }.
✅ API: There's no API yet! Well, there's one POST that lets you run a full JSON graph, but I need to build an API. Need everything else in place first though.
⬛ Image metadata. No idea how I'm going to pipe that in, and I don't feel like we settled on what image metadata should even look like when you could have a giant node graph generate images, and lots of input images.
⬛ Iteration. I've been leaving iteration to the end to avoid dealing with control flow. I'm thinking I'll do loop unwrapping or something, but not sure if that happens up-front (even all the way up to the UI) or dynamically at run-time (e.g. see an iteration node and generate all the iterations from its links).
⬛ Disable some nodes in the API (e.g. you don't want show_image popping up images on the host machine if you're accessing it remotely).

#

Newest tonight:

The cli now supports command pipes (e.g. txt2img --prompt "a photo of a cat eating sushi" | upscale | show_image).
There is now a context manager to manage context objects (in-memory currently, no history, and not super clean).
All invocation (node) processing is done on another thread. Currently a single node at a time - I'll leave the problem of multiple nodes at a time for someone in the future (since it gets really complicated with all the nodes and their needs/performance implications). Invocations can be awaited with a simple wait function call on the context.
Eventing in-app is fixed. The event system should easily support eventing via external services as well, but I'll not be proving that out with v1. It should be pretty straightforward to get socket.io hooked up again (though I'm sure I'll run into something weird).

#

Once I've got the core done, I'll clean up old code. At some point after that, it'd be great to rebase on development (or more realistically, create a brand new branch and manually migrate my changes, since they mostly sit on top of the current code right now, and a manual copy would be less hassle).

Things I could use help on after that (or leading up to it, depending on the task):

Building invocations (nodes)! I've done txt2img, img2img, upscale, gfpgan, and some image utilities. There's a lot left to do though. They're pretty straightforward though 🙂
Breaking apart the Generate class (or at least the function calls). Ideally we have a function call per invocation. I've done what I can here, but wanted to keep my changes minimal until we work on merging efforts.
More flexible configuration. I'd really like to set up configuration to parse to a well-defined object and utilize that during service definition. It'd be awesome if configuration could set up a machine to just run as a queue processor, for example, with the context manager replaced with remote calls to the hosting service.

Probably more things I'm forgetting.

terse hill Sep 27, 2022, 8:01 AM

#

misty cedar Once I've got the core done, I'll clean up old code. At some point after that, i...

Can you show features of nodes you already have and which ones you plan to do next? I’m trying to design them properly.

#

#1022847959404122182 message

misty cedar Sep 27, 2022, 8:17 AM

#

/ldm/dream/app/invocations in the branch contains all the current invocations. They're made available in the CLI automatically, and you can also read the OpenAPI doc when running the API to get the full schema for each.

I believe @tame remnant has at least prototyped reading the OpenAPI doc to generate nodes from.

Each node also has a stable type name, so if you wanted custom UI for a particular node, that should be possible.

#

We can also add additional properties to fields that the UI could pick up and utilize, if necessary. Preferably we minimize that to not mix too many UI details in the backend code, but it's possible.

tame remnant Sep 27, 2022, 8:30 AM

#

@terse hill There are two ways to make a Node UI: automatically generated from the code that describes the invocation, and hand-crafted. Right now all nodes are generated automatically

terse hill Sep 27, 2022, 8:43 AM

#

misty cedar `/ldm/dream/app/invocations` in the branch contains all the current invocations....

can't check yet, I broke everything. Сonda doesn’t start, although I installed it several times according to the manual. I’ll figure it out later.

tame remnant Sep 27, 2022, 8:58 AM

#

I've got my UI to work with the current changes to invoker-framework, no need to change any code there. I'd like to submit a PR with a simple_prompt.py invocation - that ok?

#

I'm slowly working out parsing and inferring UI from the OpenAPI doc. Without any UI details in the invocation I need a lot of logic but it's certainly doable. If I could just add a uiHint dict like discussed, it becomes sooo much simpler. What if we have an Invocation file, and then another file that describes its UI? I will still get things working to infer almost everything tho.

#

re: TextToImageInvocation & ImageToImageInvocation - should model and progress_images be there? I feel like those are system settings that don't need to be in any node

#

One UI hint I think is required is which parameters should have a handle to be linkable. For example, if I want 'prompt' to require an input (maybe this isn't how it should be but just for an example), the schema needs to indicate this somehow.

#

If the Field is optional, the schema shows no default value. I can use this, but then what if we have some other field which is optional but also does not require a link? I think there needs to be a flag for fields that require input links

#

Hmm. Ctrl+C no longer kills the script properly.

#

I mean invoke.py --api

#

Another thing that is needed in the invocation somewhere - for string types, do I show a small HTML input element or a multi-line textarea. I'm going to update my fork's invocations with the minimum UI hinting needed. I'll make it an additional property of the field called ui and you can give me your feedback, Kyle

misty cedar Sep 27, 2022, 9:55 AM

#

Can't sleep 😣. Yah I bet the threading killed Ctrl+c. Need to work that out (some todos in the code about that).

There are only really string types (text field), enum types (drop-down), and numeric (some with range requirements, mostly just minimums). Getting a good auto-UI setup may be a pain, but will make it easier for contribution later.

All non-required parameters are linkable. And required parameters can even be linked (but you have to provide a value directly as well).

Model is in there specifically since I want to support multiple models in the same node graph. There's a discussion thread discussing how it can be done. You can hide it for now, but I want to plan on it ☺️.

Progress images has always been a flag. I don't know how useful it will be in new UI, but I was going to add events for it (or at least step events like exist today).

#

I mostly am trying to avoid more nodes so I don't have more to refactor as I continue changing things around. I'll hopefully stabilize the code too, but I have some field handling and results serialization work to do, and that could be disruptive.

tame remnant Sep 27, 2022, 10:21 AM

#

Ok gotcha re: model, that will be sweet

#

in the react UI i have progress images as system-wide. but I realize it's also a node thing. I'll give nodes an additional settings tab, based on props like model and progress images, so you can manage per node

#

im just working out how to parse the numeric stuff now. javascript only has 'number' as a type, it's kinda hilarious. you can't get a float or int or anything reasonable. just number.

#

besides that I can generate the UI from the invocation now, with some assumptions. some light hinting makes the UI much nicer tho

#

mind on invoke stuff keep ya up? hopefully you're fast asleep now tho

#

width and height need pydantic's multiple_of=64. seed needs ge=0 and le=4294967295 (numpy's max)

#

and I believe cfg scale is gt=0

rocky pollen Sep 27, 2022, 10:52 AM

#

I think the question around nodes and parameters is an important one to ask - should parameters that don’t make sense except in the context of a specific node live outside the node or in it?

#

I wonder if, since front end is getting generated automatically, if it makes sense to have a “required” set of parameters displayed, and then a secondary set of “optional” parameters progressively disclosed

tame remnant Sep 27, 2022, 11:01 AM

#

while the frontend can be automatically generated, it doens't necessarily need to be - we can provide handcrafted UIs for modules where appropriate

#

but i do think it is pretty freaking cool to have it auto generate

rocky pollen Sep 27, 2022, 11:17 AM

#

hell yeah it is haha.

tame remnant Sep 27, 2022, 12:11 PM

#

Sweeeet, UI auto-generated from the invocations, processing working

#

few things need fixing but it works

misty cedar Sep 27, 2022, 2:44 PM

#

tame remnant width and height need pydantic's `multiple_of=64`. seed needs `ge=0` and `le=429...

Does it strictly have to be a multiple?

Is numpy max a constant somewhere?

elfin mural Sep 27, 2022, 6:49 PM

#

In numpy, yes. And the image dimension numbers must be mod 64 or things actually break.

red yew Sep 28, 2022, 6:59 AM

#

fix for
PermissionError: [Errno 13] Permission denied: '/results'

https://github.com/diffubik/InvokeAI/commit/ee8a043d023ee981e0c92969b984878bc8faff00

fix for ctrl-c

https://github.com/diffubik/InvokeAI/commit/136c529c56067c1e6717d76a662d4bcd4a86a0c8

tame remnant Sep 28, 2022, 7:00 AM

#

red yew fix for ```PermissionError: [Errno 13] Permission denied: '/results'``` https:...

Thanks!

tame remnant Sep 28, 2022, 7:29 AM

#

@misty cedar Here is generate.py Invocation with what I believe to be minimal functional UI metadata: https://github.com/psychedelicious/stable-diffusion/blob/react-flow-test/ldm/dream/app/invocations/generate.py how does that look to you

GitHub

stable-diffusion/generate.py at react-flow-test · psychedelicious/s...

This version of CompVis/stable-diffusion features an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, a WebGUI, a...

red yew Sep 28, 2022, 11:16 AM

#

Generating: 100%
/usr/bin/xdg-open: line 881: www-browser: command not found
...
xdg-open: no method available for opening '/tmp/tmp3xtjpaxg.PNG'

I can't find the invocation for xdg-open (I assume it is just some generic open call) - does anyone know where it is?

tame remnant Sep 28, 2022, 11:22 AM

#

I believe it's in ldm/dream/app/invocations/image.py @red yew

#

    def invoke(self, services: InvocationServices, context_id: str) -> Outputs:
        image = services.images.get(self.image.uri)
        if image:
            image.show() <----- here

red yew Sep 28, 2022, 12:24 PM

#

I'm a big fan of putting these sorts of developer options behind envvars

#

we could do sth like

if image and os.environ['INVOKE_SHOW_IMAGE'] == '1':
  image.show()

#

much lower effort than adding it to cli args

#

that's how I re-added the low gpu patch back to the invoke branch https://github.com/diffubik/InvokeAI/commit/a4d5851580765cbd45cb2db2a7b8c04e1e946b55

tame remnant Sep 28, 2022, 12:30 PM

#

I think the intention is ShowImageInvocation not a developer option, it's a node you use when you want to display an image

red yew Sep 28, 2022, 12:30 PM

#

oh I see

#

I was just testing the sample API call from the /docs endpoint

#

I see it now

tame remnant Sep 28, 2022, 12:32 PM

#

I guess it couldn't actually show the image on your system, based on your initial question?

red yew Sep 28, 2022, 12:32 PM

#

yes - for some reason PIL was looking for a browser

#

or rather xdg-open was

#

but I don't care about that step anyway

tame remnant Sep 28, 2022, 12:34 PM

#

Probably something to look into for us. Are you on a non-standard linux distro or something?

red yew Sep 28, 2022, 12:35 PM

#

regular arch via ssh though

#

maybe because no X is available

tame remnant Sep 28, 2022, 12:37 PM

#

That'll be the issue. We'll just need to have some error handling for situations like this.

red yew Sep 28, 2022, 2:21 PM

#

  "nodes": [
    {
      "id": "1",
      "type": "txt2img",
      "prompt": "A photo of a cat eating sushi"
    }
  ],
  "links": [
  ]
}``` ok this returns a 200 with an `{ "id": ... }`

#

but how do I get the actual image?

elfin mural Sep 28, 2022, 2:41 PM

#

Keep in mind that there's no 'xdg-open' on some platforms (no 'xdg' anything, for that matter)

red yew Sep 28, 2022, 2:42 PM

#

yeah the image.show() call feels a bit out of place

misty cedar Sep 28, 2022, 3:15 PM

#

Guessing you're checking out my branch? I just added image storage yesterday, and it's still pretty rough draft (was feverish all day).

The show image node is mostly for debug use, but also figured it would be nice for CLI usage. It's supposed to use the system's image viewer to display, and saves the image to a temp location before displaying. Not sure how that works on different systems though. I have a node in the invocation to not expose it for web UI ☺️

misty cedar Sep 28, 2022, 3:20 PM

#

tame remnant <@178692966318080000> Here is `generate.py` Invocation with what I believe to b...

I believe there's a title value on Field that can be set - could that be used for display name?

Why requires connection? I think even image nodes won't require a connection soon.

We should probably have a UI constants enum somewhere with all the strings defined. That would help with consistency/refactoring/preventing typos.

red yew Sep 28, 2022, 4:00 PM

#

what's the correct http request to get the png image data from a context?

misty cedar Sep 28, 2022, 4:24 PM

#

There isn't one yet

#

(trying to keep the Todo list updated at the top of this thread)

red yew Sep 28, 2022, 4:47 PM

#

let's pin the message 🙂

#

so I guess we will have an endpoint /api/v1/context/:id?

#

and /api/v1/context/:id/image.png ?

misty cedar Sep 28, 2022, 5:10 PM

#

There's either no pinning in forums or I don't have the ability to pin.

Something like that. I'll expose the contexts at a /context uri. Images will either be under /images as a root, or /context/{id}/nodes/{id}/results/ or something like that.

#

Pretty sure I need a better name than "context" though. "session" is the closest I can think of, but it doesn't seem correct (and "job" isn't correct, since the idea is that you can continue adding to it)

elfin mural Sep 28, 2022, 5:47 PM

#

@rocky pollen; are you able to pin #1024222732642177055 message ? I cannot...

rocky pollen Sep 28, 2022, 5:47 PM

#

Can try! Was just looking to see

rocky pollen Sep 28, 2022, 5:48 PM

#

misty cedar **To-do:** (rough list, probably incomplete) ✅ Image management. Images in outpu...

elfin mural Sep 28, 2022, 5:48 PM

#

@rocky pollen you rock

rocky pollen Sep 28, 2022, 5:48 PM

#

Ok looks like perms need to be looked at

#

I think contributors should now have pin perms

misty cedar Sep 28, 2022, 5:50 PM

#

Yep I'm seeing pin/unpin now 🙂

#

Also edited the messages so the whole to-do list is its own message

rocky pollen Sep 28, 2022, 6:02 PM

#

Pseudo project solution ✅

tame remnant Sep 28, 2022, 8:37 PM

#

misty cedar I believe there's a `title` value on Field that can be set - could that be used ...

Yeah I saw the title value on Fields - that's better.

Requires connection tells the UI to add a connection handle to that field and not attempt to display the value of it.

So would we need to extend Field into InvocationField and add ui to it, defining types and such?

misty cedar Sep 28, 2022, 8:38 PM

#

I think just having an enum with standard values is probably fine

tame remnant Sep 28, 2022, 8:40 PM

#

Just as a reference for users then? or can it provide editor hinting / runtime error messages

misty cedar Sep 28, 2022, 8:41 PM

#

UI_SHOW_SCROLLBAR: "true" or something

tame remnant Sep 28, 2022, 8:43 PM

#

Oh, gotcha.

#

I think I can generate a reference automatically from the TS types, either way I'll work on what you suggested and some documentation soon

misty cedar Sep 29, 2022, 3:30 AM

#

@tame remnant This might be useful for UI. Can use more specific fields that seem to include more info: https://pydantic-docs.helpmanual.io/usage/schema/#json-schema-types

Schema - pydantic

Data validation and settings management using Python type hints

misty cedar Sep 29, 2022, 5:54 AM

#

Alright contexts can now save (and load). Required some rather significant refactoring. Hopefully it should still be compatible if you've been using the openapi schema to generate UI.

misty cedar Sep 29, 2022, 6:58 AM

#

I've added an initial contexts API. Expected usage is that you create a context (with or without a graph definition), which returns the context id (but doesn't execute the graph). You then have the option to append more nodes/links (you can also do this after execution). You'll then "invoke" the graph (either a single node at a time, or everything). This will update the context as it invokes (I haven't hooked up signaling yet).

Some notes:

There's no way to delete a context. I'm not really clear how we'd handle that - should we delete all associated image results? What if you used them in other contexts? Without loading all contexts I have no way to know (I'm not using a database for contexts).
There's no way to "reset" a context. Once you execute it, you can't go back. Similar reasons - how should we handle results? I think you'd probably want to just copy your graph and create a new context from it if you want to "reset" it.
There is no way to remove a node from a context (same story).
You can't link from a new node to any other existing nodes, for the same reason.

Seems like I should probably make it possible to remove existing nodes as long as they're unexecuted...

red yew Sep 29, 2022, 10:01 AM

#

it seems like this is developing into a virtual machine to some extent

#

is the goal that the node-based backend is the only http service for invokeai?

tame remnant Sep 29, 2022, 11:07 AM

#

Still works mostly, but I'm not sure I understand the new way outputs work. Say I want an output with an ID or name of "my_string". It looks like I now need to create an output class extending BaseInvocationOutput, which has an explicit type, and give it the variable 'my_string'. Previously, I could just add an output in the invocation definition. Can I still do that somehow?

#

e.g. before, this works: ```python
class Outputs(BaseInvocationOutput):
my_string: str

def invoke(self, services: InvocationServices, context_id: str) -> Outputs:
    return SimplePromptInvocation.Outputs.construct(
        my_string = self.my_string
    )

#

now, I need to make a MyStringInvocationOutput.py to accomplish the same thing (as far as I can tell) which is far less flexible

elfin mural Sep 29, 2022, 3:33 PM

#

red yew it seems like this is developing into a virtual machine to some extent

"state machine", I think

#

In any case, a Directed Graph (whether or not, I don't know)

misty cedar Sep 29, 2022, 3:53 PM

#

tame remnant now, I need to make a MyStringInvocationOutput.py to accomplish the same thing (...

I wasn't able to deserialize the outputs without a discriminator field (the type). I couldn't generate those automatically from inner classes without some brittle, complicated code. I realized that we only really had image outputs (and soon prompt outputs), and there wasn't much value in a single output type per invocation class.

You can still make new output classes wherever makes sense if they're needed. They should be able to derive from another output type and override the type field.

Fancy automatic stuff just didn't work out. 😣

#

And yah it's kind of an interactive directed graph (I use graphlib to help determine graph validity and execution order).

red yew Sep 29, 2022, 4:07 PM

#

will the node-based backend supersede backend/server.py?

misty cedar Sep 29, 2022, 4:17 PM

#

I believe that's the plan - maintaining two backends would be a pain (it's already difficult enough to maintain an API and CLI). The frontend can be adapted to it though so it operates similar to the current frontend.

tame remnant Sep 29, 2022, 8:16 PM

#

misty cedar I wasn't able to deserialize the outputs without a discriminator field (the type...

we'll have string, integer, float, boolean and image outputs

#

and many nodes will have multiple outputs

misty cedar Sep 29, 2022, 8:21 PM

#

That's fine. Outputs will just need a type now

#

Might be useful to make some standard base types and utilize multiple inheritance to re-use some standard fields like image, prompt, etc.

#

I was also considering doing a history lookup to try to fill all parameters, but that can probably happen in the application layer too

tame remnant Sep 29, 2022, 8:25 PM

#

How do I make a node with multiple outputs of the same type?

misty cedar Sep 29, 2022, 8:25 PM

#

Just make a new output class with multiple fields

#

(or derive from a base one like ImageOutput and add some extra fields)

tame remnant Sep 29, 2022, 8:26 PM

#

alright so I will make base output types for each type now and make an invocation using em all to show you and make sur eim doing it right

misty cedar Sep 29, 2022, 8:39 PM

#

on a positive note, I think things are starting to stabilize

#

not too happy with how the images output/save, but I think it'll be okay

tame remnant Sep 29, 2022, 8:41 PM

#

Ok, so this is my invocation: https://github.com/psychedelicious/stable-diffusion/blob/react-flow/ldm/dream/app/invocations/test_invocation.py and here is my output https://github.com/psychedelicious/stable-diffusion/blob/react-flow/ldm/dream/app/invocations/test_invocation_output.py

GitHub

stable-diffusion/test_invocation.py at react-flow · psychedelicious...

This version of CompVis/stable-diffusion features an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, a WebGUI, a...

GitHub

stable-diffusion/test_invocation_output.py at react-flow · psychede...

This version of CompVis/stable-diffusion features an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, a WebGUI, a...

#

I get it now, last night when I looked at it I was confused

misty cedar Sep 29, 2022, 8:48 PM

#

Specifying a default isn't necessary on outputs.
I think the ui hints for entire invocations should be in the schema_extra part instead of as a field (fields should just be used for inputs): https://pydantic-docs.helpmanual.io/usage/schema/#schema-customization (at the bottom of the page)
You're also free to put these both in the same file. My current understanding of Python is that you should group like functionality in a single file (as a module)

Schema - pydantic

Data validation and settings management using Python type hints

tame remnant Sep 29, 2022, 8:49 PM

#

parses fine still just need to remove the type field from outputs

#

Thanks, I hadnt gotten around to reviewing the pydantic schema customization section yet

misty cedar Sep 29, 2022, 8:50 PM

#

ah right

#

maybe I can customize the schema generation to not include type in the output

#

(though maybe it's helpful to you for ui?)

#

it's super cool that these just automatically work in the UI ♥

tame remnant Sep 29, 2022, 9:01 PM

#

i can just filter out 'type', it's a reservd property anyways right?

#

yeah haha i love it

#

I already filter out 'id' and 'type' from the invocation itself

misty cedar Sep 29, 2022, 9:02 PM

#

really interested if a "single node at a time" interactive mode (like the CLI) could work out in the UI

tame remnant Sep 29, 2022, 9:09 PM

#

like the processing pauses while you choose what to do next?

misty cedar Sep 29, 2022, 9:10 PM

#

Kind of. My thought was it could be like the current UI - except all the stuff on the left (parameter entry) would be parameters for the current node

tame remnant Sep 29, 2022, 9:10 PM

#

(i havent used the cli yet)

misty cedar Sep 29, 2022, 9:10 PM

#

and maybe you could click a bubble next to the input to automatically fill in from previous nodes

#

you should check it out 🙂

tame remnant Sep 29, 2022, 9:10 PM

#

i should

misty cedar Sep 29, 2022, 9:10 PM

#

there's also an API now for adding a single node with links

#

with some cool linking options like this: from_node_id: X, from_field: *, to_field: *

#

the * makes it match up all fields it can (by type and name)

tame remnant Sep 29, 2022, 9:12 PM

#

Hmm. My thinking for recreating the current UI using nodes is we just enforce static arrays of nodes and links, and if you want more than that, you go to the node UI. You are talking about something between the two then, haven't considered it

misty cedar Sep 29, 2022, 9:12 PM

#

Yah I built it for the CLI and then figured the UI could also use that for a simplified mode

#

And you could jump back and forth between the full graph

#

"interact from here" on the graph could bring you to the single node UI

tame remnant Sep 29, 2022, 9:18 PM

#

sounds awesome

#

and maybe tricky hehe

misty cedar Sep 29, 2022, 9:25 PM

#

maybe shrugs

#

would be a good way to expose new functionality for free in the simpler UI though

tame remnant Sep 29, 2022, 9:28 PM

#

so like you do your processing in simple UI and click "I wanna do more" and it plops you into the node editor with the node view of what you have been using just a moment ago, am i understanding

misty cedar Sep 29, 2022, 9:30 PM

#

Yah pretty much. The CLI is just building a node graph on a context behind the scenes and executing it every command

#

just appends to the current last node in history

tame remnant Sep 29, 2022, 9:31 PM

#

itll take me some time to wrap my head around it but it sounds like we have a super badass tool in the works

misty cedar Sep 29, 2022, 9:31 PM

#

no reason you couldn't go back in history and pick a previous node to branch off of

tame remnant Sep 29, 2022, 9:34 PM

#

so interestingly, the frontend state management library just diffs prev state from new state and updates like that

#

i think you can save the diffs

#

and get free undo/redo

#

that kinda automatically takes care of going back thru history without needing the server to do anything,

#

maybe not tho

misty cedar Sep 29, 2022, 9:34 PM

#

Well... undo isn't really a thing once you've executed a node

#

#

super ugly, but might help get the point across

#

not sure if the layout/flow would feel nice that way, but basically, do one operation (left panel), see result (middle), select next operation. Then the next operation controls replace the left panel and you continue iterating that way

tame remnant Sep 29, 2022, 9:36 PM

#

oh haha yeah that gets the point across, i wasn't even able to imagine what you were suggesting before that image lol

misty cedar Sep 29, 2022, 9:36 PM

#

and I guess if you stick on the same one maybe it just keeps chaining off the previous node

rocky pollen Sep 29, 2022, 9:37 PM

#

So if we ask ourselves how people might use this

#

Are they working on one concept at a time - e.g. exploring prompts and then stemming off of that for img2img loops etc

#

I.e., is this “project based”

misty cedar Sep 29, 2022, 9:40 PM

#

so far I've had two different usages:

Iterating on a concept to figure out how to reliably create things in a style (this may just be generation parameters, or it might be a chain of operations)
Creating things I've already figured out how to create (e.g. I know how to make portraits in GTA style, so I'm just replacing the "subject" at the beginning, then running the generate + upscale + etc.)

#

and for either of those, it usually involves some seed exploration/repetition

tame remnant Sep 29, 2022, 10:07 PM

#

The use case for me that is impossible in other UI types is “make a spaghetti junction of connections and tell it to do the thing 100 times and see what crazy stuff comes out”

#

Or the more traditional artistic workflow of slowly iterating on a single work

rocky pollen Sep 29, 2022, 10:09 PM

#

I think thats a novelty if it’s not something that integrates into workflows

tame remnant Sep 29, 2022, 10:13 PM

#

arent we in the business of creating totally new stuff?

#

im taking inspiration from Modular Synthesizers: https://www.youtube.com/watch?v=6JeZR13dLLI

YouTube

State Azure

- Folding Space - Generative Modular Ambient // 4 Hour Relaxation

A mostly unattended generative piece on the eurorack, just letting this one run.

This patch only uses 4 main voices. 4ms Ensemble (Vangelis brass), MI Plaits (Synth parts), 2xAcid Rain Chainsaws (Bass and partial chords). Plus, a loop via Lubadh (improvised recording from Wavestate) and the background noise/traffic sample.

Patch Notes:

Main c...

▶ Play video

#

this is a graph where the objects being passed around are voltages

rocky pollen Sep 29, 2022, 10:15 PM

#

I mean yes, but new things that help people solve problems

tame remnant Sep 29, 2022, 10:15 PM

#

anything can feed to anything else, and it just keeps going as long as you let it

rocky pollen Sep 29, 2022, 10:15 PM

#

Otherwise I’ve found we won’t have many users of said “things” 🙂

tame remnant Sep 29, 2022, 10:16 PM

#

i'll argue that the generation of novelty is one of the key components of an expressive and useful universe

#

but that's maybe not so relevant 😛

rocky pollen Sep 29, 2022, 10:18 PM

#

I imagine that the ideal solution would allow for immense novelty AND be useful to pros looking for a better workflow

tame remnant Sep 29, 2022, 10:18 PM

#

exactly

#

absolutely agree, and we can do both here for sure

rocky pollen Sep 29, 2022, 10:18 PM

#

Right - the latter requires more thought on the UX and problems faced by “workflow”

#

The former just needs more nodes and unlimited flexibility

tame remnant Sep 29, 2022, 10:20 PM

#

yeah, glad to have people like you helping out on the UX side else what i would make would end up looking like the video

misty cedar Sep 29, 2022, 10:23 PM

#

I think the power in our solution is that the power users can generate novel things (either through the node graph, through new nodes in code, or a combination) and then share those with more common users

#

e.g. some of the huge upscale solutions right now are just "split up the picture, upscale each part, tape it back together" - that could be done with the node graph

rocky pollen Sep 29, 2022, 10:25 PM

#

💯

misty cedar Sep 29, 2022, 10:25 PM

#

maybe let people PR "recipes" or something

rocky pollen Sep 29, 2022, 10:25 PM

#

That or a community site for sharing them

misty cedar Sep 29, 2022, 10:25 PM

#

where they've selected inputs to a large graph to expose in simpler UI, and outputs that matter

rocky pollen Sep 29, 2022, 10:26 PM

#

Is the “novelty” seeker an explorer of sorts?

#

And they’re feeding that back to “settlers” that figure out how to use those new things in their workflows

misty cedar Sep 29, 2022, 10:28 PM

#

something like lexica for this would be cool

rocky pollen Sep 29, 2022, 10:28 PM

#

They just got 5m in funding lol

misty cedar Sep 29, 2022, 10:29 PM

#

lol

rocky pollen Sep 29, 2022, 10:32 PM

#

I don’t know exactly they monetize but they’ve got a lot of data so probably something like prompt data mining

#

Maybe that’s just me being pessimistic 🙂

misty cedar Sep 29, 2022, 10:41 PM

#

Yah any large amount of data is valuable. And if it gets enough traffic even just advertising opportunity is valuable

tame remnant Sep 30, 2022, 4:06 AM

#

@misty cedar I'm trying to get the app to do hot reloading, in api_app.py :

    config = uvicorn.Config(
        "ldm.dream.app.api_app:app",
        host = "0.0.0.0",
        port = 9090,
        loop = loop,
        reload=True,
        reload_dirs=['ldm/dream/app/invocations'])

#

the terminal output says it is watching and will reload on changes:

INFO:     Will watch for changes in these directories: ['/Users/spencer/Documents/Code/stable-diffusion/ldm/dream/app/invocations']

but when I make changes to say an invocation file, it does not reload, not even if I am patient

misty cedar Sep 30, 2022, 4:14 AM

#

You'd have to reload almost everything to pick up the new ones

#

Anywhere that does a from invocations import *

#

(or any path to the invocations)

#

And probably split code out of API app

tame remnant Sep 30, 2022, 4:16 AM

#

I guess that's what I want to do - as I'm fiddling around with the invocations and the UI, have the server reload every time I make a change

misty cedar Sep 30, 2022, 4:16 AM

#

And set OpenAPI not to cache the result

tame remnant Sep 30, 2022, 4:16 AM

#

It also works without the reload_dirs but I was trying to be more specific in case the directory setting wasn't recursive.

#

So uvicorn isn't the thing I want to hot reload then I guess

misty cedar Sep 30, 2022, 4:18 AM

#

I have no idea. Due to the way the invocations are discovered I'm not sure what stuff would need reloading and what weirdness you'd run into

tame remnant Sep 30, 2022, 4:18 AM

#

i can just use a shell script

#

kill and restart

#

ty

#

unfortunately due to how the threading is set up my approach isn't working, I think the signal indicating the process has died is never emitted. something like that.

misty cedar Sep 30, 2022, 4:43 AM

#

Oh yah I haven't fixed the shutdown yet 😣

#

Sorry, been running through a debugger most of the time

misty cedar Sep 30, 2022, 5:30 AM

#

Okay it should gracefully shutdown now

#

complains about something on API shutdown, but it at least shuts down

misty cedar Sep 30, 2022, 7:07 AM

#

woohoo, I got socket.io prototyped on FastAPI. No time to do the full implementation tonight though x.x.

tame remnant Sep 30, 2022, 7:16 AM

#

misty cedar woohoo, I got socket.io prototyped on FastAPI. No time to do the full implementa...

You are moving plenty fast mate

#

That's awesome though! and thanks for cracking at the shutdown thing

#

what debugger are you using?

#

little shell script to restart it on change works now 🙂

#

cool util I found to handle it all - entr. Pipe it a list of files to watch and it does the rest: ls ldm/dream/app/invocations/*.py | entr -r python scripts/invoke.py --api

misty cedar Sep 30, 2022, 7:22 AM

#

I'm using vscode. Not sure what it uses for a debugger, but it works really well.

tame remnant Sep 30, 2022, 7:23 AM

#

nice

#

I switched back to sublime after years of vscode, its a bit more effort to get things set up with it :/

#

The people who made sublimetext have a really nice graphical git client called SublimeMerge, it makes most of the git stuff understandable for me.

#

misty cedar Sep 30, 2022, 7:26 AM

#

Ah nice. I added a few extensions to vscode for git

#

But I'm also used to the git cli...

tame remnant Sep 30, 2022, 7:29 AM

#

I'll get there eventually, the GUI on this makes a lot of the operations clearer. Resolving conflicts is really smooth too.

rocky pollen Sep 30, 2022, 9:23 AM

#

Ok I thought i was dense because vscode has been giving me all kinds of hell with git lol

tame remnant Sep 30, 2022, 1:13 PM

#

@misty cedar may i request friendly operationIds on the API, I think they are being auto-generated now: e.g. "invoke_context_api_v1_contexts__context_id__invoke_put"

#

The openapi-generator project you linked is great btw, thanks. I went ahead and wrote my own methods as an exercise to help me appreciate what is needed to do it right, then saved myself the pain and let the generator make everything 😅

#

yeah, they are being auto-generated, i see where they can be specified (in the router decorators i think is the right term)

#

Tried out the API and it works a treat! So if I want to just immediately invoke, should I create a context, wait for 200, and then invoke it?

misty cedar Sep 30, 2022, 2:50 PM

#

Yah I haven't put a ton of work into the API docs, was just fleshing out functionality. It seemed to fill out the title for operations nicely though.

Yah I wasn't sure if I should add a query parameter to "invoke now" when creating from a graph. The pattern would be:

create context
subscribe with socket.io (not in yet)
invoke context

tame remnant Sep 30, 2022, 8:05 PM

#

misty cedar Yah I haven't put a ton of work into the API docs, was just fleshing out functio...

I mean the the generated schema autogenerates the "operationIds" and makes those really long names like "invoke_context_api_v1_contexts__context_id__invoke_put". You can add an operationId arg to e.g. in context.py:

@context_router.post('/',
    operationId = 'createContext', # <---
    responses = {
        400: {'description': 'Invalid json'}
    })
async def create_context(

and then the schema uses that as the operationId. Requesting this bc the openapi-generator generates the API code and types based on the operationId.

#

see https://fastapi.tiangolo.com/advanced/path-operation-advanced-configuration/#openapi-operationid

Path Operation Advanced Configuration - FastAPI

FastAPI framework, high performance, easy to learn, fast to code, ready for production

#

also curious of your opinion of socketio for handling the communication - it seemed really easy and effective to me but this is my first rodeo so maybe just HTTP is better? dunno

misty cedar Sep 30, 2022, 9:15 PM

#

Naw sockets for notifications are good. Polling is generally bad.

#

socket.io seems to be less maintained, but maybe it's just really stable?

#

I looked at websockets last night, but we'd have to build channels and stuff, which is a pain

#

I don't know if socketio has a backend if you were to scale-out though =/

#

I've used signalr, but that needs .net for hosting

tame remnant Sep 30, 2022, 9:36 PM

#

I got the impression that socketio was widely used in massive applications

#

There are a lot of different backend implementations/bindings for it, flask-socketio is the simplest one for flask I could. there is the more agnostic python-socketio as well tho. both can use message queues and that stuff (not that I understand what I'm talking about 😛 )

#

Glad to hear i made a reasonable choice w/ the server i wrote, was kinda concerned I just grabbed something that looked nice but had issues

misty cedar Oct 1, 2022, 6:20 AM

#

Alright, socket.io is in. Events are all defined in /ldm/dream/app/services/events.py. If you want to look at usage (and easier to understand events), run the API, visit /static/test.html, and press the test button 🙂

#

I also added a temporary endpoint for getting images. It uses a query parameter though, which I am not happy with. I'll need to do some work to fix that though.

#

Getting real close though. API needs some cleanup and I need to figure out iteration/join

#

and lots of code cleanup to remove the flask server I had built x.x

tame remnant Oct 1, 2022, 8:37 PM

#

"Context" sounds really dry next to "Invocation". Isn't "Ritual" a cool word? A ritual is a series of invocations, executed with certain parameters, in a certain order, with a certain intent, but with some uncertainty in the result.

misty cedar Oct 1, 2022, 8:54 PM

#

I keep leaning toward "Session". I originally was using context since I was going to pass it down to the invocations when they ran. That created a lot of issues though (since it also owned them, and Python really hates circular references), so I rearchitected things, but never changed the name.

#

Invocation made sense, since in addition to being a cool name, it describes what the object does.

#

(and on-brand)

tame remnant Oct 1, 2022, 9:00 PM

#

Agree Session makes sense and fits more than Context, which sounds kinda technical

misty cedar Oct 1, 2022, 9:01 PM

#

You can also continue editing the graph after running it, which is in-line with a session

#

#1025874933445824512 message

#

made a new thread to talk about iteration

#

I want to make sure it's actually useful before I spend a lot of time on it

#

especially since it doesn't mesh well with how things are currently set up =/

#

(and also because iteration and metadata are really the only big areas missing... and if neither of them is super useful, then I can clean up and we can start integrating!)

misty cedar Oct 1, 2022, 9:26 PM

#

okay context is renamed to session everywhere

misty cedar Oct 1, 2022, 9:44 PM

#

I've also removed all of the flask and dependency-injection backend stuff

misty cedar Oct 1, 2022, 10:09 PM

#

okay and image urls are much nicer now

tame remnant Oct 1, 2022, 10:39 PM

#

really shaping up, awesome work

#

regarding invocation versions - i am thinking that the invocations themselves may change over time. say I contribute an invocation for cool thing X, and then later I add a feature to it or whatever. when I load my session, i need to load the right invocation

#

i dunno if there is a way to make the invocations a module based on a git repo, something like that... but then we are getting into diy package manager territory

misty cedar Oct 1, 2022, 10:44 PM

#

As long as they're loaded before the API indexes everything, you can add more invocations from anywhere. It just looks for subclasses of the base invocation

tame remnant Oct 1, 2022, 11:05 PM

#

the issue is when you have a session that used invocation X version Y, but you have since updated and now invocation X has version Z with breaking changes

#

do we need to have subclasses and a version matcher?

#

if major version is the same, there are no breaking changes and it will still function as it did previously, but if major version is different, you need to load the same invocation version somehow

misty cedar Oct 1, 2022, 11:23 PM

#

Is it worth that effort? Same thing with models to an extent - there might be a limit to how much we can/should track.

tame remnant Oct 1, 2022, 11:29 PM

#

might not be worth it. can we easily get a hash-like represnetation of the invocation file? then we can at least say "This session appears to use a different version of the Cool Thing Invocation and may not function as expected."

#

I think a version number in schema_extra is easy enough to do. Checking can happen in the UI - "This Session uses Invocation version X, but version Y is installed. It may not work as expected." I can't imagine the CLI really using sessions much, so it wouldn't affect that client much - or am I mistaken?

misty cedar Oct 1, 2022, 11:40 PM

#

It wouldn't really reuse them

#

Could add a version, but things below the invocation could also change behavior, and that's tough to track

#

Git hash could work, but local changes would break that

#

And if you're sharing what you've made, then it probably shouldn't be expected to work for someone else unless you're using the same version of the master branch

tame remnant Oct 1, 2022, 11:57 PM

#

misty cedar And if you're sharing what you've made, then it probably shouldn't be expected t...

suppose that's fair

#

I'm just trying to poke holes to ensure we plan for edge cases and future development

misty cedar Oct 2, 2022, 12:05 AM

#

Yah. I mean, bring able to share the basic generator related things like prompt, cfg, steps is super useful even between tools/branches. I collaborated with someone who was using the automatic UI successfully that way

tame remnant Oct 2, 2022, 12:12 AM

#

absolutely

#

once we have the mvp working, we will gain a better perspective of which other features are needed

misty cedar Oct 2, 2022, 12:15 AM

#

I think I'll rebase on development sometime soon. I may add a context class for invocations to utilize, but otherwise I think it's all about ready to go.

rocky pollen Oct 2, 2022, 3:58 AM

#

tame remnant "Context" sounds really dry next to "Invocation". Isn't "Ritual" a cool word? A ...

I actually love it but that’s just my vote for being “on brand” 😜

elfin mural Oct 2, 2022, 4:12 AM

#

@misty cedar; I ran python scripts/invoke.py --api and loaded 'http://localhost:9090/static/test.html', hit the button, and it works. Cool. What other tricks can it do? 🤣

misty cedar Oct 2, 2022, 4:13 AM

#

Haha. Try running without the API flag and be amazed by the new (automatically generated) CLI

#

The API is documented at /docs (or /redoc) too (except for the signals... Not a great way to document those)

#

I haven't written many nodes for it, but it's super easy to add functionality to. Just have to write the one file for the invoker and it automatically works in the CLI and Web API

misty cedar Oct 3, 2022, 4:58 AM

#

Alright, I've "rebased" on current development (I branched from development into a new branch then merged into that - way easier given how far behind I was).

invoke-development

Please try that branch out. I'm still stuck on 3.8.5 until I do an environment rebuild, and that will probably break me for a day or so I assume. Sounds like some people were able to run it fine though, so hopefully there aren't any real changes needed.

If it works fine, I think it's ready for contribution 🙂

tame remnant Oct 3, 2022, 6:58 AM

#

woohoo!

#

some issues:

on the CLI, pressing arrow keys etc inserts control characters, I guess we want readline?
txt2img --prompt "a cute dog" | show_image | upscale generates and shows the pupper, but upscale fails:

  File "/Users/spencer/Documents/Code/stable-diffusion/ldm/dream/app/invocations/upscale.py", line 23, in invoke
    image_list     = [[self.image.get(), 0]],
AttributeError: 'ImageField' object has no attribute 'get'

Ran txt2img --prompt "a cute dog", that worked, then ran txt2img --prompt "a cute dog" | show_image | upscale and got:

File "/Users/spencer/Documents/Code/stable-diffusion/ldm/dream/app/services/invocation_session.py", line 126, in add_invocation
    from_node = self.invocations[node_id]
KeyError: '-1'

tame remnant Oct 3, 2022, 1:36 PM

#

Ok, have the UI creating sessions and invoking!

#

I expect a common use pattern is to load & connect nodes in the UI, invoke, then remove some nodes, invoke, add some nodes, invoke, change links, invoke, and so on. As far as I understand, this requires a new session each time I remove a node or change links. I suppose I can just keep appending, but then the session state kinda loses its sync with the UI state - the session will have a lot of extra nodes and links. Does it make sense to have an API method to replace the a session's nodes and links entirely?

#

General use pattern:

upon loading the UI, either resume/load an existing session from a list/session library or create a new one (and here I'm really tempted again to call Sessions "Rituals", the user library is the "Grimoire" and community preset library the "Arcaneum"...)
While user is adding and connecting nodes, not much (nothing?) is sent to the server. When they click Invoke, the graph is sent and immediately processed. So again here the action that would make sense is "set-invocations-and-invoke-all", overwriting the session's nodes and links with the payload.
While that is processing, user decides to pause, make changes, and resume. For this, I think the action would be "set-invocations-and-invoke-from-specific-node-or-link", something like that. I'm not sure how pausing at a certain point, modifying future nodes/links, and then resuming from there works on the back end....

rocky pollen Oct 3, 2022, 2:20 PM

#

I think that's right - (Also, I'm also a fan of some opinionated language... 🙃)

#

Grimoire would be a very easy title for our creative guide...

#

From a UI perspective, one might build out and "Save" an invocation, or "Invoke" it. If a user were to Invoke, while that specific invocation has been executed, the UI should retain it's node layout for modification/editing, and UI would have the outputs available for use in the next invocation

#

There will also probably need to be a "Clear" canvas button which removes all current nodes and resets to default state

terse hill Oct 3, 2022, 3:00 PM

#

I also can't yet understand how we can work with different instances of nodes. For example, three "generate" nodes in some places at the same time — possible? How about a few prompts? What if the number of nodes increases to 20-30? Should we limit the total number? Should we add a warning about long generation times, or tell approximate creation times?

All I can imagine is a linear path of nodes with image outputs on some of them, different versions of images.

rocky pollen Oct 3, 2022, 4:04 PM

#

My thought would be that N+1 generation nodes would output images in the sequence generated (mvp) and potentially handle displaying as a grid (future date)

Multiple prompts - Either
A) could be concatenated if they fed into the same input
B) if we want to keep a 1 output / 1 input node connector constraint, the prompt node could have an input that accepts text and appends the text input in that node to the text that is being input

misty cedar Oct 3, 2022, 4:21 PM

#

@tame remnant I fixed upscale/restore (I think - my environment is pretty broken at the moment). Forgot to convert it to use the image service 🙂

#

Seems from the discussion like it might be useful to have an "invocation" that's separate from the defined graph. I'm not yet sure how that meshes with continuing a previous run though (e.g. chaining from previous nodes so you don't have to run everything again to e.g. upscale).

#

I currently don't have any support for connections where the receiver is a List of N. I considered it, but there are lots of questions, like if order matters, what collections to support, etc.

The graph might currently let you connect multiple outputs to one input. If it does, that's a bug 🙂

tame remnant Oct 3, 2022, 9:04 PM

#

misty cedar <@484223699839483905> I fixed upscale/restore (I think - my environment is prett...

Thanks!

#Invoke Backend (Node-Based Backend)