#Scheduling, cancellation, and streaming AI responses

1 messages · Page 1 of 1 (latest)

cobalt salmon
#

oh, I guess I had something bad in the message? heck

#

trying again:

I'm working on an app which works similarly to AI dungeon. I want it to support streaming in responses chunk-by-chunk, and I want to support cancellation.

Here's a simplified version of how I'm doing it now. This is a lot of code. And I'd have to repeat a lot of this for whatever other parts of the app I want to support generation and cancellation. And again, this is simplified. This is missing auth checks, status guards, regeneration, and other things. I want to try to simplify this.

Some alternatives I considered:

  1. Have the entrypoint be an action instead of a mutation, and await it on the frontend. This would solve the need for a status variable, but only in the context of the client awaiting that function. Since generations can take upwards to minutes, I want to make it so the pending state is independent of browser refreshes and other things. Most of the boilerplate still remains with this approach anyways.
  2. Scheduled action cancellation. From my understanding, the docs don't cancel scheduled functions; the stream would continue running and updating the DB. They read "cancelled actions won't run any other scheduled actions", but it doesn't mention that for mutations/queries run during actions. Plus, unless the action itself can get its own status (I couldn't find a way how?) I still need some way for the action to know to abort the completion stream.
  3. Workflows. To allow cancellation, I would want to have several .step() calls per AI response chunk, but I read that workflows have to be deterministic, so to my understanding, that wouldn't work. It also doesn't look like it supports fetch() yet anyways (?).
  4. Generalize this into a generic "completion" concept, where other models (like the chapter in this example) would point to a completions table where its related content is stored and generated. All generatable things would go through this completions interface. I can't think of a way to make this abstraction that doesn't compromise on the flexibility I need. The type of response and the manner in which it's processed differs between parts of the app in a way that can't be parameterized (easily, if at all).

So my question is whether I can simplify this, generalize this, or if there's something I missed / misunderstood in reading the docs. Thanks in advance!

Gist

GitHub Gist: instantly share code, notes, and snippets.

mighty yew
#

Following