Streaming without realtime database updated. | Convex Community | Page 1

near dagger Apr 13, 2024, 6:03 PM

#

is there a way to stream from OpenAI response directly to frontend app without using realtime database? because reactive database approach consumes a lot of bandwidth which isn't ideal for us.

solar oak Apr 13, 2024, 8:00 PM

#

sorry about kapa.ai not being helpful. there is currently no way to stream data out of an action directly. i've worked on this and it's not done yet. the only ways to get intermediate data out of an action are to store it with a mutation or send it to some other server with a fetch (which does support streaming requests and responses)

tacit needle Apr 13, 2024, 8:05 PM

#

Mitigations for a lot of data writes I can think of:

Stream it one line or sentence at a time rather than per token.
Stream it from the API using a next.js api route, then write the result at the end to the DB (but accept that you might never write it if the request gets interrupted). You can use the streamed response as ephemeral data on the client until the message comes down after being written. Like an optimistic update. Note: you can't see the stream on multiple clients or after a page refresh in this case.
Paginate the data from the frontend, so each query page isn't loading as much data on the fast-refreshing data. You could even have a separate subscription / query on the "in-progress" message so it isn't in the default query fetching path.

#3 is related to concepts from https://stack.convex.dev/queries-that-scale

Queries that scale

As your app grows from tens to hundreds to thousands of users, there are some techniques that will keep your database queries snappy and efficient. I’...

near dagger Apr 14, 2024, 12:34 AM

#

Thanks @tacit needle! Already tried #1 but it still taking a lot of bandwidth and tried #3 too, unfortunately it was affecting UX which required more frontend work to fix it. Before trying #2 I wanted to confirm it with you, as @solar oak mentioned that they are already working on this, do we have an estimate when this could be ready?

tacit needle Apr 15, 2024, 12:35 AM

#

I can't remember the timeline on streaming from http actions. @spark dock might have a better idea. but to be clear, this would not be streaming from a regular action / mutation, and you'd want to store the result alongside into the DB (maybe writing the incremental values or just the whole value at the end), and filter that result out using an index, from the client's query, until it's done

spark dock Apr 15, 2024, 4:36 AM

#

No timeline on streaming http actions. I vaguely remember it being somewhat non-trivial when we discussed it so it'll take a moment before we figure this out. But as Ian said you can probably try this with a different http service. The important part is to save the result to the DB at some regular interval.

near dagger Apr 15, 2024, 1:43 PM

#

okay! thank you 🙂

valid mirage Apr 16, 2024, 4:30 AM

#

I've also been looking at this, as my bandwidth has been eaten up from this. Number 2 seems like what I'll go for.

tacit needle Apr 16, 2024, 10:53 PM

#

We have streaming from http actions working internally on a branch as of today. No guarantee on when it'll ship, but it's progress!

valid mirage Apr 17, 2024, 5:26 AM

#

Awesome to hear that it's in the pipeline, one of the reasons you guys are awesome is your awareness of generative Ai, the vector Db etc

coarse wedge Apr 30, 2024, 11:22 PM

#

@tacit needle @spark dock will streaming be live soon I am revamping the convex code and thinking of going with 2nd approach but if http actions streaming is coming soon then I can wait. Thanks

spark dock May 1, 2024, 12:45 AM

#

In progress! I'll check with the team and get back to you.

coarse wedge May 1, 2024, 2:57 AM

#

Great THANKS

spark dock May 1, 2024, 3:05 AM

#

There are some edge cases we're working through. Hopefully in a few short weeks so we can make sure it's working well.

tacit needle May 14, 2024, 11:54 PM

#

@near dagger @valid mirage @coarse wedge we now have http response streaming - so you can stream directly to a client and only periodically write to the db (or not at all). Check it out: https://news.convex.dev/announcing-convex-1-12/

And sample code: https://github.com/sshader/streaming-chat-gpt/blob/sshader-streaming/convex/http.ts

Convex News

Announcing Convex 1.12

We’ve had a busy month, and we have a bunch of different improvements to share!

Support for Svelte, Bun, and Vue!

We have a few more logos under our quickstarts section – we've added guides for Svelte, Bun, and Vue including our first community-maintained client library!

HTTP action response streaming

GitHub

streaming-chat-gpt/convex/http.ts at sshader-streaming · sshader/st...

An example of streaming ChatGPT via the OpenAI v4.0 node SDK. - sshader/streaming-chat-gpt

coarse wedge May 15, 2024, 6:39 AM

#

Letss gooo

#

@tacit needle great work thankyou so much

rocky thistle May 15, 2024, 8:01 AM

#

tacit needle <@681163390344757260> <@523246075646181376> <@260825949979803649> we now have ht...

I haven't had a chance to get into llama-farm example yet but does it streaming work different or the old way as it was released before 1.12?

tacit needle May 15, 2024, 8:07 AM

#

It's using the normal flow - it writes to the DB at the end of sentences/clauses, since the user request isn't being piped all the way to the worker. I wrote this post on the implementation: https://stack.convex.dev/implementing-work-stealing

Implementing work stealing with a reactive database

Implementing "work stealing" - a workload distribution strategy - using Convex's reactive database.

near dagger May 16, 2024, 7:10 AM

#

That's supercool! Thanks @tacit needle & team convex 🙂

valid mirage Jun 10, 2024, 3:48 AM

#

Yuss, awesome! Been waiting for this!!

tacit needle Jun 10, 2024, 5:03 AM

#

Sarah recently wrote a post on it: https://stack.convex.dev/ai-chat-with-http-streaming
and a quick video of using the ai npm library:
https://www.youtube.com/watch?v=kP0HYN6NpA0

AI Chat with HTTP Streaming

By leveraging HTTP actions with streaming, this chat app balances real-time responsiveness with efficient bandwidth usage. Users receive character-by-...

YouTube

Convex

Build your own ChatGPT in 5 Minutes

Convex recently launched its support for the Vercel AI SDK, so we wanted to show off what you could do with it. Sarah goes over setting up your own AI Chat bot and how to use Hono for CORS middleware.

▶ Play video

#Streaming without realtime database updated.