#Issue streaming response

4 messages · Page 1 of 1 (latest)

finite hazel
#

Hi all,

I have the following API url:

Route::post('/chat/new', function (Request $request) {
    $str = $request->get('message');

    $yourApiKey = getenv('OPENAI_API_KEY');

    $search = trim($str);

    $url = 'https://api.openai.com/v1/chat/completions';

    $headers = [
        'Content-Type' => 'application/json',
        'Authorization' => 'Bearer ' . $yourApiKey,
    ];

    $body = [
        'model' => 'gpt-3.5-turbo',
        'max_tokens' => 1024,
        'temperature' => 0,
        'messages' => [
            ['role' => 'system', 'content' => "
                    You are a very enthusiastic AI that loves to answer questions. You'll end every reply with Best Wishes.
                "],
            ['role' => 'user', 'content' => "
                    Here is my question: '${search}'
                "],
        ],
        'stream' => true
    ];

    $client = new Client();

    $response = new StreamedResponse(function () use ($client, $url, $headers, $body) {
        $guzzleResponse = $client->post($url, [
            'headers' => $headers,
            'body' => json_encode($body),
            'stream' => true,
        ]);

        $buffer = '';

        while (!$guzzleResponse->getBody()->eof()) {
            $buffer .= $guzzleResponse->getBody()->read(1024);

            if (preg_match('/\n/', $buffer)) {
                $chunks = preg_split('/\n/', $buffer);

                for ($i = 0; $i < count($chunks) - 1; $i++) {
                    $chunk = $chunks[$i];

                    $data = json_decode(Str::replace('data: ', '', $chunk), true);

                    if ($data == null) {
                        continue;
                    }

                    echo json_encode($data) . "\n";
                }

                $buffer = $chunks[count($chunks) - 1];
            }

            ob_flush();
            flush();
        }
    });

    $response->headers->set('Content-Type', 'text/event-stream');
    $response->headers->set('Cache-Control', 'no-cache');
    $response->headers->set('Connection', 'keep-alive');

    return $response;
});

Which on the frontend does:

async function fetchStreamedJson(url, data) {
    const response = await fetch(url, {
        method: 'POST',
        body: JSON.stringify(data),
        headers: {
            'Content-Type': 'application/json',
        },
    });

    if (!response.ok) {
        throw new Error(response.statusText)
    }

    // This data is a ReadableStream
    const dataObj = response.body
    if (!dataObj) {
        return
    }

    const reader = dataObj.getReader()
    const decoder = new TextDecoder()
    let done = false

    let lastMessage = ''

    while (!done) {
        const {value, done: doneReading} = await reader.read()
        done = doneReading
        const chunkValue = decoder.decode(value)

        console.log({ chunkValue })
        lastMessage = lastMessage + chunkValue
    }
}

For some reason its sending the entire response at the end, instead of as each chunk comes in, so I end up with something like:

"{"id":"chatcmpl-7DuMPZSHbnVlQmi4GPxDS9SI8ctIB","object":"chat.completion.chunk","created":1683549269,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}
{"id":"chatcmpl-7DuMPZSHbnVlQmi4GPxDS9SI8ctIB","object":"chat.completion.chunk","created":1683549269,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"content":"Hello"},"index":0,"finish_reason":null}]}
{"id":"chatcmpl-7DuMPZSHbnVlQmi4GPxDS9SI8ctIB","object":"chat.completion.chunk","created":1683549269,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"content":"!"},"index":0,"finish_reason":null}]}
"``` (but lots more)

What am I doing wrong?
wild yacht
#

would probably not waste the time rebuilding what already exists and consume given packages for this?

#

sorry cannot think into your issue more now