#Why doesn't it stream the tokens?

1 messages · Page 1 of 1 (latest)

solemn urchin
#

This is the code I used ```py
import requests
import sseclient # pip install sseclient-py
import json

url = "http://127.0.0.1:5000/v1/chat/completions"

headers = {
"Content-Type": "application/json"
}

history = []

while True:
user_message = input("> ")
history.append({"role": "user", "content": user_message})
data = {
"mode": "chat-instruct",
"stream": True,
"messages": history
}

stream_response = requests.post(url, headers=headers, json=data, verify=False, stream=True)
client = sseclient.SSEClient(stream_response)

assistant_message = ''
for event in client.events():
    payload = json.loads(event.data)
    chunk = payload['choices'][0]['message']['content']
    assistant_message += chunk
    print(chunk, end='')

print()
history.append({"role": "assistant", "content": assistant_message})```

I got the code from https://github.com/oobabooga/text-generation-webui/wiki/12-‐-OpenAI-API#python-chat-example but I just can't figure out why it isn't streaming the tokens, now it's just giving it all at once. Also how would I add a system prompt to this without ruining the history feature.

solemn urchin
#

im using the code for with streaming, but it doesnt work

oak thistle
#

did you use the -u flag?
python -u something.py

solemn urchin
#

no should I use that?

oak thistle
#

yes

#

without -u flag, python basically clumps 'em tokens together
-u flag disables it

solemn urchin
#

im running it using visual studio right now, but I'll run it from the terminal

solemn urchin
#

Do you know how I would implement a system prompt using this api (without messing up the history)?

oak thistle
#

well i tried appending system role message, and the model just ignores it lol

#

but you can put your system prompt in your instruction template and then specify it "instruction_template":"something"
downside you cant modify the prompt every request 🫠

solemn urchin
#

I'll try it out