bram
February 22, 2024, 11:50am
1
Hi!
I was wondering whether it is possible to ‘stream’ the generated answer as one can do with OpenAI API.
My goal is to use the weaviate generate feature in a FastAPI endpoint as such:
@app.get("/summary")
async def get_topic_summary(topic: str):
return StreamingResponse(sample_summary(), media_type='text/event-stream')
In OpenAI, one can do this like (from their docs ):
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Say this is a test"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
I could not found documentation about this somewhere. Thanks in advance!
Hi @bram !
This is not possible
For example, in Verba, our team used open ai directly to be able to stream the generation to the frontend:
completion = await asyncio.to_thread(
openai.ChatCompletion.create, **chat_completion_arguments
)
system_msg = str(completion["choices"][0]["message"]["content"])
except Exception:
raise
return system_msg
async def generate_stream(
self,
queries: list[str],
context: list[str],
conversation: dict = None,
) -> Iterator[dict]:
"""Generate a stream of response dicts based on a list of queries and list of contexts, and includes conversational context
@parameter: queries : list[str] - List of queries
@parameter: context : list[str] - List of contexts
@parameter: conversation : dict - Conversational context
@returns Iterator[dict] - Token response generated by the Generator in this format {system:TOKEN, finish_reason:stop or empty}.
Let me know if this helps.
Thanks!
bram
February 22, 2024, 1:01pm
3
Hi @DudaNogueira ,
Thanks, I will have I look!