Stream generated query results in python

bram · February 22, 2024, 11:50am

Hi!

I was wondering whether it is possible to ‘stream’ the generated answer as one can do with OpenAI API.

My goal is to use the weaviate generate feature in a FastAPI endpoint as such:

@app.get("/summary")
async def get_topic_summary(topic: str):
    return StreamingResponse(sample_summary(), media_type='text/event-stream')

In OpenAI, one can do this like (from their docs):

stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Say this is a test"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

I could not found documentation about this somewhere. Thanks in advance!

DudaNogueira · February 22, 2024, 12:24pm

Hi @bram!

This is not possible

For example, in Verba, our team used open ai directly to be able to stream the generation to the frontend:

github.com

weaviate/Verba/blob/1c9d4b49385315883ba0027ac1772a8b448f6204/goldenverba/components/generation/GPT4Generator.py#L76


      
                  completion = await asyncio.to_thread(
                      openai.ChatCompletion.create, **chat_completion_arguments
                  )
                  system_msg = str(completion["choices"][0]["message"]["content"])
          
              except Exception:
                  raise
          
              return system_msg
          
          async def generate_stream(
              self,
              queries: list[str],
              context: list[str],
              conversation: dict = None,
          ) -> Iterator[dict]:
              """Generate a stream of response dicts based on a list of queries and list of contexts, and includes conversational context
              @parameter: queries : list[str] - List of queries
              @parameter: context : list[str] - List of contexts
              @parameter: conversation : dict - Conversational context
              @returns Iterator[dict] - Token response generated by the Generator in this format {system:TOKEN, finish_reason:stop or empty}.

Let me know if this helps.

Thanks!

bram · February 22, 2024, 1:01pm

Hi @DudaNogueira ,

Thanks, I will have I look!

Topic		Replies	Views
Weaviate and ChatGPT Retrieval Plugin Query No Results Support	1	493	November 20, 2023
OpenAI GPT4 use with the generate.near_text() function Support	2	641	January 12, 2024
Generative tasks using Together AI endpoint (and via proxy) Support	2	18	June 30, 2025
Can Weaviate read JSON data? Support python , technical	4	292	December 3, 2024
How to debug Weaviate General developer-experience , python	2	616	March 18, 2024

Stream generated query results in python

Related topics