Text2vec_openai redundancy via multiple providers?

D3x · January 23, 2025, 9:05pm

Hi team,

We use the text2vec_openai vectorizer with our Weaviate instance using the default OpenAI base urls directly. Over the past few months there have been multiple API outages/disruptions that have impacted our production workloads and we’d like to mitigate this going forward.

Are there any formal recommendations from Weaviate how to address this?

One option we explored is that through a proxy like Litellm we can define fallback embedding endpoints (same model, different provider, such as OpenAI → Azure OpenAI) and seemlessly handle disruptions or exceptions. However, calling Litellm requires passing in a header Authorization Bearer token such as demoed here: Embeddings - /embeddings | liteLLM which the current text2vec_openai doesn’t seem to support.
Alternatively, could Weaviate provide the ability to define a backup vectorizer internally? Obviously it would be subject to some validation or understanding that both the primary and backup vectorizer would have to be the same exact model & configurations (dimensions, etc.).

The path of least resistance seem to be #1 for us. Would it be possible for Weaviate to expose a text2vec_openai configuration so we can pass in optional headers to the vectorizer?

Secondarily, I understand from the discussion at What is the process for changing vectorizer model - #7 by DudaNogueira that the vectorizer configuration is not mutable. Assuming all we want to do is switch to a different provider but keep the exact same model (e.g. change base url and hopefully pass in optional headers as described above) is there a way we can do that without having to recreate an entire new collection from scratch and migrate?

DudaNogueira · January 24, 2025, 9:35pm

hi there @D3x !!

That’s an interesting use case

I believe option 2 is more complex. Adding a fallback mechanism will require some code that feels more right for a proxy

Specially where there are tools that can do this better, like litellm.

One thing that I understand that could be possible, even with a “break the glass” scenario, is allowing changing the vectorizer. Something like a developer mode that you can set. and then change it

The main reason for the vectorization not being mutable is to serve as guardrail to avoid changing it wrongfully or expecting it would re vectorize the entire dataset. Hopefully this will be possible in the future with ASYNC Vectorization.

I have played around with littlellm and the api key seems to be optional.

here is what I came up with:

model_list:
  - model_name: text-embedding-3-large
    litellm_params:
      model: openai/text-embedding-3-large
      api_key: os.environ/OPENAI_API_KEY
  - model_name: azure-text-embedding-3-large 
    litellm_params:
      model: azure/text-embedding-my-deployment
      api_base: os.environ/AZURE_API_BASE
      api_key: os.environ/AZURE_AI_API_KEY

#general_settings:
#  master_key: sk-1234 # [OPTIONAL] if set all calls to proxy will require either this key or a valid generated token

router_settings:
  fallbacks: [{"text-embedding-3-large": ["azure-text-embedding-3-large"]}]

No I run litellm pointing to this config file

litellm --config config.yaml

and can run curls without bearer token:

curl --request POST \
  --url http://localhost:4000/v1/embeddings \
  --header 'content-type: application/json' \
  --data '{"model":"text-embedding-3-large","input":"The quick brown fox jumps over the lazy dog"}'

Also, if you want to set a bearer token to protect your endpoint, you can set it as the openai token on client instantiation, like so:

openai_key = "sk-1234" # use the same one defined in litellm

headers = {
    "X-OpenAI-Api-Key": openai_key,
}

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url,                       # `weaviate_url`: your Weaviate URL
    auth_credentials=Auth.api_key(weaviate_key),      # `weaviate_key`: your Weaviate API key
    headers=headers
)

That, coupled with pointing your collection vectorizer to litellm, will make Weaviate send sk-1234 to litellm just like it would send to openai.

Let me know if that helps!

D3x · January 27, 2025, 6:38am

Thanks @DudaNogueira for the details!

The part that I was missing was that the X-OpenAI-Api-Key header passed into Weaviate would ultimately be passed through the text2vec_openai vectorizer as an Authorization Bearer token to whatever base_url we set. It was a little unintuitive but should work for our use case. I’ll continue down this path and hope that this will keep our services up when OpenAI APIs (inevitably) goes down again.

And just to confirm, as of now there is no way to update an existing collection with a vectorizer base_url change without having to migrate to a brand new collection + updated vectorizer configuration correct?

DudaNogueira · January 27, 2025, 1:14pm

hi @D3x !!

That’s right. You cannot change the vectorizer conf of a collection.

However, you do can overwrite that at query time passing as a header: X-Openai-Baseurl

And thank you: as I have just found out that this is not documented

Thanks!!!

D3x · January 28, 2025, 12:18am

Amazing, thanks for X-Openai-Baseurl! I was dreading having to migrate an entire collection

Topic		Replies	Views
How to Use different embedding than OpenAI Support	1	470	August 16, 2024
Text2vec-openai Batch API Support integration , wcs , python	1	347	July 8, 2024
Why weaviate client (typescript) is not using configured text2VecAzureOpenAI vectorizer? Support typescript , azure	1	349	November 20, 2024
Alternatives to custom vectorizer for Weaviate Cloud? General python	3	308	November 1, 2024
Cannot use vec. param. using OpenAI API key via GPT assistant yaml / json schema Support	3	681	November 21, 2023

Text2vec_openai redundancy via multiple providers?

Related topics