Best practice for fast embedding with OpenAI? ( or similar performance )

Jasper_van_den_Berg · April 24, 2025, 1:33pm

Hi there!

We’ve been integrating Weaviate with LangChain into our agentic application and the performance of the vector searches seemed great, although lately I noticed a bottleneck coming from the embedding. Even for just very simple queries it would take like 1.5-2 seconds to complete.

We’re using the weaviate-langchain lib to integrate weaviate and currently rely on it to build the vector store like:

               WeaviateVectorStore.from_documents(
                    documents,
                    client=self.client,
                    embedding=self.embeddings_model,
                    index_name=index_name,
                    uuids=uuids
                )

which creates the Weaviate vector store without a configured vectorizer, instead using our custom embedding models to do near_vector searches.

I noticed that this is currently causing additional latencies of like +0.5-0.8s for just embedding a short message, due to latencies caused by interacting with the OpenAI embeddings API endpoint.

What’s the best practice with Weaviate to mitigate this latency while still maintaining high performing embeddings for similarity searches?

Any insights or advice would be highly appreciated!

Thanks in advance.

Kind regards,
Jasper

Topic		Replies	Views
Indexing embeddings taking too long. What am I doing wrong? Support	4	1323	September 27, 2024
Text2vec-openai Batch API Support integration , wcs , python	1	118	July 8, 2024
Assistance Needed to Improve Weaviate's Vector Search Performance General	2	292	March 6, 2025
Simple keyword search not working Support	4	1001	September 14, 2023
Advice Needed on Optimizing Vector Search in Weaviate Support	1	170	September 6, 2024

Best practice for fast embedding with OpenAI? ( or similar performance )

Related topics