How to get the Vector Store from Document Splitted and Embedding

Description

I would like help to use Weaviate V4 for Python correctly to vectorise the split document and embedding.

Currently the only way I can receive the vector is to use “WeaviateVectorStore” from “from langchain_weaviate”, but the problem is that it creates a collection in the Cluster every time the code is executed and the vector is returned to the “vectorstore” variable.

What I want is for it not to create a collection every time it vectorises, but just to return the vector. I’ve tried deleting the code in the dependency where it creates the collection automatically, but it still creates a collection.

Server Setup Information

  • Weaviate Server Version: 1.24.13
  • Deployment Method: binary
  • Multi Node? Number of Running Nodes: 1

Any additional Information

Below is the code I’m using

vectorstore = WeaviateVectorStore.from_documents(client=self.client, documents=self.documents_split, embedding=self.embedding)

could anyone suggest an alternative way or even a solution?

Hi @eduardobuzzi !!

Welcome to our community :hugs:

I believe I have just the recipe you are looking for.

This recipe will use the latest langchain integration with Weaviate to ingest 2 different PDF files, and allow you to use both langchain and directly to filter, generate, query, etc.

I believe you will need to specify the index_name and initialize always with the same index_name. Like in here:

db = WeaviateVectorStore.from_documents([], embeddings, client=client, index_name="MyCoolCollection")

Let me know if this helps :slight_smile:

1 Like

Hello @DudaNogueira,

Unfortunately this Github link I think has been removed.

Hi!

Sorry for this.

Our team has moved some recipes around.

check here:

then navigate to integrations / llm-frameworks / langchain / loading-data /langchain-simple-pdf.ipynb

Let me know if this helps.

Thanks!