Need help combining weaviate with langchain

I have a usecase where the users will have many documents. No user will be able to access any other users documents. Also each user can select which files they can access .
I am using weaviate-python client , langchain (RetrievalQAWithSourcesChain).

First I tried to create a single class “Data” which has properties “content” and “source” , then user will be ble to filter the data using the “source” property. But this method has a problem. Even after filtering , the user is able to access other users files.

Then I tried another method. A class for each user and inside the user class , there will be a “data” field that will be linking to the “Data” class.
Below is the schema.

    "classes": [
        "class": username,
        "description": f"Class for user {username}",
        "properties": [
                "name": "username",
                "description": "Username of the user",
                "dataType": ["text"]
                "name": "data",
                "description": "Data associated with the user",
                "dataType": ["Data"]
            "class": "Data",
            "description": "Documents/data in the system",
            "vectorizer": "text2vec-openai",
            "moduleConfig": {"text2vec-openai": {"model": "ada", "type": "text"}},
            "properties": [
                    "name": "content",
                     "description": "The content of the paragraph",
                    "dataType": ["text"],

                    "moduleConfig": {
                        "text2vec-openai": {
                            "skip": False,
                            "vectorizePropertyName": False,
                }, {
                    "name": "source",
                    "description": "The link to the document",
                    "dataType": ["text"]

I am using the below code to create a vectorstore .

vectorstore = Weaviate(client, user, “data{ … on Data { source content }}”, attributes=[‘data { … on Data { source } }’], embedding=embed)

  1. I am getting the below error
KeyError: 'data{ ... on Data { source content }}'
  1. How can I retrieve specific data using the “source” from the user class? Is filtering a good approach?

can anyone help me with this? Thanks in advance.

Hi @ananthan-123 !

Welcome to our community :hugs:

This is a great use case for the new multi tenant feature in Weaviate. So each user will be a tenant and can be added to the class, but with the data isolated from each other.

However, multi tenancy is not yet supported in Langchain.
I have started a PR and Issue here for that:

With that said, filtering is a possible solution.

Creating one class per user is not the best approach. It will result in multiple vectors spaces for each user, making it hard to scale.

This is how you would get relevant documents with filtering with the current langchain, extracted from here:

from weaviate import Client
from langchain.docstore.document import Document
from langchain.retrievers.weaviate_hybrid_search import WeaviateHybridSearchRetriever

texts = ["foo", "bar", "baz"]
metadatas = [{"page": i} for i in range(len(texts))]

client = Client("http://localhost:8080")

retriever = WeaviateHybridSearchRetriever(
for i, text in enumerate(texts):
        [Document(page_content=text, metadata=metadatas[i])]
where_filter = {"path": ["page"], "operator": "Equal", "valueNumber": 0}

output = retriever.get_relevant_documents("foo", where_filter=where_filter)

Output: [Document(page_content=‘foo’, metadata={‘page’: 0})]

let me know if this helps :slight_smile: