URGENT: Filtering Retrieval Search in Weaviate Based on Tenant-Specific Uploaded Files

I am currently working on a critical project that involves implementing a multi-tenant system using Weaviate for RAG, and I’m faced with a challenge related to filtering retrieval searches based on certain uploaded files for each tenant.

Specifically, I would like to understand how I can configure Weaviate to allow retrieval searches that are filtered to specific files uploaded by a tenant. Are there specific query parameters or configurations that I need to consider to work with langchain?

For example, a tenant has 10 files and I want to search only 5 of them, how can I do it?

Any guidance or examples on how to achieve this would be greatly appreciated. I want to ensure that the retrieval search results are scoped to the relevant files associated with the selected tenant in the system.

Hi @AmanAda,
I am not an expert in langchain, but I can speak from a Weaviate level.

Overall, you don’t need to configure Weaviate in any special way. You only need a property with a file_name (or some other identificator), and you can filter using contains_any.

Get the tenant

First, as part of your query, you need to specify the tenant you are searching on. With the Python client, this looks like this:

my_collection = client.collections.get("MyCollectionName")

# Get the specific tenant's version of the collection
my_tenant = my_collection.with_tenant("tenant_name")

See more here.

Filter (contains_any) on files

Then, you need to run a query on the tenant with a filter using contains_any, which should contain all the files you want to search on. Here is an example with a Weaviate Python client:

import weaviate.classes as wvc

response = my_tenant.query.near_text(
    query="search term here",
    filters=wvc.Filter("file_name").contains_any(["file1", "file2", "file3"]),
    limit=4
)
1 Like