I am currently working on a critical project that involves implementing a multi-tenant system using Weaviate for RAG, and I’m faced with a challenge related to filtering retrieval searches based on certain uploaded files for each tenant.
Specifically, I would like to understand how I can configure Weaviate to allow retrieval searches that are filtered to specific files uploaded by a tenant. Are there specific query parameters or configurations that I need to consider to work with langchain?
For example, a tenant has 10 files and I want to search only 5 of them, how can I do it?
Any guidance or examples on how to achieve this would be greatly appreciated. I want to ensure that the retrieval search results are scoped to the relevant files associated with the selected tenant in the system.
Hi @AmanAda,
I am not an expert in langchain, but I can speak from a Weaviate level.
Overall, you don’t need to configure Weaviate in any special way. You only need a property with a file_name
(or some other identificator), and you can filter using contains_any
.
Get the tenant
First, as part of your query, you need to specify the tenant you are searching on. With the Python client, this looks like this:
my_collection = client.collections.get("MyCollectionName")
# Get the specific tenant's version of the collection
my_tenant = my_collection.with_tenant("tenant_name")
See more here.
Filter (contains_any) on files
Then, you need to run a query on the tenant with a filter using contains_any
, which should contain all the files you want to search on. Here is an example with a Weaviate Python client:
import weaviate.classes as wvc
response = my_tenant.query.near_text(
query="search term here",
filters=wvc.Filter("file_name").contains_any(["file1", "file2", "file3"]),
limit=4
)
1 Like