I’m building a QA chat experience that’s tailored to a specific user’s data using a RetrievalQAChain from Langchain. Let’s say the user has many “events” that I want the chat to be able to answer questions about. I create embeddings from these and store them in Weaviate (using OpenAI to create embeddings). Well that works but I only want the chat to answer questions about the logged in user’s “events”.
I see two ways to do that:
- Create a Weaviate Class for every user and store their embeddings in their Class. Always perform the similarity search on the user’s Class.
- Use a single Weaviate Class for “events” but include a userId Property on each data object and filter on that userId when performing a similarity search.
Both approaches seem reasonable to me right now. I wonder if the presence of other users’ data objects in the single “events” Class would mess with the similarity search results regardless of the filter on the userId Property. But aside from that, the differences seem to be mostly in how they would be implemented (both seem easy enough tbh).
Does anyone have any comments/suggestions/considerations regarding either approach?
Thanks!