I’m looking to build a RAG workflow in an internal chat application for file uploads specifically.
Essentially the workflow could be any of:
- User/Tenant uploads a file into a thread - vectorise and then query the vectors based on the users query
- User/Tenant uploads a file to an “assistant” which can be queried by other users
My question is one of architecture suggestions, I was looking at OpenAI’s Vector databases and they’re essentially what I’m looking for but I would like to isolate the RAG component away from what they offer hence looking at Weaviate.
Thus the questions:
- Would a collection per thread make sense? Or is there some partitioning possible within a collection - would that run into performance issues?
- Threads are ephemeral, my intention is to expire these threads’ vector stores on a regular basis - I was looking at using collection properties to record and then use these to prune collections as-needed.
Anyone running a similar workflow and have any suggestions?