USE CASE: I have a collection of web pages as HTML files that will be the basis of my LLM based chat-bot project. I will handle the chunking of the documents myself.My experience in the past has been limited to using the OpenAI API to create the embedding vectors from text chunks and write them to a file,. Then I used that same service to create the embedding vector of the user input before executing a query. To execute a query I would do a linear search through the monolithic target embeddings file using the cosine similarity function to rank and select the top matches.
I now am switching to a cloud based weaviate instance. I have created my production cluster (not sandbox), and have done the quickstart tutorial using my cluster URL and API key (admin, read/write). It works fine.
I am hoping to find a code sample that will show me how to use my cloud based weaviate cluster to do what I was doing before, using the Open AI embeddings API and my cosine similarity function. Except now of course, I want to use the preconfigured vectorizing modules that are part of my weaviate cluster and to use the weaviate cloud based similarity search to select the top matches.
Is there a good sample that will show me quickly how to:
- Set up and configure my cluster (e.g. - create a schema, if that is necessary, etc.)
- Attach metadata to each chunk that I store in my weaviate cluster. I need to attach the source URL of the HTML file the chunk belonged to, including a few other fields.
- Vectorize a chunk and store it in my cluster using the JS/TS client
- Vectorize a user query and then run a similarity search, including showing me how to use the most common similarity search parameters (e.g. - maximum number of matches, minimum similarity score, etc.)
- Any performance or optimization tips that would help me
A good sample that shows me how to do these operations would really help get me up to speed quickly and save me a lot of time over perusing through many pages in the API reference.
Also, please list the name of any Medium authors that write good articles on using a weaviate cloud cluster.
hi @Robert_Oschler !! Welcome to our community
Have you seen our quickstart?
Apart from the chunking, it will give you a pretty straightforward path to get it starting.
Also, we have some recipes in this repo:
That can give you working examples on different features.
Notice that most of those steps (vectorize the chunk, vectorize user query, searching, etc) Weaviate will take care for your. For that you need to pass your OpenAI key when initializing the client.
Let me know if you need further assistance. We are here to help!
Thanks!
1 Like
Thanks. That sample looks pretty good so I will go through it. Regarding chunking, any tips on a good size for a typical chunk in the “Goldilocks zone” (not too large, not to small)?
That’s great that Weaviate handles the vectorizing. I was using OpenAI for my embeddings work in my linear file trials so I already have an API key, and that gives me confidence because I know their embedding endpoint works well. Hopefully they are not having performance issues as I have been having lately with ChatGPTPlus (network errors, slow generation)?
Regarding LLM usage. I want to do that myself because I have custom code for orchestrating the prompt “massaging” I do, using the selection results from the RAG content selection operation. In other words, I will be using Weaviate for the pre-filtering operation where I take a user input (query) and do a search to get the N-best matches that I pass to the LLM for the completion text operation, and for that operation I will be using the Gemini 1.5 Pro API for my LLM work. Is there anything I need to know about doing that? I briefly scanned the quickstart you just mentioned, and it appears to assume that Weaviate will be handling the LLM interaction, which I don’t want. That is why I’m asking this question.
Hi!
For chunking, unfortunately, there is no one size fits all.
We have a nice documentation here that may give you some background:
One thing you could do, to make the chunking, and the fine control over the generation step, is to leverage a tool like langchain.
I happened to update today a recipe that can be used with that:
notice that you can use different LLMs that langchain support for generating the answers, and this will give you the ability to have more control over the documents you want to use for that generation.
Let me know if that helps.
1 Like
Replying to help others who are looking for a good starting point for chunking parameters. From the Weaviate chunking tutorial:
For search with fixed-size chunks, if you don’t have any other factors, try a size of around 100-200 words, and a 20% overlap.
1 Like