Retrieving “Adjacent” Chunks for Better Context

Mike_Hudson · February 29, 2024, 7:16pm

In my RAG application I use Flowise to upsert a text file which is then chunked into documents of fixed chunk size. The embeddings for these chunks are stored in Weaviate. I ask a question via Flowise, which sends the question’s embeddings to Weaviate as a query. Weaviate retrieves the text chunk with the closest vector similarity to my question. All good. But how can I then query Weaviate to also retrieve the two text chunks which are located before and after that chunk in the original file? The idea is to use small chunks in the VDB for precision, but to bring (selectably) wider context back into the LLM for better reasoning.

DudaNogueira · February 29, 2024, 7:33pm

Hi Mike!

This is not possible. Also, the query should already bring those chunks, if they are relevant enough to your query, considering you have small chunks, you should be able to pass forward to the generation a bigger number of chunks.

I believe you want to tweak the chunking size and the overlap, so it has enough context before and after.

Have you seen this material we have on that subject?

Let me know if this helps.

Mike_Hudson · February 29, 2024, 8:08pm

Hi Duda,

Let me try to explain more clearly. I want to use small chunks because it gives more specific similarity search results – increasing the chunk size is therefore not an option, and adjusting overlap does not solve the problem. Whenever I get a ‘hit’ on cosine similarity for a chunk, I want to retrieve that chunk and also the two chunks which are contiguous to that chunk. So, for example, let’s say I upload a book which is chunked and embedded at paragraph level. When I query the VDB if my query matches a particular paragraph I would ideally like to retrieve the whole page of the book, not just the matching paragraph. But I don’t have the concept of a page, and so as a work-around I want to get the paragraphs immediately before and after the matching paragraph. Let’s says I query the VDB and get a hit on paragraph “B”. I would also want to retrieve paragraphs A and C – even though they may not match at all for cosine similarity - I want the whole block of text A-B-C.

I guess what this comes down to is how Weaviate stores the original text; whether contiguous chunks of text have a numbering system which reflects the order in which they were upserted, so that if the three paragraphs ABC were numbered 123 then paras A and B could perhaps be retrieved as para 2-1=1 and para 2+1=3. Vectara does something similar to this: “x sentences before and x sentences after” a matching sentence.

DudaNogueira · February 29, 2024, 8:32pm

One thing that could be worth trying, is doing 2 queries.

So first query, will get the topX objects.

When you chunk those, you save the position of the chunk.

Based on the topX chunks, you do a second query to also select the -1 +1 chunk position of the topX chunks.

The downside of this approach is that for generating the answer, you would have to consume the generative api yourself at code level, as you would need to “reaugment” your context window.

Let me know if this helps.

Mike_Hudson · February 29, 2024, 9:12pm

Thanks Duda but I’m not sure I follow. Do you mean that Weaviate does not store the ordinal number of the chunks as they are upserted? Or are you saying that the matching chunk does have a stored ordinal number and that I should therefore:

Query Weavitate and get the chunk that matches by cosine sim;
Query Weaviate for that matching chunk’s ordinal number x; and
Use that ordinal number to query Weaviate for the (possibly non-matching) chunks of ordinal number x-1 and x+1?

(To make things simple let’s just assume that Top_k=1 – only one matching chunk to be retrieved, plus the chunk before and after it)

SomebodySysop · June 5, 2024, 11:03am

I just had this very same thought today and Googled it and this post came up first. I sort of figured Weaviate would have a way to do this. Perhaps in the past 4 months they have developed something.

But, I was thinking of a workaround. In my class I have a property called “docId”, which contains information about the embedding relative to it’s identifiers in my local filesystem. In this docId is a delta, which is the sequential number of the chunk relative to it’s parent document. So, for example, if each chunk was a paragraph, the first paragraph would have the delta 0, the following paragraph would have the delta 1, and so on.

What this means is that if I do a cosine similarity search and retrieve a chunk with a high score, I could use it’s docId to retrieve the sequential chunk before it (if one exists) and the one after (if one exists) using a filtered query. I believe this is exactly what you want to do:

What solution did you eventually come up with?

SomebodySysop · June 9, 2024, 8:30pm

Found it! Small-to-Big RAG Retrieval

DudaNogueira · June 10, 2024, 8:55pm

Hi!

Oh… that’s interesting!

THanks for sharing

SomebodySysop · August 28, 2024, 6:58am

Just a follow up to this discussion.

In addition to implementing my own Semantic Chunking strategy: Using gpt-4 API to Semantically Chunk Documents - #166 by SomebodySysop as well as Small-to-Big chunk retrieval (for better chunk context): Advanced RAG 01: Small-to-Big Retrieval | by Sophia Yang, Ph.D. | Towards Data Science

I also deployed the “Deep Dive” strategy (RAG is failing when the number of documents increase - #5 by SomebodySysop - API - OpenAI Developer Forum). Essentially, I take the top 50 (or even 100) cosine similarity search results returned by Weaviate, and rate each chunk based upon it’s relationship to the actual question asked. I do one chunk at a time, which ensures the best model response. I then return the highest rated chunks together as context to the model for a complete answer.

Using OpenAI’s new text-embedding-3-large embed model.

Not only is this process faster than I thought it was going to be (since each API call only returns a single rating number, in my case 0-10), but also far less expensive than I imagined (especially with the new gpt-4o-mini and gemini-1.5-flash models).

This works amazingly well. it turned out to be the key to my issues with getting “comprehensive” responses.

Hope this helps someone else!

SomebodySysop · August 28, 2024, 7:07am

I am doing exactly this. However, you have to do it programmatically and it requires that you supply the ordinal numbers in your object properties metadata. So, when you retrieve a chunk, you can then also retrieve the preceding and following chunks.

I created a method in PHP called getComprehensiveData() which executes this process. Here is my ChatGPT chat on the subject: https://chatgpt.com/share/ee2bce35-0828-442e-b595-6a922caec14c

In my metadata, these “ordinal” numbers are referred to as “docIds”.

If you haven’t yet implemented something like this, give it a try. It works beautifully in providing more context to your retrieved chunks.

Dirk · August 28, 2024, 11:47am

Haven’t tested it out, but how about you add a new reference property “adjacentChunks”, add references to the chunks that you want to set as related and then also return the content of the referenced objects .

See here: Manage relationships with cross-references | Weaviate - Vector Database

SomebodySysop · September 13, 2024, 11:18pm

So, it appears that Weaviate has addressed this issue more directly:

kgorazda · February 19, 2025, 10:28am

Hello everyone! I am working on a very similar problem and I am using the same strategy as most of you:

However, I was thinking that it seems like there is a good strategy for retrieving adjacent chunks in Verba in their “context window”. It seems that they are creating separate collections for source documents and chunks, with cross-references between them. Has anyone tried this approach?

Topic		Replies	Views
Best way to vectorize and store a large document in Weaviate? General	6	1835	August 18, 2023
Late Chunking Support	9	614	September 23, 2024
Similarity search returns chunks that all have exactly the same distance value Support bug	3	847	November 29, 2023
Similarity search issues Support	3	192	August 26, 2024
Update existing chunks in a document with more than QUERY_MAXIMUM_RESULTS entries Support	10	333	November 12, 2024

Retrieving “Adjacent” Chunks for Better Context

Related topics