Facing maximum context length exceed issue during vectorizing

SMukherjee · April 15, 2024, 5:21pm

Description

I am trying to vectorize a 1500 page long pdf (4MB) document to weaviate. I am using a mix of static and dynamic chunking strategy that will create chunks with not more than 1000 words (200 words overlap) in them. Once I chunked this document it create 648 chunks. After that, I am using the below code to vectorize the chunks and push it to weaviate.

> for row in processedText:
> 	totalChunkCount = row[0]
> 	# Access values of each row
> 	chunkOrder = row[0]			#It has the sequence of the chunk
> 	documentContent = row[1]	#Actual chunk content
> 
> 	otherDocDict = {}
> 	otherDocDict['businessApplicationNumber'] = appNumber		#Not important, Alphanumeric code, 10-12 characters long
> 	otherDocDict['applicationName'] = appName					#Not important, a text string, 20-30 characters long
> 	otherDocDict['documentContent'] = documentContent
> 	otherDocDict['chunkOrder'] = chunkOrder
> 	
> 	print("The current chunk Order is:", chunkOrder)
> 	otherDocUuid = otherDocCollection.data.insert(otherDocDict)		#otherDocCollection is our weaviate collection
> 	otherDocUuids.append(otherDocUuid)								#A list to keep track of all uploaded object uuids
> 
> 	wv_Collection_Add_Ref(otherDocCollection, otherDocUuid, sysUuid, 'hasSystem')	#creating reference here
> 	wv_Collection_Add_Ref(sysCollection, sysUuid, otherDocUuid, 'hasOtherDocs')		#creating reference here, bi-directional

Important configurations: We are using Azure-Openai-API and text-embedding-3-small

It is running well till chunkOrder 647. But in chunk 648 (the last chunk) it is failing with error: Object was not added! Unexpected status code: 500, with response body: {‘error’: [{‘message’: “update vector: connection to: Azure OpenAI API failed with status: 400 error: This model’s maximum context length is 8192 tokens, however you requested 15326 tokens (15326 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.”}]}.

Now, as I said, none of our chunk is more than 1000 words long, so I got suspicious and checked the word count and token count of chunk 648. It contain 983 words and as per tiktoken, 2500 tokens. Then to further check, I have changed the code so that only chunk 648 gets vectorized (all the other chunks are ignored) and it works without any issues!!

In light of this, my question is does weaviate bunch up requests and then send them to vectorize at once? Or is it retaining context from previous requests and that is why the context size is getting out of hand?
What is the fix for this issue? Is batching the request going to solve it?

Please let me know if you have any questions?

Server Setup Information

Weaviate Server Version: 1.24.6
Deployment Method: k8s using EKS
Multi Node? Number of Running Nodes: 2
Client Language and Version: Python 3.9.7; weaviate 4.5.0

Any additional Information

Unexpected status code: 500, with response body: {‘error’: [{‘message’: “update vector: connection to: Azure OpenAI API failed with status: 400 error: This model’s maximum context length is 8192 tokens, however you requested 15326 tokens (15326 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.”}]}
Important configurations: We are using Azure-Openai-API and text-embedding-3-small

DudaNogueira · April 16, 2024, 11:43am

Hi @SMukherjee !! Welcome to our community!

This is indeed strange.

Each chunk should be sent individually. It should not retain any information from previous content. So not sure why this last chunk is getting this error.

When you print it before passing to the client, you see with the expected content, right?

We should see a feature that will allow us to have a close look into the payload being sent. The idea is that when enabling a more verbose log level, those informations will be printed in stdout logs.

for now, a “hacky” way is to set the base url to something differently, like so:

from weaviate import classes as wvc
client.collections.delete("PayloadInspect")
collection = client.collections.create(
    "PayloadInspect",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(
        base_url="http://webhook/6849fef3-d146-46a8-b6ca-e76ca6cdcbe7"
    )
)

now you could run you ingestion and check the exact payload being sent, as they are now sent to an app like webhook that will output all requests.

The insertion will probably fail, but you can catch the request.

Let me know if this helps.

Topic		Replies	Views
Errors: text too long for vectorization. Tokens for text: 10440, max tokens per batch: 8192, ApiKey absolute token limit: 1000000' Support bug	12	320	November 1, 2024
Error : text too long for vectorization Support python , technical	9	357	December 18, 2024
Weaviate Openai Embedding Models General	8	429	August 23, 2024
Max file size for pdf imports & Connection Interruption Error Support bug , developer-experience , technical	1	195	November 13, 2024
Error "text too long for vectorization" Support javascript , azure	8	426	June 5, 2024

Facing maximum context length exceed issue during vectorizing

Description

Server Setup Information

Any additional Information

Related topics