Batch import silently fails

I’m using the following code to import to a collection:

with collection.batch.dynamic() as batch:
    for data_row in [
        {
            "filename": "feeds.pdf",
            "chunk": content,
            "chunk_n": 1,
        },
...
    ]:
        uuid = batch.add_object(
            properties=data_row,
        )
        print(uuid)

Some of the objects didn’t get ingested - I noticed that by counting the expected number of documents and the actual number of documents in the db.

I had to do some debugging and figured it’s because some of the chunks were too big for the embedding model.

Is there a way to know if an ingestion during batch went correctly? There was no error.

hi @Luka_Secerovic !!

You need to inspect the end result of your batch.

for example, consider this code:

import weaviate

client = weaviate.connect_to_local()
objects = [
    {"text": "object 1", "vector": [1,2,3]},
    {"text": "object 2", "vector": [1,2,3,4,5]},
]
client.collections.delete("Test")
collection = client.collections.create("Test")

with collection.batch.dynamic() as batch:
    for obj in objects:
        batch.add_object(
            properties={"text": obj["text"]}, 
            vector=obj["vector"]
        )

this will not be imported, because we have objects with different sized vectors.

After running this code, you should see the following in logs:

{'message': 'Failed to send 1 objects in a batch of 2. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}

now, in collection.batch.failed_objects with have

[ErrorObject(message=‘inconsistent vector lengths: 5 != 3’, object_=_BatchObject(collection=‘Test’, vector=[1.0, 2.0, 3.0], uuid=‘1d64911a-b270-42a6-9346-ecbda0275e9d’, properties={‘text’: ‘object 1’}, tenant=None, references=None, index=0, retry_count=0), original_uuid=‘1d64911a-b270-42a6-9346-ecbda0275e9d’)]

Let me know if that helps!

Thanks!