[Question] Unable to store custom vectors in local instance

Hi there, I’m currently following the steps exactly in this tutorial [Bring your own vectors | Weaviate - Vector Database] on storing custom vectors using insert_many. I can see the embeddings are being produced however once I store them, I am unable to retrieve them using the queries below (the vector field returned is empty, {}). Am I missing something? Thank you!

import weaviate
client = weaviate.connect_to_local()
course_collection = client.collections.get(‘AgentName’)

Hi @Jennifer_Jordache,

Welcome to our community! :slightly_smiling_face:

The lines you shared are indeed correct for that. However it would be challenging to know where the issue lies without having a look into your code

Could you please share the code scripts including the import data method?

Thank you & Have a lovely weekend

1 Like

Thanks so much @Mohamed_Shahin! I’ve pasted my code below. I was doing some troubleshooting and it seemed like whenever I individually inserted the objects in the for loop (using .insert instead of .insert_many), I was able to store and retrieve the vectors correctly. Is it possible that my usage of insert_many is nulling out the vectors? Also I’d love to look at the definition of insert_many but am having trouble finding it :sweat_smile:

def store_in_weaviate(self, document_name, address, structured_text):
        docname = os.path.split(address)[1]
        # Copy the original file
        self.filecopy(address, f"{self.course_content_store}/{document_name}/{docname}")

        embeddings_data_objs = list()

        # skills will unnest and become property, with the agent name being the collection
        for page in structured_text:
            page_number = page["page"]
            text_df = page[
            ]  # with possible columns 'text', 'heading', 'summary', 'clean'
            embeddings = page["embeddings"]
            summary_embeddings = page["summary_embeddings"]

            # store the text in the vector db
            for i, row in text_df.iterrows():
                # store the clean text
                properties = {
                    "document_name": document_name,
                    "page_number": page_number,
                    "chunk_number": i,
                    "raw_text": row["text"],
                    "heading": row["heading"],
                    "encoded_text": row["clean"],
                    "clean_text": row["clean"],
                    "is_summary": False,

                    wvc.data.DataObject(properties=properties, vector=embeddings[i])
                # now store the summary
                properties = {
                    "document_name": document_name,
                    "page_number": page_number,
                    "chunk_number": i,
                    "raw_text": row["text"],
                    "heading": row["heading"],
                    "encoded_text": row["summary"],
                    "clean_text": row["clean"],
                    "is_summary": True,
                        properties=properties, vector=summary_embeddings[i]
        course_collection = client.collections.get(self.collection_name)

Hey @Jennifer_Jordache,

Awesome, you did spot it! it happens me too, little things like that :sweat_smile:.

Here is the definition for insert_many function:

def insert_many(
    objects: Sequence[Union[Properties, DataObject[Properties, Optional[ReferenceInputs]]]],
) -> BatchObjectReturn:
    """Insert multiple objects into the collection.

            The objects to insert. This can be either a list of `Properties` or `DataObject[Properties, ReferenceInputs]`
                If you didn't set `data_model` then `Properties` will be `Data[str, Any]` in which case you can insert simple dictionaries here.
                    If you want to insert references, vectors, or UUIDs alongside your properties, you will have to use `DataObject` instead.

            If any unexpected error occurs during the batch operation.
            If a property is invalid. I.e., has name `id` or `vector`, which are reserved.
            If every object in the batch fails to be inserted. The exception message contains details about the failure.

Additionally, here is client repo, so you can read more: Weaviate Python Client

Furthermore to the func .insert_many, the vectors should not be nulled out by insert_many if properly passed.

I would personally look at embeddings[i] to ensure no null or incorrectly formatted before adding to DataObject. Also I would be logging to verify vectors are correctly passed to DataObject and insert_many.

I would recommend you read through:

I hope this help you and I wish you a Happy Sunday!

Happy Coding :partying_face:

1 Like

Thank you again for the reply and for the links, super helpful!

Just wanted to clarify that I still haven’t figured out why insert_many results in null vectors but using only insert in a for loop works in populating the vectors. I’ve decided to not use insert_many anymore due to this but if you end up having any ideas on why this might be the case, I would appreciate it! I’m also not using threading so it’s quite confusing why insert_many is behaving this way. Either way, thank you for the help especially on a weekend!