Openai Vectorizer failing to reach embeddings endpoint

Description

I am creating a collection as follows:

            self.collections[self.test_case_collection_name] = self.client.collections.create(
                name=self.test_case_collection_name,
                vectorizer_config=Configure.Vectorizer.text2vec_openai(
                    model=self.embedding_model_name,
                    base_url=self.embedding_model_base_url),
                properties=[
                    wvc.config.Property(
                        name="test_case_id",
                        data_type=wvc.config.DataType.TEXT,
                        skip_vectorization=True,
                        index_filterable=True,
                    ),
                    wvc.config.Property(
                        name="test_case_content",
                        data_type=wvc.config.DataType.TEXT,
                    ),
                ],
            )

where embedding_model_base_url is a deployment endpoint for an inference service running on the same K8s cluster this Weaviate deployment is running on.

I then have the following function (serving as a wrapper for the built-in Weaviate insert functionality):


    def insert_items(
        self,
        collection_type: CollectionType,
        items: Union[str, List[str]],
        item_ids: Union[str, List[str]],
        item_intents: Union[str, List[str]] = None,
        additional_properties: Dict[str, Any] = None
    ) -> None:
        items = as_list_str(items)
        item_ids = as_list_str(item_ids)
        
        if item_intents is not None:
            item_intents = as_list_str(item_intents)
            if len(items) != len(item_intents):
                logger.error(
                    f"Unable to insert items as the number of items differs from the number of item intents")
                return

        if len(items) != len(item_ids):
            logger.error(
                f"Unable to insert items as the number of items differs from the number of item IDs")
            return
        
        if collection_type == CollectionType.TEST_CASES:
            collection_name = self.db_name + "_test_cases"
            content_field = "test_case_content"
            id_field = "test_case_id"
        elif collection_type == CollectionType.TEST_PLANS:
            collection_name = self.db_name + "_test_plans"
            content_field = "test_plan_content"
            id_field = "test_plan_id"
        elif collection_type == CollectionType.TEST_SETS:
            collection_name = self.db_name + "_test_sets"
            content_field = "test_set_content"
            id_field = "test_set_id"
            intent_field = "test_set_intent"
        else:
            logger.error(f"Unknown collection type: {collection_type}")
            return

        collection = self.collections.get(collection_name)
        if collection is None:
            logger.error(f"Collection {collection_name} not found")
            return

        with collection.batch.dynamic() as batch:
            for i, (item, item_id) in enumerate(zip(items, item_ids)):
                try:
                    properties = {
                        id_field: item_id,
                        content_field: item
                    }
                    
                    if collection_type == CollectionType.TEST_SETS and item_intents is not None:
                        properties[intent_field] = item_intents[i]
                    
                    if additional_properties:
                        properties.update(additional_properties)
                    
                    batch.add_object(properties=properties)

                    if batch.number_errors > 0:
                        raise Exception("Batch errors detected")

                except Exception as e:
                    failed_objects = collection.batch.failed_objects
                    logger.error(f"Failed objects: {failed_objects}, Error: {str(e)}")
                    break

        logger.info(f"{len(items)} item(s) added to {collection_name} collection")

I then try to use the insert_items function to add approximately 3000 records to the database. Upon initial startup of the Weaviate server, it appears that the insertion works when a fresh collection is instantiated. I can observe the external logs of the inference server and see that requests are being successfully processed.

However, at a certain point before the ~3000 records are inserted, the batches start to fail and I get the following error the client:

2025-04-28 17:11:59,690 - weaviate-client - ERROR - {'message': 'Failed to send all objects in a batch of 144', 'error': "WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')"}
2025-04-28 17:11:59,691 - weaviate-client - ERROR - {'message': 'Failed to send 144 objects in a batch of 144. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
2025-04-28 17:12:03,152 - weaviate-client - ERROR - {'message': 'Failed to send all objects in a batch of 192', 'error': "WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')"}
2025-04-28 17:12:03,153 - weaviate-client - ERROR - {'message': 'Failed to send 192 objects in a batch of 192. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
2025-04-28 17:12:03,168 - vectordb - ERROR - Failed objects: [], Error: Batch errors detected
2025-04-28 17:12:10,167 - weaviate-client - ERROR - {'message': 'Failed to send all objects in a batch of 192', 'error': "WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')"}
2025-04-28 17:12:10,167 - weaviate-client - ERROR - {'message': 'Failed to send 192 objects in a batch of 192. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
2025-04-28 17:15:03,181 - weaviate-client - ERROR - {'message': 'Failed to send all objects in a batch of 192', 'error': "WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')"}
2025-04-28 17:15:03,183 - weaviate-client - ERROR - {'message': 'Failed to send 192 objects in a batch of 192. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
2025-04-28 17:18:03,201 - weaviate-client - ERROR - {'message': 'Failed to send all objects in a batch of 48', 'error': "WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')"}
2025-04-28 17:18:03,202 - weaviate-client - ERROR - {'message': 'Failed to send 48 objects in a batch of 48. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}

Note: When I uninstall Weaviate and re-install (using Helm and including deleting the underlying data volume), I can repeat the above process and add at least some items to the DB before getting an error again.

I see no particular errors in the server logs.

This is strange, because it is clearly not an issue with my inference server (as I can continue to make unrelated requests to the inference server after Weaviate stops working). So the issue is something along the lines of “the Weaviate server cannot reach (at all) the inference server via the provided endpoint.” This is quite unusual since the deployment begins seemingly fine, but only halts after some time. I think it is also worth mentioning that I don’t think this is a data size / bandwidth problem as A) the amount of data being vectorized is not larg, B) the node the server is deployed on has ample resources, and C) once the server stops working with inserts, I cannot insert any entries, not even a single one.

What I find even more strange, is that the following function (which also uses the vectorizer, of course) works after the insert function seizes to:

    def query(self, query: str, collection_type: CollectionType, limit: int = 5, target_vector: str = "content") -> Dict[str, Any]:
        if collection_type == CollectionType.TEST_CASES:
            collection_name = self.test_case_collection_name
            id_field = "test_case_id"
            content_field = "test_case_content"
            vector_name = None  # No named vectors for test cases
        elif collection_type == CollectionType.TEST_PLANS:
            collection_name = self.test_plan_collection_name
            id_field = "test_plan_id"
            content_field = "test_plan_content"
            vector_name = None  # No named vectors for test plans
        elif collection_type == CollectionType.TEST_SETS:
            collection_name = self.test_set_collection_name
            id_field = "test_set_id"
            content_field = "test_set_content"
            # Choose the appropriate named vector based on target_vector parameter
            vector_name = "content_vector" if target_vector == "content" else "intent_vector"
            intent_field = "test_set_intent"
        else:
            logger.error(f"Unknown collection type: {collection_type}")
            return

        group_by = wq.GroupBy(
            prop=id_field,
            objects_per_group=1,
            number_of_groups=limit,
        )

        collection = self.collections.get(collection_name)
        
        # For collections with named vectors, we need to specify which vector to use
        if vector_name:
            response = collection.query.hybrid(
                query=query,
                target_vector=vector_name,
                group_by=group_by,
                return_metadata=wq.MetadataQuery(score=True, distance=True),
            )
        else:
            # For collections without named vectors, use the default query
            response = collection.query.hybrid(
                query=query,
                group_by=group_by,
                return_metadata=wq.MetadataQuery(score=True, distance=True),
            )

        output = {id_field: [], content_field: []}
        
        # Add intent field to output if we're querying test sets
        if collection_type == CollectionType.TEST_SETS:
            output[intent_field] = []
        
        for grp_name, grp_content in response.groups.items():
            item_id = grp_content.objects[0].properties[id_field]
            item_content = grp_content.objects[0].properties.get(content_field)

            output[id_field].append(item_id)
            output[content_field].append(item_content)
            
            if collection_type == CollectionType.TEST_SETS:
                item_intent = grp_content.objects[0].properties.get(intent_field)
                output[intent_field].append(item_intent)

        return output

Server Setup Information

  • Weaviate Server Version: 1.30.0
  • Deployment Method: k8s + Helm
  • Multi Node? No Number of Running Nodes: 1
  • Client Language and Version: go/1.22.0
  • Multitenancy?: No

Any additional Information

Hi @Cam_Quilici !

Welcome to Weaviate Community!

It seems that you are encountering OpenAI timeout issues that could happen when inserting large datasets. I noticed that failed objects that could contain specific error code is not displayed correctly. You may use the below code and see what are the specific failures for each failed object caused the issue:

def insert_items(
        self,
        collection_type: CollectionType,
        items: Union[str, List[str]],
        item_ids: Union[str, List[str]],
        item_intents: Union[str, List[str]] = None,
        additional_properties: Dict[str, Any] = None
    ) -> None:
        items = as_list_str(items)
        item_ids = as_list_str(item_ids)
        
        if item_intents is not None:
            item_intents = as_list_str(item_intents)
            if len(items) != len(item_intents):
                logger.error(
                    f"Unable to insert items as the number of items differs from the number of item intents")
                return

        if len(items) != len(item_ids):
            logger.error(
                f"Unable to insert items as the number of items differs from the number of item IDs")
            return
        
        if collection_type == CollectionType.TEST_CASES:
            collection_name = self.db_name + "_test_cases"
            content_field = "test_case_content"
            id_field = "test_case_id"
        elif collection_type == CollectionType.TEST_PLANS:
            collection_name = self.db_name + "_test_plans"
            content_field = "test_plan_content"
            id_field = "test_plan_id"
        elif collection_type == CollectionType.TEST_SETS:
            collection_name = self.db_name + "_test_sets"
            content_field = "test_set_content"
            id_field = "test_set_id"
            intent_field = "test_set_intent"
        else:
            logger.error(f"Unknown collection type: {collection_type}")
            return

        collection = self.collections.get(collection_name)
        if collection is None:
            logger.error(f"Collection {collection_name} not found")
            return

        with collection.batch.dynamic() as batch:
            for i, (item, item_id) in enumerate(zip(items, item_ids)):
                properties = {
                    id_field: item_id,
                    content_field: item
                }
                
                if collection_type == CollectionType.TEST_SETS and item_intents is not None:
                    properties[intent_field] = item_intents[i]
                
                if additional_properties:
                    properties.update(additional_properties)
                
                batch.add_object(properties=properties)
                
                # You can monitor errors during insertion and break if needed
                if batch.number_errors > 10:  # Set your threshold
                    logger.error("Too many errors during batch import, stopping")
                    break

        # Check for failed objects AFTER the context manager exits
        failed_objects = collection.batch.failed_objects
        if failed_objects:
            logger.error(f"Failed to import {len(failed_objects)} objects")
            for failed in failed_objects:
                logger.error(f"Failed to import object with error: {failed.message}")
        
        logger.info(f"{len(items)} item(s) added to {collection_name} collection")

To add, are this failures intermittent or consistently happening after X number of objects inserted?

Looking forward to your response!

Thank you for your response. When I try to insert with this code, I get the following message:

2025-04-29 09:38:59,953 - weaviate-client - ERROR - {'message': 'Failed to send all objects in a batch of 1', 'error': "WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')"}
2025-04-29 09:38:59,954 - weaviate-client - ERROR - {'message': 'Failed to send 1 objects in a batch of 1. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
2025-04-29 09:38:59,968 - vectordb - ERROR - Failed to import 1 objects
2025-04-29 09:38:59,968 - vectordb - ERROR - Failed to import object with error: WeaviateBatchError('Query call with protocol GRPC batch failed with message Deadline Exceeded.')

I don’t see how this could be an OpenAI timeout issue. As I mentioned, Weaviate seems to have no issue reaching the Vectorizer endpoint when running the query function.

To add, are this failures intermittent or consistently happening after X number of objects inserted?

I wouldn’t say they are “intermittent.” The insert function consistently stops working after the deployment has been up for a few minutes.

The query function has never stopped working.

To re-emphasize, for some reason the insert function (which acts essentially as a “wrapper” for the dynamic batch import functionality in Weaviate) cannot reach the OpenAI endpoint! I am trying to figure out why this is.

Looking at the Weaviate logs, it doesn’t seem to be complaining about anything in particular:

"2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 1.457577ms\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property__id/segment-1745938791196437717_1745938861151397709.db shard=ogrOO4lPUYqf took=1.457577ms
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 2.183711ms" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property__id/segment-1745938791196437717.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property__id/segment-1745938861151397709.db segment_index=0 shard=ogrOO4lPUYqf took=2.183711ms
time="2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 708.161µs\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_id/segment-1745938791196558567_1745938861159752229.db shard=ogrOO4lPUYqf took="708.161µs"
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 1.21991ms" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_id/segment-1745938791196558567.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_id/segment-1745938861159752229.db segment_index=0 shard=ogrOO4lPUYqf took=1.21991ms
time="2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 431.487µs\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_id_searchable/segment-1745938791196706451_1745938861168560103.db shard=ogrOO4lPUYqf took="431.487µs"
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 1.706459ms" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_id_searchable/segment-1745938791196706451.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_id_searchable/segment-1745938861168560103.db segment_index=0 shard=ogrOO4lPUYqf took=1.706459ms
time="2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 911.307µs\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_intent/segment-1745938791196822264_1745938861177715881.db shard=ogrOO4lPUYqf took="911.307µs"
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 952.697µs" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_intent/segment-1745938791196822264.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_intent/segment-1745938861177715881.db segment_index=0 shard=ogrOO4lPUYqf took="952.697µs"
time="2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 1.11399ms\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_content/segment-1745938791196845744_1745938861190070685.db shard=ogrOO4lPUYqf took=1.11399ms
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 1.15799ms" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_content/segment-1745938791196845744.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_content/segment-1745938861190070685.db segment_index=0 shard=ogrOO4lPUYqf took=1.15799ms
time="2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 1.645299ms\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_intent_searchable/segment-1745938791196970988_1745938861207507887.db shard=ogrOO4lPUYqf took=1.645299ms
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 1.547193ms" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_intent_searchable/segment-1745938791196970988.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_intent_searchable/segment-1745938861207507887.db segment_index=0 shard=ogrOO4lPUYqf took=1.547193ms
time="2025-04-29T15:02:14Z" level=debug msg="building bloom filter took 5.756373ms\n" action=lsm_precompute_disk_segment_build_bloom_filter_primary build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_content_searchable/segment-1745938791196996175_1745938861245194905.db shard=ogrOO4lPUYqf took=5.756373ms
time="2025-04-29T15:02:14Z" level=debug msg="replacing compacted segments took 2.889829ms" action=lsm_replace_compacted_segments_blocking build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 class=Abc_test_sets index=abc_test_sets path_left=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_content_searchable/segment-1745938791196996175.db path_right=/var/lib/weaviate/abc_test_sets/ogrOO4lPUYqf/lsm/property_test_set_content_searchable/segment-1745938861245194905.db segment_index=0 shard=ogrOO4lPUYqf took=2.889829ms
time="2025-04-29T15:02:15Z" level=debug msg="received HTTP request" action=restapi_request build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 method=GET url=/v1/nodes
time="2025-04-29T15:02:16Z" level=debug msg="received HTTP request" action=restapi_request build_git_commit=b7b7715 build_go_version=go1.22.12 build_image_tag=v1.30.0 build_wv_version=1.30.0 method=GET url=/v1/nodes

hi @Cam_Quilici !!

Do you have any resource usage reading? We have some dashboards here for prometheus and grafana: weaviate-local-k8s/manifests/grafana-dashboards at main · weaviate/weaviate-local-k8s · GitHub

Indeed those logs looks normal.

Also, it is important to note that the dynamic batch will calculate it’s size according to the latency from the last batch. This may overwhelm the server in low latency scenarios (dedicated vectorizer in same cloud/k8s, for example)

A good alternative here is creating a fixed sized batch, and tweak those values.

On top of that, ASYNC_INDEXING can help too, as it will take it’s time to index the recently created vectors while still having horsepower to ingest the data.

It also interesting to check the some other env vars, for example GOMEMLIMIT: Environment variables | Weaviate

But as you mentioned, for those 3k objects, weaviate performance shouldn’t be an issue.

Have you tried stress testing that vectorizer? Is it able to vectorize those 3k objects in some expected rate?

So I tried increasing GOMEMLIMIT as well as ASYNC_INDEXING and these did not help solve the issue.

Additionally, using a fixed batch size does not work.

There is something seriously strange happening that is preventing the client from reaching the OpenAI endpoint, and I really cannot find out what that is for the life of me.

My solution (for now) was to just create a “custom” vectorizer (that is, a separate OpenAI client within my wrapper class that generates embeddings) and then use this to add the embeddings to the DB.

Would like to get this solved eventually, but it seems too cryptic at the moment.

I suspect the timeout from Weaviate to your custom inference model is, at least at some point, bigger than 50 seconds, which is the default to MODULES_CLIENT_TIMEOUT env var

But that would probably bubble up on server logs.

Can you try increasing that? env var?

Note that, whenever you see yourself tweaking Weaviate default timeouts, it’s probably because there is demand to allocate more resources or may have a networking latency issue.

I was hopeful that this could be the solution, but alas no.

I increased MODULES_CLIENT_TIMEOUT to 10m and also increased the --read-timeout and --write-timeout arguments to 600s.

After some further investigation, it seems like the batch upload of my 2575 records is being arrested at basically the same exact point each failure. I added a tqdm progress bar to track the items as they are uploaded, and consistently get stuck here:

Inserting items to vector db:  67%|███████████████████████████████████████████████████████████████████████▏                                  | 1728/2575 [01:26<00:25, 33.70it/s]

So far, I think we have deduced some important things:

  1. This is not an issue with the actual inference server because when I create a “custom” vectorizer, I have no issues
  2. This is not a timeout issue since all the relevant timeout variables are sufficiently large
  3. The upload is being broken each time when I upload these particular 2575 records, at the same point during the upload
  4. Once this happens, I have to completely restart the Weaviate deployment before insertions work again
    • Weirdly enough, despite the insert functionality being broken, the Vectorizer can still be reached on a query call

Thinking through these things, I suspect that this is something wrong (perhaps a bug? perhaps negligence on my part?) with the batch uploading of data (note, I did try with a fixed batch size and this did not work).

What further investigation can I do?

If we could create a MRE (minimal reproducible example) that I could reproduce on my end, it would help.

You only see this issue specifically with text2vec_openai on a custom inference model, right? If you use OpenAi itself, or ollama, etc no issues?

Also, make sure to set the debug level to trace, and monitor for any outstanding logs while the ingestion is taking place.

The text2vec-openai module has some rate limit implementation, so not sure if this could influentiate.

How are you running that inference model? We have a module for KubeAi, by the way :slight_smile:

Yeah, this must be some sort of bug related to rate-limiting. That’s the only plausible explanation given the circumstances.

Right now, I have a valid workaround so I’m going to stop spending time debugging this. I may get back to this sometime next week.

Is kind of disappointing that this doesn’t just “work.”

1 Like

Yeah, I feel you!

But worry not, we can we will prevail!

Happy coding :slight_smile:

1 Like

Hey @Cam_Quilici - this definitely shouldn’t happen - would you be willing to test this with a custom build? Then I would add more logging to the vectorizer and we could try to narrow down the problem - feel free to ping me on slack

@Dirk Pinged, thanks.