Description
My SDK is the Weaviate TypeScript client on a Node.js 21.7.1 runtime version.
I’m using the batch import logic (batcher.withObject() and then batcher.do()) to loop through a JSON list of a dataset and load them all into the vector database that calculates embeddings using the OpenAI vectorizer.
The data set always gets added with new entries, while keeping the old ones. Given that, I don’t want to always re-calculate the vector embeddings for records that I already imported in the past (waste of time and money). So, I used Weaviate’s generateUuid5 to generate a unique id entry for each object, assuming that when the batch import will try to load an existing entry it will ignore. However, that’s probably not the case.
I can tell, because it doesn’t throw an exception, and looking at the objects that get created I can see that each run of loading the data it has a new creationTimeUnix and lastUpdateTimeUnix time:
 {
    class: 'BlogPosts',
    creationTimeUnix: 1710143025495,
    id: '8881dee0-9ad3-5940-96c9-b00ebcdf487d',
    lastUpdateTimeUnix: 1710143025495,
    properties: {
Any ideas how do I achieve batch import while ignoring existing objects in the collection to avoid re-calculation of the embeddings?
Server Setup Information
I’m using the Weaviate cloud hosted service.
 
    
  
  
        
    
    
  
  
    
     )
 )