Issue (Cache Invalidation?) with Bulk Deletion of Objects using Filters

dhanshew72 · October 16, 2025, 11:26pm

Description

I’m attempting to remove records using a delete_many call like below:

...
db_filter = Filter.by_property("file_id").equal("the_file_id")
collection.data.delete_many(where=db_filter)
...

The delete is successful and if I run a subsequent fetch_object with appropriate filter I won’t see the record anymore. However, occasionally the record will pop back up in hybrid searches and re-enter the database partially. I’ve been able to recreate it a few times and haven’t found a solution for this issue. I’m only deleting at most 1000 objects per call.

What I’ve tried

Remove by UUID after lookup of all records
Upgraded weaviate version to latest (1.33), I did see this mentioned for 1.24, but my version based on release dates seem fine: Can not delete all objects
Verify delete by calling fetch_object after a sleep call, still came up in subsequent queries
Modified consistency_level change to ALL on every query

Any suggestions/ideas would be appreciated.

Server Setup Information

Weaviate Server Version: 1.33.0 (happened on 1.25.0 & 1.24.26 as well)
Deployment Method: ECS
Multi Node? Number of Running Nodes: 5
Client Language and Version: Python with 4.10.4
Multitenancy?: Yes

Any additional Information

I attempted to upgrade from 1.24 to 1.25, but still saw the issue. Afterwards, I jumped to 1.33 (not on Kubernetes), and the upgrade worked after some smoke tests but I still see this issue on deletes.

maryannc · October 16, 2025, 11:51pm

Hi @dhanshew72 ,

Good Day!

Welcome to Weaviate Community!

It seems that your experience matches known issues with deletion consistency in Weaviate, especially in replicated or high-concurrency environments. Setting the deletion resolution strategy to either DeleteOnConflict or TimeBasedResolution should help resolve your case.

This is how you can update your deletionStrategy:

from weaviate.classes.config import Reconfigure, ReplicationDeletionStrategy   #Use the collection you want to update articles = 
client.collections.use("Article")  # Update the deletion strategy 
articles.config.update(replication_config=Reconfigure.replication(         deletion_strategy=ReplicationDeletionStrategy.TIME_BASED_RESOLUTION  # or DELETE_ON_CONFLICT
), )

Hope this helps.

dhanshew72 · October 17, 2025, 12:07am

I’m curious on how this could resolve the issue. It seems to be on lookup that this data comes back after a deletion which I can verify does work. I only have one replica, this seems to be the cache and file system being off base. Is uploading identical data a problem in earlier versions of weaviate?

maryannc · October 17, 2025, 12:45am

Updating the deletionStrategy setting should help, as the default value is set to NoAutomatedResolution. With this setting, deletion conflicts are not treated as a special case — meaning if an object is deleted on one replica but still exists on another, it could potentially be restored. You can find more details on this behavior here.

In addition to the deletionStrategy, it’s considered best practice in a multi-node setup to ensure that replication settings, such as asynchronous replication, are properly configured. More on that can be found here.

Lastly, the current version of the Weaviate client in use is v4.17.0. It’s also recommended to keep both the client and server versions up to date to benefit from the latest fixes and improvements. Release notes here.

Let me know if you’d like help updating the configuration or client version.

dhanshew72 · October 17, 2025, 12:59am

To clarify, I have a replication factor of 1, does that impact it?

maryannc · October 17, 2025, 2:22am

I see. Thanks for clarifying. Yes, if replication factor is 1 for your collections, then there is no replication happening. Could you verify how long was the sleep value configured? You may try increasing the value.

If you are targeting a multi-node setup(5 nodes) it would be best to check the replication settings as recommended for data consistency. Also please note that, replication factor cannot be updated since v1.32 as replica movement has been implemented. You can read about replication factor here: Cluster Architecture | Weaviate Documentation

dhanshew72 · October 17, 2025, 4:50pm

Sleep value is default at 90 seconds at the moment. It doesn’t seem to be related to that, the deletes are working, but the data just comes up again in subsequent searches after the delete has happened.

I’m thinking it’s a caching/file system issue where data isn’t fully deleted or marked as deleted properly.

dhanshew72 · October 22, 2025, 6:11pm

Tested this out. Unfortunately didn’t solve my issue, I think I found an issue, if the file is a duplicate IE you delete, then re-upload, it’ll pop back up. Not sure if that’s the exact issue. The only workaround I have is updating the record to empty out all fields which had some success.

dhanshew72 · October 23, 2025, 6:59pm

Oh my, I figured it out. I had a weird condition in my code that would re-upload objects that could’ve been deleted. That’s on me, thanks for the help.

Topic		Replies	Views
Can not delete all objects Support	5	1561	March 20, 2024
Deletion (delete_many with filter) not fully delete all the matched objects Support	1	511	September 16, 2024
Delete object with id array failed Support bug	2	410	December 17, 2024
Volume and objects size going up instead of down after removing >50% of objects Support	7	3101	February 27, 2025
BUG: Unable to delete with Multi-tenancy v1.21.7 & python client 3.25.2 Support bug	1	825	November 16, 2023

Issue (Cache Invalidation?) with Bulk Deletion of Objects using Filters

Description

Server Setup Information

Any additional Information

Related topics