After uploading 600K objects (30 hours) I have to face the fact that all the data has been deleted from my Weaviate db. I have to say I have no control over this. Querying the data I have only 100 records left in my database. Oops. Everything is gone.
Here are some details I noticed:
- I enabled a Windows 10 Docker container, which looks good from my perspective. I cannot upload logs, only jpg is allowed.
- During the obload the folder size of my local data directory went up to 3.8GB, then down to 2.3GB, up again to 3.6GB and now it is 109MB >>> 100 records/objects left
- The weaviate docker container was running all the time, no interruptions or stops.
- The upload never stopped, I did not see any errors and I used the upload parameters carefully (small batch size).
- The entire upload looked fine. When VsCode finished and terminated my script immediately afterwards all data was gone.
- This has happened 3 times now with different settings. I always thought I made some kind of mistake. But now I can say: this is some sytematic behaviour.
Any help would be greatly appreciated. I think I have reached the point of no return. Why is all the data deleted when the upload is finished? Well. Maybe someone has an idea.
Many thanks, Patrik.
Can you share with us the code you run for your upload?
(you don’t need to share the info your data), but it would be good to see what you call and how?
Also, can you share with us your
Can you run the following code and share with us what you get?
(note please change
YouCollectionName to the name of your collecion.
result = (
print("Object count: ", result["data"]["Aggregate"])
Were you able to solve this?
Let me know if we can help you.
It was the uuid5() function that was causing the problems. Since I removed it completely, the upload works fine. I still don’t quite understand it. Anyway. I can finally move on and start doing what is nice to do with a vector database
This kind of bug is hard to debug. But this old trick helped me overcome my bias and objectively check each line for possible sources of error. I started with a few lines of working code and added more complexity as I tested line by line.
Many, many thanks for getting in touch and have a nice day. Patrik.
Sounds like the
generate_uuid5 function was generating the same UUID for multiple objects due to the function having the same input.
Glad to hear you’ve resolved it!