Hello folks and @DudaNogueira
I have a parquet file with embeddings stored in it and the actual raw size of the parquet file is 44GB and when inserting in to the DB and post insertion the foot print is just 40GB on the persistent data path.
My persistent datapath is set an NFS share mounted on to the Weaviate DB server running single node . My expectation is that everything will be stored after insertion. To my surprise the actual size of the directory post insertion is lesser than the file size and i have not enabled any PQ or BQ compression .
Questions :
-
Does Weaviate uses local server cache storage + persistent data storage like 70 -30 % or something of that fashion . Because i could see some data written in the local cache. ?
-
How weavieate stores its vectors inside the collection , does it perform any compression by default. In my configuration PQ and BQ is disabled.
So i am wondering do Weaviate DB do some sort of Quantization techniques or data compression technique’s to have lesser footprint ?
Any pointers on how the collections are stored and retrieved .