How is Storage footprint reduced after inserting vectors in to Weaviate

Adi_Sra_Ga · October 24, 2024, 6:16am

I have a parquet file with embeddings stored in it and the actual raw size of the parquet file is 44GB and when inserting in to the DB and post insertion the foot print is just 40GB on the persistent data path.

My persistent datapath is set an NFS share mounted on to the Weaviate DB server running single node . My expectation is that everything will be stored after insertion. To my surprise the actual size of the directory post insertion is lesser than the file size and i have not enabled any PQ or BQ compression .

Questions :

Does Weaviate uses local server cache storage + persistent data storage like 70 -30 % or something of that fashion . Because i could see some data written in the local cache. ?
How weavieate stores its vectors inside the collection , does it perform any compression by default. In my configuration PQ and BQ is disabled.

So i am wondering do Weaviate DB do some sort of Quantization techniques or data compression technique’s to have lesser footprint ?

Any pointers on how the collections are stored and retrieved .

DudaNogueira · November 1, 2024, 2:40pm

hi @Adi_Sra_Ga !!

Sorry for the delay here

I am assuming that this parquet already includes the vector, right?

There some possible explanations here.

For example, Weaviate uses 32 bit floats for storing vectors. so you have 64 bit float arrays in the parquet file than could explain things.

Also, your parquet may have some extra paddings, spaces, etc.

Let me know if this helps.

Topic		Replies	Views
Sizing disk storage for Weaviate Support	3	222	May 2, 2025
Weaviate Disk Usage Question Support python	1	194	February 5, 2025
Embeddings taking up more space than expected? Support	4	942	July 4, 2023
cluster performance or compression Support	1	294	January 19, 2024
No change in vector size after turning Product Quantization on Support	3	356	February 5, 2024

How is Storage footprint reduced after inserting vectors in to Weaviate

Related topics