Server specs and setup for production

Hi Weaviate community,

I want to set up a Weaviate vector DB for a small production environment. What server specs and setup would you recommend to handle peak usage of up to 100 queries per second?
Object count will probably stay below 1M for a while.
Would a single docker setup be sufficient for that or should I be looking at Kubernets (seems overkill)? What about CPU and RAM recommendations?

These posts give a good overview, but I’m looking for answers on how many requests a single node can handle with which hardware.

Hi @olaf-ho :wave:

What is the dimensionality of your data? For now, let’s assume its in the ~100d ball park. In general, pretty much any setup should be able to handle 100 queries per second(qps) on 1M 100d objects.

For reference see the figure below where every single AWS and GCP machine was able to achieve over 100qps on the SIFT1M dataset (which has 1M objects each being 128 dimensional) single-threaded even.

1 Like

Hi @zainhas, I’m using OpenAI‘s text-embedding-ada-002 which has 1536 dimensions if I understand it correctly.

What would you recommend in terms of CPU and RAM? I’ll probably choose GCP for hosting.

n2-standard-4 or n2-standard-8 for best performance/cost ratio

3 Likes

HI @etiennedi @zainhas any suggestions for azure vms.

Hi @Kavali_Kranthi_Kumar !

One nice way to calculate the resource usage is to just ask Verba:

https://verba.weaviate.io/

That will give you an estimate on the memory consumption. With that in hands, you can properly select the VM size in your cloud provider.

:slight_smile: