Autoscaling and RPS issues

Hi Team,

We are currently on the HA $75 base plan. During load testing, we noticed that at 15 RPS, the max response time spikes from a P90 of 500ms to a maximum of 3 seconds. At 50 RPS, we experience response time timeouts at around 20%, and any higher RPS makes our app unusable.

How can we handle high RPS without sacrificing usability? These issues at low RPS levels (below 1K) are concerning and make us hesitant about using Weaviate dedicated hosting.

Is this expected behavior? Our P75 is 300ms, so it doesn’t seem like our queries are inherently costly. It appears to be an infrastructure issue.

Can someone help me understand this?

Thank you.

Hi @Chase_Norton !

For any support on our cloud, the best place to ask is by sending an email to

With that we can identify your cluster and take a closer look on it. Also you get a quicker return. Sorry for the delay here.

For critical or deployments that require performance at scale, it is advised to reach to our team so we can understand better your requirements.

In order to increase RPS, you need to have horizontal scaling:

So one thing to check is the replication factor of your collections. Also, the resource allocated for each node and if they are roughly using the same amount of resources.

Please, reach us at and we’ll figure that out and take a look at your metrics.