Memory Leak with S3 backup module: Memory Usage Remains High After Backups

Description

We have the backup s3 module enabled and after some daily backup the memory usage increases, and it doesn’t release memory anymore. We already set env variable GOMEMLIMIT: 4GiB, but it seems no have any effect. We have only one collection with shard of 3 and replication of 2.

Furthermore, we attach you a screenshoot of 7 days of Pod memory:

These are environment variables set in pods:

AWS_FORCE_PATH_STYLE : false
BACKUP_S3_BUCKET : weaviate-backups
BACKUP_S3_ENDPOINT : *******
BACKUP_S3_USE_SSL : false
CLUSTER_DATA_BIND_PORT : 7001
CLUSTER_GOSSIP_BIND_PORT : 7000
CLUSTER_JOIN : weaviate-headless.prod.svc.cluster.local.
DEFAULT_VECTORIZER_MODULE : none
ENABLE_MODULES : backup-s3
GOGC : 100
GOMEMLIMIT : 4GiB
PERSISTENCE_DATA_PATH : /var/lib/weaviate
PROMETHEUS_MONITORING_ENABLED : true
PROMETHEUS_MONITORING_GROUP : false
QUERY_MAXIMUM_RESULTS : 100000
RAFT_BOOTSTRAP_EXPECT : 3
RAFT_BOOTSTRAP_TIMEOUT : 600
RAFT_JOIN : weaviate-0,weaviate-1,weaviate-2
REINDEX_VECTOR_DIMENSIONS_AT_STARTUP : false
TRACK_VECTOR_DIMENSIONS : true

Server Setup Information

  • Weaviate Server Version: 1.27.0
  • Deployment Method: K8S
  • Multi Node? Number of Running Nodes: 3
  • Client Language and Version: Python V3
  • Multitenancy?: disabled

Any additional Information

And the schema configuration:

{
  "class": "Books",
  "invertedIndexConfig": {
    "bm25": {
      "b": 0.75,
      "k1": 1.2
    },
    "cleanupIntervalSeconds": 60,
    "stopwords": {
      "additions": null,
      "preset": "en",
      "removals": null
    }
  },
  "multiTenancyConfig": {
    "autoTenantActivation": false,
    "autoTenantCreation": false,
    "enabled": false
  },
  "properties": [
    {
      "dataType": [
        "text"
      ],
      "indexFilterable": true,
      "indexRangeFilters": false,
      "indexSearchable": true,
      "name": "text",
      "tokenization": "word"
    },
    {
      "dataType": [
        "number"
      ],
      "indexFilterable": true,
      "indexRangeFilters": false,
      "indexSearchable": false,
      "name": "bookId"
    }
  ],
  "replicationConfig": {
    "asyncEnabled": false,
    "deletionStrategy": "DeleteOnConflict",
    "factor": 2
  },
  "shardingConfig": {
    "actualCount": 3,
    "actualVirtualCount": 384,
    "desiredCount": 3,
    "desiredVirtualCount": 384,
    "function": "murmur3",
    "key": "_id",
    "strategy": "hash",
    "virtualPerPhysical": 128
  },
  "vectorIndexConfig": {
    "bq": {
      "enabled": false
    },
    "cleanupIntervalSeconds": 300,
    "distance": "cosine",
    "dynamicEfFactor": 8,
    "dynamicEfMax": 500,
    "dynamicEfMin": 100,
    "ef": -1,
    "efConstruction": 128,
    "filterStrategy": "sweeping",
    "flatSearchCutoff": 40000,
    "maxConnections": 32,
    "pq": {
      "bitCompression": false,
      "centroids": 256,
      "enabled": false,
      "encoder": {
        "distribution": "log-normal",
        "type": "kmeans"
      },
      "segments": 0,
      "trainingLimit": 100000
    },
    "skip": false,
    "sq": {
      "enabled": false,
      "rescoreLimit": 20,
      "trainingLimit": 100000
    },
    "vectorCacheMaxObjects": 1000000000000
  },
  "vectorIndexType": "hnsw",
  "vectorizer": "none"
}

hi @David_Mane !!

Welcome to our community :hugs:

There is an open issue on that, and our team has it on it’s radar:

One options to mitigate this is to tweak the cpuPercentage.

Let me know if this helps!

Hi @DudaNogueira ,

Thank you for the quick response and for confirming that the issue is on the team’s radar. I’ll look into tweaking the cpuPercentage setting as a mitigation and see if it improves the situation.

I appreciate the support and will keep an eye on updates regarding this issue. Thanks again for the suggestion!

1 Like