HI @Charlie_Chen
Thanks for the detailed information. I check all info. carefully This is a classic memory pressure scenario in Weaviate, and there are several factors at play here.
Understanding the Problem
1. Memory Architecture Mismatch
You have:
-
Container Memory Limit: 3000 MiB (3 GiB)
-
GOMEMLIMIT: 2500 MiB
-
Actual Usage: ~2.64 GiB at idle
The issue is that GOMEMLIMIT only controls Go’s heap memory, not the total process memory. Weaviate also uses:
-
Off-heap memory for vector indexes (HNSW graphs)
-
Memory-mapped files for LSM storage
-
OS page cache
-
gRPC buffers and other system overhead
With 300K objects, your data structures (especially LSM compactions and vector indexes) need additional headroom that simply isn’t available.
2. LSM Compaction Pressure
Your logs show compaction being skipped due to OOM:
msg":"skipping compaction due to memory pressure"
This is critical coz:
-
LSM stores accumulate uncompacted segments
-
This leads to increased memory usage over time
-
Delete operations are particularly expensive as they create tombstones that need compaction
-
Without compaction, the database becomes progressively slower and more memory-hungry
3. The Restart Clue
When you deleted the problematic class and restarted, memory dropped to 519 MiB. This confirms the issue is data-related, not a memory leak.
Solutions
Increase Memory Allocation
Recommend config.:
resources:
limits:
cpu: 3
memory: 6Gi # Doubled from 3Gi
requests:
cpu: 500m
memory: 6Gi
env:
- name: GOMEMLIMIT
value: "5200MiB" # ~85% of 6Gi
- name: GOGC
value: "50" # More aggressive GC (consider lowering from 100)
Why this helps:
-
Provides adequate headroom for LSM compactions
-
Allows vector index operations to complete
-
Enables proper garbage collection cycles
Alternative: Optimize for Lower Memory
If increasing memory isn’t an option, try these optimizations:
1. Adjust Vector Index Settings
class_config = {
"vectorIndexConfig": {
"efConstruction": 64, # Lower from default 128
"maxConnections": 16, # Lower from default 64
"ef": -1, # Use dynamic ef
}
}
2. Enable Async Indexing
env:
- name: ASYNC_INDEXING
value: "true"
3. Reduce Batch Sizes
# In your Python client
with client.batch.dynamic() as batch:
batch.batch_size = 50 # Reduce from default
# ... your operations
4. Implement Rate Limiting
Add delays between batch operations to give compaction time to run:
import time
for batch in batches:
process_batch(batch)
time.sleep(0.5) # Allow background tasks to catch up
Let me know if you need help implementing any of these solutions or if you have questions about the memory breakdown!
-Chaitanya