Hi @DudaNogueira!
My Configuration and steps:
Server: Standalone
CPU LIMIT: 40
LENGTH OF VECTOR TYPE ONE - 768
LENGTH OF VECTOR TYPE TWO - 1024
Index - HNSW
CONSISTENCY LEVEL : ONE
docker-compose:
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.25.25
ports:
- 8081:8080
- 50052:50051
volumes:
- ./test_data:/data
- ./backups:/tmp/backups
restart: on-failure:0
environment:
TOMBSTONE_DELETION_CONCURRENCY: '4'
DISABLE_LAZY_LOAD_SHARDS: 'true'
HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE: 'true'
STANDALONE_MODE: 'true'
AUTOSCHEMA_ENABLED: 'false'
QUERY_MAXIMUM_RESULTS: 10000
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/data'
DEFAULT_VECTORIZER_MODULE: 'none'
CLUSTER_HOSTNAME: 'node1'
ENABLE_MODULES: 'backup-filesystem,backup-s3'
BACKUP_FILESYSTEM_PATH: '/tmp/backups'
LIMIT_RESOURCES: 'true'
ASYNC_INDEXING: 'true'
LOG_LEVEL: 'debug'
PROMETHEUS_MONITORING_ENABLED: 'false'
GOMAXPROCS: 40
GOGC: 90
PERSISTENCE_HNSW_MAX_LOG_SIZE: 4GiB
deploy:
resources:
limits:
cpus: '40.0'
Block of code for collection creation:
client.collections.create(
name=some_name,
properties=[
weaviate.classes.config.Property(
name=SOME_CATEGORY,
data_type=weaviate.classes.config.DataType.TEXT
),
weaviate.classes.config.Property(
name=SOME_PROP,
data_type=weaviate.classes.config.DataType.TEXT
)
],
vectorizer_config=[
weaviate.classes.config.Configure.NamedVectors.none(
name=WV_VECTOR_TYPE_ONE,
vector_index_config=weaviate.classes.config.Configure.VectorIndex.hnsw(
#quantizer=weaviate.classes.config.Configure.VectorIndex.Quantizer.pq(segments=8, training_limit=100000),
distance_metric=weaviate.classes.config.VectorDistances.COSINE,
ef=320,
ef_construction=320,
max_connections=100
)
),
weaviate.classes.config.Configure.NamedVectors.none(
name=WV_VECTOR_TYPE_TWO,
vector_index_config=weaviate.classes.config.Configure.VectorIndex.hnsw(
#quantizer=weaviate.classes.config.Configure.VectorIndex.Quantizer.pq(segments=8, training_limit=100000),
distance_metric=weaviate.classes.config.VectorDistances.COSINE,
ef=480,
ef_construction=480,
max_connections=120
)
)
],
)
Block of search query:
SEARCH THREADS = 4
QUERY_TIMEOUT_SEC = 60
collection = client.collections.get(SCHEMA)
if CONSISTENCY_LEVEL:
collection =
collection.with_consistency_level(CONSISTENCY_LEVEL)
query = collection.query.near_vector(
near_vector=vectors[VECTOR],
limit=LIMIT,
filters=Filter.by_property(filter_property).equal(filter_value) if filter_property and filter_value else None,
return_metadata=MetadataQuery(distance=True),
return_properties=properties,
target_vector=VECTOR,
)
Importing was through batching (batch size = 5000). After importing, I waited until “queue size” was equal to 0.
With the same import and after waiting finish of “queue” in a normal configuration (without compression), the speed is an order of magnitude higher than with compression.
search on 5 million vectors:
(without any compression)
(segments: 128)
And I noticed that with segments=6 results are better than for example segments=128 or 256. But even with the best segment option (segments=6), the speed is about 2 times slower than without quantization.