Hi team,
I’m experimenting with named self-provided vectors and multi-vector search in Weaviate, but I’ve run into a problem.
I have a case where some of my documents don’t contain all of the named vectors. When I query with near_vector
and TargetVectors.manual_weights(...)
, those documents are skipped entirely if they are missing one of the vectors — but my expected behavior is that they should still be searchable, and simply get a 0
contribution for the missing vector(s).
Minimal reproducible code
import weaviate
from weaviate.classes.config import Configure, Property, DataType
from weaviate.classes.tenants import Tenant
from weaviate.classes.query import TargetVectors, MetadataQuery
# 1. Connect
client = weaviate.connect_to_local(port=9203)
# 2. Create collection with 3 named vectors
if client.collections.exists("sample"):
client.collections.delete("sample")
client.collections.create(
name="sample",
properties=[
Property(name="text", data_type=DataType.TEXT),
Property(name="info", data_type=DataType.TEXT),
Property(name="desc", data_type=DataType.TEXT),
],
vector_config=[
Configure.Vectors.self_provided(name="vector1", vector_index_config=Configure.VectorIndex.hnsw()),
Configure.Vectors.self_provided(name="vector2", vector_index_config=Configure.VectorIndex.hnsw()),
Configure.Vectors.self_provided(name="vector3", vector_index_config=Configure.VectorIndex.hnsw()),
],
multi_tenancy_config=Configure.multi_tenancy(enabled=True)
)
# 3. Add tenant
collection = client.collections.get("sample")
collection.tenants.create([Tenant(name="tenantA")])
tenant_collection = collection.with_tenant("tenantA")
# 4. Insert data with missing vectors in some docs
v = [0.2355] * 1024
tenant_collection.data.insert(
properties={"text": "First text", "mvs": "mvs1"},
vector={"vector1": v, "vector2": v}
)
tenant_collection.data.insert(
properties={"text": "Second text", "mvs": "mvs2"},
vector={"vector1": v, "vector2": v, "vector3": v}
)
tenant_collection.data.insert(
properties={"text": "Third text", "mvs": "mvs3"},
vector={"vector1": v}
)
# 5. Query across vector1 + vector2
response = tenant_collection.query.near_vector(
near_vector={
"vector1": v,
"vector2": v
},
limit=20,
target_vector=TargetVectors.manual_weights({
"vector1": 30,
"vector2": 30
}),
return_metadata=MetadataQuery(distance=True)
)
for o in response.objects:
print(o.properties, o.metadata.distance)
Problem
-
Documents that don’t contain
vector2
are not included in the results at all. -
My expectation: such documents should still participate in the search, with their missing vector treated as a 0 score contribution (instead of being excluded).
Question
-
Is this the intended behavior?
-
If yes, is there a config option or roadmap feature to allow “graceful fallback” for missing vectors (treat as 0 contribution instead of excluding the doc)?
Would you like me to also suggest a workaround in the ticket?