Description
For near_text
search, you can provide multiple query vectors for the same target vector:
Is something like this also supported for hybrid search? The search would require an array of query texts and an array of query vectors.
My use case is that I’m splitting a long document into chunks, and then want to search the vector database by each chunk using a hybrid search. The results should be combined using a “join strategy”, as described here: Multiple target vectors | Weaviate
hi @RisingOrange !
Welcome to our community 
The document you linked is doing in fact a collection.query.near_vector

I believe this is what you want:
from weaviate.classes.query import HybridVector, Move, HybridFusion
jeopardy = client.collections.get("JeopardyQuestion")
response = jeopardy.query.hybrid(
query="California",
max_vector_distance=0.4, # Maximum threshold for the vector search component
vector=HybridVector.near_vector(
vector=[v1, v2]
),
alpha=0.75,
limit=5,
)
Let me know if this helps!
1 Like
Thanks, that’s helpful!
I tried the code on weaviate 4.13.2 and got “Providing lists of lists has been deprecated. Please provide a dictionary with target names as keys and lists of numbers as values.”
I modified it and now have this version, which works:
from weaviate.classes.query import HybridVector
chunks = split_query(query_text)
chunk_vectors = generate_embeddings(chunks)
response = collection.query.hybrid(
query=query_text,
target_vector="corpus_vector",
vector=HybridVector.near_vector(
vector={"corpus_vector": chunk_vectors},
),
max_vector_distance=similarity_threshold,
alpha=0.75,
limit=5,
)
However, I think the results would be different if the collection.query.hybrid
function allowed to pass multiple text queries, instead of just allowing one query string. Splitting the query text, doing a BM25F query for each one and then combining the results would give a different result than just doing one BM25F query on the whole query, right?
The text queries will boil down to tokens for the bm25 phase of the search
So
query = "This query has multiple tokens"
will be the same as
query = [
["this", "query"],
["has", "multiple", "tokens"]
]
Let me know if this helps!
1 Like