Description
Server Setup Information
-
Weaviate Server Version: 1.32.4-1e414fd.amd64
Weaviate Client: Python Client 4.16.6
Docker Image: semitechnologies/weaviate:1.32.4-1e414fd.amd64
Deployment Method: Docker container for Weaviate and a docker container for the embedding model.
Any additional Information
I have a Weaviate collection with a vector field. I have inserted the data using insert_many or just using insert. The SigText gets vectorized. However, when I search it returns the same thing every time regardless of the query. I checked the SigTextVector vector field and it is the same for each row regardless of the SigText. So, that is why the query doesn’t work. I have called the embedding model directly and it returns different results based on the input. I check the log for my embedding model, and Weaviate is sending ‘Sig Codes’ for the prompt to embed everytime and not the value of the SigText. Can you please let me know what I am doing wrong?
Here is the vector field:
vector_config=[
Set another named vector with the “text2vec-openai” vectorizer
Configure.Vectors.text2vec_openai(
name=“SigTextVector”,
source_properties=[
“SigText”
], # (Optional) Set the source property(ies)
vector_index_config=Configure.VectorIndex.hnsw( # (Optional) Set vector index options
ef=500, # Search list size
ef_construction=400,
distance_metric=VectorDistances.COSINE,
filter_strategy=VectorFilterStrategy.SWEEPING, # or ACORN TODO: What should this be set to if anything?
Need to define M for the HNSW index, which is the number of connections per node
),
base_url=“http://vllm-mxbai-embed-large-v1:8000”, # Use the internal Docker container port since this will be docker to docker communication
model=“/models/mxbai-embed-large-v1”
)
Here is the query:
sigCodeCollection = client.collections.get(“SigCodes”)
testSearch = sigCodeCollection.query.near_text(
query=“ONCE A WEEK”, # The model provider integration will automatically vectorize the query
limit=5,
target_vector=“SigTextVector”,
return_metadata=MetadataQuery(
distance=True # Include distance in the results
)
)
print(f"Test Search Results: {testSearch}“)
for obj in testSearch.objects:
print(f"Sig Code: {obj.properties[“sigCode”]} - Sig Text: {obj.properties[“sigText”]} - Distance: {obj.metadata.distance}”)
Here is the output:
Sig Code: APP - Sig Text: APPLICATION - Distance: 0.5655122995376587
Sig Code: 5T - Sig Text: TAKE 5 TABLETS - Distance: 0.5655122995376587
Sig Code: U - Sig Text: UNITS - Distance: 0.5655122995376587
Sig Code: 2PU - Sig Text: INHALE 2 PUFFS - Distance: 0.5655122995376587
Sig Code: 2GTTS - Sig Text: INSTILL 2 DROPS - Distance: 0.5655122995376587!-- logs, additional setup information, anything extra you did in the setup or variables not included in any guide you followed →