I have a collection that contains both vectorized and non-vectorized fields (set with skip_vectorization=True). When I call collection.data.replace, the object will only update if I change at least one of the vectorized fields. If I perform a replace changing only non-vectorized fields, no update happens.
Server Setup Information
Weaviate Server Version: 1.25.6 (local) & 1.25.7 (WCS)
import weaviate
from weaviate.util import generate_uuid5
from weaviate import classes as wvc
client = weaviate.connect_to_local()
client.collections.delete("Test")
collection = client.collections.create(
"Test",
properties=[
wvc.config.Property(name="vectorized", data_type=wvc.config.DataType.TEXT, skip_vectorization=False),
wvc.config.Property(name="non_vectorized", data_type=wvc.config.DataType.TEXT, skip_vectorization=True)
]
)
# now we insert an object
collection.data.insert({"vectorized": "this should be vectorized", "non_vectorized": "this should not be vectorized"}, uuid=generate_uuid5("example1"))
# now we replace the object
collection.data.replace(properties={"non_vectorized": "changing here"}, uuid=generate_uuid5("example1"))
# now we get the object
collection.query.fetch_objects().objects[0].properties
#outputs
#{'vectorized': None, 'non_vectorized': 'changing here'}
Please, let me know if this code is close to what you have crafted, or let me know how to reproduce this issue.
Thanks for the prompt reply. I am out of office hours now so will have to try to create a reproducible sample tomorrow, however the main difference I can see between our implementations is that I do the replace on the entire object, even if some of the fields didn’t change. In your case that would be:
collection.data.replace(properties={"non_vectorized": "changing here", "vectorized": "this should be vectorized"}, uuid=generate_uuid5("example1"))
I can’t confirm right now that this will reproduce it but maybe you could give it a try? Otherwise I will try tomorrow.
Ok this turned out to be more awkward to narrow down than I expected and seems to require some rather odd specifics that maybe point to some other underlying issue?
Here is a reproducible example:
import weaviate
from weaviate.util import generate_uuid5
from weaviate.classes import config as wvc
client = weaviate.connect_to_local(
headers={"X-OpenAI-Api-Key": "<key>"}
)
# Create the collection and explicitly set a vectorizer
client.collections.delete("Test")
collection = client.collections.create(
"Test",
vectorizer_config=(
wvc.Configure.Vectorizer.text2vec_openai(
model="ada",
model_version="002",
)
),
properties=[
wvc.Property(name="non_vectorized", data_type=wvc.DataType.TEXT, skip_vectorization=True),
wvc.Property(name="vectorized_text", data_type=wvc.DataType.TEXT),
wvc.Property(name="vectorized_array", data_type=wvc.DataType.TEXT_ARRAY)
]
)
uuid = generate_uuid5("example1")
# Insert the new object
data = {"non_vectorized": "Original Text", "vectorized_text": "Original Text", "vectorized_array": []}
collection.data.insert(properties={**data}, uuid=uuid)
# Replacing a non-vectorized property on its own does not work
replace_data = {**data, "non_vectorized": "I Changed"}
collection.data.replace(properties=replace_data, uuid=uuid)
print(collection.query.fetch_objects().objects[0].properties)
# Replacing either vectorized property at the same time works
replace_data = {**data, "vectorized_text": "I Changed", "non_vectorized": "I Changed"}
collection.data.replace(properties=replace_data, uuid=uuid)
print(collection.query.fetch_objects().objects[0].properties)
replace_data = {**data, "vectorized_array": ["I Changed"], "non_vectorized": "I Changed"}
collection.data.replace(properties=replace_data, uuid=uuid)
print(collection.query.fetch_objects().objects[0].properties)
Here is what makes this strange:
If I don’t explicitly set a vectorizer_config, this issue does not occur.
If I don’t have the text array in my schema, this issue does not occur.