Description
Hi everyone,
I’m getting started with Weaviate, you’ll have to excuse any (likely) obvious errors on my part:
I’ve followed the guide on setting up a collection with my own vectors – A collection consisting of two properties (name
and image
) and the corresponding CLIP embeddings of both properties that I prepared ahead (as multimodal vectorizers are not available on WCS).
client.collections.create(
name="EmojiDB",
properties=[
wc.Property(name="name", data_type=wc.DataType.TEXT),
wc.Property(name="image", data_type=wc.DataType.BLOB),
],
vectorizer_config=wc.Configure.Vectorizer.none(),
generative_config=wc.Configure.Generative.openai()
)
Populating the database seemed to work properly, using both the corresponding string for text
and base-64 encoded string for image
… Except for an error showing repeatedly:
ERROR:asyncio:Exception in callback PollerCompletionQueue._handle_events(<_UnixSelecto...e debug=False>)()
handle: <Handle PollerCompletionQueue._handle_events(<_UnixSelecto...e debug=False>)()>
Traceback (most recent call last):
...
I’m using Google Colab for testing, so I assumed this might be more of a compute-related warning.
I went ahead and performed a vector search with query.near_vector
and would manage to retrieve somewhat similar objects’ name
as the one I prompted (Using the same CLIP vectoriser as the embeddings I provided).
Yet, I’m unable to access the image
attribute, which is nowhere to be seen on the QueryReturn response object:
QueryReturn(objects=[Object(uuid=_WeaviateUUIDInt('ddfbc272-c9fe-4a61-a0fb-061e958a8fbb'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=0.7795770168304443, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'name': 'white down pointing backhand index'}, references=None, vector={}, collection='EmojiDB'), Object(uuid=_WeaviateUUIDInt('61a18bec-b56f-46f0-8c43-57e3fc8c89fa'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=0.7795770168304443, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'name': 'white down pointing backhand index'}, references=None, vector={}, collection='EmojiDB'), Object(uuid=_WeaviateUUIDInt('9e4a3158-7048-4a7b-8339-3979621b1add'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=0.7795770168304443, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'name': 'white down pointing backhand index'}, references=None, vector={}, collection='EmojiDB'), Object(uuid=_WeaviateUUIDInt('949dbd44-aa89-47fa-8157-f85ce778463d'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=0.7795770168304443, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'name': 'white down pointing backhand index'}, references=None, vector={}, collection='EmojiDB'), Object(uuid=_WeaviateUUIDInt('f68156cc-6300-4724-8869-1e352325f94a'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=0.7863796949386597, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'name': 'sign of the horns'}, references=None, vector={}, collection='EmojiDB')])
The vector
attribute is also empty, which wouldn’t seem to be the expected behavior.
I went on the WCS Query App to try to get a look at the database, and it would actually output a record containing the encoded image attribute (Although it only fetches one single record…)
{ Get { EmojiDB { name image } } }
I’m a bit confused as to why these discrepancies are occurring. Has the raised error something to do with the image not loading correctly?
Perhaps the dataset might be somewhat bulky ? A total of 2749 30x30 images.
Below is what the populating script looks like:
emojiDB = client.collections.get("EmojiDB")
try:
with emojiDB.batch.dynamic() as batch:
for index, rows in df.iterrows():
img_path = f"emojisFolder/{index}.png"
with open(img_path, "rb") as file:
poster_b64 = base64.b64encode(file.read()).decode("utf-8")
collection_object = {
"name": df.iloc[index, 1],
"image": poster_b64 }
vector = emojis_embeddings[index]
batch.add_object(
properties=collection_object,
vector=vector
)
if len(emojiDB.batch.failed_objects) > 0:
print(f"Failed to import {len(emojiDB.batch.failed_objects)} objects")
finally:
failed_objs_a = client.batch.failed_objects
failed_refs_a = client.batch.failed_references
client.close()
Hopefully this is enough information for someone to spot my error(s)
Server Setup Information
- Weaviate Server Version: Weaviate Cloud Services
- Deployment Method: Cloud WCS Instance
- Multi Node? Number of Running Nodes: 1
- Client Language and Version:
weaviate_client-4.5.7-py3-none-any