Hey! Can anyone help me with a problem I am facing with batch insertion of objects into a WCS collection. For some reason, the objects just aren’t being imported even though everything seems to be correct. The insertion code as well as the schema code is pasted below. I have added print statements as well.
Schema:
from weaviate.classes.config import Property, DataType, Configure
# Define text collection schema
client.collections.create(
“RSATextCollection”,
properties=[
Property(name=“text”, data_type=DataType.TEXT),
Property(name=“source”, data_type=DataType.TEXT), # Add source metadata
Property(name=“timestamp”, data_type=DataType.DATE) # Add timestamp metadata
],
vectorizer_config=Configure.Vectorizer.none()
)
# Define image collection schema
client.collections.create(
“RSAImageCollection”,
properties=[
Property(name=“image_path”, data_type=DataType.TEXT),
Property(name=“source”, data_type=DataType.TEXT), # Add source metadata
Property(name=“timestamp”, data_type=DataType.DATE) # Add timestamp metadata
],
vectorizer_config=Configure.Vectorizer.none()
)
Insertion code:
Upload text embeddings with metadata
text_collection = client.collections.get(“RSATextCollection”)
with text_collection.batch.dynamic() as batch:
for idx, (text, embedding) in enumerate(zip(text_data, text_embeddings_np)):
if idx % 10 == 0:
print(idx, text, embedding.shape)
obj_uuid = generate_uuid5(text)
batch.add_object(
properties={
“text”: text,
“source”: pdf_path,
“timestamp”: datetime.now().isoformat()
},
uuid=obj_uuid,
vector=embedding.tolist()
)
Upload image embeddings with metadata
image_collection = client.collections.get(“RSAImageCollection”)
with image_collection.batch.dynamic() as batch:
for idx, (image_path, embedding) in enumerate(zip(image_files, image_embeddings_np)):
if idx % 10 == 0:
print(idx, image_path, embedding.shape)
obj_uuid = generate_uuid5(image_path)
batch.add_object(
properties={
“image_path”: image_path,
“source”: pdf_path,
“timestamp”: datetime.now().isoformat()
},
uuid=obj_uuid,
vector=embedding.tolist()
)
Output:
0 SESSION ID: (512,)
10 Security Researcher (512,)
20 Attendees should note that sessions may be audio- or video-recorded and may be published in various (512,)
30 Most common attack techniques in K8S (512,)
40 Architecture (512,)
50 Namespaces (512,)
60 #RSAC (512,)
70 Node Components (512,)
80 everything (512,)
90 K03: Overly Permissive RBAC Configurations (512,)
100 Kubernetes (512,)
110 Compromised images / registry (512,)
120 Supply Chain Attacks (512,)
130 RCE (512,)
140 A04:2021 (512,)
150 A09:2021 (512,)
160 Exec into container (512,)
170 #RSAC (512,)
180 27 (512,)
190 Access cloud resources (512,)
200 #RSAC (512,)
210 34 (512,)
220 credential (512,)
230 36 (512,)
240 CoreDNS poisoning (512,)
250 Resource Hijacking (512,)
260 Resource Quotas and Limits (512,)
270 runAsNonRoot, runAsUser, runAsGroup (512,)
280 43 (512,)
290 Istio, Linkerd, Consul, etc (512,)
300 #RSAC (512,)
310 ClusterRole (512,)
320 Protect Cluster Against Privilege Pods (512,)
330 psp-default (512,)
340 jsPolicy, etc (512,)
350 Where to Log: (512,)
360 Enable mTLS authentication on the kubelet’s HTTPS endpoint (512,)
370 Akri (512,)
380 images for vulnerabilities (512,)
390 limit communication between your services and monitor network traffic (512,)
400 Hardening control plane components and lock down kubelet (512,)
0 data_pdf/image_0.jpeg (512,)
10 data_pdf/image_10.jpeg (512,)
20 data_pdf/image_20.png (512,)
30 data_pdf/image_30.png (512,)
40 data_pdf/image_40.png (512,)
50 data_pdf/image_50.png (512,)
60 data_pdf/image_60.png (512,)
70 data_pdf/image_70.png (512,)
80 data_pdf/image_80.png (512,)
90 data_pdf/image_90.jpeg (512,)
100 data_pdf/image_100.jpeg (512,)
110 data_pdf/image_110.png (512,)
120 data_pdf/image_120.png (512,)
130 data_pdf/image_130.png (512,)
140 data_pdf/image_140.png (512,)
150 data_pdf/image_150.png (512,)
160 data_pdf/image_160.png (512,)
170 data_pdf/image_170.jpeg (512,)
180 data_pdf/image_180.png (512,)
190 data_pdf/image_190.jpeg (512,)
200 data_pdf/image_200.png (512,)
210 data_pdf/image_210.jpeg (512,)
220 data_pdf/image_220.png (512,)
230 data_pdf/image_230.jpeg (512,)
240 data_pdf/image_240.jpeg (512,)
250 data_pdf/image_250.jpeg (512,)
260 data_pdf/image_260.png (512,)
However, nothing is being inserted and the collections are still empty even after running this code. Any help would be greatly appreciated! If any more context or code is required, I’d be happy to provide it.