I need to export vector embeddings and their associated metadata from my Weaviate instance into TSV files suitable for visualization on TensorFlow Projector. The Projector requires two files:
- A TSV file of vectors, where each line represents a vector.
- An optional TSV file of metadata, where each line represents the metadata corresponding to the vectors.
Here’s the format they expect:
Vectors TSV Example:
0.1\t0.2\t0.5\t0.9
0.2\t0.1\t5.0\t0.2
0.4\t0.1\t7.0\t0.8
Metadata TSV Example:
Pokémon\tSpecies
Wartortle\tTurtle
Venusaur\tSeed
Charmeleon\tFlame
I have a collection named Airarabia_faqs_en
in my Weaviate instance, with the following properties:
content
(TEXT)category
(TEXT)url
(TEXT)title
(TEXT)
The vectors are generated and stored using the text2vec-openai
vectorizer.
Could someone provide a step-by-step guide or script (preferably in Python) to extract these vectors and metadata from Weaviate and save them into the required TSV format?
I am using Weaviate locally with Docker.