How to Export Vectors and Metadata to TSV for TensorFlow Projector

I need to export vector embeddings and their associated metadata from my Weaviate instance into TSV files suitable for visualization on TensorFlow Projector. The Projector requires two files:

  1. A TSV file of vectors, where each line represents a vector.
  2. An optional TSV file of metadata, where each line represents the metadata corresponding to the vectors.

Here’s the format they expect:

Vectors TSV Example:

0.1\t0.2\t0.5\t0.9
0.2\t0.1\t5.0\t0.2
0.4\t0.1\t7.0\t0.8

Metadata TSV Example:

Pokémon\tSpecies
Wartortle\tTurtle
Venusaur\tSeed
Charmeleon\tFlame

I have a collection named Airarabia_faqs_en in my Weaviate instance, with the following properties:

  • content (TEXT)
  • category (TEXT)
  • url (TEXT)
  • title (TEXT)

The vectors are generated and stored using the text2vec-openai vectorizer.

Could someone provide a step-by-step guide or script (preferably in Python) to extract these vectors and metadata from Weaviate and save them into the required TSV format?

I am using Weaviate locally with Docker.

Hi @ROHAN_BALKONDEKAR,

Weaviate makes it easy to read all objects with vectors.
You need to use the iterator on your collection, like this:

collection = client.collections.get("YourCollectionName")

for item in collection.iterator(include_vector=True):
    print(item.properties)
    print(item.vector)

Here are the docs on how to read all data.

From this, you should be able to figure out how to save it to a TSV file.