How to encode frequencies of cross references?

Description

I have successfully built my schema that has:

  • class: Person
  • class: Book
  • crossref: person > has > book

and populated the db with some data.How can I save such data that says someone has multiple copies of the same book? E.g., ‘John Smith’ will have 4 copies of ‘Bible’. Obviously there should be only one ‘Bible’ object of the class ‘Book’, so this is different from saying ‘john smith’ has four books ‘bible’, ‘time traveller’, ‘general relativity’, and ‘god father’. What about the ‘frequency’ 4?

Can this be done by adding 4 cross reference links between ‘john smith’?

Alternatively, I saw the object data type in Weaviate, and I suppose I could have a schema like this, but I don’t know how to form my queries, if this schema makes sense…

{
    "class": "Person",
    "properties": [
        {
            "dataType": ["text"],
            "name": "last_name",
        },
        {
            "dataType": ["object"],
            "name": "hasBook",
            "nestedProperties": [
                {"dataType": ["Book"], "name": "bookCrossref"},
                {"dataType": ["number"], "name": "copies"}
            ],
        }
    ],
}

Server Setup Information

  • Weaviate Server Version: 3.24
  • Deployment Method: docker
  • Client Language and Version: python 3.10

Here’s an update. So I wrote some code test idea 1 and can confirm, whether this is intended or not by the dev team, that duplicate cross references are allowed.

import weaviate
from weaviate.util import generate_uuid5

weaviate_url="http://localhost:8080/"

client = weaviate.Client(weaviate_url)

class_definitions = [
    {
        "class": "Book",
        "properties": [
            {"name": "title", "dataType": ["text"]},
            {"name": "ownedBy",
                "dataType": ["Person"],
                "description": "The person who owns this book",
             }
        ],
    },
    {
        "class": "Person",
        "description": "A person",
        "properties": [
            {"name": "name", "dataType": ["text"]},
            {"name": "gender", "dataType": ["text"]},
            {
                "name": "hasBook",
                "dataType": ["Book"],
                "description": "The books this person has",
            },
        ],
    },
]
client.schema.create({"classes": class_definitions})

#add a person
uuid_person = generate_uuid5("ID_John Smith")
client.data_object.create(data_object={"name":"John Smith", "gender":"male"}, class_name="Person", uuid=uuid_person)

#add two books
uuid_book1 = generate_uuid5("ID_BOOK_Romeo and Juliet")
client.data_object.create(data_object={"title":"Romeo and Juliet by Shakespear"}, class_name="Book", uuid=uuid_book1)
uuid_book2 = generate_uuid5("ID_BOOK_Hamlet")
client.data_object.create(data_object={"title":"Hamlet by Shakespear"}, class_name="Book", uuid=uuid_book2)

obj=client.data_object.get_by_id(uuid_person, class_name="Person")
obj=client.data_object.get_by_id(uuid_book1, class_name="Book")

#add crossreferences
client.data_object.reference.add(
    from_class_name="Person",
    from_uuid=uuid_person,
    from_property_name="hasBook",
    to_class_name="Book",
    to_uuid=uuid_book1
)

client.data_object.reference.add(
    from_class_name="Person",
    from_uuid=uuid_person,
    from_property_name="hasBook",
    to_class_name="Book",
    to_uuid=uuid_book1,
)

client.data_object.reference.add(
    from_class_name="Person",
    from_uuid=uuid_person,
    from_property_name="hasBook",
    to_class_name="Book",
    to_uuid=uuid_book2,
)

# add reverse cross ref
client.data_object.reference.add(
    from_class_name="Book",
    from_uuid=uuid_book1,
    from_property_name="ownedBy",
    to_class_name="Person",
    to_uuid=uuid_person,
)

client.data_object.reference.add(
    from_class_name="Book",
    from_uuid=uuid_book1,
    from_property_name="ownedBy",
    to_class_name="Person",
    to_uuid=uuid_person,
)

client.data_object.reference.add(
    from_class_name="Book",
    from_uuid=uuid_book2,
    from_property_name="ownedBy",
    to_class_name="Person",
    to_uuid=uuid_person,
)


cref= "hasBook { ... on Book { title } }"
rs1=client.query.get('Person', ['name','gender', cref]).do()

cref= "ownedBy { ... on Person { name } }"
rs2=client.query.get('Book', ['title', cref]).do()

print("end")

Inspecting object rs1 shows this:

Inspecting object rs2 shows this:

Would appreciate if someone can confirm this indeed works as expected…

Hi @Z_Z !

This is an information I was not aware :joy:

As we are in the first iteration of the nested objects, for now, there are some constraints:

As of 1.22, object and object[] datatype properties are not indexed and not vectorized.
Future plans include the ability to index nested properties, for example to allow for filtering on nested properties and vectorization options.

I will play around with this code.

Also, quick question: any reason not to be using the weaviate python client v4? It’s really superior than this v3 you are using.

THanks!

It’s the effort required for the migration and lack of resources in my company … we will do it eventually but we started a project before v4 was introduced and the underlying schema and code associated with it have got very complex.

Thanks