Weaviate mutates the case of document meta properties

moruga123 · April 1, 2024, 8:56pm

Description

If I create a collection like so:

client.connect()
try:
    collection = client.collections.create(
        name='lowercase',
        properties=[
            wvcc.Property(name="CONTENT", data_type=wvcc.DataType.TEXT, skip_vectorization=False, index_searchable=True),
            wvcc.Property(name="URL", data_type=wvcc.DataType.TEXT, skip_vectorization=True, index_searchable=True),
        ],
        vector_index_config=wvc.config.Configure.VectorIndex.hnsw(), # https://weaviate.io/developers/weaviate/manage-data/collections
        vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_transformers(
            passage_inference_url="http://t2v-transformers-passage:8080",
            query_inference_url='http://t2v-transformers-query:8080'
        )
    )

then weaviate will mutate the property names to ‘cONTENT’ and ‘uRL’ when I later retrieve documents from the collection. Why is this?

The collection name itself also gets mutated to ‘Lowercase’. However, the following command will still work:

collection = client.collections.get('lowercase')

So, are collections treated as case insensitive?

Server Setup Information

Weaviate Server Version: 1.24.2
Deployment Method: docker
Multi Node? Number of Running Nodes: 1
Client Language and Version: python, 4.5.3

moruga123 · April 1, 2024, 9:13pm

If I use property names where weaviate is the intermediary, e.g. I do collection.query.hybrid(…, query_properties=[‘CONTENT’]), then it is able to see the CONTENT property without a problem even though it prints as ‘cONTENT’.

However, it would be better if the properties were left alone, so that when I export document meta data, I can access them using whatever case was there originally prior to indexing documents in weaviate.

DudaNogueira · April 4, 2024, 7:51pm

Hi @moruga123 !

Those are really interesting findings. While we suggest, as a convention, PascalCase for the Collection name and lowercase for the property names

Enforcing those can bring unnecessary friction for the DX.

I have consolidated your code here for better reproducibility (and removed unnecessary parts):

import weaviate
from weaviate import classes as wvc
client = weaviate.connect_to_local()
client.collections.delete("lowercase")
lowercase = client.collections.create(
    name='lowercase',
    properties=[
        wvc.config.Property(name="CONTENT", data_type=wvc.config.DataType.TEXT, skip_vectorization=False, index_searchable=True),
        wvc.config.Property(name="URL", data_type=wvc.config.DataType.TEXT, skip_vectorization=True, index_searchable=True),
        wvc.config.Property(name="URL_2", data_type=wvc.config.DataType.TEXT, skip_vectorization=True, index_searchable=True),
        # this doesn't work:
        #wvc.config.Property(name="URL:2", data_type=wvc.config.DataType.TEXT, skip_vectorization=True, index_searchable=True),
    ],
    vectorizer_config=None
)
print(lowercase.config.get().properties)
# insert some document
# MAKE SURE YOU HAVE AUTOSCHEMA_ENABLED: 'false' in your docker compose
lowercase.data.insert({
    "CONTENT": "content", "URL": "http://weaviate.io"
})
# lets fetch our objects
query = lowercase.query.fetch_objects()
# NOTHING HERE
print("NOTHING HERE: ", query.objects[0].properties.get("CONTENT"))
# CONTENT IS IN FACT HERE
print("CONTENT IS IN FACT HERE: ", query.objects[0].properties.get("cONTENT"))

Also, I have found some discussions about this here:

On top of that, I exported the created collection and imported using the v3 client, so this all happens on the server side:

lowercase_schema = lowercase.config.get().to_dict()
lowercase_schema["class"] = "v3class"
clientv3 = weaviate.Client("http://localhost:8080")
clientv3.schema.create_class(lowercase_schema)
v3class.config.get().properties

and got the same results. I also agree here:

However, it would be better if the properties were left alone…

While the collection name is easier to avoid this friction, the property is not, as you pointed out.

I believe the best course of action here is to raise an issue in GH:

Topic		Replies	Views
Small error in the Schema starter guide Support	1	256	February 12, 2024
Why am I getting a malformed vector error when trying to add text metadata? Support python	6	400	July 1, 2024
Client.collections.create_from_config reverts to default config Support bug , developer-experience , python	2	158	August 12, 2024
[Feedback] Update to the Python client – collections, search, CRUD operations General developer-experience , feedback	19	1610	April 24, 2025
[Question] YOUR TOPIC Support python	1	127	July 30, 2024

Weaviate mutates the case of document meta properties

Description

Server Setup Information

Related topics