Based from here: Vectorizers and Rerankers | Weaviate - vector database
Unless specified otherwise in the schema, the default behavior is to:
- Only vectorize properties that use the
textdata type (unless skipped)
- Sort properties in alphabetical (a-z) order before concatenating values
- … so on
Is there a way to sort the properties based on our pre-defined order when vectorizing?
The reason behind is that for example, we are using Huggingface multi-qa-MiniLM-L6-cos-v1 which has the following note:
Note that there is a limit of 512 word pieces: Text longer than that will be truncated. Further note that the model was just trained on input text up to 250 word pieces. It might not work well for longer text.
So we would want to prioritize the some fields before truncation in case we hit the limit. Thanks!