Separate transformers inference APIs per class

Hi! I want Weaviate to create my embeddings for me but I want to host the models myself and I want to specify a different transformers model per class. I think that would require a separate inference end point per class similarly to how the hugging face module can configure the endpointURL. Is it possible?

Hi @fredespi !! Welcome to our community :hugs:

Sorry for the delay here :slight_smile:

Unfortunately this is not possible :thinking:

I have brought this use case for our team, so we can think about how we can make this more flexible.

But for now, this is not possible. Sorry :frowning:

This is actually a very interesting use case. Do you have some more context @fredespi?

Might be good for us to look into this use case.

@bobvanluijt one possible solution I was thinking about and have brought to our team is if the text2vec-transformers passes the class name with the payload.

This would allow a custom inference container to use different models.

Also, to set the base url for the model per class, so each class could have a different inference model.

What do you think?

Hi, sure. We serve customers in various markets and their data is in various languages. We want to host embeddings models ourselves in order to have full control over costs, language support, updates, etc. We want to be flexible when it comes to assigning models to collections of data. This is what we would like to do: when a customer wants to create a collection we pick a model that best fits their data. We create the collection in Weaviate and then index their data using the model. But then we we query the data we have to keep track of which embedding model we used originally for that collection. It would be convenient if that was baked into the collection so that the collection itself knows which embedding model/server to use.

Hi I am interested in a similar use-case. Following up on the thread to see if there is any update on this feature. We are interested to host multiple transformer containers so we can assign different embedders for different collections based on the the customer-use case.

Thank you.

hi @pc10 !!

You can now specify a different inference_url per collection, like so:

from weaviate.classes.config import Configure

collection = client.collections.create(
    "DemoCollection2",
    vectorizer_config=[
        Configure.NamedVectors.text2vec_transformers(
            name="title_vector",
            source_properties=["title"],
            inference_url="https://webhook.site/ec8436b5-4a54-4705-b8e0-95f50b81b9f6"
        )
    ],
    # Additional parameters not shown
)