Need help to use my own vectorizer and generative model

Sibgat-Ul · July 6, 2024, 5:21pm

I would like to know how to use a locally ran/downloaded vectorizer model to create embeddings and vectors,

i could not find the appropriate doc for that.

bobvanluijt · July 6, 2024, 6:15pm

These are the doc you’re looking for (pick based on your model of choice):

It basically runs two containers, one with Weaviate, and another one with the models

DudaNogueira · July 8, 2024, 12:02pm

Also, the code for the transformers inference model container is here:

So if you have a private model, you can use that same endpoint to call your own model, keeping it compatible with the text2vec-transformers module/integration.

Let us know if this helps

Sibgat-Ul · July 8, 2024, 1:36pm

thanks alot, i did study these docs earlier but i wanted to actually write codes and not use docker container for now.

I came up with this solution for now (tho it does not feels weaviatty way)

here is my model’s code

model = SentenceTransformer('all-MiniLM-L6-v2', device='cuda')

research_docs = client.collections.create(
                name='ResearchDocs',
                vectorizer_config=wvc.config.Configure.Vectorizer.none(),
            )

            results = search_web(query="Deep fake detection", search_engine="both", max_results=2)
            print("Search results:", len(results['results']))

            docs = collapse_results(results)
            print("Docs:", len(docs))
            emb = (model.encode(docs)).tolist()

            wvc_dataObjects = list()
            for i, (d, e) in enumerate(zip(docs, emb)):
                wvc_dataObjects.append(
                    wvc.data.DataObject(
                        properties={
                            "title": results['results'][i]['title'],
                            "content": results['results'][i]['content'],
                        },
                        vector=e
                    ),
                )

            research_docs.data.insert_many(wvc_dataObjects)

I will try out the docker way tho

thanks and sorry for the late reply

Sibgat-Ul · July 8, 2024, 1:37pm

Is there any way to make both of them run on a single container?

DudaNogueira · July 8, 2024, 6:33pm

Hi!

If you don’t want to use a vectorizer on docker, you will have to do what we call “bring your own vectors”. And that’s what you are doing

More on that here: Bring your own vectors | Weaviate - Vector Database

The downside of this approach is that you are in charge of vectorizing and re-vectorizing your content when one of its “vectorizable” properties (skip:false) changes.

Also, you will not be able to use nearText, as Weaviate doesn’t know how to vectorize the query.

So you will need to vectorize the query yourself, and use, instead of nearText, nearVector with your vector query. This also applies to hybrid (replacing query parameter by vector) and so on.

Let me know if this helps

Topic		Replies	Views
Using sentence_transformers together with Weaviate Support bug , python	5	700	July 24, 2024
Does Weaviate has its own embedding model which can be used for text and image embedding Support technical	1	153	January 2, 2025
What vectorizers can be used? General	1	133	April 1, 2025
Weaviate Text Embedding Variations Support	1	569	February 19, 2024
Emmanuel Katto Dubai : Exploring Alternatives to Custom Vectorizer for Weaviate Cloud General	1	158	November 20, 2024

Need help to use my own vectorizer and generative model

Related topics