Weaviate custon Retriever

Description

Hello the simple description about what i want to do.
at my program i was creating a Collection for any data or archives, with this properties
“file_name”: wc.DataType.TEXT,
“file_type”: wc.DataType.TEXT,
“file_version”: wc.DataType.TEXT,
“splitter_method”: wc.DataType.TEXT,
“splitter_args”: wc.DataType.TEXT,
“type”: wc.DataType.TEXT,
“url”: wc.DataType.TEXT,
“uuid”: wc.DataType.UUID,
“version”: wc.DataType.TEXT,
“page_content”: wc.DataType.TEXT_ARR
“metadata”: wc.DataType.TEXT_ARRAY,

also use the t2v_transformers from cr.weaviate.io
i want to create a retriever from the “page_content”(vectorized) and “metadata”(non vectorized) without use another embedding sistem and also without create a new collection from weaviate Abstractions,
can i do it ?

Server Setup Information

  • Weaviate Server Version: 1.25.1
  • Deployment Method: docker - embedded
  • Client Language and Version: python
  • Multitenancy?: idk

Any additional Information

i already can insert the data into a weaviate DB, i just want make a retriever from this DB TEXT_ARRAY in a object with especifica UUID5
i also want to ask if have any tutorial from weaviate about how to do a VectorStore and retriever without langchain abstraction.
any other questions quem ask me

Hi @yuri_Golfeto !! Welcome to our community :hugs:

Not sure I understood your first issue :thinking:

For the second, using Weaviate directly AND using langchain, I have written a nice recipe here:

If you use this approach, you can probably retrieve your data the way you want (maybe solving the first issue?)

If you could, please, elaborate on your first issue?

Let me know if this helps :slight_smile:

Thanks!

isen’t this what i want, but this one work well, i just changed the class
let me explain in a better way,
at my idea i want have the “master” Collection called Collection_ingestor.
why?
reason: i want create more them one Vector Store with many types of documents and different types of text_Split.
at the properties i sendo before i just insert all of documentos Splits inside the Page_Content type of Text_ARRAY.
this didn’t work because at vetorizarion they vectorize the Object, didin’t Vectorize each part for each part.
now can you send me the recipe of Weaviate HybridSearch, this recipe its 100% better to understand.

more one question, why i cannot create 2 WeaviateVectorStore.from_documents
in a same weaviate connection ?? or how i do it ?

Hi!

There are some recipes on hybrid search here directly in Weavaite here:

As you are using langchain, you can use it thru that integration.

You can create as many vectorstore you want.

Each time you pass the index_name, for Weaviate, it’s about a collection.

Sor for example, creating 1 vectorstore:

db_collection_a = WeaviateVectorStore.from_documents(docs, embeddings, client=client, index_name="CollectionA")

db_collection_b = WeaviateVectorStore.from_documents(docs, embeddings, client=client, index_name="CollectionB")

you can pass the same client, just change the index_name.

Also, if you don’t want to ingest data, simply pass an empty docs, like so:

db_collection_a = WeaviateVectorStore.from_documents([], embeddings, client=client, index_name="CollectionA")

let me know if this helps :slight_smile:

Hello Duda, that helped me a lot,
let me show what a want to do


was any way to use 2 vector store as a same retriever, like MultiRetrievalQAChain but from Weaviate side ?
reason why i ask from a Weaviate side: Weaviate repository from langchain its easier to understand, more efficient and make the things better and trustworthy

You can only query one collection in Weaviate.

If you want to query over multiple contents, you will need to add them to the same collection, and filter out the contents you want using filters.

Not sure if that would be possible with Langchain (meaning langchain would perform query in different collections then merge the results).

Let me know if this helps!

THanks!