I’m trying to ingest data into weavite, a mix of text data and other formats like pdf that I convert to text batches using “unstructured”.
I’m basically following what reported at ingesting PDF but I suppose I’m missing something and/or I’m doing something wrong.
If I query data as follow:
response = coll.query.bm25( query="metal oxide", limit=2, return_metadata=MetadataQuery(distance=True) )
I get a result, while using:
res = coll.generate.near_text( query="metal oxide", limit=2, # return_metadata=MetadataQuery(distance=True), single_prompt="Summarize {coll_name}, use a maximum of 20 words." )
I get nothing.
I would like to perform semantic search + RAG on property “files” (DataType.TEXT_ARRAY) containing batch text extracted using partition_pdf
from usntructured
.
Schema is the following: