Hello Weaviate noob here, I created a vectorizer using LangChain that works fine. And it does persist to cloud after creation. But suppose at some later date I want to get the vectorizer without rebuilding it? I believe the line would be:
I am far from being a LangChain expert. On top of that, I am also a Weaviate noob myself, as I recently joined Weaviate, hehehe.
What I could find is that, while you need to specify text_key for the main Class instantiation, it will have no effect while passing it to from_documents(). It is hardcoded to text here:
text_key will be the property where text will be stored:
I am not sure if hardcoding text at from_texts() is a good thing, because you tie all from_documents() and from_texts()name import to that property, leaving no other option while importing content.
So now, it will depend on how you imported your data (if it was using from_documents, your text_key will be text)
So here is something that have worked for me:
from langchain.vectorstores import Weaviate
import weaviate
# considering you have docs, embeddings, dependencies, etc
WEAVIATE_URL = "http://localhost:8080"
db = Weaviate.from_documents(docs, embeddings, weaviate_url=WEAVIATE_URL, by_text=False, index_name="MyIndex")
# now, you can:
client = weaviate.Client(WEAVIATE_URL)
db = Weaviate(client=client, index_name="MyIndex", text_key="text")
db.similarity_search_by_text(query="health")