Description
Server Setup Information
- Weaviate Server Version:
- Deployment Method:
- Multi Node? Number of Running Nodes:
- Client Language and Version:
- Multitenancy?:
Great to have you with us in the community ![]()
Here’s a blog from our Experts in Weaviate team that explains how to vectorize PDFs and different chunking strategies. Once you’ve gone through it, you can choose the approach that works best for you:
Wishing you a great week.
Best regards,
Mohamed Shahin
Weaviate Support Engineer
(Ireland, UTC±00:00/+01:00)
Hello @Mohamed_Shahin
Thanks for the details
As initial stage to building the vector DB in weaviate, the challenge i am facing to use the right api to create the collection and embeded chunk
Below is the simple code base which is failing after multiple fix suggested by exception
~~~
client instance
def get_client():
weaviate_url = os.environ[“WEAVIATE_URL”]
weaviate_api_key = os.environ[“WEAVIATE_API_KEY”]
client = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=Auth.api_key(weaviate_api_key)
)
return client
collection creation
def create_collection():
client = get_client()
if client.collections.exists(“claim_collection”):
client.collections.delete("claim_collection")
client.collections.create(
name="claim_collection",
properties=\[
Property(name="chunk", data_type=DataType.TEXT),
Property(name="page", data_type=DataType.INT),
Property(name="source", data_type=DataType.TEXT)
\],
vector_config=Configure.Vectors.text2vec_openai()
)
client.close()
collection = client.collections.get(“claim_collection”)
for i, doc in enumerate(chunk_data,1):
collection.data.insert({
"chunk": doc.page_content,
"page": doc.metadata.get("page", i), # if page info exists, else use i
"source": "health_data.pdf"
})
client.close()
this time i am getting below error
‘vectorize target vector default: update vector: API Key: no api key found neither in request header: X-Openai-Api-Key nor in environment variable under OPENAI_APIKEY’}]}
~~~
The challenge here is i am not able to find the correct documentation for building vector database using collection api parameters. As see the schema structure already being removed in the version
4.16.10
.. so can you please help me in understanding the issue with right solution
it would be more appreciated if your knowledge base is updated with latest API changes
Thanks
Good morning @Umesh_Narayanan
You’ll need to provide a vectorizer API key for objects to be vectorized, such as an OpenAI key
Example:
import weaviate
from weaviate.classes.init import Auth
import os
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
openai_api_key = os.environ["OPENAI_APIKEY"]
client = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url, # Your Weaviate Cloud URL
auth_credentials=Auth.api_key(weaviate_api_key), # Your Weaviate Cloud key
headers={"X-OpenAI-Api-key": openai_api_key} # Your OpenAI API key
)