Description
Is there a way to set the api version for v4 Azure OpenAI vectorizer? I tried
passing the OPENAI_API_VERSION but Weaviate seems to default to:
/embeddings?api-version=2024-02-01
…but our instance of Azure OpenAI uses 2023-05-15
Server Setup Information
- Weaviate Server Version: 1.26.1
- Deployment Method: Docker
- Multi Node? Number of Running Nodes: Single
- Client Language and Version: Python weaviate_client-4.7.1
- Multitenancy?:
Any additional Information
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:latest
ports:
- 8080:8080
- 50051:50051
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
AZURE_APIKEY: ${AZURE_OPENAI_API_KEY}
OPENAI_API_TYPE: 'azure'
ENABLE_MODULES: 'text2vec-azure-openai, text2vec-openai'
DEFAULT_VECTORIZER_MODULE: 'none'
OPENAI_API_VERSION: ${AZURE_OPENAI_API_VERSION}
AZURE_ENDPOINT: ${AZURE_OPENAI_ENDPOINT}
AZURE_OPENAI_BASE_URL: ${AZURE_OPENAI_BASE_URL}
AZURE_OPENAI_RESOURCE_NAME: ${AZURE_OPENAI_RESOURCE_NAME}
AZURE_OPENAI_API_VERSION: ${AZURE_OPENAI_API_VERSION}
AZURE_OPENAI_EMBEDDING_DEPLOYMENT: ${AZURE_OPENAI_EMBEDDING_DEPLOYMENT}
CLUSTER_HOSTNAME: 'node1'
volumes:
- weaviate_data:/var/lib/weaviate
class WeaviateStore:
def __init__(self, settings: Settings):
self.settings = settings
self.client: weaviate.WeaviateAsyncClient = None
self.class_name = settings.WEAVIATE_CLASS_NAME
async def initialize(self):
if self.client is None:
connection_params = ConnectionParams.from_params(
http_host=self.settings.WEAVIATE_HOST,
http_port=self.settings.WEAVIATE_PORT,
http_secure=False,
grpc_host=self.settings.WEAVIATE_HOST,
grpc_port=self.settings.WEAVIATE_GRPC_PORT,
grpc_secure=False,
)
self.client = weaviate.WeaviateAsyncClient(
connection_params=connection_params,
additional_headers={
"X-Azure-Api-Key": self.settings.AZURE_OPENAI_API_KEY,
},
)
await self.client.connect()
await self._create_or_update_schema()
async def _create_or_update_schema(self):
try:
collection = self.client.collections.get(self.class_name)
logger.info(f"Collection '{self.class_name}' already exists. Updating configuration.")
await self.client.collections.delete(self.class_name)
except UnexpectedStatusCodeError as e:
if e.status_code != 404:
logger.error(f"Unexpected error when checking for existing collection: {str(e)}")
raise
try:
properties = [
Property(name="content", data_type=DataType.TEXT),
Property(name="metadata", data_type=DataType.OBJECT, nested_properties=[
Property(name="source", data_type=DataType.TEXT),
Property(name="document_type", data_type=DataType.TEXT),
Property(name="npi", data_type=DataType.TEXT),
]),
Property(name="npi", data_type=DataType.TEXT),
]
vectorizer_config = Configure.Vectorizer.text2vec_azure_openai(
resource_name=self.settings.AZURE_OPENAI_RESOURCE_NAME,
deployment_id=self.settings.AZURE_OPENAI_EMBEDDING_DEPLOYMENT,
base_url=self.settings.AZURE_OPENAI_ENDPOINT
)
logger.info(f"Creating new collection '{self.class_name}'")
await self.client.collections.create(
name=self.class_name,
properties=properties,
vectorizer_config=vectorizer_config
)
logger.info(f"Successfully created collection '{self.class_name}'")
except UnexpectedStatusCodeError as e:
logger.error(f"Failed to create collection: {str(e)}")
raise
except Exception as e:
logger.error(f"Unexpected error during schema creation/update: {str(e)}", exc_info=True)
raise