Hey everyone,
we are planning to use Weaviate as our vector db for our internal RAG system. However, as we are building our infrastructure in the Google Cloud, we are struggling with deploying Weaviate as a Cloud Run Service. The problem is that when deploying it as Cloud Run only one port can be exposed per application. All attempts to make it run failed: We have tested to deploy two containers one for REST and one for GRPC while sharing the same volume and then access them via the http_endpoint and grpc_endpoint. This however didn’t work due to concurrency problems. Is there any way to make these things work without adding the complexity of a proxy service? Are we missing out some obvious solutions?
We are aware that deploying Weaviate inside Kubernetes (GKE) would be an option, but so far we want to stick to Cloud Run. We are also aware of the existing Weaviate Cloud version, but we want to manage the database ourselves.
Thanks in advance!
I managed to get a first workaround by adding a Envoy proxy. However, I now have the same host and port for HTTP and GRPC and the pydantic validation of the connect_to_custom fails because of that. Is there any way to disable this validation? Because apart from that it seems to work.
Hi @cluel01 !!
Welcome to our community 
I never seen this being tried, to be honest. So not sure exactly how it plays out.
But before the client pydantic issue, can you make sure the grpc port is serving?
You can test that with grpcurl:
# lets test our grpc connection
❯ wget https://raw.githubusercontent.com/grpc/grpc/master/src/proto/grpc/health/v1/health.proto
❯ grpcurl -d '{"service": "Weaviate"}' -proto health.proto grpc.weaviate.mydomain.com:50051 grpc.health.v1.Health/Check
{
"status": "SERVING"
}
Let me know if you can share how you did it as I would love to try that too!
Thanks!
I am facing a similar issue, my setup (in tf) is as follows:
# Weaviate deployment on Cloud Run
resource "google_cloud_run_v2_service" "weaviate" {
name = "weaviate"
location = var.region
project = var.project_id
ingress = "INGRESS_TRAFFIC_ALL" # TEMPORARY; TODO restrict to Cloud Run and Cloud Functions
template {
service_account = google_service_account.weaviate_sa.email
containers {
image = "semitechnologies/weaviate:1.23.7"
ports {
container_port = 443
}
resources {
limits = {
cpu = "1000m"
memory = "2Gi"
}
startup_cpu_boost = true
}
# Basic Weaviate configuration
env {
name = "PERSISTENCE_DATA_PATH"
value = "/var/lib/weaviate"
}
env {
name = "AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED"
value = "false"
}
env {
name = "AUTHENTICATION_APIKEY_ENABLED"
value = "true"
}
env {
name = "AUTHENTICATION_APIKEY_USERS"
value = "default_user"
}
env {
name = "AUTHENTICATION_APIKEY_ALLOWED_KEYS"
value_source {
secret_key_ref {
secret = google_secret_manager_secret.weaviate_api_key.secret_id
version = "latest"
}
}
}
# Enable OpenAI vectorizer modules
env {
name = "DEFAULT_VECTORIZER_MODULE"
value = "text2vec-openai"
}
env {
name = "ENABLE_MODULES"
value = "text2vec-openai"
}
# OpenAI API configuration
env {
name = "OPENAI_APIKEY"
value_source {
secret_key_ref {
secret = google_secret_manager_secret.openai_api_key.secret_id
version = "latest"
}
}
}
# PostgreSQL persistence configuration
env {
name = "PERSISTENCE_PROVIDER"
value = "postgresql"
}
env {
name = "PERSISTENCE_POSTGRESQL_HOST"
value = google_sql_database_instance.main_db.public_ip_address
}
env {
name = "PERSISTENCE_POSTGRESQL_PORT"
value = "5432"
}
env {
name = "PERSISTENCE_POSTGRESQL_DATABASE"
value = google_sql_database.weaviate_database.name
}
env {
name = "PERSISTENCE_POSTGRESQL_USER"
value = google_sql_user.weaviate_user.name
}
env {
name = "PERSISTENCE_POSTGRESQL_PASSWORD"
value_source {
secret_key_ref {
secret = google_secret_manager_secret.weaviate_db_password.secret_id
version = "latest"
}
}
}
env {
name = "ENABLE_GRPC"
value = "true"
}
env {
name = "GRPC_PORT"
value = "443"
}
env {
name = "GRPC_SECURE"
value = "true"
}
volume_mounts {
name = "cloudsql"
mount_path = "/cloudsql"
}
}
volumes {
name = "cloudsql"
cloud_sql_instance {
instances = [google_sql_database_instance.main_db.connection_name]
}
}
}
}
When I am trying to connect, I get:
client: weaviate.WeaviateClient = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate.classes.init.Auth.api_key(weaviate_api_key),
additional_config=weaviate.classes.init.AdditionalConfig(
)
)
weaviate.exceptions.UnexpectedStatusCodeError: Meta endpoint! Unexpected status code: 502, with response body: None.
I ad everything working with the 1.22 version. 1.23 with grpc (as required by the clients) is what I switched to today.
When trying to connect using
client: weaviate.WeaviateClient = weaviate.connect_to_custom(
http_host=weaviate_url,
http_port=443,
http_secure=True,
grpc_host=weaviate_url,
grpc_port=443,
grpc_secure=True,
auth_credentials=weaviate.classes.init.Auth.api_key(weaviate_api_key),
)
I get the error described above (validation)
GRPC Health check:
grpcurl \
-import-path . \
-proto health.proto \
-d '{"service": "Weaviate"}' \
-H "Authorization: Bearer XXXXXXXX" \
weaviate-XXXXX:443 \
grpc.health.v1.Health/Check
{
"status": "SERVING"
}
I would try to also combine it with some nginx (are all grcp requests sent with header application/grpc?)
Hi!
Considering the error message, it seems that the client was not able to get the meta endpoint.
My guess is that Weaviate was not up and running by the time the client reached out (hence the 502 / bad gateway)
Can you try hitting this same endpoint directly with rest?
Here is how:
So maybe adding a sleep to allow time for Weviate be up and. running?
Also, as I said earlier, this is not a common scenario. As you scale, running in a cloud run / serverless will probably not give you the readability you need.
Let me know if this helps!
Thanks!