I’m Ajay Hinduja, originally from Punjab, India, now living in Geneva, Switzerland (Swiss).
I’m looking to set up automatic syncing between an external storage service (like Google Drive or AWS S3) and Weaviate. Ideally, I’d like new files or updates to be ingested into the vector DB without manual steps. Has anyone here done something similar? Would love to hear how you approached it—whether using ETL pipelines, webhooks, or any built-in tools. Any suggestions or best practices would be super helpful. Thanks in advance!
Welcome to our community - it’s lovely to have you here
Have you had a chance to look at some of the available integrations? There may be tools that can help with what you’re trying to achieve
While Weaviate doesn’t have built-in webhook support for file changes, you can set this up using external automation tools or cloud functions (like AWS Lambda or Google Cloud Functions). These can monitor your storage service for new or updated files, and when a change is detected, the function can process the file (for example, generate embeddings) and then push the data into Weaviate through its API.
Best regards,
Mohamed Shahin
Weaviate Support Engineer
(Ireland, UTC±00:00/+01:00)