@DudaNogueira I thought I would create a new thread for this issue rather than hijacking New OpenAI Embedding Models - #21 by SomebodySysop
The problem statement is that once a weaviate server is configured with an OpenAI vectorizer using the new model of text-embedding-3-large
and dimensions of 1024
, hybrid queries fails with a vector search: vector lengths don't match: 1024 vs 3072
error message upon a server reboot.
I was able to replicate this issue on codesandbox. This is using Weaviate v1.23.10 and python client 4.4.4.
Steps to reproduce
- https://codesandbox.io/p/sandbox/interesting-morse-hgvggd
- Sign-in using SSO of choice
- Open up setup.py and query.py and update line 16 with an OpenAI API Key. As this is being done codesandbox will “seamlessly fork” to your own private sandbox. If the URL does not change, you may have to go back to the dashboard CodeSandbox, go to My drafts, and open the newly created sandbox.
- Go to top left corner and select the “Restart Devbox” option. This should trigger sandbox initialization. Wait for container to be started and the
pip -r requirements.txt
job to complete. - Open up a new terminal in the center bottom pane.
- Run the following in sequence:
-
docker compose down -v
-
docker compose up -d
-
python setup.py
-
python query.py
Note the following:
- setup.py creates a collection and inserts a single object
- The single object we stored in weaviate has a vector length of 1024, indicating vectorizer is working properly
- We can fetch that object from weaviate, confirming that the inserted object is persisted
- We can hybrid query from weaviate
- Now run:
docker compose restart
python query.py
All we’ve done here is restart the weaviate container. Notice now that we can still fetch the inserted object (see output above the exception output), but now hybrid query fails with a vector length not matching error.