Weaviate Openai Embedding Models

do we have any models for text2vec-openai embedding module which has token limit greater than 8192?

the message i’m getting:
weaviate.exceptions.UnexpectedStatusCodeException: Create class! Unexpected status code: 422, with response body: {'error': [{'message': "module 'text2vec-openai': wrong OpenAI model name, available model names are: [ada babbage curie davinci text-embedding-3-small text-embedding-3-large]"}]}.

"moduleConfig": {
                "generative-openai": {},
                "text2vec-openai": {
                    "model": "?????",
                }
            },

hi @spark !!

by default, if you do not provide a model, it will use ada.

However, you can use any of the supported models, as stated in the error message:

  • ada
  • babbage
  • curie
  • davinci
  • text-embedding-3-small
  • text-embedding-3-large

Notice that with the last two you can also specify the dimensions.

You can find more informations on this here:

Let me know if that helps!

THanks!

I totally understand @DudaNogueira
but could you please help me out in this regard which I’m facing, I was using the default.

{'error': [{'message': "update vector: connection to: OpenAI API failed with status: 400 error: This model's maximum context length is 8192 tokens, however you requested 9655 tokens (9655 in your prompt; 0 for the completion). Please reduce your prompt; or completion length."}]}

{'error': [{'message': "update vector: connection to: OpenAI API failed with status: 400 error: This model's maximum context length is 8192 tokens, however you requested 9745 tokens (9745 in your prompt; 0 for the completion). Please reduce your prompt; or completion length."}]}

All of Openai’s embedding models currently max out 8192 tokens. Some open-source embedding models support larger context windows, but I’d suggest chunking your data and you’ll(probably) get better performance that way too.

1 Like

Could you please guide in this regard?
@DudaNogueira @JK_Rider

Here’s a quick guide on Chunking which should help out:A Guide to Chunking Strategies for Retrieval Augmented Generation (RAG) — Sagacify.

hi @spark !!

as @JK_Rider mentioned, the issue is about passing too much context.

If you see this when vectorizing (which seems to be the case, considering the “update vector” part of the log), it is probably be because your chunks are too big to fit in that context windows.

However, if you see this while generating, you are probably passing too much objects (limit=X) to the generation step.

here are some other content on chunking. As you will soon discover, there isn’t a “one size fits all”, as it will depend on a lot of requirements.

And also this video on advanced RAG techniques:

Thanks!

By the way we an upcoming webinar on this topic:

Chunking

Live workshop
Wednesday, August 28th
9am PDT, 12pm EDT, 6pm CEST

Since the subject is chunking, my two cents: Using gpt-4 API to Semantically Chunk Documents - #166 by SomebodySysop - API - OpenAI Developer Forum

1 Like