Use of generative-openai moduleConfig

I am getting ready to launch a new cluster with an updated schema. I do primarily NearText queries.

Original Schema Config

  "moduleConfig": {
        "generative-openai": {
            "model": "gpt-4"
        },
        "text2vec-openai": {
            "model": "ada",
            "modelVersion": "002",
            "type": "text",
            "vectorizeClassName": true
        }
    },

New Schema Config

   "vectorizer" => "text2vec-openai",
   "moduleConfig" => [
      "text2vec-openai" => [
            "vectorizeClassName" => true,
            "model" => "text-embedding-3-large",
            "dimensions" => 3072,
            "type" => "text",
            "vectorizeClassName" => false
        ],
   ],

My question is: Should I still include the generative-openai element?

Hi!

If you are using it, or plan on using it, it’s good to enable on the server (that’s a really small overhead) and also enable it while creating the collection.

Otherwise, for now, you need to reindex your data only to enable or change the generative module (We know, this is not really necessary)

Hopefully, in the near future, you will be able to change (or enable/disable) the generative module without needing to reindex the collection.

Let me know if this helps :slight_smile:

Given this schema, how do I add it?

Check here an example:

{
  "classes": [
    {
      "class": "Document",
      "description": "A class called document",
      ...,
      "moduleConfig": {
        "generative-openai": {
          "model": "gpt-3.5-turbo",  // Optional - Defaults to `gpt-3.5-turbo`
          "resourceName": "<YOUR-RESOURCE-NAME>",  // For Azure OpenAI - Required
          "deploymentId": "<YOUR-MODEL-NAME>",  // For Azure OpenAI - Required
          "temperatureProperty": <temperature>,  // Optional, applicable to both OpenAI and Azure OpenAI
          "maxTokensProperty": <max_tokens>,  // Optional, applicable to both OpenAI and Azure OpenAI
          "frequencyPenaltyProperty": <frequency_penalty>,  // Optional, applicable to both OpenAI and Azure OpenAI
          "presencePenaltyProperty": <presence_penalty>,  // Optional, applicable to both OpenAI and Azure OpenAI
          "topPProperty": <top_p>,  // Optional, applicable to both OpenAI and Azure OpenAI
        },
      }
    }
  ]
}
1 Like

So, to be clear, my new schema with both generative-openai and “text-embedding-3-large” becomes

$schema = [
“class” => “SolrAI”,
“description” => “Class representing the SolrAI index”,
“vectorizer” => “text2vec-openai”,
“moduleConfig” => [
“generative-openai” => [
“generative-openai” => “gpt-3.5-turbo”,
],
“text2vec-openai” => [
“vectorizeClassName” => true,
“model” => “text-embedding-3-large”,
“dimensions” => 3072,
“type” => “text”,
“vectorizeClassName” => false
],
],
…

I do not use Azure and choose the default options for everything else.

Just trying to make sure I get this right. The first time.

Also, for further clarification, I the generative-openai moduleConfig only applies when you have “generate” statement in your query, and not simply “NearText”. Correct?

it’s kind of the opposite :slight_smile:

When you enable in your server and define a generative module to your collection, it will expose generative functions, so you can now generate content based on your data.

Also, whenever you have a vectorizer defined for you collection, you can do nearText, because Weaviate will not be able to vectorize now only the data you brings in, but also the query, so it can compare the distances between your query vector and the vectors from your indexed objects.

:slight_smile: