Open ai llm not able to understand field's meaning

I have created class with following properties. And ingestion around 13K object.

# ===== define collection =====
class_obj = {
    "class": "ProductCatalogNumeric",
    "vectorizer": "text2vec-openai",  # If set to "none" you must always provide vectors yourself. Could be any other "text2vec-*" also.
    "moduleConfig": {
        "text2vec-openai": {},
        "generative-openai": {}  # Ensure the `generative-openai` module is used for generative queries
    },
    "properties": [
        {
            "name": "title",
            "dataType": ["text"],
            "description": "Name of product that we are selling in our marketplace"
        },
        {
            "name": "jpin",
            "dataType": ["text"],
            "description": "Jpin represent unique identifier for every product"
        },
        {
            "name": "price",
            "dataType": ["number"],
            "description": "The selling price of the product"
        },
        {
            "name": "margin",
            "dataType": ["number"],
            "description": "Percenatge Margin we earn after saling of the product"
        },
    ],
}

client.schema.create_class(class_obj)

But when i am asking below question , its giving data/field not present ?

this is my question:

response = (
    client.query
    .get("ProductCatalogNumeric", ['title','jpin','price','margin'])
    .with_near_text({"concepts": ["basmati rice"]})
    .with_generate(grouped_task="Tell me average price of all basmati rice products. And show calculation ? ")
    .with_limit(100)
    .do()
)

print(json.dumps(response, indent=4))

And this is open ai llm response

{
    "data": {
        "Get": {
            "ProductCatalogNumeric": [
                {
                    "_additional": {
                        "generate": {
                            "error": null,
                            "groupedResult": "To calculate the average price of all basmati rice products, we need to first gather the prices of all the products listed. Since the prices are not provided in the given data, we cannot calculate the average price."
                        }
                    },
                    "jpin": "JPIN-1304444084",
                    "margin": 0.064957,
                    "price": 121.03000000000002,
                    "title": "Daawat Devaaya Basmati Rice, 1Kg Pack"
                },
                {
                    "_additional": {
                        "generate": null
                    },
                    "jpin": "JPIN-1304511345",
                    "margin": 0.0241038235,
                    "price": 2580,
                    "title": "Gauri Rozana Steamed Basmati Rice, 30Kg Bag"
                },
                {
                    "_additional": {
                        "generate": null
                    },
                    "jpin": "JPIN-1304351614",
                    "margin": 0.0638388,
                    "price": 113.20000000000002,
                    "title": "Daawat Heritage Platinum Basmati Rice, Classic, 1Kg Pack"
                },
                {
                    "_additional": {
                        "generate": null
                    },
                    "jpin": "JPIN-1304472302",
                    "margin": 0.063287,
                    "price": 182.56,
                    "title": "Daawat Traditional Basmati Rice, 1Kg Pack"
                },
}

Hi @bhupendra_singh, welcome to the Weaviate forum.
Have you tried using prompt_properties?
You can find an example in the docs: generative search

response = (
    client.query
    .get("ProductCatalogNumeric", ['title','jpin','price','margin'])
    .with_near_text({"concepts": ["basmati rice"]})
    .with_generate(
        grouped_task="Tell me average price of all basmati rice products. And show calculation ? ",
    grouped_properties=["title", "price"] # <== the list of properties to pass to the LLM
    )
    .with_limit(100)
    .do()
)

Btw. I haven’t tested if group_task also accepts number properties.
I hope that is not causing the issue.

Side note - on limit

Btw. if you set limit to 100, then the query returns the 100 nearest results.

What I mean by that? You ask.
If in your database you have 8 objects related to rice, then vector search will return first the 8 rice objects, then it will continue to look for any other related objects. i.e. the 100th object could be pasta or something, that could considered similar because that is also food/carb product.

You could use autocut (see an example of autocut in our docs), which returns a group of similar objects, and if there is a drop in quality of results, then it cuts off the rest. This way you have a better chance for Weaviate to only use the most relevant group of objects for your generative task.

You can swap

.with_limit(100)

For

.with_autocut(1) # returns the first group of similar objects

I hope this helps.

Hi @bhupendra_singh,
I’ve just checked with the team. Generative modules currently only use text properties for generative tasks (both single_prompt and group_task).

A workaround would be to convert the price property to a string.

        {
            "name": "price",
            "dataType": ["text"],
            "description": "The selling price of the product",
        },

Btw. I would be careful with relying on LLMs for calculations, as sometimes they might hallucinate and you might get inaccurate calculations :thinking: