Problem with Q&A using Local vectorization model (text2vec-transformers)

garcia.e · May 14, 2024, 9:36am

I’m having some problems with a Q&A system using the text2vec-transformer model in Local. I’m using the weaviate version 1.23.9

When I make a question about a processed and vectorized document, It response properly, giving me the correct answer using the correct context ( I can track with what context is answer me)

I also have the limit of 3 context per question to make the answer.

The problem is that when I reboot the system, I make the same question an weaviate doesn’t found any context to make the response. Or even if it founds a context, before the system reboot it found 3 context for the same question.

I checked the weaviate database and all the information still there, i also checked the vectors and they are same. I checked too the creationTimeUnix and the lastUpdateTimeUnix and they are the same.

I don’t know what is happening.

This is the code to get the context from weaviate:

def get_vectors(prompt, technology):
    try:
        response = (
            client.query
            .get("Context", ["content", "technology", "document","titleTotal"])
            .with_near_text({
                "concepts": [prompt],
                "distance": 0.65
            })
             .with_where({
                "path": ["technology"],
                "operator": "Equal",
                "valueText": str(technology)
            })
            .with_additional(["id"])
            .with_limit(3)
            .do()
        )
        
        return response
    except Exception as ex:
        print(ex)

DudaNogueira · May 14, 2024, 1:21pm

hi @garcia.e

That’s strange.

Do you see any difference on the data being returned by your query?

Can you reproduce this on latest 1.25 version? And finally, could you provide a notebook with an end to end code? That would help a lot to understand this issue.

Thanks!

garcia.e · May 14, 2024, 1:57pm

Hello, I didn’t see any difference on the data.
I tried to use the last version of weaviate but this problem still happens.

The code is simple:

I pass the prompt to weaviate to search the context that better fits to it using the code in the firts mail.
Then I send the prompt and the contexts retrieved from weaviate and send it to openIA to make the answer.

Our data schema is the next:

document_schema = {
  "class": "Context",
  "description": "test",
  "vectorIndexType": "hnsw",
  "vectorIndexConfig": {
    "distance": "dot",
  },
  "vectorizer": "text2vec-transformers",
  "moduleConfig": {
    "text2vec-transformers": {
      "vectorizeClassName": True
    },
    "qna-openai": {
      "model": "text-davinci-002",
      "maxTokens": 350, 
      "temperature": 0.5,  
    }
  },
  "properties": [
    {
      "name": "content",
      "description": "context text",
      "dataType": [
        "text"
      ],
      "tokenization": "word",
      "moduleConfig": {
        "text2vec-transformers": {
          "skip": True,
        }
      },
      "indexInverted": True
    },
    {
      "name": "titleTotal",
      "description": "titleTotal",
      "dataType": [
        "text"
      ],
      "tokenization": "word",
      "moduleConfig": {
        "text2vec-transformers": {
          "skip": True,
        }
      },
      "indexInverted": True
    },
    {
      "name": "title",
      "description": "title",
      "dataType": [
        "text"
      ],
      "tokenization": "word",
      "moduleConfig": {
        "text2vec-transformers": {
          "skip": True,
        }
      },
      "indexInverted": True
    },
    {
      "name": "document",
      "description": "document name",
      "dataType": [
        "text"
      ],
      "indexInverted": True
    },
    {
      "name": "page",
      "description": "page",
      "dataType": [
        "text"
      ],
      "indexInverted": True
    },
    {
      "name": "technology",
      "description": "technology id",
      "dataType": [
        "text"
      ],
      "indexInverted": True
    }
  ],
  "invertedIndexConfig": {
    "indexTimestamps": False,
    "indexNullState": False,
    "indexPropertyLength": False
  },
}

Thanks!

Topic		Replies	Views
Text2vec_openai redundancy via multiple providers? Support integration , technical	4	424	January 28, 2025
Weaviate Text Embedding Variations Support	1	678	February 19, 2024
Local Embed vs Weaviate Module Support	6	1333	October 19, 2023
Weaviate Openai Embedding Models General	8	524	August 23, 2024
Facing maximum context length exceed issue during vectorizing Support python	1	473	April 16, 2024

Problem with Q&A using Local vectorization model (text2vec-transformers)

Related topics