AWS Bedrock invalid template issue

Description

I have the following code:

import weaviate
from weaviate.classes.config import Configure

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_aws import BedrockEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain_weaviate.vectorstores import WeaviateVectorStore

headers = {
    "X-AWS-Access-Key": "my_access_key",
    "X-AWS-Secret-Key": "my_secret_key",
}

client = weaviate.connect_to_local(headers=headers)

try:

    print('Client is ready? ', client.is_ready())

    client.collections.delete("MPVirtual")

    client.collections.create(
        "MPVirtual",
        vectorizer_config=Configure.NamedVectors.text2vec_aws(
            name="MPVirtual",
            region="sa-east-1",
            service="bedrock",
            model="amazon.titan-embed-text-v2:0"
        ),
        generative_config=Configure.Generative.aws(
            region="sa-east-1",
            service="bedrock",
            model="amazon.titan-text-express-v1"
        )
    )

    text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=50)
    embeddings = BedrockEmbeddings()

    loader = PyPDFLoader("my_pdf_file.pdf", extract_images=False)
    docs = loader.load_and_split(text_splitter)
    print(f"GOT {len(docs)} docs for PDF")

    db = WeaviateVectorStore.from_documents(docs, embeddings, client=client, index_name="MPVirtual")

    collection = client.collections.get("MPVirtual")

    response = collection.aggregate.over_all(total_count=True)
    print(response)

    response = collection.aggregate.over_all(group_by="source")
    for group in response.groups:
        print(group.grouped_by.value, group.total_count)

    object = collection.query.fetch_objects(limit=1).objects[0]
    print(object.properties.keys())

finally:
    client.close()

When I run the script, the following error occurs:


Client is ready?  True
GOT 66 docs for PDF
ERROR:root:Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: The provided model identifier is invalid.
Traceback (most recent call last):
  File "/my_folder/IA/mpvirtual/teste.py", line 44, in <module>
    db = WeaviateVectorStore.from_documents(docs, embeddings, client=client, index_name="MPVirtual")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/langchain_core/vectorstores/base.py", line 852, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/langchain_weaviate/vectorstores.py", line 487, in from_texts
    weaviate_vector_store.add_texts(texts, metadatas, tenant=tenant, **kwargs)
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/langchain_weaviate/vectorstores.py", line 165, in add_texts
    embeddings = self._embedding.embed_documents(list(texts))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/langchain_aws/embeddings/bedrock.py", line 178, in embed_documents
    response = self._embedding_func(text)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/langchain_aws/embeddings/bedrock.py", line 159, in _embedding_func
    raise e
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/langchain_aws/embeddings/bedrock.py", line 144, in _embedding_func
    response = self.client.invoke_model(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/botocore/client.py", line 569, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my_folder/IA/mpvirtual/.venv/lib/python3.11/site-packages/botocore/client.py", line 1023, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: The provided model identifier is invalid.

I copied the names of the models from this link:

The user has the necessary accesses.

I really believe Iā€™m doing something wrong. Can you help me solve this problem? Iā€™m just starting out in this world of vector banks, RAG and GenIA.

Server Setup Information

  • Weaviate Server Version: 1.27.0
  • Deployment Method: Local
  • Multi Node? No
  • Client Language and Version: Python 3.11
  • Multitenancy? No

Oi @taigofranca !!

Welcome to our community :hugs:

I believe that the issue here is that by default, BedrockEmbeddings from langchain will grab the credentials from ~/.aws/credentials, while Weaviate will accept those in headers.

So now you need to make sure that you pass a working langchain embeddings object, like so:

from langchain_aws import BedrockEmbeddings
embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v2:0", credentials_profile_name="my_cool_aws_profile")
result = embeddings.embed_query("This is a test")
print(result)

I was able to run a basic insert/generate with this code, but this will only Weaviate, and rely on the credentials you pass at the client instantiation

from weaviate.classes.config import Configure

client.collections.delete("DemoCollection")

collection = client.collections.create(
    "DemoCollection",
    vectorizer_config=[
        Configure.NamedVectors.text2vec_aws(
            name="title_vector",
            region="sa-east-1",
            source_properties=["title"],
            service="bedrock",
            model="amazon.titan-embed-text-v2:0",
        )
    ],
    generative_config=Configure.Generative.aws(
        region="sa-east-1",
        service="bedrock",
        model="amazon.titan-text-express-v1"
    )    
)

collection.data.insert({"title": "This is a test"})
obj = collection.generate.fetch_objects(
    include_vector=True, 
    limit=1, 
    single_prompt="Translate {title} to Portuguese"
).objects[0]
print(obj.generated)
print(len(obj.vector.get("title_vector")))

Let me know if this helps!

Oi @DudaNogueira !!

Your suggestion worked. I just had to indicate the region, as I changed from the Titan model to the Cohere model.

embeddings = BedrockEmbeddings(model_id="cohere.embed-multilingual-v3", credentials_profile_name="avenger_thor_profile", region_name="us-east-1")

Thanks for your help.
Hugs.

1 Like