KeyError: 'moduleConfig' Error Message

I am getting KeyError: ‘moduleConfig’ error message when using weaviate with langchain. Weaviate is running in a docker container.

This is the code generating the error.

client = weaviate.connect_to_local()
embeddings = OpenAIEmbeddings()
db2 = WeaviateVectorStore.from_documents(docs_3, embeddings, client=client)

Here is my docker-compose.yml

This post on langchain github on the same issue recommended removing the DEFAULT_VECTORIZER_MODULE: ‘multi2vec-clip’ from the docker compose file. I did that, restarted the container but I am still getting the error.

Any suggestions?

hi @kihumban !!

Welcome to our community :hugs:

We have a nice recipe for langchain here, that I crafted myself, check it out:

If you could share here the sull running example as well as the docker compose in text, I can try reproducing that.

The recipe I liked above has all you need to use all Langchain features with Weaviate.

Let me know if I can help you on that :slight_smile:

Thanks!

@DudaNogueira thank you for sharing the recipe. I will check it out. This forum does not allow me to attach any file other than images. It flag text for having more than 2 hyperlinks. What is the best way of sharing the code and the docker file?

You can paste the docker and code here too :slight_smile:

I am basically working on this Langchain Weaviate library example. Here is the code. The last expression is the one generating the error.
############
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
import weaviate
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain.chains import RetrievalQAWithSourcesChain
from langchain_openai import OpenAI

loader = TextLoader(“data/state_of_the_union.txt”)
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs_3 = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()

client = weaviate.connect_to_local()
db2 = WeaviateVectorStore.from_documents(docs_3, embeddings, client=client)
##############

######################
Initially I had the DEFAULT_VECTORIZER_MODULE set to ‘multi2vec-clip’. However I removed it based to advise on this thread. Apparently the issue is being cause by the conflict between declaring embeddings = OpenAIEmbeddings() while DEFAULT_VECTORIZER_MODULE set to ‘multi2vec-clip’. However, updating my docker file by setting DEFAULT_VECTORIZER_MODULE to ‘none’ did not resolve the issue.

Here is the LangChain thread regarding the error

Can you remove the DEFAULT_VECTORIZER_MODULE altogether?

Also, can you copy the docker compose here? Otherwise it’s really copy and paste from the image, as it gets all messed up :grimacing:

Have you looked into the recipe I pasted?

It should give you all the steps to start using Langchain with Weaviate the proper way.

Thanks!

I removed the DEFAULT_VECTORIZER_MODULE and I am still getting the same error. However, I have worked through the the code in the recipe you shared and that is working ok. The only difference is that the recipe is using Embedded Weaviate rather than stand-alone server as was in my code. I am not sure if that has an impact. I will modify the recipe to try with client = weaviate.connect_to_local().

The forum platform does not allow me to copy the docker compose file. Every time I do that it flags the post with the following message.

Hi!

As long as you have a working weaviate client (connect to embedded/local/cloud) it should work accordingly.

You can paste the the docker compose here, like so:

---
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.27.0
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'none'
      ENABLE_API_BASED_MODULES: 'true'
      CLUSTER_HOSTNAME: 'node1'
volumes:
  weaviate_data:
...

I have tried pasting the docker compose here but it does not work. I keep getting the error message I indicated above.

Oh, I mean, the content of the docker compose file.

Feel free to reach out to me on our slack. I would love to help you there too!

I run the langchain-simple-pdf-multitenant.ipynb recipe using local stand-alone server instead of embedded and run into the same error. The only section of the code that changed was the initialization of the client

client = weaviate.connect_to_local(
headers={
“X-OpenAi-Api-Key”: os.environ.get(“OPENAI_API_KEY”), # Replace with your Cohere key
}
)

Here is my docker compose:

services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.26.4
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      CLIP_INFERENCE_API: 'http://multi2vec-clip:8080'
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_API_BASED_MODULES: 'true'
      ENABLE_MODULES: 'multi2vec-clip,text2vec-openai,generative-openai'
      CLUSTER_HOSTNAME: 'node1'
  multi2vec-clip:
    image: cr.weaviate.io/semitechnologies/multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1
    environment:
      ENABLE_CUDA: '0'
volumes:
  weaviate_data:
...

Here is the expression and the associated error dump:

db = WeaviateVectorStore.from_documents(docs, embeddings, client=client, index_name=“WikipediaLangChainMT”, tenant=“brazil”)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[11], line 1
----> 1 db = WeaviateVectorStore.from_documents(docs, embeddings, client=client, index_name="WikipediaLangChainMT", tenant="brazil")

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/langchain_core/vectorstores/base.py:835, in VectorStore.from_documents(cls, documents, embedding, **kwargs)
    833 texts = [d.page_content for d in documents]
    834 metadatas = [d.metadata for d in documents]
--> 835 return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/langchain_weaviate/vectorstores.py:487, in WeaviateVectorStore.from_texts(cls, texts, embedding, metadatas, tenant, client, index_name, text_key, relevance_score_fn, **kwargs)
    475 attributes = list(metadatas[0].keys()) if metadatas else None
    477 weaviate_vector_store = cls(
    478     client,
    479     index_name,
   (...)
    484     use_multi_tenancy=tenant is not None,
    485 )
--> 487 weaviate_vector_store.add_texts(texts, metadatas, tenant=tenant, **kwargs)
    489 return weaviate_vector_store

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/langchain_weaviate/vectorstores.py:167, in WeaviateVectorStore.add_texts(self, texts, metadatas, tenant, **kwargs)
    164 if self._embedding:
    165     embeddings = self._embedding.embed_documents(list(texts))
--> 167 with self._client.batch.dynamic() as batch:
    168     for i, text in enumerate(texts):
    169         data_properties = {self._text_key: text}

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/batch/client.py:179, in _BatchClientWrapper.dynamic(self, consistency_level)
    177 self._batch_mode: _BatchMode = _DynamicBatching()
    178 self._consistency_level = consistency_level
--> 179 return self.__create_batch_and_reset()

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/batch/client.py:135, in _BatchClientWrapper.__create_batch_and_reset(self)
    133 def __create_batch_and_reset(self) -> _ContextManagerWrapper[_BatchClient]:
    134     if self._vectorizer_batching is None or not self._vectorizer_batching:
--> 135         configs = self.__config.list_all(simple=True)
    137         vectorizer_batching = False
    138         for config in configs.values():

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/collections/sync.py:286, in _Collections.list_all(self, simple)
    265 def list_all(
    266     self, simple: bool = True
    267 ) -> Union[Dict[str, CollectionConfig], Dict[str, CollectionConfigSimple]]:
    268     """List the configurations of the all the collections currently in the Weaviate instance.
    269 
    270     Arguments:
   (...)
    284             If Weaviate reports a non-OK status.
    285     """
--> 286     return self.__loop.run_until_complete(self.__collections.list_all, simple)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/event_loop.py:40, in _EventLoop.run_until_complete(self, f, *args, **kwargs)
     38     raise WeaviateClosedClientError()
     39 fut = asyncio.run_coroutine_threadsafe(f(*args, **kwargs), self.loop)
---> 40 return fut.result()

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py:456, in Future.result(self, timeout)
    454     raise CancelledError()
    455 elif self._state == FINISHED:
--> 456     return self.__get_result()
    457 else:
    458     raise TimeoutError()

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py:401, in Future.__get_result(self)
    399 if self._exception:
    400     try:
--> 401         raise self._exception
    402     finally:
    403         # Break a reference cycle with the exception in self._exception
    404         self = None

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/collections/async_.py:304, in _CollectionsAsync.list_all(self, simple)
    285 """List the configurations of the all the collections currently in the Weaviate instance.
    286 
    287 Arguments:
   (...)
    301         If Weaviate reports a non-OK status.
    302 """
    303 _validate_input([_ValidateArgument(expected=[bool], name="simple", value=simple)])
--> 304 return await self._get_all(simple=simple)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/collections/base.py:76, in _CollectionsBase._get_all(self, simple)
     74 assert res is not None
     75 if simple:
---> 76     return _collection_configs_simple_from_json(res)
     77 return _collection_configs_from_json(res)

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/classes/config_methods.py:327, in _collection_configs_simple_from_json(schema)
    323 def _collection_configs_simple_from_json(
    324     schema: Dict[str, Any]
    325 ) -> Dict[str, _CollectionConfigSimple]:
    326     return {
--> 327         schema["class"]: _collection_config_simple_from_json(schema) for schema in schema["classes"]
    328     }

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/classes/config_methods.py:250, in _collection_config_simple_from_json(schema)
    242 def _collection_config_simple_from_json(schema: Dict[str, Any]) -> _CollectionConfigSimple:
    243     return _CollectionConfigSimple(
    244         name=schema["class"],
    245         description=schema.get("description"),
    246         generative_config=__get_generative_config(schema),
    247         properties=_properties_from_config(schema) if schema.get("properties") is not None else [],
    248         references=_references_from_config(schema) if schema.get("properties") is not None else [],
    249         reranker_config=__get_rerank_config(schema),
--> 250         vectorizer_config=__get_vectorizer_config(schema),
    251         vectorizer=__get_vectorizer(schema),
    252         vector_config=__get_vector_config(schema, simple=True),
    253     )

File /Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/weaviate/collections/classes/config_methods.py:88, in __get_vectorizer_config(schema)
     86 def __get_vectorizer_config(schema: Dict[str, Any]) -> Optional[_VectorizerConfig]:
     87     if __is_vectorizer_present(schema) is not None and schema.get("vectorizer", "none") != "none":
---> 88         vec_config: Dict[str, Any] = schema["moduleConfig"].pop(schema["vectorizer"])
     89         try:
     90             vectorizer = Vectorizers(schema["vectorizer"])

KeyError: 'moduleConfig'

Is this some kind of a bug? It seem it’s looking for a property called “moduleConfig” in the collection object. But when I look at the collection object configuration (below), there is no property “moduleConfig”

<weaviate.Collection config={
  "name": "WikipediaLangChainMT",
  "description": null,
  "generative_config": {
    "generative": "generative-openai",
    "model": {}
  },
  "inverted_index_config": {
    "bm25": {
      "b": 0.75,
      "k1": 1.2
    },
    "cleanup_interval_seconds": 60,
    "index_null_state": false,
    "index_property_length": false,
    "index_timestamps": false,
    "stopwords": {
      "preset": "en",
      "additions": null,
      "removals": null
    }
  },
  "multi_tenancy_config": {
    "enabled": true,
    "auto_tenant_creation": true,
    "auto_tenant_activation": true
  },
  "properties": [],
  "references": [],
  "replication_config": {
    "factor": 1,
    "async_enabled": false
  },
  "reranker_config": null,
  "sharding_config": null,
  "vector_index_config": {
    "quantizer": null,
    "cleanup_interval_seconds": 300,
    "distance_metric": "cosine",
    "dynamic_ef_min": 100,
    "dynamic_ef_max": 500,
    "dynamic_ef_factor": 8,
    "ef": -1,
    "ef_construction": 128,
    "flat_search_cutoff": 40000,
    "max_connections": 32,
    "skip": false,
    "vector_cache_max_objects": 1000000000000
  },
  "vector_index_type": "hnsw",
  "vectorizer_config": {
    "vectorizer": "text2vec-openai",
    "model": {
      "baseURL": "https://api.openai.com",
      "model": "ada"
    },
    "vectorize_collection_name": true
  },
  "vectorizer": "text2vec-openai",
  "vector_config": null
}>

Hi!

This example in our recipes uses text2vec-openai and generative-openai

And those modules are enabled:

ENABLE_MODULES: 'multi2vec-clip,text2vec-openai,generative-openai'

So it should work. :thinking:

have you restarted your docker so the changes take effect?

you can call:

docker compose up -d

I have all those modules enabled in my docker compose as previously indicated. I have restarted the docker several times, the outcome has been consistently the same - KeyError: ‘moduleConfig’.

Below is my docker compose at it currently is. I just run the program and got the same error.

services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.26.4
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      CLIP_INFERENCE_API: 'http://multi2vec-clip:8080'
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_MODULES: 'multi2vec-clip,text2vec-openai,generative-openai'
      ENABLE_API_BASED_MODULES: 'true'
      CLUSTER_HOSTNAME: 'node1'
  multi2vec-clip:
    image: cr.weaviate.io/semitechnologies/multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1
    environment:
      ENABLE_CUDA: '0'
volumes:
  weaviate_data:
...

Hi! This is strange.

I have used your docker compose, and was able to run this:


client.collections.delete("Kihu")
collection = client.collections.create(
    "Kihu",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),
    generative_config=wvc.config.Configure.Generative.openai(),
)
collection.data.insert({"text": "This is a test"})
obj = collection.generate.fetch_objects(single_prompt="translate {text} to spanish", include_vector=True).objects[0]
print("Generation", obj.generated)
print("Generation", obj.vector)

Let me know if you can run this code.

Thanks!

Also, you can check what are the installed modules in a server with:

client.get_meta().get("modules")

the modules are there. See below.

{'generative-anthropic': {'documentationHref': 'https://docs.anthropic.com/en/api/getting-started',
  'name': 'Generative Search - Anthropic'},
 'generative-anyscale': {'documentationHref': 'https://docs.anyscale.com/endpoints/overview',
  'name': 'Generative Search - Anyscale'},
 'generative-aws': {'documentationHref': 'https://docs.aws.amazon.com/bedrock/latest/APIReference/welcome.html',
  'name': 'Generative Search - AWS'},
 'generative-cohere': {'documentationHref': 'https://docs.cohere.com/reference/chat',
  'name': 'Generative Search - Cohere'},
 'generative-databricks': {'documentationHref': 'https://docs.databricks.com/en/machine-learning/foundation-models/api-reference.html#completion-task',
  'name': 'Generative Search - Databricks'},
 'generative-friendliai': {'documentationHref': 'https://docs.friendli.ai/openapi/create-chat-completions',
  'name': 'Generative Search - FriendliAI'},
 'generative-mistral': {'documentationHref': 'https://docs.mistral.ai/api/',
  'name': 'Generative Search - Mistral'},
 'generative-octoai': {'documentationHref': 'https://octo.ai/docs/text-gen-solution/getting-started',
  'name': 'Generative Search - OctoAI'},
 'generative-openai': {'documentationHref': 'https://platform.openai.com/docs/api-reference/completions',
  'name': 'Generative Search - OpenAI'},
 'generative-palm': {'documentationHref': 'https://cloud.google.com/vertex-ai/docs/generative-ai/chat/test-chat-prompts',
  'name': 'Generative Search - Google PaLM'},
 'multi2vec-palm': {'documentationHref': 'https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings',
  'name': 'Google PaLM Multimodal Module'},
 'reranker-cohere': {'documentationHref': 'https://txt.cohere.com/rerank/',
  'name': 'Reranker - Cohere'},
 'reranker-voyageai': {'documentationHref': 'https://docs.voyageai.com/reference/reranker-api',
  'name': 'Reranker - VoyageAI'},
 'text2vec-aws': {'documentationHref': 'https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html',
  'name': 'AWS Module'},
 'text2vec-cohere': {'documentationHref': 'https://docs.cohere.ai/embedding-wiki/',
  'name': 'Cohere Module'},
 'text2vec-databricks': {'documentationHref': 'https://docs.databricks.com/en/machine-learning/foundation-models/api-reference.html#embedding-task',
  'name': 'Databricks Foundation Models Module - Embeddings'},
 'text2vec-huggingface': {'documentationHref': 'https://huggingface.co/docs/api-inference/detailed_parameters#feature-extraction-task',
  'name': 'Hugging Face Module'},
 'text2vec-jinaai': {'documentationHref': 'https://jina.ai/embeddings/',
  'name': 'JinaAI Module'},
 'text2vec-octoai': {'documentationHref': 'https://octo.ai/docs/text-gen-solution/getting-started',
  'name': 'OctoAI Module'},
 'text2vec-openai': {'documentationHref': 'https://platform.openai.com/docs/guides/embeddings/what-are-embeddings',
  'name': 'OpenAI Module'},
 'text2vec-palm': {'documentationHref': 'https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings',
  'name': 'Google PaLM Module'},
 'text2vec-voyageai': {'documentationHref': 'https://docs.voyageai.com/docs/embeddings',
  'name': 'VoyageAI Module'}}

were you able to run that code?