Environment
- Weaviate: Local installation (please specify version if known)
- Client: Python client (please specify version)
- Vectorizer: text2vec-ollama
- Generative: ollama (llama3.1:8b)
Issue Description
When attempting to use the generate.near_text()
method on a Chunk collection, I receive the following error:
weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message panic occurred: interface conversion: interface {} is []interface {}, not []string.
However, the regular query.near_text()
method works perfectly fine on the same collection with the same query parameters.
Schema Configuration
Document Collection
client.collections.create(
name="Document",
description="A document with metadata and elements.",
properties=[
wc.Property(name="file_name", data_type=wc.DataType.TEXT),
wc.Property(name="file_extension", data_type=wc.DataType.TEXT),
wc.Property(
name="elements",
data_type=wc.DataType.OBJECT_ARRAY,
nested_properties=[
wc.Property(name="type", data_type=wc.DataType.TEXT),
wc.Property(name="text", data_type=wc.DataType.TEXT),
wc.Property(name="page_number", data_type=wc.DataType.INT),
wc.Property(
name="structured_data",
data_type=wc.DataType.OBJECT,
nested_properties=[
wc.Property(name="format", data_type=wc.DataType.TEXT),
wc.Property(name="headers", data_type=wc.DataType.TEXT_ARRAY),
wc.Property(
name="rows",
data_type=wc.DataType.OBJECT_ARRAY,
nested_properties=[
wc.Property(name="zone_tarifare", data_type=wc.DataType.TEXT),
wc.Property(name="tarif", data_type=wc.DataType.TEXT),
wc.Property(name="zile_ore", data_type=wc.DataType.TEXT),
]
),
# ... additional nested properties
]
),
]
)
],
vectorizer_config=wc.Configure.Vectorizer.text2vec_ollama(
api_endpoint='http://host.docker.internal:11434',
model='nomic-embed-text',
),
generative_config=wc.Configure.Generative.ollama(
api_endpoint='http://host.docker.internal:11434',
model='llama3.1:8b',
),
)
Chunck Collection
client.collections.create(
name="Chunk",
description="A chunk of a document, with metadata and references to its parent document.",
properties=[
wc.Property(name="chunk_id", data_type=wc.DataType.TEXT),
wc.Property(name="chunk_type", data_type=wc.DataType.TEXT),
wc.Property(name="embedding_text", data_type=wc.DataType.TEXT),
wc.Property(name="embedding_keywords", data_type=wc.DataType.TEXT_ARRAY),
wc.Property(name="topic_keywords", data_type=wc.DataType.TEXT_ARRAY),
wc.Property(
name="entity_mentions",
data_type=wc.DataType.OBJECT_ARRAY,
nested_properties=[
wc.Property(name="type", data_type=wc.DataType.TEXT),
wc.Property(name="text", data_type=wc.DataType.TEXT),
wc.Property(name="category", data_type=wc.DataType.TEXT),
],
),
wc.Property(name="cross_references", data_type=wc.DataType.TEXT_ARRAY),
wc.Property(
name="relations",
data_type=wc.DataType.OBJECT_ARRAY,
nested_properties=[
wc.Property(name="subject", data_type=wc.DataType.TEXT),
wc.Property(name="predicate", data_type=wc.DataType.TEXT),
wc.Property(name="object", data_type=wc.DataType.TEXT),
wc.Property(name="relation_type", data_type=wc.DataType.TEXT),
],
),
],
references=[
wc.ReferenceProperty(name="fromDocument", target_collection="Document"),
],
vectorizer_config=wc.Configure.Vectorizer.text2vec_ollama(
api_endpoint='http://host.docker.internal:11434',
model='nomic-embed-text',
),
generative_config=wc.Configure.Generative.ollama(
api_endpoint='http://host.docker.internal:11434',
model='llama3.1:8b',
),
)
#CODE:
import weaviate
from weaviate.classes.query import QueryReference
client = weaviate.connect_to_local()
try:
chunk_collection = client.collections.get("Chunk")
# Example task
task = "How much for parking ticket?"
keywords = "parking ticket" # Extracted via KeywordExtractor
# This works fine - returns results
response = chunk_collection.query.near_text(
query=keywords,
limit=2,
return_references=QueryReference(
link_on="fromDocument",
return_properties=["file_name", "elements"]
)
)
print("Query succeeded, got objects")
# This fails with the interface conversion error
response = chunk_collection.generate.near_text(
query=keywords,
limit=2,
grouped_task=task,
)
print(response.generated)
finally:
client.close()
Expected Behavior
The generate.near_text()
method should successfully generate a response based on the retrieved chunks, similar to how the regular query.near_text()
works.
Actual Behavior
The generate query fails with:
weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message panic occurred: interface conversion: interface {} is []interface {}, not []string.
Additional Context
- The error suggests an internal type conversion issue where Weaviate expects a string array but receives a generic interface array
- This might be related to the complex nested schema structure with OBJECT_ARRAY and TEXT_ARRAY properties
- The issue only occurs with the generate operation, not with regular queries
Questions
- Is this a known issue with complex nested schemas and the generate functionality?
- Are there any workarounds to use generate with collections that have OBJECT_ARRAY properties?
- Could this be related to how Ollama integration handles the nested data structures?
Any help or insights would be greatly appreciated!