How does Elysia handle collections that have already been pre-chunked?

Orly_Mugwaneza · October 9, 2025, 6:11am

I’m using Elysia with a Weaviate collection where documents are already pre-chunked before ingestion.

Since Elysia typically uses a chunk-on-demand (post-chunking) approach—storing full documents and creating chunks at query time—I’d like to know:

How does Elysia handle collections that are already pre-chunked?
- Does it use the existing chunks, ignore them, or re-chunk/merge them?
Is there a recommended setup for working with pre-chunked data?
Can I avoid the latency from post-chunking by leveraging my pre-chunked data or adjusting Elysia’s configuration?

Any guidance on the best way to handle this would be appreciated.

DudaNogueira · October 9, 2025, 10:20pm

hi @Orly_Mugwaneza !!

Welcome to our community !!

Elysia does not currently detect or use pre-chunked data . It implements a chunk-on-demand approach where it creates its own chunked collection at query time, regardless of whether your documents are already chunked

When you query a collection, Elysia evaluates whether chunking is needed based on the content field size. If the mean token count exceeds 400 tokens and the display type is “document”, it triggers chunking.

The chunking process:

Creates a separate collection named ELYSIA_CHUNKED_{collection_name.lower()}__
Chunks full documents using sentence-based chunking (5 sentences per chunk by default)
Stores chunks with references back to the original full documents

Also keep in mind that Elysia is on it’s early stages So this is can change later or get some new features.

Feel free to add your feature request at Elysia’s repo: GitHub - weaviate/elysia: Python package and backend for the Elysia platform app.

Let me know if this helps!

Thanks!

Topic		Replies	Views
Best way to vectorize and store a large document in Weaviate? General	6	2202	August 18, 2023
Late Chunking Support	9	1209	September 23, 2024
Slow deletion when using filter (and updating chunked documents) Support	2	900	June 30, 2023
Fast start code sample and/or article on using a weaviate production cloud based cluster? Support wcs	4	738	April 18, 2024
Update existing chunks in a document with more than QUERY_MAXIMUM_RESULTS entries Support	10	729	November 12, 2024

How does Elysia handle collections that have already been pre-chunked?

Related topics