Beever Atlas: an open-source LLM Wiki on Weaviate (3-tier schema for chat memory)

jhkchan · April 27, 2026, 5:24am

Hey everyone

We just open-sourced Beever Atlas — an LLM Wiki for team chat that distills Slack / Discord / Microsoft Teams / Mattermost / Telegram conversations into a structured, browsable wiki backed by a three-tier Weaviate schema that turns out to be the right shape for conversational corpora when chunk-first RAG falls over.

Apache 2.0, runnable in 3 commands (make demo).

Why I’m posting this here

Most “RAG on chat” projects throw messages into a single vector collection and call it done. We tried that first and it didn’t work — the same fact gets restated across dozens of threads, the meaningful structure lives outside the message body, and retrieval surfaces noise instead of signal.

What we ended up with on the Weaviate side is a layered design I haven’t seen documented elsewhere, and I’d love feedback from people who’ve thought about this more than I have.

The three-tier schema

Tier	Class	What’s in it	Why it’s separate
Conversation	`Channel`	High-level summary per channel/server, with `topics[]` and `participants[]` rollups	Cheap to retrieve when the question is “what is this team working on?”
Topic	`Topic`	One vector per coherent thread of discussion; aggregates messages, decisions, mentions	Lets multi-message threads get retrieved as a unit instead of sliced into chunks
Atomic	`Fact`	One vector per extracted fact, with `subject`, `predicate`, `object`, `confidence`, `source_message_ids[]`	The grounded-citation layer — every Q&A answer cites Facts back to source messages

The pipeline that populates these three tiers is a six-stage Google ADK agent chain: preprocessor → fact extractor → entity extractor → cross-batch validator → relationship extractor → persister. The validator step is what makes the Fact tier non-noisy.

How retrieval picks a tier

A small LLM-classifier query router picks which tier(s) to hit per question:

“What is this team working on?” → Channel summaries
“What did Alice say about the Aurora migration?” → Topic + Fact
“What changed about Aurora over the past quarter?” → Fact tier with temporal filters on valid_from / valid_until

The cross-tier merging happens app-side today, which is one of the things I’d love feedback on (see below).

This is the whole reason we call it an LLM Wiki and not RAG — the wiki has structure that flat vector retrieval can’t serve on its own.

Demo

git clone https://github.com/Beever-AI/beever-atlas
cd beever-atlas
make demo

That stands up the full stack under docker-compose with a public Wikipedia corpus pre-loaded, and you can ask grounded questions against the Weaviate-backed wiki in <5 minutes.

There’s also a 16-tool MCP server so Claude Code and Cursor can query the same memory directly — happy to expand on that if anyone’s curious.

What I’d love feedback on

The three-tier vs. single-collection tradeoff. Has anyone tried something similar and found a tier was redundant, or that a fourth tier paid for itself?
Cross-tier retrieval. Right now we query each tier independently and merge in app code. Would a Weaviate-side cross-reference / multi-class query be faster?
Embedding model choice for the Fact tier. We’re using text-embedding-3-small. Anyone tried smaller / domain-tuned models on extracted-fact corpora?

Docs: Beever Atlas (repo linked at the top of this post).
(Disclosure: I’m a maintainer — Beever AI Limited, Toronto. We released this on April 19, 2026.)

Topic		Replies	Views
Retrieved document chunks is dependent on LLM in combination with Langchain Support	1	398	July 29, 2024
Looking to hire a weaviate/langchain expert for a project if anyone is interested let me know General	1	843	August 2, 2023
Azureopenai with WeaViate (Insert and Search data) + Rag Azureopenai. Combined WeaViate with a RAG from my company (RAG in Azure). Is it possible? Support python , azure	2	481	October 23, 2024
Knowledge Universe API — populate Weaviate with scored, multi-source knowledge in one call Showcase integration , developer-experience , python , technical	1	70	April 8, 2026
Trying to resolve the error ,here are my code and error Support technical	2	595	January 29, 2025