llm-wiki

Core Extraction (Thesis & Synthesis)

The LLM Wiki pattern replaces standard de-contextualized Retrieval-Augmented Generation (RAG) by having an LLM incrementally build and maintain a persistent, compounding wiki of structured markdown files. Unlike standard RAG systems that query raw documents in a vacuum and re-derive insights from scratch for every request, this methodology compiles incoming sources once, dynamically integrating new data, resolving contradictions, and maintaining bi-directional connections. The resulting wiki acts as a persistent repository where the analytical heavy-lifting is done upfront at the ingestion step, allowing subsequent user queries to benefit from pre-synthesized networks of information.

The architecture decouples immutable raw sources from the LLM-owned wiki directory, governed by a versioned schema contract (like CLAUDE.md or AGENTS.md) to establish content governance. Standardized operations include structured ingestion, user querying, and periodic lint checks to resolve orphans and conceptual gaps. This keeps the maintenance cost of the knowledge base near zero, allowing the human to focus strictly on curation, exploration, and synthesis.

Source Grounding

Source Document: llm-wiki.md
Exact Citation: Local Reference

Semantic Connections

Links:
- Concept Anchors:
  - LLM Wiki: A design pattern for managing personal knowledge by using an LLM to incrementally build and maintain a persistent, compounding folder of markdown files.
  - Retrieval-Augmented Generation: A RAG framework that retrieves raw content chunks at query time to generate answers without persistent accumulation.
  - Schema Contract: A governance document defining folder structures, metadata standards, and LLM editing rules for a wiki.
- Relational Connections:
  - LLM Wiki challenges Retrieval-Augmented Generation by compiling and interlinking insights once at ingestion rather than re-deriving them at query time.
  - Schema Contract guides the operations of LLM Wiki by locking the LLM into strict editing and indexing constraints.

Inquiry & Speculation

What are the performance and cost thresholds of text-based index files like index.md before moving to vector databases?
How do we handle file conflicts and transaction safety when multiple human editors and agent sub-routines commit to the same Git repository?
Can we expose local hybrid search engines (like qmd) to LLM agents as native Model Context Protocol (MCP) tools for high-speed indexing?

Garth Schwer

Explorer

Recent Writing

AI-Augmented Digital Gardening

Building Humane Tech Workshop in Melbourne

Mindset

PARA

obsidian bookmarks

llm-wiki

llm-wiki

Core Extraction (Thesis & Synthesis)

Source Grounding

Semantic Connections

Inquiry & Speculation

References

Graph View

Table of Contents