Karpathy's LLM Knowledge Base System: Full Breakdown of His CLAUDE.md Schema | WebEdge

[2026-04-05] query | Question Topic

[2026-04-07] lint | Weekly health check</code></pre>

This format allows Unix tool processing: grep "^## \[" log.md | tail -5 retrieves the last 5 operations. The log is also how the LLM (and the user) understand the wiki's history and activity.

CLI Tooling and qmd

Karpathy recommends building CLI tools to help the LLM operate efficiently at scale. He identifies search as the most critical capability. His recommendation: qmd — a local markdown search engine with hybrid BM25/vector search and LLM re-ranking, all on-device. It provides both a CLI interface and an MCP server interface, allowing the LLM agent to invoke search directly during query processing.

Obsidian as the Front-End

The gist describes an Obsidian-native workflow for managing raw sources:

Obsidian Web Clipper — browser extension for one-click web-to-markdown capture
Image handling: configure Obsidian attachment folder to raw/assets/, bind "Download attachments for current file" hotkey (e.g. Ctrl+Shift+D) for local image storage. Note: LLMs must view images separately after reading the text file.
Obsidian Graph View — visualizes wiki connectivity, identifies hub pages and orphans
Marp — markdown slide deck format with Obsidian plugin support
Dataview — Obsidian plugin that queries page frontmatter to generate dynamic tables from YAML metadata
Git repository — the entire wiki is a version-controlled markdown repo with built-in collaboration

Why This Works (The Maintenance Argument)

Karpathy makes a sharp observation about why knowledge bases typically fail: not because reading is hard, but because maintenance is. Updating cross-references, keeping multiple pages consistent, flagging stale claims — these are exactly the tedious, systematic tasks that LLMs excel at and humans avoid.

He connects the pattern to Vannevar Bush's 1945 Memex concept — a vision of personal, curated knowledge with "associative trails between documents." Bush's vision failed because it lacked a solution to the maintenance problem. LLMs solve that problem.

The human's role in this system: curator, analyst, questioner. The LLM's role: everything else.

Enterprise Application with Claude API

The LLM Wiki pattern is the right conceptual foundation for enterprise knowledge management systems built on the Claude API. Direct applications:

Competitive intelligence: Ingest competitor press releases, product changelogs, analyst reports — the wiki maintains live entity pages per competitor, with the Lint operation flagging when claims become outdated
Internal expertise base: Every project, every decision, every lesson learned — structured, searchable, cross-referenced, and maintained automatically
Customer-facing agents: Support agents that don't just answer questions but file valuable Q&A pairs back into the knowledge base, compounding over time
Research synthesis: Analysts who need to track a domain over months rather than isolated sessions

The critical architectural insight: the CLAUDE.md schema is the variable. Karpathy is explicit: "The directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain." Every element is optional and modular. The schema is co-developed with the LLM, tailored to the specific business domain.

This is exactly what we build at WebEdge — Claude API integrations where the schema evolves with the client's domain and the knowledge base compounds over time.

Summary

Karpathy's LLM Wiki pattern is one of the clearest and most practical methodological contributions to LLM agent design in 2026. It addresses a real architectural problem in traditional RAG and proposes a concrete, implementable solution.

Key components to remember: three-layer architecture (sources → wiki → schema), three operations (Ingest, Query, Lint), two special files (index.md, log.md), and CLAUDE.md as the operational configuration center that you co-develop with the LLM.

If you want to build a system like this for your business using the Claude API, reach out.

Explore our AI solutions → | Get in touch →

Karpathy's LLM Knowledge Base System: Full Breakdown of His CLAUDE.md Schema | WebEdge

[2026-04-05] query | Question Topic

[2026-04-07] lint | Weekly health check</code></pre>

CLI Tooling and qmd

Obsidian as the Front-End

Why This Works (The Maintenance Argument)

Enterprise Application with Claude API

Summary

WebEdge

Ready to implement AI in your business?

Related articles

AI Implementation Got 130× Cheaper: What It Means for Your Business

AI Automation for Marketing Agencies: Scale Without Hiring | WebEdge

Multi-Agent Architecture for Business Operations: How webedge-org Structures AI Teams