Skip to content
向野而深的博客
Go back

Karpathy 的 LLM Wiki 详解

目录

LLM

一种利用大语言模型(LLMs)构建个人知识库的模式。

这是一份思路文档,设计初衷是可复制粘贴到你自己的大语言模型智能体(如 OpenAI Codex、Claude Code、OpenCode / Pi 等)中使用。其核心目标是传递核心思路,具体落地细节则由你的智能体与你协作完成。

Karpathy大神的 [llm.md](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f)

源码



# LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

## The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

The idea here is different. Instead of just retrieving from raw documents at query time, the LLM **incrementally builds and maintains a persistent wiki** — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then *kept current*, not re-derived on every query.

This is the key difference: **the wiki is a persistent, compounding artifact.** The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read. The wiki keeps getting richer with every source you add and every question you ask.

You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time. In practice, I have the LLM agent open on one side and Obsidian open on the other. The LLM makes edits based on our conversation, and I browse the results in real time — following links, checking the graph view, reading the updated pages. Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.

This can apply to a lot of different contexts. A few examples:

- **Personal**: tracking your own goals, health, psychology, self-improvement — filing journal entries, articles, podcast notes, and building up a structured picture of yourself over time.
- **Research**: going deep on a topic over weeks or months — reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis.
- **Reading a book**: filing each chapter as you go, building out pages for characters, themes, plot threads, and how they connect. By the end you have a rich companion wiki. Think of fan wikis like [Tolkien Gateway](https://tolkiengateway.net/wiki/Main_Page) — thousands of interlinked pages covering characters, places, events, languages, built by a community of volunteers over years. You could build something like that personally as you read, with the LLM doing all the cross-referencing and maintenance.
- **Business/team**: an internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls. Possibly with humans in the loop reviewing updates. The wiki stays current because the LLM does the maintenance that no one on the team wants to do.
- **Competitive analysis, due diligence, trip planning, course notes, hobby deep-dives** — anything where you're accumulating knowledge over time and want it organized rather than scattered.

## Architecture

There are three layers:

**Raw sources** — your curated collection of source documents. Articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth.

**The wiki** — a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.

**The schema** — a document (e.g. CLAUDE.md for Claude Code or AGENTS.md for Codex) that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file — it's what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.

## Operations

**Ingest.** You drop a new source into the raw collection and tell the LLM to process it. An example flow: the LLM reads the source, discusses key takeaways with you, writes a summary page in the wiki, updates the index, updates relevant entity and concept pages across the wiki, and appends an entry to the log. A single source might touch 10-15 wiki pages. Personally I prefer to ingest sources one at a time and stay involved — I read the summaries, check the updates, and guide the LLM on what to emphasize. But you could also batch-ingest many sources at once with less supervision. It's up to you to develop the workflow that fits your style and document it in the schema for future sessions.

**Query.** You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms depending on the question — a markdown page, a comparison table, a slide deck (Marp), a chart (matplotlib), a canvas. The important insight: **good answers can be filed back into the wiki as new pages.** A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn't disappear into chat history. This way your explorations compound in the knowledge base just like ingested sources do.

**Lint.** Periodically, ask the LLM to health-check the wiki. Look for: contradictions between pages, stale claims that newer sources have superseded, orphan pages with no inbound links, important concepts mentioned but lacking their own page, missing cross-references, data gaps that could be filled with a web search. The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.

## Indexing and logging

Two special files help the LLM (and you) navigate the wiki as it grows. They serve different purposes:

**index.md** is content-oriented. It's a catalog of everything in the wiki — each page listed with a link, a one-line summary, and optionally metadata like date or source count. Organized by category (entities, concepts, sources, etc.). The LLM updates it on every ingest. When answering a query, the LLM reads the index first to find relevant pages, then drills into them. This works surprisingly well at moderate scale (~100 sources, ~hundreds of pages) and avoids the need for embedding-based RAG infrastructure.

**log.md** is chronological. It's an append-only record of what happened and when — ingests, queries, lint passes. A useful tip: if each entry starts with a consistent prefix (e.g. `## [2026-04-02] ingest | Article Title`), the log becomes parseable with simple unix tools — `grep "^## \[" log.md | tail -5` gives you the last 5 entries. The log gives you a timeline of the wiki's evolution and helps the LLM understand what's been done recently.

## Optional: CLI tools

At some point you may want to build small tools that help the LLM operate on the wiki more efficiently. A search engine over the wiki pages is the most obvious one — at small scale the index file is enough, but as the wiki grows you want proper search. [qmd](https://github.com/tobi/qmd) is a good option: it's a local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking, all on-device. It has both a CLI (so the LLM can shell out to it) and an MCP server (so the LLM can use it as a native tool). You could also build something simpler yourself — the LLM can help you vibe-code a naive search script as the need arises.

## Tips and tricks

- **Obsidian Web Clipper** is a browser extension that converts web articles to markdown. Very useful for quickly getting sources into your raw collection.
- **Download images locally.** In Obsidian Settings → Files and links, set "Attachment folder path" to a fixed directory (e.g. `raw/assets/`). Then in Settings → Hotkeys, search for "Download" to find "Download attachments for current file" and bind it to a hotkey (e.g. Ctrl+Shift+D). After clipping an article, hit the hotkey and all images get downloaded to local disk. This is optional but useful — it lets the LLM view and reference images directly instead of relying on URLs that may break. Note that LLMs can't natively read markdown with inline images in one pass — the workaround is to have the LLM read the text first, then view some or all of the referenced images separately to gain additional context. It's a bit clunky but works well enough.
- **Obsidian's graph view** is the best way to see the shape of your wiki — what's connected to what, which pages are hubs, which are orphans.
- **Marp** is a markdown-based slide deck format. Obsidian has a plugin for it. Useful for generating presentations directly from wiki content.
- **Dataview** is an Obsidian plugin that runs queries over page frontmatter. If your LLM adds YAML frontmatter to wiki pages (tags, dates, source counts), Dataview can generate dynamic tables and lists.
- The wiki is just a git repo of markdown files. You get version history, branching, and collaboration for free.

## Why this works

The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

The idea is related in spirit to Vannevar Bush's Memex (1945) — a personal, curated knowledge store with associative trails between documents. Bush's vision was closer to this than to what the web became: private, actively curated, with the connections between documents as valuable as the documents themselves. The part he couldn't solve was who does the maintenance. The LLM handles that.


## Note

This document is intentionally abstract. It describes the idea, not a specific implementation. The exact directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain, your preferences, and your LLM of choice. Everything mentioned above is optional and modular — pick what's useful, ignore what isn't. For example: your sources might be text-only, so you don't need image handling at all. Your wiki might be small enough that the index file is all you need, no search engine required. You might not care about slide decks and just want markdown pages. You might want a completely different set of output formats. The right way to use this is to share it with your LLM agent and work together to instantiate a version that fits your needs. The document's only job is to communicate the pattern. Your LLM can figure out the rest.

中文翻译

# LLM Wiki

一种利用 LLM 构建个人知识库的模式。

这是一个想法文件,旨在被复制并粘贴到你自己的 LLM Agent 中(例如 OpenAI Codex、Claude Code、OpenCode / Pi 等)。其目标是传达高层级的设计思路,而你的 Agent 将与你协作完成具体的构建工作。

## 核心理念

大多数人使用 LLM 和文档的体验类似于 **RAG(检索增强生成)**:你上传一堆文件,LLM 在提问时检索相关的片段,然后生成答案。这种方式有效,但 LLM 在面对每个问题时都在从零开始“重新发现”知识。这里没有**积累**。如果问一个需要综合五份文档的微妙问题,LLM 每次都必须重新寻找并拼接相关的碎片。没有任何东西被沉淀下来。NotebookLM、ChatGPT 文件上传和大多数 RAG 系统都是这样运作的。

这里的思路不同。LLM 不仅仅是在提问时从原始文档中检索,而是**增量式地构建并维护一个持久的 Wiki**——这是一个介于你和原始素材之间的、由结构化且相互链接的 Markdown 文件组成的集合。当你添加新素材时,LLM 不只是将其索引以备后用,而是会阅读它,提取关键信息,并将其整合到现有的 Wiki 中——更新实体页面、修订主题摘要、标注新数据与旧主张的矛盾点、加强或挑战正在形成的综合结论。知识被编译一次后便**保持更新**,而不是在每次查询时重新推导。

这就是关键区别:**Wiki 是一个持久的、具有复利效应的产物。** 交叉引用已经存在,矛盾点已被标记,综合结论已经反映了你阅读过的所有内容。随着你添加的素材和提出的问题越来越多,Wiki 会变得越来越丰富。

你几乎从不(或很少)亲自编写 Wiki——LLM 负责编写和维护所有内容。你负责素材来源、探索和提出正确的问题。LLM 负责所有的苦力活——总结、交叉引用、归档和记账,正是这些工作让知识库随着时间的推移变得真正有用。在实践中,我通常在屏幕一侧打开 LLM Agent,另一侧打开 Obsidian。LLM 根据我们的对话进行编辑,而我实时浏览结果——点击链接、查看关系图谱、阅读更新后的页面。**Obsidian 是 IDE;LLM 是程序员;Wiki 则是代码库。**

这可以应用于许多不同的场景。例如:

*   **个人**:追踪自己的目标、健康、心理、自我提升——归档日记、文章、播客笔记,随时间推移构建出一幅关于你自己的结构化画像。
*   **研究**:在数周或数月内深入研究一个课题——阅读论文、文章、报告,并增量构建一个带有不断演进的论点的全面 Wiki。
*   **阅读一本书**:随着阅读进度归档每一章,构建角色、主题、情节线的页面以及它们之间的联系。读完时,你就拥有了一个丰富的配套 Wiki。想想那些同人 Wiki,比如 [Tolkien Gateway](https://tolkiengateway.net/wiki/Main_Page)——由志愿者社区耗时多年构建的数千个相互关联的页面。你可以在阅读时亲自构建类似的东西,而 LLM 负责所有的交叉引用和维护工作。
*   **业务/团队**:一个由 LLM 维护的内部 Wiki,素材来源于 Slack 会话、会议纪要、项目文档、客户电话。可以引入人工审核更新。Wiki 能保持最新,是因为 LLM 承担了团队中没人愿意做的维护工作。
*   **竞争对手分析、尽职调查、旅行规划、课程笔记、爱好深挖**——任何你需要随时间积累知识并希望其有组织而非零散的场景。

## 架构

分为三层:

**原始素材 (Raw sources)** —— 你策划的源文档集合。文章、论文、图片、数据文件。这些是不可变的——LLM 只读不改。这是你的事实来源。

**Wiki 层** —— 一个由 LLM 生成的 Markdown 文件目录。包括摘要、实体页、概念页、对比分析、概述和综合论述。LLM 完全拥有这一层。它创建页面,在新素材到达时更新页面,维护交叉引用,并保持一切一致。你负责阅读,LLM 负责编写。

**架构规范 (The schema)** —— 一个文档(例如针对 Claude Code 的 `CLAUDE.md` 或针对 Codex 的 `AGENTS.md`),告诉 LLM Wiki 如何组织、命名规范是什么,以及在摄入素材、回答问题或维护 Wiki 时遵循什么工作流。这是核心配置文件——正是它让 LLM 成为一名自律的 Wiki 维护者,而非通用的聊天机器人。随着你摸索出适合自己领域的规范,你和 LLM 会共同演进这个文件。

## 操作

**摄入 (Ingest)**。你将新素材放入原始集合中,并让 LLM 处理它。一个示例流程:LLM 阅读素材,与你讨论关键要点,在 Wiki 中编写摘要页,更新索引,更新整个 Wiki 中相关的实体和概念页,并在日志中追加一条记录。单一素材可能会触达 10-15 个 Wiki 页面。我个人倾向于一次摄入一个素材并保持参与——我阅读摘要,检查更新,并引导 LLM 重点关注什么。但你也可以在较少监督的情况下批量摄入多个素材。你可以根据自己的风格开发工作流,并将其记录在 Schema 中供后续会话使用。

**查询 (Query)**。针对 Wiki 提问。LLM 搜索相关页面,阅读并综合带有引用的答案。答案可以根据问题采取不同的形式——Markdown 页面、对比表格、幻灯片 (Marp)、图表 (matplotlib) 或画布。一个重要的洞察:**优秀的答案可以作为新页面重新存入 Wiki。** 你要求的对比、分析或发现的联系,这些都是有价值的,不应消失在聊天记录中。这样,你的探索过程就像摄入的素材一样,在知识库中产生复利。

**清理 (Lint)**。定期让 LLM 对 Wiki 进行“体检”。查找:页面间的矛盾、被新素材取代的陈旧主张、没有入站链接的孤岛页面、提到但缺少专属页面的重要概念、缺失的交叉引用、可以通过网页搜索填补的数据空白。LLM 擅长建议新的调查问题和寻找新素材。这能保证 Wiki 在增长过程中保持健康。

## 索引与日志

两个特殊文件帮助 LLM(和你)在 Wiki 增长时进行导航。它们用途各异:

**index.md** 是以内容为导向的。它是 Wiki 中所有内容的目录——每个页面都列有链接、一行摘要以及可选的元数据(如日期或来源数量)。按类别(实体、概念、来源等)组织。LLM 在每次摄入时更新它。回答查询时,LLM 先阅读索引以找到相关页面,然后深入阅读。这种方式在中等规模(约 100 个素材,数百个页面)下表现惊人,且无需嵌入式 (embedding-based) RAG 基础设施。

**log.md** 是以时间为导向的。它是一个只允许追加的记录,记载了何时发生了什么——摄入、查询、清理。一个实用的小贴士:如果每条记录都以一致的前缀开头(例如 `## [2026-04-02] ingest | 文章标题`),日志就可以用简单的 Unix 工具解析——`grep "^## \[" log.md | tail -5` 可以让你看到最后 5 条记录。日志提供了 Wiki 演进的时间线,并帮助 LLM 了解最近完成了哪些工作。

## 可选:命令行工具 (CLI)

在某些阶段,你可能想要构建一些小工具来帮助 LLM 更高效地操作 Wiki。Wiki 页面的搜索引擎是最显而易见的需求——小规模时索引文件就足够了,但随着 Wiki 的增长,你需要真正的搜索。[qmd](https://github.com/tobi/qmd) 是一个不错的选择:它是一个本地 Markdown 文件搜索引擎,支持 BM25/向量混合搜索和 LLM 重排序,全部在设备本地运行。它既有 CLI(方便 LLM 调用),也有 MCP 服务(方便 LLM 将其作为原生工具使用)。你也可以自己构建更简单的工具——LLM 可以根据需要帮你写一个朴素的搜索脚本。

## 贴士与技巧

*   **Obsidian Web Clipper** 是一个浏览器扩展,可将网页文章转换为 Markdown。这对于快速将素材放入原始集合非常有用。
*   **本地下载图片**。在 Obsidian 设置 → 文件与链接中,将“附件存放路径”设为固定目录(如 `raw/assets/`)。然后在设置 → 快捷键中,搜索“下载”找到“下载当前文件的附件”并绑定快捷键(如 `Ctrl+Shift+D`)。剪藏文章后,按下快捷键,所有图片都会下载到本地磁盘。这是可选的但很有用——它让 LLM 能直接查看和引用图片,而不是依赖可能失效的 URL。注意 LLM 无法在一次读取中原生理解带有行内图片的 Markdown——折衷方案是让 LLM 先读文本,然后根据需要单独查看部分或全部引用图片以获取额外上下文。虽然有点繁琐,但效果不错。
*   **Obsidian 的关系图谱 (Graph View)** 是查看 Wiki 形状的最佳方式——什么与什么相连,哪些页面是枢纽,哪些是孤岛。
*   **Marp** 是一种基于 Markdown 的幻灯片格式。Obsidian 有其插件。适用于直接从 Wiki 内容生成演示文稿。
*   **Dataview** 是一个 Obsidian 插件,可以在页面前置参数 (frontmatter) 上运行查询。如果你的 LLM 为 Wiki 页面添加了 YAML 前置参数(标签、日期、来源数),Dataview 可以生成动态表格和列表。
*   Wiki 本质上只是一个 Markdown 文件的 Git 仓库。你可以免费获得版本历史、分支功能和协作支持。

## 为什么这行得通

维护知识库最乏味的部分不是阅读或思考——而是“记账”。更新交叉引用、保持摘要最新、记录新数据何时反驳了旧主张、保持几十个页面的一致性。人类会放弃 Wiki,是因为维护负担的增长速度超过了价值。LLM 不会感到厌倦,不会忘记更新交叉引用,并且可以一次性修改 15 个文件。Wiki 能够保持更新,是因为维护成本几乎为零。

人类的工作是策划素材、指导分析、提出好问题并思考这一切意味着什么。LLM 的工作是除此之外的一切。

这个理念在精神上与 Vannevar Bush 的 Memex (1945) 有关——一个私人的、经过策划的知识库,文档之间具有关联路径。Bush 的愿景与此更接近,而非如今的互联网:私有的、积极策划的、文档间的联系与文档本身同样具有价值。他无法解决的部分是谁来负责维护。LLM 处理了这一点。

## 备注

本文档有意保持抽象。它描述的是一种模式,而非特定的实现。具体的目录结构、Schema 规范、页面格式、工具链——所有这些都取决于你的领域、你的偏好以及你选择的 LLM。上面提到的所有内容都是可选且模块化的——选择有用的,忽略无用的。例如:你的素材可能全是文本,所以你完全不需要处理图片。你的 Wiki 可能足够小,索引文件就是你所需的全部,不需要搜索引擎。你可能不在乎幻灯片,只需要 Markdown 页面。你可能想要一套完全不同的输出格式。使用它的正确方式是将其分享给你的 LLM Agent,共同实例化一个符合你需求的版本。本文档唯一的任务是传达这种模式。你的 LLM 会搞定剩下的事。

(未完待续)



Previous Post
Obsidian Web Clipper 模板定制
Next Post
Windows 环境下 pyenv-win 全指南