Distiller
Transform one-off AI coding sessions into structured, reusable knowledge assets. Instead of "use once, forget once", extract underlying logic, reusable patterns, and pitfall guides — building compound interest on development experience.
Why This Matters
Most valuable technical knowledge lives in conversations that get closed and forgotten. A debugging breakthrough at 2 AM, an architectural insight during pair programming, a hard-won understanding of a framework quirk — all gone the moment the chat window closes. Distiller exists because knowledge should compound, not evaporate.
Core Principles
- Extract, don't interpret — Faithfully capture what actually happened. Inventing rationale or over-generalizing beyond what the content supports produces unreliable notes that erode trust over time.
- Score by confidence — Not all extractions are equal. Each point gets a confidence level (high/medium/low) so the user knows what to trust immediately vs. what needs verification.
- Scene-specific depth — A bug retrospective needs root-cause chains; a project wrap-up needs architecture decision records. Generic "summarize everything" misses the dimensions that matter most in each context.
- User retains control — AI proposes, user confirms. Every key decision (scene selection, storage location, create vs. append) requires explicit user approval because the user understands their knowledge base better than any model can.
- Zero external dependencies — Pure skill, no CLI tools or package installs required. The search script (
scripts/search_distiller.py) uses Python standard library only; if Python is unavailable, a built-in Grep fallback covers the same functionality.
Resource Loading Strategy
This skill follows a three-level progressive disclosure model to stay lightweight:
| Level | What | When to Load |
| ----- | ---- | ------------ |
| Always in context | This SKILL.md body (~200 lines) | Automatically loaded on skill trigger |
| Load on demand | Scene template (references/scene-*.md) | Read only the ONE matched scene file during extraction (Phase 4.1) |
| Load on demand | references/tags-taxonomy.md | Read during extraction (Phase 4.1) to select proper tags |
| Load on demand | assets/distiller-frontmatter.md | Read during extraction (Phase 4.1) for the output skeleton |
| Execute without loading | scripts/search_distiller.py | Run as subprocess in Phase 3.2; if Python unavailable, use Grep fallback described in Phase 3.2 |
Only load what the current phase requires. Reading all reference files upfront wastes context.
Workflow
The workflow has five phases. Each phase groups logically related actions.
Phase 1: Setup
1.1 Initialize Memory
Scan the project root for .distiller/ directory. If it does not exist, create it:
.distiller/
├── profile.json → {}
├── history.jsonl → (empty)
├── search-index.jsonl → (empty)
└── index.md → # Distilled Knowledge Index\n\n_No entries yet._
1.2 Load Preferences
Read .distiller/profile.json. Stored values (preferred storage path, frequent tags, scene history) pre-fill later choices so returning users answer fewer questions. On first run, proceed with defaults.
1.3 Resolve Interaction Language
Detect the user's language from their latest message:
- Contains Chinese characters → interact in Chinese
- Otherwise → interact in English
All user-facing prompts in subsequent phases MUST use this language.
Phase 2: Scope & Scene
2.1 Ask Scope
Ask the user which content to distill:
| Option | Description |
| ------ | ----------- |
| Current chat | Analyze this conversation only |
| Today's all chats | Scan agent-transcripts/ for today's .jsonl files |
| Specific files/dirs | User provides file or directory paths |
2.2 Analyze & Detect Scene
Read the scoped content and auto-detect the best-fit scene by matching content against signal patterns defined in each scene reference file:
| Scene | Reference File | Typical Signals |
| ----- | -------------- | --------------- |
| Project Wrap-up | references/scene-project-wrapup.md | Architecture decisions, module design, deployment, lessons learned |
| Bug Retrospective | references/scene-bug-retrospective.md | Error traces, debugging sessions, root cause analysis, fix verification |
| Code Refactoring | references/scene-refactoring.md | Code restructuring, design patterns, performance optimization |
| Tech Learning | references/scene-tech-learning.md | New framework/tool exploration, API discovery, concept learning |
| Universal | references/scene-universal.md | No scene exceeds threshold, or mixed content |
Select the scene with the highest match score. If none exceeds the threshold, fall back to Universal. Also detect the source language (en or zh) for the document body.
2.3 Confirm Scene
Present the detected scene and confidence to the user. Ask them to confirm or override — they may know their intent better than signal matching can infer.
Phase 3: Storage & Dedup
3.1 Ask Storage Target
Ask the user where to save, pre-filling from profile.json if available:
| Option | Path |
| ------ | ---- |
| Obsidian vault | User specifies vault path |
| Project local | docs/distiller/ in current project root |
| Global knowledge base | ~/.agents/distiller/ |
3.2 Search Existing Docs
Search the chosen storage directory for documents with related topics or tags. This prevents knowledge fragmentation — related insights belong together.
Primary method — Run scripts/search_distiller.py:
python scripts/search_distiller.py <keywords> --dir <storage_path> --distiller-dir <project_root>/.distiller
Fallback — If Python is unavailable (command fails or not installed), use the Grep tool directly:
- Search for
distilledtag in .md files:Grep pattern="^ - distilled$" path=<storage_path> glob="*.md" - Among matches, search for topic keywords:
Grep pattern="<keyword1>|<keyword2>" path=<storage_path> glob="*.md" - If both match the same file, it is a related document
Present findings and offer:
- Create new — if no closely related docs found
- Append to existing — if a related doc exists (show filename and topic)
User confirms the choice.
Phase 4: Extract & Review
4.1 Extract
Now read the matched references/scene-*.md file. Follow its extraction dimensions, prompt template, and output structure. Apply confidence scoring to each extraction point:
- high — Explicitly stated in source, well-supported, directly actionable
- medium — Reasonably inferred from context, may need verification
- low — Weak signal, speculative, worth noting but needs validation
Format confidence inline: **[high]** The race condition occurs because...
Language rule: Frontmatter keys and values are always in English. Document body (section headers + content) follows the detected source language.
Use the frontmatter skeleton from assets/distiller-frontmatter.md. For tag selection, consult references/tags-taxonomy.md.
4.2 User Review
Present the complete distilled document to the user. Ask them to confirm, revise specific sections, or request re-extraction with different parameters. The user's review is the most important quality gate — no document gets written without their approval.
Phase 5: Persist
Once the user approves:
- Write the document to the chosen path. Filename:
YYYY-MM-DD-<topic-slug>.md - Update
.distiller/profile.json— Increment scene frequency, update preferred storage, refresh frequent tags - Append to
.distiller/history.jsonl—{"date": "...", "scene": "...", "topic": "...", "tags": [...], "output_path": "...", "scope": "...", "lang": "..."} - Append to
.distiller/search-index.jsonl—{"file": "...", "title": "...", "keywords": [...], "timestamp": "..."} - Regenerate
.distiller/index.md— Rebuild full index fromhistory.jsonl, grouped by scene then date
Output Specification
See assets/distiller-frontmatter.md for the complete frontmatter skeleton and field reference.
Key rules:
- All frontmatter fields and values in English (including
title,tags,aliases) tagsuse hierarchical/-separated format perreferences/tags-taxonomy.md- Base tag
distilledalways present - Body language follows source (en or zh)
- Use
[[filename]]wiki-links for cross-references between distilled docs - Confidence shown inline per extraction point as
**[high]**,**[medium]**, or**[low]**
Example Output (Bug Retrospective, Chinese body)
---
title: "Race Condition in Async Task Queue"
date: 2025-07-12
aliases:
- "async queue deadlock"
- "task scheduler race"
tags:
- distilled
- scene/bug-retrospective
- tech/python
- topic/concurrency
- difficulty/advanced
scene: bug-retrospective
source_scope: current-chat
confidence_summary: high
lang: zh
---
## Bug 画像
**[high]** 在并发提交 >50 个任务时,队列偶发死锁。现象为 worker 线程全部阻塞在 `queue.get()`,CPU 降至 0%。
## 根因链
**[high]** 表面症状:任务不再被消费
→ 直接原因:`queue.put()` 在锁内调用
→ 底层原因:生产者持有 Lock A 的同时等待队列空间(需要 Lock B)
→ 根因:经典 AB-BA 死锁,两把锁的获取顺序不一致
Quality Checks
When evaluating whether a distilled document meets quality standards, verify these points:
| Check | Pass Criteria |
| ----- | ------------- |
| Frontmatter completeness | All required fields present, all values in English |
| Tag compliance | Includes distilled + one scene/* + at least one tech/* + one difficulty/* |
| Confidence coverage | Every extraction point has an inline **[high/medium/low]** marker |
| Faithfulness | No claims that go beyond what the source content supports |
| Scene fit | Extraction dimensions match the declared scene template |
| Actionability | A developer reading this 6 months later can apply the knowledge without the original context |
These checks are useful both for the user reviewing a document (Phase 4.2) and for evaluating the skill itself during iteration. If setting up formal evaluation, place test prompts and assertions in evals/evals.json following the standard skill-creator schema.
Resources
Loaded on demand — read only when the current phase requires them:
references/scene-project-wrapup.md— Project wrap-up extraction template (read in Phase 4.1)references/scene-bug-retrospective.md— Bug retrospective extraction template (read in Phase 4.1)references/scene-refactoring.md— Code refactoring extraction template (read in Phase 4.1)references/scene-tech-learning.md— Tech learning extraction template (read in Phase 4.1)references/scene-universal.md— Universal extraction template, fallback (read in Phase 4.1)references/tags-taxonomy.md— Hierarchical tag taxonomy (read in Phase 4.1)assets/distiller-frontmatter.md— Obsidian-compatible frontmatter skeleton (read in Phase 4.1)scripts/search_distiller.py— Search existing distilled documents by keyword/tag (execute in Phase 3.2)
Scan to join WeChat group