Back to skills
extension
Category: Content & MediaNo API key required

py_mnn_kb

Local vector knowledge base with GraphRAG retrieval (vector + BM25 + knowledge graph). Use this skill when the user mentions: "查知识库", "加入知识库", "记住这个", "save to KB", "add to knowledge base", "query knowledge base", "记录一下", or similar intent to store or retrieve private knowledge.

personAuthor: jakexiaohubgithub

py_mnn_kb — MNN Knowledge Base Skill

Local GraphRAG knowledge base backed by SQLite + MNN embeddings. Fully compatible with Android OfflineAI RAG database format.


Setup

1. Install dependencies

pip install -r requirements.txt

2. Configure

cp config.example.json config.json
# Edit config.json: set llm_api.api_key, optionally change default_name

Key fields in config.json:

| Field | Default | Description | |---|---|---| | knowledge_base.default_name | default | KB used when --kb is omitted | | knowledge_base.storage_dir | assets/knowledge_bases | Where DB files are stored | | llm_api.api_key | (required for query+LLM) | OpenAI-compatible API key | | graph_ner.custom_dict_path | assets/example_terms.json | Domain terminology for NER |

3. First run (auto-downloads embedding model)

python scripts/py_mnn_kb.py status

On first use, Qwen3-Embedding-0.6B-MNN-int4 (~400 MB) is auto-downloaded into assets/.


Tools

kb_build — Build / append knowledge base from files

Indexes a directory of documents. Runs in background; returns immediately. Check progress with kb_status.

Parameters: | Name | Type | Required | Description | |---|---|---|---| | dir_path | string | yes | Directory path to index (recursive) | | kb_name | string | no | KB name (default: value of default_name in config.json) |

Returns: { status, command, kb_name, pid, files, message }

Supported formats: .txt .md .pdf .docx .pptx .xlsx .csv .html .json .jsonl

CLI:

python scripts/py_mnn_kb.py build ./my_docs/ --kb my_kb
python scripts/py_mnn_kb.py build ./my_docs/          # uses default KB name

Trigger phrases: "加入知识库", "索引这个目录", "build KB", "index these files"


kb_note — Insert a text note directly into the knowledge base

Embeds and stores a free-form text snippet. Synchronous. Refused while build is running.

Parameters: | Name | Type | Required | Description | |---|---|---|---| | text | string | yes | Text content to store | | kb_name | string | no | KB name (default: default_name) | | title | string | no | Optional title, stored as source label |

Returns: { status, kb_name, chunks_added, elapsed_sec }

CLI:

python scripts/py_mnn_kb.py note "Q1 roadmap: focus on modules A and B" --kb my_kb
python scripts/py_mnn_kb.py note "$(cat meeting.txt)" --kb my_kb --title "Weekly meeting"

Trigger phrases: "记住这个", "记录一下", "加个笔记", "save this", "remember this"


kb_query — Retrieve relevant chunks (RAG retrieval)

Runs vector + BM25 + GraphRAG fusion and returns the top-N context string. The agent appends this context to its prompt — no LLM call is made inside this tool. Synchronous. Refused while build is running.

Parameters: | Name | Type | Required | Description | |---|---|---|---| | prompt | string | yes | Query question or keywords | | kb_name | string | no | KB name (default: default_name) |

Returns: Multi-document context string, e.g.:

Document1 [ID:42 source:manual.pdf]:
Deployment has three steps...

Document2 [ID:55 source:notes.md]:
...

CLI:

python scripts/py_mnn_kb.py query "NAND筛选核心流程" --kb my_kb --no-llm
python scripts/py_mnn_kb.py --output json query "产品路线图" --kb my_kb

Agent usage pattern:

context = kb_query("用户的问题", kb_name="my_kb")
# Then: f"Based on the following context:\n{context}\n\nQuestion: {user_question}"

Trigger phrases: "查知识库", "查一下", "知识库里有没有", "search KB"


kb_status — Check build progress or last build result

No KB initialization needed. Always returns instantly.

Parameters: | Name | Type | Required | Description | |---|---|---|---| | kb_name | string | no | (informational only, does not affect result) |

Returns:

  • While building: { status: "building", progress: 0-100, message }
  • After success: { status: "ok", message, stats: { chunks_added, elapsed_sec, ... } }
  • After failure: { status: "error", error }
  • Not yet run: { status: "idle", message }

CLI:

python scripts/py_mnn_kb.py status

Trigger phrases: "构建进度", "build status", "知识库建好了吗"


Workflow Examples

A · User uploads files → auto-index

User: "把这些文档加入知识库"
Agent → save files to temp dir
      → kb_build(dir_path=tmp_dir, kb_name="my_kb")   # returns immediately
      → "已开始后台构建,用 kb_status 检查进度"

B · User dictates a note → insert

User: "记住:STAR2000 低温写性能提升 8%"
Agent → kb_note(text="STAR2000 低温写性能提升 8%", kb_name="my_kb", title="技术发现")
      → "已保存到知识库 my_kb"

C · User asks a question → KB-assisted answer

User: "NAND 筛选核心流程是什么?"
Agent → context = kb_query("NAND 筛选核心流程", kb_name="my_kb")
      → append context to LLM prompt → generate answer

D · Check if build finished before querying

Agent → st = kb_status()
      → if st["status"] == "building": tell user to wait
      → else: proceed with kb_query(...)

Notes

  • kb_build is incremental append — re-running on the same directory adds only new content
  • kb_note and kb_query are blocked (return status: building) while a build is running
  • --output json on any CLI command returns machine-parseable JSON on stdout
  • KB name default is used when --kb is omitted; configure default_name in config.json