返回 Skill 列表
extension
分类: 开发与工程无需 API Key

RAG

构建、优化和调试 RAG 管道,包括分块策略、检索调优、评估指标以及生产监控。

person作者: ivangdavilahubclawhub

When to Use

User wants to implement, improve, or troubleshoot Retrieval-Augmented Generation systems.

Quick Reference

| Topic | File | |-------|------| | Pipeline components & architecture | architecture.md | | Implementation patterns & code | implementation.md | | Evaluation metrics & debugging | evaluation.md | | Security & compliance | security.md |

Core Capabilities

  1. Architecture design — Select embedding models, vector DBs, and chunking strategies based on requirements
  2. Implementation — Write ingestion pipelines, query handlers, and update logic
  3. Retrieval optimization — Tune top-k, reranking, hybrid search parameters
  4. Evaluation — Build test datasets, measure recall/precision, diagnose failures
  5. Production ops — Monitor quality drift, set up alerts, debug degradation
  6. Security — PII detection, access control, compliance requirements

Decision Checklist

Before recommending architecture, ask:

  • [ ] What document types and volume?
  • [ ] Latency requirements (real-time chat vs batch)?
  • [ ] Update frequency (how often do docs change)?
  • [ ] Access control needs (who can see what)?
  • [ ] Compliance constraints (GDPR, HIPAA, SOC2)?
  • [ ] Budget (managed vs self-hosted, embedding costs)?

Critical Rules

  • Never skip access control — Filter at retrieval time, not after
  • Always overlap chunks — 10-20% prevents context loss at boundaries
  • Evaluate before optimizing — Build eval dataset first, then tune
  • Same embedding model — Query and documents must use identical model
  • Monitor similarity scores — Dropping averages signal drift or issues
  • Plan for deletion — GDPR erasure requires re-embedding capability

Common Failure Patterns

| Symptom | Likely Cause | Fix | |---------|--------------|-----| | Wrong docs retrieved | Query too vague, poor chunks | Query expansion, smaller chunks | | Relevant doc missed | Not indexed, low similarity | Check ingestion, hybrid search | | Hallucinated answers | Context too short | Increase top-k, better reranking | | Slow responses | Large chunks, no caching | Optimize chunk size, cache embeddings | | Inconsistent results | Non-deterministic reranking | Set seeds, use stable sorting |