RAG Chatbot Skill
Instructions
-
Setup
- Create Python virtual environment
- Install dependencies:
pip install fastapi uvicorn pydantic openai qdrant-client asyncpg langchain-text-splitters tenacity python-dotenv - Create
.env.examplewith:OPENAI_API_KEY,QDRANT_URL,QDRANT_API_KEY,NEON_DATABASE_URL,EMBED_MODEL,CHAT_MODEL
-
Data model
- Neon tables:
documents(id, path, checksum, meta jsonb)chunks(id, doc_id, content, meta jsonb)sessions(id, user_id, prefs jsonb)
- Qdrant collection
book_chunkswith payload:doc_path,module,week,tags,heading_path
- Neon tables:
-
Ingestion
- Walk markdown glob, parse frontmatter
- Split chunks (by headings and tokens)
- Embed via OpenAI, upsert to Qdrant
- Store doc/chunk metadata in Neon
- Track checksum for incremental ingest
-
Query endpoints
/health- healthcheck/ingest- trigger ingestion/query- semantic search + LLM answer with sources/query/selected- bypass vector search, inject user-selected text as context
-
Ops
- Configure CORS for Docusaurus origin
- Add logging with request IDs
- Include safety: max tokens, fallback answers, latency budget
Examples
# Query endpoint structure
@app.post("/query")
async def query(request: QueryRequest):
# 1. Embed query
# 2. Search Qdrant
# 3. Build context from chunks
# 4. Call OpenAI with context
# 5. Return answer + sources
pass
# Run server
uvicorn main:app --reload --port 8000
Definition of Done
- FastAPI runs locally with env sample; healthcheck ok
- Ingestion populates Qdrant + Neon with at least sample doc; idempotent reruns succeed
- Query endpoints return grounded answers with cited headings
- Selected-text mode echoes user selection
- README snippet for running server and triggering ingest
Scan to join WeChat group