Cloud Edge Writing
Build cloud-edge collaborative intelligent scientific writing systems.
When to Activate
| Trigger | Priority | |---------|----------| | Cloud-edge AI application | HIGH | | OpenVINO local inference integration | HIGH | | RAG knowledge base system | HIGH | | SSE streaming interaction design | MEDIUM | | Scientific writing assistant | MEDIUM |
The Three-Layer Model
Layer 1: Patterns → Fully abstractable (workflow, protocol, architecture)
Layer 2: Configuration → Parameterizable (LLM config, model paths, hardware)
Layer 3: Environment → Prerequisites checklist (drivers, API keys, models)
See references/layers/ for detailed guidance on each layer.
Core Workflow
Phase 1: Environment Preparation (Layer 3)
Goal: Validate all prerequisites before coding.
Option A: Interactive Setup Wizard (Recommended)
# One-click setup with guided prompts
python scripts/setup-wizard.py
The wizard will:
- Check Python version
- Install dependencies (cloud/edge)
- Create config files from templates
- Guide API key configuration
- Check model download status
Option B: Manual Setup
# 1. Check environment
python scripts/check-environment.py
# 2. Install dependencies
pip install -r templates/requirements-cloud.txt # Cloud only
pip install -r templates/requirements-edge.txt # Edge only
# 3. Configure environment variables
cp templates/.env.template .env
# Edit .env with your API keys
Checklist:
- [ ] Python ≥ 3.10
- [ ] OpenVINO runtime installed (for edge)
- [ ] NPU driver ready (for edge)
- [ ] LLM API key configured (DeepSeek/OpenAI/etc.)
- [ ] Model files downloaded (for edge)
Model Download:
# Check model status and get download links
python scripts/download-models.py
See references/layers/layer3-environment.md for full details.
Phase 2: Configuration (Layer 2)
Goal: Generate parameterized config for your deployment.
# Generate config from template
cp templates/config-template.yaml config.yaml
# Edit config.yaml with your settings
Key Parameters:
| Parameter | Example | Description |
|-----------|---------|-------------|
| llm.base_url | https://api.deepseek.com | LLM API endpoint |
| llm.api_key | sk-xxx | API key |
| llm.model | deepseek-v4-flash | Model name |
| edge.general_model | Qwen3-8B-ov-npu | Edge general model |
| edge.translate_model | HY-MT1.5-1.8B-int4-ov | Edge translate model |
See references/layers/layer2-config.md for full config reference.
Phase 3: Cloud Setup (RAG + LLM)
Goal: Build the cloud-side RAG knowledge base and QA pipeline.
Architecture: See references/architecture/cloud-rag-pipeline.md
Components:
- Document Parser — PDF → text extraction
- Chunking Engine — Semantic splitting with overlap
- Index Builder — JSONL-based summary index
- Retrieval Pipeline — Two-stage: filter → compress → answer
- LLM Integration — Streaming API calls with retry
Scaffold:
cp templates/fastapi-scaffold.py cloud/app.py
# Customize for your domain
Phase 4: Edge Setup (OpenVINO Worker)
Goal: Build edge-side inference with OpenVINO acceleration.
Architecture: See references/architecture/edge-inference-pattern.md
Components:
- Model Loader — OpenVINO GenAI pipeline initialization
- Worker Process — Subprocess with stdin/stdout JSON protocol
- Task Router — Grammar check / Polish / Translate dispatch
- Stream Generator — Token-by-token streaming output
Scaffold:
cp templates/edge-worker-scaffold.py edge/worker.py
# Customize prompts and models
Phase 5: Frontend Integration
Goal: Build unified UI with SSE streaming support.
Architecture: See references/architecture/sse-protocol.md
Components:
- Cloud Page — RAG query, two-stage status, answer panel
- Edge Page — Editor, task buttons, diff view, device stats
- SSE Client — EventSource parsing with stage tracking
Scaffold:
cp templates/vue-page-template.vue src/pages/CloudPage.vue
# Customize for your UI
Phase 6: Deployment Validation
Goal: Verify end-to-end functionality.
# Validate config
python scripts/validate-config.py
# Run health checks
curl http://localhost:8011/api/health
curl http://localhost:5000/api/health
# Test cloud RAG
curl -X POST http://localhost:8011/api/cloud/chat/stream \
-H "Content-Type: application/json" \
-d '{"question": "test query", "language": "zh"}'
# Test edge inference
curl -X POST http://localhost:8011/api/edge/grammar-check/stream \
-H "Content-Type: application/json" \
-d '{"textContent": "He go to school."}'
See templates/deployment-checklist.md for full validation steps.
Quick Reference Card
┌─────────────────────────────────────────────────────────┐
│ CLOUD EDGE WRITING WORKFLOW │
├─────────────────────────────────────────────────────────┤
│ │
│ Phase 1: ENVIRONMENT │
│ ├─ python scripts/setup-wizard.py (Recommended) │
│ │ OR manually: │
│ ├─ python scripts/check-environment.py │
│ ├─ pip install -r requirements-*.txt │
│ ├─ cp .env.template .env → fill API keys │
│ └─ python scripts/download-models.py │
│ │
│ Phase 2: CONFIGURATION │
│ ├─ cp config-template.yaml config.yaml │
│ ├─ Set: LLM endpoint, model paths, hardware │
│ └─ python scripts/validate-config.py │
│ │
│ Phase 3: CLOUD (RAG) │
│ ├─ Document → Chunk → Index (JSONL) │
│ ├─ Two-stage retrieval: filter → compress → answer │
│ └─ FastAPI + SSE streaming │
│ │
│ Phase 4: EDGE (OpenVINO) │
│ ├─ Load models: Qwen3(NPU) + HY-MT(CPU) │
│ ├─ Worker subprocess: stdin/stdout JSON │
│ └─ Tasks: grammar, polish, translate │
│ │
│ Phase 5: FRONTEND │
│ ├─ Vue 3 + Tailwind + SSE client │
│ ├─ Cloud page: RAG query + stage status │
│ └─ Edge page: editor + diff + device stats │
│ │
│ Phase 6: VALIDATE │
│ ├─ Health checks on all services │
│ └─ End-to-end test: query → answer │
│ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐ │ CLOUD EDGE WRITING WORKFLOW │ ├─────────────────────────────────────────────────────────┤ │ │ │ Phase 1: ENVIRONMENT │ │ ├─ python scripts/check-environment.py │ │ ├─ pip install -r requirements-*.txt │ │ └─ cp .env.template .env → fill API keys │ │ │ │ Phase 2: CONFIGURATION │ │ ├─ cp config-template.yaml config.yaml │ │ └─ Set: LLM endpoint, model paths, hardware │ │ │ │ Phase 3: CLOUD (RAG) │ │ ├─ Document → Chunk → Index (JSONL) │ │ ├─ Two-stage retrieval: filter → compress → answer │ │ └─ FastAPI + SSE streaming │ │ │ │ Phase 4: EDGE (OpenVINO) │ │ ├─ Load models: Qwen3(NPU) + HY-MT(CPU) │ │ ├─ Worker subprocess: stdin/stdout JSON │ │ └─ Tasks: grammar, polish, translate │ │ │ │ Phase 5: FRONTEND │ │ ├─ Vue 3 + Tailwind + SSE client │ │ ├─ Cloud page: RAG query + stage status │ │ └─ Edge page: editor + diff + device stats │ │ │ │ Phase 6: VALIDATE │ │ ├─ Health checks on all services │ │ └─ End-to-end test: query → answer │ │ │ └─────────────────────────────────────────────────────────┘
---
## Reference Files
| Category | Location | Contents |
|----------|----------|----------|
| **Architecture** | `references/architecture/` | Cloud RAG, Edge inference, SSE, Worker patterns |
| **Layers** | `references/layers/` | Pattern, Config, Environment layer guides |
| **Decisions** | `references/decisions/` | Cloud-edge split decision tree |
| **Examples** | `examples/scenarios/` | Research writing setup, custom RAG domain |
| **Templates** | `templates/` | Config, code scaffolds, requirements, checklist |
| **Scripts** | `scripts/` | Setup wizard, environment check, model download, config validation |
---
## Cloud-Edge Split Summary
| Dimension | Cloud | Edge |
|-----------|-------|------|
| **Compute** | High (long context RAG) | Low (single paragraph) |
| **Privacy** | Low (public papers) | **High** (user writing stays local) |
| **Model** | Remote API (DeepSeek) | Local OpenVINO (Qwen3 + HY-MT) |
| **Hardware** | Cloud GPU | Intel NPU + CPU |
| **Tasks** | Paper retrieval, QA | Grammar, polish, translate |
| **Latency** | Network + long inference | Local low latency |
See `references/decisions/cloud-edge-split.md` for decision framework.
---
## Integration Notes
- Combine with `code-review-guardian` for code quality checks
- Use `safe-refactoring` when modifying existing implementations
- Use `test-driven-debugging` for troubleshooting
---
## Limitations
- Requires Intel hardware with NPU for optimal edge performance
- Edge models need manual download (several GB)
- LLM API keys must be provisioned separately
- MinerU API token needed for advanced PDF parsing
- Not suitable for real-time collaborative editing scenarios
---
## License
MIT License - Copyright (c) 2026 KIDZ
See [LICENSE](LICENSE) for full text.
Scan to contact