Back to skills
extension
Category: Development & EngineeringNo API key required

cloud-edge-writing

cloud-edge-writing

personAuthor: KKIDZZhubModelScope

Cloud Edge Writing

Build cloud-edge collaborative intelligent scientific writing systems.

When to Activate

| Trigger | Priority | |---------|----------| | Cloud-edge AI application | HIGH | | OpenVINO local inference integration | HIGH | | RAG knowledge base system | HIGH | | SSE streaming interaction design | MEDIUM | | Scientific writing assistant | MEDIUM |


The Three-Layer Model

Layer 1: Patterns      → Fully abstractable (workflow, protocol, architecture)
Layer 2: Configuration → Parameterizable (LLM config, model paths, hardware)
Layer 3: Environment   → Prerequisites checklist (drivers, API keys, models)

See references/layers/ for detailed guidance on each layer.


Core Workflow

Phase 1: Environment Preparation (Layer 3)

Goal: Validate all prerequisites before coding.

Option A: Interactive Setup Wizard (Recommended)

# One-click setup with guided prompts
python scripts/setup-wizard.py

The wizard will:

  • Check Python version
  • Install dependencies (cloud/edge)
  • Create config files from templates
  • Guide API key configuration
  • Check model download status

Option B: Manual Setup

# 1. Check environment
python scripts/check-environment.py

# 2. Install dependencies
pip install -r templates/requirements-cloud.txt   # Cloud only
pip install -r templates/requirements-edge.txt    # Edge only

# 3. Configure environment variables
cp templates/.env.template .env
# Edit .env with your API keys

Checklist:

  • [ ] Python ≥ 3.10
  • [ ] OpenVINO runtime installed (for edge)
  • [ ] NPU driver ready (for edge)
  • [ ] LLM API key configured (DeepSeek/OpenAI/etc.)
  • [ ] Model files downloaded (for edge)

Model Download:

# Check model status and get download links
python scripts/download-models.py

See references/layers/layer3-environment.md for full details.


Phase 2: Configuration (Layer 2)

Goal: Generate parameterized config for your deployment.

# Generate config from template
cp templates/config-template.yaml config.yaml
# Edit config.yaml with your settings

Key Parameters: | Parameter | Example | Description | |-----------|---------|-------------| | llm.base_url | https://api.deepseek.com | LLM API endpoint | | llm.api_key | sk-xxx | API key | | llm.model | deepseek-v4-flash | Model name | | edge.general_model | Qwen3-8B-ov-npu | Edge general model | | edge.translate_model | HY-MT1.5-1.8B-int4-ov | Edge translate model |

See references/layers/layer2-config.md for full config reference.


Phase 3: Cloud Setup (RAG + LLM)

Goal: Build the cloud-side RAG knowledge base and QA pipeline.

Architecture: See references/architecture/cloud-rag-pipeline.md

Components:

  1. Document Parser — PDF → text extraction
  2. Chunking Engine — Semantic splitting with overlap
  3. Index Builder — JSONL-based summary index
  4. Retrieval Pipeline — Two-stage: filter → compress → answer
  5. LLM Integration — Streaming API calls with retry

Scaffold:

cp templates/fastapi-scaffold.py cloud/app.py
# Customize for your domain

Phase 4: Edge Setup (OpenVINO Worker)

Goal: Build edge-side inference with OpenVINO acceleration.

Architecture: See references/architecture/edge-inference-pattern.md

Components:

  1. Model Loader — OpenVINO GenAI pipeline initialization
  2. Worker Process — Subprocess with stdin/stdout JSON protocol
  3. Task Router — Grammar check / Polish / Translate dispatch
  4. Stream Generator — Token-by-token streaming output

Scaffold:

cp templates/edge-worker-scaffold.py edge/worker.py
# Customize prompts and models

Phase 5: Frontend Integration

Goal: Build unified UI with SSE streaming support.

Architecture: See references/architecture/sse-protocol.md

Components:

  1. Cloud Page — RAG query, two-stage status, answer panel
  2. Edge Page — Editor, task buttons, diff view, device stats
  3. SSE Client — EventSource parsing with stage tracking

Scaffold:

cp templates/vue-page-template.vue src/pages/CloudPage.vue
# Customize for your UI

Phase 6: Deployment Validation

Goal: Verify end-to-end functionality.

# Validate config
python scripts/validate-config.py

# Run health checks
curl http://localhost:8011/api/health
curl http://localhost:5000/api/health

# Test cloud RAG
curl -X POST http://localhost:8011/api/cloud/chat/stream \
  -H "Content-Type: application/json" \
  -d '{"question": "test query", "language": "zh"}'

# Test edge inference
curl -X POST http://localhost:8011/api/edge/grammar-check/stream \
  -H "Content-Type: application/json" \
  -d '{"textContent": "He go to school."}'

See templates/deployment-checklist.md for full validation steps.


Quick Reference Card

┌─────────────────────────────────────────────────────────┐
│           CLOUD EDGE WRITING WORKFLOW                   │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Phase 1: ENVIRONMENT                                   │
│  ├─ python scripts/setup-wizard.py (Recommended)        │
│  │   OR manually:                                       │
│  ├─ python scripts/check-environment.py                 │
│  ├─ pip install -r requirements-*.txt                   │
│  ├─ cp .env.template .env → fill API keys               │
│  └─ python scripts/download-models.py                   │
│                                                         │
│  Phase 2: CONFIGURATION                                 │
│  ├─ cp config-template.yaml config.yaml                 │
│  ├─ Set: LLM endpoint, model paths, hardware            │
│  └─ python scripts/validate-config.py                   │
│                                                         │
│  Phase 3: CLOUD (RAG)                                   │
│  ├─ Document → Chunk → Index (JSONL)                    │
│  ├─ Two-stage retrieval: filter → compress → answer     │
│  └─ FastAPI + SSE streaming                             │
│                                                         │
│  Phase 4: EDGE (OpenVINO)                               │
│  ├─ Load models: Qwen3(NPU) + HY-MT(CPU)               │
│  ├─ Worker subprocess: stdin/stdout JSON                │
│  └─ Tasks: grammar, polish, translate                   │
│                                                         │
│  Phase 5: FRONTEND                                      │
│  ├─ Vue 3 + Tailwind + SSE client                       │
│  ├─ Cloud page: RAG query + stage status                │
│  └─ Edge page: editor + diff + device stats             │
│                                                         │
│  Phase 6: VALIDATE                                      │
│  ├─ Health checks on all services                       │
│  └─ End-to-end test: query → answer                     │
│                                                         │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐ │ CLOUD EDGE WRITING WORKFLOW │ ├─────────────────────────────────────────────────────────┤ │ │ │ Phase 1: ENVIRONMENT │ │ ├─ python scripts/check-environment.py │ │ ├─ pip install -r requirements-*.txt │ │ └─ cp .env.template .env → fill API keys │ │ │ │ Phase 2: CONFIGURATION │ │ ├─ cp config-template.yaml config.yaml │ │ └─ Set: LLM endpoint, model paths, hardware │ │ │ │ Phase 3: CLOUD (RAG) │ │ ├─ Document → Chunk → Index (JSONL) │ │ ├─ Two-stage retrieval: filter → compress → answer │ │ └─ FastAPI + SSE streaming │ │ │ │ Phase 4: EDGE (OpenVINO) │ │ ├─ Load models: Qwen3(NPU) + HY-MT(CPU) │ │ ├─ Worker subprocess: stdin/stdout JSON │ │ └─ Tasks: grammar, polish, translate │ │ │ │ Phase 5: FRONTEND │ │ ├─ Vue 3 + Tailwind + SSE client │ │ ├─ Cloud page: RAG query + stage status │ │ └─ Edge page: editor + diff + device stats │ │ │ │ Phase 6: VALIDATE │ │ ├─ Health checks on all services │ │ └─ End-to-end test: query → answer │ │ │ └─────────────────────────────────────────────────────────┘


---

## Reference Files

| Category | Location | Contents |
|----------|----------|----------|
| **Architecture** | `references/architecture/` | Cloud RAG, Edge inference, SSE, Worker patterns |
| **Layers** | `references/layers/` | Pattern, Config, Environment layer guides |
| **Decisions** | `references/decisions/` | Cloud-edge split decision tree |
| **Examples** | `examples/scenarios/` | Research writing setup, custom RAG domain |
| **Templates** | `templates/` | Config, code scaffolds, requirements, checklist |
| **Scripts** | `scripts/` | Setup wizard, environment check, model download, config validation |

---

## Cloud-Edge Split Summary

| Dimension | Cloud | Edge |
|-----------|-------|------|
| **Compute** | High (long context RAG) | Low (single paragraph) |
| **Privacy** | Low (public papers) | **High** (user writing stays local) |
| **Model** | Remote API (DeepSeek) | Local OpenVINO (Qwen3 + HY-MT) |
| **Hardware** | Cloud GPU | Intel NPU + CPU |
| **Tasks** | Paper retrieval, QA | Grammar, polish, translate |
| **Latency** | Network + long inference | Local low latency |

See `references/decisions/cloud-edge-split.md` for decision framework.

---

## Integration Notes

- Combine with `code-review-guardian` for code quality checks
- Use `safe-refactoring` when modifying existing implementations
- Use `test-driven-debugging` for troubleshooting

---

## Limitations

- Requires Intel hardware with NPU for optimal edge performance
- Edge models need manual download (several GB)
- LLM API keys must be provisioned separately
- MinerU API token needed for advanced PDF parsing
- Not suitable for real-time collaborative editing scenarios

---

## License

MIT License - Copyright (c) 2026 KIDZ

See [LICENSE](LICENSE) for full text.