Back to skills
extension
Category: OtherNo API key required

Aegis Protocol

Self-healing stability monitor for AI agents - 5 core checks + 15 extended checks, auto-recovery, health scoring

personAuthor: ankechenlab-nodehubclawhub

Aegis Protocol 🛡️

Self-Healing Stability Monitor for AI Agents

Version: 0.12.6
Author: Dream
License: MIT


Features

  • 20-Dimension Monitoring: System, services, AI agent, security, maintenance
  • Auto Recovery: Terminate stuck sessions, restart services, compact context
  • Health Scoring: Quantified health score (0-100)
  • Healing Memory: Record and learn from recovery strategies
  • Result Caching: 5-minute TTL for reduced system calls

Commands

aegis-protocol init

Initialize configuration file

aegis-protocol status

Show system health summary

aegis-protocol check

Run full health check

aegis-protocol heal

Execute automatic recovery

aegis-protocol config

View current configuration


Usage Examples

# Initialize
python3 aegis-protocol.py init

# Check health
python3 aegis-protocol.py check

# Auto recover
python3 aegis-protocol.py heal

# View config
python3 aegis-protocol.py config

Configuration

File: ~/.openclaw/workspace/.watchdog-config.json

{
  "thresholds": {
    "sessionTimeoutMinutes": 60,
    "pm2RestartAlert": 50,
    "diskUsagePercent": 90,
    "memoryUsagePercent": 95,
    "contextUsagePercent": 80
  }
}

Monitoring Dimensions

| Category | Checks | |----------|--------| | System | CPU, Memory, Disk, Zombies, FD, Connections | | Services | PM2, Nginx, Docker, Cron | | AI Agent | Sessions, Context, Tasks, Loops | | Security | SSL, Updates, Git | | Maintenance | Backup, Cleanup, Network |


Health Score

| Score | Status | |-------|--------| | 90-100 | Excellent | | 70-89 | Good | | 50-69 | Warning | | 0-49 | Critical |


Testing

# Unit tests
python3 -m pytest tests/ -v

# Coverage
python3 -m pytest tests/ --cov=aegis_protocol -v

Coverage: 82%
Tests: 20+ passing


Quality Metrics

| Metric | Value | |--------|-------| | Type hints | >90% | | Test coverage | 82% | | Documentation | 100% | | Bugs | 0 |


Version History

v0.7.0 (2026-04-05)

  • Result caching with 5-minute TTL
  • Type hints >90% coverage
  • Exception classification (4 types)
  • 20-dimension monitoring
  • Health scoring system

Aegis Protocol - The Never-Sleeping Guardian 🌀