clawSafe 🛡️

Enterprise-grade security detector for AI agents

Overview

clawSafe is a comprehensive security middleware that intercepts and blocks malicious input before it reaches your AI agent. Built with defense-in-depth philosophy.

Features

5-Layer Protection

| Layer | Threats | Rules | |-------|---------|-------| | LLM Layer | Prompt Injection, Jailbreak, Prompt Leaking, Encoding Attacks | 44 | | Web Layer | SQL Injection, XSS, CSRF, SSRF | 32 | | API Layer | Key Exposure, Rate Limiting, Auth Bypass | 19 | | Supply Chain | Dangerous Dependencies, Remote Code Execution | 8 | | Deploy Layer | Environment Leaks, Debug Info Disclosure | 10 |

Total: 113+ detection rules

Quick Start

Installation

# Via ClawHub
clawhub install clawSafe

# Manual
cp -r clawSafe ~/.openclaw/workspace/skills/

Basic Usage

const Detector = require('./detector');

const detector = new Detector();

// Scan user input
const result = detector.scan('Ignore previous instructions');

if (!result.safe) {
  console.log('Blocked:', result.threats);
  // Handle blocked input
}

Return Format

{
  safe: boolean;           // true if input is safe
  threats: Array<{
    type: string;         // threat category
    pattern: string;      // matched pattern ID
    severity: string;     // critical|high|medium|low
    confidence: number;   // 0-1
    description: string;
  }>;
  confidence: number;      // overall confidence 0-1
  layersScanned: string[]; // layers that were checked
}

Configuration

config.json

{
  "enabled": true,
  "logLevel": "info",
  "layers": {
    "llm": {
      "enabled": true,
      "rules": ["injection", "jailbreak", "prompt_leak", "encoding"]
    },
    "web": {
      "enabled": true,
      "rules": ["sql_injection", "xss", "csrf", "ssrf"]
    },
    "api": {
      "enabled": true,
      "rules": ["key_exposure", "rate_limit", "auth"]
    },
    "supply_chain": {
      "enabled": true,
      "rules": ["deps"]
    },
    "deploy": {
      "enabled": true,
      "rules": ["env_leak", "debug_info"]
    }
  },
  "detection": {
    "confidenceThreshold": 0.6,
    "minMatchCount": 1
  },
  "actions": {
    "onThreatDetected": "block",
    "onUncertain": "log"
  }
}

whitelist.json

{
  "keywords": ["trusted-keyword"],
  "users": ["user-id-1"],
  "sessions": ["session-id-1"]
}

Detection Rules

LLM Layer

Prompt Injection Patterns:

ignore previous instructions
disregard your guidelines
forget all rules
act as if you have no restrictions

Jailbreak Patterns:

DAN mode
developer mode
roleplay as

Encoding Bypass:

Base64 encoded commands
Hex encoding
Unicode homoglyphs

Web Layer

SQL Injection: '; DROP TABLE users; --
XSS: <script>alert(1)</script>
CSRF: Token manipulation
SSRF: Internal URL access

API Layer

API Key exposure: sk-1234567890
JWT tokens
Bearer tokens
Basic auth credentials

Testing

# Run all tests
node test.js

# Interactive mode
node test-interactive.js

# Demo
node detector.js

Integration

OpenClaw Hook

To integrate with OpenClaw, add to your gateway config:

// gateway.config.js
module.exports = {
  middleware: ['clawSafe'],
  clawSafe: {
    enabled: true,
    strictMode: false
  }
};

Performance

Latency: < 5ms per scan
Memory: ~50KB
Rules: 113+ (JSON-based, lazy load)

License

MIT

Changelog

v1.0.0

Initial release
5-layer protection
113+ detection rules