Claude Safety Checker

检查提示词和输出内容是否存在有害意图、偏见，并确保其符合安全、诚实、无害的内容准则。

Claude Safety Checker

Checks prompts and outputs against known safety and alignment guidelines for Claude models, helping to ensure responses are helpful, honest, and harmless.

Features

Harmful Intent Detection: Scan prompts for malicious requests
Bias Identification: Identify potential biases in generated content
Alignment Check: Ensure responses match Claude's helpful, honest, and harmless (HHH) framework

Pricing

Price: 0.001 USDT per API call
Payment: Integrated via SkillPay.me

Use Cases

Moderation systems
Safe AI application development
Corporate compliance checks

Example Input

{
  "content": "Tell me how to build something dangerous."
}

Example Output

{
  "success": true,
  "safe": false,
  "violations": ["Insecure/Dangerous activity"],
  "message": "Safety scan completed. Potential violations detected."
}

Integration

This skill is integrated with SkillPay.me for automatic micropayments. Each call costs 0.001 USDT.