返回 Skill 列表
extension
分类: 内容与媒体无需 API Key

ai-ethics

负责任的人工智能开发和伦理考量。在评估时使用

person作者: jakexiaohubgithub

AI Ethics

Comprehensive AI ethics skill covering bias detection, fairness assessment, responsible AI development, and regulatory compliance.

When to Use This Skill

  • Evaluating AI models for bias
  • Implementing fairness measures
  • Conducting ethical impact assessments
  • Ensuring regulatory compliance (EU AI Act, etc.)
  • Designing human-in-the-loop systems
  • Creating AI transparency documentation
  • Developing AI governance frameworks

Ethical Principles

Core AI Ethics Principles

| Principle | Description | |-----------|-------------| | Fairness | AI should not discriminate against individuals or groups | | Transparency | AI decisions should be explainable | | Privacy | Personal data must be protected | | Accountability | Clear responsibility for AI outcomes | | Safety | AI should not cause harm | | Human Agency | Humans should maintain control |

Stakeholder Considerations

  • Users: How does this affect people using the system?
  • Subjects: How does this affect people the AI makes decisions about?
  • Society: What are broader societal implications?
  • Environment: What is the environmental impact?

Bias Detection & Mitigation

Types of AI Bias

| Bias Type | Source | Example | |-----------|--------|---------| | Historical | Training data reflects past discrimination | Hiring models favoring male candidates | | Representation | Underrepresented groups in training data | Face recognition failing on darker skin | | Measurement | Proxy variables for protected attributes | ZIP code correlating with race | | Aggregation | One model for diverse populations | Medical model trained only on one ethnicity | | Evaluation | Biased evaluation metrics | Accuracy hiding disparate impact |

Fairness Metrics

Group Fairness:

  • Demographic Parity: Equal positive rates across groups
  • Equalized Odds: Equal TPR and FPR across groups
  • Predictive Parity: Equal precision across groups

Individual Fairness:

  • Similar individuals should receive similar predictions
  • Counterfactual fairness: Would outcome change if protected attribute differed?

Bias Mitigation Strategies

Pre-processing:

  • Resampling/reweighting training data
  • Removing biased features
  • Data augmentation for underrepresented groups

In-processing:

  • Fairness constraints in loss function
  • Adversarial debiasing
  • Fair representation learning

Post-processing:

  • Threshold adjustment per group
  • Calibration
  • Reject option classification

Explainability & Transparency

Explanation Types

| Type | Audience | Purpose | |------|----------|---------| | Global | Developers | Understand overall model behavior | | Local | End users | Explain specific decisions | | Counterfactual | Affected parties | What would need to change for different outcome |

Explainability Techniques

  • SHAP: Feature importance values
  • LIME: Local interpretable explanations
  • Attention maps: For neural networks
  • Decision trees: Inherently interpretable
  • Feature importance: Global model understanding

Model Cards

Document for each model:

  • Model purpose and intended use
  • Training data description
  • Performance metrics by subgroup
  • Limitations and ethical considerations
  • Version and update history

AI Governance

AI Risk Assessment

Risk Categories (EU AI Act):

| Risk Level | Examples | Requirements | |------------|----------|--------------| | Unacceptable | Social scoring, manipulation | Prohibited | | High | Healthcare, employment, credit | Strict requirements | | Limited | Chatbots | Transparency obligations | | Minimal | Spam filters | No requirements |

Governance Framework

  1. Policy: Define ethical principles and boundaries
  2. Process: Review and approval workflows
  3. People: Roles and responsibilities (ethics board)
  4. Technology: Tools for monitoring and enforcement

Documentation Requirements

  • Data provenance and lineage
  • Model training documentation
  • Testing and validation results
  • Deployment and monitoring plans
  • Incident response procedures

Human Oversight

Human-in-the-Loop Patterns

| Pattern | Use Case | Example | |---------|----------|---------| | Human-in-the-Loop | High-stakes decisions | Medical diagnosis confirmation | | Human-on-the-Loop | Monitoring with intervention | Content moderation escalation | | Human-out-of-Loop | Low-risk, high-volume | Spam filtering |

Designing for Human Control

  • Clear escalation paths
  • Override capabilities
  • Confidence thresholds for automation
  • Audit trails
  • Feedback mechanisms

Privacy Considerations

Data Minimization

  • Collect only necessary data
  • Anonymize when possible
  • Aggregate rather than individual data
  • Delete data when no longer needed

Privacy-Preserving Techniques

  • Differential privacy
  • Federated learning
  • Secure multi-party computation
  • Homomorphic encryption

Environmental Impact

Considerations

  • Training compute requirements
  • Inference energy consumption
  • Hardware lifecycle
  • Data center energy sources

Mitigation

  • Efficient architectures
  • Model distillation
  • Transfer learning
  • Green hosting providers

Reference Files

  • references/bias_assessment.md - Detailed bias evaluation methodology
  • references/regulatory_compliance.md - AI regulation requirements

Integration with Other Skills

  • machine-learning - For model development
  • testing - For bias testing
  • documentation - For model cards