返回 Skill 列表
extension
分类: 数据与分析无需 API Key

Goodhart's Law

Activate when: our KPI is going up but the real outcome isn't improving; people seem to be gaming the metric; we're about to tie bonuses or promotions to a n...

person作者: deciqaihubclawhub

Goodhart's Law

Overview

Goodhart's Law: when a metric controls behavior, people optimize the metric rather than the underlying goal. Formulated by economist Charles Goodhart (1975) on UK monetary policy; sharpened by Marilyn Strathern (1997): "When a measure becomes a target, it ceases to be a good measure." Four failure mechanisms (Manheim & Garrabrant 2018): Regressional, Extremal, Causal, Adversarial. Countermeasure is always multi-metric + audit + rotation.

Composes with feedback-loops, principal-agent, okr-goal-setting, survivorship-bias.

When to Use

  • A KPI is being introduced or its weight is increasing in performance evaluation
  • A metric is "improving" without corresponding improvement in the underlying goal
  • People are visibly optimizing for a number rather than the work it was meant to track
  • Algorithmic optimization is producing outcomes the designers didn't intend
  • Resource allocation is driven by a single composite score or ranking

Not when: metric and goal are identical; stakes too low for gaming; metric is purely descriptive with no reward/punishment; question is which metric to use, not whether the measurement-reward system is sound.

Coaching Novices (Adaptive Front Door)

  • Engine mode: user has a concrete metric or system → run The Process directly.
  • Coach mode: user is unfamiliar or has no concrete case → guide step by step.

In Coach mode, respond one step at a time. Each [WAIT] is a hard stop — output only that step's question, then stop.

  1. One-line: before relying on a metric to control behavior, predict how people will game it — choose the system that survives that prediction.
  2. Check fit: if the metric is purely descriptive (no reward attached), Goodhart's law doesn't apply yet.
  3. Elicit the specific metric and the underlying goal: what's being measured? What's the actual outcome you care about?

[WAIT — do not advance until user responds]

  1. One question at a time: proxy gap? How would a clever agent game this? Which Goodhart category? What countermeasure fits?

[WAIT — do not advance until user responds]

  1. Close: name the gaming-resistant design (multi-metric, audit, rotation, paired-constraint) + monitoring schedule.

[WAIT — do not advance until user responds]

The Process

Step 1 — State metric and goal: metric being targeted / underlying goal / current proxy-goal correlation / who is measured / stakes.

Step 2 — Predict the gaming: list ≥3 ways to game the metric with minimum effort on the goal. If you can't list 3, you haven't thought hard enough.

Step 3 — Categorize mechanism:

| Mechanism | Test | |---|---| | Regressional | Is there noise that optimization will push into? | | Extremal | Does metric-goal correlation break at extremes? | | Causal | Is the metric a symptom, not a cause? | | Adversarial | Will agents actively game with intelligence? |

Step 4 — Choose countermeasure: Regressional → constrain range. Extremal → paired constraint metrics. Causal → closer-to-causation metric + direct audit. Adversarial → multi-metric + randomized audits + rotation. Step 5 — Design the system: primary metric / constraint metric(s) / audit mechanism (sampled direct goal observation) / rotation schedule / separation of measure-for-control from measure-for-diagnosis / gaming-detection threshold. Step 6 — Schedule re-evaluation: independent goal measurement (how/when/who) / drift threshold / retirement criteria / owner.

Output template

# Goodhart-Robust Design: <metric>
Metric: | Underlying goal: | Correlation: | Who measured: | Stakes:
Gaming vectors (≥3):
Mechanism: Regressional / Extremal / Causal / Adversarial
Primary metric: | Constraint metric(s): | Audit: | Rotation: | Separation: | Gaming threshold:
Goal measurement (independent): | Drift threshold: | Retirement criteria: | Owner:

→ Method in Action: Goodhart 1975 (M3) and Strathern 1997 (RAE)

Pack: Goodhart's Law Patterns

| Domain | Common gaming | Defense | |---|---|---| | Sales quotas | Sandbagging, channel stuffing, end-of-quarter discounts | Multi-period averaging; quality metrics; clawback | | Hospital wait targets | Ambulance parking, patient reclassification | Outcome audits; paired metrics; randomized inspection | | Standardized testing | Teaching to test, curriculum narrowing | Sample-based assessment; multi-measure; reduce single-test stakes | | Algorithmic engagement | Clickbait, outrage, misinformation | Multi-objective optimization; quality + harm constraints |

Applying It Well

  • All metrics are proxies — narrower than the goal. Pre-commit to gap analysis before deployment.
  • Gaming is rational under measurement pressure. Fix the system, not the people.
  • Rotation and audit are the only durable defenses. Plan metric retirement at design time.

→ Primary sources: references/sources.md

Common Rationalizations

| Fake move | Reality | |---|---| | [D] "If you can't measure it, you can't manage it" | Often false. Judgment, trust, and direct observation are also valid management tools. | | [D] "Our metric is well-defined; it won't be gamed" | Precision invites precise gaming. Basel II capital ratios were well-defined — extensively gamed. | | [D] "Our people wouldn't game the metric" | Goodhart's law is structural; individual virtue is insufficient in aggregate. | | [D] "We just need a better metric" | Often the issue is any single metric under pressure; fix is multi-metric + audit. | | [D] "We've used this metric for years" | Long use = more time for gaming to mature. Tenure is a warning, not an endorsement. | | → Add [O] entries here after each real use — paste the actual failure pattern | What went wrong and why |

[D] = designed upfront | [O] = observed in real use. [O] entries are more valuable.

Red Flags

  • Metric tied to high-stakes rewards or punishments
  • Metric "improves" without obvious improvement in the underlying goal
  • People being measured can already articulate ways to game it
  • Single metric is the primary evaluation tool, no audit or paired-constraint

Verification

  • [ ] Goal underlying the metric specifically named
  • [ ] Proxy gap explicitly described; ≥3 gaming vectors listed
  • [ ] Goodhart mechanism category identified
  • [ ] Countermeasure design (multi-metric, audit, rotation) in place
  • [ ] Independent goal measurement scheduled with owner and retirement date

Part of deciqAI Knowledge Skills — open-source thinking skills that make rigor executable for AI agents. Built by deciqAI · https://deciqai.com · Contributions welcome — see the template at the repo root.