Foundations of Goodhart's Law

The sentence haunting every boardroom

"When a measure becomes a target, it ceases to be a good measure."

Marilyn Strathern, 1997 — restatement of the principle articulated by Charles Goodhart in 1975.

You may have seen this sentence on an open-space wall, quoted by a VP Engineering or tweeted by an OpenAI researcher. But behind its almost Taoist phrasing lies one of the most universally violated principles in modern management — and one of the most dangerous blind spots in the AI era.

Because today, you no longer optimise your metrics alone. Your sales reps optimise them. Your algorithms optimise them. Your LLMs optimise them. All of them, silently, in parallel, without coordination. And all of them, unknowingly, turn your measure into a target — and therefore into a lie.

This module teaches you to see that drift before it consumes your business.

Goodhart's original formulation (1975)

The economist Charles Goodhart, then an advisor at the Bank of England, observed during the 1970s that every time the UK central bank adopted a monetary aggregate (M0, M1, M3…) as a policy target, that aggregate started behaving strangely — drifting from the economic quantity it was supposed to represent.

His original formulation, in a 1975 paper ("Problems of Monetary Management: The U.K. Experience"), is more technical:

"Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes."

This is the econometric version of the law. Marilyn Strathern, an anthropologist at Cambridge, restated it in 1997 in its aphoristic form — the one that went viral.

Why the measure degrades: 4 mechanisms

graph TD
    A[Measure becomes target] --> B[Pressure on the actor]
    B --> C[Mechanism 1<br/>Incomplete specification]
    B --> D[Mechanism 2<br/>Adverse selection]
    B --> E[Mechanism 3<br/>Local optimisation]
    B --> F[Mechanism 4<br/>Explicit gaming]
    C --> G[Proxy drift]
    D --> G
    E --> G
    F --> G
    G --> H[The measure no longer reflects reality]

1. Incomplete specification ("proxy gap")

No metric ever fully captures the value it claims to measure. Revenue does not measure profitability; NPS does not measure real loyalty; number of meetings does not measure productivity.

As long as the metric is passively observed, the proxy ⟷ real value gap stays stable. As soon as it becomes a target, actors optimise the proxy, not the value. The gap explodes.

2. Adverse selection

A target attracts actors who know how to optimise it — not those who create value. Reward sales reps only on the number of meetings booked? You will attract and retain the best meeting-bookers, not the best closers.

3. Local optimisation (Campbell's effect)

Donald T. Campbell, an American sociologist, stated a sister law in 1976:

"The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

Actors reallocate their energy toward the proxy at the expense of everything else. The score goes up, the value collapses.

4. Explicit gaming

The extreme case: cheating. CRM massaging, lead inflation, fake signatures, fake clicks, mislabelled AI content. Gaming = the target has become the main objective and the underlying value has become secondary.

The Manheim & Garrabrant taxonomy (2018)

In a foundational paper, "Categorizing Variants of Goodhart's Law", David Manheim and Scott Garrabrant (of the Machine Intelligence Research Institute) distinguish four variants of the law — a taxonomy now standard in the AI alignment community.

Variant Mechanism Sales/business example
Regressional Goodhart The proxy is noisy; selection happens on noise, not signal Hiring sales reps with the highest test score — often noise
Causal Goodhart Correlation confused with causation; you optimise a non-real cause Sending more emails because email volume correlates with revenue
Extremal Goodhart The proxy/value relationship holds on the observed range, not at extremes Pushing NPS to 90 when behaviour beyond 70 is unknown
Adversarial Goodhart A smart agent (human or AI) actively optimises the proxy The rep forcing €1 deals to inflate deal count

This taxonomy matters: each variant requires a different defence. Treating Causal Goodhart as Adversarial Goodhart means applying the wrong remedy.

The Lucas precedent (econometrics)

Before Goodhart, economist Robert Lucas (Nobel 1995) had articulated the Lucas critique (1976): econometric parameters estimated on past data change when economic policy changes. That is: you cannot steer an economy assuming agents will keep behaving as before — they adapt to the policy.

Goodhart, Lucas, Campbell: three convergent statements of one deep principle — human systems (and now AI systems) are reflexive. They internalise the measure into their behaviour.

The cobra effect: the legendary metaphor

During British rule in India, the governor of Delhi wanted to reduce the cobra population. He introduced a bounty for each dead cobra brought to the administration.

At first, it works: locals kill wild cobras. Then it derails: they start breeding cobras at home to kill them and collect the bounty. The government scraps the bounty. The locals, with no use for their farmed cobras, release them.

Final cobra population in Delhi: higher than before the bounty.

This is what economist Horst Siebert later called the cobra effect (Der Kobra-Effekt, 2001) — the inverse consequence of a poorly designed incentive. An instance of Goodhart at the scale of public policy.

The 5 trigger conditions

Goodhart's law does not fire automatically. It needs five combined ingredients:

  1. A quantified metric (mathematically optimisable).
  2. A smart actor (human or AI) with optimisation capacity.
  3. Stakes for that actor (pay, status, survival, objective function…).
  4. A gap between proxy and real value (every metric has one).
  5. A long-enough time window for the actor to learn the proxy.

Remove any of these 5 ingredients → the Goodhart drift does not fire, or fires very weakly.

This grid is worth three management PhDs: before laying down a KPI, ask which of the 5 ingredients you could remove to neutralise Goodhart.

Three operational corollaries

Corollary 1 — There is no perfectly aligned metric

Every metric is a proxy. Better to have several independent proxies than one "perfect" proxy. This is the triangulation principle in data science.

Corollary 2 — Secrecy delays Goodhart

A metric not communicated to actors stays a good proxy. That is why the best bonus designers never reveal the full formula.

Corollary 3 — The stronger the optimiser, the faster Goodhart strikes

This is the AI version of the law: a human takes 6 months to learn to game a KPI. An RL agent takes 6 hours. An LLM can discover an exploitation bug in seconds. The speed of Goodhart drift is now proportional to the optimisation power of the agent — this is what the AI alignment community calls reward hacking.

Conclusion: a principle to internalise

Goodhart is not an occasional side effect. It is a fundamental law of adaptive systems under pressure. Just like natural selection. And just like natural selection, it is always running, everywhere optimisation exists.

In the era where you orchestrate humans, no-code automations and LLMs on the same objectives, Goodhart is your #1 risk. The next module dives into the precise psychological mechanisms that trigger it in humans.