Books

‘Noise: A Flaw in Human Judgment’ can help us all

Not just for the judges in billowy silk robes.

November 16, 2021

Noise: A Flaw in Human Judgment by Daniel Kahneman, Olivier Sibony, and Cass R. Sunstein has sounded an alarm. These experienced authors have identified a significant, widespread problem– that of systematic errors in judgment — and offer some pragmatic solutions to work toward fixing them.

Noisy errors can do a lot of damage. Business meetings routinely waste time and arrive at misguided decisions. Job interviews don’t predict future performance accurately enough. Numerous medical diagnostics for female patients are error-prone. Criminal sentences for similar offenses vary by 25%, and juries assess damages with wild inaccuracy. Under the best conditions, 1 in 600 fingerprint identifications is a false positive!

In Noise: A Flaw in Human Judgment, a judgment or decision is a rating or ranking assigned to a particular case, though it doesn’t have to be a number, and judgments may integrate various data into an overall judgment call. So much depends on the “judge” assigned to the case that it becomes a kind of lottery. Domains we’d think are formulaic and disciplined end up having a wide variation in what should be identical cases.

What’s wrong with this picture?

'Noise: A Flaw in Human Judgment' can help us all

“Anywhere there is judgment, there is noise, and probably more than you’d expect,” the authors state. So they separate the bias from the overall errors of a system, and what remains is called “system noise,” which is different from something like systemic bias. And that’s their point — system noise is an independent and additive source of error. That means repeated errors don’t cancel out. Missing the mark in one direction is not fixed by missing again in the other direction. Instead, you just made two bad decisions.

Maybe the worst thing about noise is that different judges respond to cases without sharing a common ranking. And despite unique judges often having predictable and stable patterns of errors, like being consistently harsh or lenient, there can be “occasion noise,” when a judge sees the same exact case twice but decides it differently. The effect of an irrelevant feature — like if the defendant looks like the judge’s daughter, or if the local sports team just won — is a type of occasion noise.

Why hasn’t noise been studied before?

Analysts look to biases to explain bad decisions because causal explanations are satisfying. Only a statistical view of the world enables us to see noise. So noise receives much less attention than bias because the statistical view is so uncommon.

Organizations desire to maintain an “illusion of agreement” by ignoring or suppressing evidence of divergence among experts, who use shared norms, professional doctrine, and specific methods. Examples include underwriting, criminal sentencing, wine-tasting, essay-grading, and book and movie-reviewing. In most fields, a judgment may never be evaluated against an actual value.

How to help keep the noise down

The main takeaway from Noise: A Flaw in Human Judgment is that we should be doing more to prevent our own judgment errors and those of our organizations. The authors propose good decision hygiene, which is about reducing the impact of unspecified noise, just like washing your hands kills unspecified germs.

- The goal of judgment is accuracy, not individual expression. Use algorithmic and rule-based evaluation or decision guidelines.
- Think statistically and take the “outside” view of the case. By anchoring the prediction statistically, instead of as part of a causal story.
- Structure judgments into several independent tasks. Use structured interviews, diagnostic guidelines, and the Mediating Assessments Protocol, in which a complex decision is broken down to ensure that each measure is made independently.
- Resist premature intuitions. This requires discipline in how you sequence information. Only create a holistic judgment when the profile of assessments is complete. The feeling of confidence in making an overall impression of the case is not helpful if your aim is accuracy.
- Obtain independent judgments from multiple judges, then consider aggregating those. Favor relative judgments and relative scales by making a pairwise comparison on a standard scale. In a less noisy world, “overt disagreements would be both more frequent and more constructively resolved.”

Rules are expensive to create because identifying the right one is tricky. Rules eliminate discretion and reduce the role of personal judgment. Simple rules and simple models, such as linear models built on limited data, outperform judges because they are noise-free. The subtlety and complexity of our judgments are mostly noise! Standards grant discretion. The noise created by setting standards can be reduced by aggregating judgments and by using a mediating assessments protocol.

Decision guidelines decompose a complex decision into several easy sub-judgments on predefined dimensions. This helps constrain discretion and promote homogeneity in judgments and diagnostics, which can reduce noise and improve decisions. Unlike algorithms or rules, guidelines do not eliminate the need for judging.

Error, bias, and the components of noise

Included in the appendices of Noise are a step-by-step guide for conducting a noise audit at your organization, a well-sorted checklist of biases with example quotes for debiasing in real-time, and instructions on how to apply the “outside view” to get a statistical perspective, so you don’t make nonregressive predictions.

Noise: A Flaw in Human Judgment is not pessimistic about the magnitude of errors our human systems make. Instead, it’s a pragmatic call to action about how algorithms, guidelines, and rules can make our judgments more accurate. Heed the call.

AIPT Science is co-presented by AIPT and the New York City Skeptics.