Skip to main content
AI Foundations

⏱ About 15 min15 XP

Where Bias Comes From

When people hear that an AI system is 'biased,' they often imagine a programmer who deliberately wrote in something unfair. That almost never happens. Bias in AI is usually not malicious — it is structural. It enters through the data, through the design choices made before a line of code is written, and through how a system is actually used in the world. Understanding where bias comes from is the first step to doing something about it.

Source One: The Data

An AI learns from examples. If those examples reflect an unfair world, the AI learns to replicate that unfairness — and can lock it in even after society has moved on. Consider a hiring algorithm trained on ten years of résumés from a company where most senior employees were men. The algorithm learns that 'successful candidate' patterns look like the people who were historically hired. It may penalize signals associated with women — even signals that have nothing to do with job performance. Amazon actually built and then abandoned exactly this kind of résumé screener around 2018 when they discovered it was downgrading résumés that mentioned the word 'women's' (as in 'women's chess club'). Data bias also arises from who is missing. Medical AI trained predominantly on data from patients of European ancestry may perform poorly for patients of other backgrounds — not out of any intent, but because those patients were underrepresented in the training set.

Historical Bias vs. Representation Bias

Historical bias: the training data reflects real-world inequities that existed in the past. The AI learns those patterns and perpetuates them. Representation bias: certain groups appear too rarely in the data for the model to learn accurate patterns about them. Both cause harm — through different mechanisms.

Source Two: Design Choices

Bias enters before any data is touched. The choices that frame a problem shape what the system can and cannot do. Choice of objective: What are you optimizing for? A recidivism prediction tool (used in US courts to estimate how likely a defendant is to re-offend) was optimized to predict re-arrest. But re-arrest is not the same as re-offense — it also reflects policing patterns, which are themselves uneven. By choosing 're-arrest' as the target variable, the designers baked in existing disparities. Choice of who is in the room: Teams that are demographically homogeneous often fail to anticipate how a system will affect groups different from themselves — not because they are careless, but because they lack that lived experience. Choice of what counts as success: A facial recognition system evaluated only on light-skinned faces looks great on aggregate metrics. When the team does not disaggregate performance by subgroup, they never see the disparity.

Aggregate Accuracy Can Hide Group-Level Failure

A model that is 95% accurate overall can still be 99% accurate on the majority group and 70% accurate on a minority group. Reporting only the overall number makes the system look better than it is for the people who matter most. Always ask: accurate for whom?

Source Three: Deployment Context

A system that performs well in testing can cause harm when deployed in a different context — or when used by people in ways the designers did not anticipate. A medical diagnosis AI trained on patients at a large research hospital may be tested and validated in that setting. When it is deployed in a rural clinic with older imaging equipment and a different patient population, its accuracy may drop significantly — with no warning. Deployment bias also arises from automation bias: the tendency for humans to over-trust AI outputs and stop applying their own judgment. A radiologist who sees an AI mark a scan as 'low risk' may spend less time scrutinizing it — even when the AI is wrong.

Match each type of bias to its correct description.

Terms

Historical bias
Representation bias
Objective bias
Automation bias

Definitions

Certain groups appear too rarely in training data for the model to learn accurate patterns
The AI learns patterns from data that reflects past societal inequities
Humans over-trust AI outputs and reduce their own critical scrutiny
The chosen optimization target does not accurately represent the real goal

Drag terms onto their definitions, or click a term then click a definition to match.

Amazon's résumé-screening AI penalized candidates who mentioned 'women's chess club.' Which source of bias does this most directly illustrate?

A facial recognition system is 94% accurate overall but only 68% accurate for people with dark skin tones. A team reports only the 94% figure. What critical mistake are they making?

Trace the Bias

  1. Choose one of these scenarios:
  2. A. A loan approval AI trained on historical approval data from the 1990s.
  3. B. A content moderation AI tested only on English-language posts before being deployed globally.
  4. C. A school grading AI that flags essays as low-quality if they use non-standard dialects of English.
  5. For your chosen scenario, identify which of the three bias sources (data, design, deployment) is at work — or if multiple sources apply.
  6. Write a short paragraph explaining the mechanism: how exactly does the bias enter, and what group is likely harmed?
  7. Then propose one concrete change that could reduce the bias — at the data stage, design stage, or deployment stage.