Bias Detective
This lesson is your investigation. You have spent eight lessons building a toolkit: you know what bias is, how it enters AI through data and design, what representation gaps look like, how fairness is measured, and who shares responsibility. Now you will put that toolkit to work. You are a bias detective. A fictional city's AI system has been deployed, and your job is to find what is wrong — and propose what to do about it.
The Case File
Metroville City Council recently deployed an AI system called FAIR-ASSIST to help route social services — food assistance, job training referrals, housing support, and family counseling — to residents who apply online. The system was built by a tech company using five years of past case-worker decisions as training data. Here is what is known about FAIR-ASSIST: The training data came from case files in which human case workers decided who to approve for each service. Case workers were predominantly from one socioeconomic background and received limited training on implicit bias. The system uses features including: applicant's zip code, length of the application text, device type used to apply (mobile vs. desktop), primary language of the application, and previous use of city services. The city's population is 40 percent Spanish-speaking, but only 8 percent of the training data came from Spanish-language applications. The system was launched after three months of testing on English-language applications only. There is currently no feedback channel for applicants to report if they believe a decision was wrong. Application decisions are final and are not reviewed by a human case worker.
FAIR-ASSIST routes social services in Metroville. It was trained on biased caseworker decisions, uses proxy variables, under-represents Spanish speakers, was not tested on diverse language groups, and has no appeal or review process.
Part 1 — Identify the Bias Sources
Bias Detective: Part 1 — Find the Sources
- Read the FAIR-ASSIST case file carefully. Then complete all four steps in writing.
- Step 1: List every source of bias you can identify in FAIR-ASSIST. For each source, name the type of bias (historical bias, representation gap, proxy variable, measurement bias, design choice, etc.) and explain in one to two sentences why it is a bias problem.
- Step 2: For each feature FAIR-ASSIST uses (zip code, application length, device type, application language, previous service use), explain whether it might act as a proxy for a protected characteristic. Which features concern you most, and why?
- Step 3: FAIR-ASSIST was tested on English-language applications only. What specific risks does this create for Spanish-speaking applicants? Name at least two.
- Step 4: The system has no human review and no appeal process. Using what you know about the shared responsibility for fairness, explain who failed here and how — engineers, the organization, and the city government.
Part 2 — Measure the Impact
Even without real data, you can reason about what fairness metrics would likely show. Use what you know about representation gaps and proxy variables to make educated predictions.
Bias Detective: Part 2 — Measure the Impact
- Step 1: Predict — without real data, but using your knowledge from this module — whether FAIR-ASSIST would satisfy demographic parity, equal opportunity, and predictive parity when comparing English-speaking and Spanish-speaking applicants. Explain your reasoning for each metric.
- Step 2: Imagine you are allowed to run one fairness test before the city shuts the system down for review. Which metric would you test first, and why? What data would you need to run that test?
- Step 3: Describe one group of Metroville residents (besides Spanish speakers) who might also experience unfair outcomes from FAIR-ASSIST based on the features it uses. Explain why.
Part 3 — Fix the System
Bias Detective: Part 3 — Recommend Improvements
- You have been asked to write a short report to the Metroville City Council recommending changes to FAIR-ASSIST. Your report must include:
- 1. A summary of the top three bias problems you found (two to three sentences each).
- 2. Three specific technical changes the development team should make before redeployment. Be concrete: not just 'get better data' but exactly what data, from whom, and how.
- 3. Two organizational or policy changes the city and the tech company should make — things beyond the code itself.
- 4. A fairness metric you recommend the city track and publish publicly every six months after redeployment.
- 5. A final paragraph explaining why the residents who are most likely to be harmed by FAIR-ASSIST are also the residents who most need the services it is supposed to provide — and why that makes getting this right especially urgent.
FAIR-ASSIST was trained on five years of past case-worker decisions. Why does that make the training data a bias risk?
FAIR-ASSIST uses an applicant's zip code as one of its features. Why can zip code be a fairness problem?