Disparate Impact
Also known as: adverse impact, disparate impact analysis, 80 percent rule
- Disparate Impact
- A legal and algorithmic fairness concept where a neutral-seeming policy or model disproportionately harms a protected group, measured by the four-fifths rule requiring each group’s selection rate to be at least 80% of the highest group’s rate.
Disparate impact occurs when a neutral-looking rule or algorithm produces significantly worse outcomes for a protected group compared to others, even without any discriminatory intent.
What It Is
If you’ve ever wondered why a hiring algorithm or credit scoring model keeps getting flagged as “unfair” despite treating every applicant through the same formula, disparate impact is the reason. The concept describes a situation where an apparently equal process leads to unequal results across demographic groups — not because anyone designed it that way, but because the process itself has a built-in blind spot.
Think of it like a standardized test that asks questions only about experiences common to one culture. The test doesn’t say “exclude group X.” But because the questions draw on a narrow range of life experience, one group consistently scores lower. That gap between groups is the disparate impact.
The legal foundation traces back to Title VII of the Civil Rights Act of 1964, and the Supreme Court formalized the doctrine in Griggs v. Duke Power Co. in 1971. In employment law, regulators measure disparate impact using the four-fifths rule (also called the 80% rule). According to EEOC, the selection rate for any protected group must be at least 80% of the highest group’s selection rate. If your company hires 60% of male applicants, the female hire rate needs to be at least 48% (60% times 0.8) to pass this threshold.
When fairness metrics like demographic parity, equalized odds, and calibration get compared — as in discussions about choosing the right fairness criterion — disparate impact sits in the background as the legal standard that started it all. Demographic parity is, mathematically, a stricter cousin of the four-fifths rule: it asks for equal selection rates across groups rather than rates within 80% of each other. Understanding where disparate impact ends and statistical parity begins helps you pick the right fairness criterion.
In machine learning, disparate impact gets measured through the same ratio concept. According to Fairlearn Docs, the demographic_parity_ratio() function returns a value between 0 and 1, where 1.0 means perfect parity and anything below 0.8 indicates adverse impact under the four-fifths threshold. This makes the jump from legal compliance to algorithmic auditing direct.
How It’s Used in Practice
The most common place you’ll encounter disparate impact is during fairness audits of classification models — screening tools for resumes, loan applications, insurance claims, or medical referrals. A data science team trains a model, evaluates accuracy, then runs a disparate impact check before deployment. The check splits predictions by demographic group and compares selection rates. If the ratio drops below the 80% threshold, the team either adjusts the model (through techniques like threshold tuning or re-weighting training data) or documents a valid business justification.
Outside of ML, HR departments and legal teams apply the same four-fifths rule when reviewing hiring and promotion data. The EEOC uses it to investigate employment discrimination complaints without needing to prove deliberate bias.
Pro Tip: Run the four-fifths ratio on your model’s outputs before anyone asks you to. If you discover a ratio below 0.8 during development, you have time to investigate root causes — maybe a proxy feature in the training data, maybe a threshold set too aggressively. Finding it yourself is much cheaper than a regulator finding it in production.
When to Use / When Not
| Scenario | Use | Avoid |
|---|---|---|
| Auditing a hiring or lending classifier for group fairness | ✅ | |
| Comparing selection rates across protected demographic groups | ✅ | |
| Choosing between fairness metrics for a high-stakes decision system | ✅ | |
| Evaluating individual prediction explanations (why one person was rejected) | ❌ | |
| Measuring fairness when no protected group data is available | ❌ | |
| Assessing content recommendation quality without demographic segmentation | ❌ |
Common Misconception
Myth: Disparate impact means someone intentionally discriminated against a group. Reality: The entire point of disparate impact is that intent doesn’t matter. A perfectly well-meaning algorithm can still fail the four-fifths test if its outputs produce unequal rates across groups. The legal doctrine specifically targets outcomes, not motives — which is exactly why it became so relevant to automated decision systems that have no “intent” at all.
One Sentence to Remember
Disparate impact measures the gap between what a system does to different groups, not what it was designed to do — and the four-fifths rule gives you a concrete number to check against before that gap becomes a legal or ethical problem.
FAQ
Q: How is disparate impact different from disparate treatment? A: Disparate treatment requires proving intentional discrimination against a specific group. Disparate impact focuses on unequal outcomes regardless of intent, making it the standard most relevant to algorithmic systems.
Q: Does passing the four-fifths rule mean a model is fair? A: Not necessarily. The four-fifths rule is a minimum threshold, not a guarantee of fairness. A model can pass this test while still failing other metrics like equalized odds or calibration across groups.
Q: Can disparate impact apply to AI systems outside hiring? A: Yes. Lending, insurance underwriting, healthcare triage, criminal risk assessment, and any automated decision affecting people’s access to opportunities or services can be evaluated for disparate impact.
Sources
- EEOC: Uniform Guidelines on Employee Selection Procedures - Official Q&A explaining the four-fifths rule and adverse impact standards
- Fairlearn Docs: Common Fairness Metrics - Documentation of demographic parity ratio and related ML fairness measurements
Expert Takes
Disparate impact translates a legal threshold into a statistical test. The four-fifths rule defines a minimum ratio between group selection rates — drop below it, and the burden of proof shifts to whoever deployed the system. What makes this useful for ML practitioners is its simplicity: one ratio, one threshold, one clear signal that something in the pipeline needs investigation before you start debating which fairness metric fits best.
When you’re building a fairness audit into your model evaluation pipeline, disparate impact analysis belongs right after your standard performance metrics. Calculate the selection rate ratio per protected group, compare it against the threshold, and flag anything that falls short. The practical challenge isn’t the math — it’s getting clean demographic labels for your test set and deciding what counts as a “selection” in your specific use case.
Regulatory pressure on algorithmic fairness is increasing across every industry that touches consumer decisions. Organizations that build disparate impact checks into their development cycle now are positioning themselves ahead of compliance requirements still catching up to the technology. The four-fifths rule gives teams a defensible, well-established benchmark — one that regulators already understand and courts already recognize.
The four-fifths rule offers a clean threshold, but clean thresholds can create false comfort. A system that barely passes the test isn’t meaningfully fairer than one that barely fails it. The deeper question is what happens to the people on the wrong side of any automated decision — and whether reducing systemic disadvantage to a single ratio can truly capture the lived experience of being filtered out by a process no one can see, challenge, or appeal.