AI Infrastructure, Safety & Ethics

Algorithmic Fairness

Definition

Algorithmic fairness is the field that formalizes what it means for an algorithm to treat people equitably. Multiple competing formal definitions exist, including: demographic parity (equal positive prediction rates across groups), equalized odds (equal true positive and false positive rates across groups), calibration (predicted probabilities are equally accurate across groups), and individual fairness (similar individuals receive similar predictions). These definitions often conflict mathematically—it is impossible to simultaneously satisfy all fairness criteria when base rates differ between groups (the impossibility theorem). Choosing which fairness criterion to optimize is a value judgment that depends on the domain, stakes, and legal framework.

Why It Matters

Understanding algorithmic fairness is essential for any team deploying AI in domains that affect people's lives—hiring, lending, healthcare, criminal justice, housing, and education. The choice of fairness metric has profound real-world implications: optimizing for demographic parity in loan approval may require approving some high-risk applications from underrepresented groups; optimizing for equalized odds ensures equal error rates but may produce different approval rates across groups. These are not purely technical decisions—they embed value judgments about acceptable tradeoffs between accuracy and equity that require stakeholder input and often legal guidance.

How It Works

Fairness evaluation workflow: (1) identify the relevant protected attributes for the use case and legal context; (2) compute the model's predictions separately for each demographic group; (3) calculate fairness metrics—demographic parity ratio (approval rate for group A / approval rate for group B; target >= 0.8 per the 80% rule), equalized odds (difference in TPR and FPR between groups; target < 0.05), calibration gap (difference in calibration error between groups); (4) interpret results in the legal and ethical context; (5) apply fairness-aware training techniques if thresholds are violated. Fairlearn, AI Fairness 360, and Aequitas provide standardized fairness measurement toolkits.

Algorithmic Fairness Metrics

Demographic Parity

✓ Pass

P(Ŷ=1|A=0) = P(Ŷ=1|A=1)

Group A: 68%Group B: 71%Δ 3%

Equalized Odds

✗ Fail

TPR and FPR equal across groups

Group A: 82%Group B: 61%Δ 21%

Calibration

✓ Pass

P(Y=1|Ŷ=p) = p for all groups

Group A: 79%Group B: 78%Δ 1%

Real-World Example

A recidivism risk assessment tool used in pre-trial detention decisions showed equalized accuracy across racial groups (77% for both Black and white defendants) but a significant equalized odds violation: the false positive rate (incorrectly classifying a non-recidivist as high-risk) was 45% for Black defendants vs. 23% for white defendants. Though the tool appeared 'equally accurate,' it was twice as likely to incorrectly detain Black defendants. This disparity—documented in the COMPAS analysis by ProPublica—became a landmark case study in the importance of choosing fairness metrics that reflect the actual harm distribution of errors rather than just aggregate accuracy.

Common Mistakes

✕Using a single fairness metric as a complete fairness evaluation—different metrics capture different types of fairness violations; use multiple metrics
✕Treating fairness as a post-hoc fix—many fairness issues require data collection and model design changes that are difficult to apply after the model is trained
✕Conflating statistical fairness with ethical fairness—a model can satisfy mathematical fairness criteria while still producing harmful outcomes in context

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

Algorithmic Fairness

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

AI Bias

Fairness Metrics

Disparate Impact

Responsible AI

Explainability

Ready to build your AI chatbot?