AI Infrastructure, Safety & Ethics

AI Bias

Definition

AI bias refers to systematic errors in model outputs that disadvantage particular groups based on characteristics like race, gender, age, disability, or socioeconomic status. Bias enters AI systems at multiple stages: (1) historical bias—training data reflects past human prejudices (e.g., historical hiring data reflects past discrimination); (2) representation bias—certain groups are underrepresented in training data; (3) measurement bias—proxy features (zip code, name) encode demographic information; (4) label bias—human annotators' subjective judgments reflect cultural biases; (5) aggregation bias—a model trained on combined data performs poorly on specific subgroups. Bias is not always intentional but can cause serious harm regardless of intent.

Why It Matters

AI bias has caused documented real-world harm: healthcare algorithms prioritized spending on white patients over equally sick Black patients, facial recognition systems misidentified Black individuals at significantly higher rates, and credit scoring models charged higher interest rates to minority borrowers with equivalent risk profiles. Regulatory frameworks increasingly require bias testing for high-stakes AI applications in hiring, lending, criminal justice, and healthcare. Beyond legal risk, biased AI systems erode user trust, create reputational damage, and may cause direct harm to vulnerable populations. Every team deploying AI in consequential domains must systematically evaluate and mitigate bias.

How It Works

Bias evaluation methodology: (1) identify protected attributes relevant to the use case (race, gender, age, disability); (2) gather disaggregated evaluation data with demographic labels; (3) compute performance metrics separately for each demographic group; (4) apply fairness metrics (demographic parity, equalized odds, calibration) to measure disparity; (5) identify the most severe disparities and trace them to their source (training data, features, labels, objective); (6) apply mitigations (resampling, reweighting, adversarial debiasing, fairness constraints); (7) re-evaluate after mitigation. IBM AI Fairness 360 and Fairlearn provide Python toolkits for bias measurement and mitigation.

Types of AI Bias

TypeExampleImpactML Stage

Historical BiasResume screener trained on past hiring reflects past discriminationHighTraining data

Measurement BiasPulse-ox accuracy differs by skin tone in training setHighData collection

Aggregation BiasSingle model for all subgroups when behavior differsMediumModel design

Evaluation BiasBenchmark doesn't represent deployment populationMediumEvaluation

Deployment BiasSystem used for purpose different from design intentHighProduction

Feedback Loop BiasModel predictions influence future training dataHighOngoing use

Mitigation: Audit training data, measure fairness metrics across subgroups, document model cards, and monitor production drift.

Real-World Example

A major bank tested their automated loan approval model across demographic groups and found that applications from Black and Hispanic applicants were approved at 23% lower rates than equivalent white applicants after controlling for financial factors. Investigation identified two bias sources: the zip code feature encoded racial composition of neighborhoods (legal redlining proxy), and the employment stability feature penalized gig economy workers who are disproportionately minority. Removing the zip code feature and redesigning employment stability reduced the disparity to 3%—within statistical noise—while maintaining overall model accuracy at 94% of the original.

Common Mistakes

✕Testing for bias only on aggregate performance metrics—bias is a group-level phenomenon that requires disaggregated evaluation by demographic group
✕Assuming bias was not introduced because no protected attributes were used as features—proxy variables (zip code, name, employment gaps) encode demographic information
✕Treating bias mitigation as a one-time fix—bias can re-emerge as data distributions change; ongoing monitoring is required

Related Terms

Algorithmic Fairness

Algorithmic fairness defines formal mathematical criteria for measuring and achieving equitable treatment across demographic groups in AI decision systems—including demographic parity, equalized odds, and individual fairness.

Fairness Metrics

Fairness metrics are quantitative measures that evaluate how equitably an AI system treats different demographic groups—providing the mathematical foundation for detecting and reporting bias in model predictions.

Responsible AI

Responsible AI is a framework of organizational practices and principles—encompassing fairness, transparency, privacy, safety, and accountability—that guide how teams build and deploy AI systems that are trustworthy and beneficial.

AI Ethics

AI ethics is the field that examines the moral principles and societal responsibilities governing the development and deployment of AI systems—addressing fairness, accountability, transparency, privacy, and the broader human impact of algorithmic decision-making.

Disparate Impact

Disparate impact occurs when an AI system produces significantly different outcomes for different demographic groups—even without explicitly using protected attributes—creating legal liability under anti-discrimination law regardless of intent.

← AI Infrastructure, Safety & Ethics ← Glossary Hub

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →