AI Bias: Types, Causes, How to Test for It, and What the Law Requires

AI bias is not one thing — there are multiple distinct types with different causes, different tests, and different legal implications. The complete governance guide for enterprise AI teams.

Key Takeaways

AI bias is not a single phenomenon — it encompasses data bias (training data that underrepresents or misrepresents groups), algorithmic bias (model architecture that amplifies disparities), and measurement bias (metrics that fail to capture relevant dimensions of performance equally across groups).
The most legally significant form of AI bias is disparate impact — where an apparently neutral AI system produces outcomes that disproportionately disadvantage a protected group. Disparate impact is actionable under anti-discrimination law without proof of discriminatory intent.
Fairness testing requires specifying a fairness metric — demographic parity, equalised odds, calibration — before testing. Different metrics capture different dimensions of fairness and in many cases cannot all be satisfied simultaneously. The choice of metric is itself a governance and values decision.
EU AI Act Annex IV requires that technical documentation for high-risk AI includes the results of tests conducted to assess performance and the measures taken to address bias. Bias testing is a legal compliance requirement, not merely a best practice.
The bias testing process: (1) identify demographic groups relevant to the AI's use case, (2) obtain test data with demographic labels, (3) specify the fairness metric appropriate to the context, (4) test the model against that metric, (5) document results and threshold analysis, (6) implement remediation if thresholds are exceeded, (7) re-test post-remediation and monitor in production.

"仅供参考。本文不构成法律、监管、财务或专业建议。如需具体指导，请咨询合格专家。"

The types of AI bias

Historical bias in training data: AI models learn from historical data that reflects historical discrimination. If historical lending data shows lower approval rates for certain demographic groups — not because those groups are less creditworthy but because they faced discriminatory lending practices — an AI trained on that data will learn to reproduce those patterns. The model is not explicitly discriminating; it is faithfully learning the patterns in the data. But those patterns encode historical injustice, and the model perpetuates it.

Representation bias: training datasets that underrepresent certain groups produce models that perform worse for those groups. Facial recognition systems trained primarily on light-skinned male faces perform significantly worse on dark-skinned female faces. Medical AI trained on datasets skewed toward particular demographics produces less accurate diagnoses for underrepresented groups. Representation bias is particularly acute for AI serving populations that have historically been excluded from research and data collection.

Proxy variable bias: AI models that do not explicitly use protected characteristics may use proxy variables that are correlated with those characteristics. Postcode can be a proxy for race in markets with residential segregation. Educational institution can be a proxy for socioeconomic background. Even without any intention to discriminate, a model that uses these variables will reproduce the disparate outcomes they proxy. Proxy variable bias is particularly difficult to detect because the model appears neutral — it is only by testing outcomes across demographic groups that the discrimination becomes visible.

Feedback loop bias: AI systems that learn from their own outputs create feedback loops that can amplify initial biases. A recidivism prediction model that predicts high risk for certain groups influences parole decisions; the resulting incarceration confirms the prediction in subsequent training data; the model becomes more confident in its (biased) predictions. Feedback loop bias is particularly concerning for AI systems that continuously update from production data.

The fairness metric decision

Choosing a fairness metric is a values decision disguised as a technical decision. The most commonly used metrics: demographic parity (equal positive outcome rates across groups — equal loan approval rates regardless of demographic group); equalised odds (equal true positive and false positive rates across groups — equal probability of being correctly approved and incorrectly rejected); calibration (among people who receive a given score, equal probability of the outcome regardless of group — if 40% of low-risk individuals default, that should be true across groups). These metrics often cannot all be satisfied simultaneously — there are mathematical proofs showing that in most realistic scenarios, satisfying one fairness metric makes it impossible to satisfy another. The choice of which metric to optimise for is not a technical decision — it is a decision about which dimension of fairness the organisation prioritises.

阅读英文版