In our last blog post, we unpacked what fairness in AI means, why it matters, and why technical fixes alone can not be enough to solve the deeper social and structural challenges at play. Still, just because fairness isn’t the whole answer doesn’t mean it isn’t a necessary part of the solution. Improving how AI systems treat individuals and groups, especially in high-stakes areas, is a critical step toward reducing harm and moving toward more equitable outcomes. That’s why in this second post, we shift our focus to the practical: How is fairness currently being pursued in AI systems today?

As with explainability, which we dealt with in a prior blog post, there is no silver bullet for eliminating unfair and discriminatory bias and ensuring fairness in AI systems (Mahoney et al., 2022). With the increasing use of AI systems in sensitive sectors like education, hiring, health care or criminal justice, there is an urgent need to resolve the issue of algorithmic bias and ensure fairness. Accordingly, there has been a growing number of approaches, methods and processes to achieve this.

The first step to mitigating unfair bias and improving fairness in AI systems is not technical measurement of bias or direct mitigation techniques, but establishing what fairness actually entails in a given context. This requires defining what constitutes a fair outcome, who the relevant stakeholders are, what type of data is relevant and whose perspectives are being prioritized. Since what is considered fair can vary depending on social, cultural and institutional contexts, asking these questions and defining the scope of an AI system is crucial to ensure fairness. Impact assessment frameworks, such as the UNESCO Ethical Impact Assessment (EIA), emphasize this by encouraging developers and stakeholders to explicitly identify the values and principles at stake in a given AI system, including e.g., what fairness should mean in that particular context.

The next step to mitigating unfair bias and improving fairness in AI systems involves identifying and evaluating a given system in terms of its fairness . This is done by a wide range of methods that can be as straightforward as asking end-users and affected groups about the performance of a given system. Other methods involve data auditing to see if, for example, the training data accurately represents the population the system is targetting or if the labelling is done correctly (Mahoney et al., 2022; Ferrara, 2023). Moreover, the model itself can be tested by cross-checking performance metrics like accuracy, precision or recall across different subgroups (Mahoney et al., 2022; Binns, 2018). Direct mathematical metrics also calculate a model’s fairness, like demographic parity, equalised odds or individual fairness (Ferrara, 2023). This involves, for example, assessing whether the model’s positive outcomes are equally distributed across groups or whether different groups have equal true positive rates.

Alongside identifying unfair bias and fairness metrics, there is a wide range of technical tools and approaches for actively mitigating bias and promoting fairness in AI systems. These mitigation techniques can be broadly categorised into three core categories:

The first approach, pre-processing, involves improving or correcting training data. This can, for instance, affect the so-called oversampling of underrepresented groups or generate synthetic data to counter imbalances (Mahoney et al., 2022; Ferrara, 2023). During pre-processing, an often proposed solution is to limit the use of sensitive features. While this does make sense in many cases, it usually falls short of achieving a fairer outcome, as proxy variables can also carry hidden bias. Features like ZIP codes, shopping habits or education history can all substitute for origin, class or gender (Mahoney et al., 2022; Mehrabi et al., 2021).

The second approach is in-processing, which can involve adjusting the weights in the model, employing so-called adversarial debiasing or introducing regularisation terms to penalise unwanted bias or an overemphasis on sensitive features (Mahoney et al., 2022; Mehrabi et al., 2021).

Lastly, a post-processing approach involves adjusting the output of AI models to remove bias and ensure fairness. This can be done by, e.g., adjusting the classification thresholds, using methods that tweak model decisions to achieve equalised odds or giving favourable outcomes to unprivileged groups (Mahoney et al., 2022; Mehrabi et al., 2021).

All these technical methods provide essential tools to mitigate bias, but fairness isn’t something that can be programmed into existence. While technical methods like unfair bias audits, regularisation techniques or post-processing corrections are essential, they can not achieve the goal of improving fairness on their own. Instead, fairness has to incorporate broader societal dynamics of how these systems are developed, who profits from them, how we interact with them, and what place and power we ascribe to them in our societies. That is why fairness must be treated not just as a technical goal but as an ethical and political one.

This calls for broad collaboration of stakeholders and diverse participatory approaches that actively involve marginalised communities in not only the development of technology but also in issues of their governance (Leslie et al., 2024). It also involves making fairness not just an add-on or a bonus, but an essential requirement in AI development. This entails the need for standard-setting and legal frameworks that establish fairness as a sine qua non by mandating fairness checklists, auditing and documentation, as well as institutional oversight and human-in-the-loop processes (Alvarez et al., 2024; Leslie et al., 2024).

Ultimately, achieving fairness in AI means challenging how, why and for whom we build systems. It is not just about making systems work for most. It’s about making them work for everyone.

 

References

Alvarez, J. M., Colmenarejo, A. B., Elobaid, A., Fabbrizzi, S., Fahimi, M., Ferrara, A., Ghodsi, S., Mougan, C., Papageorgiou, I., Reyero, P., Russo, M., Scott, K. M., State, L., Zhao, X. & Ruggieri, S. (2024). Policy advice and best practices on bias and fairness in AI. Ethics And Information Technology, 26(2). https://doi.org/10.1007/s10676-024-09746-w.

Binns, R. (2018, 21. Januar). Fairness in Machine Learning: Lessons from Political Philosophy. PMLR. https://proceedings.mlr.press/v81/binns18a.html.

Ferrara, E. (2023). Fairness and Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, and Mitigation Strategies. Sci, 6(1), 3. https://doi.org/10.3390/sci6010003.

Leslie, D., Rincon, C., Briggs, M., Perini, A., Jayadeva, S., Borda, A., Bennett, S., Burr, C., Aitken, M., Katell, M., Fischer, C., Wong, J. & Garcia, I. K. (2024). AI fairness in practice. arXiv (Cornell University). https://doi.org/10.5281/zenodo.10680527.

Mahoney, T., R. Varshney, K. & Hind, M. (2019). AI Fairness How to Measure and Reduce Unwanted Bias in Machine Learning (1st Edition). O’Reilly Media, Inc.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. (2021). A Survey on Bias and Fairness in Machine Learning. ACM Computing Surveys, 54(6), 1–35. https://doi.org/10.1145/3457607.

 

Further Reading/Watching/Listening:

Books & Articles:

Barocas, S., Hardt, M. & Narayanan, A. (2023). Fairness and Machine Learning: Limitations and Opportunities. MIT Press.

Benjamin, R. (2019). Race after technology: Abolitionist Tools for the New Jim Code. Polity.

Eubanks, V. (2019). Automating inequality: How High-Tech Tools Profile, Police, and Punish the Poor. Picador.

O’Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. https://ci.nii.ac.jp/ncid/BB22310261.

 

Videos & Podcasts:

“AI Bias and Fairness” by Ava Soleimany at MIT.
 Watch on YouTube

“How I’m fighting bias in algorithms” by Joy Buolamwini at TEDTalks.
 Watch on YouTube

More Than Just Math