Dokument: Measuring and Mitigating Bias in Machine Learning

Titel:

Measuring and Mitigating Bias in Machine Learning

URL für Lesezeichen:

https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=69373

URN (NBN):

urn:nbn:de:hbz:061-20250423-080540-7

Kollektion:

Dissertationen

Sprache:

Englisch

Dokumententyp:

Wissenschaftliche Abschlussarbeiten » Dissertation

Medientyp:

Text

Autor:

Duong, Manh Khoi [Autor]

Dateien:

[Dateien anzeigen]	Adobe PDF
[Details]	3,76 MB in einer Datei
[ZIP-Datei erzeugen]
Dateien vom 22.04.2025 / geändert 22.04.2025

Beitragende:

Prof. Dr. Conrad, Stefan [Gutachter]
Prof. Dr. Leuschel, Michael [Gutachter]

Dokumententyp (erweitert):

Dissertation

Dewey Dezimal-Klassifikation:

000 Informatik, Informationswissenschaft, allgemeine Werke

Beschreibung:

Machine learning has become more prevalent in recent years and is used in various applications, including supporting decision-making processes. It can act as a tool to predict outcomes based on historical data. In practice, this can range from predicting academic performances to calculating credit risk scores for potential borrowers. However, applications involving personal data can have harmful consequences if the machine learning models used are biased towards certain groups of people. To prevent discriminating subpopulations, fairness is a considerable concern.

Just recently, the European Parliament adopted the Artificial Intelligence Act (AI Act) on March 13, 2024, which aims to regulate the use of AI in the European Union. One of the concerns of the AI Act is the fairness of AI systems. Discrimination that is prohibited by the European Union or national law also applies to AI systems. Since it is expected that most of the rules will come into force on August 2, 2026, and the rules for high-risk AI systems apply earlier, research focused on making machine learning models fairer and responsible has become indispensable.

In this dissertation, we explore literature gaps on fairness in machine learning and propose novel methods to fill gaps that are relevant to the AI Act. The thesis begins by presenting works that emerged from a research project, called Responsible Academic Performance Prediction (RAPP), which aimed to develop a responsible AI platform for predicting academic performances. These works highlight the importance of preventing discrimination in machine learning models. Recognizing that unfair predictions often stem from biased data input, we focus on this root cause and propose methods to mitigate discrimination in the data itself.

While similar methods already exist in the literature, our methods differ in that they can handle any type of discrimination, such as intersectional discrimination or discrimination towards non-binary groups. This is very pivotal, as most prior methods can only deal with binary groups and mitigate discrimination between a privileged and an unprivileged group. But in reality, populations can be categorized into more than two groups, and discrimination can occur among any of these groups. A common approach to make former methods work with more than two groups is to merge multiple groups together. However, this leads to further marginalizing already underrepresented groups and ignoring the discrimination they face. Our methods overcome this limitation and prevent such groups from being ignored. This is done by introducing a new fairness-agnostic framework, FairDo, that can be used with any fairness metric. The framework itself is quite flexible, allowing users to define their own fairness metrics and objectives to optimize the data. An option to handle privacy concerns is also provided by the framework. For this, synthetic data can be used to optimize the data for fairness. With FairDo, certain statistical properties of the data regarding fairness can be fulfilled and a step towards satisfying the AI Act is taken. This thesis further contributes with different aspects of the introduced framework and provides a comprehensive evaluation of the methods in the respective papers.

To strive for good scientific practice, we have made our research reproducible and accessible by publishing all of our methods and experiments on GitHub. Specifically, our fairness framework, FairDo, is additionally available on PyPI and comes with a documentation page (https://fairdo.readthedocs.io/en/latest/).

Lizenz:

Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung 4.0 International Lizenz

Fachbereich / Einrichtung:

Mathematisch- Naturwissenschaftliche Fakultät » WE Informatik » Datenbanken und Informationssysteme

Dokument erstellt am:

23.04.2025

Dateien geändert am:

23.04.2025

Promotionsantrag am:

13.11.2024

Datum der Promotion:

07.04.2025

Heinrich-Heine-Universität Düsseldorf

Dokument: Measuring and Mitigating Bias in Machine Learning