Information Theoretic Measures for Fairness-aware Supervised Learning
This page is dedicated for updates on the NSF funded project ‘‘CRII: CIF: Information Theoretic Measures for Fairness-aware Supervised Learning’’.
Personnel
Missouri University of Science & Technology (NSF CRII: CIF 00090275) — 10/01/2024 — 5/31/2026
- PI: Mohamed Nafea (PI), Assistant Professor, Computer Engineering, Missouri S&T
- Graduate Research Assistants: Sokrat Aldarmini, PhD student, Computer Engineering, Missouri S&T
University of Detroit Mercy (NSF CRII: CIF 2246058) — 06/01/2023 — 11/30/2024
- PI: Mohamed Nafea (PI), Assistant Professor, ECE Department, University of Detroit Mercy
- Graduate Assistants: Mario Padilla Rodriguez, Master’s student
- Undergraduate Assistants: Sarwar Nazrul and Eyiara Oladipo
Award Information
This project is supported by the National Science Foundation (NSF) under Grants CFF 2246058/00090275. Awarded amount is $174,850.00.
Summary of Project Goals and Outcomes
Despite the growing success of machine learning (ML) systems in accomplishing complex tasks, their increasing use in making/aiding consequential decisions that affect people’s lives (e.g., university admission, healthcare, predictive policing) raises concerns about potential discriminatory practices. Unfair outcomes in ML systems result from historical biases in the data used to train them. A learning algorithm designed merely to minimize prediction error may inherit or even exacerbate such biases; particularly when observed attributes of individuals, critical for generating accurate decisions, are biased by their group identities (e.g., race or gender) due to existing social and cultural inequalities. Understanding and measuring these biases– at the data level– is a challenging yet crucial problem, leading to constructive insights and methodologies for debiasing the data and adapting the learning system to minimize discrimination, as well as raising the need for policy changes and infrastructural development. This project aims to establish a comprehensive framework for precisely quantifying the marginal impact of individuals’ attributes on accuracy and unfairness of decisions, using tools from information and game theories and causal inference, along with legal and social science definitions of fairness. This multi-disciplinary effort will provide guidelines and design insights for practitioners in the field of fair data-driven automated systems, and inform the public debate on social consequences of artificial intelligence.
The majority of previous work formulates the fairness problem from the viewpoint of the learning algorithm by enforcing a statistical or counterfactual fairness constraint on the learner’s outcome, and designing a learner that meets it. As the fairness problem originates from biased data, merely adding constraints to the prediction task might not provide a holistic view of its fundamental limitations. This project looks at the fairness problem through different lens, where instead of asking “for a given learner, how can we achieve fairness?”, it asks “for a given dataset, what are the inherent tradeoffs in the data, and based on these, what is the best learner we can design?”. In supervised learning models, the challenge in the proposed problem lies in the complex structures of correlation/causation among individuals’ attributes (covariates), their group identities (protected features), the target variable (label), and the prediction outcome (decision). In analyzing the dataset, the marginal impacts of covariates on accuracy and discrimination of decisions are quantified from the data, via carefully designed measures accounting for the complex correlation/causation structures among variables and the inherent tension between accuracy and fairness objectives. Subsequently, methods to exploit the quantified impacts in guiding downstream ML systems to improve their achievable accuracy-fairness tradeoff will be investigated. Importantly, the proposed framework provides explainable solutions, where inclusion of certain attributes in the learning system is explained by their importance for accurate as well as fair decisions.
This award reflects NSF’s statutory mission and has been deemed worthy of support through evaluation using the Foundation’s intellectual merit and broader impacts review criteria.
Selected Publications
- Mario Padilla Rodriguez and Mohamed Nafea. Benchmark for Centralized and Federated Heart Disease Classification Models Using UCI Heart-disease Dataset. Submitted to ECAI Demo 2024
- Mohamed Nafea and Yahya H. Ezzeldin. Federated Fair, Transferable, and Personalizable Representations via Adversarial Training. In preparation.
Broader Impacts
July 2023: PI Mohamed Nafea delivered a 2 hour tutorial titled “Towards Responsible and private AI” to the incoming Summer-Bridge Science and Engineering Equity Development (SEED) students
November 2023: PI Mohamed Nafea mentored MSU graduate student Ann Drew in delivering a short tutorial to high school students during Detroit Mercy iDRAW day
January 2024: PI Mohamed Nafea served as the ECE department representative and students mentor in the FIRST Robotics Competition Kickoff event held at Detroit Mercy and captured by live TV!
February-April 2024: PI Mohamed Nafea delivered a talk titled “Towards Responsible AI: Learning with Biased, Imperfect, and Decentralized Data” at Texas Tech University, CS; University of New Haven, ECE; and Missouri S&T, ECE.
Acknowledgment
This project is supported in part by the U.S. NSF under grant CCF 2246058 . Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.