NSF Award #2452330

Information Theoretic Measures for Fairness-aware Supervised Learning

This page is dedicated for updates about the NSF funded project ‘‘CRII: CIF: Information Theoretic Measures for Fairness-aware Supervised Learning’’.

Personnel

Missouri University of Science & Technology (NSF CRII: CIF 2452330) — 10/01/2024 — 6/30/2026

  • PI: Mohamed Nafea (PI), Assistant Professor, Computer Engineering, Missouri S&T
  • Graduate Research Assistants: Sokrat Aldarmini, PhD student, Computer Engineering, Missouri S&T

University of Detroit Mercy (NSF CRII: CIF 2246058) — 06/01/2023 — 11/30/2024

  • PI: Mohamed Nafea (PI), Assistant Professor, ECE Department, University of Detroit Mercy
  • Graduate Assistants: Mario Padilla Rodriguez, Master’s student
  • Undergraduate Assistants: Sarwar Nazrul and Eyiara Oladipo

Award Information

This project is supported by the National Science Foundation (NSF) under Grants CFF 2246058/00090275. Awarded amount is $174,850.00.

Summary of Project Goals and Outcomes

Despite the growing success of machine learning (ML) systems in accomplishing complex tasks, their increasing use in making/aiding consequential decisions that affect people’s lives (e.g., university admission, healthcare, predictive policing) raises concerns about potential discriminatory practices.  Unfair outcomes in ML systems result from historical biases in the data used to train them. A learning algorithm designed merely to minimize prediction error may inherit or even exacerbate such biases; particularly when observed attributes of individuals, critical for generating accurate decisions, are biased by their group identities (e.g., race or gender) due to existing social and cultural inequalities.  Understanding and measuring these biases– at the data level– is a challenging yet crucial problem, leading to constructive insights and methodologies for debiasing the data and adapting the learning system to minimize discrimination, as well as raising the need for policy changes and infrastructural development. This project aims to establish a comprehensive framework for precisely quantifying the marginal impact of individuals’ attributes on accuracy and unfairness of decisions, using tools from information and game theories and causal inference, along with legal and social science definitions of fairness.  This multi-disciplinary effort will provide guidelines and design insights for practitioners in the field of fair data-driven automated systems, and inform the public debate on social consequences of artificial intelligence.

The majority of previous work formulates the fairness problem from the viewpoint of the learning algorithm by enforcing a statistical or counterfactual fairness constraint on the learner’s outcome, and designing a learner that meets it. As the fairness problem originates from biased data, merely adding constraints to the prediction task might not provide a holistic view of its fundamental limitations. This project looks at the fairness problem through different lens, where instead of asking “for a given learner, how can we achieve fairness?”, it asks “for a given dataset, what are the inherent tradeoffs in the data, and based on these, what is the best learner we can design?”.  In supervised learning models, the challenge in the proposed problem lies in the complex structures of correlation/causation among individuals’ attributes (covariates), their group identities (protected features), the target variable (label), and the prediction outcome (decision).  In analyzing the dataset, the marginal impacts of covariates on accuracy and discrimination of decisions are quantified from the data, via carefully designed measures accounting for the complex correlation/causation structures among variables and the inherent tension between accuracy and fairness objectives. Subsequently, methods to exploit the quantified impacts in guiding downstream ML systems to improve their achievable accuracy-fairness tradeoff will be investigated.  Importantly, the proposed framework provides explainable solutions, where inclusion of certain attributes in the learning system is explained by their importance for accurate as well as fair decisions.

This award reflects NSF’s statutory mission and has been deemed worthy of support through evaluation using the Foundation’s intellectual merit and broader impacts review criteria.

Supported Publications
  1. S. Aldarmini and M. Nafea. Information-theoretic quantification of inherent discrimination bias in training data for supervised learning. In 2nd workshop on Navigating and Addressing Data Problems for Foundation Models at the International Conference on Learning Representations (DATA-FM@ICLR 2025), April 2025. (No formal proceedings, available on Open Review)
  2. S. Aldarmini and M. Nafea. Model agnostic Shapley-value analysis benchmark for data feature attribution in explainable and fair machine learning. Work in progress. 2025.

Related (not-supported) work:

  1. E. Oladipo, S. Nazrul, and M. Nafea. Benchmarking deep learning architectures for ECG-based multi-label heart disease prediction using MIMIC-IV database. Accepted for publication in the 38th IEEE International Symposium on Computer based Medical Systems (CBMS), Madrid, Spain, June 2025.
  2. M. P. Rodriguez, E. Oladipo, and M. Nafea. Centralized and federated heart disease classification using UCI dataset: A benchmark with interpretability analysis. Accepted for publication in the IEEE Evolution Life Member Conference, Boston, MA, June 2025. 
  3. M. Nafea and Y. H. Ezzeldin. Federated Fair, Transferable, and Personalizable Representations via Adversarial Training. Work in progress, 2025.
Broader Impacts
  • May 25: Eyiara Oladipo graduated his bachelor degree in computer science at Detroit mercy.
  • January 25: Yeva Vainerman, a sophomore computer science student at Missouri S&T, joined our research team
  • December 24: Sarwar Nazrul graduated his bachelor degree in computer science at Detroit mercy.
  • August 24: PI Nafea joined Missouri S&T ECE department as a tenure-track Assistant Professor in computer engineering.
  • August 24: Master’s student Mario Padilla Rodriguez successfully defended his masters thesis and graduated our research team in Fall 24.
  • June 24: Eyiara Oladipo and Sarwar Nazrul, senior computer science students at Detroit mercy, joined our research team.
  • May 24: Sokrat Aldarmini joined our research team as its 1st PhD student, and is funded by this award from NSF.
  • February-April 24: PI Nafea delivered a talk titled “Towards Responsible AI: Learning with Biased, Imperfect, and Decentralized Data” at Texas Tech University, CS; University of New Haven, ECE; and Missouri S&T, ECE.
  • January 24: PI Nafea served as the ECE department representative and student mentor in the FIRST Robotics Competition Kickoff event held at Detroit Mercy and captured by live TV!
  • November 23: PI Nafea mentored MSU graduate student Ann Drew in delivering a short tutorial to high school students during Detroit Mercy iDRAW day
  • July 23: PI Nafea delivered a 2-hour tutorial titled “Towards Responsible and private AI” to the incoming Summer-Bridge Science and Engineering Equity Development (SEED) students
Acknowledgment

This project is supported in part by the U.S. NSF under grant CCF 2246058 . Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.