TY - GEN
T1 - A comparative study of fairness-enhancing interventions in machine learning
AU - Friedler, Sorelle A.
AU - Choudhary, Sonam
AU - Scheidegger, Carlos
AU - Hamilton, Evan P.
AU - Venkatasubramanian, Suresh
AU - Roth, Derek
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/1/29
Y1 - 2019/1/29
N2 - Computers are increasingly used to make decisions that have significant impact on people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions that require investigation for these algorithms to receive broad adoption. We present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures and existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits) and to different forms of preprocessing, indicating that fairness interventions might be more brittle than previously thought.
AB - Computers are increasingly used to make decisions that have significant impact on people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions that require investigation for these algorithms to receive broad adoption. We present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures and existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits) and to different forms of preprocessing, indicating that fairness interventions might be more brittle than previously thought.
KW - Benchmarks
KW - Fairness-aware machine learning
UR - http://www.scopus.com/inward/record.url?scp=85061831874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061831874&partnerID=8YFLogxK
U2 - 10.1145/3287560.3287589
DO - 10.1145/3287560.3287589
M3 - Conference contribution
AN - SCOPUS:85061831874
T3 - FAT* 2019 - Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency
SP - 329
EP - 338
BT - FAT* 2019 - Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency
PB - Association for Computing Machinery, Inc
T2 - 2019 ACM Conference on Fairness, Accountability, and Transparency, FAT* 2019
Y2 - 29 January 2019 through 31 January 2019
ER -