pith. sign in

arxiv: 1801.05398 · v3 · pith:YBEWM6MInew · submitted 2018-01-16 · 💻 cs.IT · cs.LG· math.IT· stat.ML

On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning

classification 💻 cs.IT cs.LGmath.ITstat.ML
keywords disparatedistributionsimpactoutputcorrectionfunctionmodeldiscrimination
0
0 comments X
read the original abstract

In the context of machine learning, disparate impact refers to a form of systematic discrimination whereby the output distribution of a model depends on the value of a sensitive attribute (e.g., race or gender). In this paper, we propose an information-theoretic framework to analyze the disparate impact of a binary classification model. We view the model as a fixed channel, and quantify disparate impact as the divergence in output distributions over two groups. Our aim is to find a correction function that can perturb the input distributions of each group to align their output distributions. We present an optimization problem that can be solved to obtain a correction function that will make the output distributions statistically indistinguishable. We derive closed-form expressions to efficiently compute the correction function, and demonstrate the benefits of our framework on a recidivism prediction problem based on the ProPublica COMPAS dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.