Investigating Human + Machine Complementarity for Recidivism Predictions
read the original abstract
When might human input help (or not) when assessing risk in fairness domains? Dressel and Farid (2018) asked Mechanical Turk workers to evaluate a subset of defendants in the ProPublica COMPAS data for risk of recidivism, and concluded that COMPAS predictions were no more accurate or fair than predictions made by humans. We delve deeper into this claim to explore differences in human and algorithmic decision making. We construct a Human Risk Score based on the predictions made by multiple Turk workers, characterize the features that determine agreement and disagreement between COMPAS and Human Scores, and construct hybrid Human+Machine models to predict recidivism. Our key finding is that on this data set, Human and COMPAS decision making differed, but not in ways that could be leveraged to significantly improve ground-truth prediction. We present the results of our analyses and suggestions for data collection best practices to leverage complementary strengths of human and machines in the fairness domain.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Epistemology gives a Future to Complementarity in Human-AI Interactions
Complementarity in human-AI teams serves as evidence of epistemic reliability within a justificatory framework rather than acting as a standalone post-hoc accuracy metric.
-
AI, Meet Human: Learning Paradigms for Hybrid Decision Making Systems
Proposes a taxonomy of Hybrid Decision Making Systems as a conceptual and technical framework for modeling human-machine interaction in machine learning literature.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.