Robustness Disparities in Commercial Face Detection
read the original abstract
Facial detection and analysis systems have been deployed by large companies and critiqued by scholars and activists for the past decade. Critiques that focus on system performance analyze disparity of the system's output, i.e., how frequently is a face detected for different Fitzpatrick skin types or perceived genders. However, we focus on the robustness of these system outputs under noisy natural perturbations. We present the first of its kind detailed benchmark of the robustness of three such systems: Amazon Rekognition, Microsoft Azure, and Google Cloud Platform. We use both standard and recently released academic facial datasets to quantitatively analyze trends in robustness for each. Across all the datasets and systems, we generally find that photos of individuals who are older, masculine presenting, of darker skin type, or have dim lighting are more susceptible to errors than their counterparts in other identities.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Toward Calibrated, Fair, and accurate Deepfake Detection
Face-Feature Tuning is a label-free logit remapping method that reduces FPR/TPR gaps across groups in deepfake detection while preserving overall accuracy.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.