pith. sign in

arxiv: 1707.09457 · v1 · pith:MDTUU4IMnew · submitted 2017-07-29 · 💻 cs.AI · cs.CL· cs.CV· stat.ML

Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints

classification 💻 cs.AI cs.CLcs.CVstat.ML
keywords biasmodelsvisualamplificationclassificationconstraintscorpus-leveldatasets
0
0 comments X p. Extension
pith:MDTUU4IM Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{MDTUU4IM}

Prints a linked pith:MDTUU4IM badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. Structured prediction models are used in these tasks to take advantage of correlations between co-occurring labels and visual input but risk inadvertently encoding social biases found in web corpora. In this work, we study data and models associated with multilabel object classification and visual semantic role labeling. We find that (a) datasets for these tasks contain significant gender bias and (b) models trained on these datasets further amplify existing bias. For example, the activity cooking is over 33% more likely to involve females than males in a training set, and a trained model further amplifies the disparity to 68% at test time. We propose to inject corpus-level constraints for calibrating existing structured prediction models and design an algorithm based on Lagrangian relaxation for collective inference. Our method results in almost no performance loss for the underlying recognition task but decreases the magnitude of bias amplification by 47.5% and 40.5% for multilabel classification and visual semantic role labeling, respectively.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

    cs.CL 2023-04 accept novelty 8.0

    Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.

  2. BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation

    cs.CL 2026-02 unverdicted novelty 7.0

    BRIDGE reduces bias against high-scoring ELL students in automated scoring by generating synthetic samples via inter-group content pasting and quality discrimination, achieving fairness gains comparable to additional ...

  3. Segment Anything

    cs.CV 2023-04 unverdicted novelty 7.0

    A promptable model trained on 1B masks achieves competitive zero-shot segmentation performance across tasks and is released publicly with its dataset.

  4. Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding

    cs.CV 2025-11 conditional novelty 6.0

    A plug-and-play Anonymizing Adapter Module removes private information from video latent features using self-supervised privacy objectives and consistency losses while retaining utility on action recognition, temporal...