pith. the verified trust layer for science. sign in

arxiv: 1805.04508 · v1 · pith:F2WVFUIZnew · submitted 2018-05-11 · 💻 cs.CL

Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems

classification 💻 cs.CL
keywords systemsbiasesexamininginappropriatesentimentanalysisautomaticbias
0
0 comments X p. Extension
Add this Pith Number to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{F2WVFUIZ}

Prints a linked pith:F2WVFUIZ badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Automatic machine learning systems can inadvertently accentuate and perpetuate inappropriate human biases. Past work on examining inappropriate biases has largely focused on just individual systems. Further, there is no benchmark dataset for examining inappropriate biases in systems. Here for the first time, we present the Equity Evaluation Corpus (EEC), which consists of 8,640 English sentences carefully chosen to tease out biases towards certain races and genders. We use the dataset to examine 219 automatic sentiment analysis systems that took part in a recent shared task, SemEval-2018 Task 1 'Affect in Tweets'. We find that several of the systems show statistically significant bias; that is, they consistently provide slightly higher sentiment intensity predictions for one race or one gender. We make the EEC freely available.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models

    cs.CL 2026-04 unverdicted novelty 7.0

    SPAGBias reveals that LLMs form nuanced gender associations with specific urban micro-spaces that exceed real-world distributions and produce failures in planning and descriptive tasks.