On Measuring Social Biases in Sentence Encoders

Chandler May, Alex Wang, Shikha Bordia, Samuel R · 2019 · cs.CL · arXiv 1903.10561

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

The Word Embedding Association Test shows that GloVe and word2vec word embeddings exhibit human-like implicit biases based on gender, race, and other social constructs (Caliskan et al., 2017). Meanwhile, research on learning reusable text representations has begun to explore sentence-level texts, with some sentence encoders seeing enthusiastic adoption. Accordingly, we extend the Word Embedding Association Test to measure bias in sentence encoders. We then test several sentence encoders, including state-of-the-art methods such as ELMo and BERT, for the social biases studied in prior work and two important biases that are difficult or impossible to test at the word level. We observe mixed results including suspicious patterns of sensitivity that suggest the test's assumptions may not hold in general. We conclude by proposing directions for future work on measuring bias in sentence encoders.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

cs.CL · 2019-08-27 · unverdicted · novelty 8.0

Sentence-BERT adapts BERT with siamese and triplet networks to produce sentence embeddings for efficient cosine-similarity comparisons, cutting computation time from hours to seconds on similarity search while matching BERT accuracy.

Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis

cs.CL · 2019-06-24 · unverdicted · novelty 6.0

Authors release a new 800-sentence gender-balanced profession dataset and use it to test occupational gender stereotypes in three sentiment analysis models.

LaMDA: Language Models for Dialog Applications

cs.CL · 2022-01-20 · unverdicted · novelty 6.0

LaMDA shows that fine-tuning on human-value annotations and consulting external knowledge sources significantly improves safety and factual grounding in large dialog models beyond what scaling alone achieves.

DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation

cs.CL · 2026-05-15 · unverdicted · novelty 5.0

DebiasRAG uses a three-stage RAG process to generate and rerank query-specific debiasing contexts that act as fairness constraints for LLM outputs.

Galactica: A Large Language Model for Science

cs.CL · 2022-11-16 · unverdicted · novelty 5.0

Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.

Bias in Large Language Models: Origin, Evaluation, and Mitigation

cs.CL · 2024-11-16 · unverdicted · novelty 2.0

A literature review that categorizes bias in LLMs, surveys evaluation and mitigation techniques, and discusses ethical implications.

citing papers explorer

Showing 6 of 6 citing papers.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks cs.CL · 2019-08-27 · unverdicted · none · ref 22
Sentence-BERT adapts BERT with siamese and triplet networks to produce sentence embeddings for efficient cosine-similarity comparisons, cutting computation time from hours to seconds on similarity search while matching BERT accuracy.
Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis cs.CL · 2019-06-24 · unverdicted · none · ref 22 · internal anchor
Authors release a new 800-sentence gender-balanced profession dataset and use it to test occupational gender stereotypes in three sentiment analysis models.
LaMDA: Language Models for Dialog Applications cs.CL · 2022-01-20 · unverdicted · none · ref 66
LaMDA shows that fine-tuning on human-value annotations and consulting external knowledge sources significantly improves safety and factual grounding in large dialog models beyond what scaling alone achieves.
DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation cs.CL · 2026-05-15 · unverdicted · none · ref 37 · internal anchor
DebiasRAG uses a three-stage RAG process to generate and rerank query-specific debiasing contexts that act as fairness constraints for LLM outputs.
Galactica: A Large Language Model for Science cs.CL · 2022-11-16 · unverdicted · none · ref 103
Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.
Bias in Large Language Models: Origin, Evaluation, and Mitigation cs.CL · 2024-11-16 · unverdicted · none · ref 50 · internal anchor
A literature review that categorizes bias in LLMs, surveys evaluation and mitigation techniques, and discusses ethical implications.

On Measuring Social Biases in Sentence Encoders

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer