Bias and fairness in large language models: A survey.Computational Linguistics, 50(3):1097–1179, 2024

Isabel O Gallegos, Ryan A Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, Nesreen K Ahmed · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack

cs.CR · 2026-05-17 · unverdicted · novelty 6.0

LLM cascade systems are vulnerable to a new adversarial attack that simultaneously degrades accuracy and destroys the intended cost savings by targeting both the lightweight models and the escalation decision mechanism.

Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs

cs.LG · 2026-04-08 · unverdicted · novelty 5.0

Guardian-as-an-Advisor prepends risk labels and explanations from a guardian model to queries, improving LLM safety compliance and reducing over-refusal while adding minimal compute overhead.

citing papers explorer

Showing 2 of 2 citing papers.

When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack cs.CR · 2026-05-17 · unverdicted · none · ref 37
LLM cascade systems are vulnerable to a new adversarial attack that simultaneously degrades accuracy and destroys the intended cost savings by targeting both the lightweight models and the escalation decision mechanism.
Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs cs.LG · 2026-04-08 · unverdicted · none · ref 18
Guardian-as-an-Advisor prepends risk labels and explanations from a guardian model to queries, improving LLM safety compliance and reducing over-refusal while adding minimal compute overhead.

Bias and fairness in large language models: A survey.Computational Linguistics, 50(3):1097–1179, 2024

fields

years

verdicts

representative citing papers

citing papers explorer