Bias in Large Language Models: Origin, Evaluation, and Mitigation
Pith reviewed 2026-05-23 17:04 UTC · model grok-4.3
The pith
Biases in large language models arise from data and context and can be detected and reduced through staged evaluation and mitigation methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that bias in LLMs manifests as intrinsic biases rooted in training data and architecture and extrinsic biases introduced during application, that these can be measured with data-level, model-level, and output-level methods, and that they can be addressed by pre-model, intra-model, and post-model mitigation techniques, thereby supporting the development of fairer AI systems.
What carries the argument
The categorization of biases into intrinsic and extrinsic types together with the division of evaluation and mitigation into pre-model, intra-model, and post-model stages.
If this is right
- Biased models can produce harmful decisions in healthcare and criminal justice applications.
- Evaluation at multiple levels (data, model, output) allows earlier detection of bias than output checks alone.
- Mitigation works best when applied across the full model lifecycle rather than at a single stage.
- Legal and ethical review of LLM deployments must account for both intrinsic and extrinsic bias sources.
Where Pith is reading between the lines
- The framework could be extended by testing whether new task-specific biases in areas such as code generation fit the existing intrinsic/extrinsic split.
- A practical next step would be to map the mitigation techniques onto measurable fairness metrics that regulators could adopt.
- If the staged approach proves effective, model developers might adopt it as a default checklist during training and deployment.
Load-bearing premise
The review assumes that its chosen categorization of biases into intrinsic and extrinsic, along with the selected evaluation and mitigation methods, provides a representative and useful overview of the field without significant omissions or selection bias in the surveyed literature.
What would settle it
A comprehensive survey of recent LLM bias papers that identifies a major category or effective mitigation approach falling outside the intrinsic/extrinsic and pre/intra/post-model frameworks would show the review's organization is incomplete.
read the original abstract
Large Language Models (LLMs) have revolutionized natural language processing, but their susceptibility to biases poses significant challenges. This comprehensive review examines the landscape of bias in LLMs, from its origins to current mitigation strategies. We categorize biases as intrinsic and extrinsic, analyzing their manifestations in various NLP tasks. The review critically assesses a range of bias evaluation methods, including data-level, model-level, and output-level approaches, providing researchers with a robust toolkit for bias detection. We further explore mitigation strategies, categorizing them into pre-model, intra-model, and post-model techniques, highlighting their effectiveness and limitations. Ethical and legal implications of biased LLMs are discussed, emphasizing potential harms in real-world applications such as healthcare and criminal justice. By synthesizing current knowledge on bias in LLMs, this review contributes to the ongoing effort to develop fair and responsible AI systems. Our work serves as a comprehensive resource for researchers and practitioners working towards understanding, evaluating, and mitigating bias in LLMs, fostering the development of more equitable AI technologies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper is a review that categorizes bias in LLMs as intrinsic versus extrinsic, surveys evaluation methods at data/model/output levels, organizes mitigation into pre/intra/post-model techniques, and discusses ethical/legal implications in applications such as healthcare and justice, with the central claim that this synthesis advances fair and responsible AI.
Significance. A well-structured synthesis of bias literature could provide a practical reference for the field if the coverage is representative; however, the lack of any documented search protocol means the contribution rests on unverified curation rather than systematic aggregation.
major comments (1)
- [Abstract and Introduction] The manuscript states it provides a 'comprehensive review' and 'synthesizing current knowledge' (abstract) but contains no description of literature search methodology, databases, keywords, date ranges, inclusion criteria, or number of papers screened. This omission directly undermines the representativeness claim and the utility of the intrinsic/extrinsic and pre/intra/post taxonomies as a 'robust toolkit'.
Simulated Author's Rebuttal
We thank the referee for their thoughtful feedback on our review paper. We address the major comment regarding the absence of a documented literature search methodology below and outline revisions that will improve transparency while preserving the paper's value as a synthesized reference.
read point-by-point responses
-
Referee: [Abstract and Introduction] The manuscript states it provides a 'comprehensive review' and 'synthesizing current knowledge' (abstract) but contains no description of literature search methodology, databases, keywords, date ranges, inclusion criteria, or number of papers screened. This omission directly undermines the representativeness claim and the utility of the intrinsic/extrinsic and pre/intra/post taxonomies as a 'robust toolkit'.
Authors: We acknowledge the validity of this observation. The current manuscript presents a narrative synthesis of key literature rather than a systematic review following protocols such as PRISMA. To address the concern, we will add a dedicated subsection (likely in the Introduction) that explicitly describes the literature scope: primary sources include arXiv, ACL Anthology, NeurIPS, and Google Scholar; coverage focuses on works from 2018 to October 2024; inclusion was guided by relevance to LLM bias origins, evaluation benchmarks, and mitigation strategies, with emphasis on highly cited and representative papers. We will also moderate phrasing from 'comprehensive review' to 'extensive review' and 'synthesizing current knowledge' to 'synthesizing key developments' to align with the narrative nature of the work. These changes will clarify the basis for the intrinsic/extrinsic and pre/intra/post categorizations without overstating systematic aggregation, thereby supporting their utility as a practical reference. revision: yes
Circularity Check
Review paper aggregates external literature without internal circular derivations
full rationale
This is a review paper synthesizing existing literature on LLM bias. The abstract describes categorization of biases (intrinsic/extrinsic) and mitigation strategies (pre/intra/post-model) drawn from surveyed works, with no original equations, predictions, fitted parameters, or derivations presented. No load-bearing self-citations or self-definitional steps are evident in the provided text; the central synthesis claim rests on external sources rather than reducing to inputs defined within the paper itself.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 7 Pith papers
-
ReLay: Personalized LLM-Generated Plain-Language Summaries for Better Understanding, but at What Cost?
Personalized LLM-generated plain language summaries improve lay readers' comprehension and quality ratings but increase risks of reinforcing biases and introducing hallucinations compared to static expert summaries.
-
Counting Worlds Branching Time Semantics for post-hoc Bias Mitigation in generative AI
CTLF is a branching-time logic with counting-worlds semantics for verifying fairness in probability distributions over protected attributes, predicting bias bounds, and calculating outputs to remove in generative AI series.
-
When AI reviews science: Can we trust the referee?
AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference sub...
-
Safe for Whom? Rethinking How We Evaluate the Safety of LLMs for Real Users
LLM safety evaluations for personal advice must test responses against diverse user vulnerability profiles, since context-blind ratings overestimate safety and realistic prompt context does not fix the problem.
-
A Study of LLMs' Preferences for Libraries and Programming Languages
Empirical study of eight LLMs finds overuse of popular libraries like NumPy in up to 45% of unnecessary cases and strong default preference for Python even when suboptimal.
-
FAIR_XAI: Improving Multimodal Foundation Model Fairness via Explainability for Wellbeing Assessment
Vision-language models for wellbeing assessment exhibit dataset-dependent performance and demographic biases, with explainability interventions providing inconsistent fairness gains at potential accuracy costs.
-
A Survey on LLM-as-a-Judge
A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.
Reference graph
Works this paper leans on
-
[1]
Persistent an ti-muslim bias in large language models
Abubakar Abid, Maheen Farooqi, and James Zou. Persistent an ti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, page 298–306, New York, NY, USA,
work page 2021
-
[2]
Arif Ahmad and Pushpak Bhattacharyya
Association for Computing Machinery. Arif Ahmad and Pushpak Bhattacharyya. Bias in language mode ls: A survey. Jaimeen Ahn and Alice Oh. Mitigating language-dependent et hnic bias in BERT. In Proceedings of the 2021 Conference on Empirical Methods in Natural Langua ge Processing, Online and Punta Cana, Dominican Republic, November
work page 2021
-
[3]
doi: 10.1109/TAC.1974.1100705. AJ Alvero, Jinsook Lee, Alejandra Regla-Vargas, Rene Kizil ec, Thorsten Joachims, and Anthony Lis- ing Antonio. Large language models, social demography, and hegemony: Comparing authorship in human and synthetic text. Preprint, pages 1–25,
-
[4]
Haozhe An, Christabel Acquaye, Colin Wang, Zongxia Li, and R achel Rudinger. Do large language models discriminate in hiring decisions on the basis of race , ethnicity, and gender? arXiv preprint arXiv:2406.10486,
-
[5]
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias. ProPublica, 23(2016): 139–159,
work page 2016
-
[6]
Fairmonitor: A dual-framework for detecting stereotypes and biases in lar ge language models
Yanhong Bai, Jiabao Zhao, Jinxin Shi, Zhentao Xie, Xingjiao Wu, and Liang He. Fairmonitor: A dual-framework for detecting stereotypes and biases in lar ge language models. arXiv preprint arXiv:2405.03098,
-
[7]
Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
As- sociation for Computational Linguistics. Christine Basta, Marta R Costa-Jussà, and Noe Casas. Evalua ting the underlying gender bias in contextualized word embeddings. arXiv preprint arXiv:1904.08783 ,
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[8]
On the dangers of stochastic parrots: Can language models be too bi g
Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the dangers of stochastic parrots: Can language models be too bi g. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages 610–623,
work page 2021
-
[9]
Abeba Birhane and Vinay Uday Prabhu. Large image datasets: A pyrrhic win for computer vision? In 2021 IEEE Winter Conference on Applications of Computer Visi on (W ACV), pages 1536–1546. IEEE,
work page 2021
-
[10]
Su Lin Blodgett and Brendan O’Connor. Racial disparity in na tural language processing: A case study of social media african-american english. arXiv preprint arXiv:1707.00061 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
A large annotated corpus for learning natural language inference
Samuel R Bowman, Gabor Angeli, Christopher Potts, and Chris topher D Manning. A large anno- tated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Language models are few-shot learners
21 Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jar ed Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda As kell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems , 33:1877–1901,
work page 1901
-
[13]
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Daniel Cer, Mona Diab, Eneko Agirre, Inigo Lopez-Gazpio, an d Lucia Specia. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-li ngual focused evaluation. arXiv preprint arXiv:1708.00055,
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[14]
My f air lady: Detecting and mitigating bias in job advertisements
Michelle Chen, Zhu Ma, Aniko Hannak, and Christo Wilson. My f air lady: Detecting and mitigating bias in job advertisements. Proceedings of the 2018 World Wide Web Conference , pages 991–1000,
work page 2018
-
[15]
Enhanced lstm for natural language inference
Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, Hui Jiang, and D iana Inkpen. Enhanced lstm for natural language inference. arXiv preprint arXiv:1609.06038 ,
-
[16]
Interactive analysis of llms using mea ningful counterfactuals
Furui Cheng, Vilém Zouhar, Robin Shing Moon Chan, Daniel Für st, Hendrik Strobelt, and Men- natallah El-Assady. Interactive analysis of llms using mea ningful counterfactuals. arXiv preprint arXiv:2405.00708,
-
[17]
Improving n eural conversational models with entropy-based data filtering
Richárd Csáky, Patrik Purgai, and Gábor Recski. Improving n eural conversational models with entropy-based data filtering. arXiv preprint arXiv:1905.05471 ,
-
[18]
Ad- vances in neural information processing systems , 33:4271–4282
22 Debarati Das, Karin De Langis, Anna Martin, Jaehyung Kim, Mi nhwa Lee, Zae Myung Kim, Shirley Hayati, Risako Owan, Bin Hu, Ritik Parkar, et al. Under the su rface: Tracking the artifactuality of llm-generated data. arXiv preprint arXiv:2401.14698 ,
-
[19]
Semantic change character- ization with llms using rhetorics
Jader Martins Camboim de Sá, Marcos Da Silveira, and Cédric P ruski. Semantic change character- ization with llms using rhetorics. arXiv preprint arXiv:2407.16624 ,
-
[20]
On measures of biases and harms in nlp
Sunipa Dev, Emily Sheng, Jieyu Zhao, Aubrie Amstutz, Jiao Su n, Yu Hou, Mattie Sanseverino, Jiin Kim, Akihiro Nishi, Nanyun Peng, et al. On measures of biases and harms in nlp. arXiv preprint arXiv:2108.03362,
-
[21]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Tout anova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[22]
Query Expansion with Locally-Trained Word Embeddings
Fernando Diaz, Bhaskar Mitra, and Nick Craswell. Query expa nsion with locally-trained word embeddings. arXiv preprint arXiv:1605.07891 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[23]
Addressing age- related bias in sentiment analysis
Mark Díaz, Isaac Johnson, Amanda Lazar, Anne Marie Piper, an d Darren Gergle. Addressing age- related bias in sentiment analysis. In Proceedings of the 2018 chi conference on human factors in computing systems , pages 1–14,
work page 2018
-
[24]
Evaluating vocab ulary usage in llms
Matthew Durward and Christopher Thomson. Evaluating vocab ulary usage in llms. In Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educa tional Applications (BEA 2024), pages 266–282,
work page 2024
-
[25]
Cognitive bias in high- stakes decision-making with llms
Jessica Echterhoff, Yao Liu, Abeer Alessa, Julian McAuley, a nd Zexue He. Cognitive bias in high- stakes decision-making with llms. arXiv preprint arXiv:2403.00811 ,
-
[26]
Robbie: Robust bias evaluation of large generative language models
David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuc hen Zhang, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, and Eric Smi th. Robbie: Robust bias evaluation of large generative language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages 3764–3814,
work page 2023
-
[27]
AllenNLP: A Deep Semantic Natural Language Processing Platform
Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Prad eep Dasigi, Nelson Liu, Matthew Peters, Michael Schmitz, and Luke Zettlemoyer. Allennlp: A deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[28]
Aparna Garimella, Akhash Amarnath, Kiran Kumar, Akash Pram od Yalla, N Anandhavelu, Niyati Chhaya, and Balaji Vasan Srinivasan. He is very intelligent , she is very beautiful? on mitigating social biases in language modelling and generation. In Findings of the Association for Computa- tional Linguistics: ACL-IJCNLP 2021 , pages 4534–4545,
work page 2021
-
[29]
Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A. Smith. RealToxici- tyPrompts: Evaluating neural toxic degeneration in langua ge models. In Trevor Cohn, Yulan He, and Yang Liu, editors, Findings of the Association for Computational Linguistics : EMNLP 2020 , pages 3356–3369. Association for Computational Linguisti cs,
work page 2020
-
[30]
Statistical challenges wit h dataset construction: Why you will never have enough images
Josh Goldman and John K Tsotsos. Statistical challenges wit h dataset construction: Why you will never have enough images. arXiv preprint arXiv:2408.11160 ,
-
[31]
Unboxing occupat ional bias: Grounded debiasing llms with us labor data
Atmika Gorti, Manas Gaur, and Aman Chadha. Unboxing occupat ional bias: Grounded debiasing llms with us labor data. arXiv preprint arXiv:2408.11247 ,
-
[32]
Sentime nt analysis with nlp on twitter data
24 Md Rakibul Hasan, Maisha Maliha, and M Arifuzzaman. Sentime nt analysis with nlp on twitter data. In 2019 international conference on computer, communication, c hemical, materials and electronic engineering (IC4ME2) , pages 1–4. IEEE,
work page 2019
-
[33]
Data Mining, Inference, and Prediction
URL https://doi.org/10.1007/978-0-387-84858-7 . Lucy Havens, Melissa Terras, Benjamin Bach, and Beatrice Al ex. Uncertainty and inclusivity in gender bias annotation: An annotation taxonomy and annotat ed datasets of british english text. In 4th Workshop on Gender Bias in Natural Language Processing at NAACL, pages 30–57. ACL Anthology,
-
[34]
URL https://www.jstor.org/stable/1912352
doi: 10.2307/1912352. URL https://www.jstor.org/stable/1912352. Lisa Anne Hendricks, Kaylee Burns, Kate Saenko, Trevor Darr ell, and Anna Rohrbach. Women also snowboard: Overcoming bias in captioning models. In Proceedings of the European conference on computer vision (ECCV) , pages 771–787,
-
[35]
A structural probe fo r finding syntax in word representa- tions
John Hewitt and Christopher D Manning. A structural probe fo r finding syntax in word representa- tions. In Proceedings of the 2019 Conference of the North American Chapt er of the Association for Computational Linguistics , pages 4129–4138,
work page 2019
-
[36]
Chuan Tian Hongfei Li, Qian H. Li and Kevin Hou. Issues in cox p roportional hazards model with unequal randomization. Journal of Biopharmaceutical Statistics , 0(0):1–6, 2024a. doi: 10.1080/ 10543406.2024.2418139. URL https://doi.org/10.1080/10543406.2024.2418139. PMID: 39445665. Chuan Tian Hongfei Li, Qian H. Li and Kevin Hou. Issues in cox p roportiona...
-
[37]
The importance of modeling social fa ctors of language: Theory and practice
Dirk Hovy and Diyi Yang. The importance of modeling social fa ctors of language: Theory and practice. In Proceedings of the 2021 Conference of the North American Chapt er of the Association for Computational Linguistics: Human language technologi es, pages 588–602,
work page 2021
-
[38]
Zhanhao Hu, Julien Piet, Geng Zhao, Jiantao Jiao, and David W agner. Toxicity detection for free. arXiv preprint arXiv:2405.18822 ,
-
[39]
Up5: Unbiased foun- dation model for fairness-aware recommendation
25 Wenyue Hua, Yingqiang Ge, Shuyuan Xu, Jianchao Ji, and Yongf eng Zhang. Up5: Unbiased foun- dation model for fairness-aware recommendation. arXiv preprint arXiv:2305.12090 ,
-
[40]
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, et al. L lama guard: Llm-based input-output safeguard for human-ai conversations. arXiv preprint arXiv:2312.06674 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[41]
Ctrl: A conditional transformer language model for controllable generation
Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, Caimin g Xiong, and Richard Socher. Ctrl: A conditional transformer language model for controllable generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Process ing, pages 111–129,
work page 2019
-
[42]
Simran Khanuja, Sebastian Ruder, and Partha Talukdar. Eval uating the diversity, equity and inclusion of nlp technology: A case study for indian languag es. arXiv preprint arXiv:2205.12676 ,
-
[43]
Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems
Svetlana Kiritchenko and Saif M Mohammad. Examining gender and race bias in two hundred sentiment analysis systems. arXiv preprint arXiv:1805.04508 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[44]
Hyukhun Koh, Dohyung Kim, Minwoo Lee, and Kyomin Jung. Can ll ms recognize toxic- ity? structured toxicity investigation framework and sema ntic-based metric. arXiv preprint arXiv:2402.06900,
-
[45]
Measuring Bias in Contextualized Word Representations
Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yul ia Tsvetkov. Measuring bias in contextualized word representations. arXiv preprint arXiv:1906.07337 ,
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[46]
Neural embed- ding of beliefs reveals the role of relative dissonance in hu man decision-making
Byunghwee Lee, Rachith Aiyappa, Yong-Yeol Ahn, Haewoon Kwa k, and Jisun An. Neural embed- ding of beliefs reveals the role of relative dissonance in hu man decision-making. arXiv preprint arXiv:2408.07237,
-
[47]
End-to-end Neural Coreference Resolution
Kenton Lee, Luheng He, Mike Lewis, and Luke Zettlemoyer. End -to-end neural coreference resolution. arXiv preprint arXiv:1707.07045 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[48]
Comparing biases and the im pact of multilingual training across multiple languages
Sharon Levy, Neha John, Ling Liu, Yogarshi Vyas, Jie Ma, Yosh inari Fujinuma, Miguel Ballesteros, Vittorio Castelli, and Dan Roth. Comparing biases and the im pact of multilingual training across multiple languages. In Proceedings of the 2023 Conference on Empirical Methods in Na tural Language Processing, Singapore, December
work page 2023
-
[49]
Steer- ing llms towards unbiased responses: A causality-guided de biasing framework
Jingling Li, Zeyu Tang, Xiaoyu Liu, Peter Spirtes, Kun Zhang , Liu Leqi, and Yang Liu. Steer- ing llms towards unbiased responses: A causality-guided de biasing framework. arXiv preprint arXiv:2403.08743, 2024a. Weitao Li, Junkai Li, Weizhi Ma, and Yang Liu. Citation-enha nced generation for llm-based chatbot. arXiv preprint arXiv:2402.16063 , 2024b. Ying...
-
[50]
On Measuring Social Biases in Sentence Encoders
URL https://api.semanticscholar.org/CorpusID:202541569. Chandler May, Alex Wang, Shikha Bordia, Samuel R Bowman, and Rachel Rudinger. On measuring social biases in sentence encoders. arXiv preprint arXiv:1903.10561 ,
work page internal anchor Pith review Pith/arXiv arXiv 1903
-
[51]
Text classification using label names only: A language model self -training approach
Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, C hao Zhang, and Jiawei Han. Text classification using label names only: A language model self -training approach. arXiv preprint arXiv:2010.07245,
-
[52]
Global gallery: The fine art of painting culture portraits through multilingual instruction tuning
Anjishnu Mukherjee, Aylin Caliskan, Ziwei Zhu, and Antonio s Anastasopoulos. Global gallery: The fine art of painting culture portraits through multilingual instruction tuning. In Proceedings of the 2024 Conference of the North American Chapter of the Associati on for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages 6398–6415,
work page 2024
-
[53]
Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond
Ramesh Nallapati, Bowen Zhou, Caglar Gulcehre, Bing Xiang, et al. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[54]
URL https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-9-56
doi: 10.1186/1471-2288-9-56. URL https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-9-56. Davide Neri, Jacopo Soldani, Olaf Zimmermann, and Antonio B rogi. Design principles, architectural smells and refactorings for microservices: a multivocal re view. SICS Software-Intensive Cyber- Physical Systems , 35:3–15,
-
[55]
Shangrui Nie, Michael Fromm, Charles Welch, Rebekka Görge, Akbar Karimi, Joan Plepi, Nazia Mowmita, Nicolas Flores-Herr, Mehdi Ali, and Lucie Flek. Do multilingual large language models mitigate stereotype bias? In Proceedings of the 2nd Workshop on Cross-Cultural Considera tions in NLP , Bangkok, Thailand, August 2024a. Association for Computa tional Lin...
-
[56]
Competent men and warm women: Gender stereo- types and backlash in image search results
Jahna Otterbacher, Jo Bates, and Paul Clough. Competent men and warm women: Gender stereo- types and backlash in image search results. In Proceedings of the 2017 chi conference on human factors in computing systems , pages 6620–6631,
work page 2017
-
[57]
Reducing gender bia s in abusive language detection
Ji Ho Park, Jamin Shin, and Pascale Fung. Reducing gender bia s in abusive language detection. In Proceedings of the 2018 Conference on Empirical Methods in Na tural Language Processing , Brussels, Belgium, October-November
work page 2018
-
[58]
Models and dat asets for cross-lingual summarisation
Laura Perez-Beltrachini and Mirella Lapata. Models and dat asets for cross-lingual summarisation. arXiv preprint arXiv:2202.09583 ,
-
[59]
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Perc y Liang. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[60]
Know What You Don't Know: Unanswerable Questions for SQuAD
Pranav Rajpurkar, Robin Jia, and Percy Liang. Know what you d on’t know: Unanswerable questions for squad. arXiv preprint arXiv:1806.03822 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[61]
Gender Bias in Coreference Resolution
doi: 10.1037/h0037350. Rachel Rudinger, Jason Naradowsky, Brian Leonard, and Benj amin Van Durme. Gender bias in coreference resolution. arXiv preprint arXiv:1804.09301 ,
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1037/h0037350
-
[62]
Abel Salinas, Louis Penafiel, Robert McCormack, and Fred Mor statter. " im not racist but...": Dis- covering bias in the internal knowledge of large language mo dels. arXiv preprint arXiv:2310.08780,
-
[63]
Zhiqiang Shen, Tianhua Tao, Liqun Ma, Willie Neiswanger, Jo el Hestness, Natalia Vassilieva, Daria Soboleva, and Eric Xing. Slimpajama-dc: Understanding dat a combinations for llm training. arXiv preprint arXiv:2309.10818 ,
-
[64]
The woman worked as a babysit- ter: On biases in language generation
Emily Sheng, Kai-Wei Chang, Prem Natarajan, and Nanyun Peng . The woman worked as a babysit- ter: On biases in language generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Join t Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3407–3412,
work page 2019
-
[65]
Weiyan Shi, Ryan Li, Yutong Zhang, Caleb Ziems, Raya Horesh, Rogério Abreu de Paula, Diyi Yang, et al. Culturebank: An online community-driven knowl edge base towards culturally aware language technologies. arXiv preprint arXiv:2404.15238 ,
-
[66]
Large language model s as subpopulation representative models: A review
Gabriel Simmons and Christopher Hare. Large language model s as subpopulation representative models: A review. arXiv preprint arXiv:2310.17888 ,
-
[67]
Dropout: a simple way to prevent neural networks from overfit ting
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya S utskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfit ting. The Journal of Machine Learning Research, 15(1):1929–1958,
work page 1929
-
[68]
Mitigating Gender Bias in Natural Language Processing: Literature Review
Tony Sun, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSher ief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. Mi tigating gender bias in natural language processing: Literature review. arXiv preprint arXiv:1906.08976 ,
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[69]
Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation
Eva Vanmassenhove, Dimitar Shterionov, and Andy Way. Lost i n translation: Loss and decay of linguistic richness in machine translation. arXiv preprint arXiv:1906.12068 ,
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[70]
URL https://projecteuclid.org/euclid.aos/1176345802
doi: 10.1214/aos/1176345802. URL https://projecteuclid.org/euclid.aos/1176345802. A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems ,
-
[71]
Cross-lingual semant ic similarity of words as the similarity of their semantic word responses
Ivan Vulic and Marie-Francine Moens. Cross-lingual semant ic similarity of words as the similarity of their semantic word responses. In Proceedings of the 2013 Conference of the North Amer- ican Chapter of the Association for Computational Linguist ics: Human Language Technologies (NAACL-HLT 2013), pages 106–116. ACL; East Stroudsburg, PA,
work page 2013
-
[72]
Angelina Wang, Jamie Morgenstern, and John P Dickerson. Lar ge language models cannot replace human participants because they cannot portray identity gr oups. arXiv preprint arXiv:2402.01908, 2024a. Xinru Wang, Hannah Kim, Sajjadur Rahman, Kushan Mitra, and Z hengjie Miao. Human-llm collaborative annotation through effective verification of llm labels. In P...
-
[73]
Measuring and reducing gendered cor relations in pre-trained models
Kellie Webster, Xuezhi Wang, Ian Tenney, Alex Beutel, Emily Pitler, Ellie Pavlick, Jilin Chen, Ed Chi, and Slav Petrov. Measuring and reducing gendered cor relations in pre-trained models. arXiv preprint arXiv:2010.06032 ,
-
[74]
Zekun Wu, Sahan Bulathwela, Maria Perez-Ortiz, and Adriano Soares Koshiyama. Auditing large language models for enhanced text-based stereotype detect ion and probing-based bias evaluation. arXiv preprint arXiv:2404.01768 ,
-
[75]
Finbert: A pretrained language model for financial communications
Yuqi Yang, Yuan Yuan, and Lei Liu. Finbert: A pretrained lang uage model for financial communi- cations. arXiv preprint arXiv:2006.08097 ,
-
[76]
Cau sal prompting: Debiasing large language model prompting based on front-door adjustment
Congzhi Zhang, Linhai Zhang, Deyu Zhou, and Guoqiang Xu. Cau sal prompting: Debiasing large language model prompting based on front-door adjustment. arXiv preprint arXiv:2403.02738 ,
-
[77]
Deep learning ba sed recommender system: A survey and new perspectives
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. Deep learning ba sed recommender system: A survey and new perspectives. ACM computing surveys (CSUR) , 52(1):1–38, 2019a. Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris B rockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, and Bill Dolan. Dialogpt: Large-scale genera tive pre-training for conversationa...
-
[78]
Gender Bias in Contextualized Word Embeddings
Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. Gender bias in coreference resolution: Evaluation and debiasing methods . In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computati onal Linguistics , pages 15–20, 2018a. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei ...
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[79]
Association for Computational Linguistics. 34 Appendix A. Examples of Extrinsic Biases A.1 Natural Language Understanding (NLU) tasks NLU encompasses a broad range of tasks that aim to improve com prehension of input sequences (Chang et al., 2024). It seeks to grasp the deeper connotatio ns and implications inherent in human communication, focusing on wha...
work page 2024
-
[80]
This task is crucial for accurately interpreting the meaning of sentences, especially in cases where pronouns, names, or other referen- tial expressions are used. The primary goal of coreference r esolution is to correctly link pronouns like “he,” “she,” or “it” and definite descriptions like “the CEO” to the appropriate entity mentioned earlier in the tex...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.