pith. sign in

arxiv: 2502.04419 · v3 · submitted 2025-02-06 · 💻 cs.LG · cs.AI· cs.CL

Understanding and Mitigating Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks

Pith reviewed 2026-05-23 03:49 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CL
keywords bias inheritanceLLM data augmentationdownstream tasksfairnessmitigationsynthetic dataclassificationgeneration
0
0 comments X

The pith

LLM-generated training data inherits biases that degrade performance on bias-related downstream tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates bias inheritance, where training on data produced by LLMs transfers and strengthens biases present in those models. Experiments mixing real and synthetic data at different proportions across ten tasks reveal that higher shares of augmented data lower results in classification and generation problems tied to bias. The authors pinpoint three misalignment sources—values, group data, and distributions—and introduce token, mask, and loss adjustments as possible fixes, though these vary in success by task and bias type.

Core claim

Bias inheritance from LLM data augmentation harms downstream task performance in bias directly-related classification and generation tasks. Three key misalignment factors are misalignment of values, group data, and data distributions. Three mitigation strategies are proposed: token-based, mask-based, and loss-based approaches.

What carries the argument

Controlled experiments varying the bias ratio, the share of LLM-augmented data mixed with real data, to quantify inheritance effects and test mitigation methods.

If this is right

  • Bias inheritance reduces performance in tasks directly involving bias.
  • Misalignments in values, group data, and distributions are the main drivers.
  • The three mitigation strategies show different effectiveness across tasks and bias types.
  • Mitigating bias inheritance remains substantially challenging.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners should monitor bias ratios when augmenting datasets with LLMs to avoid unintended fairness issues.
  • The approach could be extended to test if pre-aligning the generating LLM reduces inheritance.
  • Similar inheritance effects may appear in other generative uses of LLMs beyond data augmentation.
  • Balancing data volume gains against bias costs may require new evaluation metrics focused on fairness.

Load-bearing premise

Varying only the proportion of augmented data in the training mix separates the bias inheritance effect from other influences like how the LLM produces examples or specific task traits.

What would settle it

No drop in performance on bias-related tasks as the augmented data proportion increases would show that bias inheritance does not harm downstream performance.

Figures

Figures reproduced from arXiv: 2502.04419 by Hao Chen, Jindong Wang, Kaijie Zhu, Kam-Fai Wong, Miaomiao Li, Tingyuan Zhu, Weijia Zhang, Yang Wang.

Figure 1
Figure 1. Figure 1: The overview of our research pipeline. (a) Six key types of bias for data generation with the [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Results on downstream tasks related to gender with different types of bias in augmentation data. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Results for bias indirectly and directly related tasks (x-axis: 0-Unbiased, 1-Contextual Single Explicit, [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The average hiring recommendations results. Increase of male candidates in minority races is more [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The average story generation results and the multi-Round hiring recommendation results. Bias [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Hiring and salary generation results using GPT-4o-mini. Same notation is used for x-axis as in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Misalignment of values and imbalanced generation. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Embedding Distribution. samples experience a performance decline, whereas male samples consistently improve for gender biographies classification. Similarly, in the hiring recommendation task, Chinese female candidates are ranked higher than their male counterparts. To investigate the causes of these disparities, we examine the gender distribution in augmented data under the unbiased type condition. Withou… view at source ↗
Figure 9
Figure 9. Figure 9: The average mitigation results. All three mitigation strategies mitigate the malicious effects of bias [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Hiring recommendation results with different bias types. [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Story generation results with different bias types. [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Multi-round classification and salary generation results. [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Embedding Distribution E More Results on Bias Inheritance Mitigation E.1 Methods Token-based Mitigation For instance, “The following text may contain biases. [Text with Augmented Bias] ". This token serves as a signal to the model that the text might contain bias, guiding it to approach the interpretation or processing of this data with caution. Mask-based Mitigation For cultural bias, we replace specific… view at source ↗
Figure 14
Figure 14. Figure 14: The mitigation results using GPT-4o-mini. [PITH_FULL_IMAGE:figures/full_fig_p027_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Gender bias mitigation on classification tasks. [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Gender bias mitigation on hiring recommendation tasks. [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Gender bias mitigation on salary recommendation tasks. [PITH_FULL_IMAGE:figures/full_fig_p029_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Cultural bias mitigation on classification tasks. [PITH_FULL_IMAGE:figures/full_fig_p030_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Cultural bias mitigation on story generation tasks. [PITH_FULL_IMAGE:figures/full_fig_p030_19.png] view at source ↗
read the original abstract

Generating synthetic datasets via large language models (LLMs) has emerged as a promising approach to improve LLM performance. However, LLMs inherently reflect biases in their training data, leading to a critical challenge: when models are trained on synthetic data, they may propagate and amplify the inherent biases that can significantly impact fairness and robustness on downstream tasks-a phenomenon we term bias inheritance. This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance. We fine-tune LLMs with a combined dataset of real and LLM-augmented data with varied bias ratio as the proportion of augmented data. Through systematic experiments across 10 classification and generation tasks, we analyze how 6 different types of biases manifest. Our results indicate that bias inheritance harms downstream task performance in bias directly-related classification and generation tasks. Then, our analysis identifies three key misalignment factors: misalignment of values, group data, and data distributions. Based on these insights, we propose three mitigation strategies: token-based, mask-based, and loss-based approaches, which can work differently on various tasks and bias, indicating the substantial challenges to mitigate bias inheritance. We hope this work can provide insights to the research of LLM data augmentation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents the first systematic study of 'bias inheritance' in LLM-based data augmentation. It fine-tunes models on mixtures of real and LLM-augmented data while sweeping the proportion of augmented data (bias ratio) across 10 classification and generation tasks, examines six bias types, reports that inheritance degrades performance on bias-related tasks, identifies three misalignment factors (values, group data, distributions), and proposes three mitigation strategies (token-based, mask-based, loss-based).

Significance. If the performance degradation can be causally attributed to inherited bias rather than generic augmentation artifacts, the work would usefully highlight risks in synthetic-data pipelines and offer concrete mitigation directions. The empirical scope across multiple tasks and bias types is a strength, but the central attribution claim remains provisional pending tighter controls on generation confounds.

major comments (1)
  1. [experimental design / bias ratio sweeps] Experimental design section (bias-ratio sweeps on combined real+augmented datasets): the central claim that performance drops are due to bias inheritance rests on the assumption that varying the proportion of LLM-augmented data isolates inherited bias while holding all other data properties constant. No ablation is described that fixes the generation process and prompt while varying only bias content; therefore changes in fluency, length, coverage, or distributional shift introduced by the LLM itself could confound the observed degradation. This assumption is load-bearing for the claim that 'bias inheritance harms downstream task performance.'
minor comments (1)
  1. [abstract / methods] Abstract and methods: exact definitions of the six bias types and three misalignment factors, statistical controls (error bars, significance tests, number of runs), and baseline comparisons are not detailed; these omissions hinder reproducibility and assessment of effect sizes.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback. Below we respond point-by-point to the major comment on experimental design, providing an honest assessment of the current evidence and planned revisions.

read point-by-point responses
  1. Referee: [experimental design / bias ratio sweeps] Experimental design section (bias-ratio sweeps on combined real+augmented datasets): the central claim that performance drops are due to bias inheritance rests on the assumption that varying the proportion of LLM-augmented data isolates inherited bias while holding all other data properties constant. No ablation is described that fixes the generation process and prompt while varying only bias content; therefore changes in fluency, length, coverage, or distributional shift introduced by the LLM itself could confound the observed degradation. This assumption is load-bearing for the claim that 'bias inheritance harms downstream task performance.'

    Authors: We acknowledge that the bias-ratio sweeps do not include an explicit ablation that holds the generation process and prompts fixed while varying only bias content. This leaves open the possibility that other LLM-induced properties (fluency, length, coverage, or distributional shift) contribute to the observed degradation. Our design varies the proportion of augmented data under a fixed generation pipeline, which modulates exposure to biased content but does not isolate bias from all other generation artifacts. In the revised manuscript we will add an explicit limitations paragraph discussing this confound and will report additional controls (e.g., comparing real vs. LLM data matched on length and perplexity) where feasible. We also note that performance drops are concentrated on bias-related tasks rather than unrelated ones, which provides partial support for the bias-inheritance interpretation, but we agree this does not fully resolve the attribution question. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical measurements only

full rationale

The paper reports results from fine-tuning experiments that combine real and LLM-augmented data at varying bias ratios across 10 tasks, then measures downstream performance and identifies misalignment factors post-hoc. No equations, fitted parameters renamed as predictions, self-citation load-bearing uniqueness claims, or ansatzes appear in the abstract or described setup. All central claims rest on observed performance deltas rather than any reduction to inputs by construction, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on standard machine-learning assumptions about data distributions and the ability to control bias ratios experimentally. No new physical entities are postulated. The term 'bias inheritance' is introduced as a descriptive label rather than a new theoretical construct with independent evidence.

axioms (2)
  • domain assumption LLMs inherently reflect biases present in their training data
    Stated in the opening sentence of the abstract as the source of the problem.
  • domain assumption Varying the proportion of augmented data isolates the inheritance effect
    Implicit in the experimental design that uses 'varied bias ratio as the proportion of augmented data'.
invented entities (1)
  • bias inheritance no independent evidence
    purpose: Label for the phenomenon of biases propagating from LLM to synthetic data to downstream models
    Introduced in the abstract as 'a phenomenon we term bias inheritance'; no independent falsifiable prediction is provided.

pith-pipeline@v0.9.0 · 5769 in / 1511 out tokens · 64624 ms · 2026-05-23T03:49:30.914637+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs

    cs.CL 2025-06 unverdicted novelty 7.0

    VISE is the first benchmark for sycophancy in Video-LLMs, with two training-free mitigation strategies based on key-frame selection and internal representation steering.

  2. Safe for Whom? Rethinking How We Evaluate the Safety of LLMs for Real Users

    cs.AI 2025-12 unverdicted novelty 6.0

    LLM safety evaluations for personal advice must test responses against diverse user vulnerability profiles, since context-blind ratings overestimate safety and realistic prompt context does not fix the problem.

  3. Inertia in Moral and Value Judgments of Large Language Models

    cs.CL 2024-08 unverdicted novelty 4.0

    LLMs exhibit persistent inertia in value orientations, with harm avoidance and fairness remaining skewed across persona prompts.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · cited by 3 Pith papers · 3 internal anchors

  1. [1]

    Phi-4 Technical Report

    URLhttps://www.worldvaluessurvey.org/wvs.jsp. Marah Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J Hewett, Mojan Javaheripi, Piero Kauffmann, et al. Phi-4 technical report.arXiv preprint arXiv:2412.08905,

  2. [2]

    Physics of language models: Part 3.1, knowledge storage and extraction

    Zeyuan Allen-Zhu and Yuanzhi Li. Physics of language models: Part 3.1, knowledge storage and extraction. arXiv preprint arXiv:2309.14316,

  3. [3]

    Overview of mex-a3t at ibereval 2018: Authorship and aggressiveness analysis in mexican spanish tweets

    Miguel Ángel Álvarez-Carmona, Estefanía Guzmán-Falcón, Manuel Montes-y-Gómez, Hugo Jair Escalante, Luis Villaseñor-Pineda, Verónica Reyes-Meza, and Antonio Rico-Sulayes. Overview of mex-a3t at ibereval 2018: Authorship and aggressiveness analysis in mexican spanish tweets. InProceedings of the Third Workshop on Evaluation of Human Language Technologies fo...

  4. [4]

    Measuring implicit bias in explicitly unbiased large language models.arXiv preprint arXiv:2402.04105,

    Xuechunzi Bai, Angelina Wang, Ilia Sucholutsky, and Thomas L Griffiths. Measuring implicit bias in explicitly unbiased large language models.arXiv preprint arXiv:2402.04105,

  5. [5]

    Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter

    Valerio Basile, Cristina Bosco, Elisabetta Fersini, Debora Nozza, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso, and Manuela Sanguinetti. Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. InProceedings of the 13th international workshop on semantic evaluation, pages 54–63,

  6. [6]

    Revealing hidden bias in ai: Lessons from large language models.arXiv preprint arXiv:2410.16927,

    11 Django Beatty, Kritsada Masanthia, Teepakorn Kaphol, and Niphan Sethi. Revealing hidden bias in ai: Lessons from large language models.arXiv preprint arXiv:2410.16927,

  7. [7]

    Language models are few-shot learners

    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901,

  8. [8]

    AGR: Age group fairness reward for bias mitigation in LLMs

    Shuirong Cao, Ruoxi Cheng, and Zhiqiang wang. AGR: Age group fairness reward for bias mitigation in LLMs. InPluralistic Alignment Workshop at NeurIPS 2024,

  9. [9]

    URLhttps://aclanthology.org/2020.lrec-1.761/

    European Language Resources Association. URLhttps://aclanthology.org/2020.lrec-1.761/. Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Choulde- chova, Sahin Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai. Bias in bios: A case study of semantic representation bias in a high-stakes setting. Inproceedings of ...

  10. [10]

    Data augmentation using llms: Data perspectives, learning paradigms and challenges

    Bosheng Ding, Chengwei Qin, Ruochen Zhao, Tianze Luo, Xinze Li, Guizhen Chen, Wenhan Xia, Junjie Hu, Luu Anh Tuan, and Shafiq Joty. Data augmentation using llms: Data perspectives, learning paradigms and challenges. InFindings of the Association for Computational Linguistics ACL 2024, pages 1679–1705,

  11. [11]

    Sina at fignews 2024: Multilingual datasets annotated with bias and propaganda.arXiv preprint arXiv:2407.09327,

    Lina Duaibes, Areej Jaber, Mustafa Jarrar, Ahmad Qadi, and Mais Qandeel. Sina at fignews 2024: Multilingual datasets annotated with bias and propaganda.arXiv preprint arXiv:2407.09327,

  12. [12]

    Towards Measuring the Representation of Subjective Global Opinions in Language Models

    Esin Durmus, Karina Nyugen, Thomas I Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, et al. Towards measuring the representation of subjective global opinions in language models.arXiv preprint arXiv:2306.16388,

  13. [13]

    First-person fairness in chatbots

    Tyna Eloundou, Alex Beutel, David G Robinson, Keren Gu-Lemberg, Anna-Luisa Brakman, Pamela Mishkin, Meghan Shah, Johannes Heidecke, Lilian Weng, and Adam Tauman Kalai. First-person fairness in chatbots. arXiv preprint arXiv:2410.19803,

  14. [14]

    Jillian Fisher, Shangbin Feng, Robert Aron, Thomas Richardson, Yejin Choi, Daniel W Fisher, Jennifer Pan, Yulia Tsvetkov, and Katharina Reinecke

    URL https://api.semanticscholar.org/CorpusID:261898112. Jillian Fisher, Shangbin Feng, Robert Aron, Thomas Richardson, Yejin Choi, Daniel W Fisher, Jennifer Pan, Yulia Tsvetkov, and Katharina Reinecke. Biased ai can influence political decision-making.arXiv preprint arXiv:2410.06415,

  15. [15]

    Explicit and implicit large language model personas generate opinions but fail to replicate deeper perceptions and biases.arXiv preprint arXiv:2406.14462,

    Salvatore Giorgi, Tingting Liu, Ankit Aich, Kelsey Isman, Garrick Sherman, Zachary Fried, João Sedoc, Lyle H Ungar, and Brenda Curtis. Explicit and implicit large language model personas generate opinions but fail to replicate deeper perceptions and biases.arXiv preprint arXiv:2406.14462,

  16. [16]

    Hey gpt, can you be more racist? analysis from crowdsourced attempts to elicit biased content from generative ai.arXiv preprint arXiv:2410.15467,

    Hangzhi Guo, Pranav Narayanan Venkit, Eunchae Jang, Mukund Srinath, Wenbo Zhang, Bonam Mingole, Vipul Gupta, Kush R Varshney, S Shyam Sundar, and Amulya Yadav. Hey gpt, can you be more racist? analysis from crowdsourced attempts to elicit biased content from generative ai.arXiv preprint arXiv:2410.15467,

  17. [17]

    LoRA: Low-Rank Adaptation of Large Language Models

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models.arXiv preprint arXiv:2106.09685,

  18. [18]

    Huggingface

    doi: 10.1038/s43588-024-00741-1. Huggingface. Meta-llama-3.1-8b-instruct,

  19. [19]

    URLhttps://aclanthology.org/2020.osact-1.8/

    European Language Resource Association. URLhttps://aclanthology.org/2020.osact-1.8/. Zhuoren Jiang, Zhe Gao, Guoxiu He, Yangyang Kang, Changlong Sun, Qiong Zhang, Luo Si, and Xiaozhong Liu. Detect camouflaged spam content via StoneSkipping: Graph and text joint embedding for Chinese character variation representation. In Proceedings of the 2019 Conference...

  20. [20]

    Shovon, and Gene Kim

    Mahammed Kamruzzaman, Md. Shovon, and Gene Kim. Investigating subtler biases in LLMs: Ageism, beauty, institutional, and nationality bias in generative models. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors, Findings of the Association for Computational Linguistics: ACL 2024, pages 8940–8965, Bangkok, Thailand, August

  21. [21]

    doi: 10.18653/v1/2024.findings-acl.530

    Association for Computational Linguistics. doi: 10.18653/v1/2024.findings-acl.530. URL https://aclanthology.org/2024.findings-acl.530/. Alex Koch, Roland Imhoff, Ron Dotsch, Christian Unkelbach, and Hans Alves. The abc of stereotypes about groups: Agency/socioeconomic success, conservative–progressive beliefs, and communion.Journal of personality and soci...

  22. [22]

    Subtle biases need subtler measures: Dual metrics for evaluating representative and affinity bias in large language models

    13 Abhishek Kumar, Sarfaroz Yunusov, and Ali Emami. Subtle biases need subtler measures: Dual metrics for evaluating representative and affinity bias in large language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 375–392. Association for Computational Linguistics, August 2...

  23. [23]

    Ruleprompt: Weakly supervised text classification with prompting plms and self-iterative logical rules

    Miaomiao Li, Jiaqi Zhu, Yang Wang, Yi Yang, Yilin Li, and Hongan Wang. Ruleprompt: Weakly supervised text classification with prompting plms and self-iterative logical rules. InProceedings of the ACM on Web Conference 2024, pages 4272–4282, 2024c. Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, and Ruslan Salakhutdinov. Towards understanding and mitigati...

  24. [24]

    Chen Liu, Fajri Koto, Timothy Baldwin, and Iryna Gurevych

    URL https://api.semanticscholar.org/CorpusID:235623756. Chen Liu, Fajri Koto, Timothy Baldwin, and Iryna Gurevych. Are multilingual LLMs culturally-diverse reasoners? an investigation into multicultural proverbs and sayings. In Kevin Duh, Helena Gomez, and Steven Bethard, editors,Proceedings of the 2024 Conference of the North American Chapter of the Asso...

  25. [25]

    The generation gap: Exploring age bias in the value systems of large language models

    Siyang Liu, Trisha Maturi, Bowen Yi, Siqi Shen, and Rada Mihalcea. The generation gap: Exploring age bias in the value systems of large language models. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, Florida, USA, November 2024b. Shayne Longpre, ...

  26. [26]

    AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

    Angel Felipe Magnossão de Paula and Ipek Baris Schlicht. AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), pages 547–566. CEUR Workshop Proceedings,

  27. [27]

    Gaurav Maheshwari, Dmitry Ivanov, and Kevin El Haddad

    URL https://arxiv.org/abs/2111.04530. Gaurav Maheshwari, Dmitry Ivanov, and Kevin El Haddad. Efficacy of synthetic data as a benchmark.arXiv preprint arXiv:2409.11968,

  28. [28]

    Text classification using label names only: A language model self-training approach

    Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, and Jiawei Han. Text classification using label names only: A language model self-training approach. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9006–9017,

  29. [29]

    Tarek Naous, Michael J Ryan, Alan Ritter, and Wei Xu

    doi: 10.1007/s10462-024-10903-2. Tarek Naous, Michael J Ryan, Alan Ritter, and Wei Xu. Having beer after prayer? measuring cultural bias in large language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16366–16393. Association for Computational Linguistics, August

  30. [30]

    you gotta be a doctor, lin

    Huy Nghiem, John Prindle, Jieyu Zhao, and Hal Daumé Iii. “you gotta be a doctor, lin” : An investigation of name-based bias of large language models in employment recommendations. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 7268–7287, November

  31. [31]

    Steven Rogulsky, Nicholas Popovic, and Michael Färber

    doi: 10.5753/brasnam.2017.3260. Steven Rogulsky, Nicholas Popovic, and Michael Färber. The effects of hallucinations in synthetic training data for relation extraction.arXiv preprint arXiv:2410.08393,

  32. [32]

    The bias amplification paradox in text-to-image generation

    Preethi Seshadri, Sameer Singh, and Yanai Elazar. The bias amplification paradox in text-to-image generation. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 6367–6384,

  33. [33]

    Detection and measurement of syntactic templates in generated text

    Chantal Shaib, Yanai Elazar, Junyi Jessy Li, and Byron C Wallace. Detection and measurement of syntactic templates in generated text. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen, editors,Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 6416–6431, Miami, Florida, USA, November

  34. [34]

    Llm theory of mind and alignment: Opportunities and risks.arXiv preprint arXiv:2405.08154,

    Winnie Street. Llm theory of mind and alignment: Opportunities and risks.arXiv preprint arXiv:2405.08154,

  35. [35]

    Will we run out of data? limits of llm scaling based on human-generated data, 2024

    European Language Resources Association. URLhttps: //aclanthology.org/2022.lrec-1.777/. Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn, and Anson Ho. Will we run out of data? an analysis of the limits of scaling datasets in machine learning.arXiv preprint arXiv:2211.04325, 1,

  36. [36]

    Unraveling downstream gender bias from large language models: A study on AI educational writing assistance

    Thiemo Wambsganss, Xiaotian Su, Vinitra Swamy, Seyed Neshaei, Roman Rietsche, and Tanja Käser. Unraveling downstream gender bias from large language models: A study on AI educational writing assistance. In Houda Bouamor, Juan Pino, and Kalika Bali, editors,Findings of the Association for Computational Linguistics: EMNLP 2023, pages 10275–10288, Singapore,...

  37. [37]

    kelly is a warm person, joseph is a role model

    Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.689. URLhttps://aclanthology. org/2023.findings-emnlp.689/. Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, and Nanyun Peng. “kelly is a warm person, joseph is a role model”: Gender biases in LLM-generated reference letters. In Houda Bouamor, Juan Pino, and Ka...

  38. [38]

    doi: 10.18653/v1/2023.findings-emnlp.243

    Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.243. URL https://aclanthology.org/2023.findings-emnlp.243/. Ze Wang, Zekun Wu, Jeremy Zhang, Navya Jain, Xin Guan, and Adriano Koshiyama. Bias amplification: Language models as increasingly biased media.arXiv preprint arXiv:2410.15234,

  39. [39]

    it’sa fair game

    URL http://papers.nips.cc/paper/ 9106-defending-against-neural-fake-news.pdf. Zeliang Zhang, Xin Liang, Mingqian Feng, Susan Liang, and Chenliang Xu. Will the inclusion of generated data amplify bias across generations in future image classification models?arXiv preprint arXiv:2410.10160, 2024a. Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sa...

  40. [40]

    How to synthesize text data without model collapse?arXiv preprint arXiv:2412.14689,

    Xuekai Zhu, Daixuan Cheng, Hengli Li, Kaiyan Zhang, Ermo Hua, Xingtai Lv, Ning Ding, Zhouhan Lin, Zilong Zheng, and Bowen Zhou. How to synthesize text data without model collapse?arXiv preprint arXiv:2412.14689,

  41. [41]

    You are a person influenced by Spanish culture responding to the following question

    17 Appendix Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks Contents A Details on Multidimensional Bias Generation 18 A.1 The Source of Augmentation Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 A.2 Multidimensional Bias Generation . . . . . . . . . . . . . . . . . . . . . ...

  42. [42]

    Generate a story about a character named [NAME]

    For story generation, we use male and female names from each culture, prompting the model with “Generate a story about a character named [NAME] ". Following Naous et al. [2024], we use adjectives outlined by Koch et al

  43. [43]

    We extract these adjectives from the generated stories, analyzing the frequency of adjectives used to describe the characters

    that reflect dimensions of the Communion framework [Koch et al., 2016], focusing on dimensions like Agency, Beliefs, and Communion. We extract these adjectives from the generated stories, analyzing the frequency of adjectives used to describe the characters. By calculating the rates of positive and negative adjectives linked to each dimension, we assess h...

  44. [44]

    To ensure gender balance, we sample 600 examples for each profession, with an equal split between male and female data

    The BiasinBio dataset [De-Arteaga et al., 2019] contains real-world English biographies sourced from Common Crawl for several occupations. To ensure gender balance, we sample 600 examples for each profession, with an equal split between male and female data. An example from BiasinBio is shown in Table