pith. machine review for the scientific record. sign in

arxiv: 2502.04419 · v3 · submitted 2025-02-06 · 💻 cs.LG · cs.AI· cs.CL

Recognition: unknown

Understanding and Mitigating Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIcs.CL
keywords biasdatainheritancetasksbiasesdownstreamllmswork
0
0 comments X
read the original abstract

Generating synthetic datasets via large language models (LLMs) has emerged as a promising approach to improve LLM performance. However, LLMs inherently reflect biases in their training data, leading to a critical challenge: when models are trained on synthetic data, they may propagate and amplify the inherent biases that can significantly impact fairness and robustness on downstream tasks-a phenomenon we term bias inheritance. This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance. We fine-tune LLMs with a combined dataset of real and LLM-augmented data with varied bias ratio as the proportion of augmented data. Through systematic experiments across 10 classification and generation tasks, we analyze how 6 different types of biases manifest. Our results indicate that bias inheritance harms downstream task performance in bias directly-related classification and generation tasks. Then, our analysis identifies three key misalignment factors: misalignment of values, group data, and data distributions. Based on these insights, we propose three mitigation strategies: token-based, mask-based, and loss-based approaches, which can work differently on various tasks and bias, indicating the substantial challenges to mitigate bias inheritance. We hope this work can provide insights to the research of LLM data augmentation.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Safe for Whom? Rethinking How We Evaluate the Safety of LLMs for Real Users

    cs.AI 2025-12 unverdicted novelty 6.0

    LLM safety evaluations for personal advice must test responses against diverse user vulnerability profiles, since context-blind ratings overestimate safety and realistic prompt context does not fix the problem.