Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning

Jose Hernandez; Thomas Stephan Juzek; Xiaoyang Ming

arxiv: 2606.00334 · v1 · pith:3TXYMSSYnew · submitted 2026-05-29 · 💻 cs.CL · cs.AI

Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning

Xiaoyang Ming , Jose Hernandez , Thomas Stephan Juzek This is my paper

Pith reviewed 2026-06-28 22:02 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords lexical biaspreference learningLLM alignmentRLHFTriangulated Preference Shift scoreinstruct modelsmodel evaluationlexical misalignment

0 comments

The pith

A new triangulated metric isolates lexical biases introduced specifically during the preference-learning stage of LLM training by comparing human, base-model, and instruct-model outputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Triangulated Preference Shift score to detect word-use changes that occur when language models receive preference training such as RLHF. The score compares outputs on identical prompts across human-written references, untuned base models, and the final preference-tuned instruct models. This three-way comparison lets researchers measure shifts without first hand-picking example words or phrases. Readers may care because the method offers an automated way to check whether alignment steps are steering models toward particular lexical habits, including patterns that resemble prestige language, and the authors apply it across six model families while linking results to earlier studies of lexical misalignment.

Core claim

The Triangulated Preference Shift score isolates shifts induced specifically by preference learning without manual curation by triangulating between human gold standards, base models, and instruct variants. The metric provides an initial automated method to quantify behavioral shifts attributable to preference tuning across six model families and can test whether those shifts move models toward what could be interpreted as a language of prestige.

What carries the argument

Triangulated Preference Shift score, which measures lexical distribution differences across human text, base-model outputs, and instruct-model outputs on the same prompts to attribute changes to the preference stage.

If this is right

The score can be applied to outputs from six model families to detect preference-induced lexical changes.
It can test whether preference learning shifts models toward language patterns associated with prestige.
Results can be anchored against existing literature on lexical misalignment from preference stages.
The approach supplies an automated check that may help guide model alignment and trustworthy AI development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the triangulation reliably separates stages, the same comparison could be used to evaluate how different preference datasets or reward models alter lexical output.
Repeated application of the score at intermediate training checkpoints might locate the precise point at which specific word biases are introduced.
Developers could run the metric on candidate preference data before full training to anticipate and reduce unwanted lexical shifts.

Load-bearing premise

Lexical differences observed when the same prompts are given to humans, base models, and instruct models arise primarily from the preference-learning stage rather than from other training differences, prompt distributions, or sampling choices.

What would settle it

Finding nearly identical lexical distributions between base models and their corresponding instruct models on matched prompts, or large human-instruct differences that already appear in the base models before any preference training occurs.

Figures

Figures reproduced from arXiv: 2606.00334 by Jose Hernandez, Thomas Stephan Juzek, Xiaoyang Ming.

**Figure 2.** Figure 2: TPS Corpus-level Relative Ratio for Instruct vs [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Various language domains have undergone remarkable changes in recent years; these shifts are largely attributed to the advent of Large Language Models and their misalignment with natural language usage. These misalignments are thought to partly originate in the preference-learning stage, e.g. Reinforcement Learning from Human Feedback, which generally makes models more useful but simultaneously may introduce systematic lexical bias. In terms of lexical behavior, this is visible in a model's preference for certain formats or the overuse of words (delve, furthermore), even when such patterns are not present in base model outputs. Research on lexical misalignment induced during preference training is constrained by reliance on manual curation. We address this, by introducing the Triangulated Preference Shift score, a metric that triangulates between human gold standards, base models, and instruct variants to isolate shifts induced specifically by preference learning, without manual curation. We provide data across six model families, anchor the results in the literature, and illustrate the general approach's utility by analyzing whether preference learning shifts models toward what could be interpreted as a "language of prestige". The metric provides an initial automated method to quantify behavioral shifts attributable to preference tuning, and thus, may help inform model alignment and development of trustworthy AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a curation-free triangulated metric for lexical shifts from preference tuning across six model families, but the isolation claim depends on an assumption about base vs instruct differences that lacks clear controls for SFT and other stages.

read the letter

The main thing to know is that this work defines a Triangulated Preference Shift score by comparing human gold text, base model outputs, and instruct model outputs on the same prompts. The goal is to isolate lexical changes that happen specifically during preference learning without needing hand-curated examples. They run the metric on six model families and check whether the shifts point toward what they call a language of prestige.

The approach is new in its specific three-way setup and the automation angle. Running it at scale across families gives the results some breadth, and tying the findings back to existing lexical bias literature helps ground the claims.

The soft spot is the core assumption that differences between base and instruct outputs are driven mainly by the preference stage. Instruct models usually include supervised fine-tuning on instruction data before or with RLHF, and that step alone can change token distributions. The abstract and stress-test note do not mention controls that hold SFT data, tokenizer, or sampling fixed, so the measured shift may mix in effects from earlier training. If the full paper shows explicit checks against those confounds, the isolation would be stronger; otherwise the metric risks over-attributing changes to preference tuning.

This is for people working on automated evaluation tools for alignment side effects. A reader who wants a practical, curation-free starting point for tracking lexical bias could test the method and build on it.

I would send it to peer review. The multi-family data and the concrete metric give it enough substance for referees to evaluate the controls and validation details.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Triangulated Preference Shift score, a curation-free metric that triangulates lexical statistics across human gold-standard text, base-model outputs, and instruct-model outputs on identical prompts. The goal is to isolate lexical shifts attributable specifically to the preference-learning stage (e.g., RLHF) across six model families, while also examining whether these shifts align with a hypothesized "language of prestige."

Significance. If the triangulation successfully isolates preference-stage effects, the metric would supply an automated, reproducible tool for quantifying and tracking lexical bias introduced during alignment, complementing existing manual-curation approaches and supporting more trustworthy model development.

major comments (2)

[Method / Triangulated Preference Shift score definition] The central claim that the metric isolates shifts induced specifically by preference learning rests on the assumption that base-to-instruct differences are driven primarily by the preference stage. However, the method description does not report controls that hold SFT data, tokenizer, or sampling temperature fixed across the base/instruct pairs; any measured shift therefore conflates the target stage with earlier pipeline differences.
[Method / Triangulated Preference Shift score definition] The triangulation subtracts base-model statistics from instruct-model statistics and further anchors against human text, but without the above controls the subtraction step cannot be shown to remove only preference-induced components rather than SFT-induced token-distribution changes.

minor comments (2)

[Abstract] The abstract states results are "anchored in the literature," but the manuscript should explicitly cite the specific prior lexical-bias studies being used for anchoring.
[Method] Notation for the three-way comparison (human, base, instruct) should be introduced with a compact equation or table to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and for identifying a key assumption in the Triangulated Preference Shift score. We respond to each major comment below and will revise the manuscript to increase transparency around methodological limitations.

read point-by-point responses

Referee: [Method / Triangulated Preference Shift score definition] The central claim that the metric isolates shifts induced specifically by preference learning rests on the assumption that base-to-instruct differences are driven primarily by the preference stage. However, the method description does not report controls that hold SFT data, tokenizer, or sampling temperature fixed across the base/instruct pairs; any measured shift therefore conflates the target stage with earlier pipeline differences.

Authors: We agree that the publicly released base/instruct pairs used in the study do not hold SFT corpora, tokenizers, or decoding parameters fixed, and the original method section did not explicitly flag this. The triangulation subtracts base-model lexical statistics from instruct-model statistics (anchored to human text) precisely to surface differences attributable to whatever stages separate the two models; in the families examined, preference tuning is the dominant post-base stage. Nevertheless, the referee is correct that earlier differences could contribute. In revision we will add a dedicated “Assumptions and Limitations” subsection that states this caveat, describes the model pairs examined, and explains why the metric remains useful as a curation-free indicator even if it does not isolate preference effects with perfect purity. revision: yes
Referee: [Method / Triangulated Preference Shift score definition] The triangulation subtracts base-model statistics from instruct-model statistics and further anchors against human text, but without the above controls the subtraction step cannot be shown to remove only preference-induced components rather than SFT-induced token-distribution changes.

Authors: The subtraction step is intended to cancel components shared by base and instruct outputs; the human anchor then contextualizes whether the residual shift moves toward or away from natural language. We acknowledge that, absent fixed controls, any SFT-induced distributional change unique to the instruct checkpoint will remain in the residual. The revised manuscript will (a) restate the triangulation formula with explicit notation for the stages being compared and (b) add a short paragraph discussing the practical difficulty of obtaining perfectly matched base/instruct pairs while still providing a reproducible, curation-free signal across six families. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper proposes the Triangulated Preference Shift score as a new metric constructed via direct comparison of human gold standards, base-model outputs, and instruct-model outputs on the same prompts. This is presented as an independent definitional contribution for isolating preference-stage effects without curation. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text that would reduce the claimed isolation to an input by construction. The central claim retains independent content as a proposed automated measurement approach, even though its interpretive validity rests on external assumptions about training stages.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated. The central claim rests on the unstated assumption that the three-way comparison cleanly separates preference effects.

pith-pipeline@v0.9.1-grok · 5749 in / 1098 out tokens · 16822 ms · 2026-06-28T22:02:42.716256+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

76 extracted references · 6 canonical work pages

[1]

AI --- 2024 Stack Overflow Developer Survey , year =

2024
[2]

2024 , month = may, howpublished =

Coffey, Lauren , title =. 2024 , month = may, howpublished =

2024
[3]

2025 , month = jul, howpublished =

O'Brien, Matt and Sanders, Linley , title =. 2025 , month = jul, howpublished =

2025
[4]

2025 , month = jun, howpublished =

Sidoti, Olivia and McClain, Colleen , title =. 2025 , month = jun, howpublished =

2025
[5]

Delving into

Kobak, Dmitry and M. Delving into. arXiv preprint arXiv:2406.07016 , year =

arXiv
[6]

arXiv preprint arXiv:2404.01268 , year =

Liang, Weixin and Zhang, Yaohui and Wu, Zhengxuan and Lepp, Haley and Ji, Wenlong and Zhao, Xuandong and Cao, Hancheng and Liu, Sheng and He, Siyu and Huang, Zhi and others , title =. arXiv preprint arXiv:2404.01268 , year =

arXiv
[7]

Advances in Neural Information Processing Systems , year =

Ouyang, Long and Wu, Jeffrey and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and others , title =. Advances in Neural Information Processing Systems , year =
[8]

Advances in Neural Information Processing Systems , year =

Rafailov, Rafael and Sharma, Archit and Mitchell, Eric and Manning, Christopher D and Ermon, Stefano and Finn, Chelsea , title =. Advances in Neural Information Processing Systems , year =
[9]

and Kolossa, Dorothea , title =

Irrgang, Verena and Solopova, Veronika and Zeiler, Steffen and Nickel, Robert M. and Kolossa, Dorothea , title =. Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024) , address =. 2024 , month = sep, pages =

2024
[10]

Achiam, Josh and Adler, Steven and Agarwal, Sandhini and Ahmad, Lama and Akkaya, Ilge and others , journal =
[11]

Measuring Massive Multitask Language Understanding (

Hendrycks, Dan and Burns, Collin and Basart, Steven and Zou, Andy and Mazeika, Mantas and Song, Dawn and Steinhardt, Jacob , journal =. Measuring Massive Multitask Language Understanding (
[12]

medRxiv , year =

Matsui, Kentaro , title =. medRxiv , year =
[13]

arXiv preprint arXiv:2404.15799 , year =

Liu, Jialin and Bu, Yi , title =. arXiv preprint arXiv:2404.15799 , year =

arXiv
[14]

arXiv preprint arXiv:2403.16887 , year =

Gray, Andrew , title =. arXiv preprint arXiv:2403.16887 , year =

arXiv
[15]

Geng, Mingmeng and Trotta, Roberto , journal =. Is
[16]

arXiv preprint arXiv:2506.05339 , year =

Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models , author =. arXiv preprint arXiv:2506.05339 , year =

arXiv
[17]

arXiv preprint arXiv:2508.01930 , year =

Word Overuse and Alignment in Large Language Models: The Influence of Learning from Human Feedback , author =. arXiv preprint arXiv:2508.01930 , year =

arXiv
[18]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Deep Reinforcement Learning from Human Preferences , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[19]

arXiv preprint arXiv:2109.01652 , year =

Finetuned Language Models are Zero-Shot Learners , author =. arXiv preprint arXiv:2109.01652 , year =

Pith/arXiv arXiv
[20]

arXiv preprint arXiv:2307.09288 , year =

Llama 2: Open Foundation and Fine-Tuned Chat Models , author =. arXiv preprint arXiv:2307.09288 , year =

Pith/arXiv arXiv
[21]

ACL 2025 Student Research Workshop , year =

Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models , author =. ACL 2025 Student Research Workshop , year =

2025
[22]

arXiv preprint arXiv:2409.01754 , year =

Empirical Evidence of Large Language Model's Influence on Human Spoken Communication , author =. arXiv preprint arXiv:2409.01754 , year =

arXiv
[23]

, journal =

Anderson, Bryce and Galpin, Riley and Juzek, Tom S. , journal =. Model Misalignment and Language Change: Traces of
[24]

Nature , year =

Gibney, Elizabeth , title =. Nature , year =. doi:10.1038/d41586-025-00229-6 , note =

work page doi:10.1038/d41586-025-00229-6
[25]

arXiv preprint arXiv:2110.14168 , year =

Training Verifiers to Solve Math Word Problems , author =. arXiv preprint arXiv:2110.14168 , year =

Pith/arXiv arXiv
[26]

arXiv preprint arXiv:2409.11704 , year =

Zhang, Xuanchang and Xiong, Wei and Chen, Lichang and Zhou, Tianyi and Huang, Heng and Zhang, Tong , title =. arXiv preprint arXiv:2409.11704 , year =

arXiv
[27]

Nature , year =

Shumailov, Ilia and Shumaylov, Zakhar and Zhao, Yiren and Gal, Yarin and Papernot, Nicolas and Anderson, Ross , title =. Nature , year =
[28]

, title =

Alemohammad, Sina and Casco-Rodriguez, Josue and Luzi, Lorenzo and Humayun, Ahmed Imtiaz and Babaei, Hossein and LeJeune, Daniel and Siahkoohi, Ali and Baraniuk, Richard G. , title =. arXiv preprint arXiv:2307.01850 , year =

arXiv
[29]

arXiv preprint arXiv:2311.16822 , year =

Briesch, Martin and Sobania, Dominik and Rothlauf, Franz , title =. arXiv preprint arXiv:2311.16822 , year =

arXiv
[30]

Texto Livre , year =

da Silva, Antonio Marcio and Rottava, Lucia , title =. Texto Livre , year =
[31]

Lengua y Sociedad , year =

Kotz, Gabriela and Salcedo-Lagos, Pedro and Fuentes, Karina , title =. Lengua y Sociedad , year =
[32]

2025 , howpublished =

Jin, Houji and Ashrafi, Negin and Abdollahi, Armin and others , title =. 2025 , howpublished =

2025
[33]

PLOS ONE , year =

Zaitsu, Wataru and Jin, Mingzhe , title =. PLOS ONE , year =
[34]

International Journal of Speech Technology , year =

Schaaff, Kristina and Schlippe, Tim and Mindner, Lorenz , title =. International Journal of Speech Technology , year =
[35]

Linguistic Characteristics of

Ter. Linguistic Characteristics of. 2025 , howpublished =

2025
[36]

Gehrmann, Sebastian and Strobelt, Hendrik and Rush, Alexander , booktitle =
[37]

On the Possibilities of

Chakraborty, Souradip and others , journal =. On the Possibilities of
[38]

and Finn, Chelsea , booktitle =

Mitchell, Eric and Lee, Yoonho and Khazatsky, Alexander and Manning, Christopher D. and Finn, Chelsea , booktitle =
[39]

Proceedings of ICLR , year =

On the Reliability of Watermarks for Large Language Models , author =. Proceedings of ICLR , year =
[40]

Huang, Yifei and others , booktitle =
[41]

arXiv preprint arXiv:2303.11156 , year =

Sadasivan, Vinu Sankar and others , title =. arXiv preprint arXiv:2303.11156 , year =

Pith/arXiv arXiv
[42]

International Journal for Educational Integrity , year =

Weber-Wulff, Debora and others , title =. International Journal for Educational Integrity , year =
[43]

Language (Technology) is Power: A Critical Survey of ``Bias'' in

Blodgett, Su Lin and Barocas, Solon and Daum. Language (Technology) is Power: A Critical Survey of ``Bias'' in. arXiv preprint arXiv:2005.14050 , year =

arXiv 2005
[44]

and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret , title =

Bender, Emily M. and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret , title =. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , year =

2021
[45]

Proceedings of the ACM Collective Intelligence Conference , year =

Kotek, Hadas and Dockum, Rikker and Sun, David , title =. Proceedings of the ACM Collective Intelligence Conference , year =
[46]

and Lester, Jenna C

Omiye, Jesutofunmi A. and Lester, Jenna C. and Spichak, Simon and Rotemberg, Veronica and Daneshjou, Roxana , title =. NPJ Digital Medicine , year =
[47]

arXiv preprint arXiv:2310.13548 , year =

Towards Understanding Sycophancy in Language Models , author =. arXiv preprint arXiv:2310.13548 , year =

Pith/arXiv arXiv
[48]

Proceedings of ICLR , year =

Simple Synthetic Data Reduces Sycophancy in Large Language Models , author =. Proceedings of ICLR , year =
[49]

Poushter, Jacob and Smerkovich, Maria and Fagan, Moira and Prozorovsky, Andrew , title =
[50]

, title =

Zajonc, Robert B. , title =. Journal of Personality and Social Psychology, Monograph Supplement , year =
[51]

Journal of Verbal Learning and Verbal Behavior , volume =

Frequency and the Conference of Referential Validity , author =. Journal of Verbal Learning and Verbal Behavior , volume =
[52]

arXiv preprint arXiv:2311.16867 , year=

The Falcon Series of Open Language Models , author=. arXiv preprint arXiv:2311.16867 , year=

Pith/arXiv arXiv
[53]

arXiv preprint arXiv:2503.19786 , year=

Gemma 3 Technical Report , author=. arXiv preprint arXiv:2503.19786 , year=

Pith/arXiv arXiv
[54]

arXiv preprint arXiv:2407.21783 , year=

The Llama 3 Herd of Models , author=. arXiv preprint arXiv:2407.21783 , year=

Pith/arXiv arXiv
[55]

and Sablayrolles, Alexandre and Mensch, Arthur and others , journal=

Jiang, Albert Q. and Sablayrolles, Alexandre and Mensch, Arthur and others , journal=. Mistral
[56]

arXiv preprint arXiv:2501.00656 , year=

2. arXiv preprint arXiv:2501.00656 , year=

Pith/arXiv arXiv
[57]

Yi: Open Foundation Models by 01

Young, Alex and Chen, Bei and Li, Chao and others , journal=. Yi: Open Foundation Models by 01
[58]

2024 , howpublished =

Open. 2024 , howpublished =

2024
[59]

Krichevsky, R. E. and Trofimov, V. K. , title =. IEEE Transactions on Information Theory , year =
[60]

Wiktionary: The Free Dictionary , year =
[61]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , month = jul, year =

Comparing LLM-generated and human-authored news text using formal syntactic theory , author =. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , month = jul, year =. doi:10.18653/v1/2025.acl-long.443 , note =

work page doi:10.18653/v1/2025.acl-long.443 2025
[62]

Artificial Intelligence Review , volume =

Contrasting Linguistic Patterns in Human and LLM-Generated News Text , author =. Artificial Intelligence Review , volume =. 2024 , doi =

2024
[63]

2024 , note =

Do LLMs write like humans? Variation in grammatical and rhetorical styles , author =. 2024 , note =

2024
[64]

Zenodo , year =

spaCy: Industrial-Strength Natural Language Processing in Python , author =. Zenodo , year =. doi:10.5281/zenodo.1212303 , url =

work page doi:10.5281/zenodo.1212303
[65]

Proceedings of the Twelfth Language Resources and Evaluation Conference , year =

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection , author =. Proceedings of the Twelfth Language Resources and Evaluation Conference , year =
[66]

Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , year =

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , author =. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , year =. doi:10.18653/v1/K17-3001 , url =

work page doi:10.18653/v1/k17-3001 2017
[67]

2023 , howpublished =

Verbosity Bias in Preference Labeling by Large Language Models , author =. 2023 , howpublished =

2023
[68]

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock

Zhang, Xuanchang and Xiong, Wei and Chen, Lichang and Zhou, Tianyi and Huang, Heng and Zhang, Tong , journal =. Verbalized Sampling: How to Mitigate Mode Collapse and Unlock. 2025 , url =

2025
[69]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year =

What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability , author =. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year =. doi:10.18653/v1/2023.emnlp-main.887 , url =

work page doi:10.18653/v1/2023.emnlp-main.887 2023
[70]

Proceedings of the 24th Conference on Computational Natural Language Learning , year =

Cloze Distillation: Improving Neural Language Models with Human Next-Word Prediction , author =. Proceedings of the 24th Conference on Computational Natural Language Learning , year =. doi:10.18653/v1/2020.conll-1.49 , url =

work page doi:10.18653/v1/2020.conll-1.49 2020
[71]

2001 , publisher=

English Words: History and Structure , author=. 2001 , publisher=

2001
[72]

Language and Speech , volume=

The formality of the Latinate lexicon in English , author=. Language and Speech , volume=. 1981 , publisher=

1981
[73]

conciseness , author=

Can Grammarly and ChatGPT accelerate language change? AI-powered technologies and their impact on the English language: wordiness vs. conciseness , author=. arXiv preprint arXiv:2502.04324 , year=

arXiv
[74]

TIME , year =

Perrigo, Billy , title =. TIME , year =
[75]

arXiv preprint arXiv:2603.18161 , year=

How LLMs Distort Our Written Language , author=. arXiv preprint arXiv:2603.18161 , year=

arXiv
[76]

and Harman, Mark and Syme, Don and Noppen, Joost and Yannakoudakis, Helen and Nauck, Detlef , journal=

Twist, Lukas and Zhang, Jie M. and Harman, Mark and Syme, Don and Noppen, Joost and Yannakoudakis, Helen and Nauck, Detlef , journal=. A Study of

[1] [1]

AI --- 2024 Stack Overflow Developer Survey , year =

2024

[2] [2]

2024 , month = may, howpublished =

Coffey, Lauren , title =. 2024 , month = may, howpublished =

2024

[3] [3]

2025 , month = jul, howpublished =

O'Brien, Matt and Sanders, Linley , title =. 2025 , month = jul, howpublished =

2025

[4] [4]

2025 , month = jun, howpublished =

Sidoti, Olivia and McClain, Colleen , title =. 2025 , month = jun, howpublished =

2025

[5] [5]

Delving into

Kobak, Dmitry and M. Delving into. arXiv preprint arXiv:2406.07016 , year =

arXiv

[6] [6]

arXiv preprint arXiv:2404.01268 , year =

Liang, Weixin and Zhang, Yaohui and Wu, Zhengxuan and Lepp, Haley and Ji, Wenlong and Zhao, Xuandong and Cao, Hancheng and Liu, Sheng and He, Siyu and Huang, Zhi and others , title =. arXiv preprint arXiv:2404.01268 , year =

arXiv

[7] [7]

Advances in Neural Information Processing Systems , year =

Ouyang, Long and Wu, Jeffrey and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and others , title =. Advances in Neural Information Processing Systems , year =

[8] [8]

Advances in Neural Information Processing Systems , year =

Rafailov, Rafael and Sharma, Archit and Mitchell, Eric and Manning, Christopher D and Ermon, Stefano and Finn, Chelsea , title =. Advances in Neural Information Processing Systems , year =

[9] [9]

and Kolossa, Dorothea , title =

Irrgang, Verena and Solopova, Veronika and Zeiler, Steffen and Nickel, Robert M. and Kolossa, Dorothea , title =. Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024) , address =. 2024 , month = sep, pages =

2024

[10] [10]

Achiam, Josh and Adler, Steven and Agarwal, Sandhini and Ahmad, Lama and Akkaya, Ilge and others , journal =

[11] [11]

Measuring Massive Multitask Language Understanding (

Hendrycks, Dan and Burns, Collin and Basart, Steven and Zou, Andy and Mazeika, Mantas and Song, Dawn and Steinhardt, Jacob , journal =. Measuring Massive Multitask Language Understanding (

[12] [12]

medRxiv , year =

Matsui, Kentaro , title =. medRxiv , year =

[13] [13]

arXiv preprint arXiv:2404.15799 , year =

Liu, Jialin and Bu, Yi , title =. arXiv preprint arXiv:2404.15799 , year =

arXiv

[14] [14]

arXiv preprint arXiv:2403.16887 , year =

Gray, Andrew , title =. arXiv preprint arXiv:2403.16887 , year =

arXiv

[15] [15]

Geng, Mingmeng and Trotta, Roberto , journal =. Is

[16] [16]

arXiv preprint arXiv:2506.05339 , year =

Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models , author =. arXiv preprint arXiv:2506.05339 , year =

arXiv

[17] [17]

arXiv preprint arXiv:2508.01930 , year =

Word Overuse and Alignment in Large Language Models: The Influence of Learning from Human Feedback , author =. arXiv preprint arXiv:2508.01930 , year =

arXiv

[18] [18]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Deep Reinforcement Learning from Human Preferences , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =

[19] [19]

arXiv preprint arXiv:2109.01652 , year =

Finetuned Language Models are Zero-Shot Learners , author =. arXiv preprint arXiv:2109.01652 , year =

Pith/arXiv arXiv

[20] [20]

arXiv preprint arXiv:2307.09288 , year =

Llama 2: Open Foundation and Fine-Tuned Chat Models , author =. arXiv preprint arXiv:2307.09288 , year =

Pith/arXiv arXiv

[21] [21]

ACL 2025 Student Research Workshop , year =

Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models , author =. ACL 2025 Student Research Workshop , year =

2025

[22] [22]

arXiv preprint arXiv:2409.01754 , year =

Empirical Evidence of Large Language Model's Influence on Human Spoken Communication , author =. arXiv preprint arXiv:2409.01754 , year =

arXiv

[23] [23]

, journal =

Anderson, Bryce and Galpin, Riley and Juzek, Tom S. , journal =. Model Misalignment and Language Change: Traces of

[24] [24]

Nature , year =

Gibney, Elizabeth , title =. Nature , year =. doi:10.1038/d41586-025-00229-6 , note =

work page doi:10.1038/d41586-025-00229-6

[25] [25]

arXiv preprint arXiv:2110.14168 , year =

Training Verifiers to Solve Math Word Problems , author =. arXiv preprint arXiv:2110.14168 , year =

Pith/arXiv arXiv

[26] [26]

arXiv preprint arXiv:2409.11704 , year =

Zhang, Xuanchang and Xiong, Wei and Chen, Lichang and Zhou, Tianyi and Huang, Heng and Zhang, Tong , title =. arXiv preprint arXiv:2409.11704 , year =

arXiv

[27] [27]

Nature , year =

Shumailov, Ilia and Shumaylov, Zakhar and Zhao, Yiren and Gal, Yarin and Papernot, Nicolas and Anderson, Ross , title =. Nature , year =

[28] [28]

, title =

Alemohammad, Sina and Casco-Rodriguez, Josue and Luzi, Lorenzo and Humayun, Ahmed Imtiaz and Babaei, Hossein and LeJeune, Daniel and Siahkoohi, Ali and Baraniuk, Richard G. , title =. arXiv preprint arXiv:2307.01850 , year =

arXiv

[29] [29]

arXiv preprint arXiv:2311.16822 , year =

Briesch, Martin and Sobania, Dominik and Rothlauf, Franz , title =. arXiv preprint arXiv:2311.16822 , year =

arXiv

[30] [30]

Texto Livre , year =

da Silva, Antonio Marcio and Rottava, Lucia , title =. Texto Livre , year =

[31] [31]

Lengua y Sociedad , year =

Kotz, Gabriela and Salcedo-Lagos, Pedro and Fuentes, Karina , title =. Lengua y Sociedad , year =

[32] [32]

2025 , howpublished =

Jin, Houji and Ashrafi, Negin and Abdollahi, Armin and others , title =. 2025 , howpublished =

2025

[33] [33]

PLOS ONE , year =

Zaitsu, Wataru and Jin, Mingzhe , title =. PLOS ONE , year =

[34] [34]

International Journal of Speech Technology , year =

Schaaff, Kristina and Schlippe, Tim and Mindner, Lorenz , title =. International Journal of Speech Technology , year =

[35] [35]

Linguistic Characteristics of

Ter. Linguistic Characteristics of. 2025 , howpublished =

2025

[36] [36]

Gehrmann, Sebastian and Strobelt, Hendrik and Rush, Alexander , booktitle =

[37] [37]

On the Possibilities of

Chakraborty, Souradip and others , journal =. On the Possibilities of

[38] [38]

and Finn, Chelsea , booktitle =

Mitchell, Eric and Lee, Yoonho and Khazatsky, Alexander and Manning, Christopher D. and Finn, Chelsea , booktitle =

[39] [39]

Proceedings of ICLR , year =

On the Reliability of Watermarks for Large Language Models , author =. Proceedings of ICLR , year =

[40] [40]

Huang, Yifei and others , booktitle =

[41] [41]

arXiv preprint arXiv:2303.11156 , year =

Sadasivan, Vinu Sankar and others , title =. arXiv preprint arXiv:2303.11156 , year =

Pith/arXiv arXiv

[42] [42]

International Journal for Educational Integrity , year =

Weber-Wulff, Debora and others , title =. International Journal for Educational Integrity , year =

[43] [43]

Language (Technology) is Power: A Critical Survey of ``Bias'' in

Blodgett, Su Lin and Barocas, Solon and Daum. Language (Technology) is Power: A Critical Survey of ``Bias'' in. arXiv preprint arXiv:2005.14050 , year =

arXiv 2005

[44] [44]

and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret , title =

Bender, Emily M. and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret , title =. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , year =

2021

[45] [45]

Proceedings of the ACM Collective Intelligence Conference , year =

Kotek, Hadas and Dockum, Rikker and Sun, David , title =. Proceedings of the ACM Collective Intelligence Conference , year =

[46] [46]

and Lester, Jenna C

Omiye, Jesutofunmi A. and Lester, Jenna C. and Spichak, Simon and Rotemberg, Veronica and Daneshjou, Roxana , title =. NPJ Digital Medicine , year =

[47] [47]

arXiv preprint arXiv:2310.13548 , year =

Towards Understanding Sycophancy in Language Models , author =. arXiv preprint arXiv:2310.13548 , year =

Pith/arXiv arXiv

[48] [48]

Proceedings of ICLR , year =

Simple Synthetic Data Reduces Sycophancy in Large Language Models , author =. Proceedings of ICLR , year =

[49] [49]

Poushter, Jacob and Smerkovich, Maria and Fagan, Moira and Prozorovsky, Andrew , title =

[50] [50]

, title =

Zajonc, Robert B. , title =. Journal of Personality and Social Psychology, Monograph Supplement , year =

[51] [51]

Journal of Verbal Learning and Verbal Behavior , volume =

Frequency and the Conference of Referential Validity , author =. Journal of Verbal Learning and Verbal Behavior , volume =

[52] [52]

arXiv preprint arXiv:2311.16867 , year=

The Falcon Series of Open Language Models , author=. arXiv preprint arXiv:2311.16867 , year=

Pith/arXiv arXiv

[53] [53]

arXiv preprint arXiv:2503.19786 , year=

Gemma 3 Technical Report , author=. arXiv preprint arXiv:2503.19786 , year=

Pith/arXiv arXiv

[54] [54]

arXiv preprint arXiv:2407.21783 , year=

The Llama 3 Herd of Models , author=. arXiv preprint arXiv:2407.21783 , year=

Pith/arXiv arXiv

[55] [55]

and Sablayrolles, Alexandre and Mensch, Arthur and others , journal=

Jiang, Albert Q. and Sablayrolles, Alexandre and Mensch, Arthur and others , journal=. Mistral

[56] [56]

arXiv preprint arXiv:2501.00656 , year=

2. arXiv preprint arXiv:2501.00656 , year=

Pith/arXiv arXiv

[57] [57]

Yi: Open Foundation Models by 01

Young, Alex and Chen, Bei and Li, Chao and others , journal=. Yi: Open Foundation Models by 01

[58] [58]

2024 , howpublished =

Open. 2024 , howpublished =

2024

[59] [59]

Krichevsky, R. E. and Trofimov, V. K. , title =. IEEE Transactions on Information Theory , year =

[60] [60]

Wiktionary: The Free Dictionary , year =

[61] [61]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , month = jul, year =

Comparing LLM-generated and human-authored news text using formal syntactic theory , author =. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , month = jul, year =. doi:10.18653/v1/2025.acl-long.443 , note =

work page doi:10.18653/v1/2025.acl-long.443 2025

[62] [62]

Artificial Intelligence Review , volume =

Contrasting Linguistic Patterns in Human and LLM-Generated News Text , author =. Artificial Intelligence Review , volume =. 2024 , doi =

2024

[63] [63]

2024 , note =

Do LLMs write like humans? Variation in grammatical and rhetorical styles , author =. 2024 , note =

2024

[64] [64]

Zenodo , year =

spaCy: Industrial-Strength Natural Language Processing in Python , author =. Zenodo , year =. doi:10.5281/zenodo.1212303 , url =

work page doi:10.5281/zenodo.1212303

[65] [65]

Proceedings of the Twelfth Language Resources and Evaluation Conference , year =

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection , author =. Proceedings of the Twelfth Language Resources and Evaluation Conference , year =

[66] [66]

Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , year =

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , author =. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , year =. doi:10.18653/v1/K17-3001 , url =

work page doi:10.18653/v1/k17-3001 2017

[67] [67]

2023 , howpublished =

Verbosity Bias in Preference Labeling by Large Language Models , author =. 2023 , howpublished =

2023

[68] [68]

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock

Zhang, Xuanchang and Xiong, Wei and Chen, Lichang and Zhou, Tianyi and Huang, Heng and Zhang, Tong , journal =. Verbalized Sampling: How to Mitigate Mode Collapse and Unlock. 2025 , url =

2025

[69] [69]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year =

What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability , author =. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year =. doi:10.18653/v1/2023.emnlp-main.887 , url =

work page doi:10.18653/v1/2023.emnlp-main.887 2023

[70] [70]

Proceedings of the 24th Conference on Computational Natural Language Learning , year =

Cloze Distillation: Improving Neural Language Models with Human Next-Word Prediction , author =. Proceedings of the 24th Conference on Computational Natural Language Learning , year =. doi:10.18653/v1/2020.conll-1.49 , url =

work page doi:10.18653/v1/2020.conll-1.49 2020

[71] [71]

2001 , publisher=

English Words: History and Structure , author=. 2001 , publisher=

2001

[72] [72]

Language and Speech , volume=

The formality of the Latinate lexicon in English , author=. Language and Speech , volume=. 1981 , publisher=

1981

[73] [73]

conciseness , author=

Can Grammarly and ChatGPT accelerate language change? AI-powered technologies and their impact on the English language: wordiness vs. conciseness , author=. arXiv preprint arXiv:2502.04324 , year=

arXiv

[74] [74]

TIME , year =

Perrigo, Billy , title =. TIME , year =

[75] [75]

arXiv preprint arXiv:2603.18161 , year=

How LLMs Distort Our Written Language , author=. arXiv preprint arXiv:2603.18161 , year=

arXiv

[76] [76]

and Harman, Mark and Syme, Don and Noppen, Joost and Yannakoudakis, Helen and Nauck, Detlef , journal=

Twist, Lukas and Zhang, Jie M. and Harman, Mark and Syme, Don and Noppen, Joost and Yannakoudakis, Helen and Nauck, Detlef , journal=. A Study of