arXiv preprint arXiv:2409.11704 , year =

From lists to emojis: How format bias affects model alignment · arXiv 2409.11704

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models

cs.CL · 2026-06-02 · unverdicted · novelty 7.0

Introduces Lexical Alignment Score and Triangulated Preference Shift metrics to automatically identify lexical overuse in LLMs and attribute portions to preference learning stages via windowed prevalence on PubMed data.

Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning

cs.CL · 2026-05-29 · unverdicted · novelty 6.0

Introduces a triangulation-based metric to quantify lexical shifts attributable to preference tuning without requiring manual curation of examples.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models cs.CL · 2026-06-02 · unverdicted · none · ref 23
Introduces Lexical Alignment Score and Triangulated Preference Shift metrics to automatically identify lexical overuse in LLMs and attribute portions to preference learning stages via windowed prevalence on PubMed data.
Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning cs.CL · 2026-05-29 · unverdicted · none · ref 26
Introduces a triangulation-based metric to quantify lexical shifts attributable to preference tuning without requiring manual curation of examples.

arXiv preprint arXiv:2409.11704 , year =

fields

years

verdicts

representative citing papers

citing papers explorer