Whose Facts Win? LLM Source Preferences under Knowledge Conflicts

Jakob Schuster; Katja Markert; Vagrant Gautam

arxiv: 2601.03746 · v3 · submitted 2026-01-07 · 💻 cs.CL

Whose Facts Win? LLM Source Preferences under Knowledge Conflicts

Jakob Schuster , Vagrant Gautam , Katja Markert This is my paper

Pith reviewed 2026-05-16 16:53 UTC · model grok-4.3

classification 💻 cs.CL

keywords large language modelsknowledge conflictssource preferencescredibilitysynthetic sourcesrepetition biasretrieval-augmented generation

0 comments

The pith

LLMs favor government and newspaper sources over people or social media when facts conflict, but repetition of the weaker source can reverse the choice.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how large language models decide which piece of information to accept when two retrieved contexts contradict each other. It isolates the effect of source type by creating synthetic sources that represent different levels of institutional credibility. Across thirteen open-weight models, the results show a stable preference for government and newspaper sources over personal or social-media accounts. The same experiments reveal that simply repeating the lower-credibility claim several times is enough to flip the model’s selection. A mitigation technique is introduced that cuts the repetition effect while keeping most of the original source preference intact.

Core claim

When LLMs face inter-context knowledge conflicts, they resolve them by preferring information attributed to institutionally corroborated sources such as governments or newspapers over information attributed to individuals or social media. These source preferences are not fixed: repeating the same claim from a less-credible source reverses the preference in favor of that source. A new repetition-bias reduction method lowers the reversal rate by up to 79.2 percent while retaining at least 72.5 percent of the models’ original source preferences.

What carries the argument

A synthetic-source framework that generates controlled knowledge conflicts between different credibility types without inheriting biases from specific real-world documents.

If this is right

Retrieval-augmented pipelines could improve consistency by weighting institutional sources higher when conflicts arise.
Training data that repeatedly features lower-credibility claims may systematically erode LLMs’ source-based fact selection.
Mitigation methods that dampen repetition effects can be applied at inference time without retraining.
Systems that surface multiple conflicting sources will need explicit mechanisms to preserve credibility ordering.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

LLMs appear to have absorbed a broad societal ranking of source credibility from their training data.
In live retrieval settings, the number of times a claim appears may override the source label the model was trained to respect.
Testing the same synthetic-source protocol on closed models or on non-English data would show whether the preference pattern is architecture- or language-specific.

Load-bearing premise

That the preference patterns produced by synthetic sources will appear unchanged when the same models encounter real retrieved documents from actual institutions and social platforms.

What would settle it

Run the same conflict-resolution tests on a fresh set of real web documents drawn from government sites, established newspapers, personal blogs, and social-media posts; check whether the institutional-source preference and the repetition-reversal effect both reappear at comparable rates.

Figures

Figures reproduced from arXiv: 2601.03746 by Jakob Schuster, Katja Markert, Vagrant Gautam.

**Figure 2.** Figure 2: We measure the influence of source credibility on a model’s output by observing how answer probabilities [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Source preferences when comparing attributed [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Model preferences between source types un [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Preferences when conflicting information is [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Source preferences when conflicting infor [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: Probability deviation from 50% of RHS an [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 9.** Figure 9: LLMs mostly prefer repeated unattributed information, flipping prior preferences for attributed information. Legend in §3.3. No Source Government No Source (Repeated) Government Social Media Government Social Media (1-Table Maj.) Government Social Media (2-Table Maj.) Government -50 -40 -30 -20 -10 0 10 20 30 40 50 SP Social Media (Repeated) Government [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 10.** Figure 10: Source preferences when QWEN models are instructed to consider source credibility (darker), compared to original prompts (lighter). Prompting weakens repetition bias but not enough to ensure consistency with the original source hierarchy. Legend in §3.3. that source credibility still plays a role in this setting, but takes a backseat compared to repetition. Credibility prompting. We investigate whether … view at source ↗

**Figure 11.** Figure 11: GEMMA3-4B when fine-tuned and prompted to consider credibility (darker) in comparison to the original teacher model (lighter). This setup reduces repetition bias and maintains original preferences. We also show results with just fine-tuning in Appendix J. The SPd-gap between government and no source reduces by 99.8% from 45.7 to 1.0, while retaining 88.8% of the original preference from 29.4 to 26.1. For… view at source ↗

**Figure 12.** Figure 12: GPT-4.1 prompt to create alternative values for NeoQA attributes with a small set of possible values. Variations for the eye color attribute of PERSON entities are brown, blue, green, hazel, grey, amber, black, dark brown, light brown, dark blue, light blue, emerald and golden brown. Variations for the hair color attribute of PERSON entities are black, brown, blonde, red, gray, white, dark brown, light b… view at source ↗

**Figure 13.** Figure 13: Prompt used to generate alternative values for [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

**Figure 14.** Figure 14: Position bias for all models, displaying [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗

**Figure 15.** Figure 15: Example prompt for experiments with government vs. unattributed knowledge in the QWEN2.5 template. Instruction following. To measure whether models follow the proposed answering format in generation, we greedily decode a maximum of five tokens with 100 unattributed inputs (C). After parsing the generations with regular expressions, we report whether they answered with only a single letter in the correct… view at source ↗

**Figure 16.** Figure 16: Example prompt for eliciting prompted preference between [PITH_FULL_IMAGE:figures/full_fig_p019_16.png] view at source ↗

**Figure 17.** Figure 17: Source preferences between attributed and [PITH_FULL_IMAGE:figures/full_fig_p020_17.png] view at source ↗

**Figure 19.** Figure 19: Source preferences between attributed and [PITH_FULL_IMAGE:figures/full_fig_p021_19.png] view at source ↗

**Figure 21.** Figure 21: Source preferences between attributed and [PITH_FULL_IMAGE:figures/full_fig_p022_21.png] view at source ↗

**Figure 23.** Figure 23: Source preference when conflicting informa [PITH_FULL_IMAGE:figures/full_fig_p023_23.png] view at source ↗

**Figure 24.** Figure 24: shows the directly prompted preferences for all inter- and intra-type experiments (15 source contrasts and 13 models = 195 cases). Prompting Newspaper Government Person Government Social Media Government Person Newspaper Social Media Newspaper Social Media Person Low Circulation (Newspaper) High Circulation (Newspaper) Low Followers (Social Media) High Followers (Social Media) No Title (Person) Academic T… view at source ↗

**Figure 25.** Figure 25: Example prompt for experiments a conflicting [PITH_FULL_IMAGE:figures/full_fig_p024_25.png] view at source ↗

**Figure 26.** Figure 26: Source preference when conflicting infor [PITH_FULL_IMAGE:figures/full_fig_p024_26.png] view at source ↗

**Figure 27.** Figure 27: Example prompt for experiments with a conflicting [PITH_FULL_IMAGE:figures/full_fig_p025_27.png] view at source ↗

**Figure 28.** Figure 28: Example prompt for experiments with a conflicting [PITH_FULL_IMAGE:figures/full_fig_p026_28.png] view at source ↗

**Figure 29.** Figure 29: Source preferences when GEMMA models are instructed to consider source credibility (darker), compared to original prompts (lighter). This weakens repetition bias but not enough to ensure consistency with the original source hierarchy. Legend in §3.3. I Credibility Prompting for All Models We add a paragraph to the instruction of every prompt, stating: "When selecting an answer, identify which sources sup… view at source ↗

**Figure 31.** Figure 31: Source preferences when OLMO models are instructed to consider source credibility (darker), compared to original prompts (lighter). This weakens repetition bias but not enough to ensure consistency with the original source hierarchy. Legend in §3.3. 27 [PITH_FULL_IMAGE:figures/full_fig_p027_31.png] view at source ↗

**Figure 32.** Figure 32: GEMMA-3-4B when fine-tuned (darker) in comparison to original results (lighter). They show less repetition bias but do not attend to source credibility to the same degree as when combining fine-tuning and credibility prompting. Fine-tuning results without additional credibility prompting. In the main paper, we report the results of the fine-tuned model plus credibility prompting, which achieves the best … view at source ↗

read the original abstract

As large language models (LLMs) are more frequently used in retrieval-augmented generation pipelines, it is increasingly relevant to study their behavior under knowledge conflicts. Thus far, the role of the source of the retrieved information has gone unexamined. We address this gap with a novel framework to investigate how source preferences affect LLM resolution of inter-context knowledge conflicts in English, motivated by interdisciplinary research on credibility. By using synthetic sources, we study preferences for different types of sources without inheriting the biases of specific real-world sources. With a comprehensive, tightly-controlled evaluation of 13 open-weight LLMs, we find that LLMs prefer institutionally-corroborated information (e.g., government or newspaper sources) over information from people and social media. However, these source preferences can be reversed by simply repeating information from less credible sources. To mitigate repetition effects and maintain consistent preferences, we propose a novel method that reduces repetition bias by up to 79.2%, while also maintaining at least 72.5% of original preferences. We release all data and code to encourage future work on credibility and source preferences in knowledge-intensive NLP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows LLMs favor institutional sources over social media in knowledge conflicts and that repetition can flip this, plus a mitigation that cuts the effect by up to 79 percent in their tests.

read the letter

The main result is straightforward: across 13 open-weight models, LLMs resolve conflicting facts by preferring government or newspaper sources over people or social media, but repeating the weaker source enough times reverses the preference. The authors then test a simple mitigation that reduces the repetition bias while keeping at least 72.5 percent of the original source preference intact. They release the data and code, which is the right move for this kind of work. The synthetic-source design is a reasonable way to isolate credibility signals without inheriting the quirks of any particular real dataset, and the controlled setup across models gives the numbers some weight. The repetition reversal itself is a useful observation for anyone running retrieval-augmented generation, since it points to a cheap way the pipeline can go wrong. The main limitation is that everything stays inside the synthetic regime. The source labels are generated by the experimenters, so the models could be reacting to surface phrasing or framing rather than any internalized sense of institutional credibility. The mitigation is also only measured in the same artificial setting, so it is unclear whether the same numbers would appear with real retrieved documents that carry formatting, length, or provenance noise. That does not invalidate the internal findings, but it caps how far the practical claims can be pushed without follow-up. This is worth a serious referee for groups working on factuality in RAG pipelines. The framework is new enough and the artifacts are open enough that referees can ask for the right additional checks on generalization. I would bring it to a reading group to discuss the repetition result and whether the mitigation scales.

Referee Report

2 major / 2 minor

Summary. The paper introduces a framework using synthetic sources to study how LLMs resolve inter-context knowledge conflicts, motivated by credibility research. Through a controlled evaluation of 13 open-weight models, it reports that LLMs prefer institutionally corroborated sources (e.g., government or newspaper) over personal or social media sources, but that these preferences can be reversed by repeating information from less credible sources. It further proposes a method to reduce repetition bias by up to 79.2% while retaining at least 72.5% of original preferences, and releases all data and code.

Significance. If the results hold, the work provides actionable insights for retrieval-augmented generation pipelines by quantifying source-type biases and offering a mitigation technique. The multi-model evaluation and public release of data/code are strengths that support reproducibility and follow-on research on credibility in knowledge-intensive NLP.

major comments (2)

[§4] §4 (synthetic source construction): the templates used to instantiate sources (e.g., “According to a government report…” vs. “According to a social media user…”) are not shown to be free of surface-level lexical or framing cues that LLMs could exploit directly; without an ablation that varies only the source label while holding content fixed, the observed preferences may reflect template artifacts rather than internalized credibility models.
[§5.3] §5.3 (repetition mitigation evaluation): the proposed bias-reduction method is validated exclusively inside the same synthetic-source regime used for the main experiments; because real-world retrieval introduces provenance, formatting, and noise signals absent from the synthetic setup, it is unclear whether the reported 79.2% bias reduction and 72.5% preference retention would survive outside the controlled synthetic condition.

minor comments (2)

[Abstract, §3] The abstract and §3 claim the framework is “tightly controlled,” yet no quantitative measure of control (e.g., lexical overlap statistics between source templates) is provided.
[Results tables] Table 2 (or equivalent results table) should report per-model variance or confidence intervals for the preference percentages to allow readers to assess stability across the 13 LLMs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and indicate where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [§4] §4 (synthetic source construction): the templates used to instantiate sources (e.g., “According to a government report…” vs. “According to a social media user…”) are not shown to be free of surface-level lexical or framing cues that LLMs could exploit directly; without an ablation that varies only the source label while holding content fixed, the observed preferences may reflect template artifacts rather than internalized credibility models.

Authors: We agree that an explicit ablation isolating the source label would provide stronger evidence that preferences arise from source type rather than template phrasing. The templates were intentionally kept minimal and drawn from standard credibility literature phrasings to reduce framing effects, but we acknowledge this does not fully rule out lexical cues. In the revised manuscript we will add a new ablation experiment that holds the factual content fixed and varies only the source descriptor (e.g., swapping “government report” for “social media user” while preserving identical wording elsewhere). This will be reported in §4 and the associated results table. revision: yes
Referee: [§5.3] §5.3 (repetition mitigation evaluation): the proposed bias-reduction method is validated exclusively inside the same synthetic-source regime used for the main experiments; because real-world retrieval introduces provenance, formatting, and noise signals absent from the synthetic setup, it is unclear whether the reported 79.2% bias reduction and 72.5% preference retention would survive outside the controlled synthetic condition.

Authors: We concur that the synthetic regime, while enabling tight control over source type and repetition, omits real-world signals such as document formatting, provenance metadata, and retrieval noise. The mitigation technique was designed and evaluated within this controlled setting precisely to isolate repetition bias from other confounds. In the revision we will expand the limitations paragraph in §5.3 and the conclusion to explicitly discuss this scope restriction and to recommend that future work test the method on noisy web-retrieved documents. We cannot add new real-world experiments within the current revision timeline, but the released code and data will facilitate such follow-up studies. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with no derivations or self-referential reductions

full rationale

The paper conducts a controlled empirical study of LLM behavior under synthetic knowledge conflicts, measuring source-type preferences across 13 models via direct prompting experiments. No equations, fitted parameters, uniqueness theorems, or derivation chains are present that could reduce outputs to inputs by construction. All claims rest on observed response distributions from the experimental setup rather than any self-definition, renamed empirical patterns, or load-bearing self-citations. The synthetic source construction is an explicit methodological choice for bias control and is evaluated as such, without circular feedback into the reported preferences.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The study rests on standard assumptions from credibility research and LLM prompting practices; no free parameters, invented entities, or non-standard axioms are mentioned in the abstract.

pith-pipeline@v0.9.0 · 5496 in / 1044 out tokens · 34626 ms · 2026-05-16T16:53:50.430428+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 3 internal anchors

[1]

Angelica Chen, Jason Phang, Alicia Parrish, Vishakh Padmakumar, Chen Zhao, Samuel R

Dissociation of processes in belief: Source recollection, statement familiarity, and the illusion of truth.Journal of Experimental Psychology: General, 121(4):446. Angelica Chen, Jason Phang, Alicia Parrish, Vishakh Padmakumar, Chen Zhao, Samuel R. Bowman, and Kyunghyun Cho. 2024a. Two failures of self- consistency in the multi-step reasoning of LLMs. Tra...

work page 2024
[2]

InProceedings of the 2022 Con- ference on Empirical Methods in Natural Language Processing, pages 2292–2307

Rich knowledge sources bring complex knowl- edge conflicts: Recalibrating models to reflect con- flicting evidence. InProceedings of the 2022 Con- ference on Empirical Methods in Natural Language Processing, pages 2292–2307. Kimberle Crenshaw. 1991. Mapping the margins: In- tersectionality, identity politics, and violence against women of color.Stanford L...

work page 2022
[3]

Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, and Dietrich Klakow

Bias and fairness in large language models: A survey.Computational Linguistics, 50(3):1097– 1179. Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, and Dietrich Klakow. 2024. Robust pro- noun fidelity with english llms: Are they reasoning, repeating, or just biased?Transactions of the Associ- ation for Computational Linguistics, 12:1755–1779. Vagr...

work page 2024
[4]

The Llama 3 Herd of Models

A lightweight method to generate unanswer- able questions in English. InFindings of the As- sociation for Computational Linguistics: EMNLP 2023, pages 7349–7360, Singapore. Association for Computational Linguistics. Max Glockner, Xiang Jiang, Leonardo F. R. Ribeiro, Iryna Gurevych, and Markus Dreyer. 2025. NeoQA: Evidence-based question answering with gen...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

Giwon Hong, Jeonghwan Kim, Junmo Kang, Sung- Hyon Myaeng, and Joyce Jiyoung Whang

Frequency and the conference of referential validity.Journal of verbal learning and verbal be- havior, 16(1):107–112. Giwon Hong, Jeonghwan Kim, Junmo Kang, Sung- Hyon Myaeng, and Joyce Jiyoung Whang. 2024. Why so gullible? enhancing the robustness of retrieval-augmented models against counterfactual noise. InFindings of the Association for Computa- tiona...

work page arXiv 2024
[6]

InProceedings of the 2021 Conference on Empirical Methods in Natural Language Process- ing, pages 7052–7063, Online and Punta Cana, Do- minican Republic

Entity-based knowledge conflicts in question answering. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Process- ing, pages 7052–7063, Online and Punta Cana, Do- minican Republic. Association for Computational Linguistics. Chaitanya Malaviya, Sudeep Bhatia, and Mark Yatskar

work page 2021
[7]

2 OLMo 2 Furious

Cascading biases: Investigating the effect of heuristic annotation strategies on data and models. InProceedings of the 2022 Conference on Empiri- cal Methods in Natural Language Processing, pages 6525–6540, Abu Dhabi, United Arab Emirates. As- sociation for Computational Linguistics. Simon Malberg, Roman Poletukhin, Carolin M. Schus- ter, and Georg Groh. ...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[8]

Qwen2.5 Technical Report

Not all contexts are equal: Teaching LLMs credibility-aware generation. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19844–19863, Miami, Florida, USA. Association for Computational Lin- guistics. Gordon Pennycook, Tyrone D Cannon, and David G Rand. 2018. Prior exposure increases perceived accu- racy of fa...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[9]

InThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Bench- marks Track

$\texttt{ConflictBank}$: A benchmark for evaluating the influence of knowledge conflicts in LLMs. InThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Bench- marks Track. Arjun Subramonian, Vagrant Gautam, Preethi Seshadri, Dietrich Klakow, Kai-Wei Chang, and Yizhou Sun

work page
[10]

Let me speak freely? a study on the impact of format restrictions on performance of large language models.arXiv preprint arXiv:2408.02442, 2024

Agree to disagree? a meta-evaluation of LLM misgendering. InSecond Conference on Language Modeling. Zhi Rui Tam, Cheng-Kuang Wu, Yi-Lin Tsai, Chieh- Yen Lin, Hung-yi Lee, and Yun-Nung Chen. 2024. Let me speak freely? a study on the impact of format restrictions on performance of large language models. arXiv preprint arXiv:2408.02442. Hexiang Tan, Fei Sun,...

work page arXiv 2024
[11]

14 A Creation of Conflict Pairs In Table 1, we show examples of four counterfactually-created alternative values for dif- ferent entity types and attributes

Judging llm-as-a-judge with mt-bench and chatbot arena.Advances in neural information pro- cessing systems, 36:46595–46623. 14 A Creation of Conflict Pairs In Table 1, we show examples of four counterfactually-created alternative values for dif- ferent entity types and attributes. In the following subsections, we describe three different methods of creati...

work page
[12]

Rescalingfor numerical attributes (Appendix A.1)

work page
[13]

Samplingfor categorical attributes with a small number of possible values (Appendix A.2)

work page
[14]

| | Weight | 135 lbs | | Eye Color | Brown | | Hair Color | Black | | Marital Status | Single | | Political Affiliation | Independent | \

Generationfor categorical attributes with a large number of possible values (Appendix A.3) A.1 Rescaling We automatically adjust the values of numerical attributes that are not dates (such asbudget) by up to ±20%. Numbers with five digits or more are rounded to the third most significant decimal place to preserve a consistent level of precision. We scale ...

work page 2025

[1] [1]

Angelica Chen, Jason Phang, Alicia Parrish, Vishakh Padmakumar, Chen Zhao, Samuel R

Dissociation of processes in belief: Source recollection, statement familiarity, and the illusion of truth.Journal of Experimental Psychology: General, 121(4):446. Angelica Chen, Jason Phang, Alicia Parrish, Vishakh Padmakumar, Chen Zhao, Samuel R. Bowman, and Kyunghyun Cho. 2024a. Two failures of self- consistency in the multi-step reasoning of LLMs. Tra...

work page 2024

[2] [2]

InProceedings of the 2022 Con- ference on Empirical Methods in Natural Language Processing, pages 2292–2307

Rich knowledge sources bring complex knowl- edge conflicts: Recalibrating models to reflect con- flicting evidence. InProceedings of the 2022 Con- ference on Empirical Methods in Natural Language Processing, pages 2292–2307. Kimberle Crenshaw. 1991. Mapping the margins: In- tersectionality, identity politics, and violence against women of color.Stanford L...

work page 2022

[3] [3]

Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, and Dietrich Klakow

Bias and fairness in large language models: A survey.Computational Linguistics, 50(3):1097– 1179. Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, and Dietrich Klakow. 2024. Robust pro- noun fidelity with english llms: Are they reasoning, repeating, or just biased?Transactions of the Associ- ation for Computational Linguistics, 12:1755–1779. Vagr...

work page 2024

[4] [4]

The Llama 3 Herd of Models

A lightweight method to generate unanswer- able questions in English. InFindings of the As- sociation for Computational Linguistics: EMNLP 2023, pages 7349–7360, Singapore. Association for Computational Linguistics. Max Glockner, Xiang Jiang, Leonardo F. R. Ribeiro, Iryna Gurevych, and Markus Dreyer. 2025. NeoQA: Evidence-based question answering with gen...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[5] [5]

Giwon Hong, Jeonghwan Kim, Junmo Kang, Sung- Hyon Myaeng, and Joyce Jiyoung Whang

Frequency and the conference of referential validity.Journal of verbal learning and verbal be- havior, 16(1):107–112. Giwon Hong, Jeonghwan Kim, Junmo Kang, Sung- Hyon Myaeng, and Joyce Jiyoung Whang. 2024. Why so gullible? enhancing the robustness of retrieval-augmented models against counterfactual noise. InFindings of the Association for Computa- tiona...

work page arXiv 2024

[6] [6]

InProceedings of the 2021 Conference on Empirical Methods in Natural Language Process- ing, pages 7052–7063, Online and Punta Cana, Do- minican Republic

Entity-based knowledge conflicts in question answering. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Process- ing, pages 7052–7063, Online and Punta Cana, Do- minican Republic. Association for Computational Linguistics. Chaitanya Malaviya, Sudeep Bhatia, and Mark Yatskar

work page 2021

[7] [7]

2 OLMo 2 Furious

Cascading biases: Investigating the effect of heuristic annotation strategies on data and models. InProceedings of the 2022 Conference on Empiri- cal Methods in Natural Language Processing, pages 6525–6540, Abu Dhabi, United Arab Emirates. As- sociation for Computational Linguistics. Simon Malberg, Roman Poletukhin, Carolin M. Schus- ter, and Georg Groh. ...

work page internal anchor Pith review Pith/arXiv arXiv 2022

[8] [8]

Qwen2.5 Technical Report

Not all contexts are equal: Teaching LLMs credibility-aware generation. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19844–19863, Miami, Florida, USA. Association for Computational Lin- guistics. Gordon Pennycook, Tyrone D Cannon, and David G Rand. 2018. Prior exposure increases perceived accu- racy of fa...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[9] [9]

InThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Bench- marks Track

$\texttt{ConflictBank}$: A benchmark for evaluating the influence of knowledge conflicts in LLMs. InThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Bench- marks Track. Arjun Subramonian, Vagrant Gautam, Preethi Seshadri, Dietrich Klakow, Kai-Wei Chang, and Yizhou Sun

work page

[10] [10]

Let me speak freely? a study on the impact of format restrictions on performance of large language models.arXiv preprint arXiv:2408.02442, 2024

Agree to disagree? a meta-evaluation of LLM misgendering. InSecond Conference on Language Modeling. Zhi Rui Tam, Cheng-Kuang Wu, Yi-Lin Tsai, Chieh- Yen Lin, Hung-yi Lee, and Yun-Nung Chen. 2024. Let me speak freely? a study on the impact of format restrictions on performance of large language models. arXiv preprint arXiv:2408.02442. Hexiang Tan, Fei Sun,...

work page arXiv 2024

[11] [11]

14 A Creation of Conflict Pairs In Table 1, we show examples of four counterfactually-created alternative values for dif- ferent entity types and attributes

Judging llm-as-a-judge with mt-bench and chatbot arena.Advances in neural information pro- cessing systems, 36:46595–46623. 14 A Creation of Conflict Pairs In Table 1, we show examples of four counterfactually-created alternative values for dif- ferent entity types and attributes. In the following subsections, we describe three different methods of creati...

work page

[12] [12]

Rescalingfor numerical attributes (Appendix A.1)

work page

[13] [13]

Samplingfor categorical attributes with a small number of possible values (Appendix A.2)

work page

[14] [14]

| | Weight | 135 lbs | | Eye Color | Brown | | Hair Color | Black | | Marital Status | Single | | Political Affiliation | Independent | \

Generationfor categorical attributes with a large number of possible values (Appendix A.3) A.1 Rescaling We automatically adjust the values of numerical attributes that are not dates (such asbudget) by up to ±20%. Numbers with five digits or more are rounded to the third most significant decimal place to preserve a consistent level of precision. We scale ...

work page 2025