Reasoning isn’t enough: Examining truth-bias and sycophancy in LLMs.arXiv preprint arXiv:2506.21561

Emilio Barkett, Olivia Long, Madhavendra Thakur · 2024 · arXiv 2506.21561

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition

cs.AI · 2026-04-07 · unverdicted · novelty 7.0

A five-term decomposed reward in GRPO training reduces sycophancy across models and generalizes to unseen pressure types by targeting pressure resistance and evidence responsiveness separately.

Evaluating Reasoning Models for Queries with Presuppositions

cs.CL · 2026-05-04 · unverdicted · novelty 4.0

Reasoning models achieve only 2-11% higher accuracy than non-reasoning models when handling queries with false presuppositions, failing to challenge 26-42% of them and remaining sensitive to presupposition strength.

citing papers explorer

Showing 2 of 2 citing papers.

Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition cs.AI · 2026-04-07 · unverdicted · none · ref 2
A five-term decomposed reward in GRPO training reduces sycophancy across models and generalizes to unseen pressure types by targeting pressure resistance and evidence responsiveness separately.
Evaluating Reasoning Models for Queries with Presuppositions cs.CL · 2026-05-04 · unverdicted · none · ref 1
Reasoning models achieve only 2-11% higher accuracy than non-reasoning models when handling queries with false presuppositions, failing to challenge 26-42% of them and remaining sensitive to presupposition strength.

Reasoning isn’t enough: Examining truth-bias and sycophancy in LLMs.arXiv preprint arXiv:2506.21561

fields

years

verdicts

representative citing papers

citing papers explorer