pith. sign in

Political Bias Audits of LLMs Capture Sycophancy to the Inferred Auditor

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

Large language models (LLMs) are commonly evaluated for political bias based on their responses to fixed questionnaires, which typically place frontier models on the political left. A parallel literature shows that LLMs are sycophantic: they adapt their answers to the views, identities, and expectations of the user. We show that these findings are linked: standard political-bias audits partly capture sycophantic accommodation to the inferred auditor. We employ a factorial experiment across three major audit instruments--the Political Compass Test, the Pew Political Typology, and 1,540 partisan-benchmarked Pew American Trends Panel items--administered to six frontier LLMs while varying only the asker's stated identity (N = 30,990 responses). At baseline, all six models lean left. When the asker identifies as a conservative Republican, responses shift sharply: the share of items closer to Democrats falls by 28-62 percentage points, and all six models move right of center. A mirror-image progressive-Democrat cue produces little change; rightward accommodation is 8.0$\times$ larger than leftward. When asked who the default asker is, models identify an auditor, researcher, or academic; when asked what answer that asker expects, they select the Democrat-coded option 75% of the time, nearly the rate under an explicit progressive cue. These patterns are inconsistent with a purely fixed model ideology and indicate that single-prompt audits capture an interaction between model and inferred interlocutor. Political bias in LLMs is therefore not a fixed point on an ideological scale but a response profile that must be mapped across realistic interlocutors.

fields

cs.CL 1 cs.CY 1

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

Defeat Devices in AI Systems

cs.CY · 2026-06-27 · unverdicted · novelty 6.0

The paper defines defeat devices in AI via a triadic test (discriminator, concealed swap, performance gap), unifies existing cases under this concept, proposes TADP detection, and claims such devices can emerge naturally in frontier models.

Auditing Stance Asymmetry in Generative Explanations

cs.CL · 2026-05-27 · unverdicted · novelty 6.0

Introduces Symmetry Decomposition Evaluation (SDE) to audit stable stance asymmetries in generative explanations using paired situations, role rewrites, and evidence controls on a 32-family prototype suite.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Auditing Stance Asymmetry in Generative Explanations cs.CL · 2026-05-27 · unverdicted · none · ref 22 · internal anchor

    Introduces Symmetry Decomposition Evaluation (SDE) to audit stable stance asymmetries in generative explanations using paired situations, role rewrites, and evidence controls on a 32-family prototype suite.