PPT-Bench measures how LLMs change answers under epistemic, value, authority, and identity pressures at baseline, single-turn, and multi-turn levels, finding separable inconsistency patterns across five models.
ISBN TODO
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLMs perform substantially better as pragmatic listeners judging language than as speakers generating it, revealing weak alignment between the two roles.
citing papers explorer
-
Beyond Social Pressure: Benchmarking Epistemic Attack in Large Language Models
PPT-Bench measures how LLMs change answers under epistemic, value, authority, and identity pressures at baseline, single-turn, and multi-turn levels, finding separable inconsistency patterns across five models.
-
How Hypocritical Is Your LLM judge? Listener-Speaker Asymmetries in the Pragmatic Competence of Large Language Models
LLMs perform substantially better as pragmatic listeners judging language than as speakers generating it, revealing weak alignment between the two roles.