Benchmarking prompt sensitivity in large language models

Aryan Razavi, Aref Jafari, Alona Fyshe, Gholamreza Haffari · 2025 · arXiv 2502.06065

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Stop Drawing Scientific Claims from LLM Social Simulations Without Robustness Audits

physics.soc-ph · 2026-05-17 · accept · novelty 6.0

Minor perturbations in persona format, instruction framing, and network structure shift cooperation by up to 76 percentage points and polarization metrics consistently, showing that LLM social simulations require per-claim robustness audits via the new TRAILS taxonomy.

Coordinates of Capability: A Unified MTMM-Geometric Framework for LLM Evaluation

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

A new MTMM-geometric framework unifies LLM evaluation metrics into three latent dimensions to separate method variance from true capabilities.

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Understanding the Mechanism of Altruism in Large Language Models econ.GN · 2026-04-21 · unverdicted · none · ref 192
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

Benchmarking prompt sensitivity in large language models

fields

years

verdicts

representative citing papers

citing papers explorer