In a Bayesian persuasion model of AI misalignment on bit strings, receiver utility under sender-optimal signaling is at most 3/2 times prior-only utility, with an additive bound for near-product priors and a 6-bit example achieving 39/31.
American Economic Review 101, 6 (October 2011), 2590–2615
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Continuity and robustness of Bayesian persuasion to infinitesimal Knightian uncertainty about receiver preferences are equivalent and generic.
The paper introduces a noise-state recursive representation for finite-player dynamic games with dispersed private information, yielding explicit equilibrium characterizations in continuous-time LQG settings.
Optimal staggered information disclosure and tiered evacuation orders reduce social costs by 89% in a calibrated hurricane evacuation model by mitigating belief-driven congestion externalities.
Equation-to-Behavior Prompting lets large LLMs match cognitive models like Bayesian updating in persuasion games; RL training cuts small-model belief error by 26.5% and improves diverse training outcomes by 2.5-12%.
Non-affine approval functions create unavoidable miscalibration in proper scoring rules for strategic agents, but step-function thresholds enable first-best screening without it, uniquely for the Brier score.
For heterogeneous power-p pseudospherical scoring rules with d ≤ 4, the True-KL0 property R(M,p,d) < 1 holds for all M > 1, establishing unconditional DSIC via a Prekopa-based log-concavity argument on the loss integral.
For stationary ergodic processes the set of calibration-passing forecast distributions equals the mean-preserving contractions of the conditional distribution, allowing the dynamic game to be solved via static persuasion.
A game-theoretic framework and algorithms are introduced to maximize beneficial information from ML systems while minimizing biased influences arising from conflicts of interest.
citing papers explorer
-
Quantifying Theoretical AI Alignment Guarantees: Receiver-Utility Bounds in Bayesian Persuasion
In a Bayesian persuasion model of AI misalignment on bit strings, receiver utility under sender-optimal signaling is at most 3/2 times prior-only utility, with an additive bound for near-product priors and a 6-bit example achieving 39/31.
-
Robustness of Persuasion to Receiver Preferences
Continuity and robustness of Bayesian persuasion to infinitesimal Knightian uncertainty about receiver preferences are equivalent and generic.
-
Forecasting and Manipulating the Forecasts of Others
The paper introduces a noise-state recursive representation for finite-player dynamic games with dispersed private information, yielding explicit equilibrium characterizations in continuous-time LQG settings.
-
Continuous-Time Information Design for Hurricane Evacuation: Disclosure, Congestion, and Optimal Phasing under Model Uncertainty
Optimal staggered information disclosure and tiered evacuation orders reduce social costs by 89% in a calibrated hurricane evacuation model by mitigating belief-driven congestion externalities.
-
Using Cognitive Models to Improve Language Model Simulation of Human Persuasion Games
Equation-to-Behavior Prompting lets large LLMs match cognitive models like Bayesian updating in persuasion games; RL training cuts small-model belief error by 26.5% and improves diverse training outcomes by 2.5-12%.
-
The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting
Non-affine approval functions create unavoidable miscalibration in proper scoring rules for strategic agents, but step-function thresholds enable first-best screening without it, uniquely for the Brier score.
-
Calibrated Forecasting and Persuasion
For stationary ergodic processes the set of calibration-passing forecast distributions equals the mean-preserving contractions of the conditional distribution, allowing the dynamic game to be solved via static persuasion.
-
Learning with Conflicts of Interest
A game-theoretic framework and algorithms are introduced to maximize beneficial information from ML systems while minimizing biased influences arising from conflicts of interest.