A new evaluation framework for LLM social intelligence finds that influence, transparency, and adaptability predict agent success in games better than theory of mind or deep planning, with metrics achieving AUC 0.82 in predicting pairwise outcomes.
States as strings as strategies: Steering language models with game- theoretic solvers.CoRR, abs/2402.01704
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.
citing papers explorer
-
Communicate-Predict-Act: Evaluating Social Intelligence of Agents
A new evaluation framework for LLM social intelligence finds that influence, transparency, and adaptability predict agent success in games better than theory of mind or deep planning, with metrics achieving AUC 0.82 in predicting pairwise outcomes.
-
Common-agency Games for Multi-Objective Test-Time Alignment
CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.