{"paper":{"title":"Common-agency Games for Multi-Objective Test-Time Alignment","license":"http://creativecommons.org/licenses/by/4.0/","headline":"CAGE treats multiple conflicting alignment goals as strategic principals bidding token incentives to produce an equilibrium LLM policy at inference time.","cross_cats":[],"primary_cat":"cs.GT","authors_text":"Baiting Chen, Rui Yu, Tong Zhu, Xiaowu Dai","submitted_at":"2026-05-08T06:56:35Z","abstract_excerpt":"Aligning large language models (LLMs) with human preferences is inherently multi-objective: different users and evaluation criteria impose heterogeneous and often conflicting requirements on model outputs. We propose CAGE (Common-Agency Games for Alignment), a training-free, game-theoretic framework for multi-objective test-time alignment. CAGE models alignment objectives as strategic principals that allocate token-level incentives to a shared LLM, inducing an equilibrium policy that captures the joint effect of competing objectives. We develop an efficient algorithm based on equilibrium probl"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"CAGE enables flexible and fine-grained trade-offs across objectives at inference time, consistently outperforming existing test-time alignment methods while requiring no retraining. It further supports weak-to-strong generalization, making multi-objective alignment practical in resource-constrained settings.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That modeling heterogeneous objectives as strategic principals allocating token-level incentives produces an equilibrium policy whose joint effect meaningfully captures real user preferences, and that the EPEC-based algorithm reliably computes this equilibrium with the claimed existence, uniqueness, convergence, and stability guarantees.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"CAGE treats multiple conflicting alignment goals as strategic principals bidding token incentives to produce an equilibrium LLM policy at inference time.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"ba3aeb72703cb6e24891c5dae8672767889a68094fa699046c184755d7700103"},"source":{"id":"2605.13875","kind":"arxiv","version":1},"verdict":{"id":"1c5a5ac2-e479-4d34-af9e-4253368271a2","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T06:11:14.544441Z","strongest_claim":"CAGE enables flexible and fine-grained trade-offs across objectives at inference time, consistently outperforming existing test-time alignment methods while requiring no retraining. It further supports weak-to-strong generalization, making multi-objective alignment practical in resource-constrained settings.","one_line_summary":"CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That modeling heterogeneous objectives as strategic principals allocating token-level incentives produces an equilibrium policy whose joint effect meaningfully captures real user preferences, and that the EPEC-based algorithm reliably computes this equilibrium with the claimed existence, uniqueness, convergence, and stability guarantees.","pith_extraction_headline":"CAGE treats multiple conflicting alignment goals as strategic principals bidding token incentives to produce an equilibrium LLM policy at inference time."},"references":{"count":230,"sample":[{"doi":"","year":2024,"title":"TinyLlama: An Open-Source Small Language Model , author=. 2024 , eprint=","work_id":"4b2a85e5-7934-4e57-ba32-e7ac0a26d388","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"International Conference on Learning Representations , year=","work_id":"7d66a0d4-a1e1-4771-b0cd-ea8c424cb4bd","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"The Twelfth International Conference on Learning Representations , year=","work_id":"44d6bf99-5902-43da-b8fd-83297976c6e0","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"C hat D ev: Communicative agents for software development","work_id":"166b34d2-9477-4b53-b493-bcce91575006","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"Encouraging divergent thinking in large language models through multi-agent debate","work_id":"5e657239-8ab5-4c3c-ac58-63b4c9d2e420","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":230,"snapshot_sha256":"0d6a0ec0b4c4362b2aaaab7e88e2dcbc27332505742a1851b0d20041ea2a4151","internal_anchors":19},"formal_canon":{"evidence_count":2,"snapshot_sha256":"aebd8280c17417da790c7bdeedd992a81e4524c1d32020d083c0fcbfb65b478e"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}