Cross-Validation Equilibrium
Pith reviewed 2026-06-27 07:23 UTC · model grok-4.3
The pith
When players delegate predictions to ML agents, Cross-Validation Equilibrium requires each agent to pick the model that minimizes expected out-of-sample squared error on data drawn from equilibrium play itself.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In Cross-Validation Equilibrium, each player's ML agent selects a predictive model to minimize expected out-of-sample squared error given its realized training sample, and each player best-replies to the belief generated by the model her ML agent selected. The training sample is drawn from the outcome distribution generated by players' ML-guided behavior in equilibrium. The paper analyzes this equilibrium concept, relates it to other solution concepts, and shows that endogenous model selection can produce multiple equilibria in applications such as team-effort games with linear-quadratic payoffs.
What carries the argument
Cross-Validation Equilibrium (CVE), the fixed point in which ML agents perform model selection by minimizing expected out-of-sample squared error on samples drawn from the equilibrium outcome distribution and players optimize against the resulting beliefs.
If this is right
- In a team-effort game with linear-quadratic payoffs, endogenous model selection can give rise to multiple equilibria.
- CVE can be applied directly to jury voting and speculative betting to derive equilibrium predictions.
- The concept relates to standard Bayesian Nash equilibrium and other solution concepts that incorporate belief formation.
- Model selection by ML agents on equilibrium-generated data can change the set of stable outcomes relative to exogenous-data settings.
Where Pith is reading between the lines
- The same logic could be used to study repeated interactions in which the training sample grows with observed play.
- Regulators might influence equilibrium selection by restricting the class of models available to the ML agents.
- Experimental designs that vary the amount of feedback players receive could test whether beliefs converge to the CVE prediction.
Load-bearing premise
The training sample for each ML agent is drawn from the outcome distribution generated by players' ML-guided behavior in equilibrium.
What would settle it
Collect data on actions and reported beliefs in a laboratory team-effort game, then check whether subjects' beliefs match the predictions of the model that would have been selected by minimizing out-of-sample squared error on a fresh draw from the observed action distribution.
Figures
read the original abstract
We study strategic interaction when players delegate belief formation to predictive machine learning (ML). In a static Bayesian game, each player's ML agent predicts a payoff-relevant outcome variable as a function of the player's type. The ML agent's training sample is endogenous: it is drawn from the outcome distribution generated by players' ML-guided behavior. In Cross-Validation Equilibrium (CVE), each player's ML agent selects a predictive model to minimize expected out-of-sample squared error, given its realized training sample, and each player best-replies to the belief generated by the model her ML agent selected. We analyze CVE and relate it to other equilibrium concepts. We apply CVE to jury voting, speculative betting, and games with linear-quadratic payoffs. E.g., in a team-effort game, endogenous model selection can give rise to multiple equilibria.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Cross-Validation Equilibrium (CVE) for static Bayesian games in which each player delegates belief formation to an ML agent that selects a predictive model minimizing expected out-of-sample squared error on a training sample drawn from the equilibrium outcome distribution induced by all players' ML-guided actions. Players then best-reply to the beliefs generated by their selected models. The paper analyzes CVE, relates it to existing equilibrium concepts, and applies it to jury voting, speculative betting, and linear-quadratic payoff games, with the team-effort example illustrating that endogenous model selection can produce multiple equilibria.
Significance. If a fixed point for the model-selection mapping can be shown to exist under stated conditions, CVE supplies a new equilibrium notion that endogenizes both behavior and the ML models used to form beliefs, with the endogenous DGP creating a non-standard consistency requirement. The applications demonstrate that this can generate multiplicity even in simple games, which is a concrete contribution. The manuscript earns credit for explicitly linking ML cross-validation to strategic best-reply and for working through three distinct applications rather than remaining purely abstract.
major comments (2)
- [Definition of CVE] Definition of CVE (abstract and opening sections): the fixed-point requirement—that the model minimizing out-of-sample error on the equilibrium-induced training sample must itself induce that same equilibrium distribution—is stated but no general existence theorem, continuity conditions on the model class, or compactness argument is supplied. This is load-bearing because without it the set of CVE profiles may be empty for many games, undermining the claim that CVE is a well-defined equilibrium concept to be analyzed and applied.
- [Team-effort game] Team-effort game application: the manuscript asserts that endogenous model selection gives rise to multiple equilibria, yet provides no explicit verification that the selected models are indeed optimal given the training samples generated by the claimed equilibrium strategies. Without this check the multiplicity claim rests on an unverified fixed point.
minor comments (2)
- Notation for the training-sample distribution and the out-of-sample error functional should be introduced with explicit symbols rather than described only in prose, to allow readers to track the dependence on the endogenous DGP.
- The relation of CVE to existing concepts (e.g., rational expectations, self-confirming equilibrium) is mentioned but would benefit from a short table or paragraph contrasting the information and consistency requirements.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. The comments identify two important points where the manuscript can be strengthened. We respond to each below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Definition of CVE] Definition of CVE (abstract and opening sections): the fixed-point requirement—that the model minimizing out-of-sample error on the equilibrium-induced training sample must itself induce that same equilibrium distribution—is stated but no general existence theorem, continuity conditions on the model class, or compactness argument is supplied. This is load-bearing because without it the set of CVE profiles may be empty for many games, undermining the claim that CVE is a well-defined equilibrium concept to be analyzed and applied.
Authors: We agree that the manuscript would benefit from an explicit existence result. In the revision we will add a new proposition establishing existence of CVE when the type space, action space, and model class are all finite. The argument proceeds by noting that the mapping from strategy profiles to the induced training-sample distribution is continuous, that the cross-validation objective is continuous in the model parameters for any fixed sample, and that the finite model class therefore admits a best-reply fixed point by standard arguments on a finite set. We will also state the compactness and continuity conditions required for this result and note that they are satisfied in all three applications. This directly addresses the concern that CVE profiles might be empty in general games. revision: yes
-
Referee: [Team-effort game] Team-effort game application: the manuscript asserts that endogenous model selection gives rise to multiple equilibria, yet provides no explicit verification that the selected models are indeed optimal given the training samples generated by the claimed equilibrium strategies. Without this check the multiplicity claim rests on an unverified fixed point.
Authors: The referee is correct that the current draft does not display the explicit verification. In the revised version we will add a short appendix subsection that computes, for each claimed equilibrium strategy profile, the training sample it induces, evaluates the out-of-sample squared-error objective for every model in the admissible class, and confirms that the model selected by each player is indeed a minimizer. These calculations will be reported both for the symmetric high-effort equilibrium and for the asymmetric equilibria that generate multiplicity, thereby confirming that the fixed-point property holds. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces CVE as an equilibrium concept defined by mutual consistency between ML model selection (minimizing out-of-sample error on an endogenous training sample) and players' best responses. This fixed-point structure is the explicit definition of the equilibrium notion itself, not a derived prediction that reduces to its inputs by construction. No equations, fitted parameters, or self-citations are exhibited in the provided text that would trigger any of the enumerated circularity patterns. Analysis of specific games (jury voting, team effort) proceeds by solving the resulting fixed point rather than assuming the result tautologically. The derivation is therefore self-contained as a new equilibrium definition.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
The impact of artificial intel- ligence design on pricing
Asker, John, Chaim Fershtman, and Ariel Pakes (2024). “The impact of artificial intel- ligence design on pricing”. In:Journal of Economics & Management Strategy33.2, pp. 276–304
2024
-
[2]
Adaptive Algorithms and Collusion via Coupling
Banchio, Martino and Giacomo Mantegazza (2024). “Adaptive Algorithms and Collusion via Coupling.” In:working paper
2024
-
[3]
Artificial intelligence, algorithmic pricing, and collusion
Calvano, Emilio et al. (2020). “Artificial intelligence, algorithmic pricing, and collusion”. In:American Economic Review110.10, pp. 3267–3297
2020
-
[4]
Algorithmic Collusion and a Folk Theorem from Learning with Bounded Rationality
Cartea, Alvaro et al. (2025). “Algorithmic Collusion and a Folk Theorem from Learning with Bounded Rationality”. In:Games and Economic Behavior
2025
-
[5]
Proxy variables and feedback effects in decision making
Clyde, Alexander (2025). “Proxy variables and feedback effects in decision making”. In: Games and Economic Behavior. 31
2025
-
[6]
A Representative-Sampling Model of Stochas- tic Choice
Danenberg, Tuval and Ran Spiegler (2025). “A Representative-Sampling Model of Stochas- tic Choice”. In:JPE: Micro (forthcoming)
2025
-
[7]
Reinforcement Learning in a Prisoner’s Dilemma
Dolgopolov, Arthur (2024). “Reinforcement Learning in a Prisoner’s Dilemma”. In: Games and Economic Behavior144, pp. 84–103
2024
-
[8]
The model selection curse
Eliaz, Kfir and Ran Spiegler (2019). “The model selection curse”. In:American Economic Review: Insights1.2, pp. 127–140
2019
-
[9]
Behavioral equilibrium in economies with adverse selection
Esponda, Ignacio (2008). “Behavioral equilibrium in economies with adverse selection”. In:American Economic Review98.4, pp. 1269–1291
2008
-
[10]
Berk–Nash equilibrium: A framework for modeling agents with misspecified models
Esponda, Ignacio and Demian Pouzo (2016). “Berk–Nash equilibrium: A framework for modeling agents with misspecified models”. In:Econometrica84.3, pp. 1093–1130
2016
-
[11]
Cursed equilibrium
Eyster, Erik and Matthew Rabin (2005). “Cursed equilibrium”. In:Econometrica73.5, pp. 1623–1672
2005
-
[12]
Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting
Feddersen, Timothy and Wolfgang Pesendorfer (1998). “Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting”. In:American Political Science Review92.1, pp. 23–35
1998
-
[13]
Algorithmic collusion: Supra-competitive prices via independent algorithms
Hansen, Karsten T, Kanishka Misra, and Mallesh M Pai (2021). “Algorithmic collusion: Supra-competitive prices via independent algorithms”. In:Marketing Science40.1, pp. 1–12
2021
-
[14]
Springer
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman (2009).The Elements of Sta- tistical Learning: Data Mining, Inference, and Prediction. Springer
2009
-
[15]
Analogy-based expectation equilibrium
Jehiel, Philippe (2005). “Analogy-based expectation equilibrium”. In:Journal of Eco- nomic Theory123.2, pp. 81–104
2005
-
[16]
Categorization in Games: A Bias-Variance Perspective
Jehiel, Philippe and Erik Mohlin (2026). “Categorization in Games: A Bias-Variance Perspective”. In:mimeo
2026
-
[17]
Endogenous clustering and analogy-based expectation equilibrium
Jehiel, Philippe and Giacomo Weber (2025). “Endogenous clustering and analogy-based expectation equilibrium”. In:Review of Economic Studies, rdaf054
2025
-
[18]
Games with procedurally rational players
Osborne, Martin J and Ariel Rubinstein (1998). “Games with procedurally rational players”. In:American Economic Review, pp. 834–847
1998
-
[19]
Reinforcement Learning and Collusion
Possnig, Clemens (2024). “Reinforcement Learning and Collusion”. In:mimeo
2024
-
[20]
Statistical inference in games
Salant, Yuval and Josh Cherry (2020). “Statistical inference in games”. In:Econometrica 88.4, pp. 1725–1752
2020
-
[21]
Bayesian networks and boundedly rational expectations
Spiegler, Ran (2016). “Bayesian networks and boundedly rational expectations”. In:The Quarterly Journal of Economics131.3, pp. 1243–1290. — (2026). “Machine-Learning to Trust”. In:working paper. 32
2016
-
[22]
AI in Action: Algorithmic Learning with Strategic Con- sumers
Waizmann, Stephan (2025). “AI in Action: Algorithmic Learning with Strategic Con- sumers”. In:mimeo. 33
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.