Cross-Validation Equilibrium

Ran Spiegler; Stephan Waizmann

arxiv: 2606.12571 · v1 · pith:WQVPXAPPnew · submitted 2026-06-10 · 💰 econ.TH

Cross-Validation Equilibrium

Ran Spiegler , Stephan Waizmann This is my paper

Pith reviewed 2026-06-27 07:23 UTC · model grok-4.3

classification 💰 econ.TH

keywords cross-validation equilibriummachine learning in gamesendogenous databelief formationBayesian gamesmultiple equilibriateam effort gamejury voting

0 comments

The pith

When players delegate predictions to ML agents, Cross-Validation Equilibrium requires each agent to pick the model that minimizes expected out-of-sample squared error on data drawn from equilibrium play itself.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines Cross-Validation Equilibrium for static Bayesian games in which each player relies on an ML agent to form beliefs about a payoff-relevant outcome. The agent's training sample comes from the distribution of outcomes that arise when all players follow the beliefs produced by their own agents. The agent chooses the predictive model that minimizes expected squared error on held-out data from this endogenous sample, and the player then best-responds to the belief the chosen model generates. The concept is applied to jury voting, speculative betting, and linear-quadratic games, where the endogenous data-generating process can support multiple equilibria.

Core claim

In Cross-Validation Equilibrium, each player's ML agent selects a predictive model to minimize expected out-of-sample squared error given its realized training sample, and each player best-replies to the belief generated by the model her ML agent selected. The training sample is drawn from the outcome distribution generated by players' ML-guided behavior in equilibrium. The paper analyzes this equilibrium concept, relates it to other solution concepts, and shows that endogenous model selection can produce multiple equilibria in applications such as team-effort games with linear-quadratic payoffs.

What carries the argument

Cross-Validation Equilibrium (CVE), the fixed point in which ML agents perform model selection by minimizing expected out-of-sample squared error on samples drawn from the equilibrium outcome distribution and players optimize against the resulting beliefs.

If this is right

In a team-effort game with linear-quadratic payoffs, endogenous model selection can give rise to multiple equilibria.
CVE can be applied directly to jury voting and speculative betting to derive equilibrium predictions.
The concept relates to standard Bayesian Nash equilibrium and other solution concepts that incorporate belief formation.
Model selection by ML agents on equilibrium-generated data can change the set of stable outcomes relative to exogenous-data settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same logic could be used to study repeated interactions in which the training sample grows with observed play.
Regulators might influence equilibrium selection by restricting the class of models available to the ML agents.
Experimental designs that vary the amount of feedback players receive could test whether beliefs converge to the CVE prediction.

Load-bearing premise

The training sample for each ML agent is drawn from the outcome distribution generated by players' ML-guided behavior in equilibrium.

What would settle it

Collect data on actions and reported beliefs in a laboratory team-effort game, then check whether subjects' beliefs match the predictions of the model that would have been selected by minimizing out-of-sample squared error on a fresh draw from the observed action distribution.

Figures

Figures reproduced from arXiv: 2606.12571 by Ran Spiegler, Stephan Waizmann.

**Figure 1.** Figure 1: This figure plots f(λ) = 2Φ δ(θ2−θ1) √ 2σε[1−λ(1−δ)] − λ − 1 for different values of δ and ∆ = θ √2−θ1 2σε . A root of f(λ) corresponds to a solution to Equation (16). Note that there are three roots in (0, 1) for each parameter specification. state is approximately (θ1 + θ2) 2/8. Compare this with the Nash equilibrium payoff (θ1 + θ2) 2/4 in the same limit. In this sense, CVE induces a substantial devi… view at source ↗

read the original abstract

We study strategic interaction when players delegate belief formation to predictive machine learning (ML). In a static Bayesian game, each player's ML agent predicts a payoff-relevant outcome variable as a function of the player's type. The ML agent's training sample is endogenous: it is drawn from the outcome distribution generated by players' ML-guided behavior. In Cross-Validation Equilibrium (CVE), each player's ML agent selects a predictive model to minimize expected out-of-sample squared error, given its realized training sample, and each player best-replies to the belief generated by the model her ML agent selected. We analyze CVE and relate it to other equilibrium concepts. We apply CVE to jury voting, speculative betting, and games with linear-quadratic payoffs. E.g., in a team-effort game, endogenous model selection can give rise to multiple equilibria.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CVE introduces a distinct equilibrium concept with endogenous ML model selection, but existence of the required fixed point is not guaranteed in general.

read the letter

The main takeaway is that this paper defines Cross-Validation Equilibrium to capture players delegating belief formation to ML agents whose training data comes from equilibrium play itself. Each agent picks a model minimizing expected out-of-sample squared error on that endogenous sample, and players then best-reply to the resulting beliefs.

What is new is the explicit combination of cross-validation model selection with a self-consistent data-generating process inside a Bayesian game. The paper relates CVE to existing equilibrium concepts and works through applications in jury voting, speculative betting, and linear-quadratic payoff games. The team-effort example is the clearest illustration: endogenous model choice produces multiple equilibria that would not arise under standard assumptions.

The setup is handled cleanly at the conceptual level and the examples show how the mechanism can generate interesting multiplicity. That part is useful for anyone modeling AI-assisted strategic settings.

The soft spot is existence. The definition imposes a fixed-point requirement: the distribution induced by behavior under the selected models must make those same models optimal for out-of-sample error. The abstract notes that CVE is analyzed and that multiple equilibria arise in one game, but supplies no general existence argument or restrictions on the model class that would ensure the fixed point for arbitrary games. Without those conditions the concept risks being empty in many environments, and the circularity between data and behavior is not resolved by additional theorems.

This paper is for game theorists interested in bounded rationality or learning with modern statistical tools. A reader working on information design or AI in economics would get concrete examples to think with. It deserves a serious referee because the core definition is distinct and the applications demonstrate scope, even though existence and scope conditions will need strengthening.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Cross-Validation Equilibrium (CVE) for static Bayesian games in which each player delegates belief formation to an ML agent that selects a predictive model minimizing expected out-of-sample squared error on a training sample drawn from the equilibrium outcome distribution induced by all players' ML-guided actions. Players then best-reply to the beliefs generated by their selected models. The paper analyzes CVE, relates it to existing equilibrium concepts, and applies it to jury voting, speculative betting, and linear-quadratic payoff games, with the team-effort example illustrating that endogenous model selection can produce multiple equilibria.

Significance. If a fixed point for the model-selection mapping can be shown to exist under stated conditions, CVE supplies a new equilibrium notion that endogenizes both behavior and the ML models used to form beliefs, with the endogenous DGP creating a non-standard consistency requirement. The applications demonstrate that this can generate multiplicity even in simple games, which is a concrete contribution. The manuscript earns credit for explicitly linking ML cross-validation to strategic best-reply and for working through three distinct applications rather than remaining purely abstract.

major comments (2)

[Definition of CVE] Definition of CVE (abstract and opening sections): the fixed-point requirement—that the model minimizing out-of-sample error on the equilibrium-induced training sample must itself induce that same equilibrium distribution—is stated but no general existence theorem, continuity conditions on the model class, or compactness argument is supplied. This is load-bearing because without it the set of CVE profiles may be empty for many games, undermining the claim that CVE is a well-defined equilibrium concept to be analyzed and applied.
[Team-effort game] Team-effort game application: the manuscript asserts that endogenous model selection gives rise to multiple equilibria, yet provides no explicit verification that the selected models are indeed optimal given the training samples generated by the claimed equilibrium strategies. Without this check the multiplicity claim rests on an unverified fixed point.

minor comments (2)

Notation for the training-sample distribution and the out-of-sample error functional should be introduced with explicit symbols rather than described only in prose, to allow readers to track the dependence on the endogenous DGP.
The relation of CVE to existing concepts (e.g., rational expectations, self-confirming equilibrium) is mentioned but would benefit from a short table or paragraph contrasting the information and consistency requirements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. The comments identify two important points where the manuscript can be strengthened. We respond to each below and outline the revisions we will make.

read point-by-point responses

Referee: [Definition of CVE] Definition of CVE (abstract and opening sections): the fixed-point requirement—that the model minimizing out-of-sample error on the equilibrium-induced training sample must itself induce that same equilibrium distribution—is stated but no general existence theorem, continuity conditions on the model class, or compactness argument is supplied. This is load-bearing because without it the set of CVE profiles may be empty for many games, undermining the claim that CVE is a well-defined equilibrium concept to be analyzed and applied.

Authors: We agree that the manuscript would benefit from an explicit existence result. In the revision we will add a new proposition establishing existence of CVE when the type space, action space, and model class are all finite. The argument proceeds by noting that the mapping from strategy profiles to the induced training-sample distribution is continuous, that the cross-validation objective is continuous in the model parameters for any fixed sample, and that the finite model class therefore admits a best-reply fixed point by standard arguments on a finite set. We will also state the compactness and continuity conditions required for this result and note that they are satisfied in all three applications. This directly addresses the concern that CVE profiles might be empty in general games. revision: yes
Referee: [Team-effort game] Team-effort game application: the manuscript asserts that endogenous model selection gives rise to multiple equilibria, yet provides no explicit verification that the selected models are indeed optimal given the training samples generated by the claimed equilibrium strategies. Without this check the multiplicity claim rests on an unverified fixed point.

Authors: The referee is correct that the current draft does not display the explicit verification. In the revised version we will add a short appendix subsection that computes, for each claimed equilibrium strategy profile, the training sample it induces, evaluates the out-of-sample squared-error objective for every model in the admissible class, and confirms that the model selected by each player is indeed a minimizer. These calculations will be reported both for the symmetric high-effort equilibrium and for the asymmetric equilibria that generate multiplicity, thereby confirming that the fixed-point property holds. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces CVE as an equilibrium concept defined by mutual consistency between ML model selection (minimizing out-of-sample error on an endogenous training sample) and players' best responses. This fixed-point structure is the explicit definition of the equilibrium notion itself, not a derived prediction that reduces to its inputs by construction. No equations, fitted parameters, or self-citations are exhibited in the provided text that would trigger any of the enumerated circularity patterns. Analysis of specific games (jury voting, team effort) proceeds by solving the resulting fixed point rather than assuming the result tautologically. The derivation is therefore self-contained as a new equilibrium definition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no specific free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.1-grok · 5652 in / 1165 out tokens · 21275 ms · 2026-06-27T07:23:13.750258+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references

[1]

The impact of artificial intel- ligence design on pricing

Asker, John, Chaim Fershtman, and Ariel Pakes (2024). “The impact of artificial intel- ligence design on pricing”. In:Journal of Economics & Management Strategy33.2, pp. 276–304

2024
[2]

Adaptive Algorithms and Collusion via Coupling

Banchio, Martino and Giacomo Mantegazza (2024). “Adaptive Algorithms and Collusion via Coupling.” In:working paper

2024
[3]

Artificial intelligence, algorithmic pricing, and collusion

Calvano, Emilio et al. (2020). “Artificial intelligence, algorithmic pricing, and collusion”. In:American Economic Review110.10, pp. 3267–3297

2020
[4]

Algorithmic Collusion and a Folk Theorem from Learning with Bounded Rationality

Cartea, Alvaro et al. (2025). “Algorithmic Collusion and a Folk Theorem from Learning with Bounded Rationality”. In:Games and Economic Behavior

2025
[5]

Proxy variables and feedback effects in decision making

Clyde, Alexander (2025). “Proxy variables and feedback effects in decision making”. In: Games and Economic Behavior. 31

2025
[6]

A Representative-Sampling Model of Stochas- tic Choice

Danenberg, Tuval and Ran Spiegler (2025). “A Representative-Sampling Model of Stochas- tic Choice”. In:JPE: Micro (forthcoming)

2025
[7]

Reinforcement Learning in a Prisoner’s Dilemma

Dolgopolov, Arthur (2024). “Reinforcement Learning in a Prisoner’s Dilemma”. In: Games and Economic Behavior144, pp. 84–103

2024
[8]

The model selection curse

Eliaz, Kfir and Ran Spiegler (2019). “The model selection curse”. In:American Economic Review: Insights1.2, pp. 127–140

2019
[9]

Behavioral equilibrium in economies with adverse selection

Esponda, Ignacio (2008). “Behavioral equilibrium in economies with adverse selection”. In:American Economic Review98.4, pp. 1269–1291

2008
[10]

Berk–Nash equilibrium: A framework for modeling agents with misspecified models

Esponda, Ignacio and Demian Pouzo (2016). “Berk–Nash equilibrium: A framework for modeling agents with misspecified models”. In:Econometrica84.3, pp. 1093–1130

2016
[11]

Cursed equilibrium

Eyster, Erik and Matthew Rabin (2005). “Cursed equilibrium”. In:Econometrica73.5, pp. 1623–1672

2005
[12]

Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting

Feddersen, Timothy and Wolfgang Pesendorfer (1998). “Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting”. In:American Political Science Review92.1, pp. 23–35

1998
[13]

Algorithmic collusion: Supra-competitive prices via independent algorithms

Hansen, Karsten T, Kanishka Misra, and Mallesh M Pai (2021). “Algorithmic collusion: Supra-competitive prices via independent algorithms”. In:Marketing Science40.1, pp. 1–12

2021
[14]

Springer

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman (2009).The Elements of Sta- tistical Learning: Data Mining, Inference, and Prediction. Springer

2009
[15]

Analogy-based expectation equilibrium

Jehiel, Philippe (2005). “Analogy-based expectation equilibrium”. In:Journal of Eco- nomic Theory123.2, pp. 81–104

2005
[16]

Categorization in Games: A Bias-Variance Perspective

Jehiel, Philippe and Erik Mohlin (2026). “Categorization in Games: A Bias-Variance Perspective”. In:mimeo

2026
[17]

Endogenous clustering and analogy-based expectation equilibrium

Jehiel, Philippe and Giacomo Weber (2025). “Endogenous clustering and analogy-based expectation equilibrium”. In:Review of Economic Studies, rdaf054

2025
[18]

Games with procedurally rational players

Osborne, Martin J and Ariel Rubinstein (1998). “Games with procedurally rational players”. In:American Economic Review, pp. 834–847

1998
[19]

Reinforcement Learning and Collusion

Possnig, Clemens (2024). “Reinforcement Learning and Collusion”. In:mimeo

2024
[20]

Statistical inference in games

Salant, Yuval and Josh Cherry (2020). “Statistical inference in games”. In:Econometrica 88.4, pp. 1725–1752

2020
[21]

Bayesian networks and boundedly rational expectations

Spiegler, Ran (2016). “Bayesian networks and boundedly rational expectations”. In:The Quarterly Journal of Economics131.3, pp. 1243–1290. — (2026). “Machine-Learning to Trust”. In:working paper. 32

2016
[22]

AI in Action: Algorithmic Learning with Strategic Con- sumers

Waizmann, Stephan (2025). “AI in Action: Algorithmic Learning with Strategic Con- sumers”. In:mimeo. 33

2025

[1] [1]

The impact of artificial intel- ligence design on pricing

Asker, John, Chaim Fershtman, and Ariel Pakes (2024). “The impact of artificial intel- ligence design on pricing”. In:Journal of Economics & Management Strategy33.2, pp. 276–304

2024

[2] [2]

Adaptive Algorithms and Collusion via Coupling

Banchio, Martino and Giacomo Mantegazza (2024). “Adaptive Algorithms and Collusion via Coupling.” In:working paper

2024

[3] [3]

Artificial intelligence, algorithmic pricing, and collusion

Calvano, Emilio et al. (2020). “Artificial intelligence, algorithmic pricing, and collusion”. In:American Economic Review110.10, pp. 3267–3297

2020

[4] [4]

Algorithmic Collusion and a Folk Theorem from Learning with Bounded Rationality

Cartea, Alvaro et al. (2025). “Algorithmic Collusion and a Folk Theorem from Learning with Bounded Rationality”. In:Games and Economic Behavior

2025

[5] [5]

Proxy variables and feedback effects in decision making

Clyde, Alexander (2025). “Proxy variables and feedback effects in decision making”. In: Games and Economic Behavior. 31

2025

[6] [6]

A Representative-Sampling Model of Stochas- tic Choice

Danenberg, Tuval and Ran Spiegler (2025). “A Representative-Sampling Model of Stochas- tic Choice”. In:JPE: Micro (forthcoming)

2025

[7] [7]

Reinforcement Learning in a Prisoner’s Dilemma

Dolgopolov, Arthur (2024). “Reinforcement Learning in a Prisoner’s Dilemma”. In: Games and Economic Behavior144, pp. 84–103

2024

[8] [8]

The model selection curse

Eliaz, Kfir and Ran Spiegler (2019). “The model selection curse”. In:American Economic Review: Insights1.2, pp. 127–140

2019

[9] [9]

Behavioral equilibrium in economies with adverse selection

Esponda, Ignacio (2008). “Behavioral equilibrium in economies with adverse selection”. In:American Economic Review98.4, pp. 1269–1291

2008

[10] [10]

Berk–Nash equilibrium: A framework for modeling agents with misspecified models

Esponda, Ignacio and Demian Pouzo (2016). “Berk–Nash equilibrium: A framework for modeling agents with misspecified models”. In:Econometrica84.3, pp. 1093–1130

2016

[11] [11]

Cursed equilibrium

Eyster, Erik and Matthew Rabin (2005). “Cursed equilibrium”. In:Econometrica73.5, pp. 1623–1672

2005

[12] [12]

Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting

Feddersen, Timothy and Wolfgang Pesendorfer (1998). “Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting”. In:American Political Science Review92.1, pp. 23–35

1998

[13] [13]

Algorithmic collusion: Supra-competitive prices via independent algorithms

Hansen, Karsten T, Kanishka Misra, and Mallesh M Pai (2021). “Algorithmic collusion: Supra-competitive prices via independent algorithms”. In:Marketing Science40.1, pp. 1–12

2021

[14] [14]

Springer

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman (2009).The Elements of Sta- tistical Learning: Data Mining, Inference, and Prediction. Springer

2009

[15] [15]

Analogy-based expectation equilibrium

Jehiel, Philippe (2005). “Analogy-based expectation equilibrium”. In:Journal of Eco- nomic Theory123.2, pp. 81–104

2005

[16] [16]

Categorization in Games: A Bias-Variance Perspective

Jehiel, Philippe and Erik Mohlin (2026). “Categorization in Games: A Bias-Variance Perspective”. In:mimeo

2026

[17] [17]

Endogenous clustering and analogy-based expectation equilibrium

Jehiel, Philippe and Giacomo Weber (2025). “Endogenous clustering and analogy-based expectation equilibrium”. In:Review of Economic Studies, rdaf054

2025

[18] [18]

Games with procedurally rational players

Osborne, Martin J and Ariel Rubinstein (1998). “Games with procedurally rational players”. In:American Economic Review, pp. 834–847

1998

[19] [19]

Reinforcement Learning and Collusion

Possnig, Clemens (2024). “Reinforcement Learning and Collusion”. In:mimeo

2024

[20] [20]

Statistical inference in games

Salant, Yuval and Josh Cherry (2020). “Statistical inference in games”. In:Econometrica 88.4, pp. 1725–1752

2020

[21] [21]

Bayesian networks and boundedly rational expectations

Spiegler, Ran (2016). “Bayesian networks and boundedly rational expectations”. In:The Quarterly Journal of Economics131.3, pp. 1243–1290. — (2026). “Machine-Learning to Trust”. In:working paper. 32

2016

[22] [22]

AI in Action: Algorithmic Learning with Strategic Con- sumers

Waizmann, Stephan (2025). “AI in Action: Algorithmic Learning with Strategic Con- sumers”. In:mimeo. 33

2025