pith. sign in

arxiv: 2606.23463 · v1 · pith:AHYGK2ZAnew · submitted 2026-06-22 · 💰 econ.GN · q-fin.EC

Equilibrium World Models

Pith reviewed 2026-06-26 05:52 UTC · model grok-4.3

classification 💰 econ.GN q-fin.EC
keywords equilibrium world modelsdeep learning solversdynamic stochastic modelsrare disastersrational expectationsneural network solversheterogeneous agentsbinding constraints
0
0 comments X

The pith

Equilibrium World Models enforce exact rational-expectations conditions on ordinary, rare, stressed, and counterfactual states using a certified learned surrogate for continuations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Equilibrium World Models to globally solve dynamic stochastic models featuring rare disasters, binding constraints, and counterfactual states. Standard neural-network solvers impose equilibrium conditions only on states generated by their own simulated policy, which can yield self-confirming solutions accurate on the path but untested off it. EWMs instead generate a broader distribution of states and enforce the model's exact equilibrium conditions there, carrying continuations via a learned surrogate while certifying the policy strictly against the true conditions. The approach supplies an error decomposition, an off-path residual bound, and a convergence result that connects self-confirming solutions to rational-expectations equilibria. A reader would care because it promises reliable global solutions without repeated expensive expectation evaluations at each step.

Core claim

Equilibrium World Models enforce the model's exact equilibrium conditions on a broader, model-generated distribution of ordinary, rare, stressed, and counterfactual states. They carry the continuation with a learned surrogate, but certify the resulting policy strictly against the true equilibrium conditions. We provide an error decomposition, an off-path residual bound, and a convergence result linking self-confirming solutions to rational-expectations equilibria.

What carries the argument

Enforcement of exact equilibrium conditions on a broad model-generated state distribution, with a learned surrogate for continuation values that is certified against the true conditions.

If this is right

  • In a rare-disaster Brock-Mirman laboratory, coverage reduces disaster-region residuals by an order of magnitude.
  • In a high-dimensional international real-business-cycle model, EWMs converge from nearly all random starts while classical solvers fail from all.
  • When actions move transition measures, action-conditioned continuations recover the relevant policy margin.
  • In a heterogeneous-agent economy with aggregate risk, EWMs compress the numerical representation of the wealth distribution by at least 25x while imposing exact full-distribution conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The certification of surrogates against true conditions on expanded distributions could be adapted to verify approximate solutions in other classes of dynamic models with uncertainty.
  • Lower frequency of continuation evaluations may support faster evaluation of policy counterfactuals in large-scale economies.
  • The convergence result from self-confirming to full rational-expectations solutions suggests an iterative refinement procedure that starts from classical neural outputs.

Load-bearing premise

The learned surrogate for continuation values combined with certification against true equilibrium conditions on the broader state distribution produces policies that satisfy the model's rational-expectations equilibrium without material approximation error from the surrogate.

What would settle it

Simulating an EWM-certified policy on states outside the certified distribution and observing equilibrium residuals that exceed the stated off-path bound would falsify the claim of reliable global solutions without material surrogate error.

Figures

Figures reproduced from arXiv: 2606.23463 by Andreas Schaab, Simon Scheidegger.

Figure 1
Figure 1. Figure 1: Stylized schematic of the coverage measure ( [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Per-iteration structure of an unsupervised residual solver (left) versus EWM (right). The [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The 𝜅-homotopy as a bridge from a self-confirming equilibrium to rational expectations on the coverage axis. Each stage enlarges the imagined support 𝜇𝜅, from the ergodic path to a neighborhood, the rare regime, and post-shock cross-sections, imposing the same exact residual on strictly more of the reachable set. At every finite 𝜅 the fixed point is a coverage-confirmed fixed point on 𝜇𝜅, self-confirming i… view at source ↗
Figure 4
Figure 4. Figure 4: How the coverage measure 𝜇𝜅 of (7) is built in Brock–Mirman, on the state (𝑘, 𝑧) of capital and productivity, and how the continuation is amortized on it. Building 𝜇𝜅 (the three-step coverage sampling of this section): (1) ergodic 𝜇𝜋𝜃 , the policy’s own path, obtained by simulating the exact transition Γ forward (blue), the set DEQN trains on; (2) stress, seeds drawn off the ergodic set, a low-productivity… view at source ↗
Figure 5
Figure 5. Figure 5: Brock–Mirman with a rare disaster: held-out exact disaster residuals by arm. Panel (a) [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Seed-basin certification at 𝑁=2 and 𝑁=4 (ten seeds per arm). Each point is one seed: the on-path exact residual against the disaster-region exact residual, with the 45◦ line shown and filled markers for seeds that pass verified stationarity. The pathwise baseline sits an order of magnitude above the diagonal and never verifies; the coverage and surrogate arms collapse onto the diagonal and verify in eight … view at source ↗
Figure 7
Figure 7. Figure 7: What is approximated, and where the encoder enters. The Bewley solve approximates [PITH_FULL_IMAGE:figures/full_fig_p043_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The learned embedding keeps the decision-relevant cross-section. From the trained [PITH_FULL_IMAGE:figures/full_fig_p046_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Network architecture and training, Brock–Mirman setting (Table [PITH_FULL_IMAGE:figures/full_fig_p060_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Endogenous protection. Left: converged normal-regime protection by arm, with the [PITH_FULL_IMAGE:figures/full_fig_p064_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Normal-times price of the one-period disaster Arrow claim (implied risk-neutral disaster [PITH_FULL_IMAGE:figures/full_fig_p068_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The warm-started surrogate-capacity homotopy on the international real business cycle [PITH_FULL_IMAGE:figures/full_fig_p069_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The world model’s encoder, drawn for the heterogeneous-agent economy. The state [PITH_FULL_IMAGE:figures/full_fig_p072_13.png] view at source ↗
read the original abstract

We introduce \emph{Equilibrium World Models} (EWMs), a deep-learning method for globally solving dynamic stochastic models that feature rare disasters, binding constraints, and counterfactual states. Standard unsupervised neural-network-based solvers impose equilibrium conditions only on states generated by their own simulated policy. Their solutions can therefore be self-confirming: accurate on the simulated path, but untested off it, sensitive to initialization, and costly when expectations must be recomputed at each step. EWMs change the computational representation, not the economics. They enforce the model's exact equilibrium conditions on a broader, model-generated distribution of ordinary, rare, stressed, and counterfactual states. They carry the continuation with a learned surrogate, but certify the resulting policy strictly against the true equilibrium conditions. We provide an error decomposition, an off-path residual bound, and a convergence result linking self-confirming solutions to rational-expectations equilibria. We demonstrate EWMs through a sequence of test cases that isolate the main pathologies of classical deep-learning solvers and then scale them to richer economies. In a rare-disaster Brock--Mirman laboratory, coverage reduces disaster-region residuals by an order of magnitude. In a high-dimensional international real-business-cycle model, classical deep-learning solvers fail from all random starts, whereas EWMs converge from nearly all and evaluate continuations up to two orders of magnitude less often. When actions move transition measures, EWMs use action-conditioned continuations to recover the relevant policy margin. In a heterogeneous-agent economy with aggregate risk, EWMs compress the numerical representation of the wealth distribution by at least 25x while imposing exact full-distribution rational-expectations conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper introduces Equilibrium World Models (EWMs), a deep-learning method for globally solving dynamic stochastic models featuring rare disasters, binding constraints, and counterfactual states. Unlike standard neural-network solvers that impose equilibrium conditions only on states generated by their own policy (risking self-confirming solutions), EWMs enforce the model's exact equilibrium conditions on a broader model-generated distribution of ordinary, rare, stressed, and counterfactual states. They use a learned surrogate for continuation values but certify the resulting policy strictly against true equilibrium conditions, supported by an error decomposition, an off-path residual bound, and a convergence result linking self-confirming solutions to rational-expectations equilibria. Empirical demonstrations include a rare-disaster Brock-Mirman model (order-of-magnitude residual reduction in disaster regions), a high-dimensional international RBC model (improved convergence and reduced continuation evaluations), and a heterogeneous-agent economy with aggregate risk (25x compression of wealth distribution representation while imposing exact full-distribution conditions).

Significance. If the stated guarantees and empirical results hold, EWMs would address a central limitation of unsupervised neural solvers for DSGE models by reducing sensitivity to initialization and off-path errors, enabling more reliable solutions in settings with rare events and high dimensionality. The explicit error decomposition, residual bound, and convergence result are notable strengths, as is the reproducible demonstration across isolated test cases and scaled applications. This could meaningfully advance computational methods in macroeconomics and related fields.

minor comments (2)
  1. The abstract refers to 'a sequence of test cases' and specific models (Brock-Mirman, international RBC, heterogeneous-agent); the main text should include explicit section references or table numbers for each demonstration to allow readers to locate the corresponding error metrics and convergence statistics.
  2. Notation for the surrogate continuation and the certification step should be introduced with a clear equation or definition early in the methods section to distinguish the learned component from the exact equilibrium conditions being enforced.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the detailed and positive summary of our work on Equilibrium World Models, as well as the recommendation for minor revision. No specific major comments were provided in the report, so we have no points to address point-by-point at this stage. We will make minor revisions to enhance clarity and presentation as appropriate.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's central approach enforces the model's exact equilibrium conditions on a broader, model-generated distribution of states (ordinary, rare, stressed, and counterfactual) while using a learned surrogate only for carrying the continuation; the final policy is certified strictly against the true equilibrium conditions via an explicit error decomposition, off-path residual bound, and convergence result that connects self-confirming solutions to rational-expectations equilibria. This structure is self-contained against external model conditions rather than reducing any load-bearing claim to a fitted parameter, self-definition, or self-citation chain. No instances of the enumerated circularity patterns appear in the provided description or abstract.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities beyond the method name itself; the central contribution is algorithmic rather than resting on new economic assumptions.

invented entities (1)
  • Equilibrium World Models no independent evidence
    purpose: Deep-learning solver that enforces equilibrium conditions on broad model-generated state distributions
    Newly proposed method whose properties are asserted in the abstract.

pith-pipeline@v0.9.1-grok · 5814 in / 1224 out tokens · 26667 ms · 2026-06-26T05:52:35.504062+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 1 canonical work pages

  1. [1]

    Achdou, Y., Han, J., Lasry, J.-M., Lions, P.-L., and Moll, B. (2022). Income and wealth distribution in macroeconomics: A continuous-time approach.The Review of Economic Studies, 89(1):45–86

  2. [2]

    Adam, K., Marcet, A., and Nicolini, J. P. (2016). Stock market volatility and learning.Journal of Finance, 71(1):33–82

  3. [3]

    Aiyagari, R. (1994). Uninsured idiosyncratic risk and aggregate saving.The Quarterly Journal of Economics, 109(3):659–684

  4. [4]

    Aliprantis, C. D. and Border, K. C. (2006).Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, 3rd edition

  5. [5]

    Azinovic, M., Gaegauf, L., and Scheidegger, S. (2022). DEEP EQUILIBRIUM NETS.International Economic Review, 63(4):1471–1525. Azinovic-Yang,M.andŽemlička,J.(2024). Intergenerationalconsequencesofraredisasters.Avail- able at SSRN 4386477. Azinovic-Yang,M.andŽemlička,J.(2025). Deeplearninginthesequencespace. arXiv:2509.13623

  6. [6]

    and LeCun, Y

    Balestriero, R. and LeCun, Y. (2025). SIGReg: Sketched isotropic gaussian regularization. arXiv:2511.08544

  7. [7]

    Bauschke, H. H. and Combettes, P. L. (2011).Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer. Bellman,R.(1961).AdaptiveControlProcesses: AGuidedTour. ’RandCorporation.Researchstudies. Princeton University Press. Bewley,T.(1986). Stationarymonetaryequilibriumwithacontinuumofindependentlyfluctuating consumers.Contributions to Mathema...

  8. [8]

    (1999).Convergence of Probability Measures

    Billingsley, P. (1999).Convergence of Probability Measures. Wiley, 2nd edition

  9. [9]

    Branch, W. A. and Evans, G. W. (2006). Intrinsic heterogeneity in expectation formation.Journal of Economic Theory, 127(1):264–295

  10. [10]

    Bray, M. M. (1982). Learning, estimation, and the stability of rational expectations.Journal of Economic Theory, 26(2):318–339

  11. [11]

    and Scheidegger, S

    Brumm, J. and Scheidegger, S. (2017). Using adaptive sparse grids to solve high-dimensional dynamic models.Econometrica, 85(5):1575–1612

  12. [12]

    (2019).The Master Equation and the Convergence Problem in Mean Field Games

    Cardaliaguet, P., Delarue, F., Lasry, J.-M., and Lions, P.-L. (2019).The Master Equation and the Convergence Problem in Mean Field Games. Annals of Mathematics Studies. Princeton University Press

  13. [13]

    M., Covarrubias, M., and Nuno, G

    Carvalho, V. M., Covarrubias, M., and Nuno, G. (2025). Planning against disasters in dynamic production networks. Technical report, Working Paper. Chen,H.,Didisheim,A.,andScheidegger,S.(2026). Deepsurrogatesforfinance: Withanapplica- tion to option pricing.Journal of Financial Economics, 177:104222. 73

  14. [14]

    Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function.Mathematics of

  15. [15]

    Den Haan, W

    Control, Signals and Systems, 2(4):303–314. Den Haan, W. J. (2010). Comparison of solutions to the incomplete markets model with aggregate uncertainty.Journal of Economic Dynamics and Control, 34(1):4–27. Den Haan, W. J. and Marcet, A. (1990). Solving the stochastic growth model by parameterizing expectations.Journal of Business and Economic Statistics, 8...

  16. [16]

    Duarte, V., Duarte, D., and Silva, D. (2024). Machine learning for continuous-time finance.Review of Financial Studies, 37(11):3217–3271

  17. [17]

    and McNelis, P

    Duffy, J. and McNelis, P. D. (2001). Approximating and simulating the stochastic growth model: Parameterized expectations, neural networks, and the genetic algorithm.Journal of Economic Dynamics and Control, 25(9):1273–1303

  18. [18]

    and Pouzo, D

    Esponda, I. and Pouzo, D. (2016). Berk–nash equilibrium: A framework for modeling agents with misspecified models.Econometrica, 84(3):1093–1130. Eusepi,S.andPreston,B.(2011). Expectations,learning,andbusinesscyclefluctuations.American Economic Review, 101(6):2844–2872

  19. [19]

    Evans, G. W. and Honkapohja, S. (2001).Learning and Expectations in Macroeconomics. Princeton University Press. Fernández-Villaverde, J., Hurtado, S., and Nuño, G. (2023). Financial frictions and the wealth distribution.Econometrica, 91(3):869–901. Fernández-Villaverde, J., Nuño, G., and Perla, J. (2024). Taming the curse of dimensionality: Quantitativeec...

  20. [20]

    Fischer, A. (1992). A special Newton-type optimization method.Optimization, 24(3–4):269–284. Folini,D.,Friedl,A.,Kübler,F.,andScheidegger,S.(2024). TheClimateinClimateEconomics.The Review of Economic Studies, forthcoming

  21. [21]

    Friedl, A., Kübler, F., Scheidegger, S., and Usui, T. (2023). Deep uncertainty quantification: With an application to integrated assessment models. Working paper, University of Lausanne

  22. [22]

    and Levine, D

    Fudenberg, D. and Levine, D. K. (1993). Self-confirming equilibrium.Econometrica, 61(3):523–545

  23. [23]

    Gopalakrishna, G. (2024). ALIENs and continuous time economies.Available at SSRN

  24. [24]

    Gu, Z., Lauriere, M., Merkel, S., and Payne, J. (2024). Global solutions to master equations for continuoustimeheterogeneousagentmacroeconomicmodels. arXivpreprintarXiv:2406.13726

  25. [25]

    and Schmidhuber, J

    Ha, D. and Schmidhuber, J. (2018). World models. arXiv:1803.10122

  26. [26]

    Hafner, D., Lillicrap, T., Ba, J., and Norouzi, M. (2020). Dream to control: Learning behaviors by latent imagination. InInternational Conference on Learning Representations

  27. [27]

    Hafner, D., Pasukonis, J., Ba, J., and Lillicrap, T. (2023). Mastering diverse domains through world models. arXiv:2301.04104. 74

  28. [28]

    Han, J., Yang, Y., and E, W. (2024). DeepHAM: A global solution method for heterogeneous agent models with aggregate shocks.Quantitative Economics. Forthcoming; preprint arXiv:2112.14377 (first version December 2021). Hornik,K.,Stinchcombe,M.,andWhite,H.(1989). Multilayerfeedforwardnetworksareuniversal approximators.Neural Networks, 2(5):359–366

  29. [29]

    Huang, H., Gao, T., Gui, Y., Guo, J., and Zhang, P. (2022). Stock trading optimization through model-basedreinforcementlearningwithresistancesupportrelativestrength. arXiv:2205.15056

  30. [30]

    E., Fernández-Villaverde, J., Perla, J., and Sood, A

    Kahou, M. E., Fernández-Villaverde, J., Perla, J., and Sood, A. (2021). Exploiting symmetry in high-dimensional dynamic programming.NBER Working Paper, (28981)

  31. [31]

    (2022).Estimating nonlinear heterogeneous agents models with neural networks

    Kase, H., Melosi, L., and Rottner, M. (2022).Estimating nonlinear heterogeneous agents models with neural networks. Centre for Economic Policy Research. Kingma,D.P.andBa,J.(2015). Adam: Amethodforstochasticoptimization.Proceedingsofthe3rd International Conference on Learning Representations (ICLR)

  32. [32]

    H., and Potter, S

    Koop, G., Pesaran, M. H., and Potter, S. M. (1996). Impulse response analysis in nonlinear multi- variate models.Journal of Econometrics, 74(1):119–147

  33. [33]

    and Smith, Jr, A

    Krusell, P. and Smith, Jr, A. A. (1998). Income and wealth heterogeneity in the macroeconomy. Journal of Political Economy, 106(5):867–896. Kubler,F.andScheidegger,S.(2023). Uniformlyself-justifiedequilibria.JournalofEconomicTheory, 212:105707. Kübler, F. and Scheidegger, S. (2025). Self-justified equilibria: Existence and computation.Journal of the Europ...

  34. [34]

    LeCun, Y. (2022). A path towards autonomous machine intelligence. OpenReview

  35. [35]

    Li, J., Liu, Y., Liu, W., Fang, S., Wang, L., Xu, C., and Bian, J. (2025). MarS: a financial market simulation engine powered by generative foundation model. arXiv:2409.07486. Lillicrap,T.P.,Hunt,J.J.,Pritzel,A.,Heess,N.,Erez,T.,Tassa,Y.,Silver,D.,andWierstra,D.(2016). Continuous control with deep reinforcement learning. InInternational Conference on Lear...

  36. [36]

    Lucas, R. E. (1976). Econometric policy evaluation: A critique. In Brunner, K. and Meltzer, A. H., editors,The Phillips Curve and Labor Markets, volume 1 ofCarnegie-Rochester Conference Series on Public Policy, pages 19–46. North-Holland

  37. [37]

    MacKay, D. J. C. (1992). Information-based objective functions for active data selection.Neural Computation, 4(4):590–604

  38. [38]

    Maes, L., Le Lidec, Q., Scieur, D., LeCun, Y., and Balestriero, R. (2026). LeWorldModel: Stable end-to-end joint-embedding predictive architecture from pixels. arXiv:2603.19312. 75

  39. [39]

    Maliar, L., Maliar, S., and Winant, P. (2021). Deep learning for solving dynamic economic models. Journal of Monetary Economics, 122:76–101

  40. [40]

    Marcet, A. (1988). Solution of nonlinear models by parameterizing expectations. Technical report, Carnegie Mellon University

  41. [41]

    and Sargent, T

    Marcet, A. and Sargent, T. J. (1989). Convergence of least-squares learning mechanisms in self- referential linear stochastic models.Journal of Economic Theory, 48(2):337–368. Moll,B.(2026). Heterogeneousagentmacroeconomics: Eightlessonsandachallenge.TheEconomic Journal, 136(676):1173–1205. Economic Journal Lecture, Royal Economic Society. Nuño, G., Renne...

  42. [42]

    Deeplearningforsearchandmatchingmodels

    Payne, J., Rebei, A., andYang, Y.(2025). Deeplearningforsearchandmatchingmodels. Technical Report 25-05, Swiss Finance Institute

  43. [43]

    Polyak, B. T. and Juditsky, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization, 30(4):838–855

  44. [44]

    Rasmussen, C. E. and Williams, C. K. I. (2005).Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press. Renner,P.andScheidegger,S.(2018). Machinelearningfordynamicincentiveproblems. Working paper. Available at SSRN: http://dx.doi.org/10.2139/ssrn.3282487

  45. [45]

    (1976).Principles of Mathematical Analysis

    Rudin, W. (1976).Principles of Mathematical Analysis. McGraw-Hill, 3rd edition

  46. [46]

    Sargent, T. J. (1993).Bounded Rationality in Macroeconomics. Oxford University Press

  47. [47]

    Sargent, T. J. (1999).The Conquest of American Inflation. Princeton University Press

  48. [48]

    Sargent, T. J. (2024). Macroeconomics after Lucas. Sequel to Lucas and Sargent (1978)

  49. [49]

    Scheidegger, S. (2026). Deep learning for solving and estimating dynamic models in economics and finance. arXiv:2605.14493

  50. [50]

    and Bilionis, I

    Scheidegger, S. and Bilionis, I. (2019). Machine learning for high-dimensional dynamic stochastic economies.Journal of Computational Science, 33:68–82

  51. [51]

    Schmidhuber, J. (1990). Making the world differentiable: On using self-supervised fully recurrent neural networks for dynamic reinforcement learning and planning in non-stationary environ- ments.Technical Report FKI-126-90, Technische Universität München

  52. [52]

    Snoek, J., Larochelle, H., and Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. InAdvances in Neural Information Processing Systems (NeurIPS 25)

  53. [53]

    L., Lucas, R

    Stokey, N. L., Lucas, R. E., and Prescott, E. C. (1989).Recursive Methods in Economic Dynamics. Harvard University Press, Cambridge, MA. Valaitis,V.andVilla,A.T.(2024). Amachinelearningprojectionmethodformacro-financemodels. Quantitative Economics, 15(1):145–173. 76

  54. [54]

    Yang, Y., Wang, C., Schaab, A., and Moll, B. (2026). Structural reinforcement learning for hetero- geneous agent macroeconomics. arXiv:2512.18892

  55. [55]

    Young, E. R. (2010). Solving the incomplete markets model with aggregate uncertainty using the krusell–smith algorithm and non-stochastic simulations.Journal of Economic Dynamics and Control, 34(1):36–41. 77