pith. sign in

arxiv: 2604.09736 · v1 · submitted 2026-04-09 · 💰 econ.EM

Training Neural Networks Embedded in Dynamic Discrete Choice Models

Pith reviewed 2026-05-10 16:50 UTC · model grok-4.3

classification 💰 econ.EM
keywords dynamic discrete choice modelsneural networksBellman's equationfixed point estimationasymptotic normalityinfinite horizonutility approximationunnested fixed point
0
0 comments X

The pith

A dual form of Bellman's equation separates utility parameters from the dynamic programming fixed point, enabling neural network approximations in infinite-horizon dynamic discrete choice models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops estimators for infinite-horizon dynamic discrete choice models whose optimization, after pre-computation, no longer requires solving or constraining large systems of linear equations. By exploiting a dual representation of Bellman's equation, the method isolates the utility parameters from the value-function fixed point. This isolation supports consistent and asymptotically normal estimation while permitting flexible neural-network approximations to the utility function. The resulting UFXP and OUFXP estimators therefore allow researchers to specify more complex, non-parametric utilities without the usual nested computational burden.

Core claim

The authors introduce the unnested fixed point (UFXP) estimator and its optimal variant (OUFXP) that rest on a dual representation of Bellman's equation. This representation cleanly separates the utility parameters from the dynamic programming fixed point for the class of infinite-horizon discrete choice models considered. After a one-time pre-computation step, the estimation problem contains neither embedded linear systems nor constraints enforcing them. The authors prove consistency and asymptotic normality for both estimators and efficiency for OUFXP, thereby justifying neural-network approximations to utility.

What carries the argument

The dual representation of Bellman's equation, which isolates utility parameters from the dynamic programming fixed point.

If this is right

  • Utility functions in infinite-horizon DDC models can be estimated non-parametrically with neural networks.
  • The UFXP estimator is consistent and asymptotically normal.
  • The OUFXP estimator is consistent, asymptotically normal, and efficient.
  • After pre-computation the estimation objective contains no large linear equation systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The separation may extend to other dynamic models whose Bellman equations admit analogous dual forms.
  • Researchers could test whether deeper or recurrent networks improve fit in settings where state spaces are large.
  • The pre-computation step could be reused across multiple utility specifications, lowering the cost of specification searches.

Load-bearing premise

The dual representation of Bellman's equation continues to separate utility parameters from the fixed point even after neural networks approximate the utility function.

What would settle it

In a Monte Carlo experiment with a known data-generating process, the UFXP estimator fails to recover the true utility parameters at the rate and with the asymptotic distribution predicted by the paper's theorems.

Figures

Figures reproduced from arXiv: 2604.09736 by Ecenur Oguz, Robert L. Bray.

Figure 1
Figure 1. Figure 1: True and estimated holding cost functions for the 540-state model. The black lines [PITH_FULL_IMAGE:figures/full_fig_p020_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Empirical cumulative distribution functions (CDFs) of the holding cost estimation errors, [PITH_FULL_IMAGE:figures/full_fig_p026_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Ensemble of 383 estimated holding cost functions whose UFXP objectives fall within [PITH_FULL_IMAGE:figures/full_fig_p035_3.png] view at source ↗
read the original abstract

We develop the first general-purpose estimator for infinite-horizon dynamic discrete choice models whose estimation problem, after pre-computation, is unencumbered by large systems of linear equations -- either imposed as constraints, or embedded in the objective function. Our unnested fixed point (UFXP) and optimal unnested fixed point (OUFXP) estimators exploit a dual representation of Bellman's equation to separate the utility parameters from the dynamic programming fixed point. We establish the consistency and asymptotic normality of UFXP and OUFXP, as well as the efficiency of the latter. Our estimators enable researchers to model utility functions non-parametrically via flexible neural-network approximations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops UFXP and OUFXP estimators for infinite-horizon dynamic discrete choice models. These exploit a dual representation of Bellman's equation to separate the utility parameters from the dynamic programming fixed point after a one-time pre-computation, thereby avoiding large linear systems either as constraints or inside the objective. The approach is claimed to enable non-parametric utility approximation via neural networks while delivering consistency, asymptotic normality, and (for OUFXP) efficiency.

Significance. If the dual separation is rigorously shown to survive neural-network approximations to utility, the estimators would constitute a meaningful computational advance for DDC models by removing the need for nested fixed-point solutions or repeated large-matrix operations during estimation. The theoretical results on consistency and efficiency would strengthen the case for adopting flexible utility specifications in applied work.

major comments (2)
  1. [theoretical results section] The central claim that the dual representation of Bellman's equation cleanly isolates utility parameters (and their neural-network weights ϕ) from the infinite-horizon fixed point is load-bearing for the entire contribution. The manuscript asserts that this separation survives the introduction of a nonlinear neural-network utility u(s,a;ϕ), but supplies no explicit algebraic derivation or operator identity demonstrating that the dual objective remains independent of V*(ϕ) once ϕ enters nonlinearly. (See the derivation of the dual form and its extension to neural networks in the theoretical results section.)
  2. [theoretical results section] No analysis is provided of how neural-network approximation error interacts with the fixed-point separation or with the consistency and asymptotic normality claims. Because the pre-computation step produces an approximation to the fixed point that is then held fixed while optimizing over ϕ, the interaction between approximation error and the outer estimator is central to whether the stated asymptotic properties hold in practice.
minor comments (2)
  1. The abstract is lengthy and repeats the computational advantage multiple times; a tighter version would improve readability.
  2. The manuscript would benefit from at least one set of Monte Carlo experiments that vary the neural-network architecture and report both bias and the computational cost relative to nested fixed-point or MPEC alternatives.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on the theoretical results. These points help clarify the presentation of the dual separation and its implications for neural-network approximations. We address each major comment below and will revise the theoretical results section to incorporate the requested material.

read point-by-point responses
  1. Referee: [theoretical results section] The central claim that the dual representation of Bellman's equation cleanly isolates utility parameters (and their neural-network weights ϕ) from the infinite-horizon fixed point is load-bearing for the entire contribution. The manuscript asserts that this separation survives the introduction of a nonlinear neural-network utility u(s,a;ϕ), but supplies no explicit algebraic derivation or operator identity demonstrating that the dual objective remains independent of V*(ϕ) once ϕ enters nonlinearly. (See the derivation of the dual form and its extension to neural networks in the theoretical results section.)

    Authors: We agree that an explicit algebraic derivation for the nonlinear case strengthens the exposition. The dual representation exploits the fact that the infinite-horizon fixed point can be pre-computed once for the state space and then held fixed while optimizing over ϕ; the dual objective is constructed so that it does not require re-solving the fixed point inside the estimation step. This structure is independent of whether u is linear or nonlinear in ϕ. In the revision we will insert a step-by-step derivation showing that, after the one-time pre-computation of the relevant operator, the dual objective evaluated at any candidate ϕ (including neural-network weights) does not depend on a ϕ-dependent value function V*(ϕ). This will make the separation transparent for the neural-network case. revision: yes

  2. Referee: [theoretical results section] No analysis is provided of how neural-network approximation error interacts with the fixed-point separation or with the consistency and asymptotic normality claims. Because the pre-computation step produces an approximation to the fixed point that is then held fixed while optimizing over ϕ, the interaction between approximation error and the outer estimator is central to whether the stated asymptotic properties hold in practice.

    Authors: We concur that the interaction between neural-network approximation error and the asymptotic properties merits explicit treatment. The current proofs establish consistency and asymptotic normality under the assumption that the utility function is correctly specified (or approximated at a rate that vanishes sufficiently fast). In the revision we will add a subsection analyzing the propagation of approximation error through the pre-computed fixed point. Under standard conditions on the neural-network approximation rate (e.g., as the number of units grows with sample size), we will show that the total error remains o_p(1) and that the UFXP and OUFXP estimators retain consistency and asymptotic normality; for OUFXP we will also confirm that efficiency is preserved asymptotically. revision: yes

Circularity Check

0 steps flagged

Algebraic dual-Bellman separation is independent of fitted NN parameters; no load-bearing self-citation or fitted-input renaming

full rationale

The paper's central claim rests on an algebraic identity in the dual representation of Bellman's equation that isolates utility parameters (including those inside a neural-network approximator) from the fixed-point operator after a single pre-computation step. This identity is presented as holding for the class of models considered and is used to derive consistency and asymptotic normality of UFXP/OUFXP without any indication that the separation itself is obtained by fitting parameters to the evaluation data or by a self-citation chain whose content reduces to the present result. No equation is shown to be definitionally equivalent to its inputs, and the efficiency result for OUFXP follows from standard extremum-estimator arguments once the separation is granted. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the existence of a dual representation of Bellman's equation that isolates utility parameters; this is a standard property of discounted dynamic programming but its use for neural-network estimation is new. No free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption A dual representation of Bellman's equation exists that separates the utility parameters from the value-function fixed point.
    Invoked to justify moving the fixed-point computation outside the estimation routine.

pith-pipeline@v0.9.0 · 5399 in / 1266 out tokens · 25959 ms · 2026-05-10T16:50:21.809588+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    Aguirregabiria, Victor. 1999. The dynamics of markups and inventories in retailing firms . The Review of Economic Studies\/ 66 (2) 275--308

  2. [2]

    Aguirregabiria, Victor, Arvind Magesan. 2013. Euler equations for the estimation of dynamic discrete choice structural models. Structural Econometric Models\/ , Advances in Econometrics\/ , vol. 31. Emerald Group Publishing Limited, 3--44

  3. [3]

    Aguirregabiria, Victor, Pedro Mira. 2002. Swapping the nested fixed point algorithm: A class of estimators for discrete markov decision models. Econometrica\/ 70 (4) 1519--1543

  4. [4]

    Aguirregabiria, Victor, Pedro Mira. 2007. Sequential estimation of dynamic discrete games. Econometrica\/ 75 (1) 1--53

  5. [5]

    Aguirregabiria, Victor, Pedro Mira. 2010. Dynamic discrete choice structural models: A survey. Journal of Econometrics\/ 156 (1) 38--67

  6. [6]

    Aouad, Ali, Antoine D\' e sir. 2025. Representing random utility choice models with neural networks. Management Science\/ 0 (0) null

  7. [7]

    Ellickson

    Arcidiacono, Peter, Paul B. Ellickson. 2011. Practical methods for estimation of dynamic discrete choice models. Annual Review of Economics\/ 3 (Volume 3, 2011) 363--394

  8. [8]

    Arcidiacono, Peter, Robert A. Miller. 2011. Conditional choice probability estimation of dynamic discrete choice models with unobserved heterogeneity. Econometrica\/ 79 (6) 1823--1867

  9. [9]

    Arkoudi, Ioanna, Rico Krueger, Carlos Lima Azevedo, Francisco C. Pereira. 2023. Combining discrete choice models and neural networks through embeddings: Formulation, interpretability and performance. Transportation Research Part B: Methodological\/ 175 102783

  10. [10]

    Lanier Benkard, Jonathan Levin

    Bajari, Patrick, C. Lanier Benkard, Jonathan Levin. 2007. Estimating dynamic models of imperfect competition. Econometrica\/ 75 (5) 1331--1370

  11. [11]

    Bray, Robert L. 2019 a . Markov Decision Processes with Exogenous Variables . Management Science\/ 65 (10) 4598--4606

  12. [12]

    Bray, Robert L. 2019 b . Strong convergence and dynamic economic models . Quantitative Economics\/ 10 (1) 43--65

  13. [13]

    Bray, Robert L., Yuliang Yao, Yongrui Duan, Jiazhen Huo. 2019. Ration gaming and the bullwhip effect. Operations Research\/ 67 (2) 453--467

  14. [14]

    Byrd, Richard H., Jorge Nocedal, Richard A. Waltz. 2006. Knitro: An integrated package for nonlinear optimization. G. Di Pillo, M. Roma, eds., Large-Scale Nonlinear Optimization\/ . Springer US, Boston, MA, 35--59

  15. [15]

    Colin, Pravin K

    Cameron, A. Colin, Pravin K. Trivedi. 2005. Microeconometrics: Methods and Applications\/ . Cambridge University Press

  16. [16]

    Dearing, Adam. 2019. Pseudo-value functions and closed-form ccp estimation of dynamic discrete choice models, working paper

  17. [17]

    Han, Yafei, Francisco Camara Pereira, Moshe Ben-Akiva, Christopher Zegras. 2022. A neural-embedded discrete choice model: Learning taste representation with strengthened interpretability. Transportation Research Part B: Methodological\/ 163 166--186

  18. [18]

    He, Kaiming, Xiangyu Zhang, Shaoqing Ren, Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV)\/ . 1026--1034

  19. [19]

    Joseph, Robert A

    Hotz, V. Joseph, Robert A. Miller. 1993. Conditional Choice Probabilities and the Estimation of Dynamic Models . The Review of Economic Studies\/ 60 (3) 497--529

  20. [20]

    Joseph, Robert A

    Hotz, V. Joseph, Robert A. Miller, Seth Sanders, Jeffrey Smith. 1994. A Simulation Estimator for Dynamic Models of Discrete Choice . The Review of Economic Studies\/ 61 (2) 265--289

  21. [21]

    Hsieh, Sung-Lin, Shaowei Ke, Zhaoran Wang, Chen Zhao. 2025. Logit neural-network utility. Journal of Economic Behavior & Organization\/ 236 107054

  22. [22]

    constrained optimization approaches to estimation of structural models

    Iskhakov, Fedor, Jinhyuk Lee, John Rust, Bertel Schjerning, Kyoungwon Seo. 2016. Comment on “constrained optimization approaches to estimation of structural models”. Econometrica\/ 84 (1) 365--370

  23. [23]

    Judd, Kenneth. 1998. Numerical Methods in Economics\/ , vol. 1. 1st ed. The MIT Press

  24. [24]

    Kasahara, Hiroyuki, Katsumi Shimotsu. 2018. Estimation of Discrete Choice Dynamic Programming Models . The Journal of Japanese Economic Association\/ 69 (1) 28--58

  25. [25]

    Kingma, Diederik P., Jimmy Ba. 2017. Adam: A method for stochastic optimization

  26. [26]

    Lee, John M. 2012. Introduction to Smooth Manifolds\/ , Graduate Texts in Mathematics\/ , vol. 218. 2nd ed. Springer, New York, NY

  27. [27]

    Liu, Dong C., Jorge Nocedal. 1989. On the limited memory bfgs method for large scale optimization. Mathematical Programming\/ 45 503--528

  28. [28]

    Miessi Sanches, Fabio A., Daniel Junior Silva, Sorawoot Srisuma. 2016. Ordinary least squares estimation of a dynamic game model. International Economic Review\/ 57 (2) 623--634

  29. [29]

    Newey, Whitney K., Daniel McFadden. 1994. Large sample estimation and hypothesis testing. Handbook of Econometrics\/ , Handbook of Econometrics\/ , vol. 4, chap. 36. Elsevier, 2111--2245

  30. [30]

    Nocedal, Jorge, Stephen J. Wright. 2006. Numerical Optimization \/ . 2nd ed. Springer Series in Operations Research and Financial Engineering, Springer, New York

  31. [31]

    Pakes, Ariel, Michael Ostrovsky, Steven Berry. 2007. Simple estimators for the parameters of discrete dynamic games (with entry/exit examples). The RAND Journal of Economics\/ 38 (2) 373--399

  32. [32]

    Pesendorfer, Martin, Philipp Schmidt-Dengler. 2008. Asymptotic Least Squares Estimators for Dynamic Games1 . The Review of Economic Studies\/ 75 (3) 901--928

  33. [33]

    Rust, John. 1986. Structural estimation of markov decision processes. Handbook of Econometrics\/ , Handbook of Econometrics\/ , vol. 4, 1st ed., chap. 51. Elsevier, 3081--3143

  34. [34]

    Rust, John. 1987 a . Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher . Econometrica\/ 55 (5) 999--1033

  35. [35]

    Rust, John. 1987 b . Optimal replacement of gmc bus engines: An empirical model of harold zurcher. Econometrica\/ 55 (5) 999--1033

  36. [36]

    Rust, John. 2000. Nested fixed point algorithm documentation manual, unpublished manuscript

  37. [37]

    Shastri, Anant R. 2011. Elements of Differential Topology\/ . CRC Press, Boca Raton, FL

  38. [38]

    Sifringer, Brian, Virginie Lurkin, Alexandre Alahi. 2020. Enhancing discrete choice models with representation learning. Transportation Research Part B: Methodological\/ 140 236--261

  39. [39]

    Su, Che-Lin, Kenneth L. Judd. 2012. Constrained optimization approaches to estimation of structural models. Econometrica\/ 80 (5) 2213--2230

  40. [40]

    van Cranenburgh , Sander, Shenhao Wang, Akshay Vij, Francisco Pereira, Joan Walker. 2022. Choice modelling in the age of machine learning - discussion paper. Journal of Choice Modelling\/ 42 100340

  41. [41]

    van der Laan, Lars, Aurelien Bibaut, Nathan Kallus. 2025. Efficient inference for inverse reinforcement learning and dynamic discrete choice models

  42. [42]

    Wei, Yanhao (Max), Zhenling Jiang. 2025. Estimating parameters of structural models using neural networks. Marketing Science\/ 44 (1) 102--128

  43. [43]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sent...

  44. [44]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...