Training Neural Networks Embedded in Dynamic Discrete Choice Models
Pith reviewed 2026-05-10 16:50 UTC · model grok-4.3
The pith
A dual form of Bellman's equation separates utility parameters from the dynamic programming fixed point, enabling neural network approximations in infinite-horizon dynamic discrete choice models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce the unnested fixed point (UFXP) estimator and its optimal variant (OUFXP) that rest on a dual representation of Bellman's equation. This representation cleanly separates the utility parameters from the dynamic programming fixed point for the class of infinite-horizon discrete choice models considered. After a one-time pre-computation step, the estimation problem contains neither embedded linear systems nor constraints enforcing them. The authors prove consistency and asymptotic normality for both estimators and efficiency for OUFXP, thereby justifying neural-network approximations to utility.
What carries the argument
The dual representation of Bellman's equation, which isolates utility parameters from the dynamic programming fixed point.
If this is right
- Utility functions in infinite-horizon DDC models can be estimated non-parametrically with neural networks.
- The UFXP estimator is consistent and asymptotically normal.
- The OUFXP estimator is consistent, asymptotically normal, and efficient.
- After pre-computation the estimation objective contains no large linear equation systems.
Where Pith is reading between the lines
- The separation may extend to other dynamic models whose Bellman equations admit analogous dual forms.
- Researchers could test whether deeper or recurrent networks improve fit in settings where state spaces are large.
- The pre-computation step could be reused across multiple utility specifications, lowering the cost of specification searches.
Load-bearing premise
The dual representation of Bellman's equation continues to separate utility parameters from the fixed point even after neural networks approximate the utility function.
What would settle it
In a Monte Carlo experiment with a known data-generating process, the UFXP estimator fails to recover the true utility parameters at the rate and with the asymptotic distribution predicted by the paper's theorems.
Figures
read the original abstract
We develop the first general-purpose estimator for infinite-horizon dynamic discrete choice models whose estimation problem, after pre-computation, is unencumbered by large systems of linear equations -- either imposed as constraints, or embedded in the objective function. Our unnested fixed point (UFXP) and optimal unnested fixed point (OUFXP) estimators exploit a dual representation of Bellman's equation to separate the utility parameters from the dynamic programming fixed point. We establish the consistency and asymptotic normality of UFXP and OUFXP, as well as the efficiency of the latter. Our estimators enable researchers to model utility functions non-parametrically via flexible neural-network approximations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops UFXP and OUFXP estimators for infinite-horizon dynamic discrete choice models. These exploit a dual representation of Bellman's equation to separate the utility parameters from the dynamic programming fixed point after a one-time pre-computation, thereby avoiding large linear systems either as constraints or inside the objective. The approach is claimed to enable non-parametric utility approximation via neural networks while delivering consistency, asymptotic normality, and (for OUFXP) efficiency.
Significance. If the dual separation is rigorously shown to survive neural-network approximations to utility, the estimators would constitute a meaningful computational advance for DDC models by removing the need for nested fixed-point solutions or repeated large-matrix operations during estimation. The theoretical results on consistency and efficiency would strengthen the case for adopting flexible utility specifications in applied work.
major comments (2)
- [theoretical results section] The central claim that the dual representation of Bellman's equation cleanly isolates utility parameters (and their neural-network weights ϕ) from the infinite-horizon fixed point is load-bearing for the entire contribution. The manuscript asserts that this separation survives the introduction of a nonlinear neural-network utility u(s,a;ϕ), but supplies no explicit algebraic derivation or operator identity demonstrating that the dual objective remains independent of V*(ϕ) once ϕ enters nonlinearly. (See the derivation of the dual form and its extension to neural networks in the theoretical results section.)
- [theoretical results section] No analysis is provided of how neural-network approximation error interacts with the fixed-point separation or with the consistency and asymptotic normality claims. Because the pre-computation step produces an approximation to the fixed point that is then held fixed while optimizing over ϕ, the interaction between approximation error and the outer estimator is central to whether the stated asymptotic properties hold in practice.
minor comments (2)
- The abstract is lengthy and repeats the computational advantage multiple times; a tighter version would improve readability.
- The manuscript would benefit from at least one set of Monte Carlo experiments that vary the neural-network architecture and report both bias and the computational cost relative to nested fixed-point or MPEC alternatives.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments on the theoretical results. These points help clarify the presentation of the dual separation and its implications for neural-network approximations. We address each major comment below and will revise the theoretical results section to incorporate the requested material.
read point-by-point responses
-
Referee: [theoretical results section] The central claim that the dual representation of Bellman's equation cleanly isolates utility parameters (and their neural-network weights ϕ) from the infinite-horizon fixed point is load-bearing for the entire contribution. The manuscript asserts that this separation survives the introduction of a nonlinear neural-network utility u(s,a;ϕ), but supplies no explicit algebraic derivation or operator identity demonstrating that the dual objective remains independent of V*(ϕ) once ϕ enters nonlinearly. (See the derivation of the dual form and its extension to neural networks in the theoretical results section.)
Authors: We agree that an explicit algebraic derivation for the nonlinear case strengthens the exposition. The dual representation exploits the fact that the infinite-horizon fixed point can be pre-computed once for the state space and then held fixed while optimizing over ϕ; the dual objective is constructed so that it does not require re-solving the fixed point inside the estimation step. This structure is independent of whether u is linear or nonlinear in ϕ. In the revision we will insert a step-by-step derivation showing that, after the one-time pre-computation of the relevant operator, the dual objective evaluated at any candidate ϕ (including neural-network weights) does not depend on a ϕ-dependent value function V*(ϕ). This will make the separation transparent for the neural-network case. revision: yes
-
Referee: [theoretical results section] No analysis is provided of how neural-network approximation error interacts with the fixed-point separation or with the consistency and asymptotic normality claims. Because the pre-computation step produces an approximation to the fixed point that is then held fixed while optimizing over ϕ, the interaction between approximation error and the outer estimator is central to whether the stated asymptotic properties hold in practice.
Authors: We concur that the interaction between neural-network approximation error and the asymptotic properties merits explicit treatment. The current proofs establish consistency and asymptotic normality under the assumption that the utility function is correctly specified (or approximated at a rate that vanishes sufficiently fast). In the revision we will add a subsection analyzing the propagation of approximation error through the pre-computed fixed point. Under standard conditions on the neural-network approximation rate (e.g., as the number of units grows with sample size), we will show that the total error remains o_p(1) and that the UFXP and OUFXP estimators retain consistency and asymptotic normality; for OUFXP we will also confirm that efficiency is preserved asymptotically. revision: yes
Circularity Check
Algebraic dual-Bellman separation is independent of fitted NN parameters; no load-bearing self-citation or fitted-input renaming
full rationale
The paper's central claim rests on an algebraic identity in the dual representation of Bellman's equation that isolates utility parameters (including those inside a neural-network approximator) from the fixed-point operator after a single pre-computation step. This identity is presented as holding for the class of models considered and is used to derive consistency and asymptotic normality of UFXP/OUFXP without any indication that the separation itself is obtained by fitting parameters to the evaluation data or by a self-citation chain whose content reduces to the present result. No equation is shown to be definitionally equivalent to its inputs, and the efficiency result for OUFXP follows from standard extremum-estimator arguments once the separation is granted. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A dual representation of Bellman's equation exists that separates the utility parameters from the value-function fixed point.
Reference graph
Works this paper leans on
-
[1]
Aguirregabiria, Victor. 1999. The dynamics of markups and inventories in retailing firms . The Review of Economic Studies\/ 66 (2) 275--308
work page 1999
-
[2]
Aguirregabiria, Victor, Arvind Magesan. 2013. Euler equations for the estimation of dynamic discrete choice structural models. Structural Econometric Models\/ , Advances in Econometrics\/ , vol. 31. Emerald Group Publishing Limited, 3--44
work page 2013
-
[3]
Aguirregabiria, Victor, Pedro Mira. 2002. Swapping the nested fixed point algorithm: A class of estimators for discrete markov decision models. Econometrica\/ 70 (4) 1519--1543
work page 2002
-
[4]
Aguirregabiria, Victor, Pedro Mira. 2007. Sequential estimation of dynamic discrete games. Econometrica\/ 75 (1) 1--53
work page 2007
-
[5]
Aguirregabiria, Victor, Pedro Mira. 2010. Dynamic discrete choice structural models: A survey. Journal of Econometrics\/ 156 (1) 38--67
work page 2010
-
[6]
Aouad, Ali, Antoine D\' e sir. 2025. Representing random utility choice models with neural networks. Management Science\/ 0 (0) null
work page 2025
- [7]
-
[8]
Arcidiacono, Peter, Robert A. Miller. 2011. Conditional choice probability estimation of dynamic discrete choice models with unobserved heterogeneity. Econometrica\/ 79 (6) 1823--1867
work page 2011
-
[9]
Arkoudi, Ioanna, Rico Krueger, Carlos Lima Azevedo, Francisco C. Pereira. 2023. Combining discrete choice models and neural networks through embeddings: Formulation, interpretability and performance. Transportation Research Part B: Methodological\/ 175 102783
work page 2023
-
[10]
Lanier Benkard, Jonathan Levin
Bajari, Patrick, C. Lanier Benkard, Jonathan Levin. 2007. Estimating dynamic models of imperfect competition. Econometrica\/ 75 (5) 1331--1370
work page 2007
-
[11]
Bray, Robert L. 2019 a . Markov Decision Processes with Exogenous Variables . Management Science\/ 65 (10) 4598--4606
work page 2019
-
[12]
Bray, Robert L. 2019 b . Strong convergence and dynamic economic models . Quantitative Economics\/ 10 (1) 43--65
work page 2019
-
[13]
Bray, Robert L., Yuliang Yao, Yongrui Duan, Jiazhen Huo. 2019. Ration gaming and the bullwhip effect. Operations Research\/ 67 (2) 453--467
work page 2019
-
[14]
Byrd, Richard H., Jorge Nocedal, Richard A. Waltz. 2006. Knitro: An integrated package for nonlinear optimization. G. Di Pillo, M. Roma, eds., Large-Scale Nonlinear Optimization\/ . Springer US, Boston, MA, 35--59
work page 2006
-
[15]
Cameron, A. Colin, Pravin K. Trivedi. 2005. Microeconometrics: Methods and Applications\/ . Cambridge University Press
work page 2005
-
[16]
Dearing, Adam. 2019. Pseudo-value functions and closed-form ccp estimation of dynamic discrete choice models, working paper
work page 2019
-
[17]
Han, Yafei, Francisco Camara Pereira, Moshe Ben-Akiva, Christopher Zegras. 2022. A neural-embedded discrete choice model: Learning taste representation with strengthened interpretability. Transportation Research Part B: Methodological\/ 163 166--186
work page 2022
-
[18]
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV)\/ . 1026--1034
work page 2015
-
[19]
Hotz, V. Joseph, Robert A. Miller. 1993. Conditional Choice Probabilities and the Estimation of Dynamic Models . The Review of Economic Studies\/ 60 (3) 497--529
work page 1993
-
[20]
Hotz, V. Joseph, Robert A. Miller, Seth Sanders, Jeffrey Smith. 1994. A Simulation Estimator for Dynamic Models of Discrete Choice . The Review of Economic Studies\/ 61 (2) 265--289
work page 1994
-
[21]
Hsieh, Sung-Lin, Shaowei Ke, Zhaoran Wang, Chen Zhao. 2025. Logit neural-network utility. Journal of Economic Behavior & Organization\/ 236 107054
work page 2025
-
[22]
constrained optimization approaches to estimation of structural models
Iskhakov, Fedor, Jinhyuk Lee, John Rust, Bertel Schjerning, Kyoungwon Seo. 2016. Comment on “constrained optimization approaches to estimation of structural models”. Econometrica\/ 84 (1) 365--370
work page 2016
-
[23]
Judd, Kenneth. 1998. Numerical Methods in Economics\/ , vol. 1. 1st ed. The MIT Press
work page 1998
-
[24]
Kasahara, Hiroyuki, Katsumi Shimotsu. 2018. Estimation of Discrete Choice Dynamic Programming Models . The Journal of Japanese Economic Association\/ 69 (1) 28--58
work page 2018
-
[25]
Kingma, Diederik P., Jimmy Ba. 2017. Adam: A method for stochastic optimization
work page 2017
-
[26]
Lee, John M. 2012. Introduction to Smooth Manifolds\/ , Graduate Texts in Mathematics\/ , vol. 218. 2nd ed. Springer, New York, NY
work page 2012
-
[27]
Liu, Dong C., Jorge Nocedal. 1989. On the limited memory bfgs method for large scale optimization. Mathematical Programming\/ 45 503--528
work page 1989
-
[28]
Miessi Sanches, Fabio A., Daniel Junior Silva, Sorawoot Srisuma. 2016. Ordinary least squares estimation of a dynamic game model. International Economic Review\/ 57 (2) 623--634
work page 2016
-
[29]
Newey, Whitney K., Daniel McFadden. 1994. Large sample estimation and hypothesis testing. Handbook of Econometrics\/ , Handbook of Econometrics\/ , vol. 4, chap. 36. Elsevier, 2111--2245
work page 1994
-
[30]
Nocedal, Jorge, Stephen J. Wright. 2006. Numerical Optimization \/ . 2nd ed. Springer Series in Operations Research and Financial Engineering, Springer, New York
work page 2006
-
[31]
Pakes, Ariel, Michael Ostrovsky, Steven Berry. 2007. Simple estimators for the parameters of discrete dynamic games (with entry/exit examples). The RAND Journal of Economics\/ 38 (2) 373--399
work page 2007
-
[32]
Pesendorfer, Martin, Philipp Schmidt-Dengler. 2008. Asymptotic Least Squares Estimators for Dynamic Games1 . The Review of Economic Studies\/ 75 (3) 901--928
work page 2008
-
[33]
Rust, John. 1986. Structural estimation of markov decision processes. Handbook of Econometrics\/ , Handbook of Econometrics\/ , vol. 4, 1st ed., chap. 51. Elsevier, 3081--3143
work page 1986
-
[34]
Rust, John. 1987 a . Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher . Econometrica\/ 55 (5) 999--1033
work page 1987
-
[35]
Rust, John. 1987 b . Optimal replacement of gmc bus engines: An empirical model of harold zurcher. Econometrica\/ 55 (5) 999--1033
work page 1987
-
[36]
Rust, John. 2000. Nested fixed point algorithm documentation manual, unpublished manuscript
work page 2000
-
[37]
Shastri, Anant R. 2011. Elements of Differential Topology\/ . CRC Press, Boca Raton, FL
work page 2011
-
[38]
Sifringer, Brian, Virginie Lurkin, Alexandre Alahi. 2020. Enhancing discrete choice models with representation learning. Transportation Research Part B: Methodological\/ 140 236--261
work page 2020
-
[39]
Su, Che-Lin, Kenneth L. Judd. 2012. Constrained optimization approaches to estimation of structural models. Econometrica\/ 80 (5) 2213--2230
work page 2012
-
[40]
van Cranenburgh , Sander, Shenhao Wang, Akshay Vij, Francisco Pereira, Joan Walker. 2022. Choice modelling in the age of machine learning - discussion paper. Journal of Choice Modelling\/ 42 100340
work page 2022
-
[41]
van der Laan, Lars, Aurelien Bibaut, Nathan Kallus. 2025. Efficient inference for inverse reinforcement learning and dynamic discrete choice models
work page 2025
-
[42]
Wei, Yanhao (Max), Zhenling Jiang. 2025. Estimating parameters of structural models using neural networks. Marketing Science\/ 44 (1) 102--128
work page 2025
-
[43]
, " * write output.state after.block = add.period write newline
ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sent...
-
[44]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.