Graph neural networks for residential location choice: connection to classical logit models
Pith reviewed 2026-05-22 00:21 UTC · model grok-4.3
The pith
Graph neural networks can represent classical nested and spatially correlated logit models as special cases for analyzing residential location choices.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors show that graph neural network discrete choice models incorporate the nested logit model and the spatially correlated logit model as two specific cases. This yields a novel algorithmic interpretation through message passing among alternatives' utilities, while maintaining connections to classical random utility theory and offering improved predictive performance in spatial choice settings.
What carries the argument
Message passing on graphs of choice alternatives, where aggregation of neighboring utilities reproduces correlation structures from nested and spatial logit models.
Load-bearing premise
The graph structure among choice alternatives can be defined such that message passing in the GNN exactly reproduces the desired correlation structure in random utilities.
What would settle it
Testing the GNN-DCM on synthetic data generated from a nested logit model to check if it recovers the exact nesting parameters and correlations when the graph reflects the nesting tree.
Figures
read the original abstract
Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has been a long-lasting focus in classical discrete choice models. To address the gap, this paper introduces Graph Neural Network (GNN) as a novel framework to analyze residential location choice. The GNN-based discrete choice models (GNN-DCMs) offer a structured approach for neural networks to capture dependence among spatial alternatives, while maintaining clear connections to classical random utility theory. Theoretically, we demonstrate that the GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities. Empirically, the GNN-DCMs outperform benchmark MNL, SCL, and feedforward neural networks in predicting residential location choices among Chicago's 77 community areas. Regarding model interpretation, the GNN-DCMs can capture individual heterogeneity and exhibit spatially-aware substitution patterns. Overall, these results highlight the potential of GNN-DCMs as a unified and expressive framework for synergizing discrete choice modeling and deep learning in the complex spatial choice contexts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Graph Neural Network-based Discrete Choice Models (GNN-DCMs) for residential location choice. It claims that these models capture dependence among spatial alternatives while maintaining connections to random utility theory, and theoretically demonstrates that GNN-DCMs incorporate the nested logit (NL) and spatially correlated logit (SCL) models as specific cases via message passing among alternatives' utilities. Empirically, GNN-DCMs are shown to outperform MNL, SCL, and feedforward neural networks in predicting choices among Chicago's 77 community areas, with additional claims regarding interpretation of individual heterogeneity and spatially-aware substitution patterns.
Significance. If the theoretical equivalences are established with explicit derivations, the work provides a meaningful bridge between graph neural networks and classical discrete choice models, potentially enabling more flexible modeling of spatial correlations while preserving interpretability. The empirical application to real residential choice data suggests practical gains in predictive accuracy for complex spatial settings.
major comments (3)
- [Theoretical demonstration of NL and SCL inclusion] The central theoretical claim that GNN-DCMs incorporate NL and SCL as specific cases is load-bearing but lacks explicit algebraic reductions. The manuscript should specify the exact GNN parameters (aggregation function, normalization, layer count, and edge-weight definitions) that recover the closed-form inclusive-value terms of NL or the covariance adjustments of SCL without approximation.
- [Empirical results and model application] The empirical evaluation asserts outperformance but omits key details on graph construction (how nodes=alternatives and edges among the 77 Chicago community areas are defined to encode spatial correlations) and training procedures. Without these, the superiority over MNL, SCL, and feedforward networks cannot be fully assessed.
- [Model interpretation] The interpretation claims regarding individual heterogeneity and substitution patterns require concrete methods (e.g., how attention or message-passing weights map to cross-elasticities or heterogeneity parameters) to demonstrate they extend or align with classical random-utility interpretations.
minor comments (2)
- [Abstract] The abstract would be strengthened by including at least one quantitative performance metric (e.g., log-likelihood improvement or hit rate) alongside the qualitative claim of outperformance.
- [Notation and definitions] Notation for utilities, message-passing functions, and graph adjacency should be introduced with explicit cross-references to classical logit notation to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and have revised the manuscript to strengthen the theoretical derivations, add missing methodological details, and enhance the interpretation section.
read point-by-point responses
-
Referee: [Theoretical demonstration of NL and SCL inclusion] The central theoretical claim that GNN-DCMs incorporate NL and SCL as specific cases is load-bearing but lacks explicit algebraic reductions. The manuscript should specify the exact GNN parameters (aggregation function, normalization, layer count, and edge-weight definitions) that recover the closed-form inclusive-value terms of NL or the covariance adjustments of SCL without approximation.
Authors: We appreciate this observation and agree that explicit algebraic reductions strengthen the claim. The original manuscript demonstrates the connection at the level of message passing on utilities but does not spell out the precise parameter settings. In the revised version we will add a dedicated subsection with the required derivations: for the nested logit case we specify a single-layer GNN using sum aggregation, identity normalization, and edge weights defined by the nesting partition (intra-nest edges set to 1, inter-nest edges set to 0); this exactly reproduces the inclusive-value term. An analogous construction with covariance-adjusted edge weights recovers the spatially correlated logit model without approximation. revision: yes
-
Referee: [Empirical results and model application] The empirical evaluation asserts outperformance but omits key details on graph construction (how nodes=alternatives and edges among the 77 Chicago community areas are defined to encode spatial correlations) and training procedures. Without these, the superiority over MNL, SCL, and feedforward networks cannot be fully assessed.
Authors: We agree that these implementation details are essential for reproducibility and evaluation. The revised manuscript will include an expanded methods section that (i) defines the graph explicitly—nodes are the 77 community areas and edges are constructed from a spatial adjacency matrix augmented by a distance threshold to capture correlations—and (ii) reports the full training protocol, including optimizer (Adam), learning rate, batch size, epoch count, early-stopping criterion, and software framework (PyTorch Geometric). These additions will allow readers to assess the reported performance gains. revision: yes
-
Referee: [Model interpretation] The interpretation claims regarding individual heterogeneity and substitution patterns require concrete methods (e.g., how attention or message-passing weights map to cross-elasticities or heterogeneity parameters) to demonstrate they extend or align with classical random-utility interpretations.
Authors: We thank the referee for this suggestion. The current manuscript discusses interpretation at a conceptual level. In revision we will add explicit mapping procedures: we will show how the learned message-passing weights can be converted into implied cross-elasticities by differentiating the post-propagation utilities, and how node-level feature embeddings capture individual heterogeneity. Numerical illustrations drawn from the Chicago estimation results will be included to demonstrate consistency with classical random-utility quantities. revision: yes
Circularity Check
No circularity: theoretical mapping to NL/SCL is independent construction
full rationale
The paper claims to demonstrate that GNN-DCMs recover NL and SCL as specific cases through message passing on defined graphs among alternatives. This is a direct algebraic or structural equivalence shown by choosing particular aggregation functions and edge definitions to match closed-form inclusive values or covariances, rather than fitting parameters to data and relabeling the output as a prediction. No self-citation chain, self-definitional loop, or renaming of known results is indicated in the abstract or derivation outline. The central result remains self-contained against external benchmarks of classical logit forms and does not reduce to its own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (2)
- Graph adjacency or edge-weight matrix
- Neural network weights and aggregation functions
axioms (1)
- domain assumption Random utility maximization with Gumbel or similar error distributions
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.
Reference graph
Works this paper leans on
-
[1]
Agrawal, D. and Schorling, C. (1996). Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model. Journal of Retailing , 72(4):383--407
work page 1996
-
[2]
Anas, A. (1982). Residential location markets and urban transportation. economic theory, econometrics and policy analysis with discrete choice models . Number Monograph
work page 1982
-
[3]
Ansel, J., Yang, E., He, H., Gimelshein, N., Jain, A., Voznesensky, M., Bao, B., Bell, P., Berard, D., Burovski, E., Chauhan, G., Chourdia, A., Constable, W., Desmaison, A., DeVito, Z., Ellison, E., Feng, W., Gong, J., Gschwind, M., Hirsh, B., Huang, S., Kalambarkar, K., Kirsch, L., Lazos, M., Lezcano, M., Liang, Y., Liang, J., Lu, Y., Luk, C., Maher, B.,...
work page 2024
-
[4]
Bayoh, I., Irwin, E. G., and Haab, T. (2006). Determinants of residential location choice: How important are local public goods in attracting homeowners to central city locations? Journal of Regional Science , 46(1):97--120
work page 2006
-
[5]
Ben-Akiva, M. and Bowman, J. L. (1998). Integration of an activity-based model system and a residential location model. Urban Studies , 35(7):1131--1153
work page 1998
-
[6]
Ben-Akiva, M. E. (1973). Structure of passenger travel demand models. PhD thesis, Massachusetts Institute of Technology
work page 1973
-
[7]
Bhat, C. R. (2000). A multi-level cross-classified model for discrete response variables. Transportation Research Part B: Methodological , 34(7):567--582
work page 2000
-
[8]
Bhat, C. R. and Eluru, N. (2009). A copula-based approach to accommodate residential self-selection effects in travel behavior modeling. Transportation Research Part B: Methodological , 43(7):749--765
work page 2009
-
[9]
Bhat, C. R. and Guo, J. (2004). A mixed spatially correlated logit model: formulation and application to residential choice modeling. Transportation Research Part B: Methodological , 38(2):147--168
work page 2004
-
[10]
Bhat, C. R. and Guo, J. Y. (2007). A comprehensive analysis of built environment characteristics on household residential choice and auto ownership levels. Transportation Research Part B: Methodological , 41(5):506--526
work page 2007
-
[11]
Bhat, C. R., Sener, I. N., and Eluru, N. (2010). A flexible spatially dependent discrete choice model: formulation and application to teenagers’ weekday recreational activity participation. Transportation research part B: methodological , 44(8-9):903--921
work page 2010
-
[12]
Bina, M., Warburg, V., and Kockelman, K. M. (2006). Location choice vis- \`a -vis transportation: Apartment dwellers. Transportation Research Record , 1977(1):93--102
work page 2006
-
[13]
Bolduc, D. and Ben-Akiva, M. (1991). A multinomial probit formulation for large choice sets. In LES METHODES D'ANALYSE DES COMPORTEMENTS DE DEPLACEMENTS POUR LES ANNEES 1990-6E CONFERENCE INTERNATIONALE SUR LES COMPORTEMENTS DE DEPLACEMENTS, VOLUME 2
work page 1991
-
[14]
Brock, W. A. and Durlauf, S. N. (2001). Discrete choice with social interactions. The Review of Economic Studies , 68(2):235--260
work page 2001
-
[15]
Chen, J., Chen, C., and Timmermans, H. J. (2008). Accessibility trade-offs in household residential location decisions. Transportation Research Record , 2077(1):71--79
work page 2008
-
[16]
Chicago Historical Society (2004). Community areas. Encyclopedia of Chicago. Accessed Mar 23, 2025
work page 2004
-
[17]
Dugundji, E. R. and Walker, J. L. (2005). Discrete choice with social and spatial network interdependencies: an empirical example using mixed generalized extreme value models with field and panel effects. Transportation Research Record , 1921(1):70--78
work page 2005
-
[18]
A., Brathwaite, T., Walker, J., and Wang, S
Feng, S., Yao, R., Hess, S., Daziano, R. A., Brathwaite, T., Walker, J., and Wang, S. (2024). Deep neural networks for choice analysis: Enhancing behavioral regularity with gradient regularization. Transportation Research Part C: Emerging Technologies , 166:104767
work page 2024
-
[19]
Fey, M. and Lenssen, J. E. (2019). Fast graph representation learning with PyTorch Geometric . In ICLR Workshop on Representation Learning on Graphs and Manifolds
work page 2019
-
[20]
Goetzke, F. (2008). Network effects in public transit use: evidence from a spatially autoregressive mode choice model for new york. Urban Studies , 45(2):407--417
work page 2008
-
[21]
Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. journal of Computational and Graphical Statistics , 24(1):44--65
work page 2015
-
[22]
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning . MIT Press. http://www.deeplearningbook.org
work page 2016
-
[23]
Guevara, C. A. and Ben-Akiva, M. (2006). Endogeneity in residential location choice models. Transportation research record , 1977(1):60--66
work page 2006
-
[24]
Hamilton, W., Ying, Z., and Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in neural information processing systems , 30
work page 2017
-
[25]
Hamilton, W. L. (2020). Graph representation learning . Morgan & Claypool Publishers
work page 2020
-
[26]
C., Ben-Akiva, M., and Zegras, C
Han, Y., Pereira, F. C., Ben-Akiva, M., and Zegras, C. (2022). A neural-embedded discrete choice model: Learning taste representation with strengthened interpretability. Transportation Research Part B: Methodological , 163:166--186
work page 2022
-
[27]
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 770--778
work page 2016
-
[28]
Hu, L. and Wang, L. (2019). Housing location choices of the poor: does access to jobs matter? Housing Studies , 34(10):1721--1745
work page 2019
-
[29]
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 4700--4708
work page 2017
-
[30]
Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[31]
Kipf, T. N. and Welling, M. (2017). Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations
work page 2017
-
[32]
Kumar, A., Rao, V. R., and Soni, H. (1995). An empirical comparison of neural network and logistic regression models. Marketing letters , 6:251--263
work page 1995
-
[33]
Lee, B. H. and Waddell, P. (2010). Residential mobility and location choice: a nested logit model with sampling of alternatives. Transportation , 37:587--601
work page 2010
-
[34]
LeSage, J. and Pace, R. K. (2009). Introduction to spatial econometrics . Chapman and Hall/CRC
work page 2009
-
[35]
Liu, D. C. and Nocedal, J. (1989). On the limited memory bfgs method for large scale optimization. Mathematical programming , 45(1):503--528
work page 1989
-
[36]
McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In Zarembka, P., editor, Fontiers in Econometrics , pages 105--142. Academic Press
work page 1974
-
[37]
McFadden, D. (1978). Modelling the choice of residential location. Transportation Research Record , 673:72--77
work page 1978
-
[38]
Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) , pages 807--814
work page 2010
-
[39]
Perez-Lopez, J.-B., Novales, M., and Orro, A. (2022). Spatially correlated nested logit model for spatial location choice. Transportation Research Part B: Methodological , 161:1--12
work page 2022
-
[40]
Schirmer, P. M., Van Eggermond, M. A., and Axhausen, K. W. (2014). The role of location in residential location choice models: a review of literature. Journal of Transport and Land Use , 7(2):3--21
work page 2014
-
[41]
Sener, I. N., Pendyala, R. M., and Bhat, C. R. (2011). Accommodating spatial correlation across choice alternatives in discrete choice models: an application to modeling residential location choice behavior. Journal of transport geography , 19(2):294--303
work page 2011
-
[42]
Sifringer, B., Lurkin, V., and Alahi, A. (2020). Enhancing discrete choice models with representation learning. Transportation Research Part B: Methodological , 140:236--261
work page 2020
-
[43]
Sivakumar, A. and Bhat, C. R. (2007). Comprehensive, unified framework for analyzing spatial location choice. Transportation Research Record , 2003(1):103--111
work page 2007
-
[44]
Srivastava, R. K., Greff, K., and Schmidhuber, J. (2015). Highway networks. arXiv preprint arXiv:1505.00387
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[45]
Tomlinson, K. and Benson, A. R. (2024). Graph-based methods for discrete choice. Network Science , 12(1):21--40
work page 2024
-
[46]
Train, K. E. (2009). Discrete choice methods with simulation . Cambridge university press
work page 2009
-
[47]
Veli c kovi \'c , P., Cucurull, G., Casanova, A., Romero, A., Li \`o , P., and Bengio, Y. (2018). Graph attention networks. In International Conference on Learning Representations
work page 2018
- [48]
-
[49]
Wang, S., Mo, B., and Zhao, J. (2020a). Deep neural networks for choice analysis: Architecture design with alternative-specific utility functions. Transportation Research Part C: Emerging Technologies , 112:234--251
-
[50]
Wang, S., Mo, B., and Zhao, J. (2021). Theory-based residual neural networks: A synergy of discrete choice models and deep neural networks. Transportation Research Part B: Methodological , 146:333--358
work page 2021
-
[51]
Wang, S., Mo, B., Zheng, Y., Hess, S., and Zhao, J. (2024). Comparing hundreds of machine learning and discrete choice models for travel demand modeling: An empirical benchmark. Transportation Research Part B: Methodological , 190:103061
work page 2024
-
[52]
Wang, S., Wang, Q., and Zhao, J. (2020b). Deep neural networks for choice analysis: Extracting complete economic information for interpretation. Transportation Research Part C: Emerging Technologies , 118:102701
-
[53]
Wang, X. C., Kockelman, K. M., and Lemp, J. D. (2012). The dynamic spatial multinomial probit model: analysis of land use change using parcel-level data. Journal of Transport Geography , 24:77--88
work page 2012
-
[54]
Weisbrod, G., Lerman, S. R., and Ben-Akiva, M. (1980). Tradeoffs in residential location decisions: Transportation versus other factors. Transport Policy and Decision Making , 1(1):13--26
work page 1980
-
[55]
Wen, C.-H. and Koppelman, F. S. (2001). The generalized nested logit model. Transportation Research Part B: Methodological , 35(7):627--641
work page 2001
-
[56]
Wong, M. and Farooq, B. (2021). ResLogit : A residual neural network logit model for data-driven choice modelling. Transportation Research Part C: Emerging Technologies , 126:103050
work page 2021
-
[57]
You, J., Ying, Z., and Leskovec, J. (2020). Design space for graph neural networks. Advances in Neural Information Processing Systems , 33:17009--17021
work page 2020
-
[58]
Zondag, B. and Pieters, M. (2005). Influence of accessibility on residential location choice. Transportation Research Record , 1902(1):63--70
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.