pith. sign in

arxiv: 2507.21334 · v1 · pith:UXGIQMMJnew · submitted 2025-07-28 · 📊 stat.ML · cs.LG

Graph neural networks for residential location choice: connection to classical logit models

Pith reviewed 2026-05-22 00:21 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords graph neural networksdiscrete choice modelsresidential location choicenested logitspatially correlated logitrandom utility theorymessage passing
0
0 comments X

The pith

Graph neural networks can represent classical nested and spatially correlated logit models as special cases for analyzing residential location choices.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes using graph neural networks to model residential location choices by representing possible locations as nodes in a graph. The GNN framework captures dependencies among alternatives through message passing, linking directly to classical random utility models. It demonstrates that the GNN-based models include the nested logit and spatially correlated logit as specific cases, providing an algorithmic view of utility correlations. Empirically, these models better predict choices in Chicago while revealing heterogeneity and substitution patterns.

Core claim

The authors show that graph neural network discrete choice models incorporate the nested logit model and the spatially correlated logit model as two specific cases. This yields a novel algorithmic interpretation through message passing among alternatives' utilities, while maintaining connections to classical random utility theory and offering improved predictive performance in spatial choice settings.

What carries the argument

Message passing on graphs of choice alternatives, where aggregation of neighboring utilities reproduces correlation structures from nested and spatial logit models.

Load-bearing premise

The graph structure among choice alternatives can be defined such that message passing in the GNN exactly reproduces the desired correlation structure in random utilities.

What would settle it

Testing the GNN-DCM on synthetic data generated from a nested logit model to check if it recovers the exact nesting parameters and correlations when the graph reflects the nesting tree.

Figures

Figures reproduced from arXiv: 2507.21334 by Lingqian Hu, Shenhao Wang, Yuheng Bu, Yuqi Zhou, Zhanhong Cheng.

Figure 1
Figure 1. Figure 1: Illustration of the alternative graphs in the NL and SCL models for residential location choice. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of the GNN-DCM framework. Customizable design components are colored in blue. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Using GNN as the NL model to calculate the choice probability for alternative 1, where the [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The relationship between classical and NN-based discrete choice models. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The study area and the alternative graph. [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visualization of forecast accuracy of ten-fold cross-validation for different GNN designs. [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The impact of housing unit value and transit accessibility on the choice probability. [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Percentage change in the residential choice probability for a selected household following a 10% [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
read the original abstract

Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has been a long-lasting focus in classical discrete choice models. To address the gap, this paper introduces Graph Neural Network (GNN) as a novel framework to analyze residential location choice. The GNN-based discrete choice models (GNN-DCMs) offer a structured approach for neural networks to capture dependence among spatial alternatives, while maintaining clear connections to classical random utility theory. Theoretically, we demonstrate that the GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities. Empirically, the GNN-DCMs outperform benchmark MNL, SCL, and feedforward neural networks in predicting residential location choices among Chicago's 77 community areas. Regarding model interpretation, the GNN-DCMs can capture individual heterogeneity and exhibit spatially-aware substitution patterns. Overall, these results highlight the potential of GNN-DCMs as a unified and expressive framework for synergizing discrete choice modeling and deep learning in the complex spatial choice contexts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces Graph Neural Network-based Discrete Choice Models (GNN-DCMs) for residential location choice. It claims that these models capture dependence among spatial alternatives while maintaining connections to random utility theory, and theoretically demonstrates that GNN-DCMs incorporate the nested logit (NL) and spatially correlated logit (SCL) models as specific cases via message passing among alternatives' utilities. Empirically, GNN-DCMs are shown to outperform MNL, SCL, and feedforward neural networks in predicting choices among Chicago's 77 community areas, with additional claims regarding interpretation of individual heterogeneity and spatially-aware substitution patterns.

Significance. If the theoretical equivalences are established with explicit derivations, the work provides a meaningful bridge between graph neural networks and classical discrete choice models, potentially enabling more flexible modeling of spatial correlations while preserving interpretability. The empirical application to real residential choice data suggests practical gains in predictive accuracy for complex spatial settings.

major comments (3)
  1. [Theoretical demonstration of NL and SCL inclusion] The central theoretical claim that GNN-DCMs incorporate NL and SCL as specific cases is load-bearing but lacks explicit algebraic reductions. The manuscript should specify the exact GNN parameters (aggregation function, normalization, layer count, and edge-weight definitions) that recover the closed-form inclusive-value terms of NL or the covariance adjustments of SCL without approximation.
  2. [Empirical results and model application] The empirical evaluation asserts outperformance but omits key details on graph construction (how nodes=alternatives and edges among the 77 Chicago community areas are defined to encode spatial correlations) and training procedures. Without these, the superiority over MNL, SCL, and feedforward networks cannot be fully assessed.
  3. [Model interpretation] The interpretation claims regarding individual heterogeneity and substitution patterns require concrete methods (e.g., how attention or message-passing weights map to cross-elasticities or heterogeneity parameters) to demonstrate they extend or align with classical random-utility interpretations.
minor comments (2)
  1. [Abstract] The abstract would be strengthened by including at least one quantitative performance metric (e.g., log-likelihood improvement or hit rate) alongside the qualitative claim of outperformance.
  2. [Notation and definitions] Notation for utilities, message-passing functions, and graph adjacency should be introduced with explicit cross-references to classical logit notation to improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and have revised the manuscript to strengthen the theoretical derivations, add missing methodological details, and enhance the interpretation section.

read point-by-point responses
  1. Referee: [Theoretical demonstration of NL and SCL inclusion] The central theoretical claim that GNN-DCMs incorporate NL and SCL as specific cases is load-bearing but lacks explicit algebraic reductions. The manuscript should specify the exact GNN parameters (aggregation function, normalization, layer count, and edge-weight definitions) that recover the closed-form inclusive-value terms of NL or the covariance adjustments of SCL without approximation.

    Authors: We appreciate this observation and agree that explicit algebraic reductions strengthen the claim. The original manuscript demonstrates the connection at the level of message passing on utilities but does not spell out the precise parameter settings. In the revised version we will add a dedicated subsection with the required derivations: for the nested logit case we specify a single-layer GNN using sum aggregation, identity normalization, and edge weights defined by the nesting partition (intra-nest edges set to 1, inter-nest edges set to 0); this exactly reproduces the inclusive-value term. An analogous construction with covariance-adjusted edge weights recovers the spatially correlated logit model without approximation. revision: yes

  2. Referee: [Empirical results and model application] The empirical evaluation asserts outperformance but omits key details on graph construction (how nodes=alternatives and edges among the 77 Chicago community areas are defined to encode spatial correlations) and training procedures. Without these, the superiority over MNL, SCL, and feedforward networks cannot be fully assessed.

    Authors: We agree that these implementation details are essential for reproducibility and evaluation. The revised manuscript will include an expanded methods section that (i) defines the graph explicitly—nodes are the 77 community areas and edges are constructed from a spatial adjacency matrix augmented by a distance threshold to capture correlations—and (ii) reports the full training protocol, including optimizer (Adam), learning rate, batch size, epoch count, early-stopping criterion, and software framework (PyTorch Geometric). These additions will allow readers to assess the reported performance gains. revision: yes

  3. Referee: [Model interpretation] The interpretation claims regarding individual heterogeneity and substitution patterns require concrete methods (e.g., how attention or message-passing weights map to cross-elasticities or heterogeneity parameters) to demonstrate they extend or align with classical random-utility interpretations.

    Authors: We thank the referee for this suggestion. The current manuscript discusses interpretation at a conceptual level. In revision we will add explicit mapping procedures: we will show how the learned message-passing weights can be converted into implied cross-elasticities by differentiating the post-propagation utilities, and how node-level feature embeddings capture individual heterogeneity. Numerical illustrations drawn from the Chicago estimation results will be included to demonstrate consistency with classical random-utility quantities. revision: yes

Circularity Check

0 steps flagged

No circularity: theoretical mapping to NL/SCL is independent construction

full rationale

The paper claims to demonstrate that GNN-DCMs recover NL and SCL as specific cases through message passing on defined graphs among alternatives. This is a direct algebraic or structural equivalence shown by choosing particular aggregation functions and edge definitions to match closed-form inclusive values or covariances, rather than fitting parameters to data and relabeling the output as a prediction. No self-citation chain, self-definitional loop, or renaming of known results is indicated in the abstract or derivation outline. The central result remains self-contained against external benchmarks of classical logit forms and does not reduce to its own inputs by construction.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The framework rests on the random utility maximization assumption and on the existence of a graph whose edges encode the relevant substitution or correlation structure; no new physical entities are introduced.

free parameters (2)
  • Graph adjacency or edge-weight matrix
    Defines which alternatives exchange messages; must be specified or learned and directly affects the recovered correlation structure.
  • Neural network weights and aggregation functions
    Parameters of the GNN layers that are fitted to data and must be chosen to match the target logit closed forms.
axioms (1)
  • domain assumption Random utility maximization with Gumbel or similar error distributions
    Invoked to maintain connection to classical logit models as stated in the abstract.

pith-pipeline@v0.9.0 · 5767 in / 1300 out tokens · 48865 ms · 2026-05-22T00:21:12.875403+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    The GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models

    cs.CL 2026-05 unverdicted novelty 3.0

    Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    and Schorling, C

    Agrawal, D. and Schorling, C. (1996). Market share forecasting: An empirical comparison of artificial neural networks and multinomial logit model. Journal of Retailing , 72(4):383--407

  2. [2]

    Anas, A. (1982). Residential location markets and urban transportation. economic theory, econometrics and policy analysis with discrete choice models . Number Monograph

  3. [3]

    Y., Suk, H., Suo, M., Tillet, P., Wang, E., Wang, X., Wen, W., Zhang, S., Zhao, X., Zhou, K., Zou, R., Mathews, A., Chanan, G., Wu, P., and Chintala, S

    Ansel, J., Yang, E., He, H., Gimelshein, N., Jain, A., Voznesensky, M., Bao, B., Bell, P., Berard, D., Burovski, E., Chauhan, G., Chourdia, A., Constable, W., Desmaison, A., DeVito, Z., Ellison, E., Feng, W., Gong, J., Gschwind, M., Hirsh, B., Huang, S., Kalambarkar, K., Kirsch, L., Lazos, M., Lezcano, M., Liang, Y., Liang, J., Lu, Y., Luk, C., Maher, B.,...

  4. [4]

    G., and Haab, T

    Bayoh, I., Irwin, E. G., and Haab, T. (2006). Determinants of residential location choice: How important are local public goods in attracting homeowners to central city locations? Journal of Regional Science , 46(1):97--120

  5. [5]

    and Bowman, J

    Ben-Akiva, M. and Bowman, J. L. (1998). Integration of an activity-based model system and a residential location model. Urban Studies , 35(7):1131--1153

  6. [6]

    Ben-Akiva, M. E. (1973). Structure of passenger travel demand models. PhD thesis, Massachusetts Institute of Technology

  7. [7]

    Bhat, C. R. (2000). A multi-level cross-classified model for discrete response variables. Transportation Research Part B: Methodological , 34(7):567--582

  8. [8]

    Bhat, C. R. and Eluru, N. (2009). A copula-based approach to accommodate residential self-selection effects in travel behavior modeling. Transportation Research Part B: Methodological , 43(7):749--765

  9. [9]

    Bhat, C. R. and Guo, J. (2004). A mixed spatially correlated logit model: formulation and application to residential choice modeling. Transportation Research Part B: Methodological , 38(2):147--168

  10. [10]

    Bhat, C. R. and Guo, J. Y. (2007). A comprehensive analysis of built environment characteristics on household residential choice and auto ownership levels. Transportation Research Part B: Methodological , 41(5):506--526

  11. [11]

    R., Sener, I

    Bhat, C. R., Sener, I. N., and Eluru, N. (2010). A flexible spatially dependent discrete choice model: formulation and application to teenagers’ weekday recreational activity participation. Transportation research part B: methodological , 44(8-9):903--921

  12. [12]

    Bina, M., Warburg, V., and Kockelman, K. M. (2006). Location choice vis- \`a -vis transportation: Apartment dwellers. Transportation Research Record , 1977(1):93--102

  13. [13]

    and Ben-Akiva, M

    Bolduc, D. and Ben-Akiva, M. (1991). A multinomial probit formulation for large choice sets. In LES METHODES D'ANALYSE DES COMPORTEMENTS DE DEPLACEMENTS POUR LES ANNEES 1990-6E CONFERENCE INTERNATIONALE SUR LES COMPORTEMENTS DE DEPLACEMENTS, VOLUME 2

  14. [14]

    Brock, W. A. and Durlauf, S. N. (2001). Discrete choice with social interactions. The Review of Economic Studies , 68(2):235--260

  15. [15]

    Chen, J., Chen, C., and Timmermans, H. J. (2008). Accessibility trade-offs in household residential location decisions. Transportation Research Record , 2077(1):71--79

  16. [16]

    Community areas

    Chicago Historical Society (2004). Community areas. Encyclopedia of Chicago. Accessed Mar 23, 2025

  17. [17]

    Dugundji, E. R. and Walker, J. L. (2005). Discrete choice with social and spatial network interdependencies: an empirical example using mixed generalized extreme value models with field and panel effects. Transportation Research Record , 1921(1):70--78

  18. [18]

    A., Brathwaite, T., Walker, J., and Wang, S

    Feng, S., Yao, R., Hess, S., Daziano, R. A., Brathwaite, T., Walker, J., and Wang, S. (2024). Deep neural networks for choice analysis: Enhancing behavioral regularity with gradient regularization. Transportation Research Part C: Emerging Technologies , 166:104767

  19. [19]

    and Lenssen, J

    Fey, M. and Lenssen, J. E. (2019). Fast graph representation learning with PyTorch Geometric . In ICLR Workshop on Representation Learning on Graphs and Manifolds

  20. [20]

    Goetzke, F. (2008). Network effects in public transit use: evidence from a spatially autoregressive mode choice model for new york. Urban Studies , 45(2):407--417

  21. [21]

    Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. journal of Computational and Graphical Statistics , 24(1):44--65

  22. [22]

    Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning . MIT Press. http://www.deeplearningbook.org

  23. [23]

    Guevara, C. A. and Ben-Akiva, M. (2006). Endogeneity in residential location choice models. Transportation research record , 1977(1):60--66

  24. [24]

    Hamilton, W., Ying, Z., and Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in neural information processing systems , 30

  25. [25]

    Hamilton, W. L. (2020). Graph representation learning . Morgan & Claypool Publishers

  26. [26]

    C., Ben-Akiva, M., and Zegras, C

    Han, Y., Pereira, F. C., Ben-Akiva, M., and Zegras, C. (2022). A neural-embedded discrete choice model: Learning taste representation with strengthened interpretability. Transportation Research Part B: Methodological , 163:166--186

  27. [27]

    He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 770--778

  28. [28]

    and Wang, L

    Hu, L. and Wang, L. (2019). Housing location choices of the poor: does access to jobs matter? Housing Studies , 34(10):1721--1745

  29. [29]

    Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 4700--4708

  30. [30]

    Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  31. [31]

    Kipf, T. N. and Welling, M. (2017). Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations

  32. [32]

    R., and Soni, H

    Kumar, A., Rao, V. R., and Soni, H. (1995). An empirical comparison of neural network and logistic regression models. Marketing letters , 6:251--263

  33. [33]

    Lee, B. H. and Waddell, P. (2010). Residential mobility and location choice: a nested logit model with sampling of alternatives. Transportation , 37:587--601

  34. [34]

    and Pace, R

    LeSage, J. and Pace, R. K. (2009). Introduction to spatial econometrics . Chapman and Hall/CRC

  35. [35]

    Liu, D. C. and Nocedal, J. (1989). On the limited memory bfgs method for large scale optimization. Mathematical programming , 45(1):503--528

  36. [36]

    McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In Zarembka, P., editor, Fontiers in Econometrics , pages 105--142. Academic Press

  37. [37]

    McFadden, D. (1978). Modelling the choice of residential location. Transportation Research Record , 673:72--77

  38. [38]

    and Hinton, G

    Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) , pages 807--814

  39. [39]

    Perez-Lopez, J.-B., Novales, M., and Orro, A. (2022). Spatially correlated nested logit model for spatial location choice. Transportation Research Part B: Methodological , 161:1--12

  40. [40]

    M., Van Eggermond, M

    Schirmer, P. M., Van Eggermond, M. A., and Axhausen, K. W. (2014). The role of location in residential location choice models: a review of literature. Journal of Transport and Land Use , 7(2):3--21

  41. [41]

    N., Pendyala, R

    Sener, I. N., Pendyala, R. M., and Bhat, C. R. (2011). Accommodating spatial correlation across choice alternatives in discrete choice models: an application to modeling residential location choice behavior. Journal of transport geography , 19(2):294--303

  42. [42]

    Sifringer, B., Lurkin, V., and Alahi, A. (2020). Enhancing discrete choice models with representation learning. Transportation Research Part B: Methodological , 140:236--261

  43. [43]

    and Bhat, C

    Sivakumar, A. and Bhat, C. R. (2007). Comprehensive, unified framework for analyzing spatial location choice. Transportation Research Record , 2003(1):103--111

  44. [44]

    Highway Networks

    Srivastava, R. K., Greff, K., and Schmidhuber, J. (2015). Highway networks. arXiv preprint arXiv:1505.00387

  45. [45]

    and Benson, A

    Tomlinson, K. and Benson, A. R. (2024). Graph-based methods for discrete choice. Network Science , 12(1):21--40

  46. [46]

    Train, K. E. (2009). Discrete choice methods with simulation . Cambridge university press

  47. [47]

    Veli c kovi \'c , P., Cucurull, G., Casanova, A., Romero, A., Li \`o , P., and Bengio, Y. (2018). Graph attention networks. In International Conference on Learning Representations

  48. [48]

    Villarraga, D. F. and Daziano, R. A. (2025). Designing graph convolutional neural networks for discrete choice with network effects. arXiv preprint arXiv:2503.09786

  49. [49]

    Wang, S., Mo, B., and Zhao, J. (2020a). Deep neural networks for choice analysis: Architecture design with alternative-specific utility functions. Transportation Research Part C: Emerging Technologies , 112:234--251

  50. [50]

    Wang, S., Mo, B., and Zhao, J. (2021). Theory-based residual neural networks: A synergy of discrete choice models and deep neural networks. Transportation Research Part B: Methodological , 146:333--358

  51. [51]

    Wang, S., Mo, B., Zheng, Y., Hess, S., and Zhao, J. (2024). Comparing hundreds of machine learning and discrete choice models for travel demand modeling: An empirical benchmark. Transportation Research Part B: Methodological , 190:103061

  52. [52]

    Wang, S., Wang, Q., and Zhao, J. (2020b). Deep neural networks for choice analysis: Extracting complete economic information for interpretation. Transportation Research Part C: Emerging Technologies , 118:102701

  53. [53]

    C., Kockelman, K

    Wang, X. C., Kockelman, K. M., and Lemp, J. D. (2012). The dynamic spatial multinomial probit model: analysis of land use change using parcel-level data. Journal of Transport Geography , 24:77--88

  54. [54]

    R., and Ben-Akiva, M

    Weisbrod, G., Lerman, S. R., and Ben-Akiva, M. (1980). Tradeoffs in residential location decisions: Transportation versus other factors. Transport Policy and Decision Making , 1(1):13--26

  55. [55]

    and Koppelman, F

    Wen, C.-H. and Koppelman, F. S. (2001). The generalized nested logit model. Transportation Research Part B: Methodological , 35(7):627--641

  56. [56]

    and Farooq, B

    Wong, M. and Farooq, B. (2021). ResLogit : A residual neural network logit model for data-driven choice modelling. Transportation Research Part C: Emerging Technologies , 126:103050

  57. [57]

    You, J., Ying, Z., and Leskovec, J. (2020). Design space for graph neural networks. Advances in Neural Information Processing Systems , 33:17009--17021

  58. [58]

    and Pieters, M

    Zondag, B. and Pieters, M. (2005). Influence of accessibility on residential location choice. Transportation Research Record , 1902(1):63--70