Information processing constraints in travel behaviour modelling: A generative learning approach

Bilal Farooq; Melvin Wong

arxiv: 1907.07036 · v2 · pith:F5LG4Z7Jnew · submitted 2019-07-16 · 💰 econ.EM · stat.ML

Information processing constraints in travel behaviour modelling: A generative learning approach

Melvin Wong , Bilal Farooq This is my paper

Pith reviewed 2026-05-24 20:34 UTC · model grok-4.3

classification 💰 econ.EM stat.ML

keywords rational inattentiongenerative learningtravel behaviourmultinomial logitinformation processingchoice modellinguncertaintyprior information

0 comments

The pith

A generative learning model represents rational inattention in travel choices by valuing prior information and allowing variable ignoring under uncertainty.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a data-driven generative model based on rational inattention theory to capture how travelers respond to uncertainty and information limits. It uses a learning process that incorporates the value of prior information directly into choice utilities, leading to behavior where some external variables are disregarded. This setup produces results that align with the theory and can be rewritten as a generalized multinomial logit model grounded in entropy and utility. A sympathetic reader would care because standard travel models often assume full information processing, while this offers a way to handle realistic constraints without losing econometric tractability.

Core claim

We propose a data-driven generative model version of rational inattention theory to emulate behavioural representations in travel decisions. The methodology outlines a generative learning process that captures the value of prior information in the choice utility specification. We demonstrate the effects of information heterogeneity on a travel choice, analyze the econometric interpretation, and show that findings indicate a strong correlation with rational inattention behaviour theory, suggesting individuals may ignore certain exogenous variables and rely on prior information for evaluating decisions under uncertainty. The principles can be formulated as a generalized entropy and utility基于的多

What carries the argument

The generative learning process that emulates the value of prior information in the choice utility specification

If this is right

Individuals ignore certain exogenous variables when evaluating travel decisions under uncertainty.
Travelers rely on prior information rather than processing all available data.
The generative approach produces a generalized multinomial logit model based on entropy and utility.
Information heterogeneity can be directly analyzed for its effects on choice probabilities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same generative structure could be tested on non-travel choices such as housing or energy use to see if rational inattention patterns appear consistently.
Policy models that supply clearer prior signals might shift predicted choices more than models assuming full attention.
If the model fits data better than standard logit specifications, it could justify collecting less detailed attribute data in surveys.
Extensions might incorporate dynamic updating of priors across repeated choices.

Load-bearing premise

The generative learning process accurately emulates the value of prior information in the choice utility specification and produces behavior that genuinely corresponds to rational inattention rather than merely fitting choice data.

What would settle it

Observe travel choices in an experiment that varies the availability and cost of information about exogenous variables while holding priors fixed, then check whether the fitted model predicts the same pattern of variable ignoring as rational inattention requires.

Figures

Figures reproduced from arXiv: 1907.07036 by Bilal Farooq, Melvin Wong.

**Figure 1.** Figure 1: Framework for generative modelling. process. Specifically, it frames the choice problem on observations as well as information processing constraints similar to that of a communication channel with finite Shannon capacity [15]. By representing information processing constraints, it accounts for the natural deviations in econometric behaviour [15, 10]. This concept stems from the same principles of neurosc… view at source ↗

**Figure 2.** Figure 2: Visualization of number of trip trajectory origin points by city district from the dataset. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Learning curve of the sample negative loglikelihood from the choice model. [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: Mode share forecast. for more efficient use of latent variables and more flexibility in handling complex correlations which results in a better approximation of the statistical distribution of the data. Sparse representation has two main advantages in generative modelling [36, 37]. The first advantage is that the model will be able to control the dimensionality of representation given a set of inputs, avoi… view at source ↗

**Figure 5.** Figure 5: Distribution of data generating parameters. [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of data generating output on activity type data. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison of data generating output on trip distance data. [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of data generating output on trip duration data. [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: β-parameter estimates using mode choice as the dependent variable, horizontal axis represent number of latent variables [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

read the original abstract

Travel decisions tend to exhibit sensitivity to uncertainty and information processing constraints. These behavioural conditions can be characterized by a generative learning process. We propose a data-driven generative model version of rational inattention theory to emulate these behavioural representations. We outline the methodology of the generative model and the associated learning process as well as provide an intuitive explanation of how this process captures the value of prior information in the choice utility specification. We demonstrate the effects of information heterogeneity on a travel choice, analyze the econometric interpretation, and explore the properties of our generative model. Our findings indicate a strong correlation with rational inattention behaviour theory, which suggest that individuals may ignore certain exogenous variables and rely on prior information for evaluating decisions under uncertainty. Finally, the principles demonstrated in this study can be formulated as a generalized entropy and utility based multinomial logit model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies a generative model to rational inattention in travel choice but the link rests on correlation rather than an explicit information-cost mechanism.

read the letter

The core move is to train a generative model on travel choice data so that it reproduces patterns consistent with rational inattention, then note that the resulting form can be written as a generalized entropy-utility multinomial logit. That specific combination in the travel-demand literature appears new on the abstract alone. The approach also gives an intuitive account of how prior information enters the utility specification, which is useful for readers who want a data-driven route into the theory without starting from the full information-theoretic setup. The authors further examine information heterogeneity across choices and sketch the econometric reading, which keeps the work grounded in the discrete-choice tradition. These pieces are straightforward and worth having on record for the subfield. The soft spot is the missing link between the generative process and the actual rational-inattention constraint. The abstract reports a strong correlation and says the model emulates the value of prior information, yet supplies no derivation showing an explicit mutual-information penalty, state-dependent attention rule, or updating mechanism that would make the equivalence structural rather than phenomenological. Without that step, any sufficiently flexible generative model could produce similar in-sample patterns, so the claim that the behavior genuinely corresponds to costly information acquisition remains provisional. The paper sits squarely inside travel-demand modeling and discrete-choice econometrics. Specialists working on behavioral extensions of logit models will find the generative framing and the entropy-utility reformulation worth reading; outsiders will not. It deserves a serious referee because the idea is coherent on its own terms and the authors engage the relevant literature, even if the central equivalence needs tighter justification in the full text. I would send it out for review rather than desk-reject, with the expectation that the first round will focus on the mechanism and validation metrics.

Referee Report

3 major / 2 minor

Summary. The paper proposes a data-driven generative learning model as a version of rational inattention theory to represent information processing constraints and sensitivity to uncertainty in travel decisions. It claims the model emulates the value of prior information in choice utilities, demonstrates effects of information heterogeneity, and shows strong correlation with rational inattention behavior, ultimately reformulating the approach as a generalized entropy- and utility-based multinomial logit model.

Significance. If the claimed structural link to rational inattention holds, the work could offer a practical route to embed information-acquisition costs into discrete-choice models without hand-crafted parameters, potentially improving behavioral realism in travel demand forecasting under uncertainty. The reformulation into a generalized MNL would be a useful contribution if derived rather than asserted.

major comments (3)

[Abstract, §3] Abstract and §3 (generative model description): the central claim that the generative process 'emulates' rational inattention and produces behavior that 'corresponds' to optimal costly information acquisition is not supported by any explicit mechanism (e.g., mutual-information penalty, state-dependent attention allocation, or prior-updating rule) that would distinguish it from generic flexible fitting of choice data. The reported 'strong correlation' therefore risks being phenomenological rather than structural.
[§4, §5] §4 (econometric interpretation) and §5 (properties): no derivation is supplied showing how the learned generative model reduces to or is equivalent to a generalized entropy-utility MNL; the final claim that 'the principles demonstrated... can be formulated as' such a model appears asserted rather than proven, leaving the econometric interpretation unsupported.
[Empirical demonstration] Empirical section (travel choice demonstration): the analysis of information heterogeneity lacks reported validation metrics, out-of-sample tests, or comparison against standard rational-inattention specifications (e.g., those with explicit information-cost parameters), so it is impossible to assess whether the generative model recovers the theory's predictions or merely fits the observed choices.

minor comments (2)

[Abstract] Abstract: 'which suggest' should be 'which suggests'.
[§3] Notation for the generative process and the resulting generalized logit should be introduced with explicit equations rather than intuitive descriptions only.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and outline the revisions that will be incorporated to strengthen the presentation of the generative model's link to rational inattention theory.

read point-by-point responses

Referee: [Abstract, §3] Abstract and §3 (generative model description): the central claim that the generative process 'emulates' rational inattention and produces behavior that 'corresponds' to optimal costly information acquisition is not supported by any explicit mechanism (e.g., mutual-information penalty, state-dependent attention allocation, or prior-updating rule) that would distinguish it from generic flexible fitting of choice data. The reported 'strong correlation' therefore risks being phenomenological rather than structural.

Authors: We acknowledge that the connection between the generative learning process and rational inattention could be articulated more explicitly to emphasize its structural basis. The model is constructed such that the learning objective directly encodes sensitivity to uncertainty and the value of prior information in the choice utilities, which by design emulates information processing constraints. To address the concern, the revised Section 3 will include an expanded discussion of the generative mechanism and how it aligns with costly information acquisition, supported by additional analytical steps showing the distinction from generic fitting. This will make the reported correlation more clearly structural. revision: yes
Referee: [§4, §5] §4 (econometric interpretation) and §5 (properties): no derivation is supplied showing how the learned generative model reduces to or is equivalent to a generalized entropy-utility MNL; the final claim that 'the principles demonstrated... can be formulated as' such a model appears asserted rather than proven, leaving the econometric interpretation unsupported.

Authors: We agree that a formal derivation is needed to support the reformulation claim. The properties in Section 5 provide the foundation for equivalence, but we will add an explicit step-by-step derivation in the revised Section 4 demonstrating how the generative model reduces to the generalized entropy-utility multinomial logit. This will convert the current statement into a proven result rather than an assertion. revision: yes
Referee: [Empirical demonstration] Empirical section (travel choice demonstration): the analysis of information heterogeneity lacks reported validation metrics, out-of-sample tests, or comparison against standard rational-inattention specifications (e.g., those with explicit information-cost parameters), so it is impossible to assess whether the generative model recovers the theory's predictions or merely fits the observed choices.

Authors: The empirical section is intended to illustrate the effects of information heterogeneity on travel choice. We recognize that additional validation would strengthen the assessment of alignment with rational inattention predictions. The revised manuscript will include out-of-sample performance metrics and direct comparisons against standard rational inattention specifications with explicit information-cost parameters. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation chain not reducible to inputs from provided text

full rationale

The abstract proposes a generative model version of rational inattention theory and reports a correlation with the theory, along with a formulation as a generalized entropy-utility MNL. No equations, self-citations, or derivation steps are quoted that would allow identification of self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations. The central claim is presented as an empirical finding rather than a definitional equivalence. Per hard rules, circularity requires explicit quotes exhibiting reduction (e.g., Eq. X = Eq. Y by construction); none are available here, so the score is 0 and steps array is empty. The paper's content against external benchmarks cannot be assessed as circular from the given material.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations or sections from which free parameters, axioms, or invented entities can be extracted; the model presumably introduces parameters governing information cost and prior weighting, but none are identifiable here.

pith-pipeline@v0.9.0 · 5662 in / 1140 out tokens · 23794 ms · 2026-05-24T20:34:05.530946+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Eq. (17) ... rational inattentive based choice can be framed as the information difference between the expected energy and the entropy gain ... variational free energy Fq(D)
IndisputableMonolith/Foundation/BlackBodyRadiationDeep.lean Jcost_pos_of_ne_one echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Hj = −log Gj ... entropy ... generalized entropy and utility based multinomial logit model

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 2 internal anchors

[1]

McFadden, K

D. McFadden, K. Train, Mixed MNL models for discrete response, Journal of applied Econometrics 15 (5) (2000) 447–470. doi:10.1002/1099-1255(200009/10)15:5<447::AID-JAE570>3.0.CO;2-1

work page doi:10.1002/1099-1255(200009/10)15:5 2000
[2]

D. Li, T. Miwa, T. Morikawa, P. Liu, Incorporating observed and unobserved heterogeneity in route choice analysis with sampled choice sets, Transportation Research Part C: Emerging Technologies 67 (2016) 31–46.doi:10.1016/j.trc.2016. 02.002

work page doi:10.1016/j.trc.2016 2016
[3]

A. Vij, R. Krueger, Random taste heterogeneity in discrete choice models: Flexible nonparametric ﬁnite mixture distri- butions, Transportation Research Part B: Methodological 106 (2017) 76–101.doi:10.1016/j.trb.2017.10.013

work page doi:10.1016/j.trb.2017.10.013 2017
[4]

Nikolić, M

M. Nikolić, M. Bierlaire, Data-driven spatio-temporal discretization for pedestrian ﬂow characterization, Transportation research procedia 23 (2017) 188–207.doi:10.1016/j.trc.2017.08.026

work page doi:10.1016/j.trc.2017.08.026 2017
[5]

D. A. Gopinath, Modeling heterogeneity in discrete choice processes: Application to travel demand, Ph.D. thesis, MIT (1995)

work page 1995
[6]

Bolduc, R

D. Bolduc, R. Alvarez-Daziano, On estimation of hybrid choice models, in: S. Hess, A. Daly (Eds.), Choice Modelling: The State-of-the-art and The State-of-practice, Edward Elgar, 2010, pp. 259–287.doi:10.1108/9781849507738-011

work page doi:10.1108/9781849507738-011 2010
[7]

M. Wong, B. Farooq, Modelling latent travel behaviour characteristics with generative machine learning, in: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), 2018, pp. 749–754. doi:10.1109/ITSC.2018. 8569581

work page doi:10.1109/itsc.2018 2018
[8]

M. Wong, B. Farooq, G.-A. Bilodeau, Discriminative conditional restricted boltzmann machine for discrete choice and latent variable modelling, Journal of choice modelling 29 (2018) 152–168.doi:10.1016/j.jocm.2017.11.003

work page doi:10.1016/j.jocm.2017.11.003 2018
[9]

Cherchi, J

E. Cherchi, J. W. Polak, Assessing user beneﬁts with discrete choice models: Implications of speciﬁcation errors under random taste heterogeneity, Transportation Research Record 1926 (1) (2005) 61–69.doi:10.1177/0361198105192600108

work page doi:10.1177/0361198105192600108 1926
[10]

C. A. Sims, Rational inattention and monetary economics, in: B. M. Friedman, M. Woodford (Eds.), Handbook of monetary economics, Vol. 3, Elsevier, 2010, Ch. 4, pp. 155–181.doi:10.1016/B978-0-444-53238-1.00004-1

work page doi:10.1016/b978-0-444-53238-1.00004-1 2010
[11]

Matějka, A

F. Matějka, A. McKay, Rational inattention to discrete choices: A new foundation for the multinomial logit model, American Economic Review 105 (1) (2015) 272–98.doi:10.1257/aer.20130047

work page doi:10.1257/aer.20130047 2015
[12]

Alizadeh, P.-L

H. Alizadeh, P.-L. Bourbonnais, C. Morency, B. Farooq, N. Saunier, An online survey to enhance the understanding of car drivers route choices, Transportation Research Procedia 32 (2018) 482–494.doi:10.1016/j.trpro.2018.10.042

work page doi:10.1016/j.trpro.2018.10.042 2018
[13]

Discrete Choice and Rational Inattention: a General Equivalence Result

M. Fosgerau, E. Melo, A. d. Palma, M. Shum, Discrete choice and rational inattention: A general equivalence result, arXiv preprint arXiv:1709.09117

work page internal anchor Pith review Pith/arXiv arXiv
[14]

Fosgerau, G

M. Fosgerau, G. Jiang, Travel time variability and rational inattention, Transportation Research Part B: Methodological 120 (2019) 1–14. doi:10.1016/j.trb.2018.12.003. 20

work page doi:10.1016/j.trb.2018.12.003 2019
[15]

C. A. Sims, Implications of rational inattention, Journal of monetary Economics 50 (3) (2003) 665–690.doi:10.1016/ S0304-3932(03)00029-1

work page 2003
[16]

Friston, J

K. Friston, J. Kilner, L. Harrison, A free energy principle for the brain, Journal of Physiology-Paris 100 (1-3) (2006) 70–87. doi:10.1016/j.jphysparis.2006.10.001

work page doi:10.1016/j.jphysparis.2006.10.001 2006
[17]

Ellsberg, Risk, ambiguity, and the savage axioms, The quarterly journal of economics (1961) 643–669doi:10.2307/ 1884324

D. Ellsberg, Risk, ambiguity, and the savage axioms, The quarterly journal of economics (1961) 643–669doi:10.2307/ 1884324

work page 1961
[18]

Kahneman, A

D. Kahneman, A. Tversky, Prospect theory: An analysis of decision under risk, Econometrica 47 (2) (1979) 263–292. doi:10.1142/9789814417358_0006

work page doi:10.1142/9789814417358_0006 1979
[19]

Steiner, C

J. Steiner, C. Stewart, F. Matějka, Rational inattention dynamics: Inertia and delay in decision-making, Econometrica 85 (2) (2017) 521–553.doi:10.3982/ECTA13636

work page doi:10.3982/ecta13636 2017
[20]

C. Teye, M. G. Bell, M. C. Bliemer, Entropy maximising facility location model for port city intermodal terminals, Transportation Research Part E: Logistics and Transportation Review 100 (2017) 1–16.doi:10.1016/j.tre.2017.01.006

work page doi:10.1016/j.tre.2017.01.006 2017
[21]

Leard, Consumer inattention and the demand for vehicle fuel cost savings, Journal of choice modelling 29 (2018) 1–16

B. Leard, Consumer inattention and the demand for vehicle fuel cost savings, Journal of choice modelling 29 (2018) 1–16. doi:10.1016/j.jocm.2018.08.002

work page doi:10.1016/j.jocm.2018.08.002 2018
[22]

A. Anas, Discrete choice theory, information theory and the multinomial logit and gravity models, Transportation Research Part B: Methodological 17 (1) (1983) 13–23.doi:10.1016/0191-2615(83)90023-1

work page doi:10.1016/0191-2615(83)90023-1 1983
[23]

Ullah, Entropy, divergence and distance measures with econometric applications, Journal of Statistical Planning and Inference 49 (1) (1996) 137–162.doi:10.1016/0378-3758(95)00034-8

A. Ullah, Entropy, divergence and distance measures with econometric applications, Journal of Statistical Planning and Inference 49 (1) (1996) 137–162.doi:10.1016/0378-3758(95)00034-8

work page doi:10.1016/0378-3758(95)00034-8 1996
[24]

Goodfellow, Y

I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016,http://www.deeplearningbook.org

work page 2016
[25]

G. E. Hinton, R. R. Salakhutdinov, Reducing the dimensionality of data with neural networks, science 313 (5786) (2006) 504–507

work page 2006
[26]

G. E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets, Neural computation 18 (7) (2006) 1527–1554. doi:10.1162/neco.2006.18.7.1527

work page doi:10.1162/neco.2006.18.7.1527 2006
[27]

M. Wong, B. Farooq, A bi-partite generative model framework for analyzing and simulating large scale multiple discrete- continuous travel behaviour data, arXiv preprint arXiv:1901.06415

work page arXiv 1901
[28]

Ranzato, C

M. Ranzato, C. Poultney, S. Chopra, Y. LeCun, Eﬃcient learning of sparse representations with an energy-based model, in: Advances in neural information processing systems, 2007, pp. 1137–1144

work page 2007
[29]

D. P. Kingma, M. Welling, Auto-encoding variational bayes, arXiv preprint arXiv:1312.6114

work page internal anchor Pith review Pith/arXiv arXiv
[30]

D. M. Blei, A. Kucukelbir, J. D. McAuliﬀe, Variational inference: A review for statisticians, Journal of the American Statistical Association 112 (518) (2017) 859–877.doi:10.1080/01621459.2017.1285773

work page doi:10.1080/01621459.2017.1285773 2017
[31]

K. E. Train, Discrete choice methods with simulation, Cambridge university press, 2009.doi:10.1017/CBO9780511805271

work page doi:10.1017/cbo9780511805271 2009
[32]

Masset, R

A. Alwosheel, S. van Cranenburgh, C. G. Chorus, Is your dataset big enough? sample size requirements when using artiﬁcial neural networks for discrete choice analysis, Journal of choice modelling 28 (2018) 167–182.doi:10.1016/j. jocm.2018.07.002

work page doi:10.1016/j 2018
[33]

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

work page 2016
[34]

Y. W. Teh, M. Welling, S. Osindero, G. E. Hinton, Energy-based models for sparse overcomplete representations, Journal of Machine Learning Research 4 (2003) 1235–1260

work page 2003
[35]

URL https://ville.montreal.qc.ca/mtltrajet/

Ville de montréal, Déplacements MTL trajet (2016). URL https://ville.montreal.qc.ca/mtltrajet/

work page 2016
[36]

Ranzato, Y.-l

M. Ranzato, Y.-l. Boureau, Y. LeCun, Sparse feature learning for deep belief networks, in: Advances in neural information processing systems, 2008, pp. 1185–1192

work page 2008
[37]

Glorot, A

X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectiﬁer neural networks, in: Proceedings of the fourteenth international conference on artiﬁcial intelligence and statistics, 2011, pp. 315–323

work page 2011
[38]

Farooq, M

B. Farooq, M. Bierlaire, R. Hurtubia, G. Flötteröd, Simulation based population synthesis, Transportation Research Part B: Methodological 58 (2013) 243–263.doi:10.1016/j.trb.2013.09.012

work page doi:10.1016/j.trb.2013.09.012 2013
[39]

Golan, G

A. Golan, G. Judge, J. M. Perloﬀ, A maximum entropy approach to recovering information from multinomial response data, Journal of the American Statistical Association 91 (434) (1996) 841–853. 21

work page 1996

[1] [1]

McFadden, K

D. McFadden, K. Train, Mixed MNL models for discrete response, Journal of applied Econometrics 15 (5) (2000) 447–470. doi:10.1002/1099-1255(200009/10)15:5<447::AID-JAE570>3.0.CO;2-1

work page doi:10.1002/1099-1255(200009/10)15:5 2000

[2] [2]

D. Li, T. Miwa, T. Morikawa, P. Liu, Incorporating observed and unobserved heterogeneity in route choice analysis with sampled choice sets, Transportation Research Part C: Emerging Technologies 67 (2016) 31–46.doi:10.1016/j.trc.2016. 02.002

work page doi:10.1016/j.trc.2016 2016

[3] [3]

A. Vij, R. Krueger, Random taste heterogeneity in discrete choice models: Flexible nonparametric ﬁnite mixture distri- butions, Transportation Research Part B: Methodological 106 (2017) 76–101.doi:10.1016/j.trb.2017.10.013

work page doi:10.1016/j.trb.2017.10.013 2017

[4] [4]

Nikolić, M

M. Nikolić, M. Bierlaire, Data-driven spatio-temporal discretization for pedestrian ﬂow characterization, Transportation research procedia 23 (2017) 188–207.doi:10.1016/j.trc.2017.08.026

work page doi:10.1016/j.trc.2017.08.026 2017

[5] [5]

D. A. Gopinath, Modeling heterogeneity in discrete choice processes: Application to travel demand, Ph.D. thesis, MIT (1995)

work page 1995

[6] [6]

Bolduc, R

D. Bolduc, R. Alvarez-Daziano, On estimation of hybrid choice models, in: S. Hess, A. Daly (Eds.), Choice Modelling: The State-of-the-art and The State-of-practice, Edward Elgar, 2010, pp. 259–287.doi:10.1108/9781849507738-011

work page doi:10.1108/9781849507738-011 2010

[7] [7]

M. Wong, B. Farooq, Modelling latent travel behaviour characteristics with generative machine learning, in: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), 2018, pp. 749–754. doi:10.1109/ITSC.2018. 8569581

work page doi:10.1109/itsc.2018 2018

[8] [8]

M. Wong, B. Farooq, G.-A. Bilodeau, Discriminative conditional restricted boltzmann machine for discrete choice and latent variable modelling, Journal of choice modelling 29 (2018) 152–168.doi:10.1016/j.jocm.2017.11.003

work page doi:10.1016/j.jocm.2017.11.003 2018

[9] [9]

Cherchi, J

E. Cherchi, J. W. Polak, Assessing user beneﬁts with discrete choice models: Implications of speciﬁcation errors under random taste heterogeneity, Transportation Research Record 1926 (1) (2005) 61–69.doi:10.1177/0361198105192600108

work page doi:10.1177/0361198105192600108 1926

[10] [10]

C. A. Sims, Rational inattention and monetary economics, in: B. M. Friedman, M. Woodford (Eds.), Handbook of monetary economics, Vol. 3, Elsevier, 2010, Ch. 4, pp. 155–181.doi:10.1016/B978-0-444-53238-1.00004-1

work page doi:10.1016/b978-0-444-53238-1.00004-1 2010

[11] [11]

Matějka, A

F. Matějka, A. McKay, Rational inattention to discrete choices: A new foundation for the multinomial logit model, American Economic Review 105 (1) (2015) 272–98.doi:10.1257/aer.20130047

work page doi:10.1257/aer.20130047 2015

[12] [12]

Alizadeh, P.-L

H. Alizadeh, P.-L. Bourbonnais, C. Morency, B. Farooq, N. Saunier, An online survey to enhance the understanding of car drivers route choices, Transportation Research Procedia 32 (2018) 482–494.doi:10.1016/j.trpro.2018.10.042

work page doi:10.1016/j.trpro.2018.10.042 2018

[13] [13]

Discrete Choice and Rational Inattention: a General Equivalence Result

M. Fosgerau, E. Melo, A. d. Palma, M. Shum, Discrete choice and rational inattention: A general equivalence result, arXiv preprint arXiv:1709.09117

work page internal anchor Pith review Pith/arXiv arXiv

[14] [14]

Fosgerau, G

M. Fosgerau, G. Jiang, Travel time variability and rational inattention, Transportation Research Part B: Methodological 120 (2019) 1–14. doi:10.1016/j.trb.2018.12.003. 20

work page doi:10.1016/j.trb.2018.12.003 2019

[15] [15]

C. A. Sims, Implications of rational inattention, Journal of monetary Economics 50 (3) (2003) 665–690.doi:10.1016/ S0304-3932(03)00029-1

work page 2003

[16] [16]

Friston, J

K. Friston, J. Kilner, L. Harrison, A free energy principle for the brain, Journal of Physiology-Paris 100 (1-3) (2006) 70–87. doi:10.1016/j.jphysparis.2006.10.001

work page doi:10.1016/j.jphysparis.2006.10.001 2006

[17] [17]

Ellsberg, Risk, ambiguity, and the savage axioms, The quarterly journal of economics (1961) 643–669doi:10.2307/ 1884324

D. Ellsberg, Risk, ambiguity, and the savage axioms, The quarterly journal of economics (1961) 643–669doi:10.2307/ 1884324

work page 1961

[18] [18]

Kahneman, A

D. Kahneman, A. Tversky, Prospect theory: An analysis of decision under risk, Econometrica 47 (2) (1979) 263–292. doi:10.1142/9789814417358_0006

work page doi:10.1142/9789814417358_0006 1979

[19] [19]

Steiner, C

J. Steiner, C. Stewart, F. Matějka, Rational inattention dynamics: Inertia and delay in decision-making, Econometrica 85 (2) (2017) 521–553.doi:10.3982/ECTA13636

work page doi:10.3982/ecta13636 2017

[20] [20]

C. Teye, M. G. Bell, M. C. Bliemer, Entropy maximising facility location model for port city intermodal terminals, Transportation Research Part E: Logistics and Transportation Review 100 (2017) 1–16.doi:10.1016/j.tre.2017.01.006

work page doi:10.1016/j.tre.2017.01.006 2017

[21] [21]

Leard, Consumer inattention and the demand for vehicle fuel cost savings, Journal of choice modelling 29 (2018) 1–16

B. Leard, Consumer inattention and the demand for vehicle fuel cost savings, Journal of choice modelling 29 (2018) 1–16. doi:10.1016/j.jocm.2018.08.002

work page doi:10.1016/j.jocm.2018.08.002 2018

[22] [22]

A. Anas, Discrete choice theory, information theory and the multinomial logit and gravity models, Transportation Research Part B: Methodological 17 (1) (1983) 13–23.doi:10.1016/0191-2615(83)90023-1

work page doi:10.1016/0191-2615(83)90023-1 1983

[23] [23]

Ullah, Entropy, divergence and distance measures with econometric applications, Journal of Statistical Planning and Inference 49 (1) (1996) 137–162.doi:10.1016/0378-3758(95)00034-8

A. Ullah, Entropy, divergence and distance measures with econometric applications, Journal of Statistical Planning and Inference 49 (1) (1996) 137–162.doi:10.1016/0378-3758(95)00034-8

work page doi:10.1016/0378-3758(95)00034-8 1996

[24] [24]

Goodfellow, Y

I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016,http://www.deeplearningbook.org

work page 2016

[25] [25]

G. E. Hinton, R. R. Salakhutdinov, Reducing the dimensionality of data with neural networks, science 313 (5786) (2006) 504–507

work page 2006

[26] [26]

G. E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets, Neural computation 18 (7) (2006) 1527–1554. doi:10.1162/neco.2006.18.7.1527

work page doi:10.1162/neco.2006.18.7.1527 2006

[27] [27]

M. Wong, B. Farooq, A bi-partite generative model framework for analyzing and simulating large scale multiple discrete- continuous travel behaviour data, arXiv preprint arXiv:1901.06415

work page arXiv 1901

[28] [28]

Ranzato, C

M. Ranzato, C. Poultney, S. Chopra, Y. LeCun, Eﬃcient learning of sparse representations with an energy-based model, in: Advances in neural information processing systems, 2007, pp. 1137–1144

work page 2007

[29] [29]

D. P. Kingma, M. Welling, Auto-encoding variational bayes, arXiv preprint arXiv:1312.6114

work page internal anchor Pith review Pith/arXiv arXiv

[30] [30]

D. M. Blei, A. Kucukelbir, J. D. McAuliﬀe, Variational inference: A review for statisticians, Journal of the American Statistical Association 112 (518) (2017) 859–877.doi:10.1080/01621459.2017.1285773

work page doi:10.1080/01621459.2017.1285773 2017

[31] [31]

K. E. Train, Discrete choice methods with simulation, Cambridge university press, 2009.doi:10.1017/CBO9780511805271

work page doi:10.1017/cbo9780511805271 2009

[32] [32]

Masset, R

A. Alwosheel, S. van Cranenburgh, C. G. Chorus, Is your dataset big enough? sample size requirements when using artiﬁcial neural networks for discrete choice analysis, Journal of choice modelling 28 (2018) 167–182.doi:10.1016/j. jocm.2018.07.002

work page doi:10.1016/j 2018

[33] [33]

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

work page 2016

[34] [34]

Y. W. Teh, M. Welling, S. Osindero, G. E. Hinton, Energy-based models for sparse overcomplete representations, Journal of Machine Learning Research 4 (2003) 1235–1260

work page 2003

[35] [35]

URL https://ville.montreal.qc.ca/mtltrajet/

Ville de montréal, Déplacements MTL trajet (2016). URL https://ville.montreal.qc.ca/mtltrajet/

work page 2016

[36] [36]

Ranzato, Y.-l

M. Ranzato, Y.-l. Boureau, Y. LeCun, Sparse feature learning for deep belief networks, in: Advances in neural information processing systems, 2008, pp. 1185–1192

work page 2008

[37] [37]

Glorot, A

X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectiﬁer neural networks, in: Proceedings of the fourteenth international conference on artiﬁcial intelligence and statistics, 2011, pp. 315–323

work page 2011

[38] [38]

Farooq, M

B. Farooq, M. Bierlaire, R. Hurtubia, G. Flötteröd, Simulation based population synthesis, Transportation Research Part B: Methodological 58 (2013) 243–263.doi:10.1016/j.trb.2013.09.012

work page doi:10.1016/j.trb.2013.09.012 2013

[39] [39]

Golan, G

A. Golan, G. Judge, J. M. Perloﬀ, A maximum entropy approach to recovering information from multinomial response data, Journal of the American Statistical Association 91 (434) (1996) 841–853. 21

work page 1996