Semi-Parametric Hierarchical Bayes Estimates of New Yorkers' Willingness to Pay for Features of Shared Automated Vehicle Services

Akshay Vij; Rico Krueger; Taha H. Rashidi

arxiv: 1907.09639 · v1 · pith:7I3LH43Tnew · submitted 2019-07-23 · 💰 econ.GN · q-fin.EC

Semi-Parametric Hierarchical Bayes Estimates of New Yorkers' Willingness to Pay for Features of Shared Automated Vehicle Services

Rico Krueger , Taha H. Rashidi , Akshay Vij This is my paper

Pith reviewed 2026-05-24 17:23 UTC · model grok-4.3

classification 💰 econ.GN q-fin.EC

keywords mixingdistributionsdp-monvehicledatasemi-parametricservicesautomated

0 comments

The pith

DP-MON mixing distribution in hierarchical Bayes MNL models yields superior fit to NYC SAV stated choice data, polarized WTP for avoiding ride-splitting (one-third pay 10-80 USD/h), and low value placed on vehicle automation or electrification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Researchers collected survey data from New Yorkers who chose between different shared automated vehicle service options that varied in price, travel time, sharing with strangers, automation, and power source. They analyzed these choices with hierarchical Bayesian multinomial logit models that allow tastes to differ across people. Three ways of modeling that taste variation were tested: a single normal distribution, a fixed number of normal distributions mixed together, and a Dirichlet process mixture of normals that lets the data determine how many groups are needed. The most flexible version fit the survey responses better than the others and did just as well when predicting choices people had not seen during estimation. The results showed that feelings about sharing a ride with strangers split sharply: roughly one third of respondents would pay between 10 and 80 dollars per hour to avoid it, while the rest were neutral or even preferred sharing. Whether the vehicle was self-driving or electric mattered little to most people. This pattern suggests that the main benefits of these technologies may come indirectly through lower costs or better service rather than from the technology features themselves.

Core claim

The DP-MON mixing distribution provides superior fit to the data and performs at least as well as the competing methods at out-of-sample prediction. We find that preferences for in-vehicle travel time by SAV with ride-splitting are strongly polarised. Whereas one third of the sample is willing to pay between 10 and 80 USD/h to avoid sharing a vehicle with strangers, the remainder of the sample is either indifferent to ride-splitting or even desires it. Moreover, we estimate that new technologies such as vehicle automation and electrification are relatively unimportant to travellers.

Load-bearing premise

The stated choice experiment data accurately capture real-world willingness to pay and behavioral responses to SAV service attributes.

read the original abstract

In this paper, we contrast parametric and semi-parametric representations of unobserved heterogeneity in hierarchical Bayesian multinomial logit models and leverage these methods to infer distributions of willingness to pay for features of shared automated vehicle (SAV) services. Specifically, we compare the multivariate normal (MVN), finite mixture of normals (F-MON) and Dirichlet process mixture of normals (DP-MON) mixing distributions. The latter promises to be particularly flexible in respect to the shapes it can assume and unlike other semi-parametric approaches does not require that its complexity is fixed prior to estimation. However, its properties relative to simpler mixing distributions are not well understood. In this paper, we evaluate the performance of the MVN, F-MON and DP-MON mixing distributions using simulated data and real data sourced from a stated choice study on preferences for SAV services in New York City. Our analysis shows that the DP-MON mixing distribution provides superior fit to the data and performs at least as well as the competing methods at out-of-sample prediction. The DP-MON mixing distribution also offers substantive behavioural insights into the adoption of SAVs. We find that preferences for in-vehicle travel time by SAV with ride-splitting are strongly polarised. Whereas one third of the sample is willing to pay between 10 and 80 USD/h to avoid sharing a vehicle with strangers, the remainder of the sample is either indifferent to ride-splitting or even desires it. Moreover, we estimate that new technologies such as vehicle automation and electrification are relatively unimportant to travellers. This suggests that travellers may primarily derive indirect, rather than immediate benefits from these new technologies through increases in operational efficiency and lower operating costs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DP-MON gives better in-sample fit than MVN or F-MON on the SAV data and reveals polarized ride-splitting preferences, but the results sit on unvalidated stated choice responses.

read the letter

The main things to know are that the Dirichlet process mixture of normals mixing distribution outperforms the multivariate normal and finite mixture of normals on in-sample fit for the New York City stated choice data while matching them on out-of-sample prediction, and that it produces a clear split in preferences where roughly one third of respondents show strong aversion to ride-splitting worth 10-80 USD per hour while the rest are indifferent or positive. Automation and electrification come out as relatively low priorities compared with operational features. The paper applies DP-MON to both simulated and real data without fixing the number of components in advance, which is the concrete advance over the baselines. The comparison is straightforward and the behavioral pattern on ride-splitting polarization is new relative to the cited work. The method section appears to deliver what the abstract promises on model performance. The central limitation is the data source. Stated choice experiments for services that do not yet exist routinely produce WTP estimates that do not match actual behavior, and the paper does not report external validation against revealed preferences or direct tests for hypothetical bias. That leaves the specific dollar ranges and the claim that new technologies are unimportant as plausible but provisional. The estimation details and stability checks are presumably in the full text, but the abstract itself gives no error bars or quantitative deltas. This paper is aimed at researchers who work on flexible heterogeneity in discrete choice models or who need city-level inputs for SAV planning. A reader focused on semi-parametric mixing distributions will find the simulation and real-data comparison useful. It is solid enough on its own terms to merit a serious referee even if the policy takeaways require the usual caution around stated preference data.

Referee Report

2 major / 2 minor

Summary. The paper compares multivariate normal (MVN), finite mixture of normals (F-MON), and Dirichlet process mixture of normals (DP-MON) mixing distributions within hierarchical Bayesian multinomial logit models. Using both simulated data and a stated choice survey of New York City respondents, it evaluates in-sample fit and out-of-sample prediction performance, then applies the preferred specification to recover distributions of willingness-to-pay for shared automated vehicle attributes. The central claims are that DP-MON yields superior in-sample fit with comparable out-of-sample performance and that preferences for ride-splitting are strongly polarized while automation and electrification are relatively unimportant to travelers.

Significance. If the model-comparison results are confirmed with quantitative metrics, the work provides a practical demonstration of the advantages of flexible semi-parametric mixing distributions for capturing unobserved heterogeneity in discrete choice settings. The behavioral findings on ride-splitting polarization could inform service design for emerging mobility options. The use of both simulated and real data for benchmarking the three mixing distributions is a methodological strength that strengthens the model-selection conclusions.

major comments (2)

[Abstract] Abstract: the claim that DP-MON provides superior in-sample fit and performs at least as well at out-of-sample prediction is stated without any reported quantitative metrics (log-likelihood values, DIC/WAIC, hit rates, or standard errors), preventing verification of the magnitude or statistical significance of the reported advantage over MVN and F-MON.
[Results / behavioral insights] Behavioral results (abstract and results section): the substantive claims of polarized WTP for ride-splitting (one-third of sample at 10–80 USD/h) and low importance of automation/electrification rest entirely on stated-choice data; the manuscript contains no discussion of hypothetical bias, no external validation against revealed-preference data, and no sensitivity checks, which directly affects the credibility of the policy-relevant WTP distributions.

minor comments (2)

[Methods] The description of the Dirichlet process concentration parameter and its prior could be accompanied by an explicit equation or reference to the estimation algorithm used.
[Tables] Tables reporting WTP distributions should include measures of uncertainty (e.g., credible intervals) for the reported quantiles and mixture-component shares.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment in turn below and outline the revisions we will make.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that DP-MON provides superior in-sample fit and performs at least as well at out-of-sample prediction is stated without any reported quantitative metrics (log-likelihood values, DIC/WAIC, hit rates, or standard errors), preventing verification of the magnitude or statistical significance of the reported advantage over MVN and F-MON.

Authors: We agree that the abstract would be strengthened by including quantitative metrics. Although the results section contains tables reporting log-likelihoods, DIC, WAIC, and out-of-sample hit rates, these are not summarized in the abstract. In the revised manuscript we will add the key metrics (e.g., in-sample log-likelihood improvements and out-of-sample prediction rates) directly into the abstract so that the magnitude of the DP-MON advantage is verifiable. revision: yes
Referee: [Results / behavioral insights] Behavioral results (abstract and results section): the substantive claims of polarized WTP for ride-splitting (one-third of sample at 10–80 USD/h) and low importance of automation/electrification rest entirely on stated-choice data; the manuscript contains no discussion of hypothetical bias, no external validation against revealed-preference data, and no sensitivity checks, which directly affects the credibility of the policy-relevant WTP distributions.

Authors: We accept that the behavioral claims rest on stated-choice data and that the original manuscript does not discuss hypothetical bias or external validation. We will add a dedicated limitations subsection that (i) cites the literature on hypothetical bias in stated-choice experiments for travel behavior, (ii) explicitly notes the absence of revealed-preference data for external validation, and (iii) reports any sensitivity checks already conducted (e.g., alternative model specifications or prior robustness). While we cannot supply new revealed-preference validation without additional data collection, the expanded discussion will make the scope and limitations of the WTP results transparent. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results from external data and standard model evaluation

full rationale

The paper fits hierarchical Bayesian MNL models with MVN, F-MON and DP-MON mixing distributions to simulated data and to real stated-choice data collected from NYC respondents. Model performance is assessed via standard in-sample fit metrics and out-of-sample prediction on held-out choice observations; these quantities are not defined in terms of the fitted parameters themselves. WTP distributions are recovered directly from the estimated posteriors. No self-definitional equations, no renaming of fitted quantities as predictions, and no load-bearing self-citations that would make the central claims tautological. The derivation chain rests on external survey data and established econometric procedures without reduction to its own inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The analysis relies on standard discrete choice theory and Bayesian nonparametric mixing without introducing new entities or ad-hoc postulates beyond the model comparison itself.

free parameters (2)

number of components in F-MON
Finite mixture requires pre-specifying or selecting the number of normal components, a modeling choice tuned to data.
Dirichlet process concentration parameter
Controls the effective number of mixture components in DP-MON and is estimated from data.

axioms (1)

domain assumption Respondents choose according to the multinomial logit rule derived from random utility maximization.
Core maintained assumption of the hierarchical Bayes MNL framework used throughout.

pith-pipeline@v0.9.0 · 5847 in / 1404 out tokens · 43496 ms · 2026-05-24T17:23:40.351360+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

[1]

Update ζ by sampling ζ∼ N µζ, Σζ , where Σζ = Σ0 + N Ω−1 −1 and µζ = Σζ Σ−1 0 µ0 + Ω−1∑N n=1 βn

work page
[2]

,R} by sampling ar∼ Gamma ν+R 2 , 1 A2 r +ν Ω−1 r r

Update ar for all r∈{ 1, . . . ,R} by sampling ar∼ Gamma ν+R 2 , 1 A2 r +ν Ω−1 r r

work page
[3]

Update Ω by sampling Ω∼ IW ν + N + K− 1, 2νdiag(a) + ∑N n=1(βn− ζ)(βn− ζ)⊤

work page
[4]

,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK )

Update βn for all n∈{ 1, . . . ,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK ). b) Compute r = P(yn|Xn, ˜βn)φ( ˜βn|ζ,Ω) P(yn|Xn,βn)φ(βn|ζ,Ω). c) Draw u∼ Uniform(0, 1). If r≤ u, accept the proposal. If r> u, reject the proposal. ρ is a step size, which needs to be tuned. Here, we employ the same tuning mechanism as Train (2009): ρ is set to an ini...

work page 2009
[5]

,K} by sampling ζk∼ N µζk, Σζk , where Σζk = Σ0 + ckΩ−1 −1 and µζk = Σζk Σ−1 0 µ0 + Ω−1∑ n:qn=k βn

Update ζk for all k∈{ 1, . . . ,K} by sampling ζk∼ N µζk, Σζk , where Σζk = Σ0 + ckΩ−1 −1 and µζk = Σζk Σ−1 0 µ0 + Ω−1∑ n:qn=k βn

work page
[6]

,K} and r∈{ 1,

Update akr for all k∈{ 1, . . . ,K} and r∈{ 1, . . . ,R} by sampling akr ∼ Gamma ν+R 2 , 1 A2 r + ν Ω−1 k r r

work page
[7]

,K} by sampling Ωk∼ IW(ν + ck + R−1, 2νdiag(ak) + ∑ n:qn=k(βn− ζk)(βn− ζk)⊤)

Update Ωk for all k∈{ 1, . . . ,K} by sampling Ωk∼ IW(ν + ck + R−1, 2νdiag(ak) + ∑ n:qn=k(βn− ζk)(βn− ζk)⊤)

work page
[8]

Update π by sampling π∼ Dirichlet(¯α), where ¯αk =α + ck

work page
[9]

,N} by sampling qn∼ Categorical(p), where pk = πkφ(βn|ζk,Ωk)∑K k′=1πk′φ(βn|ζk′,Ωk′ )

Update qn for all n∈{ 1, . . . ,N} by sampling qn∼ Categorical(p), where pk = πkφ(βn|ζk,Ωk)∑K k′=1πk′φ(βn|ζk′,Ωk′ )

work page
[10]

,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK )

Update βn for all n∈{ 1, . . . ,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK ). b) Compute r = P(yn|Xn, ˜βn)φ( ˜βn|ζk,Ωk) P(yn|Xn,βn)φ(βn|ζk,Ωk). c) Draw u∼ Uniform(0, 1). If r≤ u, accept the proposal. If r> u, reject the proposal. The step sizeρ is tuned in the same way as for Algorithm 1. 30 A.3. Mixed logit with a Dirichlet process mixture of ...

work page
[11]

Update α by samplingα∼ Gamma 2 + K− 1, 2− ∑K−1 k=1 ln(1−ηk)

work page
[12]

,K− 1} by samplingηk∼ Beta(1 + ck,α + ∑K j=k+1 c j), setηK = 1, and calculateπk =ηk ∏k−1 l=1 (1−ηl ) for all k∈{ 1,

Updateηk for all k∈{ 1, . . . ,K− 1} by samplingηk∼ Beta(1 + ck,α + ∑K j=k+1 c j), setηK = 1, and calculateπk =ηk ∏k−1 l=1 (1−ηl ) for all k∈{ 1, . . . ,K}. 31

work page

[1] [1]

Update ζ by sampling ζ∼ N µζ, Σζ , where Σζ = Σ0 + N Ω−1 −1 and µζ = Σζ Σ−1 0 µ0 + Ω−1∑N n=1 βn

work page

[2] [2]

,R} by sampling ar∼ Gamma ν+R 2 , 1 A2 r +ν Ω−1 r r

Update ar for all r∈{ 1, . . . ,R} by sampling ar∼ Gamma ν+R 2 , 1 A2 r +ν Ω−1 r r

work page

[3] [3]

Update Ω by sampling Ω∼ IW ν + N + K− 1, 2νdiag(a) + ∑N n=1(βn− ζ)(βn− ζ)⊤

work page

[4] [4]

,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK )

Update βn for all n∈{ 1, . . . ,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK ). b) Compute r = P(yn|Xn, ˜βn)φ( ˜βn|ζ,Ω) P(yn|Xn,βn)φ(βn|ζ,Ω). c) Draw u∼ Uniform(0, 1). If r≤ u, accept the proposal. If r> u, reject the proposal. ρ is a step size, which needs to be tuned. Here, we employ the same tuning mechanism as Train (2009): ρ is set to an ini...

work page 2009

[5] [5]

,K} by sampling ζk∼ N µζk, Σζk , where Σζk = Σ0 + ckΩ−1 −1 and µζk = Σζk Σ−1 0 µ0 + Ω−1∑ n:qn=k βn

Update ζk for all k∈{ 1, . . . ,K} by sampling ζk∼ N µζk, Σζk , where Σζk = Σ0 + ckΩ−1 −1 and µζk = Σζk Σ−1 0 µ0 + Ω−1∑ n:qn=k βn

work page

[6] [6]

,K} and r∈{ 1,

Update akr for all k∈{ 1, . . . ,K} and r∈{ 1, . . . ,R} by sampling akr ∼ Gamma ν+R 2 , 1 A2 r + ν Ω−1 k r r

work page

[7] [7]

,K} by sampling Ωk∼ IW(ν + ck + R−1, 2νdiag(ak) + ∑ n:qn=k(βn− ζk)(βn− ζk)⊤)

Update Ωk for all k∈{ 1, . . . ,K} by sampling Ωk∼ IW(ν + ck + R−1, 2νdiag(ak) + ∑ n:qn=k(βn− ζk)(βn− ζk)⊤)

work page

[8] [8]

Update π by sampling π∼ Dirichlet(¯α), where ¯αk =α + ck

work page

[9] [9]

,N} by sampling qn∼ Categorical(p), where pk = πkφ(βn|ζk,Ωk)∑K k′=1πk′φ(βn|ζk′,Ωk′ )

Update qn for all n∈{ 1, . . . ,N} by sampling qn∼ Categorical(p), where pk = πkφ(βn|ζk,Ωk)∑K k′=1πk′φ(βn|ζk′,Ωk′ )

work page

[10] [10]

,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK )

Update βn for all n∈{ 1, . . . ,N}: a) Propose ˜βn = βn +pρchol(Ω)η, where η∼ N(0, IK ). b) Compute r = P(yn|Xn, ˜βn)φ( ˜βn|ζk,Ωk) P(yn|Xn,βn)φ(βn|ζk,Ωk). c) Draw u∼ Uniform(0, 1). If r≤ u, accept the proposal. If r> u, reject the proposal. The step sizeρ is tuned in the same way as for Algorithm 1. 30 A.3. Mixed logit with a Dirichlet process mixture of ...

work page

[11] [11]

Update α by samplingα∼ Gamma 2 + K− 1, 2− ∑K−1 k=1 ln(1−ηk)

work page

[12] [12]

,K− 1} by samplingηk∼ Beta(1 + ck,α + ∑K j=k+1 c j), setηK = 1, and calculateπk =ηk ∏k−1 l=1 (1−ηl ) for all k∈{ 1,

Updateηk for all k∈{ 1, . . . ,K− 1} by samplingηk∼ Beta(1 + ck,α + ∑K j=k+1 c j), setηK = 1, and calculateπk =ηk ∏k−1 l=1 (1−ηl ) for all k∈{ 1, . . . ,K}. 31

work page