Estimating Supply Incrementality in Two-sided Marketplaces: A Causal Machine Learning Approach
Pith reviewed 2026-07-01 00:58 UTC · model grok-4.3
The pith
Double/debiased machine learning combined with hierarchical Bayesian priors estimates the causal effect of added supply on total bookings across segments in marketplaces like Airbnb.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that combining double/debiased machine learning with a hierarchical Bayesian framework, using pre-existing knowledge as priors and geospatial similarity measures to construct features, produces plausible estimates of the marketplace returns to additional supply together with strong out-of-sample performance when applied to estimating the impact of extra listings on total bookings in heterogeneous two-sided marketplaces such as Airbnb.
What carries the argument
Double/debiased machine learning integrated with a hierarchical Bayesian framework that leverages pre-existing knowledge as priors and geospatial measures of product segment similarity to build tractable features for shared estimation across segments.
If this is right
- The estimates can be used to assess returns to additional supply in different product segments while borrowing strength from similar segments.
- The same combination of methods can be applied to other two-sided marketplaces to measure supply incrementality.
- Geospatial similarity features enable the construction of informative inputs even when direct data on segment relationships is limited.
- Strong out-of-sample results support using the model for counterfactual evaluation of supply changes.
Where Pith is reading between the lines
- If the priors turn out to be poorly calibrated for certain segments, the shared estimation could produce misleading results, pointing to the value of robustness checks on prior sensitivity.
- The geospatial feature construction could be replaced or augmented with other similarity metrics from network or text data to test whether the performance gains generalize.
- The method's ability to handle heterogeneous products suggests it could inform supply allocation rules that target segments with higher incremental returns.
Load-bearing premise
The hierarchical Bayesian framework depends on pre-existing knowledge supplying informative and appropriate priors so that estimation can be shared across product segments.
What would settle it
Applying the fitted model to a hold-out Airbnb dataset and finding that its predictions for changes in total bookings after observed supply increases deviate substantially from the actual outcomes would indicate the claimed out-of-sample performance does not hold.
Figures
read the original abstract
In two-sided marketplaces with heterogeneous products, it is important to understand the causal relationship between additional supply and marketplace outcomes, such as the total quantity transacted or transaction value in the marketplace. This paper studies a causal machine learning approach to estimating this relationship across product segments. We use the Airbnb marketplace as an example, focusing on the impact of additional listing supply on total bookings, but the methodology applies to other two-sided marketplaces. Our approach combines double/debiased machine learning with a hierarchical Bayesian framework that leverages pre-existing knowledge as priors. We construct tractable and informative features for the model by leveraging measures of product segment similarity from the geospatial literature. We find that such a model provides plausible estimates of the marketplace returns to additional supply and strong out of sample performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes combining double/debiased machine learning (DML) with a hierarchical Bayesian model that incorporates pre-existing knowledge as priors and geospatial segment-similarity features to estimate the causal effect of additional supply (listings) on marketplace outcomes such as total bookings in two-sided platforms, using Airbnb as the running example. It claims the resulting estimates are plausible and exhibit strong out-of-sample performance across product segments.
Significance. If the causal identification holds after nuisance estimation, the approach would offer a practical way to pool information across heterogeneous segments while respecting domain structure, which could be useful for supply-side policy in marketplaces. The use of external geospatial priors and hierarchical shrinkage is a concrete strength that distinguishes it from off-the-shelf DML applications.
major comments (2)
- [Methods / Identification strategy] The central claim that DML recovers unbiased marketplace returns to supply rests on the orthogonality condition after nuisance estimation. The manuscript provides no explicit discussion or robustness evidence that the constructed geospatial similarity features (or the hierarchical priors) are sufficient to block segment-specific unobserved demand shocks that could jointly determine listing supply and bookings; without such evidence the reported estimates remain vulnerable to the endogeneity concern raised in the stress test.
- [Hierarchical Bayesian framework] The hierarchical Bayesian component is described as leveraging 'pre-existing knowledge as priors,' yet the paper does not report sensitivity checks to prior specification or the degree of shrinkage across segments. If the priors are weakly informative or misspecified for certain product segments, the pooled estimates could be driven by the prior rather than the data, undermining the claim of plausible causal effects.
minor comments (2)
- [Abstract / Results] The abstract states 'strong out of sample performance' without defining the metric, the hold-out scheme, or the baseline comparator; this should be clarified with concrete numbers and a table in the results section.
- [Methods] Notation for the DML moment condition and the hierarchical prior should be introduced with explicit equations rather than prose descriptions to allow readers to verify the orthogonality construction.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The comments highlight important areas for strengthening the identification discussion and the hierarchical Bayesian component. We address each point below and outline revisions that will be incorporated in the next version of the manuscript.
read point-by-point responses
-
Referee: [Methods / Identification strategy] The central claim that DML recovers unbiased marketplace returns to supply rests on the orthogonality condition after nuisance estimation. The manuscript provides no explicit discussion or robustness evidence that the constructed geospatial similarity features (or the hierarchical priors) are sufficient to block segment-specific unobserved demand shocks that could jointly determine listing supply and bookings; without such evidence the reported estimates remain vulnerable to the endogeneity concern raised in the stress test.
Authors: We agree that an explicit discussion of how the geospatial segment-similarity features support the orthogonality condition would improve the manuscript. The DML framework relies on the nuisance estimators (including the geospatial features) being sufficiently rich to capture confounding; our features are constructed from established geospatial similarity measures precisely to proxy for shared demand factors across segments. The stress test already provides supporting evidence against certain endogeneity patterns. We will revise the identification section to include a dedicated subsection clarifying these assumptions, the role of the features in blocking segment-specific demand shocks, and additional robustness checks that leverage the stress-test design. revision: yes
-
Referee: [Hierarchical Bayesian framework] The hierarchical Bayesian component is described as leveraging 'pre-existing knowledge as priors,' yet the paper does not report sensitivity checks to prior specification or the degree of shrinkage across segments. If the priors are weakly informative or misspecified for certain product segments, the pooled estimates could be driven by the prior rather than the data, undermining the claim of plausible causal effects.
Authors: We acknowledge that the absence of reported sensitivity analyses to prior choice and shrinkage is a gap. While the hierarchical structure is intended to let the data dominate via partial pooling, explicit checks are needed to demonstrate robustness. We will add a new subsection (or appendix) reporting sensitivity to alternative prior specifications (e.g., varying hyperprior strength and informativeness) and quantitative measures of shrinkage (posterior vs. segment-specific estimates) across product segments. These will be presented alongside the main results to confirm that the reported effects are data-driven. revision: yes
Circularity Check
No circularity; standard causal ML pipeline with external priors and features.
full rationale
The described approach combines double/debiased ML with a hierarchical Bayesian model that uses pre-existing knowledge as priors and constructs features from the geospatial literature. No equations, fitted parameters, or self-citations are shown that reduce the claimed marketplace returns or out-of-sample performance to inputs by construction. The derivation chain remains self-contained against external benchmarks and does not invoke load-bearing self-citations or ansatzes.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Double/debiased machine learning assumptions hold for identifying causal effects of supply
- domain assumption Pre-existing knowledge provides informative and appropriate priors for the hierarchical model across segments
Reference graph
Works this paper leans on
-
[1]
Customized regression model for Airbnb dynamic pricing
Peng Ye, Julian Qian, Jieying Chen, Chen-hung Wu, Yitong Zhou, Spencer De Mars, Frank Yang, and Li Zhang. Customized regression model for Airbnb dynamic pricing. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 932–940, New York, NY, USA,
-
[2]
Association for Computing Machinery
-
[3]
Sanchez Martinez, S
C. Sanchez Martinez, S. O’Donnell, L. Yuan, and Y. Zhu. How Airbnb measures listing lifetime value.the Airbnb Tech Blog, 2025
2025
-
[4]
Paul H. Douglas. The Cobb-Douglas production function once again: its history, its testing, and some new empirical values.Journal of Political Economy, 84(5): 903–915, 1976
1976
-
[5]
Pissarides
Barbara Petrongolo and Christopher A. Pissarides. Looking into the black box: A survey of the matching function.Journal of Economic Literature, 39(2):390–431, 2001
2001
-
[6]
Empirical models of demand and supply in differ- entiated products industries
Amit Gandhi and Aviv Nevo. Empirical models of demand and supply in differ- entiated products industries. Nber working paper, National Bureau of Economic Research, 2021
2021
-
[7]
Automobile prices in market equilibrium.Econometrica, 63(4):841–890, 1995
Stephen Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equilibrium.Econometrica, 63(4):841–890, 1995
1995
-
[8]
Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018
2018
-
[9]
Edward H. Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects, 2022
2022
-
[10]
Hyndman, and Bonsoo Koo
Christoph Bergmeir, Rob J. Hyndman, and Bonsoo Koo. A note on the validity of cross-validation for evaluating autoregressive time series prediction.Compu- tational Statistics & Data Analysis, 120:70–83, 2018
2018
-
[11]
A structured comparison of causal ma- chine learning methods to assess heterogeneous treatment effects in spatial data
Kevin Credit and Matthias Lehnert. A structured comparison of causal ma- chine learning methods to assess heterogeneous treatment effects in spatial data. Journal of Geographical Systems, 26:483–510, 2024
2024
-
[12]
Cambridge University Press, 2006
Andrew Gelman and Jennifer Hill.Data Analysis Using Regression and Multi- level/Hierarchical Models. Cambridge University Press, 2006
2006
-
[13]
Moral hazard, wildfires, and the economic incidence of natural disasters
Patrick Baylis and Judson Boomhower. Moral hazard, wildfires, and the economic incidence of natural disasters. NBER Working Paper 26550, National Bureau of Economic Research, 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.