Estimating Supply Incrementality in Two-sided Marketplaces: A Causal Machine Learning Approach

Daniel Schmierer; Dan Zylberglejd; Yufei Wu

arxiv: 2606.30999 · v1 · pith:Z2HBE5BDnew · submitted 2026-06-30 · 💻 cs.LG · econ.EM· stat.AP· stat.ME

Estimating Supply Incrementality in Two-sided Marketplaces: A Causal Machine Learning Approach

Yufei Wu , Daniel Schmierer , Dan Zylberglejd This is my paper

Pith reviewed 2026-07-01 00:58 UTC · model grok-4.3

classification 💻 cs.LG econ.EMstat.APstat.ME

keywords causal machine learningdouble debiased machine learninghierarchical Bayesian modeltwo-sided marketplacessupply incrementalityAirbnbgeospatial similarityout-of-sample performance

0 comments

The pith

Double/debiased machine learning combined with hierarchical Bayesian priors estimates the causal effect of added supply on total bookings across segments in marketplaces like Airbnb.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a causal machine learning method to measure how additional supply, such as new listings, affects overall marketplace outcomes like total bookings in two-sided platforms with varied products. It integrates double/debiased machine learning for handling many controls with a hierarchical Bayesian model that uses existing knowledge as priors and draws on geospatial measures to build features for product segment similarity. The approach is demonstrated on Airbnb data but applies more broadly. The authors report that the resulting model yields plausible estimates of supply returns and performs strongly on data not used in fitting. Understanding these relationships matters for platforms deciding whether and where to encourage more supply without simply displacing existing activity.

Core claim

The central claim is that combining double/debiased machine learning with a hierarchical Bayesian framework, using pre-existing knowledge as priors and geospatial similarity measures to construct features, produces plausible estimates of the marketplace returns to additional supply together with strong out-of-sample performance when applied to estimating the impact of extra listings on total bookings in heterogeneous two-sided marketplaces such as Airbnb.

What carries the argument

Double/debiased machine learning integrated with a hierarchical Bayesian framework that leverages pre-existing knowledge as priors and geospatial measures of product segment similarity to build tractable features for shared estimation across segments.

If this is right

The estimates can be used to assess returns to additional supply in different product segments while borrowing strength from similar segments.
The same combination of methods can be applied to other two-sided marketplaces to measure supply incrementality.
Geospatial similarity features enable the construction of informative inputs even when direct data on segment relationships is limited.
Strong out-of-sample results support using the model for counterfactual evaluation of supply changes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the priors turn out to be poorly calibrated for certain segments, the shared estimation could produce misleading results, pointing to the value of robustness checks on prior sensitivity.
The geospatial feature construction could be replaced or augmented with other similarity metrics from network or text data to test whether the performance gains generalize.
The method's ability to handle heterogeneous products suggests it could inform supply allocation rules that target segments with higher incremental returns.

Load-bearing premise

The hierarchical Bayesian framework depends on pre-existing knowledge supplying informative and appropriate priors so that estimation can be shared across product segments.

What would settle it

Applying the fitted model to a hold-out Airbnb dataset and finding that its predictions for changes in total bookings after observed supply increases deviate substantially from the actual outcomes would indicate the claimed out-of-sample performance does not hold.

Figures

Figures reproduced from arXiv: 2606.30999 by Daniel Schmierer, Dan Zylberglejd, Yufei Wu.

**Figure 2.** Figure 2: Prior vs Posterior for Interaction Coefficient Estimates [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Prior vs Posterior Estimates at Product Group Level [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

In two-sided marketplaces with heterogeneous products, it is important to understand the causal relationship between additional supply and marketplace outcomes, such as the total quantity transacted or transaction value in the marketplace. This paper studies a causal machine learning approach to estimating this relationship across product segments. We use the Airbnb marketplace as an example, focusing on the impact of additional listing supply on total bookings, but the methodology applies to other two-sided marketplaces. Our approach combines double/debiased machine learning with a hierarchical Bayesian framework that leverages pre-existing knowledge as priors. We construct tractable and informative features for the model by leveraging measures of product segment similarity from the geospatial literature. We find that such a model provides plausible estimates of the marketplace returns to additional supply and strong out of sample performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies double/debiased ML plus hierarchical Bayes with geospatial features to estimate supply incrementality on Airbnb, but the abstract gives no equations or checks so the causal claims stay unverified.

read the letter

The main thing here is a practical extension of double/debiased machine learning to estimate how extra listings affect total bookings across product segments in a two-sided marketplace. The authors add a hierarchical Bayesian layer that borrows strength via priors and builds features from geospatial segment similarity. That combination is new for this exact task even if the pieces are standard elsewhere.

It handles heterogeneity across segments without forcing a single global parameter, and the claim of strong out-of-sample performance suggests the model is at least stable on held-out data. Using pre-existing knowledge as priors is a reasonable way to make the estimates more tractable when segments are small.

The soft spots are straightforward. The abstract contains no equations, no description of the exact identifying variation, and no robustness numbers, so we cannot tell whether the DML orthogonality condition actually holds once geospatial features are included. If supply decisions respond to unobserved local demand shocks that the similarity measures miss, the reported marketplace returns will be biased. The stress-test concern lands because nothing in the provided text rules it out. "Plausible estimates" is also too vague to evaluate without seeing the actual magnitudes or comparisons to simpler baselines.

This is for applied researchers who already work on causal inference for platforms and need a template for segment-level supply effects. A reader who wants to adapt the method to their own marketplace data could extract useful implementation ideas, but anyone looking for new identification theory or broad external validity will find little.

The work deserves peer review. The application is relevant and the methods are grounded enough that referees can check the identification strategy and validation once the full details are in front of them.

Referee Report

2 major / 2 minor

Summary. The paper proposes combining double/debiased machine learning (DML) with a hierarchical Bayesian model that incorporates pre-existing knowledge as priors and geospatial segment-similarity features to estimate the causal effect of additional supply (listings) on marketplace outcomes such as total bookings in two-sided platforms, using Airbnb as the running example. It claims the resulting estimates are plausible and exhibit strong out-of-sample performance across product segments.

Significance. If the causal identification holds after nuisance estimation, the approach would offer a practical way to pool information across heterogeneous segments while respecting domain structure, which could be useful for supply-side policy in marketplaces. The use of external geospatial priors and hierarchical shrinkage is a concrete strength that distinguishes it from off-the-shelf DML applications.

major comments (2)

[Methods / Identification strategy] The central claim that DML recovers unbiased marketplace returns to supply rests on the orthogonality condition after nuisance estimation. The manuscript provides no explicit discussion or robustness evidence that the constructed geospatial similarity features (or the hierarchical priors) are sufficient to block segment-specific unobserved demand shocks that could jointly determine listing supply and bookings; without such evidence the reported estimates remain vulnerable to the endogeneity concern raised in the stress test.
[Hierarchical Bayesian framework] The hierarchical Bayesian component is described as leveraging 'pre-existing knowledge as priors,' yet the paper does not report sensitivity checks to prior specification or the degree of shrinkage across segments. If the priors are weakly informative or misspecified for certain product segments, the pooled estimates could be driven by the prior rather than the data, undermining the claim of plausible causal effects.

minor comments (2)

[Abstract / Results] The abstract states 'strong out of sample performance' without defining the metric, the hold-out scheme, or the baseline comparator; this should be clarified with concrete numbers and a table in the results section.
[Methods] Notation for the DML moment condition and the hierarchical prior should be introduced with explicit equations rather than prose descriptions to allow readers to verify the orthogonality construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments highlight important areas for strengthening the identification discussion and the hierarchical Bayesian component. We address each point below and outline revisions that will be incorporated in the next version of the manuscript.

read point-by-point responses

Referee: [Methods / Identification strategy] The central claim that DML recovers unbiased marketplace returns to supply rests on the orthogonality condition after nuisance estimation. The manuscript provides no explicit discussion or robustness evidence that the constructed geospatial similarity features (or the hierarchical priors) are sufficient to block segment-specific unobserved demand shocks that could jointly determine listing supply and bookings; without such evidence the reported estimates remain vulnerable to the endogeneity concern raised in the stress test.

Authors: We agree that an explicit discussion of how the geospatial segment-similarity features support the orthogonality condition would improve the manuscript. The DML framework relies on the nuisance estimators (including the geospatial features) being sufficiently rich to capture confounding; our features are constructed from established geospatial similarity measures precisely to proxy for shared demand factors across segments. The stress test already provides supporting evidence against certain endogeneity patterns. We will revise the identification section to include a dedicated subsection clarifying these assumptions, the role of the features in blocking segment-specific demand shocks, and additional robustness checks that leverage the stress-test design. revision: yes
Referee: [Hierarchical Bayesian framework] The hierarchical Bayesian component is described as leveraging 'pre-existing knowledge as priors,' yet the paper does not report sensitivity checks to prior specification or the degree of shrinkage across segments. If the priors are weakly informative or misspecified for certain product segments, the pooled estimates could be driven by the prior rather than the data, undermining the claim of plausible causal effects.

Authors: We acknowledge that the absence of reported sensitivity analyses to prior choice and shrinkage is a gap. While the hierarchical structure is intended to let the data dominate via partial pooling, explicit checks are needed to demonstrate robustness. We will add a new subsection (or appendix) reporting sensitivity to alternative prior specifications (e.g., varying hyperprior strength and informativeness) and quantitative measures of shrinkage (posterior vs. segment-specific estimates) across product segments. These will be presented alongside the main results to confirm that the reported effects are data-driven. revision: yes

Circularity Check

0 steps flagged

No circularity; standard causal ML pipeline with external priors and features.

full rationale

The described approach combines double/debiased ML with a hierarchical Bayesian model that uses pre-existing knowledge as priors and constructs features from the geospatial literature. No equations, fitted parameters, or self-citations are shown that reduce the claimed marketplace returns or out-of-sample performance to inputs by construction. The derivation chain remains self-contained against external benchmarks and does not invoke load-bearing self-citations or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only; relies on standard causal ML assumptions and the validity of pre-existing knowledge as Bayesian priors. No free parameters, new entities, or ad-hoc axioms explicitly listed.

axioms (2)

domain assumption Double/debiased machine learning assumptions hold for identifying causal effects of supply
Invoked by the choice of DML method in the abstract.
domain assumption Pre-existing knowledge provides informative and appropriate priors for the hierarchical model across segments
Stated directly in the abstract as part of the framework.

pith-pipeline@v0.9.1-grok · 5668 in / 1084 out tokens · 31109 ms · 2026-07-01T00:58:38.388200+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references

[1]

Customized regression model for Airbnb dynamic pricing

Peng Ye, Julian Qian, Jieying Chen, Chen-hung Wu, Yitong Zhou, Spencer De Mars, Frank Yang, and Li Zhang. Customized regression model for Airbnb dynamic pricing. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 932–940, New York, NY, USA,
[2]

Association for Computing Machinery
[3]

Sanchez Martinez, S

C. Sanchez Martinez, S. O’Donnell, L. Yuan, and Y. Zhu. How Airbnb measures listing lifetime value.the Airbnb Tech Blog, 2025

2025
[4]

Paul H. Douglas. The Cobb-Douglas production function once again: its history, its testing, and some new empirical values.Journal of Political Economy, 84(5): 903–915, 1976

1976
[5]

Pissarides

Barbara Petrongolo and Christopher A. Pissarides. Looking into the black box: A survey of the matching function.Journal of Economic Literature, 39(2):390–431, 2001

2001
[6]

Empirical models of demand and supply in differ- entiated products industries

Amit Gandhi and Aviv Nevo. Empirical models of demand and supply in differ- entiated products industries. Nber working paper, National Bureau of Economic Research, 2021

2021
[7]

Automobile prices in market equilibrium.Econometrica, 63(4):841–890, 1995

Stephen Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equilibrium.Econometrica, 63(4):841–890, 1995

1995
[8]

Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018

Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018

2018
[9]

Edward H. Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects, 2022

2022
[10]

Hyndman, and Bonsoo Koo

Christoph Bergmeir, Rob J. Hyndman, and Bonsoo Koo. A note on the validity of cross-validation for evaluating autoregressive time series prediction.Compu- tational Statistics & Data Analysis, 120:70–83, 2018

2018
[11]

A structured comparison of causal ma- chine learning methods to assess heterogeneous treatment effects in spatial data

Kevin Credit and Matthias Lehnert. A structured comparison of causal ma- chine learning methods to assess heterogeneous treatment effects in spatial data. Journal of Geographical Systems, 26:483–510, 2024

2024
[12]

Cambridge University Press, 2006

Andrew Gelman and Jennifer Hill.Data Analysis Using Regression and Multi- level/Hierarchical Models. Cambridge University Press, 2006

2006
[13]

Moral hazard, wildfires, and the economic incidence of natural disasters

Patrick Baylis and Judson Boomhower. Moral hazard, wildfires, and the economic incidence of natural disasters. NBER Working Paper 26550, National Bureau of Economic Research, 2019

2019

[1] [1]

Customized regression model for Airbnb dynamic pricing

Peng Ye, Julian Qian, Jieying Chen, Chen-hung Wu, Yitong Zhou, Spencer De Mars, Frank Yang, and Li Zhang. Customized regression model for Airbnb dynamic pricing. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 932–940, New York, NY, USA,

[2] [2]

Association for Computing Machinery

[3] [3]

Sanchez Martinez, S

C. Sanchez Martinez, S. O’Donnell, L. Yuan, and Y. Zhu. How Airbnb measures listing lifetime value.the Airbnb Tech Blog, 2025

2025

[4] [4]

Paul H. Douglas. The Cobb-Douglas production function once again: its history, its testing, and some new empirical values.Journal of Political Economy, 84(5): 903–915, 1976

1976

[5] [5]

Pissarides

Barbara Petrongolo and Christopher A. Pissarides. Looking into the black box: A survey of the matching function.Journal of Economic Literature, 39(2):390–431, 2001

2001

[6] [6]

Empirical models of demand and supply in differ- entiated products industries

Amit Gandhi and Aviv Nevo. Empirical models of demand and supply in differ- entiated products industries. Nber working paper, National Bureau of Economic Research, 2021

2021

[7] [7]

Automobile prices in market equilibrium.Econometrica, 63(4):841–890, 1995

Stephen Berry, James Levinsohn, and Ariel Pakes. Automobile prices in market equilibrium.Econometrica, 63(4):841–890, 1995

1995

[8] [8]

Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018

Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018

2018

[9] [9]

Edward H. Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects, 2022

2022

[10] [10]

Hyndman, and Bonsoo Koo

Christoph Bergmeir, Rob J. Hyndman, and Bonsoo Koo. A note on the validity of cross-validation for evaluating autoregressive time series prediction.Compu- tational Statistics & Data Analysis, 120:70–83, 2018

2018

[11] [11]

A structured comparison of causal ma- chine learning methods to assess heterogeneous treatment effects in spatial data

Kevin Credit and Matthias Lehnert. A structured comparison of causal ma- chine learning methods to assess heterogeneous treatment effects in spatial data. Journal of Geographical Systems, 26:483–510, 2024

2024

[12] [12]

Cambridge University Press, 2006

Andrew Gelman and Jennifer Hill.Data Analysis Using Regression and Multi- level/Hierarchical Models. Cambridge University Press, 2006

2006

[13] [13]

Moral hazard, wildfires, and the economic incidence of natural disasters

Patrick Baylis and Judson Boomhower. Moral hazard, wildfires, and the economic incidence of natural disasters. NBER Working Paper 26550, National Bureau of Economic Research, 2019

2019