pith. sign in

arxiv: 2606.26690 · v1 · pith:EKCW4QOQnew · submitted 2026-06-25 · 💻 cs.IR · cs.LG

Attributed, But Not Incremental: Cannibalization-Corrected Attribution for Large-Scale Advertising

Pith reviewed 2026-06-26 03:11 UTC · model grok-4.3

classification 💻 cs.IR cs.LG
keywords attribution correctioncannibalizationincrementality experimentsadvertising measurementbudget allocationcausal calibrationstructural constraints
0
0 comments X

The pith

An experiment-calibrated framework converts sparse lift measurements into daily attribution corrections that reduce calibration error and measured cannibalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to establish that raw paid attribution overstates true incremental conversions when channels overlap with organic demand or other sources, which distorts ROI and budget choices at scale. The proposed method anchors corrections on incrementality experiments, turns those sparse lifts into daily estimates, and distributes the calibrated cannibalization volumes across hierarchies while preserving structural consistency. Offline checks against held-out experiment readouts show lower error than raw attribution or fine-grained ML baselines. Real deployment across multiple markets coincided with an approximately 15-percentage-point drop in the observed cannibalization rate, making the corrected signal usable for daily decisions.

Core claim

The central claim is that an experiment-calibrated attribution correction framework, which converts sparse lift measurements into daily correction estimates and allocates calibrated cannibalization volume across business hierarchies under structural consistency constraints, substantially reduces calibration error relative to raw attribution and fine-grained ML baselines; when deployed, the system supported budget and traffic adjustments that were followed by an approximately 15-percentage-point reduction in the measured cannibalization rate.

What carries the argument

The experiment-calibrated attribution correction framework that allocates calibrated cannibalization volume across business hierarchies under structural consistency constraints.

If this is right

  • Corrected daily attribution signals support more accurate budget allocation and channel diagnosis than raw outputs.
  • The framework produces lower calibration error than both uncorrected attribution and fine-grained ML baselines in offline forward validation.
  • Structural consistency constraints keep the allocated correction volumes coherent across business hierarchies.
  • Deployment of the corrected signals enables strategy adjustments that reduce measured cannibalization rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the extrapolation step holds across markets, the method could lower the frequency of full-scale experiments needed for ongoing measurement.
  • The approach implies that cannibalization bias is large enough in multi-channel systems to justify explicit correction layers rather than relying on model complexity alone.
  • Similar sparse-anchor calibration could be tested in other domains where attribution or measurement systems face overlapping signals.

Load-bearing premise

Sparse lift measurements from incrementality experiments can be reliably extrapolated into daily correction estimates across all periods and hierarchies without material bias from unmodeled temporal or contextual shifts.

What would settle it

A new set of channel-level incrementality experiments run after the correction model is locked, with forward-in-time comparison of corrected attribution against the fresh lift readouts to check whether calibration error stays low.

Figures

Figures reproduced from arXiv: 2606.26690 by Bowen Yuan, Donghui Li, Lijing Song, Qinxin Chen, Zili Yang.

Figure 1
Figure 1. Figure 1: Attribution-cannibalization mismatch and cor [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed ETDC+HCA framework. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Operational locality validation under a local pol [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
read the original abstract

In large-scale paid acquisition and growth advertising systems, production attribution outputs are widely used for daily budget allocation and channel diagnosis. However, paid-attributed conversions such as daily new users (DNU) may systematically overstate true incremental growth when paid channels overlap with organic demand, brand-driven traffic, or other acquisition channels. This attribution-cannibalization mismatch can distort incremental ROI measurement and budget decisions at scale. We propose an experiment-calibrated attribution correction framework that uses incrementality experiments as causal anchors to convert sparse lift measurements into daily correction estimates. To make the corrected signal actionable at production granularity, we further allocate calibrated cannibalization volume across business hierarchies under structural consistency constraints. Offline forward-in-time validation against channel-level incrementality experiment readouts shows that the proposed framework substantially reduces calibration error relative to raw attribution and fine-grained ML baselines. Deployed across multiple global TikTok markets, the system supported budget and traffic strategy adjustments that were followed by an approximately 15-percentage-point reduction in the measured cannibalization rate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes an experiment-calibrated attribution correction framework for large-scale advertising that converts sparse incrementality experiment lift measurements into daily cannibalization correction estimates and allocates the corrected volume across business hierarchies under structural consistency constraints. Offline forward-in-time validation against channel-level incrementality readouts is reported to show substantially lower calibration error than raw attribution or fine-grained ML baselines; deployment across global TikTok markets is claimed to have supported strategy adjustments followed by an approximately 15-percentage-point reduction in measured cannibalization rate.

Significance. If the central mapping from sparse lifts to stable daily corrections holds without material bias, the framework would provide a practical, production-scale method for aligning attribution outputs with incremental outcomes in paid acquisition systems, directly improving budget allocation and ROI diagnostics where cannibalization from organic/brand overlap is common.

major comments (2)
  1. [Abstract] Abstract (validation paragraph): The forward-in-time validation is anchored to the same sparse incrementality experiment readouts used to derive the daily corrections; this setup cannot detect systematic bias arising from temporal drift, seasonality, or unmeasured contextual shifts between experiment windows and production periods, leaving the claim of substantially reduced calibration error vulnerable to the extrapolation assumption identified in the stress-test note.
  2. [Abstract] Abstract (deployment paragraph): The reported 15-percentage-point reduction in measured cannibalization rate is presented as a deployment outcome, but without details on the measurement protocol, exclusion rules, or controls for concurrent changes in traffic strategy, it is unclear whether the reduction can be attributed to the corrected attribution signal rather than other operational adjustments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below, with proposed revisions where the concerns identify areas for clarification or qualification.

read point-by-point responses
  1. Referee: [Abstract] Abstract (validation paragraph): The forward-in-time validation is anchored to the same sparse incrementality experiment readouts used to derive the daily corrections; this setup cannot detect systematic bias arising from temporal drift, seasonality, or unmeasured contextual shifts between experiment windows and production periods, leaving the claim of substantially reduced calibration error vulnerable to the extrapolation assumption identified in the stress-test note.

    Authors: We thank the referee for this observation. The forward-in-time validation uses a temporal split of the available experiment readouts, deriving corrections from earlier windows and evaluating against later ones to approximate prospective application. We acknowledge that this design cannot fully rule out bias from unmeasured temporal or contextual shifts, consistent with the extrapolation assumption already flagged in the stress-test note. We will revise the validation discussion and abstract to explicitly state this limitation and temper the strength of the calibration-error claim accordingly. revision: yes

  2. Referee: [Abstract] Abstract (deployment paragraph): The reported 15-percentage-point reduction in measured cannibalization rate is presented as a deployment outcome, but without details on the measurement protocol, exclusion rules, or controls for concurrent changes in traffic strategy, it is unclear whether the reduction can be attributed to the corrected attribution signal rather than other operational adjustments.

    Authors: We agree that the abstract deployment claim requires qualification to avoid over-attribution. The manuscript's deployment section describes the measurement approach, but the abstract is brief. We will revise the abstract paragraph to note that the reduction was observed following strategy adjustments informed by the framework, while acknowledging that concurrent operational changes may have contributed, and to direct readers to the full deployment details. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation anchored on external experiments

full rationale

The framework converts sparse lift measurements from incrementality experiments into daily correction estimates and allocates under structural constraints. These experiments serve as independent causal anchors rather than self-derived inputs. Offline forward-in-time validation uses the same external readouts as benchmarks, and deployment results reference measured cannibalization reductions outside the model's fitted values. No self-definitional equations, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation chain. The central claim remains falsifiable against external experiment data and does not reduce to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.1-grok · 5718 in / 995 out tokens · 36670 ms · 2026-06-26T03:11:30.594575+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

13 extracted references · 10 canonical work pages · 2 internal anchors

  1. [1]

    Carlos Aguilar-Palacios, Sergio Muñoz-Romero, and José Luis Rojo-Álvarez. 2021. Causal Quantification of Cannibalization During Promotional Sales in Grocery Retail.IEEE Access9 (2021), 34078–34089. doi:10.1109/ACCESS.2021.3062222

  2. [2]

    Joel Barajas, Tom Zidar, and Mert Bay. 2020. Advertising Incrementality Measurement using Controlled Geo-Experiments: The Universal App Campaign Case Study. InProceedings of the 2020 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/advertising-incrementality- measurement-using-controlled-geo-experiments%3A-the-universal-app- campa...

  3. [3]

    John Bencina, Erkut Aykutlug, Yue Chen, Zerui Zhang, Stephanie Sorenson, Shao Tang, and Changshuai Wei. 2025. LiDDA: Data Driven Attribution at LinkedIn. arXiv preprint arXiv:2505.09861(2025). doi:10.48550/arXiv.2505.09861

  4. [4]

    Thomas Blake, Chris Nosko, and Steven Tadelis. 2015. Consumer Heterogeneity and Paid Search Effectiveness: A Large-Scale Field Experiment.Econometrica83, 1 (2015), 155–174. doi:10.3982/ECTA12423

  5. [5]

    Brian Dalessandro, Claudia Perlich, Ori Stitelman, and Foster Provost. 2012. Causally Motivated Attribution for Online Advertising. InProceedings of the 6th International Workshop on Data Mining for Online Advertising and Internet Economy (ADKDD ’12). Association for Computing Machinery, New York, NY, USA, Article 3, 9 pages. doi:10.1145/2351356.2351363

  6. [6]

    Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network

    Ruihuan Du, Yu Zhong, Harikesh S. Nair, Bo Cui, and Ruyang Shou. 2019. Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network. arXiv preprint arXiv:1902.00215(2019). doi:10.48550/arXiv.1902.00215

  7. [7]

    Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky

    Brett R. Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. 2019. A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook.Marketing Science38, 2 (2019), 193–225. doi:10. 1287/mksc.2018.1135

  8. [8]

    Johnson, Randall A

    Garrett A. Johnson, Randall A. Lewis, and Elmar I. Nubbemeyer. 2017. Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness.Journal of Marketing Research54, 6 (2017), 867–884. doi:10.1509/jmr.15.0297

  9. [9]

    Harang Ju, Michael Zhao, and Sinan Aral. 2025. Complementarity Between Paid and Organic Installs in Mobile App Advertising.arXiv preprint arXiv:2504.16151 (2025). doi:10.48550/arXiv.2504.16151

  10. [10]

    Jon Vaver and Jim Koehler

    Randall A. Lewis and Justin M. Rao. 2015. The Unfavorable Economics of Mea- suring the Returns to Advertising.The Quarterly Journal of Economics130, 4 (2015), 1941–1973. doi:10.1093/qje/qjv023

  11. [11]

    Ning Li, Sai Kumar Arava, Chen Dong, William Yan, Abhishek Pani, and Linda Boyle. 2018. Deep Neural Net with Attention for Multi-channel Multi-touch Attribution. InProceedings of the 2018 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/deep-neural-net-with-attention-for- multi-channel-multi-touch-attribution/2018

  12. [12]

    Xuhui Shao and Lexin Li. 2011. Data-driven Multi-touch Attribution Models. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11). Association for Computing Machinery, New York, NY, USA, 258–264. doi:10.1145/2020408.2020453

  13. [13]

    Dongdong Yang, Kevin Dyer, and Senzhang Wang. 2020. Interpretable Deep Learn- ing Model for Online Multi-touch Attribution.arXiv preprint arXiv:2004.00384 (2020). doi:10.48550/arXiv.2004.00384