Online activity prediction via generalized Indian buffet process models

Lorenzo Masoero; Mario Beraha; Stefano Favaro; Thomas S. Richardson

read the original abstract

Online A/B tests are the standard tool for data-driven decision-making at scale. Among the design choices with the largest impact on statistical power is the triggering mechanism: how many users to expose and for how long. This often requires forecasting user engagement, i.e., whether enough users will trigger, and when a target participation level will be reached, from limited pilot data. We introduce a Bayesian nonparametric model for predicting both new-user counts and total triggers, accommodating the heavy-tailed engagement patterns typical of web experiments. All predictive quantities can be computed without intensive numerical procedures such as MCMC or variational inference. We evaluate on three public datasets (over 450 public benchmark evaluations) and 1,774 proprietary A/B tests. In all the settings, our models show improved accuracy in forecasting new users, total triggers, and time to reach a target sample size compared with state-ofthe-art competitors, especially when only a few pilot days are observed.

Online activity prediction via generalized Indian buffet process models

discussion (0)