Always Valid Inference: Bringing Sequential Analysis to A/B Testing

David J. Walsh; Leo Pekelis; Ramesh Johari

arxiv: 1512.04922 · v3 · pith:DE5LPOVPnew · submitted 2015-12-15 · 🧮 math.ST · stat.AP· stat.ME· stat.TH

Always Valid Inference: Bringing Sequential Analysis to A/B Testing

Ramesh Johari , Leo Pekelis , David J. Walsh This is my paper

classification 🧮 math.ST stat.APstat.MEstat.TH

keywords validalwaysinferencep-valuessequentialtestingusersconfidence

0 comments

read the original abstract

A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid* p-values and confidence intervals that let users try to take advantage of data as fast as it becomes available, providing valid statistical inference whenever they make their decision. Always valid inference can be interpreted as a natural interface for a sequential hypothesis test, which empowers users to implement a modified test tailored to them. In particular, we show in an appropriate sense that the measures we develop tradeoff sample size and power efficiently, despite a lack of prior knowledge of the user's relative preference between these two goals. We also use always valid p-values to obtain multiple hypothesis testing control in the sequential context. Our methodology has been implemented in a large scale commercial A/B testing platform to analyze hundreds of thousands of experiments to date.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Simultaneous false discovery rate control in location families
stat.ME 2026-05 unverdicted novelty 7.0

A generalization of the Benjamini-Hochberg procedure controls the FDR curve below any specified level in location families, and the standard procedure simultaneously controls the entire curve for free.