pith. sign in

arxiv: 2602.21479 · v2 · pith:MDR65RNEnew · submitted 2026-02-25 · 📊 stat.ML · cs.LG

Global Sequential Testing for Multi-Stream Auditing

classification 📊 stat.ML cs.LG
keywords alphafracsequentialdatagloballeftrightstreams
0
0 comments X
read the original abstract

Across many risk-sensitive areas, it is critical to continuously audit machine learning systems as we receive more data to quickly determine if they are performing as designed. This auditing task can be modeled as a sequential hypothesis testing problem with $k$ data streams and a global null hypothesis that asserts the system operates as intended across all $k$ streams. Under the alternative, the standard global sequential test, which uses a Bonferroni correction, has an expected stopping time of $O\left(\ln \frac{k}{\alpha}\right)$ for large $k$ and significance level $\alpha$. In this work, we demonstrate that efficient sequential tests, relying on merging martingales via averaging and products rules, provide improved stopping times, and thus more powerful tests against the null. Using these results, we show that a balanced test can match the Bonferroni rate of $O\left(\ln \frac{k}{\alpha}\right)$ in the sparse regime (just a few non-null streams) while achieving $O\left(\frac{1}{k}\ln \frac{1}{\alpha}\right)$ under dense alternatives (many non-null steams). We validate our theory through experiments on both synthetic and real-world data.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.