Active Statistical Inference

· 2024 · stat.ML · arXiv 2403.03208

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open full Pith review browse 4 citing papers arXiv PDF

abstract

Inspired by the concept of active learning, we propose active inference$\unicode{x2013}$a methodology for statistical inference with machine-learning-assisted data collection. Assuming a budget on the number of labels that can be collected, the methodology uses a machine learning model to identify which data points would be most beneficial to label, thus effectively utilizing the budget. It operates on a simple yet powerful intuition: prioritize the collection of labels for data points where the model exhibits uncertainty, and rely on the model's predictions where it is confident. Active inference constructs provably valid confidence intervals and hypothesis tests while leveraging any black-box machine learning model and handling any data distribution. The key point is that it achieves the same level of accuracy with far fewer samples than existing baselines relying on non-adaptively-collected data. This means that for the same number of collected samples, active inference enables smaller confidence intervals and more powerful p-values. We evaluate active inference on datasets from public opinion research, census analysis, and proteomics.

representative citing papers

Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards

math.ST · 2025-06-20 · unverdicted · novelty 7.0

The MLA-UCB algorithm uses ML-generated surrogate rewards from auxiliary data to provably lower cumulative regret in multi-armed bandits, achieving asymptotic optimality under joint Gaussian assumptions without requiring knowledge of the true-surrogate covariance.

Batch-Adaptive Causal Annotations

stat.ML · 2025-02-14 · unverdicted · novelty 6.0

Derives closed-form optimal batch sampling probabilities to minimize asymptotic variance of doubly robust ATE estimator with missing outcomes, achieving lower MSE and matching full-sample precision with 75% fewer labels on simulated and real data.

Industrializing Prediction-Powered Inference: The GLIDE Library for Reliable GenAI and Agentic Systems Evaluation

cs.AI · 2026-05-29 · unverdicted · novelty 3.0

GLIDE is a Python library that packages multiple PPI estimators and samplers for reliable GenAI evaluation and reports annotation savings in an agentic case study.

High-Dimensional Statistics: Reflections on Progress and Open Problems

math.ST · 2026-05-06 · unverdicted · novelty 2.0 · 2 refs

This review synthesizes representative advances in high-dimensional statistics, highlights common themes and open problems, and points to key entry works.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards math.ST · 2025-06-20 · unverdicted · none · ref 36 · internal anchor
The MLA-UCB algorithm uses ML-generated surrogate rewards from auxiliary data to provably lower cumulative regret in multi-armed bandits, achieving asymptotic optimality under joint Gaussian assumptions without requiring knowledge of the true-surrogate covariance.
Batch-Adaptive Causal Annotations stat.ML · 2025-02-14 · unverdicted · none · ref 9 · internal anchor
Derives closed-form optimal batch sampling probabilities to minimize asymptotic variance of doubly robust ATE estimator with missing outcomes, achieving lower MSE and matching full-sample precision with 75% fewer labels on simulated and real data.
Industrializing Prediction-Powered Inference: The GLIDE Library for Reliable GenAI and Agentic Systems Evaluation cs.AI · 2026-05-29 · unverdicted · none · ref 13 · internal anchor
GLIDE is a Python library that packages multiple PPI estimators and samplers for reliable GenAI evaluation and reports annotation savings in an agentic case study.
High-Dimensional Statistics: Reflections on Progress and Open Problems math.ST · 2026-05-06 · unverdicted · none · ref 109 · 2 links · internal anchor
This review synthesizes representative advances in high-dimensional statistics, highlights common themes and open problems, and points to key entry works.

Active Statistical Inference

fields

years

verdicts

representative citing papers

citing papers explorer