Batch Bayesian Active Learning with Partial Batch Label Sampling

Kangping Hu; Stephen Mussmann

arxiv: 2510.09877 · v3 · submitted 2025-10-10 · 💻 cs.LG · cs.AI· stat.ML

Batch Bayesian Active Learning with Partial Batch Label Sampling

Kangping Hu , Stephen Mussmann This is my paper

Pith reviewed 2026-05-18 07:58 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords active learningbatch active learningBayesian active learningEPIGexpected predictive information gainpartial label samplingmachine learning

0 comments

The pith

Partial Batch Label Sampling lets EPIG scale to large batches while preserving performance in Bayesian active learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives Partial Batch Label Sampling (ParBaLS) for the Expected Predictive Information Gain acquisition function by applying a specific formulation of Bayesian Decision Theory. This addresses the scaling problem in batch active learning, where full-batch methods become computationally heavy and simple top-B selection loses effectiveness. Experiments on multiple datasets show that ParBaLS EPIG delivers higher accuracy for a fixed labeling budget when paired with Bayesian logistic regression on fixed embeddings from pre-trained models. A reader would care because labeling data is costly, and better batch selection can reduce the total labels needed to reach a target model quality.

Core claim

Using Bayesian Decision Theory, the authors derive Partial Batch Label Sampling (ParBaLS) for EPIG that approximates batch information gain by considering only partial label realizations within each candidate batch. This yields an acquisition function that avoids both the intractability of exact batch methods like BatchBALD and the performance degradation of greedy top-B selection. Experiments demonstrate that ParBaLS EPIG produces superior test accuracy compared with baselines under a fixed query budget for Bayesian logistic regression on pre-trained embeddings.

What carries the argument

Partial Batch Label Sampling (ParBaLS), a mechanism that computes EPIG by sampling labels for only a subset of points inside each proposed batch under the Bayesian decision-theoretic objective.

If this is right

For any fixed labeling budget, ParBaLS EPIG reaches higher accuracy than top-B selection or full-batch approximations.
The method remains computationally feasible at batch sizes where exact batch methods become intractable.
The same ParBaLS construction can be used whenever the acquisition function is EPIG and the model admits a posterior over predictions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The partial-sampling idea could be adapted to other information-based acquisition functions such as BALD or EER.
Real-world annotation pipelines that already operate in batches could integrate ParBaLS to lower total labeling cost.
Checking whether the gains persist when the embedding model is fine-tuned jointly with the classifier would test the method's robustness beyond the fixed-feature setting.

Load-bearing premise

The reported performance gains hold when the model is Bayesian logistic regression and the features are fixed embeddings from a pre-trained network.

What would settle it

Repeating the experiments with an end-to-end trained neural network instead of fixed embeddings plus logistic regression and observing no accuracy advantage for ParBaLS EPIG over top-B selection would falsify the practical claim.

read the original abstract

Over the past couple of decades, many active learning acquisition functions have been proposed, leaving practitioners with an unclear choice of which to use. Bayesian-based active learning offers principled objectives with explainable intuition, including Expected Error Reduction (EER), Expected Predictive Information Gain (EPIG), and Bayesian Active Learning by Disagreements (BALD). A key challenge of such methods is the difficult scaling to large batch sizes, leading to either computational challenges (BatchBALD) or dramatic performance drops (top-$B$ selection). Here, using a particular formulation of Bayesian Decision Theory, we derive Partial Batch Label Sampling (ParBaLS) for the EPIG algorithm. We show experimentally for several datasets that ParBaLS EPIG gives superior performance for a fixed budget and Bayesian Logistic Regression on embeddings from large pre-trained models. Our code is available at https://github.com/ADDAPT-ML/ParBaLS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ParBaLS provides a useful derivation for scaling EPIG in batches but its benefits appear tied to the Bayesian logistic regression and fixed embedding setup used in the experiments.

read the letter

The punchline here is that the authors derive a new sampling strategy called ParBaLS for the EPIG acquisition function in batch Bayesian active learning, and their experiments suggest it delivers better performance than common baselines when using a fixed labeling budget. They start from a Bayesian Decision Theory formulation to come up with Partial Batch Label Sampling specifically tailored to EPIG. This seems like a direct attempt to fix the scaling issues that plague methods like BatchBALD at larger batch sizes or the performance drops from just picking the top B points. The fact that they release code on GitHub makes it easier to verify and build on. What the paper does well is keep the focus on a practical problem in active learning pipelines. Many papers propose new acquisition functions without addressing how they behave in batches, so this targeted derivation is a step in the right direction. The experiments cover several datasets with Bayesian logistic regression on embeddings from large pre-trained models, which is a realistic setting for many applications. The soft spots are mostly around generalization. The results rely on Bayesian logistic regression with fixed features, where posterior updates are straightforward. If the method exploits properties specific to linear models or this setup, the advantages might shrink with more complex models or when training end-to-end. The abstract claims experimental superiority but the details on effect sizes, error bars, and significance tests would need checking in the full text to see how robust they are. This paper is for researchers and practitioners working on information-theoretic active learning who need better ways to handle batch queries. A reader looking for incremental improvements to EPIG or similar methods could get some value from the derivation and the empirical comparison. It deserves a serious referee. The work is focused, the derivation is presented as coming from first principles, and the experiments provide evidence in a relevant domain. Even with the limited scope, it is worth sending out for review to get input on whether the approach extends beyond this model class.

Referee Report

1 major / 2 minor

Summary. The paper derives Partial Batch Label Sampling (ParBaLS) from a Bayesian Decision Theory formulation of the Expected Predictive Information Gain (EPIG) acquisition function to address scaling challenges in batch Bayesian active learning. It contrasts this with methods like top-B selection and BatchBALD, and reports experimental results showing that ParBaLS EPIG outperforms baselines on several datasets when using Bayesian logistic regression on fixed embeddings from large pre-trained models, for a fixed labeling budget. Code is provided for reproducibility.

Significance. If the experimental results hold under the stated conditions, the work offers a principled, explainable approach to batch selection within information-theoretic active learning that could improve practical performance for moderate batch sizes. The derivation from Bayesian Decision Theory and the open-sourced implementation are strengths that aid verification and potential extensions.

major comments (1)

[§4] §4 (Experiments): Results are shown only for Bayesian logistic regression on fixed pre-trained embeddings. This setup permits exact posterior updates, which may be essential to the observed gains; the central experimental claim of superiority would be more robust if the authors either qualified the scope or added at least one experiment with approximate inference in a non-linear model class.

minor comments (2)

[Abstract] Abstract: The claim of 'superior performance' lacks any quantitative indication of the magnitude of improvement or the number of datasets; a single sentence summarizing average gains would help readers gauge practical impact.
[§4] §4, tables: Performance figures are given as point estimates without error bars, number of random seeds, or statistical significance tests. Including these would allow proper assessment of whether differences are reliable given the stochasticity of active learning.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the principled derivation, and recommendation for minor revision. We address the single major comment below.

read point-by-point responses

Referee: [§4] §4 (Experiments): Results are shown only for Bayesian logistic regression on fixed pre-trained embeddings. This setup permits exact posterior updates, which may be essential to the observed gains; the central experimental claim of superiority would be more robust if the authors either qualified the scope or added at least one experiment with approximate inference in a non-linear model class.

Authors: We agree that the reported experiments use Bayesian logistic regression on fixed embeddings, which permits exact posterior updates. This design choice was made deliberately to isolate the contribution of the ParBaLS acquisition function from confounding effects of approximate inference (e.g., variational methods or sampling in non-linear networks). The underlying derivation from Bayesian Decision Theory for EPIG is model-agnostic provided the predictive distribution is available, but the controlled setting allows clean attribution of performance differences to the batch selection strategy itself. In the revised manuscript we will explicitly qualify the scope of the empirical claims to state that superiority is demonstrated under exact inference with linear models on pre-trained embeddings. This directly addresses the referee’s suggestion without requiring new, computationally heavy experiments at this stage. revision: yes

Circularity Check

0 steps flagged

Derivation of ParBaLS from Bayesian Decision Theory formulation of EPIG is self-contained

full rationale

The paper presents the derivation of Partial Batch Label Sampling (ParBaLS) explicitly as arising from a particular formulation of Bayesian Decision Theory applied to the EPIG acquisition function. This is positioned as a principled derivation rather than a fit or renaming. No equations or steps are shown to reduce by construction to the inputs (e.g., no fitted parameters renamed as predictions, no self-definitional loops where the output is presupposed in the definition of the input). The experimental claims are separate empirical validations on Bayesian logistic regression with fixed pre-trained embeddings and do not feed back into the derivation. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The derivation chain is therefore independent and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The work rests on standard Bayesian decision theory and the definition of EPIG; no free parameters or new physical entities are introduced in the abstract.

axioms (1)

domain assumption Bayesian Decision Theory provides a valid objective for deriving acquisition functions in active learning
Invoked to derive ParBaLS from EPIG

invented entities (1)

Partial Batch Label Sampling (ParBaLS) no independent evidence
purpose: To approximate batch information gain while avoiding full combinatorial computation
New sampling procedure introduced for the EPIG algorithm

pith-pipeline@v0.9.0 · 5681 in / 1243 out tokens · 32299 ms · 2026-05-18T07:58:53.407412+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

arg max ˆx∈D ∑x∈V I(Yx;Yˆx|L) (EPIG form derived from expected negative-log-loss reduction)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ParBaLS Monte-Carlo estimate over partial-batch pseudo-labels yS∼YS|L

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.