pith. sign in

arxiv: 2605.21765 · v1 · pith:W5NEZ4E3new · submitted 2026-05-20 · 💻 cs.LG

Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning

Pith reviewed 2026-05-22 09:19 UTC · model grok-4.3

classification 💻 cs.LG
keywords Bayesian neural networkssampling-based inferenceposterior explorationsample distillationuncertainty quantificationBayesian deep learningmodel averaging
0
0 comments X

The pith

Sampling-based inference has reached computational parity with optimization-based methods for Bayesian neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This position paper argues that sampling-based inference (SAI) has achieved computational parity with optimization-based methods and is ready to supersede them in Bayesian neural networks. The authors contend that this shift would benefit the community by delivering principled uncertainty quantification along with better predictions and deeper insights into model behavior. A sympathetic reader would care because it could fulfill the original promise of Bayesian approaches in deep learning by making reliable uncertainty estimates practical. The paper identifies overcoming misconceptions about sampling feasibility as the first step, followed by addressing posterior exploration and sample distillation.

Core claim

The central claim is that SAI has achieved computational parity with optimization-based methods and is at the verge of superseding such methods for effective and efficient inference in BNNs. SAI can yield superior prediction performance through model averaging, serve as the foundation for a plethora of possible downstream tasks, and provide crucial insights into the landscape of BNNs. Overcoming misconceptions is a necessary first step, and the community must focus on sufficient exploration of the posterior landscape and high-fidelity distillation of posterior samples.

What carries the argument

Sufficient exploration of the posterior landscape and high-fidelity distillation of posterior samples as the two core problems whose solution will unlock the advantages of sampling-based inference in BNNs.

If this is right

  • BNNs become a more widely adopted principled paradigm for uncertainty quantification.
  • Superior prediction performance is achieved through model averaging.
  • SAI provides the foundation for various downstream tasks.
  • Crucial insights into the BNN landscape become available.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Further research could explore how posterior exploration techniques from statistics integrate with neural network training.
  • Efficient sample distillation might enable scalable Bayesian inference in large-scale models.
  • This approach could influence uncertainty handling in applications requiring robust decision making under uncertainty.

Load-bearing premise

The primary barriers to wider use of sampling-based inference are persistent misconceptions about its feasibility and efficiency.

What would settle it

Empirical evidence demonstrating that sampling-based inference remains substantially more computationally expensive or produces inferior results compared to optimization-based methods in practical BNN applications would disprove the position.

Figures

Figures reproduced from arXiv: 2605.21765 by David R\"ugamer, Emanuel Sommer.

Figure 1
Figure 1. Figure 1: Conceptual overview of the pathway toward practical sampling-based inference (SAI). A strong algorithmic status quo already addresses a substantial portion of common misconceptions, while remaining barriers are navigated through additional algorithmic enablers and tooling. Together, these elements enable practical feasibility and widespread adoption. The (dashed) feedback arrow indicates positive momentum … view at source ↗
Figure 2
Figure 2. Figure 2: ). Despite these significant advancements, persis￾tent misconceptions about SAI extend beyond the stereo￾typical focus on computational inefficiency. Many of these lingering beliefs stem from a naive transfer of techniques from classical MCMC to the distinct context of BNNs. The following discussion addresses the most relevant misconcep￾tions prevalent in the community, specifically those concern￾ing time … view at source ↗
Figure 3
Figure 3. Figure 3: Illustrative comparison of inference costs for state-of-the-art BNNs in terms of the number of forward passes required for a single batch of test points. Models and sample counts are taken from recent works (Izmailov et al., 2021; Kim et al., 2024; Paulin et al., 2025; Sommer et al., 2024; 2025) and span a range of architectures from small MLPs to large-scale CNNs with tens of millions of parameters. As a … view at source ↗
read the original abstract

The practical adoption of sampling-based inference (SAI) in Bayesian neural networks (BNNs) remains limited, partly due to persistent misconceptions about the feasibility and efficiency of sampling. This position paper argues that SAI has achieved computational parity with optimization-based methods and is at the verge of superseding such methods for effective and efficient inference in BNNs. This development should be in the interest of the whole community, promoting BNNs as a principled paradigm with its long-standing yet unfulfilled promise of providing principled uncertainty quantification for neural networks. SAI can even do more -- yielding superior prediction performance through model averaging, serving as the foundation for a plethora of possible downstream tasks, and providing crucial insights into the landscape of BNNs. In order to make such a change happen and unfold the potential of sampling, overcoming current misconceptions is a necessary first step. The next step is to realign research efforts toward addressing remaining challenges in SAI. In particular, the community must focus on two core problems: sufficient exploration of the posterior landscape and high-fidelity distillation of posterior samples for efficient downstream inference. By addressing conceptual and practical obstacles, we can unlock the full potential of SAI and establish it as a central tool in Bayesian deep learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. This position paper argues that sampling-based inference (SAI) in Bayesian neural networks (BNNs) has now reached computational parity with optimization-based methods, that misconceptions remain the primary barrier to adoption, and that the community should therefore redirect efforts toward sufficient posterior exploration and high-fidelity sample distillation to realize SAI's advantages in uncertainty quantification, model averaging, and downstream tasks.

Significance. If the parity claim is supported by the cited literature, the paper could usefully reorient Bayesian deep learning research toward sampling methods, potentially delivering the long-promised principled uncertainty estimates and the additional benefits of model averaging. The explicit identification of two concrete research problems (posterior exploration and sample distillation) supplies a clear agenda. The manuscript offers no new algorithms, proofs, or experiments but synthesizes an argument for a shift in priorities.

major comments (1)
  1. [Abstract] Abstract: the claim that 'SAI has achieved computational parity with optimization-based methods' is presented as an established fact without accompanying benchmarks, derivations, or explicit citations to large-scale comparisons (wall-clock time, memory, or ESS per unit resource) on modern architectures such as ResNets or transformers; this assertion is load-bearing for the subsequent recommendation to realign research efforts.
minor comments (1)
  1. The abstract would be strengthened by a single sentence or footnote listing the key prior works that are taken to establish parity, so readers can immediately locate the supporting evidence.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review of our position paper. The feedback highlights a key point about the presentation of our central claim, which we address directly below. We have revised the manuscript to strengthen the supporting evidence while preserving the overall argument.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that 'SAI has achieved computational parity with optimization-based methods' is presented as an established fact without accompanying benchmarks, derivations, or explicit citations to large-scale comparisons (wall-clock time, memory, or ESS per unit resource) on modern architectures such as ResNets or transformers; this assertion is load-bearing for the subsequent recommendation to realign research efforts.

    Authors: We appreciate the referee drawing attention to the need for clearer grounding of this claim. The abstract is intentionally concise, but the full manuscript synthesizes results from recent literature on scalable sampling methods (e.g., variants of HMC, SG-MCMC, and ensemble-based approaches) that report wall-clock times, memory footprints, and effective sample sizes competitive with standard optimization on ResNet-scale models and other modern architectures. These works are cited in Sections 2 and 3. We acknowledge that explicit head-to-head transformer benchmarks remain limited in the cited studies and that the abstract itself did not include direct pointers. In the revised version we have (i) added two key citations to the abstract, (ii) inserted a short paragraph in the introduction that summarizes the relevant efficiency metrics from the literature, and (iii) clarified that the parity claim refers to practical, end-to-end training and inference costs rather than theoretical asymptotic rates. This is a partial revision: we retain the position that parity has been reached in many contemporary settings, but we now make the evidentiary basis more explicit. revision: partial

Circularity Check

0 steps flagged

Position paper states external field assessment; no derivation or self-referential reduction present

full rationale

The document is a position paper whose central claim—that sampling-based inference has reached computational parity with optimization methods—is framed as an observation drawn from the existing literature rather than a quantity constructed or fitted inside the paper. No equations, parameter fits, or predictive steps appear in the provided text or abstract. The call to address posterior exploration and sample distillation follows directly from the stated position without reducing to a tautology, renamed input, or load-bearing self-citation chain. The argument remains self-contained as a synthesis of external developments and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The position rests on domain-level assumptions about the current computational status of sampling methods and on the identification of two specific obstacles as the main remaining challenges.

axioms (2)
  • domain assumption Sampling-based inference has achieved computational parity with optimization-based methods in Bayesian neural networks.
    This is the load-bearing premise stated directly in the abstract as the basis for arguing that sampling should now supersede optimization.
  • ad hoc to paper The main obstacles to adoption are misconceptions together with insufficient posterior exploration and high-fidelity sample distillation.
    The abstract presents these two problems as the necessary focus for future work once misconceptions are overcome.

pith-pipeline@v0.9.0 · 5748 in / 1472 out tokens · 49240 ms · 2026-05-22T09:19:24.995572+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 3 internal anchors

  1. [1]

    bde: A Python Package for Bayesian Deep Ensembles via MILE

    Arvanitis, V ., Aslanidis, A., Sommer, E., and Rügamer, D. bde: A Python Package for Bayesian Deep Ensembles via MILE.arXiv preprint arXiv:2605.14146,

  2. [2]

    Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., 9 Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning Askell, A., Agarwal, S., Herbert-V oss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigl...

  3. [3]

    Laplace Redux – Effortless Bayesian Deep Learning

    Daxberger, E., Kristiadi, A., Immer, A., Eschenhagen, R., Bauer, M., and Hennig, P. Laplace Redux – Effortless Bayesian Deep Learning. In35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021a. Daxberger, E., Nalisnick, E., Allingham, J. U., Antorán, J., and Hernández-Lobato, J. M. Bayesian Deep Learning via Subnetwork Inference. In...

  4. [4]

    Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., and Wilson, A. G. Averaging Weights Leads to Wider Optima and Better Generalization. InProceedings of the 34th Conference on Uncertainty in Artificial Intelligence 2018,

  5. [5]

    and Kelleher, J

    Jawla, D. and Kelleher, J. Layer wise scaled gaussian priors for markov chain monte carlo sampled deep bayesian neural networks.Frontiers in Artificial Intelli- gence, V olume 8 - 2025,

  6. [6]

    doi: 10.3389/frai.2025.1444891

    ISSN 2624-8212. doi: 10.3389/frai.2025.1444891. Kaiser, J., Schwethelm, K., Rueckert, D., and Kaissis, G. Laplace sample information: Data informativeness through a bayesian lens. InThe Thirteenth International Conference on Learning Representations,

  7. [7]

    Li, G., Lin, G., Zhang, Z., and Zhou, Q

    URLhttps://arxiv.org/abs/2504.18911. Li, G., Lin, G., Zhang, Z., and Zhou, Q. Fast replica ex- change stochastic gradient langevin dynamics.arXiv preprint arXiv:2301.01898,

  8. [9]

    Rønning, O., Nalisnick, E., Ley, C., Smyth, P., and Hamel- ryck, T

    URL https: //arxiv.org/abs/2412.08876. Rønning, O., Nalisnick, E., Ley, C., Smyth, P., and Hamel- ryck, T. ELBOing stein: Variational bayes with stein mixture inference. InThe Thirteenth International Con- ference on Learning Representations,

  9. [10]

    and Wade, S

    Sheinkman, A. and Wade, S. The architecture and eval- uation of bayesian neural networks.arXiv preprint arXiv:2503.11808,

  10. [11]

    Wang, T., Zhu, J.-Y ., Torralba, A., and Efros, A. A. Dataset distillation.arXiv preprint arXiv:1811.10959,

  11. [12]

    Willard, B. T. and Louf, R. Efficient guided generation for large language models.arXiv preprint arXiv:2307.09702,

  12. [13]

    Experimental Details Runtime IllustrationTo calculate the runtime results shown in Figure 2 we were using the code from Kobialka et al

    13 Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning A. Experimental Details Runtime IllustrationTo calculate the runtime results shown in Figure 2 we were using the code from Kobialka et al. (2026). We adopt the same UCI benchmark setting as in their Table 2, use the airfoil dataset (Dua & Graff,

  13. [14]

    To ensure a fair comparison under equal computational budgets, we additionally include a 35-member Deep Ensemble—the most performant competing method

    and extend their analysis by reporting average runtimes measured on a standard 10-core CPU. To ensure a fair comparison under equal computational budgets, we additionally include a 35-member Deep Ensemble—the most performant competing method. B. Clarification on Terminology: SAI vs. SBI Throughout this paper, we utilize the acronym SAI to denote Sampling-...