Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning

David R\"ugamer; Emanuel Sommer

arxiv: 2605.21765 · v1 · pith:W5NEZ4E3new · submitted 2026-05-20 · 💻 cs.LG

Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning

Emanuel Sommer , David R\"ugamer This is my paper

Pith reviewed 2026-05-22 09:19 UTC · model grok-4.3

classification 💻 cs.LG

keywords Bayesian neural networkssampling-based inferenceposterior explorationsample distillationuncertainty quantificationBayesian deep learningmodel averaging

0 comments

The pith

Sampling-based inference has reached computational parity with optimization-based methods for Bayesian neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This position paper argues that sampling-based inference (SAI) has achieved computational parity with optimization-based methods and is ready to supersede them in Bayesian neural networks. The authors contend that this shift would benefit the community by delivering principled uncertainty quantification along with better predictions and deeper insights into model behavior. A sympathetic reader would care because it could fulfill the original promise of Bayesian approaches in deep learning by making reliable uncertainty estimates practical. The paper identifies overcoming misconceptions about sampling feasibility as the first step, followed by addressing posterior exploration and sample distillation.

Core claim

The central claim is that SAI has achieved computational parity with optimization-based methods and is at the verge of superseding such methods for effective and efficient inference in BNNs. SAI can yield superior prediction performance through model averaging, serve as the foundation for a plethora of possible downstream tasks, and provide crucial insights into the landscape of BNNs. Overcoming misconceptions is a necessary first step, and the community must focus on sufficient exploration of the posterior landscape and high-fidelity distillation of posterior samples.

What carries the argument

Sufficient exploration of the posterior landscape and high-fidelity distillation of posterior samples as the two core problems whose solution will unlock the advantages of sampling-based inference in BNNs.

If this is right

BNNs become a more widely adopted principled paradigm for uncertainty quantification.
Superior prediction performance is achieved through model averaging.
SAI provides the foundation for various downstream tasks.
Crucial insights into the BNN landscape become available.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Further research could explore how posterior exploration techniques from statistics integrate with neural network training.
Efficient sample distillation might enable scalable Bayesian inference in large-scale models.
This approach could influence uncertainty handling in applications requiring robust decision making under uncertainty.

Load-bearing premise

The primary barriers to wider use of sampling-based inference are persistent misconceptions about its feasibility and efficiency.

What would settle it

Empirical evidence demonstrating that sampling-based inference remains substantially more computationally expensive or produces inferior results compared to optimization-based methods in practical BNN applications would disprove the position.

Figures

Figures reproduced from arXiv: 2605.21765 by David R\"ugamer, Emanuel Sommer.

**Figure 1.** Figure 1: Conceptual overview of the pathway toward practical sampling-based inference (SAI). A strong algorithmic status quo already addresses a substantial portion of common misconceptions, while remaining barriers are navigated through additional algorithmic enablers and tooling. Together, these elements enable practical feasibility and widespread adoption. The (dashed) feedback arrow indicates positive momentum … view at source ↗

**Figure 2.** Figure 2: ). Despite these significant advancements, persistent misconceptions about SAI extend beyond the stereotypical focus on computational inefficiency. Many of these lingering beliefs stem from a naive transfer of techniques from classical MCMC to the distinct context of BNNs. The following discussion addresses the most relevant misconceptions prevalent in the community, specifically those concerning time … view at source ↗

**Figure 3.** Figure 3: Illustrative comparison of inference costs for state-of-the-art BNNs in terms of the number of forward passes required for a single batch of test points. Models and sample counts are taken from recent works (Izmailov et al., 2021; Kim et al., 2024; Paulin et al., 2025; Sommer et al., 2024; 2025) and span a range of architectures from small MLPs to large-scale CNNs with tens of millions of parameters. As a … view at source ↗

read the original abstract

The practical adoption of sampling-based inference (SAI) in Bayesian neural networks (BNNs) remains limited, partly due to persistent misconceptions about the feasibility and efficiency of sampling. This position paper argues that SAI has achieved computational parity with optimization-based methods and is at the verge of superseding such methods for effective and efficient inference in BNNs. This development should be in the interest of the whole community, promoting BNNs as a principled paradigm with its long-standing yet unfulfilled promise of providing principled uncertainty quantification for neural networks. SAI can even do more -- yielding superior prediction performance through model averaging, serving as the foundation for a plethora of possible downstream tasks, and providing crucial insights into the landscape of BNNs. In order to make such a change happen and unfold the potential of sampling, overcoming current misconceptions is a necessary first step. The next step is to realign research efforts toward addressing remaining challenges in SAI. In particular, the community must focus on two core problems: sufficient exploration of the posterior landscape and high-fidelity distillation of posterior samples for efficient downstream inference. By addressing conceptual and practical obstacles, we can unlock the full potential of SAI and establish it as a central tool in Bayesian deep learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This position paper argues sampling has reached parity in BNNs and should now focus on exploration plus distillation, but the parity claim rests on synthesis rather than new evidence.

read the letter

The main thing to know is that this is a call for the Bayesian deep learning community to treat sampling-based inference as computationally viable now and shift effort toward better posterior exploration and sample distillation for downstream tasks. The authors lay out why sampling should win in principle: proper uncertainty, model averaging for predictions, and visibility into the loss landscape. They identify two concrete bottlenecks that feel actionable and worth prioritizing over further optimization tweaks. That framing is the useful part here; it gives people a short list of problems to attack instead of vague dissatisfaction with current methods. The paper does a reasonable job reminding readers of the long-term goals of BDL without overclaiming new technical results. The soft spot is the repeated assertion that sampling has already achieved parity with optimization methods. The abstract presents this as settled, yet the document itself adds no fresh large-scale comparisons on modern architectures or hardware. If the cited prior work is mostly limited to smaller models or older setups, the generalization to ResNets or transformers does not automatically follow, and the stress-test concern lands. This leaves the central claim more as an informed opinion than a demonstrated fact. The piece is aimed at researchers already working on uncertainty quantification who are open to reorienting priorities. A reader looking for new algorithms or experiments will not find them, but someone deciding what to work on next might get a clearer sense of where the field could move. It deserves peer review as a position paper so the community can test the parity assessment and sharpen the proposed research directions.

Referee Report

1 major / 1 minor

Summary. This position paper argues that sampling-based inference (SAI) in Bayesian neural networks (BNNs) has now reached computational parity with optimization-based methods, that misconceptions remain the primary barrier to adoption, and that the community should therefore redirect efforts toward sufficient posterior exploration and high-fidelity sample distillation to realize SAI's advantages in uncertainty quantification, model averaging, and downstream tasks.

Significance. If the parity claim is supported by the cited literature, the paper could usefully reorient Bayesian deep learning research toward sampling methods, potentially delivering the long-promised principled uncertainty estimates and the additional benefits of model averaging. The explicit identification of two concrete research problems (posterior exploration and sample distillation) supplies a clear agenda. The manuscript offers no new algorithms, proofs, or experiments but synthesizes an argument for a shift in priorities.

major comments (1)

[Abstract] Abstract: the claim that 'SAI has achieved computational parity with optimization-based methods' is presented as an established fact without accompanying benchmarks, derivations, or explicit citations to large-scale comparisons (wall-clock time, memory, or ESS per unit resource) on modern architectures such as ResNets or transformers; this assertion is load-bearing for the subsequent recommendation to realign research efforts.

minor comments (1)

The abstract would be strengthened by a single sentence or footnote listing the key prior works that are taken to establish parity, so readers can immediately locate the supporting evidence.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review of our position paper. The feedback highlights a key point about the presentation of our central claim, which we address directly below. We have revised the manuscript to strengthen the supporting evidence while preserving the overall argument.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'SAI has achieved computational parity with optimization-based methods' is presented as an established fact without accompanying benchmarks, derivations, or explicit citations to large-scale comparisons (wall-clock time, memory, or ESS per unit resource) on modern architectures such as ResNets or transformers; this assertion is load-bearing for the subsequent recommendation to realign research efforts.

Authors: We appreciate the referee drawing attention to the need for clearer grounding of this claim. The abstract is intentionally concise, but the full manuscript synthesizes results from recent literature on scalable sampling methods (e.g., variants of HMC, SG-MCMC, and ensemble-based approaches) that report wall-clock times, memory footprints, and effective sample sizes competitive with standard optimization on ResNet-scale models and other modern architectures. These works are cited in Sections 2 and 3. We acknowledge that explicit head-to-head transformer benchmarks remain limited in the cited studies and that the abstract itself did not include direct pointers. In the revised version we have (i) added two key citations to the abstract, (ii) inserted a short paragraph in the introduction that summarizes the relevant efficiency metrics from the literature, and (iii) clarified that the parity claim refers to practical, end-to-end training and inference costs rather than theoretical asymptotic rates. This is a partial revision: we retain the position that parity has been reached in many contemporary settings, but we now make the evidentiary basis more explicit. revision: partial

Circularity Check

0 steps flagged

Position paper states external field assessment; no derivation or self-referential reduction present

full rationale

The document is a position paper whose central claim—that sampling-based inference has reached computational parity with optimization methods—is framed as an observation drawn from the existing literature rather than a quantity constructed or fitted inside the paper. No equations, parameter fits, or predictive steps appear in the provided text or abstract. The call to address posterior exploration and sample distillation follows directly from the stated position without reducing to a tautology, renamed input, or load-bearing self-citation chain. The argument remains self-contained as a synthesis of external developments and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The position rests on domain-level assumptions about the current computational status of sampling methods and on the identification of two specific obstacles as the main remaining challenges.

axioms (2)

domain assumption Sampling-based inference has achieved computational parity with optimization-based methods in Bayesian neural networks.
This is the load-bearing premise stated directly in the abstract as the basis for arguing that sampling should now supersede optimization.
ad hoc to paper The main obstacles to adoption are misconceptions together with insufficient posterior exploration and high-fidelity sample distillation.
The abstract presents these two problems as the necessary focus for future work once misconceptions are overcome.

pith-pipeline@v0.9.0 · 5748 in / 1472 out tokens · 49240 ms · 2026-05-22T09:19:24.995572+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

This position paper argues that SAI has achieved computational parity with optimization-based methods and is at the verge of superseding such methods for effective and efficient inference in BNNs.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 3 internal anchors

[1]

bde: A Python Package for Bayesian Deep Ensembles via MILE

Arvanitis, V ., Aslanidis, A., Sommer, E., and Rügamer, D. bde: A Python Package for Bayesian Deep Ensembles via MILE.arXiv preprint arXiv:2605.14146,

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., 9 Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning Askell, A., Agarwal, S., Herbert-V oss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigl...

work page 1901
[3]

Laplace Redux – Effortless Bayesian Deep Learning

Daxberger, E., Kristiadi, A., Immer, A., Eschenhagen, R., Bauer, M., and Hennig, P. Laplace Redux – Effortless Bayesian Deep Learning. In35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021a. Daxberger, E., Nalisnick, E., Allingham, J. U., Antorán, J., and Hernández-Lobato, J. M. Bayesian Deep Learning via Subnetwork Inference. In...

work page 2021
[4]

Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., and Wilson, A. G. Averaging Weights Leads to Wider Optima and Better Generalization. InProceedings of the 34th Conference on Uncertainty in Artificial Intelligence 2018,

work page 2018
[5]

and Kelleher, J

Jawla, D. and Kelleher, J. Layer wise scaled gaussian priors for markov chain monte carlo sampled deep bayesian neural networks.Frontiers in Artificial Intelli- gence, V olume 8 - 2025,

work page 2025
[6]

doi: 10.3389/frai.2025.1444891

ISSN 2624-8212. doi: 10.3389/frai.2025.1444891. Kaiser, J., Schwethelm, K., Rueckert, D., and Kaissis, G. Laplace sample information: Data informativeness through a bayesian lens. InThe Thirteenth International Conference on Learning Representations,

work page doi:10.3389/frai.2025.1444891 2025
[7]

Li, G., Lin, G., Zhang, Z., and Zhou, Q

URLhttps://arxiv.org/abs/2504.18911. Li, G., Lin, G., Zhang, Z., and Zhou, Q. Fast replica ex- change stochastic gradient langevin dynamics.arXiv preprint arXiv:2301.01898,

work page arXiv
[9]

Rønning, O., Nalisnick, E., Ley, C., Smyth, P., and Hamel- ryck, T

URL https: //arxiv.org/abs/2412.08876. Rønning, O., Nalisnick, E., Ley, C., Smyth, P., and Hamel- ryck, T. ELBOing stein: Variational bayes with stein mixture inference. InThe Thirteenth International Con- ference on Learning Representations,

work page arXiv
[10]

and Wade, S

Sheinkman, A. and Wade, S. The architecture and eval- uation of bayesian neural networks.arXiv preprint arXiv:2503.11808,

work page arXiv
[11]

Wang, T., Zhu, J.-Y ., Torralba, A., and Efros, A. A. Dataset distillation.arXiv preprint arXiv:1811.10959,

work page internal anchor Pith review arXiv
[12]

Willard, B. T. and Louf, R. Efficient guided generation for large language models.arXiv preprint arXiv:2307.09702,

work page internal anchor Pith review Pith/arXiv arXiv
[13]

Experimental Details Runtime IllustrationTo calculate the runtime results shown in Figure 2 we were using the code from Kobialka et al

13 Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning A. Experimental Details Runtime IllustrationTo calculate the runtime results shown in Figure 2 we were using the code from Kobialka et al. (2026). We adopt the same UCI benchmark setting as in their Table 2, use the airfoil dataset (Dua & Graff,

work page 2026
[14]

To ensure a fair comparison under equal computational budgets, we additionally include a 35-member Deep Ensemble—the most performant competing method

and extend their analysis by reporting average runtimes measured on a standard 10-core CPU. To ensure a fair comparison under equal computational budgets, we additionally include a 35-member Deep Ensemble—the most performant competing method. B. Clarification on Terminology: SAI vs. SBI Throughout this paper, we utilize the acronym SAI to denote Sampling-...

work page 2019

[1] [1]

bde: A Python Package for Bayesian Deep Ensembles via MILE

Arvanitis, V ., Aslanidis, A., Sommer, E., and Rügamer, D. bde: A Python Package for Bayesian Deep Ensembles via MILE.arXiv preprint arXiv:2605.14146,

work page internal anchor Pith review Pith/arXiv arXiv

[2] [2]

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., 9 Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning Askell, A., Agarwal, S., Herbert-V oss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigl...

work page 1901

[3] [3]

Laplace Redux – Effortless Bayesian Deep Learning

Daxberger, E., Kristiadi, A., Immer, A., Eschenhagen, R., Bauer, M., and Hennig, P. Laplace Redux – Effortless Bayesian Deep Learning. In35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021a. Daxberger, E., Nalisnick, E., Allingham, J. U., Antorán, J., and Hernández-Lobato, J. M. Bayesian Deep Learning via Subnetwork Inference. In...

work page 2021

[4] [4]

Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., and Wilson, A. G. Averaging Weights Leads to Wider Optima and Better Generalization. InProceedings of the 34th Conference on Uncertainty in Artificial Intelligence 2018,

work page 2018

[5] [5]

and Kelleher, J

Jawla, D. and Kelleher, J. Layer wise scaled gaussian priors for markov chain monte carlo sampled deep bayesian neural networks.Frontiers in Artificial Intelli- gence, V olume 8 - 2025,

work page 2025

[6] [6]

doi: 10.3389/frai.2025.1444891

ISSN 2624-8212. doi: 10.3389/frai.2025.1444891. Kaiser, J., Schwethelm, K., Rueckert, D., and Kaissis, G. Laplace sample information: Data informativeness through a bayesian lens. InThe Thirteenth International Conference on Learning Representations,

work page doi:10.3389/frai.2025.1444891 2025

[7] [7]

Li, G., Lin, G., Zhang, Z., and Zhou, Q

URLhttps://arxiv.org/abs/2504.18911. Li, G., Lin, G., Zhang, Z., and Zhou, Q. Fast replica ex- change stochastic gradient langevin dynamics.arXiv preprint arXiv:2301.01898,

work page arXiv

[8] [9]

Rønning, O., Nalisnick, E., Ley, C., Smyth, P., and Hamel- ryck, T

URL https: //arxiv.org/abs/2412.08876. Rønning, O., Nalisnick, E., Ley, C., Smyth, P., and Hamel- ryck, T. ELBOing stein: Variational bayes with stein mixture inference. InThe Thirteenth International Con- ference on Learning Representations,

work page arXiv

[9] [10]

and Wade, S

Sheinkman, A. and Wade, S. The architecture and eval- uation of bayesian neural networks.arXiv preprint arXiv:2503.11808,

work page arXiv

[10] [11]

Wang, T., Zhu, J.-Y ., Torralba, A., and Efros, A. A. Dataset distillation.arXiv preprint arXiv:1811.10959,

work page internal anchor Pith review arXiv

[11] [12]

Willard, B. T. and Louf, R. Efficient guided generation for large language models.arXiv preprint arXiv:2307.09702,

work page internal anchor Pith review Pith/arXiv arXiv

[12] [13]

Experimental Details Runtime IllustrationTo calculate the runtime results shown in Figure 2 we were using the code from Kobialka et al

13 Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning A. Experimental Details Runtime IllustrationTo calculate the runtime results shown in Figure 2 we were using the code from Kobialka et al. (2026). We adopt the same UCI benchmark setting as in their Table 2, use the airfoil dataset (Dua & Graff,

work page 2026

[13] [14]

To ensure a fair comparison under equal computational budgets, we additionally include a 35-member Deep Ensemble—the most performant competing method

and extend their analysis by reporting average runtimes measured on a standard 10-core CPU. To ensure a fair comparison under equal computational budgets, we additionally include a 35-member Deep Ensemble—the most performant competing method. B. Clarification on Terminology: SAI vs. SBI Throughout this paper, we utilize the acronym SAI to denote Sampling-...

work page 2019