Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network"

Roland S. Zimmermann

arxiv: 1907.00895 · v1 · pith:K7UKW6XYnew · submitted 2019-07-01 · 💻 cs.LG · stat.ML

Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network"

Roland S. Zimmermann This is my paper

Pith reviewed 2026-05-25 11:53 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords adversarial robustnessBayesian neural networksadversarial trainingstochastic modelsattack evaluationrobustness evaluation

0 comments

The pith

Adjusting adversarial attacks for BNN stochasticity removes evidence of improved robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper contends that evaluations of adversarially trained Bayesian neural networks have used attacks that ignore the model's randomness. An attack must instead draw multiple weight samples during optimization to properly test the defense. When this adjustment is applied, the reported robustness advantage over ordinary networks vanishes. A reader would care because it indicates that prior claims of better defense may reflect mismatched attack methods rather than actual security gains.

Core claim

When adversarial attacks are modified to incorporate the stochastic nature of Bayesian neural networks by sampling multiple realizations during the attack, there is no strong evidence that adversarially trained BNNs achieve higher robustness than their non-Bayesian counterparts.

What carries the argument

An adjusted adversarial attack that accounts for BNN stochasticity by incorporating multiple forward passes or weight samples during example generation.

If this is right

Standard attacks can underestimate the vulnerability of models that draw stochastic samples at inference time.
Adversarially trained BNNs must be re-evaluated using attacks that match their probabilistic structure.
The combination of adversarial training and Bayesian methods does not automatically yield higher robustness.
Defense evaluations involving randomness require attack procedures adapted to that randomness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar adjustments may be needed when evaluating other stochastic or ensemble-based defenses.
This critique points to a general requirement that attack generation should reflect the distribution over which the model is defined.
One testable extension is whether uncertainty estimates from BNNs can be explicitly used inside the attack objective to produce stronger examples.

Load-bearing premise

The adjusted attack that incorporates the stochastic nature of the Bayesian network is the appropriate and sufficient method for evaluating the defense's robustness.

What would settle it

Empirical results in which adversarially trained BNNs still exhibit measurably higher robustness than standard networks when evaluated under the adjusted stochastic attack would falsify the central claim.

read the original abstract

A recent paper by Liu et al. combines the topics of adversarial training and Bayesian Neural Networks (BNN) and suggests that adversarially trained BNNs are more robust against adversarial attacks than their non-Bayesian counterparts. Here, I analyze the proposed defense and suggest that one needs to adjust the adversarial attack to incorporate the stochastic nature of a Bayesian network to perform an accurate evaluation of its robustness. Using this new type of attack I show that there appears to be no strong evidence for higher robustness of the adversarially trained BNNs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The comment identifies a real methodological gap in how Adv-BNN was evaluated but gives too little detail to judge how much the robustness claim actually changes.

read the letter

The central takeaway is that the robustness improvement claimed for adversarially trained Bayesian neural networks may not be real once the attack is adjusted for the model's stochasticity. The comment proposes running the inner maximization over multiple weight samples drawn from the posterior, rather than fixing one draw as in standard PGD attacks on deterministic networks. That adjustment follows logically from the fact that BNN inference involves sampling at test time, so an attack that ignores it can underestimate vulnerability. The paper does a clean job naming this mismatch between training-time stochasticity and attack-time evaluation, which is a point that prior work on Adv-BNN apparently did not address. The logic is straightforward and does not rely on circular assumptions or invented entities. The main limitation is that the abstract supplies no attack pseudocode, no dataset names, and no quantitative drops in accuracy, so the strength of the re-evaluation cannot be checked from the given text. The claim of “no strong evidence” for higher robustness therefore rests on an unshown experiment. If the full paper contains reproducible attack code and clear numbers, that would strengthen it; otherwise the support stays thin. This note is mainly useful for researchers who evaluate stochastic or ensemble defenses and want to avoid under-powered attacks. A reader already working on Bayesian robustness would get a useful reminder about attack design. I would bring it to a reading group to discuss the stochastic-attack idea. I would not cite it in my own work unless I were directly addressing the Adv-BNN paper. It deserves peer review as a short comment because the evaluation point is substantive enough to check against the original experiments.

Referee Report

1 major / 0 minor

Summary. The manuscript is a short comment on Liu et al.'s Adv-BNN paper. It argues that standard PGD-style attacks treat BNNs as deterministic and therefore underestimate vulnerability; an adjusted attack that incorporates multiple posterior weight samples inside the inner maximization is required for proper evaluation. Using this adjusted attack the author concludes there is no strong evidence that adversarially trained BNNs are more robust than their non-Bayesian counterparts.

Significance. If the adjusted attack is the appropriate evaluation method and the (unshown) empirical comparison holds, the comment would identify a methodological gap in robustness testing of stochastic defenses. The underlying logic follows directly from the distinction between a deterministic forward pass and an expectation over posterior draws.

major comments (1)

[Abstract] Abstract and main text: the central empirical claim ('there appears to be no strong evidence for higher robustness') is asserted without any description of the adjusted attack algorithm, the number of posterior samples used, the datasets, the models, or the quantitative results. This information is load-bearing for the conclusion and is absent from the provided manuscript text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review. The central point—that proper evaluation of BNN robustness requires an attack that accounts for posterior stochasticity—is well taken, and we agree the manuscript must supply the missing experimental details to support its empirical claim.

read point-by-point responses

Referee: [Abstract] Abstract and main text: the central empirical claim ('there appears to be no strong evidence for higher robustness') is asserted without any description of the adjusted attack algorithm, the number of posterior samples used, the datasets, the models, or the quantitative results. This information is load-bearing for the conclusion and is absent from the provided manuscript text.

Authors: We agree that the current manuscript text does not contain these details. In the revised version we will insert a concise methods paragraph describing the adjusted attack (PGD with inner maximization performed over multiple posterior samples drawn at each step), the number of samples used (10), the datasets and models evaluated, and the resulting robustness numbers that support the claim of no strong evidence for a BNN advantage. Because the paper is a short comment we will keep the addition brief while making the empirical basis explicit. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is an external methodological critique of Liu et al. It identifies that standard PGD attacks treat BNNs as deterministic and proposes incorporating posterior sampling into the inner maximization loop to evaluate robustness. This adjustment follows directly from the definition of BNN inference and does not rely on any self-referential equations, fitted parameters renamed as predictions, or load-bearing self-citations. No derivation chain exists that reduces to its own inputs; the central claim rests on the observable difference between deterministic and stochastic forward passes, which is independently verifiable.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the premise that the proposed stochastic-aware attack is the correct evaluation standard; this is introduced here rather than derived from prior independent evidence.

axioms (1)

domain assumption Standard definitions of adversarial robustness and attack success in machine learning apply without modification to the stochastic setting.
The evaluation of whether robustness is higher depends on these common metrics being appropriate for BNNs.

pith-pipeline@v0.9.0 · 5611 in / 1061 out tokens · 26334 ms · 2026-05-25T11:53:59.856932+00:00 · methodology

Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network"

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)