arxiv: 2604.18393 · v1 · submitted 2026-04-20 · 💻 cs.CV

Recognition: unknown

One-Step Diffusion with Inverse Residual Fields for Unsupervised Industrial Anomaly Detection

Boan Zhang , Wen Li , Guanhua Yu , Xiyang Liu , Wenchao Chen , Long Tian

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:38 UTC · model grok-4.3

classification 💻 cs.CV

keywords unsupervised anomaly detectiondiffusion modelsone-step inferenceinverse residual fieldsindustrial defect detectionDDPMGaussian density scoring

0 comments

The pith

Inverse residual fields from a single diffusion step distinguish anomalies for fast unsupervised detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes training a standard diffusion model on normal industrial images only. For any test image, it runs one forward pass to estimate noise and derives an inverse residual field from it. Anomalies then appear as outliers when the density of this field is scored under a simple Gaussian, allowing detection without iterating the full denoising chain. This yields performance on par with slower diffusion-based detectors while cutting inference time in half. The approach rests on the observation that the residual field already encodes the distinction between normal and off-manifold data at nearby time steps.

Core claim

By computing the inverse residual field from the noise predicted by a pre-trained unconditional DDPM in a single step, test samples can be scored for anomaly by their probability density under a fitted Gaussian; this IRF representation makes anomalies distinguishable, and the property holds for any neighboring denoising step, enabling one-step inference for uIAD.

What carries the argument

The inverse residual field (IRF), derived directly from the estimated noise at a chosen time step, which maps input images into a space where normal data concentrates under a Gaussian density while anomalies deviate.

If this is right

uIAD can be performed with a single diffusion step instead of iterative denoising, achieving roughly 2X faster inference.
The method reaches state-of-the-art or competitive results on standard benchmarks across six metrics without any distillation or conditioning.
Anomalies are detected by thresholding the Gaussian probability density of the IRF rather than reconstruction error in pixel space.
The separation property of IRF persists across neighboring time steps in the diffusion process.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the IRF separation holds, the same one-step scoring could be tested on other diffusion-based tasks like out-of-distribution detection beyond industry.
Combining IRF scoring with existing reconstruction losses might further improve robustness on subtle defects.
Since no conditioning is used, the approach may extend to domains where labeled anomalies are scarce but normal data is abundant.

Load-bearing premise

That the inverse residual field from one noise estimate will place anomalies outside the Gaussian density fitted on normal samples, and that this separation remains reliable for nearby time steps.

What would settle it

A benchmark experiment showing that the Gaussian density scores for known anomalous images overlap substantially with those of normal images on the MVTec, BTAD, or MPDD datasets would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.18393 by Boan Zhang, Guanhua Yu, Long Tian, Wenchao Chen, Wen Li, Xiyang Liu.

**Figure 2.** Figure 2: Trajectories of the one-step transition from the input data space [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Visualizations of anomaly localization by our model on MVTec-AD, ViSA, and MPDD. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Diffusion models have achieved outstanding performance in unsupervised industrial anomaly detection (uIAD) by learning a manifold of normal data under the common assumption that off-manifold anomalies are harder to generate, resulting in larger reconstruction errors in data space or lower probability densities in the tractable latent space. However, their iterative denoising and noising nature leads to slow inference. In this paper, we propose OSD-IRF, a novel one-step diffusion with inverse residual fields, to address this limitation for uIAD task. We first train a deep diffusion probabilistic model (DDPM) on normal data without any conditioning. Then, for a test sample, we predict its inverse residual fields (IRF) based on the noise estimated by the well-trained parametric noise function of the DDPM. Finally, uIAD is performed by evaluating the probability density of the IRF under a Gaussian distribution and comparing it with a threshold. Our key observation is that anomalies become distinguishable in this IRF space, a finding that has seldom been reported in prior works. Moreover, OSD-IRF requires only single step diffusion for uIAD, thanks to the property that IRF holds for any neighboring time step in the denoising process. Extensive experiments on three widely used uIAD benchmarks show that our model achieves SOTA or competitive performance across six metrics, along with roughly a 2X inference speedup without distillation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a simple one-step diffusion trick for industrial anomaly detection via inverse residual fields, which is practically useful if the empirical invariance holds, but the lack of derivation for that invariance is the main soft spot.

read the letter

This paper shows how to do diffusion-based unsupervised anomaly detection in one step instead of many by introducing inverse residual fields. They train a plain DDPM on normal industrial images, then at test time pull the noise estimate from the model and turn it into an IRF, which they model as Gaussian for scoring anomalies. The claim is that anomalies stand out clearly in this space and that the IRF property holds across nearby timesteps, so no need for iteration. The new part is this IRF representation and the observation that it separates anomalies without extra training or conditioning. It does well on the practical side: the method is simple to implement on top of existing DDPMs, and they report SOTA or competitive numbers on three common benchmarks plus a solid inference speedup. For applications where speed matters more than squeezing every last point of accuracy, this could be a direct win. The weaker part is the justification for why IRF works this way. The invariance across timesteps is stated as a key property that holds, but it reads more like an empirical finding than something derived from the forward or reverse process. If the noise prediction error behaves differently at different t for anomalous inputs, the single-step guarantee could be fragile. They also fit a Gaussian without showing much about whether the IRF for normals is actually normal or how sensitive the threshold is. The experiments apparently support the claims, but more ablations on timestep choice and IRF definition would strengthen it. This work is aimed at practitioners in industrial computer vision who need fast anomaly detection without heavy compute at inference. It won't change how we think about diffusion models broadly, but it offers a concrete engineering improvement. I think it deserves a serious referee. The idea is clean enough and the results look promising enough that reviewers should examine the details and see if the invariance holds up under closer inspection.

Referee Report

3 major / 2 minor

Summary. The paper proposes OSD-IRF, a one-step diffusion approach for unsupervised industrial anomaly detection. It trains an unconditional DDPM on normal samples, computes inverse residual fields (IRF) from the model's single noise prediction on a test input, and detects anomalies by fitting a Gaussian to the IRF distribution and thresholding its density. The central claims are that anomalies become distinguishable in IRF space (a seldom-reported property) and that this separation holds for any neighboring timestep, enabling single-step inference with SOTA or competitive results on three uIAD benchmarks plus a 2x speedup.

Significance. If the IRF separation property and its timestep invariance are robustly established, the work would be significant for practical uIAD: it directly tackles the iterative inference bottleneck of diffusion models while repurposing an existing noise estimator without extra conditioning or distillation. The approach could influence efficient generative-model pipelines in industrial inspection if the single-step Gaussian scoring proves reliable across datasets.

major comments (3)

[Method section (IRF definition and timestep invariance claim)] The assertion that 'IRF holds for any neighboring time step in the denoising process' (abstract and method description) is presented only as an empirical observation without derivation from the DDPM forward process, reverse process, or noise-prediction error structure. This property is load-bearing for the one-step guarantee; without showing that the residual distribution remains stationary for both in-distribution and anomalous inputs across small Δt, the single-step claim risks collapse when the noise estimator's behavior shifts with t.
[Method section (Gaussian density evaluation)] The final anomaly score relies on evaluating IRF under a fitted Gaussian density with a threshold, yet no analysis is provided on the normality assumption (e.g., Q-Q plots, Kolmogorov-Smirnov tests) or sensitivity to the specific timestep t chosen for the single noise prediction. This directly affects the reliability of the separation claim for normal vs. anomalous samples.
[Experiments section] While the abstract states 'extensive experiments on three widely used uIAD benchmarks show SOTA or competitive performance across six metrics,' the provided text contains no quantitative tables, ablation studies on timestep choice or IRF computation, or error-bar statistics. This makes it impossible to verify the performance claims or the robustness of the IRF separation observation.

minor comments (2)

[Method section] The notation for 'inverse residual fields (IRF)' is introduced without an explicit equation relating it to the DDPM noise predictor ε_θ(x_t, t) or the forward-process residual; a clear definition (e.g., Eq. (X)) would improve reproducibility.
[Experiments section] The paper mentions 'roughly a 2X inference speedup without distillation' but does not report wall-clock timings, hardware details, or comparison against other one-step diffusion baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, indicating the revisions we will make to strengthen the theoretical grounding, validation, and experimental presentation of OSD-IRF.

read point-by-point responses

Referee: [Method section (IRF definition and timestep invariance claim)] The assertion that 'IRF holds for any neighboring time step in the denoising process' (abstract and method description) is presented only as an empirical observation without derivation from the DDPM forward process, reverse process, or noise-prediction error structure. This property is load-bearing for the one-step guarantee; without showing that the residual distribution remains stationary for both in-distribution and anomalous inputs across small Δt, the single-step claim risks collapse when the noise estimator's behavior shifts with t.

Authors: We acknowledge that the timestep invariance of IRF is presented primarily as an empirical observation in the current manuscript and is indeed central to the one-step inference claim. While a complete closed-form derivation is challenging given the learned noise estimator, we will revise the method section to include a more detailed justification grounded in the DDPM forward process and the structure of the noise-prediction error. Specifically, we will show why the inverse residual remains approximately stationary for small Δt by analyzing the expected reconstruction behavior for in-distribution samples (where the model was trained to predict noise accurately) versus anomalies. We will also expand the empirical analysis with additional plots demonstrating invariance across neighboring timesteps for both normal and anomalous inputs. revision: yes
Referee: [Method section (Gaussian density evaluation)] The final anomaly score relies on evaluating IRF under a fitted Gaussian density with a threshold, yet no analysis is provided on the normality assumption (e.g., Q-Q plots, Kolmogorov-Smirnov tests) or sensitivity to the specific timestep t chosen for the single noise prediction. This directly affects the reliability of the separation claim for normal vs. anomalous samples.

Authors: We agree that explicit validation of the Gaussian modeling assumption and timestep sensitivity would improve the reliability of the anomaly scoring procedure. In the revised manuscript, we will add Q-Q plots and Kolmogorov-Smirnov test statistics for the IRF distributions on normal samples across the benchmarks. We will also include a sensitivity study varying the single-step timestep t and reporting the resulting anomaly detection metrics to confirm robustness of the separation in IRF space. revision: yes
Referee: [Experiments section] While the abstract states 'extensive experiments on three widely used uIAD benchmarks show SOTA or competitive performance across six metrics,' the provided text contains no quantitative tables, ablation studies on timestep choice or IRF computation, or error-bar statistics. This makes it impossible to verify the performance claims or the robustness of the IRF separation observation.

Authors: We apologize for any lack of clarity in the experimental presentation in the reviewed version. The full manuscript contains quantitative comparisons on the three uIAD benchmarks (MVTec AD, BTAD, and VisA) across the six metrics, but to directly address the concern we will expand the experiments section with full tables including error bars from multiple random seeds, dedicated ablations on timestep choice and IRF computation variants, and additional visualizations of the IRF separation for normal versus anomalous samples. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; derivation remains independent of inputs

full rationale

The paper trains a standard unconditional DDPM on normal data only, then defines IRF directly from the model's single noise prediction on a test sample and scores via an external Gaussian density fit on normal IRF values. The anomaly distinguishability is explicitly labeled an empirical observation, and the neighboring-timestep invariance is asserted as a property enabling one-step use without any equation reducing the final density score to a retraining of the DDPM or to the Gaussian parameters themselves. No self-definitional loop, fitted-input-as-prediction, or load-bearing self-citation appears in the abstract or described chain; the anomaly score is computed from the trained model's output in a manner that is statistically independent of the training objective once the model is fixed.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim rests on the novel IRF entity and two domain assumptions whose validity is asserted rather than derived; full parameter counts and exact fitting procedures are not visible in the abstract.

free parameters (1)

anomaly threshold
Decision threshold on Gaussian density of IRF; must be chosen on held-out normal data.

axioms (2)

domain assumption Off-manifold anomalies produce larger reconstruction errors or lower densities when a DDPM is trained only on normal data.
Stated as the common assumption underlying diffusion-based uIAD.
ad hoc to paper The inverse residual field property holds for any neighboring time step in the denoising process.
Invoked to justify single-step inference.

invented entities (1)

Inverse Residual Fields (IRF) no independent evidence
purpose: A transformed representation in which anomalies become distinguishable under a Gaussian density.
Newly introduced construct whose definition and computation are described only at the level of the abstract.

pith-pipeline@v0.9.0 · 5558 in / 1489 out tokens · 42628 ms · 2026-05-10T04:38:03.116778+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on , pages=

Real-time segmentation of on-line handwritten arabic script , author=. Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on , pages=. 2014 , organization=

2014
[2]

Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of , pages=

Fast classification of handwritten on-line Arabic characters , author=. Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of , pages=. 2014 , organization=

2014
[3]

Advanced Data Mining and Applications: 12th International Conference, ADMA 2016, Gold Coast, QLD, Australia, December 12-15, 2016, Proceedings 12 , pages=

Prediction-Based, Prioritized Market-Share Insight Extraction , author=. Advanced Data Mining and Applications: 12th International Conference, ADMA 2016, Gold Coast, QLD, Australia, December 12-15, 2016, Proceedings 12 , pages=. 2016 , organization=

2016
[4]

ACM computing surveys (CSUR) , volume=

Deep learning for anomaly detection: A review , author=. ACM computing surveys (CSUR) , volume=. 2021 , publisher=

2021
[5]

ACM computing surveys (CSUR) , volume=

Deep learning for medical anomaly detection--a survey , author=. ACM computing surveys (CSUR) , volume=. 2021 , publisher=

2021
[6]

E3S web of conferences , volume=

Anomaly detection for predictive maintenance in industry 4.0-A survey , author=. E3S web of conferences , volume=. 2020 , organization=

2020
[7]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Simplenet: A simple network for image anomaly detection and localization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[8]

Advances in Neural Information Processing Systems , volume=

Hierarchical vector quantized transformer for multi-class unsupervised anomaly detection , author=. Advances in Neural Information Processing Systems , volume=
[9]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Towards total recall in industrial anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[10]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=
[11]

ICLR , year=

Score-based generative modeling through stochastic differential equations , author=. ICLR , year=
[12]

ICLR , year=

Denoising diffusion implicit models , author=. ICLR , year=
[13]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Anoddpm: Anomaly detection with denoising diffusion probabilistic models using simplex noise , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[14]

Forty-second International Conference on Machine Learning , year=

OmiAD: One-step adaptive masked diffusion model for multi-class anomaly detection via adversarial distillation , author=. Forty-second International Conference on Machine Learning , year=
[15]

arXiv preprint arXiv:2504.05662 , year=

InvAD: Inversion-based Reconstruction-Free Anomaly Detection with Diffusion Models , author=. arXiv preprint arXiv:2504.05662 , year=

work page arXiv
[16]

Proceedings of the AAAI conference on artificial intelligence , volume=

A diffusion-based framework for multi-class anomaly detection , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
[17]

DAGM German Conference on Pattern Recognition , pages=

Anomaly detection with conditioned denoising diffusion models , author=. DAGM German Conference on Pattern Recognition , pages=. 2024 , organization=

2024
[18]

ICLR , year=

On diffusion modeling for anomaly detection , author=. ICLR , year=
[19]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Removing anomalies as noises for industrial defect localization , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
[20]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[21]

Forty-first International Conference on Machine Learning , year=

Vague prototype-oriented diffusion model for multi-class anomaly detection , author=. Forty-first International Conference on Machine Learning , year=
[22]

Advances in Neural Information Processing Systems , volume=

Card: Classification and regression diffusion models , author=. Advances in Neural Information Processing Systems , volume=
[23]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[24]

International conference on machine learning , pages=

Efficientnet: Rethinking model scaling for convolutional neural networks , author=. International conference on machine learning , pages=. 2019 , organization=

2019
[25]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

MVTec AD--A comprehensive real-world dataset for unsupervised anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[26]

European conference on computer vision , pages=

Spot-the-difference self-supervised pre-training for anomaly detection and segmentation , author=. European conference on computer vision , pages=. 2022 , organization=

2022
[27]

2021 13th International congress on ultra modern telecommunications and control systems and workshops (ICUMT) , pages=

Deep learning-based defect detection of metal parts: evaluating current methods in complex conditions , author=. 2021 13th International congress on ultra modern telecommunications and control systems and workshops (ICUMT) , pages=. 2021 , organization=

2021
[28]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Bmad: Benchmarks for medical anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[29]

Advances in Neural Information Processing Systems , volume=

A unified model for multi-class anomaly detection , author=. Advances in Neural Information Processing Systems , volume=
[30]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Destseg: Segmentation guided denoising student-teacher for anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[31]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Anomaly detection via reverse distillation from one-class embedding , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[32]

2009 IEEE conference on computer vision and pattern recognition , pages=

Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

2009
[33]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
[34]

Decoupled Weight Decay Regularization

Decoupled weight decay regularization , author=. arXiv preprint arXiv:1711.05101 , year=

work page internal anchor Pith review Pith/arXiv arXiv