Industrial Surface Defect Detection via Diffusion Generation and Asymmetric Student-Teacher Network

Guangcan Liu; Runlin Zhou; Shuo Feng; Yuyang Li

arxiv: 2604.19240 · v1 · submitted 2026-04-21 · 💻 cs.AI

Industrial Surface Defect Detection via Diffusion Generation and Asymmetric Student-Teacher Network

Shuo Feng , Runlin Zhou , Yuyang Li , Guangcan Liu This is my paper

Pith reviewed 2026-05-10 02:47 UTC · model grok-4.3

classification 💻 cs.AI

keywords industrial defect detectiondiffusion modelssynthetic data generationteacher-student networkunsupervised anomaly detectionpixel-level localization

0 comments

The pith

A diffusion model trained only on normal samples generates realistic defects to train an asymmetric teacher-student network for unsupervised industrial surface defect detection and localization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to overcome the lack of defect samples for training detection systems in manufacturing by creating artificial defects. It does this by training a denoising diffusion model exclusively on defect-free images and then adding structured noise to produce varied, realistic-looking defects complete with location labels. These synthetic examples are used to train a two-part network where one part stays focused on normal patterns and the other learns to spot and mark deviations. The result is a system that can both classify images as defective or not and highlight the exact defective areas. Such an approach matters because it removes the need to collect and label rare defect occurrences, potentially making automated quality control more practical across industries.

Core claim

Training a denoising diffusion probabilistic model solely on normal samples and using it to generate defects through constant-variance Gaussian perturbations and Perlin noise-based masks, then employing these in an asymmetric teacher-student network with cosine similarity and pixel-wise losses, allows the model to achieve 98.4% image-level AUROC and 98.3% pixel-level AUROC for unsupervised defect detection and localization on the MVTecAD dataset.

What carries the argument

Asymmetric teacher-student network in which the teacher extracts stable normal feature representations while the student reconstructs normal patterns and amplifies anomaly discrepancies, trained on data generated by a diffusion model.

Load-bearing premise

The defects created by the diffusion model using Gaussian noise and Perlin masks are sufficiently similar to real defects to train a detector that works on actual industrial surfaces.

What would settle it

A comparison where the model performs poorly on a set of real defects whose visual properties do not match those of the generated ones would indicate the generation step is insufficient.

Figures

Figures reproduced from arXiv: 2604.19240 by Guangcan Liu, Runlin Zhou, Shuo Feng, Yuyang Li.

**Figure 1.** Figure 1: The overall pipeline of the local defect synthesis method based on Perlin noise mask. 3.2 Dataset Reconstruction Scheme for Downstream Denoising and Defect Detection Tasks To better adapt to downstream denoising and defect detection tasks, this paper proposes a structured triplet-based sample construction strategy. During the data organization and loading stage, the indexing mechanism of raw samples is re… view at source ↗

**Figure 2.** Figure 2: Multi-Task Jointly Driven Training Process [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Multi-Task Jointly Driven Inference Process. 3.3.3 Loss Function The core training objective of our asymmetric dual-stream framework is to enforce the student decoder to produce feature representations that closely match those extracted by the frozen teacher network on clean defectfree images. Leveraging cross-layer spatial topology alignment and channel-wise L2 normalization, the spatial distance between… view at source ↗

**Figure 4.** Figure 4: Visualization of defect localization results on the MVTec AD dataset. 4.5 Ablation Study To verify the influence of each core module on detection performance, this chapter conducts an ablation study based on the MVTec AD dataset. Key components including the decoder, cosine loss, segmentation head, and loss function are removed or replaced respectively to quantify the necessity of each structure. Experim… view at source ↗

read the original abstract

Industrial surface defect detection often suffers from limited defect samples, severe long-tailed distributions, and difficulties in accurately localizing subtle defects under complex backgrounds. To address these challenges, this paper proposes an unsupervised defect detection method that integrates a Denoising Diffusion Probabilistic Model (DDPM) with an asymmetric teacher-student architecture. First, at the data level, the DDPM is trained solely on normal samples. By introducing constant-variance Gaussian perturbations and Perlin noise-based masks, high-fidelity and physically consistent defect samples along with pixel-level annotations are generated, effectively alleviating the data scarcity problem. Second, at the model level, an asymmetric dual-stream network is constructed. The teacher network provides stable representations of normal features, while the student network reconstructs normal patterns and amplifies discrepancies between normal and anomalous regions. Finally, a joint optimization strategy combining cosine similarity loss and pixel-wise segmentation supervision is adopted to achieve precise localization of subtle defects. Experimental results on the MVTecAD dataset show that the proposed method achieves 98.4\% image-level AUROC and 98.3\% pixel-level AUROC, significantly outperforming existing unsupervised and mainstream deep learning methods. The proposed approach does not require large amounts of real defect samples and enables accurate and robust industrial defect detection and localization. \keywords{Industrial defect detection \and diffusion models \and data generation \and teacher-student architecture \and pixel-level localization}

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper generates synthetic defects via DDPM plus Perlin masks from normal samples only, then trains an asymmetric teacher-student network to reach 98.4% image and 98.3% pixel AUROC on MVTecAD, but skips any check that the synthetics actually resemble real defects.

read the letter

The core idea here is straightforward: train a diffusion model on normal images, apply constant-variance Gaussian noise and Perlin masks to create labeled defect examples, then use those to supervise an asymmetric teacher-student pair where the teacher holds stable normal features and the student learns to highlight differences via cosine similarity plus segmentation loss. That pipeline is the main thing the authors put forward for handling scarce real defects in industrial inspection. It reports strong benchmark numbers without needing actual anomalous training data, which is the practical hook. The approach builds on established diffusion augmentation and teacher-student anomaly detection, but the specific combination of Perlin-masked synthesis with joint pixel-level optimization on MVTecAD is the fresh application. The results look usable on the surface for someone facing long-tailed defect distributions. The main gap is exactly what the stress test flags: no distributional comparison or ablation shows that the generated defects track real ones instead of Perlin artifacts. Without FID, MMD, expert scoring, or a control run using random masks, the performance lift could come from the student picking up mask-specific patterns rather than genuine defect cues. The abstract also omits error bars, baseline implementation details, and sensitivity analysis on the free parameters like mask settings or loss weights. Those are fixable but leave the central claim under-supported right now. The work stays empirical with no circular fitting or self-referential loops in the setup. It is aimed at applied computer vision researchers or engineers doing quality control on manufacturing lines who need data-augmentation tricks for unsupervised localization. A reader already working on diffusion for anomaly tasks or teacher-student variants would pick up concrete implementation choices worth testing. I would send it for peer review. The benchmark results and the end-to-end pipeline are concrete enough that referees can evaluate the missing validation steps and ask for the necessary ablations.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes an unsupervised industrial surface defect detection method that trains a DDPM solely on normal samples, then synthesizes defects via constant-variance Gaussian perturbations and Perlin noise masks to produce both images and pixel-level annotations. These synthetic data train an asymmetric teacher-student network in which the teacher supplies stable normal representations while the student reconstructs normals and amplifies anomalies; joint optimization uses cosine similarity plus pixel-wise segmentation supervision. On MVTecAD the method reports 98.4% image-level AUROC and 98.3% pixel-level AUROC, outperforming prior unsupervised and supervised baselines without requiring real defect samples.

Significance. If the synthetic defects prove distributionally faithful to real industrial anomalies, the approach would meaningfully alleviate data scarcity and long-tailed distributions that currently limit supervised defect detectors. The combination of diffusion-based synthesis with an asymmetric distillation architecture is a coherent and potentially reusable design pattern for other anomaly-detection domains.

major comments (3)

[defect synthesis procedure (Section 3.1)] The central performance claim rests on the assertion that DDPM-generated defects (constant-variance Gaussian + Perlin masks) are high-fidelity and physically consistent with real MVTecAD anomalies, yet the manuscript supplies no distributional metrics (FID, MMD, or expert visual scoring) comparing synthetic versus real defect images, nor any ablation replacing the Perlin-masked outputs with random noise or simpler masks. Without this evidence it is impossible to rule out that the student network is merely learning Perlin-specific artifacts rather than genuine defect cues.
[Experiments and Results (Section 4)] The experimental section reports single-point AUROC figures (98.4% / 98.3%) with no error bars, no repeated runs with different random seeds, and no statistical significance tests against the reproduced baselines. This makes the claim of “significant outperformance” difficult to evaluate.
[Implementation details and ablation studies (Section 4.2)] No ablation is presented on the free parameters listed in the method (Gaussian perturbation variance, Perlin noise mask parameters, or the weighting between cosine similarity and segmentation losses). Consequently it is unclear whether the reported numbers are robust or the result of post-hoc tuning on the test set.

minor comments (2)

[Abstract and Introduction] The abstract and introduction repeatedly use the phrase “high-fidelity and physically consistent” without defining the criteria; a short paragraph clarifying the intended meaning would improve readability.
[Experimental setup] Baseline implementation details (exact architectures, training epochs, data augmentations) are referenced only by citation; a brief table summarizing the reproduced settings would aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential of combining diffusion-based defect synthesis with an asymmetric teacher-student architecture. We agree that stronger empirical validation is required for the synthetic data fidelity, experimental reproducibility, and hyperparameter robustness. We address each major comment below and will revise the manuscript to incorporate the requested evidence and analyses.

read point-by-point responses

Referee: [defect synthesis procedure (Section 3.1)] The central performance claim rests on the assertion that DDPM-generated defects (constant-variance Gaussian + Perlin masks) are high-fidelity and physically consistent with real MVTecAD anomalies, yet the manuscript supplies no distributional metrics (FID, MMD, or expert visual scoring) comparing synthetic versus real defect images, nor any ablation replacing the Perlin-masked outputs with random noise or simpler masks. Without this evidence it is impossible to rule out that the student network is merely learning Perlin-specific artifacts rather than genuine defect cues.

Authors: We agree that quantitative validation of the synthetic defects is essential to support our claims of high fidelity. In the revised manuscript we will add FID and MMD scores computed between the DDPM-generated defect images and the real defect samples in MVTecAD. We will also include side-by-side visual comparisons and, to the extent possible, a small-scale expert visual scoring study. In addition, we will insert an ablation in Section 4.2 that replaces the Perlin masks with random Gaussian noise masks (and simpler alternatives) while keeping all other components fixed, thereby demonstrating that performance gains arise from realistic defect cues rather than mask-specific artifacts. revision: yes
Referee: [Experiments and Results (Section 4)] The experimental section reports single-point AUROC figures (98.4% / 98.3%) with no error bars, no repeated runs with different random seeds, and no statistical significance tests against the reproduced baselines. This makes the claim of “significant outperformance” difficult to evaluate.

Authors: We acknowledge that single-run point estimates without statistical support weaken the evaluation. In the revision we will repeat all experiments (including baseline reproductions) with at least five different random seeds, report mean image-level and pixel-level AUROC together with standard deviations, and include paired t-tests (or equivalent non-parametric tests) to establish statistical significance of the improvements over the reproduced baselines. revision: yes
Referee: [Implementation details and ablation studies (Section 4.2)] No ablation is presented on the free parameters listed in the method (Gaussian perturbation variance, Perlin noise mask parameters, or the weighting between cosine similarity and segmentation losses). Consequently it is unclear whether the reported numbers are robust or the result of post-hoc tuning on the test set.

Authors: We agree that sensitivity analysis of the listed hyperparameters is necessary to demonstrate robustness. We will expand Section 4.2 with systematic ablations on (i) the constant-variance Gaussian perturbation level, (ii) Perlin noise parameters such as scale and octaves, and (iii) the relative weighting between the cosine-similarity and pixel-wise segmentation losses. These studies will be performed on a held-out validation split to avoid test-set tuning and will be summarized with performance curves or tables. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical method with independent benchmark evaluation

full rationale

The paper describes an empirical pipeline: DDPM trained only on normal samples, followed by synthetic defect generation via constant-variance Gaussian noise plus Perlin masks, then training of an asymmetric teacher-student network with cosine similarity and segmentation losses. Reported performance consists of AUROC numbers on the public MVTecAD benchmark. No equations, fitted parameters, or self-citations are invoked in a way that reduces the central claims to the inputs by construction. The derivation chain is self-contained and externally falsifiable via the stated dataset and metrics.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

The central claim rests on the unverified assumption that synthetic defects are realistic enough for training, plus multiple unspecified hyperparameters in the diffusion process and network training; no new physical entities are postulated.

free parameters (3)

Gaussian perturbation variance
Constant-variance noise added during defect generation; specific value chosen but not reported in abstract.
Perlin noise mask parameters
Settings controlling mask shape and scale for defect localization; chosen to produce physically consistent samples.
Loss weighting between cosine similarity and segmentation supervision
Balance factor in joint optimization; not specified.

axioms (2)

domain assumption DDPM trained only on normal samples can produce high-fidelity, physically consistent defect images when combined with Gaussian perturbations and Perlin masks
Invoked in the data-generation stage to justify synthetic sample quality.
domain assumption The teacher network provides stable normal representations that the student can reliably reconstruct except at defect locations
Core premise of the asymmetric architecture.

pith-pipeline@v0.9.0 · 5552 in / 1479 out tokens · 129764 ms · 2026-05-10T02:47:42.154381+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 1 internal anchor

[1]

Image-based surface defect detection using deep learning: A review[J]

Bhatt P M, Malhan R K, Rajendran P, et al. Image-based surface defect detection using deep learning: A review[J]. Journal of Computing and Information Science in Engineering, 2021, 21(4): 040801

work page 2021
[2]

Defect image sample generation with GAN for improving defect recognition[J]

Niu S, Wang Y, Wang F, et al. Defect image sample generation with GAN for improving defect recognition[J]. IEEE Transactions on Automation Science and Engineering, 2020, 18(3): 1071–1082

work page 2020
[3]

Few-shot defect image generation via defect-aware feature manipulation[C]

Duan Y, Liu J, Wang Z, et al. Few-shot defect image generation via defect-aware feature manipulation[C]. Proceedings of the AAAI Conference on Artificial Intel- ligence, 2023: 1346–1354

work page 2023
[4]

Improved denoising diffusion probabilistic models[C]

Nichol A Q, Dhariwal P. Improved denoising diffusion probabilistic models[C]. International Conference on Machine Learning, 2021: 8162–8171

work page 2021
[5]

On diffusion modeling for anomaly detection[C]

Livernoche V, K¨ ohler L, Eisenbacher M, et al. On diffusion modeling for anomaly detection[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 2032–2041

work page 2024
[6]

High-resolution image synthesis with latent diffusion models[C]

Rombach R, Blattmann A, Lorenz R, et al. High-resolution image synthesis with latent diffusion models[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 10684–10695

work page 2022
[7]

Denoising diffusion implicit models[C]

Song J, Meng C, Ermon S. Denoising diffusion implicit models[C]. International Conference on Learning Representations, 2020

work page 2020
[8]

Score-based generative modeling through stochastic differential equations[C]

Song Y, Sohl-Dickstein J, Kingma D P, et al. Score-based generative modeling through stochastic differential equations[C]. International Conference on Learning Representations, 2020

work page 2020
[9]

Denoising diffusion probabilistic models[C]

Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[C]. Advances in Neural Information Processing Systems, 2020, 33: 6840–6851

work page 2020
[10]

Diffusion models beat GANs on image synthesis[C]

Dhariwal P, Nichol A. Diffusion models beat GANs on image synthesis[C]. Ad- vances in Neural Information Processing Systems, 2021, 34: 8780–8794

work page 2021
[11]

A computational approach to edge detection[J]

Canny J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, 8(6): 679–698

work page 1986
[12]

Multiresolution gray-scale and rotation in- variant texture classification with local binary patterns[J]

Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation in- variant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971–987

work page 2002
[13]

U-net: Convolutional networks for biomedical image segmentation[C]

Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation[C]. International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: 234–241

work page 2015
[14]

Faster r-cnn: Towards real-time object detection with region proposal networks[C]

Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]. Advances in Neural Information Processing Sys- tems, 2015, 28: 91–99

work page 2015
[15]

Patchcore: Towards total recall in industrial anomaly detection[C]

Roth K, Pemula L, Zepeda J, et al. Patchcore: Towards total recall in industrial anomaly detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 14009–14019

work page 2021
[16]

Student-teacher feature pyramid matching for anomaly detection[C]

Wang G, Han S, Ding E, et al. Student-teacher feature pyramid matching for anomaly detection[C]. Proceedings of the British Machine Vision Conference, 2021: 1–14. 14 S. Feng et al

work page 2021
[17]

Autoaugment: Learning augmentation policies from data[C]

Cubuk E D, Zoph B, Mane D, et al. Autoaugment: Learning augmentation policies from data[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 113–123

work page 2019
[18]

Randaugment: Practical automated data aug- mentation with a reduced search space[C]

Cubuk E D, Zoph B, Shlens J, et al. Randaugment: Practical automated data aug- mentation with a reduced search space[C]. Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition Workshops, 2020: 702–703

work page 2020
[19]

Defect-gan: High-fidelity defect synthesis for automated defect inspection[C]

Zhang G, Cui K, Hung T Y, et al. Defect-gan: High-fidelity defect synthesis for automated defect inspection[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021: 2515–2524

work page 2021
[20]

Auto-Encoding Variational Bayes

Kingma D P, Welling M. Auto-encoding variational bayes[J]. arXiv preprint arXiv:1312.6114, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[21]

Cutpaste: Self-supervised learning for anomaly detec- tion and localization[C]

Li C L, Sohn K, Yoon J, et al. Cutpaste: Self-supervised learning for anomaly detec- tion and localization[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 9664–9674

work page 2021
[22]

Draem-a discriminatively trained reconstruc- tion embedding for surface anomaly detection[C]

Zavrtanik V, Kristan M, Skocaj D. Draem-a discriminatively trained reconstruc- tion embedding for surface anomaly detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 8330–8339

work page 2021
[23]

Diffusionad: Denoising diffusion for anomaly de- tection[C]

Zhang H, Wang Z, Wu Z, et al. Diffusionad: Denoising diffusion for anomaly de- tection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 843–852

work page 2023
[24]

Destseg: Segmentation guided denoising student-teacher for anomaly detection[C]

Zhang X, Li S, Li X, et al. Destseg: Segmentation guided denoising student-teacher for anomaly detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 3914–3923

work page 2023
[25]

Winclip: Zero-/few-shot anomaly classification and segmentation[C]

Jeong J, Kim S, Seo D, et al. Winclip: Zero-/few-shot anomaly classification and segmentation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 19613–19622

work page 2023
[26]

MVTec AD – A comprehensive real-world dataset for unsupervised anomaly detection[C]

Bergmann P, Fauser M, Sattlegger D, et al. MVTec AD – A comprehensive real-world dataset for unsupervised anomaly detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9592– 9600

work page 2019
[27]

Uninformed students: Student- teacher anomaly detection with discriminative latent features[C]

Bergmann P, Fauser M, Sattlegger D, et al. Uninformed students: Student- teacher anomaly detection with discriminative latent features[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 8771–8780

work page 2020
[28]

The per-region overlap (PRO) score: A fair evaluation metric for anomaly localization[J]

Schneider T, Bergmann P, Steger C. The per-region overlap (PRO) score: A fair evaluation metric for anomaly localization[J]. arXiv preprint arXiv:2009.14067, 2020

work page arXiv 2009
[29]

SimpleNet: A simple network for image anomaly detection and localization[C]

Liu Z, Wang Y, Han Y, et al. SimpleNet: A simple network for image anomaly detection and localization[C]. Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2023: 20402–20411

work page 2023
[30]

RealNet: A feature selection network with realistic syn- thetic anomaly for anomaly detection[C]

Zhang X, Xu M, Zhou X. RealNet: A feature selection network with realistic syn- thetic anomaly for anomaly detection[C]. Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2024: 23678–23687

work page 2024
[31]

CFLOW-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows[C]

Gudovskiy D, Ishizaka S, Kozuka K. CFLOW-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows[C]. Proceed- ings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022: 1434–1442

work page 2022
[32]

PyramidFlow: High-resolution defect contrastive lo- calization using pyramid normalizing flow[C]

Lei J, Hu X, Wang Y, et al. PyramidFlow: High-resolution defect contrastive lo- calization using pyramid normalizing flow[C]. Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2023: 14143–14152. Title Suppressed Due to Excessive Length 15

work page 2023
[33]

DiAD: A diffusion-based framework for multi-class anomaly detection[J]

He H, Zhang J, Chen H, et al. DiAD: A diffusion-based framework for multi-class anomaly detection[J]. arXiv preprint arXiv:2312.06607, 2023

work page arXiv 2023
[34]

UTRAD: Anomaly detection and localization with U-Transformer[J]

Chen L, You Z, Zhang N, et al. UTRAD: Anomaly detection and localization with U-Transformer[J]. Neural Networks, 2022, 147: 53–62

work page 2022

[1] [1]

Image-based surface defect detection using deep learning: A review[J]

Bhatt P M, Malhan R K, Rajendran P, et al. Image-based surface defect detection using deep learning: A review[J]. Journal of Computing and Information Science in Engineering, 2021, 21(4): 040801

work page 2021

[2] [2]

Defect image sample generation with GAN for improving defect recognition[J]

Niu S, Wang Y, Wang F, et al. Defect image sample generation with GAN for improving defect recognition[J]. IEEE Transactions on Automation Science and Engineering, 2020, 18(3): 1071–1082

work page 2020

[3] [3]

Few-shot defect image generation via defect-aware feature manipulation[C]

Duan Y, Liu J, Wang Z, et al. Few-shot defect image generation via defect-aware feature manipulation[C]. Proceedings of the AAAI Conference on Artificial Intel- ligence, 2023: 1346–1354

work page 2023

[4] [4]

Improved denoising diffusion probabilistic models[C]

Nichol A Q, Dhariwal P. Improved denoising diffusion probabilistic models[C]. International Conference on Machine Learning, 2021: 8162–8171

work page 2021

[5] [5]

On diffusion modeling for anomaly detection[C]

Livernoche V, K¨ ohler L, Eisenbacher M, et al. On diffusion modeling for anomaly detection[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 2032–2041

work page 2024

[6] [6]

High-resolution image synthesis with latent diffusion models[C]

Rombach R, Blattmann A, Lorenz R, et al. High-resolution image synthesis with latent diffusion models[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 10684–10695

work page 2022

[7] [7]

Denoising diffusion implicit models[C]

Song J, Meng C, Ermon S. Denoising diffusion implicit models[C]. International Conference on Learning Representations, 2020

work page 2020

[8] [8]

Score-based generative modeling through stochastic differential equations[C]

Song Y, Sohl-Dickstein J, Kingma D P, et al. Score-based generative modeling through stochastic differential equations[C]. International Conference on Learning Representations, 2020

work page 2020

[9] [9]

Denoising diffusion probabilistic models[C]

Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[C]. Advances in Neural Information Processing Systems, 2020, 33: 6840–6851

work page 2020

[10] [10]

Diffusion models beat GANs on image synthesis[C]

Dhariwal P, Nichol A. Diffusion models beat GANs on image synthesis[C]. Ad- vances in Neural Information Processing Systems, 2021, 34: 8780–8794

work page 2021

[11] [11]

A computational approach to edge detection[J]

Canny J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, 8(6): 679–698

work page 1986

[12] [12]

Multiresolution gray-scale and rotation in- variant texture classification with local binary patterns[J]

Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation in- variant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971–987

work page 2002

[13] [13]

U-net: Convolutional networks for biomedical image segmentation[C]

Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation[C]. International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: 234–241

work page 2015

[14] [14]

Faster r-cnn: Towards real-time object detection with region proposal networks[C]

Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]. Advances in Neural Information Processing Sys- tems, 2015, 28: 91–99

work page 2015

[15] [15]

Patchcore: Towards total recall in industrial anomaly detection[C]

Roth K, Pemula L, Zepeda J, et al. Patchcore: Towards total recall in industrial anomaly detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 14009–14019

work page 2021

[16] [16]

Student-teacher feature pyramid matching for anomaly detection[C]

Wang G, Han S, Ding E, et al. Student-teacher feature pyramid matching for anomaly detection[C]. Proceedings of the British Machine Vision Conference, 2021: 1–14. 14 S. Feng et al

work page 2021

[17] [17]

Autoaugment: Learning augmentation policies from data[C]

Cubuk E D, Zoph B, Mane D, et al. Autoaugment: Learning augmentation policies from data[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 113–123

work page 2019

[18] [18]

Randaugment: Practical automated data aug- mentation with a reduced search space[C]

Cubuk E D, Zoph B, Shlens J, et al. Randaugment: Practical automated data aug- mentation with a reduced search space[C]. Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition Workshops, 2020: 702–703

work page 2020

[19] [19]

Defect-gan: High-fidelity defect synthesis for automated defect inspection[C]

Zhang G, Cui K, Hung T Y, et al. Defect-gan: High-fidelity defect synthesis for automated defect inspection[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021: 2515–2524

work page 2021

[20] [20]

Auto-Encoding Variational Bayes

Kingma D P, Welling M. Auto-encoding variational bayes[J]. arXiv preprint arXiv:1312.6114, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[21] [21]

Cutpaste: Self-supervised learning for anomaly detec- tion and localization[C]

Li C L, Sohn K, Yoon J, et al. Cutpaste: Self-supervised learning for anomaly detec- tion and localization[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 9664–9674

work page 2021

[22] [22]

Draem-a discriminatively trained reconstruc- tion embedding for surface anomaly detection[C]

Zavrtanik V, Kristan M, Skocaj D. Draem-a discriminatively trained reconstruc- tion embedding for surface anomaly detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 8330–8339

work page 2021

[23] [23]

Diffusionad: Denoising diffusion for anomaly de- tection[C]

Zhang H, Wang Z, Wu Z, et al. Diffusionad: Denoising diffusion for anomaly de- tection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 843–852

work page 2023

[24] [24]

Destseg: Segmentation guided denoising student-teacher for anomaly detection[C]

Zhang X, Li S, Li X, et al. Destseg: Segmentation guided denoising student-teacher for anomaly detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 3914–3923

work page 2023

[25] [25]

Winclip: Zero-/few-shot anomaly classification and segmentation[C]

Jeong J, Kim S, Seo D, et al. Winclip: Zero-/few-shot anomaly classification and segmentation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 19613–19622

work page 2023

[26] [26]

MVTec AD – A comprehensive real-world dataset for unsupervised anomaly detection[C]

Bergmann P, Fauser M, Sattlegger D, et al. MVTec AD – A comprehensive real-world dataset for unsupervised anomaly detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9592– 9600

work page 2019

[27] [27]

Uninformed students: Student- teacher anomaly detection with discriminative latent features[C]

Bergmann P, Fauser M, Sattlegger D, et al. Uninformed students: Student- teacher anomaly detection with discriminative latent features[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 8771–8780

work page 2020

[28] [28]

The per-region overlap (PRO) score: A fair evaluation metric for anomaly localization[J]

Schneider T, Bergmann P, Steger C. The per-region overlap (PRO) score: A fair evaluation metric for anomaly localization[J]. arXiv preprint arXiv:2009.14067, 2020

work page arXiv 2009

[29] [29]

SimpleNet: A simple network for image anomaly detection and localization[C]

Liu Z, Wang Y, Han Y, et al. SimpleNet: A simple network for image anomaly detection and localization[C]. Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2023: 20402–20411

work page 2023

[30] [30]

RealNet: A feature selection network with realistic syn- thetic anomaly for anomaly detection[C]

Zhang X, Xu M, Zhou X. RealNet: A feature selection network with realistic syn- thetic anomaly for anomaly detection[C]. Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2024: 23678–23687

work page 2024

[31] [31]

CFLOW-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows[C]

Gudovskiy D, Ishizaka S, Kozuka K. CFLOW-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows[C]. Proceed- ings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022: 1434–1442

work page 2022

[32] [32]

PyramidFlow: High-resolution defect contrastive lo- calization using pyramid normalizing flow[C]

Lei J, Hu X, Wang Y, et al. PyramidFlow: High-resolution defect contrastive lo- calization using pyramid normalizing flow[C]. Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2023: 14143–14152. Title Suppressed Due to Excessive Length 15

work page 2023

[33] [33]

DiAD: A diffusion-based framework for multi-class anomaly detection[J]

He H, Zhang J, Chen H, et al. DiAD: A diffusion-based framework for multi-class anomaly detection[J]. arXiv preprint arXiv:2312.06607, 2023

work page arXiv 2023

[34] [34]

UTRAD: Anomaly detection and localization with U-Transformer[J]

Chen L, You Z, Zhang N, et al. UTRAD: Anomaly detection and localization with U-Transformer[J]. Neural Networks, 2022, 147: 53–62

work page 2022