MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples
Pith reviewed 2026-05-17 22:52 UTC · model grok-4.3
The pith
Mutual scoring of unlabeled patches in 2D and 3D separates anomalies for zero-shot industrial detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the Mutual Scoring framework (MuSc-V2), built on Iterative Point Grouping, Similarity Neighborhood Aggregation with Multi-Degrees, Mutual Scoring Mechanism, Cross-modal Anomaly Enhancement, and Re-scoring with Constrained Neighborhood, leverages the discriminative property of normal patch similarities versus anomaly isolation to deliver strong zero-shot anomaly classification and segmentation performance in multimodal settings.
What carries the argument
The Mutual Scoring Mechanism (MSM), which allows samples to score each other within each modality, fused with cross-modal anomaly enhancement to recover missing detections.
If this is right
- Delivers a 23.7 percent AP gain on the MVTec 3D-AD dataset.
- Delivers a 19.3 percent boost on the Eyecandies dataset.
- Surpasses all previous zero-shot methods and most few-shot methods.
- Maintains robust performance when applied to the full dataset or smaller subsets.
- Supports flexible use with 2D only, 3D only, or combined modalities.
Where Pith is reading between the lines
- This approach could extend to other clustering-based unsupervised tasks where normal examples share common features.
- A test on datasets containing anomalies that mimic normal patterns would check the limits of the similarity assumption.
- Combining the mutual scoring with additional sensor types might improve robustness in real-world manufacturing lines.
Load-bearing premise
Normal image patches across industrial products typically find many other similar patches in both 2D appearance and 3D shapes, while anomalies remain diverse and isolated.
What would settle it
Finding an industrial dataset where anomalous patches show as many mutual similarities as normal patches would falsify the separation mechanism.
Figures
read the original abstract
Zero-shot anomaly classification (AC) and segmentation (AS) methods aim to identify and outline defects without using any labeled samples. In this paper, we reveal a key property that is overlooked by existing methods: normal image patches across industrial products typically find many other similar patches, not only in 2D appearance but also in 3D shapes, while anomalies remain diverse and isolated. To explicitly leverage this discriminative property, we propose a Mutual Scoring framework (MuSc-V2) for zero-shot AC/AS, which flexibly supports single 2D/3D or multimodality. Specifically, our method begins by improving 3D representation through Iterative Point Grouping (IPG), which reduces false positives from discontinuous surfaces. Then we use Similarity Neighborhood Aggregation with Multi-Degrees (SNAMD) to fuse 2D/3D neighborhood cues into more discriminative multi-scale patch features for mutual scoring. The core comprises a Mutual Scoring Mechanism (MSM) that lets samples within each modality to assign score to each other, and Cross-modal Anomaly Enhancement (CAE) that fuses 2D and 3D scores to recover modality-specific missing anomalies. Finally, Re-scoring with Constrained Neighborhood (RsCon) suppresses false classification based on similarity to more representative samples. Our framework flexibly works on both the full dataset and smaller subsets with consistently robust performance, ensuring seamless adaptability across diverse product lines. In aid of the novel framework, MuSc-V2 achieves significant performance improvements: a $\textbf{+23.7\%}$ AP gain on the MVTec 3D-AD dataset and a $\textbf{+19.3\%}$ boost on the Eyecandies dataset, surpassing previous zero-shot benchmarks and even outperforming most few-shot methods. The code will be available at The code will be available at \href{https://github.com/HUST-SLOW/MuSc-V2}{https://github.com/HUST-SLOW/MuSc-V2}.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents MuSc-V2, a zero-shot framework for multimodal industrial anomaly classification and segmentation. It identifies a key property that normal patches have many similar counterparts in 2D appearance and 3D shapes, while anomalies are diverse and isolated. The method incorporates Iterative Point Grouping (IPG) to improve 3D representations, Similarity Neighborhood Aggregation with Multi-Degrees (SNAMD) for multi-scale features, Mutual Scoring Mechanism (MSM), Cross-modal Anomaly Enhancement (CAE), and Re-scoring with Constrained Neighborhood (RsCon). Experiments on MVTec 3D-AD and Eyecandies datasets report substantial performance gains of +23.7% AP and +19.3% respectively over prior zero-shot methods.
Significance. If the central claims hold, this work would represent a notable advance in zero-shot anomaly detection by explicitly leveraging inter-sample similarities in an unlabeled pool for both 2D and 3D modalities. The reported outperformance over most few-shot methods is particularly striking. The planned code release at https://github.com/HUST-SLOW/MuSc-V2 enhances reproducibility.
major comments (2)
- [Introduction] Introduction (key property paragraph): The foundational claim that 'normal image patches across industrial products typically find many other similar patches, not only in 2D appearance but also in 3D shapes, while anomalies remain diverse and isolated' is presented without any quantitative support such as neighbor-count histograms, average in-neighborhood sizes, or intra-class vs. inter-class similarity distributions on MVTec 3D-AD or Eyecandies. This assumption is load-bearing for the MSM and the reported +23.7% AP gain, as the mutual scoring separation depends on a reliable gap in neighborhood counts.
- [Experiments] Experiments section (main results table): The performance tables show overall AP improvements, but no ablation isolates the contribution of MSM + CAE + RsCon from the IPG and SNAMD components alone. Without such controls it remains possible that the gains derive primarily from the 3D representation improvements rather than the mutual-scoring logic itself.
minor comments (2)
- [Abstract] Abstract: The sentence 'The code will be available at The code will be available at https://github.com/HUST-SLOW/MuSc-V2' contains a duplicated phrase that should be corrected.
- [Method] Method: The aggregation steps in SNAMD would benefit from explicit pseudocode or a small diagram showing how multi-degree neighborhoods are fused across modalities.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment point by point below, proposing targeted revisions to strengthen the manuscript while maintaining its core contributions.
read point-by-point responses
-
Referee: [Introduction] The foundational claim that 'normal image patches across industrial products typically find many other similar patches, not only in 2D appearance but also in 3D shapes, while anomalies remain diverse and isolated' is presented without any quantitative support such as neighbor-count histograms, average in-neighborhood sizes, or intra-class vs. inter-class similarity distributions on MVTec 3D-AD or Eyecandies. This assumption is load-bearing for the MSM and the reported +23.7% AP gain, as the mutual scoring separation depends on a reliable gap in neighborhood counts.
Authors: We acknowledge that the key property is introduced as an empirical observation without explicit quantitative backing in the current introduction. This property emerged from our analysis of patch distributions in the target datasets and is validated indirectly through the method's performance. To directly address the concern and reinforce the foundation for MSM, we will add quantitative analyses—including neighbor-count histograms, average in-neighborhood sizes, and intra- vs. inter-class similarity distributions—on MVTec 3D-AD and Eyecandies in the revised introduction and/or a new supplementary section. revision: yes
-
Referee: [Experiments] The performance tables show overall AP improvements, but no ablation isolates the contribution of MSM + CAE + RsCon from the IPG and SNAMD components alone. Without such controls it remains possible that the gains derive primarily from the 3D representation improvements rather than the mutual-scoring logic itself.
Authors: We agree that a more granular ablation would help isolate the impact of the mutual scoring logic (MSM, CAE, RsCon) from the representation enhancements (IPG, SNAMD). The existing experiments include module-level ablations and overall framework results, but we recognize the value of a dedicated control experiment. In the revision, we will add an ablation study that evaluates the mutual scoring components on top of the base IPG+SNAMD features to clarify their specific contributions to the reported gains. revision: yes
Circularity Check
No significant circularity; heuristic framework rests on explicit empirical assumption
full rationale
The paper states an observed property of normal patches sharing 2D/3D neighbors while anomalies are isolated, then builds MSM + SNAMD + CAE + RsCon to exploit it for scoring. No equations or steps reduce a claimed prediction back to a fitted parameter or self-citation by construction; the reported AP gains are presented as experimental outcomes on MVTec 3D-AD and Eyecandies rather than a closed derivation. The central premise is falsifiable via neighbor-count statistics on the target datasets and does not import uniqueness theorems or rename prior results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Normal patches find many similar counterparts in 2D and 3D while anomalies are diverse and isolated.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
normal image patches across industrial products typically find many other similar patches, not only in 2D appearance but also in 3D shapes, while anomalies remain diverse and isolated
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Mutual Scoring Mechanism (MSM) that lets samples within each modality to assign score to each other
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Q. Chen, H. Luo, C. Lv, and Z. Zhang, “A unified anomaly synthesis strategy with gradient ascent for industrial anomaly detection and localization,” inEur. Conf. Comput. Vis., 2025
work page 2025
-
[2]
Collaborative discrepancy optimization for reliable image anomaly localization,
Y . Cao, X. Xu, Z. Liu, and W. Shen, “Collaborative discrepancy optimization for reliable image anomaly localization,”IEEE Trans. Ind. Inform., 2023
work page 2023
-
[3]
Diffusionad: Norm-guided one-step denoising diffusion for anomaly detection,
H. Zhang, Z. Wang, D. Zeng, Z. Wu, and Y .-G. Jiang, “Diffusionad: Norm-guided one-step denoising diffusion for anomaly detection,”IEEE Trans. Pattern Anal. Mach. Intell., 2025
work page 2025
-
[4]
Center- aware residual anomaly synthesis for multiclass industrial anomaly detection,
Q. Chen, H. Luo, H. Yao, W. Luo, Z. Qu, C. Lv, and Z. Zhang, “Center- aware residual anomaly synthesis for multiclass industrial anomaly detection,”IEEE Trans. Ind. Inform., 2025
work page 2025
-
[5]
Self-supervised masked convolutional transformer block for anomaly detection,
N. Madan, N.-C. Ristea, R. T. Ionescu, K. Nasrollahi, F. S. Khan, T. B. Moeslund, and M. Shah, “Self-supervised masked convolutional transformer block for anomaly detection,”IEEE Trans. Pattern Anal. Mach. Intell., 2023
work page 2023
-
[6]
H. Li, J. Hu, B. Li, H. Chen, Y . Zheng, and C. Shen, “Target before shooting: Accurate anomaly detection and localization under one mil- lisecond via cascade patch retrieval,”IEEE Trans. Image Process., 2024
work page 2024
-
[7]
Self-supervised anomaly detection with neural transformations,
C. Qiu, M. Kloft, S. Mandt, and M. Rudolph, “Self-supervised anomaly detection with neural transformations,”IEEE Trans. Pattern Anal. Mach. Intell., 2024
work page 2024
-
[8]
Prior normality prompt transformer for multiclass industrial image anomaly detection,
H. Yao, Y . Cao, W. Luo, W. Zhang, W. Yu, and W. Shen, “Prior normality prompt transformer for multiclass industrial image anomaly detection,” IEEE Trans. Ind. Inform., 2024
work page 2024
-
[9]
Pushing the limits of fewshot anomaly detection in industry vision: Graphcore,
G. Xie, J. Wang, J. Liu, F. Zheng, and Y . Jin, “Pushing the limits of fewshot anomaly detection in industry vision: Graphcore,” inInt. Conf. Learn. Represent., 2023
work page 2023
-
[10]
Shape- consistent one-shot unsupervised domain adaptation for rail surface defect segmentation,
S. Ma, K. Song, M. Niu, H. Tian, Y . Wang, and Y . Yan, “Shape- consistent one-shot unsupervised domain adaptation for rail surface defect segmentation,”IEEE Trans. Ind. Inform., 2023
work page 2023
-
[11]
Few-shot domain-adaptive anomaly detection for cross-site brain images,
J. Su, H. Shen, L. Peng, and D. Hu, “Few-shot domain-adaptive anomaly detection for cross-site brain images,”IEEE Trans. Pattern Anal. Mach. Intell., 2021
work page 2021
-
[12]
Toward generalist anomaly detection via in-context residual learning with few-shot sample prompts,
J. Zhu and G. Pang, “Toward generalist anomaly detection via in-context residual learning with few-shot sample prompts,” inIEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[13]
Promptad: Learning prompts with only normal samples for few-shot anomaly detection,
X. Li, Z. Zhang, X. Tan, C. Chen, Y . Qu, Y . Xie, and L. Ma, “Promptad: Learning prompts with only normal samples for few-shot anomaly detection,” inIEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[14]
Adapting visual-language models for generalizable anomaly detection in medical images,
C. Huang, A. Jiang, J. Feng, Y . Zhang, X. Wang, and Y . Wang, “Adapting visual-language models for generalizable anomaly detection in medical images,” inIEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[15]
Winclip: Zero-/few-shot anomaly classification and segmentation,
J. Jeong, Y . Zou, T. Kim, D. Zhang, A. Ravichandran, and O. Dabeer, “Winclip: Zero-/few-shot anomaly classification and segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023
work page 2023
-
[16]
X. Chen, Y . Han, and J. Zhang, “A zero-/few-shot anomaly classification and segmentation method for cvpr 2023 vand workshop challenge tracks 1&2: 1st place on zero-shot ad and 4th place on few-shot ad,”arXiv preprint arXiv:2305.17382, 2023
-
[17]
Zero- shot anomaly detection via batch normalization,
A. Li, C. Qiu, M. Kloft, P. Smyth, M. Rudolph, and S. Mandt, “Zero- shot anomaly detection via batch normalization,” inAdv. Neural Inform. Process. Syst., 2023
work page 2023
-
[18]
Multimodal industrial anomaly detection via hybrid fusion,
Y . Wang, J. Peng, J. Zhang, R. Yi, Y . Wang, and C. Wang, “Multimodal industrial anomaly detection via hybrid fusion,” inIEEE Conf. Comput. Vis. Pattern Recog., 2023
work page 2023
-
[19]
Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection,
E. Horwitz and Y . Hoshen, “Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection,” inIEEE Conf. Comput. Vis. Pattern Recog., 2023
work page 2023
-
[20]
M3dm-nr: Rgb-3d noisy-resistant industrial anomaly detection via multimodal denoising,
C. Wang, H. Zhu, J. Peng, Y . Wang, R. Yi, Y . Wu, L. Ma, and J. Zhang, “M3dm-nr: Rgb-3d noisy-resistant industrial anomaly detection via multimodal denoising,”IEEE Trans. Pattern Anal. Mach. Intell., 2025
work page 2025
-
[21]
Pointad: Comprehending 3d anomalies from points and pixels for zero-shot 3d anomaly detection,
Q. Zhou, J. Yan, S. He, W. Meng, and J. Chen, “Pointad: Comprehending 3d anomalies from points and pixels for zero-shot 3d anomaly detection,” Adv. Neural Inform. Process. Syst., 2024
work page 2024
-
[22]
X. Li, Z. Huang, F. Xue, and Y . Zhou, “Musc: Zero-shot industrial anomaly classification and segmentation with mutual scoring of the unlabeled images,” inInt. Conf. Learn. Represent., 2024
work page 2024
-
[23]
An image is worth 16x16 words: Transformers for image recognition at scale,
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,” inInt. Conf. Learn. Represent., 2020
work page 2020
-
[24]
H. Zhao, L. Jiang, J. Jia, P. H. Torr, and V . Koltun, “Point transformer,” inInt. Conf. Comput. Vis., 2021
work page 2021
-
[25]
Learning transferable visual models from natural language supervision,
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInt. Conf. Mach. Learn., 2021
work page 2021
-
[26]
Emerging properties in self-supervised vision transformers,
M. Caron, H. Touvron, I. Misra, H. J ´egou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” inInt. Conf. Comput. Vis., 2021
work page 2021
-
[27]
Masked autoencoders for point cloud self-supervised learning,
Y . Pang, W. Wang, F. E. Tay, W. Liu, Y . Tian, and L. Yuan, “Masked autoencoders for point cloud self-supervised learning,” inEur. Conf. Comput. Vis., 2022
work page 2022
-
[28]
Point-bert: Pre- training 3d point cloud transformers with masked point modeling,
X. Yu, L. Tang, Y . Rao, T. Huang, J. Zhou, and J. Lu, “Point-bert: Pre- training 3d point cloud transformers with masked point modeling,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022
work page 2022
-
[29]
Swin transformer: Hierarchical vision transformer using shifted windows,
Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inInt. Conf. Comput. Vis., 2021
work page 2021
-
[30]
Vlt: Vision-language trans- former and query generation for referring segmentation,
H. Ding, C. Liu, S. Wang, and X. Jiang, “Vlt: Vision-language trans- former and query generation for referring segmentation,”IEEE Trans. Pattern Anal. Mach. Intell., 2022
work page 2022
-
[31]
Pointnet: Deep learning on point sets for 3d classification and segmentation,
C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” inIEEE Conf. Comput. Vis. Pattern Recog., 2017
work page 2017
-
[32]
Flattening- net: Deep regular 2d representation for 3d point cloud analysis,
Q. Zhang, J. Hou, Y . Qian, Y . Zeng, J. Zhang, and Y . He, “Flattening- net: Deep regular 2d representation for 3d point cloud analysis,”IEEE Trans. Pattern Anal. Mach. Intell., 2023
work page 2023
-
[33]
V ote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning,
S. Chen, H. Zhu, M. Li, X. Chen, P. Guo, Y . Lei, G. Yu, T. Li, and T. Chen, “V ote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning,”IEEE Trans. Pattern Anal. Mach. Intell., 2024
work page 2024
-
[34]
M.-H. Guo, J.-X. Cai, Z.-N. Liu, T.-J. Mu, R. R. Martin, and S.-M. Hu, “Pct: Point cloud transformer,”Comput. Vis. Media, 2021
work page 2021
-
[35]
Generative variational-contrastive learning for self-supervised point cloud represen- tation,
B. Wang, Z. Tian, A. Ye, F. Wen, S. Du, and Y . Gao, “Generative variational-contrastive learning for self-supervised point cloud represen- tation,”IEEE Trans. Pattern Anal. Mach. Intell., 2024. 13
work page 2024
-
[36]
Point transformer v2: Grouped vector attention and partition-based pooling,
X. Wu, Y . Lao, L. Jiang, X. Liu, and H. Zhao, “Point transformer v2: Grouped vector attention and partition-based pooling,”Adv. Neural Inform. Process. Syst., 2022
work page 2022
-
[37]
Point transformer v3: Simpler faster stronger,
X. Wu, L. Jiang, P.-S. Wang, Z. Liu, X. Liu, Y . Qiao, W. Ouyang, T. He, and H. Zhao, “Point transformer v3: Simpler faster stronger,” inIEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[38]
Adaclip: Adapting clip with hybrid learnable prompts for zero-shot anomaly detection,
Y . Cao, J. Zhang, L. Frittoli, Y . Cheng, W. Shen, and G. Boracchi, “Adaclip: Adapting clip with hybrid learnable prompts for zero-shot anomaly detection,” inEur. Conf. Comput. Vis., 2024
work page 2024
-
[39]
Vcp-clip: A visual context prompting model for zero-shot anomaly segmentation,
Z. Qu, X. Tao, M. Prasad, F. Shen, Z. Zhang, X. Gong, and G. Ding, “Vcp-clip: A visual context prompting model for zero-shot anomaly segmentation,” inEur. Conf. Comput. Vis., 2024
work page 2024
-
[40]
Promptad: Zero-shot anomaly detection using text prompts,
Y . Li, A. Goodge, F. Liu, and C.-S. Foo, “Promptad: Zero-shot anomaly detection using text prompts,” inWinter Conf. Appl. Comput. Vis., 2024
work page 2024
-
[41]
Filo: Zero-shot anomaly detection by fine-grained description and high- quality localization,
Z. Gu, B. Zhu, G. Zhu, Y . Chen, H. Li, M. Tang, and J. Wang, “Filo: Zero-shot anomaly detection by fine-grained description and high- quality localization,” inACM Int. Conf. Multimedia, 2024
work page 2024
-
[42]
Zero-shot versus many-shot: Unsupervised texture anomaly detection,
T. Aota, L. T. T. Tong, and T. Okatani, “Zero-shot versus many-shot: Unsupervised texture anomaly detection,” inWinter Conf. Appl. Comput. Vis., 2023
work page 2023
-
[43]
R3d-ad: Reconstruction via diffusion for 3d anomaly detection,
Z. Zhou, L. Wang, N. Fang, Z. Wang, L. Qiu, and S. Zhang, “R3d-ad: Reconstruction via diffusion for 3d anomaly detection,” inEur. Conf. Comput. Vis., 2025
work page 2025
-
[44]
W. Li, X. Xu, Y . Gu, B. Zheng, S. Gao, and Y . Wu, “Towards scalable 3d anomaly detection and localization: A benchmark via 3d anomaly synthesis and a self-supervised learning network,” inIEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[45]
Easynet: An easy network for 3d industrial anomaly detection,
R. Chen, G. Xie, J. Liu, J. Wang, Z. Luo, J. Wang, and F. Zheng, “Easynet: An easy network for 3d industrial anomaly detection,” inACM Int. Conf. Multimedia, 2023
work page 2023
-
[46]
Shape-guided dual-memory learning for 3d anomaly detection,
Y .-M. Chu, C. Liu, T.-I. Hsieh, H.-T. Chen, and T.-L. Liu, “Shape-guided dual-memory learning for 3d anomaly detection,” inInt. Conf. Mach. Learn., 2023
work page 2023
-
[47]
Multi- modal industrial anomaly detection by crossmodal feature mapping,
A. Costanzino, P. Z. Ramirez, G. Lisanti, and L. Di Stefano, “Multi- modal industrial anomaly detection by crossmodal feature mapping,” in IEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[48]
D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Sch ¨olkopf, “Ranking on data manifolds,”Adv. Neural Inform. Process. Syst., 2003
work page 2003
-
[49]
T. Lin and H. Zha, “Riemannian manifold learning,”IEEE Trans. Pattern Anal. Mach. Intell., 2008
work page 2008
-
[50]
Affinity learning via self-diffusion for image segmentation and clustering,
B. Wang and Z. Tu, “Affinity learning via self-diffusion for image segmentation and clustering,” inIEEE Conf. Comput. Vis. Pattern Recog., 2012
work page 2012
-
[51]
Z. Zhang, J. Wang, and H. Zha, “Adaptive manifold learning,”IEEE Trans. Pattern Anal. Mach. Intell., 2011
work page 2011
-
[52]
Anomalyclip: Object- agnostic prompt learning for zero-shot anomaly detection,
Q. Zhou, G. Pang, Y . Tian, S. He, and J. Chen, “Anomalyclip: Object- agnostic prompt learning for zero-shot anomaly detection,” inInt. Conf. Learn. Represent., 2024
work page 2024
-
[53]
The farthest point strategy for progressive image sampling,
Y . Eldar, M. Lindenbaum, M. Porat, and Y . Y . Zeevi, “The farthest point strategy for progressive image sampling,”IEEE Trans. Image Process., 1997
work page 1997
-
[54]
M. P. Do Carmo,Differential geometry of curves and surfaces: revised and updated second edition. Courier Dover Publications, 2016
work page 2016
-
[55]
Towards total recall in industrial anomaly detection,
K. Roth, L. Pemula, J. Zepeda, B. Sch ¨olkopf, T. Brox, and P. Gehler, “Towards total recall in industrial anomaly detection,” inIEEE Conf. Comput. Vis. Pattern Recog., 2022
work page 2022
-
[56]
Unsupervised metric learning by self- smoothing operator,
J. Jiang, B. Wang, and Z. Tu, “Unsupervised metric learning by self- smoothing operator,” inInt. Conf. Comput. Vis., 2011
work page 2011
-
[57]
The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization,
P. Bergmann, X. Jin, D. Sattlegger, and C. Steger, “The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization,” inInt. Conf. Comput. Vis. Theor. Appl., 2021
work page 2021
-
[58]
Fine-grained abnormality prompt learning for zero-shot anomaly detection,
J. Zhu, Y .-S. Ong, C. Shen, and G. Pang, “Fine-grained abnormality prompt learning for zero-shot anomaly detection,” inInt. Conf. Comput. Vis., 2025
work page 2025
-
[59]
Pointclip v2: Prompting clip and gpt for powerful 3d open- world learning,
X. Zhu, R. Zhang, B. He, Z. Guo, Z. Zeng, Z. Qin, S. Zhang, and P. Gao, “Pointclip v2: Prompting clip and gpt for powerful 3d open- world learning,” inInt. Conf. Comput. Vis., 2023
work page 2023
-
[60]
Ulip: Learning a unified representation of language, images, and point clouds for 3d understanding,
L. Xue, M. Gao, C. Xing, R. Mart ´ın-Mart´ın, J. Wu, C. Xiong, R. Xu, J. C. Niebles, and S. Savarese, “Ulip: Learning a unified representation of language, images, and point clouds for 3d understanding,” inIEEE Conf. Comput. Vis. Pattern Recog., 2023
work page 2023
-
[61]
Ulip-2: Towards scalable multimodal pre-training for 3d understanding,
L. Xue, N. Yu, S. Zhang, A. Panagopoulou, J. Li, R. Mart ´ın-Mart´ın, J. Wu, C. Xiong, R. Xu, J. C. Niebleset al., “Ulip-2: Towards scalable multimodal pre-training for 3d understanding,” inIEEE Conf. Comput. Vis. Pattern Recog., 2024
work page 2024
-
[62]
Towards zero-shot 3d anomaly localization,
Y . Wang, K.-C. Peng, and Y . Fu, “Towards zero-shot 3d anomaly localization,” inWinter Conf. Appl. Comput. Vis., 2025
work page 2025
-
[63]
The eyecandies dataset for unsupervised multimodal anomaly detection and localization,
L. Bonfiglioli, M. Toschi, D. Silvestri, N. Fioraio, and D. De Gregorio, “The eyecandies dataset for unsupervised multimodal anomaly detection and localization,” inAsian Conf. Comput. Vis., 2022
work page 2022
-
[64]
Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection,
P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger, “Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection,” inIEEE Conf. Comput. Vis. Pattern Recog., 2019
work page 2019
-
[65]
Spot-the- difference self-supervised pre-training for anomaly detection and seg- mentation,
Y . Zou, J. Jeong, L. Pemula, D. Zhang, and O. Dabeer, “Spot-the- difference self-supervised pre-training for anomaly detection and seg- mentation,” inEur. Conf. Comput. Vis., 2022
work page 2022
-
[66]
K. Mao, P. Wei, Y . Lian, Y . Wang, and N. Zheng, “Beyond single-modal boundary: Cross-modal anomaly detection through visual prototype and harmonization,” inIEEE Conf. Comput. Vis. Pattern Recog., 2025
work page 2025
-
[67]
Rareclip: Rarity-aware online zero- shot industrial anomaly detection,
J. He, M. Cao, S. Peng, and Q. Xie, “Rareclip: Rarity-aware online zero- shot industrial anomaly detection,” inInt. Conf. Comput. Vis., 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.