IPAD-CLIP adapts CLIP via artifact-aware text embeddings to detect multi-class local perceptual artifacts, backed by a new dataset of 3520 images with pixel-level masks.
Exploring plain vit reconstruction for multi-class unsupervised anomaly detection
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
baseline 1polarities
baseline 1representative citing papers
DPDiff-AD conditions a diffusion model on local prototypes (via nearest aggregation) and global prototypes (via optimal transport) to model normality scalably in multi-class anomaly detection, reporting AUROC gains on 160-category data.
GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.
citing papers explorer
-
IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts
IPAD-CLIP adapts CLIP via artifact-aware text embeddings to detect multi-class local perceptual artifacts, backed by a new dataset of 3520 images with pixel-level masks.
-
Dual Prototype-Conditioned Diffusion Model for Scalable Multi-Class Unsupervised Anomaly Detection in Large Category Spaces
DPDiff-AD conditions a diffusion model on local prototypes (via nearest aggregation) and global prototypes (via optimal transport) to model normality scalably in multi-class anomaly detection, reporting AUROC gains on 160-category data.
-
GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis
GroundingAnomaly uses a Spatial Conditioning Module and Gated Self-Attention in a frozen diffusion U-Net to synthesize spatially accurate few-shot anomalies, reaching SOTA on MVTec AD and VisA for detection, segmentation, and instance detection.