A text-semantics-guided multimodal framework with geometry-aware mapping and object-conditioned text adaptation achieves state-of-the-art unsupervised anomaly detection and localization on RGB-3D industrial datasets while enabling a single model for multiple classes.
Winclip: Zero-/few-shot anomaly classification and segmentation
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
A skeleton-based zero-shot VAD method distills LLM knowledge for action typicality during training and performs test-time context uniqueness analysis to derive scene-adaptive normality boundaries, claiming SOTA results on four datasets with over 100 unseen scenes.
citing papers explorer
-
Text-Guided Multimodal Unified Industrial Anomaly Detection
A text-semantics-guided multimodal framework with geometry-aware mapping and object-conditioned text adaptation achieves state-of-the-art unsupervised anomaly detection and localization on RGB-3D industrial datasets while enabling a single model for multiple classes.
-
Action Hints: Semantic Typicality and Context Uniqueness for Generalizable Skeleton-based Video Anomaly Detection
A skeleton-based zero-shot VAD method distills LLM knowledge for action typicality during training and performs test-time context uniqueness analysis to derive scene-adaptive normality boundaries, claiming SOTA results on four datasets with over 100 unseen scenes.