GenAU augments a vision-language model with segmentation tokens to unify image-level anomaly detection, pixel-level segmentation, multi-type classification, and language-based defect analysis in a single instruction-following architecture.
Ro- bust visual representation learning with multi-modal prior knowledge for image classification under distribution shift
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models
GenAU augments a vision-language model with segmentation tokens to unify image-level anomaly detection, pixel-level segmentation, multi-type classification, and language-based defect analysis in a single instruction-following architecture.