pith. machine review for the scientific record. sign in

arxiv: 2512.13671 · v2 · submitted 2025-12-15 · 💻 cs.CV

Recognition: unknown

AgentIAD: Agentic Industrial Anomaly Detection via Adaptive Memory Augmentation

Authors on Pith no claims yet
classification 💻 cs.CV
keywords agentiadmemoryagenticanomalyindustrialanalysisdetectionduring
0
0 comments X
read the original abstract

Industrial anomaly detection (IAD) is challenging due to the subtle and highly localized nature of many defects, which single-pass vision--language models (VLMs) often fail to capture. Moreover, existing approaches lack mechanisms to actively acquire complementary evidence during inference. We propose AgentIAD, an agentic vision--language framework that enables iterative industrial inspection through a unified action space. The agent dynamically accesses two forms of memory during inspection: visual memory via the Perceptive Zoomer (PZ) for fine-grained local analysis, and retrieved memory via the Web Searcher (WS) and Comparative Retriever (CR) for external knowledge acquisition and cross-instance verification. This design allows the model to progressively gather evidence through multi-round perception--action reasoning. To effectively learn such policies under sparse supervision, AgentIAD adopts a two-stage training strategy: tool-aware supervised fine-tuning first initializes structured reasoning and memory-access behaviors, followed by agentic reinforcement learning to refine long-horizon decision policies. Extensive experiments show that, under the same backbone, AgentIAD improves classification accuracy by 5.92% over the previous state-of-the-art method on the MMAD benchmark while providing more reliable and interpretable anomaly analysis.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.