DiagramNet supplies a new multimodal dataset and progressive training pipeline with decoupled multi-agent workflow, allowing a 3B model to outperform GPT-5, Claude-Sonnet-4, and Gemini-2.5-Pro by over 2x on system-level diagram tasks while generalizing to other benchmarks.
AMSbench : A Comprehensive Benchmark for Evaluating MLLM Capabilities in AMS Circuits , October 2025 b
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
MLLMs exhibit a consistent recognition-reasoning inversion on discrete visual symbols across domains, underperforming on elementary perception while appearing competent on higher-level reasoning via linguistic compensation.
citing papers explorer
-
DiagramNet: An End-to-End Recognition Framework and Dataset for Non-Standard System-Level Diagrams
DiagramNet supplies a new multimodal dataset and progressive training pipeline with decoupled multi-agent workflow, allowing a 3B model to outperform GPT-5, Claude-Sonnet-4, and Gemini-2.5-Pro by over 2x on system-level diagram tasks while generalizing to other benchmarks.
-
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding
MLLMs exhibit a consistent recognition-reasoning inversion on discrete visual symbols across domains, underperforming on elementary perception while appearing competent on higher-level reasoning via linguistic compensation.