Multi-modal agent tuning: Building a vlm-driven agent for efficient tool usage.arXiv preprint arXiv:2412.15606,

Zhi Gao, Bofei Zhang, Pengxiang Li, Xiaojian Ma, Tao Yuan, Yue Fan, Yuwei Wu, Yunde Jia, Song- Chun Zhu, Qing Li · 2025 · arXiv 2412.15606

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

support 1

cs.AI · 2026-04-04 · unverdicted · novelty 6.0

Multi-agent VLM frameworks outperform single VLMs for automated coding of on-screen collaborative learning behaviors using the ICAP framework.

cs.AI · 2025-10-03

Showing 2 of 2 citing papers.

Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors cs.AI · 2026-04-04 · unverdicted · none · ref 14
Multi-agent VLM frameworks outperform single VLMs for automated coding of on-screen collaborative learning behaviors using the ICAP framework.
Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents cs.AI · 2025-10-03 · unreviewed · ref 4