IEEE transactions on pattern analysis and machine intelligence46(8), 5625–5644 (2024)

Zhang, J · 2024

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models

cs.CV · 2026-04-18 · unverdicted · novelty 7.0

DO-Bench is a controlled benchmark that attributes VLM object hallucination errors to textual prior pressure, perceptual limits, or their interaction via two diagnostic dimensions and metrics.

V-tableR1: Process-Supervised Multimodal Table Reasoning with Critic-Guided Policy Optimization

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

V-tableR1 uses a critic VLM for dense step-level feedback and a new PGPO algorithm to shift multimodal table reasoning from pattern matching to verifiable logical steps, achieving SOTA accuracy with a 4B open-source model.

Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors

cs.AI · 2026-04-04 · unverdicted · novelty 6.0

Multi-agent VLM frameworks outperform single VLMs for automated coding of on-screen collaborative learning behaviors using the ICAP framework.

Learning from the Unseen: Generative Data Augmentation for Geometric-Semantic Accident Anticipation

cs.CV · 2026-04-29 · unverdicted · novelty 5.0

A generative video synthesis pipeline paired with a semantic graph neural network yields gains in accident anticipation accuracy and lead time on driving datasets, accompanied by a new benchmark release.

DBMF: A Dual-Branch Multimodal Framework for Out-of-Distribution Detection

cs.CV · 2026-04-09 · unverdicted · novelty 5.0

DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.

citing papers explorer

Showing 5 of 5 citing papers.

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models cs.CV · 2026-04-18 · unverdicted · none · ref 41
DO-Bench is a controlled benchmark that attributes VLM object hallucination errors to textual prior pressure, perceptual limits, or their interaction via two diagnostic dimensions and metrics.
V-tableR1: Process-Supervised Multimodal Table Reasoning with Critic-Guided Policy Optimization cs.AI · 2026-04-22 · unverdicted · none · ref 39
V-tableR1 uses a critic VLM for dense step-level feedback and a new PGPO algorithm to shift multimodal table reasoning from pattern matching to verifiable logical steps, achieving SOTA accuracy with a 4B open-source model.
Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors cs.AI · 2026-04-04 · unverdicted · none · ref 34
Multi-agent VLM frameworks outperform single VLMs for automated coding of on-screen collaborative learning behaviors using the ICAP framework.
Learning from the Unseen: Generative Data Augmentation for Geometric-Semantic Accident Anticipation cs.CV · 2026-04-29 · unverdicted · none · ref 28
A generative video synthesis pipeline paired with a semantic graph neural network yields gains in accident anticipation accuracy and lead time on driving datasets, accompanied by a new benchmark release.
DBMF: A Dual-Branch Multimodal Framework for Out-of-Distribution Detection cs.CV · 2026-04-09 · unverdicted · none · ref 28
DBMF integrates scores from text-image and vision branches to improve out-of-distribution detection on endoscopic datasets by up to 24.84% over prior methods.

IEEE transactions on pattern analysis and machine intelligence46(8), 5625–5644 (2024)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer