In: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summa- rization

Banerjee, S · 2005

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Lost in Volume: The CT-SpatialVQA Benchmark for Evaluating Semantic-Spatial Understanding of 3D Medical Vision-Language Models

cs.CV · 2026-05-09 · unverdicted · novelty 7.0

CT-SpatialVQA benchmark shows 3D medical VLMs achieve only 34% average accuracy on semantic-spatial reasoning tasks in CT volumes, often below random chance.

GTASA: Ground Truth Annotations for Spatiotemporal Analysis, Evaluation and Training of Video Models

cs.CV · 2026-04-12 · unverdicted · novelty 7.0

GTASA supplies annotated multi-actor videos with exact 3D spatial and temporal ground truth that outperforms neural video generators in physical and semantic validity while enabling new probes of video encoders.

On the Factual Consistency of Text-based Explainable Recommendation Models

cs.IR · 2025-12-30 · unverdicted · novelty 7.0

A prompting pipeline and statement-level metrics show that six state-of-the-art text-based explainable recommendation models achieve high semantic similarity but very low factual consistency on Amazon review data.

Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting

cs.IR · 2026-05-18 · unverdicted · novelty 6.0

Introduces TableGrid Navigation (TGN) and Progressive Inference Prompting (PIP) as training-free structured prompting frameworks that improve LLM performance on table question answering over baselines on TableBench and achieve SOTA on FeTaQa.

Hi-GaTA: Hierarchical Gated Temporal Aggregation Adapter for Surgical Video Report Generation

cs.CV · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

Hi-GaTA is a hierarchical gated temporal aggregation adapter that uses short-to-long temporal pyramids and gated fusion to enable surgical video report generation, backed by a new 214-video benchmark and a surgical ViViT pretrained on 40,000 minutes of video.

Fake or Real, Can Robots Tell? Evaluating VLM Robustness to Domain Shift in Single-View Robotic Scene Understanding

cs.RO · 2025-06-24 · unverdicted · novelty 6.0

VLMs caption real objects effectively but degrade on 3D-printed fakes in robotic scenes, while some standard metrics fail to detect the factual errors from this domain shift.

STAND: Semantic Anchoring Constraint with Dual-Granularity Disambiguation for Remote Sensing Image Change Captioning

cs.CV · 2026-04-25 · unverdicted · novelty 4.0

STAND adds semantic anchoring and dual-granularity disambiguation modules to address viewpoint, scale, and knowledge ambiguities in remote sensing change captioning.

citing papers explorer

Showing 7 of 7 citing papers.

Lost in Volume: The CT-SpatialVQA Benchmark for Evaluating Semantic-Spatial Understanding of 3D Medical Vision-Language Models cs.CV · 2026-05-09 · unverdicted · none · ref 3
CT-SpatialVQA benchmark shows 3D medical VLMs achieve only 34% average accuracy on semantic-spatial reasoning tasks in CT volumes, often below random chance.
GTASA: Ground Truth Annotations for Spatiotemporal Analysis, Evaluation and Training of Video Models cs.CV · 2026-04-12 · unverdicted · none · ref 8
GTASA supplies annotated multi-actor videos with exact 3D spatial and temporal ground truth that outperforms neural video generators in physical and semantic validity while enabling new probes of video encoders.
On the Factual Consistency of Text-based Explainable Recommendation Models cs.IR · 2025-12-30 · unverdicted · none · ref 2
A prompting pipeline and statement-level metrics show that six state-of-the-art text-based explainable recommendation models achieve high semantic similarity but very low factual consistency on Amazon review data.
Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting cs.IR · 2026-05-18 · unverdicted · none · ref 1
Introduces TableGrid Navigation (TGN) and Progressive Inference Prompting (PIP) as training-free structured prompting frameworks that improve LLM performance on table question answering over baselines on TableBench and achieve SOTA on FeTaQa.
Hi-GaTA: Hierarchical Gated Temporal Aggregation Adapter for Surgical Video Report Generation cs.CV · 2026-05-11 · unverdicted · none · ref 3 · 2 links
Hi-GaTA is a hierarchical gated temporal aggregation adapter that uses short-to-long temporal pyramids and gated fusion to enable surgical video report generation, backed by a new 214-video benchmark and a surgical ViViT pretrained on 40,000 minutes of video.
Fake or Real, Can Robots Tell? Evaluating VLM Robustness to Domain Shift in Single-View Robotic Scene Understanding cs.RO · 2025-06-24 · unverdicted · none · ref 5
VLMs caption real objects effectively but degrade on 3D-printed fakes in robotic scenes, while some standard metrics fail to detect the factual errors from this domain shift.
STAND: Semantic Anchoring Constraint with Dual-Granularity Disambiguation for Remote Sensing Image Change Captioning cs.CV · 2026-04-25 · unverdicted · none · ref 2
STAND adds semantic anchoring and dual-granularity disambiguation modules to address viewpoint, scale, and knowledge ambiguities in remote sensing change captioning.

In: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summa- rization

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer