Title resolution pending

· 2021

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

cs.CV · 2026-01-15 · unverdicted · novelty 8.0

Molmo2 delivers state-of-the-art open-weight video VLMs with new grounding datasets and training methods that outperform prior open models and match or exceed some proprietary ones on pointing and tracking tasks.

Fast 4D Mesh Generation by Spatio-Temporal Attention Chains

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

A training-free Spatio-Temporal Attention Chain framework accelerates 4D mesh generation 13x, improves quality, scales to 16x longer videos, and supports downstream tracking and camera estimation.

Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

Evidence for cross-modal representational convergence weakens substantially at scale and in realistic many-to-many settings, indicating models learn rich but distinct representations.

Probing Intrinsic Medical Task Relationships: A Contrastive Learning Perspective

cs.CV · 2026-04-07 · unverdicted · novelty 6.0

TaCo contrastively embeds semantic, generative, and transformation tasks from medical imaging into a joint space to reveal which tasks cluster, blend, or remain distinct.

FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving

cs.CV · 2025-05-23 · conditional · novelty 6.0

FSDrive uses a generated future scene frame as visual spatio-temporal CoT to improve VLA models for safer autonomous driving trajectory prediction.

citing papers explorer

Showing 5 of 5 citing papers.

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding cs.CV · 2026-01-15 · unverdicted · none · ref 119
Molmo2 delivers state-of-the-art open-weight video VLMs with new grounding datasets and training methods that outperform prior open models and match or exceed some proprietary ones on pointing and tracking tasks.
Fast 4D Mesh Generation by Spatio-Temporal Attention Chains cs.CV · 2026-05-19 · unverdicted · none · ref 55
A training-free Spatio-Temporal Attention Chain framework accelerates 4D mesh generation 13x, improves quality, scales to 16x longer videos, and supports downstream tracking and camera estimation.
Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale cs.CV · 2026-04-20 · unverdicted · none · ref 74
Evidence for cross-modal representational convergence weakens substantially at scale and in realistic many-to-many settings, indicating models learn rich but distinct representations.
Probing Intrinsic Medical Task Relationships: A Contrastive Learning Perspective cs.CV · 2026-04-07 · unverdicted · none · ref 59
TaCo contrastively embeds semantic, generative, and transformation tasks from medical imaging into a joint space to reveal which tasks cluster, blend, or remain distinct.
FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving cs.CV · 2025-05-23 · conditional · none · ref 51
FSDrive uses a generated future scene frame as visual spatio-temporal CoT to improve VLA models for safer autonomous driving trajectory prediction.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer