In Proceedings of the IEEE/CVF International Confer- ence on Computer Vision, pages 20406–20417

Peter Kulits, Haiwen Feng, Weiyang Liu, Victoria Abrevaya, Michael J Black · 2024 · arXiv 2404.15228

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

$\Delta$ynamics: Language-Based Representation for Inferring Rigid-Body Dynamics From Videos

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

A vision-language framework generates text-based rigid-body scene configurations from videos using motion reasoning and optical flow, reporting 0.30 IoU on CLEVRER (7x over baselines) and transfer to 235 real videos.

Symbolic Grounding Reveals Representational Bottlenecks in Abstract Visual Reasoning

cs.AI · 2026-04-23 · unverdicted · novelty 6.0

LLMs given symbolic image descriptions reach mid-90s accuracy on abstract visual reasoning tasks where end-to-end VLMs stay near chance, showing representation as the primary bottleneck.

citing papers explorer

Showing 2 of 2 citing papers.

$\Delta$ynamics: Language-Based Representation for Inferring Rigid-Body Dynamics From Videos cs.CV · 2026-05-20 · unverdicted · none · ref 31
A vision-language framework generates text-based rigid-body scene configurations from videos using motion reasoning and optical flow, reporting 0.30 IoU on CLEVRER (7x over baselines) and transfer to 235 real videos.
Symbolic Grounding Reveals Representational Bottlenecks in Abstract Visual Reasoning cs.AI · 2026-04-23 · unverdicted · none · ref 2
LLMs given symbolic image descriptions reach mid-90s accuracy on abstract visual reasoning tasks where end-to-end VLMs stay near chance, showing representation as the primary bottleneck.

In Proceedings of the IEEE/CVF International Confer- ence on Computer Vision, pages 20406–20417

fields

years

verdicts

representative citing papers

citing papers explorer