Raft: Recurrent all-pairs field transforms for optical flow

Zachary Teed, Jia Deng · 2020 · arXiv 2003.12039

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

CaRLi-V: Camera-RADAR-LiDAR Point-Wise 3D Velocity Estimation

cs.RO · 2025-11-03 · unverdicted · novelty 7.0

CaRLi-V fuses RADAR velocity cube, camera optical flow, and LiDAR ranges in a closed-form solution to produce dense point-wise 3D velocity estimates that outperform scene flow methods on a custom dataset.

Zero-Parameter Geometric Gating for Temporally Stable Low-Altitude UAV Video Semantic Segmentation

cs.CV · 2026-06-08 · unverdicted · novelty 6.0

A RANSAC-based geometric gate routes regions to homography or optical flow warping before SSP fusion, improving mIoU by 4.24-4.91% on synthetic UAVid with only 211K added parameters to frozen backbones.

Turning Video Models into Generalist Robot Policies

cs.RO · 2026-05-27 · unverdicted · novelty 6.0

Decouples action-free video world models from embodiment-specific IDMs using Jacobian-based translation to achieve zero-shot cross-embodiment robot policies.

EgoDyn-Bench: Evaluating Ego-Motion Understanding in Vision-Centric Foundation Models for Autonomous Driving

cs.CV · 2026-04-22 · unverdicted · novelty 6.0

EgoDyn-Bench reveals a perception bottleneck in vision-centric foundation models: ego-motion logic derives from language while visual input adds negligible signal, with explicit trajectories restoring consistency.

VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction

cs.CV · 2026-02-09 · unverdicted · novelty 6.0

VisPhyWorld evaluates MLLMs' physical reasoning via executable code generation for video reconstruction, with VisPhyBench showing strong semantics but weak parameter inference and dynamics simulation.

Learning Zero-Shot Subject-Driven Video Generation Using 1% Compute

cs.CV · 2025-04-23 · unverdicted · novelty 6.0

A zero-shot subject-driven video generation framework that decomposes the task into identity injection from 200K subject-image pairs and motion preservation from 4K arbitrary videos, trained in 288 A100 GPU hours on CogVideoX-5B to match prior performance at 1% compute.

AV1 Motion Vector Fidelity and Application for Efficient Optical Flow

eess.IV · 2025-10-20 · unverdicted · novelty 5.0

AV1 motion vectors serve as a high-fidelity warm-start for the RAFT optical flow network, delivering a four-fold speedup in convergence with only minor end-point error increase compared to standard initialization.

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

cs.RO · 2025-07-02 · unverdicted · novelty 5.0

The survey frames VLA models as pipelines that generate progressively grounded action tokens and classifies those tokens into eight types to guide future development.

WorldOdysseyBench: An Open-World Benchmark for Long-Horizon Stability of Interactive World Models

cs.CV · 2026-06-30

citing papers explorer

Showing 8 of 8 citing papers after filters.

CaRLi-V: Camera-RADAR-LiDAR Point-Wise 3D Velocity Estimation cs.RO · 2025-11-03 · unverdicted · none · ref 20
CaRLi-V fuses RADAR velocity cube, camera optical flow, and LiDAR ranges in a closed-form solution to produce dense point-wise 3D velocity estimates that outperform scene flow methods on a custom dataset.
Zero-Parameter Geometric Gating for Temporally Stable Low-Altitude UAV Video Semantic Segmentation cs.CV · 2026-06-08 · unverdicted · none · ref 12
A RANSAC-based geometric gate routes regions to homography or optical flow warping before SSP fusion, improving mIoU by 4.24-4.91% on synthetic UAVid with only 211K added parameters to frozen backbones.
Turning Video Models into Generalist Robot Policies cs.RO · 2026-05-27 · unverdicted · none · ref 42
Decouples action-free video world models from embodiment-specific IDMs using Jacobian-based translation to achieve zero-shot cross-embodiment robot policies.
EgoDyn-Bench: Evaluating Ego-Motion Understanding in Vision-Centric Foundation Models for Autonomous Driving cs.CV · 2026-04-22 · unverdicted · none · ref 34
EgoDyn-Bench reveals a perception bottleneck in vision-centric foundation models: ego-motion logic derives from language while visual input adds negligible signal, with explicit trajectories restoring consistency.
VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction cs.CV · 2026-02-09 · unverdicted · none · ref 39
VisPhyWorld evaluates MLLMs' physical reasoning via executable code generation for video reconstruction, with VisPhyBench showing strong semantics but weak parameter inference and dynamics simulation.
Learning Zero-Shot Subject-Driven Video Generation Using 1% Compute cs.CV · 2025-04-23 · unverdicted · none · ref 45
A zero-shot subject-driven video generation framework that decomposes the task into identity injection from 200K subject-image pairs and motion preservation from 4K arbitrary videos, trained in 288 A100 GPU hours on CogVideoX-5B to match prior performance at 1% compute.
AV1 Motion Vector Fidelity and Application for Efficient Optical Flow eess.IV · 2025-10-20 · unverdicted · none · ref 7
AV1 motion vectors serve as a high-fidelity warm-start for the RAFT optical flow network, delivering a four-fold speedup in convergence with only minor end-point error increase compared to standard initialization.
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective cs.RO · 2025-07-02 · unverdicted · none · ref 216
The survey frames VLA models as pipelines that generate progressively grounded action tokens and classifies those tokens into eight types to guide future development.

Raft: Recurrent all-pairs field transforms for optical flow

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer