Megasam: Accurate, fast, and robust structure and motion from casual dynamic videos

Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely · 2024 · arXiv 2412.04463

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Large Video Planner Enables Generalizable Robot Control

cs.RO · 2025-12-17 · conditional · novelty 7.0

A video foundation model trained on human demonstrations generates zero-shot plans that convert to executable robot actions on novel scenes and tasks.

Beyond the Frame: Generating 360 Panoramic Videos from Perspective Videos

cs.CV · 2025-04-10 · unverdicted · novelty 7.0

A generative model produces realistic and coherent 360 panoramic videos from in-the-wild perspective videos via curated online data and geometry-motion aware operations.

Learning Efficient 4D Gaussian Representations from Monocular Videos with Flow Splatting

cs.CV · 2026-06-29 · unverdicted · novelty 6.0

Flow Splatting extends 4D Gaussian volumes with time-varying means and covariances, approximates a velocity field, and splats it to render optical flow for supervising dynamic reconstruction from monocular video.

Quantitative Video World Model Evaluation for Geometric-Consistency

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

PDI-Bench computes 3D projective residuals from segmented and tracked points to quantify geometric inconsistency in AI-generated videos.

Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

cs.CV · 2025-05-29 · unverdicted · novelty 6.0 · 2 refs

Spatial-MLLM adds a 3D spatial encoder initialized from a visual geometry model and space-aware frame sampling to MLLMs to improve spatial understanding and reasoning from purely 2D visual inputs.

MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details

cs.CV · 2025-07-03 · unverdicted · novelty 5.0

MoGe-2 recovers metric-scale 3D point maps with fine details from single images via data refinement and extension of affine-invariant predictions.

citing papers explorer

Showing 5 of 5 citing papers after filters.

Beyond the Frame: Generating 360 Panoramic Videos from Perspective Videos cs.CV · 2025-04-10 · unverdicted · none · ref 28
A generative model produces realistic and coherent 360 panoramic videos from in-the-wild perspective videos via curated online data and geometry-motion aware operations.
Learning Efficient 4D Gaussian Representations from Monocular Videos with Flow Splatting cs.CV · 2026-06-29 · unverdicted · none · ref 26
Flow Splatting extends 4D Gaussian volumes with time-varying means and covariances, approximates a velocity field, and splats it to render optical flow for supervising dynamic reconstruction from monocular video.
Quantitative Video World Model Evaluation for Geometric-Consistency cs.CV · 2026-05-14 · unverdicted · none · ref 18
PDI-Bench computes 3D projective residuals from segmented and tracked points to quantify geometric inconsistency in AI-generated videos.
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence cs.CV · 2025-05-29 · unverdicted · none · ref 33 · 2 links
Spatial-MLLM adds a 3D spatial encoder initialized from a visual geometry model and space-aware frame sampling to MLLMs to improve spatial understanding and reasoning from purely 2D visual inputs.
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details cs.CV · 2025-07-03 · unverdicted · none · ref 36
MoGe-2 recovers metric-scale 3D point maps with fine details from single images via data refinement and extension of affine-invariant predictions.

Megasam: Accurate, fast, and robust structure and motion from casual dynamic videos

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer