GAP pre-trains the spatial adapter on a lightweight simulated proxy task with free object masks to generate repeatable geometric keypoints, yielding higher success rates than baselines in low-data robotic manipulation on RoboMimic and ManiSkill.
Deep spatial autoencoders for visuomotor learning,
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
MSACT improves localization stability and task success rates in limited-data bimanual manipulation by extracting stable 2D attention points and aligning predicted attention sequences across frames without keypoint labels.
citing papers explorer
-
GAP: Geometric Anchor Pre-training for Data-Efficient Visuomotor Learning of Manipulation Tasks
GAP pre-trains the spatial adapter on a lightweight simulated proxy task with free object masks to generate repeatable geometric keypoints, yielding higher success rates than baselines in low-data robotic manipulation on RoboMimic and ManiSkill.
-
MSACT: Multistage Spatial Alignment for Stable Low-Latency Fine Manipulation
MSACT improves localization stability and task success rates in limited-data bimanual manipulation by extracting stable 2D attention points and aligning predicted attention sequences across frames without keypoint labels.