General Covariant Action Modeling: Constructing Generalized Manifolds via Spatio-Temporal Decoupling

· 2026 · cs.CV · arXiv 2606.00110

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Achieving robust generalization from limited data is a central challenge in embodied intelligence. Prevailing methods fail by regressing absolute coordinates, which violates the principle of general covariance. Fundamentally, this conflates the intrinsic task geometry with rigid execution patterns, binding policies to specific motion styles and fixed speeds. To resolve this, we propose the Generalized Action Manifold (GAM) framework that enforces general covariance through structural disentanglement. Specifically, GAM realizes the manifold by enforcing invariance across two orthogonal dimensions: (1) Temporal Invariance, utilizing an Arc-Length Parameterizer to orthogonalize the spatial path geometry from temporal dynamics, ensuring robustness to velocity variations; (2) Geometric Invariance, where a Schema-Affine-Factorization mechanism maps trajectories to canonical ``world lines'' in a pose-normalized coordinate frame. This distinguishes invariant geometric schemas from affine modulations, ensuring spatial generalizability. By integrating GAM within a structured Vision-Language-Action (VLA) architecture, we enable sparse demonstrations to densely populate a continuous, valid action manifold. Empirical results demonstrate that GAM enables superior transfer and robustness capabilities, outperforming geometry-agnostic baselines.

representative citing papers

Orca: The World is in Your Mind

cs.CV · 2026-06-29 · unverdicted · novelty 5.0 · 2 refs

Orca pre-trains a world latent space on 125K hours of video and 160M events via unconscious and conscious next-state-prediction learning, then shows the frozen backbone supports stronger text, image, and action readouts than specialized baselines.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Orca: The World is in Your Mind cs.CV · 2026-06-29 · unverdicted · none · ref 2 · 2 links · internal anchor
Orca pre-trains a world latent space on 125K hours of video and 160M events via unconscious and conscious next-state-prediction learning, then shows the frozen backbone supports stronger text, image, and action readouts than specialized baselines.

General Covariant Action Modeling: Constructing Generalized Manifolds via Spatio-Temporal Decoupling

fields

years

verdicts

representative citing papers

citing papers explorer