Learns state-conditioned commitment depth in a 7B vision-language policy that jointly predicts actions and replan intervals, outperforming fixed-depth baselines and larger models on Sliding Puzzle and Sokoban while providing a theoretical dominance result.
Canonical reference
Mixture of horizons in action chunking
Canonical reference. 83% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6representative citing papers
A verifier called Future Forward Dynamics Causal Attention enables adaptive action execution in World Action Models, reducing model inferences by 69% and improving success rates in robotic tasks.
AsyncShield restores VLA geometric intent from latency via kinematic pose mapping and uses PPO-Lagrangian to balance tracking with LiDAR safety constraints in a plug-and-play module.
SV-VLA uses infrequent heavy VLA planning of action chunks plus a lightweight closed-loop verifier to achieve both efficiency and robustness in dynamic robot control.
A3 adaptively selects verifiable action prefixes in VLA models using group-sampled consensus and conditional re-decoding to balance robustness and speed without manual horizon tuning.
LingBot-VA combines video world modeling with policy learning via Mixture-of-Transformers, closed-loop rollouts, and asynchronous inference to improve robot manipulation in simulation and real settings.
citing papers explorer
-
When to Re-Commit: Temporal Abstraction Discovery for Long-Horizon Vision-Language Reasoning
Learns state-conditioned commitment depth in a 7B vision-language policy that jointly predicts actions and replan intervals, outperforming fixed-depth baselines and larger models on Sliding Puzzle and Sokoban while providing a theoretical dominance result.
-
When to Trust Imagination: Adaptive Action Execution for World Action Models
A verifier called Future Forward Dynamics Causal Attention enables adaptive action execution in World Action Models, reducing model inferences by 69% and improving success rates in robotic tasks.
-
AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation
AsyncShield restores VLA geometric intent from latency via kinematic pose mapping and uses PPO-Lagrangian to balance tracking with LiDAR safety constraints in a plug-and-play module.
-
Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLA
SV-VLA uses infrequent heavy VLA planning of action chunks plus a lightweight closed-loop verifier to achieve both efficiency and robustness in dynamic robot control.
-
Dynamic Execution Commitment of Vision-Language-Action Models
A3 adaptively selects verifiable action prefixes in VLA models using group-sampled consensus and conditional re-decoding to balance robustness and speed without manual horizon tuning.
-
Causal World Modeling for Robot Control
LingBot-VA combines video world modeling with policy learning via Mixture-of-Transformers, closed-loop rollouts, and asynchronous inference to improve robot manipulation in simulation and real settings.