Recognition: 2 theorem links
· Lean TheoremReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
Pith reviewed 2026-05-16 08:21 UTC · model grok-4.3
The pith
Manipulation tasks are solved in real time by optimizing sequences of relational keypoint constraints generated automatically from language instructions and RGB-D observations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ReKep represents each constraint as a Python function that takes 3D keypoints extracted from the environment and returns a scalar cost; a sequence of such functions defines a complete task that is solved by hierarchical optimization over end-effector trajectories in SE(3), with the functions themselves produced automatically by vision-language models from language instructions and RGB-D input, enabling real-time closed-loop control across diverse manipulation scenarios.
What carries the argument
Relational Keypoint Constraints (ReKep), Python functions that map sets of 3D keypoints to numerical costs and are solved hierarchically to yield end-effector poses.
If this is right
- Robot actions are computed as sequences of end-effector poses in SE(3) at real-time frequencies inside a perception-action loop.
- The approach supports multi-stage, in-the-wild, bimanual, and reactive manipulation behaviors.
- No task-specific training data or environment models are required for new tasks.
- Constraints are generated on the fly from free-form language and RGB-D observations.
Where Pith is reading between the lines
- If vision-language models become more reliable at producing stable constraints, the method could scale to longer-horizon tasks that currently require manual decomposition.
- The same keypoint-based cost functions might be reused across different robot embodiments by simply changing the SE(3) optimization targets.
- Iterative refinement loops that feed execution failures back to the vision-language model could reduce the impact of occasional incorrect constraint generation.
Load-bearing premise
Vision-language models will produce correct, complete, and numerically stable Python constraint functions for arbitrary new tasks and scenes.
What would settle it
Running the generated ReKep functions on a novel scene and task where the optimizer either fails to converge, produces colliding trajectories, or executes unsafe actions that violate the intended goal.
read the original abstract
Representing robotic manipulation tasks as constraints that associate the robot and the environment is a promising way to encode desired robot behaviors. However, it remains unclear how to formulate the constraints such that they are 1) versatile to diverse tasks, 2) free of manual labeling, and 3) optimizable by off-the-shelf solvers to produce robot actions in real-time. In this work, we introduce Relational Keypoint Constraints (ReKep), a visually-grounded representation for constraints in robotic manipulation. Specifically, ReKep is expressed as Python functions mapping a set of 3D keypoints in the environment to a numerical cost. We demonstrate that by representing a manipulation task as a sequence of Relational Keypoint Constraints, we can employ a hierarchical optimization procedure to solve for robot actions (represented by a sequence of end-effector poses in SE(3)) with a perception-action loop at a real-time frequency. Furthermore, in order to circumvent the need for manual specification of ReKep for each new task, we devise an automated procedure that leverages large vision models and vision-language models to produce ReKep from free-form language instructions and RGB-D observations. We present system implementations on a wheeled single-arm platform and a stationary dual-arm platform that can perform a large variety of manipulation tasks, featuring multi-stage, in-the-wild, bimanual, and reactive behaviors, all without task-specific data or environment models. Website at https://rekep-robot.github.io/.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Relational Keypoint Constraints (ReKep) as Python functions that map sets of 3D keypoints to scalar costs. It claims that representing manipulation tasks as sequences of such constraints enables a hierarchical optimization procedure to produce real-time sequences of SE(3) end-effector poses, and that large vision and vision-language models can automatically generate the required ReKep functions from free-form language instructions and RGB-D observations. Physical system demonstrations on a wheeled single-arm platform and a stationary dual-arm platform are presented for multi-stage, bimanual, in-the-wild, and reactive tasks without task-specific data or environment models.
Significance. If the VLM-generated constraints prove reliable, the work would provide a practical route to versatile, label-free manipulation by composing off-the-shelf perception models with standard optimization solvers, achieving real-time closed-loop control on two distinct physical platforms. The hierarchical formulation and perception-action loop are technically coherent, but the absence of quantitative metrics, ablations, or bounded-error analysis on the generation step limits the strength of the central claim.
major comments (2)
- [Abstract and §4] Abstract and §4 (Experiments): no quantitative success rates, timing statistics, ablation studies, or failure-case analysis are reported for the hierarchical optimizer or the VLM-generated constraints, despite these being required to substantiate real-time convergence and reliability across the claimed task variety.
- [§3.3] §3.3 (Automated ReKep Generation): the procedure that prompts VLMs to emit Python constraint functions contains no verification step, numerical stability checks, or empirical evaluation of error modes (incorrect keypoint indexing, non-differentiable operations, or incomplete temporal sequencing), which directly undermines the claim that manual specification can be circumvented for arbitrary tasks.
minor comments (2)
- [§3.1] Notation for the keypoint set and cost functions is introduced without a compact mathematical definition before the Python implementation; a short formalization would improve clarity.
- [Abstract] The website link is given but no supplementary video timestamps or failure examples are referenced in the text, making it harder for readers to locate the supporting demonstrations.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and will incorporate revisions to strengthen the quantitative support and evaluation of the automated generation process.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Experiments): no quantitative success rates, timing statistics, ablation studies, or failure-case analysis are reported for the hierarchical optimizer or the VLM-generated constraints, despite these being required to substantiate real-time convergence and reliability across the claimed task variety.
Authors: We acknowledge that the manuscript currently emphasizes qualitative demonstrations to illustrate versatility across diverse tasks. In the revision, we will expand §4 with quantitative success rates from repeated trials on representative tasks, timing statistics for the full perception-action loop and optimizer, ablation studies isolating the hierarchical components, and a dedicated failure-case analysis. These additions will directly support the claims of real-time convergence and reliability. revision: yes
-
Referee: [§3.3] §3.3 (Automated ReKep Generation): the procedure that prompts VLMs to emit Python constraint functions contains no verification step, numerical stability checks, or empirical evaluation of error modes (incorrect keypoint indexing, non-differentiable operations, or incomplete temporal sequencing), which directly undermines the claim that manual specification can be circumvented for arbitrary tasks.
Authors: We agree that additional safeguards and empirical evaluation are warranted. The revised §3.3 will include a verification step that invokes a Python interpreter to detect syntax errors and basic numerical instabilities (such as division by zero or non-differentiable operations). We will also add an empirical breakdown of observed error modes across tested tasks, including incorrect keypoint indexing and incomplete temporal sequencing, together with the mitigation strategies employed in the current implementation. revision: yes
- A formal bounded-error analysis of the VLM-generated constraints is not feasible in this work, as it would require theoretical guarantees on large vision-language models that are currently unavailable.
Circularity Check
No circularity: system relies on external VLMs and standard solvers
full rationale
The paper defines ReKep as Python functions from 3D keypoints to costs, then uses a hierarchical optimizer on SE(3) poses and delegates generation of those functions to off-the-shelf large vision and vision-language models. No equations or procedures inside the paper reduce by construction to fitted parameters, self-citations, or renamed inputs; the central claims rest on the external models' capabilities and the optimizer's standard behavior rather than any internal derivation that loops back to the paper's own outputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Large vision and language models can map free-form language and RGB-D observations to correct Python constraint functions without task-specific fine-tuning.
- domain assumption Hierarchical optimization of sequences of keypoint costs produces feasible real-time robot trajectories in SE(3).
Forward citations
Cited by 18 Pith papers
-
CreFlow: Corrective Reflow for Sparse-Reward Embodied Video Diffusion RL
CreFlow combines LTL compositional rewards with credit-aware NFT and corrective reflow losses in online RL to improve embodied video diffusion models, raising downstream task success by 23.8 percentage points on eight...
-
PaMoSplat: Part-Aware Motion-Guided Gaussian Splatting for Dynamic Scene Reconstruction
PaMoSplat reconstructs dynamic scenes by lifting 2D segmentations to coherent 3D Gaussian parts and estimating their motions via optical flow-guided differential evolution for higher quality rendering and faster training.
-
KITE: Keyframe-Indexed Tokenized Evidence for VLM-Based Robot Failure Analysis
KITE is a training-free method that uses keyframe-indexed tokenized evidence including BEV schematics to enhance VLM performance on robot failure detection, identification, localization, explanation, and correction.
-
ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs
ST-BiBench reveals a coordination paradox in which MLLMs show strong high-level strategic reasoning yet fail at fine-grained 16-dimensional bimanual action synthesis and multi-stream fusion.
-
From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation
AgentChord models manipulation tasks as directed graphs enriched with anticipatory recovery branches, using specialized agents to enable immediate, low-latency failure responses and improve success on long-horizon bim...
-
TriRelVLA: Triadic Relational Structure for Generalizable Embodied Manipulation
TriRelVLA introduces triadic object-hand-task relational representations and a task-grounded graph transformer with a relational bottleneck to improve generalization in robotic manipulation across scenes, objects, and tasks.
-
Decompose and Recompose: Reasoning New Skills from Existing Abilities for Cross-Task Robotic Manipulation
Decompose and Recompose decomposes seen robotic demonstrations into skill-action alignments and recomposes them via visual-semantic retrieval and planning to enable zero-shot cross-task generalization.
-
BridgeACT: Bridging Human Demonstrations to Robot Actions via Unified Tool-Target Affordances
BridgeACT learns robot manipulation from human videos alone by predicting task-relevant grasp regions and 3D motion affordances that map directly to robot controllers.
-
CorridorVLA: Explicit Spatial Constraints for Generative Action Heads via Sparse Anchors
CorridorVLA improves VLA models by using predicted sparse anchors to impose explicit spatial corridors on action trajectories, yielding 3.4-12.4% success rate gains on LIBERO-Plus with GR00T-Corr reaching 83.21%.
-
AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly
AssemLM uses a specialized point cloud encoder inside a multimodal LLM to reach state-of-the-art 6D pose prediction for assembly tasks, backed by a new 900K-sample benchmark called AssemBench.
-
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
InternVLA-M1 uses spatially guided pre-training on 2.3M examples followed by action post-training to deliver up to 17% gains on robot manipulation benchmarks and 20.6% on unseen objects.
-
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
CoT-VLA is a 7B VLA that generates future visual frames autoregressively as planning goals before actions, outperforming prior VLAs by 17% on real-world tasks and 6% in simulation.
-
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.
-
FAST: Efficient Action Tokenization for Vision-Language-Action Models
FAST applies discrete cosine transform to robot action sequences for efficient tokenization, enabling autoregressive VLAs to succeed on high-frequency dexterous tasks and scale to 10k hours of data while matching diff...
-
Forecast-aware Gaussian Splatting for Predictive 3D Representation in Language-Guided Pick-and-Place Manipulation
Forecast-GS predicts task-completed 3D states via Gaussian splatting to achieve higher success rates than baselines in real-world language-conditioned manipulation tasks.
-
BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation
BioProVLA-Agent integrates protocol parsing, visual state verification, and VLA-based execution in a closed-loop multi-agent framework with AugSmolVLA augmentation to improve robustness for biological lab tasks like t...
-
Synergizing Efficiency and Reliability for Continuous Mobile Manipulation
A framework integrates anticipatory planning and real-time feedback via reliability-aware optimization and phase switching to achieve efficient, reliable continuous mobile manipulation under uncertainty.
-
From Video to Control: A Survey of Learning Manipulation Interfaces from Temporal Visual Data
A survey introduces an interface-centric taxonomy for video-to-control methods in robotic manipulation and identifies the robotics integration layer as the central open challenge.
Reference graph
Works this paper leans on
-
[1]
L. P. Kaelbling and T. Lozano-P ´erez. Hierarchical planning in the now. In Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010
work page 2010
-
[2]
D. Driess, J.-S. Ha, M. Toussaint, and R. Tedrake. Learning models as functionals of signed- distance fields for manipulation planning. In Conference on robot learning, pages 245–255. PMLR, 2022
work page 2022
-
[3]
A. Simeonov, Y . Du, A. Tagliasacchi, J. B. Tenenbaum, A. Rodriguez, P. Agrawal, and V . Sitz- mann. Neural descriptor fields: Se (3)-equivariant object representations for manipulation. In 2022 International Conference on Robotics and Automation (ICRA) , pages 6394–6400. IEEE, 2022
work page 2022
-
[4]
L. Manuelli, W. Gao, P. Florence, and R. Tedrake. kpam: Keypoint affordances for category- level robotic manipulation. In The International Symposium of Robotics Research , pages 132–157. Springer, 2019
work page 2019
-
[5]
DINOv2: Learning Robust Visual Features without Supervision
M. Oquab, T. Darcet, T. Moutakanni, H. V o, M. Szafraniec, V . Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby, et al. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [6]
-
[7]
M. Toussaint, J. Harris, J.-S. Ha, D. Driess, and W. H ¨onig. Sequence-of-constraints mpc: Reactive timing-optimal control of sequential manipulation. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 13753–13760. IEEE, 2022
work page 2022
-
[8]
L. P. Kaelbling and T. Lozano-P ´erez. Integrated task and motion planning in belief space. The International Journal of Robotics Research, 32(9-10):1194–1227, 2013
work page 2013
-
[9]
S. Srivastava, E. Fang, L. Riano, R. Chitnis, S. Russell, and P. Abbeel. Combined task and motion planning through an extensible planner-independent interface layer. In 2014 IEEE international conference on robotics and automation (ICRA), 2014
work page 2014
-
[10]
A. Byravan and D. Fox. Se3-nets: Learning rigid body motion using deep neural networks. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 173–180. IEEE, 2017
work page 2017
-
[11]
N. T. Dantam, Z. K. Kingston, S. Chaudhuri, and L. E. Kavraki. An incremental constraint- based framework for task and motion planning. The International Journal of Robotics Re- search, 37(10):1134–1151, 2018
work page 2018
-
[12]
T. Migimatsu and J. Bohg. Object-centric task and motion planning in dynamic environments. IEEE Robotics and Automation Letters, 5(2):844–851, 2020
work page 2020
-
[13]
C. R. Garrett, R. Chitnis, R. Holladay, B. Kim, T. Silver, L. P. Kaelbling, and T. Lozano-P´erez. Integrated task and motion planning. Annual review of control, robotics, and autonomous systems, 4:265–293, 2021. 9
work page 2021
- [14]
- [15]
-
[16]
S. Tyree, J. Tremblay, T. To, J. Cheng, T. Mosier, J. Smith, and S. Birchfield. 6-dof pose esti- mation of household objects for robotic manipulation: An accessible dataset and benchmark. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 13081–13088. IEEE, 2022
work page 2022
-
[17]
C. Pan, B. Okorn, H. Zhang, B. Eisner, and D. Held. Tax-pose: Task-specific cross-pose estimation for robot manipulation. In Conference on Robot Learning , pages 1783–1792. PMLR, 2023
work page 2023
- [18]
-
[19]
I. Lenz, R. A. Knepper, and A. Saxena. Deepmpc: Learning deep latent features for model predictive control. In Robotics: Science and Systems, volume 10. Rome, Italy, 2015
work page 2015
-
[20]
M. B. Chang, T. Ullman, A. Torralba, and J. B. Tenenbaum. A compositional object-based approach to learning physical dynamics. arXiv preprint arXiv:1612.00341, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[21]
P. Battaglia, R. Pascanu, M. Lai, D. Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics. Advances in neural information processing systems, 29, 2016
work page 2016
-
[22]
A. Sanchez-Gonzalez, N. Heess, J. T. Springenberg, J. Merel, M. Riedmiller, R. Hadsell, and P. Battaglia. Graph networks as learnable physics engines for inference and control. In International Conference on Machine Learning, pages 4470–4479. PMLR, 2018
work page 2018
-
[23]
E. Jang, C. Devin, V . Vanhoucke, and S. Levine. Grasp2vec: Learning object representations from self-supervised grasping. arXiv preprint arXiv:1811.06964, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[24]
Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects
J. Tremblay, T. To, B. Sundaralingam, Y . Xiang, D. Fox, and S. Birchfield. Deep ob- ject pose estimation for semantic robotic grasping of household objects. arXiv preprint arXiv:1809.10790, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[25]
Z. Xu, J. Wu, A. Zeng, J. B. Tenenbaum, and S. Song. Densephysnet: Learning dense physical object representations via multi-step dynamic interactions. arXiv preprint arXiv:1906.03853, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[26]
J. Mao, C. Gan, P. Kohli, J. B. Tenenbaum, and J. Wu. The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision. arXiv preprint arXiv:1904.12584, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[27]
C. P. Burgess, L. Matthey, N. Watters, R. Kabra, I. Higgins, M. Botvinick, and A. Ler- chner. Monet: Unsupervised scene decomposition and representation. arXiv preprint arXiv:1901.11390, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
- [28]
-
[29]
F. Locatello, D. Weissenborn, T. Unterthiner, A. Mahendran, G. Heigold, J. Uszkoreit, A. Dosovitskiy, and T. Kipf. Object-centric learning with slot attention. Advances in neural information processing systems, 33:11525–11538, 2020
work page 2020
-
[30]
N. Heravi, A. Wahid, C. Lynch, P. Florence, T. Armstrong, J. Tompson, P. Sermanet, J. Bohg, and D. Dwibedi. Visuomotor control in multi-object scenes using object-aware representa- tions. In 2023 IEEE International Conference on Robotics and Automation (ICRA) , pages 9515–9522. IEEE, 2023
work page 2023
- [31]
-
[32]
W. Yuan, C. Paxton, K. Desingh, and D. Fox. Sornet: Spatial object-centric representations for sequential manipulation. In Conference on Robot Learning, pages 148–157. PMLR, 2022
work page 2022
- [33]
-
[34]
J. Hsu, J. Mao, J. Tenenbaum, and J. Wu. What’s left? concept grounding with logic-enhanced foundation models. Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[35]
Y . Li, J. Wu, R. Tedrake, J. B. Tenenbaum, and A. Torralba. Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. arXiv preprint arXiv:1810.01566, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [36]
- [37]
- [38]
-
[39]
X. Lin, Y . Wang, Z. Huang, and D. Held. Learning visible connectivity dynamics for cloth smoothing. In Conference on Robot Learning, pages 256–266. PMLR, 2022
work page 2022
-
[40]
J. Abou-Chakra, K. Rana, F. Dayoub, and N. S ¨underhauf. Physically embodied gaussian splatting: A realtime correctable world model for robotics. arXiv preprint arXiv:2406.10788, 2024
- [41]
-
[42]
T. Schmidt, R. Newcombe, and D. Fox. Self-supervised visual descriptor learning for dense correspondence. IEEE Robotics and Automation Letters, 2(2):420–427, 2016
work page 2016
-
[43]
P. R. Florence, L. Manuelli, and R. Tedrake. Dense object nets: Learning dense visual object descriptors by and for robotic manipulation. arXiv preprint arXiv:1806.08756, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[44]
T. D. Kulkarni, A. Gupta, C. Ionescu, S. Borgeaud, M. Reynolds, A. Zisserman, and V . Mnih. Unsupervised learning of object keypoints for perception and control. Advances in neural information processing systems, 32, 2019
work page 2019
-
[45]
Z. Qin, K. Fang, Y . Zhu, L. Fei-Fei, and S. Savarese. Keto: Learning keypoint representations for tool manipulation. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 7278–7285. IEEE, 2020. 11
work page 2020
-
[46]
P. Sundaresan, J. Grannen, B. Thananjeyan, A. Balakrishna, M. Laskey, K. Stone, J. E. Gon- zalez, and K. Goldberg. Learning rope manipulation policies using dense object descriptors trained on synthetic depth data. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 9411–9418. IEEE, 2020
work page 2020
-
[47]
L. Manuelli, Y . Li, P. Florence, and R. Tedrake. Keypoints into the future: Self-supervised correspondence in model-based reinforcement learning. arXiv preprint arXiv:2009.05085 , 2020
-
[48]
B. Chen, P. Abbeel, and D. Pathak. Unsupervised learning of visual 3d keypoints for control. In International Conference on Machine Learning, pages 1539–1549. PMLR, 2021
work page 2021
-
[49]
A. Simeonov, Y . Du, Y .-C. Lin, A. R. Garcia, L. P. Kaelbling, T. Lozano-P ´erez, and P. Agrawal. Se (3)-equivariant relational rearrangement with neural descriptor fields. In Conference on Robot Learning, pages 835–846. PMLR, 2023
work page 2023
-
[50]
M. Vecerik, C. Doersch, Y . Yang, T. Davchev, Y . Aytar, G. Zhou, R. Hadsell, L. Agapito, and J. Scholz. Robotap: Tracking arbitrary points for few-shot visual imitation. arXiv preprint arXiv:2308.15975, 2023
-
[51]
E. Chun, Y . Du, A. Simeonov, T. Lozano-Perez, and L. Kaelbling. Local neural descriptor fields: Locally conditioned object representations for manipulation. In 2023 IEEE Interna- tional Conference on Robotics and Automation (ICRA), pages 1830–1836. IEEE, 2023
work page 2023
-
[52]
S. Bahl, R. Mendonca, L. Chen, U. Jain, and D. Pathak. Affordances from human videos as a versatile representation for robotics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13778–13790, 2023
work page 2023
- [53]
-
[54]
H. Bharadhwaj, R. Mottaghi, A. Gupta, and S. Tulsiani. Track2act: Predicting point tracks from internet videos enables diverse zero-shot robot manipulation, 2024
work page 2024
-
[55]
Z. Kingston, M. Moll, and L. E. Kavraki. Sampling-based methods for motion planning with constraints. Annual review of control, robotics, and autonomous systems, 1:159–185, 2018
work page 2018
-
[56]
N. Ratliff, M. Zucker, J. A. Bagnell, and S. Srinivasa. Chomp: Gradient optimization tech- niques for efficient motion planning. In 2009 IEEE international conference on robotics and automation, pages 489–494. IEEE, 2009
work page 2009
-
[57]
J. Schulman, Y . Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan, S. Patil, K. Goldberg, and P. Abbeel. Motion planning with sequential convex optimization and convex collision checking. The International Journal of Robotics Research, 33(9):1251–1270, 2014
work page 2014
-
[58]
B. Sundaralingam, S. K. S. Hari, A. Fishman, C. Garrett, K. Van Wyk, V . Blukis, A. Millane, H. Oleynikova, A. Handa, F. Ramos, et al. Curobo: Parallelized collision-free robot motion generation. In 2023 IEEE International Conference on Robotics and Automation (ICRA) , pages 8112–8119. IEEE, 2023
work page 2023
-
[59]
T. Marcucci, J. Umenberger, P. Parrilo, and R. Tedrake. Shortest paths in graphs of convex sets. SIAM Journal on Optimization, 34(1):507–532, 2024
work page 2024
-
[60]
N. D. Ratliff, J. Issac, D. Kappler, S. Birchfield, and D. Fox. Riemannian motion policies. arXiv preprint arXiv:1801.02854, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[61]
M. Posa, S. Kuindersma, and R. Tedrake. Optimization and stabilization of trajectories for constrained dynamical systems. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 1366–1373. IEEE, 2016. 12
work page 2016
-
[62]
I. Mordatch, E. Todorov, and Z. Popovi ´c. Discovery of complex behaviors through contact- invariant optimization. ACM Transactions on Graphics (ToG), 31(4):1–8, 2012
work page 2012
-
[63]
I. Mordatch, Z. Popovi ´c, and E. Todorov. Contact-invariant optimization for hand manipula- tion. In Proceedings of the ACM SIGGRAPH/Eurographics symposium on computer anima- tion, pages 137–144, 2012
work page 2012
-
[64]
M. Posa, C. Cantu, and R. Tedrake. A direct method for trajectory optimization of rigid bodies through contact. The International Journal of Robotics Research, 33(1):69–81, 2014
work page 2014
- [65]
-
[66]
Z. Liu, G. Zhou, J. He, T. Marcucci, F.-F. Li, J. Wu, and Y . Li. Model-based control with sparse neural dynamics. Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[67]
K. M. Lynch and M. T. Mason. Stable pushing: Mechanics, controllability, and planning.The international journal of robotics research, 15(6):533–556, 1996
work page 1996
-
[68]
Y . Hou, Z. Jia, and M. T. Mason. Fast planning for 3d any-pose-reorienting using pivoting. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1631–1638. IEEE, 2018
work page 2018
-
[69]
J.-P. Sleiman, F. Farshidian, and M. Hutter. Versatile multicontact planning and control for legged loco-manipulation. Science Robotics, 8(81):eadg5014, 2023
work page 2023
-
[70]
W. Yang and M. Posa. Dynamic on-palm manipulation via controlled sliding. arXiv preprint arXiv:2405.08731, 2024
- [71]
-
[72]
K. Yunt and C. Glocker. Trajectory optimization of mechanical hybrid systems using sumt. In 9th IEEE International Workshop on Advanced Motion Control, 2006. , pages 665–671. IEEE, 2006
work page 2006
-
[73]
K. Yunt and C. Glocker. A combined continuation and penalty method for the determination of optimal hybrid mechanical trajectories. In Iutam Symposium on Dynamics and Control of Nonlinear Systems with Uncertainty: Proceedings of the IUTAM Symposium held in Nanjing, China, September 18-22, 2006, pages 187–196. Springer, 2007
work page 2006
-
[74]
K. Yunt. An augmented lagrangian based shooting method for the optimal trajectory gen- eration of switching lagrangian systems. Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications and Algorithms, 18(5):615–645, 2011
work page 2011
-
[75]
F. Lagriffoul, D. Dimitrov, A. Saffiotti, and L. Karlsson. Constraint propagation on interval bounds for dealing with geometric backtracking. In2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 957–964. IEEE, 2012
work page 2012
-
[76]
F. Lagriffoul, D. Dimitrov, J. Bidot, A. Saffiotti, and L. Karlsson. Efficiently combining task and motion planning using geometric constraints. The International Journal of Robotics Research, 33(14):1726–1747, 2014
work page 2014
-
[77]
T. Lozano-P ´erez and L. P. Kaelbling. A constraint-based method for solving sequential ma- nipulation planning problems. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3684–3691. IEEE, 2014. 13
work page 2014
- [78]
- [79]
- [80]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.